]> git.apps.os.sepia.ceph.com Git - ceph-ci.git/log
ceph-ci.git
2 years agodoc: update multisite doc
parth-gr [Mon, 8 May 2023 13:53:29 +0000 (19:23 +0530)]
doc: update multisite doc

cmd for getting zone group was spelled incorrectly
Updated to rdosgw-admin

Signed-off-by: parth-gr <paarora@redhat.com>
(cherry picked from commit edab93b2f15b19f05a86aab499ba11b56135aaf3)

2 years agoMerge pull request #51263 from sseshasa/wip-reef-fix-mclk-rec-backfill-cost
Radoslaw Zarzynski [Mon, 8 May 2023 18:23:32 +0000 (20:23 +0200)]
Merge pull request #51263 from sseshasa/wip-reef-fix-mclk-rec-backfill-cost

reef: osd: mClock recovery/backfill cost fixes

Reviewed-by: Sam Just <sjust@redhat.com>
2 years agoMerge pull request #51389 from zdover23/wip-doc-2023-05-08-backport-51387-to-reef
zdover23 [Mon, 8 May 2023 13:37:03 +0000 (23:37 +1000)]
Merge pull request #51389 from zdover23/wip-doc-2023-05-08-backport-51387-to-reef

reef: doc/rados: stretch-mode.rst (other commands)

Reviewed-by: Cole Mitchell <cole.mitchell.ceph@gmail.com>
2 years agodoc/rados: stretch-mode.rst (other commands)
Zac Dover [Mon, 8 May 2023 11:08:49 +0000 (21:08 +1000)]
doc/rados: stretch-mode.rst (other commands)

Edit the "Other Commands" section of
doc/rados/operations/stretch-mode.rst.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit fde33f1a5b8dbd03c096140887e04038a82f3076)

2 years agoMerge pull request #51371 from zdover23/wip-doc-2023-05-06-backport-51359-to-reef
zdover23 [Mon, 8 May 2023 12:41:55 +0000 (22:41 +1000)]
Merge pull request #51371 from zdover23/wip-doc-2023-05-06-backport-51359-to-reef

reef: doc/cephfs: repairing inaccessible FSes

Reviewed-by: Svelar <sunrongqi@huawei.com>
2 years agoMerge pull request #51377 from zdover23/wip-doc-2023-05-07-backport-51322-to-reef
Anthony D'Atri [Sun, 7 May 2023 10:37:21 +0000 (06:37 -0400)]
Merge pull request #51377 from zdover23/wip-doc-2023-05-07-backport-51322-to-reef

reef: doc/rados: stretch-mode: stretch cluster issues

2 years agodoc/rados: stretch-mode: stretch cluster issues
Zac Dover [Wed, 3 May 2023 05:16:07 +0000 (15:16 +1000)]
doc/rados: stretch-mode: stretch cluster issues

Edit "Stretch Cluster Issues", which might better be called "Netsplits"
or "Recognizing Netsplits".

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 6c1baffb85556120672b45cce89b93a20e7b09a2)

2 years agodoc/cephfs: repairing inaccessible FSes
Zac Dover [Fri, 5 May 2023 06:35:28 +0000 (16:35 +1000)]
doc/cephfs: repairing inaccessible FSes

Add a procedure to doc/cephfs/troubleshooting.rst that explains how to
restore access to FileSystems that became inaccessible after
post-Nautilus upgrades. The procedure included here was written by Harry
G Coin, and merely lightly edited by me. I include him here as a
"co-author", but it should be noted that he did the heavy lifting on
this.

See the email thread here for more context:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/HS5FD3QFR77NAKJ43M2T5ZC25UYXFLNW/

Co-authored-by: Harry G Coin <hgcoin@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
2 years agoMerge pull request #51253 from rhcs-dashboard/fix-pg-imbalancy-reef
Nizamudeen A [Fri, 5 May 2023 15:19:21 +0000 (20:49 +0530)]
Merge pull request #51253 from rhcs-dashboard/fix-pg-imbalancy-reef

reef: mgr/dashboard: fix CephPGImbalance alert

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
2 years agoMerge pull request #51111 from rhcs-dashboard/wip-59458-reef
Nizamudeen A [Fri, 5 May 2023 05:25:36 +0000 (10:55 +0530)]
Merge pull request #51111 from rhcs-dashboard/wip-59458-reef

reef: mgr/dashboard: expose more grafana configs in service form

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
2 years agoMerge pull request #51349 from zdover23/wip-doc-2023-05-05-backport-51348-to-reef
Anthony D'Atri [Fri, 5 May 2023 03:10:51 +0000 (23:10 -0400)]
Merge pull request #51349 from zdover23/wip-doc-2023-05-05-backport-51348-to-reef

reef: doc: Use `ceph osd crush tree` command to display weight set weights

2 years agodoc: Use `ceph osd crush tree` command to display weight set weights
James Lakin [Thu, 4 May 2023 17:02:36 +0000 (18:02 +0100)]
doc: Use `ceph osd crush tree` command to display weight set weights

The previous `ceph osd tree` doesn't show pool-defined weight-sets as the above documentation suggests.

Signed-off-by: James Lakin <james@jameslakin.co.uk>
(cherry picked from commit 15c3d72a43a37798de823b26f1429f7776f67aaa)

2 years agoMerge pull request #51165 from rhcs-dashboard/wip-59503-reef
Nizamudeen A [Thu, 4 May 2023 15:36:31 +0000 (21:06 +0530)]
Merge pull request #51165 from rhcs-dashboard/wip-59503-reef

reef: mgr/dashboard: hide notification on force promote

Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
2 years agoMerge pull request #50881 from adk3798/reef-backport-49969-50100-50270-50101-50244...
Adam King [Thu, 4 May 2023 14:04:54 +0000 (10:04 -0400)]
Merge pull request #50881 from adk3798/reef-backport-49969-50100-50270-50101-50244-50133-50133-50413-50318-50082-

reef: mgr/cephadm: Reef Batch Backport

Reviewed-by: Teoman ONAY <tonay@ibm.com>
2 years agoMerge pull request #51337 from zdover23/wip-doc-2023-05-04-backport-51292-to-reef
Anthony D'Atri [Thu, 4 May 2023 02:18:43 +0000 (22:18 -0400)]
Merge pull request #51337 from zdover23/wip-doc-2023-05-04-backport-51292-to-reef

reef: doc/rados: edit stretch-mode.rst

2 years agodoc/rados: edit stretch-mode.rst
Zac Dover [Sun, 30 Apr 2023 02:09:51 +0000 (12:09 +1000)]
doc/rados: edit stretch-mode.rst

Edit "Stretch Mode Limitations" (renamed "Limitations of Stretch Mode"
in this commit) in doc/rados/operations/stretch-mode.rst.

Co-authored-by: Greg Farnum <gfarnum@redhat.com>
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 410e2a181c3247d13a1b20d80c4bcbbc1a5f84da)

2 years agoMerge pull request #50978 from batrick/i59295
Yuri Weinstein [Wed, 3 May 2023 22:12:22 +0000 (15:12 -0700)]
Merge pull request #50978 from batrick/i59295

reef: MgrMonitor: batch commit OSDMap and MgrMap mutations

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
2 years agoMerge pull request #50963 from ajarr/wip-58999-reef
Yuri Weinstein [Wed, 3 May 2023 22:11:06 +0000 (15:11 -0700)]
Merge pull request #50963 from ajarr/wip-58999-reef

reef: mgr: store names of modules that register RADOS clients in the MgrMap

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2 years agoMerge pull request #51061 from mkogan1/wip-50842-reef
Casey Bodley [Wed, 3 May 2023 21:36:23 +0000 (17:36 -0400)]
Merge pull request #51061 from mkogan1/wip-50842-reef

reef: rgw : fix python script using s3cmd with error code 403 ubuntu 20.04

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2 years agorgw : fix python script using s3cmd with error code 403 ubuntu 20.04
Mark Kogan [Sun, 2 Apr 2023 13:23:25 +0000 (16:23 +0300)]
rgw : fix python script using s3cmd with error code 403 ubuntu 20.04

Fixes: https://tracker.ceph.com/issues/54104
Signed-off-by: Mark Kogan <mkogan@redhat.com>
(cherry picked from commit 5846a9c2677067516f98d502980dab1681cddb69)

2 years agoMerge pull request #51334 from ljflores/wip-59600-reef
Laura Flores [Wed, 3 May 2023 18:38:21 +0000 (13:38 -0500)]
Merge pull request #51334 from ljflores/wip-59600-reef

reef: mgr: add urllib3==1.26.15 to mgr/requirements.txt

2 years agomgr: add urllib3==1.26.15 to mgr/requirements.txt
Laura Flores [Mon, 1 May 2023 16:28:54 +0000 (16:28 +0000)]
mgr: add urllib3==1.26.15 to mgr/requirements.txt

We do not depend on any particular version of
urllib3, but as a workaround to the incompatibility
of urllib3 constraints between kubernetes and
requests, we need to pin it temporarily to
the version both are happy with.

Fixes: https://tracker.ceph.com/issues/59591
Signed-off-by: Laura Flores <lflores@redhat.com>
(cherry picked from commit 80d460005e44649191aa862fa78bd278644b5237)

2 years agoMerge pull request #51309 from zdover23/wip-doc-2023-05-02-backport-51133-to-reef
zdover23 [Tue, 2 May 2023 22:25:04 +0000 (08:25 +1000)]
Merge pull request #51309 from zdover23/wip-doc-2023-05-02-backport-51133-to-reef

reef: doc/mgr: update prompts in prometheus.rst

Reviewed-by: Cole Mitchell <cole.mitchell.ceph@gmail.com>
2 years agoMerge pull request #50631 from trociny/wip-59130-reef
Adam King [Tue, 2 May 2023 21:48:25 +0000 (17:48 -0400)]
Merge pull request #50631 from trociny/wip-59130-reef

reef: mgr/cephadm: don't add mgr into iscsi trusted_ip_list if it's already there

Reviewed-by: Adam King <adking@redhat.com>
2 years agodoc/mgr: update prompts in prometheus.rst
Zac Dover [Tue, 18 Apr 2023 14:28:50 +0000 (16:28 +0200)]
doc/mgr: update prompts in prometheus.rst

Update prompts in prometheus.rst so that they're unselectable.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 5a90d64b343f158d43397c70c267deb4e7ef0e00)

2 years agoMerge pull request #51305 from zdover23/wip-doc-2023-05-02-backport-51299-to-reef
Anthony D'Atri [Mon, 1 May 2023 23:25:48 +0000 (19:25 -0400)]
Merge pull request #51305 from zdover23/wip-doc-2023-05-02-backport-51299-to-reef

reef: doc/radosgw: rabbitmq - push-endpoint edit

2 years agodoc/radosgw: rabbitmq - push-endpoint edit
Zac Dover [Mon, 1 May 2023 17:14:01 +0000 (03:14 +1000)]
doc/radosgw: rabbitmq - push-endpoint edit

Remove a note that directed users to change "push-endpoint" (with a
hyphen) to "push_endpoint" (with an underscore) when using rabbitmq.

Re: https://github.com/ceph/ceph/pull/48486#issuecomment-1529925389

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit e4b35de2abf00d514c76f77645c587c562bab05d)

2 years agoMerge pull request #51302 from zdover23/wip-doc-2023-05-02-backport-51296-to-reef
Anthony D'Atri [Mon, 1 May 2023 20:35:51 +0000 (16:35 -0400)]
Merge pull request #51302 from zdover23/wip-doc-2023-05-02-backport-51296-to-reef

reef: doc/rados: edit stretch-mode.rst

2 years agodoc/rados: edit stretch-mode.rst
Zac Dover [Mon, 1 May 2023 02:29:07 +0000 (12:29 +1000)]
doc/rados: edit stretch-mode.rst

Refine and supplement the introductory and explanatory text at the top
of the /doc/rados/operations/stretch-mode.rst file.

Co-authored-by: Josh Durgin <jdurgin@redhat.com>
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit b642900abc57302e62a5064dba510c3cc5743ac0)

2 years agoqa/: Override mClock profile to 'high_recovery_ops' for qa tests
Sridhar Seshasayee [Sat, 29 Apr 2023 04:48:11 +0000 (10:18 +0530)]
qa/: Override mClock profile to 'high_recovery_ops' for qa tests

The qa tests are not client I/O centric and mostly focus on triggering
recovery/backfills and monitor them for completion within a finite amount
of time. The same holds true for scrub operations.

Therefore, an mClock profile that optimizes background operations is a
better fit for qa related tests. The osd_mclock_profile is therefore
globally overriden to 'high_recovery_ops' profile for the Rados suite as
it fits the requirement.

Also, many standalone tests expect recovery and scrub operations to
complete within a finite time. To ensure this, the osd_mclock_profile
options is set to 'high_recovery_ops' as part of the run_osd() function
in ceph-helpers.sh.

A subset of standalone tests explicitly used 'high_recovery_ops' profile.
Since the profile is now set as part of run_osd(), the earlier overrides
are redundant and therefore removed from the tests.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2 years agodoc/: Modify mClock configuration documentation to reflect profile changes
Sridhar Seshasayee [Tue, 11 Apr 2023 17:57:05 +0000 (23:27 +0530)]
doc/: Modify mClock configuration documentation to reflect profile changes

Modify the relevant documentation to reflect:

- change in the default mClock profile to 'balanced'
- new allocations for ops across mClock profiles
- change in the osd_max_backfills limit
- miscellaneous changes related to warnings.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2 years agocommon/options/osd.yaml.in: Change mclock max sequential bandwidth for SSDs
Sridhar Seshasayee [Tue, 11 Apr 2023 16:47:53 +0000 (22:17 +0530)]
common/options/osd.yaml.in: Change mclock max sequential bandwidth for SSDs

The osd_mclock_max_sequential_bandwidth_ssd is changed to 1200 MiB/s as
a reasonable middle ground considering the broad range of SSD capabilities.
This allows the mClock's cost model to extract the SSDs capability
depending on the cost of the IO being performed.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2 years agoosd/: Retain the default osd_max_backfills limit to 1 for mClock
Sridhar Seshasayee [Tue, 11 Apr 2023 16:30:11 +0000 (22:00 +0530)]
osd/: Retain the default osd_max_backfills limit to 1 for mClock

The earlier limit of 3 was still aggressive enough to have an impact on
the client and other competing operations. Retain the current default
for mClock. This can be modified if necessary after setting the
osd_mclock_override_recovery_settings option.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2 years agocommon/options/osd.yaml.in: change mclock profile default to balanced
Samuel Just [Tue, 11 Apr 2023 15:15:38 +0000 (08:15 -0700)]
common/options/osd.yaml.in: change mclock profile default to balanced

Let's use the middle profile as the default.
Modify the standalone tests accordingly.

Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2 years agoMerge pull request #51289 from zdover23/wip-doc-2023-04-30-backport-51285-to-reef
Anthony D'Atri [Sat, 29 Apr 2023 20:00:57 +0000 (16:00 -0400)]
Merge pull request #51289 from zdover23/wip-doc-2023-04-30-backport-51285-to-reef

reef: doc/rados: edit stretch-mode procedure

2 years agodoc/rados: edit stretch-mode procedure
Zac Dover [Sat, 29 Apr 2023 00:14:02 +0000 (10:14 +1000)]
doc/rados: edit stretch-mode procedure

Edit the "stretch mode" section in doc/rados/operations/stretch-mode.rst
so that the procedure is formatted as a procedure and the sentences
correctly have heads.

Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit a19ff7a5ea9bbd24365648a90abfa1b720c5b231)

2 years agoMerge pull request #51286 from zdover23/wip-doc-2023-04-29-backport-51276-to-reef
zdover23 [Sat, 29 Apr 2023 17:32:04 +0000 (03:32 +1000)]
Merge pull request #51286 from zdover23/wip-doc-2023-04-29-backport-51276-to-reef

reef: docs: Update the Prometheus endpoint info

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2 years agodocs: Update the Prometheus endpoint info
Paul Cuzner [Fri, 28 Apr 2023 05:21:39 +0000 (17:21 +1200)]
docs: Update the Prometheus endpoint info

This patch just tidies up some of the links and adds
an example showing how the http_sd_configs option
may be used.

Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
(cherry picked from commit 690d34ab08f22cd988828aa2097531627000907e)

2 years agoMerge pull request #51272 from zdover23/wip-doc-2023-04-28-backport-51271-to-reef
Anthony D'Atri [Fri, 28 Apr 2023 00:53:50 +0000 (20:53 -0400)]
Merge pull request #51272 from zdover23/wip-doc-2023-04-28-backport-51271-to-reef

reef: doc/rados: m-config-ref: edit "background"

2 years agodoc/rados: m-config-ref: edit "background"
Zac Dover [Thu, 27 Apr 2023 22:35:17 +0000 (08:35 +1000)]
doc/rados: m-config-ref: edit "background"

Edit the "Background" section of doc/rados/monitor/config-ref.rst

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 9223863fc83095def59b416bf70f9a828a701ccc)

2 years agoMerge pull request #51148 from zdover23/wip-doc-2023-04-20-backport-51143-to-reef
zdover23 [Thu, 27 Apr 2023 20:41:03 +0000 (06:41 +1000)]
Merge pull request #51148 from zdover23/wip-doc-2023-04-20-backport-51143-to-reef

reef: docs: warning and remove few docs section for Filestore

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2 years agodoc/man/rbd: resurrect filestore alloc_size note
Ilya Dryomov [Thu, 20 Apr 2023 10:05:14 +0000 (12:05 +0200)]
doc/man/rbd: resurrect filestore alloc_size note

Mistakenly removed in commit d79f2a81541c ("docs: warning and remove
few docs section for Filestore Update docs after filestore removal.").
The kernel client, however new, will continue to be able to talk to
FileStore OSDs for as long as they exist.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit d3558c49bf0456a199cf55f73c5832f408462ec5)

2 years agoosd/scheduler/mClockScheduler: avoid limits for recovery
Samuel Just [Tue, 11 Apr 2023 15:10:04 +0000 (08:10 -0700)]
osd/scheduler/mClockScheduler: avoid limits for recovery

Now that recovery operations are split between background_recovery and
background_best_effort, rebalance qos params to avoid penalizing
background_recovery while idle.

Signed-off-by: Samuel Just <sjust@redhat.com>
2 years agoosd/: add counters for ops delayed due to degraded|unreadable target
Samuel Just [Mon, 10 Apr 2023 21:18:49 +0000 (14:18 -0700)]
osd/: add counters for ops delayed due to degraded|unreadable target

Signed-off-by: Samuel Just <sjust@redhat.com>
2 years agoosd/: add counters for queue latency for PGRecovery[Context]
Samuel Just [Thu, 6 Apr 2023 21:15:02 +0000 (14:15 -0700)]
osd/: add counters for queue latency for PGRecovery[Context]

Signed-off-by: Samuel Just <sjust@redhat.com>
2 years agoosd/: add per-op latency averages for each recovery related message
Samuel Just [Thu, 6 Apr 2023 20:50:48 +0000 (20:50 +0000)]
osd/: add per-op latency averages for each recovery related message

Signed-off-by: Samuel Just <sjust@redhat.com>
2 years agoosd/: differentiate priority for PGRecovery[Context]
Samuel Just [Thu, 6 Apr 2023 07:04:05 +0000 (00:04 -0700)]
osd/: differentiate priority for PGRecovery[Context]

PGs with degraded objects should be higher priority.

Signed-off-by: Samuel Just <sjust@redhat.com>
2 years agoosd/: add MSG_OSD_PG_(BACKFILL|BACKFILL_REMOVE|SCAN) as recovery messages
Samuel Just [Thu, 6 Apr 2023 05:57:48 +0000 (22:57 -0700)]
osd/: add MSG_OSD_PG_(BACKFILL|BACKFILL_REMOVE|SCAN) as recovery messages

Otherwise, these end up as PGOpItem and therefore as immediate:

class PGOpItem : public PGOpQueueable {
...
  op_scheduler_class get_scheduler_class() const final {
    auto type = op->get_req()->get_type();
    if (type == CEPH_MSG_OSD_OP ||
  type == CEPH_MSG_OSD_BACKOFF) {
      return op_scheduler_class::client;
    } else {
      return op_scheduler_class::immediate;
    }
  }
...
};

This was probably causing a bunch of extra interference with client
ops.

Signed-off-by: Samuel Just <sjust@redhat.com>
2 years agoosd/: differentiate scheduler class for undersized/degraded vs data movement
Samuel Just [Thu, 6 Apr 2023 05:57:42 +0000 (22:57 -0700)]
osd/: differentiate scheduler class for undersized/degraded vs data movement

Recovery operations on pgs/objects that have fewer than the configured
number of copies should be treated more urgently than operations on
pgs/objects that simply need to be moved to a new location.

Signed-off-by: Samuel Just <sjust@redhat.com>
2 years agoosd/.../OpSchedulerItem: add MSG_OSD_PG_PULL to is_recovery_msg
Samuel Just [Thu, 6 Apr 2023 04:30:18 +0000 (04:30 +0000)]
osd/.../OpSchedulerItem: add MSG_OSD_PG_PULL to is_recovery_msg

Signed-off-by: Samuel Just <sjust@redhat.com>
2 years agoosd/: move PGRecoveryMsg check from osd into PGRecoveryMsg::is_recovery_msg
Samuel Just [Thu, 6 Apr 2023 04:23:23 +0000 (04:23 +0000)]
osd/: move PGRecoveryMsg check from osd into PGRecoveryMsg::is_recovery_msg

Signed-off-by: Samuel Just <sjust@redhat.com>
2 years agoosd/: move get_recovery_op_priority into PeeringState next to get_*_priority
Samuel Just [Thu, 6 Apr 2023 03:45:19 +0000 (03:45 +0000)]
osd/: move get_recovery_op_priority into PeeringState next to get_*_priority

Consolidate methods governing recovery scheduling in PeeringState.

Signed-off-by: Samuel Just <sjust@redhat.com>
2 years agoosd/scheduler: simplify qos specific params in OpSchedulerItem
Samuel Just [Tue, 4 Apr 2023 23:34:17 +0000 (23:34 +0000)]
osd/scheduler: simplify qos specific params in OpSchedulerItem

is_qos_item() was only used in operator<< for OpSchedulerItem.  However,
it's actually useful to see priority for mclock items since it affects
whether it goes into the immediate queues and, for some types, the
class.  Unconditionally display both class_id and priority.

Signed-off-by: Samuel Just <sjust@redhat.com>
2 years agoosd/scheduler: remove unused PGOpItem::maybe_get_mosd_op
Samuel Just [Tue, 4 Apr 2023 23:22:59 +0000 (23:22 +0000)]
osd/scheduler: remove unused PGOpItem::maybe_get_mosd_op

Signed-off-by: Samuel Just <sjust@redhat.com>
2 years agoosd/scheduler: remove OpQueueable::get_order_locker() and supporting machinery
Samuel Just [Tue, 4 Apr 2023 23:13:41 +0000 (23:13 +0000)]
osd/scheduler: remove OpQueueable::get_order_locker() and supporting machinery

Apparently unused.

Signed-off-by: Samuel Just <sjust@redhat.com>
2 years agoosd/scheduler: remove OpQueueable::get_op_type() and supporting machinery
Samuel Just [Tue, 4 Apr 2023 23:05:56 +0000 (23:05 +0000)]
osd/scheduler: remove OpQueueable::get_op_type() and supporting machinery

Apparently unused.

Signed-off-by: Samuel Just <sjust@redhat.com>
2 years agoPeeringState::clamp_recovery_priority: use std::clamp
Samuel Just [Mon, 3 Apr 2023 20:31:46 +0000 (13:31 -0700)]
PeeringState::clamp_recovery_priority: use std::clamp

Signed-off-by: Samuel Just <sjust@redhat.com>
2 years agodoc: Modify mClock configuration documentation to reflect new cost model
Sridhar Seshasayee [Sat, 25 Mar 2023 07:14:40 +0000 (12:44 +0530)]
doc: Modify mClock configuration documentation to reflect new cost model

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2 years agoosd: Retain overridden mClock recovery settings across osd restarts
Sridhar Seshasayee [Tue, 21 Feb 2023 12:24:36 +0000 (17:54 +0530)]
osd: Retain overridden mClock recovery settings across osd restarts

Fix an issue where an overridden mClock recovery setting (set prior to
an osd restart) could be lost after an osd restart.

For e.g., consider that prior to an osd restart, the option
'osd_max_backfill' was successfully set to a value different from the
mClock default. If the osd was restarted for some reason, the
boot-up sequence was incorrectly resetting the backfill value to the
mclock default within the async local/remote reservers. This fix
ensures that no change is made if the current overriden value is
different from the mClock default.

Modify an existing standalone test to verify that the local and remote
async reservers are updated to the desired number of backfills under
normal conditions and also across osd restarts.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2 years agoosd: Set default max active recovery and backfill limits for mClock
Sridhar Seshasayee [Mon, 20 Mar 2023 12:29:17 +0000 (17:59 +0530)]
osd: Set default max active recovery and backfill limits for mClock

Client ops are sensitive to the recovery load and must be carefully
set for osds whose underlying device is HDD. Tests revealed that
recoveries with osd_max_backfills = 10 and osd_recovery_max_active_hdd = 5
were still aggressive and overwhelmed client ops. The built-in defaults
for mClock are now set to:

    1) osd_recovery_max_active_hdd = 3
    2) osd_recovery_max_active_ssd = 10
    3) osd_max_backfills = 3

The above may be modified if necessary by setting
osd_mclock_override_recovery_settings option.

Fixes: https://tracker.ceph.com/issues/58529
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2 years agoosd/scheduler/mClockScheduler: make is_rotational const
Samuel Just [Wed, 29 Mar 2023 06:29:58 +0000 (23:29 -0700)]
osd/scheduler/mClockScheduler: make is_rotational const

Signed-off-by: Samuel Just <sjust@redhat.com>
2 years agoosd/scheduler/mClockScheduler: simplify profile handling
Samuel Just [Wed, 29 Mar 2023 07:10:57 +0000 (00:10 -0700)]
osd/scheduler/mClockScheduler: simplify profile handling

Previously, setting default configs from the configured profile was
split across:
- enable_mclock_profile_settings
- set_mclock_profile - sets mclock_profile class member
- set_*_allocations - updates client_allocs class member
- set_profile_config - sets profile based on client_allocs class member

This made tracing the effect of changing the profile pretty challenging
due passing state through class member variables.

Instead, define a simple profile_t with three constexpr values
corresponding to the three profiles and handle it all in a single
set_config_defaults_from_profile() method.

Signed-off-by: Samuel Just <sjust@redhat.com>
2 years agoosd: Modify mClock scheduler's cost model to represent cost in bytes
Sridhar Seshasayee [Thu, 9 Feb 2023 15:17:44 +0000 (20:47 +0530)]
osd: Modify mClock scheduler's cost model to represent cost in bytes

The mClock scheduler's cost model for HDDs/SSDs is modified and now
represents the cost of an IO in terms of bytes.

The cost parameters, namely, osd_mclock_cost_per_io_usec_[hdd|ssd]
and osd_mclock_cost_per_byte_usec_[hdd|ssd] which represent the cost
of an IO in secs are inaccurate and therefore removed.

The new model considers the following aspects of an osd to calculate
the cost of an IO:

 - osd_mclock_max_capacity_iops_[hdd|ssd] (existing option)
   The measured random write IOPS at 4 KiB block size. This is
   measured during OSD boot-up using OSD bench tool.
 - osd_mclock_max_sequential_bandwidth_[hdd|ssd] (new config option)
   The maximum sequential bandwidth of of the underlying device.
   For HDDs, 150 MiB/s is considered, and for SSDs 750 MiB/s is
   considered in the cost calculation.

The following important changes are made to arrive at the overall
cost of an IO,

1. Represent QoS reservation and limit config parameter as proportion:
The reservation and limit parameters are now set in terms of a
proportion of the OSD's max IOPS capacity. The earlier representation
was in terms of IOPS per OSD shard which required the user to perform
calculations before setting the parameter. Representing the
reservation and limit in terms of proportions is much more intuitive
and simpler for a user.

2. Cost per IO Calculation:
Using the above config options, osd_bandwidth_cost_per_io for the osd is
calculated and set. It is the ratio of the max sequential bandwidth and
the max random write iops of the osd. It is a constant and represents the
base cost of an IO in terms of bytes. This is added to the actual size of
the IO(in bytes) to represent the overall cost of the IO operation.See
mClockScheduler::calc_scaled_cost().

3. Cost calculation in Bytes:
The settings for reservation and limit in terms a fraction of the OSD's
maximum IOPS capacity is converted to Bytes/sec before updating the
mClock server's ClientInfo structure. This is done for each OSD op shard
using osd_bandwidth_capacity_per_shard shown below:

    (res|lim)  = (IOPS proportion) * osd_bandwidth_capacity_per_shard
    (Bytes/sec)   (unitless)             (bytes/sec)

The above result is updated within the mClock server's ClientInfo
structure for different op_scheduler_class operations. See
mClockScheduler::ClientRegistry::update_from_config().

The overall cost of an IO operation (in secs) is finally determined
during the tag calculations performed in the mClock server. See
crimson::dmclock::RequestTag::tag_calc() for more details.

4. Profile Allocations:
Optimize mClock profile allocations due to the change in the cost model
and lower recovery cost.

5. Modify standalone tests to reflect the change in the QoS config
parameter representation of reservation and limit options.

Fixes: https://tracker.ceph.com/issues/58529
Fixes: https://tracker.ceph.com/issues/59080
Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2 years agoosd: update PGRecovery queue item cost to reflect object size
Sridhar Seshasayee [Thu, 2 Feb 2023 10:00:26 +0000 (15:30 +0530)]
osd: update PGRecovery queue item cost to reflect object size

Previously, we used a static value of osd_recovery_cost (20M
by default) for PGRecovery. For pools with relatively small
objects, this causes mclock to backfill very very slowly as
20M massively overestimates the amount of IO each recovery
queue operation requires. Instead, add a cost_per_object
parameter to OSDService::awaiting_throttle and set it to the
average object size in the PG being queued.

Fixes: https://tracker.ceph.com/issues/58606
Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2 years agoosd: update OSDService::queue_recovery_context to specify cost
Sridhar Seshasayee [Thu, 2 Feb 2023 08:12:39 +0000 (13:42 +0530)]
osd: update OSDService::queue_recovery_context to specify cost

Previously, we always queued this with cost osd_recovery_cost which
defaults to 20M. With mclock, this caused these items to be delayed
heavily. Instead, base the cost on the operation queued.

Fixes: https://tracker.ceph.com/issues/58606
Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2 years agoosd/osd_types: use appropriate cost value for PullOp
Sridhar Seshasayee [Fri, 3 Feb 2023 05:36:06 +0000 (11:06 +0530)]
osd/osd_types: use appropriate cost value for PullOp

See included comments -- previous values did not account for object
size.  This causes problems for mclock which is much more strict
in how it interprets costs.

Fixes: https://tracker.ceph.com/issues/58607
Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2 years agoosd/osd_types: use appropriate cost value for PushReplyOp
Sridhar Seshasayee [Wed, 25 Jan 2023 08:19:59 +0000 (13:49 +0530)]
osd/osd_types: use appropriate cost value for PushReplyOp

See included comments -- previous values did not account for object
size.  This causes problems for mclock which is much more strict
in how it interprets costs.

Fixes: https://tracker.ceph.com/issues/58529
Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
2 years agomgr/dashboard: fix CephPGImbalance alert
Aashish Sharma [Mon, 24 Apr 2023 06:14:11 +0000 (11:44 +0530)]
mgr/dashboard: fix CephPGImbalance alert

Fixes: https://tracker.ceph.com/issues/55568
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit 8b5c4d27c20bce82bb46064a2cd2928a0736e6cd)

2 years agoMerge pull request #51239 from zdover23/wip-doc-2023-04-27-backport-51154-to-reef
zdover23 [Thu, 27 Apr 2023 00:44:25 +0000 (10:44 +1000)]
Merge pull request #51239 from zdover23/wip-doc-2023-04-27-backport-51154-to-reef

reef: doc/rados/ops: edit user-management.rst (3 of x)

Reviewed-by: Cole Mitchell <cole.mitchell@gmail.com>
2 years agodoc/rados/ops: edit user-management.rst (3 of x)
Zac Dover [Thu, 20 Apr 2023 08:25:00 +0000 (10:25 +0200)]
doc/rados/ops: edit user-management.rst (3 of x)

Line-edit doc/rados/user-management.rst (3 of x).

https://tracker.ceph.com/issues/58485

Follows https://github.com/ceph/ceph/pull/51140.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 97b751ed8f8917f9d4d9cbca03f224e6518836ef)

2 years agoMerge pull request #51155 from zdover23/wip-doc-2023-04-20-backport-51140-to-reef
zdover23 [Thu, 27 Apr 2023 00:09:09 +0000 (10:09 +1000)]
Merge pull request #51155 from zdover23/wip-doc-2023-04-20-backport-51140-to-reef

reef: doc/rados: edit user-management (2 of x)

Reviewed-by: Cole Mitchell <cole.mitchell@gmail.com>
2 years agoMerge pull request #51235 from zdover23/wip-doc-2023-04-27-backport-51204-to-reef
Anthony D'Atri [Wed, 26 Apr 2023 22:25:55 +0000 (18:25 -0400)]
Merge pull request #51235 from zdover23/wip-doc-2023-04-27-backport-51204-to-reef

reef: doc/cephfs: explain cephfs data and metadata set

2 years agodoc/cephfs: explain cephfs data and metadata set
Zac Dover [Tue, 25 Apr 2023 07:46:53 +0000 (17:46 +1000)]
doc/cephfs: explain cephfs data and metadata set

Explain how to set application metadata for the CephFS data pool and the
CephFS metadata pool.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 9152f9700420f9735533f276559af87dff97bd75)

2 years agoMerge pull request #51012 from cbodley/wip-59358
Casey Bodley [Wed, 26 Apr 2023 15:18:00 +0000 (11:18 -0400)]
Merge pull request #51012 from cbodley/wip-59358

reef: rgw/keystone: use secret key from EC2 for sigv4 streaming mode

Reviewed-by: Shilpa Jagannath <smanjara@redhat.com>
2 years agoMerge pull request #51220 from zdover23/wip-doc-2023-04-26-backport-51193-to-reef
Anthony D'Atri [Wed, 26 Apr 2023 00:21:53 +0000 (20:21 -0400)]
Merge pull request #51220 from zdover23/wip-doc-2023-04-26-backport-51193-to-reef

reef: doc/start: rewrite intro paragraph

2 years agodoc/start: rewrite intro paragraph
Zac Dover [Mon, 24 Apr 2023 11:02:16 +0000 (13:02 +0200)]
doc/start: rewrite intro paragraph

Rewrite the first paragraph in doc/start/intro.rst.

Signed-off-by: Zac Dover <zac.dover@proton.me>
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit bea01d5f1469030253a3403dbb9e2c9fa97806ac)

2 years agoMerge pull request #51022 from cbodley/wip-59151
Casey Bodley [Tue, 25 Apr 2023 18:09:41 +0000 (14:09 -0400)]
Merge pull request #51022 from cbodley/wip-59151

reef: rgw: install rgw scripts with common files rather than radosgw files

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2 years agoMerge pull request #51019 from cbodley/wip-59273
Casey Bodley [Tue, 25 Apr 2023 15:35:26 +0000 (11:35 -0400)]
Merge pull request #51019 from cbodley/wip-59273

reef: rgw/admin: 'data sync status' formats binary error repo entries

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2 years agoMerge pull request #51024 from cbodley/wip-59133
Casey Bodley [Tue, 25 Apr 2023 15:35:01 +0000 (11:35 -0400)]
Merge pull request #51024 from cbodley/wip-59133

reef: rgw/s3: DeleteObjects response uses correct delete_marker flag

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2 years agoMerge pull request #51015 from cbodley/wip-59292
Casey Bodley [Tue, 25 Apr 2023 15:11:45 +0000 (11:11 -0400)]
Merge pull request #51015 from cbodley/wip-59292

reef: qa/rgw: add rgw/upgrade suite

Reviewed-by: Ali Maredia <amaredia@redhat.com>
2 years agoMerge pull request #51014 from cbodley/wip-59280
Casey Bodley [Tue, 25 Apr 2023 15:02:48 +0000 (11:02 -0400)]
Merge pull request #51014 from cbodley/wip-59280

reef: rgw: set init_check_compat when bucket sync status doesn't exist

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2 years agoMerge pull request #51020 from cbodley/wip-59275
Casey Bodley [Tue, 25 Apr 2023 15:01:40 +0000 (11:01 -0400)]
Merge pull request #51020 from cbodley/wip-59275

reef: rgw/sts: Fixes get_cert_url improper url path concatenation

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2 years agoMerge pull request #51145 from cbodley/wip-59493
Casey Bodley [Tue, 25 Apr 2023 14:10:17 +0000 (10:10 -0400)]
Merge pull request #51145 from cbodley/wip-59493

reef: cmake/rgw: librgw tests depend on ALLOC_LIBS

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2 years agoMerge pull request #51013 from cbodley/wip-59278
Casey Bodley [Tue, 25 Apr 2023 14:09:49 +0000 (10:09 -0400)]
Merge pull request #51013 from cbodley/wip-59278

reef: rgw: fix CopyObj crash after admin override

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2 years agoMerge pull request #51017 from cbodley/wip-59360
Casey Bodley [Tue, 25 Apr 2023 14:09:17 +0000 (10:09 -0400)]
Merge pull request #51017 from cbodley/wip-59360

reef: rgw: fix rgw cache invalidation after unregister_watch() error

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2 years agoMerge pull request #51018 from cbodley/wip-59377
Casey Bodley [Tue, 25 Apr 2023 14:09:08 +0000 (10:09 -0400)]
Merge pull request #51018 from cbodley/wip-59377

reef: rgw/civetweb: handle old clients with transfer-encoding: chunked.

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2 years agoMerge pull request #51021 from cbodley/wip-59356
Casey Bodley [Tue, 25 Apr 2023 14:08:33 +0000 (10:08 -0400)]
Merge pull request #51021 from cbodley/wip-59356

reef: rgw/sse-s3: fix bucket encryption of multipart upload

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2 years agoMerge pull request #51023 from cbodley/wip-59232
Casey Bodley [Tue, 25 Apr 2023 14:08:02 +0000 (10:08 -0400)]
Merge pull request #51023 from cbodley/wip-59232

reef: rgw/notifications: support bucket notification with bucket policy

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Reviewed-by: Yuval Lifshitz <ylifshit@redhat.com>
2 years agoMerge pull request #51025 from cbodley/wip-59145
Casey Bodley [Tue, 25 Apr 2023 14:07:43 +0000 (10:07 -0400)]
Merge pull request #51025 from cbodley/wip-59145

reef: rgw: Do not duplicate query-string in ops-log

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2 years agoMerge pull request #51026 from cbodley/wip-59028
Casey Bodley [Tue, 25 Apr 2023 14:07:31 +0000 (10:07 -0400)]
Merge pull request #51026 from cbodley/wip-59028

reef: rgw: use unique_ptr for flat_map emplace in BucketTrimWatcher

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Reviewed-by: Yuval Lifshitz <ylifshit@redhat.com>
2 years agoMerge pull request #51027 from cbodley/wip-59013
Casey Bodley [Tue, 25 Apr 2023 14:07:18 +0000 (10:07 -0400)]
Merge pull request #51027 from cbodley/wip-59013

reef: rgw/notifications: fetch object state to get size, in rgw_lc.cc

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Reviewed-by: Yuval Lifshitz <ylifshit@redhat.com>
2 years agoMerge pull request #51028 from cbodley/wip-59220
Casey Bodley [Tue, 25 Apr 2023 14:06:46 +0000 (10:06 -0400)]
Merge pull request #51028 from cbodley/wip-59220

reef: qa/rgw: unpin centos for verify suite

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2 years agoqa/distros: pass --allowerasing --nobest when installing container-tools
Adam King [Thu, 16 Feb 2023 17:34:06 +0000 (12:34 -0500)]
qa/distros: pass --allowerasing --nobest when installing container-tools

One of the tests in the orch suite is running distro install
commands from multiple distros, causing it to first install
container-tools 3.0 and then later install container-tools,
which fails, causing the test to fail. This is sort of a bandaid
fix to getthe test to work. It will cause whatever the last
version of the package to be installed to end up being installed
(and will do so without error) which is what we want in the tests.

Fixes: https://tracker.ceph.com/issues/57771
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 3011d954986e38ae8f7c9cd027ef2a88dff9a3d8)

2 years agomgr/rgw: adding mgr rgw module to ceph image
Redouane Kachach [Mon, 27 Feb 2023 08:56:54 +0000 (09:56 +0100)]
mgr/rgw: adding mgr rgw module to ceph image
Fixes: https://tracker.ceph.com/issues/58856
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 110db72e24dab6667adc21259355cf17dc20ac86)

2 years agomgr/cephadm: use SFTP instead of SCP to copy cephadm remote files
Redouane Kachach [Mon, 3 Apr 2023 16:34:25 +0000 (18:34 +0200)]
mgr/cephadm: use SFTP instead of SCP to copy cephadm remote files
fixes: https://tracker.ceph.com/issues/59298

Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 99148b4a183f54a8779edbf529dfb34bba67e2a9)

2 years agomgr/cephadm: Adding extra arguments support for RGW frontend
Redouane Kachach [Wed, 26 Oct 2022 09:33:38 +0000 (11:33 +0200)]
mgr/cephadm: Adding extra arguments support for RGW frontend
Fixes: https://tracker.ceph.com/issues/57931
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 2c46c0741962e0e6a5ddbc960dfd21948daf0947)

2 years agopython-common: add a dedicated tox env to run mypy
John Mulligan [Thu, 30 Mar 2023 20:49:27 +0000 (16:49 -0400)]
python-common: add a dedicated tox env to run mypy

IMO it's not a good practice to overload a tox rule with multiple
different test tools. It forces the tools to share the same virtualenvs
and makes it impossible to run the tools individually. A separate mypy
env also better matches the other tox.ini files in the ceph tree.
Since the new 'mypy' env is in the default env list it will continue
to get run automatically when no specific envs are selected.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit ff39f87701ba5935902f2a6c67d7ad178bddc5e0)

2 years agomypy: update pinned mypy version to 0.981
John Mulligan [Tue, 28 Mar 2023 20:42:41 +0000 (16:42 -0400)]
mypy: update pinned mypy version to 0.981

mypy version 0.981 fixes a bug where on newer python versions mypy
doesn't properly load pyi files with keyword only arguments.
As noted in the src/mypy-constrains.txt mypy version needs to be
manually bumped periodically, and ceph is overdue for an update too.
It's never been updated since the file was added in June 2021.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 866f14d40cd3ffd30d85f9d2c09cf4a25948cd5c)

2 years agopython-common: fix variable name reuse to make mypy happy
John Mulligan [Thu, 30 Mar 2023 20:48:02 +0000 (16:48 -0400)]
python-common: fix variable name reuse to make mypy happy

The variables high and low were being used as both `str`s and regex
match objects. Rename the vars in the if block to avoid this problem.
This change makes this file pass mypy checking on mypy 0.981.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit f2646dbaba943baccd2fa7d7860c73fa05e7cd8d)

2 years agosrc/pybind: fix type annotations for signal handler function
John Mulligan [Wed, 29 Mar 2023 14:15:10 +0000 (10:15 -0400)]
src/pybind: fix type annotations for signal handler function

This change makes this file pass mypy checking on mypy 0.981.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 3035ca6c52168245dcc2104ef58615948697e740)