librbd: localize snap_remove op for mirror snapshots
A client may attempt a lock request not quickly enough to
obtain exclusive lock for operations when another competing
client responds quicker. This can happen when a peer site has
different performance characteristics or latency. Instead of
relying on this unpredictable behavior, localize operation to
primary cluster.
Fixes: https://tracker.ceph.com/issues/59393 Signed-off-by: Christopher Hoffman <choffman@redhat.com>
(cherry picked from commit ac552c9b4d65198db8038d397a3060d5a030917d)
Conflicts:
src/cls/rbd/cls_rbd.cc [ commit 3a93b40721a1 ("librbd:
s/boost::variant/std::variant/") not in quincy ]
src/librbd/mirror/snapshot/UnlinkPeerRequest.cc [ ditto ]
Adam King [Tue, 2 May 2023 22:59:18 +0000 (18:59 -0400)]
mgr/cephadm: verify mon spec exists before trying to grab from spec store
In a normal deployment, we generally shouldn't have
to worry about this. This is more for teuthology
which does deployments in a weird way that can cause
there to be no mon spec in the cluster. Fixes an issue
seen when backporting the mon crush location work
to quincy where an upgrade test would fail with
```
[WRN] UPGRADE_REDEPLOY_DAEMON: Upgrading daemon mon.b on host smithi047 failed.
Upgrade daemon: mon.b: Service mon not found.
```
Adam King [Mon, 28 Nov 2022 16:59:59 +0000 (11:59 -0500)]
mgr/cephadm: Set mon crush locations based on service spec
The part of this that added the --set-crush-location flag
when deploying the mon was handled in another commit. This
piece is to finish the functionality by having cephadm set
the location through commands to handle when multiple
bucket=loc pairs are specified for a single monitor
Adam King [Mon, 28 Nov 2022 19:14:58 +0000 (14:14 -0500)]
mgr/cephadm: redo service level config when spec is updated
Previously, the service config function was only called
when we deploy a new daemon for that service. That meant
that updates to the spec such as changing a cert that don't
affect the daemon placement wouldn't trigger the service level
config to happen again. With this change, we now mark
the service as needing its config function ran if a daemon
for the service is added/removed or if the spec is updated.
Zac Dover [Mon, 22 May 2023 21:41:09 +0000 (07:41 +1000)]
doc/glossary: update bluestore entry
Update the BlueStore entry in the glossary, explaining that as of Reef
BlueStore and only BlueStore (and not FileStore) is the storage backend
for Ceph.
This topic has been discussed many times; recently at the Dev
Summit of Cephalocon 2023.
This commit is the minial version of the work, contained entirely
within the `doc`. However, likely it will be expanded as there
were ideas like e.g. adding cache tiering back experimental feature
list (Sam) to warn users when deploying a new cluster.
doc: Add missing `ceph` command in documentation section `REPLACING AN OSD`
Signed-off-by: Alexander Proschek <alexander.proschek@protonmail.com> Signed-off-by: Alexander Proschek <alexander.proschek@protonmail.com>
(cherry picked from commit 0557d5e465556adba6d25db62a40ba55a5dd2400)
Zac Dover [Thu, 18 May 2023 21:07:02 +0000 (07:07 +1000)]
doc/radosgw: explain multisite dynamic sharding
Add a note to doc/radosgw/dynamicresharding.rst and a note to
doc/radosgw/multisite.rst that explains that dynamic resharding is not
supported in releases prior to Reef.
This commit is made in response to a request from Mathias Chapelain.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit d4ed4223d914328361528990f89f1ee4acd30e79)
qa/tasks: Change default mClock profile to high_recovery_ops
With the new mClock default profile, tests were failing with "Exiting scrub checking -- not all pgs scrubbed" due to slower scrubs.
Changing the default profile to high_recovery_ops for testing purposes will fix this issue.
Zac Dover [Wed, 17 May 2023 12:25:38 +0000 (22:25 +1000)]
doc/cephfs: line-edit "Mirroring Module"
Line-edit the "Mirroring Module" section of
doc/cephfs/cephfs-mirroring.rst. Add prompts and formatting where such
things contribute to the realization of adequate sentences.
This commit is a follow-up to https://github.com/ceph/ceph/pull/51505.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit dd8855d9a934bcdd6a026f1308ba7410b1e143e3)
Aashish Sharma [Mon, 8 May 2023 07:19:13 +0000 (12:49 +0530)]
mgr/dashboard: fix regression caused by cephPgImabalance alert
because of an earlier fix delivered, there is a regression caused by it
due to which alerts are not getting displayed in the active alerts tab.
This PR intends to fix this issue.
Venky Shankar [Tue, 16 May 2023 05:25:34 +0000 (10:55 +0530)]
doc: explain cephfs mirroring `peer_add` step in detail
@zdover23 reached out regarding missing explanation for `peer_add`
step in cephfs mirroring documentation. Add some explanation and
and example to make the step clear.
Matan Breizman [Wed, 1 Feb 2023 08:49:19 +0000 (08:49 +0000)]
messages/MOSDMap: Rename oldest_map to cluster_osdmap_trim_lower_bound
Previously, MOSDMap messages sent to other OSDs were populated with the
superblocks's oldest_map. We should, instead, use the superblock's
cluster_osdmap_trim_lower_bound because oldest map is merely a marker
for each osd's trimming progress.
As specified in the docs:
***
We use cluster_osdmap_trim_lower_bound rather than a specific osd's oldest_map
because we don't necessarily trim all MOSDMap::oldest_map. In order to avoid
doing too much work at once we limit the amount of osdmaps trimmed using
``osd_target_transaction_size`` in OSD::trim_maps().
For this reason, a specific OSD's oldest_map can lag behind
OSDSuperblock::cluster_osdmap_trim_lower_bound
for a while.
***
Matan Breizman [Thu, 3 Nov 2022 08:59:11 +0000 (08:59 +0000)]
osd: Remove oldest_stored_osdmap()
The only usage was for identyfing map gaps on new intervals.
We should use max_oldest_stored_osdmap() instead, since a specific
osd's oldest_map may lag behind.
Matan Breizman [Wed, 2 Nov 2022 10:40:03 +0000 (10:40 +0000)]
osd: Fix check_past_interval_bounds()
When getting the required past interval bounds we use
oldest_map or current pg info (lec/ec).
Before this change we set oldest_map epoch using the
osd's superblock.oldest_map.
The fix will use the max_oldest_map received with other peers
instead since a specific osd's oldest_map can lag for a while
in order to avoid large workloads.
Samuel Just [Wed, 26 Oct 2022 04:46:24 +0000 (21:46 -0700)]
doc/dev/osd_internals: add past_intervals.rst
Add explanation of past_interals.
Signed-off-by: Samuel Just <sjust@redhat.com> Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit cd4c031e5e5f5b0318347a7957310cb7358380f6)
rgw/multisite: set truncated flag to true after fetching remote mdlogs.
This is in addition to 2ed1c3e. Ensures that the we continue syncing logs
that have been fetched already and don't end up cloning more logs prematurely.
rgw: read incremental metalog from master cluster based on truncate variable
when the log entry in the meta.log object of the secondary cluster is empty,
the value of max_marker is also empty,which can't meet the requirement that
mdlog_marker <= max_marker,resulting in that the secondary cluster can't fetch
new log entry from the master cluster and infinite loop,finally, the secondary
cluster's metadata can't catch up the master cluster. when the truncate is false,
it means that the secondary cluster's meta.log is empyt,we can read more from
master cluster.
Zac Dover [Fri, 12 May 2023 10:35:25 +0000 (20:35 +1000)]
doc/cephfs: rectify prompts in fs-volumes.rst
Make sure all prompts are unselectable. This PR is meant to be
backported to Reef, Quincy, and Pacific, to get all of the prompts into
a fit state so that a line-edit can be performed on the Englsh language
in this file.
Ramana Raja [Wed, 15 Feb 2023 15:12:54 +0000 (10:12 -0500)]
mgr/rbd_support: recover from rados client blocklisting
In certain scenarios the OSDs were slow to process RBD requests.
This lead to the rbd_support module's RBD client not being able to
gracefully handover a RBD exclusive lock to another RBD client.
After the condition persisted for some time, the other RBD client
forcefully acquired the lock by blocklisting the rbd_support module's
RBD client, and consequently blocklisted the module's RADOS client. The
rbd_support module stopped working. To recover the module, the entire
mgr service had to be restarted which reloaded other mgr modules.
Instead of recovering the rbd_support module from client blocklisting
by being disruptive to other mgr modules, recover the module
automatically without restarting the mgr serivce. On client getting
blocklisted, shutdown the module's handlers and blocklisted client,
create a new rados client for the module, and start the new handlers.