qa: add "failover / failback loop" test for rbd-mirror
For snapshot-based mirroring, check that demote (or other mirror
snapshots) don't pile up. Nothing in particular to assert on for
journal-based mirroring but the test is still useful.
Ilya Dryomov [Sat, 26 Aug 2023 11:04:52 +0000 (13:04 +0200)]
librbd: make CreatePrimaryRequest remove any unlinked mirror snapshots
After commit ac552c9b4d65 ("librbd: localize snap_remove op for mirror
snapshots"), rbd-mirror daemon no longer removes mirror snapshots when
it's done syncing them -- instead it only unlinks from them. However,
CreatePrimaryRequest state machine was not adjusted to compensate and
hence two cases were missed:
- primary demotion snapshot (rbd-mirror daemon unlinks from primary
demotion snapshots just like it does from regular primary snapshots);
this comes up when an image is demoted but then promoted on the same
cluster
- non-primary demotion snapshot (unlike regular non-primary snapshots,
non-primary demotion snapshots store peer uuids and rbd-mirror daemon
does unlinking just like in the case of primary snapshots); this
comes up when an image is demoted and promoted on the other cluster
Related is the case of orphan snapshots. Since they are dummy to begin
with, CreatePrimaryRequest would now clean up the orphan snapshot after
the creation of the force promote snapshot.
Fixes: https://tracker.ceph.com/issues/61707 Co-authored-by: Christopher Hoffman <choffman@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 9c05d3d81f4b06af2cfd47376e9ad86369bdf8cf)
Conflicts:
src/librbd/mirror/snapshot/CreatePrimaryRequest.cc [ commit 3a93b40721a1 ("librbd: s/boost::variant/std::variant/") not
in pacific ]
Ilya Dryomov [Tue, 22 Aug 2023 15:27:50 +0000 (17:27 +0200)]
librbd: don't attempt to remove image state on orphan snapshots
Despite being mirror snapshots, orphan snapshots don't have image
state: see CreateNonPrimaryRequest::write_image_state() for a similar
is_orphan() check. Attempting to remove image state generates bogus
"failed to read image state object" and "failed to remove image state"
errors.
It's a no-no to acquire locks in these "fast" messenger methods. This
can lead to messenger slow downs in the best case as it's blocking reads
on the wire. In the worse case, the messenger may deadlock with other
threads, preventing any further message reads off the wire.
It's not obvious this method is "fast" so I've added a comment regarding
this.
Fixes: https://tracker.ceph.com/issues/61874 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 69980823e62f67d502c4045e15c41c5c44cd5127)
python-common: drive_selection: fix KeyError when osdspec_affinity is not set
When osdspec_affinity is not set, the drive selection code will fail.
This can happen when a device has multiple LVs where some of are used
by Ceph and at least one LV isn't used by Ceph.
Ilya Dryomov [Mon, 14 Aug 2023 11:16:59 +0000 (13:16 +0200)]
qa/suites/upgrade/octopus-x: skip TestClsRbd.mirror_snapshot test
The behavior of the class method changed in reef; the change was
backported to pacific and quincy. An octopus test binary used against
pacific OSDs produces an expected failure:
[ RUN ] TestClsRbd.mirror_snapshot
.../ceph-15.2.17/src/test/cls_rbd/test_cls_rbd.cc:2279: Failure
Expected equality of these values:
-85
mirror_image_snapshot_unlink_peer(&ioctx, oid, 1, "peer2")
Which is: 0
[ FAILED ] TestClsRbd.mirror_snapshot (6 ms)
liu shi [Fri, 14 May 2021 07:51:01 +0000 (03:51 -0400)]
cpu_profiler: fix asok command crash
fixes: https://tracker.ceph.com/issues/50814 Signed-off-by: liu shi <liu.shi@navercorp.com>
(cherry picked from commit be7303aafe34ae470d2fd74440c3a8d51fcfa3ff)
Patrick Donnelly [Fri, 21 Jul 2023 15:56:49 +0000 (11:56 -0400)]
mds: adjust cap acquisition throttles
For production workloads, these defaults rarely help. Adjust
accordingly. For a steady state "find" workload, these new throttles
will prevent acquiring more than ~2300 caps/second which is quite
manageable with typical recall rates.
-ln(0.5) / 30 * 100k = 2310
Fixes: https://tracker.ceph.com/issues/62114 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit f290ef9d0d2d09fb978d56c46be704c6efd45c43)
Venky Shankar [Wed, 9 Aug 2023 05:43:01 +0000 (11:13 +0530)]
qa: avoid explicit set to client mountpoint as "/"
This causes self.cephfs_mntpt to set as "/" by default which
overrides the config in ceph.conf. `test_client_cache_size`
updates ceph.conf with:
client mountpoint = /subdir
However, the ceph-fuse mount command has --client_mountpoint explicitly
set as "/", thereby causing the root of the file system to get mounted which
confuses the test.
Conflicts:
qa/tasks/cephfs/fuse_mount.py
- merge conflicts due to updated upstream code
- removed offending line; host_mntpt was appended to the mount command
later in the code; this issue was created due to manual conflict
resolution during backporting process;
qa/tasks/cephfs/kernel_mount.py
qa/tasks/cephfs/mount.py
- fixed conflicts between 'main' and 'pacific' branches
Conflicts:
src/pybind/mgr/rbd_support/mirror_snapshot_schedule.py
- Above conflict was due to commit e4a16e2
("mgr/rbd_support: add type annotation") not in pacific
Conflicts:
src/pybind/mgr/rbd_support/module.py
- Above conflict was due to commit dcb51b0
("mgr/rbd_support: define commands using CLICommand") not in pacific
Ramana Raja [Wed, 10 May 2023 18:37:44 +0000 (14:37 -0400)]
rbd_support: recover from "double blocklisting"
Recover from being blocklisted while recovering from blocklisting.
When the rbd_support module is being set up to recover from client
blocklisting, the module's new rados client connection can also get
blocklisted. Currently, this will cause the recovery to fail and
the module will remain inoperable. Instead, retry module recovery
when the new client gets blocklisted during the module setup in the
recovery thread.
Conflicts:
src/pybind/mgr/rbd_support/mirror_snapshot_schedule.py
src/pybind/mgr/rbd_support/module.py
src/pybind/mgr/rbd_support/perf.py
src/pybind/mgr/rbd_support/task.py
src/pybind/mgr/rbd_support/trash_purge_schedule.py
- Above conflicts were due to commit e4a16e2
("mgr/rbd_support: add type annotation") not in pacific
- Above conflicts were due to commit dcb51b0
("mgr/rbd_support: define commands using CLICommand") not in pacific
Ramana Raja [Wed, 15 Feb 2023 15:12:54 +0000 (10:12 -0500)]
mgr/rbd_support: recover from rados client blocklisting
In certain scenarios the OSDs were slow to process RBD requests.
This lead to the rbd_support module's RBD client not being able to
gracefully handover a RBD exclusive lock to another RBD client.
After the condition persisted for some time, the other RBD client
forcefully acquired the lock by blocklisting the rbd_support module's
RBD client, and consequently blocklisted the module's RADOS client. The
rbd_support module stopped working. To recover the module, the entire
mgr service had to be restarted which reloaded other mgr modules.
Instead of recovering the rbd_support module from client blocklisting
by being disruptive to other mgr modules, recover the module
automatically without restarting the mgr serivce. On client getting
blocklisted, shutdown the module's handlers and blocklisted client,
create a new rados client for the module, and start the new handlers.
Conflicts:
src/pybind/mgr/rbd_support/mirror_snapshot_schedule.py
src/pybind/mgr/rbd_support/module.py
src/pybind/mgr/rbd_support/perf.py
src/pybind/mgr/rbd_support/task.py
src/pybind/mgr/rbd_support/trash_purge_schedule.py
- Above conflicts were due to commit e4a16e2
("mgr/rbd_support: add type annotation") not in pacific
- Above conflicts were due to commit dcb51b0
("mgr/rbd_support: define commands using CLICommand") not in pacific
Ramana Raja [Mon, 30 Jan 2023 07:21:54 +0000 (02:21 -0500)]
mgr: store names of modules that register RADOS clients in the MgrMap
The MgrMap stores a list of RADOS clients' addresses registered by the
mgr modules. During failover of ceph-mgr, the list is used to blocklist
clients belonging to the failed ceph-mgr.
Store the names of the mgr modules that registered the RADOS clients
along with the clients' addresses in the MgrMap. During debugging, this
allows easy identification of the mgr module that registered a
particular RADOS client by just dumping the MgrMap (`ceph mgr dump`).
Following is the MgrMap output with a module's client name displayed
along with its client addrvec,
$ ceph mgr dump | jq '.active_clients[0]'
{
"name": "devicehealth",
"addrvec": [
{
"type": "v2",
"addr": "10.0.0.148:0",
"nonce": 612376578
}
]
}
Igor Fedotov [Fri, 11 Nov 2022 14:31:19 +0000 (17:31 +0300)]
os/bluestore: introduce a cooldown period for failed BlueFS allocations.
When using bluefs_shared_alloc_size one might get a long-lasting state when
that large chunks are not available any more and fallback to shared
device min alloc size occurs. The introduced cooldown is intended to
prevent repetitive allocation attempts with bluefs_shared_alloc_size for
a while. The rationale is to eliminate performance penalty these failing
attempts might cause.
Adam Emerson [Thu, 20 Jul 2023 01:03:44 +0000 (21:03 -0400)]
build: install-deps.sh installs system boost on Jammy
Since on Jammy system boost is new enough for Pacific and we don't have
Jammy packages for older boost (we only have those for Bionic), just
install the system packages rather than fetching ceph-libboost.
No analogous commit exists in main as while main's Jammy case installs
ceph-libboost, we just need a system package here.
Fixes: https://tracker.ceph.com/issues/62103 Signed-off-by: Adam Emerson <aemerson@redhat.com>
Adam Emerson [Wed, 19 Jul 2023 21:12:08 +0000 (17:12 -0400)]
build: Remove old ceph-libboost* packages in install-deps
Here, we extract `clean_boost_on_ubuntu()` and call it before other
installs on Debian distributions so that if we install a system boost,
a potentially newer `ceph-libboost` won't get in the way.
As the sources.list.d being removed in the original cleanup code isn't
the one we're currently installing in the install code, add a removal
for the currently used source, then do apt-update so packages from the
removed source are no longer included as available.
Two subsidiary dev packages from conflicting boost libraries can be
installed, but it leaves apt in an inconsistent state. To clean this
up, add `--fix-missing` to the removal line and call
`clean_boost_on_ubuntu()` before other uses of apt.
Fixes: https://tracker.ceph.com/issues/62097 Signed-off-by: Adam Emerson <aemerson@redhat.com>
(cherry picked from commit 0c3f511e14af639b6509e69b889258b2f718f8fd)
Conflicts:
install-deps.sh
- Different boost version for Pacific than Squid.
- ci_debug does not exist in Pacific
- whitespace
- No INSTALL_EXTRA
Fixes: https://tracker.ceph.com/issues/62103 Signed-off-by: Adam Emerson <aemerson@redhat.com>
Any unknown exception causes the module to be unloaded and unresponsive.
So, it'll be ideal to catch all exceptions during command-line interaction
and report them instead of crashing with a traceback.