Patrick Donnelly [Wed, 20 Sep 2023 20:57:01 +0000 (16:57 -0400)]
common: resolve config proxy deadlock using refcounted pointers
7e8c683 introduced some gymnastics with a "CallGate" to maintain a count for
each observer we may be "calling into" with a config change (namely:
handle_conf_change). This was to prevent remove_observer coming in and deleting
the observer in the middle of the call. More importantly, it was to avoid
holding the lock while traversing the observers so that the config_proxy lock
can be dropped while calling handle_conf_change. This is important as e.g. the
MDS may attempt to acquire the config_proxy lock in its
MDSRank::handle_conf_change method (what prompted the change).
However, this introduces a new deadlock:
- Thread 2 acquires the config_proxy lock and then removes an observer. It blocks
waiting for the observer's CallGate to close.
- Thread 1 had dropped the config_proxy lock while traversing the observers to call each
observer's handle_conf_change method. Those methods may attempt to reacquire the
config_proxy lock. This creates the deadlock as it's waiting for Thread 2 to drop the lock
while Thread 1 cannot release the CallGate.
The solution, I believe, is to properly refcount "uses" of the observers for the purposes
of flushing these changes. Use std::shared_ptr to effect this.
Reproducing this is fairly simply with several parallel calls to `config set`.
During the course of executing `config set`, the Objecter may receive config
updates that will be flushed and potentially race with cleanup of observers
during shutdown.
Patrick Donnelly [Thu, 21 Sep 2023 02:00:03 +0000 (22:00 -0400)]
common: add missing locks in config_proxy methods
It's not generally safe to access the md_config_t without these locks. Some
methods are probably harmless (accessing read-only state) but best to be
consistent.
Ville Ojamo [Fri, 3 Nov 2023 05:44:00 +0000 (12:44 +0700)]
doc/cephadm/services: remove excess rendered indentation in osd.rst
Start bash command blocks at the left margin, removing
excessive padding/indentation that would render the
block too much towards the right.
At the same time ident the source consistently:
- Two spaces for command blocks and output blocks.
- Four spaces for notes, code blocks.
There seems to be no uniform style for this, sometimes
commands are indented with three spaces but it would
seem two spaces is common. In the end it all renders
the same I guess.
Ramana Raja [Mon, 18 Sep 2023 02:52:56 +0000 (22:52 -0400)]
qa/suites/rbd: add test to check rbd_support module recovery
... on repeated blocklisting of its client.
There were issues with rbd_support module not being able to recover
from its RADOS client being repeatedly blocklisted. This occured for
example in clusters with OSDs slow to process RBD requests while the
module's mirror_snapshot_scheduler was taking mirror snapshots by
requesting exclusive locks on the RBD images and workloads were running
on the snapshotted images via kernel clients.
There is no need for CreateSnapshotRequests.__del__() that calls
CreateSnapshotRequests.wait_for_pending().
MirrorSnapshotScheduleHandler.shutdown() already calls
CreateSnapshotRequests.wait_for_pending().
Ramana Raja [Thu, 26 Oct 2023 17:18:52 +0000 (13:18 -0400)]
mgr/rbd_support: fix recursive locking on CreateSnapshotRequests lock
The MirrorSnapshotScheduleHandler's run thread issues asynchronous
create snapshot requests using a CreateSnapshotRequests instance. When
the thread invokes a CreateSnapshotRequests instance's get_ioctx(),
the instance's class variable lock is acquired. With the class
variable lock held, the garbage collection of a CreateSnapshotRequests
instance may race in the thread. The thread would then call
CreateSnapshotRequests __del__() that tries to acquire the class
variable lock that the thread already holds. Fix this
recursive deadlock by converting the CreateSnapshotRequests lock from
a class variable to an instance variable. There is no need to share
the lock across CreateSnapshotRequests instances.
Also convert MirrorSnapshotScheduleHandler, PerfHandler and
TrashPurgeScheduleHandler class variables to instance variables
that don't need to be shared across the instances.
Fixes: https://tracker.ceph.com/issues/62994 Signed-off-by: Ramana Raja <rraja@redhat.com> Co-Authored-By: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 4452bc22d1c6c8499cf55d6e39090adf7ae1dcbf)
Zac Dover [Wed, 1 Nov 2023 01:53:59 +0000 (11:53 +1000)]
doc/cephadm: edit troubleshooting.rst (1 of x)
Edit doc/cephadm/troubleshooting.rst. This commit and the PR of which it
is a part was raised in response to
https://github.com/ceph/ceph/pull/53976. The limits of reStructuredText
are particularly visible here in every instance of a BASH for-loop and
in every instance of a command stretched over multiple lines.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 69472c26af5419faa9ed93c071ed5933d03fa67f)
Laura Flores [Thu, 28 Sep 2023 17:52:11 +0000 (17:52 +0000)]
osd: fix logic in check_pg_upmaps
The logic was changed in check_pg_upmaps
in a Reef refactor, which results in recommendations
made by the upmap balancer even when it says there are
no optimizations.
Zac Dover [Mon, 30 Oct 2023 02:37:39 +0000 (12:37 +1000)]
doc/glossary: improve "BlueStore" entry
Initially s/backend/back end/ but then I added a little more information
about BlueStore's use of RocksDB to map object names to block locations
on disk.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 8713cca328c9373636efdb92449d743b5bd56584)
test/librbd/fsx: wait for resize to propagate in krbd_resize()
With this changes resize request will not be blocked until the resize is
completed. Because of this the fsx test fails as it assumes that the
request to resize immediately implies changes on the device size.
Hence we have to add a wait in resize handler of fsx for the device to
actually get resized.
Problem:
-------
Trying to disable any feature on an rbd image mapped with nbd leads to stuck
in rbd-nbd.
The rbd-nbd registers a watcher callback to detect image resize in
NBDWatchCtx::handle_notify(). The handle_notify calls image info method, which
calls refresh_if_required and it got stuck there.
It is getting stuck in ImageState::refresh_if_required() because
DisableFeaturesRequest issues update notifications while still holding onto
the exclusive lock with everything that has to do with it blocked.
Solution:
--------
Set only notify flag as part of NBDWatchCtx::handle_notify() and handle
the resize detection part as part of a different thread.
Aashish Sharma [Mon, 30 Oct 2023 07:47:37 +0000 (13:17 +0530)]
mgr/dashboard: update rgw multisite import form helper info
Change 'To obtain the token, generate it from your secondary Ceph cluster' to 'To obtain the token, generate it from your primary Ceph cluster' in rgw multisite import form helper
Zac Dover [Fri, 27 Oct 2023 06:58:28 +0000 (16:58 +1000)]
doc/rados: remove cache-tiering-related keys
Remove information related to cache-tiering-related keys from
doc/rados/operations/pools.rst. Cache-tiering is deprecated in Reef.
This PR is suitable for backporting to the Reef release branch, but not
to release branches prior to Reef.
John Mulligan [Wed, 11 Oct 2023 18:05:17 +0000 (14:05 -0400)]
cephadm: add a --dry-run option to cephadm shell
Instead of creating the shell, the --dry-run option prints the container
command that would be used. This can be used as a starting point for
creating custom container commands similar to what cephadm shell would
generate but with tweaks.
Zac Dover [Wed, 25 Oct 2023 23:48:57 +0000 (09:48 +1000)]
doc/rados: remove HitSet-related key information
Remove HitSet-related key information from
doc/rados/operations/pools.rst. HitSet-related keys are relevant only to
releases of Ceph that support cache tiering. Only Quincy and earlier
(inclusive) releases of Ceph support cache tiering. Backport this commit
from main to Reef, but not to Quincy or to release branches earlier than
Quincy.