Zac Dover [Sat, 15 Jun 2024 11:55:18 +0000 (21:55 +1000)]
doc/rados: explain replaceable parts of command
Add an explanation that directs the reader to replace the "X" part of
the command "ceph tell mon.X mon_status" with the value specific to the
reader's Ceph cluster (which is (probably) not "X").
In the future, such replaceable strings in commands may be bounded by
angle brackets ("<" and ">").
This improvement to the documentation was suggested on the [ceph-users]
email list by Joel Davidow. This email, an absolute model of user
engagement with an upstream project, can be reviewed here:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/KF67F5TXFSSTPXV7EKL6JKLA5KZQDLDQ/
The log-only-match entry was backported to squid before the ignorelist changes,
but in main it was introduced after the ignorelist changes.
See https://github.com/ceph/ceph/commit/b4522dd332d40a54b9e0be58bd96aeaa345f8977.
Laura Flores [Wed, 12 Jun 2024 20:21:09 +0000 (15:21 -0500)]
qa/suites/rados/thrash-old-clients: update supported releases and distro
thrash-old-clients tests should only support N-3 releases. To fix this for
main, I have removed all releases < quincy and have added squid.
Also, we are fully switching to centos.9_stream packages/containers after
the centos.8_stream end of life, so I changed the distro from centos.8_stream
to centos.9_stream.
*** Note: If this commit is backported, it should be done in such a way that
only releases >= quincy reference centos.9_stream. For instance, if backporting to squid,
a reef/squid thrash test is okay to make references to centos.9_stream since both reef and
squid support this, but a pacific/squid test will have to take a different approach
since pacific does not support centos.9_stream.
Modifications:
- For this squid backport, I kept pacific since that fits into N-3 where
N is squid.
- Pacific does not build c9 packages, so I picked an alternative distro
that is shared among all represented releases: ubuntu 20.04.
Add an explanation of leader-peon conditions that obtain when the
cluster is in the "HEALTH_OK" state. Previously, the text discussed
these two monitor states only in the context of a health detail entry.
This improvement to the documentation was suggested on the [ceph-users]
email list by Joel Davidow. This email, an absolute model of user
engagement with an upstream project, can be reviewed here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/KF67F5TXFSSTPXV7EKL6JKLA5KZQDLDQ/
I will list Joel Davidow here as the co-author for the sake of more
expediently getting this change into the documentation, but though he is
listed as the co-author, he is the true author.
Co-authored-by: Joel Davidow <jdavidow@nso.edu> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 6fb9a5ef817eda5184d51ebcb425a6091ca82299)
Leonid Usov [Thu, 6 Jun 2024 11:48:56 +0000 (14:48 +0300)]
squid: mds: QuiesceDbRequest: update the internal encoding of ops
Excluding the last root from a set will automatically mark it as QS_CANCELED.
Hence, it makes more sense if `exclude` and `cancel` share the same op code,
rather than `exclude` and `release`.
Fixes: https://tracker.ceph.com/issues/66400 Signed-off-by: Leonid Usov <leonid.usov@ibm.com> Fixes: https://tracker.ceph.com/issues/66383
(cherry picked from commit dad52497817c372fd7c61a88a210b5a3613cb807)
Zac Dover [Wed, 5 Jun 2024 16:43:15 +0000 (02:43 +1000)]
doc/start: s/intro.rst/index.rst/
Change the filename "doc/start/intro.rst" to "doc/start/index.rst" so
that Sphinx finds the root filename for the "/start" directory in the
default location.
Zac Dover [Tue, 4 Jun 2024 13:37:27 +0000 (23:37 +1000)]
doc/start: s/http/https/ in links
Replace "http" with "https" in doc/start/get-involved.rst.
This commit is, in a way, a repeat of
https://github.com/ceph/ceph/pull/57213/
(1c5383b91bd7dbfa9670c6485fcc5ff28b79f40d), which targeted the Reef
branch instead of the main branch. When this commit has been merged and
backported, I will close https://github.com/ceph/ceph/pull/57213/.
I am listing Casey Cain here as the co-author, but he is in fact the
true author of this change.
Lucian Petrut [Fri, 24 May 2024 10:03:11 +0000 (10:03 +0000)]
rbd-wnbd: wait for the disk cleanup to complete
The WNBD disk removal workflow is asynchronous, which is why we'll
need to wait for the cleanup to complete when stopping the service.
The "disconnect_all_mappings" function is moved to
RbdMappingDispatcher::stop, allowing us to access the mapping list
more easily and reject new mappings after a stop has been requested.
Rishabh Dave [Fri, 8 Mar 2024 15:31:51 +0000 (21:01 +0530)]
mds: add no counters in warning for standby-replay MDS
Don't include inode and stray counters in the health warnings printed
for standby-replay MDSs. Since these counters are present in the health
warnings only due to replay, it can confuse users, and therefore, do not
include them.
Fixes: https://tracker.ceph.com/issues/63514 Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 03dcdc1329e471aa4aa403519ea5131db2f99b23)
Patrick Donnelly [Thu, 30 May 2024 13:08:04 +0000 (09:08 -0400)]
Merge PR #57730 into squid
* refs/pull/57730/head:
squid: mds: remove unnecssary quiesce finisher variable
squid: mds: attach quiesce_path mdr to finisher at creation not dispatch
squid: mds/quiesce: disable quiesce root debug parameters by default
squid: mds/quiesce-agt: never send a synchronous ack
squid: mds/quiesce-agt: add test for a rapid async ack
squid: mds/quiesce: always abort fragmenting asynchronously to prevent reentrancy
squid: mds/quiesce: overdrive an export if it hasn't frozen the tree yet
squid: mds/quiesce: quiesce_inode should not hold on to remote auth pins
squid: qa/cephfs: check that a completed quiesce doesn't hold remote auth pins
squid: mds: add `--lifetime` parameter to the `lock path` asok command
squid: mds/quiesce: accept a regular file as the quiesce root
squid: mds: command_quiesce_path: rename `--wait` to `--await` for consistency
squid: mds: command_quiesce_path: do not block the asok thread and return an adequate rc
squid: mds/quiesce: drop remote authpins before waiting for the quiesce lock
squid: qa/cephfs/test_quiesce: test proper handling of remote authpins
squid: mds: don't clear `AUTHPIN_FROZEN` until `FROZEN` in rename_prep
squid: mds: enhance the `lock path` asok command
squid: mds/quiesce: overdrive fragmenting that's still freezing
squid: revert: mds: provide a mechanism to authpin while freezing
squid: qa/cephfs/test_quiesce: enhance the fragmentation test
squid: mds/queisce-db: collect acks while bootstrapping
squid: mds/quiesce-db: optimize peer updates
squid: mds/quiesce-db: track db epoch separately from the membership epoch
squid: mds/quiesce-db: test that a peer on a newer membership epoch can ack a root
squid: mds: don't stall the asok thread for flush commands
squid: qa/quiescer: relax some timing requirements in the quiescer
squid: qa/tasks/quiescer: dump ops in parallel
squid: qa/suites/fs: add quiescer to the fs suite
squid: qa/tasks: the quiescer task and a waiter task to test it
squid: qa/tasks/cephfs: don't create a new CephManager if there is one in the context
squid: qa/tasks: vstart_runner: introduce --config-mode
squid: qa/tasks: introduce ThrasherGreenlet
squid: qa: update quiesce tests to expect ipolicy lock
squid: mds: add missing policylock to test F_QUIESCE_BLOCK
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Jos Collin [Mon, 6 May 2024 12:47:29 +0000 (18:17 +0530)]
pybind/mgr/mirroring: Fix KeyError: 'directory_count' in daemon status
The directory_count key is missing in self.mgr.get_daemon_status() output json,
intermittently when there is a delay caused by m_listener.handle_mirroring_enabled() to update the
directory_count, which results in ServiceDaemon::update_status() creates a json with out 'directory_count' key/value.
But the mgr/mirroring -> daemon_status() always expects the 'directory_count' key to be present in the json returned by
self.mgr.get_daemon_status().
This issue occurs intermittently when we enable/disable mirroring and check the 'daemon status' in between.
This patch fixes this issue by setting a default value 0 for 'directory_count' in doemon_status().
Fixes: https://tracker.ceph.com/issues/65795 Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit b78baa23e562742b8bdc5a75f82e3b6fbf55a8a5)
Zac Dover [Tue, 28 May 2024 16:27:53 +0000 (02:27 +1000)]
doc/dev: add note about intro of perf counters
Add a note to the "perf counter" section of doc/dev/perf_counters.rst
that explains that this feature was introduced in the Reef release of
Ceph. This note will prevent us from accidentally backporting
perf-counter-related PRs to Quincy.
Leonid Usov [Mon, 20 May 2024 16:17:04 +0000 (19:17 +0300)]
squid: mds/quiesce: overdrive an export if it hasn't frozen the tree yet
Just like with the fragmenting, we should abort an ongoing export
if a quiesce is attempted for the directory.
To minimize the stress for the system, we only allow the abort
if the export hasn't yet managed to freeze the tree. If that is the case,
then quiesce will have to wait for the export to finish.
Fixes: https://tracker.ceph.com/issues/66123 Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit da5c263b8e7797eac6c9d13d5b6a6b292d9c5def) Fixes: https://tracker.ceph.com/issues/66259
Leonid Usov [Mon, 20 May 2024 22:03:15 +0000 (01:03 +0300)]
squid: mds/quiesce: quiesce_inode should not hold on to remote auth pins
1. avoid taking a remote authpin for the quiesce lock
2. drop remote authpins that were taken because of other locks
We should not be forcing a mustpin when taking quiesce lock.
This creates unnecessary overhead due to the distributed nature
of the quiesce: all ranks will execute quiesce_inode, including
the auth rank, which will authpin the inode.
Auth pinning on the auth rank is important to synchronize quiesce
with operations that are managed by the auth, like fragmenting
and exporting.
If we let a remote quiesce process take a foreign authpin then
it may block freezing on the auth, which will stall quiesce locally.
This wouldn't be a problem if the quiesce that is blocked on the auth
and the quiesce that's holding a remote authpin from the replica side
were unrelated, but in our case it may be the same logical quiesce
that effectively steps on its own toes. This creates an opportunity
for a deadlock.
Fixes: https://tracker.ceph.com/issues/66152 Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit b1cb6d985622c6164d99d3fd79b6eeaf6530894c) Fixes: https://tracker.ceph.com/issues/66258
Leonid Usov [Sat, 11 May 2024 14:00:21 +0000 (17:00 +0300)]
squid: mds: enhance the `lock path` asok command
* when the quiesce lock is taken by this op, don't consider the inode `quiesced`
* drop all locks taken during traversal
* drop all local authpins after the locks are taken
* add --await functionality that will block the command until locks are taken or an error is encountered
* return the RC that represents the operation result. 0 if the operation was scheduled and hasn't failed so far
* add authpin control flags
** --ap-freeze - to auth_pin_freeze the target inode
** --ap-dont-block - to pass auth_pin_nonblocking when acquiring the target inode locks
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit 3552fc5a9ea17c173a18be41fa15fbbae8d77edf) Fixes: https://tracker.ceph.com/issues/66154
Leonid Usov [Thu, 9 May 2024 01:39:12 +0000 (04:39 +0300)]
squid: mds/quiesce: overdrive fragmenting that's still freezing
Quiesce requires revocation of capabilities,
which is not working for a freezing/frozen nodes.
Since it is best effort, abort an ongoing fragmenting
for the sake of a faster quiesce.
Signed-off-by: Leonid Usov <leonid.usov@ibm.com> Fixes: https://tracker.ceph.com/issues/65716
(cherry picked from commit 8b6440652d501644d641c1c8b3255c3720738ec6) Fixes: https://tracker.ceph.com/issues/66154
Leonid Usov [Sun, 12 May 2024 16:19:34 +0000 (19:19 +0300)]
squid: revert: mds: provide a mechanism to authpin while freezing
This is a functional revert of a9964a7ccc4394f923fb0f1c76eb8fa03fe8733d
git revert was giving too many conflicts, as the code has changed
too much since the original commit.
The bypass freezing mechanism lead us into several deadlocks,
and when we found out that a freezing inode defers reclaiming
client caps, we realized that we needed to try a different approach.
This commit removes the bypass freezing related changes to clear way
for a different approach to resolving the conflict between quiesce
and freezing.
Fixes: https://tracker.ceph.com/issues/65716 Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit bf760602a4f02cc07072db2da5cb987e3072afce) Fixes: https://tracker.ceph.com/issues/66154
Leonid Usov [Mon, 13 May 2024 21:10:04 +0000 (00:10 +0300)]
squid: mds/quiesce-db: track db epoch separately from the membership epoch
Tracking the db epoch separately will make sure that replicas
only follow leader's epoch choice, even if they are already on
the new membership epoch. This eliminates races due to the
random order of mdsmap updates.
Fixes: https://tracker.ceph.com/issues/65977 Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit 379ef7196b61142dc7753992f897ad91b37f048f) Fixes: https://tracker.ceph.com/issues/66070