Zac Dover [Sat, 15 Jun 2024 11:55:18 +0000 (21:55 +1000)]
doc/rados: explain replaceable parts of command
Add an explanation that directs the reader to replace the "X" part of
the command "ceph tell mon.X mon_status" with the value specific to the
reader's Ceph cluster (which is (probably) not "X").
In the future, such replaceable strings in commands may be bounded by
angle brackets ("<" and ">").
This improvement to the documentation was suggested on the [ceph-users]
email list by Joel Davidow. This email, an absolute model of user
engagement with an upstream project, can be reviewed here:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/KF67F5TXFSSTPXV7EKL6JKLA5KZQDLDQ/
The log-only-match entry was backported to squid before the ignorelist changes,
but in main it was introduced after the ignorelist changes.
See https://github.com/ceph/ceph/commit/b4522dd332d40a54b9e0be58bd96aeaa345f8977.
crimson/osd/osdop_params:Unify OpsExecuter::user_modify and osd_op_params_t::user_modify
Before this change OpsExecuter::user_modify was maintained in OpsExecuter::do_write_op.
However, osd_op_params->user_modify was not updated when used in OpsExecuter::prepare_transaction
Laura Flores [Wed, 12 Jun 2024 20:21:09 +0000 (15:21 -0500)]
qa/suites/rados/thrash-old-clients: update supported releases and distro
thrash-old-clients tests should only support N-3 releases. To fix this for
main, I have removed all releases < quincy and have added squid.
Also, we are fully switching to centos.9_stream packages/containers after
the centos.8_stream end of life, so I changed the distro from centos.8_stream
to centos.9_stream.
*** Note: If this commit is backported, it should be done in such a way that
only releases >= quincy reference centos.9_stream. For instance, if backporting to squid,
a reef/squid thrash test is okay to make references to centos.9_stream since both reef and
squid support this, but a pacific/squid test will have to take a different approach
since pacific does not support centos.9_stream.
Modifications:
- For this squid backport, I kept pacific since that fits into N-3 where
N is squid.
- Pacific does not build c9 packages, so I picked an alternative distro
that is shared among all represented releases: ubuntu 20.04.
Add an explanation of leader-peon conditions that obtain when the
cluster is in the "HEALTH_OK" state. Previously, the text discussed
these two monitor states only in the context of a health detail entry.
This improvement to the documentation was suggested on the [ceph-users]
email list by Joel Davidow. This email, an absolute model of user
engagement with an upstream project, can be reviewed here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/KF67F5TXFSSTPXV7EKL6JKLA5KZQDLDQ/
I will list Joel Davidow here as the co-author for the sake of more
expediently getting this change into the documentation, but though he is
listed as the co-author, he is the true author.
Co-authored-by: Joel Davidow <jdavidow@nso.edu> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 6fb9a5ef817eda5184d51ebcb425a6091ca82299)
Leonid Usov [Thu, 6 Jun 2024 11:48:56 +0000 (14:48 +0300)]
squid: mds: QuiesceDbRequest: update the internal encoding of ops
Excluding the last root from a set will automatically mark it as QS_CANCELED.
Hence, it makes more sense if `exclude` and `cancel` share the same op code,
rather than `exclude` and `release`.
Fixes: https://tracker.ceph.com/issues/66400 Signed-off-by: Leonid Usov <leonid.usov@ibm.com> Fixes: https://tracker.ceph.com/issues/66383
(cherry picked from commit dad52497817c372fd7c61a88a210b5a3613cb807)
Zac Dover [Wed, 5 Jun 2024 16:43:15 +0000 (02:43 +1000)]
doc/start: s/intro.rst/index.rst/
Change the filename "doc/start/intro.rst" to "doc/start/index.rst" so
that Sphinx finds the root filename for the "/start" directory in the
default location.
Zac Dover [Tue, 4 Jun 2024 13:37:27 +0000 (23:37 +1000)]
doc/start: s/http/https/ in links
Replace "http" with "https" in doc/start/get-involved.rst.
This commit is, in a way, a repeat of
https://github.com/ceph/ceph/pull/57213/
(1c5383b91bd7dbfa9670c6485fcc5ff28b79f40d), which targeted the Reef
branch instead of the main branch. When this commit has been merged and
backported, I will close https://github.com/ceph/ceph/pull/57213/.
I am listing Casey Cain here as the co-author, but he is in fact the
true author of this change.
Lucian Petrut [Fri, 24 May 2024 10:03:11 +0000 (10:03 +0000)]
rbd-wnbd: wait for the disk cleanup to complete
The WNBD disk removal workflow is asynchronous, which is why we'll
need to wait for the cleanup to complete when stopping the service.
The "disconnect_all_mappings" function is moved to
RbdMappingDispatcher::stop, allowing us to access the mapping list
more easily and reject new mappings after a stop has been requested.
Rishabh Dave [Fri, 8 Mar 2024 15:31:51 +0000 (21:01 +0530)]
mds: add no counters in warning for standby-replay MDS
Don't include inode and stray counters in the health warnings printed
for standby-replay MDSs. Since these counters are present in the health
warnings only due to replay, it can confuse users, and therefore, do not
include them.
Fixes: https://tracker.ceph.com/issues/63514 Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 03dcdc1329e471aa4aa403519ea5131db2f99b23)
Patrick Donnelly [Thu, 30 May 2024 13:08:04 +0000 (09:08 -0400)]
Merge PR #57730 into squid
* refs/pull/57730/head:
squid: mds: remove unnecssary quiesce finisher variable
squid: mds: attach quiesce_path mdr to finisher at creation not dispatch
squid: mds/quiesce: disable quiesce root debug parameters by default
squid: mds/quiesce-agt: never send a synchronous ack
squid: mds/quiesce-agt: add test for a rapid async ack
squid: mds/quiesce: always abort fragmenting asynchronously to prevent reentrancy
squid: mds/quiesce: overdrive an export if it hasn't frozen the tree yet
squid: mds/quiesce: quiesce_inode should not hold on to remote auth pins
squid: qa/cephfs: check that a completed quiesce doesn't hold remote auth pins
squid: mds: add `--lifetime` parameter to the `lock path` asok command
squid: mds/quiesce: accept a regular file as the quiesce root
squid: mds: command_quiesce_path: rename `--wait` to `--await` for consistency
squid: mds: command_quiesce_path: do not block the asok thread and return an adequate rc
squid: mds/quiesce: drop remote authpins before waiting for the quiesce lock
squid: qa/cephfs/test_quiesce: test proper handling of remote authpins
squid: mds: don't clear `AUTHPIN_FROZEN` until `FROZEN` in rename_prep
squid: mds: enhance the `lock path` asok command
squid: mds/quiesce: overdrive fragmenting that's still freezing
squid: revert: mds: provide a mechanism to authpin while freezing
squid: qa/cephfs/test_quiesce: enhance the fragmentation test
squid: mds/queisce-db: collect acks while bootstrapping
squid: mds/quiesce-db: optimize peer updates
squid: mds/quiesce-db: track db epoch separately from the membership epoch
squid: mds/quiesce-db: test that a peer on a newer membership epoch can ack a root
squid: mds: don't stall the asok thread for flush commands
squid: qa/quiescer: relax some timing requirements in the quiescer
squid: qa/tasks/quiescer: dump ops in parallel
squid: qa/suites/fs: add quiescer to the fs suite
squid: qa/tasks: the quiescer task and a waiter task to test it
squid: qa/tasks/cephfs: don't create a new CephManager if there is one in the context
squid: qa/tasks: vstart_runner: introduce --config-mode
squid: qa/tasks: introduce ThrasherGreenlet
squid: qa: update quiesce tests to expect ipolicy lock
squid: mds: add missing policylock to test F_QUIESCE_BLOCK
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Jos Collin [Mon, 6 May 2024 12:47:29 +0000 (18:17 +0530)]
pybind/mgr/mirroring: Fix KeyError: 'directory_count' in daemon status
The directory_count key is missing in self.mgr.get_daemon_status() output json,
intermittently when there is a delay caused by m_listener.handle_mirroring_enabled() to update the
directory_count, which results in ServiceDaemon::update_status() creates a json with out 'directory_count' key/value.
But the mgr/mirroring -> daemon_status() always expects the 'directory_count' key to be present in the json returned by
self.mgr.get_daemon_status().
This issue occurs intermittently when we enable/disable mirroring and check the 'daemon status' in between.
This patch fixes this issue by setting a default value 0 for 'directory_count' in doemon_status().
Fixes: https://tracker.ceph.com/issues/65795 Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit b78baa23e562742b8bdc5a75f82e3b6fbf55a8a5)
Zac Dover [Tue, 28 May 2024 16:27:53 +0000 (02:27 +1000)]
doc/dev: add note about intro of perf counters
Add a note to the "perf counter" section of doc/dev/perf_counters.rst
that explains that this feature was introduced in the Reef release of
Ceph. This note will prevent us from accidentally backporting
perf-counter-related PRs to Quincy.