git-server-git.apps.pok.os.sepia.ceph.com Git

doc/rados: explain replaceable parts of command

Add an explanation that directs the reader to replace the "X" part of
the command "ceph tell mon.X mon_status" with the value specific to the
reader's Ceph cluster (which is (probably) not "X").

In the future, such replaceable strings in commands may be bounded by
angle brackets ("<" and ">").

This improvement to the documentation was suggested on the [ceph-users]
email list by Joel Davidow. This email, an absolute model of user
engagement with an upstream project, can be reviewed here:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/KF67F5TXFSSTPXV7EKL6JKLA5KZQDLDQ/

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit d071ad2575c86f300a9ba39df3c4949e5dc9c47d)

Merge pull request #57998 from ljflores/wip-tracker-66460

squid: qa/suites/rados/thrash-old-clients: update supported releases and distro

Merge pull request #58049 from zdover23/wip-doc-2024-06-14-backport-58007-to-squid

squid: doc/rados: add pg-states and pg-concepts to tree

Merge pull request #57525 from batrick/wip-66044-squid

squid: qa: unmount clients before damaging the fs

Reviewed-by: Jos Collin <jcollin@redhat.com>

Merge pull request #57762 from joscollin/wip-66270-squid

squid: pybind/mgr/mirroring: Fix KeyError: 'directory_count' in daemon status

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #57760 from joscollin/wip-66277-squid

squid: cephfs-journal-tool: Add preventive measures to avoid fs corruption

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #57795 from rishabh-d-dave/wip-65920-squid

squid: mds: don't add counters in warning for standby-replay MDS

Reviewed-by: Jos Collin <jcollin@redhat.com>

Merge pull request #57945 from ceph/wip-lusov-qdb-exclude-or-cancel-squid

squid: mds: QuiesceDbRequest: update the internal encoding of ops

Reviewed-by: Jos Collin <jcollin@redhat.com>

doc/rados: add pg-states and pg-concepts to tree

Add "pg-states" and "pg-concepts" to the left tree pane on
docs.ceph.com.

This commit has been made in response to a request from the upstream
made in https://pad.ceph.com/p/Report_Documentation_Bugs.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 0629f47faf73a5b88adbeceaf022ee23111bae7d)

qa/suites: add "mon down" log variations to ignorelist

Fixes: https://tracker.ceph.com/issues/64864
Signed-off-by: Laura Flores <lflores@ibm.com>
(cherry picked from commit d475ac3e6ab86a4913e1d318989c617031978bc2)

Conflicts:
qa/suites/orch/cephadm/smoke/start.yaml
qa/suites/orch/cephadm/workunits/task/test_host_drain.yaml
qa/suites/orch/cephadm/workunits/task/test_monitoring_stack_basic.yaml
qa/suites/orch/cephadm/workunits/task/test_rgw_multisite.yaml
qa/suites/orch/cephadm/workunits/task/test_set_mon_crush_locations.yaml

The log-only-match entry was backported to squid before the ignorelist changes,
but in main it was introduced after the ignorelist changes.
See https://github.com/ceph/ceph/commit/b4522dd332d40a54b9e0be58bd96aeaa345f8977.

Merge pull request #57436 from joscollin/wip-65978-squid

squid: cephfs_mirror: increment sync_failures when sync_perms() and sync_snaps() fails

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Leonid Usov <leonid.usov@ibm.com>

Merge pull request #57440 from joscollin/wip-65982-squid

squid: mgr/stats: initialize mx_last_updated in FSPerfStats

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Leonid Usov <leonid.usov@ibm.com>

Merge pull request #57449 from joscollin/wip-65989-squid

squid: cephfs_mirror: fix crash in update_fs_mirrors()

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #57117 from lxbsz/wip-65361

squid: qa/cephfs: fix root_squash check failure bug

Reviewed-by: Jos Collin <jcollin@redhat.com>

Merge pull request #57341 from batrick/wip-65844-squid

squid: qa: ignore variation of PG_DEGRADED health warning

Reviewed-by: Jos Collin <jcollin@redhat.com>

Merge pull request #57447 from mchangir/wip-65899-squid

squid: mgr/snap_schedule: restore yearly spec to lowercase y

Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>

Merge pull request #57523 from batrick/wip-66042-squid

squid: mds: remove superfluous debug message

Reviewed-by: Jos Collin <jcollin@redhat.com>

Merge pull request #57556 from lxbsz/wip-66054

squid: qa/fsx: use a specified sha1 to build the xfstest-dev

Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Leonid Usov <leonid.usov@ibm.com>

Merge pull request #57566 from dparmar18/wip-66060-squid

squid: qa: add a YAML to ignore MGR_DOWN warning

Reviewed-by: Jos Collin <jcollin@redhat.com>

qa/suites/rados/thrash-old-clients: update supported releases and distro

thrash-old-clients tests should only support N-3 releases. To fix this for
main, I have removed all releases < quincy and have added squid.

Also, we are fully switching to centos.9_stream packages/containers after
the centos.8_stream end of life, so I changed the distro from centos.8_stream
to centos.9_stream.

*** Note: If this commit is backported, it should be done in such a way that
only releases >= quincy reference centos.9_stream. For instance, if backporting to squid,
a reef/squid thrash test is okay to make references to centos.9_stream since both reef and
squid support this, but a pacific/squid test will have to take a different approach
since pacific does not support centos.9_stream.

Fixes: https://tracker.ceph.com/issues/66398
Signed-off-by: Laura Flores <lflores@ibm.com>
(cherry picked from commit 820e4004f3fd17a5daee955f7a1443b1501caaad)

Modifications:
- For this squid backport, I kept pacific since that fits into N-3 where
N is squid.
- Pacific does not build c9 packages, so I picked an alternative distro
that is shared among all represented releases: ubuntu 20.04.

Merge pull request #56700 from joscollin/wip-65318-squid

squid: cephfs-mirror: use monotonic clock

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #56910 from neesingh-rh/wip-65347-squid

squid: qa: fixing tests in test_cephfs_shell.TestShellOpts

Reviewed-by: Jos Collin <jcollin@redhat.com>

Merge pull request #56896 from chrisphoffman/wip-65489-squid

squid: mds: Add fragment to scrub

Reviewed-by: Jos Collin <jcollin@redhat.com>

Merge pull request #56950 from batrick/wip-65519-squid

squid: qa: ignore human-friendly POOL_APP_NOT_ENABLED in clog

Reviewed-by: Jos Collin <jcollin@redhat.com>

Merge pull request #56952 from batrick/wip-65366-squid

squid: qa: test test_kill_mdstable for all mount types

Reviewed-by: Jos Collin <jcollin@redhat.com>

Merge pull request #57116 from lxbsz/wip-65675

squid: mds: fix the description for inotable testing only options

Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>
Reviewed-by: Jos Collin <jcollin@redhat.com>

Merge pull request #57982 from zdover23/wip-doc-2024-06-12-backport-57976-to-squid

squid: doc/glossary: Add "S3"

Merge pull request #57173 from batrick/wip-65708-squid

squid: client: clear resend_mds only after sending request

Reviewed-by: Jos Collin <jcollin@redhat.com>

doc/glossary: Add "S3"

Add "S3" entry to the glossary.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit be6d0645c40431e8312244019c0331e4076bd5f2)

Merge pull request #57489 from ljflores/wip-66041-squid

squid: qa/suites/rados/singleton: add POOL_APP_NOT_ENABLED to ignorelist

Merge pull request #57958 from zdover23/wip-doc-2024-06-11-backport-57957-to-squid

squid: doc/rados: improve leader/peon monitor explanation

doc/rados: improve leader/peon monitor explanation

Add an explanation of leader-peon conditions that obtain when the
cluster is in the "HEALTH_OK" state. Previously, the text discussed
these two monitor states only in the context of a health detail entry.

This improvement to the documentation was suggested on the [ceph-users]
email list by Joel Davidow. This email, an absolute model of user
engagement with an upstream project, can be reviewed here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/KF67F5TXFSSTPXV7EKL6JKLA5KZQDLDQ/

I will list Joel Davidow here as the co-author for the sake of more
expediently getting this change into the documentation, but though he is
listed as the co-author, he is the true author.

Co-authored-by: Joel Davidow <jdavidow@nso.edu>
Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 6fb9a5ef817eda5184d51ebcb425a6091ca82299)

Merge PR #57358 into squid

* refs/pull/57358/head:
ceph.spec.in: remove command-with-macro line

Reviewed-by: John Mulligan <jmulligan@redhat.com>

Merge pull request #57747 from rhcs-dashboard/wip-66247-squid

squid: mgr/dashboard: fix readonly landingpage

Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #57948 from zdover23/wip-doc-2024-06-10-backport-57947-to-squid

squid: doc/start: remove "intro.rst"

Reviewed-by: Cole Mitchell <cole.mitchell.ceph@gmail.com>

doc/start: remove "intro.rst"

Remove "start/intro.rst", which has been renamed "start/index.rst" in
order to follow the conventions followed elsewhere in the documentation.

Follows https://github.com/ceph/ceph/pull/57900.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 74cc624d002e51769da37c04b3bdc32e0077d370)

Merge pull request #57941 from zdover23/wip-doc-2024-06-09-backport-57939-to-squid

squid: doc/glossary.rst: add "OpenStack Swift" and "Swift"

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

squid: mds: QuiesceDbRequest: update the internal encoding of ops

Excluding the last root from a set will automatically mark it as QS_CANCELED.
Hence, it makes more sense if `exclude` and `cancel` share the same op code,
rather than `exclude` and `release`.

Fixes: https://tracker.ceph.com/issues/66400
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
Fixes: https://tracker.ceph.com/issues/66383
(cherry picked from commit dad52497817c372fd7c61a88a210b5a3613cb807)

doc/glossary.rst: add "OpenStack Swift" and "Swift"

Add "OpenStack Swift" and "Swift" entries to the glossary.

Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit b2d413ee9db1d68392f29db148a7bc2e87a7b419)

Merge pull request #57915 from zdover23/wip-doc-2024-06-07-backport-57887-to-squid

squid: doc/rados: add options to network config ref

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #57883 from zdover23/wip-doc-2024-06-05-backport-57868-to-squid

squid: doc: correct typo

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #57913 from zdover23/wip-doc-2024-06-07-backport-57886-to-squid

squid: doc/dev: origin of Labeled Perf Counters

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

doc/rados: add options to network config ref

Add the following options to
doc/rados/configuration/network-config-ref.rst:

- public_network_interface
- cluster_network_interface

These additions were made in response to a request from Blaine Gardner.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 33bc1a0241cf29d0f1d12aa0a54c6cda5a469adc)

doc/dev: origin of Labeled Perf Counters

Note that Labeled Perf Counters were introduced in Reef.

Fixes: https://github.com/ceph/ceph/pull/57753#discussion_r1626483732
Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 36e09fb6894dcec56224d483d36a7315b8d19d60)

Merge pull request #57902 from zdover23/wip-doc-2024-06-06-backport-57900-to-squid

squid: doc/start: s/intro.rst/index.rst/

Reviewed-by: Cole Mitchell <cole.mitchell.ceph@gmail.com>

doc/start: s/intro.rst/index.rst/

Change the filename "doc/start/intro.rst" to "doc/start/index.rst" so
that Sphinx finds the root filename for the "/start" directory in the
default location.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 84ce2212e87a4b6b2416eeab7e8e1718ae3ce87b)

Merge pull request #57870 from zdover23/wip-doc-2024-06-05-backport-57867-to-squid

squid: doc/start: s/http/https/ in links

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc: correct typo

Signed-off-by: Matthew Vernon <mvernon@wikimedia.org>
(cherry picked from commit 4769493887e9f99f990122135d7cab6caee27f71)

doc/start: s/http/https/ in links

Replace "http" with "https" in doc/start/get-involved.rst.

This commit is, in a way, a repeat of
https://github.com/ceph/ceph/pull/57213/
(1c5383b91bd7dbfa9670c6485fcc5ff28b79f40d), which targeted the Reef
branch instead of the main branch. When this commit has been merged and
backported, I will close https://github.com/ceph/ceph/pull/57213/.

I am listing Casey Cain here as the co-author, but he is in fact the
true author of this change.

Co-authored-by: Casey Cain <ccain@linuxfoundation.org>
Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 922f23f0f30da856a661376527f413dc9424382d)

Merge pull request #57850 from zdover23/wip-doc-2024-06-04-backport-57824-to-squid

squid: doc/rados: add stop monitor command

doc/rados: add stop monitor command

Add the command for stopping a monitor to the procedure that explains
how to inject a monmap into a monitor.

Zac of the future: cf. 05 Aug 2023.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit c032188d66539a78ab0e4de2a5f5fc4329927bf6)

Merge pull request #57844 from zdover23/wip-doc-2024-06-04-backport-57839-to-squid

squid: doc/start: Edit Beginner's Guide

doc/start: Edit Beginner's Guide

Make some improvements to the basic text of the Beginner's Guide.

Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit f484a156bed655909617f4e351b692d7a23d0e87)

Merge pull request #57821 from zdover23/wip-doc-2024-06-02-backport-57820-to-squid

squid: doc/start: Add Beginner's Guide

Reviewed-by: Cole Mitchell <cole.mitchell.ceph@gmail.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc/start: Add Beginner's Guide

Add a Beginner's Guide to docs.ceph.com.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 91aafc6a7f72c105fbf3aa8419863e931d5b9e00)

Merge pull request #57814 from zdover23/wip-doc-2024-06-01-backport-57804-to-squid

squid: doc/cephfs: edit vstart warning text

doc/cephfs: edit vstart warning text

Improve the English in the vstart warning in doc/cephfs/mantle.rst.

Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 9ce7f9bd6c006ede6e1d563f4273376e2dbc1d03)

Merge pull request #57803 from petrutlucian94/wip-66312-squid

squid: rbd-wnbd: wait for the disk cleanup to complete

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

rbd-wnbd: wait for the disk cleanup to complete

The WNBD disk removal workflow is asynchronous, which is why we'll
need to wait for the cleanup to complete when stopping the service.

The "disconnect_all_mappings" function is moved to
RbdMappingDispatcher::stop, allowing us to access the mapping list
more easily and reject new mappings after a stop has been requested.

While at it, we'll log service stop requests.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
(cherry picked from commit 9136cbdecb520def4fdfbbf696e1802037cac510)

Merge pull request #57791 from zdover23/wip-doc-2024-05-30-backport-57790-to-squid

squid: doc/cephfs: edit front matter in mantle.rst

qa/cephfs: test that counters are not printed for SR MDS

- Add tests to verify that inode and stray counters are not
  replayed/included in the health warnings printed for the
  standby-replay MDS.

- Add "MDS_CACHE_OVERSIZED" health warning to ignorelist to
  failover.yaml.

- Add a helper method to qa.tasks.cephfs.filesystem.Filesystem to get
  MDS name of standby-replay MDS.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 2784e224e7af38d5b96c573df7dfb373de53937b)

mds: add no counters in warning for standby-replay MDS

Don't include inode and stray counters in the health warnings printed
for standby-replay MDSs. Since these counters are present in the health
warnings only due to replay, it can confuse users, and therefore, do not
include them.

Fixes: https://tracker.ceph.com/issues/63514
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 03dcdc1329e471aa4aa403519ea5131db2f99b23)

Merge pull request #57466 from soumyakoduri/wip-skoduri-squid

[Squid]rgw/cloud-transition: fix the crash with publish_commit

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #57376 from cbodley/wip-65888-squid

squid: rgw/beast: fix crash observed in SSL stream.async_shutdown()

Reviewed-by: Mark Kogan <mkogan@redhat.com>

Merge pull request #57470 from yuvalif/wip-65996-squid

squid: rgw/notification: start/stop endpoint managers in notification manager

Reviewed-by: Casey Bodley <cbodley@redhat.com>

doc/cephfs: edit front matter in mantle.rst

Improve the structure and grammar of the front matter in the
doc/cephfs/mantle.rst file.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 559d4849ecd6f93b5812f3d8d0448115c5b5beab)

Merge PR #57730 into squid

* refs/pull/57730/head:
squid: mds: remove unnecssary quiesce finisher variable
squid: mds: attach quiesce_path mdr to finisher at creation not dispatch
squid: mds/quiesce: disable quiesce root debug parameters by default
squid: mds/quiesce-agt: never send a synchronous ack
squid: mds/quiesce-agt: add test for a rapid async ack
squid: mds/quiesce: always abort fragmenting asynchronously to prevent reentrancy
squid: mds/quiesce: overdrive an export if it hasn't frozen the tree yet
squid: mds/quiesce: quiesce_inode should not hold on to remote auth pins
squid: qa/cephfs: check that a completed quiesce doesn't hold remote auth pins
squid: mds: add `--lifetime` parameter to the `lock path` asok command
squid: mds/quiesce: accept a regular file as the quiesce root
squid: mds: command_quiesce_path: rename `--wait` to `--await` for consistency
squid: mds: command_quiesce_path: do not block the asok thread and return an adequate rc
squid: mds/quiesce: drop remote authpins before waiting for the quiesce lock
squid: qa/cephfs/test_quiesce: test proper handling of remote authpins
squid: mds: don't clear `AUTHPIN_FROZEN` until `FROZEN` in rename_prep
squid: mds: enhance the `lock path` asok command
squid: mds/quiesce: overdrive fragmenting that's still freezing
squid: revert: mds: provide a mechanism to authpin while freezing
squid: qa/cephfs/test_quiesce: enhance the fragmentation test
squid: mds/queisce-db: collect acks while bootstrapping
squid: mds/quiesce-db: optimize peer updates
squid: mds/quiesce-db: track db epoch separately from the membership epoch
squid: mds/quiesce-db: test that a peer on a newer membership epoch can ack a root
squid: mds: don't stall the asok thread for flush commands
squid: qa/quiescer: relax some timing requirements in the quiescer
squid: qa/tasks/quiescer: dump ops in parallel
squid: qa/suites/fs: add quiescer to the fs suite
squid: qa/tasks: the quiescer task and a waiter task to test it
squid: qa/tasks/cephfs: don't create a new CephManager if there is one in the context
squid: qa/tasks: vstart_runner: introduce --config-mode
squid: qa/tasks: introduce ThrasherGreenlet
squid: qa: update quiesce tests to expect ipolicy lock
squid: mds: add missing policylock to test F_QUIESCE_BLOCK

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #57657 from rhcs-dashboard/wip-65994-squid

squid: exporter: fix regex for rgw sync metrics

Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

Merge pull request #57757 from zdover23/wip-doc-2024-05-29-backport-57753-to-squid

squid: doc/dev: add note about intro of perf counters

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

pybind/mgr/mirroring: Fix KeyError: 'directory_count' in daemon status

The directory_count key is missing in self.mgr.get_daemon_status() output json,
intermittently when there is a delay caused by m_listener.handle_mirroring_enabled() to update the
directory_count, which results in ServiceDaemon::update_status() creates a json with out 'directory_count' key/value.
But the mgr/mirroring -> daemon_status() always expects the 'directory_count' key to be present in the json returned by
self.mgr.get_daemon_status().

This issue occurs intermittently when we enable/disable mirroring and check the 'daemon status' in between.
This patch fixes this issue by setting a default value 0 for 'directory_count' in doemon_status().

Fixes: https://tracker.ceph.com/issues/65795
Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit b78baa23e562742b8bdc5a75f82e3b6fbf55a8a5)

doc: update 'journal reset' command with --yes-i-really-really-mean-it

Fixes: https://tracker.ceph.com/issues/62925
Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit 42953ece97b1b082400443b44bab46b1118fb1f8)

qa: fix cephfs-journal-tool command options and make fs inactive

Fixes: https://tracker.ceph.com/issues/62925
Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit 0820b31d5b1f1542636c56611ec27636afc23b68)

cephfs-journal-tool: Add warning messages during 'journal reset' and prevent execution on active fs

Fixes: https://tracker.ceph.com/issues/62925
Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit 57f7d8b70f6989dee09a2adf5ff99b2917589488)

squid: mds: remove unnecssary quiesce finisher variable

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit ed519f63f632e49fcb6f45bcf03e1022e17378b9)

squid: mds: attach quiesce_path mdr to finisher at creation not dispatch

No functional difference but this is cleaner.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 612b0957ee4eeda4f3b17ca5c3c2ca2346e8ec3d)

doc/dev: add note about intro of perf counters

Add a note to the "perf counter" section of doc/dev/perf_counters.rst
that explains that this feature was introduced in the Reef release of
Ceph. This note will prevent us from accidentally backporting
perf-counter-related PRs to Quincy.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 55e41bc679aeee9119dc815b456bb76880f61ea2)

squid: mds/quiesce: disable quiesce root debug parameters by default

Fixes: https://tracker.ceph.com/issues/66225
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit a9cb3a581a309a99b72997ae5ddb88084f5484c9)
Fixes: https://tracker.ceph.com/issues/66255

squid: mds/quiesce-agt: never send a synchronous ack

Defer to the agent thread to perform all acking.
This avoids race conditions between the updating thread
and the acking thread.

Fixes: https://tracker.ceph.com/issues/66219
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit 9a4c5853d1c2a353f72cd6358bbdedd93c4cc209)
Fixes: https://tracker.ceph.com/issues/66256

squid: mds/quiesce-agt: add test for a rapid async ack

In this scenario, the agent thread is able to run and generate an ack
before the db_update call returns to the caller.

Fixes: https://tracker.ceph.com/issues/66219
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit 4ab40ea0d3a366e3e2cb7bd7da8da9463b27eb25)
Fixes: https://tracker.ceph.com/issues/66256

squid: mds/quiesce: always abort fragmenting asynchronously to prevent reentrancy

Fixes: https://tracker.ceph.com/issues/66208
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit f24c0dca62590ddaf553ecae9405a52aa27ed613)
Fixes: https://tracker.ceph.com/issues/66257

squid: mds/quiesce: overdrive an export if it hasn't frozen the tree yet

Just like with the fragmenting, we should abort an ongoing export
if a quiesce is attempted for the directory.

To minimize the stress for the system, we only allow the abort
if the export hasn't yet managed to freeze the tree. If that is the case,
then quiesce will have to wait for the export to finish.

Fixes: https://tracker.ceph.com/issues/66123
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit da5c263b8e7797eac6c9d13d5b6a6b292d9c5def)
Fixes: https://tracker.ceph.com/issues/66259

squid: mds/quiesce: quiesce_inode should not hold on to remote auth pins

1. avoid taking a remote authpin for the quiesce lock
2. drop remote authpins that were taken because of other locks

We should not be forcing a mustpin when taking quiesce lock.
This creates unnecessary overhead due to the distributed nature
of the quiesce: all ranks will execute quiesce_inode, including
the auth rank, which will authpin the inode.

Auth pinning on the auth rank is important to synchronize quiesce
with operations that are managed by the auth, like fragmenting
and exporting.

If we let a remote quiesce process take a foreign authpin then
it may block freezing on the auth, which will stall quiesce locally.
This wouldn't be a problem if the quiesce that is blocked on the auth
and the quiesce that's holding a remote authpin from the replica side
were unrelated, but in our case it may be the same logical quiesce
that effectively steps on its own toes. This creates an opportunity
for a deadlock.

Fixes: https://tracker.ceph.com/issues/66152
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit b1cb6d985622c6164d99d3fd79b6eeaf6530894c)
Fixes: https://tracker.ceph.com/issues/66258

squid: qa/cephfs: check that a completed quiesce doesn't hold remote auth pins

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit e32fb12b8ea105cef82cf5b9304c28bc4dc8e7a5)
Fixes: https://tracker.ceph.com/issues/66258

squid: mds: add `--lifetime` parameter to the `lock path` asok command

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit c395c78e09f9b41088801dcea1fab1cd10b0ba00)
Fixes: https://tracker.ceph.com/issues/66258

squid: mds/quiesce: accept a regular file as the quiesce root

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit f706ae8c2d1993ba11fe32f6cfa87154c7d2b39b)
Fixes: https://tracker.ceph.com/issues/66258

squid: mds: command_quiesce_path: rename `--wait` to `--await` for consistency

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit c20221574e4600d22dd3c0238647cc5671c8b43c)
Fixes: https://tracker.ceph.com/issues/66258

squid: mds: command_quiesce_path: do not block the asok thread and return an adequate rc

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit df546a4fba0d3851644ce1607340484409a3677d)
Fixes: https://tracker.ceph.com/issues/66258

Merge pull request #57749 from zdover23/wip-doc-2024-05-29-backport-57732-to-squid

squid: doc/developer_guide: update doc about installing teuthology

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #57742 from zdover23/wip-doc-2024-05-28-backport-57720-to-squid

squid: doc/cephfs: s/subvolumegroups/subvolume groups

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

squid: mds/quiesce: drop remote authpins before waiting for the quiesce lock

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
Fixes: https://tracker.ceph.com/issues/65802
(cherry picked from commit 5692f7f55ee44e0ab5a576909845549201d6c986)
Fixes: https://tracker.ceph.com/issues/66153

squid: qa/cephfs/test_quiesce: test proper handling of remote authpins

When a request is blocked on the quiesce lock, it should release
all remote authpins, especially those that make an inode AUTHPIN_FROZEN

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit bed8a47b802acc56d7953bb8781165cd1068ab83)
Fixes: https://tracker.ceph.com/issues/66154

squid: mds: don't clear `AUTHPIN_FROZEN` until `FROZEN` in rename_prep

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit 86d6533351606a86978e117f127d29d63ea588ce)
Fixes: https://tracker.ceph.com/issues/66154

squid: mds: enhance the `lock path` asok command

* when the quiesce lock is taken by this op, don't consider the inode `quiesced`
* drop all locks taken during traversal
* drop all local authpins after the locks are taken
* add --await functionality that will block the command until locks are taken or an error is encountered
* return the RC that represents the operation result. 0 if the operation was scheduled and hasn't failed so far
* add authpin control flags
** --ap-freeze - to auth_pin_freeze the target inode
** --ap-dont-block - to pass auth_pin_nonblocking when acquiring the target inode locks

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit 3552fc5a9ea17c173a18be41fa15fbbae8d77edf)
Fixes: https://tracker.ceph.com/issues/66154

squid: mds/quiesce: overdrive fragmenting that's still freezing

Quiesce requires revocation of capabilities,
which is not working for a freezing/frozen nodes.
Since it is best effort, abort an ongoing fragmenting
for the sake of a faster quiesce.

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
Fixes: https://tracker.ceph.com/issues/65716
(cherry picked from commit 8b6440652d501644d641c1c8b3255c3720738ec6)
Fixes: https://tracker.ceph.com/issues/66154

squid: revert: mds: provide a mechanism to authpin while freezing

This is a functional revert of a9964a7ccc4394f923fb0f1c76eb8fa03fe8733d
git revert was giving too many conflicts, as the code has changed
too much since the original commit.

The bypass freezing mechanism lead us into several deadlocks,
and when we found out that a freezing inode defers reclaiming
client caps, we realized that we needed to try a different approach.
This commit removes the bypass freezing related changes to clear way
for a different approach to resolving the conflict between quiesce
and freezing.

Fixes: https://tracker.ceph.com/issues/65716
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit bf760602a4f02cc07072db2da5cb987e3072afce)
Fixes: https://tracker.ceph.com/issues/66154

squid: qa/cephfs/test_quiesce: enhance the fragmentation test

Repeatedly quiesce under a heavy balancer load

Fixes: https://tracker.ceph.com/issues/65716
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit 2b2af17ae45d34eeddb2d31f791ed4f0af77672a)
Fixes: https://tracker.ceph.com/issues/66154

squid: mds/queisce-db: collect acks while bootstrapping

Keeping the acks that come in will allow processing them
immediately after the bootstrap is over, avoiding unnecessary
set state transitions.

Fixes: https://tracker.ceph.com/issues/66119
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit d6fb8755ca839ef5c1f94c3bc92a0e799c8f2d85)
Fixes: https://tracker.ceph.com/issues/66155

squid: mds/quiesce-db: optimize peer updates

Prevent sending of the same version to the same peer more than once a second

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit eebf597b2193fd15ff80892570ebbe670acf0f53)
Fixes: https://tracker.ceph.com/issues/66070

squid: mds/quiesce-db: track db epoch separately from the membership epoch

Tracking the db epoch separately will make sure that replicas
only follow leader's epoch choice, even if they are already on
the new membership epoch. This eliminates races due to the
random order of mdsmap updates.

Fixes: https://tracker.ceph.com/issues/65977
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit 379ef7196b61142dc7753992f897ad91b37f048f)
Fixes: https://tracker.ceph.com/issues/66070

squid: mds/quiesce-db: test that a peer on a newer membership epoch can ack a root

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit f58f63c4aecc867dfe4fd68f04629e8e45f3e864)
Fixes: https://tracker.ceph.com/issues/66070