osd: Apply randomly determined IO priority cutoff across all OSD shards
Determine the op priority cutoff for an OSD and apply it on all the OSD
shards, which is a more realistic scenario. Previously, the cut off value
was randomized between OSD shards leading to issues in testing. The IO
priority cut off is first determined before initializing the OSD shards.
The cut off value is then passed to the OpScheduler implementations that
are modified accordingly to apply the values during initialization.
osd/OSD: Query osd op queue type from scheduler instead of config subsystem
All OSD shards are guaranteed to use the same scheduler type. Therefore,
OSD::osd_op_queue_type() is used where applicable to determine the
scheduler type. This results in the appropriate setting of other config
options based on the randomly selected scheduler type in case the global
'osd_op_queue' config option is set to 'debug_random' (for e.g., in CI
tests).
Note: If 'osd_op_queue' is set to 'debug_random', the PG specific code
(PGPeering, PrimaryLogPG) would continue to use the existing mechanism of
querying the config option key (osd_op_queue) as before using get_val().
common, osd: Apply randomly selected scheduler type across all OSD shards
Originally, the choice of 'debug_random' for osd_op_queue resulted in the
selection of a random scheduler type for each OSD shard. A more realistic
scenario for testing would be the selection of the random scheduler type
applied globally for all shards of an OSD. In other words, all OSD shards
would employ the same scheduler type. For e.g., this scenario would be
possible during upgrades when the scheduler type has changed between
releases.
The following changes are made as part of the commit:
1. Introduce enum class op_queue_type_t within osd_types.h that holds the
various op queue types supported. This header in included by OpQueue.h.
Add helper functions osd_types.cc to return the op_queue_type_t as
enum or a string representing the enum member.
2. Determine the scheduler type before initializing the OSD shards in
OSD class constructor.
3. Pass the determined op_queue_type_t to the OSDShard's make_scheduler()
method for each shard. This ensures all shards of the OSD are
initialized with the same scheduler type.
4. Rename & modify the unused OSDShard::get_scheduler_type() method to
return op_queue_type_t set for the queue.
5. Introduce OpScheduler::get_type() and OpQueue::get_type() pure
virtual functions and define them within the respective queue
implementation. This returns a value pertaining to the op queue type.
This is called by OSDShard::get_op_queue_type().
6. Add OSD::osd_op_queue_type() method for determining the scheduler
type set on the OSD shards. Since all OSD shards are set to use
the same scheduler type, the shard with the lowest id is used to
get the scheduler type using OSDShard::get_op_queue_type().
7. Improve comment description related to 'osd_op_queue' option in
common/options/osd.yaml.in.
common, osd: Remove unused implementation of mClockPriorityQueue
mClockPriorityQueue (mClockQueue class) is an older mClock implementation
of the OpQueue abstraction. This was replaced by a simpler implementation
of the OpScheduler abstraction as part of
https://github.com/ceph/ceph/pull/30650.
The simpler implementation of mClockScheduler is being currently used.
This commit removes the unused src/common/mClockPriorityQueue.h along
with the associated unit test file: test_mclock_priority_queue.cc.
Other miscellaneous changes,
- Remove the cmake references to the unit test file
- Remove the inclusion of the header file in mClockScheduler.h
Zac Dover [Tue, 19 Dec 2023 09:15:57 +0000 (19:15 +1000)]
doc/install: update "update submodules"
Remove misleading material that would give readers the wrong idea about
when stale submodules are present. This commit is made in response to
information given to me by Ilya Dryomov here: https://github.com/ceph/ceph/pull/54929#issuecomment-1859237986.
The client incorrectly decodes max_xattr_size (type: uint64_t) into
bal_rank_mask (type: string).
This situation ended up due to a couple of reasons:
* the kclient patchset hanlding `max_xattr_size` was merged early on
and another MDS side change that bumped the MDSMap encoding version
to 17 got merged in the midst (PR #43284). Details in comment:
Zac Dover [Sat, 9 Dec 2023 03:46:00 +0000 (04:46 +0100)]
doc/radosgw: format POST statements
Format the POST methods so that they appear in the rendered text as
examples of POST API calls and not as plain old unformatted text, which
is how they looked before this commit. The content of these API calls
remains to be tested and confirmed to work, but this is a first step.
Zac Dover [Sat, 2 Dec 2023 05:32:26 +0000 (06:32 +0100)]
doc/radosgw: add gateway starting command
Add a command that properly starts (or restarts) the RADOS gateway after
RGW settings have been changed. This commit has been added in response
to an issue reported anonymously on
https://pad.ceph.com/p/Report_Documentation_Bugs.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit ec7c515490c2ade44d886e423a6601c7ef0cf5e8)
Zac Dover [Tue, 5 Dec 2023 19:46:26 +0000 (20:46 +0100)]
doc/radosgw: update link in rgw-cache.rst
Update link in doc/radosgw/rgw-cache.rst. The link updated here is a
link to all the Nginx configuration files. The old link was broken. This
update comes to us from an anonymous report on
https://pad.ceph.com/p/Report_Documentation_Bugs.
Zac Dover [Sun, 3 Dec 2023 12:17:46 +0000 (13:17 +0100)]
doc/rados: repair stretch-mode.rst
Remove a section of doc/rados/operations/stretch-mode.rst that I wrongly
re-included after its removal. The request for this (re)-removal is
here: https://github.com/ceph/ceph/pull/54689#discussion_r1413007655.
Zac Dover [Sat, 2 Dec 2023 05:38:28 +0000 (06:38 +0100)]
doc/radosgw: fix formatting
Repair the formatting of a string that had a string inside backticks
that itself was inside double asterisks. The presence of the asterisks
around the entire string caused the backticks to appear in the rendered
documentation.
fs suite relies on these debugfs entries to gather mount information
(client-id, addr/inst) which are required by some tests. In fs suite,
the disto kernel gets overridden by the testing kernel and therefore
even if Ubuntu 20.04 is chosen as the distro, the testing kernel is
installed. However, with smoke suite, the distro kernel is used and
the missing patches causes certain essential information gathering to
fail early on (client-id, etc..) causing the test to not even start
execution. PR #54515 fixes a bug in the client-id fetching path but
isn't complete due to the missing patches - details here:
https://tracker.ceph.com/issues/63488#note-8
But its essential to have the smoke tests running since those tests
have lately uncovered bugs in the MDS (w/ distro kernels). In order
to benefit from those tests, this change ignores failures when
gathering mount information (which aren't used by the fs relevant
smoke tests). The test (in fs suite) that rely on this piece of
information would fail when run with 20.04 distro kernel (but the
fs suite overrides it with the testing kernel).
Venky Shankar [Mon, 27 Nov 2023 05:12:02 +0000 (10:42 +0530)]
qa: add centos_latest (9.stream) and ubuntu_20.04 yamls to supported-all-distro
A bug in Ceph MDS (MDS crash!) is seen with distos using a not-so-recent kernel
(5.4ish). This crash was first seen in quincy smoke run and the problematic
backport change was reverted. The smoke suite chooses a random distro for each
job, so to hit this bug, the appropriate distro needs to be (randomly) get chosen.
This change point the smoke suite to run against all supported distros.
This effects suites that point to supported-all-distro (powercycle) since it
bloats up the number of jobs. E.g., currently, without --subset, powercycle:osd
INFO:teuthology.suite.run:0/336 jobs were filtered out.
vs
(with this change)
Unable to schedule 560 jobs, too many jobs, when maximum 500 jobs allowed.
For smoke suite
INFO:teuthology.suite.run:Scheduled 24 jobs in total.
vs
(with this change)
INFO:teuthology.suite.run:Scheduled 120 jobs in total.
Eventually, with PR #46882, then testing kernel will no longer override the
distro kernel in fs suite, so we should get good coverage then.
MClientRequest: handle owner_uid and owner_gid from ceph_mds_request_head_legacy
When a client is too old and uses struct ceph_mds_request_head_legacy we must
fill new owner_uid and owner_gid fields from an old client_uid and client_gid.
Fixes: https://github.com/ceph/ceph/pull/52575 Fixes: https://tracker.ceph.com/issues/63288 Fixes: commit 46cb244b9c839 ("ceph_fs.h: add separate owner_{u,g}id fields") Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
(cherry picked from commit a70a70f589214d6e2a5b477a61005b13ba2fec46)
(cherry picked from commit 65257baa62eddac0cc3df9d2ca3a57e7fd2b25e2)
MClientRequest: handle ext_num_retry and ext_num_fwd from ceph_mds_request_head_legacy
When a client is too old and uses struct ceph_mds_request_head_legacy we must
fill new ext_num_retry and ext_num_fwd fields from an old num_retry and num_fwd.
Fixes: https://github.com/ceph/ceph/pull/45669 Fixes: https://tracker.ceph.com/issues/63288 Fixes: commit cbd7e3040208 ("ceph_fs.h: add 32 bits extended num_retry and num_fwd support") Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
(cherry picked from commit 43f32a46aa9095b19525357ba7ca215e842b4f77)
(cherry picked from commit 312bb5b9f1ada9646205a78f0a0fcc73d2530d5c)
fs suite relies on these debugfs entries to gather mount information
(client-id, addr/inst) which are required by some tests. In fs suite,
the disto kernel gets overridden by the testing kernel and therefore
even if Ubuntu 20.04 is chosen as the distro, the testing kernel is
installed. However, with smoke suite, the distro kernel is used and
the missing patches causes certain essential information gathering to
fail early on (client-id, etc..) causing the test to not even start
execution. PR #54515 fixes a bug in the client-id fetching path but
isn't complete due to the missing patches - details here:
https://tracker.ceph.com/issues/63488#note-8
But its essential to have the smoke tests running since those tests
have lately uncovered bugs in the MDS (w/ distro kernels). In order
to benefit from those tests, this change ignores failures when
gathering mount information (which aren't used by the fs relevant
smoke tests). The test (in fs suite) that rely on this piece of
information would fail when run with 20.04 distro kernel (but the
fs suite overrides it with the testing kernel).
Venky Shankar [Mon, 27 Nov 2023 05:12:02 +0000 (10:42 +0530)]
qa: add centos_latest (9.stream) and ubuntu_20.04 yamls to supported-all-distro
A bug in Ceph MDS (MDS crash!) is seen with distos using a not-so-recent kernel
(5.4ish). This crash was first seen in quincy smoke run and the problematic
backport change was reverted. The smoke suite chooses a random distro for each
job, so to hit this bug, the appropriate distro needs to be (randomly) get chosen.
This change point the smoke suite to run against all supported distros.
This effects suites that point to supported-all-distro (powercycle) since it
bloats up the number of jobs. E.g., currently, without --subset, powercycle:osd
INFO:teuthology.suite.run:0/336 jobs were filtered out.
vs
(with this change)
Unable to schedule 560 jobs, too many jobs, when maximum 500 jobs allowed.
For smoke suite
INFO:teuthology.suite.run:Scheduled 24 jobs in total.
vs
(with this change)
INFO:teuthology.suite.run:Scheduled 120 jobs in total.
Eventually, with PR #46882, then testing kernel will no longer override the
distro kernel in fs suite, so we should get good coverage then.
Zac Dover [Tue, 28 Nov 2023 05:08:48 +0000 (06:08 +0100)]
doc/rados: improve "Ceph Subsystems"
Improve the English in the subsection "Ceph Subsystems" in the section
"Subsystem, Log and Debug Settings" [sic] in
doc/rados/troubleshooting/log-and-debug.rst.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 7bca5f57cc2c11bdd76dd0edb43c716a1d5ad355)
Lucian Petrut [Wed, 15 Mar 2023 09:04:40 +0000 (09:04 +0000)]
test/libcephfs: skip flaky timestamp assertion on Windows
There's a new libcephfs test that creates a snapshot and
compares ctime/mtime. The issue is that one of the assertion
fails on Windows, potentially due to reduced timestamp
precision.