Conflicts:
src/pybind/mgr/dashboard/controllers/saml2.py
- kept the config options as it is in reef
src/pybind/mgr/dashboard/tox.ini
- kept the file as is in reef
Zac Dover [Wed, 11 Jun 2025 12:44:32 +0000 (22:44 +1000)]
doc/rados/ops: edit cache-tiering.rst
Add material to doc/rados/operations/cache-tiering.rst, as suggested by
Anthony D'Atri in
https://github.com/ceph/ceph/pull/63745#discussion_r2127887785.
J. Eric Ivancich [Mon, 24 Mar 2025 23:45:06 +0000 (19:45 -0400)]
rgw: add force option to `radosgw-admin object rm ...`
The `radosgw-admin object rm ...` sub-command will give up if it
determines that there's an issue with the head object. This can make
it difficult for an admin to clean up a bucket index when there's a
damaged or missing head object.
When the user adds the "--yes-i-really-mean-it" command-line option,
it enables the "force mode". The bucket index entry(ies) will be
removed. If the object being removed is the current version in a
versioned bucket, the appropriate changes to the OLH will take place.
Ville Ojamo [Wed, 30 Apr 2025 18:17:14 +0000 (01:17 +0700)]
doc/radosgw: Improve rgw-cache.rst
Try to improve the language by completely rewriting some sentences.
Attempt to format the document more like the rest of the docs.
Fix several errors in punctuation, capitalization, spaces etc.
Use blocks with bash prompts for CLI commands instead of hardcoded
prompts.
Fix section hierarchy and section title underline lengths.
Use admonition.
Kefu Chai [Wed, 25 Jun 2025 03:02:46 +0000 (11:02 +0800)]
doc: do not depend on typed-ast
the typed-ast project was marked end of life since July 2023, and
not maintained anymore. since we build the document using readthedocs'
service, and in .readtherdocs.yml we use python 3.9, which comes with
ast module included by its standard library.
the typed-ast dependency was originally added in 30d41597, but now that
we are using python 3.9, there is no need to use this module anymore.
Add comprehensive documentation for defining configuration options in
ceph-mgr modules, including all supported properties and their usage.
Previously, the documentation did not explain how to define ceph-mgr
module configuration options, despite subtle differences from other Ceph
components. This change documents all supported Option properties, their
types, and provides clear examples to help module developers properly
configure their options.
Jos Collin [Fri, 11 Apr 2025 06:08:20 +0000 (11:38 +0530)]
qa: fix multi-fs tests in test_mds_metrics.py
* Avoids the unnecessary setup, when writing a multi-fs test.
Avoids creating the default filesystem, deleting it and creating the required filesystems, mounting them.
This change uses the filesystems created using 'REQUIRE_BACKUP_FILESYSTEM' for conducting tests.
* This change consequently fixes the old/deleted filesystems appearing in the `perf stats` output,
making it stale output.
* Drops unused function parameters.
Fixes: https://tracker.ceph.com/issues/68001 Fixes: https://tracker.ceph.com/issues/68446 Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit eaf2a8360d0d70b20d5ea61022fdde4f6a9b6464)
J. Eric Ivancich [Tue, 22 Oct 2024 17:17:14 +0000 (13:17 -0400)]
rgw: fix empty storage class on display of multipart uploads
Some multipart uploads do not have a stored storage class, however the
code is written such that an empty storage class is treated as the
"STANDARD" storage class. So when encoding the storage class in JSON,
use the canonical storage.
The crash module has been enabled by default since commit 18f253aa in
Nautilus and is now in the always_on_modules list. However, the
documentation still contained instructions for manually enabling it.
When users followed these outdated instructions, they encountered:
```
module 'crash' is already enabled (always-on)
```
The module cannot be disabled either. Running:
```
ceph mgr module disable crash
```
Returns the error:
```
Error EINVAL: module 'crash' cannot be disabled (always-on)
```
In this change, we remove the obsolete enabling instructions and clarify
that this module is always active and cannot be disabled.
Mark Kogan [Wed, 25 Jun 2025 12:21:49 +0000 (12:21 +0000)]
qa/rgw: fix perl tests missing Amazon::S3 module
and a second case where perl tests can fail without error output
1. fix errors like: `Can't locate Amazon/S3.pm in @INC (you may need to
install the Amazon::S3 module)`
by priming the perl tests with installing the Amazon::S3 module from cpan
ex:
```
2025-06-23T19:18:40.162 INFO:tasks.workunit.client.0.smithi090.stderr:Can't locate Amazon/S3.pm in @INC (you may need to install the Amazon::S3 module) (@INC contains: /usr/local/lib64/perl5/5.32 ...
```
Kefu Chai [Wed, 25 Jun 2025 04:14:36 +0000 (12:14 +0800)]
mgr/dashboard: Fix inline markup warning in API documentation
Remove trailing space from summary field that was causing Sphinx build
warning.
Sphinx was generating a warning due to malformed inline markup:
```
/home/kefu/dev/ceph/doc/mgr/ceph_api/index.rst:3349: WARNING: Inline strong start-string without end-string.`
```
The openapi directive appears to convert trailing spaces into asterisk
markers, creating unterminated strong markup. This change removes the
trailing space to eliminate the warning and maintain consistency with
other entries in the file.
When the cluster needs to be read, the completion is posted to ASIO.
However, in the two special cases (cluster DNE and zero cluster), the
completion is completed inline at the moment. This violates invariants
and can eventually lead to a lockup. For example, in a scenario of
a read from a clone image whose parent is under migration:
io::ObjectReadRequest::read_parent()
io::util::read_parent()
< image_lock is taken for read >
io::ImageDispatchSpec::send()
migration::ImageDispatch::read()
migration::QCOWFormat::ReadRequest::send()
...
migration::QCOWFormat::ReadRequest::read_clusters()
< cluster DNE >
migration::QCOWFormat::ReadRequest::handle_read_clusters()
io::AioCompletion::complete()
io::ObjectReadRequest::copyup()
is_copy_on_read()
< image_lock is taken for read >
copyup() expects to be called with no locks held, but going through
QCOWFormat in the "cluster DNE" case essentially maintains image_lock
taken in read_parent() and then it's taken again by the same thread in
is_copy_on_read(). Under pthreads, it's not a problem:
A thread may hold multiple concurrent read locks on rwlock (that is,
successfully call the pthread_rwlock_rdlock() function n times). If
so, the thread must perform matching unlocks (that is, it must call
the pthread_rwlock_unlock() function n times).
But according to C++ standard it's undefined behavior:
If lock_shared is called by a thread that already owns the mutex in
any mode (exclusive or shared), the behavior is undefined.
Other, longer and more elaborate, call chains are possible too and
there it may end up being a write lock, a tripped assertion, etc. To
avoid this, make the special cases in read_clusters() behave the same
as the main path.
Kefu Chai [Wed, 25 Jun 2025 03:50:24 +0000 (11:50 +0800)]
doc/dev/config: Document how to use :confval: directive for config options
Add comprehensive guide for documenting configuration options using the
:confval: directive, including naming conventions and cross-referencing.
Previously, the documentation lacked guidance on using the :confval:
directive and the important distinction between regular config options
and mgr module options (which require the mgr/<module>/ namespace
prefix). This change provides detailed examples and best practices for
properly documenting and referencing both types of configuration options.
Jos Collin [Tue, 6 May 2025 11:50:39 +0000 (17:20 +0530)]
qa: fix test_cephfs_mirror_stats failure
* Don't create huge files that results in 'No space left on device'.
* Relax last_synced_end > last_synced_start check, so that
the test wouldn't fail even if 'counter dump' delays getting updated
values within a particular snapshot sync.
Fixes: https://tracker.ceph.com/issues/71186 Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit 9738b8d36275fda42d847058aab55ba1e6e6e7fc)
Jos Collin [Fri, 13 Dec 2024 02:53:07 +0000 (08:23 +0530)]
qa: fix test_cephfs_mirror_stats failure
100MB files would take less than a second to sync, which makes no difference
in 'last_synced_end' and the test fails intermittently. We need to increase the
size of the files, as the time/duration is determined only in seconds.
Because of this, it also needs more sleep time before checking the status.
Fixes: https://tracker.ceph.com/issues/69232 Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit 005e492288b71c641f33396cc8b13cc53d52b478)
Kefu Chai [Tue, 24 Jun 2025 08:30:11 +0000 (16:30 +0800)]
ceph.spec.in: Remove rgw-restore-bucket-index.8* from packaging
Fix RPM build failure caused by missing manpage in reef branch.
In commit bbe80059, we backported 8d0ec766 to reef but incorrectly
included `%{_mandir}/man8/rgw-restore-bucket-index.8*` in the package
files section. The original commit 8d0ec766 only added
`%{_mandir}/man8/rgw-gap-list.8*`, and the rgw-restore-bucket-index
manpage is not built in the reef branch.
This caused RPM build failures because rpmbuild requires all packaged
files to exist in the build directory:
```
error: File not found: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/18.2.7-518-ge3d44f43/rpm/el9/BUILDROOT/ceph-18.2.7-518.ge3d44f43.el9.x86_64/usr/share/man/man8/rgw-restore-bucket-index.8*
```
In this change, we remove `%{_mandir}/man8/rgw-restore-bucket-index.8*`
from the ceph-common package files section to resolve this issue.
Note: This is a reef-specific fix addressing a backport issue and
is not cherry-picked from master.
Zac Dover [Mon, 23 Jun 2025 08:18:07 +0000 (18:18 +1000)]
doc/radosgw: remove "pubsub_event_lost"
Remove "pubsub_event_lost" from the list of "Notification Performance
Statistics" in doc/radosgw/notifications.rst. "pubsub_event_lost" is now
obsolete.