Ville Ojamo [Wed, 30 Apr 2025 18:17:14 +0000 (01:17 +0700)]
doc/radosgw: Improve rgw-cache.rst
Try to improve the language by completely rewriting some sentences.
Attempt to format the document more like the rest of the docs.
Fix several errors in punctuation, capitalization, spaces etc.
Use blocks with bash prompts for CLI commands instead of hardcoded
prompts.
Fix section hierarchy and section title underline lengths.
Use admonition.
Kefu Chai [Wed, 25 Jun 2025 03:02:46 +0000 (11:02 +0800)]
doc: do not depend on typed-ast
the typed-ast project was marked end of life since July 2023, and
not maintained anymore. since we build the document using readthedocs'
service, and in .readtherdocs.yml we use python 3.9, which comes with
ast module included by its standard library.
the typed-ast dependency was originally added in 30d41597, but now that
we are using python 3.9, there is no need to use this module anymore.
Add comprehensive documentation for defining configuration options in
ceph-mgr modules, including all supported properties and their usage.
Previously, the documentation did not explain how to define ceph-mgr
module configuration options, despite subtle differences from other Ceph
components. This change documents all supported Option properties, their
types, and provides clear examples to help module developers properly
configure their options.
J. Eric Ivancich [Tue, 22 Oct 2024 17:17:14 +0000 (13:17 -0400)]
rgw: fix empty storage class on display of multipart uploads
Some multipart uploads do not have a stored storage class, however the
code is written such that an empty storage class is treated as the
"STANDARD" storage class. So when encoding the storage class in JSON,
use the canonical storage.
The crash module has been enabled by default since commit 18f253aa in
Nautilus and is now in the always_on_modules list. However, the
documentation still contained instructions for manually enabling it.
When users followed these outdated instructions, they encountered:
```
module 'crash' is already enabled (always-on)
```
The module cannot be disabled either. Running:
```
ceph mgr module disable crash
```
Returns the error:
```
Error EINVAL: module 'crash' cannot be disabled (always-on)
```
In this change, we remove the obsolete enabling instructions and clarify
that this module is always active and cannot be disabled.
Mark Kogan [Wed, 25 Jun 2025 12:21:49 +0000 (12:21 +0000)]
qa/rgw: fix perl tests missing Amazon::S3 module
and a second case where perl tests can fail without error output
1. fix errors like: `Can't locate Amazon/S3.pm in @INC (you may need to
install the Amazon::S3 module)`
by priming the perl tests with installing the Amazon::S3 module from cpan
ex:
```
2025-06-23T19:18:40.162 INFO:tasks.workunit.client.0.smithi090.stderr:Can't locate Amazon/S3.pm in @INC (you may need to install the Amazon::S3 module) (@INC contains: /usr/local/lib64/perl5/5.32 ...
```
Kefu Chai [Wed, 25 Jun 2025 04:14:36 +0000 (12:14 +0800)]
mgr/dashboard: Fix inline markup warning in API documentation
Remove trailing space from summary field that was causing Sphinx build
warning.
Sphinx was generating a warning due to malformed inline markup:
```
/home/kefu/dev/ceph/doc/mgr/ceph_api/index.rst:3349: WARNING: Inline strong start-string without end-string.`
```
The openapi directive appears to convert trailing spaces into asterisk
markers, creating unterminated strong markup. This change removes the
trailing space to eliminate the warning and maintain consistency with
other entries in the file.
Kefu Chai [Wed, 25 Jun 2025 03:50:24 +0000 (11:50 +0800)]
doc/dev/config: Document how to use :confval: directive for config options
Add comprehensive guide for documenting configuration options using the
:confval: directive, including naming conventions and cross-referencing.
Previously, the documentation lacked guidance on using the :confval:
directive and the important distinction between regular config options
and mgr module options (which require the mgr/<module>/ namespace
prefix). This change provides detailed examples and best practices for
properly documenting and referencing both types of configuration options.
Jos Collin [Tue, 6 May 2025 11:50:39 +0000 (17:20 +0530)]
qa: fix test_cephfs_mirror_stats failure
* Don't create huge files that results in 'No space left on device'.
* Relax last_synced_end > last_synced_start check, so that
the test wouldn't fail even if 'counter dump' delays getting updated
values within a particular snapshot sync.
Fixes: https://tracker.ceph.com/issues/71186 Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit 9738b8d36275fda42d847058aab55ba1e6e6e7fc)
Jos Collin [Fri, 13 Dec 2024 02:53:07 +0000 (08:23 +0530)]
qa: fix test_cephfs_mirror_stats failure
100MB files would take less than a second to sync, which makes no difference
in 'last_synced_end' and the test fails intermittently. We need to increase the
size of the files, as the time/duration is determined only in seconds.
Because of this, it also needs more sleep time before checking the status.
Fixes: https://tracker.ceph.com/issues/69232 Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit 005e492288b71c641f33396cc8b13cc53d52b478)
Kefu Chai [Tue, 24 Jun 2025 08:30:11 +0000 (16:30 +0800)]
ceph.spec.in: Remove rgw-restore-bucket-index.8* from packaging
Fix RPM build failure caused by missing manpage in reef branch.
In commit bbe80059, we backported 8d0ec766 to reef but incorrectly
included `%{_mandir}/man8/rgw-restore-bucket-index.8*` in the package
files section. The original commit 8d0ec766 only added
`%{_mandir}/man8/rgw-gap-list.8*`, and the rgw-restore-bucket-index
manpage is not built in the reef branch.
This caused RPM build failures because rpmbuild requires all packaged
files to exist in the build directory:
```
error: File not found: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/18.2.7-518-ge3d44f43/rpm/el9/BUILDROOT/ceph-18.2.7-518.ge3d44f43.el9.x86_64/usr/share/man/man8/rgw-restore-bucket-index.8*
```
In this change, we remove `%{_mandir}/man8/rgw-restore-bucket-index.8*`
from the ceph-common package files section to resolve this issue.
Note: This is a reef-specific fix addressing a backport issue and
is not cherry-picked from master.
Zac Dover [Mon, 23 Jun 2025 08:18:07 +0000 (18:18 +1000)]
doc/radosgw: remove "pubsub_event_lost"
Remove "pubsub_event_lost" from the list of "Notification Performance
Statistics" in doc/radosgw/notifications.rst. "pubsub_event_lost" is now
obsolete.
Kefu Chai [Tue, 3 Jun 2025 08:07:33 +0000 (16:07 +0800)]
librbd/cache/pwl: fix memory leak in SyncPoint persist context cleanup
Previously, SyncPoint allocated two C_Gather instances tracked by raw
pointers but failed to properly clean them up when only a single sync
point existed, causing memory leaks detected by AddressSanitizer.
This change fixes the leak by modifying AbstractWriteLog::shut_down()
to check for prior sync points in the chain. When the current sync point
is the only one present, we now activate the m_prior_log_entries_persisted
context to ensure:
- The onfinish callback executes and releases the captured strong
reference to the enclosing SyncPoint
- The parent m_sync_point_persist context completes and gets properly
released
This ensures all allocated contexts are cleaned up correctly during
shutdown, eliminating the memory leak.
The ASan report:
```
Indirect leak of 2064 byte(s) in 1 object(s) allocated from:
#0 0x56440919ae2d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_librbd+0x2f3de2d) (BuildId: 6a04677c6ee5235f1a41815df807f97c5b96d4cd)
#1 0x56440bd67751 in __gnu_cxx::new_allocator<Context*>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
#2 0x56440bd676e0 in std::allocator<Context*>::allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
#3 0x56440bd676e0 in std::allocator_traits<std::allocator<Context*>>::allocate(std::allocator<Context*>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
#4 0x56440bd6730b in std::_Vector_base<Context*, std::allocator<Context*>>::_M_allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:346:20
#5 0x7fd33e00e8d1 in std::vector<Context*, std::allocator<Context*>>::reserve(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:78:22
#6 0x7fd33e00c51c in librbd::cache::pwl::SyncPoint::SyncPoint(unsigned long, ceph::common::CephContext*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/SyncPoint.cc:20:27
#7 0x56440bd65f26 in decltype(::new((void*)(0)) librbd::cache::pwl::SyncPoint(std::declval<unsigned long&>(), std::declval<ceph::common::CephContext*&>())) std::construct_at<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:97:39
#8 0x56440bd65b98 in void std::allocator_traits<std::allocator<librbd::cache::pwl::SyncPoint>>::construct<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>&, librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:518:4
#9 0x56440bd657d3 in std::_Sp_counted_ptr_inplace<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:519:4
#10 0x56440bd65371 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*&, std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:651:6
#11 0x56440bd65163 in std::__shared_ptr<librbd::cache::pwl::SyncPoint, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1342:14
#12 0x56440bd650e6 in std::shared_ptr<librbd::cache::pwl::SyncPoint>::shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:409:4
#13 0x56440bd65057 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::allocate_shared<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint> const&, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:862:14
#14 0x56440bca97e7 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::make_shared<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:878:14
#15 0x56440bd443c8 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::new_sync_point(librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1905:20
#16 0x56440bd42e4c in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1951:3
#17 0x56440bd9cbf2 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point_if_needed(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1990:5
#18 0x56440bd9c636 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&)::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2152:9
#19 0x56440bd9b9b4 in boost::detail::function::void_function_obj_invoker<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&), void, librbd::cache::pwl::GuardedRequestFunctionContext&>::invoke(boost::detail::function::function_buffer&, librbd::cache::pwl::GuardedRequestFunctionContext&) /opt/ceph/include/boost/function/function_template.hpp:100:11
#20 0x56440bd29321 in boost::function_n<void, librbd::cache::pwl::GuardedRequestFunctionContext&>::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /opt/ceph/include/boost/function/function_template.hpp:789:14
#21 0x56440bd28d85 in librbd::cache::pwl::GuardedRequestFunctionContext::finish(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/Request.h:335:5
#22 0x5644091e0fe0 in Context::complete(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/include/Context.h:102:5
#23 0x56440bd9b378 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::detain_guarded_request(librbd::cache::pwl::C_BlockIORequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::GuardedRequestFunctionContext*, bool) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1202:20
#24 0x56440bd96c50 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2154:3
#25 0x56440bd1e4b5 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::shut_down(Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:703:3
#26 0x56440bdb9022 in librbd::cache::pwl::TestMockCacheSSDWriteLog_compare_and_write_compare_matched_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/librbd/cache/pwl/test_mock_SSDWriteLog.cc:403:7
```
rbd-mirror: release lock before calling m_async_op_tracker.finish_op()
m_async_op_tracker.finish_op() in InstanceReplayer::start_image_replayers
may invoke a completion that re-enters code paths that attempt to acquire
the same mutex (m_lock), violating the non-recursive lock constraint.
This can be fixed by releasing the lock before calling
m_async_op_tracker.finish_op().
Fixes: https://tracker.ceph.com/issues/70951 Signed-off-by: VinayBhaskar-V <vvarada@redhat.com>
(cherry picked from commit 07e44074ae5d18dc1248e557fd1aa2dbe39792c6)
Add an instruction that includes the --enable-auth flag in a "git orch
apply mgmt-gateway" command, in accordance with a request made by
afreen23 here: https://github.com/ceph/ceph/pull/60440#discussion_r1953530599
Zac Dover [Fri, 30 May 2025 12:38:03 +0000 (22:38 +1000)]
doc/rados/operations: edit cache-tiering.rst
Strengthen the warning against deploying cache tiering in releases after
Reef. This follows up on Anthony D'Atri's request in
https://github.com/ceph/ceph/pull/63465.