John Mulligan [Wed, 2 Jul 2025 18:25:32 +0000 (14:25 -0400)]
mgr/smb: move prune function to be a staging store method
The prune function was tightly linked to the staging store. Re-implement
it as more generic operation on the staging store.
Continue shrinking handler.py a bit and preparing to make it simpler to
add future SMBResource types.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Wed, 2 Jul 2025 18:06:21 +0000 (14:06 -0400)]
mgr/smb: replace cross check if-block with singledispatch
The previous code relied on a cascading block of if-isintance statements
that was dense and somwehat error prone as I found out during an
experiment to add a new top level resource type. Refactor the cross
check function to use singledispatch:
https://docs.python.org/3.9/library/functools.html#functools.singledispatch
Now instead of correctly adding check function(s) and updating the
if-block, only new check functions using the register decorator is
needed.
Note that making this checking more generic is difficult as each
different resource type really has different cross checking needs.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Wed, 2 Jul 2025 17:44:39 +0000 (13:44 -0400)]
mgr/smb: move functions from handler.py to staging.py
Create a new file staging.py to reduce the size of handler.py and
organize this a bit more. The staging.py file will be responsible for
the special staging area store as well as functions that work directly
on the staging store like the cross-check of resources.
This change is nearly a straight more except for renaming some functions
from _foo to foo.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Wed, 2 Jul 2025 17:30:14 +0000 (13:30 -0400)]
mgr/smb: extract a cross_check_resource function from handler class
Move the sequence of if-isinstance checks out of the class into a
function that can be called (and tested) on it's own. This prepares
for additional future refactoring of this area.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Tue, 1 Jul 2025 15:28:02 +0000 (11:28 -0400)]
mgr/smb: reorganize internal.py to use fewer if-isinstance blocks
While working on adding a new top-level resource type I realized that
this code can be a bit error prone and un-DRY. Use a single method for
mapping between resource classes and resource entry classes and refactor
the rest of the module around that idea.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Tue, 1 Jul 2025 21:15:10 +0000 (17:15 -0400)]
mgr/smb: add new ResourceKey protocol class
Add a new protocol class that can be used to uniquely identify a resource
within a given store namespace. The idea is to use this key class where
a resource can be ID'd by either one metadata field or by two allowing
more common interfaces and fewer special code paths.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
.github/workflows/scripts/config-diff-post-comment.js: fix config check ok logic
currently, whenever a "config diff tool output" comment is created it
also has the string `/config check ok` string in it. The next time the
test run it see's this text and assumes that the user has commented it.
We fix the logic to makes sure that we ignore such cases.
Kefu Chai [Mon, 30 Jun 2025 08:48:09 +0000 (16:48 +0800)]
osdc: remove unused rados.h include from error_code.h
Remove unnecessary `#include "include/rados.h"` from error_code.h as it's not
used by the header and error_code.h doesn't need to expose any RADOS
declarations.
This improves compilation time and reduces unnecessary dependencies.
mgr/dashboard: Enable rgw module automatically in the primary and secondary cluster if not enabled during multi-site automation
1. Enable rgw module automatically in the primary and secondary cluster if not enabled during multi-site automation
2. Improve progress bar descriptions and add sub-descriptions for steps
The crash module has been enabled by default since commit 18f253aa in
Nautilus and is now in the always_on_modules list. However, the
documentation still contained instructions for manually enabling it.
When users followed these outdated instructions, they encountered:
```
module 'crash' is already enabled (always-on)
```
The module cannot be disabled either. Running:
```
ceph mgr module disable crash
```
Returns the error:
```
Error EINVAL: module 'crash' cannot be disabled (always-on)
```
In this change, we remove the obsolete enabling instructions and clarify
that this module is always active and cannot be disabled.
Kirill Nazarov [Sun, 26 Jan 2025 19:08:24 +0000 (22:08 +0300)]
rbd: add --estimated-size option for import from stdin
One issue with importing from stdin is that it's not easy to track
progress. The only feasible option is to process messages on the highest
log level looking for lines like
but when it comes to large images it takes a lot of effort.
This commit introduces --estimated-size option, that makes it possible
to print out progress in percents via the standard mechanism. Obviously,
it requires the knowledge of the amount of provided data in advance and
in case of an error nonsensical percents might be printed, but I don't
think it's that big of a deal.
Also use `estimated size` as the base image size, making resizing not
necessary in cases where we know the exact amount of data provided from
stdin.
Mark Kogan [Wed, 25 Jun 2025 12:21:49 +0000 (12:21 +0000)]
qa/rgw: fix perl tests missing Amazon::S3 module
and a second case where perl tests can fail without error output
1. fix errors like: `Can't locate Amazon/S3.pm in @INC (you may need to
install the Amazon::S3 module)`
by priming the perl tests with installing the Amazon::S3 module from cpan
ex:
```
2025-06-23T19:18:40.162 INFO:tasks.workunit.client.0.smithi090.stderr:Can't locate Amazon/S3.pm in @INC (you may need to install the Amazon::S3 module) (@INC contains: /usr/local/lib64/perl5/5.32 ...
```
2. log an error when RGW process is not detected
Fixes: https://tracker.ceph.com/issues/71577 Signed-off-by: Mark Kogan <mkogan@redhat.com>
Yuval Lifshitz [Wed, 18 Jun 2025 12:11:46 +0000 (12:11 +0000)]
test/rgw/notifications: prevent client retries to avoid duplicates
if the RGW is slow, and the client retry, it may cause test to fail
since the number of notifications would be off.
in addition, in slow RGW, we need to verify that the expiry time did
not pass before checking the queue, so we see the expected number of
entries in the queue before they expire.
Yuval Lifshitz [Wed, 18 Jun 2025 12:09:12 +0000 (12:09 +0000)]
rgw/notifications: stop processing when we reach a skipped notifications
if a notification retry should be skipped, we should stop processing
all notifications. if we successfully processing another notification
it will not be removed (as we will remove only up to the marker of the
skipped notification). as a result, the successfull notification will be
processed again.
Ronen Friedman [Wed, 25 Jun 2025 14:25:08 +0000 (09:25 -0500)]
osd/scrub: some perf counters priority was '0'
Some scrub perf counters were created without specifying
individual priorities, assuming by mistake that the
default priority is '_INTERESTING'. That was not the case,
and those perf counters were not reported.
Kefu Chai [Wed, 25 Jun 2025 13:51:04 +0000 (21:51 +0800)]
rgw: do not include unused header
previously, when building cls_rgw, we could have following build
failure:
```
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/cls/rgw/cls_rgw_types.cc:4:
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/cls/rgw/cls_rgw_types.h:15:
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/rgw/rgw_basic_types.h:32:
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/rgw/rgw_user_types.h:27:
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/common/dout.h:29:
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/common/ceph_context.h:41:
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/common/config_proxy.h:7:
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/common/config.h:28:
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/common/config_values.h:59:
/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/common/options/legacy_config_opts.h:1:10: fatal error: 'global_legacy_options.h' file not found
1 | #include "global_legacy_options.h"
| ^~~~~~~~~~~~~~~~~~~~~~~~~
```
but it turned out that `cls_rgw_types.h` does not use `dout.h` at all.
so, in this change, we just drop this include. this helps to reduce
the build dependency.
Kefu Chai [Wed, 25 Jun 2025 04:14:36 +0000 (12:14 +0800)]
mgr/dashboard: Fix inline markup warning in API documentation
Remove trailing space from summary field that was causing Sphinx build
warning.
Sphinx was generating a warning due to malformed inline markup:
```
/home/kefu/dev/ceph/doc/mgr/ceph_api/index.rst:3349: WARNING: Inline strong start-string without end-string.`
```
The openapi directive appears to convert trailing spaces into asterisk
markers, creating unterminated strong markup. This change removes the
trailing space to eliminate the warning and maintain consistency with
other entries in the file.
When the cluster needs to be read, the completion is posted to ASIO.
However, in the two special cases (cluster DNE and zero cluster), the
completion is completed inline at the moment. This violates invariants
and can eventually lead to a lockup. For example, in a scenario of
a read from a clone image whose parent is under migration:
io::ObjectReadRequest::read_parent()
io::util::read_parent()
< image_lock is taken for read >
io::ImageDispatchSpec::send()
migration::ImageDispatch::read()
migration::QCOWFormat::ReadRequest::send()
...
migration::QCOWFormat::ReadRequest::read_clusters()
< cluster DNE >
migration::QCOWFormat::ReadRequest::handle_read_clusters()
io::AioCompletion::complete()
io::ObjectReadRequest::copyup()
is_copy_on_read()
< image_lock is taken for read >
copyup() expects to be called with no locks held, but going through
QCOWFormat in the "cluster DNE" case essentially maintains image_lock
taken in read_parent() and then it's taken again by the same thread in
is_copy_on_read(). Under pthreads, it's not a problem:
A thread may hold multiple concurrent read locks on rwlock (that is,
successfully call the pthread_rwlock_rdlock() function n times). If
so, the thread must perform matching unlocks (that is, it must call
the pthread_rwlock_unlock() function n times).
But according to C++ standard it's undefined behavior:
If lock_shared is called by a thread that already owns the mutex in
any mode (exclusive or shared), the behavior is undefined.
Other, longer and more elaborate, call chains are possible too and
there it may end up being a write lock, a tripped assertion, etc. To
avoid this, make the special cases in read_clusters() behave the same
as the main path.
Zac Dover [Wed, 25 Jun 2025 09:19:49 +0000 (19:19 +1000)]
doc/radosgw: line edit bucket_logging.rst
Edit doc/radosgw/bucket_logging.rst so that it is not solecistic and so
that its punctuation is corrected and its use of articles is corrected.
This file remains in my judgment demotic and maybe demotic enough to
warrant another editorial pass in the future.
Venky Shankar [Wed, 25 Jun 2025 06:39:39 +0000 (12:09 +0530)]
Merge PR #59435 into main
* refs/pull/59435/head:
mgr/volumes: Fix json.loads for test on mon caps
mgr/volumes: Add test for mon caps if auth key has remaining mds/osd caps
mgr/volumes: Keep mon caps if auth key has remaining mds/osd caps
Add comprehensive documentation for defining configuration options in
ceph-mgr modules, including all supported properties and their usage.
Previously, the documentation did not explain how to define ceph-mgr
module configuration options, despite subtle differences from other Ceph
components. This change documents all supported Option properties, their
types, and provides clear examples to help module developers properly
configure their options.
Kefu Chai [Wed, 25 Jun 2025 03:02:46 +0000 (11:02 +0800)]
doc: do not depend on typed-ast
the typed-ast project was marked end of life since July 2023, and
not maintained anymore. since we build the document using readthedocs'
service, and in .readtherdocs.yml we use python 3.9, which comes with
ast module included by its standard library.
the typed-ast dependency was originally added in 30d41597, but now that
we are using python 3.9, there is no need to use this module anymore.
Kefu Chai [Wed, 25 Jun 2025 03:50:24 +0000 (11:50 +0800)]
doc/dev/config: Document how to use :confval: directive for config options
Add comprehensive guide for documenting configuration options using the
:confval: directive, including naming conventions and cross-referencing.
Previously, the documentation lacked guidance on using the :confval:
directive and the important distinction between regular config options
and mgr module options (which require the mgr/<module>/ namespace
prefix). This change provides detailed examples and best practices for
properly documenting and referencing both types of configuration options.