Leonid Chernin [Tue, 24 Jun 2025 13:00:49 +0000 (16:00 +0300)]
nvmeofgw: fixing GW delete issues
1.fixing the issue when gw is deleted based on invalid subsystem info
2. in function track_deleting_gws: break from loop only if
delete was really done
3. fix published rebalance index - publish ana-group instead of
index
4. do not dump gw-id string after gw was removed
Fixes: https://tracker.ceph.com/issues/71896 Signed-off-by: Leonid Chernin <leonidc@il.ibm.com>
Ronen Friedman [Wed, 25 Jun 2025 14:25:08 +0000 (09:25 -0500)]
osd/scrub: some perf counters priority was '0'
Some scrub perf counters were created without specifying
individual priorities, assuming by mistake that the
default priority is '_INTERESTING'. That was not the case,
and those perf counters were not reported.
Kefu Chai [Wed, 25 Jun 2025 13:51:04 +0000 (21:51 +0800)]
rgw: do not include unused header
previously, when building cls_rgw, we could have following build
failure:
```
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/cls/rgw/cls_rgw_types.cc:4:
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/cls/rgw/cls_rgw_types.h:15:
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/rgw/rgw_basic_types.h:32:
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/rgw/rgw_user_types.h:27:
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/common/dout.h:29:
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/common/ceph_context.h:41:
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/common/config_proxy.h:7:
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/common/config.h:28:
In file included from /home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/common/config_values.h:59:
/home/jenkins-build/build/workspace/ceph-dashboard-pull-requests/src/common/options/legacy_config_opts.h:1:10: fatal error: 'global_legacy_options.h' file not found
1 | #include "global_legacy_options.h"
| ^~~~~~~~~~~~~~~~~~~~~~~~~
```
but it turned out that `cls_rgw_types.h` does not use `dout.h` at all.
so, in this change, we just drop this include. this helps to reduce
the build dependency.
Kefu Chai [Wed, 25 Jun 2025 04:14:36 +0000 (12:14 +0800)]
mgr/dashboard: Fix inline markup warning in API documentation
Remove trailing space from summary field that was causing Sphinx build
warning.
Sphinx was generating a warning due to malformed inline markup:
```
/home/kefu/dev/ceph/doc/mgr/ceph_api/index.rst:3349: WARNING: Inline strong start-string without end-string.`
```
The openapi directive appears to convert trailing spaces into asterisk
markers, creating unterminated strong markup. This change removes the
trailing space to eliminate the warning and maintain consistency with
other entries in the file.
When the cluster needs to be read, the completion is posted to ASIO.
However, in the two special cases (cluster DNE and zero cluster), the
completion is completed inline at the moment. This violates invariants
and can eventually lead to a lockup. For example, in a scenario of
a read from a clone image whose parent is under migration:
io::ObjectReadRequest::read_parent()
io::util::read_parent()
< image_lock is taken for read >
io::ImageDispatchSpec::send()
migration::ImageDispatch::read()
migration::QCOWFormat::ReadRequest::send()
...
migration::QCOWFormat::ReadRequest::read_clusters()
< cluster DNE >
migration::QCOWFormat::ReadRequest::handle_read_clusters()
io::AioCompletion::complete()
io::ObjectReadRequest::copyup()
is_copy_on_read()
< image_lock is taken for read >
copyup() expects to be called with no locks held, but going through
QCOWFormat in the "cluster DNE" case essentially maintains image_lock
taken in read_parent() and then it's taken again by the same thread in
is_copy_on_read(). Under pthreads, it's not a problem:
A thread may hold multiple concurrent read locks on rwlock (that is,
successfully call the pthread_rwlock_rdlock() function n times). If
so, the thread must perform matching unlocks (that is, it must call
the pthread_rwlock_unlock() function n times).
But according to C++ standard it's undefined behavior:
If lock_shared is called by a thread that already owns the mutex in
any mode (exclusive or shared), the behavior is undefined.
Other, longer and more elaborate, call chains are possible too and
there it may end up being a write lock, a tripped assertion, etc. To
avoid this, make the special cases in read_clusters() behave the same
as the main path.
Zac Dover [Wed, 25 Jun 2025 09:19:49 +0000 (19:19 +1000)]
doc/radosgw: line edit bucket_logging.rst
Edit doc/radosgw/bucket_logging.rst so that it is not solecistic and so
that its punctuation is corrected and its use of articles is corrected.
This file remains in my judgment demotic and maybe demotic enough to
warrant another editorial pass in the future.
Venky Shankar [Wed, 25 Jun 2025 06:39:39 +0000 (12:09 +0530)]
Merge PR #59435 into main
* refs/pull/59435/head:
mgr/volumes: Fix json.loads for test on mon caps
mgr/volumes: Add test for mon caps if auth key has remaining mds/osd caps
mgr/volumes: Keep mon caps if auth key has remaining mds/osd caps
Add comprehensive documentation for defining configuration options in
ceph-mgr modules, including all supported properties and their usage.
Previously, the documentation did not explain how to define ceph-mgr
module configuration options, despite subtle differences from other Ceph
components. This change documents all supported Option properties, their
types, and provides clear examples to help module developers properly
configure their options.
Kefu Chai [Wed, 25 Jun 2025 03:02:46 +0000 (11:02 +0800)]
doc: do not depend on typed-ast
the typed-ast project was marked end of life since July 2023, and
not maintained anymore. since we build the document using readthedocs'
service, and in .readtherdocs.yml we use python 3.9, which comes with
ast module included by its standard library.
the typed-ast dependency was originally added in 30d41597, but now that
we are using python 3.9, there is no need to use this module anymore.
Kefu Chai [Wed, 25 Jun 2025 03:50:24 +0000 (11:50 +0800)]
doc/dev/config: Document how to use :confval: directive for config options
Add comprehensive guide for documenting configuration options using the
:confval: directive, including naming conventions and cross-referencing.
Previously, the documentation lacked guidance on using the :confval:
directive and the important distinction between regular config options
and mgr module options (which require the mgr/<module>/ namespace
prefix). This change provides detailed examples and best practices for
properly documenting and referencing both types of configuration options.
Kefu Chai [Tue, 24 Jun 2025 14:38:13 +0000 (22:38 +0800)]
rbd: fix unused function warning when WITH_KRBD is disabled
Guard print_error_description() and get_unsupported_features() with
`#ifdef WITH_KRBD` to prevent compiler warnings when KRBD support is
not enabled.
These functions are only called by do_kernel_map(), which is itself
conditionally compiled. When WITH_KRBD is not defined, the compiler
generates unused function warnings for these helper functions.
Fixes warning:
```
/home/kefu/dev/ceph/src/tools/rbd/action/Kernel.cc:305:13: warning: ‘void rbd::action::kernel::print_error_description(const char*, const char*, const char*, const char*, int)’ defined but not used [-Wunused-function]
305 | static void print_error_description(const char *poolname,
| ^~~~~~~~~~~~~~~~~~~~~~~
```
Zac Dover [Mon, 23 Jun 2025 12:50:03 +0000 (22:50 +1000)]
doc/rados: clarify "upmap_max_deviation"
Clarify the threshold set by "upmap_max_deviation" and add the
information about this configurable that is currently in
src/pybind/mgr/balancer/module.py to src/common/options/global.yaml.in,
so that it will be accessible by means of ".. confval::" declarations.
John Mulligan [Thu, 13 Feb 2025 20:59:42 +0000 (15:59 -0500)]
ceph.spec.in: use rpm macro for python shebang pathfix
To support EL 10 distros, update the source of the pathfix tool (on EL
9+ distros) and use the macro for updating python shebangs that has been
available since at least EL 9.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Let PGShardManager::invoke_on_each_shard_seq pass the local shard_services
instance instead of using an additional helper.
The downside of dropping the generic sharded_map_seq helper is that it is
able to support *any* (seastar::)sharded object. However, as shard_services
is the only user of it - directly using the local instance without the
helper seems easier to read.
Matan Breizman [Sun, 22 Jun 2025 10:10:10 +0000 (10:10 +0000)]
crimson/common/smp_helpers: fix reactor_map_seq
Copy f into reactor_map_seq which would be kept alive
due to this method being a coroutine. That way, we can ensure
the lambdas passed to each core that are capturing f by
reference would be safe.
Alternatively, we can also copy f by using it's copy ctor and
pass a copy to each shard:
co_await crimson::submit_to(core, F(f))
However, avoiding the copy is possible here due to the sequential
traversal. Note, seastar's invoke_on_all do copy each callback to
every shard and is running the invocation in parallel.
The above would have fixed f's captures to be invalid and result
in a segfaults on diffrent shards.
Zac Dover [Mon, 23 Jun 2025 08:18:07 +0000 (18:18 +1000)]
doc/radosgw: remove "pubsub_event_lost"
Remove "pubsub_event_lost" from the list of "Notification Performance
Statistics" in doc/radosgw/notifications.rst. "pubsub_event_lost" is now
obsolete.
Shraddha Agrawal [Wed, 28 May 2025 05:56:26 +0000 (11:26 +0530)]
mon: add command osd pool clear-availability-status
This commit adds a new command to allow users to clear the
calculated availability score for a specified pool. This can be
done by issuing the command:
ceph osd pool clear-availability-status <pool_name>
Nitzan Mordechai [Wed, 21 May 2025 11:41:01 +0000 (11:41 +0000)]
src/mon/MgrStatMonitor: fix invalid iterator increment in calc_pool_availability()
Erasing entries from `pool_availability` inside a range-for
loop invalidated the hidden iterator, triggering an
“Invalid read” under Valgrind.
- Use `std::erase_if(pool_availability, predicate)` for
atomic removal.
- Refactor the stats-update loop to use structured bindings
and a clear `++it` for readability.