Adam C. Emerson [Fri, 18 Apr 2025 07:27:36 +0000 (03:27 -0400)]
rgw: Add run_coro utility
A convenience function for turning coroutines that return values and
use exceptions, `error_code`, or similar into `int`-returning
functions that take references to out parameters.
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Adam C. Emerson [Wed, 6 Aug 2025 20:02:32 +0000 (16:02 -0400)]
common/async: Update `use_blocked` for newer asio
Reimplement with `initiate` rather than the old style. This
necessitates getting rid of the old `async::Completion` in anything
that was calling it, and other changes.
Also, use disposition for error handling.
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Ronen Friedman [Wed, 6 Aug 2025 05:38:07 +0000 (00:38 -0500)]
osd/scrub: do not limit operator-initiated repairs
'auto-repair' scrubs are limited to a maximum of
'scrub_auto_repair_num_errors' damaged objects.
However, operator-initiated repairs should not be limited
by that number. Alas, a bug in a previous commit
(97de817ad1c253ee1c7c9c9302981ad2435301b9) modified the
code in such a way that it applied the
'scrub_auto_repair_num_errors' limit to all repairs,
including operator-initiated ones. This commit fixes that.
Adam C. Emerson [Mon, 30 Jun 2025 20:54:46 +0000 (16:54 -0400)]
rgw/datalog: Manage and shutdown tasks properly
This is slightly ugly but good enough for now. Make sure we can block
when shutting down background tasks.
Remove a few `driver` parameters that are unused. This lets us
simplify the IAM Policy and Lua tests and not construct stores we
never use. (Which is good since we aren't running them under a cluster.)
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Adam C. Emerson [Fri, 30 May 2025 20:54:45 +0000 (16:54 -0400)]
neorados: Hold reference to implementation across operations
Asynchrony combined with cancellations keeps leading to occasional
lifetime issues, so follow the best-practices of Asio I/O objects by
having completions keep a reference live.
The original NeoRados backing implements Asio's two-phase shutdown
properly.
The RadosClient backing does not, because it shares an Objecter with
completions that do not belong to it. In practice I don't think this
will matter since librados and neorados get shut down around the same
time.
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
* kafka: pass full broker list to consumer in tests
* kafka: use ip instead of localhost
* kafka: make sure topic exists before consumer start
* kafka: fix zookeeper and broker conf in tests
* kafka: verify receiver in the test
* kafka: tests were not running (Fixes: https://tracker.ceph.com/issues/72240)
* kafka: failover tests were failing (Fixes: https://tracker.ceph.com/issues/71585)
* simplify basic tests run command
* v2 migration tests were not running
* fix failing migration tests
Bill Scales [Fri, 1 Aug 2025 15:17:58 +0000 (16:17 +0100)]
doc: erasure coding enhancements for tentacle
* Document new pool flag allow_ec_optimizations
* Reference new conf setting osd_pool_default_flag_ec_optimizations
* Add section describing Erasure Code Optimizations
Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
Ville Ojamo [Tue, 5 Aug 2025 15:34:26 +0000 (22:34 +0700)]
doc/rados: Remove obsolete fs-recomm links
2 files linked to filesystem-recommendations.rst which was removed
around the year 2017.
I understand this was relevant only for Filestore. So simply remove the
references to this file & the link definition if one was used.
Ville Ojamo [Tue, 5 Aug 2025 14:45:05 +0000 (21:45 +0700)]
doc/rados: Use ref instead of relative external links
Instead of external links use :ref: where dst labels exist already in:
operations/erasure-code.rst
operations/pools.rst
troubleshooting/troubleshooting-osd.rst
Use link text generation where it is reasonably close to previous manual
link text.
Delete some unused link definitions.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Zac Dover [Tue, 5 Aug 2025 11:24:41 +0000 (21:24 +1000)]
doc/cephfs: edit troubleshooting.rst
Edit "Stuck in up:replay" under the "Stuck During Recovery" section of
doc/cephfs/troubleshooting.rst. I had planned to edit the entire "Stuck
During Recovery" section in a single commit, but I think that the
material is too involved for that.
Ville Ojamo [Tue, 5 Aug 2025 06:08:09 +0000 (13:08 +0700)]
doc/radosgw: Small improvements in kmip.rst
Major rewrite of the last section that is a copypasta from vault.rst:
- "engines" are relevant only to Hashicorp Vault and not KMIP
- leave only 1 copy of the 2 identical CLI examples
- talk about KMIP and not Vault, it is an alternative to Vault
Also fix other mention of "Vault" into "KMIP".
Auto-generate contents list instead of hardcoding it.
Use '=' in ceph.conf example, I believe ':' cannot be used.
Capitalize "Ceph", "Python", "KMIP", "OpenSSL", "PyKMIP" consistently.
Call it consistently "Ceph Object Gateway".
Format "pykmip" in italic when referring to the binary.
Hyphen in "PEM-encoded" along with capitalization.
Use double backticks for data.
Spell out a lonesome number "1" in text.
Fix typo "correspondent" to "corresponding".
Promptify CLI commands.
Use title case consistently in section titles.
Linkify mention of Ceph configuration file.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Ville Ojamo [Tue, 5 Aug 2025 05:14:13 +0000 (12:14 +0700)]
doc/radosgw: Small improvements in s3-notification-compatibility.rst
Attempt a small fix to a grammatical error in a sentence.
It should also refer to "below" and not "above", probably.
End full sentences in full stops.
Indent an unordered list consistently so that it renders consistently
with the same bullets.
Also indent all various blocks at the same columns consistently.
Wrap lines before column 80 while we're at it.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Ville Ojamo [Mon, 12 May 2025 09:01:44 +0000 (16:01 +0700)]
doc/radosgw: Use ref for hyperlinks, 2nd batch
Use validated ":ref:" hyperlinks instead of "external links" in "target
definitions" when linking within the Ceph docs:
- Add a label at beginning of referenced files if missing.
- Remove unused "target definitions".
- Updated links targeting files: compression encryption keystone
Cleaned hyperlinks usage in kmip.rst:
- Some links were using anonymous links (double underscore) unnecessarily.
- Some links were not using backticks, add for consistency.
- Move anonymous link definition to after the ordered list to avoid
unnecessary empty line between list items.
Use an already existing label for 2 intra-docs links that used full URLs.
Use an already existing label for intra-docs link instead of a file name
reference in s3/authentication.rst.
The rendered PR should look the same as the old docs, only differing in
the source RST.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Ville Ojamo [Mon, 4 Aug 2025 06:06:38 +0000 (13:06 +0700)]
doc/cephadm: Small improvements in services/tracing.rst
Use ref instead of a full URL link and add label for it in
doc/jaegertracing/index.rst.
Capitalize "Ceph", "Jaeger", "ElasticSearch" consistently.
Start sentences with capital case consistently.
Fix a typo.
Wrap lines a bit before column 80.
Use an ordered list instead of hardcoding list numbers in separate
paragraphs.
Don't use ordered list for items that do not both fit under the text
paragraph introducing the list.
Rewrite the sentences to be more consistent and hopefully more correct.
Add articles that I believe should be there, also for consistency with
the previous paragraph.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Naman Munet [Tue, 22 Jul 2025 17:08:42 +0000 (22:38 +0530)]
mgr/dashboard: user accounts enhancements
fixes: https://tracker.ceph.com/issues/72072
PR covers:
1) Displaying account name instead of account id in bucket list page & bucket edit form for account owned buckets
2) non-root account user can now be assigned with managed policies with which they can perform operations
3) The root user indication shifted next to username in users list rather than on Account Name with a new icon.
Alex Ainscow [Mon, 14 Jul 2025 15:55:40 +0000 (16:55 +0100)]
osd: Optimised EC avoids ever reading more than K shards (if plugin supports it).
Plugins which support partial reads, should never need more than k shards
to read the data, even if some shards have failed. However, rebalancing commonly
requests k + m shards, as very frequently all shards are moved. If this occurs
and all k + m shards are online, the read will be achieved by reading ALL shards
rather than just reading k shards. This commit fixes that issue.
The problem is that we don't want to change the API to the old EC, so we cannot
update the plugin behaviour here. Instead, the EC code itself will reduce
the number of shards it tells minimum_to_decode about.
In a comment we note that bitset_set performance could be improved using _pdep_u64.
This would require fiddly platform-specific code and would likely not show
any performance improvements for most applications. The majority of the calls to
this function will be with a bitset that has <=n set bits and will never enter this
if statement. When there are >n bits set we are going to save one or more read I/Os,
the cost of the for loop is insignificant vs this saving. I have left the comment
in as a hint to future users of this function.
Further notes were made in a review comment that are worth recording:
- If performance is limited by the drives, then less read I/Os is a clear advantage.
- If performance is limited by the network then less remote read I/Os is a clear advantage.
- If performance is limited by the CPU then the CPU cost of M unnecessary remote
read I/Os (messenger+bluestore) is almost certainly more than the cost of doing an
extra encode operation to calculate the coding parities.
- If performance is limited by system memory bandwidth the encode+crc generation
has less overhead than the read+bluestore crc check+messenger overheads.
Longer term this logic should probably be pushed into the plugins, in particular
to give LRC the opportunity to optimize for locality of the shards. Reason for
not doing this now is that it would be messy because the legacy EC code cannot
support this optimization and LRC isn't yet optimizing for locality
Incorporate into doc/cephfs/ceph-dokan.rst the suggestions made by
Anthony D'Atri in https://github.com/ceph/ceph/pull/64737, and make a
few other small improvements to the English language in that file.
test/rbd-mirror: eliminate a race in ResyncRequestedRemoteNotPrimary
Adjust the wait_for_notification call in TestMockImageReplayerSnapshotReplayer.ResyncRequestedRemoteNotPrimary
to expect 2 notifications instead of 1. This allows the test to correctly wait for both expected events
i.e for finish_sync() and handle_replay_complete(locker, -EREMOTEIO, "remote image demoted"), ensuring the
replayer transitions to STATE_COMPLETE and is_replaying() returns false as intended.
Max Kellermann [Fri, 11 Oct 2024 22:35:13 +0000 (00:35 +0200)]
mds/MDSDaemon: unlock `mds_lock` while shutting down Beacon and others
This fixes a deadlock bug during MDS shutdown:
- the "signal_handler" thread receives the shutdown signal and invokes
MDSDaemon::suicide() while holding `mds_lock`
- MDSDaemon::suicide() invokes Beacon::send_and_wait() while still
holding `mds_lock`
- meanwhile, all "ms_dispatch" threads get stuck waiting for
`mds_lock`, for example in MDCache::upkeep_main() or
MDSDaemon::ms_dispatch2()
- Beacon::send_and_wait() waits for a `MSG_MDS_BEACON` packet to be
dispatched (via `cvar` with a timeout)
At this point, even if a `MSG_MDS_BEACON` packet is received by one of
the worker threads, they will put it in the `DispatchQueue`, but no
dispatcher thread will be able to handle it because they are all
stuck. The cvar.wait_for() call in Beacon::send_and_wait() will
therefore time out and the `MSG_MDS_BEACON` will never be processed.
The proper solution is to unlock `mds_lock` to avoid the dispatchers
from getting stuck. And in general, we should be holding a lock
strictly only when it is needed and never do blocking calls while
holding a lock.
Fixes: https://tracker.ceph.com/issues/68760 Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Ville Ojamo [Fri, 18 Jul 2025 05:53:08 +0000 (12:53 +0700)]
doc/radosgw: Simple fixes and improvements, links improvements
Fix table with a column separator problem in s3/bucketops.rst.
Remove whitespaces at end of lines in s3/bucketops.rst.
Linkify mention of multizone into multisite.rst in bucket_logging.rst.
Separate units from numbers with a space in bucket_logging.rst
Consistency in capitalization and full stop usage in table data in
s3-notification-compatibility.rst s3/bucketops.rst.
Use ref for intra-docs link instead of "external links" feature in
s3/bucketops.rst notifications.rst s3.rst, add a label in start of
s3-notification-compatibility.rst for it. Follow label format that seems
to be in the majority.
Use auto-generated link text that ref provides.
Reflow the text in the cell. Extend table syntax width to accommodate
longer text in cell.
Use ref similarly on links to s3/bucketops.rst. Add a label in it and
use it from bucket_logging.rst and notifications.rst.
Delete unused external link definition in s3/bucketops.rst.
Remove multiple whitespace at the end of lines in notifications.rst
s3-notification-compatibility.rst bucketops.rst.
Change tab characters to spaces in indentation in bucketops.rst
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>