osd/scrub: limit scrubbing under snap-trimming overload
When the snap-trim queues are long, scrubbing is likely to
make things worse. This change adds a new scrubbing restriction
for that case, and prevents periodic scrubs from starting when
the total snap-trim queue length across all PGs exceeds a
configurable threshold.
mgr/DaemonServer: auto-tune stats period when message queue gets backed up
The mgr can get overwhelmed when there's a lot of cluster activity and
daemons are sending stats reports faster than we can process them.
This commit adds logic to monitor the messenger queue depth and bump
up mgr_stats_period when things get congested. This reduces the
frequency of daemon stat reports, allowing the mgr to process existing
reports without being overwhelmed by new ones. The period automatically
scales back down when the queue clears up.
Added mgr_stats_period_autotune (on by default) and a queue threshold
setting. Recovery happens automatically when the queue clears up.
Max period is capped at 60 seconds to prevent excessive stat delays.
Kefu Chai [Tue, 19 May 2026 12:58:10 +0000 (20:58 +0800)]
debian/rules: strip ceph-osd-classic and ceph-osd-crimson
override_dh_strip enumerates each binary package explicitly. It was not
updated when ceph-osd was split into the ceph-osd-classic and
ceph-osd-crimson implementation packages, so the OSD binaries in those
two packages are shipped unstripped (ceph-osd-crimson installs at ~4.6
GiB) and their -dbg packages are left empty.
Add the missing dh_strip invocations so the OSD binaries are stripped
and their debug symbols land in the corresponding -dbg packages, as is
already done for every other binary package.
Bill Scales [Tue, 19 May 2026 06:05:13 +0000 (07:05 +0100)]
doc/dev/internals: Improve Ceph Internals TOC
The Ceph internals section of the docs is a bit of a mess
as far as the table of contents is concerned. This commit
tries to add a bit more structure grouping topics by
area and trying to arrange them in a more logical order.
Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
rgw/dedup: add --allow/deny-bucket-list and --allow/deny-storage-class-list to dedup commands
Resolves: bz#2413730 Signed-off-by: Gabriel BenHanokh <gbenhano@redhat.com>
Patrick Donnelly [Mon, 18 May 2026 14:20:08 +0000 (10:20 -0400)]
Merge PR #68937 into main
* refs/pull/68937/head:
.github/workflows/releng-audit: group events to serialize executions
.github/workflows/releng-audit: remove override on reopen
.github/workflows/releng-audit: refactor auth check to function
Afreen Misbah [Mon, 18 May 2026 10:01:58 +0000 (15:31 +0530)]
mgr/dashboard: fix logs e2e tests after carbonization
Update e2e test selectors to match the new Carbon component structure.
The .card-body and .message classes were replaced with .log-viewer
and .log-entry__message after carbonizing the logs component.
Assisted-by: Claude Signed-off-by: Afreen Misbah <afreen@ibm.com>
Afreen Misbah [Sun, 17 May 2026 14:53:54 +0000 (20:23 +0530)]
mgr/dashboard: Carbonize upgrade page
- Made cluster status clickable to navigate to overview when not HEALTH_OK
- Replaced Bootstrap classes with Carbon design tokens
- Updated upgrade.component.scss to use CSS custom properties
Assisted-by: Claude Signed-off-by: Afreen Misbah <afreenmisbah@ibm.com>
Afreen Misbah [Tue, 12 May 2026 12:07:39 +0000 (17:37 +0530)]
mgr/dashboard: Fix edit and delete access for pool-manager role
Fixes https://tracker.ceph.com/issues/76561
- allows deleting pools in pool-manager role by bypassing config-opt read permissions
- allows editing in pool-manager role which failing deu to misisng rbd mirroring permissions
- fixes a bug with pool edit mode where when both compression and name are edited it fails due to an if-else logic bug
Kefu Chai [Wed, 6 May 2026 02:08:20 +0000 (10:08 +0800)]
cmake/BuildISAL: build and install library targets only
Skip building the igzip executables; Ceph only needs libisal.la.
This should speed up the build a little bit, as we don't build the
executables previous built with "make"
Shai Fultheim [Sat, 16 May 2026 20:17:59 +0000 (23:17 +0300)]
crimson/os/seastore: fix cleaner space leak from shadowed result list
TransactionManager::get_extents_if_live() declared an inner
std::list<CachedExtentRef> res inside the "extent is cached" branch
that shadowed the outer res returned by the coroutine. When the
queried extent was present in the cache, it was moved into the inner
list and immediately discarded, and the empty outer list was returned
to the caller.
The async cleaner uses this result to decide whether to rewrite an
extent or treat it as dead. For recently-allocated LBA tree internal
nodes (still hot in cache), the shadowed return caused the cleaner to
skip them, so mark_space_free() never paired with the earlier
mark_space_used(). Each affected reclaim leaked exactly one extent
(4 KiB for LADDR_INTERNAL), tripping the live_bytes != 0 assertion in
SegmentCleaner::clean_space() (async_cleaner.cc:1441) once a victim
segment with such a leftover was selected.
The reproducer (at ~70% full) deterministically aborted within ~3
minutes before this fix; with the fix the OSDs run cleanly past the
trigger point.
Kefu Chai [Sat, 16 May 2026 02:53:41 +0000 (10:53 +0800)]
doc/dev: refresh vstart.sh options in dev_cluster_deployment
Bring doc/dev/dev_cluster_deployment.rst back in line with the current
src/vstart.sh:
* drop the removed -K/--kstore objectstore backend
* drop -N/--not-new, which was dropped in 8dd2e418; reusing the existing
cluster config is simply the default when -n is not given
* correct the --rgw_frontend default from civetweb to beast
* note that -b/--bluestore is the default objectstore backend
* update the example and add a note that a fresh build needs -n on the
first run, while later runs can omit it
* note that the option list is not exhaustive and point at src/vstart.sh
Matthew N. Heler [Fri, 15 May 2026 11:11:35 +0000 (06:11 -0500)]
doc/rados/configuration: recommend wpq for EC clusters seeing slow ops
On large EC clusters, mClock currently routes recovery EC sub-reads
through the immediate queue, skipping throttling. When many OSDs read
from one source during recovery, that source's high-priority queue
saturates and starves client work, producing slow ops. Recommend
falling back to wpq in the mClock config reference until the
scheduler treats those reads as background.
Signed-off-by: Matthew N. Heler <matthew.heler@hotmail.com>
Ronen Friedman [Wed, 13 May 2026 08:06:17 +0000 (08:06 +0000)]
qa/crimson: disable test-coredump
This test is there to aid in debugging coredumps creation and collection.
It always fails (as it intentionally leaves the coredump in place for collection).
Disabling it in the suite to avoid noise, but leaving the test in place for manual
runs when needed.
Ronen Friedman [Sun, 10 May 2026 13:16:02 +0000 (13:16 +0000)]
qa/crimson: add coredump generation test using ASOK assert
Trigger a crash on a Crimson OSD via the admin socket 'assert'
command and verify the OSD goes down and a coredump is produced.
Exercises the debug_asok_assert_abort path added in the companion
commit.
rgw: group lifecycle versioned deletes to reduce OLH contention
When multiple versions of the same key expire together, each delete
does a read-modify-write of the OLH on the same bucket index shard.
Buffer versions of the same key during listing and flush on key change.
Groups with multiple versions get pre-evaluated, then hard deletes go
through rgw::multi_delete::dispatch() which skips OLH updates on all
but the last delete. LCOpRule::process() is split into evaluate() and
execute() to support this two-phase pattern.
Non-versioned buckets and single-version groups are unchanged.
Signed-off-by: Matthew N. Heler <matthew.heler@hotmail.com>
mgr/dashboard: adding daemon_name as an arg to nvmeof get bundle API
When cephadm-signed are in use, we know to know exacly which nvmeof daemon is
being used so we get the correct certificates for this daemon in
particular
rgw: extract multi-delete OLH grouping for use by lifecycle
Move the OLH-aware dispatch logic out of RGWDeleteMultiObj into a
standalone rgw::multi_delete::dispatch() so lifecycle expiration
can group versioned deletes of the same key and skip redundant
OLH updates.
Signed-off-by: Matthew N. Heler <matthew.heler@hotmail.com>