Sun Yuechi [Wed, 10 Jun 2026 00:13:53 +0000 (08:13 +0800)]
cmake: disable Catch2 tests when Catch2 is unavailable
debhelper on noble passes -DFETCHCONTENT_FULLY_DISCONNECTED=ON, so CPM
cannot fetch Catch2 and silently skips it, leaving no Catch2 targets
behind and breaking the generate step. Fall back to WITH_CATCH2=OFF
with a warning instead.
dheart [Tue, 9 Jun 2026 13:27:14 +0000 (21:27 +0800)]
os/bluestore: prevent reallocation and corruption when shared_blob key is missing/undecodable
When the shared_blob key is missing or fails to decode,
it is necessary to scan the blob's pextents directly as the sole authoritative source
to verify allocated blocks and prevent double-allocation.
Emmanuel Ameh [Tue, 9 Jun 2026 12:40:03 +0000 (13:40 +0100)]
doc/man: Remove stale EOL release names from deprecation notices
ceph.rst: "osd create" deprecation notice cited "the Luminous release"
(2017, EOL 2020). Update to a plain deprecation statement directing
users to the replacement command (osd new).
rbd.rst: cephx_require_signatures option deprecation cited "the Bobtail
release" (2013, EOL 2015) as context for why the option is deprecated.
Remove the EOL release name; retain the deprecation warning. Fix the
companion nocephx_require_signatures notice for consistency ("in a
future release" instead of "in the future").
Reviewed-by: Joseph Mundackal <jmundackal@bloomberg.net> Reviewed-by: Anthony D Atri <anthony.datri@gmail.com> Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
Adam Kupczyk [Tue, 1 Jul 2025 13:47:14 +0000 (13:47 +0000)]
os/bluestore: Add new onode recovery method
Added read_allocation_from_onodes_mt function
(originally copied from read_allocation_from_onodes).
Added Decoder_AllocationsAndStatFS class
(originally copied from ExtentDecoderpartial).
There are significant differences from originals:
- shared blobs are not scanned at all
- to not account allocations more than once,
collisions are detected on SimpleBitmap level;
only the first onode referencing shared blob will mark allocation
- Blobs are not preserved
- instead we remember only if blob or spanning blob was compressed
The underlying logic is make recovery faster and prepare for
multithread refactor.
Adam Kupczyk [Fri, 4 Jul 2025 16:28:16 +0000 (16:28 +0000)]
os/bluestore: Rework on decoding
Refactored ExtentDecoder.
Introduced decode_create_blob method to it.
Converted bluestore_blob_t::decode and Blob::decode methods into templates.
Created clear example path how to specialize these and other decoders.
mds: fix shutdown hang when ephemeral pins active and max_mds is 0
During shutdown, `ceph fs set <fs> down true` sets max_mds to 0 before
the MDS daemons have finished exporting their subtrees. shutdown_pass()
iterates over auth subtrees and skips any dir whose inode is
ephemerally pinned, expecting handle_export_pins() to re-place them.
However, handle_export_pins() calls hash_into_rank_bucket() which (after
the companion fix) now returns MDS_RANK_NONE when max_mds == 0. With
no valid target rank the export is never scheduled, so the ephemerally-
pinned dirs are skipped by shutdown_pass() indefinitely and the daemon
loops.
mds: fix crash in hash_into_rank_bucket() when max_mds is 0
When a CephFS cluster is paused (e.g. via `ceph fs set <fs> down true`
or `ceph fs pause`) the MDS map's max_mds is set to 0. Any subsequent
call to hash_into_rank_bucket() with max_mds == 0 triggers a crash:
the jump-consistent-hash loop never executes (j starts at 0, condition
j < max_mds is immediately false), leaving b = -1, so the final
assert(result >= 0 && result < max_mds) aborts the daemon.
This commit enables ceph-osd-crimson and ceph-osd-crimson-dbg
packages for debian builds which have gcc version 13 or above.
This is done as a first step to add noble to supported distors
for crimson.
Kefu Chai [Sun, 7 Jun 2026 08:58:20 +0000 (16:58 +0800)]
mgr/dashboard: don't mutate the cached osd_map in CephService
test_pool_list fails intermittently:
Traceback (most recent call last):
File "qa/tasks/mgr/dashboard/test_pool.py", line 182, in test_pool_list
self.assertNotIn('pg_status', pool)
AssertionError: 'pg_status' unexpectedly found in
{'pool': 1, 'pool_name': 'rbd', ..., 'pg_status': {'active+clean': 1}, ...}
mgr.get('osd_map') defaults to mutable=False, so cacheable_get_python()
returns the mgr's shared cached object rather than a copy.
get_pool_list_with_stats() writes pool['pg_status'] and pool['stats']
into those cached dicts, and get_erasure_code_profiles() sets ecp['name']
and rewrites ecp['k']/['m'] to int. The writes outlive the request, so
once a stats=true call has run, GET /api/pool with stats=false still
returns pools carrying pg_status and the assertion above fails. It only
triggers while the cache stays valid between the two requests, hence the
flakiness.
Audited the other dashboard readers of cached mgr.get() keys: these two
are the only sites that mutate the result; the rest only read, and
health.py already copies its osd_map before editing.
Copy the dicts before stamping them; the cache stays clean.
Sun Yuechi [Sat, 6 Jun 2026 09:44:57 +0000 (17:44 +0800)]
Dockerfile.build: fetch sccache on riscv64
sccache ships a riscv64 release artifact since v0.13.0, published under the
riscv64gc target triple. Map uname -m "riscv64" to that asset name so the
download resolves on riscv64 instead of being skipped.
Sun Yuechi [Sat, 6 Jun 2026 09:44:33 +0000 (17:44 +0800)]
Dockerfile.build: bump sccache to v0.15.0
The releases since v0.8.2 add caching for C++20 modules, assembly, and C
preprocessor output, plus broader GCC/MSVC flag handling. They also avoid
double-caching when ccache is on PATH and carry assorted cache-correctness
and storage-backend fixes.
Kefu Chai [Fri, 5 Jun 2026 01:34:56 +0000 (09:34 +0800)]
ceph.spec.in: only require c-ares >= 1.28 on el10+
87e233bb2628784c8c59603e74bc728a8944265e added an unconditional
"Requires: c-ares >= 1.28.0" to ceph-osd-crimson: seastar links
ares_query_dnsrec, which c-ares only grew in 1.28, and the libcares.so.2
SONAME doesn't carry the version so rpm can't infer the floor itself.
But the floor only earns its place where the build links the symbol
against a newer c-ares than the runtime has, and that's an EL thing.
el10's minors cross 1.28 under one $releasever (10.1 ships 1.25, 10.2
ships 1.34), so a builder rolls to 1.34 while a frozen 10.1 node stays on
1.25; without the floor the rpm installs there and the osd then crashes
on the missing symbol. el9 builds the legacy ares_query path and doesn't
need it at all.
Fedora and SUSE don't have the skew: one c-ares per release, built and
run against the same one, so the auto libcares.so.2 dep covers them. So
pin it only on el10+, arch-qualified with %{?_isa}.
Ronen Friedman [Thu, 4 Jun 2026 13:05:26 +0000 (13:05 +0000)]
crimson/test: chain invoke_on_all() future instead of calling get()
The reactors start-up code on ARM64 uses invoke_on_all() to
set a configuration option.
Replace smp::invoke_on_all().get() with future chaining. This
avoids waiting on a future from a reactor continuation (outside
of a seastar thread) that throws exception.
Ronen Friedman [Fri, 29 May 2026 18:21:51 +0000 (18:21 +0000)]
osd/scrub: clean up inconsistent_obj_wrapper and ScrubStore
Add a default constructor to inconsistent_obj_wrapper, allowing
decode_wrapper() to avoid requiring a dummy hobject_t that gets
immediately overwritten by decode(). Remove the now-unnecessary
hobject_t parameter from merge_encoded_error_wrappers().
Introduce a 'last_degraded' timestamp to the pg_stat_t structure to track
the initial point of redundancy loss. This field, used in conjunction
with 'last_clean', allows the manager to calculate a cluster-wide
durability score by measuring the duration of vulnerability windows.
Changes:
1) Add last_degraded (utime_t) to pg_stat_t in osd_types.h.
2) Increment pg_stat_t encoding version to 31. The decode logic
defaults last_degraded to last_clean for backward compatibility
during rolling upgrades.
3) Update operator==, dump(), and generate_test_instances() to
support ceph-dencoder testing and JSON output.
4) Implement latching logic in PeeringState::prepare_stats_for_publish():
- A PG is considered vulnerable if in DEGRADED or UNDERSIZED state.
- last_degraded is set to 'now' only if it is <= last_clean,
effectively latching the timestamp to the start of the failure
event until the PG next becomes clean.
5) Standalone tests to verify:
- The last_degraded timestamp latching logic.
- Verify last_degraded timestamp is modified when OSDs are marked 'out' for
draining purposes in which case PGs are marked undersized.
6) Release note the addition of 'last_degraded' field to PG stats.
John Mulligan [Thu, 14 May 2026 14:02:56 +0000 (10:02 -0400)]
doc: add more details about the remote-control sidecar service
Add a section about how to set up and access the remote-control sidecar
service. Update a bit of the existing config docs that was not accurate.
Cover the three approaches to making use of the remote-control service
as a client.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Fri, 8 May 2026 18:01:36 +0000 (14:01 -0400)]
container: include python3-ceph-smb-ctl in ceph image
The python3-ceph-smb-ctl package provides the ceph-smb-ctl CLI tool (and
requires needed deps) and is a weak dependency of python3-ceph-common.
However, since the container disables weak dependencies by default we
need to explicitly list it if we want it in the container image. Which
we do.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Kefu Chai [Thu, 4 Jun 2026 10:38:24 +0000 (18:38 +0800)]
test/rgw/posix: free the quota handler in TestDriver
TestDriver::init() allocates quota_handler via
RGWQuotaHandler::generate_handler() but nothing frees it. The real
POSIXDriver frees it in finalize(), which the unit tests never call, so
every fixture that runs init() leaks the handler and the stat caches
hanging off it: 274 allocations, ~40KB, all rooted at generate_handler()
under ASan:
==6102==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 3200 byte(s) in 5 object(s) allocated from:
#1 RGWQuotaHandler::generate_handler(...) src/rgw/rgw_quota.cc:989
#2 TestDriver::init(...) src/test/rgw/test_rgw_posix_driver.cc:1100
#3 POSIXDriverTest::SetUp() src/test/rgw/test_rgw_posix_driver.cc:1191
...
SUMMARY: AddressSanitizer: 40099 byte(s) leaked in 274 allocation(s).
So free it in ~TestDriver(), the counterpart to the init() allocation.
~POSIXDriver() is empty and nothing else touches quota_handler, so there
is no double free, and free_handler(nullptr) is a no-op when init()
bailed out early.