Ronen Friedman [Thu, 4 Jun 2026 13:05:26 +0000 (13:05 +0000)]
crimson/test: chain invoke_on_all() future instead of calling get()
The reactors start-up code on ARM64 uses invoke_on_all() to
set a configuration option.
Replace smp::invoke_on_all().get() with future chaining. This
avoids waiting on a future from a reactor continuation (outside
of a seastar thread) that throws exception.
Kefu Chai [Thu, 4 Jun 2026 10:38:24 +0000 (18:38 +0800)]
test/rgw/posix: free the quota handler in TestDriver
TestDriver::init() allocates quota_handler via
RGWQuotaHandler::generate_handler() but nothing frees it. The real
POSIXDriver frees it in finalize(), which the unit tests never call, so
every fixture that runs init() leaks the handler and the stat caches
hanging off it: 274 allocations, ~40KB, all rooted at generate_handler()
under ASan:
==6102==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 3200 byte(s) in 5 object(s) allocated from:
#1 RGWQuotaHandler::generate_handler(...) src/rgw/rgw_quota.cc:989
#2 TestDriver::init(...) src/test/rgw/test_rgw_posix_driver.cc:1100
#3 POSIXDriverTest::SetUp() src/test/rgw/test_rgw_posix_driver.cc:1191
...
SUMMARY: AddressSanitizer: 40099 byte(s) leaked in 274 allocation(s).
So free it in ~TestDriver(), the counterpart to the init() allocation.
~POSIXDriver() is empty and nothing else touches quota_handler, so there
is no double free, and free_handler(nullptr) is a no-op when init()
bailed out early.
ceph.spec.in: require c-ares >= 1.28 for ceph-osd-crimson
Seastar's DNS stack uses ares_query_dnsrec when built against c-ares
>= 1.28 (ARES_VERSION >= 0x011c00). Only ceph-osd-crimson links that
path; classic-osd does not, so add the version floor on the crimson
subpackage only.
Rocky Linux 10 shaman builds use docker.io/rockylinux/rockylinux:10
(os-release 10.1), but dnf builddeps resolve against the live Rocky 10
BaseOS/AppStream repos, which track the newest minor and install
c-ares-devel/c-ares 1.34.6. CMake links ceph-osd-crimson against that
library. Teuthology nodes are provisioned as Rocky 10.1 and install only
the requested Ceph packages without a full distro upgrade, so their
baseline c-ares stays at 1.25.0 (< 1.28, no ares_query_dnsrec). Install
succeeds but OSD startup fails with "undefined symbol: ares_query_dnsrec".
Require c-ares >= 1.28 on ceph-osd-crimson so dnf upgrades to a suitable
libcares (1.34.6 is already in Rocky 10.1 baseos) or fails cleanly at
install. Ubuntu crimson CI does not show this mismatch: the same LTS is
used for building and testing, and maintainers do not bump upstream
package versions across an LTS lifecycle (only cherry-picked fixes), so
build-time and runtime libc-ares stay aligned.
get('osd_map') returns the cached object directly, so del and key
assignments were silently corrupting the cache for subsequent callers.
Take a shallow copy before modifying, and use pop() instead of del in
case the cache was already corrupted.
mgr/DaemonServer: clarify ok-to-upgrade error message for CRUSH buckets
Refine the error string in DaemonServer.cc returned by the
ok-to-upgrade command when OSDs in a CRUSH bucket cannot be upgraded.
The original message is ambiguous. It fails to clearly convey that
stopping *any* individual OSD in that specific bucket will drop PGs
offline, meaning no OSDs within that bucket can be safely upgraded at
this time.
Update the phrasing to explicitly state that at least X PGs will go offline
if any OSD out of the total count in that CRUSH bucket is stopped. Also
standardize on capitalized acronyms (PG, OSD, CRUSH) and wrap the bucket
name in single quotes for better log readability.
Kobi Ginon [Wed, 3 Jun 2026 14:03:09 +0000 (17:03 +0300)]
doc/rbd: clarify Rocky iSCSI gateway requirements
List Rocky Linux 8+ alongside RHEL/CentOS Stream 7.5+. Note that packaged
ceph-iscsi must recognize Rocky in /etc/os-release (ceph-iscsi#282). Add a
short Rocky note under iSCSI targets; expand the overview maintenance
warning with migration guidance to RBD and the NVMe-oF gateway.
Since we are changing the 'application' for the report,
we need non-RO, in case of cached api call.
using 'pool_stats' map directly to avoid copy of the pg_dump
that can be huge.
mgr: replace TTLCache with MgrMapCache and protect api with readonly
This patch removes the old TTLCache implementation and introduces
a new generic MgrMapCache driven by a runtime toggle:
- Add `mgr_map_cache_enabled` config option in global.yaml
- Swap out `ttl_cache` for `api_cache` (MgrMapCache) in ActivePyModules
- Update cacheable_get_python() and get_python() to use LFU‐based api_cache
- add new get_mutable parameter to the get api call to get a copy.
- Invalidate api_cache on notify_all events
- Remove all TTLCache headers, sources, and tests
- Include MgrMapCache.cc in CMakeLists and update BaseMgrModule bindings
- Improve logging around cache hits, misses, and state changes
- ActivePyModules
* Remove unused update_cache_metrics()
* Log cache hits/misses inline and only insert into cache when
enabled+cacheable (with proper Py_INCREF)
* Switch get_python() to use PyFormatterRO for cacheable routes, PyFormatter otherwise
- MgrMapCache/LFUCache
* Add can_read_cache()/can_write_cache() helpers and use const& for key parameters
* Guard perf counter increments and improve debug logging
- PyFormatter
* Add PyFormatterRO subclass that freezes dicts/lists into read-only
proxies on the fly
- Python mgr_module
* Simplify get() to return raw result
This change ensures immutable JSON output on cache hits and tightens up cache logic.
mgr/cli: add cache flush command with proper status reporting
Allow operators to invalidate individual mgr Python caches at runtime
without restarting the manager. Introduces a new CLI command:
$ ceph mgr cli cache flush <map-name>
which returns success or a clear error if the named cache entry doesn’t
exist or isn’t cacheable. This makes it easy to drop stale cached maps
(e.g. osd_map, mon_map) on demand.
Fixes: https://tracker.ceph.com/issues/72447 Signed-off-by: Nitzan Mordechai <nmordec@ibm.com>
mgr: add new unit tests for MgrMapCache
- Guard against null perf‐counter before calling inc(), preventing crashes
- Add “foo” to allowed_keys list (for test coverage)
- Rename and refocus the CMake test target from TTLCache to MgrMapCache
- Introduce test_mgrmapcache.cc with LFUCache tests.
- Remove the obsolete test_ttlcache.cc
Fixes: https://tracker.ceph.com/issues/72447 Signed-off-by: Nitzan Mordechai <nmordec@ibm.com>
mgr/test_cache: add new tests
Naman Munet [Mon, 4 May 2026 12:57:53 +0000 (18:27 +0530)]
mgr/dashboard: multisite sync-policy page should include daemon selection
Fixes: https://tracker.ceph.com/issues/71522
Changes includes:
- Added daemon selection support to all sync policy endpoints
- Enhanced backend with daemon context awareness
- Fetch only the sync policies from the specified daemon
Dhairya Parmar [Wed, 20 May 2026 21:18:15 +0000 (02:48 +0530)]
mds: persist session auth_name in ESession journal event
So that it can be applied to the freshly creation session which happens
while recreating session in ESession::replay when the OMAP version fell
behind the ESession cmapv and the newly creation session would be
rejected as target when a client tries to reclaim this session.
Sun Yuechi [Mon, 1 Jun 2026 06:52:03 +0000 (14:52 +0800)]
cmake: link legacy-option-headers from targets that use legacy options
The *_legacy_options.h headers that define the legacy ConfigValues
members are generated at build time by y2c.py. Linking the
legacy-option-headers INTERFACE library adds an order dependency on
that step. A few targets reference legacy members without linking it,
so under a parallel build they can be compiled before the headers
exist and fail with "class ConfigValues has no member ...":
Kefu Chai [Mon, 1 Jun 2026 10:40:06 +0000 (18:40 +0800)]
qa/cephadm: query iSCSI gateway FQDN from inside the container
rbd-target-api validates that the gateway hostname supplied by gwcli
matches the container's own socket.getfqdn(). Running the same call on
the host can return a different value when the host and container resolve
names differently (e.g. on Rocky 10), causing gateway creation to fail
with HTTP 400 and all subsequent gwcli configuration to break silently.
Query the FQDN from inside the iSCSI container directly so the value is
always consistent with what rbd-target-api expects. This also removes the
"run twice" workaround, which was compensating for host-side DNS
warm-up flakiness rather than addressing the underlying mismatch.
Kefu Chai [Mon, 1 Jun 2026 05:19:04 +0000 (13:19 +0800)]
test/libcephfs: reduce SnapDiffDeletionRecreation bulk_count on Windows
this test timed out on Windows. and HugeSnapDiffLargeDelta, at half
the file count, passed in 508 seconds on the same run, suggesting this
test takes ~17 minutes on Windows -- beyond the test runner limit.
we haven't profiled the Windows client yet, but the likely culprit is
EventPoll, the Windows messenger backend, which scans the entire poll
array on every event_wait() and poll_ctl() call rather than using a
keyed data structure.
in this change, we reduce bulk_count to 1 << 12 on Windows. the unique
thing this test covers is the deletion-recreation pattern: a name that
exists as a file in snap1, gets deleted, and reappears as a directory in
snap2 -- it must show up in the diff with both snapids. 4096 produces
1024 such pairs, which is enough to exercise that logic. multi-fragment
snapdiff is already covered by HugeSnapDiffLargeDelta, which derives its
file count from mds_bal_split_size and mds_bal_fragment_fast_factor
explicitly to trigger fragmentation.
Sun Yuechi [Sat, 30 May 2026 06:15:12 +0000 (14:15 +0800)]
cmake: add WITH_SYSTEM_SPDK to link a system-installed SPDK
By default ceph builds the bundled src/spdk fork via BuildSPDK. Add a
WITH_SYSTEM_SPDK option that instead locates a distro-provided SPDK
through a new Findspdk.cmake (pkg-config based, modelled on
Finddpdk.cmake), exposing the same spdk::spdk target.
Sun Yuechi [Sat, 30 May 2026 06:11:11 +0000 (14:11 +0800)]
blk/spdk: support both old and new spdk_env_opts member names
SPDK 21.01 renamed two struct spdk_env_opts members: pci_whitelist ->
pci_allowed and master_core -> main_core. Guard the assignments in
NVMEDevice with SPDK_VERSION.
Kefu Chai [Sat, 30 May 2026 07:49:18 +0000 (15:49 +0800)]
rgw/posix: fix event replay in BucketCache ev_loop
evec is never cleared after each n->notify() call, so events accumulate
across iterations of ev_loop's inner for loop. Each notify() call
receives not just the current event but all events dispatched in earlier
iterations too.
Kefu Chai [Sat, 30 May 2026 07:49:14 +0000 (15:49 +0800)]
rgw/posix: fix refcount leaks in BucketCache
get_bucket(FLAG_LOCK) increments the refcount via lru.ref(), but three
paths returned without the paired lru.unref(): the "do nothing" early
return and the INVALIDATE branch in notify(), and unconditionally in
invalidate_bucket(). Entries hitting these paths accumulated inflated
refcounts that the LRU could never reclaim, leaking during
~BucketCache() → cache.drain().
Replace the manual lru.unref() calls in notify(), add_entry(),
remove_entry(), invalidate_bucket(), and list_bucket() with a scope_guard
declared before unique_lock. Since the guard outlives ulk, it fires after
the mutex is released on all paths, including exceptions from
getRWTransaction() or txn->commit() (e.g. MDB_MAP_FULL, EIO) that the
manual calls never reached.
list_bucket() also had a bare b->mtx.unlock() after fill(); replace it
with unique_lock{..., std::adopt_lock} so a throw from fill() releases
the mutex too.
The POSIXBucket copy constructor incorrectly calls .get() on a
on a temporary unique_ptr returned by clone(), causing immediate
deletion of the Directory object. This leaves a dangling pointer
that triggers a segfault during destruction.
Matt Benjamin [Tue, 3 Feb 2026 22:12:22 +0000 (17:12 -0500)]
posixdriver: fix cksum_type, flags propagation
Posixdriver doesn't serialize POSIXMultipartUpload, but rather a
member mp_obj of type POSIXMPObj--so to avoid losing the latter's
inherited cksum_type and cksum_flags members (which are already
copied in), copy them out in POSIXMultiPartUpload::get_info() which
we need to call to copy out dest_placement anyway.
(oops, chksum_type was copied in, but not cksum_flags)
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>