Kefu Chai [Thu, 4 Jun 2026 10:38:24 +0000 (18:38 +0800)]
test/rgw/posix: free the quota handler in TestDriver
TestDriver::init() allocates quota_handler via
RGWQuotaHandler::generate_handler() but nothing frees it. The real
POSIXDriver frees it in finalize(), which the unit tests never call, so
every fixture that runs init() leaks the handler and the stat caches
hanging off it: 274 allocations, ~40KB, all rooted at generate_handler()
under ASan:
==6102==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 3200 byte(s) in 5 object(s) allocated from:
#1 RGWQuotaHandler::generate_handler(...) src/rgw/rgw_quota.cc:989
#2 TestDriver::init(...) src/test/rgw/test_rgw_posix_driver.cc:1100
#3 POSIXDriverTest::SetUp() src/test/rgw/test_rgw_posix_driver.cc:1191
...
SUMMARY: AddressSanitizer: 40099 byte(s) leaked in 274 allocation(s).
So free it in ~TestDriver(), the counterpart to the init() allocation.
~POSIXDriver() is empty and nothing else touches quota_handler, so there
is no double free, and free_handler(nullptr) is a no-op when init()
bailed out early.
Dhairya Parmar [Wed, 20 May 2026 21:18:15 +0000 (02:48 +0530)]
mds: persist session auth_name in ESession journal event
So that it can be applied to the freshly creation session which happens
while recreating session in ESession::replay when the OMAP version fell
behind the ESession cmapv and the newly creation session would be
rejected as target when a client tries to reclaim this session.
Sun Yuechi [Mon, 1 Jun 2026 06:52:03 +0000 (14:52 +0800)]
cmake: link legacy-option-headers from targets that use legacy options
The *_legacy_options.h headers that define the legacy ConfigValues
members are generated at build time by y2c.py. Linking the
legacy-option-headers INTERFACE library adds an order dependency on
that step. A few targets reference legacy members without linking it,
so under a parallel build they can be compiled before the headers
exist and fail with "class ConfigValues has no member ...":
Kefu Chai [Mon, 1 Jun 2026 10:40:06 +0000 (18:40 +0800)]
qa/cephadm: query iSCSI gateway FQDN from inside the container
rbd-target-api validates that the gateway hostname supplied by gwcli
matches the container's own socket.getfqdn(). Running the same call on
the host can return a different value when the host and container resolve
names differently (e.g. on Rocky 10), causing gateway creation to fail
with HTTP 400 and all subsequent gwcli configuration to break silently.
Query the FQDN from inside the iSCSI container directly so the value is
always consistent with what rbd-target-api expects. This also removes the
"run twice" workaround, which was compensating for host-side DNS
warm-up flakiness rather than addressing the underlying mismatch.
Kefu Chai [Mon, 1 Jun 2026 05:19:04 +0000 (13:19 +0800)]
test/libcephfs: reduce SnapDiffDeletionRecreation bulk_count on Windows
this test timed out on Windows. and HugeSnapDiffLargeDelta, at half
the file count, passed in 508 seconds on the same run, suggesting this
test takes ~17 minutes on Windows -- beyond the test runner limit.
we haven't profiled the Windows client yet, but the likely culprit is
EventPoll, the Windows messenger backend, which scans the entire poll
array on every event_wait() and poll_ctl() call rather than using a
keyed data structure.
in this change, we reduce bulk_count to 1 << 12 on Windows. the unique
thing this test covers is the deletion-recreation pattern: a name that
exists as a file in snap1, gets deleted, and reappears as a directory in
snap2 -- it must show up in the diff with both snapids. 4096 produces
1024 such pairs, which is enough to exercise that logic. multi-fragment
snapdiff is already covered by HugeSnapDiffLargeDelta, which derives its
file count from mds_bal_split_size and mds_bal_fragment_fast_factor
explicitly to trigger fragmentation.
Sun Yuechi [Sat, 30 May 2026 06:15:12 +0000 (14:15 +0800)]
cmake: add WITH_SYSTEM_SPDK to link a system-installed SPDK
By default ceph builds the bundled src/spdk fork via BuildSPDK. Add a
WITH_SYSTEM_SPDK option that instead locates a distro-provided SPDK
through a new Findspdk.cmake (pkg-config based, modelled on
Finddpdk.cmake), exposing the same spdk::spdk target.
Sun Yuechi [Sat, 30 May 2026 06:11:11 +0000 (14:11 +0800)]
blk/spdk: support both old and new spdk_env_opts member names
SPDK 21.01 renamed two struct spdk_env_opts members: pci_whitelist ->
pci_allowed and master_core -> main_core. Guard the assignments in
NVMEDevice with SPDK_VERSION.
Kefu Chai [Sat, 30 May 2026 07:49:18 +0000 (15:49 +0800)]
rgw/posix: fix event replay in BucketCache ev_loop
evec is never cleared after each n->notify() call, so events accumulate
across iterations of ev_loop's inner for loop. Each notify() call
receives not just the current event but all events dispatched in earlier
iterations too.
Kefu Chai [Sat, 30 May 2026 07:49:14 +0000 (15:49 +0800)]
rgw/posix: fix refcount leaks in BucketCache
get_bucket(FLAG_LOCK) increments the refcount via lru.ref(), but three
paths returned without the paired lru.unref(): the "do nothing" early
return and the INVALIDATE branch in notify(), and unconditionally in
invalidate_bucket(). Entries hitting these paths accumulated inflated
refcounts that the LRU could never reclaim, leaking during
~BucketCache() → cache.drain().
Replace the manual lru.unref() calls in notify(), add_entry(),
remove_entry(), invalidate_bucket(), and list_bucket() with a scope_guard
declared before unique_lock. Since the guard outlives ulk, it fires after
the mutex is released on all paths, including exceptions from
getRWTransaction() or txn->commit() (e.g. MDB_MAP_FULL, EIO) that the
manual calls never reached.
list_bucket() also had a bare b->mtx.unlock() after fill(); replace it
with unique_lock{..., std::adopt_lock} so a throw from fill() releases
the mutex too.
The POSIXBucket copy constructor incorrectly calls .get() on a
on a temporary unique_ptr returned by clone(), causing immediate
deletion of the Directory object. This leaves a dangling pointer
that triggers a segfault during destruction.
Matt Benjamin [Tue, 3 Feb 2026 22:12:22 +0000 (17:12 -0500)]
posixdriver: fix cksum_type, flags propagation
Posixdriver doesn't serialize POSIXMultipartUpload, but rather a
member mp_obj of type POSIXMPObj--so to avoid losing the latter's
inherited cksum_type and cksum_flags members (which are already
copied in), copy them out in POSIXMultiPartUpload::get_info() which
we need to call to copy out dest_placement anyway.
(oops, chksum_type was copied in, but not cksum_flags)
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Matt Benjamin [Sun, 15 Feb 2026 20:56:03 +0000 (15:56 -0500)]
posixdriver: fix cache fill of versioned buckets
This change completes the original intent (hypothesized) to
conditionally set the FLAG_CURRENT bit on just the current
entries during bucket listing cache fill.
This avoids interning 2 copies of the current version of each
object in the listing cache, and also correctly sets the
FLAG_CURRENT bit as required--so the current versions are correctly
reported in versioned listings.
Janky logic to find the current version by explicitly chasing
the symlink target and saving it outside the enumeration scope
has been replaced with proper call to stat() provided by Dang.
Symlink::fill_cache() is no longer used, so removed.
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Matt Benjamin [Sun, 15 Feb 2026 15:21:28 +0000 (10:21 -0500)]
posixdriver: add bde.flags to in bucket cache serde cycle
The upstream logic (mostly?) correctly uses bde.flags when filling
the cache for versioned objects, but cache ser(de)ialization has
been discarding that member.
This change suppresses the visible result where RGW incorrectly produces
multiple versions in non-versioned listing because none uniquely sets
FLAG_CURRENT:
Cached listings for versions are still incorrect in containing an
an extra entry for the "current" version in with empty instance
(from the Symlink)--the visible effect being that list-object-versions
output is incorrect (no entry is sent with IsLatest, after the
empty instance version has been filtered out).
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Jacques Heunis [Thu, 15 Jan 2026 12:11:11 +0000 (12:11 +0000)]
tools/rados: Remove plain text snippets from rados bench JSON output
`rados bench` emits performance stats as its output. It is very helpful
for this output to be in a machine-readable format and the CLI provides
the `--format=json` flag to achieve this.
There are some logs that do not respect the formatter flag though, as
they provide status updates as the tool is running and do not form part
of the output dataset. This prevents the contents of stdout from being
valid JSON which destroys the machine-readability of the output.
To resolve this we gate those status messages behind a check for the
formatter. If any specific formatter is provided we do not emit the
status logs. This leaves the plaintext output largely untouched while
helping the machine-readable output to be well-formed.
Fixes: https://tracker.ceph.com/issues/74370 Signed-off-by: Jacques Heunis <jheunis@bloomberg.net>
Jamie Pryde [Fri, 29 May 2026 11:44:56 +0000 (12:44 +0100)]
qa: Ignore deprecated EC plugin warning in teuthology tests
Add DEPRECATED_EC_PLUGIN to the list of health warnings to
ignore in the thrash-erasure-code-* tests that use deprecated
plugins or techniques. It is expected that this warning will
be raised.