Afreen Misbah [Tue, 12 May 2026 12:07:39 +0000 (17:37 +0530)]
mgr/dashboard: Fix edit and delete access for pool-manager role
Fixes https://tracker.ceph.com/issues/76561
- allows deleting pools in pool-manager role by bypassing config-opt read permissions
- allows editing in pool-manager role which failing deu to misisng rbd mirroring permissions
- fixes a bug with pool edit mode where when both compression and name are edited it fails due to an if-else logic bug
Shai Fultheim [Sat, 16 May 2026 20:17:59 +0000 (23:17 +0300)]
crimson/os/seastore: fix cleaner space leak from shadowed result list
TransactionManager::get_extents_if_live() declared an inner
std::list<CachedExtentRef> res inside the "extent is cached" branch
that shadowed the outer res returned by the coroutine. When the
queried extent was present in the cache, it was moved into the inner
list and immediately discarded, and the empty outer list was returned
to the caller.
The async cleaner uses this result to decide whether to rewrite an
extent or treat it as dead. For recently-allocated LBA tree internal
nodes (still hot in cache), the shadowed return caused the cleaner to
skip them, so mark_space_free() never paired with the earlier
mark_space_used(). Each affected reclaim leaked exactly one extent
(4 KiB for LADDR_INTERNAL), tripping the live_bytes != 0 assertion in
SegmentCleaner::clean_space() (async_cleaner.cc:1441) once a victim
segment with such a leftover was selected.
The reproducer (at ~70% full) deterministically aborted within ~3
minutes before this fix; with the fix the OSDs run cleanly past the
trigger point.
Matthew N. Heler [Fri, 15 May 2026 11:11:35 +0000 (06:11 -0500)]
doc/rados/configuration: recommend wpq for EC clusters seeing slow ops
On large EC clusters, mClock currently routes recovery EC sub-reads
through the immediate queue, skipping throttling. When many OSDs read
from one source during recovery, that source's high-priority queue
saturates and starves client work, producing slow ops. Recommend
falling back to wpq in the mClock config reference until the
scheduler treats those reads as background.
Signed-off-by: Matthew N. Heler <matthew.heler@hotmail.com>
Patrick Donnelly [Thu, 14 May 2026 17:26:33 +0000 (13:26 -0400)]
.github/workflows/releng-audit: consolidate into single job
In order to make this a required check someday, we can't have the main
job ever be skipped. So, consolidate into a single job and skip actions
based on the router logic.
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Venky Shankar [Thu, 14 May 2026 09:27:21 +0000 (14:57 +0530)]
Merge PR #64774 into main
* refs/pull/64774/head:
test_cephfs.py: delete purge_dir() helper method, use rmtree() instead
test_cephfs.py: remove rendundant call to purge_dir()
test_cephfs.py: test rmtree on root
pybind/cephfs: don't attempt to unlink root in rmtree
test_cephfs.py: test rmtree with and without should_cancel
pybind/cephfs: make should_cancel option parameter for rmtree()
mgr/volumes: clone using cptree() from cephfs python bindings
test_cephfs: add unit tests for cptree() in cephfs python bindings
test/pybind/assertions: add helper method assert_less
pybind/cephfs: use depth-first, non-recursive approach for cloning
test_cephfs: call object setup/teardown for all tests in TestWithRootUser
test_cephfs.py: add tests for utimensat()
pybind/cephfs: add python bindings for utimensat()
qa/cephfs: add tests for chownat()
pybind/cephfs: add python bindings for chownat()
test_cephfs.py: add tests for chmodat()
pybind/cephfs: add python bindings for chmodat()
test_cephfs.py: add tests for symlinkat()
pybind/cephfs: add python binding for symlinkat()
test_cephfs.py: add test for readlinkat()
pybind/cephfs: add python binding for readlinkat()
pybind/cephfs: add tests for statxat()
pybind/cephfs: add python bindings for statxat()
test_cephfs.py: add tests for mkdirat()
pybind/cephfs: add python binding for mkdirat()
Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Jos Collin <jcollin@redhat.com>
Patrick Donnelly [Wed, 13 May 2026 19:48:41 +0000 (15:48 -0400)]
Merge PR #68781 into main
* refs/pull/68781/head:
doc/governance: remove Sam from CSC
Reviewed-by: Joseph Mundackal <jmundackal@bloomberg.net> Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>
Kefu Chai [Tue, 5 May 2026 01:36:01 +0000 (09:36 +0800)]
pybind/mgr/status: drop asserts that fight the defaultdict defaults
The 'assert metadata' checks in the status module were actually fighting
against our own defaults. Since an empty defaultdict is falsy, these
asserts would blow up the whole command if a single daemon was down
after a mgr restart.
This drops those four grumpy asserts. Now, instead of a traceback,
`ceph osd status` and `ceph fs status` will just show a blank hostname
or "unknown" version as intended.
The trigger is common in practice: any mgr restart leaves daemons
that are currently down without metadata in daemon_state, since
they never reconnect via MMgrOpen to repopulate it. After such a
restart, `ceph osd status` and `ceph fs status` blow up:
```
Error EINVAL: Traceback (most recent call last):
...
File ".../status/module.py", line 340, in handle_osd_status
assert metadata
AssertionError
```
Kefu Chai [Tue, 5 May 2026 01:35:00 +0000 (09:35 +0800)]
mgr: narrow get_metadata return type with @overload
Enable type narrowing for get_metadata() when a non-None default is
provided. Previously, the return type was always `Optional[Dict[str, str]]`,
forcing callers to use defensive `assert metadata` checks even when
a result was guaranteed.
The wrapper returns either the metadata from `_ceph_get_metadata()` or the
caller-supplied default. Providing an `@overload` allows type checkers to
prove the result is non-None, avoiding invalid assertions for falsy
defaults (like an empty defaultdict).
Kefu Chai [Wed, 13 May 2026 04:32:33 +0000 (12:32 +0800)]
crimson/osd: drop redundant trailing co_return in pg_advance_map
check_for_splits() and split_pg() both ended with a bare co_return
that the compiler inserts implicitly for a coroutine returning
seastar::future<>. Remove the redundant statements.
Venky Shankar [Tue, 12 May 2026 15:26:29 +0000 (20:56 +0530)]
Merge PR #68128 into main
* refs/pull/68128/head:
qa: Fix checksum calculation on empty directories
qa: Add mirror test for snapshot with only dir
tools/cephfs_mirror: Fix sync hang
Olivier Chaze [Tue, 12 May 2026 14:32:10 +0000 (16:32 +0200)]
doc/rgw: warn about rgw_usage_max_shards consistency
Add documentation warnings explaining that all RGW daemons and
radosgw-admin commands must use the same rgw_usage_max_shards value.
Mismatched shard counts cause writes and reads/trim to target different
objects, resulting in seemingly empty usage logs or failed cleanup.
Also document the --rgw-usage-max-shards command-line parameter for
radosgw-admin as an alternative to global config.
Kefu Chai [Tue, 12 May 2026 09:17:56 +0000 (17:17 +0800)]
debian: drop explicit libprotobuf dependency from ceph-osd-crimson
The ceph-osd-crimson package already lists ${shlibs:Depends} in its
Depends field, which generates the correct libprotobuf dependency for
the target distribution at build time (e.g. libprotobuf32t64 on
Trixie/Noble). The hardcoded libprotobuf23 entry is redundant and
breaks installations on distributions where protobuf ships under a
different package name.
Afreen Misbah [Tue, 5 May 2026 21:05:11 +0000 (02:35 +0530)]
mgr/dashboard: Updates to empty state component
- added state for no storage in empty state component
- extended the icon component to take into account the scenario of button with icon
- fix unit tests
Shweta Bhosale [Mon, 11 May 2026 10:02:14 +0000 (15:32 +0530)]
mgr/nfs: reuse CephfsClient for path checks and earmark resolver
cephfs_path_is_dir defined an inner function decorated with lru_cache, so
each call got a new function object and an empty cache, CephfsClient(mgr)
ran every time. Moved caching to module-level cephfs_client_for_mgr(mgr)
and call it from cephfs_path_is_dir.
Passed that shared client into CephFSEarmarkResolver from the NFS module so
export create/apply does not construct a separate CephfsClient for
earmarks.
Kefu Chai [Mon, 11 May 2026 05:46:25 +0000 (13:46 +0800)]
crimson: consolidate the return paths of get_segment_manager()
before this change, two branches both return `BlockSegmentManager`,
which is redundant. in this change, consolidate them so that the
`HAVE_ZNS` path becomes an early return. this improves readability.
Kefu Chai [Mon, 11 May 2026 05:27:42 +0000 (13:27 +0800)]
crimson: abort on ioctl(BLKGETNRZONES) failure
previously, we did not check the return value of ioctl(BLKGETNRZONES).
we query the number of zones of the storage device to determine which
seastore backend to use. the only possible error from this ioctl is
-EFAULT (invalid user pointer), which indicates a programming error
and should never happen in practice. use ceph_assert() to catch this.
Kefu Chai [Mon, 11 May 2026 05:07:25 +0000 (13:07 +0800)]
crimson: use uint32_t when calling ioctl(BLKGETNRZONES)
before this change, we pass a pointer to a `size_t` to
ioctl(BLKGETNRZONES), but in the Linux kernel,
include/uapi/linux/blkzoned.h:
```c
#define BLKGETNRZONES _IOR(0x12, 133, __u32)
```
this API reads 32 bits of data into the pointer. on 64-bit
architectures, size_t is 64 bits. fortunately, we initialize
nr_zones with 0, so the upper 32 bits remain zero. this works
on little-endian systems, but not on big-endian systems. it is
also semantically wrong. we should pass a pointer to a 32-bit
value when calling ioctl(BLKGETNRZONES).
in this change, we change the type of nr_zones from size_t to
uint32_t to match what the Linux kernel expects.
```
[1/3] Building CXX object src/crimson/os/seastore/CMakeFiles/crimson-seastore.dir/segment_manager.cc.o
/home/kefu/dev/ceph/src/crimson/os/seastore/segment_manager.cc:45:15: warning: lambda capture 'FNAME' is not used [-Wunused-lambda-capture]
45 | ).then([FNAME,
| ^
```
but we went further by coroutinize the whole method. because the return
value of ioctl() is not checked before this change, and clang correctly
flagged this with a warning, we marker it with `[[maybe_unused]]`, we
will fix it in a separate change.