Patrick Donnelly [Tue, 31 Mar 2026 13:10:08 +0000 (18:40 +0530)]
mon/MonClient: check stopping for auth request handling
When the MonClient is shutting down, it is no longer safe to
access MonClient::auth and other members. The AuthClient
methods should be checking the stopping flag in this case.
The key bit from the segfault backtrace (thanks Brad Hubbard!) is here:
#13 0x00007f921ee23c40 in ProtocolV2::handle_auth_done (this=0x7f91cc0945f0, payload=...) at /usr/include/c++/12/bits/shared_ptr_base.h:1665
#14 0x00007f921ee16a29 in ProtocolV2::run_continuation (this=0x7f91cc0945f0, continuation=...) at msg/./src/msg/async/ProtocolV2.cc:54
#15 0x00007f921edee56e in std::function<void (char*, long)>::operator()(char*, long) const (__args#1=0, __args#0=<optimized out>, this=0x7f91cc0744d8) at /usr/include/c++/12/bits/std_function.h:591
#16 AsyncConnection::process (this=0x7f91cc074140) at msg/./src/msg/async/AsyncConnection.cc:485
#17 0x00007f921ee3300c in EventCenter::process_events (this=0x55efc9d0a058, timeout_microseconds=<optimized out>, working_dur=0x7f921a891d88) at msg/./src/msg/async/Event.cc:465
#18 0x00007f921ee38bf9 in operator() (__closure=<optimized out>) at msg/./src/msg/async/Stack.cc:50
#19 std::__invoke_impl<void, NetworkStack::add_thread(Worker*)::<lambda()>&> (__f=...) at /usr/include/c++/12/bits/invoke.h:61
#20 std::__invoke_r<void, NetworkStack::add_thread(Worker*)::<lambda()>&> (__fn=...) at /usr/include/c++/12/bits/invoke.h:111
#21 std::_Function_handler<void(), NetworkStack::add_thread(Worker*)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/12/bits/std_function.h:290
#22 0x00007f921e81f253 in std::execute_native_thread_routine (__p=0x55efc9e9c5f0) at ../../../../../src/libstdc++-v3/src/c++11/thread.cc:82
#23 0x00007f921f5e8ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#24 0x00007f921f67a8d0 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
I originally thought this may be the issue causing [1] however that
turned out to be an issue caused by OpenSSL's use of atexit handlers.
I still think there is a bug here so I am continuing with this change.
[1] https://tracker.ceph.com/issues/59335
Fixes: https://tracker.ceph.com/issues/76017 Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Patrick Donnelly [Tue, 14 Apr 2026 15:46:54 +0000 (11:46 -0400)]
Merge PR #66294 into main
* refs/pull/66294/head:
qa: enforce centos9 for test
qa: rename distro
qa/suites/fs/bugs: use centos9 for squid upgrade test
qa: remove unused variables
qa: use centos9 for fs suites using k-testing
qa: update fs suite to rocky10
qa: skip dashboard install due to dependency noise
qa: only setup nat rules during bridge creation
qa: correct wording of comment
qa: use nft instead iptables
qa: use py3 builtin ipaddress module
Patrick Donnelly [Wed, 21 Jan 2026 17:25:31 +0000 (12:25 -0500)]
tools/cephfs: add new cephfs-tool
This patch introduces `cephfs-tool`, a new standalone C++ utility
designed to interact directly with `libcephfs`.
While the tool is architected to support various subcommands in the
future, the initial implementation focuses on a `bench` command to
measure library performance. This allows developers and administrators
to benchmark the userspace library isolated from FUSE or Kernel client
overheads.
Key features include:
* Multi-threaded Read/Write throughput benchmarking.
* Configurable block sizes, file counts, and fsync intervals.
* Detailed statistical reporting (Mean, Std Dev, Min/Max) for throughput and IOPS.
* Support for specific CephFS user/group impersonation (UID/GID) via `ceph_mount_perms_set`.
As an example test on a "trial" sepia machine against the new LRC, I
used a command like:
General Options:
-h [ --help ] Produce help message
-c [ --conf ] arg Ceph config file path
-i [ --id ] arg (=admin) Client ID
-k [ --keyring ] arg Path to keyring file
--filesystem arg CephFS filesystem name to mount
--uid arg (=-1) User ID to mount as
--gid arg (=-1) Group ID to mount as
Benchmark Options (used with 'bench' command):
--threads arg (=1) Number of threads
--iterations arg (=1) Number of iterations
--files arg (=100) Total number of files
--size arg (=4MB) File size (e.g. 4MB, 0 for creates only)
--block-size arg (=4MB) IO block size (e.g. 1MB)
--fsync-every arg (=0) Call fsync every N bytes
--prefix arg (=benchmark_) Filename prefix
--dir-prefix arg (=bench_run_) Directory prefix
--root-path arg (=/) Root path in CephFS
--per-thread-mount Use separate mount per thread
--no-cleanup Disable cleanup of files
AI-Assisted: significant portions of this code were AI-generated through a dozens of iterative prompts. Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
cephadm: wait for latest osd map after ceph-volume before OSD deploy
after ceph-volume creates an OSD, the cached osd map of the mgr can
lag behind the monitors, then get_osd_uuid_map() misses the new osd
id and deploy_osd_daemons_for_existing_osds() skips deploying the
cephadm daemon, which reports a misleading "Created no osd(s)" while
the osd exists.
This behavior is often seen with raw devices. (lvm list returns quicker).
This also fixes get_osd_uuid_map(only_up=True) as the previous branch
never populated the map when 'only_up' was true.
Now we only include osds with 'up==1' so a new OSD created (but still down)
is not treated as already present.
Patrick Donnelly [Tue, 14 Apr 2026 00:47:43 +0000 (20:47 -0400)]
qa: rename distro
The kernel mount overrides for the distro have no effect if they are
applied before `supported-random-distro`.
Fixes:
2026-04-13T19:06:13.603 INFO:teuthology.task.pexec:sudo dnf remove nvme-cli -y
2026-04-13T19:06:13.603 INFO:teuthology.task.pexec:sudo dnf install nvmetcli nvme-cli -y
2026-04-13T19:06:13.626 INFO:teuthology.task.pexec:Running commands on host ubuntu@trial005.front.sepia.ceph.com
2026-04-13T19:06:13.627 INFO:teuthology.task.pexec:sudo dnf remove nvme-cli -y
2026-04-13T19:06:13.627 INFO:teuthology.task.pexec:sudo dnf install nvmetcli nvme-cli -y
2026-04-13T19:06:13.652 INFO:teuthology.orchestra.run.trial148.stderr:sudo: dnf: command not found
2026-04-13T19:06:13.653 DEBUG:teuthology.orchestra.run:got remote process result: 1
2026-04-13T19:06:13.654 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/run_tasks.py", line 105, in run_tasks
manager = run_one_task(taskname, ctx=ctx, config=config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/run_tasks.py", line 83, in run_one_task
return task(**kwargs)
^^^^^^^^^^^^^^
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/task/pexec.py", line 149, in task
with parallel() as p:
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/parallel.py", line 84, in __exit__
for result in self:
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/parallel.py", line 98, in __next__
resurrect_traceback(result)
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/parallel.py", line 30, in resurrect_traceback
raise exc.exc_info[1]
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/parallel.py", line 23, in capture_traceback
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/task/pexec.py", line 62, in _exec_host
tor.wait([r])
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/orchestra/run.py", line 485, in wait
proc.wait()
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/orchestra/run.py", line 161, in wait
self._raise_for_status()
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/orchestra/run.py", line 181, in _raise_for_status
raise CommandFailedError(
teuthology.exceptions.CommandFailedError: Command failed on trial148 with status 1: 'TESTDIR=/home/ubuntu/cephtest bash -s'
which was done because these dnf commands were pulled from rocky10.yaml from the kclient overrides but ubuntu_latest was used for the random distro.
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Patrick Donnelly [Thu, 12 Feb 2026 15:36:29 +0000 (10:36 -0500)]
qa: use centos9 for fs suites using k-testing
A better approach would be to include centos9 OR rocky10 for
distribution choice. Then we can just filter out rocky10 when we're
testing the `testing` kernel but keep rocky10 coverage for other
testing.
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Patrick Donnelly [Wed, 19 Nov 2025 17:25:45 +0000 (12:25 -0500)]
qa: skip dashboard install due to dependency noise
2025-11-18T19:46:46.226 INFO:teuthology.orchestra.run.smithi008.stdout:/usr/bin/ceph: stderr Error ENOTSUP: Module 'alerts' is not enabled/loaded (required by command 'dashboard set-ssl-certificate'): use `ceph mgr module enable alerts` to enable it
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Gil Bregman [Mon, 13 Apr 2026 21:41:25 +0000 (00:41 +0300)]
mgr/dashboard: Add port and secure-listeners to subsystem add NVMeoF CLI command Fixes: https://tracker.ceph.com/issues/75998 Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
mgr/cephadm: fix nvmeof TLS handling and add coverage for ssl/mTLS
This PR fixes the value of `ssl` field on `NvmeofServiceSpec` (was
always set to enable_auth) and add some UT to make sure both specs
with ssl only and with mTLS enabled (enable_auth) generate the
expected daemon configuration.
Nizamudeen A [Sun, 12 Apr 2026 06:06:30 +0000 (11:36 +0530)]
ceph.spec.in: replace golang github prometheus with promtool binary path
i don't see golang-github-prometheus available for centos anymore and
other distro's as well. And different package provides the promtool in
different distro's so instead of identifying all the corresponding
packages and its name, replacing the package name with binary path so it
works across distros without distro specific conditions
Some build failures are captured in our internal runs recently
https://github.com/rhcs-dashboard/ceph-dev/actions/runs/24298848427/job/70949666821
Gil Bregman [Sun, 12 Apr 2026 16:18:07 +0000 (19:18 +0300)]
mgr/dashboard: Add location to gateway info command in NVMeoF CLI Fixes: https://tracker.ceph.com/issues/75968 Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
This commit makes these changes to nvmeof top tool:
1. Improve/cleanup help text
2. Rename args (--group, --server-addr, --subsystem) to
(--gw-group, --server-address, --nqn) to match other nvmeof cmds
3. Validate args --period, --gw-group, --server-address, --sort-by
4. Remove --service arg (since group and service have 1-1 mapping, this is redundant)
5. Show all cpu stats if no args are passed to "ceph nvmeof top cpu"
6. Don't show busy/idle rate more than 100%
mgr/dashboard: replace deprecated codecs.open with open
codecs.open() was deprecated since Python 3.14, see
https://docs.python.org/3/library/codecs.html#codecs.open.
Let's use the builtin open() as recommended by the official
document.
Vinayak Tiwari [Sun, 8 Feb 2026 08:24:29 +0000 (13:54 +0530)]
osd: avoid ceph_abort in build_incremental_map_msg when newest_map is missing
When sharing OSD maps with a peer (e.g. during heartbeat in
maybe_share_map), we may have already trimmed the requested range or
newest_map (e.g. trim race, or store read failure). In that case the
panic path tried to send newest_map; if it could not be loaded, the
code called ceph_abort() and crashed the OSD.
Log and return an empty MOSDMap instead of aborting. The receiver drops
such messages (last <= superblock.get_newest_map()) and can re-request
from the mon.
Gil Bregman [Mon, 6 Apr 2026 22:08:15 +0000 (01:08 +0300)]
mgr/dashboard: Add namespace encryption support to NVMeoF CLI Fixes: https://tracker.ceph.com/issues/74965 Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
Devika Babrekar [Wed, 4 Mar 2026 09:07:12 +0000 (14:37 +0530)]
mgr/dashboard: Making 'ISA' as default plugin for EC profiles created through dashboard Fixes: https://tracker.ceph.com/issues/75312 Signed-off-by: Devika Babrekar <devika.babrekar@ibm.com>
qa/cephadm: fix NFS ganesha startup failure in containers
The test_cephadm.sh workunit deploys NFS using cephadm _orch deploy with
config_blobs sourced from src/cephadm/samples/nfs.json. The ganesha.conf
section in that sample has no NFS_CORE_PARAM block, so allow_set_io_flusher_fail
defaults to false.
On Rocky Linux 10 (the current base for ceph:main images), ganesha 7.0 calls
prctl(PR_SET_IO_FLUSHER) at startup. Containers lack the required capabilities
(CAP_SYS_ADMIN/CAP_SYS_RAWIO) for this syscall, so it returns EPERM. With
allow_set_io_flusher_fail unset, ganesha treats this as a fatal error and aborts
immediately, before even fetching the %url RADOS config.
The orchestrator path (ganesha.conf.j2) already has allow_set_io_flusher_fail = true
in its NFS_CORE_PARAM block. This fix brings the sample config used by the
standalone test path in line with it.
This commit fixes an issue when the image is not the base distro,
the debug suffix for it is overwritten. This is especially
required for crimson debug builds to work for rocky10.
Aashish Sharma [Tue, 31 Mar 2026 04:30:23 +0000 (10:00 +0530)]
mgr/dashboard: Add option to edit zone with keys/
argument like"sync_from" and "sync_from_all"
Currently, there is no option to configure the sync_from and sync_from_all keys directly while creating or editing a zone from the dashboard. These arguments are particularly important when setting up archive zones. In archive zones, duplicate objects appear when sync_from_all is set to true (which is the default). The fix is to:
1.Set sync_from_all to false
2.Set sync_from to point to the master zone only
This ensures that the archive zone syncs exclusively from the master zone, preventing duplicate object issues.
rgw/posix: fix Inotify member initialization order race
wfd and efd were initialized in the Inotify constructor body, but
the inotify thread was started in the member initializer list via
thrd(&Inotify::ev_loop, this). Since C++ initializes members in
declaration order (wfd, efd, thrd), the thread could start before
the constructor body ran, causing ev_loop() to capture indeterminate
fd values into its local pollfd array.
When the destructor later signaled shutdown via the real efd, the
thread never woke because it was polling the wrong fd, causing
thrd.join() to block indefinitely. This predominantly affected arm64
builds due to the weaker memory model widening the race window.
Fix by moving wfd/efd initialization into the member initializer
list so they are set before the thread starts. Also make shutdown
std::atomic<bool> to eliminate the data race, close wfd/efd in the
destructor to fix the fd leak, and add error checking for eventfd().
Leonid Chernin [Tue, 17 Mar 2026 15:40:16 +0000 (17:40 +0200)]
nvmeofgw: propagate quorum feature to the NVMeofMonClient,
reverted feature bit NVMEOF_BEACON_DIFF:
-NVMeofGwMon adds a quorum_features indication to the MonClient map.
-MonClient initially sends beacons without applying the BEACON_DIFF logic.
-MonClient begins applying the BEACON_DIFF logic only when the BEACON_DIFF bit
is set in the quorum_features field of the NVMeoF monitor map.
-added mon commands:
nvme-gw set beacon-diff disable
nvme-gw set beacon-diff enable
-performed changes in encode/decode of the BEACON_DIFF feature
-reverted NVMEOF_BEACON_DIFF bit
Signed-off-by: Leonid Chernin <leonidc@il.ibm.com>