This commit makes these changes to nvmeof top tool:
1. Improve/cleanup help text
2. Rename args (--group, --server-addr, --subsystem) to
(--gw-group, --server-address, --nqn) to match other nvmeof cmds
3. Validate args --period, --gw-group, --server-address, --sort-by
4. Remove --service arg (since group and service have 1-1 mapping, this is redundant)
5. Show all cpu stats if no args are passed to "ceph nvmeof top cpu"
6. Don't show busy/idle rate more than 100%
debian: remove stale distutils override from py3dist-overrides
distutils was deprecated in Python 3.10 (PEP 632) and removed in
Python 3.12. The `python3-distutils` package no longer exists in
Debian Trixie (Python 3.13) or Ubuntu 24.04+ (Python 3.12).
The only runtime reference was in `debian/ceph-mgr.requires`, already
cleaned up by 3fb3f892aa3. This override is now dead code, hence no
installed file declares a runtime dependency on `distutils`, so
`dh_python3` never resolves it. Removing it prevents a latent
uninstallable-dependency bug if `distutils` were accidentally
reintroduced in a `.requires` file.
Fixes: https://tracker.ceph.com/issues/75901 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Signed-off-by: Max R. Carrara <m.carrara@proxmox.com> Signed-off-by: Kefu Chai <k.chai@proxmox.com>
libcephsqlite: ensure atexit handlers are registered after openssl
When the sqlite3 executable encounters an error with .bail=on, it will
make a call to exit(). The atexit() handlers will execute in LIFO order.
We need to ensure that openssl (before OpenSSL 4.0 [1]) atexit handlers are
registered before libcephsqlite.
[1] http://github.com/openssl/openssl/commit/31659fe32673a6bd66abf3f8a7d803e81c6ffeed (OpenSSL 4.0 no longer arms `OPENSSL_cleanup()` function as an `atexit(3)`)
Fixes: https://tracker.ceph.com/issues/59335 Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Increase op_delay to pggrow to avoid rapid PG splits.
Excessive splitting with a low reactor count can leave many PGs in
snaptrim, causing tests to hit the (short) snap trimming timeout.
Crimson's pggrow keeps the OSDs clean thorugout the entire test,
which is against do_thrash expectations.
Increasing op_delay would reduce do_thrash "actions" back to a normal rate.
Use prompts that cannot be selected in CLI examples. Remove warnings
about selectable prompts.
Use privileged prompt for ceph commands.
Use inline formatting consistently.
Improve capitalization.
Signed-off-by: Ville Ojamo <git2233+ceph@ojamo.eu>
Reviewed-by: Patrick Donnelly <pdonnell@ibm.com> Reviewed-by: Redouane Kachach <rkachach@redhat.com> Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
test/crimson/seastore: add missing crimson::gtest to link libraries
unittest-transaction-manager, unittest-omap-manager,
unittest-btree-lba-manager, and unittest-seastore all include
gtest_seastar.cc but were not explicitly linking against crimson::gtest.
This worked previously because gtest symbols were pulled in transitively,
but with gcc-toolset-13 and LTO the transitive dependency is no longer
satisfied, producing undefined reference errors for testing::Message,
testing::Test, testing::AssertionSuccess, etc.
crimson/osd: make load_obc(obc, md_ref) return void
load_obc() taking an already-resolved loaded_object_md_t::ref is
synchronous, because it just populates obc state, it does yield.
Returning an errorated future was unnecessary and caused a
-Wunused-result warning at its only call site:
ECRecoveryBackend::maybe_load_obc().
In this change, we change it to return void and deduplicate the OBC
population logic: the private async overload (taking future<md_ref>)
now validates ssc and returns object_corrupted on failure.
This silences the warning, and simpler this way. The async error
propagation is preserved.
Kyr Shatskyy [Tue, 27 Jan 2026 18:17:04 +0000 (19:17 +0100)]
qa/standalone/misc: make less noise for asok calculation
The $() notation not only calls the function, it also creates subshell,
which is a separate process. Additionally in debug mode all the content
of the function get_asok_path is printed out in the logs each time
whenever it is called, for example:
Instead of calling get_asok_path each time we need to define osd.0,
etc. asok file path, we just predefine corresponding variables.
It not only avoids extra resource usage, but also removes a lot of
noise from the logs.
Previously s3tests_java.py set JAVA_HOME using the `alternatives`
command. That had issues in that `alternatives` is not present on all
Ubuntu systems, and some installations of Java don't update
alternatives. So instead we look for a "java-8" jvm in /usr/lib/jvm/
and set JAVA_HOME to the first one we find.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
When graph walks is complete, it will actually be possible link to
supported without repeating the links. The matrix implementation doesn't
support this.
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
* refs/pull/67961/head:
qa/distros: link random distro to supported
qa/distros: add .qa links
qa: remove special chars from supported-random-distro
qa: remove redundant single-container-host
qa/distros/container-hosts: remove centos8
qa: remove distros/podman
qa: switch from distros/podman to supported-container-hosts
Kefu Chai [Sun, 29 Mar 2026 11:41:48 +0000 (19:41 +0800)]
crimson/osd: remove unnecessary named string for --smp value
The local variable `smp` is used only in the two immediately following
statements. Inline the fmt::format() call into emplace_back() and pass
reactor_num directly to the logger.
Kefu Chai [Sun, 29 Mar 2026 11:39:40 +0000 (19:39 +0800)]
crimson/osd: make early_config_t::to_ptr_vector private
The helper is an implementation detail of get_early_args() and
get_ceph_args(). Making it private prevents callers from inadvertently
holding the returned const char* pointers past the lifetime of the
input vector. Also fix the truncated doc-comment ("must not outlive in").
Shraddha Agrawal [Thu, 19 Mar 2026 08:01:28 +0000 (13:31 +0530)]
crimson/osd/pg_recovery: call MOSDPGRecoveryDelete instead of MOSDPGBackfillRemove
This commit fixes the abort in Recovered::Recovered.
There is a race to acquire the OBC lock between backfill and
client delete for the same object.
When the lock is acquired first by the backfill, the object is
recovered first, and then deleted by the client delete request.
When recovering the object, the corresponding peer_missing entry
is cleared and we are able to transition to Recovered state
successfully.
When the lock is acquired first by client delete request, the
object is deleted. Then backfill tries to recover the object,
finds it deleted and exists early. The stale peer_missing
entry is not cleared. In Recovered::Recovered, needs_recovery()
sees this stale peer_missing entry and calls abort.
The issue is fixed by sending MOSDPGRecoveryDelete from the client
path to peers and waiting for MOSDPGRecoveryDeleteReply in
recover_object.
Dror Guy [Mon, 23 Mar 2026 07:08:47 +0000 (09:08 +0200)]
osd: fix PrimaryLogPG op ordering during laggy state
When ops are kicked back from the waiting lists unreadable, degraded,
blocked etc during laggy state, they bypass newer ops in the
waiting_for_readable list. This fix places the kicked back waiting
list directly into the waiting_for_readable list with the right op order.
Fixes: https://tracker.ceph.com/issues/75403 Signed-off-by: Dror Guy <drorg@ionir.com>