Alex Ainscow [Wed, 8 Apr 2026 10:49:58 +0000 (11:49 +0100)]
osd: Allow multiple objects with same version in missing list.
Most of the time, a single version in a PG can only correspond to a single object.
However, following a PG merge it is possible, even likely, that two objects will
have the same version. The PG Log works around this by discarding the log.
However, during backfill, it is possible for the missing list to be build with
these duplicate versions.
A recently added assert detected that this scenario was corrupting the reverse
missing list (rmissing). This behaviour has always existed, but was previously
unnoticed. It could cause some bugs and potentially loop-asserts on OSDs,
although mostly would not be noticed.
Here we fix this properly, by converting rmissing to a multimap. This is wrapped
in some insert functions, which assert that the rmissing list does not end up
with duplicate entries. The code is optimised for the case where there are no
duplicate versions.
Additionally, some of the old asserts have been rolled into the insert functions.
Fixes: https://tracker.ceph.com/issues/75778 Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
libcephsqlite: ensure atexit handlers are registered after openssl
When the sqlite3 executable encounters an error with .bail=on, it will
make a call to exit(). The atexit() handlers will execute in LIFO order.
We need to ensure that openssl (before OpenSSL 4.0 [1]) atexit handlers are
registered before libcephsqlite.
[1] http://github.com/openssl/openssl/commit/31659fe32673a6bd66abf3f8a7d803e81c6ffeed (OpenSSL 4.0 no longer arms `OPENSSL_cleanup()` function as an `atexit(3)`)
Fixes: https://tracker.ceph.com/issues/59335 Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Use prompts that cannot be selected in CLI examples. Remove warnings
about selectable prompts.
Use privileged prompt for ceph commands.
Use inline formatting consistently.
Improve capitalization.
Signed-off-by: Ville Ojamo <git2233+ceph@ojamo.eu>
Reviewed-by: Patrick Donnelly <pdonnell@ibm.com> Reviewed-by: Redouane Kachach <rkachach@redhat.com> Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
test/crimson/seastore: add missing crimson::gtest to link libraries
unittest-transaction-manager, unittest-omap-manager,
unittest-btree-lba-manager, and unittest-seastore all include
gtest_seastar.cc but were not explicitly linking against crimson::gtest.
This worked previously because gtest symbols were pulled in transitively,
but with gcc-toolset-13 and LTO the transitive dependency is no longer
satisfied, producing undefined reference errors for testing::Message,
testing::Test, testing::AssertionSuccess, etc.
crimson/osd: make load_obc(obc, md_ref) return void
load_obc() taking an already-resolved loaded_object_md_t::ref is
synchronous, because it just populates obc state, it does yield.
Returning an errorated future was unnecessary and caused a
-Wunused-result warning at its only call site:
ECRecoveryBackend::maybe_load_obc().
In this change, we change it to return void and deduplicate the OBC
population logic: the private async overload (taking future<md_ref>)
now validates ssc and returns object_corrupted on failure.
This silences the warning, and simpler this way. The async error
propagation is preserved.
Kyr Shatskyy [Tue, 27 Jan 2026 18:17:04 +0000 (19:17 +0100)]
qa/standalone/misc: make less noise for asok calculation
The $() notation not only calls the function, it also creates subshell,
which is a separate process. Additionally in debug mode all the content
of the function get_asok_path is printed out in the logs each time
whenever it is called, for example:
Instead of calling get_asok_path each time we need to define osd.0,
etc. asok file path, we just predefine corresponding variables.
It not only avoids extra resource usage, but also removes a lot of
noise from the logs.
Previously s3tests_java.py set JAVA_HOME using the `alternatives`
command. That had issues in that `alternatives` is not present on all
Ubuntu systems, and some installations of Java don't update
alternatives. So instead we look for a "java-8" jvm in /usr/lib/jvm/
and set JAVA_HOME to the first one we find.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
When graph walks is complete, it will actually be possible link to
supported without repeating the links. The matrix implementation doesn't
support this.
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
* refs/pull/67961/head:
qa/distros: link random distro to supported
qa/distros: add .qa links
qa: remove special chars from supported-random-distro
qa: remove redundant single-container-host
qa/distros/container-hosts: remove centos8
qa: remove distros/podman
qa: switch from distros/podman to supported-container-hosts
Kefu Chai [Sun, 29 Mar 2026 11:41:48 +0000 (19:41 +0800)]
crimson/osd: remove unnecessary named string for --smp value
The local variable `smp` is used only in the two immediately following
statements. Inline the fmt::format() call into emplace_back() and pass
reactor_num directly to the logger.
Kefu Chai [Sun, 29 Mar 2026 11:39:40 +0000 (19:39 +0800)]
crimson/osd: make early_config_t::to_ptr_vector private
The helper is an implementation detail of get_early_args() and
get_ceph_args(). Making it private prevents callers from inadvertently
holding the returned const char* pointers past the lifetime of the
input vector. Also fix the truncated doc-comment ("must not outlive in").
Shraddha Agrawal [Thu, 19 Mar 2026 08:01:28 +0000 (13:31 +0530)]
crimson/osd/pg_recovery: call MOSDPGRecoveryDelete instead of MOSDPGBackfillRemove
This commit fixes the abort in Recovered::Recovered.
There is a race to acquire the OBC lock between backfill and
client delete for the same object.
When the lock is acquired first by the backfill, the object is
recovered first, and then deleted by the client delete request.
When recovering the object, the corresponding peer_missing entry
is cleared and we are able to transition to Recovered state
successfully.
When the lock is acquired first by client delete request, the
object is deleted. Then backfill tries to recover the object,
finds it deleted and exists early. The stale peer_missing
entry is not cleared. In Recovered::Recovered, needs_recovery()
sees this stale peer_missing entry and calls abort.
The issue is fixed by sending MOSDPGRecoveryDelete from the client
path to peers and waiting for MOSDPGRecoveryDeleteReply in
recover_object.
Dror Guy [Mon, 23 Mar 2026 07:08:47 +0000 (09:08 +0200)]
osd: fix PrimaryLogPG op ordering during laggy state
When ops are kicked back from the waiting lists unreadable, degraded,
blocked etc during laggy state, they bypass newer ops in the
waiting_for_readable list. This fix places the kicked back waiting
list directly into the waiting_for_readable list with the right op order.
Fixes: https://tracker.ceph.com/issues/75403 Signed-off-by: Dror Guy <drorg@ionir.com>
Ville Ojamo [Tue, 31 Mar 2026 06:51:15 +0000 (13:51 +0700)]
doc/ceph-volume: Fix spelling etc errors
Low-hanging spelling, punctuation, and capitalization errors.
Ignore style and other more complex issues.
Use angle brackets consistently for value placeholders.
Signed-off-by: Ville Ojamo <git2233+ceph@ojamo.eu>
Aliaksei Makarau [Tue, 31 Mar 2026 06:40:04 +0000 (08:40 +0200)]
This change introduces the shared memory communication (SMC-D) for the cluster network.
SMC-D is faster than ethernet in IBM Z LPARs and/or VMs (zVM or KVM).