Shraddha Agrawal [Thu, 19 Mar 2026 08:01:28 +0000 (13:31 +0530)]
crimson/osd/pg_recovery: call MOSDPGRecoveryDelete instead of MOSDPGBackfillRemove
This commit fixes the abort in Recovered::Recovered.
There is a race to acquire the OBC lock between backfill and
client delete for the same object.
When the lock is acquired first by the backfill, the object is
recovered first, and then deleted by the client delete request.
When recovering the object, the corresponding peer_missing entry
is cleared and we are able to transition to Recovered state
successfully.
When the lock is acquired first by client delete request, the
object is deleted. Then backfill tries to recover the object,
finds it deleted and exists early. The stale peer_missing
entry is not cleared. In Recovered::Recovered, needs_recovery()
sees this stale peer_missing entry and calls abort.
The issue is fixed by sending MOSDPGRecoveryDelete from the client
path to peers and waiting for MOSDPGRecoveryDeleteReply in
recover_object.
Aliaksei Makarau [Tue, 31 Mar 2026 06:40:04 +0000 (08:40 +0200)]
This change introduces the shared memory communication (SMC-D) for the cluster network.
SMC-D is faster than ethernet in IBM Z LPARs and/or VMs (zVM or KVM).
bst2002git [Wed, 4 Mar 2026 15:48:20 +0000 (16:48 +0100)]
found duplicate series for the match group {fs_id="-1"}
when 1 MDS active and 2 MDS standby (on 3Node-Cluster)
found duplicate series for the match group {fs_id="-1"} on the right hand-side of the operation
many-to-many matching not allowed: matching labels must be unique on one side
Vallari Agrawal [Thu, 12 Mar 2026 13:50:00 +0000 (19:20 +0530)]
mgr/dashboard: Add 'network_mask' to nvmeof cli
This commit add the following to nvmeof cli:
0. Add new param `--network-mask` to 'subsystem add' cmd
It's a list parameter so we can pass multiple netmask by
`subsystem add --network-mask <subnet1> --network-mask <subnet2>`
1. Add new cli `subsystem add_network --network-mask <subnet>`
2. Add new cli `subsystem del_network --network-mask <subnet>`
3. Add column 'network_mask' to `subsystem list` output
4. Add column 'manual' to `listener list` output
Shraddha Agrawal [Mon, 30 Mar 2026 10:12:08 +0000 (15:42 +0530)]
qa/tasks/cephadm.py: only pass --objectstore when not bluestore
This commit ensure that we only pass --objectstore argument to
cephadm's add/apply OSD command only when the value is not the
default value, bluestore.
This is done to ensure older ceph releases, like Squid and Tentacle
do not fail, as --objectstore argument was only added in Umbrella.
Kefu Chai [Sun, 29 Mar 2026 11:41:24 +0000 (19:41 +0800)]
crimson/osd: fix inaccurate comment about child early-exit in get_early_config
The comment contained a typo ("taged") and vaguely referred to "one of
the parameters" without explaining what actually happens: the child
calls exit(0) for early-exit paths such as --help and --version, and
the parent detects this by checking for a clean exit with no pipe data.
Kefu Chai [Sun, 29 Mar 2026 11:40:46 +0000 (19:40 +0800)]
crimson/osd: remove redundant comments
Remove comments that merely restate what the code already says clearly:
- SeastarOption field comments (option_name, config_key, value_type)
- "Define a list of Seastar options" above seastar_options
- "Function to get the option value as a string" above get_option_value
- "stop()s registered using defer() are called here" in main()
Also fix the trailing space before the semicolon in the value_type
field declaration.
Lumir Sliva [Sat, 28 Mar 2026 23:27:10 +0000 (00:27 +0100)]
doc/dev: fix typos in running-tests-locally.rst
Fix grammar error ('is be tested' -> 'can be tested'), misspellings
of 'bootstrap', 'teuthology', and 'environment', a repeated word
('manually manually'), and a missing article ('maybe bootstrap' ->
'maybe the bootstrap').
Lumir Sliva [Sat, 28 Mar 2026 23:43:27 +0000 (00:43 +0100)]
doc: fix typos and outdated refs across developer guide
Fix 'elipsis' to 'ellipsis' in SubmittingPatches.rst, update
outdated 'master' branch references to 'main' in essentials.rst
and running-tests-locally.rst, fix 'sometime' to 'sometimes' in
merging.rst, and remove duplicated word in teuthology-intro.rst.
Nizamudeen A [Sat, 28 Mar 2026 08:20:44 +0000 (13:50 +0530)]
mgr/dashboard: fix subvolume group corruption from smb share form
the SMB share form accidentally corrupts the subvolume group when it
issues a call to the subvolume_info API with an empty subvol_name which
then corrupts the group entirely and the following subvolume creation
gets failed.
The fix is to not call subvol_info with an empty name.
Fixes: https://tracker.ceph.com/issues/75771 Signed-off-by: Nizamudeen A <nia@redhat.com>
WenLei [Fri, 27 Mar 2026 08:40:14 +0000 (16:40 +0800)]
src/arch: fix hwprobe include path and ZBC/ZVBC offsets for riscv64
Signed-off-by: WenLei <lei.wen2@zte.com.cn>
Fix runtime detection of RISC-V ZBC and ZVBC crypto extensions.
Problems fixed:
- <sys/hwprobe.h> only exists in glibc >= 2.40 (released 2024-07-22).
Many production RISC-V distros still use older glibc (Ubuntu 22.04: 2.35,
Debian 12: 2.36, etc.) and would fail to compile.
Therefore we switch to the kernel UAPI header <asm/hwprobe.h>,
which works with all current glibc versions.
Proof:
- Absent in glibc 2.39:
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h;hb=refs/tags/glibc-2.39
- Present in glibc 2.40:
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/riscv/sys/hwprobe.h;hb=refs/tags/glibc-2.40
- Introducing commit:
https://sourceware.org/git/?p=glibc.git;a=commit;h=426d0e1aa8f17426d13707594111df712d2b8911
- Incorrect fallback bit positions:
- RISCV_HWPROBE_EXT_ZBC was (1ULL << 15) → should be (1ULL << 7)
- RISCV_HWPROBE_EXT_ZVBC was (1ULL << 20) → should be (1ULL << 18)
Ronen Friedman [Mon, 23 Mar 2026 16:24:20 +0000 (16:24 +0000)]
Crimson/osd/run_bench(): make randomness follow Classic more closely
Direct gen() calls for randomness: Crimson uses dis(gen) % onum and
dis(gen) % (osize / bsize) to pick random object indices and
offsets, which limits the range to 0–255. Classic uses mt19937s
directly, allowing the full 32-bit range of randomness.
rbd: improve mirror image status and validation error messages
When a mirror image is left in a transitional state such as DISABLING,
the current mirror image status command reports:
$ rbd mirror image status test_pool/test_image1
rbd: mirroring not enabled on the image
This is the same message shown when mirroring is disabled or not yet
enabled, which can give the impression that mirroring is already
disabled.
Improve the validation logic and error messages to distinguish between
the DISABLED state and other non-enabled states, and include the image
name and current state in the output.
Examples:
When the image is completely disabled:
$ rbd mirror image status test_pool/test_image1
rbd: mirroring disabled on image 'test_image1'
When the image is in a transitional state (ex: DISABLING):
$ rbd mirror image status test_pool/test_image1
rbd: mirroring not enabled on image 'test_image1' (state: disabling)
Adam Kupczyk [Thu, 25 Sep 2025 07:03:12 +0000 (03:03 -0400)]
extblkdev/fcm: Refuse to operate on multimedia lvm block devices
BlueStore is selecting were data is put to the device.
Merging 2 FCM devices together means that BlueStore will see free space
on one of the devices, but not know the other is full and asking to put
data there. It will cause -ENOSPC while free space is reported.