]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/log
ceph.git
4 days agoMerge pull request #68825 from phlogistonjohn/jjm-smb-ctl-tool-fe
John Mulligan [Tue, 9 Jun 2026 18:21:32 +0000 (14:21 -0400)]
Merge pull request #68825 from phlogistonjohn/jjm-smb-ctl-tool-fe

smb: add a smb remote control client tool frontend

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Anoop C S <anoopcs@cryptolab.net>
4 days agosrc/common/options: Increase autoscaler PG target and overload values
Anthony D'Atri [Fri, 25 Oct 2024 19:45:27 +0000 (15:45 -0400)]
src/common/options: Increase autoscaler PG target and overload values

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
4 days agoMerge pull request #65275 from ifed01/wip-ifed-no-buffered-wal
Igor Fedotov [Tue, 9 Jun 2026 15:51:59 +0000 (18:51 +0300)]
Merge pull request #65275 from ifed01/wip-ifed-no-buffered-wal

os/bluestore: do not use buffered IO for BlueFS WAL.

Reviewed-by: Adam Kupczyk <akupczyk@ibm.com>
4 days agoMerge pull request #69211 from Matan-B/wip-matanb-seastore-conflict-counters
Matan Breizman [Tue, 9 Jun 2026 13:53:42 +0000 (16:53 +0300)]
Merge pull request #69211 from Matan-B/wip-matanb-seastore-conflict-counters

crimsn/os/seastore: separate reset accounting from transaction creation

Reviewed-by: Xuehan Xu <xuxuehan@qianxin.com>
4 days agoos/bluestore: prevent reallocation and corruption when shared_blob key is missing... 69085/head
dheart [Tue, 9 Jun 2026 13:27:14 +0000 (21:27 +0800)]
os/bluestore: prevent reallocation and corruption when shared_blob key is missing/undecodable

When the shared_blob key is missing or fails to decode,
it is necessary to scan the blob's pextents directly as the sole authoritative source
to verify allocated blocks and prevent double-allocation.

Signed-off-by: dheart <dheart_joe@163.com>
4 days agoMerge pull request #69233 from tchaikov/wip-rgw-posix-thread-last
Casey Bodley [Tue, 9 Jun 2026 13:16:15 +0000 (09:16 -0400)]
Merge pull request #69233 from tchaikov/wip-rgw-posix-thread-last

rgw/posix: start the Inotify thread last, after the rest is built

Reviewed-by: Casey Bodley <cbodley@redhat.com>
4 days agodoc/man: Remove stale EOL release names from deprecation notices 69364/head
Emmanuel Ameh [Tue, 9 Jun 2026 12:40:03 +0000 (13:40 +0100)]
doc/man: Remove stale EOL release names from deprecation notices

ceph.rst: "osd create" deprecation notice cited "the Luminous release"
(2017, EOL 2020). Update to a plain deprecation statement directing
users to the replacement command (osd new).

rbd.rst: cephx_require_signatures option deprecation cited "the Bobtail
release" (2013, EOL 2015) as context for why the option is deprecated.
Remove the EOL release name; retain the deprecation warning. Fix the
companion nocephx_require_signatures notice for consistency ("in a
future release" instead of "in the future").

Fixes: https://tracker.ceph.com/issues/77191
Signed-off-by: Emmanuel Ameh <eameh@contractor.linuxfoundation.org>
4 days agoMerge pull request #69253 from cbodley/wip-76725
Casey Bodley [Tue, 9 Jun 2026 12:24:19 +0000 (08:24 -0400)]
Merge pull request #69253 from cbodley/wip-76725

osdc: deliver neorados completions to associated executor

Reviewed-by: Adam Emerson <aemerson@redhat.com>
Reviewed-by: Shilpa Jagannath <smanjara@redhat.com>
4 days agoMerge pull request #69246 from eameh-LF/i77075
eameh-LF [Tue, 9 Jun 2026 12:06:30 +0000 (13:06 +0100)]
Merge pull request #69246 from eameh-LF/i77075

doc/cephadm: fix typo and missing quote in activate-existing-osds

4 days agoMerge pull request #65792 from aclamk/aclamk-bs-onode-stall-fix
Jaya Prakash [Tue, 9 Jun 2026 11:53:16 +0000 (17:23 +0530)]
Merge pull request #65792 from aclamk/aclamk-bs-onode-stall-fix

os/bluestore: Fix problem with onode cache causing stalls

Reviewed-by: Igor Fedotov <igor.fedotov@croit.io>
4 days agoMerge pull request #68798 from aclamk/aclamk-bs-fix-stray-spanning-blobs
Jaya Prakash [Tue, 9 Jun 2026 11:52:57 +0000 (17:22 +0530)]
Merge pull request #68798 from aclamk/aclamk-bs-fix-stray-spanning-blobs

os/bluestore: Fix ExtentMap::reshard produce stray spanning blobs

Reviewed-by: Igor Fedotov <igor.fedotov@croit.io>
4 days agodoc: Update documentation to reflect new functionality 66726/head
Matty Williams [Mon, 23 Feb 2026 16:32:13 +0000 (16:32 +0000)]
doc: Update documentation to reflect new functionality

https://tracker.ceph.com/issues/74188
Signed-off-by: Matty Williams <Matty.Williams@ibm.com>
4 days agotest: Add integration tests for EC Omap operations and recovery
Matty Williams [Tue, 23 Dec 2025 13:42:37 +0000 (13:42 +0000)]
test: Add integration tests for EC Omap operations and recovery

Assisted-by: Bob
Used for writing tests following the pattern of existing tests.

Fixes: https://tracker.ceph.com/issues/74188
Signed-off-by: Matty Williams <Matty.Williams@ibm.com>
4 days agoosd: Hook up omap operations in EC pools
Matty Williams [Mon, 18 May 2026 09:09:32 +0000 (10:09 +0100)]
osd: Hook up omap operations in EC pools

Add pool flag to determine if omap operations are supported in a pool.
- Currently disabled in EC pools (will later be enabled for Fast EC pools)
Require all osds to have umbrella or later release version to enable pool flag.
Change recovery reads to use journal updates.
Clear the journal for a new epoch.
Set omap_complete accurately before recovery.
Encode omap updates and add entry to journal.
Decode omap updates, apply updates to object store, then remove from journal.
Change omap reads in PrimaryLogPG to use PGBackend functions, including omap updates from journal.

Assisted-by: Bob
Used for debugging and copying patterns (e.g. implementing REPLACE type to match MODIFY).

Fixes: https://tracker.ceph.com/issues/74188
Signed-off-by: Matty Williams <Matty.Williams@ibm.com>
4 days agoosd: Allow for recovery of OMAP header and entries in EC pools
Matty Williams [Tue, 12 May 2026 15:11:17 +0000 (16:11 +0100)]
osd: Allow for recovery of OMAP header and entries in EC pools

Add omap fields to read_request_t, read_result_t, ECSubRead and ECSubReadReply.
Read and write omap header and entries if !omap_complete.
Require omap_complete to finish recovery.

Fixes: https://tracker.ceph.com/issues/74244
Signed-off-by: Matty Williams <Matty.Williams@ibm.com>
4 days agodoc: Write design document to explain the reasoning behind implementing this feature
Matty Williams [Tue, 24 Feb 2026 15:16:28 +0000 (15:16 +0000)]
doc: Write design document to explain the reasoning behind implementing this feature

Assisted-by: Bob
Used to create the first draft of the design document.

https://tracker.ceph.com/issues/74187
Signed-off-by: Matty Williams <Matty.Williams@ibm.com>
4 days agoosd: Introduce functions required for EC OMAP support
Matty Williams [Fri, 12 Dec 2025 11:21:10 +0000 (11:21 +0000)]
osd: Introduce functions required for EC OMAP support

Introduced a "supports_omap" pool flag which is always enabled for Replicated pools and currently always disabled for EC pools.
Introduced wrappers around omap read operations in PGBackend to include updates from the journal in EC pools with optimisations enabled.
Introduced a function for encoding an EC_OMAP operation in the ObjectModDesc::Visitor class and a function for committing an operation in the Trimmer struct.

Signed-off-by: Matty Williams <Matty.Williams@ibm.com>
4 days agoMerge pull request #69033 from kchheda3/fix-76729-notif-eventtime-race
Yuval Lifshitz [Tue, 9 Jun 2026 07:58:15 +0000 (10:58 +0300)]
Merge pull request #69033 from kchheda3/fix-76729-notif-eventtime-race

rgw/notification: fix zero eventTime in bucket notifications on concurrent PUT race

4 days agocrimson/osd: init PerShardState::startup_time per-shard 69329/head
Kefu Chai [Mon, 8 Jun 2026 09:17:26 +0000 (17:17 +0800)]
crimson/osd: init PerShardState::startup_time per-shard

previously, we got the mono_clock::now() in OSD::start() and passed it
to PerShardState. this worked fine. but it was a little bit convoluted
-- we pass the startup_time all the way to PerShardState.

in this change, we just use call mono_clock::now() in the contructor
of PerShardState. simpler this way.

the startup_time has two consumers:

- the PGs hosted by the sharded_service use it as a reference for the
  monotonic timestamp
- Heartbeat::send_heartbeats() uses it as for the mono_ping_stamp.

because, strictly speaking, we cannot gurantee that all PerShardState
sharded services share the identical startup timestamp, as they are
constructed on different shards. but this does not matter, as PGs
always use the hosting shard service for the referencing timestamp,
and OSD always uses the shard service on local shard for sending
heartbeats.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
4 days agocrimson/osd: remove OSD::startup_time
Kefu Chai [Mon, 8 Jun 2026 08:54:47 +0000 (16:54 +0800)]
crimson/osd: remove OSD::startup_time

OSD::startup_time was added in 91c0df81, in which was added to provide
a monotomic increasing timestamp representing the startup time. but
later, we decided to keep track of this timestamp in PerShardState.
but we didn't remove OSD::startup_time when adding
PerShardState::get_mnow().

in this change, we remove the unused OSD::startup_time.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
4 days agocrimson/osd: coroutinize OSD::start()
Kefu Chai [Mon, 8 Jun 2026 08:24:40 +0000 (16:24 +0800)]
crimson/osd: coroutinize OSD::start()

OSD::start() is a long, deeply nested .then() continuation chain. let's
rewrite it as a coroutine to make it readable. the chain was already
sequential and the one concurrent step keeps its when_all_succeed(), so
the rewrite preserves both ordering and concurrency.

start() runs once at boot, off the i/o path, so the small overhead of
co_await over a hand-rolled continuation chain is a fine price for the
readability.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
4 days agomgr/cephadm: add unit tests 68158/head
Aashish Sharma [Tue, 21 Apr 2026 08:10:54 +0000 (13:40 +0530)]
mgr/cephadm: add unit tests

Signed-off-by: Aashish Sharma <aasharma@redhat.com>
4 days agomgr/cephadm: set default prometheus template in config-key store unless
Aashish Sharma [Wed, 1 Apr 2026 10:10:16 +0000 (15:40 +0530)]
mgr/cephadm: set default prometheus template in config-key store unless
overridden by the user

Fixes: https://tracker.ceph.com/issues/75826
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
4 days agoMerge PR #68413 into main
Venky Shankar [Tue, 9 Jun 2026 01:32:00 +0000 (07:02 +0530)]
Merge PR #68413 into main

* refs/pull/68413/head:
mds: fix shutdown hang when ephemeral pins active and max_mds is 0
mds: fix crash in hash_into_rank_bucket() when max_mds is 0

Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
4 days agoMerge pull request #69165 from sunyuechi/wip-addcephtest-catch2-imported-target
Kefu Chai [Mon, 8 Jun 2026 23:37:28 +0000 (07:37 +0800)]
Merge pull request #69165 from sunyuechi/wip-addcephtest-catch2-imported-target

cmake/AddCephTest: use namespaced Catch2 imported targets

Reviewed-by: Jesse F. Williamson <jfw@ibm.com>
5 days agoMerge PR #69337 into main
Patrick Donnelly [Mon, 8 Jun 2026 22:31:53 +0000 (18:31 -0400)]
Merge PR #69337 into main

* refs/pull/69337/head:
doc: governance/csc: update email address

Reviewed-by: Joseph Mundackal <jmundackal@bloomberg.net>
Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>
Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
5 days agodoc: governance/csc: update email address 69337/head
Yehuda Sadeh Weinraub [Mon, 8 Jun 2026 18:38:26 +0000 (11:38 -0700)]
doc: governance/csc: update email address

yehuda@redhat.com -> yehuda@ui.com

Signed-off-by: Yehuda Sadeh Weinraub <yehuda@ui.com>
5 days agoMerge pull request #69176 from Ericmzhang/wip-fix-pg_autoscaler-tests
Ericmzhang [Mon, 8 Jun 2026 19:12:11 +0000 (12:12 -0700)]
Merge pull request #69176 from Ericmzhang/wip-fix-pg_autoscaler-tests

qa: Fix pg autoscaler tests

5 days agoMerge pull request #69315 from sunyuechi/wip-sccache-riscv64
Zack Cerza [Mon, 8 Jun 2026 18:37:07 +0000 (12:37 -0600)]
Merge pull request #69315 from sunyuechi/wip-sccache-riscv64

Dockerfile.build: bump sccache and fetch it on riscv64

5 days agoqa/suites: add faster allocation recovery thrashing suite 68984/head
Jaya Prakash [Mon, 18 May 2026 19:57:50 +0000 (19:57 +0000)]
qa/suites: add faster allocation recovery thrashing suite

Signed-off-by: Jaya Prakash <jayaprakash@ibm.com>
5 days agoqa/workunits: add EC fio workload for allocation recovery testing
Jaya Prakash [Mon, 18 May 2026 19:57:33 +0000 (19:57 +0000)]
qa/workunits: add EC fio workload for allocation recovery testing

Signed-off-by: Jaya Prakash <jayaprakash@ibm.com>
5 days agoos/bluestore: Add printout to CBT's recovery-compare command 64369/head
Adam Kupczyk [Fri, 29 May 2026 11:16:39 +0000 (11:16 +0000)]
os/bluestore: Add printout to CBT's recovery-compare command

1) recovery-compare prints on stdout
2) gracefully rejects comparing when multithreaded not enabled

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
5 days agoos/bluestore: Add bluestore_debug_fast_recovery_compare_chance
Adam Kupczyk [Tue, 19 May 2026 19:36:37 +0000 (19:36 +0000)]
os/bluestore: Add bluestore_debug_fast_recovery_compare_chance

The setting is used for testing purposes only.
It allows to force compare if required,
or set chance to use in teuthology thrash tests.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
5 days agoos/bluestore: Make OnodeScan use just one Blob
Adam Kupczyk [Mon, 7 Jul 2025 10:16:43 +0000 (10:16 +0000)]
os/bluestore: Make OnodeScan use just one Blob

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
5 days agoos/bluestore: Tell OnodeScan to skip decoding checksums
Adam Kupczyk [Mon, 7 Jul 2025 10:02:01 +0000 (10:02 +0000)]
os/bluestore: Tell OnodeScan to skip decoding checksums

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
5 days agoos/bluestore: Adapt multithread recovery
Adam Kupczyk [Mon, 7 Jul 2025 07:24:42 +0000 (07:24 +0000)]
os/bluestore: Adapt multithread recovery

Adapt multithread recovery to modified ExtentDecoder interface.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
5 days agoos/bluestore: Multithreaded allocation recovery
Adam Kupczyk [Thu, 3 Jul 2025 08:04:01 +0000 (08:04 +0000)]
os/bluestore: Multithreaded allocation recovery

Added multithreading processing for allocation recovery.
Added new config "bluestore_allocation_recovery_threads".

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
5 days agoos/bluestore: Add "recovery-compare" action to CBT
Adam Kupczyk [Tue, 1 Jul 2025 13:25:38 +0000 (13:25 +0000)]
os/bluestore: Add "recovery-compare" action to CBT

New command compares 2 recovery modes:
 - legacy
 - new multithreaded
The command is hidden - it does not show in help.
Its role is devel & test only.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
5 days agoos/bluestore: Add new onode recovery method
Adam Kupczyk [Tue, 1 Jul 2025 13:47:14 +0000 (13:47 +0000)]
os/bluestore: Add new onode recovery method

Added read_allocation_from_onodes_mt function
  (originally copied from read_allocation_from_onodes).
Added Decoder_AllocationsAndStatFS class
  (originally copied from ExtentDecoderpartial).

There are significant differences from originals:
- shared blobs are not scanned at all
- to not account allocations more than once,
  collisions are detected on SimpleBitmap level;
  only the first onode referencing shared blob will mark allocation
- Blobs are not preserved
- instead we remember only if blob or spanning blob was compressed

The underlying logic is make recovery faster and prepare for
multithread refactor.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
5 days agoos/bluestore: Tiny refactor
Adam Kupczyk [Tue, 1 Jul 2025 11:54:01 +0000 (11:54 +0000)]
os/bluestore: Tiny refactor

Moved statfs initialization that is done after onode recovery
from read_allocation_from_onodes()
to   reconstruct_allocations().

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
5 days agoos/bluestore: Add set_atomic and clr_atomic to SimpleBitmap
Adam Kupczyk [Tue, 1 Jul 2025 11:48:45 +0000 (11:48 +0000)]
os/bluestore: Add set_atomic and clr_atomic to SimpleBitmap

The functions are analogs of set and clr respectively that allow to multithread use.
In addition return value is a count of set/cleared bits.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
5 days agoos/bluestore: Rework on decoding
Adam Kupczyk [Fri, 4 Jul 2025 16:28:16 +0000 (16:28 +0000)]
os/bluestore: Rework on decoding

Refactored ExtentDecoder.
Introduced decode_create_blob method to it.
Converted bluestore_blob_t::decode and Blob::decode methods into templates.
Created clear example path how to specialize these and other decoders.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
5 days agoMerge pull request #69212 from shraddhaag/wip-shraddhaag-enable-debian-crimson-builds
Shraddha Agrawal [Mon, 8 Jun 2026 14:54:47 +0000 (20:24 +0530)]
Merge pull request #69212 from shraddhaag/wip-shraddhaag-enable-debian-crimson-builds

debian: enable crimson packages

5 days agoMerge pull request #66746 from datdenkikniet/prologue-not-epilogue
Kefu Chai [Mon, 8 Jun 2026 14:11:05 +0000 (22:11 +0800)]
Merge pull request #66746 from datdenkikniet/prologue-not-epilogue

msg/async/frames_v2: doc: FRAME_EARLY_DATA_COMPRESSED is used in prologue, not epilogue

Reviewed-by: Kefu Chai <k.chai@proxmox.com>
5 days agoMerge pull request #69188 from sunyuechi/zstd-system-include
Kefu Chai [Mon, 8 Jun 2026 13:34:54 +0000 (21:34 +0800)]
Merge pull request #69188 from sunyuechi/zstd-system-include

compressor/zstd: include <zstd.h> instead of the bundled path

Reviewed-by: Kefu Chai <k.chai@proxmox.com>
5 days agotest: Remove invalid unit test 69284/head
Jon Bailey [Thu, 4 Jun 2026 10:27:07 +0000 (11:27 +0100)]
test: Remove invalid unit test

This test was talking about testing invalid ops, however with the inclusion of sync reads in EC (https://github.com/ceph/ceph/pull/67079), it is valid to perform class reads in EC. In addition, work was done around illegal ops here: https://github.com/ceph/ceph/pull/66258 and the existance of TEST(ClsHello, BadMethods) in test_cls_hello.cc covers illegal ops in that PR leading me to think this is unneccisairy. Because of these reasons, I think its better this test is removed as it is incorrect and also not working.

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
5 days agomds: fix shutdown hang when ephemeral pins active and max_mds is 0 68413/head
chungfengz [Thu, 16 Apr 2026 06:54:16 +0000 (06:54 +0000)]
mds: fix shutdown hang when ephemeral pins active and max_mds is 0

During shutdown, `ceph fs set <fs> down true` sets max_mds to 0 before
the MDS daemons have finished exporting their subtrees.  shutdown_pass()
iterates over auth subtrees and skips any dir whose inode is
ephemerally pinned, expecting handle_export_pins() to re-place them.
However, handle_export_pins() calls hash_into_rank_bucket() which (after
the companion fix) now returns MDS_RANK_NONE when max_mds == 0.  With
no valid target rank the export is never scheduled, so the ephemerally-
pinned dirs are skipped by shutdown_pass() indefinitely and the daemon
loops.

Fixes: https://tracker.ceph.com/issues/76059
Signed-off-by: chungfengz <chungfengz@synology.com>
5 days agomds: fix crash in hash_into_rank_bucket() when max_mds is 0
chungfengz [Thu, 16 Apr 2026 06:53:51 +0000 (06:53 +0000)]
mds: fix crash in hash_into_rank_bucket() when max_mds is 0

When a CephFS cluster is paused (e.g. via `ceph fs set <fs> down true`
or `ceph fs pause`) the MDS map's max_mds is set to 0.  Any subsequent
call to hash_into_rank_bucket() with max_mds == 0 triggers a crash:
the jump-consistent-hash loop never executes (j starts at 0, condition
j < max_mds is immediately false), leaving b = -1, so the final
assert(result >= 0 && result < max_mds) aborts the daemon.

Fixes: https://tracker.ceph.com/issues/76059
Signed-off-by: chungfengz <chungfengz@synology.com>
5 days agoMerge pull request #56634 from neesingh-rh/wip-64064
Venky Shankar [Mon, 8 Jun 2026 09:03:10 +0000 (14:33 +0530)]
Merge pull request #56634 from neesingh-rh/wip-64064

mds: comply with the valid range for `mds_log_max_segments`

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 days agoMerge PR #68793 into main
Venky Shankar [Mon, 8 Jun 2026 08:53:57 +0000 (14:23 +0530)]
Merge PR #68793 into main

* refs/pull/68793/head:
mds: prevent CDir omap commit with empty updates/removals/header

Reviewed-by: Igor Golikov <igolikov@ibm.com>
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
5 days agoMerge pull request #69153 from fultheim/rbm-capacity-enforcement
Matan Breizman [Mon, 8 Jun 2026 08:13:54 +0000 (11:13 +0300)]
Merge pull request #69153 from fultheim/rbm-capacity-enforcement

crimson/os/seastore: enforce capacity in RBMCleaner::try_reserve_projected_usage

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
5 days agomgr/dashboard: carbonize table filters 68990/head
Nizamudeen A [Tue, 19 May 2026 04:40:08 +0000 (10:10 +0530)]
mgr/dashboard: carbonize table filters

Fixes: https://tracker.ceph.com/issues/76687
Signed-off-by: Nizamudeen A <nia@redhat.com>
5 days agoMerge pull request #69248 from xxhdx1985126/wip-seastore-get_child_sync-fix
Matan Breizman [Mon, 8 Jun 2026 07:43:56 +0000 (10:43 +0300)]
Merge pull request #69248 from xxhdx1985126/wip-seastore-get_child_sync-fix

crimson/os/seastore/linked_tree_node: get_child_sync should also get transactional views of the extent

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
Reviewed-by: Matan Breizman <mbreizma@redhat.com>
5 days agoMerge PR #66492 into main
Venky Shankar [Mon, 8 Jun 2026 07:27:10 +0000 (12:57 +0530)]
Merge PR #66492 into main

* refs/pull/66492/head:
src/pybind/mgr: handle json-pretty for perf stats

Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 days agodebian: enable crimson packages 69212/head
Shraddha Agrawal [Mon, 1 Jun 2026 10:58:48 +0000 (16:28 +0530)]
debian: enable crimson packages

This commit enables ceph-osd-crimson and ceph-osd-crimson-dbg
packages for debian builds which have gcc version 13 or above.
This is done as a first step to add noble to supported distors
for crimson.

Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>
5 days agoMerge pull request #68094 from rhcs-dashboard/cleanup-log
Nizamudeen A [Mon, 8 Jun 2026 05:25:57 +0000 (10:55 +0530)]
Merge pull request #68094 from rhcs-dashboard/cleanup-log

mgr/prometheus: cleanup the smb share processing logs

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
5 days agoMerge pull request #69317 from tchaikov/wip-mgr-dashboard-immutable-cache
Nizamudeen A [Mon, 8 Jun 2026 05:23:20 +0000 (10:53 +0530)]
Merge pull request #69317 from tchaikov/wip-mgr-dashboard-immutable-cache

mgr/dashboard: don't mutate the cached osd_map in CephService

Reviewed-by: Nizamudeen A <nia@redhat.com>
5 days agoMerge pull request #65950 from joscollin/wip-71701-near-full
Venky Shankar [Mon, 8 Jun 2026 04:35:28 +0000 (10:05 +0530)]
Merge pull request #65950 from joscollin/wip-71701-near-full

qa: drop creating huge files in test_cephfs_mirror_cancel_sync

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 days agoMerge pull request #67371 from greenx/main
Kefu Chai [Mon, 8 Jun 2026 01:33:54 +0000 (09:33 +0800)]
Merge pull request #67371 from greenx/main

logrotate: send SIGHUP to ceph-exporter on log rotation

Reviewed-by: Kefu Chai <k.chai@proxmox.com>
6 days agomgr/dashboard: don't mutate the cached osd_map in CephService 69317/head
Kefu Chai [Sun, 7 Jun 2026 08:58:20 +0000 (16:58 +0800)]
mgr/dashboard: don't mutate the cached osd_map in CephService

test_pool_list fails intermittently:

  Traceback (most recent call last):
    File "qa/tasks/mgr/dashboard/test_pool.py", line 182, in test_pool_list
      self.assertNotIn('pg_status', pool)
  AssertionError: 'pg_status' unexpectedly found in
    {'pool': 1, 'pool_name': 'rbd', ..., 'pg_status': {'active+clean': 1}, ...}

mgr.get('osd_map') defaults to mutable=False, so cacheable_get_python()
returns the mgr's shared cached object rather than a copy.
get_pool_list_with_stats() writes pool['pg_status'] and pool['stats']
into those cached dicts, and get_erasure_code_profiles() sets ecp['name']
and rewrites ecp['k']/['m'] to int. The writes outlive the request, so
once a stats=true call has run, GET /api/pool with stats=false still
returns pools carrying pg_status and the assertion above fails. It only
triggers while the cache stays valid between the two requests, hence the
flakiness.

Audited the other dashboard readers of cached mgr.get() keys: these two
are the only sites that mutate the result; the rest only read, and
health.py already copies its osd_map before editing.

Copy the dicts before stamping them; the cache stays clean.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
7 days agoDockerfile.build: fetch sccache on riscv64 69315/head
Sun Yuechi [Sat, 6 Jun 2026 09:44:57 +0000 (17:44 +0800)]
Dockerfile.build: fetch sccache on riscv64

sccache ships a riscv64 release artifact since v0.13.0, published under the
riscv64gc target triple. Map uname -m "riscv64" to that asset name so the
download resolves on riscv64 instead of being skipped.

Signed-off-by: Sun Yuechi <sunyuechi@iscas.ac.cn>
7 days agoDockerfile.build: bump sccache to v0.15.0
Sun Yuechi [Sat, 6 Jun 2026 09:44:33 +0000 (17:44 +0800)]
Dockerfile.build: bump sccache to v0.15.0

The releases since v0.8.2 add caching for C++20 modules, assembly, and C
preprocessor output, plus broader GCC/MSVC flag handling. They also avoid
double-caching when ccache is on PATH and carry assorted cache-correctness
and storage-backend fixes.

Signed-off-by: Sun Yuechi <sunyuechi@iscas.ac.cn>
7 days agocrimson/os/seastore/lba,btree: better debug logs 69248/head
Xuehan Xu [Wed, 3 Jun 2026 02:55:02 +0000 (10:55 +0800)]
crimson/os/seastore/lba,btree: better debug logs

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
7 days agocrimson/os/seastore/btree: correct the sync search of leaf nodes to do
Xuehan Xu [Wed, 3 Jun 2026 02:09:12 +0000 (10:09 +0800)]
crimson/os/seastore/btree: correct the sync search of leaf nodes to do
lower_bound instead of upper_bound

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
7 days agocrimson/os/seastore/linked_tree_node: get_child_sync should also get
Xuehan Xu [Tue, 2 Jun 2026 15:29:15 +0000 (23:29 +0800)]
crimson/os/seastore/linked_tree_node: get_child_sync should also get
transactional views of the extent

Fixes: https://tracker.ceph.com/issues/76945
Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
8 days agocephadm: set Grafana http_addr to 0.0.0.0 when unset 69080/head
Kobi Ginon [Mon, 25 May 2026 12:38:34 +0000 (15:38 +0300)]
cephadm: set Grafana http_addr to 0.0.0.0 when unset
Grafana 11.1+ rejects non-literal http_addr values (e.g. localhost)
in grafana-apiserver. Use 0.0.0.0 by default; stop bracket-wrapping
IPv6 addresses in http_addr.
Fixes: https://tracker.ceph.com/issues/75365
Signed-off-by: Kobi Ginon <kginon@redhat.com>
8 days agoMerge pull request #69172 from cbodley/wip-76997
Casey Bodley [Fri, 5 Jun 2026 15:03:50 +0000 (11:03 -0400)]
Merge pull request #69172 from cbodley/wip-76997

qa/rgw: bump tempest version from 34.1.0 to 45.0.0

Reviewed-by: Tobias Urdin <tobias.urdin@binero.com>
8 days agoMerge pull request #68977 from rhcs-dashboard/76652-Convert-add-storage-wizard-to... main_base_6.5.26
Afreen Misbah [Fri, 5 Jun 2026 13:56:42 +0000 (19:26 +0530)]
Merge pull request #68977 from rhcs-dashboard/76652-Convert-add-storage-wizard-to-tearsheet

mgr/dashboard: Converting add storage wizard into tearsheet

Reviewed-by: Afreen Misbah <afreen@ibm.com>
8 days agoMerge PR #69118 into main
Venky Shankar [Fri, 5 Jun 2026 13:19:56 +0000 (18:49 +0530)]
Merge PR #69118 into main

* refs/pull/69118/head:
qa/cephfs: install ceph-mgr-modules-standard for cephfs tests

Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
8 days agoMerge pull request #69295 from tchaikov/wip-c-ares
Kefu Chai [Fri, 5 Jun 2026 10:45:08 +0000 (18:45 +0800)]
Merge pull request #69295 from tchaikov/wip-c-ares

ceph.spec.in: only require c-ares >= 1.28 on el10+

Reviewed-by: Kautilya Tripathi <kautilya.tripathi@ibm.com>
8 days agoMerge pull request #69263 from JonBailey1993/ec_direct_reads_docs
Ilya Dryomov [Fri, 5 Jun 2026 08:39:05 +0000 (10:39 +0200)]
Merge pull request #69263 from JonBailey1993/ec_direct_reads_docs

doc: Document erasure-coded pool direct reads for balance flag

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
8 days agoMerge pull request #69040 from rhcs-dashboard/76746-combining-quorum-tables-data...
Afreen Misbah [Fri, 5 Jun 2026 08:23:37 +0000 (13:53 +0530)]
Merge pull request #69040 from rhcs-dashboard/76746-combining-quorum-tables-data-on-monitors-page

mgr/dashboard: Combining Quorum tables data on Monitors page

Reviewed-by: Afreen Misbah <afreen@ibm.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>
8 days agoMerge pull request #68910 from sseshasa/wip-osd-perf-counters-for-durability-score
Sridhar Seshasayee [Fri, 5 Jun 2026 08:20:14 +0000 (13:50 +0530)]
Merge pull request #68910 from sseshasa/wip-osd-perf-counters-for-durability-score

osd: add last_degraded field to pg_stat_t

Reviewed-by: Radoslaw Zarzynski <rzarzynski@redhat.com>
8 days agoqa/cephadm: fix test_repos.sh for jammy nodes 69299/head
Redouane Kachach [Fri, 5 Jun 2026 08:02:34 +0000 (10:02 +0200)]
qa/cephadm: fix test_repos.sh for jammy nodes

The test uses add-repo --release 17.2.6 to verify version-string repo
handling, but debian-17.2.6 only has focal and bullseye suites and
jammy packages weren't built until 17.2.7. This causes apt-get update
to fail with a 404 on ubuntu_22.04 nodes.

see: https://download.ceph.com/debian-17.2.6/dists/

This fix bumps the version to 17.2.7 which includes a jammy suite.

Fixes: https://tracker.ceph.com/issues/77130
Signed-off-by: Redouane Kachach <rkachach@ibm.com>
8 days agoMerge pull request #67901 from aadhikale/wip-75619_progress_module_gives_value_error_...
Nizamudeen A [Fri, 5 Jun 2026 07:07:30 +0000 (12:37 +0530)]
Merge pull request #67901 from aadhikale/wip-75619_progress_module_gives_value_error_for_metadata

dashboard: use metadata = event.get('refs', {}) instead of dict(event…

Reviewed-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Naman Munet <nmunet@redhat.com>
8 days agoMerge pull request #69240 from amathuria/wip-amat-crimson-debug-snaptrim-timeout
Aishwarya Mathuria [Fri, 5 Jun 2026 05:58:45 +0000 (11:28 +0530)]
Merge pull request #69240 from amathuria/wip-amat-crimson-debug-snaptrim-timeout

crimson/osd: add debug logs for snaptrim and scrub background_process_lock

8 days agoMerge pull request #68989 from tchaikov/wip-slim-mgr-module
Kefu Chai [Fri, 5 Jun 2026 04:53:45 +0000 (12:53 +0800)]
Merge pull request #68989 from tchaikov/wip-slim-mgr-module

debian,rpm: split ceph-mgr-modules-core into per-module packages

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: John Mulligan <jmulligan@redhat.com>
8 days agoceph.spec.in: only require c-ares >= 1.28 on el10+ 69295/head
Kefu Chai [Fri, 5 Jun 2026 01:34:56 +0000 (09:34 +0800)]
ceph.spec.in: only require c-ares >= 1.28 on el10+

87e233bb2628784c8c59603e74bc728a8944265e added an unconditional
"Requires: c-ares >= 1.28.0" to ceph-osd-crimson: seastar links
ares_query_dnsrec, which c-ares only grew in 1.28, and the libcares.so.2
SONAME doesn't carry the version so rpm can't infer the floor itself.

But the floor only earns its place where the build links the symbol
against a newer c-ares than the runtime has, and that's an EL thing.
el10's minors cross 1.28 under one $releasever (10.1 ships 1.25, 10.2
ships 1.34), so a builder rolls to 1.34 while a frozen 10.1 node stays on
1.25; without the floor the rpm installs there and the osd then crashes
on the missing symbol. el9 builds the legacy ares_query path and doesn't
need it at all.

Fedora and SUSE don't have the skew: one c-ares per release, built and
run against the same one, so the auto libcares.so.2 dep covers them. So
pin it only on el10+, arch-qualified with %{?_isa}.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
8 days agoMerge pull request #69288 from ronen-fr/wip-rf-runnerarm
Ronen Friedman [Fri, 5 Jun 2026 01:22:29 +0000 (04:22 +0300)]
Merge pull request #69288 from ronen-fr/wip-rf-runnerarm

crimson/test: chain invoke_on_all() future instead of calling get()

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
8 days agoMerge pull request #69177 from ronen-fr/wip-rf-scrubstore
Ronen Friedman [Fri, 5 Jun 2026 01:15:50 +0000 (04:15 +0300)]
Merge pull request #69177 from ronen-fr/wip-rf-scrubstore

osd/scrub: clean up inconsistent_obj_wrapper and ScrubStore

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
9 days agotest/neorados: test cross-executor completions 69253/head
Casey Bodley [Thu, 4 Jun 2026 18:19:38 +0000 (14:19 -0400)]
test/neorados: test cross-executor completions

repurpose ceph_test_neorados_completions to verify that completions are
delivered to the handler's associated executor

Signed-off-by: Casey Bodley <cbodley@redhat.com>
9 days agocrimson/test: chain invoke_on_all() future instead of calling get() 69288/head
Ronen Friedman [Thu, 4 Jun 2026 13:05:26 +0000 (13:05 +0000)]
crimson/test: chain invoke_on_all() future instead of calling get()

The reactors start-up code on ARM64 uses invoke_on_all() to
set a configuration option.
Replace smp::invoke_on_all().get() with future chaining. This
avoids waiting on a future from a reactor continuation (outside
of a seastar thread) that throws exception.

See: https://docs.seastar.io/master/classseastar_1_1future.html#a50bfeff0acccd2f365cce40f9954218c

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
9 days agoosd/scrub: clean up inconsistent_obj_wrapper and ScrubStore 69177/head
Ronen Friedman [Fri, 29 May 2026 18:21:51 +0000 (18:21 +0000)]
osd/scrub: clean up inconsistent_obj_wrapper and ScrubStore

Add a default constructor to inconsistent_obj_wrapper, allowing
decode_wrapper() to avoid requiring a dummy hobject_t that gets
immediately overwritten by decode(). Remove the now-unnecessary
hobject_t parameter from merge_encoded_error_wrappers().

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
9 days agoMerge pull request #69148 from ronen-fr/wip-rf-scrubjob
Ronen Friedman [Thu, 4 Jun 2026 18:32:30 +0000 (21:32 +0300)]
Merge pull request #69148 from ronen-fr/wip-rf-scrubjob

osd/scrub: scrub_job cleanup

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
9 days agoqa/rgw: disable neutron service in tempest.conf 69172/head
Casey Bodley [Wed, 3 Jun 2026 17:03:49 +0000 (13:03 -0400)]
qa/rgw: disable neutron service in tempest.conf

Suggested-by: Tobias Urdin <tobias.urdin@binero.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
9 days agoqa/rgw: bump tempest version from 34.1.0 to 45.0.0
Casey Bodley [Fri, 29 May 2026 14:56:10 +0000 (10:56 -0400)]
qa/rgw: bump tempest version from 34.1.0 to 45.0.0

this 34.1.0 version fails to pip install under python 3.12 when testing
ubuntu 24.04

i chose new version 45.0.0 because it corresponds to this commit:
> Use stable constraint in tox to release new tag for 2025.2

which matches the stable/2025.2 tag we use for keystone

Fixes: https://tracker.ceph.com/issues/76997
Signed-off-by: Casey Bodley <cbodley@redhat.com>
9 days agoMerge pull request #69054 from tchaikov/wip-cls-rgw-cleanup
Casey Bodley [Thu, 4 Jun 2026 17:01:44 +0000 (13:01 -0400)]
Merge pull request #69054 from tchaikov/wip-cls-rgw-cleanup

cls: remove unused variable

Reviewed-by: Casey Bodley <cbodley@redhat.com>
9 days agoosd: Fix condition for rolling forward pg log entries 68888/head
Matty Williams [Thu, 14 May 2026 12:44:12 +0000 (13:44 +0100)]
osd: Fix condition for rolling forward pg log entries

https://tracker.ceph.com/issues/76577
Signed-off-by: Matty Williams <Matty.Williams@ibm.com>
9 days agoMerge pull request #69270 from sseshasa/wip-fix-ok-to-upgrade-error-msg
Sridhar Seshasayee [Thu, 4 Jun 2026 15:23:26 +0000 (20:53 +0530)]
Merge pull request #69270 from sseshasa/wip-fix-ok-to-upgrade-error-msg

mgr/DaemonServer: clarify ok-to-upgrade error message for CRUSH buckets

Reviewed-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
9 days agodoc: Document erasure-coded pool direct reads for balance flag 69263/head
Jon Bailey [Wed, 3 Jun 2026 10:42:29 +0000 (11:42 +0100)]
doc: Document erasure-coded pool direct reads for balance flag

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
9 days agoMerge pull request #69171 from cbodley/wip-76996
Casey Bodley [Thu, 4 Jun 2026 14:38:30 +0000 (10:38 -0400)]
Merge pull request #69171 from cbodley/wip-76996

qa/rgw: remove ragweed from multifs subsuite

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
9 days agoosd: add last_degraded field to pg_stat_t 68910/head
Sridhar Seshasayee [Wed, 6 May 2026 15:11:33 +0000 (20:41 +0530)]
osd: add last_degraded field to pg_stat_t

Introduce a 'last_degraded' timestamp to the pg_stat_t structure to track
the initial point of redundancy loss. This field, used in conjunction
with 'last_clean', allows the manager to calculate a cluster-wide
durability score by measuring the duration of vulnerability windows.

Changes:
1) Add last_degraded (utime_t) to pg_stat_t in osd_types.h.
2) Increment pg_stat_t encoding version to 31. The decode logic
   defaults last_degraded to last_clean for backward compatibility
   during rolling upgrades.
3) Update operator==, dump(), and generate_test_instances() to
   support ceph-dencoder testing and JSON output.
4) Implement latching logic in PeeringState::prepare_stats_for_publish():
   - A PG is considered vulnerable if in DEGRADED or UNDERSIZED state.
   - last_degraded is set to 'now' only if it is <= last_clean,
     effectively latching the timestamp to the start of the failure
     event until the PG next becomes clean.
5) Standalone tests to verify:
   - The last_degraded timestamp latching logic.
   - Verify last_degraded timestamp is modified when OSDs are marked 'out' for
     draining purposes in which case PGs are marked undersized.
6) Release note the addition of 'last_degraded' field to PG stats.

Fixes: https://tracker.ceph.com/issues/76604
Signed-off-by: Sridhar Seshasayee <sridhar.seshasayee@ibm.com>
9 days agoMerge pull request #69285 from tchaikov/wip-test-rgw-posix-fix-leak
Casey Bodley [Thu, 4 Jun 2026 14:12:04 +0000 (10:12 -0400)]
Merge pull request #69285 from tchaikov/wip-test-rgw-posix-fix-leak

test/rgw/posix: free the quota handler in TestDriver

Reviewed-by: Nithya Balachandran <nithya.balachandran@ibm.com>
9 days agodoc: add more details about the remote-control sidecar service 68825/head
John Mulligan [Thu, 14 May 2026 14:02:56 +0000 (10:02 -0400)]
doc: add more details about the remote-control sidecar service

Add a section about how to set up and access the remote-control sidecar
service. Update a bit of the existing config docs that was not accurate.
Cover the three approaches to making use of the remote-control service
as a client.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
9 days agopython-common/ceph/smb/ctl: add a small help text improvement
John Mulligan [Wed, 3 Jun 2026 15:27:19 +0000 (11:27 -0400)]
python-common/ceph/smb/ctl: add a small help text improvement

Make it clearer that a remote TCP server should be addressed with
IP address (or hostname) and port.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
9 days agoqa/suites/orch: enable remote control sidecar for a mgr + resources test
John Mulligan [Mon, 11 May 2026 18:28:36 +0000 (14:28 -0400)]
qa/suites/orch: enable remote control sidecar for a mgr + resources test

Signed-off-by: John Mulligan <jmulligan@redhat.com>
9 days agoqa/workunits/smb: add a test sub-suite for the new ceph-smb-ctl tool
John Mulligan [Mon, 11 May 2026 18:27:31 +0000 (14:27 -0400)]
qa/workunits/smb: add a test sub-suite for the new ceph-smb-ctl tool

Signed-off-by: John Mulligan <jmulligan@redhat.com>
9 days agoceph.spec.in: enable new python-common packaging mode on el10
John Mulligan [Wed, 22 Apr 2026 18:12:52 +0000 (14:12 -0400)]
ceph.spec.in: enable new python-common packaging mode on el10

Enable the new packaging mode for python-common by default on el10-style
distributions.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
9 days agocontainer: include python3-ceph-smb-ctl in ceph image
John Mulligan [Fri, 8 May 2026 18:01:36 +0000 (14:01 -0400)]
container: include python3-ceph-smb-ctl in ceph image

The python3-ceph-smb-ctl package provides the ceph-smb-ctl CLI tool (and
requires needed deps) and is a weak dependency of python3-ceph-common.
However, since the container disables weak dependencies by default we
need to explicitly list it if we want it in the container image. Which
we do.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
9 days agoMerge pull request #69259 from knrt10/fix-ares-depedency
Kautilya Tripathi [Thu, 4 Jun 2026 11:05:09 +0000 (16:35 +0530)]
Merge pull request #69259 from knrt10/fix-ares-depedency

ceph.spec.in: require c-ares >= 1.28 for ceph-osd-crimson