git-server-git.apps.pok.os.sepia.ceph.com Git

]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/log

John Mulligan [Tue, 9 Jun 2026 18:21:32 +0000 (14:21 -0400)]

Merge pull request #68825 from phlogistonjohn/jjm-smb-ctl-tool-fe

smb: add a smb remote control client tool frontend

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Anoop C S <anoopcs@cryptolab.net>

commit | commitdiff | tree

Anthony D'Atri [Fri, 25 Oct 2024 19:45:27 +0000 (15:45 -0400)]

src/common/options: Increase autoscaler PG target and overload values

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>

commit | commitdiff | tree

Igor Fedotov [Tue, 9 Jun 2026 15:51:59 +0000 (18:51 +0300)]

Merge pull request #65275 from ifed01/wip-ifed-no-buffered-wal

os/bluestore: do not use buffered IO for BlueFS WAL.

Reviewed-by: Adam Kupczyk <akupczyk@ibm.com>

commit | commitdiff | tree

Matan Breizman [Tue, 9 Jun 2026 13:53:42 +0000 (16:53 +0300)]

Merge pull request #69211 from Matan-B/wip-matanb-seastore-conflict-counters

crimsn/os/seastore: separate reset accounting from transaction creation

Reviewed-by: Xuehan Xu <xuxuehan@qianxin.com>

commit | commitdiff | tree

dheart [Tue, 9 Jun 2026 13:27:14 +0000 (21:27 +0800)]

os/bluestore: prevent reallocation and corruption when shared_blob key is missing/undecodable

When the shared_blob key is missing or fails to decode,
it is necessary to scan the blob's pextents directly as the sole authoritative source
to verify allocated blocks and prevent double-allocation.

Signed-off-by: dheart <dheart_joe@163.com>

commit | commitdiff | tree

Casey Bodley [Tue, 9 Jun 2026 13:16:15 +0000 (09:16 -0400)]

Merge pull request #69233 from tchaikov/wip-rgw-posix-thread-last

rgw/posix: start the Inotify thread last, after the rest is built

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Emmanuel Ameh [Tue, 9 Jun 2026 12:40:03 +0000 (13:40 +0100)]

doc/man: Remove stale EOL release names from deprecation notices

ceph.rst: "osd create" deprecation notice cited "the Luminous release"
(2017, EOL 2020). Update to a plain deprecation statement directing
users to the replacement command (osd new).

rbd.rst: cephx_require_signatures option deprecation cited "the Bobtail
release" (2013, EOL 2015) as context for why the option is deprecated.
Remove the EOL release name; retain the deprecation warning. Fix the
companion nocephx_require_signatures notice for consistency ("in a
future release" instead of "in the future").

Fixes: https://tracker.ceph.com/issues/77191
Signed-off-by: Emmanuel Ameh <eameh@contractor.linuxfoundation.org>

commit | commitdiff | tree

Casey Bodley [Tue, 9 Jun 2026 12:24:19 +0000 (08:24 -0400)]

Merge pull request #69253 from cbodley/wip-76725

osdc: deliver neorados completions to associated executor

Reviewed-by: Adam Emerson <aemerson@redhat.com>
Reviewed-by: Shilpa Jagannath <smanjara@redhat.com>

commit | commitdiff | tree

eameh-LF [Tue, 9 Jun 2026 12:06:30 +0000 (13:06 +0100)]

Merge pull request #69246 from eameh-LF/i77075

doc/cephadm: fix typo and missing quote in activate-existing-osds

commit | commitdiff | tree

Jaya Prakash [Tue, 9 Jun 2026 11:53:16 +0000 (17:23 +0530)]

Merge pull request #65792 from aclamk/aclamk-bs-onode-stall-fix

os/bluestore: Fix problem with onode cache causing stalls

Reviewed-by: Igor Fedotov <igor.fedotov@croit.io>

commit | commitdiff | tree

Jaya Prakash [Tue, 9 Jun 2026 11:52:57 +0000 (17:22 +0530)]

Merge pull request #68798 from aclamk/aclamk-bs-fix-stray-spanning-blobs

os/bluestore: Fix ExtentMap::reshard produce stray spanning blobs

Reviewed-by: Igor Fedotov <igor.fedotov@croit.io>

commit | commitdiff | tree

Matty Williams [Mon, 23 Feb 2026 16:32:13 +0000 (16:32 +0000)]

doc: Update documentation to reflect new functionality

https://tracker.ceph.com/issues/74188
Signed-off-by: Matty Williams <Matty.Williams@ibm.com>

commit | commitdiff | tree

Matty Williams [Tue, 23 Dec 2025 13:42:37 +0000 (13:42 +0000)]

test: Add integration tests for EC Omap operations and recovery

Assisted-by: Bob
Used for writing tests following the pattern of existing tests.

Fixes: https://tracker.ceph.com/issues/74188
Signed-off-by: Matty Williams <Matty.Williams@ibm.com>

commit | commitdiff | tree

Matty Williams [Mon, 18 May 2026 09:09:32 +0000 (10:09 +0100)]

osd: Hook up omap operations in EC pools

Add pool flag to determine if omap operations are supported in a pool.
- Currently disabled in EC pools (will later be enabled for Fast EC pools)
Require all osds to have umbrella or later release version to enable pool flag.
Change recovery reads to use journal updates.
Clear the journal for a new epoch.
Set omap_complete accurately before recovery.
Encode omap updates and add entry to journal.
Decode omap updates, apply updates to object store, then remove from journal.
Change omap reads in PrimaryLogPG to use PGBackend functions, including omap updates from journal.

Assisted-by: Bob
Used for debugging and copying patterns (e.g. implementing REPLACE type to match MODIFY).

Fixes: https://tracker.ceph.com/issues/74188
Signed-off-by: Matty Williams <Matty.Williams@ibm.com>

commit | commitdiff | tree

Matty Williams [Tue, 12 May 2026 15:11:17 +0000 (16:11 +0100)]

osd: Allow for recovery of OMAP header and entries in EC pools

Add omap fields to read_request_t, read_result_t, ECSubRead and ECSubReadReply.
Read and write omap header and entries if !omap_complete.
Require omap_complete to finish recovery.

Fixes: https://tracker.ceph.com/issues/74244
Signed-off-by: Matty Williams <Matty.Williams@ibm.com>

commit | commitdiff | tree

Matty Williams [Tue, 24 Feb 2026 15:16:28 +0000 (15:16 +0000)]

doc: Write design document to explain the reasoning behind implementing this feature

Assisted-by: Bob
Used to create the first draft of the design document.

https://tracker.ceph.com/issues/74187
Signed-off-by: Matty Williams <Matty.Williams@ibm.com>

commit | commitdiff | tree

Matty Williams [Fri, 12 Dec 2025 11:21:10 +0000 (11:21 +0000)]

osd: Introduce functions required for EC OMAP support

Introduced a "supports_omap" pool flag which is always enabled for Replicated pools and currently always disabled for EC pools.
Introduced wrappers around omap read operations in PGBackend to include updates from the journal in EC pools with optimisations enabled.
Introduced a function for encoding an EC_OMAP operation in the ObjectModDesc::Visitor class and a function for committing an operation in the Trimmer struct.

Signed-off-by: Matty Williams <Matty.Williams@ibm.com>

commit | commitdiff | tree

Yuval Lifshitz [Tue, 9 Jun 2026 07:58:15 +0000 (10:58 +0300)]

Merge pull request #69033 from kchheda3/fix-76729-notif-eventtime-race

rgw/notification: fix zero eventTime in bucket notifications on concurrent PUT race

commit | commitdiff | tree

Kefu Chai [Mon, 8 Jun 2026 09:17:26 +0000 (17:17 +0800)]

crimson/osd: init PerShardState::startup_time per-shard

previously, we got the mono_clock::now() in OSD::start() and passed it
to PerShardState. this worked fine. but it was a little bit convoluted
-- we pass the startup_time all the way to PerShardState.

in this change, we just use call mono_clock::now() in the contructor
of PerShardState. simpler this way.

the startup_time has two consumers:

- the PGs hosted by the sharded_service use it as a reference for the
monotonic timestamp
- Heartbeat::send_heartbeats() uses it as for the mono_ping_stamp.

because, strictly speaking, we cannot gurantee that all PerShardState
sharded services share the identical startup timestamp, as they are
constructed on different shards. but this does not matter, as PGs
always use the hosting shard service for the referencing timestamp,
and OSD always uses the shard service on local shard for sending
heartbeats.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Kefu Chai [Mon, 8 Jun 2026 08:54:47 +0000 (16:54 +0800)]

crimson/osd: remove OSD::startup_time

OSD::startup_time was added in 91c0df81, in which was added to provide
a monotomic increasing timestamp representing the startup time. but
later, we decided to keep track of this timestamp in PerShardState.
but we didn't remove OSD::startup_time when adding
PerShardState::get_mnow().

in this change, we remove the unused OSD::startup_time.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Kefu Chai [Mon, 8 Jun 2026 08:24:40 +0000 (16:24 +0800)]

crimson/osd: coroutinize OSD::start()

OSD::start() is a long, deeply nested .then() continuation chain. let's
rewrite it as a coroutine to make it readable. the chain was already
sequential and the one concurrent step keeps its when_all_succeed(), so
the rewrite preserves both ordering and concurrency.

start() runs once at boot, off the i/o path, so the small overhead of
co_await over a hand-rolled continuation chain is a fine price for the
readability.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Aashish Sharma [Tue, 21 Apr 2026 08:10:54 +0000 (13:40 +0530)]

mgr/cephadm: add unit tests

Signed-off-by: Aashish Sharma <aasharma@redhat.com>

commit | commitdiff | tree

Aashish Sharma [Wed, 1 Apr 2026 10:10:16 +0000 (15:40 +0530)]

mgr/cephadm: set default prometheus template in config-key store unless
overridden by the user

Fixes: https://tracker.ceph.com/issues/75826
Signed-off-by: Aashish Sharma <aasharma@redhat.com>

commit | commitdiff | tree

Venky Shankar [Tue, 9 Jun 2026 01:32:00 +0000 (07:02 +0530)]

Merge PR #68413 into main

* refs/pull/68413/head:
mds: fix shutdown hang when ephemeral pins active and max_mds is 0
mds: fix crash in hash_into_rank_bucket() when max_mds is 0

Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Kefu Chai [Mon, 8 Jun 2026 23:37:28 +0000 (07:37 +0800)]

Merge pull request #69165 from sunyuechi/wip-addcephtest-catch2-imported-target

cmake/AddCephTest: use namespaced Catch2 imported targets

Reviewed-by: Jesse F. Williamson <jfw@ibm.com>

commit | commitdiff | tree

Patrick Donnelly [Mon, 8 Jun 2026 22:31:53 +0000 (18:31 -0400)]

Merge PR #69337 into main

* refs/pull/69337/head:
doc: governance/csc: update email address

Reviewed-by: Joseph Mundackal <jmundackal@bloomberg.net>
Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>
Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>

commit | commitdiff | tree

Yehuda Sadeh Weinraub [Mon, 8 Jun 2026 18:38:26 +0000 (11:38 -0700)]

doc: governance/csc: update email address

yehuda@redhat.com -> yehuda@ui.com

Signed-off-by: Yehuda Sadeh Weinraub <yehuda@ui.com>

commit | commitdiff | tree

Ericmzhang [Mon, 8 Jun 2026 19:12:11 +0000 (12:12 -0700)]

Merge pull request #69176 from Ericmzhang/wip-fix-pg_autoscaler-tests

qa: Fix pg autoscaler tests

commit | commitdiff | tree

Zack Cerza [Mon, 8 Jun 2026 18:37:07 +0000 (12:37 -0600)]

Merge pull request #69315 from sunyuechi/wip-sccache-riscv64

Dockerfile.build: bump sccache and fetch it on riscv64

commit | commitdiff | tree

Jaya Prakash [Mon, 18 May 2026 19:57:50 +0000 (19:57 +0000)]

qa/suites: add faster allocation recovery thrashing suite

Signed-off-by: Jaya Prakash <jayaprakash@ibm.com>

commit | commitdiff | tree

Jaya Prakash [Mon, 18 May 2026 19:57:33 +0000 (19:57 +0000)]

qa/workunits: add EC fio workload for allocation recovery testing

Signed-off-by: Jaya Prakash <jayaprakash@ibm.com>

commit | commitdiff | tree

Adam Kupczyk [Fri, 29 May 2026 11:16:39 +0000 (11:16 +0000)]

os/bluestore: Add printout to CBT's recovery-compare command

1) recovery-compare prints on stdout
2) gracefully rejects comparing when multithreaded not enabled

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>

commit | commitdiff | tree

Adam Kupczyk [Tue, 19 May 2026 19:36:37 +0000 (19:36 +0000)]

os/bluestore: Add bluestore_debug_fast_recovery_compare_chance

The setting is used for testing purposes only.
It allows to force compare if required,
or set chance to use in teuthology thrash tests.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>

commit | commitdiff | tree

Adam Kupczyk [Mon, 7 Jul 2025 10:16:43 +0000 (10:16 +0000)]

os/bluestore: Make OnodeScan use just one Blob

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>

commit | commitdiff | tree

Adam Kupczyk [Mon, 7 Jul 2025 10:02:01 +0000 (10:02 +0000)]

os/bluestore: Tell OnodeScan to skip decoding checksums

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>

commit | commitdiff | tree

Adam Kupczyk [Mon, 7 Jul 2025 07:24:42 +0000 (07:24 +0000)]

os/bluestore: Adapt multithread recovery

Adapt multithread recovery to modified ExtentDecoder interface.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>

commit | commitdiff | tree

Adam Kupczyk [Thu, 3 Jul 2025 08:04:01 +0000 (08:04 +0000)]

os/bluestore: Multithreaded allocation recovery

Added multithreading processing for allocation recovery.
Added new config "bluestore_allocation_recovery_threads".

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>

commit | commitdiff | tree

Adam Kupczyk [Tue, 1 Jul 2025 13:25:38 +0000 (13:25 +0000)]

os/bluestore: Add "recovery-compare" action to CBT

New command compares 2 recovery modes:
- legacy
- new multithreaded
The command is hidden - it does not show in help.
Its role is devel & test only.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>

commit | commitdiff | tree

Adam Kupczyk [Tue, 1 Jul 2025 13:47:14 +0000 (13:47 +0000)]

os/bluestore: Add new onode recovery method

Added read_allocation_from_onodes_mt function
  (originally copied from read_allocation_from_onodes).
Added Decoder_AllocationsAndStatFS class
  (originally copied from ExtentDecoderpartial).

There are significant differences from originals:
- shared blobs are not scanned at all
- to not account allocations more than once,
  collisions are detected on SimpleBitmap level;
  only the first onode referencing shared blob will mark allocation
- Blobs are not preserved
- instead we remember only if blob or spanning blob was compressed

The underlying logic is make recovery faster and prepare for
multithread refactor.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>

commit | commitdiff | tree

Adam Kupczyk [Tue, 1 Jul 2025 11:54:01 +0000 (11:54 +0000)]

os/bluestore: Tiny refactor

Moved statfs initialization that is done after onode recovery
from read_allocation_from_onodes()
to reconstruct_allocations().

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>

commit | commitdiff | tree

Adam Kupczyk [Tue, 1 Jul 2025 11:48:45 +0000 (11:48 +0000)]

os/bluestore: Add set_atomic and clr_atomic to SimpleBitmap

The functions are analogs of set and clr respectively that allow to multithread use.
In addition return value is a count of set/cleared bits.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>

commit | commitdiff | tree

Adam Kupczyk [Fri, 4 Jul 2025 16:28:16 +0000 (16:28 +0000)]

os/bluestore: Rework on decoding

Refactored ExtentDecoder.
Introduced decode_create_blob method to it.
Converted bluestore_blob_t::decode and Blob::decode methods into templates.
Created clear example path how to specialize these and other decoders.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>

commit | commitdiff | tree

Shraddha Agrawal [Mon, 8 Jun 2026 14:54:47 +0000 (20:24 +0530)]

Merge pull request #69212 from shraddhaag/wip-shraddhaag-enable-debian-crimson-builds

debian: enable crimson packages

commit | commitdiff | tree

Kefu Chai [Mon, 8 Jun 2026 14:11:05 +0000 (22:11 +0800)]

Merge pull request #66746 from datdenkikniet/prologue-not-epilogue

msg/async/frames_v2: doc: FRAME_EARLY_DATA_COMPRESSED is used in prologue, not epilogue

Reviewed-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Kefu Chai [Mon, 8 Jun 2026 13:34:54 +0000 (21:34 +0800)]

Merge pull request #69188 from sunyuechi/zstd-system-include

compressor/zstd: include <zstd.h> instead of the bundled path

Reviewed-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Jon Bailey [Thu, 4 Jun 2026 10:27:07 +0000 (11:27 +0100)]

test: Remove invalid unit test

This test was talking about testing invalid ops, however with the inclusion of sync reads in EC (https://github.com/ceph/ceph/pull/67079), it is valid to perform class reads in EC. In addition, work was done around illegal ops here: https://github.com/ceph/ceph/pull/66258 and the existance of TEST(ClsHello, BadMethods) in test_cls_hello.cc covers illegal ops in that PR leading me to think this is unneccisairy. Because of these reasons, I think its better this test is removed as it is incorrect and also not working.

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>

commit | commitdiff | tree

chungfengz [Thu, 16 Apr 2026 06:54:16 +0000 (06:54 +0000)]

mds: fix shutdown hang when ephemeral pins active and max_mds is 0

During shutdown, `ceph fs set <fs> down true` sets max_mds to 0 before
the MDS daemons have finished exporting their subtrees. shutdown_pass()
iterates over auth subtrees and skips any dir whose inode is
ephemerally pinned, expecting handle_export_pins() to re-place them.
However, handle_export_pins() calls hash_into_rank_bucket() which (after
the companion fix) now returns MDS_RANK_NONE when max_mds == 0. With
no valid target rank the export is never scheduled, so the ephemerally-
pinned dirs are skipped by shutdown_pass() indefinitely and the daemon
loops.

Fixes: https://tracker.ceph.com/issues/76059
Signed-off-by: chungfengz <chungfengz@synology.com>

commit | commitdiff | tree

chungfengz [Thu, 16 Apr 2026 06:53:51 +0000 (06:53 +0000)]

mds: fix crash in hash_into_rank_bucket() when max_mds is 0

When a CephFS cluster is paused (e.g. via `ceph fs set <fs> down true`
or `ceph fs pause`) the MDS map's max_mds is set to 0. Any subsequent
call to hash_into_rank_bucket() with max_mds == 0 triggers a crash:
the jump-consistent-hash loop never executes (j starts at 0, condition
j < max_mds is immediately false), leaving b = -1, so the final
assert(result >= 0 && result < max_mds) aborts the daemon.

Fixes: https://tracker.ceph.com/issues/76059
Signed-off-by: chungfengz <chungfengz@synology.com>

commit | commitdiff | tree

Venky Shankar [Mon, 8 Jun 2026 09:03:10 +0000 (14:33 +0530)]

Merge pull request #56634 from neesingh-rh/wip-64064

mds: comply with the valid range for `mds_log_max_segments`

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Venky Shankar [Mon, 8 Jun 2026 08:53:57 +0000 (14:23 +0530)]

Merge PR #68793 into main

* refs/pull/68793/head:
mds: prevent CDir omap commit with empty updates/removals/header

Reviewed-by: Igor Golikov <igolikov@ibm.com>
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>

commit | commitdiff | tree

Matan Breizman [Mon, 8 Jun 2026 08:13:54 +0000 (11:13 +0300)]

Merge pull request #69153 from fultheim/rbm-capacity-enforcement

crimson/os/seastore: enforce capacity in RBMCleaner::try_reserve_projected_usage

Reviewed-by: Matan Breizman <mbreizma@redhat.com>

commit | commitdiff | tree

Nizamudeen A [Tue, 19 May 2026 04:40:08 +0000 (10:10 +0530)]

mgr/dashboard: carbonize table filters

Fixes: https://tracker.ceph.com/issues/76687
Signed-off-by: Nizamudeen A <nia@redhat.com>

commit | commitdiff | tree

Matan Breizman [Mon, 8 Jun 2026 07:43:56 +0000 (10:43 +0300)]

Merge pull request #69248 from xxhdx1985126/wip-seastore-get_child_sync-fix

crimson/os/seastore/linked_tree_node: get_child_sync should also get transactional views of the extent

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
Reviewed-by: Matan Breizman <mbreizma@redhat.com>

commit | commitdiff | tree

Venky Shankar [Mon, 8 Jun 2026 07:27:10 +0000 (12:57 +0530)]

Merge PR #66492 into main

* refs/pull/66492/head:
src/pybind/mgr: handle json-pretty for perf stats

Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Shraddha Agrawal [Mon, 1 Jun 2026 10:58:48 +0000 (16:28 +0530)]

debian: enable crimson packages

This commit enables ceph-osd-crimson and ceph-osd-crimson-dbg
packages for debian builds which have gcc version 13 or above.
This is done as a first step to add noble to supported distors
for crimson.

Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>

commit | commitdiff | tree

Nizamudeen A [Mon, 8 Jun 2026 05:25:57 +0000 (10:55 +0530)]

Merge pull request #68094 from rhcs-dashboard/cleanup-log

mgr/prometheus: cleanup the smb share processing logs

Reviewed-by: Avan Thakkar <athakkar@redhat.com>

commit | commitdiff | tree

Nizamudeen A [Mon, 8 Jun 2026 05:23:20 +0000 (10:53 +0530)]

Merge pull request #69317 from tchaikov/wip-mgr-dashboard-immutable-cache

mgr/dashboard: don't mutate the cached osd_map in CephService

Reviewed-by: Nizamudeen A <nia@redhat.com>

commit | commitdiff | tree

Venky Shankar [Mon, 8 Jun 2026 04:35:28 +0000 (10:05 +0530)]

Merge pull request #65950 from joscollin/wip-71701-near-full

qa: drop creating huge files in test_cephfs_mirror_cancel_sync

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Kefu Chai [Mon, 8 Jun 2026 01:33:54 +0000 (09:33 +0800)]

Merge pull request #67371 from greenx/main

logrotate: send SIGHUP to ceph-exporter on log rotation

Reviewed-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Kefu Chai [Sun, 7 Jun 2026 08:58:20 +0000 (16:58 +0800)]

mgr/dashboard: don't mutate the cached osd_map in CephService

test_pool_list fails intermittently:

  Traceback (most recent call last):
    File "qa/tasks/mgr/dashboard/test_pool.py", line 182, in test_pool_list
      self.assertNotIn('pg_status', pool)
  AssertionError: 'pg_status' unexpectedly found in
    {'pool': 1, 'pool_name': 'rbd', ..., 'pg_status': {'active+clean': 1}, ...}

mgr.get('osd_map') defaults to mutable=False, so cacheable_get_python()
returns the mgr's shared cached object rather than a copy.
get_pool_list_with_stats() writes pool['pg_status'] and pool['stats']
into those cached dicts, and get_erasure_code_profiles() sets ecp['name']
and rewrites ecp['k']/['m'] to int. The writes outlive the request, so
once a stats=true call has run, GET /api/pool with stats=false still
returns pools carrying pg_status and the assertion above fails. It only
triggers while the cache stays valid between the two requests, hence the
flakiness.

Audited the other dashboard readers of cached mgr.get() keys: these two
are the only sites that mutate the result; the rest only read, and
health.py already copies its osd_map before editing.

Copy the dicts before stamping them; the cache stays clean.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Sun Yuechi [Sat, 6 Jun 2026 09:44:57 +0000 (17:44 +0800)]

Dockerfile.build: fetch sccache on riscv64

sccache ships a riscv64 release artifact since v0.13.0, published under the
riscv64gc target triple. Map uname -m "riscv64" to that asset name so the
download resolves on riscv64 instead of being skipped.

Signed-off-by: Sun Yuechi <sunyuechi@iscas.ac.cn>

commit | commitdiff | tree

Sun Yuechi [Sat, 6 Jun 2026 09:44:33 +0000 (17:44 +0800)]

Dockerfile.build: bump sccache to v0.15.0

The releases since v0.8.2 add caching for C++20 modules, assembly, and C
preprocessor output, plus broader GCC/MSVC flag handling. They also avoid
double-caching when ccache is on PATH and carry assorted cache-correctness
and storage-backend fixes.

Signed-off-by: Sun Yuechi <sunyuechi@iscas.ac.cn>

commit | commitdiff | tree

Xuehan Xu [Wed, 3 Jun 2026 02:55:02 +0000 (10:55 +0800)]

crimson/os/seastore/lba,btree: better debug logs

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>

commit | commitdiff | tree

Xuehan Xu [Wed, 3 Jun 2026 02:09:12 +0000 (10:09 +0800)]

crimson/os/seastore/btree: correct the sync search of leaf nodes to do
lower_bound instead of upper_bound

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>

commit | commitdiff | tree

Xuehan Xu [Tue, 2 Jun 2026 15:29:15 +0000 (23:29 +0800)]

crimson/os/seastore/linked_tree_node: get_child_sync should also get
transactional views of the extent

Fixes: https://tracker.ceph.com/issues/76945
Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>

commit | commitdiff | tree

Kobi Ginon [Mon, 25 May 2026 12:38:34 +0000 (15:38 +0300)]

cephadm: set Grafana http_addr to 0.0.0.0 when unset
Grafana 11.1+ rejects non-literal http_addr values (e.g. localhost)
in grafana-apiserver. Use 0.0.0.0 by default; stop bracket-wrapping
IPv6 addresses in http_addr.
Fixes: https://tracker.ceph.com/issues/75365
Signed-off-by: Kobi Ginon <kginon@redhat.com>

commit | commitdiff | tree

Casey Bodley [Fri, 5 Jun 2026 15:03:50 +0000 (11:03 -0400)]

Merge pull request #69172 from cbodley/wip-76997

qa/rgw: bump tempest version from 34.1.0 to 45.0.0

Reviewed-by: Tobias Urdin <tobias.urdin@binero.com>

commit | commitdiff | tree

Afreen Misbah [Fri, 5 Jun 2026 13:56:42 +0000 (19:26 +0530)]

Merge pull request #68977 from rhcs-dashboard/76652-Convert-add-storage-wizard-to-tearsheet

mgr/dashboard: Converting add storage wizard into tearsheet

Reviewed-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Venky Shankar [Fri, 5 Jun 2026 13:19:56 +0000 (18:49 +0530)]

Merge PR #69118 into main

* refs/pull/69118/head:
qa/cephfs: install ceph-mgr-modules-standard for cephfs tests

Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 5 Jun 2026 10:45:08 +0000 (18:45 +0800)]

Merge pull request #69295 from tchaikov/wip-c-ares

ceph.spec.in: only require c-ares >= 1.28 on el10+

Reviewed-by: Kautilya Tripathi <kautilya.tripathi@ibm.com>

commit | commitdiff | tree

Ilya Dryomov [Fri, 5 Jun 2026 08:39:05 +0000 (10:39 +0200)]

Merge pull request #69263 from JonBailey1993/ec_direct_reads_docs

doc: Document erasure-coded pool direct reads for balance flag

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Afreen Misbah [Fri, 5 Jun 2026 08:23:37 +0000 (13:53 +0530)]

Merge pull request #69040 from rhcs-dashboard/76746-combining-quorum-tables-data-on-monitors-page

mgr/dashboard: Combining Quorum tables data on Monitors page

Reviewed-by: Afreen Misbah <afreen@ibm.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>

commit | commitdiff | tree

Sridhar Seshasayee [Fri, 5 Jun 2026 08:20:14 +0000 (13:50 +0530)]

Merge pull request #68910 from sseshasa/wip-osd-perf-counters-for-durability-score

osd: add last_degraded field to pg_stat_t

Reviewed-by: Radoslaw Zarzynski <rzarzynski@redhat.com>

commit | commitdiff | tree

Redouane Kachach [Fri, 5 Jun 2026 08:02:34 +0000 (10:02 +0200)]

qa/cephadm: fix test_repos.sh for jammy nodes

The test uses add-repo --release 17.2.6 to verify version-string repo
handling, but debian-17.2.6 only has focal and bullseye suites and
jammy packages weren't built until 17.2.7. This causes apt-get update
to fail with a 404 on ubuntu_22.04 nodes.

see: https://download.ceph.com/debian-17.2.6/dists/

This fix bumps the version to 17.2.7 which includes a jammy suite.

Fixes: https://tracker.ceph.com/issues/77130
Signed-off-by: Redouane Kachach <rkachach@ibm.com>

commit | commitdiff | tree

Nizamudeen A [Fri, 5 Jun 2026 07:07:30 +0000 (12:37 +0530)]

Merge pull request #67901 from aadhikale/wip-75619_progress_module_gives_value_error_for_metadata

dashboard: use metadata = event.get('refs', {}) instead of dict(event…

Reviewed-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Naman Munet <nmunet@redhat.com>

commit | commitdiff | tree

Aishwarya Mathuria [Fri, 5 Jun 2026 05:58:45 +0000 (11:28 +0530)]

Merge pull request #69240 from amathuria/wip-amat-crimson-debug-snaptrim-timeout

crimson/osd: add debug logs for snaptrim and scrub background_process_lock

commit | commitdiff | tree

Kefu Chai [Fri, 5 Jun 2026 04:53:45 +0000 (12:53 +0800)]

Merge pull request #68989 from tchaikov/wip-slim-mgr-module

debian,rpm: split ceph-mgr-modules-core into per-module packages

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 5 Jun 2026 01:34:56 +0000 (09:34 +0800)]

ceph.spec.in: only require c-ares >= 1.28 on el10+

87e233bb2628784c8c59603e74bc728a8944265e added an unconditional
"Requires: c-ares >= 1.28.0" to ceph-osd-crimson: seastar links
ares_query_dnsrec, which c-ares only grew in 1.28, and the libcares.so.2
SONAME doesn't carry the version so rpm can't infer the floor itself.

But the floor only earns its place where the build links the symbol
against a newer c-ares than the runtime has, and that's an EL thing.
el10's minors cross 1.28 under one $releasever (10.1 ships 1.25, 10.2
ships 1.34), so a builder rolls to 1.34 while a frozen 10.1 node stays on
1.25; without the floor the rpm installs there and the osd then crashes
on the missing symbol. el9 builds the legacy ares_query path and doesn't
need it at all.

Fedora and SUSE don't have the skew: one c-ares per release, built and
run against the same one, so the auto libcares.so.2 dep covers them. So
pin it only on el10+, arch-qualified with %{?_isa}.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Ronen Friedman [Fri, 5 Jun 2026 01:22:29 +0000 (04:22 +0300)]

Merge pull request #69288 from ronen-fr/wip-rf-runnerarm

crimson/test: chain invoke_on_all() future instead of calling get()

Reviewed-by: Matan Breizman <mbreizma@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Fri, 5 Jun 2026 01:15:50 +0000 (04:15 +0300)]

Merge pull request #69177 from ronen-fr/wip-rf-scrubstore

osd/scrub: clean up inconsistent_obj_wrapper and ScrubStore

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Casey Bodley [Thu, 4 Jun 2026 18:19:38 +0000 (14:19 -0400)]

test/neorados: test cross-executor completions

repurpose ceph_test_neorados_completions to verify that completions are
delivered to the handler's associated executor

Signed-off-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Thu, 4 Jun 2026 13:05:26 +0000 (13:05 +0000)]

crimson/test: chain invoke_on_all() future instead of calling get()

The reactors start-up code on ARM64 uses invoke_on_all() to
set a configuration option.
Replace smp::invoke_on_all().get() with future chaining. This
avoids waiting on a future from a reactor continuation (outside
of a seastar thread) that throws exception.

See: https://docs.seastar.io/master/classseastar_1_1future.html#a50bfeff0acccd2f365cce40f9954218c

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Fri, 29 May 2026 18:21:51 +0000 (18:21 +0000)]

osd/scrub: clean up inconsistent_obj_wrapper and ScrubStore

Add a default constructor to inconsistent_obj_wrapper, allowing
decode_wrapper() to avoid requiring a dummy hobject_t that gets
immediately overwritten by decode(). Remove the now-unnecessary
hobject_t parameter from merge_encoded_error_wrappers().

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Thu, 4 Jun 2026 18:32:30 +0000 (21:32 +0300)]

Merge pull request #69148 from ronen-fr/wip-rf-scrubjob

osd/scrub: scrub_job cleanup

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Casey Bodley [Wed, 3 Jun 2026 17:03:49 +0000 (13:03 -0400)]

qa/rgw: disable neutron service in tempest.conf

Suggested-by: Tobias Urdin <tobias.urdin@binero.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Casey Bodley [Fri, 29 May 2026 14:56:10 +0000 (10:56 -0400)]

qa/rgw: bump tempest version from 34.1.0 to 45.0.0

this 34.1.0 version fails to pip install under python 3.12 when testing
ubuntu 24.04

i chose new version 45.0.0 because it corresponds to this commit:
> Use stable constraint in tox to release new tag for 2025.2

which matches the stable/2025.2 tag we use for keystone

Fixes: https://tracker.ceph.com/issues/76997
Signed-off-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Casey Bodley [Thu, 4 Jun 2026 17:01:44 +0000 (13:01 -0400)]

Merge pull request #69054 from tchaikov/wip-cls-rgw-cleanup

cls: remove unused variable

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Matty Williams [Thu, 14 May 2026 12:44:12 +0000 (13:44 +0100)]

osd: Fix condition for rolling forward pg log entries

https://tracker.ceph.com/issues/76577
Signed-off-by: Matty Williams <Matty.Williams@ibm.com>

commit | commitdiff | tree

Sridhar Seshasayee [Thu, 4 Jun 2026 15:23:26 +0000 (20:53 +0530)]

Merge pull request #69270 from sseshasa/wip-fix-ok-to-upgrade-error-msg

mgr/DaemonServer: clarify ok-to-upgrade error message for CRUSH buckets

Reviewed-by: Kamoltat Sirivadhna <ksirivad@redhat.com>

commit | commitdiff | tree

Jon Bailey [Wed, 3 Jun 2026 10:42:29 +0000 (11:42 +0100)]

doc: Document erasure-coded pool direct reads for balance flag

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>

commit | commitdiff | tree

Casey Bodley [Thu, 4 Jun 2026 14:38:30 +0000 (10:38 -0400)]

Merge pull request #69171 from cbodley/wip-76996

qa/rgw: remove ragweed from multifs subsuite

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>

commit | commitdiff | tree

Sridhar Seshasayee [Wed, 6 May 2026 15:11:33 +0000 (20:41 +0530)]

osd: add last_degraded field to pg_stat_t

Introduce a 'last_degraded' timestamp to the pg_stat_t structure to track
the initial point of redundancy loss. This field, used in conjunction
with 'last_clean', allows the manager to calculate a cluster-wide
durability score by measuring the duration of vulnerability windows.

Changes:
1) Add last_degraded (utime_t) to pg_stat_t in osd_types.h.
2) Increment pg_stat_t encoding version to 31. The decode logic
   defaults last_degraded to last_clean for backward compatibility
   during rolling upgrades.
3) Update operator==, dump(), and generate_test_instances() to
   support ceph-dencoder testing and JSON output.
4) Implement latching logic in PeeringState::prepare_stats_for_publish():
   - A PG is considered vulnerable if in DEGRADED or UNDERSIZED state.
   - last_degraded is set to 'now' only if it is <= last_clean,
     effectively latching the timestamp to the start of the failure
     event until the PG next becomes clean.
5) Standalone tests to verify:
   - The last_degraded timestamp latching logic.
   - Verify last_degraded timestamp is modified when OSDs are marked 'out' for
     draining purposes in which case PGs are marked undersized.
6) Release note the addition of 'last_degraded' field to PG stats.

Fixes: https://tracker.ceph.com/issues/76604
Signed-off-by: Sridhar Seshasayee <sridhar.seshasayee@ibm.com>

commit | commitdiff | tree

Casey Bodley [Thu, 4 Jun 2026 14:12:04 +0000 (10:12 -0400)]

Merge pull request #69285 from tchaikov/wip-test-rgw-posix-fix-leak

test/rgw/posix: free the quota handler in TestDriver

Reviewed-by: Nithya Balachandran <nithya.balachandran@ibm.com>

commit | commitdiff | tree

John Mulligan [Thu, 14 May 2026 14:02:56 +0000 (10:02 -0400)]

doc: add more details about the remote-control sidecar service

Add a section about how to set up and access the remote-control sidecar
service. Update a bit of the existing config docs that was not accurate.
Cover the three approaches to making use of the remote-control service
as a client.

Signed-off-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

John Mulligan [Wed, 3 Jun 2026 15:27:19 +0000 (11:27 -0400)]

python-common/ceph/smb/ctl: add a small help text improvement

Make it clearer that a remote TCP server should be addressed with
IP address (or hostname) and port.

Signed-off-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

John Mulligan [Mon, 11 May 2026 18:28:36 +0000 (14:28 -0400)]

qa/suites/orch: enable remote control sidecar for a mgr + resources test

Signed-off-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

John Mulligan [Mon, 11 May 2026 18:27:31 +0000 (14:27 -0400)]

qa/workunits/smb: add a test sub-suite for the new ceph-smb-ctl tool

Signed-off-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

John Mulligan [Wed, 22 Apr 2026 18:12:52 +0000 (14:12 -0400)]

ceph.spec.in: enable new python-common packaging mode on el10

Enable the new packaging mode for python-common by default on el10-style
distributions.

Signed-off-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

John Mulligan [Fri, 8 May 2026 18:01:36 +0000 (14:01 -0400)]

container: include python3-ceph-smb-ctl in ceph image

The python3-ceph-smb-ctl package provides the ceph-smb-ctl CLI tool (and
requires needed deps) and is a weak dependency of python3-ceph-common.
However, since the container disables weak dependencies by default we
need to explicitly list it if we want it in the container image. Which
we do.

Signed-off-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

Kautilya Tripathi [Thu, 4 Jun 2026 11:05:09 +0000 (16:35 +0530)]

Merge pull request #69259 from knrt10/fix-ares-depedency

ceph.spec.in: require c-ares >= 1.28 for ceph-osd-crimson

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom