]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/log
ceph.git
13 days agocommon/options, os/bluestore: add debug option to force bluefs files onto slow device 68430/head
Jaya Prakash [Thu, 7 May 2026 12:09:07 +0000 (12:09 +0000)]
common/options, os/bluestore: add debug option to force bluefs files onto slow device

Fixes: https://tracker.ceph.com/issues/74319
Signed-off-by: Jaya Prakash <jayaprakash@ibm.com>
13 days agoos/bluestore: start/stop BlueFS spillover cleaner on config change
Jaya Prakash [Mon, 16 Mar 2026 19:22:49 +0000 (19:22 +0000)]
os/bluestore: start/stop BlueFS spillover cleaner on config change

Fixes: https://tracker.ceph.com/issues/74319
Signed-off-by: Jaya Prakash <jayaprakash@ibm.com>
(cherry picked from commit dc768b782d54cc6a5dee29a9c4f358e8b9183aa6)

13 days agoos/bluestore: migrated files in 128MB chunks
Jaya Prakash [Fri, 15 May 2026 17:07:32 +0000 (17:07 +0000)]
os/bluestore: migrated files in 128MB chunks

Signed-off-by: Jaya Prakash <jayaprakash@ibm.com>
13 days agoos/bluestore: Spillover Cleaner Thread implementation in BlueFS
Jaya Prakash [Thu, 16 Apr 2026 15:30:28 +0000 (15:30 +0000)]
os/bluestore: Spillover Cleaner Thread implementation in BlueFS

Fixes: https://tracker.ceph.com/issues/74319
Signed-off-by: Jaya Prakash <jayaprakash@ibm.com>
13 days agocommon/options: add bluefs_spillover_cleaner option
Jaya Prakash [Mon, 16 Mar 2026 19:23:05 +0000 (19:23 +0000)]
common/options: add bluefs_spillover_cleaner option

Fixes: https://tracker.ceph.com/issues/74319
Signed-off-by: Jaya Prakash <jayaprakash@ibm.com>
13 days agocrimson/os/seastore: enforce capacity in RBMCleaner::try_reserve_projected_usage 69153/head
Shai Fultheim [Sun, 24 May 2026 11:19:56 +0000 (14:19 +0300)]
crimson/os/seastore: enforce capacity in RBMCleaner::try_reserve_projected_usage

RBMCleaner::try_reserve_projected_usage always returned true and just
incremented stats.projected_used_bytes. The EPM BackgroundProcess
relies on the return value to block IO when the device is full, so
this effectively disabled backpressure for the RANDOM_BLOCK_SSD
backend: concurrent transactions could each reserve unbounded amounts,
and the over-commit surfaced downstream as `unexpected enospc` asserts
in the data path (object_data_handler.cc and friends, where ENOSPC is
treated as crimson::ct_error::enospc::assert_failure because the
existing infrastructure assumes ENOSPC is impossible). The OSD aborted
under sustained random-write workloads that exceeded RBM capacity.

Compute the device's data capacity as total - journal, subtract a 5%
headroom (for metadata writes and fragmentation slack the AVL allocator
cannot pack into), and reject reservations that would push
used + projected over the line. The existing EPM blocking-IO path
(extent_placement_manager.cc:726) already queues the IO until
release_projected_usage wakes it, so no caller-side changes are needed.

This is the minimal fix to keep the OSD alive under sustained random
writes. It converts a crash into a stall: once the device fills and
the cleaner has nothing to free (RBMCleaner::clean_space is still a
TODO), new writes block indefinitely instead of crashing. Verified
against an 8-job 1MB random-write fio (--size 63g, 90GB RBM, 3GB
journal): 68 GB user-written, host WAF 1.696, OSD survives, watchdog
kills fio after slow-ops timeout. Without this patch the same workload
asserts in the data path.

The headroom is intentionally generous (5%) because there is no GC
yet; once RBMCleaner::clean_space() exists, the headroom can shrink.

Fixes: https://tracker.ceph.com/issues/75598
Signed-off-by: Shai Fultheim <shai.fultheim@gmail.com>
13 days agorgw: SSE-KMS: Handle Testing Key Per Object 61256/head
Marcel Lauhoff [Tue, 5 May 2026 12:21:03 +0000 (14:21 +0200)]
rgw: SSE-KMS: Handle Testing Key Per Object

The testing backend uses a 'keysel' attribute to derive a per object
key from the KEK in the config. A single key_id with distinct keysel
has different keys and need to be cached as such.

Add the keysel to the cache key id to handle these collisions.

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agorgw: KMS Cache Shutdown: Reaper first
Marcel Lauhoff [Tue, 10 Mar 2026 10:51:30 +0000 (11:51 +0100)]
rgw: KMS Cache Shutdown: Reaper first

1. Don't delete the KMS cache before draining/joining the frontend
coroutine threads. They may still depend on the KMS cache.
2. Stop the TTL reaper early to get it off the coroutine pool.

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agorgw: KMS Cache: Reset Reaper State in Async+Threaded
Marcel Lauhoff [Fri, 6 Mar 2026 09:44:08 +0000 (10:44 +0100)]
rgw: KMS Cache: Reset Reaper State in Async+Threaded

Reset reaper state to monostate in the async and threaded case.
Fixes a possible use after free in the async reaper case.

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agocommon/keyring: Fix reset error checking
Marcel Lauhoff [Tue, 3 Mar 2026 20:25:05 +0000 (21:25 +0100)]
common/keyring: Fix reset error checking

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agocommon/web_cache: Fix _sieve_hand dangling pointer
Marcel Lauhoff [Tue, 3 Mar 2026 20:24:57 +0000 (21:24 +0100)]
common/web_cache: Fix _sieve_hand dangling pointer

sieve_expire_erase_unmutexed did not update the sieve hand passed as
advertised. Make it return the updated hand and use that to
update the global _sieve_hand in expire_erase

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agocommon/web_cache: delete perfcounters on destruction
Marcel Lauhoff [Tue, 3 Mar 2026 20:24:26 +0000 (21:24 +0100)]
common/web_cache: delete perfcounters on destruction

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agorgw: SSE-KMS: Fix wrong cache key in in lookup_or() call
Marcel Lauhoff [Tue, 3 Mar 2026 20:24:34 +0000 (21:24 +0100)]
rgw: SSE-KMS: Fix wrong cache key in in lookup_or() call

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agorgw: SSE-KMS: Handle Vault Transit Key Per Object
Marcel Lauhoff [Tue, 3 Mar 2026 20:24:10 +0000 (21:24 +0100)]
rgw: SSE-KMS: Handle Vault Transit Key Per Object

KMS backends Barbican, Vault KV, and KMIP have a static key per
key_id. However, with Vault Transit, each object has a unique DEK
wrapped by the transit key.

Keying th cache with key_id in Transit mode results in only the first
DEK to be cached for all subsequent objects.

Fix this by appending a hash of the wrapped DEK to the cache key.

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agorgw: Fix typos in perf counter descriptions
Marcel Lauhoff [Tue, 3 Mar 2026 20:24:48 +0000 (21:24 +0100)]
rgw: Fix typos in perf counter descriptions

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agoPendingReleaseNotes: Add SSE-KMS Cache
Marcel Lauhoff [Fri, 19 Dec 2025 09:20:13 +0000 (10:20 +0100)]
PendingReleaseNotes: Add SSE-KMS Cache

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agorgw: SSE-KMS Secrets Cache
Marcel Lauhoff [Thu, 19 Dec 2024 14:41:30 +0000 (15:41 +0100)]
rgw: SSE-KMS Secrets Cache

Add SSE Key Management System secrets cache to RGW.

It is common to have secrets shared by many if not all objects in a
bucket. Without RGW-side caching every PUT/GET will cause a request to
an external KSM. This not only adds load to the KSM, but also slows
down read and writes.

Combine WebCache, ceph::async::call_once and LinuxKeyringSecret into
KMSCache. WebCache stores async::once_result to wrap results of a KMS
secret fetch to mitigate cache stampedes (concurrent cache requests to
the same key coalesce into one). The retrieved secrets are stored in
the Linux kernel key retention service (LinuxKeyringSecret) for safe
keeping and retrial by subsequent requests. KMSCache adds a TTL reaper
and life cycle.

Cache values and error handling: The cache stores positive
fetch results, permanent errors (e.g key does not exists) and
transient errors (e.g fetch timeout). Each with a different TTL.

Unit tests to cover cached / uncached KMS retrieve and runtime cache
disable via config.

Add perf counter `kms_fetch_lat` to track KMS fetch request latency
and error counters to track permanent, transient and key store
errors.

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
Fixes: https://tracker.ceph.com/issues/68524
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agocommon: Refactor LinuxKeyringSecret into Keyring Interface
Marcel Lauhoff [Thu, 18 Dec 2025 18:42:07 +0000 (19:42 +0100)]
common: Refactor LinuxKeyringSecret into Keyring Interface

Goal: Support multiple backends and faking / mocking for testing.

Add abstract classes Keyring (factory) and KeyringSecret. Add
"Unsupported" implementation for non-Linux platforms. Add a get_best
factory function that currently returns the LinuxKeyring impl on Linux
or Unsupported elsewhere.

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agodoc: Document RGW KMS Cache
Marcel Lauhoff [Fri, 27 Jun 2025 10:22:06 +0000 (12:22 +0200)]
doc: Document RGW KMS Cache

Add caching section to the RGW Encryption docs. Add cache
settings to the RGW configuration reference.

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agorgw: Early Linux process keyring initialization
Marcel Lauhoff [Fri, 13 Jun 2025 14:45:41 +0000 (16:45 +0200)]
rgw: Early Linux process keyring initialization

To allow RGW threads to share possession over process keyring keys the
keyring must be created before a child thread adds keys.

Since we only use the process keyring for KMS cache secrets, only
initialize the keyring if it is enabled on startup.

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agocommon: Add Linux Keyring Secret Store Wrapper
Marcel Lauhoff [Fri, 25 Apr 2025 14:27:57 +0000 (16:27 +0200)]
common: Add Linux Keyring Secret Store Wrapper

Add RAII wrapper around the Linux Key Retention Service
add_key(2), keyctl_read(3), keyctl_invalidate(3)

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agotest: Add Secrets Store µBenchmarks
Marcel Lauhoff [Thu, 20 Mar 2025 16:27:11 +0000 (17:27 +0100)]
test: Add Secrets Store µBenchmarks

Benchmark:
- Linux Kernel Key Retention Service (kernel keystore) [0]
- memfd_secret(2)
- plain memory

Tests:
- Random reads
- (keystore) Write, Read, Remove

[0] https://docs.kernel.org/security/keys/core.html

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agotest: Add Cache benchmarks
Marcel Lauhoff [Fri, 21 Feb 2025 11:42:07 +0000 (12:42 +0100)]
test: Add Cache benchmarks

Add Google benchmark [0] based micro benchmarks for Cache/LRU
implementations in the Ceph code base.

[0] https://github.com/google/benchmark

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agocommon: Add WebCache
Marcel Lauhoff [Tue, 11 Feb 2025 12:48:22 +0000 (13:48 +0100)]
common: Add WebCache

A cache data structure for values that need to be retrieved form
outside systems (e.g Key Management Systems).

Features:
- Thread safe, optimized for concurrent lookups and cache hits
- Entry TTL expiration
- Cache replacement strategy tuned to "web" workloads (SIEVE)
- Performance Counters on hit, miss, expire, size, capacity, clears

Signed-off-by: Marcel Lauhoff <marcel.lauhoff@clyso.com>
On-behalf-of: SAP marcel.lauhoff@sap.com

13 days agocommon/async: add call_once() algorithm for optional_yield
Casey Bodley [Mon, 24 Mar 2025 16:51:15 +0000 (12:51 -0400)]
common/async: add call_once() algorithm for optional_yield

modeled after std::call_once() to guarantee that racing callers wait for
the initial caller to finish. the main differences here are

* support for coroutine callers to suspend instead of blocking while
  waiting for the initial caller, and
* the wrapped function must return a value, which is cached and returned
  to all callers

Signed-off-by: Casey Bodley <cbodley@redhat.com>
13 days agocommon/async: yield_waiter can return the associated executor
Casey Bodley [Tue, 25 Mar 2025 22:06:36 +0000 (18:06 -0400)]
common/async: yield_waiter can return the associated executor

also adds an empty() function so it's easier to specify its precondition

Signed-off-by: Casey Bodley <cbodley@redhat.com>
13 days agocommon/async: yield_waiter overloads for unique_lock
Casey Bodley [Mon, 24 Mar 2025 16:50:16 +0000 (12:50 -0400)]
common/async: yield_waiter overloads for unique_lock

if async_wait() can race with complete() across threads, the
yield_waiter's handler_state needs to be protected by a mutex. add
an async_wait() overload for unique_lock that behaves like
condition_variable::wait(): the lock is released immediately before
suspending, and reacquired immediately before calling its completion
handler

Signed-off-by: Casey Bodley <cbodley@redhat.com>
13 days agoqa: install nvme-cli only if distro remains rocky10 69222/head
Patrick Donnelly [Mon, 1 Jun 2026 15:37:23 +0000 (11:37 -0400)]
qa: install nvme-cli only if distro remains rocky10

Notably, only include these the `dnf install` commands if the distro is
not overriden by some other mechanism (like cephfs kernel overrides).

This is only a problem for tentacle presently as the k-stock kernel will
override with centos9.

Fixes: https://tracker.ceph.com/issues/77037
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
13 days agokv/KeyValueDB: New utility function util_divide_key_range
Adam Kupczyk [Tue, 1 Jul 2025 11:30:59 +0000 (11:30 +0000)]
kv/KeyValueDB: New utility function util_divide_key_range

New function splits provided range into smaller chunks.
Declared in KeyValueDB, but implemented only for RocksDBStore.
Useful for splitting large datasets for multiple threads to
iterate in parallel.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
13 days agokv/KeyValueDB: New estimate_range_size function
Adam Kupczyk [Tue, 1 Jul 2025 11:23:28 +0000 (11:23 +0000)]
kv/KeyValueDB: New estimate_range_size function

Taking estimate_prefix_size to another level.
Makes possible detailed inspection of db size.
Used primarily for bisecting key range.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
13 days agoMerge pull request #69083 from fultheim/adaptive-cleaner-thresholds
Matan Breizman [Mon, 1 Jun 2026 16:12:08 +0000 (19:12 +0300)]
Merge pull request #69083 from fultheim/adaptive-cleaner-thresholds

crimson/os/seastore: adaptive cleaner thresholds from observed workload

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
13 days agoscript/backport-create-issue: catch errors during traversal 69219/head
Patrick Donnelly [Mon, 1 Jun 2026 14:29:13 +0000 (10:29 -0400)]
script/backport-create-issue: catch errors during traversal

A ServerError shouldn't prevent all forward progress.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
13 days agoMerge pull request #69203 from tchaikov/wip-libcephfs-test
Kefu Chai [Mon, 1 Jun 2026 14:08:36 +0000 (22:08 +0800)]
Merge pull request #69203 from tchaikov/wip-libcephfs-test

test/libcephfs: reduce SnapDiffDeletionRecreation bulk_count on Windows

Reviewed-by: Igor Fedotov <igor.fedotov@croit.io>
13 days agoMerge pull request #68775 from gardran/wip-gardran-fix-write-v2-deferred-counters
Igor Fedotov [Mon, 1 Jun 2026 13:58:53 +0000 (16:58 +0300)]
Merge pull request #68775 from gardran/wip-gardran-fix-write-v2-deferred-counters

os/bluestore: do not increment *issued_deferred* counters twice

Reviewed-by: Jaya Prakash <jayaprakash@ibm.com>
Reviewed-by: Adam Kupczyk <akupczyk@ibm.com>
13 days agoMerge pull request #69168 from guits/fix-osd-type
Guillaume Abrioux [Mon, 1 Jun 2026 13:49:09 +0000 (15:49 +0200)]
Merge pull request #69168 from guits/fix-osd-type

cephadm: cephadm: omit --osd-type classic for older ceph-volume

13 days agoMerge PR #69152 into main
Patrick Donnelly [Mon, 1 Jun 2026 13:33:25 +0000 (09:33 -0400)]
Merge PR #69152 into main

* refs/pull/69152/head:
script/backport-create-issue: update custom field name

Reviewed-by: Redouane Kachach <rkachach@redhat.com>
13 days agomgr/dashboard: Combining Quorum tables data on Monitors page 69040/head
Devika Babrekar [Thu, 21 May 2026 06:33:04 +0000 (12:03 +0530)]
mgr/dashboard: Combining Quorum tables data on Monitors page
Fixes: https://tracker.ceph.com/issues/76746
Signed-off-by: Devika Babrekar <devika.babrekar@ibm.com>
13 days agocmake: link legacy-option-headers from targets that use legacy options 69215/head
Sun Yuechi [Mon, 1 Jun 2026 06:52:03 +0000 (14:52 +0800)]
cmake: link legacy-option-headers from targets that use legacy options

The *_legacy_options.h headers that define the legacy ConfigValues
members are generated at build time by y2c.py. Linking the
legacy-option-headers INTERFACE library adds an order dependency on
that step. A few targets reference legacy members without linking it,
so under a parallel build they can be compiled before the headers
exist and fail with "class ConfigValues has no member ...":

  neorados_objs, neorados_api_obj - objecter_inflight_ops,
      ms_die_on_unhandled_msg (via Objecter.h / Messenger.h)
  ceph_zstd - compressor_zstd_level
  heap_profiler - log_file

Link legacy-option-headers from them, as ceph_lz4, ceph_snappy and
jerasure_utils already do.

Signed-off-by: Sun Yuechi <sunyuechi@iscas.ac.cn>
13 days agoMerge pull request #67889 from gardran/wip-gardran-no-seq-bytes
Igor Fedotov [Mon, 1 Jun 2026 10:55:28 +0000 (13:55 +0300)]
Merge pull request #67889 from gardran/wip-gardran-no-seq-bytes

os/bluestore: avoid redundant map lookup for deferred op

Reviewed-by: Jaya Prakash <jayaprakash@ibm.com>
13 days agoos/bluestore: do not increment *issued_deferred* counter twice 68775/head
Garry Drankovich [Wed, 6 May 2026 16:19:45 +0000 (19:19 +0300)]
os/bluestore: do not increment *issued_deferred* counter twice
in write v2 mode.

_get_deferred_op() is already increasing performance counter on its own.

Signed-off-by: Garry Drankovich <garry.drankovich@clyso.com>
13 days agoqa/cephadm: query iSCSI gateway FQDN from inside the container 69214/head
Kefu Chai [Mon, 1 Jun 2026 10:40:06 +0000 (18:40 +0800)]
qa/cephadm: query iSCSI gateway FQDN from inside the container

rbd-target-api validates that the gateway hostname supplied by gwcli
matches the container's own socket.getfqdn(). Running the same call on
the host can return a different value when the host and container resolve
names differently (e.g. on Rocky 10), causing gateway creation to fail
with HTTP 400 and all subsequent gwcli configuration to break silently.

Query the FQDN from inside the iSCSI container directly so the value is
always consistent with what rbd-target-api expects. This also removes the
"run twice" workaround, which was compensating for host-side DNS
warm-up flakiness rather than addressing the underlying mismatch.

Fixes: https://tracker.ceph.com/issues/74577
Signed-off-by: Kefu Chai <k.chai@proxmox.com>
2 weeks agopython-common: Improve profile name string validation
Ashwin M. Joshi [Wed, 18 Feb 2026 05:49:12 +0000 (11:19 +0530)]
python-common: Improve profile name string validation

Fixes: https://tracker.ceph.com/issues/74986
Signed-off-by: Ashwin M. Joshi <ashjosh1@in.ibm.com>
src/python-common/ceph/tests/test_service_spec.py

 Conflicts:
src/python-common/ceph/deployment/service_spec.py

2 weeks agocontainer: install ceph-mgr-modules-core and ceph-mgr-modules-standard 68989/head
Kefu Chai [Tue, 19 May 2026 14:43:56 +0000 (22:43 +0800)]
container: install ceph-mgr-modules-core and ceph-mgr-modules-standard

The Containerfile uses --setopt=install_weak_deps=False throughout, so
ceph-mgr-modules-core (a Recommends of ceph-mgr, not a Requires) and
the split-out module packages are not automatically installed. Add them
explicitly.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
2 weeks agodebian, rpm: add ceph-mgr-modules-standard meta-package
Kefu Chai [Tue, 19 May 2026 14:43:47 +0000 (22:43 +0800)]
debian, rpm: add ceph-mgr-modules-standard meta-package

ceph-mgr-modules-core was split into per-module packages so that users
only need to install what they actually use. To ease migration for
existing deployments that want the full former set, add a meta-package
ceph-mgr-modules-standard that pulls in all modules which were
previously bundled in ceph-mgr-modules-core.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
2 weeks agodebian, rpm: add ceph-mgr-cli-api package
Kefu Chai [Tue, 19 May 2026 13:50:23 +0000 (21:50 +0800)]
debian, rpm: add ceph-mgr-cli-api package

cli_api is a new module in this release (not previously shipped in
ceph-mgr-modules-core), so it gets its own package without any
Breaks/Replaces or Obsoletes against the old monolithic package.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
2 weeks agocmake: install cli_api mgr module
Kefu Chai [Tue, 19 May 2026 13:48:38 +0000 (21:48 +0800)]
cmake: install cli_api mgr module

cli_api was missing from the mgr_modules list, so cmake did not install
it into the buildroot, causing RPM packaging to fail with "File not
found".

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
2 weeks agodebian: fix missing python3 deps for diskprediction-local and osd-support
Kefu Chai [Thu, 21 May 2026 12:45:36 +0000 (20:45 +0800)]
debian: fix missing python3 deps for diskprediction-local and osd-support

diskprediction-local depends on python3-prettytable and osd-support
depends on python3-cherrypy3; both need to be declared explicitly now
that these modules are separate packages.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
2 weeks agorpm: split ceph-mgr-modules-core into per-module packages
Kefu Chai [Thu, 21 May 2026 12:45:30 +0000 (20:45 +0800)]
rpm: split ceph-mgr-modules-core into per-module packages

ceph-mgr-modules-core has historically bundled always-on modules
together with optional ones, forcing users to install modules and their
dependencies even when they have no use for them. Split each optional
module into its own package so users and distributions can install only
what they need.

ceph-mgr-modules-core is trimmed to the 10 always-on modules defined
in src/mon/MgrMonitor.cc: balancer, crash, devicehealth, orchestrator,
pg_autoscaler, progress, rbd_support, status, telemetry, volumes.
Each optional module now follows the pattern of ceph-mgr-k8sevents and
ceph-mgr-rook.

New packages carry Obsoletes: ceph-mgr-modules-core < 21.0.0 for
proper upgrade path.

The split also exposes cross-module Python dependencies: modules
co-installed in ceph-mgr-modules-core could freely import each other,
but once separated into individual packages those imports require
explicit Requires entries. Now the inter-dependencies are expressed
properly in ceph.spec.in.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
2 weeks agorpm: deduplicate mgr module scriptlets with a macro
Kefu Chai [Thu, 21 May 2026 12:54:02 +0000 (20:54 +0800)]
rpm: deduplicate mgr module scriptlets with a macro

Define %ceph_mgr_module_scripts() to emit the identical %post/%postun
pair for each optional mgr module package, replacing the 5 existing
hand-written copies (dashboard, diskprediction-local, rook, k8sevents,
cephadm) with a single call site per package. Subsequent commits that
introduce new mgr module packages can use the macro from the start.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
2 weeks agodebian: split ceph-mgr-modules-core into per-module packages
Kefu Chai [Tue, 19 May 2026 13:45:25 +0000 (21:45 +0800)]
debian: split ceph-mgr-modules-core into per-module packages

ceph-mgr-modules-core has historically bundled always-on modules
together with optional ones, forcing users to install modules and their
dependencies even when they have no use for them. Split each optional
module into its own package so users and distributions can install only
what they need.

ceph-mgr-modules-core is trimmed to the 10 always-on modules defined
in src/mon/MgrMonitor.cc: balancer, crash, devicehealth, orchestrator,
pg_autoscaler, progress, rbd_support, status, telemetry, volumes.
Each optional module now follows the pattern of ceph-mgr-k8sevents and
ceph-mgr-rook.

New packages carry Breaks/Replaces: ceph-mgr-modules-core (<< 21.0.0)
for proper file ownership transfer on upgrade.

The split also exposes cross-module Python dependencies: modules
co-installed in ceph-mgr-modules-core could freely import each other,
but once separated into individual packages those imports require
explicit Depends entries. Now the inter-dependencies are properly
reflected in debian/control.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
2 weeks agoMerge pull request #69143 from guits/fix-cv-vg-lv-batch
Guillaume Abrioux [Mon, 1 Jun 2026 07:57:17 +0000 (09:57 +0200)]
Merge pull request #69143 from guits/fix-cv-vg-lv-batch

ceph-volume: retry lvs after empty result and "devices file is missing" stderr

2 weeks agotest/libcephfs: reduce SnapDiffDeletionRecreation bulk_count on Windows 69203/head
Kefu Chai [Mon, 1 Jun 2026 05:19:04 +0000 (13:19 +0800)]
test/libcephfs: reduce SnapDiffDeletionRecreation bulk_count on Windows

this test timed out on Windows. and HugeSnapDiffLargeDelta, at half
the file count, passed in 508 seconds on the same run, suggesting this
test takes ~17 minutes on Windows -- beyond the test runner limit.

we haven't profiled the Windows client yet, but the likely culprit is
EventPoll, the Windows messenger backend, which scans the entire poll
array on every event_wait() and poll_ctl() call rather than using a
keyed data structure.

in this change, we reduce bulk_count to 1 << 12 on Windows. the unique
thing this test covers is the deletion-recreation pattern: a name that
exists as a file in snap1, gets deleted, and reappears as a directory in
snap2 -- it must show up in the diff with both snapids. 4096 produces
1024 such pairs, which is enough to exercise that logic. multi-fragment
snapdiff is already covered by HugeSnapDiffLargeDelta, which derives its
file count from mds_bal_split_size and mds_bal_fragment_fast_factor
explicitly to trigger fragmentation.

Fixes: https://tracker.ceph.com/issues/77015
Signed-off-by: Kefu Chai <k.chai@proxmox.com>
2 weeks agoMerge pull request #69135 from VallariAg/wip-nvmeof-teuthology-mon-conf
Vallari Agrawal [Sun, 31 May 2026 16:00:05 +0000 (21:30 +0530)]
Merge pull request #69135 from VallariAg/wip-nvmeof-teuthology-mon-conf

qa/suites/nvmeof: set beacon grace and connect panic

2 weeks agoMerge pull request #66500 from AliMasarweh/wip-alimasa-global-cors
Ali Masarwa [Sun, 31 May 2026 10:30:56 +0000 (13:30 +0300)]
Merge pull request #66500 from AliMasarweh/wip-alimasa-global-cors

RGW: add support for global CORS rule

Reviewed-by: Naman Munet <naman.munet@ibm.com>, Casey Bodley <cbodley@redhat.com>
2 weeks agoMerge pull request #69185 from sunyuechi/wip-with-system-spdk
Kefu Chai [Sun, 31 May 2026 10:26:14 +0000 (18:26 +0800)]
Merge pull request #69185 from sunyuechi/wip-with-system-spdk

cmake,blk/spdk: support WITH_SYSTEM_SPDK

Reviewed-by: Kefu Chai <k.chai@proxmox.com>
2 weeks agocompressor/zstd: include <zstd.h> instead of the bundled path 69188/head
Sun Yuechi [Sun, 31 May 2026 09:04:09 +0000 (17:04 +0800)]
compressor/zstd: include <zstd.h> instead of the bundled path

ZstdCompressor.h hard-codes #include "zstd/lib/zstd.h", which only
resolves because include_directories(src) puts the bundled submodule
on the search path. It thus silently depends on src/zstd being checked
out, and breaks with -DWITH_SYSTEM_ZSTD=ON where the submodule is absent.

ceph_zstd already links Zstd::Zstd, whose INTERFACE_INCLUDE_DIRECTORIES
points at the directory holding zstd.h in both modes: src/zstd/lib for
the bundled build, ${Zstd_INCLUDE_DIR} for the system one. Use <zstd.h>
so the include resolves through that interface either way.

Signed-off-by: Sun Yuechi <sunyuechi@iscas.ac.cn>
2 weeks agoMerge pull request #68745 from Hezko/bugfix-13279
Hezko [Sun, 31 May 2026 08:04:07 +0000 (11:04 +0300)]
Merge pull request #68745 from Hezko/bugfix-13279

mgr/dashboard: fix listener add errors

2 weeks agoMerge pull request #69044 from xxhdx1985126/wip-seastore-rewrite-fix
Matan Breizman [Sun, 31 May 2026 07:20:36 +0000 (10:20 +0300)]
Merge pull request #69044 from xxhdx1985126/wip-seastore-rewrite-fix

crimson/os/seastore: force rewrite transactions to conflict with others if it involve insertions on the lba tree

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
2 weeks agocmake: add WITH_SYSTEM_SPDK to link a system-installed SPDK 69185/head
Sun Yuechi [Sat, 30 May 2026 06:15:12 +0000 (14:15 +0800)]
cmake: add WITH_SYSTEM_SPDK to link a system-installed SPDK

By default ceph builds the bundled src/spdk fork via BuildSPDK. Add a
WITH_SYSTEM_SPDK option that instead locates a distro-provided SPDK
through a new Findspdk.cmake (pkg-config based, modelled on
Finddpdk.cmake), exposing the same spdk::spdk target.

Signed-off-by: Sun Yuechi <sunyuechi@iscas.ac.cn>
2 weeks agoblk/spdk: support both old and new spdk_env_opts member names
Sun Yuechi [Sat, 30 May 2026 06:11:11 +0000 (14:11 +0800)]
blk/spdk: support both old and new spdk_env_opts member names

SPDK 21.01 renamed two struct spdk_env_opts members: pci_whitelist ->
pci_allowed and master_core -> main_core. Guard the assignments in
NVMEDevice with SPDK_VERSION.

pci_whitelist -> pci_allowed:  https://github.com/spdk/spdk/commit/4a6a2824119b
master_core -> main_core:      https://github.com/spdk/spdk/commit/fe137c8970bf

Signed-off-by: Sun Yuechi <sunyuechi@iscas.ac.cn>
2 weeks agorgw/posix: fix event replay in BucketCache ev_loop 69183/head
Kefu Chai [Sat, 30 May 2026 07:49:18 +0000 (15:49 +0800)]
rgw/posix: fix event replay in BucketCache ev_loop

evec is never cleared after each n->notify() call, so events accumulate
across iterations of ev_loop's inner for loop. Each notify() call
receives not just the current event but all events dispatched in earlier
iterations too.

Add evec.clear() after each n->notify() call.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
2 weeks agorgw/posix: fix refcount leaks in BucketCache
Kefu Chai [Sat, 30 May 2026 07:49:14 +0000 (15:49 +0800)]
rgw/posix: fix refcount leaks in BucketCache

get_bucket(FLAG_LOCK) increments the refcount via lru.ref(), but three
paths returned without the paired lru.unref(): the "do nothing" early
return and the INVALIDATE branch in notify(), and unconditionally in
invalidate_bucket(). Entries hitting these paths accumulated inflated
refcounts that the LRU could never reclaim, leaking during
~BucketCache() → cache.drain().

Replace the manual lru.unref() calls in notify(), add_entry(),
remove_entry(), invalidate_bucket(), and list_bucket() with a scope_guard
declared before unique_lock. Since the guard outlives ulk, it fires after
the mutex is released on all paths, including exceptions from
getRWTransaction() or txn->commit() (e.g. MDB_MAP_FULL, EIO) that the
manual calls never reached.

list_bucket() also had a bare b->mtx.unlock() after fill(); replace it
with unique_lock{..., std::adopt_lock} so a throw from fill() releases
the mutex too.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
2 weeks agoqa: Fix teuthology test timing out 69176/head
Eric Zhang [Thu, 28 May 2026 22:56:44 +0000 (15:56 -0700)]
qa: Fix teuthology test timing out
Enable autoscale mode for pools which is default off for teuthology
Increase mon_target_pg_per_osd so pools scale up by enough
Signed-off-by: Eric Zhang <emzhang@ibm.com>
2 weeks agoqa/workunits/rados: fetch files via GitHub instead of git.ceph.com 69180/head
Laura Flores [Fri, 29 May 2026 21:45:17 +0000 (16:45 -0500)]
qa/workunits/rados: fetch files via GitHub instead of git.ceph.com

The current method fetches files from git.ceph.com, which is unreliable
and sometimes causes the file to contain HTML output instead of the C++ code.

Fetching from GitHub is a more reliable way to get the C++ code.

Fixes: https://tracker.ceph.com/issues/68669
Signed-off-by: Laura Flores <lflores@ibm.com>
2 weeks agoMerge pull request #68934 from cbodley/wip-76578
Casey Bodley [Fri, 29 May 2026 17:52:00 +0000 (13:52 -0400)]
Merge pull request #68934 from cbodley/wip-76578

rgw/beast: add ssl_ciphersuites option for tls 1.3

Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
2 weeks agorgw/posix: remove path from table names 69174/head
Nithya Balachandran [Thu, 16 Apr 2026 10:01:50 +0000 (10:01 +0000)]
rgw/posix: remove path from table names

Removes the DB directory path from the table names.

Signed-off-by: Nithya Balachandran <nithya.balachandran@ibm.com>
2 weeks agorgw/posix: implement the quota feature
Nithya Balachandran [Tue, 24 Mar 2026 08:17:52 +0000 (08:17 +0000)]
rgw/posix: implement the quota feature

Implement the quota feature for the POSIX driver.

Signed-off-by: Nithya Balachandran <nithya.balachandran@ibm.com>
2 weeks agoRGW | standalone: add support for accounts in dbstore
Ali Masarwa [Sun, 12 Apr 2026 13:07:38 +0000 (16:07 +0300)]
RGW | standalone: add support for accounts in dbstore

Signed-off-by: Ali Masarwa <amasarwa@redhat.com>
2 weeks agoradosgw-admin: Remove dependence on RADOS
Samarah Uriarte [Tue, 24 Mar 2026 15:21:00 +0000 (15:21 +0000)]
radosgw-admin: Remove dependence on RADOS

Signed-off-by: Samarah Uriarte <samarah.uriarte@ibm.com>
2 weeks agoRGW POSIX - Fix POSIX unittest
Daniel Gryniewicz [Mon, 30 Mar 2026 14:49:47 +0000 (10:49 -0400)]
RGW POSIX - Fix POSIX unittest

Signed-off-by: Daniel Gryniewicz <dang@fprintf.net>
2 weeks agorgw/posix: fix cached size of uploaded objects
Matt Benjamin [Tue, 24 Mar 2026 18:10:28 +0000 (14:10 -0400)]
rgw/posix: fix cached size of uploaded objects

Moves file open and stat into the (atomic) link step, so size
is correctly interned in the cache.  Fix suggested by dang.

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2 weeks agorgw/posix: fix crash in radosgw-admin
Nithya Balachandran [Tue, 24 Mar 2026 11:33:15 +0000 (11:33 +0000)]
rgw/posix: fix crash in radosgw-admin

The POSIXBucket copy constructor incorrectly calls .get() on a
on a temporary unique_ptr returned by clone(), causing immediate
deletion of the Directory object. This leaves a dangling pointer
that triggers a segfault during destruction.

Signed-off-by: Nithya Balachandran <nithya.balachandran@ibm.com>
2 weeks agocohort_lru: keep strict discard, but from LRU
Matt Benjamin [Wed, 26 Nov 2025 23:17:02 +0000 (18:17 -0500)]
cohort_lru: keep strict discard, but from LRU

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2 weeks agoposixdriver: properly destruct BucketCacheEntry objects
Matt Benjamin [Wed, 26 Nov 2025 14:00:03 +0000 (09:00 -0500)]
posixdriver:  properly destruct BucketCacheEntry objects

* avoids leak of database handles during eviction

Also adds missing return-ref in invalidate_entry--this would
leak a cache entry.

With this change, we can now tolerate indefinite s3-test runs
wit rgw_posix_cache_max_buckets=100.

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2 weeks agocohort_lru: crash fix and reduce lock contention
Matt Benjamin [Tue, 25 Nov 2025 17:41:37 +0000 (12:41 -0500)]
cohort_lru: crash fix and reduce lock contention

Fixes crash induced by taking the address of the last element
of an empty intrusive list (!).

Also, introduces active queue, reducing potential for lock
contention in evict_block():

* entries are tracked on lane::active_queue when lru_refcnt > 1
** on some lane::q otherwise

Object transition between queues when lru_refcnt changes value--
a value of 0 triggers deletion, as before.

Fixes: https://tracker.ceph.com/issues/73992
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2 weeks agoposixdriver: can move buffer::list leaving scope
Matt Benjamin [Fri, 13 Feb 2026 20:29:58 +0000 (15:29 -0500)]
posixdriver: can move buffer::list leaving scope

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2 weeks agoposixdriver: add provisional manifest
Matt Benjamin [Wed, 4 Feb 2026 02:05:47 +0000 (21:05 -0500)]
posixdriver: add provisional manifest

initially, it is just used to remember the multipart layout, but
likely will see other use.

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2 weeks agoposixdriver: fix cksum_type, flags propagation
Matt Benjamin [Tue, 3 Feb 2026 22:12:22 +0000 (17:12 -0500)]
posixdriver: fix cksum_type, flags propagation

Posixdriver doesn't serialize POSIXMultipartUpload, but rather a
member mp_obj of type POSIXMPObj--so to avoid losing the latter's
inherited cksum_type and cksum_flags members (which are already
copied in), copy them out in POSIXMultiPartUpload::get_info() which
we need to call to copy out dest_placement anyway.

(oops, chksum_type was copied in, but not cksum_flags)

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2 weeks agoposixdriver: fix cache fill of versioned buckets
Matt Benjamin [Sun, 15 Feb 2026 20:56:03 +0000 (15:56 -0500)]
posixdriver: fix cache fill of versioned buckets

This change completes the original intent (hypothesized) to
conditionally set the FLAG_CURRENT bit on just the current
entries during bucket listing cache fill.

This avoids interning 2 copies of the current version of each
object in the listing cache, and also correctly sets the
FLAG_CURRENT bit as required--so the current versions are correctly
reported in versioned listings.

Janky logic to find the current version by explicitly chasing
the symlink target and saving it outside the enumeration scope
has been replaced with proper call to stat() provided by Dang.

Symlink::fill_cache() is no longer used, so removed.

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2 weeks agoposixdriver: add bde.flags to in bucket cache serde cycle
Matt Benjamin [Sun, 15 Feb 2026 15:21:28 +0000 (10:21 -0500)]
posixdriver: add bde.flags to in bucket cache serde cycle

The upstream logic (mostly?) correctly uses bde.flags when filling
the cache for versioned objects, but cache ser(de)ialization has
been discarding that member.

This change suppresses the visible result where RGW incorrectly produces
multiple versions in non-versioned listing because none uniquely sets
FLAG_CURRENT:

mbenjamin@fedora:~/dev/rgw/s3_py/python$ s3cmd ls s3://sheik2
2026-02-14 22:44           22  s3://sheik2/ginfizz_1
2026-02-14 22:44           22  s3://sheik2/ginfizz_1
2026-02-14 22:44           22  s3://sheik2/ginfizz_1
2026-02-14 22:44           22  s3://sheik2/ginfizz_2
2026-02-14 22:44           22  s3://sheik2/ginfizz_2
2026-02-14 22:44           22  s3://sheik2/ginfizz_2

Corrected result is:

mbenjamin@fedora:~/dev/rgw/s3_py/python$ s3cmd ls s3://sheik2
2026-02-14 22:44           22  s3://sheik2/ginfizz_1
2026-02-14 22:44           22  s3://sheik2/ginfizz_2

Cached listings for versions are still incorrect in containing an
an extra entry for the "current" version in with empty instance
(from the Symlink)--the visible effect being that list-object-versions
output is incorrect (no entry is sent with IsLatest, after the
empty instance version has been filtered out).

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2 weeks agoposixdriver: propagate object lock attrs across multipart upload
Matt Benjamin [Thu, 12 Feb 2026 19:13:17 +0000 (14:13 -0500)]
posixdriver: propagate object lock attrs across multipart upload

Retention rules can be specified in init-multipart, and of present,
need to propagate to the final object if the upload completes.

Needed for (e.g.) test_object_lock_delete_multipart_object_with_retention

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2 weeks agoposixdriver: page in all xattrs in POSIXObject::load_obj_state()
Matt Benjamin [Wed, 11 Feb 2026 21:44:42 +0000 (16:44 -0500)]
posixdriver: page in all xattrs in POSIXObject::load_obj_state()

This seems to be needed for (at least) object lock retention period
checks, e.g., in DeleteObject::execute().

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2 weeks agoMerge pull request #68899 from batrick/i76586
Ilya Dryomov [Fri, 29 May 2026 16:02:02 +0000 (18:02 +0200)]
Merge pull request #68899 from batrick/i76586

qa: ignore POOL_FULL for rbd tests exercising full pools

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2 weeks agoMerge PR #67683 into main
Patrick Donnelly [Fri, 29 May 2026 15:08:23 +0000 (11:08 -0400)]
Merge PR #67683 into main

* refs/pull/67683/head:
qa/tasks/cbt: construct venv just for cbt
qa/distros: use consistent naming
qa/tasks/nvme_loop: fix nvme loop task for ubuntu noble
qa/distros: add ubuntu_24.04 as supported container host
qa/distros: bump ubuntu_latest.yaml to 24.04
qa/distros: add all/ubuntu_24.04.yaml
qa/suites/rados/encoder: use random supported distro
qa/ceph-ansible: symlink supported-random-distro$
qa/fs/fscrypt: symlink supported-random-distro$
qa/cephmetrics: symlink supported-random-distro$

Reviewed-by: Redouane Kachach <rkachach@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2 weeks agoMerge PR #69163 into main
Patrick Donnelly [Fri, 29 May 2026 15:07:03 +0000 (11:07 -0400)]
Merge PR #69163 into main

* refs/pull/69163/head:
qa/tasks: capture CommandCrashedError when running nvme list cmd

Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
2 weeks agoMerge pull request #66439 from aclamk/aclamk-bs-simpler-flush
Igor Fedotov [Fri, 29 May 2026 15:04:47 +0000 (18:04 +0300)]
Merge pull request #66439 from aclamk/aclamk-bs-simpler-flush

bluestore/bluefs: FileWriter simpler flush

Reviewed-by: Igor Fedotov <igor.fedotov@croit.io>
2 weeks agoMerge pull request #68607 from dheart-joe/wip-bluestore-unshare-blob
Igor Fedotov [Fri, 29 May 2026 15:03:05 +0000 (18:03 +0300)]
Merge pull request #68607 from dheart-joe/wip-bluestore-unshare-blob

os/bluestore: optimize shared blob unsharing during snapshot removal

Reviewed-by: Igor Fedotov <igor.fedotov@croit.io>
2 weeks agoMerge pull request #69166 from sunyuechi/wip-rgw-swift-error-handler-out-of-line
Casey Bodley [Fri, 29 May 2026 15:01:51 +0000 (11:01 -0400)]
Merge pull request #69166 from sunyuechi/wip-rgw-swift-error-handler-out-of-line

rgw: move SWIFT error_handler out-of-line to fix link failure

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2 weeks agoMerge pull request #68898 from gardran/wip-gardran-show-esb-in-metadata
Igor Fedotov [Fri, 29 May 2026 15:01:44 +0000 (18:01 +0300)]
Merge pull request #68898 from gardran/wip-gardran-show-esb-in-metadata

os/bluestore: dump effective elastic shared blobs mode in OSD metadata report

Reviewed-by: Adam Kupczyk <akupczyk@ibm.com>
2 weeks agotools/rados: Remove plain text snippets from rados bench JSON output 66936/head
Jacques Heunis [Thu, 15 Jan 2026 12:11:11 +0000 (12:11 +0000)]
tools/rados: Remove plain text snippets from rados bench JSON output

`rados bench` emits performance stats as its output. It is very helpful
for this output to be in a machine-readable format and the CLI provides
the `--format=json` flag to achieve this.

There are some logs that do not respect the formatter flag though, as
they provide status updates as the tool is running and do not form part
of the output dataset. This prevents the contents of stdout from being
valid JSON which destroys the machine-readability of the output.

To resolve this we gate those status messages behind a check for the
formatter. If any specific formatter is provided we do not emit the
status logs. This leaves the plaintext output largely untouched while
helping the machine-readable output to be well-formed.

Fixes: https://tracker.ceph.com/issues/74370
Signed-off-by: Jacques Heunis <jheunis@bloomberg.net>
2 weeks agoqa/rgw: remove ragweed from multifs subsuite 69171/head
Casey Bodley [Fri, 29 May 2026 14:43:33 +0000 (10:43 -0400)]
qa/rgw: remove ragweed from multifs subsuite

it's currently broken with newer python on rocky 10 and ubuntu 24
(tracked in https://tracker.ceph.com/issues/72500) and doesn't provide
interesting test coverage outside of rgw/upgrade

Fixes: https://tracker.ceph.com/issues/76996
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2 weeks agoMerge pull request #69144 from gbregman/main
Gil Bregman [Fri, 29 May 2026 14:39:57 +0000 (17:39 +0300)]
Merge pull request #69144 from gbregman/main

nvmeof: Change the NVMEOF image version to 1.8

2 weeks agorgw/datalog: `radosgw-admin` will no longer convert datalog to omap 68941/head
Adam C. Emerson [Thu, 5 Feb 2026 22:02:44 +0000 (17:02 -0500)]
rgw/datalog: `radosgw-admin` will no longer convert datalog to omap

Omap-backed datalogs are deprecated, so we remove the ability to
convert to them.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2 weeks agorgw/datalog: Remove `rgw default data log backing` option
Adam C. Emerson [Thu, 5 Feb 2026 19:27:43 +0000 (14:27 -0500)]
rgw/datalog: Remove `rgw default data log backing` option

Omap-backed datalogs are deprecated. This option is removed and we no
longer support creating new clusters using them.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
2 weeks agoMerge pull request #68095 from lumir-sliva/fix-deprecated-egrep-fgrep
Ilya Dryomov [Fri, 29 May 2026 13:52:14 +0000 (15:52 +0200)]
Merge pull request #68095 from lumir-sliva/fix-deprecated-egrep-fgrep

qa,src: replace deprecated egrep/fgrep with grep -E/grep -F

Reviewed-by: Kefu Chai <k.chai@proxmox.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2 weeks agoqa: Ignore deprecated EC plugin warning in teuthology tests 69026/head
Jamie Pryde [Fri, 29 May 2026 11:44:56 +0000 (12:44 +0100)]
qa: Ignore deprecated EC plugin warning in teuthology tests

Add DEPRECATED_EC_PLUGIN to the list of health warnings to
ignore in the thrash-erasure-code-* tests that use deprecated
plugins or techniques. It is expected that this warning will
be raised.

Signed-off-by: Jamie Pryde <jamiepry@uk.ibm.com>
2 weeks agocephadm: cephadm: omit --osd-type classic for older ceph-volume 69168/head
Guillaume Abrioux [Fri, 29 May 2026 11:13:52 +0000 (13:13 +0200)]
cephadm: cephadm: omit --osd-type classic for older ceph-volume

tentacle doesn't know that flag yet.
During an upgrade, teuthology tests can break.
With this fix, we only add the flag when osd_type isn't classic.

Fixes: https://tracker.ceph.com/issues/76968
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2 weeks agomgr/dashboard: Add Sync from/sync from all options on master zone edit 69167/head
Aashish Sharma [Fri, 29 May 2026 11:01:50 +0000 (16:31 +0530)]
mgr/dashboard: Add Sync from/sync from all options on master zone edit

In the dashboard, master zone's edit functionality include the expected "Sync from Zones" and "Sync from All Zones" options

Fixes: https://tracker.ceph.com/issues/76989
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
2 weeks agorgw: move SWIFT error_handler out-of-line to fix link failure 69166/head
Sun Yuechi [Fri, 29 May 2026 10:39:51 +0000 (18:39 +0800)]
rgw: move SWIFT error_handler out-of-line to fix link failure

The two error_handler overrides are defined inline in rgw_rest_swift.h
and delegate to RGWSwiftWebsiteHandler::error_handler, a non-virtual
function defined in rgw_rest_swift.cc (librgw_a.a). Because the header
is included by rgw_rest.cc, the inline bodies are emitted in
librgw_common.a, which then ODR-uses that symbol across archives.

The link line lists librgw_a.a before librgw_common.a, and GNU ld only
pulls archive members on demand: when librgw_a.a is scanned nothing yet
references RGWSwiftWebsiteHandler::error_handler, so rgw_rest_swift.cc.o
is dropped and the symbol is later unresolved. This shows up as a link
failure with gcc 16 -O2.

Move the two bodies into rgw_rest_swift.cc next to the function they
call, so the ODR-use stays within the same object and the build no
longer depends on archive scan order. No functional change.

Signed-off-by: Sun Yuechi <sunyuechi@iscas.ac.cn>
2 weeks agocmake/AddCephTest: use namespaced Catch2 imported targets 69165/head
Sun Yuechi [Fri, 29 May 2026 10:19:18 +0000 (18:19 +0800)]
cmake/AddCephTest: use namespaced Catch2 imported targets

AddCephTest.cmake links the bare target names Catch2 / Catch2WithMain.
With WITH_SYSTEM_CATCH2=ON, CPM resolves Catch2 via find_package(),
which only exports the namespaced IMPORTED targets Catch2::Catch2 /
Catch2::Catch2WithMain. CMake then treats the bare names as plain
library names and the link fails with -lCatch2WithMain, since the
physical library is named libCatch2Main (OUTPUT_NAME "Catch2Main").

Use the namespaced names. Catch2 exports them as ALIASes in the bundled
(CPM subproject) build too, so the default path keeps working.

Signed-off-by: Sun Yuechi <sunyuechi@iscas.ac.cn>