]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/log
ceph.git
2 months agoMerge pull request #68392 from rkachach/fix_issue_cepahdm_qa_task
Redouane Kachach [Fri, 17 Apr 2026 08:50:46 +0000 (10:50 +0200)]
Merge pull request #68392 from rkachach/fix_issue_cepahdm_qa_task

qa: fix misleading "in cluster log" failures during cluster log scan

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Zack Cerza <zack@redhat.com>
Reviewed-by: Yuri Weinstein <yweins@redhat.com>
2 months agomgr/dashboard: sync policy created for a bucket in Object >> Multi-site >> Sync-polic... 67869/head
Naman Munet [Wed, 18 Mar 2026 08:18:06 +0000 (13:48 +0530)]
mgr/dashboard: sync policy created for a bucket in Object >> Multi-site >> Sync-policy, is not reflecting under bucket's replication

Fixes: https://tracker.ceph.com/issues/75581
Signed-off-by: Naman Munet <naman.munet@ibm.com>
2 months agoMerge pull request #68297 from tchaikov/wip-feedback-without-tracker
Kefu Chai [Fri, 17 Apr 2026 05:34:51 +0000 (13:34 +0800)]
Merge pull request #68297 from tchaikov/wip-feedback-without-tracker

mgr/feedback: fix flaky test_issue_tracker_create_with_invalid_key

Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Afreen Misbah <afreen@ibm.com>
2 months agoMerge PR #68245 into main
Patrick Donnelly [Thu, 16 Apr 2026 23:58:06 +0000 (19:58 -0400)]
Merge PR #68245 into main

* refs/pull/68245/head:
mon/MonClient: check stopping for auth request handling

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
2 months agomgr: Properly set description in labeled get_perf_schema_python 68435/head
stzuraski898 [Thu, 16 Apr 2026 21:56:53 +0000 (21:56 +0000)]
mgr: Properly set description in labeled get_perf_schema_python

Fixes: https://tracker.ceph.com/issues/76048
Signed-off-by: stzuraski898 <steven.zuraski@ibm.com>
2 months agotest/mgr: Add negative unit tests to MgrCap
szuraski898 [Wed, 4 Feb 2026 22:21:54 +0000 (16:21 -0600)]
test/mgr: Add negative unit tests to MgrCap

Fixes: https://tracker.ceph.com/issues/72938
Signed-off-by: szuraski898 <steven.zuraski@ibm.com>
2 months agotest/mgr: Add ServiceMap unit tests
szuraski898 [Wed, 4 Feb 2026 22:21:49 +0000 (16:21 -0600)]
test/mgr: Add ServiceMap unit tests

Fixes: https://tracker.ceph.com/issues/72938
Signed-off-by: szuraski898 <steven.zuraski@ibm.com>
2 months agotest/mgr: Add PyUtil unit tests
szuraski898 [Wed, 4 Feb 2026 22:21:46 +0000 (16:21 -0600)]
test/mgr: Add PyUtil unit tests

Fixes: https://tracker.ceph.com/issues/72938
Signed-off-by: szuraski898 <steven.zuraski@ibm.com>
2 months agotest/mgr: Add PyFormatter unit tests
szuraski898 [Wed, 4 Feb 2026 22:21:42 +0000 (16:21 -0600)]
test/mgr: Add PyFormatter unit tests

Fixes: https://tracker.ceph.com/issues/72938
Signed-off-by: szuraski898 <steven.zuraski@ibm.com>
2 months agotest/mgr: Add DaemonState unit tests
szuraski898 [Wed, 4 Feb 2026 22:21:36 +0000 (16:21 -0600)]
test/mgr: Add DaemonState unit tests

Fixes: https://tracker.ceph.com/issues/72938
Signed-off-by: szuraski898 <steven.zuraski@ibm.com>
2 months agotest/mgr: Add DaemonKey unit tests
szuraski898 [Wed, 4 Feb 2026 22:21:33 +0000 (16:21 -0600)]
test/mgr: Add DaemonKey unit tests

Fixes: https://tracker.ceph.com/issues/72938
Signed-off-by: szuraski898 <steven.zuraski@ibm.com>
2 months agotest/mgr: Add ClusterState unit tests
szuraski898 [Wed, 4 Feb 2026 22:21:28 +0000 (16:21 -0600)]
test/mgr: Add ClusterState unit tests

Fixes: https://tracker.ceph.com/issues/72938
Signed-off-by: szuraski898 <steven.zuraski@ibm.com>
2 months agotest/mgr: Add TestMgr scaffolding to support unit tests
szuraski898 [Wed, 4 Feb 2026 22:21:16 +0000 (16:21 -0600)]
test/mgr: Add TestMgr scaffolding to support unit tests

Fixes: https://tracker.ceph.com/issues/72938
Signed-off-by: szuraski898 <steven.zuraski@ibm.com>
2 months agorgw: fix uninitalized fields in pubsub topic creation 68204/head
Abhishek Bansal [Fri, 3 Apr 2026 15:42:23 +0000 (21:12 +0530)]
rgw: fix uninitalized fields in pubsub topic creation

Signed-off-by: Abhishek Bansal <abhibansal593@gmail.com>
2 months agoMerge pull request #68289 from cbodley/wip-75945
Casey Bodley [Thu, 16 Apr 2026 20:18:38 +0000 (16:18 -0400)]
Merge pull request #68289 from cbodley/wip-75945

qa/valgrind: generalize suppressions for gcc-14 MismatchedFree

Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2 months agorgw/sns: ListTopics uses account root arn for policy evaluation 68429/head
Casey Bodley [Thu, 16 Apr 2026 16:49:43 +0000 (12:49 -0400)]
rgw/sns: ListTopics uses account root arn for policy evaluation

when called by a non-root account user, permissions from identity policy
were not being applied correctly and always resulted in:
> evaluate_iam_policies: implicit deny from identity-based policy

passing a non-empty ARN argument to verify_user_permission() fixes this.
while other SNS APIs use a specific topic's arn, ListTopics doesn't
operate on individual topics so we use the account root user's arn

Fixes: https://tracker.ceph.com/issues/74595
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2 months agorgw/iam: add helper rgw::account::root_arn()
Casey Bodley [Thu, 16 Apr 2026 17:58:13 +0000 (13:58 -0400)]
rgw/iam: add helper rgw::account::root_arn()

we need account root arns for various permission checks, and don't have
a consistent way to construct them

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2 months agoMerge pull request #68113 from benhanokh/dedup_split_head_with_tail_objects
Gabriel Benhanokh [Thu, 16 Apr 2026 18:01:29 +0000 (21:01 +0300)]
Merge pull request #68113 from benhanokh/dedup_split_head_with_tail_objects

rgw/dedup: split-head for objects with tails

2 months agoMerge PR #67823 into main
Patrick Donnelly [Thu, 16 Apr 2026 16:29:26 +0000 (12:29 -0400)]
Merge PR #67823 into main

* refs/pull/67823/head:
qa: remove unused qa_scripts

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2 months agoMerge PR #67822 into main
Patrick Donnelly [Thu, 16 Apr 2026 16:28:52 +0000 (12:28 -0400)]
Merge PR #67822 into main

* refs/pull/67822/head:
qa: remove vestiges of ceph-deploy
doc: remove references to ceph-deploy

Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>
2 months agoMerge pull request #67493 from kginonredhat/Bug-56660-Haproxy-error-for-rgw-service...
Redouane Kachach [Thu, 16 Apr 2026 14:21:49 +0000 (16:21 +0200)]
Merge pull request #67493 from kginonredhat/Bug-56660-Haproxy-error-for-rgw-service-with-ipv6

added code to fix failure on Haproxy error for rgw service with ipv6

Reviewed-by: Redouane Kachach <rkachach@ibm.com>
2 months agoMerge pull request #66257 from ShwetaBhosale1/fix_issue_73851_cephadm_crashes_when_ga...
Redouane Kachach [Thu, 16 Apr 2026 13:48:08 +0000 (15:48 +0200)]
Merge pull request #66257 from ShwetaBhosale1/fix_issue_73851_cephadm_crashes_when_ganesha-rados-grace_fails

mgr/cephadm: Handle ganesha-rados-grace tool failure

Reviewed-by: Redouane Kachach <rkachach@ibm.com>
Reviewed-by: Adam King <adking@redhat.com>
2 months agoMerge pull request #66313 from ShwetaBhosale1/fix_issue_73912_prometheus_cannot_acces...
Redouane Kachach [Thu, 16 Apr 2026 13:47:00 +0000 (15:47 +0200)]
Merge pull request #66313 from ShwetaBhosale1/fix_issue_73912_prometheus_cannot_access_nfs_metrics_endpoints

mgr/cephadm: Allow NFS monitoring port through firewall

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Redouane Kachach <rkachach@ibm.com>
2 months agoMerge pull request #66381 from ShwetaBhosale1/fix_issue_73949_nfs_with_keepalived_only
Redouane Kachach [Thu, 16 Apr 2026 13:45:58 +0000 (15:45 +0200)]
Merge pull request #66381 from ShwetaBhosale1/fix_issue_73949_nfs_with_keepalived_only

mgr/cephadm: Fix NFS to work properly in keepalived-only ingress mode

Reviewed-by: Redouane Kachach <rkachach@ibm.com>
Reviewed-by: Adam King <adking@redhat.com>
2 months agoMerge pull request #68391 from ifed01/wip-ifed-fix-fcm
Igor Fedotov [Thu, 16 Apr 2026 13:19:12 +0000 (16:19 +0300)]
Merge pull request #68391 from ifed01/wip-ifed-fix-fcm

extblkdev/fcm: do not abort on multi-device volume before we discover…

Reviewed-by: Adam Kupczyk <akupczyk@ibm.com>
2 months agoqa/tasks/cbt: install pdsh from el9 RPMs on el10 systems 68424/head
Nitzan Mordechai [Thu, 16 Apr 2026 10:52:52 +0000 (10:52 +0000)]
qa/tasks/cbt: install pdsh from el9 RPMs on el10 systems

PDSH is not available in EPEL10 repos. Install the el9 RPMs directly
via yum, which are compatible with el10.

Fixes: https://tracker.ceph.com/issues/75877
Signed-off-by: Nitzan Mordechai <nmordech@ibm.com>
2 months agotest_cephfs.py: delete purge_dir() helper method, use rmtree() instead 64774/head
Rishabh Dave [Mon, 6 Apr 2026 06:39:12 +0000 (12:09 +0530)]
test_cephfs.py: delete purge_dir() helper method, use rmtree() instead

Use rmtree() instead of purge_dir() for following 2 reasons -

1. purge_dir()'s recursive nature makes it vulnerable to max recursion
   limit.
2. It is redundant to have purge_dir() helper method since rmtree()
   already performs job in a better way.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agotest_cephfs.py: remove rendundant call to purge_dir()
Rishabh Dave [Wed, 8 Apr 2026 17:49:19 +0000 (23:19 +0530)]
test_cephfs.py: remove rendundant call to purge_dir()

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agotest_cephfs.py: test rmtree on root
Rishabh Dave [Mon, 6 Apr 2026 07:00:03 +0000 (12:30 +0530)]
test_cephfs.py: test rmtree on root

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agopybind/cephfs: don't attempt to unlink root in rmtree
Rishabh Dave [Mon, 6 Apr 2026 06:26:23 +0000 (11:56 +0530)]
pybind/cephfs: don't attempt to unlink root in rmtree

Don't attempt to unlink '/' as part of recursive delete when rmtree('/')
is called as it is an invalid op.

Fixes: https://tracker.ceph.com/issues/75924
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agotest_cephfs.py: test rmtree with and without should_cancel
Rishabh Dave [Mon, 6 Apr 2026 07:00:47 +0000 (12:30 +0530)]
test_cephfs.py: test rmtree with and without should_cancel

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agopybind/cephfs: make should_cancel option parameter for rmtree()
Rishabh Dave [Mon, 6 Apr 2026 06:21:06 +0000 (11:51 +0530)]
pybind/cephfs: make should_cancel option parameter for rmtree()

Make it optional to pass a function reference for should_cancel
parameter of rmtree function by defining a default value for it. Right
now one has to pass a simple lambda function returning False if one
wants to make rmtree operation uninterruptible -

rmtree('some-dir', should_cancel=lambda: False)

With should_cancel's default value set to "lambda: False", the call is
much simplied -

rmtree('some-dir')

Fixes: https://tracker.ceph.com/issues/75926
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agomgr/volumes: clone using cptree() from cephfs python bindings
Rishabh Dave [Fri, 7 Nov 2025 08:01:55 +0000 (13:31 +0530)]
mgr/volumes: clone using cptree() from cephfs python bindings

Fixes: https://tracker.ceph.com/issues/72357
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agotest_cephfs: add unit tests for cptree() in cephfs python bindings
Rishabh Dave [Thu, 16 Oct 2025 14:05:41 +0000 (19:35 +0530)]
test_cephfs: add unit tests for cptree() in cephfs python bindings

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agotest/pybind/assertions: add helper method assert_less
Rishabh Dave [Sat, 4 Apr 2026 13:03:35 +0000 (18:33 +0530)]
test/pybind/assertions: add helper method assert_less

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agopybind/cephfs: use depth-first, non-recursive approach for cloning
Rishabh Dave [Thu, 31 Jul 2025 12:59:45 +0000 (18:29 +0530)]
pybind/cephfs: use depth-first, non-recursive approach for cloning

Switch to non-recursive approach in cptree() (located in bulk_copy() in
async_cloner.py) to prevent it from crashing with "RecursionError:
maximum recursion depth exceeded".

Fixes: https://tracker.ceph.com/issues/72357
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agotest_cephfs: call object setup/teardown for all tests in TestWithRootUser
Rishabh Dave [Wed, 12 Nov 2025 17:04:40 +0000 (22:34 +0530)]
test_cephfs: call object setup/teardown for all tests in TestWithRootUser

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agotest_cephfs.py: add tests for utimensat()
Rishabh Dave [Wed, 12 Nov 2025 16:50:59 +0000 (22:20 +0530)]
test_cephfs.py: add tests for utimensat()

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agopybind/cephfs: add python bindings for utimensat()
Rishabh Dave [Mon, 10 Nov 2025 13:36:10 +0000 (19:06 +0530)]
pybind/cephfs: add python bindings for utimensat()

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agoqa/cephfs: add tests for chownat()
Rishabh Dave [Wed, 12 Nov 2025 15:39:47 +0000 (21:09 +0530)]
qa/cephfs: add tests for chownat()

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agopybind/cephfs: add python bindings for chownat()
Rishabh Dave [Mon, 10 Nov 2025 13:34:26 +0000 (19:04 +0530)]
pybind/cephfs: add python bindings for chownat()

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agotest_cephfs.py: add tests for chmodat()
Rishabh Dave [Wed, 5 Nov 2025 15:13:12 +0000 (20:43 +0530)]
test_cephfs.py: add tests for chmodat()

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agopybind/cephfs: add python bindings for chmodat()
Rishabh Dave [Wed, 5 Nov 2025 13:43:16 +0000 (19:13 +0530)]
pybind/cephfs: add python bindings for chmodat()

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agotest_cephfs.py: add tests for symlinkat()
Rishabh Dave [Wed, 5 Nov 2025 08:56:35 +0000 (14:26 +0530)]
test_cephfs.py: add tests for symlinkat()

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agopybind/cephfs: add python binding for symlinkat()
Rishabh Dave [Wed, 5 Nov 2025 08:55:54 +0000 (14:25 +0530)]
pybind/cephfs: add python binding for symlinkat()

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agotest_cephfs.py: add test for readlinkat()
Rishabh Dave [Wed, 5 Nov 2025 05:23:03 +0000 (10:53 +0530)]
test_cephfs.py: add test for readlinkat()

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agopybind/cephfs: add python binding for readlinkat()
Rishabh Dave [Wed, 5 Nov 2025 05:14:01 +0000 (10:44 +0530)]
pybind/cephfs: add python binding for readlinkat()

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agopybind/cephfs: add tests for statxat()
Rishabh Dave [Wed, 5 Nov 2025 15:29:30 +0000 (20:59 +0530)]
pybind/cephfs: add tests for statxat()

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agopybind/cephfs: add python bindings for statxat()
Rishabh Dave [Tue, 4 Nov 2025 13:07:32 +0000 (18:37 +0530)]
pybind/cephfs: add python bindings for statxat()

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agotest_cephfs.py: add tests for mkdirat()
Rishabh Dave [Thu, 2 Apr 2026 12:09:13 +0000 (17:39 +0530)]
test_cephfs.py: add tests for mkdirat()

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agopybind/cephfs: add python binding for mkdirat()
Rishabh Dave [Tue, 4 Nov 2025 10:56:58 +0000 (16:26 +0530)]
pybind/cephfs: add python binding for mkdirat()

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 months agomon/MonClient: check stopping for auth request handling 68245/head
Patrick Donnelly [Tue, 31 Mar 2026 13:10:08 +0000 (18:40 +0530)]
mon/MonClient: check stopping for auth request handling

When the MonClient is shutting down, it is no longer safe to
access MonClient::auth and other members. The AuthClient
methods should be checking the stopping flag in this case.

The key bit from the segfault backtrace (thanks Brad Hubbard!) is here:

#13 0x00007f921ee23c40 in ProtocolV2::handle_auth_done (this=0x7f91cc0945f0, payload=...) at /usr/include/c++/12/bits/shared_ptr_base.h:1665
#14 0x00007f921ee16a29 in ProtocolV2::run_continuation (this=0x7f91cc0945f0, continuation=...) at msg/./src/msg/async/ProtocolV2.cc:54
#15 0x00007f921edee56e in std::function<void (char*, long)>::operator()(char*, long) const (__args#1=0, __args#0=<optimized out>, this=0x7f91cc0744d8) at /usr/include/c++/12/bits/std_function.h:591
#16 AsyncConnection::process (this=0x7f91cc074140) at msg/./src/msg/async/AsyncConnection.cc:485
#17 0x00007f921ee3300c in EventCenter::process_events (this=0x55efc9d0a058, timeout_microseconds=<optimized out>, working_dur=0x7f921a891d88) at msg/./src/msg/async/Event.cc:465
#18 0x00007f921ee38bf9 in operator() (__closure=<optimized out>) at msg/./src/msg/async/Stack.cc:50
#19 std::__invoke_impl<void, NetworkStack::add_thread(Worker*)::<lambda()>&> (__f=...) at /usr/include/c++/12/bits/invoke.h:61
#20 std::__invoke_r<void, NetworkStack::add_thread(Worker*)::<lambda()>&> (__fn=...) at /usr/include/c++/12/bits/invoke.h:111
#21 std::_Function_handler<void(), NetworkStack::add_thread(Worker*)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/12/bits/std_function.h:290
#22 0x00007f921e81f253 in std::execute_native_thread_routine (__p=0x55efc9e9c5f0) at ../../../../../src/libstdc++-v3/src/c++11/thread.cc:82
#23 0x00007f921f5e8ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#24 0x00007f921f67a8d0 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

I originally thought this may be the issue causing [1] however that
turned out to be an issue caused by OpenSSL's use of atexit handlers.

I still think there is a bug here so I am continuing with this change.

[1] https://tracker.ceph.com/issues/59335

Fixes: https://tracker.ceph.com/issues/76017
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
2 months agoscript/ceph-backport: skip fetch if merge commit already exists locally 68422/head
Kefu Chai [Thu, 16 Apr 2026 10:13:28 +0000 (18:13 +0800)]
script/ceph-backport: skip fetch if merge commit already exists locally

If the merge commit is already present in the local object store, there
is no need to fetch it from the upstream remote. Use git-cat-file to
check before fetching, avoiding unnecessary network traffic.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
2 months agoMerge pull request #68205 from idryomov/wip-transient-policy-release-note
Ilya Dryomov [Thu, 16 Apr 2026 09:00:58 +0000 (11:00 +0200)]
Merge pull request #68205 from idryomov/wip-transient-policy-release-note

doc: add RBD_LOCK_MODE_EXCLUSIVE_TRANSIENT release note

Reviewed-by: Miki Patel <miki.patel132@gmail.com>
2 months agorgw/dedup: This PR extends the RGW dedup split-head feature to support objects that... 68113/head
benhanokh [Mon, 30 Mar 2026 08:22:51 +0000 (11:22 +0300)]
rgw/dedup: This PR extends the RGW dedup split-head feature to support objects that already have tail RADOS objects (i.e. objects larger than the head chunk size).
Previously, split-head was restricted to objects whose entire data fit in the head (≤4 MiB).
It also migrates the split-head manifest representation from the legacy explicit-objs format to the prefix+index rules-based format.

Refactored should_split_head():
Now performs a larger set of eligibility checks:
 * d_split_head flag is set
 * single-part object only
 * non-empty head
 * not a legacy manifest
 * not an Alibaba Cloud OSS AppendObject

Explicit skips for unsupported manifest types:
 — old-style explicit-objs manifests
 — OSS AppendObject manifests (detected via non-empty override_prefix)

New config option: rgw_dedup_split_obj_head:
  Default is true (split-head enabled).
  Setting to false disables split-head entirely.

Tail object lookup via manifest iterator:
  Replaces the old get_tail_ioctx() which manually constructed the tail OID via generate_split_head_tail_name().
  The new function simply calls manifest.obj_begin() and resolves the first tail object location through the standard manifest iterator.

Stats cleanup:
Removed the "Potential Dedup" stats section (small_objs_stat, dup_head_bytes, dup_head_bytes_estimate, ingress_skip_too_small_64KB*)
 which tracked 64KB–4MB objects as potential-but-skipped candidates.
 Since split-head now covers all sizes, this distinction is no longer meaningful. calc_deduped_bytes() is simplified accordingly.

Signed-off-by: benhanokh <gbenhano@redhat.com>
2 months agolibrbd: tweak ReadResult's handler for SparseBufferlist type 68202/head
Ilya Dryomov [Fri, 16 Jan 2026 19:38:48 +0000 (20:38 +0100)]
librbd: tweak ReadResult's handler for SparseBufferlist type

It's similar in concept to the handler for the new ChildObject type.
To highlight that, make comments and log messages consistent between
the two.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2 months agolibrbd: avoid losing sparseness in read_parent()
Ilya Dryomov [Thu, 15 Jan 2026 12:56:13 +0000 (13:56 +0100)]
librbd: avoid losing sparseness in read_parent()

When read_parent() constructs a read for image_ctx->parent, it employs
a thick bufferlist (either re-using the bufferlist on the object extent
or creating a temporary one inside of C_ObjectReadMergedExtents).  This
forgoes any sparseness: even if the result obtained by ObjectRequest is
sparse, it's thickened by ReadResult's handler for Bufferlist type.

This behavior is very old and hasn't been a problem for regular clones
because the public API returns a thick bufferlist in the case of C++ or
equivalent char* buf/struct iovec iov[] buffers in the case of C anyway.
ObjectCacher isn't sparse-aware but it's also not used for caching reads
by default and reading from parent for the purposes of a copyup is done
in CopyupRequest in a way that preserves sparseness.  However, when it
comes to migration, source image reads go through read_parent() and the
destination image gets thickened as an inadvertent side effect.

Fix this by introducing a new ChildObject type for ReadResult whose
handler would plant the result obtained by parent's ObjectRequest into
child's ObjectRequest, as if read_parent() wasn't even called.

Fixes: https://tracker.ceph.com/issues/73831
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2 months agoMerge pull request #68383 from amathuria/wip-amat-fix-bad-statemachine-event
Matan Breizman [Thu, 16 Apr 2026 08:14:11 +0000 (11:14 +0300)]
Merge pull request #68383 from amathuria/wip-amat-fix-bad-statemachine-event

crimson/osd: fix race between AllReplicasRecovered and DeferRecovery

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
2 months agoqa: fix misleading "in cluster log" failures during cluster log scan 68392/head
Redouane Kachach [Wed, 15 Apr 2026 16:05:36 +0000 (18:05 +0200)]
qa: fix misleading "in cluster log" failures during cluster log scan

Summary:

Fix misleading failure reasons reported as `"… in cluster log"` when
no such log entry actually exists.

The cephadm task currently treats `grep` errors from the cluster log
scan as if they were actual log matches. This can produce bogus
failure summaries when `ceph.log` is missing, especially after early
failures such as image pull or bootstrap problems.

Problem:

first_in_ceph_log() currently:

- returns stdout if a match is found
- otherwise returns stderr

The caller then treats any non-None value as a real cluster log hit and formats it as:

"<value>" in cluster log

That means an error like:

  grep: /var/log/ceph/<fsid>/ceph.log: No such file or directory

can be misreported as if it came from the cluster log.

This change makes cluster log scanning robust and accurate by:

- checking whether /var/log/ceph/<fsid>/ceph.log exists before scanning
- using check_status=False for the grep pipeline
- treating only stdout as a real log match
- treating stderr as a scan error instead of log content
- avoiding overwrite of a more accurate pre-existing failure_reason
- reporting scan failures separately as cluster log scan failed

Fixes: https://tracker.ceph.com/issues/76051
Signed-off-by: Redouane Kachach <rkachach@ibm.com>
2 months agoMerge pull request #68154 from leiwen2025/rv64-crc32c
Kefu Chai [Thu, 16 Apr 2026 06:55:48 +0000 (14:55 +0800)]
Merge pull request #68154 from leiwen2025/rv64-crc32c

src/common: optimize crc32c using zbc extension for riscv64

Reviewed-by: Kefu Chai <k.chai@proxmox.com>
2 months agoMerge pull request #68301 from rhcs-dashboard/inline-edit-emitter
Nizamudeen A [Thu, 16 Apr 2026 05:59:12 +0000 (11:29 +0530)]
Merge pull request #68301 from rhcs-dashboard/inline-edit-emitter

mgr/dashboard: table cell inline edit emit editing state

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
2 months agoMerge pull request #68282 from tchaikov/wip-mgr-module-neg-exit-code
Kefu Chai [Thu, 16 Apr 2026 05:13:05 +0000 (13:13 +0800)]
Merge pull request #68282 from tchaikov/wip-mgr-module-neg-exit-code

mgr/crash, mgr/status: return negative errno to fix CLI exit code

Reviewed-by: Dan Mick <dmick@ibm.com>
2 months agoMerge pull request #68283 from tchaikov/wip-ceph-crash-less-noisy
Kefu Chai [Thu, 16 Apr 2026 05:11:59 +0000 (13:11 +0800)]
Merge pull request #68283 from tchaikov/wip-ceph-crash-less-noisy

ceph-crash: reduce log noise from auth fallback in post_crash()

Reviewed-by: Dan Mick <dmick@ibm.com>
2 months agoMerge pull request #68181 from cloudbehl/subvolume-fixes-for-double-values
Aashish Sharma [Thu, 16 Apr 2026 05:00:07 +0000 (10:30 +0530)]
Merge pull request #68181 from cloudbehl/subvolume-fixes-for-double-values

Fixes for subvolume overview in grafana

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
2 months agoMerge pull request #68403 from phlogistonjohn/jjm-codeowners-build-sig
Dan Mick [Thu, 16 Apr 2026 03:45:49 +0000 (20:45 -0700)]
Merge pull request #68403 from phlogistonjohn/jjm-codeowners-build-sig

CODEOWNERS: add a build-sig group for various build / test files

2 months agoqa/radosgw_admin: add a testcase to verify use of --marker and --object-version in... 68163/head
Oguzhan Ozmen [Wed, 1 Apr 2026 17:25:51 +0000 (17:25 +0000)]
qa/radosgw_admin: add a testcase to verify use of --marker and --object-version in bucket listing

Signed-off-by: Oguzhan Ozmen <oozmen@bloomberg.net>
2 months agoqa: Port rgw multifs test to crimson-rados 67372/head
Kautilya Tripathi [Tue, 9 Dec 2025 02:37:39 +0000 (02:37 +0000)]
qa: Port rgw multifs test to crimson-rados

This adds multifs qa tests of rgw to crimson-rados suite

Signed-off-by: Kautilya Tripathi <kautilya.tripathi@ibm.com>
2 months agocmake/isal-l: explicitly configure libdir to avoid wrong libisal.a 68404/head
Igor Fedotov [Wed, 15 Apr 2026 21:35:13 +0000 (00:35 +0300)]
cmake/isal-l: explicitly configure libdir to avoid wrong libisal.a
location.

This is to fix the following libec_isa.si build error :

FAILED: lib/libec_isa.so
: && /usr/bin/g++-14 -fPIC -O3 -DNDEBUG
-Wl,--dependency-file=src/erasure-code/isa/CMakeFiles/ec_isa.dir/link.d
-shared -Wl,-soname,libec_isa.so -o lib/libec_isa.so
src/erasure-code/CMakeFiles/erasure_code_objs.dir/ErasureCode.cc.o
src/erasure-code/isa/CMakeFiles/ec_isa.dir/ErasureCodeIsa.cc.o
src/erasure-code/isa/CMakeFiles/ec_isa.dir/ErasureCodeIsaTableCache.cc.o
src/erasure-code/isa/CMakeFiles/ec_isa.dir/ErasureCodePluginIsa.cc.o
src/isa-l/install/lib/libisal.a  -ldl  /usr/lib64/librt.a  -lresolv
-Wl,--as-needed -latomic && :
/usr/lib64/gcc/x86_64-suse-linux/14/../../../../x86_64-suse-linux/bin/ld:
cannot find src/isa-l/install/lib/libisal.a: No such file or directory
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.

It looks like [under some circumstances?] the building procedure puts
resulting .a under src/isa-l/install/lib64 path which causes lookup
error .so linkage.

This patch enforces lib64 usage.

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
2 months agoCODEOWNERS: add a build-sig group for various build / test files 68403/head
John Mulligan [Wed, 15 Apr 2026 21:15:03 +0000 (17:15 -0400)]
CODEOWNERS: add a build-sig group for various build / test files

Add a new build-sig group that covers some of the high level tools and
scripts used in the build and CI processes. This should help PRs not
pass by without notifying people who care about these things.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
2 months agomon/MonClient: un-inline MonCommand ctor/dtor to reduce compile times 68400/head
Max Kellermann [Wed, 13 Nov 2024 17:21:49 +0000 (18:21 +0100)]
mon/MonClient: un-inline MonCommand ctor/dtor to reduce compile times

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
2 months agomon/MonClient: eliminate duplicate start_mon_command() overload
Max Kellermann [Tue, 12 Aug 2025 09:30:14 +0000 (11:30 +0200)]
mon/MonClient: eliminate duplicate start_mon_command() overload

The only difference was that the first overload implied mon_rank==-1.

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
2 months agomon/MonClient: move inner class ContextVerter out of the header
Max Kellermann [Tue, 12 Aug 2025 09:11:39 +0000 (11:11 +0200)]
mon/MonClient: move inner class ContextVerter out of the header

Reduce compile times by compiling these start_mon_command() overloads
only once.

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
2 months agomon/OSDMonitor: include cleanup
Max Kellermann [Fri, 15 Aug 2025 09:59:12 +0000 (11:59 +0200)]
mon/OSDMonitor: include cleanup

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
2 months agoRevert "vstart: disable extblkdev plugins for file-backed OSDs" 68391/head
Igor Fedotov [Wed, 15 Apr 2026 18:42:46 +0000 (21:42 +0300)]
Revert "vstart: disable extblkdev plugins for file-backed OSDs"

This reverts commit 92e902ecfe2cfed217136dc64e47500ec50f9c07.

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
2 months agoextblkdev/fcm: do not abort on multi-device volume before we discovered it's FCM one
Igor Fedotov [Wed, 15 Apr 2026 15:46:56 +0000 (18:46 +0300)]
extblkdev/fcm: do not abort on multi-device volume before we discovered it's FCM one

https://tracker.ceph.com/issues/75819

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
2 months agoqa/valgrind: generalize and group MismatchedFree suppressions 68289/head
Casey Bodley [Tue, 14 Apr 2026 18:50:04 +0000 (14:50 -0400)]
qa/valgrind: generalize and group MismatchedFree suppressions

combine the various MismatchedFree suppressions into unconditional ones
for each function

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2 months agoFixes for subvolume overview in grafana 68181/head
Ankush Behl [Thu, 2 Apr 2026 11:00:09 +0000 (16:30 +0530)]
Fixes for subvolume overview in grafana
- multiple values were shown in graph and single state
- Remove All selection from subvolume path

fixes: https://tracker.ceph.com/issues/75849

Signed-off-by: Ankush Behl <cloudbehl@gmail.com>
2 months agoMerge pull request #68355 from nhoad/pretty-format-docs
Casey Bodley [Wed, 15 Apr 2026 13:29:08 +0000 (09:29 -0400)]
Merge pull request #68355 from nhoad/pretty-format-docs

rgw: Add documentation for the --pretty-format option

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2 months agoMerge pull request #68062 from anthonyeleven/global.yaml.in
Anthony D'Atri [Wed, 15 Apr 2026 13:22:37 +0000 (09:22 -0400)]
Merge pull request #68062 from anthonyeleven/global.yaml.in

src/common/options: Modernize language in global.yaml.in

2 months agoMerge pull request #65986 from HeinleinSupport/wip-cephadm-72696
Guillaume Abrioux [Wed, 15 Apr 2026 13:18:26 +0000 (15:18 +0200)]
Merge pull request #65986 from HeinleinSupport/wip-cephadm-72696

mgr/cephadm: renames ceph_device to ceph_device_lvm

2 months agoMerge PR #67032 into main
Patrick Donnelly [Wed, 15 Apr 2026 13:11:34 +0000 (09:11 -0400)]
Merge PR #67032 into main

* refs/pull/67032/head:
qa: add trivial cephfs-tool bench test
doc/cephfs: add cephfs-tool documentation
tools/cephfs: add new cephfs-tool

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2 months agomgr/dashboard: table cell inline edit emit editing state 68301/head
Nizamudeen A [Fri, 10 Apr 2026 06:25:27 +0000 (11:55 +0530)]
mgr/dashboard: table cell inline edit emit editing state

- Emit the editing state so that the consuming component can manipulate
that to add some extra validations

- Replace button with cds-icon-button.
- Replace submit button with tertiary instead of ghost for visibility.
- Also added a cancel button to cancel the ongoing edit

Fixes: https://tracker.ceph.com/issues/75949
Signed-off-by: Nizamudeen A <nia@redhat.com>
2 months agoMerge pull request #68322 from anthonyeleven/percent
Anthony D'Atri [Wed, 15 Apr 2026 12:07:45 +0000 (08:07 -0400)]
Merge pull request #68322 from anthonyeleven/percent

doc/rados/configuration: Update bluestore-config-ref.rst WAL+DB sizing

2 months agokv/rocksdb: Fix priority of rocksdb cache perf counters 68376/head
Adam Kupczyk [Wed, 15 Apr 2026 05:25:22 +0000 (05:25 +0000)]
kv/rocksdb: Fix priority of rocksdb cache perf counters

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
2 months agocrimson/osd: fix race between AllReplicasRecovered and DeferRecovery 68383/head
Aishwarya Mathuria [Tue, 14 Apr 2026 07:59:36 +0000 (13:29 +0530)]
crimson/osd: fix race between AllReplicasRecovered and DeferRecovery

Fixes a crash where AllReplicasRecovered event arrives at NotRecovering
state due to async event delivery race with DeferRecovery preemption.

The issue occurs when:
1. Recovery completes and AllReplicasRecovered is queued asynchronously
2. A higher priority operation (e.g., client I/O) triggers AsyncReserver
   to preempt recovery, posting DeferRecovery event
3. DeferRecovery is processed first, transitioning PG to NotRecovering
4. AllReplicasRecovered arrives at wrong state → crash with "bad state
   machine event" because NotRecovering doesn't handle it

The fix follows Classic OSD's approach in PrimaryLogPG::start_recovery_ops():
clear PG_STATE_RECOVERING before posting recovery completion events. This
makes the existing safety check in PeeringState::Recovering::react() work:
when DeferRecovery arrives and sees !state_test(PG_STATE_RECOVERING), it
discards itself, preventing the state transition that would cause the crash.

Fixes:https://tracker.ceph.com/issues/73314
Signed-off-by: Aishwarya Mathuria <amathuri@redhat.com>
2 months agoosd: Allow multiple objects with same version in missing list. 68284/head
Alex Ainscow [Wed, 8 Apr 2026 10:49:58 +0000 (11:49 +0100)]
osd: Allow multiple objects with same version in missing list.

Most of the time, a single version in a PG can only correspond to a single object.

However, following a PG merge it is possible, even likely, that two objects will
have the same version.   The PG Log works around this by discarding the log.

However, during backfill, it is possible for the missing list to be build with
these duplicate versions.

A recently added assert detected that this scenario was corrupting the reverse
missing list (rmissing). This behaviour has always existed, but was previously
unnoticed.  It could cause some bugs and potentially loop-asserts on OSDs,
although mostly would not be noticed.

Here we fix this properly, by converting rmissing to a multimap. This is wrapped
in some insert functions, which assert that the rmissing list does not end up
with duplicate entries.  The code is optimised for the case where there are no
duplicate versions.

Additionally, some of the old asserts have been rolled into the insert functions.

Fixes: https://tracker.ceph.com/issues/75778
Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
2 months agocephadm: fix HAProxy/RGW IPv6 failure (ip_nonlocal_bind) 67493/head
kginon [Tue, 24 Feb 2026 18:28:56 +0000 (20:28 +0200)]
cephadm: fix HAProxy/RGW IPv6 failure (ip_nonlocal_bind)

Fixes: https://tracker.ceph.com/issues/56660
Signed-off-by: Kobi Ginon <kginon@redhat.com>
2 months agomgr/dashboard: NVMeoF namespace create form should show subsystem selection first 68175/head
Sagar Gopale [Thu, 2 Apr 2026 08:06:37 +0000 (13:36 +0530)]
mgr/dashboard: NVMeoF namespace create form should show subsystem selection first

Fixes: https://tracker.ceph.com/issues/75846
Signed-off-by: Sagar Gopale <sagar.gopale@ibm.com>
2 months agoMerge pull request #68367 from guits/fix-orch-osd-add-raw
Guillaume Abrioux [Wed, 15 Apr 2026 07:05:04 +0000 (09:05 +0200)]
Merge pull request #68367 from guits/fix-orch-osd-add-raw

cephadm: wait for latest osd map after ceph-volume before OSD deploy

2 months agoMerge pull request #68336 from rhcs-dashboard/rm-golang-gh-prom
Nizamudeen A [Wed, 15 Apr 2026 06:31:49 +0000 (12:01 +0530)]
Merge pull request #68336 from rhcs-dashboard/rm-golang-gh-prom

ceph.spec.in: replace golang github prometheus with promtool binary path

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: John Mulligan
2 months agomgr/dashboard: fix namespace block size in namespace form 68377/head
Sagar Gopale [Wed, 15 Apr 2026 05:55:12 +0000 (11:25 +0530)]
mgr/dashboard: fix namespace block size in namespace form

Fixes: https://tracker.ceph.com/issues/76034
Signed-off-by: Sagar Gopale <sagar.gopale@ibm.com>
2 months agorgw: remove the now unused RGWAsyncLockSystemObj and RGWAsyncUnlockSystemObj 67962/head
Shilpa Jagannath [Mon, 23 Mar 2026 16:46:37 +0000 (16:46 +0000)]
rgw: remove the now unused RGWAsyncLockSystemObj and RGWAsyncUnlockSystemObj
classes.

Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
2 months agorgw/multisite: use aio_operate for RGWSimpleRadosLockCR/UnlockCR
Shilpa Jagannath [Mon, 23 Mar 2026 16:38:34 +0000 (16:38 +0000)]
rgw/multisite: use aio_operate for  RGWSimpleRadosLockCR/UnlockCR

RGWContinuousLeaseCR renews the sync lock every interval/2
by calling RGWSimpleRadosLockCR, which previously queued an
RGWAsyncLockSystemObj request via async_rados->queue().
At a large scale, when the async thread pool is fully saturated,
the cr thread can block, stalling lock renewal for extended periods
of time, eventually expiring.

Fix this by allowing RGWSimpleRadosLockCR and RGWSimpleRadosUnlockCR
to use aio_operate without having to queue behind other async requests

Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
2 months agorgw: return additional checksum headers too 68364/head
Matt Benjamin [Tue, 14 Apr 2026 19:25:28 +0000 (15:25 -0400)]
rgw: return additional checksum headers too

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2 months agoMerge pull request #68298 from tchaikov/wip-rgw-inotify-ctor
Casey Bodley [Tue, 14 Apr 2026 18:38:58 +0000 (14:38 -0400)]
Merge pull request #68298 from tchaikov/wip-rgw-inotify-ctor

rgw/posix: fix Inotify member initialization order race

Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
2 months agoMerge pull request #68351 from joscollin/wip-TestCorruptedSubvolumes-fix
Venky Shankar [Tue, 14 Apr 2026 18:02:22 +0000 (23:32 +0530)]
Merge pull request #68351 from joscollin/wip-TestCorruptedSubvolumes-fix

qa: update yaml file for TestCorruptedSubvolumes

Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2 months agoMerge PR #66294 into main
Patrick Donnelly [Tue, 14 Apr 2026 15:46:54 +0000 (11:46 -0400)]
Merge PR #66294 into main

* refs/pull/66294/head:
qa: enforce centos9 for test
qa: rename distro
qa/suites/fs/bugs: use centos9 for squid upgrade test
qa: remove unused variables
qa: use centos9 for fs suites using k-testing
qa: update fs suite to rocky10
qa: skip dashboard install due to dependency noise
qa: only setup nat rules during bridge creation
qa: correct wording of comment
qa: use nft instead iptables
qa: use py3 builtin ipaddress module

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2 months agoMerge pull request #62351 from vshankar/wip-revert-referent-inodes
Venky Shankar [Tue, 14 Apr 2026 15:35:25 +0000 (21:05 +0530)]
Merge pull request #62351 from vshankar/wip-revert-referent-inodes

mds: revert referent inode work

Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
2 months agoqa/rgw: add PutACL backward compatibility test for account migration 66170/head
Krunal Chheda [Tue, 14 Apr 2026 15:18:29 +0000 (20:48 +0530)]
qa/rgw: add PutACL backward compatibility test for account migration

Test that put-bucket-acl and put-object-acl work both before and after
migrating a user to an account. After migration, the bucket/object ACL
owner is still the old user id, but the requester authenticates as the
account id.

Signed-off-by: Krunal Chheda <kchheda3@bloomberg.net>
2 months agorgw/account: Update the docs to add note about supporting backward compatibility...
kchheda3 [Mon, 10 Nov 2025 22:33:41 +0000 (17:33 -0500)]
rgw/account: Update the docs to add note about supporting backward compatibility for s3:PutAcls calls for users migrated to account.

Signed-off-by: kchheda3 <kchheda3@bloomberg.net>