]> git.apps.os.sepia.ceph.com Git - ceph-ci.git/log
ceph-ci.git
4 days agoMerge branch 'lrc_fix' of https://github.com/aainscow/ceph into wip-bharath2-testing... wip-bharath2-testing-2025-12-01-1510
skanta [Mon, 1 Dec 2025 09:40:36 +0000 (15:10 +0530)]
Merge branch 'lrc_fix' of https://github.com/aainscow/ceph into wip-bharath2-testing-2025-12-01-1510

4 days agoMerge branch 'wip-update-cluster-log-warnings' of https://github.com/ljflores/ceph...
skanta [Mon, 1 Dec 2025 09:40:34 +0000 (15:10 +0530)]
Merge branch 'wip-update-cluster-log-warnings' of https://github.com/ljflores/ceph into wip-bharath2-testing-2025-12-01-1510

4 days agoMerge branch 'issue72879' of https://github.com/bill-scales/ceph into wip-bharath2...
skanta [Mon, 1 Dec 2025 09:40:32 +0000 (15:10 +0530)]
Merge branch 'issue72879' of https://github.com/bill-scales/ceph into wip-bharath2-testing-2025-12-01-1510

4 days agoMerge branch 'wip-nitzan-test-encode-update-versions' of https://github.com/NitzanMor...
skanta [Mon, 1 Dec 2025 09:40:30 +0000 (15:10 +0530)]
Merge branch 'wip-nitzan-test-encode-update-versions' of https://github.com/NitzanMordhai/ceph into wip-bharath2-testing-2025-12-01-1510

4 days agoMerge branch 'wip-nitzan-objecter-osdmap-request-override' of https://github.com...
skanta [Mon, 1 Dec 2025 09:40:27 +0000 (15:10 +0530)]
Merge branch 'wip-nitzan-objecter-osdmap-request-override' of https://github.com/NitzanMordhai/ceph into wip-bharath2-testing-2025-12-01-1510

4 days agoMerge pull request #66436 from rhcs-dashboard/add-sagar--to-mailmap-githubmap-organiz...
afreen23 [Mon, 1 Dec 2025 09:29:57 +0000 (14:59 +0530)]
Merge pull request #66436 from rhcs-dashboard/add-sagar--to-mailmap-githubmap-organizationmap

add Sagar Gopale to githubmap mailmap organizationmap

Reviewed-by: Afreen Misbah <afreen@ibm.com>
Reviewed-by: Abhishek Desai <abhishek.desai1@ibm.com>
4 days agoMerge pull request #65433 from mohit84/repeer_on_acting
SrinivasaBharathKanta [Mon, 1 Dec 2025 09:27:14 +0000 (14:57 +0530)]
Merge pull request #65433 from mohit84/repeer_on_acting

test: repeer_on_down_acting_member_coming_back is continuously failing

4 days agoMerge pull request #66417 from rhcs-dashboard/fix-server-sort
Nizamudeen A [Mon, 1 Dec 2025 08:27:19 +0000 (13:57 +0530)]
Merge pull request #66417 from rhcs-dashboard/fix-server-sort

mgr/dashboard: fix server side table sort

Reviewed-by: Afreen Misbah <afreen@ibm.com>
4 days agoMerge pull request #66375 from ronen-fr/wip-rf-eofread
Ronen Friedman [Mon, 1 Dec 2025 07:31:58 +0000 (09:31 +0200)]
Merge pull request #66375 from ronen-fr/wip-rf-eofread

osd/scrub: do not attempt to read past the end of an object

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Alex Ainscow <aainscow@uk.ibm.com>
4 days agoMerge pull request #66330 from rhcs-dashboard/edit-realm
Nizamudeen A [Mon, 1 Dec 2025 07:27:16 +0000 (12:57 +0530)]
Merge pull request #66330 from rhcs-dashboard/edit-realm

mgr/dashboard: edit realm modal not working

Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Afreen Misbah <afreen@ibm.com>
4 days agoMerge pull request #66321 from rhcs-dashboard/73832-navigation-e2e-failure
Nizamudeen A [Mon, 1 Dec 2025 06:44:44 +0000 (12:14 +0530)]
Merge pull request #66321 from rhcs-dashboard/73832-navigation-e2e-failure

mgr/dasboard : Fixes navigation e2e test

Reviewed-by: Nizamudeen A <nia@redhat.com>
5 days agoMerge pull request #66425 from baum/rbd_with_crc32c_nvmeof_service_spec
baum [Sun, 30 Nov 2025 09:22:54 +0000 (11:22 +0200)]
Merge pull request #66425 from baum/rbd_with_crc32c_nvmeof_service_spec

mgr/cephadm: add rbd_with_crc32c parameter to nvmeof service spec

5 days agoqa/standalone: osd-scrub-dump.sh: additional tests wip-rf-eofread
Ronen Friedman [Sat, 22 Nov 2025 14:48:46 +0000 (08:48 -0600)]
qa/standalone: osd-scrub-dump.sh: additional tests

Also - adding stride override parameter to the existing
large-object scrub test. And: improving the output format
of the test results.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
5 days agoosd/scrub: extract the 'read object data' functionality
Ronen Friedman [Fri, 21 Nov 2025 13:39:29 +0000 (07:39 -0600)]
osd/scrub: extract the 'read object data' functionality

from be_deep_scrub() into a separate function,
ReplicatedBackend::be_deep_scrub_read().

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
5 days agoosd/scrub: do not attempt to read past the end of an object
Ronen Friedman [Thu, 20 Nov 2025 13:54:20 +0000 (07:54 -0600)]
osd/scrub: do not attempt to read past the end of an object

When performing deep scrubs, the scrubber reads object data
in strides. Existing code uses a short read to detect the end
of the object (and if the object size is a multiple of the
stride - an extra read is performed, which returns 0 bytes).

The proposed change is to avoid such extra read attempts,
by using our knowledge of the object size.

Also - some minor code cleanups in the relevant function.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
5 days agoosd: provide fmtlib formatter for ScrubMapBuilder
Ronen Friedman [Sat, 22 Nov 2025 14:55:58 +0000 (08:55 -0600)]
osd: provide fmtlib formatter for ScrubMapBuilder

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
5 days agoMerge pull request #66230 from ronen-fr/wip-rf-larger-stride
Ronen Friedman [Sun, 30 Nov 2025 06:59:25 +0000 (08:59 +0200)]
Merge pull request #66230 from ronen-fr/wip-rf-larger-stride

osd/scrub: increasing the default data-read stride

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Alex Ainscow <aainscow@uk.ibm.com>
Reviewed-by: Jon Bailey <jonathan.bailey1@ibm.com>
6 days agomgr/cephadm: add rbd_with_crc32c parameter to nvmeof service spec
Alexander Indenbaum [Wed, 26 Nov 2025 12:28:51 +0000 (14:28 +0200)]
mgr/cephadm: add rbd_with_crc32c parameter to nvmeof service spec

Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
7 days agoosd: Perform shard look up correctly in partial EC writes
Alex Ainscow [Fri, 28 Nov 2025 14:33:13 +0000 (14:33 +0000)]
osd: Perform shard look up correctly in partial EC writes

Plugins are permitted to provide a mapping to change the order in which OSDs
are used. In practice only LRC does this and it is not currently enabled
with optimisations, so this is a theoretical bug.

The bug here was that the "first" shard was assumed to be shard_id_t(0).  However,
this is not true for LRC.

Fixes: https://tracker.ceph.com/issues/74016
Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
7 days agoMerge pull request #64248 from zhuwei127/fix-doublefree
Kefu Chai [Fri, 28 Nov 2025 13:35:46 +0000 (21:35 +0800)]
Merge pull request #64248 from zhuwei127/fix-doublefree

examples/librados: fix memory pointed to by 'rs' is freed twice.

Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
7 days agoMerge pull request #65771 from aainscow/ec_direct_reads_pr_1
Alex Ainscow [Thu, 27 Nov 2025 23:17:37 +0000 (23:17 +0000)]
Merge pull request #65771 from aainscow/ec_direct_reads_pr_1

EC Direct Reads: First PR, background work

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Bill Scales <bill_scales@uk.ibm.com>
7 days agoMerge pull request #66377 from baum/rbd_aio_write_with_crc32c_initial_fix
Ilya Dryomov [Thu, 27 Nov 2025 22:58:38 +0000 (23:58 +0100)]
Merge pull request #66377 from baum/rbd_aio_write_with_crc32c_initial_fix

librbd: rbd_aio_write_with_crc32c store CRC32C with initial value -1 to match msgr2 validation

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
8 days agoqa: Reduce number of osd threads when using compression
Bill Scales [Fri, 21 Nov 2025 10:06:22 +0000 (10:06 +0000)]
qa: Reduce number of osd threads when using compression

Smithi nodes used by teuthology tests have 8 CPU cores and typically run
4 OSD processes. When bluestore software compression is enabled the size
of the OSD thread pool needs to be reduced to 2 threads per OSD because
these threads can easily use 100% of a core. This avoids excessive
amounts of context switches, which leads to OSD threads timing out,
which causes the OSD to drop heartbeat pings and for the monitor to
temporarily mark it down. In extreme cases this can lead to PGs getting
stuck in repeated loops of peering until the teuthology test times out.

Context switches happen oppurtunistically at the end of system calls
so functions with lots of logging are some of the worst affected.

Fixes: https://tracker.ceph.com/issues/72879
Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
8 days agoosd: Restrict logging in MissingLoc::add_source_info
Bill Scales [Fri, 21 Nov 2025 10:38:44 +0000 (10:38 +0000)]
osd: Restrict logging in MissingLoc::add_source_info

add_source_info can generate an excessive amount of logging
if a PG has thousands of missing objects. When a system is
under load and threads are repeatedly context switching this
can lead to timeouts (tests showed this function taking up
to 10 seconds to execute with 99% of that time being in
logging calls where the thread was being pre-empted).
Stopping logging after the function has been running for
more than 0.5 seconds strikes a balance between providing
sufficient informtion to debug problems while providing
more stability when a system is heavily loaded.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
8 days agoosd: Increase log level for listing missing list
Bill Scales [Fri, 21 Nov 2025 10:25:48 +0000 (10:25 +0000)]
osd: Increase log level for listing missing list

Logging the entire contents of a missing list can generate a
1M character log line when there are 8000 missing objects in a
PG. Other places in the code logging the missing list use debug
level 25 which is not enabled by default in teuthology tests.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
8 days agoosd: reset_tp_timeout should reset timeout for all shards
Bill Scales [Mon, 24 Nov 2025 09:18:21 +0000 (09:18 +0000)]
osd: reset_tp_timeout should reset timeout for all shards

ShardedThreadPools are only used by the classic OSD process
which can have more than one thread for the same shard. Each
thread has a heartbeat timeout used to detect stalled threads.
Some code that is known to take a long time makes calls to
reset_tp_timeout to reset this timeout. However for sharded
pools this can be ineffective because it is common for threads
for the same shard to use the same locks (e.g. PG Lock) and
therefore if thread A is taking a long time and resetting
its timeout while holding a lock, thread B for the same shard
is liable to be waiting for the same lock, will not be
resetting its timeout and can be timed out.

Debug for issue 72879 showed heartbeat timeouts occurring at
the same time for both shards, an attempt to fix the problem
by calling reset_tp_timeout for the slow thread still showed
the other threads for the shard timing out waiting for the PG
lock that was held bythe slow thread. Looking at the OSD code
most places where reset_tp_timeout is called the thread is
holding the PG lock.

This commit moves the concept of shard_index from OSD into
ShardedThreadPool and modifies reset_tp_timeout so that it resets
the timeout for all threads for the same shard.

Some code calls reset_tp_timeout from inside loops that can take
a long time without consideration for how long the thread has
actually been running for. There is a risk that this type of
call could repeatedly reset the timeout for another shard which
is genuinely stuck and hence defeat the heartbeat checks. To
prevent this reset_tp_timeout is modified to be a NOP unless
the thread has been processing the current workitem for more
than 0.5 seconds. Therefore threads have to be slow but making
forward progress to be abe to reset the timeout.

Fixes: https://tracker.ceph.com/issues/72879
Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
8 days agoMerge pull request #65739 from tchaikov/rgw-gap-list-manpage
Kefu Chai [Thu, 27 Nov 2025 04:12:08 +0000 (12:12 +0800)]
Merge pull request #65739 from tchaikov/rgw-gap-list-manpage

debian: include rgw-gap-list manpage and rgw-policy-check in ceph-common

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
Reviewed-by: Matan Breizman <mbreizma@ibm.com>
9 days agoMerge pull request #66166 from cbodley/wip-cmake-breakpad-arch
Casey Bodley [Wed, 26 Nov 2025 18:37:59 +0000 (13:37 -0500)]
Merge pull request #66166 from cbodley/wip-cmake-breakpad-arch

cmake: disable WITH_BREAKPAD on power arch

Reviewed-by: Kefu Chai <k.chai@proxmox.com>
9 days agoMerge pull request #66416 from bluikko/doc-fscrypt-improvements-cephfs
bluikko [Wed, 26 Nov 2025 13:51:41 +0000 (20:51 +0700)]
Merge pull request #66416 from bluikko/doc-fscrypt-improvements-cephfs

doc/cephfs: Small improvements in fscrypt.rst

9 days agoMerge pull request #66420 from bluikko/doc-sphinx-warnings-202511
bluikko [Wed, 26 Nov 2025 13:51:21 +0000 (20:51 +0700)]
Merge pull request #66420 from bluikko/doc-sphinx-warnings-202511

doc: Fix Sphinx warnings

9 days agoMerge pull request #66421 from bluikko/doc-sphinx-warning-tentacle-202511
bluikko [Wed, 26 Nov 2025 13:50:50 +0000 (20:50 +0700)]
Merge pull request #66421 from bluikko/doc-sphinx-warning-tentacle-202511

doc/releases: Fix Sphinx warning in tentacle.rst

9 days agoMerge pull request #66423 from bluikko/doc-sphinx-warning-theme-202511
bluikko [Wed, 26 Nov 2025 13:50:36 +0000 (20:50 +0700)]
Merge pull request #66423 from bluikko/doc-sphinx-warning-theme-202511

doc: Fix Sphinx warning about theme option

9 days agocommon/options: document osd_deep_scrub_stride wip-rf-larger-stride
Ronen Friedman [Sun, 23 Nov 2025 14:02:04 +0000 (16:02 +0200)]
common/options: document osd_deep_scrub_stride

default value change (from 0.5MB to 4MB)

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
9 days agoosdc: Add SplitOp capability to Objecter
Alex Ainscow [Tue, 14 Oct 2025 08:24:56 +0000 (09:24 +0100)]
osdc: Add SplitOp capability to Objecter

This will provide the ability for Objecter to split up
certain ops and distribute them to the OSDs directly if
that provides a preformance advantage.

This is experimental code and is switched off unless the
magic pool flags are enabled. These magic pool flags were
pushed in an earlier commit in the same PR.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
9 days agoosdc: Fix minor typo
Alex Ainscow [Mon, 13 Oct 2025 11:50:11 +0000 (12:50 +0100)]
osdc: Fix minor typo

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
9 days agoosdc: Interface to allow split reads to copy op from client op to split op
Alex Ainscow [Fri, 3 Oct 2025 14:34:55 +0000 (15:34 +0100)]
osdc: Interface to allow split reads to copy op from client op to split op

When spliting ops, certain addition sub ops (e.g. get xattr) can be simply passed
through to the child op.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
9 days agoosdc: Add stub for ability to force an op to always go to a particular shard
Alex Ainscow [Fri, 3 Oct 2025 14:32:22 +0000 (15:32 +0100)]
osdc: Add stub for ability to force an op to always go to a particular shard

This will eventually be used by SplitIo to direct ops to the correct OSD.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
9 days agoosdc: Split handle_osd_op_reply into two functions
Alex Ainscow [Fri, 3 Oct 2025 14:15:29 +0000 (15:15 +0100)]
osdc: Split handle_osd_op_reply into two functions

The functionality is not altered by this commit.

In the future we want to post-process split-ios after
recombining the read data.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
9 days agoosdc: Remove unused con parameter from Objecter::_calc_target()
Alex Ainscow [Fri, 3 Oct 2025 14:11:00 +0000 (15:11 +0100)]
osdc: Remove unused con parameter from Objecter::_calc_target()

This parameter is not used by the _calc_target code.  It is being
removed just to clean up the code, as we are making some changes
to _calc_target in later stages of the split io PR.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
9 days agoosdc: Interface to submit IO with ASIO Post.
Alex Ainscow [Fri, 3 Oct 2025 13:55:56 +0000 (14:55 +0100)]
osdc: Interface to submit IO with ASIO Post.

For direct read failures, the locking is such that we cannot
immediately send a new IO without deadlocking. This new interface
allows an op to be sent as an asio post.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
9 days agoosd: Call clear_repop_obc for EC as well as Replica.
Alex Ainscow [Fri, 3 Oct 2025 13:51:23 +0000 (14:51 +0100)]
osd: Call clear_repop_obc for EC as well as Replica.

This function is necessary for balanced reads and as such is required for EC too.

Rename the function to make sense, given this change of purpose, but the
functionality does not change.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
9 days agoosd: Implement sync reads and sparse reads for EC for direct reads
Alex Ainscow [Fri, 3 Oct 2025 13:39:03 +0000 (14:39 +0100)]
osd: Implement sync reads and sparse reads for EC for direct reads

Sparse reads for EC are simple to implement, as the code is essentially
identical to that of replica, with some address translation.

When doing a direct read in EC, only a single OSD is involved and
that OSD, by definition is the only OSD involved. As such we can
do the more performant sync read, rather than async read.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
9 days agoosd: Add extent_to_shard_extent interface to PGBackend.
Alex Ainscow [Fri, 3 Oct 2025 13:24:49 +0000 (14:24 +0100)]
osd: Add extent_to_shard_extent interface to PGBackend.

This allows a backend to expose how an object offset/length translates to
an offset/length on a particular shard.

For Replica, this is trivial.

For EC, this means looking up the start and end offsets, then translating
this to shard address space.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
9 days agoosd: Set the from shard in the EC read reply.
Alex Ainscow [Fri, 3 Oct 2025 13:17:48 +0000 (14:17 +0100)]
osd: Set the from shard in the EC read reply.

This was not necessary prior to direct reads, but is essential when the
client needs to know which shard the read came from.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
9 days agoosd: Generalise can_serve_replica_read for consumption by EC.
Alex Ainscow [Fri, 3 Oct 2025 13:15:32 +0000 (14:15 +0100)]
osd: Generalise can_serve_replica_read for consumption by EC.

The can_serve_replica_read() function is called by replica to determine whether there are
any uncommitted writes.  If such writes exist, then the system will reject the IO to avoid
the risk of reading data from a write which may yet be rolled back.

The same code is going to be useful for EC direct reads.

The string_view code is not expensive.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
9 days agoosd: Create EC Direct Read flag and pass through to EC.
Alex Ainscow [Fri, 3 Oct 2025 13:00:10 +0000 (14:00 +0100)]
osd: Create EC Direct Read flag and pass through to EC.

This is in preperation for supporting sparse and sync reads in EC.
Such ops will only be supported for "balance reads".

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
9 days agoosd: Replace unused EC offset translation function with useful one.
Alex Ainscow [Fri, 3 Oct 2025 12:53:33 +0000 (13:53 +0100)]
osd: Replace unused EC offset translation function with useful one.

The old chunk_aligned_shard_offset_to_ro_offset was not only unused, it
didn't actually have the correct logic. We replace it here with similar,
but more useful function that will be used in sparse reads for EC

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
9 days agoosd: Introduce pool flag for "split IO" and Plugin flag for "direct read"
Alex Ainscow [Fri, 3 Oct 2025 12:49:58 +0000 (13:49 +0100)]
osd: Introduce pool flag for "split IO" and Plugin flag for "direct read"

These flags will currently behave as follows:

1. The pool flag is never set, unless by a user with the osd_pool_default_flags
   config option.
2. The pool flag will be removed for EC pools where the plugin does not support
   direct reads.
3. Replica pools will never remove the flag.

The intention is to eventually invert this logic and allow split IOs upon
upgrade to Umberella in this same function.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
9 days agoadd Sagar Gopale to githubmap mailmap organizationmap
Sagar Gopale [Wed, 26 Nov 2025 11:04:50 +0000 (16:34 +0530)]
add Sagar Gopale to githubmap mailmap organizationmap
Signed-off-by: Sagar Gopale <sagar.gopale@ibm.com>
9 days agoMerge pull request #66340 from imran-imtiaz/dashboard
Imran Imtiaz [Wed, 26 Nov 2025 09:33:45 +0000 (09:33 +0000)]
Merge pull request #66340 from imran-imtiaz/dashboard

mgr/dashboard: add GET API endpoint for consistency groups

9 days agoObjecter: respect higher epoch subscription in tick
Nitzan Mordechai [Tue, 18 Nov 2025 09:37:48 +0000 (09:37 +0000)]
Objecter: respect higher epoch subscription in tick

The OSD and Objecter share the same MonClient. During preboot, a potential
race condition exists where the OSD subscribes to osdmap epoch X, while
the Objecter subscribes to epoch X - 1.

The Objecter's subscription overrides the OSD's subscription. Consequently,
the monitor ignores the request (as it believes the OSD already has the
older map), causing the OSD to hang during preboot.

To fix this, check if a higher epoch is already subscribed before calling
_maybe_request_map during Objecter::tick. If a higher epoch is found,
maintain the existing subscription.

Fixes: https://tracker.ceph.com/issues/71931
Signed-off-by: Nitzan Mordechai <nmordech@ibm.com>
9 days agodoc: Fix Sphinx warning about theme option
Ville Ojamo [Wed, 26 Nov 2025 08:22:17 +0000 (15:22 +0700)]
doc: Fix Sphinx warning about theme option

The Sphinx theme "sphinx_rtd_theme" dropped support for "display_version"
theme option in version 3 (currently used: 3.0.2).

Because the "ceph" theme inherits that theme, remove all references to
"display_version" from it.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
9 days agolibrbd: store CRC32C with initial value -1 to match msgr2 validation wip-baum-20251126-01
Alexander Indenbaum [Sun, 23 Nov 2025 12:21:39 +0000 (14:21 +0200)]
librbd: store CRC32C with initial value -1 to match msgr2 validation

Fix runtime error, using test command:
   sudo dd if=/dev/zero bs=32k of=/dev/nvme0n1 count=1

The error log:
   2025-11-23T11:24:10.512+0000 7f30f4ec0640  1 --2- [v2:192.168.13.2:6802/3444906816,v1:192.168.13.2:6803/3444906816] >> 192.168.13.3:0/3916714748 conn(0x527d400 0x728f700 crc :-1 s=THROTTLE_DONE pgs=2038703 gs=2038723 cs=0 l=1 c_cookie=0 s_cookie=0 reconnecting=0 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0)._handle_read_frame_epilogue_main bad segment crc calculated=1136411986 expected=4294967295

Ceph msgr2 validation (ceph/src/msg/async/frames_v2.cc:47):
   uint32_t crc = segment_bl.crc32c(-1);  // Uses initial value -1

Co-authored-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
9 days agodoc/release: Fix Sphinx warning in tentacle.rst
Ville Ojamo [Wed, 26 Nov 2025 07:30:36 +0000 (14:30 +0700)]
doc/release: Fix Sphinx warning in tentacle.rst

Add an empty line between blocks to fix a warning:

/home/docs/checkouts/readthedocs.org/user_builds/ceph/checkouts/66416/doc/releases/tentacle.rst:97: ERROR: Unexpected indentation.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
9 days agodoc: Fix Sphinx warnings
Ville Ojamo [Wed, 26 Nov 2025 07:24:39 +0000 (14:24 +0700)]
doc: Fix Sphinx warnings

Fix section title underline lengths in dev/cephfs-fscrypt.rst
radosgw/adminops.rst.

Use "figure" keyword instead of "image" and use the caption feature in
dev/cephfs-fscrypt.rst.

Remove circular toc reference in dev/crimson/index.rst.

Add an empty line after block in
rados/troubleshooting/troubleshooting-pg.rst.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
9 days agodoc/cephfs: Small improvements in fscrypt.rst
Ville Ojamo [Wed, 26 Nov 2025 05:33:19 +0000 (12:33 +0700)]
doc/cephfs: Small improvements in fscrypt.rst

Fix Sphinx warnings about section title underline lengths.
Use title case in section titles.

Change Unicode quotation marks to ASCII.

Use ordered list for lines that were supposedly intended to be a list.

Use double backticks for literals.
Use image caption formatting.

Remove unnecessary comma and other small language improvements.

Capitalize MDS, OSD, etc.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
9 days agomgr/dashboard: fix server side table sort
Nizamudeen A [Wed, 26 Nov 2025 06:20:40 +0000 (11:50 +0530)]
mgr/dashboard: fix server side table sort

show a loading screen when the sort is being performed through
server-side since the sort will happen a little slow

It will be more visible in bigger environments, and with test env if you
try to sort too many time in a short interval and you start to see some
inconsistencies. This is only there for tables like OSDs or hosts where
we have the server side rendering enabled

Fixes: https://tracker.ceph.com/issues/73994
Signed-off-by: Nizamudeen A <nia@redhat.com>
10 days agoMerge pull request #66333 from shraddhaag/wip-shraddhaag-increase-reactors
Shraddha Agrawal [Tue, 25 Nov 2025 16:05:39 +0000 (21:35 +0530)]
Merge pull request #66333 from shraddhaag/wip-shraddhaag-increase-reactors

qa/clusters/crimson: increase reactors count

10 days agocmake: disable WITH_BREAKPAD on power arch wip-cmake-breakpad-arch
Casey Bodley [Fri, 7 Nov 2025 14:22:01 +0000 (09:22 -0500)]
cmake: disable WITH_BREAKPAD on power arch

Reported-by: T K Chandra Hasan <t.k.chandra.hasan@ibm.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
10 days agocmake: use cmake_dependent_option for WITH_BREAKPAD
Casey Bodley [Fri, 7 Nov 2025 14:17:47 +0000 (09:17 -0500)]
cmake: use cmake_dependent_option for WITH_BREAKPAD

a bit simpler without the WITH_BREAKPAD_DEFAULT part, and causes the
WITH_BREAKPAD option to be hidden from cmake-gui on WIN32

Signed-off-by: Casey Bodley <cbodley@redhat.com>
10 days agoMerge pull request #66336 from Matan-B/wip-matanb-crimson-snapmapper-osddriver
Matan Breizman [Tue, 25 Nov 2025 13:59:03 +0000 (15:59 +0200)]
Merge pull request #66336 from Matan-B/wip-matanb-crimson-snapmapper-osddriver

osd/SnapMapper: fix Crimson logs

Reviewed-by: Aishwarya Mathuria <amathuri@redhat.com>
10 days agoMerge pull request #66332 from rhcs-dashboard/73854-CephFS-Authorize-modal-Update...
afreen23 [Tue, 25 Nov 2025 11:00:32 +0000 (16:30 +0530)]
Merge pull request #66332 from rhcs-dashboard/73854-CephFS-Authorize-modal-Update-issues

mgr/dashboard : fix - CephFS Authorize Modal Update issue

Reviewed-by: Dnyaneshwari Talwekar dtalweka@redhat.com
10 days agoqa: update release-checklists with encoder suites updates
Nitzan Mordechai [Tue, 25 Nov 2025 09:30:00 +0000 (09:30 +0000)]
qa: update release-checklists with encoder suites updates

Fixes: https://tracker.ceph.com/issues/73921
Signed-off-by: Nitzan Mordechai <nmordech@ibm.com>
10 days agoqa/suite/rados/encoder: update release N-2 for ceph-dencoder tests
Nitzan Mordechai [Thu, 20 Nov 2025 13:39:59 +0000 (13:39 +0000)]
qa/suite/rados/encoder: update release N-2 for ceph-dencoder tests

Removing quincy release and adding tentacle for encoder suite.

Fixes: https://tracker.ceph.com/issues/73921
Signed-off-by: Nitzan Mordechai <nmordech@ibm.com>
10 days agoMerge pull request #66382 from bluikko/doc-mgmt-gateway-improvements-cephadm
bluikko [Tue, 25 Nov 2025 05:22:05 +0000 (12:22 +0700)]
Merge pull request #66382 from bluikko/doc-mgmt-gateway-improvements-cephadm

doc/cephadm: Fix command plus improvements in services/mgmt-gateway.rst

11 days agoqa/suites/upgrade: add "OBJECT_UNFOUND" to ignorelists
Laura Flores [Mon, 24 Nov 2025 17:31:05 +0000 (11:31 -0600)]
qa/suites/upgrade: add "OBJECT_UNFOUND" to ignorelists

The thrashing in the upgrade tests has been configured to be very aggressive;
the tests are permitted to stop up to 4 of the 8 OSDs, so it is expected that
it is causing these kinds of health warnings to be generated.

Fixes: https://tracker.ceph.com/issues/72424
Signed-off-by: Laura Flores <lflores@ibm.com>
11 days agoqa/suites/upgrade: ignore "osd down" cluster log variations
Laura Flores [Mon, 24 Nov 2025 17:26:57 +0000 (11:26 -0600)]
qa/suites/upgrade: ignore "osd down" cluster log variations

These warnings are expected during upgrade tests. This commit
updates the list with variations of this warning that weren't
covered.

Fixes: https://tracker.ceph.com/issues/69795
Signed-off-by: Laura Flores <lflores@ibm.com>
11 days agoqa/suites/rados/thrash-old-clients: ignore warnings about peering PGs
Laura Flores [Mon, 24 Nov 2025 17:22:48 +0000 (11:22 -0600)]
qa/suites/rados/thrash-old-clients: ignore warnings about peering PGs

These warnings are expected during thrashing tasks.

https://tracker.ceph.com/issues/73360
Signed-off-by: Laura Flores <lflores@ibm.com>
11 days agoqa/suites/upgrade: add expected filesystem warnings to ignorelist
Laura Flores [Mon, 24 Nov 2025 17:18:08 +0000 (11:18 -0600)]
qa/suites/upgrade: add expected filesystem warnings to ignorelist

These warnings appear after we run ‘fs rm’, which seems expected.

Fixes: https://tracker.ceph.com/issues/73557
Signed-off-by: Laura Flores <lflores@ibm.com>
11 days agoMerge pull request #66006 from afreen23/carbonize-chnage-password
afreen23 [Mon, 24 Nov 2025 12:22:40 +0000 (17:52 +0530)]
Merge pull request #66006 from afreen23/carbonize-chnage-password

mgr/dashboard: Carbonize the Change Password Form

Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Dnyaneshwari Talwekar dtalweka@redhat.com
11 days agoMerge pull request #66326 from afreen23/fixes-mixins
afreen23 [Mon, 24 Nov 2025 12:17:33 +0000 (17:47 +0530)]
Merge pull request #66326 from afreen23/fixes-mixins

monitoring: Fixes for development

Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>
11 days agomgr/dashboard : fix - CephFS Authorize Modal Update issue
Devika Babrekar [Thu, 20 Nov 2025 11:33:56 +0000 (17:03 +0530)]
mgr/dashboard : fix - CephFS Authorize Modal Update issue
fixes : https://tracker.ceph.com/issues/73854
Signed-off-by: Devika Babrekar <devika.babrekar@ibm.com>
11 days agodoc/cephadm: Fix command plus improvements in service/mgmt-gateway.rst
Ville Ojamo [Mon, 24 Nov 2025 09:34:19 +0000 (16:34 +0700)]
doc/cephadm: Fix command plus improvements in service/mgmt-gateway.rst

Remove double backticks from a CLI command.

Use bash prompt consistently for CLI command blocks.

Don't capitalize word in middle of sentence.

Talk about "admin" instead of "user", similarly to the last text
paragraph in the doc.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
11 days agoqa/clusters/crimson: increase reactors count
Shraddha Agrawal [Thu, 20 Nov 2025 12:41:16 +0000 (18:11 +0530)]
qa/clusters/crimson: increase reactors count

This commit increases the number of reactors in fixed-1 and
fixed-2 crimson clusters.

Signed-off-by: Shraddha Agrawal <shraddhaag@ibm.com>
11 days agomgr/dashboard: Carbonize the Change Password Form
Afreen Misbah [Tue, 21 Oct 2025 16:37:46 +0000 (22:07 +0530)]
mgr/dashboard: Carbonize the Change Password Form

Fixes https://tracker.ceph.com/issues/73193

-  using carbon based stylings, typography and components
-  used grid layout for form arrangement
-  breadcrumb is slightly off, which needs to be fixed by applying grid layout to the app shell

Signed-off-by: Afreen Misbah <afreen@ibm.com>
11 days agoMerge pull request #66372 from tchaikov/wip-qa-encoder-exclude
Kefu Chai [Mon, 24 Nov 2025 08:27:14 +0000 (16:27 +0800)]
Merge pull request #66372 from tchaikov/wip-qa-encoder-exclude

qa/suites/rados/encoder: exclude ceph-osd-classic when installing LTS…

Reviewed-by: Matan Breizman <mbreizma@ibm.com>
11 days agoqa/suites/rados/encoder: exclude ceph-osd-* when installing LTS releases
Kefu Chai [Sat, 22 Nov 2025 00:24:36 +0000 (08:24 +0800)]
qa/suites/rados/encoder: exclude ceph-osd-* when installing LTS releases

In a37b5b5, the ceph-osd-classic and ceph-osd-crimson packages were
added to qa/packages/packages.yaml. The "install" task uses this file as
the default package list for all branches, including LTS releases like
Reef.

However, a37b5b5 only exists in the main branch and won't be backported
to LTS branches. This causes installation failures in the rados/encoder
test suite, which verifies forward compatibility by installing LTS
releases and testing whether they can decode the latest corpus.

Exclude ceph-osd-classic and ceph-osd-crimson from LTS installations to
ensure the test suite can successfully install ceph-dencoder, which is
required for the interoperability tests.

Fixes: https://tracker.ceph.com/issues/73957
Signed-off-by: Kefu Chai <k.chai@proxmox.com>
11 days agoMerge pull request #66293 from anthonyeleven/instore.dbnoonecanhearyouscream
Anthony D'Atri [Mon, 24 Nov 2025 06:07:04 +0000 (01:07 -0500)]
Merge pull request #66293 from anthonyeleven/instore.dbnoonecanhearyouscream

doc: Improve start/hardware-recommendations.rst

13 days agoMerge pull request #65995 from pcuzner/rocksdb_compaction_metric
Laura Flores [Sat, 22 Nov 2025 00:04:21 +0000 (18:04 -0600)]
Merge pull request #65995 from pcuzner/rocksdb_compaction_metric

rados/osd: enable compact_running perfcounter at PRIO=5

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Neha Ojha <nojha@ibm.com>
13 days agoMerge pull request #65694 from mohit84/mclock_scheduler_monc
Laura Flores [Sat, 22 Nov 2025 00:02:45 +0000 (18:02 -0600)]
Merge pull request #65694 from mohit84/mclock_scheduler_monc

osd: Remove monc reference from scheduler

Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>
Reviewed-by: Matan Breizman <mbreizma@redhat.com>
13 days agoMerge pull request #65687 from pdvian/wip-73272-autoscaler
Laura Flores [Sat, 22 Nov 2025 00:00:16 +0000 (18:00 -0600)]
Merge pull request #65687 from pdvian/wip-73272-autoscaler

pybind/mgr/pg_autoscaler: Introduce dynamic threshold to improve scal…

Reviewed-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>
13 days agoMerge pull request #65698 from cbodley/wip-72771
Laura Flores [Fri, 21 Nov 2025 23:56:55 +0000 (17:56 -0600)]
Merge pull request #65698 from cbodley/wip-72771

osdc: Objecter::linger_by_cookie() for safe cast from uint64

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
13 days agoMerge pull request #65615 from chungfengz-syno/fix-get_addr_from_invalid_rank
Laura Flores [Fri, 21 Nov 2025 23:52:35 +0000 (17:52 -0600)]
Merge pull request #65615 from chungfengz-syno/fix-get_addr_from_invalid_rank

mon/Elector.cc: prevent assertion failure when receiving pings from r…

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2 weeks agodoc: Improve start/hardware-recommendations.rst
Anthony D'Atri [Mon, 17 Nov 2025 17:57:29 +0000 (12:57 -0500)]
doc: Improve start/hardware-recommendations.rst

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
2 weeks agoMerge pull request #66311 from rhcs-dashboard/fix-73910-main
afreen23 [Fri, 21 Nov 2025 11:51:44 +0000 (17:21 +0530)]
Merge pull request #66311 from rhcs-dashboard/fix-73910-main

monitoring: remove cephfs.libsonnet mention from dashboards.libsonnet

Reviewed-by: Afreen Misbah <afreen@ibm.com>
Reviewed-by: Ankush Behl <cloudbehl@gmail.com>
2 weeks agomgr/dashboard: edit realm modal not working
Naman Munet [Thu, 20 Nov 2025 07:28:55 +0000 (12:58 +0530)]
mgr/dashboard: edit realm modal not working

Fixes: https://tracker.ceph.com/issues/73937
Signed-off-by: Naman Munet <naman.munet@ibm.com>
2 weeks agoMerge pull request #66319 from rhcs-dashboard/fix-account-group-mode
afreen23 [Fri, 21 Nov 2025 07:18:09 +0000 (12:48 +0530)]
Merge pull request #66319 from rhcs-dashboard/fix-account-group-mode

mgr/dashboard: rgw accounts form group mode disable option is not working

Reviewed-by: Afreen Misbah <afreen@ibm.com>
2 weeks agoMerge pull request #66323 from aainscow/pg_repeer wip-kdhaduk-testing-2025-11-21-1159
Laura Flores [Thu, 20 Nov 2025 18:28:20 +0000 (12:28 -0600)]
Merge pull request #66323 from aainscow/pg_repeer

mon: ceph pg repeer should propose a correctly sized pg temp.

Reviewed-by: Laura Flores <lflores@ibm.com>
2 weeks agoMerge pull request #66229 from Matan-B/wip-matanb-crimson-on
Matan Breizman [Thu, 20 Nov 2025 16:56:00 +0000 (18:56 +0200)]
Merge pull request #66229 from Matan-B/wip-matanb-crimson-on

ceph.spec.in: Include Crimson by default

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2 weeks agomgr/dashboard: add GET API endpoint for consistency groups
Imran Imtiaz [Thu, 20 Nov 2025 14:45:32 +0000 (14:45 +0000)]
mgr/dashboard: add GET API endpoint for consistency groups

Signed-off-by: Imran Imtiaz <imran.imtiaz@uk.ibm.com>
Fixes: https://tracker.ceph.com/issues/73942
Add a consistency group dashboard API endpoint to get the list of images
in the consistency groups that match the namespace of the group.

2 weeks agoMerge pull request #66325 from solmagd/wip-doc-jd-clock-skew-option
Anthony D'Atri [Thu, 20 Nov 2025 15:53:49 +0000 (10:53 -0500)]
Merge pull request #66325 from solmagd/wip-doc-jd-clock-skew-option

doc: Harmonize hyphens to underscores in rados/troubleshooting/troubleshooting-mon.rst

2 weeks agoosd/SnapMapper: fix Crimson logs
Matan Breizman [Thu, 20 Nov 2025 13:46:49 +0000 (13:46 +0000)]
osd/SnapMapper: fix Crimson logs

Switch to crimson's debugging macro and fix the faulty
subsystem defined of ceph_subsys_

Might help with https://tracker.ceph.com/issues/73790

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2 weeks agoMerge pull request #55638 from clwluvw/cmake-boost
Seena Fallah [Thu, 20 Nov 2025 10:23:12 +0000 (11:23 +0100)]
Merge pull request #55638 from clwluvw/cmake-boost

cmake: skip boost dependency on ALIAS executable targets

2 weeks agoMerge pull request #66228 from imran-imtiaz/dashboard
Nizamudeen A [Thu, 20 Nov 2025 08:37:06 +0000 (14:07 +0530)]
Merge pull request #66228 from imran-imtiaz/dashboard

mgr/dashboard: add API endpoint to add images to consistency groups

Reviewed-by: Nizamudeen A <nia@redhat.com>
2 weeks agomgr/dasboard : Fixes navigation e2e test
Abhishek Desai [Wed, 19 Nov 2025 16:29:40 +0000 (21:59 +0530)]
mgr/dasboard : Fixes navigation e2e test
fixes : https://tracker.ceph.com/issues/73832
Signed-off-by: Abhishek Desai <abhishek.desai1@ibm.com>
2 weeks agoMerge pull request #65915 from edwinzrodriguez/ceph-zstd-update
SrinivasaBharathKanta [Thu, 20 Nov 2025 03:27:33 +0000 (08:57 +0530)]
Merge pull request #65915 from edwinzrodriguez/ceph-zstd-update

zstd: Update zstd to 1.5.6 for cmake 4 compatability

2 weeks agoMerge pull request #66120 from Jayaprakash-ibm/wip-fix-cot-bz2404644
SrinivasaBharathKanta [Thu, 20 Nov 2025 03:25:50 +0000 (08:55 +0530)]
Merge pull request #66120 from Jayaprakash-ibm/wip-fix-cot-bz2404644

tools: handle get-attr as read-only ops in ceph-objectstore-tool

2 weeks agoMerge pull request #66022 from Jayaprakash-ibm/wip-nref-opt
SrinivasaBharathKanta [Thu, 20 Nov 2025 03:24:50 +0000 (08:54 +0530)]
Merge pull request #66022 from Jayaprakash-ibm/wip-nref-opt

os/bluestore: Optimize Blob::put() for single-reference fast path

2 weeks agoMerge pull request #63390 from taodd/fix-ms_async_op_threads
SrinivasaBharathKanta [Thu, 20 Nov 2025 03:24:19 +0000 (08:54 +0530)]
Merge pull request #63390 from taodd/fix-ms_async_op_threads

common: fix the ms_async_op_threads not applied successfully for daemons running in foreground mode

2 weeks agocmake: skip boost dependency on ALIAS executable targets
Seena Fallah [Mon, 19 Feb 2024 09:39:24 +0000 (10:39 +0100)]
cmake: skip boost dependency on ALIAS executable targets

The current add_executable override in Boost does not support alias
targets. Although Ceph currently has no alias targets that are
affected by this limitation, addressing this issue now will benefit
future developments and personal projects.
This change enhances the robustness of the override logic, ensuring
compatibility with alias targets moving forward.

Signed-off-by: Seena Fallah <seenafallah@gmail.com>