Ilya Dryomov [Fri, 23 Jan 2026 13:48:53 +0000 (14:48 +0100)]
qa: don't assume that /dev/sda or /dev/vda is present in unmap.t
Instead of hard-coding the block device name, use the block device that
is backing the filesystem that the test is running on. We can be quite
sure it won't be an RBD device ;)
Disable OSD bench from benchmarking the OSDs for teuthology tests. This is to
help prevent a cluster warning pertaining to the IOPS value not lying within
a typical threshold range from being raised.
The tests can rely on the built-in static values as defined by
osd_mclock_max_capacity_iops_[ssd|hdd] which should be good enough.
Ilya Dryomov [Wed, 21 Jan 2026 18:41:41 +0000 (19:41 +0100)]
qa: krbd_blkroset.t: eliminate a race in the open_count test
Even at QD=1, dd may take less than 10 seconds to work its way to the
end of a 10M image, producing "No space left on device" error instead
of the expected "Operation not permitted" error which is supposed to
arise from the device getting marked read-only while opened.
With https://github.com/ceph/ceph-build/pull/2497 merged we no loger
build Tentacle+Crimson regularly. As Crimson no longer backport changes
into Tentacle, there's no reason to keep testing it.
Matan Breizman [Thu, 22 Jan 2026 10:00:25 +0000 (12:00 +0200)]
container/build.sh: Use dedicated debug tags
https://github.com/ceph/ceph-build/pull/2497 introduced a debug flavor.
This seems to cause conflicts with the image being pushed to quay as one
of the flavors might override the other.
Tag debug build containers explicitly.
Alternative solution would be to skip debug containers all together.
However. these might be useful for development purposes.
Note, prune-quay might also need to be updated once this is merged.
Kefu Chai [Thu, 22 Jan 2026 03:57:37 +0000 (11:57 +0800)]
cmake: fix undefined PY_LDFLAGS in distutils_install_cython_module
The distutils_install_cython_module() function was using ${PY_LDFLAGS}
without defining it, causing the linker to fail with:
/opt/rh/gcc-toolset-13/root/usr/libexec/gcc/x86_64-redhat-linux/13/ld:
cannot find -lrados: No such file or directory
This bug was introduced in commit d22734f6cb0 which changed:
set(ENV{LDFLAGS} "-L${CMAKE_LIBRARY_OUTPUT_DIRECTORY}")
to:
set(ENV{LDFLAGS} "${PY_LDFLAGS}")
However, PY_LDFLAGS was only defined in distutils_add_cython_module(),
not in distutils_install_cython_module(). This meant that during the
install phase, LDFLAGS was set to an empty string, and the linker
couldn't find librados.so and other Ceph libraries in the build
directory.
The bug was exposed by commit 719b74984605b490f23004eb41583a22c934c5fb
which changed rados.pxd to use C preprocessor conditionals (#ifdef
BUILD_DOC) instead of Cython's compile-time IF statements. This meant
the build now required proper linking during the install phase.
Fix by defining PY_LDFLAGS in distutils_install_cython_module():
Ville Ojamo [Fri, 16 Jan 2026 09:43:31 +0000 (16:43 +0700)]
doc/radosgw: change all intra-docs links to use ref (2 of 6)
Part 2 of 6 to make backporting easier. Depends on part 1.
Use the the ref role for all remaining links in doc/radosgw/ with the
exception of config-ref.rst which will depend on changes to rgw.yaml.in.
The external link definitions syntax being removed is intended for
linking to external websites and not for intra-docs links. Validity of
ref links will be checked during the docs build process.
Add labels for links targets if necessary.
Remove unused external link definitions in the modified files.
Use confval instead of literal text for 2 configuration keys in
vault.rst.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Ville Ojamo [Fri, 16 Jan 2026 08:55:27 +0000 (15:55 +0700)]
doc/radosgw: change all intra-docs links to use ref (1 of 6)
Part 1 of 6 to make backporting easier. Many of the following parts
depend on this.
Use the the ref role for all remaining links in doc/radosgw/ with the
exception of config-ref.rst which will depend on changes to rgw.yaml.in.
The external link definitions syntax being removed is intended for
linking to external websites and not for intra-docs links. Validity of
ref links will be checked during the docs build process.
Add labels for links targets if necessary.
Remove unused external link definitions in the modified files.
Use confval instead of literal text for 2 configuration keys in
vault.rst.
Use Ceph Object Gateway consistently in multisite.rst.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Ronen Friedman [Wed, 21 Jan 2026 12:37:24 +0000 (14:37 +0200)]
Merge pull request #66626 from ronen-fr/wip-rf-aborthp-justdoc
doc/ceph.rst: scrub-related 'tell pgid' commands
Related to https://github.com/ceph/ceph/pull/66515 Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com> Reviewed-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Afreen Misbah [Tue, 13 Jan 2026 20:47:40 +0000 (02:17 +0530)]
mgr/dashboard: Add productive card component
- add generic productive card component
- based on carbon design system
- there are two versions of card - with shadow(tinted affect) and without.
- applies gray10 theme which is decided by new designs.
Fix on_operator_abort_scrub() to handle the case where
the operator-initiated abort request arrives while the
'start scrub' message is still in the queue (i.e. -
is_queued_or_active() is true, but is_scrub_active()
is false).
Unlike our handling of, for example, FullReset in
PrimaryIdle::clear_state(), here we choose to ignore
the request:
Considering the added complexity to the FSM versus
the minimal benefit, it is better to just ignore this
very rare case, leaving it to the operator to re-issue
the abort command if needed.
Ronen Friedman [Thu, 4 Dec 2025 14:49:29 +0000 (08:49 -0600)]
osd/scrub: support an operator-abort command
The new explicit command aborts any ongoing scrub of the target PG,
including operator-initiated scrubs. That additional capability is needed now that
operator-initiated scrubs are no longer blocked by 'no-scrub' settings.
The scenario we are trying to help the operator with is:
- an operator issues a set of operator-initiated scrubs (e.g., via a
script), then realizes the mistake and wants to abort them all.
The abort command also downgrades the urgency level of the scrub target
(as otherwise the target would immediately restart, against the operator
wishes).
This commit implements the changes to the state machine and to the abort
logic, assuming the operator command was translated into an event.
Ville Ojamo [Tue, 20 Jan 2026 06:17:44 +0000 (13:17 +0700)]
doc/rados: fix links in operations/cache-tiering.rst
Change a link using an external link definition to ref that was missed
in commit 49af82c, PR #66943.
Remove a sentence that also linked to the same destination because the
destination had no cache tier configuration or default values that the
text promised.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Gil Bregman [Mon, 19 Jan 2026 12:18:03 +0000 (14:18 +0200)]
mgr/cephadm: Add some new fields to the cephadm NVMEoF spec file. Fixes: https://tracker.ceph.com/issues/74446 Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
Aashish Sharma [Wed, 17 Dec 2025 09:21:14 +0000 (14:51 +0530)]
monitoring: make cluster matcher backward compatible for pre-7.1 metrics
Ceph 18.* adds a `cluster` label to all Prometheus metrics. When
upgrading from earlier releases, historical metrics lack this label
and are excluded by Grafana queries that strictly match on `cluster`.
Update the shared Grafana matcher logic to use a regex matcher that
also matches series without the `cluster` label, restoring visibility
of pre-upgrade metrics while preserving multi-cluster behavior.
Patrick Donnelly [Wed, 14 Jan 2026 16:08:04 +0000 (11:08 -0500)]
script/ptl-tool: fix typo
Traceback (most recent call last):
File "/home/batrick/scm/ceph/src/script/ptl-tool.py", line 657, in <module>
File "/home/batrick/scm/ceph/src/script/ptl-tool.py", line 654, in main
File "/home/batrick/scm/ceph/src/script/ptl-tool.py", line 464, in build_branch
UnboundLocalError: cannot access local variable 'trailer_commit' where it is not associated with a value
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Ville Ojamo [Thu, 15 Jan 2026 07:22:29 +0000 (14:22 +0700)]
doc/rados: improve troubleshooting-pg.rst
Note that a link to a walkthrough uses deprecated Filestore.
Reported in doc bugs pad.
Fix capitalization, use OSD instead of ceph-osd.
Improve language in a list.
Remove escaping from slashes in PG query output, tested on Quincy.
Don't use spaces in states like active+remapped consistently.
Add label for incoming links and change them to refs.
Use privileged prompt for CLI commands, don't highlight in console output.
Use double backticks consistently. Improve markup.
Remove spaces at the end of lines.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Ville Ojamo [Fri, 16 Jan 2026 06:47:44 +0000 (13:47 +0700)]
doc/rados: use ref for links and improve links in operations
Add labels for doc top and CRUSH MSR in crush-map.rst.
Add a see more link to crush-map-edits.rst from crush-map.rst.
Use ref for linking if labels were added or existed already in a few
related files.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
J. Eric Ivancich [Thu, 15 Jan 2026 20:32:32 +0000 (15:32 -0500)]
rgw: rgw-orphan-list can continue with empty intermediate file(s)
rgw-orphan-list would exit with an error if either of the intermediate
files were empty. That's not necessarily indicative of an error,
though. If otherwise all the buckets have been removed then the
radosgw-admin intermediate file *should* be empty and the tool will
still find orphans. When an empty intermediate file is found, this
changes the output from error to a warning and will not exit.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Adam C. Emerson [Thu, 8 May 2025 18:34:54 +0000 (14:34 -0400)]
{test,rgw,tools}: Explicitly use Boost.Process v1
Boost 1.88 removed the default of using the v1 interface
automatically. See https://github.com/boostorg/process/issues/480 for
an example.
https://www.boost.org/doc/libs/1_88_0/libs/process/doc/html/index.html#version_2
describes the new, preferred version which we probably want to migrate
to eventually.
In this change we simply include the v1 files and change the namespace
we alias.
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Adam C. Emerson [Thu, 15 Jan 2026 00:58:15 +0000 (19:58 -0500)]
build: Disable `FindBoost` for Boost's included cmake config
Boost has included this since 1.70 and CMake has deprecated the
non-config version since 3.30.
See also
https://cmake.org/cmake/help/latest/policy/CMP0167.html#policy:CMP0167
We enable CMP0167 (The `FindBoost` module is removed.) to force cmake
to use the installed Boost configuration files rather than its own
detection.
We also enable CMP0144 (`find_package()` uses upper-case
`<PACKAGENAME>_ROOT` variables.) to ensue that the `BOOST_ROOT`
parameter continues to function in the config-style `find_package`.
`BuildBoost.cmake` is updated to add the `Boost::headers` interface
target to match configured system boost (retaining the Boost::boost
alias).
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Venky Shankar [Wed, 14 Jan 2026 13:08:58 +0000 (18:38 +0530)]
pybind/cephfs: invoke fcopyfile() libcephfs API without holding GIL
fcopyfile() performs a read+write cycle on the entire file to be copied.
The python binding invokes this with the GIL held. This causes the GIL
to be held for extended duration causing other python threads to be
blocked on acquiring the GIL when shceduled.
Ville Ojamo [Tue, 13 Jan 2026 09:52:50 +0000 (16:52 +0700)]
doc/_ext: unbreak releases timeline if other than 3 active releases
The Timeline custom Sphinx directive expected exactly three active
releases listed as arguments. While this is fine for the usual situation
of three active releases, improving the directive to support any number
of active releases may benefit e.g. testing.
Previously, using anything other than 3 release names in the
releases/index.rst ceph_timeline directive caused the release dates
table to not be rendered.
Use the same pattern as the TimelineGantt custom directive by requiring
two arguments, with the second argument being a space-separated string
of release names.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>