Adam King [Thu, 8 Jan 2026 19:59:29 +0000 (14:59 -0500)]
qa/cephadm: add default container image name base
Recent runs in the new lab for cephadm jobs are hitting
```
2026-01-06T00:47:00.247 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_teuthology_8707f5d0ca0e547efc56a3734c7aa4a4cf45b1f4/teuthology/run_tasks.py", line 112, in run_tasks
manager.__enter__()
File "/usr/lib/python3.12/contextlib.py", line 137, in __enter__
return next(self.gen)
^^^^^^^^^^^^^^
File "/home/teuthworker/src/github.com_ceph_ceph-c_73712ccf0d7b99648a0c0d2ebf766adc13587a45/qa/tasks/cephadm.py", line 1961, in task
raise Exception("Configuration error occurred. "
Exception: Configuration error occurred. The 'image' value is undefined for 'cephadm' task. Please provide...
```
but it appears the build is still pushing images to quay.ceph.io/ceph-ci/ceph:<sha1>
which is the image we would be looking for. The bit that raises an exception if
this isn't set is quite old, and I'm unsure if we still need it or it's safe
to just assume this naming scheme can be used if an explicit image is
passed, but this seems worth a try
Ronen Friedman [Thu, 1 Jan 2026 14:35:33 +0000 (14:35 +0000)]
scripts/build/ceph.spec.in: fix rhel version checks
Fixing multiple instances in this file where
the RHEL version is checked - without properly
ensuring that the OS is indeed RHEL.
0%{?rhel} is only defined on RHEL systems, and
is '0' otherwise. That resulted, for example, in
Fedora 43 having 'gts_version' incorrectly
set to '13'.
Kefu Chai [Tue, 23 Dec 2025 03:19:12 +0000 (11:19 +0800)]
cmake: clarify WITH_CRIMSON help text
The help text for WITH_CRIMSON previously read "Build seastar
components", which referenced the underlying C++ framework rather
than the user-facing functionality. This was confusing because users
care about Ceph features, not implementation details.
Change the help text to reference "Crimson" directly and explicitly
state the default value, making the option's purpose clearer to users.
Ville Ojamo [Mon, 5 Jan 2026 06:10:45 +0000 (13:10 +0700)]
doc: Remove sphinxcontrib-seqdiag Python package from RTD builds
This is a proactive PR to avoid breaking docs builds when Setuptools 81
starts to be used in the RTD builds process.
The sphnixcontrib-seqdiag Python package is not compatible with
Setuptools 81 or later due to use of pkg_resources:
https://setuptools.pypa.io/en/latest/pkg_resources.html
Setuptools 81 release should be imminent, with the Python deprecation
warning stating pkg_resources "removal as early as 2025-11-30".
Seqdiag seems to be unmaintained with the latest update at Pypi in
the year 2021 and also no updates to the seqdiag git repo.
There are no seqdiag directives left in the docs after last seqdiags
were removed in PR #52308.
Two other options would exist for fixing the situation (see PR for
discussion) but this seems to be the suitable one.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Ville Ojamo [Mon, 5 Jan 2026 05:46:08 +0000 (12:46 +0700)]
doc/rados: Fix Sphinx warnings in troubleshooting/log-and-debug.rst
Possibly controversial Sphinx fixes:
Fix two Sphinx warnings about more than one confval directive.
Remove the dupe confval directives from log-and-debug.rst and leave only
in mon-config-ref.rst because it has the only copy of a related clog
configuration value mon_health_to_clog_tick_interval.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Kefu Chai [Wed, 31 Dec 2025 09:01:43 +0000 (17:01 +0800)]
common/options: document log_to_stderr's conditional default value
The default value of `log_to_stderr` varies depending on whether Ceph
runs as a daemon or a library. Previously, this was only documented via
the `default` property, which led to confusion when debugging client
applications.
For example, when debugging a CephFS client, setting `debug <subsystem> = 5`
in the configuration file doesn't produce visible debug logs as expected.
This occurs because `common_preinit()` overrides `log_to_stderr` to `false`
when Ceph runs as a library, preventing logs from appearing on stderr.
This commit adds clarification to the `long_desc` field to document this
conditional behavior and help users understand why debug output may not
appear in client scenarios.
Aashish Sharma [Wed, 10 Dec 2025 10:41:50 +0000 (16:11 +0530)]
mgr/dashboard: fix multi-cluster context switcher
The multi-cluster context switcher stopped working because of a
regression caused by this PR https://github.com/ceph/ceph/pull/66034.
This PR tends to fix this issue
Afreen Misbah [Mon, 15 Dec 2025 15:53:44 +0000 (21:23 +0530)]
'mgr/dashboard: Fix display of IP address in host page
- Hosts data is getting merged with hosts' facts which is not sending address hence not getting displayed in UI
- The value is empty hence in the API
- Caused by https://github.com/ceph/ceph/pull/65102
rgw/dedup: Prevent the dup-counter from wrapping around after it reaches 64K of identical copies.
Limit dedup from a single SRC to 128 Target copies to prevent OMAP size
from growing out of control
Tests cleanup
Kefu Chai [Wed, 24 Dec 2025 08:57:12 +0000 (16:57 +0800)]
test/run-cli-tests: install wheel before cram to fix build failure
Fix the run-cli-tests failure that occurs when installing cram from git.
The error happens because the fresh venv lacks build dependencies, causing
pip to fall back to legacy setup.py install which fails:
Using legacy 'setup.py install' for cram, since package 'wheel' is not installed.
Installing collected packages: cram
Running setup.py install for cram: started
error: subprocess-exited-with-error
× Running setup.py install for cram did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
Running setup.py install for cram: finished with status 'error'
error: legacy-install-failure
× Encountered error while trying to install package.
╰─> cram
The issue became visible after commit 70880723eaa updated the pip URL
format to the new PEP 440 style, which exposed the missing build tools.
Solution: Upgrade pip, setuptools, and wheel before installing cram to
ensure proper wheel-based installation works correctly with Python 3.13
and modern pip versions.
Kefu Chai [Wed, 24 Dec 2025 05:55:26 +0000 (13:55 +0800)]
debian/control: add iproute2 to build dependencies
Test scripts like qa/tasks/cephfs/mount.py expect the ip command to be
available in the container environment. Without it, tests fail with:
```
/bin/bash: line 1: ip: command not found
File "/ceph/qa/tasks/cephfs/mount.py", line 96, in cleanup_stale_netnses_and_bridge
p = remote.run(args=['ip', 'netns', 'list'],
...
teuthology.exceptions.CommandFailedError: Command failed with status 127: 'ip netns list'
```
Add iproute2 to the debian package build dependencies when the
<pkg.ceph.check> build profile is enabled. This ensures the package is
available during container-based builds, since buildcontainer-setup.sh
→ script/run-make.sh → install-deps.sh → debian/control → generated
dependency package chain respects build profiles configured via
`FOR_MAKE_CHECK` and `WITH_CRIMSON` environment variables set in
Dockerfile.build.
Kefu Chai [Tue, 23 Dec 2025 14:44:10 +0000 (22:44 +0800)]
common: fix TrackedOp intrusive_ptr compatibility with boost 1.89+
Boost 1.89+ includes a new header sp_cxx20_constexpr.hpp (Copyright 2025)
that defines BOOST_SP_CXX20_CONSTEXPR macro. When building with C++20/23
mode and a compiler supporting constexpr dynamic allocation, this macro
expands to 'constexpr', making intrusive_ptr constructors and destructors
constexpr.
This change breaks builds that previously worked with boost 1.87 because:
In boost 1.87, the copy constructor was NOT constexpr:
intrusive_ptr(intrusive_ptr const & rhs): px( rhs.px )
{
if( px != 0 ) intrusive_ptr_add_ref( px );
}
In boost 1.89+ with C++20/23, it becomes constexpr:
BOOST_SP_CXX20_CONSTEXPR intrusive_ptr(intrusive_ptr const & rhs): px( rhs.px )
{
if( px != 0 ) intrusive_ptr_add_ref( px );
}
With constexpr, name lookup for intrusive_ptr_add_ref happens at compile
time during template instantiation, not at runtime. This changes the
lookup behavior significantly.
The issue is the "hidden friend" pattern: friend functions defined inside
a class are in the enclosing (global) namespace, but per C++ standard
[basic.lookup.argdep], they are ONLY visible via ADL (Argument-Dependent
Lookup), not via ordinary unqualified lookup.
When boost::intrusive_ptr's constexpr constructor tries to call
intrusive_ptr_add_ref(px) during template instantiation:
1. Ordinary unqualified lookup finds ceph::common::intrusive_ptr_add_ref
2. Since ordinary lookup succeeded, ADL is not performed [basic.lookup.argdep]/1
3. The friend functions in TrackedOp are never considered
4. Compilation fails due to signature mismatch (TrackedOp* vs RefCountedObject*)
In boost 1.87 (non-constexpr), ADL worked normally at runtime and found
the hidden friend functions. With constexpr in 1.89+, compile-time lookup
finds the wrong function before ADL can trigger.
The fix adds forward declarations before boost::intrusive_ptr<TrackedOp>
is first used. This makes the functions visible to ordinary lookup (not
just ADL), allowing the compiler to find them instead of the ceph::common
versions. The friend functions provide the actual definitions.
Note: Friend functions defined inside a class are already implicitly
inline per the C++ standard, so no explicit inline specifier is needed
on the friend function definitions.
This issue manifests when building with:
- Boost 1.89+ (which introduced sp_cxx20_constexpr.hpp)
- C++23 standard mode
- Compiler with constexpr dynamic allocation support
Fixes build errors like:
error: 'intrusive_ptr_add_ref' was not declared in this scope
Imran Imtiaz [Mon, 8 Dec 2025 07:59:03 +0000 (07:59 +0000)]
mgr/dashboard: add CRUD API endpoints for consistency group snapshots
Signed-off-by: Imran Imtiaz <imran.imtiaz@uk.ibm.com> Fixes: https://tracker.ceph.com/issues/74258
Create a set of consistency group dashboard API endpoints to:
- List group snapshots
- Get details about a particular snapshot
- Create a snapshot
- Delete a snapshot
Nizamudeen A [Mon, 22 Dec 2025 08:49:00 +0000 (14:19 +0530)]
mgr/dashboard: upgrade angular to 19
* bump nodejs to 22.21.1
* remove swagger-ui from the package.json and import the bundled version
of it which is `swagger-ui-dist`. This removes the dependencies to the
react redux which is bought by the swagger-ui and also reduces the build
assets and build warnings. we really don't need the whole swagger-ui
package to be present here. Also importing the swagger-ui.css inside the
api-docs component lazily.
since our project is now under nx, upgraded using the nx migrate
command. It took care of the changes where it added the `standalone:
false` to all our files since we are still on modular architecture.
Other changes include
- adding `flush()` to fakeAsync mock test
- fixing some complaints raised by tsc linter as per the new typescript
type checks
- removed `this` from html components
- fixed jest config for newer presets
Kefu Chai [Thu, 18 Dec 2025 08:41:49 +0000 (16:41 +0800)]
cmake: build static seastar for release builds
When BUILD_SHARED_LIBS is set, seastar inherits this setting from the
parent CMake project, causing crimson to link against libseastar.so.
While this works in development environments, it breaks package
installation because libseastar.so is not included in the distribution:
Force seastar to build as a static library regardless of the parent
project's BUILD_SHARED_LIBS setting. This fixes the packaging issue
and provides a modest performance improvement by eliminating PLT/GOT
indirection overhead for seastar function calls.
cmake: While building fio headers, reference to macro BITS_PER_LONG defined by the fio's build is not being used in our CMake based system for plugins.
Fixes: https://tracker.ceph.com/issues/74182 Signed-off-by: T K Chandra Hasan <t.k.chandra.hasan@ibm.com>
David Galloway [Tue, 16 Dec 2025 22:08:00 +0000 (17:08 -0500)]
install-deps: Replace apt-mirror
apt-mirror.front.sepia.ceph.com has happened to always work because we set up CNAMEs to gitbuilder.ceph.com.
That host is making its way to a new home upstate (literally and figuratively) so we'll get rid of the front subdomain since it's publicly accessible anyway and add TLS while we're at it.
Signed-off-by: David Galloway <david.galloway@ibm.com>
Casey Bodley [Thu, 11 Dec 2025 19:19:01 +0000 (14:19 -0500)]
osdc: remove implicit LingerOp reference between watch/unwatch
before this change set, linger_register() returned a raw LingerOp
pointer with an implicit reference for the caller. for librados,
this implicit reference is only dropped when the corresponding
unwatch() calls linger_cancel()
after commit 94f42b648feea77bd09dc3fdb48e6db2b48c7717 introduced
linger_by_cookie(), unwatch() no longer has a safe way to drop this
implicit reference. to prevent LingerOp leaks when unwatch() returns
ENOTCONN, we can't hold this implicit reference count until unwatch()
linger_register() now returns an explicit reference to the caller as
intrusive_ptr<LingerOp>. this helps to guarantee that this reference
count gets dropped before the completion of watch()/aio_watch()
because linger_register() no longer acquires an implicit reference for
the caller, linger_cancel() no longer drops it with info->put()
Casey Bodley [Thu, 11 Dec 2025 16:34:00 +0000 (11:34 -0500)]
librados: aio_unwatch() delivers ENOTCONN to AioCompletion
94f42b648feea77bd09dc3fdb48e6db2b48c7717 added a new error condition to
IoCtx::aio_unwatch() that callers aren't prepared to handle. instead of
returning that error directly, report it asynchronously to the
AioCompletion