crimson/osd: implement PG merge detection and orchestration in PGAdvanceMap
Integrate PG merge logic into the map advancement pipeline by implementing
helpers within PGAdvanceMap.
- check_for_merges(): Monitors pg_num changes between epochs to identify
pending merges for the current PG.
- merge_pg(): Orchestrates the merge based on the PG's role. For sources,
it stops the PG and registers it with the target shard. For targets,
it waits for all sources to arrive and executes the merge.
- start(): Updated the main loop with a shared stop_flag and exception-based
early exit to halt map advancement immediately once a merge is triggered.
Add PG::merge_from to execute the merge of source PGs into a target PG.
This function builds a transaction to remove source-specifc metadata
objects and merge source collections into the target collection.
Introduce merge waiter infrastructure to coordinate PG merging across
different CPU cores.
This commit provides the mechanism to 'ship'
source PGs between shards and synchronize the target PG's execution.
The mechanism solves two primary challenges:
- Migration Safety: PGs are tied to their 'birth_shard' for memory management.
To move a PG, we use seastar::foreign_ptr for transit and crimson::local_shared_foreign_ptr for storage.
This ensures that even when a source PG is used on a different core, its eventual destruction is
automatically routed back to its owner core, preventing cross-shard deallocation crashes.
- Target and Source PG Synchronization: Merging involves a race between the source PGs 'checking in'
and the target PG reaching the merge point. We use shared_promises within a 'merge_waiter' registry to allow the
target PG to suspend execution (via wait_for_merge_sources) until all required sources are registered.
Key components include:
- merge_waiter: A registry for tracking arrival of source
PGs and managing synchronization promises.
- register_merge_source: Logic to extract a PG from its current shard
and transport it to the target shard.
- apply_register_source: Handles the race condition between source
arrival and target readiness using shared_promises.
- wait_for_merge_sources: The suspension point where a target PG
awaits its source PGs.
- perform_source_cleanup: A safety mechanism to return source PGs
to their birth shards for destruction.
This function ensures that when a PG is being removed or
merged and it calls stop() - it will clear primary state, and
notify the Monitor to clear any pending merge flags.
crimson/osd/shard_services: inherit from peering_sharded_service
Update ShardServices to inherit from seastar::peering_sharded_service.
This allows the service to access its own sharded container directly
via container() rather than manually storing a reference to it.
crimson/osd: Add functions to notify mon when PGs are ready to merge
When a PG is in the pending merge state it is >= pg_num_pending and <
pg_num. When this happens, IO is paused and once the PG peers we notify
the mon that we are idle and safe to merge.
Ville Ojamo [Mon, 5 Jan 2026 06:10:45 +0000 (13:10 +0700)]
doc: Remove sphinxcontrib-seqdiag Python package from RTD builds
This is a proactive PR to avoid breaking docs builds when Setuptools 81
starts to be used in the RTD builds process.
The sphnixcontrib-seqdiag Python package is not compatible with
Setuptools 81 or later due to use of pkg_resources:
https://setuptools.pypa.io/en/latest/pkg_resources.html
Setuptools 81 release should be imminent, with the Python deprecation
warning stating pkg_resources "removal as early as 2025-11-30".
Seqdiag seems to be unmaintained with the latest update at Pypi in
the year 2021 and also no updates to the seqdiag git repo.
There are no seqdiag directives left in the docs after last seqdiags
were removed in PR #52308.
Two other options would exist for fixing the situation (see PR for
discussion) but this seems to be the suitable one.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Ville Ojamo [Mon, 5 Jan 2026 05:46:08 +0000 (12:46 +0700)]
doc/rados: Fix Sphinx warnings in troubleshooting/log-and-debug.rst
Possibly controversial Sphinx fixes:
Fix two Sphinx warnings about more than one confval directive.
Remove the dupe confval directives from log-and-debug.rst and leave only
in mon-config-ref.rst because it has the only copy of a related clog
configuration value mon_health_to_clog_tick_interval.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Kefu Chai [Wed, 31 Dec 2025 09:01:43 +0000 (17:01 +0800)]
common/options: document log_to_stderr's conditional default value
The default value of `log_to_stderr` varies depending on whether Ceph
runs as a daemon or a library. Previously, this was only documented via
the `default` property, which led to confusion when debugging client
applications.
For example, when debugging a CephFS client, setting `debug <subsystem> = 5`
in the configuration file doesn't produce visible debug logs as expected.
This occurs because `common_preinit()` overrides `log_to_stderr` to `false`
when Ceph runs as a library, preventing logs from appearing on stderr.
This commit adds clarification to the `long_desc` field to document this
conditional behavior and help users understand why debug output may not
appear in client scenarios.
Aashish Sharma [Wed, 10 Dec 2025 10:41:50 +0000 (16:11 +0530)]
mgr/dashboard: fix multi-cluster context switcher
The multi-cluster context switcher stopped working because of a
regression caused by this PR https://github.com/ceph/ceph/pull/66034.
This PR tends to fix this issue
Afreen Misbah [Mon, 15 Dec 2025 15:53:44 +0000 (21:23 +0530)]
'mgr/dashboard: Fix display of IP address in host page
- Hosts data is getting merged with hosts' facts which is not sending address hence not getting displayed in UI
- The value is empty hence in the API
- Caused by https://github.com/ceph/ceph/pull/65102
rgw/dedup: Prevent the dup-counter from wrapping around after it reaches 64K of identical copies.
Limit dedup from a single SRC to 128 Target copies to prevent OMAP size
from growing out of control
Tests cleanup
Kefu Chai [Wed, 24 Dec 2025 08:57:12 +0000 (16:57 +0800)]
test/run-cli-tests: install wheel before cram to fix build failure
Fix the run-cli-tests failure that occurs when installing cram from git.
The error happens because the fresh venv lacks build dependencies, causing
pip to fall back to legacy setup.py install which fails:
Using legacy 'setup.py install' for cram, since package 'wheel' is not installed.
Installing collected packages: cram
Running setup.py install for cram: started
error: subprocess-exited-with-error
× Running setup.py install for cram did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
Running setup.py install for cram: finished with status 'error'
error: legacy-install-failure
× Encountered error while trying to install package.
╰─> cram
The issue became visible after commit 70880723eaa updated the pip URL
format to the new PEP 440 style, which exposed the missing build tools.
Solution: Upgrade pip, setuptools, and wheel before installing cram to
ensure proper wheel-based installation works correctly with Python 3.13
and modern pip versions.
Kefu Chai [Wed, 24 Dec 2025 05:55:26 +0000 (13:55 +0800)]
debian/control: add iproute2 to build dependencies
Test scripts like qa/tasks/cephfs/mount.py expect the ip command to be
available in the container environment. Without it, tests fail with:
```
/bin/bash: line 1: ip: command not found
File "/ceph/qa/tasks/cephfs/mount.py", line 96, in cleanup_stale_netnses_and_bridge
p = remote.run(args=['ip', 'netns', 'list'],
...
teuthology.exceptions.CommandFailedError: Command failed with status 127: 'ip netns list'
```
Add iproute2 to the debian package build dependencies when the
<pkg.ceph.check> build profile is enabled. This ensures the package is
available during container-based builds, since buildcontainer-setup.sh
→ script/run-make.sh → install-deps.sh → debian/control → generated
dependency package chain respects build profiles configured via
`FOR_MAKE_CHECK` and `WITH_CRIMSON` environment variables set in
Dockerfile.build.
Imran Imtiaz [Mon, 8 Dec 2025 07:59:03 +0000 (07:59 +0000)]
mgr/dashboard: add CRUD API endpoints for consistency group snapshots
Signed-off-by: Imran Imtiaz <imran.imtiaz@uk.ibm.com> Fixes: https://tracker.ceph.com/issues/74258
Create a set of consistency group dashboard API endpoints to:
- List group snapshots
- Get details about a particular snapshot
- Create a snapshot
- Delete a snapshot
Nizamudeen A [Mon, 22 Dec 2025 08:49:00 +0000 (14:19 +0530)]
mgr/dashboard: upgrade angular to 19
* bump nodejs to 22.21.1
* remove swagger-ui from the package.json and import the bundled version
of it which is `swagger-ui-dist`. This removes the dependencies to the
react redux which is bought by the swagger-ui and also reduces the build
assets and build warnings. we really don't need the whole swagger-ui
package to be present here. Also importing the swagger-ui.css inside the
api-docs component lazily.
since our project is now under nx, upgraded using the nx migrate
command. It took care of the changes where it added the `standalone:
false` to all our files since we are still on modular architecture.
Other changes include
- adding `flush()` to fakeAsync mock test
- fixing some complaints raised by tsc linter as per the new typescript
type checks
- removed `this` from html components
- fixed jest config for newer presets
Kefu Chai [Thu, 18 Dec 2025 08:41:49 +0000 (16:41 +0800)]
cmake: build static seastar for release builds
When BUILD_SHARED_LIBS is set, seastar inherits this setting from the
parent CMake project, causing crimson to link against libseastar.so.
While this works in development environments, it breaks package
installation because libseastar.so is not included in the distribution:
Force seastar to build as a static library regardless of the parent
project's BUILD_SHARED_LIBS setting. This fixes the packaging issue
and provides a modest performance improvement by eliminating PLT/GOT
indirection overhead for seastar function calls.
cmake: While building fio headers, reference to macro BITS_PER_LONG defined by the fio's build is not being used in our CMake based system for plugins.
Fixes: https://tracker.ceph.com/issues/74182 Signed-off-by: T K Chandra Hasan <t.k.chandra.hasan@ibm.com>
David Galloway [Tue, 16 Dec 2025 22:08:00 +0000 (17:08 -0500)]
install-deps: Replace apt-mirror
apt-mirror.front.sepia.ceph.com has happened to always work because we set up CNAMEs to gitbuilder.ceph.com.
That host is making its way to a new home upstate (literally and figuratively) so we'll get rid of the front subdomain since it's publicly accessible anyway and add TLS while we're at it.
Signed-off-by: David Galloway <david.galloway@ibm.com>
Casey Bodley [Thu, 11 Dec 2025 19:19:01 +0000 (14:19 -0500)]
osdc: remove implicit LingerOp reference between watch/unwatch
before this change set, linger_register() returned a raw LingerOp
pointer with an implicit reference for the caller. for librados,
this implicit reference is only dropped when the corresponding
unwatch() calls linger_cancel()
after commit 94f42b648feea77bd09dc3fdb48e6db2b48c7717 introduced
linger_by_cookie(), unwatch() no longer has a safe way to drop this
implicit reference. to prevent LingerOp leaks when unwatch() returns
ENOTCONN, we can't hold this implicit reference count until unwatch()
linger_register() now returns an explicit reference to the caller as
intrusive_ptr<LingerOp>. this helps to guarantee that this reference
count gets dropped before the completion of watch()/aio_watch()
because linger_register() no longer acquires an implicit reference for
the caller, linger_cancel() no longer drops it with info->put()
Casey Bodley [Thu, 11 Dec 2025 16:34:00 +0000 (11:34 -0500)]
librados: aio_unwatch() delivers ENOTCONN to AioCompletion
94f42b648feea77bd09dc3fdb48e6db2b48c7717 added a new error condition to
IoCtx::aio_unwatch() that callers aren't prepared to handle. instead of
returning that error directly, report it asynchronously to the
AioCompletion