Afreen Misbah [Mon, 15 Dec 2025 15:53:44 +0000 (21:23 +0530)]
'mgr/dashboard: Fix display of IP address in host page
- Hosts data is getting merged with hosts' facts which is not sending address hence not getting displayed in UI
- The value is empty hence in the API
- Caused by https://github.com/ceph/ceph/pull/65102
Imran Imtiaz [Thu, 8 Jan 2026 10:37:32 +0000 (10:37 +0000)]
mgr/dashboard: fix RBD mirror schedule inheritance in pool and image APIs
Signed-off-by: Imran Imtiaz <imran.imtiaz@uk.ibm.com> Fixes: https://tracker.ceph.com/issues/74494
Fix the bug where the Pool API was reporting random image schedules
instead of pool schedules. Implement proper schedule inheritance
hierarchy (Image > Pool > Cluster) for both Pool and Image APIs.
Nitzan Mordechai [Thu, 13 Nov 2025 14:03:58 +0000 (14:03 +0000)]
qa/workunits: add Rocky Linux support to librados tests
Add Rocky Linux to the list of supported RPM-based distributions in
test_librados_build.sh and version_number_sanity.sh. Rocky Linux uses
the same package names and commands as CentOS/RHEL, so it can use the
existing RPM codepath.
Without this change, the tests fail on Rocky Linux systems with
"unknown distro" errors.
Leonid Chernin [Mon, 8 Dec 2025 20:54:44 +0000 (22:54 +0200)]
nvmeofgw: prevent map corruption while processing beacons from deleted gws
Fix race issue of map corruption when deleted gw sends beacons
but this gw data was removed from pending map and still exists in map.
Process beacons only if GW's data exists in both maps:
main-map and pending-map, otherwise just ignore beacons.
Samuel Just [Thu, 6 Nov 2025 23:54:50 +0000 (23:54 +0000)]
mon: add NVMEOF_BEACON_DIFF to mon_feature_t and mon CompatSet
NOPE NOPE
In order for the client to safely send BEACON_DIFF messages, it
needs to be the case that the leader at the time of receipt will
support BEACON_DIFF.
Simply using the connection features for the MonClient's target mon is
insufficient, because it might be a peon. If the peon supports
BEACON_DIFF and the leader does not the leader will either crash or
interpret it as a full BEACON. Neither outcome is acceptable.
Instead, we need to wire up a feature bit to the MonMap mon_feature_t
members and the CompatSet.
Adding FEATURE_BEACON_DIFF to ceph::features::mon get_supported()
and get_persistent() ensures that once all monitors in the quorum
support it, MonMap::get_required_features() will include it.
See Elector::propose_to_peers, Monitor::(win|lose)_election,
MonmapMonitor::apply_mon_features.
Once FEATURE_BEACON_DIFF is present in MonMap::get_required_features():
- Monitor::apply_monmap_to_compatset_features() will prevent
downgrades of the monitors by updating the CompatSet to include
CEPH_MON_FEATURE_INCOMPAT_NVMEOF_BEACON_DIFF
- Monitor::calc_quorum_requirements() will set
Monitor::required_features to require the NVMEOF_BEACON_DIFF
for any monitor peers.
- MonClient::get_monmap_required_features() will eventually include
ceph::features::mon::FEATURE_NVMEOF_BEACON_DIFF.
Leonid Chernin [Mon, 15 Sep 2025 11:04:04 +0000 (14:04 +0300)]
nvmeofgw: beacon diff implementation in the monitor and in the MonClient.
-monclient encodes subsystems by beacon-diff rules if BEACON_DIFF
bit is enabled by quorum
-monitor processes beacons by beacon-diff new schema
-monitor detects sequence out of order(ooo) condition and handles it
-in case ooo detected monitor send ack to the gw with the expected correct sequence
-monitor skips failovers for some interval when ooo detected
-monitor ignores all becons with incorrect sequences until gw sends expected one
-coding upgrade rules
Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> Fixes: https://tracker.ceph.com/issues/72394
(cherry picked from commit 3555a28e45c5b44289f12abe2fc843e21c7ebf87)
Ville Ojamo [Fri, 16 Jan 2026 09:43:31 +0000 (16:43 +0700)]
doc/radosgw: change all intra-docs links to use ref (2 of 6)
Part 2 of 6 to make backporting easier. Depends on part 1.
Use the the ref role for all remaining links in doc/radosgw/ with the
exception of config-ref.rst which will depend on changes to rgw.yaml.in.
The external link definitions syntax being removed is intended for
linking to external websites and not for intra-docs links. Validity of
ref links will be checked during the docs build process.
Add labels for links targets if necessary.
Remove unused external link definitions in the modified files.
Use confval instead of literal text for 2 configuration keys in
vault.rst.
Ilya Dryomov [Fri, 23 Jan 2026 13:48:53 +0000 (14:48 +0100)]
qa: don't assume that /dev/sda or /dev/vda is present in unmap.t
Instead of hard-coding the block device name, use the block device that
is backing the filesystem that the test is running on. We can be quite
sure it won't be an RBD device ;)
Ilya Dryomov [Wed, 21 Jan 2026 18:41:41 +0000 (19:41 +0100)]
qa: krbd_blkroset.t: eliminate a race in the open_count test
Even at QD=1, dd may take less than 10 seconds to work its way to the
end of a 10M image, producing "No space left on device" error instead
of the expected "Operation not permitted" error which is supposed to
arise from the device getting marked read-only while opened.
Disable OSD bench from benchmarking the OSDs for teuthology tests. This is to
help prevent a cluster warning pertaining to the IOPS value not lying within
a typical threshold range from being raised.
The tests can rely on the built-in static values as defined by
osd_mclock_max_capacity_iops_[ssd|hdd] which should be good enough.
Ville Ojamo [Fri, 16 Jan 2026 08:55:27 +0000 (15:55 +0700)]
doc/radosgw: change all intra-docs links to use ref (1 of 6)
Part 1 of 6 to make backporting easier. Many of the following parts
depend on this.
Use the the ref role for all remaining links in doc/radosgw/ with the
exception of config-ref.rst which will depend on changes to rgw.yaml.in.
The external link definitions syntax being removed is intended for
linking to external websites and not for intra-docs links. Validity of
ref links will be checked during the docs build process.
Add labels for links targets if necessary.
Remove unused external link definitions in the modified files.
Use confval instead of literal text for 2 configuration keys in
vault.rst.
Use Ceph Object Gateway consistently in multisite.rst.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Gil Bregman [Mon, 19 Jan 2026 12:18:03 +0000 (14:18 +0200)]
mgr/cephadm: Add some new fields to the cephadm NVMEoF spec file. Fixes: https://tracker.ceph.com/issues/74446 Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
(cherry picked from commit e872693c151842ea8d6142effe65e604acecf8b8)
Aashish Sharma [Wed, 17 Dec 2025 09:21:14 +0000 (14:51 +0530)]
monitoring: make cluster matcher backward compatible for pre-7.1 metrics
Ceph 18.* adds a `cluster` label to all Prometheus metrics. When
upgrading from earlier releases, historical metrics lack this label
and are excluded by Grafana queries that strictly match on `cluster`.
Update the shared Grafana matcher logic to use a regex matcher that
also matches series without the `cluster` label, restoring visibility
of pre-upgrade metrics while preserving multi-cluster behavior.
Alex Ainscow [Sun, 18 Jan 2026 22:13:54 +0000 (22:13 +0000)]
osd: Fix memory leak of ECDummyOp
Upon a pg falling idle, an ECDummy op is immediately generated.
This op causes the pg log to be committed. This op gets added to
the tid_to_op_map, however it does not get removed until the
interval ends.
The lack of remove is essentially a temporary "leak" and since the
op data structure is quite big, this can add up to significant
amounts of memory in a heavily loaded system.
The fix is simple - to add the op to the waiting list, so that it
gets cleaned up on when the op is finished.
Fixes: https://tracker.ceph.com/issues/74433 Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 5899535841c45468633bccd78859d28798c2fba7)
Nitzan Mordechai [Tue, 18 Nov 2025 09:37:48 +0000 (09:37 +0000)]
Objecter: respect higher epoch subscription in tick
The OSD and Objecter share the same MonClient. During preboot, a potential
race condition exists where the OSD subscribes to osdmap epoch X, while
the Objecter subscribes to epoch X - 1.
The Objecter's subscription overrides the OSD's subscription. Consequently,
the monitor ignores the request (as it believes the OSD already has the
older map), causing the OSD to hang during preboot.
To fix this, check if a higher epoch is already subscribed before calling
_maybe_request_map during Objecter::tick. If a higher epoch is found,
maintain the existing subscription.
Conflicts:
src/python-common/ceph/cephadm/images.py (conflicts with some
new images from main branch..we just need to change the grafana image's
version)
Afreen Misbah [Tue, 6 Jan 2026 10:47:16 +0000 (16:17 +0530)]
mgr/dashboard: Add full page tearsheet component
Fixes https://tracker.ceph.com/issues/74327
- added "full" page tearsheet
- the full page tearsheet uses a cancel confirmation modal hence added that as well
- as per latest carbon guidelines for tearsheet https://carbondesignsystem.com/community/patterns/create-flows/#anatomy-of-a-full-page
- not added - influencer title and toggle (should be added as per reqs)
Afreen Misbah [Mon, 29 Dec 2025 04:51:36 +0000 (10:21 +0530)]
mgr/dashboard: Add generic wizard component
Fixes https://tracker.ceph.com/issues/74291
- made on top of carbon modal
- carbon design system used - wide tearsheet
- added a step component as well to support navigation code
- added unit tests
Signed-off-by: Afreen Misbah <afreen@ibm.com>
(cherry picked from commit 132a7259c90659eb431b73cbe69ed85cebfa50d4)
- fixes linter errors for scss - alphabetical order
Imran Imtiaz [Wed, 24 Dec 2025 10:14:53 +0000 (10:14 +0000)]
mgr/dashboard: add CRUD API endpoints for consistency group snapshots 2/2
Signed-off-by: Imran Imtiaz <imran.imtiaz@uk.ibm.com> Fixes: https://tracker.ceph.com/issues/74275
Create a consistency group dashboard API endpoint to:
Imran Imtiaz [Mon, 8 Dec 2025 07:59:03 +0000 (07:59 +0000)]
mgr/dashboard: add CRUD API endpoints for consistency group snapshots
Signed-off-by: Imran Imtiaz <imran.imtiaz@uk.ibm.com> Fixes: https://tracker.ceph.com/issues/74258
Create a set of consistency group dashboard API endpoints to:
- List group snapshots
- Get details about a particular snapshot
- Create a snapshot
- Delete a snapshot
Imran Imtiaz [Fri, 12 Dec 2025 10:02:59 +0000 (10:02 +0000)]
mgr/dashboard: add API endpoint to delete consistency group
Signed-off-by: Imran Imtiaz <imran.imtiaz@uk.ibm.com> Fixes: https://tracker.ceph.com/issues/74201
Add a dashboard API endpoint to delete a consistency group.
Imran Imtiaz [Mon, 1 Dec 2025 14:25:07 +0000 (14:25 +0000)]
mgr/dashboard: add API endpoint to delete images from consistency groups
Signed-off-by: Imran Imtiaz <imran.imtiaz@uk.ibm.com> Fixes: https://tracker.ceph.com/issues/74033
Create a consistency group dashboard API endpoint that enables removal
of RBD images from the group.
Imran Imtiaz [Thu, 20 Nov 2025 14:45:32 +0000 (14:45 +0000)]
mgr/dashboard: add GET API endpoint for consistency groups
Signed-off-by: Imran Imtiaz <imran.imtiaz@uk.ibm.com> Fixes: https://tracker.ceph.com/issues/73942
Add a consistency group dashboard API endpoint to get the list of images
in the consistency groups that match the namespace of the group.
Imran Imtiaz [Thu, 13 Nov 2025 10:27:28 +0000 (10:27 +0000)]
mgr/dashboard: add API endpoint to add images to consistency groups
Signed-off-by: Imran Imtiaz <imran.imtiaz@uk.ibm.com> Fixes: https://tracker.ceph.com/issues/73840
Create a consistency group dashboard API endpoint that enables adding
RBD images to the group.
Imran Imtiaz [Wed, 12 Nov 2025 14:04:44 +0000 (14:04 +0000)]
mgr/dashboard: add API endpoint to create consistency groups
Signed-off-by: Imran Imtiaz <imran.imtiaz@uk.ibm.com> Fixes: https://tracker.ceph.com/issues/73821
Add the ability to create a consistency group via the Dashboard API.
Afreen Misbah [Tue, 21 Oct 2025 16:37:46 +0000 (22:07 +0530)]
mgr/dashboard: Generalized errors and validations in forms
Fixes https://tracker.ceph.com/issues/73901
- added a validation directive -`cdValidate` which can be use to set [invalid] form fields
- also added generic template for showing error messages in user password form
- user password form updates that
Nizamudeen A [Thu, 6 Nov 2025 04:53:47 +0000 (10:23 +0530)]
mgr/dashboard: start node virtual-env after starting ceph cluster
in frontend e2e.sh file, we don't need to start the node venv early on
before the ceph cluster is started. we only need it for the `npm` or
`npx` commands. Starting node virtual env and then starting ceph will
cause the ceph cluster to assume the node-env python as the python
environment which breaks the cryptotools call.
So moving the node-env venv start after the ceph is created
The alert CephPGImbalance doesn't take any device classes configured into account. As a result, there can be false positives when using mixed-size OSD disks.
Ref: https://github.com/rook/rook/discussions/13126#discussioncomment-10043490
John Mulligan [Fri, 25 Apr 2025 15:22:26 +0000 (11:22 -0400)]
mgr/dashboard: add an option to control the dashboard crypto caller
Add a mgr config option `crypto_caller` that lets a ceph user override
the default behavior of using the remote crypto caller. Supported
values are `internal` and `remote`.
John Mulligan [Fri, 25 Apr 2025 15:06:41 +0000 (11:06 -0400)]
mgr/cephadm: always use the internal cryptocaller
The cephadm modules needs to use python cryptography module for ssh (via
asyncssh) and thus there's no need to use the remote crypto caller in
cephadm. Configure cephadm to always use the internal cryptocaller.
John Mulligan [Fri, 25 Apr 2025 15:05:46 +0000 (11:05 -0400)]
python-common/cryptotools: catch all failures to read cert
Previously, the internal crypto caller would catch (and convert) some
errors when reading the cert but not all cases. Move the logic to catch
the errors to a common location and do it once consistently.
John Mulligan [Thu, 24 Apr 2025 18:36:58 +0000 (14:36 -0400)]
python-common/cryptotools: unify and organize all endpoint functions
Lightly reorganize and make the "endpoint" functions in cryptotools.py more
consistent and uniform. Use small functions for input and output
handling so that the handling is done the same way throughout. Pass a
pre-constructed crypto caller via the args to then endpoint functions.
Make generating the private key it's own named function rather than
one single (and only) function with overloaded behavior controlled by
a cli switch.