mgr/cephadm: add the VIP to the internal mgmt-gateway cert SAN list
Include the VIP as part of the mgmt-gateway internal server
certificate SAN list when operating in HA mode. Otherwise
the communication between internal services might fail.
Yuval Lifshitz [Sun, 12 Oct 2025 14:14:36 +0000 (14:14 +0000)]
rgw/logging: fix race condition when name update returns ECANCELED
* when we get ECANCELED indication from the name set operation we should
bail out and not continue with the rollover
* this fix revealed a hidden bug where we do not check the existing temp
name when we do conf change cleanup (rollover)
Adam King [Fri, 10 Oct 2025 14:48:35 +0000 (10:48 -0400)]
mgr/orchestrator: stop passing "default_flow_style" flag to yaml dump
This seems to not be compatible with pyyaml 6.0
```
File "/lib/python3.12/site-packages/ceph/deployment/service_spec.py", line 1350, in __repr__
y = yaml.dump(cast(dict, self), default_flow_style=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib64/python3.12/site-packages/yaml/__init__.py", line 253, in dump
return dump_all([data], stream, Dumper=Dumper, **kwds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib64/python3.12/site-packages/yaml/__init__.py", line 241, in dump_all
dumper.represent(data)
File "/lib64/python3.12/site-packages/yaml/representer.py", line 28, in represent
self.serialize(node)
File "/lib64/python3.12/site-packages/yaml/serializer.py", line 54, in serialize
self.serialize_node(node, None, None)
File "/lib64/python3.12/site-packages/yaml/serializer.py", line 104, in serialize_node
self.emit(MappingStartEvent(alias, node.tag, implicit,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Prepared.__init__() got an unexpected keyword argument 'flow_style'
```
and didn't seem to cause any issues with making our specs look
readable in the logs or being able to round-trip specs
when using `ceph orch ls --export` (minus the known bug
around doing so with multi-line certs)
rgw/lc: At least wait for |rgw_lc_lock_max_time| while trying to fetch the lc-shard lock to get or update the bucket status.
Currently each lc worker would try 1 second to get the lock on lc_shard to decide on which bucket to process and again 1 second to update the bucket status once bucket is lc processed. However when there are multiple rgws running lc, often shard is locked by the other lc worker or if there are issues when the rados is slow the lock is not processed within 1 second and worker either skips processing the bucket or skips updating the bucket, resulting in miss of LC or miss in updating the bucket status.
So in worst case when other lc worker is already processing a shard, wait for rgw_lc_lock_max_time to get the lock, as any given worker can max hold onto rgw_lc_lock_max_time a given shard.
Signed-off-by: kchheda3 <kchheda3@bloomberg.net>
(cherry picked from commit 937ac626afd3bf443edf96aa177854e8eb291af5) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
rgw/lc: if the buckets last lc processing time is less than start time of current LC session, then continue processing bucket for lC even if the status is not in initalized state.
Currently the logic inside expired_session() would consider an LC session valid for almost 2-3 days, so for some bucket where the lc processing POST status update fails, the next lc session would skip the bucket because the expired_session() would return false as it multiplies the num_seconds_day *2. Instead of hardcoding the logic to 2 days, store the start time for each lc session and then compare the bucket update time with lc_start time, if bucket process time is less then current lc start time, then bucket can be processed as previous session is already expired.
client: adjust `Fb` cap ref count check during synchronous fsync()
cephfs client holds a ref on Fb caps when handing out a write delegation[0].
As fsync from (Ganesha) client holding write delegation will block indefinitely[1]
waiting for cap ref for Fb to drop to 0, which will never happen until the
delegation is returned/recalled.
If an inode has been write delegated, adjust for cap reference count
check in fsync().
Note: This only workls for synchronous fsync() since `client_lock` is
held for the entire duration of the call (at least till the patch leading
upto the reference count check). Asynchronous fsync() needs to be fixed
separately (as that can drop `client_lock`).
Nizamudeen A [Thu, 11 Sep 2025 05:29:47 +0000 (10:59 +0530)]
mgr/dashboard: improve search and pagination behavior
add a throttle to the pagination cycle so that if you repeatedly try to
cycle through the page, it increases the delay. Doing this because
unlike search the button click to change page is deliberate and the
first click to the button should respond immediately.
another thing is that the search with a keyword stores every keystroke i
do in the search field and then after the debouncce interval it sends
all those request one by one.
for eg: if i type 222 it waits 1s for the
debounce timer and then sends a request to find osd with id 2 first then
again 2 and then again 2. Instead it should only send 222 at the end.
Nizamudeen A [Thu, 11 Sep 2025 04:13:13 +0000 (09:43 +0530)]
mgr/dashboard: fix missing schedule interval in rbd API
Fetching the rbd image schedule interval through the rbd_support module
schedule list command
GET /api/rbd will have the following field per image
```
"schedule_info": {
"image": "rbd/rbd_1",
"schedule_time": "2025-09-11 03:00:00",
"schedule_interval": [
{
"interval": "5d",
"start_time": null
},
{
"interval": "3h",
"start_time": null
}
]
},
```
Also fixes the UI where schedule interval was missing in the form and
also disable editing the schedule_interval.
Extended the same thing to the `GET /api/pool` endpoint.
Commit includes changes:
1) Renaming Topic to Notification destination
2) Renaming Tiering to Storage class
3) Renaming Users to User Management
4) fix storage class table refresh after delete
5) Also made changes to internal routing for topic and storage class
rgw/dedup: Grant dedup process full RGW permissions.
This is necessary to allow for the creation of intermediate SLAB objects on systems configured with Ceph authentication.
Fixes: https://tracker.ceph.com/issues/72894 Signed-off-by: Mark Kogan <mkogan@ibm.com>
Update PendingReleaseNotes
Co-authored-by: Yuval Lifshitz <yuvalif@yahoo.com> Signed-off-by: Mark Kogan <31659604+mkogan1@users.noreply.github.com>
Update PendingReleaseNotes
Co-authored-by: Yuval Lifshitz <yuvalif@yahoo.com> Signed-off-by: Mark Kogan <31659604+mkogan1@users.noreply.github.com>
(cherry picked from commit dae572d50080609c77d7131cfc99b1fb3f16d31b) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Resolves: rhbz#2393790
Rishabh Dave [Thu, 8 May 2025 15:05:39 +0000 (20:35 +0530)]
mgr/vol: keep clone source info even after cloning is finished
Instead of removing the information regarding source of a cloned
subvolume from the .meta file after the cloning has finished, keep it as
it is as the user may find it useful.
Justin Caratzas [Mon, 6 Oct 2025 23:25:44 +0000 (19:25 -0400)]
mgr/dashboard: add an option to control the dashboard crypto caller
Add a mgr config option `crypto_caller` that lets a ceph user override
the default behavior of using the remote crypto caller. Supported
values are `internal` and `remote`.
Justin Caratzas [Mon, 6 Oct 2025 23:25:44 +0000 (19:25 -0400)]
mgr/cephadm: always use the internal cryptocaller
The cephadm modules needs to use python cryptography module for ssh (via
asyncssh) and thus there's no need to use the remote crypto caller in
cephadm. Configure cephadm to always use the internal cryptocaller.
Justin Caratzas [Mon, 6 Oct 2025 23:25:44 +0000 (19:25 -0400)]
python-common/cryptotools: catch all failures to read cert
Previously, the internal crypto caller would catch (and convert) some
errors when reading the cert but not all cases. Move the logic to catch
the errors to a common location and do it once consistently.
Justin Caratzas [Mon, 6 Oct 2025 23:25:44 +0000 (19:25 -0400)]
python-common/cryptotools: unify and organize all endpoint functions
Lightly reorganize and make the "endpoint" functions in cryptotools.py more
consistent and uniform. Use small functions for input and output
handling so that the handling is done the same way throughout. Pass a
pre-constructed crypto caller via the args to then endpoint functions.
Make generating the private key it's own named function rather than
one single (and only) function with overloaded behavior controlled by
a cli switch.
Justin Caratzas [Mon, 6 Oct 2025 23:25:44 +0000 (19:25 -0400)]
pybind/mgr: fix test case in test_tls.py
Why violate the typing in a test? mypy never noticed this because tests
are not type checked but there seems to be no need to turn a str into
bytes to pass to a function that is typed only as taking str!
Justin Caratzas [Mon, 6 Oct 2025 23:25:43 +0000 (19:25 -0400)]
python-common/cryptotools: fix error path in verify tls function
The remote verify_tls function was not raising errors when it should.
Fix the function so that it always returns an object when it succeeds or
fails gracefully. Always parse that function in the crypto caller class.
Justin Caratzas [Mon, 6 Oct 2025 23:25:43 +0000 (19:25 -0400)]
python-common/cryptotools: create CrytpoCaller interface class
Create a class to act as a common shim between the cryptotools external
functions and the mgr. It provides common conversion mechanisms and
could possibly act as an abstraction in case we decide to make
the external function calls in different ways in the future.
Justin Caratzas [Mon, 6 Oct 2025 23:25:43 +0000 (19:25 -0400)]
pybind/mgr: Hack around the 'ImportError: PyO3 modules may only be initialized once per interpreter process' issue.
Fixes: https://tracker.ceph.com/issues/64213 Signed-off-by: Paulo E. Castro <pecastro@wormholenet.com>
(cherry picked from commit 717d0a6f3530ad3e07f4423002810327b2addcf1)
doc: update Grafana certificate configuration to use certmgr
With the introduction of certmgr, users must register their certificates
via `ceph orch certmgr cert set --hostname ...` instead of the old
config-key method. The updated docs clarify that Grafana certificates
are host-scoped and can only be provided by reference (or default to
cephadm-signed).
doc: update RGW HTTPS configuration to use certmgr and new fields
With the introduction of certmgr, RGW services now support three
certificate sources: cephadm-signed (default), inline, and reference.
Docs have been updated to:
- Show how to provide inline certificates using the new ssl_cert/ssl_key
fields instead of the deprecated rgw_frontend_ssl_certificate.
- Explain how to register and reference user-provided certs/keys
- Clarify that cephadm-signed certificates remain the default, with
optional wildcard SANs support.
The usage of rgw_frontend_ssl_certificate is still supported for
backward compatibility, but is now documented as deprecated.
Remove the code used to migrate Grafana self-signed certificates, as
it is no longer needed. The certmgr logic now handles generating new
certificates during the upgrade, eliminating the need for any migration
code or logic.
Remove the special-case code used for RGW service migration, as it is no
longer needed. The certmgr logic now handles populating the certstore
with the corresponding certificate and key entries by reading their values
directly from the spec. During RGW service redeployment as part of the
upgrade, certmgr will ensure the certstore is updated accordingly.
mgr/cephadm: Fix RGW spec validation for deprecated rgw cert field
Starting from Tentacle, the rgw_frontend_ssl_certificate field has been
deprecated in favor of the new ssl_cert and ssl_key fields. Update the
validation logic to run after this field is automatically transformed into
the new fields, ensuring proper validation of RGW specs.
mgr/cephadm: Include mgmt-gateway/oauth2-proxy in upgrade process
Add the new mgmt-gateway and oauth2-proxy services to the list of
services upgraded by cephadm, ensuring they are updated alongside the
rest of the cephadm-managed services.
pybind/mgr/volumes: add getter and setter APIs for snapdir_visibility
Conflicts:
fscrypt changes exist downstream 01a4d2a0356e5f66b7260dad7de70a5fa9cc3aa7 but not upstream,
so it led to a conflict, kept both the changes in the branch.
client: check client config and snaprealm flag before snapdir lookup
this commit adds a new client config client_respect_subvolume_snapshot_visibility
which acts as knob to have a per-client control over the snapshot visibility and
checks it along with the snaprealm flag while looking up a subvolume inode.
Dhairya Parmar [Wed, 6 Aug 2025 21:32:05 +0000 (03:02 +0530)]
common,mds: transmit SNAPDIR_VISIBILITY flag via SnapRealmInfoNew
at the time of building snap trace
Conflicts:
upstream ed6b71246137f9793f2d56b4d050b271a3da29fd made changes to generate_test_instances()
which is not present downstream in ceph-9.0-rhel-patches, so had to adjust accordingly.
mds: rebuild snaprealm cache if last_modified or change_attr changed
For the server side snapdir visibility changes to be transported to the
client — SnapRealm cache needs to be rebuilt otherwise the same metadata
would be sent via the send_snap_update() in C_MDS_inode_update_finish() while
setting the `ceph.dir.subvolume.snaps.visible` vxattr.
The condition used to check for the `seq` and `last_destroyed` against their
cached values but for the vxattr change, it's a rather non-feasible
heavylifting to update the `seq` which involves a set of steps to prepare the
op, commit the op, journal the changes and update snap-server/client(s) just
for a mere flag update (and updating last_destroyed anyway doesn't make sense
for this case). So, compare last_modified and change_attr with their cached
values to check if the SnapRealm cache should be rebuilt. These values are
incremented in the Server::handle_client_setvxattr while toggling the
snapshot visibility xattr and this would enforce a cache rebuild.
Conflicts:
upstream ed6b71246137f9793f2d56b4d050b271a3da29fd made changes to generate_test_instances()
but is not present downstream in ceph-9.0-rhel-patches, so had to resort to adjust accordingly.
librbd: fix segfault when removing non-existent group
Removing a non-existent group triggers a segfault in
librbd::mirror::GroupGetInfoRequest::send(). The issue is caused by a missing
return after finish(), which allows execution to fall through into
GroupGetInfoRequest::get_id() and access invalid memory.
Also, makesure to ignore ENOENT throughout the method Group::remove()
except at cls_client::dir_get_id()
Ramana Raja [Mon, 8 Sep 2025 02:50:51 +0000 (22:50 -0400)]
qa/workunits: add scenario to "test_force_promote_delete_group"
... in rbd_mirror_group_simple test suite.
After the group and its images are removed from the secondary, the test
can run in one of two scenarios. In Scenario 1, the test confirms that
the group is completely synced from the primary to the secondary. In
Scenario 2, the test disables and re-enables the primary, and then
confirms the group syncs from the primary to the secondary. Currently,
both of the scenarios fail occassionally when trying to confirm that
group is completely synced from the primary to the secondary.
Signed-off-by: Ramana Raja <rraja@redhat.com>
Resolves: rhbz#2399618
rbd-mirror: skip validation of primary demote snapshots
Problem:
When a primary demotion is in progress, the demote snapshot is in an incomplete
state. However, the group replayer incorrectly attempts to validate this
snapshot using validate_local_group_snapshots(), treating the cluster as if it
were secondary. This results in the group status being incorrectly set to
up+replaying instead of up+unknown.
Solution:
Avoid validating snapshots that are in the process of being demoted on the
primary. This ensures the group replayer does not mistakenly assign an
incorrect role or state during transition.
Adam King [Thu, 25 Sep 2025 20:13:18 +0000 (16:13 -0400)]
mgr/cephadm: split host cache entries if they exceed max mon store entry size
If the json blob we attempt to store for a host entry
exceeds the max mon store entry size, we become unable
to continue to store that hosts information in the
config key store. This means we only ever have the
information from the last time the json blob was
under the size limit each time the mgr fails over,
resulting in a number of stray host/daemon warnings
being generated and very outdated information being
reported by `ceph orch ps` and `ceph orch ls` around
the time of the failover
Igor Fedotov [Thu, 21 Aug 2025 10:42:54 +0000 (13:42 +0300)]
test/libcephfs: use more entries to reproduce snapdiff fragmentation
issue.
Snapdiff listing fragments have different boundaries in Reef and Squid+
releases hence original reproducer (made for Reef) doesn't work properly
in S+ releases. This patch fixes that at cost of longer execution.
This might be redundant/senseless when backporting to Reef.
Igor Fedotov [Tue, 12 Aug 2025 13:17:49 +0000 (16:17 +0300)]
mds: rollback the snapdiff fragment entries with the same name if needed.
This is required when more entries with the same name don't fit into the
fragment. With the existing means for fragment offset specification such a splitting to be
prohibited.