The redeploy handler had no boolean "force" parameter, so the CLI could
bind --force to the optional image argument. Pass force through to
daemon_action, validate container image ref in cephadm, and guard
against --force being captured as the image in the CLI.
rgw/multisite: fix uninitialized LatencyMonitor average and use exponentially weighted moving average
LatencyMonitor::total was declared without an initializer. Since
std::chrono::duration's default constructor leaves the value indeterminate,
the very first add_latency() call adds a real sample to garbage, producing a
huge average that immediately triggers the "OSD cluster is overloaded" warning
within seconds of RGW startup, before any actual slow ops occur.
Additionally, the old implementation uses a naive lifetime average
(total/count) that could slow the recovery from a transient slow-ops
episode. Once poisoned, the average stayed high for a long time,
keeping the throttling sync concurrency to 1.
So, also replace the naive lifetime average in LatencyMonitor with an
exponentially weighted moving average (alpha=0.15). With the weighted average,
after a series of normal lock operations a past spike's influence decays faster,
allowing concurrency to recover without an RGW restart.
rgw/multisite: expose lock latency as perf counter for data sync
Add a "lock_latency" perf counter to the per-zone data sync counter.
This tracks the latency of RADOS lock/unlock operations in
RGWContinuousLeaseCR, giving operators visibility into the values
driving the LatencyConcurrencyControl.
The new perf counter can be queried via the admin socket:
ceph daemon <asok> perf dump data-sync-from-<zone>
and reset independently:
ceph daemon <asok> perf reset data-sync-from-<zone>
This would allow us to distinguish a poisoned average from ongoing
OSD latency issues without restarting the RGW process.
ceph-volume: make TPM2 PCR policy configurable (default to PCR 7)
tpm enrollment for dmcrypt OSDs is hardcoded to systemd-cryptenroll
--tpm2-pcrs 9+12 which ties the LUKS key to initrd and kernel
command line measurements, which is brittle on RHEL image mode
systems: after a bootc switch, the kernel, initrd, or cmdline often
change, the PCRs move, and the volume won't unlock until you re-enroll
or fall back to another key.
typical error:
```
Apr 27 14:17:25 ceph-jx5fq20u bash[4289]: Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /usr/lib/systemd/systemd-cryptsetup attach M3zE7r-qsGZ-xs0T-610d-SJNZ-U89x-J0cJq8 /dev/ceph-cac05fb6-51d3-4a60-9fc1-4958c568b433/osd-block-b1a495a0-e1a4-4888-baf9-7990f45f1e56 - tpm2-device=auto,discard,headless=true,nofail
Apr 27 14:17:26 ceph-jx5fq20u ceph-e5520e2c-420d-11f1-a7b9-5254001191fb-osd-0-activate[4300]: stderr: Failed to unseal secret using TPM2: Operation not permitted
Apr 27 14:17:26 ceph-jx5fq20u bash[4289]: stderr: Failed to unseal secret using TPM2: Operation not permitted
```
The patch makes the PCR set configurable and defaults to 7 so bootc style
deployments behave correctly.
mgr/dashboard: Update permissions for pool-manager role
Fixes https://tracker.ceph.com/issues/76307
- says denied access when clicked on create pool table action
- this was happening due to the failing monitor API added for stretch cluster configuration
- also updates overview nav permissions
ceph-volume: raw activate should ignore lvm backed OSD devices
the generic activate (`ceph-volume activate`) runs the
raw path before LVM. Raw.activate was walking lsblk / raw
list entries and could hit block devices that are actually
logical volumes from `ceph-volume lvm prepare` or `lvm batch`
(with ceph lvm tags on the lv).
That made raw activation poke at LVM backed OSDs instead of
leaving it to `lvm activate`.
with this commit ceph-volume now builds the set of LV paths
that carry those tags once (`lvs` via ceph_volume_lvm_prepare_lv_paths)
and skip any candidate path that matches, so only real raw
OSDs go through the 'raw activate path'.
Also, we now pass `with_tpm` through luks_open() calls for db and
wal so encrypted metadata uses the same systemd-cryptsetup path
as the block LV when ceph.with_tpm is set.
Matthew N. Heler [Thu, 26 Feb 2026 01:03:56 +0000 (19:03 -0600)]
rgw: add RestoreStatus support to object listings
S3 clients can request restore status in listing responses through the
x-amz-optional-object-attributes header, but we had no support for it.
This stores the restore state in the bucket index so listings can
include <RestoreStatus> without having to read each object's attrs
individually.
Signed-off-by: Matthew N. Heler <matthew.heler@hotmail.com>
Add script to test for CRUSH retry exhaustion in stretch mode with
2 datacenters. Tests unbiased stretch rules by running multiple
iterations of PG mappings and checking for collisions that exceed
the 50-try limit.
Also add --show-retry-exhaustion flag to crushtool to detect and
report when CRUSH mapping hits the maximum retry limit.
mgr/cephadm: replace md5_hash with FIPS-safe config_hash
Replace md5_hash() usages in cephadm dependency hashing with an
algorithm-agnostic config_hash() helper. config_hash() is backed by
SHA-256, making dependency hash generation unconditionally FIPS-safe
while preserving change-detection behavior.
Ville Ojamo [Wed, 22 Apr 2026 06:51:34 +0000 (13:51 +0700)]
doc/rados: improve troubleshooting-mon.rst
Don't ceph tell mon_status and then claim it passes the help command.
Improve language and link to cephadm doc on asok usage. Add label and
note about accessing asok from the host in troubleshooting.rst.
Capitalize and use double backticks consistently.
Add some missing articles and other minor word changes.
Fix indentation.
Use ref and link definitions consistently, use automatic bold.
Use privileged prompts for CLI commands where necessary.
Remove spaces at end of lines and change tabs to four spaces.
Signed-off-by: Ville Ojamo <git2233+ceph@ojamo.eu>
Afreen Misbah [Fri, 27 Mar 2026 16:06:38 +0000 (21:36 +0530)]
mgr/dashboard: Add gray10 theme base color to all pages
- applies #f4f4f4 - $background to all pages as base page
- earlier the base color of page was white
- also updates tabs/navs/tables css to adapt
- some fixes of spacings in alerts tabs, nvmeof
Afreen Misbah [Thu, 26 Mar 2026 13:25:18 +0000 (18:55 +0530)]
mgr/dashboard: Remove tooltip and popover defaults
Fixes https://tracker.ceph.com/issues/75410
These defaults are not required as carbon adds blackish color to tooltips and moving forward we want to align to CDS.
If anything breaks then add / fix in the used component
The objectstore tool tests restart the OSDs without allowing enough
time for GC to run, which can lead to no-OOL-segments conditions on restart. This
adds a gc_before_restart option to the test config, which when set
to true will run crimson-objectstore-tool --op gc on each OSD
before restarting them.
crimson/tools/objectstore: add GC operation to crimson-objectstore-tool
This adds a GC operation to the crimson-objectstore-tool, allowing
us to trigger GC cycles on demand during testing. This will
help reduce segment pressure and avoid 'no-segments' conditions.
mgr/cephadm: Skip RDMA device check for NFS during upgrade
During image upgrade, prepare_create run on the asyncio event-loop
thread while an outer wait_async is active. Calling wait_async again for
cephadm list-rdma on that thread blocks the loop and can hang or time out.
Matthew Heler [Sun, 26 Apr 2026 21:00:44 +0000 (16:00 -0500)]
rgw/lc: drop per-bucket LC counters to PRIO_DEBUGONLY
mgr's perf-schema bridge silently drops labeled counters on the way
out, so shipping the per-bucket LC counters up through MgrReport just
costs ingest memory for data mgr can't expose anyway. ceph-exporter
already handles labeled counters via the daemon admin socket, so make
that the only path.
Signed-off-by: Matthew N. Heler <matthew.heler@hotmail.com>
Joshua Blanch [Sat, 24 Jan 2026 16:53:14 +0000 (16:53 +0000)]
mgr/cephadm: remove SSH error logs from health detail when host is unreachable
HostConnectionError exception includes verbose logs from asyncssh which
creates noise when looking at ceph health detail. This moves the SSH logs
to log.exception() and remove it from appearing under `health detail`.
mgr/dashboard: use cephadm root CA for RGW SSL and improve error handling
Problem: Dashboard fails to access object pages when RGW is deployed with SSL using cephadm-signed certificates.
Root cause: RGW REST API connection fails with SSL certificate verification error because the cephadm root CA certificate that signed the RGW SSL certificates is not in the dashboard's trust store.
Code Fixes:
1. rgw_client.py:
Added _get_ssl_ca_bundle() which fetches the cephadm root CA certificate from the cert store and writes it atomically (via a temp file and os.replace) to a fixed path (/tmp/ceph-dashboard-ca/rgw-cephadm-root-ca.pem), returning the file path for SSL verification.
Notes:
- The file is written once per mgr process lifetime and reused by all RgwClient instances. On mgr restart it is refetched and overwritten.
- A dedicated subdirectory (/tmp/ceph-dashboard-ca/) is used because /tmp has the sticky bit set, which prevents os.replace from overwriting files owned by a different user.
2. rest_client.py
Fixed secondary that handle_connection_error crash - when the initial SSL error occurred, the error handler itself crashed trying to process the exception, because it assumed reason.args[0] was always a string, but for SSL errors it's an SSLError object.
cephadm: replace call_throws with call in command_inspect_image
Problem:
During the upgrade, when inspecting the new ceph image for the first time, an error is printed to the ceph-mgr log instead of displaying a user-friendly message.
Root cause: During an upgrade, inspect-image is called on each node to check if the target image exists locally before pulling it. This flow, where inspect-image always precedes the pull, occurs on nodes other than the first.
Code Fixes:
1. src/cephadm/cephadm.py:
Replace call_throws with call in command_inspect_image. call_throws raises a RuntimeError on any non-zero exit code, producing a full traceback in the logs. call returns the exit code instead of raising, so the function exits cleanly with errno.ENOENT when the image is not found.
cephadm: convert lists back to tuples when loading last_client_files
Problem: ceph mgr fail or active ceph mgr restart causes unnecessary client files recreation on _admin hosts. Files such as /etc/ceph/ceph.conf and /etc/ceph/ceph.client.admin.keyring are rewritten even when their content has not changed.
Root cause:
update_client_file() stores client file metadata as a Python tuple (digest, mode, uid, gid).
When save_host() persists this to the mon store via json.dumps(), the tuple is serialized as a JSON array since JSON has no tuple type.
On mgr failover or restart, cache.load() deserializes the data with json.loads(), which returns a Python list instead of a tuple.
The comparison in _write_client_files(): match = old_files[path] == (digest, mode, uid, gid) then compares a list (from JSON) against a tuple (freshly built), which always evaluates to False.
This causes every client file to be rewritten on every mgr failover or restart.
Code Fixes:
1. src/pybind/mgr/cephadm/inventory.py:
convert the deserialized lists back to tuples when loading last_client_files
Shweta Bhosale [Wed, 18 Feb 2026 14:29:58 +0000 (19:59 +0530)]
mgr/nfs: 1. Removed the option to enable and disable cluster wide qos, it will be enabled by default
2. Removed the cluster_enable_qos field from the cluster-level block as it was causing confusion for the user.
3. Instead of using cluster use global while showing cluster level qos values in export qos get
Shweta Bhosale [Thu, 6 Nov 2025 13:04:19 +0000 (18:34 +0530)]
mgr/cephadm: support nfs cluster level qos
Added below CEPH_NODES_LIST block in ganesha.conf and enable_cluster_qos in cluster level QoS block
CEPH_NODES_LIST {
Ceph_Nodes = 192.168.100.100, 192.168.100.101, 192.168.100.102;
}
Fixes: https://tracker.ceph.com/issues/69861 Signed-off-by: Shweta Bhosale <Shweta.Bhosale1@ibm.com>
mgr/cephadm: Changes to add NFS cluster qos inter node communication port in spec
mgr/nfs: Addressed review comments for cluster level qos support
mgr/nfs: add enable_cluster_qos = true while enabling qos
Shweta Bhosale [Wed, 19 Mar 2025 11:16:10 +0000 (16:46 +0530)]
mgr/nfs: When cluster level qos is disabled and export still has qos parameters, then allow nfs export apply command if file has same qos block which is already set
mgr/cephadm: plumb force_delete_data through daemon/service removal
This PR wires the `force_delete_data` already existing flag in the
binary through cephadm’s daemon and service removal paths, so that
commands such as `ceph orch rm service` or equivalent daemon removal
can explicitly ask for data deletion instead of the default "move
under <fsid>/removed/" for daemons such as Prometheus, osd and mon.