Matthew N. Heler [Wed, 17 Dec 2025 02:53:20 +0000 (20:53 -0600)]
qa/rgw: add teuthology support for target_by_bucket cloud transition
Add cloud_target_by_bucket and cloud_target_by_bucket_prefix options
to rgw_cloudtier.py and s3tests.py. Create new test suite to run
target_by_bucket-specific s3-tests.
Signed-off-by: Matthew N. Heler <matthew.heler@hotmail.com>
Matthew N. Heler [Mon, 20 Apr 2026 21:25:47 +0000 (16:25 -0500)]
rgw/cloud-transition: yield in cloud_tier_bucket_exists HEAD
The HEAD request used null_yield, so every attempt (including the
retries added by retry_on_busy) blocked the LC worker thread for
the full HTTP timeout instead of yielding.
Signed-off-by: Matthew N. Heler <matthew.heler@hotmail.com>
rgw/cloud-transition: check bucket existence before create
Add HEAD request to check if target bucket exists before attempting
to create it. This avoids unnecessary PUT requests when the bucket
already exists on the remote endpoint.
Signed-off-by: Matthew N. Heler <matthew.heler@hotmail.com>
Add per-bucket cloud tier targeting via new options target_by_bucket
and target_by_bucket_prefix, and use them in transition/restore to
derive the destination bucket name
Signed-off-by: Matthew N. Heler <matthew.heler@hotmail.com>
ceph-volume: make TPM2 PCR policy configurable (default to PCR 7)
tpm enrollment for dmcrypt OSDs is hardcoded to systemd-cryptenroll
--tpm2-pcrs 9+12 which ties the LUKS key to initrd and kernel
command line measurements, which is brittle on RHEL image mode
systems: after a bootc switch, the kernel, initrd, or cmdline often
change, the PCRs move, and the volume won't unlock until you re-enroll
or fall back to another key.
typical error:
```
Apr 27 14:17:25 ceph-jx5fq20u bash[4289]: Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /usr/lib/systemd/systemd-cryptsetup attach M3zE7r-qsGZ-xs0T-610d-SJNZ-U89x-J0cJq8 /dev/ceph-cac05fb6-51d3-4a60-9fc1-4958c568b433/osd-block-b1a495a0-e1a4-4888-baf9-7990f45f1e56 - tpm2-device=auto,discard,headless=true,nofail
Apr 27 14:17:26 ceph-jx5fq20u ceph-e5520e2c-420d-11f1-a7b9-5254001191fb-osd-0-activate[4300]: stderr: Failed to unseal secret using TPM2: Operation not permitted
Apr 27 14:17:26 ceph-jx5fq20u bash[4289]: stderr: Failed to unseal secret using TPM2: Operation not permitted
```
The patch makes the PCR set configurable and defaults to 7 so bootc style
deployments behave correctly.
mgr/dashboard: Update permissions for pool-manager role
Fixes https://tracker.ceph.com/issues/76307
- says denied access when clicked on create pool table action
- this was happening due to the failing monitor API added for stretch cluster configuration
- also updates overview nav permissions
ceph-volume: raw activate should ignore lvm backed OSD devices
the generic activate (`ceph-volume activate`) runs the
raw path before LVM. Raw.activate was walking lsblk / raw
list entries and could hit block devices that are actually
logical volumes from `ceph-volume lvm prepare` or `lvm batch`
(with ceph lvm tags on the lv).
That made raw activation poke at LVM backed OSDs instead of
leaving it to `lvm activate`.
with this commit ceph-volume now builds the set of LV paths
that carry those tags once (`lvs` via ceph_volume_lvm_prepare_lv_paths)
and skip any candidate path that matches, so only real raw
OSDs go through the 'raw activate path'.
Also, we now pass `with_tpm` through luks_open() calls for db and
wal so encrypted metadata uses the same systemd-cryptsetup path
as the block LV when ceph.with_tpm is set.
mgr/cephadm: replace md5_hash with FIPS-safe config_hash
Replace md5_hash() usages in cephadm dependency hashing with an
algorithm-agnostic config_hash() helper. config_hash() is backed by
SHA-256, making dependency hash generation unconditionally FIPS-safe
while preserving change-detection behavior.
Ville Ojamo [Wed, 22 Apr 2026 06:51:34 +0000 (13:51 +0700)]
doc/rados: improve troubleshooting-mon.rst
Don't ceph tell mon_status and then claim it passes the help command.
Improve language and link to cephadm doc on asok usage. Add label and
note about accessing asok from the host in troubleshooting.rst.
Capitalize and use double backticks consistently.
Add some missing articles and other minor word changes.
Fix indentation.
Use ref and link definitions consistently, use automatic bold.
Use privileged prompts for CLI commands where necessary.
Remove spaces at end of lines and change tabs to four spaces.
Signed-off-by: Ville Ojamo <git2233+ceph@ojamo.eu>
Afreen Misbah [Fri, 27 Mar 2026 16:06:38 +0000 (21:36 +0530)]
mgr/dashboard: Add gray10 theme base color to all pages
- applies #f4f4f4 - $background to all pages as base page
- earlier the base color of page was white
- also updates tabs/navs/tables css to adapt
- some fixes of spacings in alerts tabs, nvmeof
Afreen Misbah [Thu, 26 Mar 2026 13:25:18 +0000 (18:55 +0530)]
mgr/dashboard: Remove tooltip and popover defaults
Fixes https://tracker.ceph.com/issues/75410
These defaults are not required as carbon adds blackish color to tooltips and moving forward we want to align to CDS.
If anything breaks then add / fix in the used component
cephadm: replace call_throws with call in command_inspect_image
Problem:
During the upgrade, when inspecting the new ceph image for the first time, an error is printed to the ceph-mgr log instead of displaying a user-friendly message.
Root cause: During an upgrade, inspect-image is called on each node to check if the target image exists locally before pulling it. This flow, where inspect-image always precedes the pull, occurs on nodes other than the first.
Code Fixes:
1. src/cephadm/cephadm.py:
Replace call_throws with call in command_inspect_image. call_throws raises a RuntimeError on any non-zero exit code, producing a full traceback in the logs. call returns the exit code instead of raising, so the function exits cleanly with errno.ENOENT when the image is not found.
cephadm: convert lists back to tuples when loading last_client_files
Problem: ceph mgr fail or active ceph mgr restart causes unnecessary client files recreation on _admin hosts. Files such as /etc/ceph/ceph.conf and /etc/ceph/ceph.client.admin.keyring are rewritten even when their content has not changed.
Root cause:
update_client_file() stores client file metadata as a Python tuple (digest, mode, uid, gid).
When save_host() persists this to the mon store via json.dumps(), the tuple is serialized as a JSON array since JSON has no tuple type.
On mgr failover or restart, cache.load() deserializes the data with json.loads(), which returns a Python list instead of a tuple.
The comparison in _write_client_files(): match = old_files[path] == (digest, mode, uid, gid) then compares a list (from JSON) against a tuple (freshly built), which always evaluates to False.
This causes every client file to be rewritten on every mgr failover or restart.
Code Fixes:
1. src/pybind/mgr/cephadm/inventory.py:
convert the deserialized lists back to tuples when loading last_client_files
Shweta Bhosale [Wed, 18 Feb 2026 14:29:58 +0000 (19:59 +0530)]
mgr/nfs: 1. Removed the option to enable and disable cluster wide qos, it will be enabled by default
2. Removed the cluster_enable_qos field from the cluster-level block as it was causing confusion for the user.
3. Instead of using cluster use global while showing cluster level qos values in export qos get
Shweta Bhosale [Thu, 6 Nov 2025 13:04:19 +0000 (18:34 +0530)]
mgr/cephadm: support nfs cluster level qos
Added below CEPH_NODES_LIST block in ganesha.conf and enable_cluster_qos in cluster level QoS block
CEPH_NODES_LIST {
Ceph_Nodes = 192.168.100.100, 192.168.100.101, 192.168.100.102;
}
Fixes: https://tracker.ceph.com/issues/69861 Signed-off-by: Shweta Bhosale <Shweta.Bhosale1@ibm.com>
mgr/cephadm: Changes to add NFS cluster qos inter node communication port in spec
mgr/nfs: Addressed review comments for cluster level qos support
mgr/nfs: add enable_cluster_qos = true while enabling qos
Shweta Bhosale [Wed, 19 Mar 2025 11:16:10 +0000 (16:46 +0530)]
mgr/nfs: When cluster level qos is disabled and export still has qos parameters, then allow nfs export apply command if file has same qos block which is already set