git-server-git.apps.pok.os.sepia.ceph.com Git

]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/log

Alex Ainscow [Fri, 17 Apr 2026 07:52:38 +0000 (08:52 +0100)]

test: Add help to ceph_test_rados

Basic help text to compliment the full docs in the
previous commit.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>

commit | commitdiff | tree

Alex Ainscow [Fri, 17 Apr 2026 07:50:27 +0000 (08:50 +0100)]

docs: Add documentation for ceph_test_rados

No documentation existed for ceph_test_rados.

This commit adds that documentation, as generated by Claude Code.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>

commit | commitdiff | tree

Nathan Hoad [Thu, 23 Apr 2026 18:03:24 +0000 (14:03 -0400)]

rgw: Remove GC deferred entries options and code.

This code has been disabled since v16.2.8.

Signed-off-by: Nathan Hoad <nhoad@bloomberg.net>

commit | commitdiff | tree

Nathan Hoad [Thu, 30 Apr 2026 18:29:24 +0000 (14:29 -0400)]

rgw: Move declaration inline to solve compiler warning about an unused variable.

Signed-off-by: Nathan Hoad <nhoad@bloomberg.net>

commit | commitdiff | tree

Patrick Donnelly [Thu, 30 Apr 2026 18:33:28 +0000 (11:33 -0700)]

Merge PR #68665 into main

* refs/pull/68665/head:
doc/start/os-recommendations: update for Umbrella and future releases

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>

commit | commitdiff | tree

Shubha Jain [Wed, 29 Apr 2026 11:21:50 +0000 (16:51 +0530)]

cephadm: update NFS scheduling and ingress tests for CEPH block changes

Signed-off-by: Shubha Jain <SHUBHA.JAIN1@ibm.com>
Made-with: Cursor

commit | commitdiff | tree

Shubha Jain [Mon, 16 Mar 2026 11:07:25 +0000 (16:37 +0530)]

cephadm: fix upgrade order validation when using --daemon-types with --hosts

Fix upgrade validation to check earlier daemon types on both same-host and other-host groups when host filtering is used.

Previously only daemons on other hosts were validated, allowing upgrades to bypass the enforced upgrade order in stretch mode clusters.

Add regression tests covering daemon_types+hosts and services+hosts scenarios.

Fixes: https://tracker.ceph.com/issues/75397
Signed-off-by: Shubha Jain <SHUBHA.JAIN1@ibm.com>

commit | commitdiff | tree

Shubha Jain [Mon, 9 Mar 2026 07:11:14 +0000 (12:41 +0530)]

mgr/cephadm: fix upgrade order validation when using daemon_types with hosts

When both daemon_types and hosts filters are provided to
`ceph orch upgrade start`, the validation logic in
`_validate_upgrade_filters()` only checked earlier daemon
types on hosts outside the target host set.

This caused a bug where earlier daemon types running on the
target hosts were ignored, allowing upgrades to proceed out
of order. For example, a crash daemon upgrade could start on
a host even when mon daemons on that same host were still on
an older version.

This patch fixes the validation by checking earlier daemon
types on both:

* daemons running on the same hosts
* daemons running on other hosts

This ensures upgrade order enforcement remains correct when
host filters are applied.

Fixes: https://tracker.ceph.com/issues/75397
Signed-off-by: Shubha Jain <SHUBHA.JAIN1@ibm.com>

commit | commitdiff | tree

Shraddha Agrawal [Wed, 29 Apr 2026 15:59:07 +0000 (21:29 +0530)]

sestore/omap_manager/btree: prevent heap buffer overflow in log

This commit fixes a heap overflow in omap_btree_node_impl when
logging the full bufferlist. This issue was already tracked in
https://tracker.ceph.com/issues/71524. To prevent this from happening,
we log the length of the bufferlist instead of the full log.

Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>

commit | commitdiff | tree

Patrick Donnelly [Thu, 30 Apr 2026 17:26:05 +0000 (10:26 -0700)]

Merge PR #68640 into main

* refs/pull/68640/head:
script/ptl-tool: source githumap from main branch

Reviewed-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

Casey Bodley [Thu, 30 Apr 2026 16:54:10 +0000 (12:54 -0400)]

Merge pull request #68007 from cbodley/wip-75722

rgw/iam: User/Group/Role apis map ECANCELED to ERR_CONCURRENT_MODIFICATION

Reviewed-by: Shilpa Jagannath <smanjara@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Thu, 30 Apr 2026 08:50:08 +0000 (08:50 +0000)]

osd/scrub: auto-correct accounting-only stat mismatches

When scrub detects a PG stats mismatch but no object-level
inconsistencies (all replicas agree on actual data), fix the
stats in place rather than reporting a scrub error.

Previously, a pure stat mismatch would log [ERR], increment
shallow_errors, and trigger OSD_SCRUB_ERRORS / PG_STATE_INCONSISTENT
health alerts — yet leave the stats unfixed unless a repair
scrub was manually initiated. The scrubber's own object count
is authoritative in this case.

Persistence of the corrected stats is deferred until the next
transaction that sets dirty_info, consistent with the existing
stats_invalid repair path.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Matthew N. Heler [Mon, 20 Apr 2026 21:25:47 +0000 (16:25 -0500)]

rgw/cloud-transition: yield in cloud_tier_bucket_exists HEAD

The HEAD request used null_yield, so every attempt (including the
retries added by retry_on_busy) blocked the LC worker thread for
the full HTTP timeout instead of yielding.

Signed-off-by: Matthew N. Heler <matthew.heler@hotmail.com>

commit | commitdiff | tree

Matthew N. Heler [Fri, 5 Dec 2025 16:52:55 +0000 (10:52 -0600)]

rgw/cloud-transition: check bucket existence before create

Add HEAD request to check if target bucket exists before attempting
to create it. This avoids unnecessary PUT requests when the bucket
already exists on the remote endpoint.

Signed-off-by: Matthew N. Heler <matthew.heler@hotmail.com>

commit | commitdiff | tree

Matthew N. Heler [Sat, 22 Nov 2025 14:12:56 +0000 (08:12 -0600)]

rgw/cloud-transition: add per-bucket target options

Add per-bucket cloud tier targeting via new options target_by_bucket
and target_by_bucket_prefix, and use them in transition/restore to
derive the destination bucket name

Signed-off-by: Matthew N. Heler <matthew.heler@hotmail.com>

commit | commitdiff | tree

Alex Ainscow [Wed, 18 Mar 2026 09:22:26 +0000 (09:22 +0000)]

osd: PGLog Attach correct version to missing list when ignoring log entries

A previous fix for PR 66698 fixed an issue where log entries associated with
partial writes were being processed incorrectly (see that PR and associated
tracker for details). The fix was to ignore log entries that should not have
been present on the non-primary shard.

The problem with that approach is that in a more complex scenario, where the
log contained a partial write, followed by a full write AND the shard is
backfilling, then the missing list was being given the version prior to the
full write, rather than prior to the clone.

Our fix here corrects how the missing list version is calculated.

See the associated tracker for instructions on how to recreate

Fixes: https://tracker.ceph.com/issues/75211
Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>

commit | commitdiff | tree

Alex Ainscow [Mon, 27 Apr 2026 13:24:45 +0000 (14:24 +0100)]

osd/test: Add EC peering test infrastructure and recovery test cases

This commit enhances the EC peering test framework and adds test cases
for erasure-coded pool recovery scenarios:

NOTE: Many of the tests cases are disabled as they recreate certain
problems. Later commits will enable these tests and fix the production
issues, but under different PRs.

Test Infrastructure Improvements:
- Add MockStore wrapper with read error injection capabilities for testing
error handling in EC recovery
- Enhance ECPeeringTestFixture with recovery callback verification
- Add support for pg_upmap to better simulate OSD placement
- Implement write_attribute() for testing partial vs full stripe writes
- Add read_shard_object_info() to verify on-disk version consistency
- Improve logging with missing object stats (m=, u=, mbc=)
- Add support for doing object recovery in Fast EC.
- Add set_config() helper for runtime configuration changes
- Preserve xinfo features when marking OSDs up/down
- Fix pg_temp handling for EC pools with optimizations

Mock Object Enhancements:
- Update MockPGBackendListener with recovery callback tracking
- Add on_local_recover, on_peer_recover, on_global_recover tracking
- Implement proper stats publishing (pg_stats_publish)
- Add is_missing_object() implementation
- Enhance should_send_op() with async_recovery_target logic
- Add apply_stats() to update PeeringState statistics

Test Cases Added:
- ECRecoveryTest: Verifies recovery with missing objects after OSD failure
- ECSequentialOSDFailoverTest: Tests sequential OSD failure/recovery cycles
- MultiObjectRecoveryReadCrash: Reproduces bug #75432 (multi-object reads)
- RollbackVersionMismatch: Reproduces bug #76213 (version mismatch)
- RollbackAfterMixedBlockedWrites: Reproduces bug #75211 (rollback issues)

These tests validate EC recovery mechanisms including:
- Object version tracking across shards
- Recovery callback invocation (local, peer, global)
- Handling of read errors during recovery
- Rollback behavior after blocked writes
- Multi-object recovery with partial failures

Assisted-by: IBM Bob, using Claude Sonnet
Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 28 Apr 2026 22:25:44 +0000 (15:25 -0700)]

doc/start/os-recommendations: update for Umbrella and future releases

Overhaul the OS recommendations documentation to reflect deployment
practices and map out the support matrices for upcoming releases through
Ceph X (24.x).

Key changes include:

* Emphasized container-based deployments: Added a new section strongly
  recommending containerized deployments via `cephadm` over legacy
  package-based installations to simplify upgrades and avoid host-level
  dependency conflicts.
* Expanded support tables: Updated the Platforms and Container Hosts
  tables to include Umbrella (21.x), Vampire (22.x), W (23.x), and
  X (24.x). Removed EOL releases like Reef.
* Added EOL visibility: Included End-of-Life dates for Linux
  distributions and anticipated EOL dates for Ceph releases to help
  administrators plan lifecycle events.
* Updated OS targets: Added support tracking for Ubuntu 24.04 (Noble),
  Ubuntu 26.04, Ubuntu 28.04, Rocky Linux 10, and Rocky Linux 11.
* Addressed CentOS transition: Added a warning that CentOS 10+ will no
  longer be built or tested by upstream. Documented that Rocky Linux 10
  is the new default container base image for Umbrella, while clarifying
  that the bare-metal host OS can remain any supported distribution.
* Added horizontal upgrade guidance: Introduced a new section outlining
  safe "horizontal" bare-metal OS upgrade paths (e.g., CentOS 9 to
  Rocky 10, Ubuntu 22.04 to 24.04) so users can safely migrate their
  nodes outside of Ceph version upgrade windows.

AI-Assisted: Gemini Pro, through numerous prompts
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

J. Eric Ivancich [Thu, 30 Apr 2026 14:17:00 +0000 (10:17 -0400)]

Merge pull request #68453 from mheler/wip-coroutine-cloud-transition

rgw/lc: add coroutine support for cloud-transition and cloud-restore

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Casey Bodley [Thu, 30 Apr 2026 14:12:07 +0000 (10:12 -0400)]

Merge pull request #67856 from cbodley/wip-75568

rgw/beast: add frontend option 'tls_groups'

Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>

commit | commitdiff | tree

Afreen Misbah [Thu, 30 Apr 2026 14:07:11 +0000 (19:37 +0530)]

mgr/dashboard: Bump lodash

Fixes https://tracker.ceph.com/issues/76370

Signed-off-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Yuval Lifshitz [Thu, 30 Apr 2026 13:33:58 +0000 (16:33 +0300)]

Merge pull request #68622 from yuvalif/wip-yuval-76262

rgw/notifications: relax topic names validation

commit | commitdiff | tree

Aashish Sharma [Wed, 29 Apr 2026 04:34:23 +0000 (10:04 +0530)]

mgr/dashboard: add remote write section to prometheus configuration

Add cli commands to add/remove remote_write section to prometheus configuration template

Fixes: https://tracker.ceph.com/issues/76316
Signed-off-by: Aashish Sharma <aasharma@redhat.com>

commit | commitdiff | tree

Casey Bodley [Thu, 30 Apr 2026 13:14:17 +0000 (09:14 -0400)]

Merge pull request #66170 from kchheda3/wip-fix-account-acls-backward-compatbility

rgw/account: Support backward compatibility for s3:PutAcls calls for users migrated to account.

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Casey Bodley [Thu, 30 Apr 2026 13:13:23 +0000 (09:13 -0400)]

Merge pull request #67962 from smanjara/wip-async-lock

rgw/multisite: convert lock/unlock coroutines to use aio_operate

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Casey Bodley [Thu, 30 Apr 2026 13:09:30 +0000 (09:09 -0400)]

Merge pull request #68055 from cbodley/wip-rgw-req-state-keys

rgw: authorization avoids sal::Object::get_instance()

Reviewed-by: Pritha Srivastava <prsrivas@redhat.com>

commit | commitdiff | tree

Casey Bodley [Thu, 30 Apr 2026 13:07:22 +0000 (09:07 -0400)]

Merge pull request #68210 from lumir-sliva/rgw/ratelimit-response-improvements

rgw: add Retry-After header and configurable rate-limit response

Reviewed-by: Ville Ojamo <git2233+ceph@ojamo.eu>
Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Casey Bodley [Thu, 30 Apr 2026 12:51:50 +0000 (08:51 -0400)]

Merge pull request #66146 from tobias-urdin/keystone-cache-miss

rgw/keystone: perf counter for cache hit wrong

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Redouane Kachach [Thu, 30 Apr 2026 12:19:15 +0000 (14:19 +0200)]

Merge pull request #68553 from yaelazulay-redhat/issue_76176_ceph_mgr_fail_or_active_ceph_mgr_restart_causes_unnecessary_client_files_recreation_on_admin_hosts

cephadm: ceph mgr fail or active ceph mgr restart causes unnecessary …

Reviewed-by: Redouane Kachach <rkachach@ibm.com>
Reviewed-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Redouane Kachach [Thu, 30 Apr 2026 11:10:42 +0000 (13:10 +0200)]

Merge pull request #67417 from webalexeu/feat/mgmt_sso_improvements

mgr/dashboard: Improve oauth2 sso configuration

Reviewed-by: Redouane Kachach <rkachach@ibm.com>
Reviewed-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>

commit | commitdiff | tree

Soumya Koduri [Thu, 30 Apr 2026 11:08:00 +0000 (16:38 +0530)]

Merge pull request #68452 from soumyakoduri/wip-skoduri-restore-crash

rgw/cloud-restore: Fix the restore workers' shutdown order

Reviewed by: Casey Bodley <cbodley@redhat.com>
Reviewed by: Matthew N. Heler <matthew.heler@hotmail.com>

commit | commitdiff | tree

Afreen Misbah [Thu, 30 Apr 2026 09:58:13 +0000 (15:28 +0530)]

Merge pull request #68658 from rhcs-dashboard/pool-permissions

mgr/dashboard: Update permissions for pool-manager role

Reviewed-by: Nizamudeen A <nia@redhat.com>

commit | commitdiff | tree

Redouane Kachach [Thu, 30 Apr 2026 08:48:49 +0000 (10:48 +0200)]

Merge pull request #67049 from adk3798/cgroup-cleanup-retry

cephadm: retry cleaning old cgroups when it fails

Reviewed-by: Redouane Kachach <rkachach@ibm.com>

commit | commitdiff | tree

Yuval Lifshitz [Sun, 26 Apr 2026 15:17:54 +0000 (15:17 +0000)]

rgw/notifications: relax topic names validation

Fixes: https://tracker.ceph.com/issues/76262
Signed-off-by: Yuval Lifshitz <ylifshit@ibm.com>

commit | commitdiff | tree

Redouane Kachach [Thu, 30 Apr 2026 08:20:57 +0000 (10:20 +0200)]

Merge pull request #61826 from ShwetaBhosale1/fix_issue_69861_NFS_commands_to_enable_disable_ops_limiting

mgr/nfs: NFS cluster and export commands to enable and disable ops control

Reviewed-by: Redouane Kachach <rkachach@ibm.com>
Reviewed-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Redouane Kachach [Thu, 30 Apr 2026 08:19:45 +0000 (10:19 +0200)]

Merge pull request #67720 from ShwetaBhosale1/fix_issue_74970_update_haproxy.cfg_to_support_nfs_active_active_deployment

mgr/cephadm: Update haproxy.cfg template to support nfs active active deployment

Reviewed-by: Redouane Kachach <rkachach@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 30 Apr 2026 08:12:11 +0000 (10:12 +0200)]

Merge pull request #68657 from guits/fix-generic-activate-tpm2

ceph-volume: raw activate should ignore lvm backed OSD devices

commit | commitdiff | tree

Guillaume Abrioux [Thu, 30 Apr 2026 08:11:29 +0000 (10:11 +0200)]

Merge pull request #68670 from guits/cv-fix-tpm2-pcrs

ceph-volume: make TPM2 PCR policy configurable (default to PCR 7)

commit | commitdiff | tree

Redouane Kachach [Thu, 30 Apr 2026 08:03:43 +0000 (10:03 +0200)]

Merge pull request #68538 from yaelazulay-redhat/issue_75448_during_upgrade_an_error_is_printed_when_inspecting_the_new_ceph_image_for_the_first_time

cephadm: During the upgrade, when inspecting the new ceph image for t…

Reviewed-by: Redouane Kachach <rkachach@ibm.com>
Reviewed-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Redouane Kachach [Thu, 30 Apr 2026 08:00:47 +0000 (10:00 +0200)]

Merge pull request #68638 from kginonredhat/issue-76185-enable-mgmt-gateway-on-a-FIPS-cluster-failed

Issue 76185 enable mgmt gateway on a fips cluster failed

Reviewed-by: Redouane Kachach <rkachach@ibm.com>

commit | commitdiff | tree

Yuval Lifshitz [Thu, 30 Apr 2026 07:59:38 +0000 (10:59 +0300)]

Merge pull request #67984 from cheese-cakee/wip-75416-fix-log-req-id

rgw/logging: use trans_id for standard access log record

commit | commitdiff | tree

bluikko [Thu, 30 Apr 2026 07:05:36 +0000 (14:05 +0700)]

Merge pull request #67825 from bluikko/wip-doc-rados-spelling3

doc/rados: Fix spelling errors (3 of 3)

commit | commitdiff | tree

Xuehan Xu [Wed, 22 Apr 2026 02:14:58 +0000 (10:14 +0800)]

crimson/osd/pg: unify the current should_send_op method implementation

Fixes: https://tracker.ceph.com/issues/76196
Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>

commit | commitdiff | tree

Patrick Donnelly [Wed, 29 Apr 2026 21:56:45 +0000 (14:56 -0700)]

Merge PR #68639 into main

* refs/pull/68639/head:
script/ptl-tool: get git dir via git command

Reviewed-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

Oguzhan Ozmen [Wed, 29 Apr 2026 21:36:48 +0000 (21:36 +0000)]

rgw/multisite: concurrency adjustment - consider the case caller provides 1

Signed-off-by: Oguzhan Ozmen <oozmen@bloomberg.net>

commit | commitdiff | tree

Oguzhan Ozmen [Tue, 28 Apr 2026 19:44:02 +0000 (19:44 +0000)]

rgw/multisite: log concurrency state transitions in adj_concurrency

Replace the timer-based "OSD cluster is overloaded" warning with
state-transition logging. Also, log when concurrency is halved and
eventually recovered.

Signed-off-by: Oguzhan Ozmen <oozmen@bloomberg.net>

commit | commitdiff | tree

Patrick Donnelly [Wed, 29 Apr 2026 20:39:07 +0000 (13:39 -0700)]

Merge PR #68641 into main

* refs/pull/68641/head:
script/ptl-tool: add option to not create a tag

Reviewed-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Wed, 29 Apr 2026 19:29:38 +0000 (12:29 -0700)]

Merge PR #68655 into main

* refs/pull/68655/head:
script/ptl-tool: allow PR numbers as GH urls

Reviewed-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Mon, 27 Apr 2026 20:08:56 +0000 (16:08 -0400)]

script/ptl-tool: source githumap from main branch

To make it authoritative.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Kamoltat (Junior) Sirivadhna [Wed, 29 Apr 2026 14:11:41 +0000 (14:11 +0000)]

src/test: update show-choose-tries.t tests

Since we modified crushtool cli commands, we need to also update
its test with new flag: --show-retry-exhaustion
and also the modified --show-choose-tries option

Also added /src/script/run-cli-tests.sh to run the cram test
easily without having the config headache

Signed-off-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>

commit | commitdiff | tree

adatri [Sat, 4 Apr 2026 00:42:20 +0000 (20:42 -0400)]

doc/cephfs/fs-volumes.rst: Correct volume creation with pre-existing pools

Signed-off-by: adatri <anthony.datri@gmail.com>

commit | commitdiff | tree

Casey Bodley [Wed, 29 Apr 2026 17:15:53 +0000 (13:15 -0400)]

Merge pull request #68186 from cbodley/wip-75534

rgw: CompleteMultipartUpload can fail with 404 NoSuchUpload

Reviewed-by: Mark Kogan <mkogan@redhat.com>

commit | commitdiff | tree

John Mulligan [Wed, 29 Apr 2026 15:24:56 +0000 (11:24 -0400)]

Merge pull request #68010 from phlogistonjohn/jjm-smb-mgr-incorrect-type-err

smb: improve smb mgr module resource type error handling

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Shachar Sharon <ssharon@redhat.com>

commit | commitdiff | tree

Kamoltat (Junior) Sirivadhna [Tue, 28 Apr 2026 20:16:00 +0000 (20:16 +0000)]

doc: update crushtool.rst

Add --show-retry-exhaustion flag to doc

Signed-off-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>

commit | commitdiff | tree

Afreen Misbah [Wed, 29 Apr 2026 13:44:07 +0000 (19:14 +0530)]

Merge pull request #68648 from rhcs-dashboard/76288-fix-ec-profile-pool

mgr/dashboard : Fixes EC profile used pool empty

Reviewed-by: Afreen Misbah <afreen@ibm.com>
Reviewed-by: Devika Babrekar <devika.babrekar@ibm.com>

commit | commitdiff | tree

Kobi Ginon [Wed, 22 Apr 2026 18:06:56 +0000 (21:06 +0300)]

orch/cephadm: fix redeploy --force, validate container image ref

The redeploy handler had no boolean "force" parameter, so the CLI could
bind --force to the optional image argument. Pass force through to
daemon_action, validate container image ref in cephadm, and guard
against --force being captured as the image in the CLI.

Fixes: https://tracker.ceph.com/issues/75967
Signed-off-by: Kobi Ginon <kginon@redhat.com>
Made-with: Cursor

commit | commitdiff | tree

Oguzhan Ozmen [Tue, 28 Apr 2026 00:09:16 +0000 (00:09 +0000)]

rgw/multisite: fix uninitialized LatencyMonitor average and use exponentially weighted moving average

LatencyMonitor::total was declared without an initializer. Since
std::chrono::duration's default constructor leaves the value indeterminate,
the very first add_latency() call adds a real sample to garbage, producing a
huge average that immediately triggers the "OSD cluster is overloaded" warning
within seconds of RGW startup, before any actual slow ops occur.

Additionally, the old implementation uses a naive lifetime average
(total/count) that could slow the recovery from a transient slow-ops
episode. Once poisoned, the average stayed high for a long time,
keeping the throttling sync concurrency to 1.

So, also replace the naive lifetime average in LatencyMonitor with an
exponentially weighted moving average (alpha=0.15). With the weighted average,
after a series of normal lock operations a past spike's influence decays faster,
allowing concurrency to recover without an RGW restart.

Fixes: https://tracker.ceph.com/issues/76308
Signed-off-by: Oguzhan Ozmen <oozmen@bloomberg.net>

commit | commitdiff | tree

Oguzhan Ozmen [Mon, 27 Apr 2026 23:07:03 +0000 (23:07 +0000)]

rgw/multisite: expose lock latency as perf counter for data sync

Add a "lock_latency" perf counter to the per-zone data sync counter.
This tracks the latency of RADOS lock/unlock operations in
RGWContinuousLeaseCR, giving operators visibility into the values
driving the LatencyConcurrencyControl.

The new perf counter can be queried via the admin socket:
ceph daemon <asok> perf dump data-sync-from-<zone>
and reset independently:
ceph daemon <asok> perf reset data-sync-from-<zone>

This would allow us to distinguish a poisoned average from ongoing
OSD latency issues without restarting the RGW process.

Signed-off-by: Oguzhan Ozmen <oozmen@bloomberg.net>

commit | commitdiff | tree

Shubha Jain [Wed, 29 Apr 2026 12:54:33 +0000 (18:24 +0530)]

python-common/service_spec: fix style in hostname normalization changes

Made-with: Cursor
Signed-off-by: Shubha Jain <SHUBHA.JAIN1@ibm.com>

commit | commitdiff | tree

Shubha Jain [Thu, 8 Jan 2026 08:53:19 +0000 (14:23 +0530)]

python-common/service_spec: simplify hostname normalization in HostPlacementSpec

Signed-off-by: Shubha Jain <SHUBHA.JAIN1@ibm.com>

commit | commitdiff | tree

Xuehan Xu [Mon, 27 Apr 2026 09:10:06 +0000 (17:10 +0800)]

crimson/os/seastore: destroy Transaction only when no other reference
exists

Fixes: https://tracker.ceph.com/issues/76268
Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 29 Apr 2026 09:17:23 +0000 (11:17 +0200)]

ceph-volume: make TPM2 PCR policy configurable (default to PCR 7)

tpm enrollment for dmcrypt OSDs is hardcoded to systemd-cryptenroll
--tpm2-pcrs 9+12 which ties the LUKS key to initrd and kernel
command line measurements, which is brittle on RHEL image mode
systems: after a bootc switch, the kernel, initrd, or cmdline often
change, the PCRs move, and the volume won't unlock until you re-enroll
or fall back to another key.

typical error:

```
Apr 27 14:17:25 ceph-jx5fq20u bash[4289]: Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /usr/lib/systemd/systemd-cryptsetup attach M3zE7r-qsGZ-xs0T-610d-SJNZ-U89x-J0cJq8 /dev/ceph-cac05fb6-51d3-4a60-9fc1-4958c568b433/osd-block-b1a495a0-e1a4-4888-baf9-7990f45f1e56 - tpm2-device=auto,discard,headless=true,nofail
Apr 27 14:17:26 ceph-jx5fq20u ceph-e5520e2c-420d-11f1-a7b9-5254001191fb-osd-0-activate[4300]: stderr: Failed to unseal secret using TPM2: Operation not permitted
Apr 27 14:17:26 ceph-jx5fq20u bash[4289]: stderr: Failed to unseal secret using TPM2: Operation not permitted
```

The patch makes the PCR set configurable and defaults to 7 so bootc style
deployments behave correctly.

Fixes: https://tracker.ceph.com/issues/76318
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Afreen Misbah [Tue, 28 Apr 2026 16:55:32 +0000 (22:25 +0530)]

mgr/dashboard: Update permissions for pool-manager role

Fixes https://tracker.ceph.com/issues/76307

- says denied access when clicked on create pool table action
- this was happening due to the failing monitor API added for stretch cluster configuration
- also updates overview nav permissions

Signed-off-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Ashwin M. Joshi [Wed, 29 Apr 2026 08:27:34 +0000 (13:57 +0530)]

mgr: ok-to-upgrade doc review comments fixed

Fixes: https://tracker.ceph.com/issues/75603
Signed-off-by: Ashwin M. Joshi <ashjosh1@in.ibm.com>

commit | commitdiff | tree

Ville Ojamo [Mon, 16 Mar 2026 16:54:06 +0000 (23:54 +0700)]

doc/rados: Fix spelling errors (3 of 3)

Signed-off-by: Ville Ojamo <git2233+ceph@ojamo.eu>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 28 Apr 2026 15:10:59 +0000 (17:10 +0200)]

ceph-volume: raw activate should ignore lvm backed OSD devices

the generic activate (`ceph-volume activate`) runs the
raw path before LVM. Raw.activate was walking lsblk / raw
list entries and could hit block devices that are actually
logical volumes from `ceph-volume lvm prepare` or `lvm batch`
(with ceph lvm tags on the lv).
That made raw activation poke at LVM backed OSDs instead of
leaving it to `lvm activate`.

with this commit ceph-volume now builds the set of LV paths
that carry those tags once (`lvs` via ceph_volume_lvm_prepare_lv_paths)
and skip any candidate path that matches, so only real raw
OSDs go through the 'raw activate path'.

Also, we now pass `with_tpm` through luks_open() calls for db and
wal so encrypted metadata uses the same systemd-cryptsetup path
as the block LV when ceph.with_tpm is set.

Fixes: https://tracker.ceph.com/issues/76305
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Ashwin M. Joshi [Fri, 24 Apr 2026 09:14:06 +0000 (14:44 +0530)]

mgr: ok-to-upgrade added code comments for flow clarity

Fixes: https://tracker.ceph.com/issues/75603
Signed-off-by: Ashwin M. Joshi <ashjosh1@in.ibm.com>

commit | commitdiff | tree

Ashwin M. Joshi [Thu, 23 Apr 2026 14:37:25 +0000 (20:07 +0530)]

mgr: ok-to-upgrade doc only review comments

Fixes: https://tracker.ceph.com/issues/75603
Signed-off-by: Ashwin M. Joshi <ashjosh1@in.ibm.com>

commit | commitdiff | tree

Ashwin M. Joshi [Thu, 23 Apr 2026 11:19:06 +0000 (16:49 +0530)]

mgr: ok-to-upgrade review comments for dataclass etc

Fixes: https://tracker.ceph.com/issues/75603
Signed-off-by: Ashwin M. Joshi <ashjosh1@in.ibm.com>

commit | commitdiff | tree

Ashwin M. Joshi [Tue, 7 Apr 2026 10:05:10 +0000 (15:35 +0530)]

mgr: Bucket scoped OSD upgrades using ok-to-upgrade

Fixes: https://tracker.ceph.com/issues/75603
Signed-off-by: Ashwin M. Joshi <ashjosh1@in.ibm.com>
Conflicts:
src/pybind/mgr/orchestrator/module.py

commit | commitdiff | tree

bluikko [Wed, 29 Apr 2026 05:53:02 +0000 (12:53 +0700)]

Merge pull request #68561 from bluikko/wip-doc-rados-troubleshooting-mon-improve

doc/rados: improve troubleshooting-mon.rst

commit | commitdiff | tree

Matthew N. Heler [Thu, 26 Feb 2026 01:03:56 +0000 (19:03 -0600)]

rgw: add RestoreStatus support to object listings

S3 clients can request restore status in listing responses through the
x-amz-optional-object-attributes header, but we had no support for it.
This stores the restore state in the bucket index so listings can
include <RestoreStatus> without having to read each object's attrs
individually.

Signed-off-by: Matthew N. Heler <matthew.heler@hotmail.com>

commit | commitdiff | tree

Kamoltat (Junior) Sirivadhna [Tue, 28 Apr 2026 20:13:10 +0000 (20:13 +0000)]

src/script: init test_stretch crush_collisions.sh

Add script to test for CRUSH retry exhaustion in stretch mode with
2 datacenters. Tests unbiased stretch rules by running multiple
iterations of PG mappings and checking for collisions that exceed
the 50-try limit.

Also add --show-retry-exhaustion flag to crushtool to detect and
report when CRUSH mapping hits the maximum retry limit.

Signed-off-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 28 Apr 2026 14:55:06 +0000 (10:55 -0400)]

script/ptl-tool: allow PR numbers as GH urls

For easier pasting.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Mon, 27 Apr 2026 19:37:43 +0000 (15:37 -0400)]

script/ptl-tool: get git dir via git command

Rather than a manual process.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Alex Ainscow [Tue, 28 Apr 2026 12:56:07 +0000 (13:56 +0100)]

Merge pull request #66258 from aainscow/read_only_execs

osd/rados/rgw/cephfs: Modernize cls interface with compile time safety

Reviewed-by: Bill Scales <bill_scales@uk.ibm.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Adam Emerson <aemerson@ibm.com>

commit | commitdiff | tree

Afreen Misbah [Tue, 28 Apr 2026 12:45:35 +0000 (18:15 +0530)]

Merge pull request #68026 from rhcs-dashboard/fix-theme

mgr/dashboard: Enable gray 10 theme as per carbon standards

Reviewed-by: Abhishek Desai <abhishek.desai1@ibm.com>

commit | commitdiff | tree

Casey Bodley [Tue, 28 Apr 2026 12:00:01 +0000 (08:00 -0400)]

Merge pull request #68577 from cbodley/wip-74398

rgw: read_obj_policy() consults s3:prefix when deciding between 403/404

Reviewed-by: Oguzhan Ozmen <oozmen@bloomberg.net>

commit | commitdiff | tree

Ronen Friedman [Tue, 24 Feb 2026 12:59:54 +0000 (12:59 +0000)]

crimson/seastore: fixing some 'unused' warnings

in btree_types compilation.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Kobi Ginon [Mon, 27 Apr 2026 19:08:54 +0000 (22:08 +0300)]

mgr/cephadm: replace md5_hash with FIPS-safe config_hash

Replace md5_hash() usages in cephadm dependency hashing with an
algorithm-agnostic config_hash() helper. config_hash() is backed by
SHA-256, making dependency hash generation unconditionally FIPS-safe
while preserving change-detection behavior.

Fixes: https://tracker.ceph.com/issues/76185
Signed-off-by: Kobi Ginon <kginon@redhat.com>

commit | commitdiff | tree

Kautilya Tripathi [Tue, 28 Apr 2026 11:14:10 +0000 (16:44 +0530)]

Merge pull request #66993 from ceph/crimson-pg-subcommands

crimson: add pg subcommands support in CLI

Reviewed-by: Aishwarya Mathuria aishwarya.mathuria@ibm.com
Reviewed-by: Kefu Chai tchaikov@gmail.com

commit | commitdiff | tree

Igor Fedotov [Tue, 28 Apr 2026 09:59:08 +0000 (12:59 +0300)]

Merge pull request #68502 from ifed01/wip-ifed-more-zoned-remove

os/bluestore: remove obsolete "zoned" freelist type

Reviewed-by: Adam Kupczyk <akupczyk@ibm.com>
Reviewed-by: Jaya Prakash <jayaprakash@ibm.com>

commit | commitdiff | tree

Ville Ojamo [Wed, 22 Apr 2026 06:51:34 +0000 (13:51 +0700)]

doc/rados: improve troubleshooting-mon.rst

Don't ceph tell mon_status and then claim it passes the help command.
Improve language and link to cephadm doc on asok usage. Add label and
note about accessing asok from the host in troubleshooting.rst.
Capitalize and use double backticks consistently.
Add some missing articles and other minor word changes.
Fix indentation.
Use ref and link definitions consistently, use automatic bold.
Use privileged prompts for CLI commands where necessary.
Remove spaces at end of lines and change tabs to four spaces.

Signed-off-by: Ville Ojamo <git2233+ceph@ojamo.eu>

commit | commitdiff | tree

Afreen Misbah [Mon, 13 Apr 2026 23:09:51 +0000 (04:39 +0530)]

mgr/dashboard: Fixed modal forms background color

Signed-off-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Afreen Misbah [Thu, 2 Apr 2026 22:27:59 +0000 (03:57 +0530)]

mgr/dashboard: Fix grid issues in notifications page and password form

Signed-off-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Afreen Misbah [Fri, 27 Mar 2026 16:06:38 +0000 (21:36 +0530)]

mgr/dashboard: Add gray10 theme base color to all pages

- applies #f4f4f4 - $background to all pages as base page
- earlier the base color of page was white
- also updates tabs/navs/tables css to adapt
- some fixes of spacings in alerts tabs, nvmeof

Signed-off-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Afreen Misbah [Fri, 27 Mar 2026 09:16:27 +0000 (14:46 +0530)]

mgr/dashboard: Add gray10 theme background to overview and rgw page

Fixes https://tracker.ceph.com/issues/75752

Signed-off-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Afreen Misbah [Thu, 26 Mar 2026 13:34:56 +0000 (19:04 +0530)]

mgr/dashboard: Remove dashboard overrides

- we have responsive layout now so removing overrides
- also removing duplicate spacings css

Signed-off-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Afreen Misbah [Thu, 26 Mar 2026 13:31:43 +0000 (19:01 +0530)]

mgr/dashboard: Remove modal defaults

Signed-off-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Afreen Misbah [Thu, 26 Mar 2026 13:25:18 +0000 (18:55 +0530)]

mgr/dashboard: Remove tooltip and popover defaults

Fixes https://tracker.ceph.com/issues/75410

These defaults are not required as carbon adds blackish color to tooltips and moving forward we want to align to CDS.
If anything breaks then add / fix in the used component

Signed-off-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Afreen Misbah [Thu, 26 Mar 2026 13:01:54 +0000 (18:31 +0530)]

mgr/dashboard: Enable gray 10 theme as per carbon standards

- this keeps only branding related colors and removes other colors

Signed-off-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Yuval Lifshitz [Tue, 28 Apr 2026 07:35:08 +0000 (10:35 +0300)]

Merge pull request #68540 from nbalacha/wip-nbalacha-76206

rgw/bucket-logging: handle SigV2 presigned URLs

commit | commitdiff | tree

Abhishek Desai [Tue, 28 Apr 2026 07:15:16 +0000 (12:45 +0530)]

mgr/dashboard : Fixes EC profile used pool empty
fixes : https://tracker.ceph.com/issues/76288
Signed-off-by: Abhishek Desai <abhishek.desai1@ibm.com>

commit | commitdiff | tree

Shraddha Agrawal [Tue, 28 Apr 2026 06:29:05 +0000 (11:59 +0530)]

Merge pull request #68424 from NitzanMordhai/wip-nitzan-rados-perf-test-epel10-pdsh-missing

qa/tasks/cbt: install pdsh from el9 RPMs on el10 systems

commit | commitdiff | tree

Shweta Bhosale [Tue, 28 Apr 2026 05:48:15 +0000 (11:18 +0530)]

mgr/cephadm: log full tracebacks for upgrade exceptions

Fixes: https://tracker.ceph.com/issues/76284
Signed-off-by: Shweta Bhosale <Shweta.Bhosale1@ibm.com>

commit | commitdiff | tree

Ronen Friedman [Mon, 20 Apr 2026 15:10:46 +0000 (15:10 +0000)]

crimson/qa/objectstore-tool: reduce segments size

used in testing. This translates into more segments, which helps
in preventing test failures due to insufficient free segments for mounting.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Sat, 18 Apr 2026 16:32:03 +0000 (16:32 +0000)]

qa/tasks: add timeout to 'GC' ceph_objectstore_tool calls

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Thu, 16 Apr 2026 18:02:21 +0000 (18:02 +0000)]

qa/tasks/ceph_objectstore_tool.py: add gc_before_restart option

The objectstore tool tests restart the OSDs without allowing enough
time for GC to run, which can lead to no-OOL-segments conditions on restart. This
adds a gc_before_restart option to the test config, which when set
to true will run crimson-objectstore-tool --op gc on each OSD
before restarting them.

Fixes: https://tracker.ceph.com/issues/73101
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Thu, 16 Apr 2026 17:58:09 +0000 (17:58 +0000)]

crimson/tools/objectstore: add GC operation to crimson-objectstore-tool

This adds a GC operation to the crimson-objectstore-tool, allowing
us to trigger GC cycles on demand during testing. This will
help reduce segment pressure and avoid 'no-segments' conditions.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Thu, 16 Apr 2026 17:55:22 +0000 (17:55 +0000)]

crimson/os: add GC operation to Seastore

Will be used to force immediate GC cycles in Seastore during testing, to
reduce segment pressure and avoid missing-OOL-segments conditions.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom