]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/log
ceph.git
11 days agoContainerfile: Support pulp repo URLs 69554/head 69555/head
David Galloway [Wed, 17 Jun 2026 12:53:12 +0000 (08:53 -0400)]
Containerfile: Support pulp repo URLs

Signed-off-by: David Galloway <david.galloway@ibm.com>
11 days agoMerge pull request #69511 from VallariAg/wip-update-submodule-182
Vallari Agrawal [Wed, 17 Jun 2026 15:44:50 +0000 (21:14 +0530)]
Merge pull request #69511 from VallariAg/wip-update-submodule-182

mgr/dashboard: bump nvmeof submodule to 1.8.2

11 days agoMerge pull request #69356 from eameh-LF/wip-doc-77200
Ilya Dryomov [Wed, 17 Jun 2026 14:21:28 +0000 (16:21 +0200)]
Merge pull request #69356 from eameh-LF/wip-doc-77200

doc: Replace Python 2 package names with Python 3 equivalents

Reviewed-by: Kefu Chai <k.chai@proxmox.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
11 days agoMerge pull request #69275 from fultheim/seastore-io-wakeup
Matan Breizman [Wed, 17 Jun 2026 13:56:01 +0000 (16:56 +0300)]
Merge pull request #69275 from fultheim/seastore-io-wakeup

crimson/os/seastore: wake blocked IO on BackgroundProcess wakeup

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
11 days agoMerge pull request #67141 from BBoozmen/wip-oozmen-74677
Oguzhan Ozmen [Wed, 17 Jun 2026 13:52:19 +0000 (09:52 -0400)]
Merge pull request #67141 from BBoozmen/wip-oozmen-74677

rgw/multisite: Balance sync traffic across DNS-resolved backends using CURLOPT_CONNECT_TO

11 days agoMerge pull request #69355 from eameh-LF/wip-doc-77208
eameh-LF [Wed, 17 Jun 2026 13:25:50 +0000 (14:25 +0100)]
Merge pull request #69355 from eameh-LF/wip-doc-77208

doc: Remove empty README.md and relocate nvmeof HA design doc

11 days agoMerge pull request #69354 from eameh-LF/wip-doc-77206
eameh-LF [Wed, 17 Jun 2026 13:25:27 +0000 (14:25 +0100)]
Merge pull request #69354 from eameh-LF/wip-doc-77206

doc/security: Add Security Lead and Working Group pages to toctree

11 days agodoc: Replace Python 2 package names with Python 3 equivalents 69356/head
Emmanuel Ameh [Tue, 9 Jun 2026 12:19:27 +0000 (13:19 +0100)]
doc: Replace Python 2 package names with Python 3 equivalents

librados-intro.rst referenced ``python-rados`` for CentOS/RHEL.
rbd-openstack.rst referenced ``python-rbd`` for both apt and yum.
Python 2 reached end-of-life in January 2020; these package names
install the Python 2 bindings (or fail entirely) on current distros.
Replace with the correct Python 3 package names: python3-rados and
python3-rbd.

Fixes: https://tracker.ceph.com/issues/77200
Signed-off-by: Emmanuel Ameh <eameh@contractor.linuxfoundation.org>
11 days agoMerge pull request #65286 from pritha-srivastava/wip-rgw-d4n-mem-leak
Samarah Uriarte [Wed, 17 Jun 2026 13:05:20 +0000 (08:05 -0500)]
Merge pull request #65286 from pritha-srivastava/wip-rgw-d4n-mem-leak

rgw/d4n: deleting LFUDAEntry and LFUDAObjEntry instances

Reviewed-by: Mark Kogan <mkogan@redhat.com>
Reviewed-by: Gal Salomon <gsalomon@redhat.com>
11 days agoMerge pull request #69378 from rhcs-dashboard/custom-filter
Nizamudeen A [Wed, 17 Jun 2026 09:00:15 +0000 (14:30 +0530)]
Merge pull request #69378 from rhcs-dashboard/custom-filter

mgr/dashboard: add custom filtering rules to the table

Reviewed-by: Afreen Misbah <afreen@ibm.com>
Reviewed-by: Naman Munet <nmunet@redhat.com>
11 days agoMerge pull request #69476 from ronen-fr/wip-rf-fmtxx-crimson
Kefu Chai [Wed, 17 Jun 2026 08:52:07 +0000 (16:52 +0800)]
Merge pull request #69476 from ronen-fr/wip-rf-fmtxx-crimson

crimson/os/seastore: fix laddr_t formatter and its use

Reviewed-by: Kefu Chai <k.chai@proxmox.com>
11 days agoMerge pull request #69297 from VallariAg/wip-prometheus-rados-ns-fix
Vallari Agrawal [Wed, 17 Jun 2026 07:13:44 +0000 (12:43 +0530)]
Merge pull request #69297 from VallariAg/wip-prometheus-rados-ns-fix

monitoring: fix NVMeoFMultipleNamespacesOfRBDImage for different rados_namespace_name

11 days agomgr/dashboard: add custom filtering rules to the table 69378/head
Nizamudeen A [Wed, 10 Jun 2026 05:21:33 +0000 (10:51 +0530)]
mgr/dashboard: add custom filtering rules to the table

```
<cd-table #table
                id="pool-list"
                [data]="pools"
                [columns]="columns"
                selectionType="single"
                [hasDetails]="true"
                [status]="tableStatus"
                [autoReload]="-1"
                (fetchData)="taskListService.fetch()"
                (setExpandedRow)="setExpandedRow($event)"
                (updateSelection)="updateSelection($event)"
                [customFilter]="true" # set this to true
                (customFilterChange)="onCustomFilterChange($event)" #
get the new rules from here>
```

Fixes: https://tracker.ceph.com/issues/77290
Signed-off-by: Nizamudeen A <nia@redhat.com>
11 days agoMerge pull request #69515 from tchaikov/wip-rocksdb-fix-ftbfs-gcc-16
Kefu Chai [Wed, 17 Jun 2026 03:17:35 +0000 (11:17 +0800)]
Merge pull request #69515 from tchaikov/wip-rocksdb-fix-ftbfs-gcc-16

rocksdb: update submodule to fix FTBFS due to missing <cstdint>

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
11 days agoMerge pull request #69509 from sunyuechi/wip-fix-unused-warnings
Kefu Chai [Wed, 17 Jun 2026 02:07:27 +0000 (10:07 +0800)]
Merge pull request #69509 from sunyuechi/wip-fix-unused-warnings

crimson,mgr,test: fix unused variable/function warnings

Reviewed-by: Kefu Chai <k.chai@proxmox.com>
11 days agoMerge pull request #69419 from Sodani/shsodani_mac_support
John Mulligan [Tue, 16 Jun 2026 22:50:31 +0000 (18:50 -0400)]
Merge pull request #69419 from Sodani/shsodani_mac_support

mgr/smb: Added a Mac client support for samba cluster

Reviewed-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Sachin Prabhu <sp@spui.uk>
11 days agoMerge pull request #69339 from Hezko/clis-alignment
Hezko [Tue, 16 Jun 2026 21:11:23 +0000 (00:11 +0300)]
Merge pull request #69339 from Hezko/clis-alignment

mgr/dashboard: align nvmeof cli with missing parameters and commands from the old nvmeof cli

11 days agodoc: add PendingReleaseNotes entry for rgw multisite DNS endpoint resolution 67141/head
Oguzhan Ozmen [Tue, 24 Mar 2026 17:54:22 +0000 (17:54 +0000)]
doc: add PendingReleaseNotes entry for rgw multisite DNS endpoint resolution

Documents the new rgw_rest_conn_connect_to_resolved_ips feature that
enables RGW to resolve HTTP endpoints for RGW services such as multisite,
into all IP addresses and distribute requests across them using
round-robin with per-IP health tracking, supporting DNS service
discovery deployments without external load balancers.

Signed-off-by: Oguzhan Ozmen <oozmen@bloomberg.net>
12 days agoMerge pull request #69132 from BBoozmen/wip-sync-err-log-76950
Oguzhan Ozmen [Tue, 16 Jun 2026 19:11:24 +0000 (15:11 -0400)]
Merge pull request #69132 from BBoozmen/wip-sync-err-log-76950

rgw/multisite: do not log transient per-object EBUSY/EAGAIN errors in sync error log

12 days agoMerge PR #69498 into main
Patrick Donnelly [Tue, 16 Jun 2026 18:51:39 +0000 (14:51 -0400)]
Merge PR #69498 into main

* refs/pull/69498/head:
doc/releases/tentacle: scope wording to package installs
doc/releases/tentacle: remove bogus notable changes
doc/releases/tentacle: add more v20.2.2 blocker fixes
doc: tentacle 20.2.2 release notes

Reviewed-by: Adam King <adking@redhat.com>
12 days agodoc/releases/tentacle: scope wording to package installs 69498/head
Patrick Donnelly [Tue, 16 Jun 2026 02:51:58 +0000 (22:51 -0400)]
doc/releases/tentacle: scope wording to package installs

Resolves: https://tracker.ceph.com/issues/77357
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
12 days agodoc/releases/tentacle: remove bogus notable changes
Patrick Donnelly [Mon, 15 Jun 2026 23:38:20 +0000 (19:38 -0400)]
doc/releases/tentacle: remove bogus notable changes

Resolves: https://tracker.ceph.com/issues/77357
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
12 days agoMerge pull request #68881 from jacquesh/fix-rgw-log-merging
Oguzhan Ozmen [Tue, 16 Jun 2026 17:37:14 +0000 (13:37 -0400)]
Merge pull request #68881 from jacquesh/fix-rgw-log-merging

rgw: Fix ops logs sometimes having several entries per line.

12 days agodoc/mgr/smb: document client compatibility mode 69419/head
Shweta Sodani [Fri, 12 Jun 2026 11:41:15 +0000 (17:11 +0530)]
doc/mgr/smb: document client compatibility mode

Add documentation for --client-compat parameter in 'cluster create'
command and new 'cluster update client-compat' command. This feature
enables macOS-specific SMB optimizations (fruit VFS, streams_xattr)
and can be set during cluster creation or updated for existing clusters.

Signed-off-by: Shweta Sodani <shsodani@redhat.com>
12 days agoMerge pull request #69366 from sseshasa/wip-fix-ok-to-upgrade-osd-sort-order
Sridhar Seshasayee [Tue, 16 Jun 2026 16:16:13 +0000 (21:46 +0530)]
Merge pull request #69366 from sseshasa/wip-fix-ok-to-upgrade-osd-sort-order

mgr/DaemonServer: Aggregate and globally sort OSDs for ok-to-upgrade

Reviewed-by: Radoslaw Zarzynski <rzarzynski@redhat.com>
Reviewed-by: Nitzan Mordechai <nmordech@ibm.com>
12 days agocrimson/os/seastore: fix laddr_t formatter and its use 69476/head
Ronen Friedman [Mon, 15 Jun 2026 11:24:18 +0000 (11:24 +0000)]
crimson/os/seastore: fix laddr_t formatter and its use

'laddr_t' existing formatter did not support a ':x' format specifier
(actually - the output was always hexadecomal).
Here we remove the ':x', but also refactor the custom formatter to
avoid using the streambuf mechanism.
Note - SEASTORE_LADDR_USE_BOOST_U128 is no longer supported by the formatter.

Fixes: https://tracker.ceph.com/issues/77399
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
12 days agoMerge pull request #69395 from aclamk/aclamk-doc-bs-fast-recovery
Adam Kupczyk [Tue, 16 Jun 2026 13:03:31 +0000 (15:03 +0200)]
Merge pull request #69395 from aclamk/aclamk-doc-bs-fast-recovery

doc, bluestore: Documentation for Fast Onode Recovery feature

12 days agomgr/smb: add tests for client compatibility mode
Shweta Sodani [Fri, 12 Jun 2026 11:29:22 +0000 (16:59 +0530)]
mgr/smb: add tests for client compatibility mode

Add comprehensive test coverage for the new client compatibility
feature that enables macOS-specific SMB optimizations:

- test_enums.py: Add tests for ClientSupportMode enum values
  (DEFAULT and MACOS) and string representation
- test_resources.py: Add tests for cluster client_compat field,
  effective_client_compat property, and is_macos_compatibility_enabled
  property with different mode configurations
- test_smb.py: Add integration tests for cluster_update_client_compat
  CLI command including successful updates and error handling for
  non-existent clusters

These tests ensure the client compatibility mode can be properly
set, retrieved, and updated at the cluster level.

Signed-off-by: Shweta Sodani <shsodani@redhat.com>
12 days agomgr/smb: Add client support mode for macOS-specific SMB features
Shweta Sodani [Thu, 11 Jun 2026 13:21:33 +0000 (18:51 +0530)]
mgr/smb: Add client support mode for macOS-specific SMB features

This commit introduces a new cluster-level configuration option to enable
client-specific SMB optimizations, starting with macOS support.

Usage:
  ceph smb cluster create <cluster-id> --client-compat macos
  ceph smb cluster update client-compat macos <cluster-id>

Signed-off-by: Shweta Sodani <ssodani@redhat.com>
12 days agorocksdb: update submodule to fix FTBFS due to missing <cstdint> 69515/head
Kefu Chai [Tue, 16 Jun 2026 08:24:16 +0000 (16:24 +0800)]
rocksdb: update submodule to fix FTBFS due to missing <cstdint>

43dd4cbd370 bumped the rocksdb submodule to v7.10.2 for CVE-2022-23476,
dropping the <cstdint> includes the v7.9.2 pin carried.
db/blob/blob_file_meta.h uses uint64_t but no longer includes <cstdint>,
so it compiles only where another header pulls <cstdint> in transitively.
GCC with libstdc++ 16.1.0 no longer does, so the build fails:

    db/blob/blob_file_meta.h: error: 'uint64_t' has not been declared

our targeted distros still pull it in, so the failure went unnoticed
there: ubuntu jammy (GCC 11.2.0) and noble (GCC 13.2).

bump the submodule to a cherry-pick of upstream rocksdb 72c3887167,
which fixes the same FTBFS.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
12 days agorgw/d4n: Update policy unit test 65286/head
Samarah Uriarte [Thu, 11 Jun 2026 17:48:42 +0000 (17:48 +0000)]
rgw/d4n: Update policy unit test

Signed-off-by: Samarah Uriarte <samarah.uriarte@ibm.com>
12 days agorgw/d4n: adding a thread to asynchronously update
Pritha Srivastava [Mon, 1 Sep 2025 08:56:17 +0000 (14:26 +0530)]
rgw/d4n: adding a thread to asynchronously update
localweight to the cache backend. Removing the code
to update the localweight from GET and PUT requests.

Signed-off-by: Pritha Srivastava <prsrivas@redhat.com>
12 days agomgr/dashboard: align nvmeof cli with missing parameters and functions from the old... 69339/head
Tomer Haskalovitch [Tue, 2 Jun 2026 10:20:28 +0000 (13:20 +0300)]
mgr/dashboard: align nvmeof cli with missing parameters and functions from the old nvmeof cli

fixes: https://tracker.ceph.com/issues/77108
Signed-off-by: Tomer Haskalovitch <tomer.haska@ibm.com>
12 days agocrimson,mgr: mark assert-only variables [[maybe_unused]] 69509/head
Sun Yuechi [Mon, 15 Jun 2026 19:41:34 +0000 (03:41 +0800)]
crimson,mgr: mark assert-only variables [[maybe_unused]]

These variables are only read inside assert(), which is compiled out
under NDEBUG. Mark them [[maybe_unused]] to silence the warnings while
keeping the debug-only assert() style used by the surrounding code:

  src/crimson/os/seastore/lba/btree_lba_manager.cc:1078: unused variable 'orig_len' [-Wunused-variable]
  src/crimson/os/seastore/omap_manager/log/log_manager.cc:73: variable 'ret' set but not used [-Wunused-but-set-variable]
  src/crimson/os/seastore/transaction_manager.cc:382: variable 'intermediate_key' set but not used [-Wunused-but-set-variable]
  src/mgr/PyModule.cc:166,186: unused variable 'r' [-Wunused-variable]

Signed-off-by: Sun Yuechi <sunyuechi@iscas.ac.cn>
12 days agotest/crimson/seastore: use gtest assertion macros instead of assert()
Sun Yuechi [Mon, 15 Jun 2026 19:41:27 +0000 (03:41 +0800)]
test/crimson/seastore: use gtest assertion macros instead of assert()

Plain assert() is compiled out under NDEBUG, leaving the checked
variables unused. Use the always-evaluated gtest macros instead.

  src/test/crimson/seastore/test_cbjournal.cc:586: variable 'old_written_to' set but not used [-Wunused-but-set-variable]
  src/test/crimson/seastore/test_btree_lba_manager.cc:345: unused structured binding declaration [-Wunused-variable]

Signed-off-by: Sun Yuechi <sunyuechi@iscas.ac.cn>
12 days agorgw/d4n: deleting LFUDAEntry and LFUDAObjEntry instances
Pritha Srivastava [Thu, 28 Aug 2025 07:06:07 +0000 (12:36 +0530)]
rgw/d4n: deleting LFUDAEntry and LFUDAObjEntry instances
in LFUDAPolicy destructor.

Signed-off-by: Pritha Srivastava <prsrivas@redhat.com>
12 days agocrimson,test: remove unused functions and dead variable
Sun Yuechi [Mon, 15 Jun 2026 19:41:19 +0000 (03:41 +0800)]
crimson,test: remove unused functions and dead variable

Fixing these warnings:

  src/crimson/os/seastore/seastore.cc:83: 'omaptree_initialize' defined but not used [-Wunused-function]
  src/crimson/osd/replicated_recovery_backend.cc:733: 'nullopt_if_empty' defined but not used [-Wunused-function]
  src/test/rgw/test_rgw_kms_cache.cc:63: 'rethrow' defined but not used [-Wunused-function]
  src/test/librados/test_cxx.cc:215: variable 'cmd' set but not used [-Wunused-but-set-variable]

Signed-off-by: Sun Yuechi <sunyuechi@iscas.ac.cn>
12 days agoMerge pull request #69106 from bigjust/upgrade-rocksdb-7.10.2-cve-2022-23476
SrinivasaBharathKanta [Tue, 16 Jun 2026 04:24:11 +0000 (09:54 +0530)]
Merge pull request #69106 from bigjust/upgrade-rocksdb-7.10.2-cve-2022-23476

rocksdb: upgrade submodule to v7.10.2 to address CVE-2022-23476

12 days agoMerge pull request #68662 from kamoltat/wip-ksirivad-stretch-crush-experiment
SrinivasaBharathKanta [Tue, 16 Jun 2026 04:23:01 +0000 (09:53 +0530)]
Merge pull request #68662 from kamoltat/wip-ksirivad-stretch-crush-experiment

src/script: init test_stretch_crush_collisions.sh

12 days agoMerge pull request #68425 from dheart-joe/pretty-kvtool-output
SrinivasaBharathKanta [Tue, 16 Jun 2026 04:22:07 +0000 (09:52 +0530)]
Merge pull request #68425 from dheart-joe/pretty-kvtool-output

tool/ceph-kvstore-tool: add --pretty-binary-key option

12 days agoMerge pull request #67747 from indirasawant/wip-isawant-mgr-standby-details
SrinivasaBharathKanta [Tue, 16 Jun 2026 04:21:26 +0000 (09:51 +0530)]
Merge pull request #67747 from indirasawant/wip-isawant-mgr-standby-details

mon/mgr: include standby manager details in ceph mgr stat

12 days agodoc/releases/tentacle: add more v20.2.2 blocker fixes
Patrick Donnelly [Mon, 15 Jun 2026 23:30:24 +0000 (19:30 -0400)]
doc/releases/tentacle: add more v20.2.2 blocker fixes

Resolves: https://tracker.ceph.com/issues/77357
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
12 days agodoc: tentacle 20.2.2 release notes
Yuri Weinstein [Thu, 11 Jun 2026 17:09:25 +0000 (10:09 -0700)]
doc: tentacle 20.2.2 release notes

Resolves: https://tracker.ceph.com/issues/77357
Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
12 days agoMerge pull request #69467 from kotreshhr/mirror-handle-dup-add-directory-notification
Kotresh HR [Tue, 16 Jun 2026 02:48:44 +0000 (08:18 +0530)]
Merge pull request #69467 from kotreshhr/mirror-handle-dup-add-directory-notification

tools/cephfs_mirror: Ignore duplicate directory acquire notifications

Reviewed-by: Venky Shankar <vshankar@redhat.com>
12 days agoMerge PR #69494 into main
Patrick Donnelly [Tue, 16 Jun 2026 01:25:50 +0000 (21:25 -0400)]
Merge PR #69494 into main

* refs/pull/69494/head:
doc: Document restarting failed release builds

Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
12 days agoMerge PR #69492 into main
Patrick Donnelly [Tue, 16 Jun 2026 01:22:38 +0000 (21:22 -0400)]
Merge PR #69492 into main

* refs/pull/69492/head:
doc/dev/release-process: update according to supported releases

Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>
12 days agodoc/dev/release-process: update according to supported releases 69492/head
Patrick Donnelly [Mon, 15 Jun 2026 21:14:13 +0000 (17:14 -0400)]
doc/dev/release-process: update according to supported releases

From: https://docs.ceph.com/en/latest/start/os-recommendations/

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
12 days agodoc: Document restarting failed release builds 69494/head
David Galloway [Mon, 15 Jun 2026 21:29:22 +0000 (17:29 -0400)]
doc: Document restarting failed release builds

Signed-off-by: David Galloway <david.galloway@ibm.com>
12 days agomgr/dashboard: bump nvmeof submodule to 1.8.2 69511/head
Vallari Agrawal [Mon, 15 Jun 2026 21:30:12 +0000 (00:30 +0300)]
mgr/dashboard: bump nvmeof submodule to 1.8.2

Update proto files and gateway submodule

Fixes: https://tracker.ceph.com/issues/77422
Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
12 days agoMerge PR #69446 into main
Patrick Donnelly [Mon, 15 Jun 2026 20:14:23 +0000 (16:14 -0400)]
Merge PR #69446 into main

* refs/pull/69446/head:
python-common/cryptotools: stop using the removed X509Req API

Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
Reviewed-by: John Mulligan <jmulligan@redhat.com>
12 days agoMerge pull request #69289 from anoopcs9/fix-run-samba-perms
John Mulligan [Mon, 15 Jun 2026 19:42:48 +0000 (15:42 -0400)]
Merge pull request #69289 from anoopcs9/fix-run-samba-perms

cephadm/smb: Bind mount /run with 0755

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Xavi Hernandez <xhernandez@gmail.com>
13 days agoMerge pull request #68532 from Ericmzhang/wip-overlapped-roots-autoscale
Ericmzhang [Mon, 15 Jun 2026 17:28:56 +0000 (10:28 -0700)]
Merge pull request #68532 from Ericmzhang/wip-overlapped-roots-autoscale

Mgr: Allow autoscaling for overlapped roots

13 days agoMerge pull request #68856 from tchaikov/wip-ceph-dencoder-dlclose
Kefu Chai [Mon, 15 Jun 2026 12:14:23 +0000 (20:14 +0800)]
Merge pull request #68856 from tchaikov/wip-ceph-dencoder-dlclose

ceph-dencoder: skip dlclose under ASan so leaks symbolise

Reviewed-by: Nitzan Mordechai <nmordec@ibm.com>
13 days agoMerge pull request #69415 from ronen-fr/wip-rf-clsrefcount
Ronen Friedman [Mon, 15 Jun 2026 11:31:57 +0000 (14:31 +0300)]
Merge pull request #69415 from ronen-fr/wip-rf-clsrefcount

crimson/osd: fix PGBackend::remove() to return ENOENT on no-op deletes

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
13 days agotools/cephfs_mirror: Ignore duplicate directory acquire notifications 69467/head
Kotresh HR [Sun, 14 Jun 2026 18:41:07 +0000 (00:11 +0530)]
tools/cephfs_mirror: Ignore duplicate directory acquire notifications

Make PeerReplayer::add_directory() idempotent when the mgr re-sends
acquire for a directory already in the replayer list.

Fixes: https://tracker.ceph.com/issues/77398
Signed-off-by: Kotresh HR <khiremat@redhat.com>
13 days agoqa/cephfs: Add test for duplicate directory acquire notify
Kotresh HR [Sun, 14 Jun 2026 18:41:02 +0000 (00:11 +0530)]
qa/cephfs: Add test for duplicate directory acquire notify

Verify that reloading the mirroring module and removing a directory
does not leave a ghost replayer entry that keeps syncing snapshots.

Fixes: https://tracker.ceph.com/issues/77398
Signed-off-by: Kotresh HR <khiremat@redhat.com>
13 days agoMerge pull request #69281 from Matan-B/wip-matanb-seastore-p95
Matan Breizman [Mon, 15 Jun 2026 09:11:08 +0000 (12:11 +0300)]
Merge pull request #69281 from Matan-B/wip-matanb-seastore-p95

seastore: add latency distribution (p95/p99) support

Reviewed-by: Xuehan Xu <xuxuehan@qianxin.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Mohit Agrawal <moagrawa@redhat.com>
13 days agoMerge pull request #66864 from ShwetaBhosale1/fix_issue_74045_ssh_hardning_feature
Shweta Bhosale [Mon, 15 Jun 2026 05:33:35 +0000 (11:03 +0530)]
Merge pull request #66864 from ShwetaBhosale1/fix_issue_74045_ssh_hardning_feature

mgr/cephadm: Cephadm hardening (SSH channel)

2 weeks agoMerge pull request #69005 from Jayaprakash-ibm/wip-jaya-mon-features-test-fix
SrinivasaBharathKanta [Sat, 13 Jun 2026 22:54:52 +0000 (04:24 +0530)]
Merge pull request #69005 from Jayaprakash-ibm/wip-jaya-mon-features-test-fix

qa: fix TEST_mon_features feature checks in mon/misc.sh

2 weeks agoMerge pull request #68650 from leonidc/fix_force_exit_gw
SrinivasaBharathKanta [Sat, 13 Jun 2026 22:54:33 +0000 (04:24 +0530)]
Merge pull request #68650 from leonidc/fix_force_exit_gw

nvmeofgw:fix forcing unavalable gw exit by sending

2 weeks agoMerge pull request #68435 from stzuraski898/wip-sz-76048
SrinivasaBharathKanta [Sat, 13 Jun 2026 22:53:49 +0000 (04:23 +0530)]
Merge pull request #68435 from stzuraski898/wip-sz-76048

mgr: ActivePyModules does not set Description in labeled get_perf_schema_python

2 weeks agoMerge pull request #68018 from kotreshhr/mirror-asok-metrics
Kotresh HR [Sat, 13 Jun 2026 18:29:00 +0000 (23:59 +0530)]
Merge pull request #68018 from kotreshhr/mirror-asok-metrics

tools/cephfs_mirror: Add metrics

2 weeks agopython-common/cryptotools: stop using the removed X509Req API 69446/head
Kefu Chai [Sat, 13 Jun 2026 01:50:09 +0000 (09:50 +0800)]
python-common/cryptotools: stop using the removed X509Req API

pyOpenSSL deprecated OpenSSL.crypto.X509Req in 24.2.0 (2024-07-20) and
removed it in 26.3.0 (2026-06-12). as we don't pin pyopenssl, CI picked
up the new release, and create_self_signed_cert() started failing with:

  AttributeError: module 'OpenSSL.crypto' has no attribute 'X509Req'

this took down run-tox-mgr, run-tox-mgr-dashboard-py3 and the mypy check.

we only used X509Req to build a subject name and then copied it into the
X509 cert. so drop it, and set the subject on the cert directly. the
resulting cert stays the same: subject from dname, issuer set to the same
subject, self-signed.

Fixes: https://tracker.ceph.com/issues/77391
Signed-off-by: Kefu Chai <k.chai@proxmox.com>
2 weeks agoMerge pull request #69421 from ShwetaBhosale1/fix_issue_77340_remove_-P_from_shebang_...
Kefu Chai [Sat, 13 Jun 2026 01:13:48 +0000 (09:13 +0800)]
Merge pull request #69421 from ShwetaBhosale1/fix_issue_77340_remove_-P_from_shebang_flags

ceph.spec.in: disable -P in python shebang for cephadm binary

Reviewed-by: Redouane Kachach <rkachach@redhat.com>
Reviewed-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Kefu Chai <k.chai@proxmox.com>
2 weeks agoqa: Add mirror metrics testcases 68018/head
Kotresh HR [Mon, 25 May 2026 18:44:03 +0000 (00:14 +0530)]
qa: Add mirror metrics testcases

Add testcases for newly introduced mirror
metrics and validate it via 'fs mirror peer status'
asok interface.

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2 weeks agodoc: Update the mirroring doc with new metrics fields
Kotresh HR [Mon, 25 May 2026 18:22:29 +0000 (23:52 +0530)]
doc: Update the mirroring doc with new metrics fields

Update the mirroring documentation and also the
release notes with new metrics introduced and it's
availability via 'fs mirror peer status' asok
interface.

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2 weeks agoqa: Fix the mirroring tests with new nested peer_status output
Kotresh HR [Mon, 25 May 2026 17:34:57 +0000 (23:04 +0530)]
qa: Fix the mirroring tests with new nested peer_status output

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2 weeks agotools/cephfs_mirror: Nest peer_status metrics by dir path and peer uuid
Kotresh HR [Fri, 5 Jun 2026 14:23:14 +0000 (19:53 +0530)]
tools/cephfs_mirror: Nest peer_status metrics by dir path and peer uuid

Restructure peer_status output so mirrored directory paths can be
shared by multiple peers without key collisions. Metrics are grouped
as metrics/<dir_path>/peer/<peer_uuid>/ instead of flat dir keys.

Sample output:
--------------
1. When two dirs are syncing.
{
    "metrics": {
        "/parent/d0": {
            "peer": {
                "8a85ab25-70f9-48e9-b82d-56324e75209b": {
                    "state": "syncing",
                    "current_syncing_snap": {
                        "id": 2,
                        "name": "d0_snap0",
                        "sync-mode": "full",
                        "avg_read_throughput_bytes": "9.01 MiB/s",
                        "avg_write_throughput_bytes": "26.74 MiB/s",
                        "crawl": {
                            "state": "completed",
                            "duration": "2s"
                        },
                        "datasync_queue_wait": {
                            "state": "completed",
                            "duration": "0s"
                        },
                        "bytes": {
                            "sync_bytes": "60.83 MiB",
                            "total_bytes": "149.94 MiB",
                            "sync_percent": "40.57%"
                        },
                        "files": {
                            "sync_files": 2028,
                            "total_files": 5000,
                            "sync_percent": "40.56%"
                        },
                        "eta": "10s"
                    },
                    "snaps_synced": 0,
                    "snaps_deleted": 0,
                    "snaps_renamed": 0
                }
            }
        },
        "/parent/d1": {
            "peer": {
                "8a85ab25-70f9-48e9-b82d-56324e75209b": {
                    "state": "syncing",
                    "current_syncing_snap": {
                        "id": 3,
                        "name": "d1_snap0",
                        "sync-mode": "full",
                        "avg_read_throughput_bytes": "6.80 MiB/s",
                        "avg_write_throughput_bytes": "20.04 MiB/s",
                        "crawl": {
                            "state": "in-progress",
                            "duration": "2s"
                        },
                        "datasync_queue_wait": {
                            "state": "completed",
                            "duration": "1s"
                        },
                        "bytes": {
                            "sync_bytes": "4.12 MiB",
                            "total_bytes": "124.98 MiB",
                            "sync_percent": "3.30%"
                        },
                        "files": {
                            "sync_files": 125,
                            "total_files": 4189,
                            "sync_percent": "2.98%"
                        },
                        "eta": "18s"
                    },
                    "snaps_synced": 0,
                    "snaps_deleted": 0,
                    "snaps_renamed": 0
                }
            }
        }
    }
}
---------
2. When two directories are synced

------------------------------------------
{
    "metrics": {
        "/parent/d0": {
            "peer": {
                "8a85ab25-70f9-48e9-b82d-56324e75209b": {
                    "state": "idle",
                    "last_synced_snap": {
                        "id": 2,
                        "name": "d0_snap0",
                        "crawl_duration": "2s",
                        "datasync_queue_wait_duration": "0s",
                        "sync_duration": "30s",
                        "sync_time_stamp": "422538.254127s",
                        "sync_bytes": "149.94 MiB",
                        "sync_files": 5000
                    },
                    "snaps_synced": 1,
                    "snaps_deleted": 0,
                    "snaps_renamed": 0
                }
            }
        },
        "/parent/d1": {
            "peer": {
                "8a85ab25-70f9-48e9-b82d-56324e75209b": {
                    "state": "idle",
                    "last_synced_snap": {
                        "id": 3,
                        "name": "d1_snap0",
                        "crawl_duration": "2s",
                        "datasync_queue_wait_duration": "1s",
                        "sync_duration": "33s",
                        "sync_time_stamp": "422546.205798s",
                        "sync_bytes": "149.94 MiB",
                        "sync_files": 5000
                    },
                    "snaps_synced": 1,
                    "snaps_deleted": 0,
                    "snaps_renamed": 0
                }
            }
        }
    }
}

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2 weeks agotools/cephfs_mirror: Add datasync_queue_wait_duration metric
Kotresh HR [Fri, 8 May 2026 00:22:59 +0000 (05:52 +0530)]
tools/cephfs_mirror: Add datasync_queue_wait_duration metric

Add the metric which measures the time spent by the snapshot
in the data queue waiting for the datasync threads.

Sample output:
When still 'waiting' in queue
{
    "/d1": {
        "state": "syncing",
        "current_syncing_snap": {
            "id": 18,
            "name": "d1_snap5",
            "sync-mode": "delta",
            "avg_read_throughput_bytes": "0.00 B/s",
            "avg_write_throughput_bytes": "0.00 B/s",
            "crawl": {
                "state": "in-progress",
                "duration": "13s"
            },
            "datasync_queue_wait": {
                "state": "waiting",
                "duration": "12s"
            },
            "bytes": {
                "sync_bytes": "0.00 B",
                "total_bytes": "110.99 MiB",
                "sync_percent": "0.00%"
            },
            "files": {
                "sync_files": 0,
                "total_files": 3719,
                "sync_percent": "0.00%"
            },
            "eta": "calculating..."
        },
        "last_synced_snap": {
            "id": 15,
            "name": "d1_snap4"
        },
        "snaps_synced": 0,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    },
}
---------------
After 'complete'
{
    "/d1": {
        "state": "syncing",
        "current_syncing_snap": {
            "id": 18,
            "name": "d1_snap5",
            "sync-mode": "delta",
            "avg_read_throughput_bytes": "11.66 MiB/s",
            "avg_write_throughput_bytes": "34.55 MiB/s",
            "crawl": {
                "state": "completed",
                "duration": "17s"
            },
            "datasync_queue_wait": {
                "state": "completed",
                "duration": "19s"
            },
            "bytes": {
                "sync_bytes": "149.94 MiB",
                "total_bytes": "149.94 MiB",
                "sync_percent": "100.00%"
            },
            "files": {
                "sync_files": 5000,
                "total_files": 5000,
                "sync_percent": "100.00%"
            },
            "eta": "0s"
        },
        "last_synced_snap": {
            "id": 15,
            "name": "d1_snap4"
        },
        "snaps_synced": 0,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}
-----
Also stored in last_sync_snap section
{
    "/d1": {
        "state": "idle",
        "last_synced_snap": {
            "id": 18,
            "name": "d1_snap5",
            "crawl_duration": "17s",
            "datasync_queue_wait_duration": "19s",
            "sync_duration": "44s",
            "sync_time_stamp": "8172.009480s",
            "sync_bytes": "149.94 MiB",
            "sync_files": 5000
        },
        "snaps_synced": 1,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2 weeks agotools/cephfs_mirror: Add eta metrics
Kotresh HR [Sat, 28 Mar 2026 11:23:33 +0000 (16:53 +0530)]
tools/cephfs_mirror: Add eta metrics

Add estimate time of completion for the current
syncing snapshot. The calculation takes into
account the average read/write throughput from
the start of snapshot sync and not the current
read/write throughput. So the ETA is affected
accordingly.

Sample output:
-------------
{
    "/d0": {
        "state": "syncing",
        "current_syncing_snap": {
            "id": 2,
            "name": "d0_snap0",
            "sync-mode": "full",
            "avg_read_throughput_bytes": "3.28 MiB/s",
            "avg_write_throughput_bytes": "71.03 MiB/s",
            "crawl": {
                "state": "completed",
                "duration": "1s"
            },
            "bytes": {
                "sync_bytes": "2.31 MiB",
                "total_bytes": "149.94 MiB",
                "sync_percent": "1.54%"
            },
            "files": {
                "sync_files": 67,
                "total_files": 5000,
                "sync_percent": "1.34%"
            },
            "eta": "calculating..."
        },
        "snaps_synced": 0,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}
------------------------------------------
{
    "/d0": {
        "state": "syncing",
        "current_syncing_snap": {
            "id": 2,
            "name": "d0_snap0",
            "sync-mode": "full",
            "avg_read_throughput_bytes": "12.17 MiB/s",
            "avg_write_throughput_bytes": "66.46 MiB/s",
            "crawl": {
                "state": "completed",
                "duration": "1s"
            },
            "bytes": {
                "sync_bytes": "26.64 MiB",
                "total_bytes": "149.94 MiB",
                "sync_percent": "17.77%"
            },
            "files": {
                "sync_files": 892,
                "total_files": 5000,
                "sync_percent": "17.84%"
            },
            "eta": "10s"
        },
        "snaps_synced": 0,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2 weeks agotools/cephfs_mirror: Add read/write throughput
Kotresh HR [Sat, 28 Mar 2026 10:57:02 +0000 (16:27 +0530)]
tools/cephfs_mirror: Add read/write throughput

The read throughput added measures the bytes
read per second from the source ceph filesystem.
Similarly, the write throughput added measures
the bytes written per second to the remote ceph
filesystem. It's derived from the time spent
in preadv and pwritev calls.

Sample output:
-------------
{
    "/d0": {
        "state": "syncing",
        "current_syncing_snap": {
            "id": 2,
            "name": "d0_snap0",
            "sync-mode": "full",
            "avg_read_throughput_bytes": "12.69 MiB/s",
            "avg_write_throughput_bytes": "54.49 MiB/s",
            "crawl": {
                "state": "completed",
                "duration": "1s"
            },
            "bytes": {
                "sync_bytes": "149.94 MiB",
                "total_bytes": "149.94 MiB",
                "sync_percent": "100.00%"
            },
            "files": {
                "sync_files": 5000,
                "total_files": 5000,
                "sync_percent": "100.00%"
            }
        },
        "snaps_synced": 0,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}
-------------

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2 weeks agotools/cephfs_mirror: Add crawl-state and sync-mode metric
Kotresh HR [Sat, 28 Mar 2026 10:12:43 +0000 (15:42 +0530)]
tools/cephfs_mirror: Add crawl-state and sync-mode metric

The 'crawl' and 'sync-mode' metric is added.

sync-mode: full/delta,
"crawl": {
           "state": "completed",
           "duration": "37s"
       }

sync-mode:
---------
The 'sync-mode: full/delta' is added to peer status.
The 'delta' means, blockdiff along with snapdiff is
being used to sync the files where as 'full' means
full directory is crawled and each file is synced
entirely.

crawl:
-----
The state can be in-progress/completed. This
identifies whether the crawler thread is done
queuing the files for data sync threads.

The time taken for the duration is also shown.
If the crawl is in-progress, the duration
would show the time taken till then from the
start of the crawl. If the crawl state is
completed, then duration indicates total
time taken for the crawl.

The crawl duration is shown in "d h m s" format.
The existing 'sync_duration' in last_synced_snap
is also formatted

The values are as below. When crawl state is
completed, the 'total_files' metric doesn't
grow anymore.

crawl_duration:
--------------
The crawl_duration of last snapshot is saved in last_synced_snap
section as well.

Sample outputs:
---------------
{
    "/d0": {
        "state": "syncing",
        "current_syncing_snap": {
            "id": 2,
            "name": "d0_snap0",
            "sync-mode": "full",
            "crawl": {
                "state": "in-progress",
                "duration": "21s"
            },
            "bytes": {
                "sync_bytes": "149.25 MiB",
                "total_bytes": "176.47 MiB",
                "sync_percent": "84.57%"
            },
            "files": {
                "sync_files": 4931,
                "total_files": 5845,
                "sync_percent": "84.36%"
            }
        },
        "snaps_synced": 0,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}
------------------------------------------
{
    "/d0": {
        "state": "syncing",
        "current_syncing_snap": {
            "id": 2,
            "name": "d0_snap0",
            "sync-mode": "full",
            "crawl": {
                "state": "completed",
                "duration": "37s"
            },
            "bytes": {
                "sync_bytes": "891.39 MiB",
                "total_bytes": "901.52 MiB",
                "sync_percent": "98.88%"
            },
            "files": {
                "sync_files": 29656,
                "total_files": 30000,
                "sync_percent": "98.85%"
            }
        },
        "snaps_synced": 0,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}
---------
  {
        "/d0": {
            "state": "syncing",
            "current_syncing_snap": {
                "id": 3,
                "name": "d0_snap1",
                "sync-mode": "delta",
                "crawl": {
                    "state": "completed",
                    "duration": "15s"
                },
                "bytes": {
                    "sync_bytes": "120.20 MiB",
                    "total_bytes": "149.94 MiB",
                    "sync_percent": "80.16%"
                },
                "files": {
                    "sync_files": 4032,
                    "total_files": 5000,
                    "sync_percent": "80.64%"
                }
            },
            "last_synced_snap": {
                "id": 2,
                "name": "d0_snap0",
                "crawl_duration": "17s",
                "sync_duration": 45,
                "sync_time_stamp": "5642.805770s",
                "sync_bytes": "300.85 MiB",
                "sync_files": 10000
            },
            "snaps_synced": 1,
            "snaps_deleted": 0,
            "snaps_renamed": 0
        }
    }
-------------
{
    "/d0": {
        "state": "idle",
        "last_synced_snap": {
            "id": 2,
            "name": "d0_snap0",
            "crawl_duration": "17s",
            "sync_duration": "2m 38s",
            "sync_time_stamp": "9259.225009s",
            "sync_bytes": "901.52 MiB",
            "sync_files": 30000
        },
        "snaps_synced": 1,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2 weeks agotools/cephfs_mirror: Add inprogress bytes and files metric
Kotresh HR [Mon, 16 Feb 2026 10:59:31 +0000 (16:29 +0530)]
tools/cephfs_mirror: Add inprogress bytes and files metric

Add following mirroring progress metrics to current_syncing_snap
as below

bytes:
  sync_bytes - bytes synced till now
  total_bytes - total bytes to be synced
  sync_percent - Percentage of bytes synced till now
files:
  total_files - Total files to be synced
  sync_files - files synced till now
  sync_percent - Percentage of files synced till now

sync_files and sync_bytes are also stored in last_synced_snap section
after the snapshot is synced.

The bytes is formatted as below.

Sample output:
--------
{
    "/d0": {
        "state": "syncing",
        "current_syncing_snap": {
            "id": 3,
            "name": "d0_snap1",
            "bytes": {
                "sync_bytes": "120.20 MiB",
                "total_bytes": "149.94 MiB",
                "sync_percent": "80.16%"
            },
            "files": {
                "sync_files": 4032,
                "total_files": 5000,
                "sync_percent": "80.64%"
            }
        },
        "last_synced_snap": {
            "id": 2,
            "name": "d0_snap0",
            "sync_duration": 45,
            "sync_time_stamp": "5642.805770s",
            "sync_bytes": "300.85 MiB",
            "sync_files": 10000
        },
        "snaps_synced": 1,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>
2 weeks agodoc/rados/bluestore: Fix flags for bluestore_allocation_from_file 69395/head
Adam Kupczyk [Fri, 12 Jun 2026 17:10:23 +0000 (19:10 +0200)]
doc/rados/bluestore: Fix flags for bluestore_allocation_from_file

Set bluestore_allocation_from_file flags to 'startup'.
Without it, documentation claims the flag to be 'runtime updateable'.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
2 weeks agodoc/man/8/ceph-bluestore-tool: Add doc for recovery-compare command
Adam Kupczyk [Wed, 10 Jun 2026 06:50:21 +0000 (08:50 +0200)]
doc/man/8/ceph-bluestore-tool: Add doc for recovery-compare command

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
2 weeks agodoc/rados/bluestore: Fast onode recovery
Adam Kupczyk [Tue, 9 Jun 2026 18:16:03 +0000 (20:16 +0200)]
doc/rados/bluestore: Fast onode recovery

Add new page of documentation about new feature: multithread onode
recovery.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
2 weeks agoMerge pull request #69300 from rkachach/fix_issue_mgmt_gw_qa
Redouane Kachach [Fri, 12 Jun 2026 13:16:14 +0000 (15:16 +0200)]
Merge pull request #69300 from rkachach/fix_issue_mgmt_gw_qa

qa: extend the ignore-list for the mgmt-gateway test suite

Reviewed-by: Shweta Bhosale <Shweta.Bhosale1@ibm.com>
2 weeks agoMerge pull request #64618 from sseshasa/wip-fix-mclock-slow-ops-during-recovery
Sridhar Seshasayee [Fri, 12 Jun 2026 13:07:50 +0000 (18:37 +0530)]
Merge pull request #64618 from sseshasa/wip-fix-mclock-slow-ops-during-recovery

osd/scheduler: Classify subOp reads according to op priority for mClock

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
2 weeks agoMerge pull request #69284 from JonBailey1993/remove_incorrect_unit_tests
Jon Bailey [Fri, 12 Jun 2026 12:49:01 +0000 (13:49 +0100)]
Merge pull request #69284 from JonBailey1993/remove_incorrect_unit_tests

test: Remove invalid unit test

Reviewed-by: Alex Ainscow <aainscow@uk.ibm.com>
2 weeks agoMerge pull request #68859 from ashjosh1git/ceph-tracker-74872-cephadm-debug-log
Redouane Kachach [Fri, 12 Jun 2026 11:04:13 +0000 (13:04 +0200)]
Merge pull request #68859 from ashjosh1git/ceph-tracker-74872-cephadm-debug-log

mgr/cephadm: Control cephadm files logging based on a mgr flag

Reviewed-by: Redouane Kachach <rkachach@ibm.com>
Reviewed-by: Adam King <adking@redhat.com>
2 weeks agoqa: extend the ignore-list for the mgmt-gateway test suite 69300/head
Redouane Kachach [Fri, 5 Jun 2026 08:14:28 +0000 (10:14 +0200)]
qa: extend the ignore-list for the mgmt-gateway test suite

Let's add CEPHADM_AGENT_DOWN and CEPHADM_STRAY_DAEMON errors

Fixes: https://tracker.ceph.com/issues/77131
Signed-off-by: Redouane Kachach <rkachach@ibm.com>
2 weeks agoMerge pull request #69247 from rkachach/fix_issue_76979
Redouane Kachach [Fri, 12 Jun 2026 10:51:21 +0000 (12:51 +0200)]
Merge pull request #69247 from rkachach/fix_issue_76979

mgr/cephadm: Don't skip OSDs with non-empty osdspec_affinity

Reviewed-by: Shweta Bhosale <Shweta.Bhosale1@ibm.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
2 weeks agoMerge pull request #69299 from rkachach/fix_issue_77130
Redouane Kachach [Fri, 12 Jun 2026 10:50:24 +0000 (12:50 +0200)]
Merge pull request #69299 from rkachach/fix_issue_77130

qa/cephadm: fix test_repos.sh for jammy nodes

Reviewed-by: Shweta Bhosale <Shweta.Bhosale1@ibm.com>
Reviewed-by: Adam King <adking@redhat.com>
2 weeks agoMerge pull request #67466 from rkachach/fix_new_secrets_mgr_module_v0
Redouane Kachach [Fri, 12 Jun 2026 10:48:32 +0000 (12:48 +0200)]
Merge pull request #67466 from rkachach/fix_new_secrets_mgr_module_v0

Adding new secrets mgr module support

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
2 weeks agoMerge pull request #67384 from ashjosh1git/ceph-tracker-74986-validate-pro-name
Redouane Kachach [Fri, 12 Jun 2026 10:36:59 +0000 (12:36 +0200)]
Merge pull request #67384 from ashjosh1git/ceph-tracker-74986-validate-pro-name

python-common: Improve profile name string validation

Reviewed-by: Redouane Kachach <rkachach@ibm.com>
2 weeks agoMerge pull request #69080 from kginonredhat/issue-75365-Grafana-container-fails-to...
Redouane Kachach [Fri, 12 Jun 2026 10:34:42 +0000 (12:34 +0200)]
Merge pull request #69080 from kginonredhat/issue-75365-Grafana-container-fails-to-start-reject-localhost

cephadm: set Grafana http_addr to 0.0.0.0 when unset

Reviewed-by: Redouane Kachach <rkachach@ibm.com>
Reviewed-by: Shweta Bhosale <Shweta.Bhosale1@ibm.com>
2 weeks agoMerge pull request #68158 from rhcs-dashboard/fix-75826-main
Redouane Kachach [Fri, 12 Jun 2026 10:30:14 +0000 (12:30 +0200)]
Merge pull request #68158 from rhcs-dashboard/fix-75826-main

mgr/cephadm: set default prometheus template in config-key store unless overridden by the user

Reviewed-by: Redouane Kachach <rkachach@ibm.com>
2 weeks agoMerge pull request #68428 from kginonredhat/wip-74058-force-delete-data
Redouane Kachach [Fri, 12 Jun 2026 10:28:50 +0000 (12:28 +0200)]
Merge pull request #68428 from kginonredhat/wip-74058-force-delete-data

mgr/cephadm: plumb force_delete_data through daemon/service removal

Reviewed-by: Redouane Kachach <rkachach@ibm.com>
2 weeks agoMerge pull request #66885 from amathuria/wip-amat-crimson-merge-support
Aishwarya Mathuria [Fri, 12 Jun 2026 09:48:37 +0000 (15:18 +0530)]
Merge pull request #66885 from amathuria/wip-amat-crimson-merge-support

Crimson PG merging support

2 weeks agodoc/cephadm: Document cephadm_binary_logging_level option 68859/head
Ashwin M. Joshi [Fri, 12 Jun 2026 08:38:18 +0000 (14:08 +0530)]
doc/cephadm: Document cephadm_binary_logging_level option

Add documentation for the new cephadm binary logging level configuration

Fixes: https://tracker.ceph.com/issues/74872
Signed-off-by: Ashwin M. Joshi <ashjosh1@in.ibm.com>
2 weeks agomgr/cephadm: Control cephadm.log messages based on a new mgr logging level flag
Ashwin M. Joshi [Tue, 10 Feb 2026 06:29:49 +0000 (11:59 +0530)]
mgr/cephadm: Control cephadm.log messages based on a new mgr logging level flag

  Introduces a new 'cephadm_binary_logging_level' config option to control
  the verbosity of cephadm logging to persistent destinations (cephadm.log, syslog).

  - Adds --logging-level CLI flag (info, debug, error, warning)
  - Adds mgr/cephadm/cephadm_binary_logging_level config option
  - Applies logging level to file and syslog handlers
  - Console handlers maintain their defaults for terminal UX

Fixes: https://tracker.ceph.com/issues/74872
Signed-off-by: Ashwin M. Joshi <ashjosh1@in.ibm.com>
2 weeks agoMerge pull request #69329 from tchaikov/wip-crimson-cleanup
Kefu Chai [Fri, 12 Jun 2026 07:43:57 +0000 (15:43 +0800)]
Merge pull request #69329 from tchaikov/wip-crimson-cleanup

crimson/osd: coroutinize OSD::start and remove OSD::startup_time

Reviewed-by: Aishwarya Mathuria <amathuri@redhat.com>
2 weeks agoMerge pull request #69412 from rhcs-dashboard/fix-77263-main
Aashish Sharma [Fri, 12 Jun 2026 07:32:10 +0000 (13:02 +0530)]
Merge pull request #69412 from rhcs-dashboard/fix-77263-main

mgr/dashboard: fix zone creation in rgw service creation form

Reviewed-by: Abhishek Desai <Abhishek.Desai1@ibm.com>
2 weeks agodoc/rados/configuration: Remove wpq recommendation warning for EC clusters 64618/head
Sridhar Seshasayee [Thu, 4 Jun 2026 06:58:35 +0000 (12:28 +0530)]
doc/rados/configuration: Remove wpq recommendation warning for EC clusters

Remove the warning that recommends using wpq scheduler as a fallback for EC
clusters. This issue is addressed by considering EC recovery reads as
background, assigning an accurate cost for those reads and tuning the QoS
parameters associated with best-effort class of operations.

Signed-off-by: Sridhar Seshasayee <sridhar.seshasayee@ibm.com>
2 weeks agomclock_common: adjust mClock profile parameters to prevent backfill starvation
Sridhar Seshasayee [Mon, 25 May 2026 12:14:54 +0000 (17:44 +0530)]
mclock_common: adjust mClock profile parameters to prevent backfill starvation

Adjust the 'background_best_effort' queue parameters across the
three standard mClock profiles (high_client_ops, balanced, and
high_recovery_ops) to ensure best effort ops are not starved.

Previously, the 'background_best_effort' queue carried a default allocation
of 0% (MIN) reservation and a weight of 1 under these profiles. When
concurrent client traffic is dense, the zero-reservation for example completely
starves backfill sub-ops (MSG_OSD_EC_READ) on pools with
'allow_ec_optimizations' set to false. This starvation forces the Primary OSD
to hold internal BlueStore transactions and PG object locks for extended
windows, causing severe client median (50th) latency inflation.

To prevent background starvation and resolve the effects of the primary lock
retention, the profile configurations are tuned as follows:

The following profile changes forces low-cost sub-ops to clear out of peer
queues rapidly to drop  primary locks, which helps improve the client
completion latency and tail latency (95th, 99th and 99.5th) percentile.

1. high_client_ops profile:
   - Grant 'background_best_effort' a safe 5% minimum reservation.
   - Scale the queue weight to 4.

2. balanced profile:
   - Grant 'background_best_effort' a 5% minimum reservation.
   - Set the queue weight to 2.

3. high_recovery_ops profile:
   - Grant 'background_best_effort' a 5% minimum reservation.
   - Set the queue weight to 2.

4. Modify the mClock config reference documentation to reflect the tuning
   changes to the best-effort QoS parameters across the profiles.

Note on Proportional Scaling Compatibility:
Configuring these changes shifts total reservations to 105% (e.g., 50%
client + 50% recovery + 5% best-effort under the Balanced profile). Under
heavy concurrent saturation, mClock's internal controls resolves this
gracefully via proportional down-scaling, preserving the underlying
device bandwidth limits for different classes of clients. For example instead
of the client being allocated 50% bandwidth, a slightly lower reservation is
allocated while shifting the remaining bandwidth to the best-effort queue.
This minor scaling shift is virtually unnoticeable to the client application,
but it prevents the internal queue deadlocks.

Signed-off-by: Sridhar Seshasayee <sridhar.seshasayee@ibm.com>
2 weeks agomclock_common, mClockScheduler: Add perf counters for scheduler ops
Sridhar Seshasayee [Tue, 21 Apr 2026 12:30:50 +0000 (18:00 +0530)]
mclock_common, mClockScheduler: Add perf counters for scheduler ops

Add perf counters to show the status pertaining to the number of ops,
dynamic queue lengths, queue latency and bytes read for the following
ops handled in the high queues and in the scheduler queues:
 - peering
 - client
 - ec reads/writes
 - ec recovery reads

Additional counters can be added in the future based on the requirement.

Signed-off-by: Sridhar Seshasayee <sridhar.seshasayee@ibm.com>
2 weeks agosrc/messages, osd: Calculate and set cost for subOpReads for mClock scheduler
Sridhar Seshasayee [Mon, 28 Jul 2025 11:09:34 +0000 (16:39 +0530)]
src/messages, osd: Calculate and set cost for subOpReads for mClock scheduler

Previously, sub-op reads returned a hardcoded cost of 0, bypassing
mClock's background bandwidth and tag calculation mechanisms. This
allowed backfill operations to proceed un-metered, occasionally causing
backend resource contention and driving up client tail latencies.

Cost is calculated based on whether the complete chunk/shard or a subchunk
needs to be read. The possible cases are:
1. Read the complete chunk aligned length:
   - Cost is set to the length of the chunk aligned extent size.
2. Fragmented reads:
   - Consider the subchunk length and count to calculate the cost.
   - compute_cost evaluates the exact layout of fragmented shard bytes on
     disk by summing up the active subchunk allocations exactly once
     (`fragmented_shard_bytes += k.second * subchunk_size`).
   - Linear Extent Scaling: Scale the baseline footprint cleanly by
     multiplying it against the true count of read extents (`tl.size()`),
     achieving a highly efficient O(N) time complexity.

This linear cost model is compatible with pools running with
'allow_ec_optimizations' set to true. Under the FastEC optimized
pipeline, most operations are unified and bypass fragment slicing,
meaning requests will primarily match the Case 1 chunk-aligned path.
In Case 2 where applicable, the O(N) loop ensures that cost will
scale proportionally according to the layout.

It is important to note that the amount of data to read was set to an upper
bound defined by osd_recovery_max_chunk (8 MiB) and was rounded up to the
stripe width. The reason for setting a higher than actual upper bound is that
there may be cases where the object doesn't have the xattrs yet to determine
its size. Therefore, the amount to read was ultimatly set to ~(8 MiB / k)
where k is the number of data shards. This can cause mClock to prolong
the recovery times as items stay longer in the queue. To address this, the
amount to read is set to the remaining length of the object to recover
if the object size is known. Otherwise, the amount to read is set to the
recovery chunk size as before. Therefore, in some cases, only the first
recovery read could be costly if the object context is not known.

The MOSDECSubOpRead class introduces the following:
 - cost member. This necessitates an increment to the HEAD_VERSION and
   appropriate handling within the encode and decode methods.
 - compute_cost() that is called when creating the message by
   ECCommonL::ReadPipeline::do_read_op(). This calls into ECSubRead::cost()
   that performs the actual calculations to set the cost based on the cases
   mentioned above.
 - The same sequence applies to the EC optimized path in
   ECCommon::ReadPipeline::do_read_op().

Fixes: https://tracker.ceph.com/issues/71655
Signed-off-by: Sridhar Seshasayee <sridhar.seshasayee@ibm.com>
2 weeks agoosd/scheduler: Classify EC subOp reads according to op priority for mClock
Sridhar Seshasayee [Tue, 22 Jul 2025 08:39:16 +0000 (14:09 +0530)]
osd/scheduler: Classify EC subOp reads according to op priority for mClock

The change brings MSG_OSD_EC_READ into the fold of mClock scheduler. This
improves the scheduling of client and other classes of operation as they
are no longer unnecessarily preempted by the 'immediate' queue.
EC SubOps are now handled as follows:

 - EC SubOp reads generated during recovery will either go into the
   'background_recovery' or 'background_best_effort' class based on
   the recovery priority set for the op. EC SubOp reads generated due
   to client will continue to be classified as 'immediate'.

 - EC SubOp writes generated as a result of client operations will
   continue to be classified as 'immediate'.

 - EC SubOp replies are considered high priority and therefore
   continue to be classed as 'immediate'.

Fixes: https://tracker.ceph.com/issues/71655
Signed-off-by: Sridhar Seshasayee <sridhar.seshasayee@ibm.com>
2 weeks agoosd/scheduler/mClockScheduler: Fix line alignments
Sridhar Seshasayee [Tue, 22 Jul 2025 08:23:07 +0000 (13:53 +0530)]
osd/scheduler/mClockScheduler: Fix line alignments

Signed-off-by: Sridhar Seshasayee <sridhar.seshasayee@ibm.com>
2 weeks agoosd/scheduler/mClockScheduler: Log the size of high priority queues.
Sridhar Seshasayee [Tue, 22 Jul 2025 08:08:16 +0000 (13:38 +0530)]
osd/scheduler/mClockScheduler: Log the size of high priority queues.

Signed-off-by: Sridhar Seshasayee <sridhar.seshasayee@ibm.com>