git-server-git.apps.pok.os.sepia.ceph.com Git

]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph-ci.git/log

skanta [Mon, 9 Feb 2026 04:55:14 +0000 (10:25 +0530)]

Merge branch 'wip-rfe-implement-ok-to-upgrade-command' of https://github.com/sseshasa/ceph into wip-bharath-testing-2026-02-09-1025

commit | commitdiff | tree

Kefu Chai [Mon, 9 Feb 2026 00:00:15 +0000 (08:00 +0800)]

Merge pull request #67254 from tchaikov/wip-doc-build-mgr-module-command

doc/_ext: fix ceph_commands.py for new decorator-based command system

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

commit | commitdiff | tree

Jose Juan Palacios-Perez [Sun, 8 Feb 2026 20:16:36 +0000 (20:16 +0000)]

Merge pull request #67186 from perezjosibm/wip-perezjos-tracker74642

crimson: fix dump_metrics skipping metrics argument.

commit | commitdiff | tree

Patrick Donnelly [Sun, 8 Feb 2026 18:28:42 +0000 (13:28 -0500)]

Merge PR #67086 into main

* refs/pull/67086/head:
qa/suites/upgrade: Exclude ceph-osd-classic/crimson when installing LTS releases
qa/suites/fs/upgrade: Exclude ceph-osd-classic/crimson when installing LTS releases

Reviewed-by: Kefu Chai <k.chai@proxmox.com>
Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>

commit | commitdiff | tree

Patrick Donnelly [Sun, 8 Feb 2026 18:26:23 +0000 (13:26 -0500)]

Merge PR #67145 into main

* refs/pull/67145/head:
src/script/build-with-container.py: fix a few spelling errors

Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>

commit | commitdiff | tree

Kefu Chai [Sun, 8 Feb 2026 12:34:15 +0000 (20:34 +0800)]

doc/_ext: fix ceph_commands.py for new decorator-based command system

After commit 4aa9e246f, mgr modules migrated from using a class-level
COMMANDS list to decorator-based command registration using per-module
CLICommand instances (e.g., @BalancerCLICommand.Read('balancer status')).

This broke the ceph_commands.py Sphinx extension which was hardcoded to
expect m.COMMANDS to be a list, causing documentation builds to fail.

But not all modules are using this per-module CLICommand. Some modules are
fully migrated (balancer, hello, etc.) and use decorators, while others
are partially migrated (volumes, progress, stats, influx, k8sevents,
osd_perf_query, osd_support) - they have CLICommand defined but still
use the old COMMANDS list.

This fix updates _collect_module_commands() to handle three scenarios:

1. Fully migrated modules: Check CLICommand.dump_cmd_list() and use it
if it returns commands
2. Partially migrated modules: Fall back to the old COMMANDS list if
dump_cmd_list() returns empty
3. Legacy modules: Use COMMANDS list if CLICommand doesn't exist

This ensures the Sphinx extension works with modules in any migration
state, maintaining backwards compatibility while supporting the new
decorator pattern.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

J. Eric Ivancich [Sat, 7 Feb 2026 03:45:25 +0000 (22:45 -0500)]

Merge pull request #67247 from ivancich/wip-fix-versioning-test-fix

rgw/test: fix rgw versioning test fix

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Anthony D'Atri [Sat, 7 Feb 2026 00:44:12 +0000 (19:44 -0500)]

Merge pull request #67243 from anthonyeleven/updateslink

doc/start: Update Slack invite link in doc/start/get-involved.rst

commit | commitdiff | tree

J. Eric Ivancich [Fri, 6 Feb 2026 21:19:27 +0000 (16:19 -0500)]

rgw/test: fix rgw versioning test fix

Removing parentheses that are problematic.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>

commit | commitdiff | tree

Dan Mick [Fri, 6 Feb 2026 21:23:51 +0000 (13:23 -0800)]

Merge pull request #66467 from athanatos/wip-sjust-mgr-cli-command-74042

pybind/mgr: update modules to use independent CLICommand subtypes with distinct COMMAND attributes

commit | commitdiff | tree

Anthony D'Atri [Fri, 6 Feb 2026 14:05:07 +0000 (09:05 -0500)]

doc/start: Update Slack invite link in doc/start/get-involved.rst

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>

commit | commitdiff | tree

Shraddha Agrawal [Fri, 6 Feb 2026 11:05:36 +0000 (16:35 +0530)]

Merge pull request #67220 from shraddhaag/wip-shraddhaag-74753

doc: add instructions for deploying crimson with cephadm

commit | commitdiff | tree

Afreen Misbah [Fri, 6 Feb 2026 08:46:22 +0000 (14:16 +0530)]

Merge pull request #67180 from afreen23/fix-notif-panel

mgr/dashboard: Fix footer of notification panel

Reviewed-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>

commit | commitdiff | tree

Shraddha Agrawal [Wed, 4 Feb 2026 14:03:33 +0000 (19:33 +0530)]

doc: add instructions for deploying crimson with cephadm

This PR adds user facing instructions on how to deploy crimson
OSDs with cephadm. It also updates the build information as per
latest changes.

Fixes: https://tracker.ceph.com/issues/74753
Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>

commit | commitdiff | tree

Josh Durgin [Thu, 5 Feb 2026 23:40:12 +0000 (15:40 -0800)]

Merge pull request #67222 from anthonymicmidd/wip-docs-page

Update foundation.rst

Reviewed-by: Josh Durgin <jdurgin@ibm.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>

commit | commitdiff | tree

SrinivasaBharathKanta [Thu, 5 Feb 2026 22:47:14 +0000 (04:17 +0530)]

Merge pull request #65623 from JonBailey1993/data_digests_are_inconsistent_fix

osd: fix for "data digests are inconsistent"

commit | commitdiff | tree

SrinivasaBharathKanta [Thu, 5 Feb 2026 22:44:43 +0000 (04:14 +0530)]

Merge pull request #65074 from bill-scales/test_pool_min_size

qa: test_pool_min_size should kill osds first then mark them down

commit | commitdiff | tree

Sridhar Seshasayee [Mon, 27 Oct 2025 16:34:54 +0000 (22:04 +0530)]

mgr/DaemonServer: Implement ok-to-upgrade command

Implement a new Mgr command called 'ok-to-upgrade' that returns a set of OSDs
within the provided CRUSH bucket that are safe to upgrade without reducing
immediate data availability.

The command accepts the following as input:
- CRUSH bucket name (required)
   - The CRUSH bucket type is limited to 'rack', 'chassis', 'host' and 'osd'.
     This is to prevent users from specifying a bucket type higher up the tree
     which could result in performance issues if the number of OSDs in the
     bucket is very high.
- The new Ceph version to check against. The format accepted is the short
   form of the Ceph version, for e.g. 20.3.0-3803-g63ca1ffb5a2. (required)
- The maximum number of OSDs to consider if specified. (optional)

Implementation Details:

After sanity checks on the provided parameters, the following steps are
performed:

1. The set of OSDs within the CRUSH bucket is first determined.
2. From the main set of OSDs, a filtered set of OSDs not yet running the new
   Ceph version is created.
   - For this purpose, the OSD's 'ceph_version_short' string is read from
     the metadata. For this purpose a new method called
     DaemonServer::get_osd_metadata() is used. The information is determined
     from the DaemonStatePtr maintained within the DaemonServer.
3. If all OSDs are already running the new Ceph version, a success report is
   generated and returned.
4. If OSDs are not running the new Ceph version, a new set (to_upgrade) is
   created.
5. If the current version cannot be determined, an error is logged and the
   output report with 'bad_no_version' field populated with the OSD in question
   is generated.
6. On the new set (to_upgrade), the existing logic in _check_offline_pgs() is
   executed to see if stopping any or all OSDs in the set as part of the upgrade
   can reduce immediate data availability.
   - If data availability is impacted, then the number of OSDs in the filtered
     set is reduced by a factor defined by a new config option called
     'mgr_osd_upgrade_check_convergence_factor' which is set to 0.8 by default.
   - The logic in _check_offline_pgs() is repeated for the new set.
   - The above is repeated until a safe subset of OSDs that can be stopped for
     upgrade is found. Each iteration reduces the number of OSDs to check by
     the convergence factor mentioned above.
7. It must be noted that the default value of
   'mgr_osd_upgrade_check_convergence_factor' is on the higher side in order to
   help determine an optimal set of OSDs to upgrade. In other words, a higher
   convergence factor would help maximize the number of OSDs to upgrade. In this
   case, the number of iterations and therefore the time taken to determine the
   OSDs to upgrade is proportional to the number of OSDs in the CRUSH bucket.
   The converse is true if a lower convergence factor is used.
8. If the number of OSDs determined is lower than the 'max' specified, then an
   additional loop is executed to determine if other children of the CRUSH
   bucket can be added to the existing set.
9. Once a viable set is determined, an output report similar to the following is
   generated:

A standalone test is introduced that exercises the logic for both replicated
and erasure-coded pools by manipulating the min_size for a pool and check for
upgradability. The tests also performs other basic sanity checks and error
conditions.

The output shown below is for a cluster running on a single node with 10 OSDs
and with replicated pool configuration:

$ ceph osd ok-to-upgrade incerta06 01.00.00-gversion-test --format=json
{"ok_to_upgrade":true,"all_osds_upgraded":false,\
"osds_in_crush_bucket":[0,1,2,3,4,5,6,7,8,9],\
"osds_ok_to_upgrade":[0],"osds_upgraded":[],"bad_no_version":[]}

The following report is shown if all OSDs are running the desired Ceph version:

$ ceph osd ok-to-upgrade --crush_bucket  localrack \
  --ceph_version 20.3.0-3803-g63ca1ffb5a2
{"ok_to_upgrade":false,"all_osds_upgraded":true,\
"osds_in_crush_bucket":[0,1,2,3,4,5,6,7,8,9],"osds_ok_to_upgrade":[],\
"osds_upgraded":[0,1,2,3,4,5,6,7,8,9],"bad_no_version":[]}'

Fixes: https://tracker.ceph.com/issues/73031
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

commit | commitdiff | tree

J. Eric Ivancich [Thu, 5 Feb 2026 19:36:02 +0000 (14:36 -0500)]

Merge pull request #67190 from ivancich/wip-log-more-versioning-test

rgw/test: add more output in boto3 versioning testing

Reviewed-by: Jane Zhu <jzhu116@bloomberg.net>

commit | commitdiff | tree

Pedro Gonzalez Gomez [Thu, 5 Feb 2026 17:02:45 +0000 (18:02 +0100)]

Merge pull request #66616 from rhcs-dashboard/cephfs-mirroring-wizard

mgr/dashboard: Cephfs Mirroring Wizard

Reviewed-by: Naman Munet <naman.munet@ibm.com>
Reviewed-by: Pedro Gonzalez Gomez <pegonzal@ibm.com>
Reviewed-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Anthony M [Wed, 4 Feb 2026 15:28:26 +0000 (09:28 -0600)]

doc: update foundation.rst

Updating the Ceph Foundation members list and the community manager.

Signed-off-by: Anthony M <anthony@amicmid.com>

commit | commitdiff | tree

Ilya Dryomov [Thu, 5 Feb 2026 16:17:06 +0000 (17:17 +0100)]

Merge pull request #66393 from ljflores/wip-update-cluster-log-warnings

qa: update ignorelists for expected cluster log warnings

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

J. Eric Ivancich [Thu, 5 Feb 2026 15:39:16 +0000 (10:39 -0500)]

Merge pull request #66874 from tchaikov/wip-rgw-client-fix-leak

rgw: fix memory leak in RGWHTTPManager thread cleanup

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 5 Feb 2026 14:04:41 +0000 (15:04 +0100)]

Merge pull request #67047 from guits/2430588

ceph-volume: avoid Device() instantiation in lvm OSD filtering

commit | commitdiff | tree

Pedro Gonzalez Gomez [Thu, 20 Nov 2025 14:09:03 +0000 (15:09 +0100)]

mgr/dashboard: Cephfs Mirroring Wizard
Fixes: https://tracker.ceph.com/issues/74200
Signed-off-by: Dnyaneshwari Talwekar <dtalweka@redhat.com>

commit | commitdiff | tree

Gil Bregman [Thu, 5 Feb 2026 08:44:24 +0000 (10:44 +0200)]

Merge pull request #67206 from gbregman/main

mgr/cephadm: Add IO statistics enable field to the cephadm NVMEoF spe…

commit | commitdiff | tree

Gil Bregman [Thu, 5 Feb 2026 06:32:23 +0000 (08:32 +0200)]

Merge branch 'ceph:main' into main

commit | commitdiff | tree

Dan Mick [Sat, 31 Jan 2026 03:42:19 +0000 (19:42 -0800)]

src/script/build-with-container.py: fix a few spelling errors

I finally snapped.

Signed-off-by: Dan Mick <dan.mick@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 9 Jan 2026 23:53:29 +0000 (07:53 +0800)]

rgw: fix memory leak in RGWHTTPManager thread cleanup

Fix memory leak detected by AddressSanitizer in unittest_http_manager.
The test was failing with ASan enabled due to rgw_http_req_data objects
not being properly cleaned up when the HTTP manager thread exits.

ASan reported the following leaks:

  Direct leak of 17152 byte(s) in 32 object(s) allocated from:
    #0 operator new(unsigned long)
    #1 RGWHTTPManager::add_request(RGWHTTPClient*)
       /ceph/src/rgw/rgw_http_client.cc:946:33
    #2 HTTPManager_SignalThread_Test::TestBody()
       /ceph/src/test/rgw/test_http_manager.cc:132:10

  Indirect leak of 768 byte(s) in 32 object(s) allocated from:
    #0 operator new(unsigned long)
    #1 rgw_http_req_data::rgw_http_req_data()
       /ceph/src/rgw/rgw_http_client.cc:52:22
    #2 RGWHTTPManager::add_request(RGWHTTPClient*)
       /ceph/src/rgw/rgw_http_client.cc:946:37

  SUMMARY: AddressSanitizer: 17920 byte(s) leaked in 64 allocation(s).

Root cause: The rgw_http_req_data class uses reference counting
(inherits from RefCountedObject). When a request is unregistered,
unregister_request() calls get() to increment the refcount, expecting
a corresponding put() to be called later.

In manage_pending_requests(), unregistered requests are properly
handled with both _unlink_request() and put(). However, in the thread
cleanup code (reqs_thread_entry exit path), only _unlink_request() was
called without the matching put(), causing a reference count leak.

The fix adds the missing put() call in the thread cleanup code to match
the reference counting pattern used in manage_pending_requests().

Test results:
- Before: 17,920 bytes leaked in 64 allocations
- After: 0 leaks, unittest_http_manager passes with ASan

Fixes: https://tracker.ceph.com/issues/74762
Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Nizamudeen A [Thu, 5 Feb 2026 03:34:12 +0000 (09:04 +0530)]

Merge pull request #67201 from rhcs-dashboard/import-error

qa/tasks: fix import error

commit | commitdiff | tree

Nizamudeen A [Wed, 4 Feb 2026 06:39:46 +0000 (12:09 +0530)]

qa/tasks: fix import error

```
2026-02-04 06:04:16,385.385 INFO:__main__: from .helper import DashboardTestCase, MgrModuleTestCase
2026-02-04 06:04:16,385.385 INFO:__main__:ImportError: cannot import name 'MgrModuleTestCase' from 'tasks.mgr.dashboard.helper' (/home/jenkins-build/build/workspace/ceph-api/qa/tasks/mgr/dashboard/helper.py)
```

Signed-off-by: Nizamudeen A <nia@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Wed, 4 Feb 2026 21:33:52 +0000 (16:33 -0500)]

Merge PR #67094 into main

* refs/pull/67094/head:
script/ptl-tool: supprt --debug-build to add debug flavor
script/ptl-tool: remove debug suffix on branch name

Reviewed-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

J. Eric Ivancich [Mon, 2 Feb 2026 21:37:22 +0000 (16:37 -0500)]

rgw/test: add more output in boto3 versioning testing

Saw a spurious error in this test and figured it'd be helpful if more
information was logged in case another spurious error occcurs.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>

commit | commitdiff | tree

Igor Fedotov [Wed, 4 Feb 2026 16:33:44 +0000 (19:33 +0300)]

Merge pull request #66886 from ifed01/wip-ifed-revert-bitmap-in-vstart

vstart.sh: revert unintended allocator type change

Reviewed-by: Adam Kupczyk <akupczyk@ibm.com>

commit | commitdiff | tree

Vallari Agrawal [Wed, 4 Feb 2026 15:47:25 +0000 (21:17 +0530)]

Merge pull request #67203 from VallariAg/fix-corrupted-issue

qa/workunits/nvmeof/basic_tests: use nvme-cli 2.13

commit | commitdiff | tree

Gil Bregman [Wed, 4 Feb 2026 14:23:04 +0000 (16:23 +0200)]

Merge branch 'ceph:main' into main

commit | commitdiff | tree

Shraddha Agrawal [Wed, 4 Feb 2026 14:22:35 +0000 (19:52 +0530)]

Merge pull request #67181 from shraddhaag/wip-shraddhaag-74178

qa/standalone/availability.sh: retry after feature is turned on

commit | commitdiff | tree

Shraddha Agrawal [Wed, 4 Feb 2026 11:21:46 +0000 (16:51 +0530)]

Merge pull request #66811 from shraddhaag/wip-shraddhaag-cephadm-add-osd-type

cephadm, ceph-volume: deploy crimson OSDs using cephadm

commit | commitdiff | tree

Alex Ainscow [Wed, 4 Feb 2026 10:06:56 +0000 (10:06 +0000)]

Merge pull request #66162 from aainscow/no_obj_ver

rados: Add API to disable version querying with reads in librados

Reviewed-by: Bill Scales <bill_scales@uk.ibm.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Gil Bregman [Wed, 4 Feb 2026 09:14:58 +0000 (11:14 +0200)]

mgr/cephadm: Add IO statistics enable field to the cephadm NVMEoF spec file.

Fixes: https://tracker.ceph.com/issues/74750
Signed-off-by: Gil Bregman <gbregman@il.ibm.com>

commit | commitdiff | tree

Kefu Chai [Wed, 4 Feb 2026 07:59:45 +0000 (15:59 +0800)]

Merge pull request #67175 from bluikko/wip-doc-undo-66059-pip-pin

doc: unpin pip in admin/doc-read-the-docs.txt

Reviewed-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Vallari Agrawal [Tue, 3 Feb 2026 15:02:17 +0000 (20:32 +0530)]

qa/workunits/nvmeof/basic_tests: use nvme-cli 2.13

Install nvme version 2.13 (instead of latest nvme
ver 2.16). This is because nvme-cli 2.16 has a bug
in 'nvme list-subsys' command on centos9.

Fixes: https://tracker.ceph.com/issues/74615
Co-authored-by: barakda <barak.davidov@gmail.com>
Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>

commit | commitdiff | tree

Xuehan Xu [Wed, 4 Feb 2026 04:58:59 +0000 (12:58 +0800)]

Merge pull request #66488 from xxhdx1985126/wip-seastore-background-trans-cc-opt2

crimson/os/seastore/cache: TRIM_DIRTY/CLEANER_* transactions won't invalidate other transactions anymore

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Matan Breizman <mbreizma@redhat.com>

commit | commitdiff | tree

David Galloway [Tue, 3 Feb 2026 21:22:39 +0000 (16:22 -0500)]

Merge pull request #67177 from rhcs-dashboard/fix-feedback-module-failure

qa/tests: wait for module to be available for connection

commit | commitdiff | tree

J. Eric Ivancich [Tue, 3 Feb 2026 17:34:03 +0000 (12:34 -0500)]

Merge pull request #66367 from mheler/lc-tag-scan-reduction

rgw/lc: optimize lifecycle processing for multiple rules

Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>

commit | commitdiff | tree

J. Eric Ivancich [Tue, 3 Feb 2026 17:30:28 +0000 (12:30 -0500)]

Merge pull request #66514 from BBoozmen/wip-oozmen-62063

RGW: remove custom copy ctor for RGWObjectCtx and enforce no copy/move

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

J. Eric Ivancich [Tue, 3 Feb 2026 17:29:27 +0000 (12:29 -0500)]

Merge pull request #66369 from BBoozmen/wip-oozmen-66100

RGW: prevent shutdown hang by reconciling race between async processor and multisite sync threads

Reviewed-by: Adam C. Emerson <aemerson@redhat.com>

commit | commitdiff | tree

Afreen Misbah [Tue, 3 Feb 2026 11:59:45 +0000 (17:29 +0530)]

mgr/dashboard: Fix footer of notification panel

Fixes https://tracker.ceph.com/issues/74735

Signed-off-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 3 Feb 2026 15:55:55 +0000 (10:55 -0500)]

script/ptl-tool: supprt --debug-build to add debug flavor

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 27 Jan 2026 16:09:53 +0000 (11:09 -0500)]

script/ptl-tool: remove debug suffix on branch name

A git trailer is now the preferred way to enable this.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 3 Feb 2026 15:45:20 +0000 (10:45 -0500)]

Merge PR #66666 into main

* refs/pull/66666/head:
ceph: fix a small error in the ceph command help

Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>
Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>

commit | commitdiff | tree

Afreen Misbah [Tue, 3 Feb 2026 15:03:53 +0000 (20:33 +0530)]

Merge pull request #67106 from afreen23/subsystem-step-1

mgr/dashboard: Add step 1 for subsystem form

Reviewed-by: Aashish Sharma <aasharma@redhat.com>

commit | commitdiff | tree

Adam Kupczyk [Tue, 3 Feb 2026 14:46:21 +0000 (15:46 +0100)]

Merge pull request #66512 from aclamk/aclamk-fix-bs-wal-envelope-mode-size

os/bluestore/bluefs: Fix stat() for WAL envelope mode

commit | commitdiff | tree

Jose J Palacios-Perez [Tue, 3 Feb 2026 14:19:27 +0000 (14:19 +0000)]

crimson: fix dump_metrics skipping metrics argument. Add minimal qa test for the fix.

Signed-off-by: Jose J Palacios-Perez <perezjos@uk.ibm.com>

commit | commitdiff | tree

Shraddha Agrawal [Tue, 3 Feb 2026 12:26:18 +0000 (17:56 +0530)]

qa/standalone/availability.sh: retry after feature is turned on

This commit adds a retry to ensure we wait for availability score
to be reported after it is turned on and do not fail early.

Fixes: https://tracker.ceph.com/issues/74178
Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>

commit | commitdiff | tree

Pedro Gonzalez Gomez [Tue, 3 Feb 2026 09:40:25 +0000 (10:40 +0100)]

Merge pull request #66962 from rhcs-dashboard/74429-add-cert-mgmt-tabs

mgr/dashboard : Add Certificate tab under service details

Reviewed-by: Afreen Misbah <afreen@ibm.com>
Reviewed-by: Pedro Gonzalez Gomez <pegonzal@ibm.com>

commit | commitdiff | tree

Matan Breizman [Tue, 3 Feb 2026 08:06:17 +0000 (10:06 +0200)]

Merge pull request #66798 from Matan-B/wip-matanb-seastore-docs

doc/dev/crimson: Update Seastore docs

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>

commit | commitdiff | tree

Nizamudeen A [Tue, 3 Feb 2026 08:03:08 +0000 (13:33 +0530)]

qa/tests: wait for module to be available for connection

Signed-off-by: Nizamudeen A <nia@redhat.com>

commit | commitdiff | tree

Ville Ojamo [Tue, 3 Feb 2026 06:28:12 +0000 (13:28 +0700)]

doc: unpin pip in admin/doc-read-the-docs.txt

7dd00ca introduced a proper fix for pip 25.3/PEP517 compatibility by
adding pyproject.toml files and the workaround in a65c46c is no longer
necessary. RTD builds with pip 25.3 and later work with the proper fix.

Remove the pinned pip in admin/doc-read-the-docs.txt and let RTD use the
default PIP version.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Mon, 2 Feb 2026 19:29:28 +0000 (20:29 +0100)]

Merge pull request #66511 from bill-scales/issue74048_deletepg

osd: Deleting PG should discard pwlc

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Abhishek Desai [Mon, 19 Jan 2026 08:47:54 +0000 (14:17 +0530)]

mgr/dashboard : Add Certificate tab under service details
fixes : https://tracker.ceph.com/issues/74429
Signed-off-by: Abhishek Desai <abhishek.desai1@ibm.com>

commit | commitdiff | tree

Alex Ainscow [Mon, 2 Feb 2026 14:13:21 +0000 (14:13 +0000)]

Merge pull request #66698 from aainscow/partial_write_with_clone_fix

osd: Do not remove objects with divergent logs if only partial writes.

Reviewed-by: Bill Scales <bill_scales@uk.ibm.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Matan Breizman [Mon, 2 Feb 2026 13:32:41 +0000 (15:32 +0200)]

Merge pull request #67157 from xxhdx1985126/wip-seastore-fix-possible-chksum-error

crimson/os/seastore/cache: fix possible extent chksum error

Reviewed-by: Matan Breizman <mbreizma@redhat.com>

commit | commitdiff | tree

Matan Breizman [Mon, 2 Feb 2026 12:38:49 +0000 (14:38 +0200)]

Merge pull request #65157 from liu-chunmei/omap_rm_key

crimson/os/seastore: optimize omap_rm_key_range

Reviewed-by: Matan Breizman <mbreizma@redhat.com>

commit | commitdiff | tree

Ilya Dryomov [Mon, 2 Feb 2026 12:34:27 +0000 (13:34 +0100)]

Merge pull request #67138 from idryomov/wip-74672

qa/valgrind.supp: make gcm_cipher_internal suppression more resilient

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Brad Hubbard <bhubbard@redhat.com>

commit | commitdiff | tree

Sridhar Seshasayee [Mon, 2 Feb 2026 08:44:15 +0000 (14:14 +0530)]

mgr/DaemonServer: Modify offline_pg_report to handle set or vector types

The offline_pg_report structure to be used by both the 'ok-to-stop' and
ok-to-upgrade' commands is modified to handle either std::set or std::vector
type containers. This is necessitated due to the differences in the way
both commands work. For the 'ok-to-upgrade' command logic to work optimally,
the items in the specified crush bucket including items found in the subtree
must be strictly ordered. The earlier std::set container re-orders the items
upon insertion by sorting the items which results in the offline pg check to
report sub-optimal results.

Therefore, the offline_pg_report struct is modified to use
std::variant<std::vector<int>, std::set<int>> as a ContainerType and handled
accordingly in dump() using std::visit(). This ensures backward compatibility
with the existing 'ok-to-stop' command while catering to the requirements of
the new 'ok-to-upgrade' command.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

commit | commitdiff | tree

Shraddha Agrawal [Thu, 29 Jan 2026 04:28:00 +0000 (09:58 +0530)]

ceph-volume: support crimson osd binary

Prior to this commit, ceph-volume was using hardcoded OSD binary
to issue commands (eg - to perform mkfs, etc). This commit enables
ceph-volume to start supporting crimson OSDs.

A new argument, --osd-type is introduced with the default value
classic. When this parameter is set to 'crimson', ceph-osd-crimson
binary will be used to execute OSD commands.

Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>

commit | commitdiff | tree

Shraddha Agrawal [Tue, 6 Jan 2026 12:51:01 +0000 (18:21 +0530)]

cephadm: add osd_type to orchestrator

This commit enables us to deploy both classic and crimson
type OSDs using cephadm. To enable the same, a new feature,
osd_type is added to DriverGroupSpec. The default value for
the same is classic, but can also be set to crimson.
When this value is read by cephadm, the entrypoint is
changed from /usr/bin/ceph-osd to /usr/bin/ceph-osd-crimson.

Fixes: https://tracker.ceph.com/issues/74081
Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>

commit | commitdiff | tree

Kefu Chai [Mon, 2 Feb 2026 11:12:58 +0000 (19:12 +0800)]

Merge pull request #67133 from rsommer/rsommer-fix-missing-smb-module

debian: package mgr/smb in ceph-mgr-modules-core

Reviewed-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Afreen Misbah [Wed, 28 Jan 2026 13:18:52 +0000 (18:48 +0530)]

mgr/dashboard: Add step 1 for subsystem form

Fixes https://tracker.ceph.com/issues/74093
Fixes https://tracker.ceph.com/issues/74094

- updates tearsheet component css to match with carbon component
- adds laoding state to submit button
- adds support for step validation when angualr component are use for steps rather than plain html templates
- adds step one of nvmeof

Signed-off-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Alex Ainscow [Mon, 2 Feb 2026 10:14:24 +0000 (10:14 +0000)]

Merge pull request #66817 from aainscow/bad_erase_after_ro_offset_fix

osd/ECUtil: Fix erase_after_ro_offset length calculation and add tests

Reviewed-by: Bill Scales <bill_scales@uk.ibm.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Shraddha Agrawal [Mon, 2 Feb 2026 10:05:16 +0000 (15:35 +0530)]

Merge pull request #67090 from shraddhaag/wip-shraddhaag-add-more-osd-bootstrap-logs

crimson/osd: add verbose DEBUG logs for OSD startup

commit | commitdiff | tree

Xuehan Xu [Mon, 2 Feb 2026 05:52:47 +0000 (13:52 +0800)]

crimson/os/seastore/cache: fix possible extent chksum error

See: https://github.com/ceph/ceph/pull/66506#issuecomment-3821417465

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>

commit | commitdiff | tree

Anthony D'Atri [Mon, 2 Feb 2026 04:04:51 +0000 (23:04 -0500)]

Merge pull request #66597 from anthonyeleven/reefunstretch

doc/rados/operations: Clarify exiting in stretch-mode.rst

commit | commitdiff | tree

Ilya Dryomov [Sat, 31 Jan 2026 20:41:11 +0000 (21:41 +0100)]

Merge pull request #67144 from idryomov/wip-74676

qa/tasks/rbd_mirror_thrash: don't use random.randrange() on floats

Reviewed-by: Ramana Raja <rraja@redhat.com>

commit | commitdiff | tree

Ilya Dryomov [Sat, 31 Jan 2026 20:40:38 +0000 (21:40 +0100)]

Merge pull request #67143 from idryomov/wip-74671

qa/workunits/rbd: use the same qemu-iotests version throughout

Reviewed-by: Ramana Raja <rraja@redhat.com>

commit | commitdiff | tree

Ilya Dryomov [Sat, 31 Jan 2026 20:39:58 +0000 (21:39 +0100)]

Merge pull request #67142 from idryomov/wip-74670

qa/tasks/qemu: rocky 10 enablement

Reviewed-by: Ramana Raja <rraja@redhat.com>

commit | commitdiff | tree

Anthony D'Atri [Sat, 31 Jan 2026 13:08:15 +0000 (08:08 -0500)]

Merge pull request #65862 from saschalucas/zonegroup_remove

doc: fix syntax for removing zone from zonegroup

commit | commitdiff | tree

bluikko [Sat, 31 Jan 2026 03:56:34 +0000 (10:56 +0700)]

Merge pull request #67130 from bluikko/wip-doc-dev-health-checks-re-add-diagrams

doc/dev: add sequence diagrams back to health-reports.rst

commit | commitdiff | tree

Joseph Mundackal [Sat, 31 Jan 2026 02:29:04 +0000 (21:29 -0500)]

Merge pull request #67120 from artsiukhou/patch-2

docs: monitoring: Fix typo thughtput -> throughput

commit | commitdiff | tree

Laura Flores [Fri, 30 Jan 2026 23:42:41 +0000 (17:42 -0600)]

Merge pull request #66961 from aainscow/ec_memory_leak

osd: Fix memory leak of ECDummyOp

Reviewed-by: Radosław Zarzyński <Radoslaw.Adam.Zarzynski@ibm.com>
Reviewed-by: Ronen Friedman <rfriedma@ibm.com>
Reviewed-by: Bill Scales <bill_scales@uk.ibm.com>

commit | commitdiff | tree

Ilya Dryomov [Fri, 30 Jan 2026 15:32:35 +0000 (16:32 +0100)]

qa/tasks/rbd_mirror_thrash: don't use random.randrange() on floats

This stopped working in Python 3.12:

  Changed in version 3.12: Automatic conversion of non-integer types
  is no longer supported. Calls such as randrange(10.0) and
  randrange(Fraction(10, 1)) now raise a TypeError.

Fixes: https://tracker.ceph.com/issues/74676
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Ilya Dryomov [Thu, 29 Jan 2026 20:25:55 +0000 (21:25 +0100)]

qa/workunits/rbd: use the same qemu-iotests version throughout

"platform:el10" could be appended to the grep pattern for v2.11.0 but
we no longer test on any distro needing v2.3.0 or v2.2.0-rc3.

Fixes: https://tracker.ceph.com/issues/74671
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Pedro Gonzalez Gomez [Fri, 30 Jan 2026 20:00:26 +0000 (21:00 +0100)]

Merge pull request #66949 from rhcs-dashboard/74411-add-cert-column

mgr/dashboard : Add Cert Status column to services page

Reviewed-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Casey Bodley [Fri, 30 Jan 2026 19:10:58 +0000 (14:10 -0500)]

Merge pull request #62201 from tobias-urdin/rgw-keystone-remove-legacy-admin-token

rgw/auth: Remove legacy Keystone admin token

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Ilya Dryomov [Tue, 11 Nov 2025 17:31:56 +0000 (18:31 +0100)]

qa/tasks/qemu: adjust NFS service name for Rocky 10

Fixes: https://tracker.ceph.com/issues/74670
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Ilya Dryomov [Tue, 11 Nov 2025 15:33:16 +0000 (16:33 +0100)]

qa/tasks/qemu: install genisoimage package

genisoimage is expected to be included in our base images but currently
isn't on Rocky 10. Since it's quite a niche thing, let's install the
package explicitly.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Ilya Dryomov [Fri, 30 Jan 2026 16:23:06 +0000 (17:23 +0100)]

Merge pull request #67139 from idryomov/wip-74669

qa/workunits/rbd: reduce randomized sleeps in live import tests

Reviewed-by: Miki Patel <miki.patel132@gmail.com>

commit | commitdiff | tree

Vova Artsiukhou [Thu, 29 Jan 2026 12:24:15 +0000 (12:24 +0000)]

docs: monitoring: Fix typo thughtput -> throughput

s/thughtput/throughput/

Signed-off-by: Vova Artsiukhou <1358483+artsiukhou@users.noreply.github.com>

commit | commitdiff | tree

Ilya Dryomov [Thu, 29 Jan 2026 20:41:03 +0000 (21:41 +0100)]

qa/workunits/rbd: reduce randomized sleeps in live import tests

These tests were tuned for slower hardware than what we have now.
Currently "rbd migration execute" always finishes (successfully) before
the NBD server is killed.

Fixes: https://tracker.ceph.com/issues/74669
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Ilya Dryomov [Tue, 11 Nov 2025 20:39:58 +0000 (21:39 +0100)]

qa/valgrind.supp: make gcm_cipher_internal suppression more resilient

gcm_cipher_internal() and ossl_gcm_stream_final() make it to the stack
trace only on CentOS Stream 9.  On Ubuntu 22.04 and Rocky 10, it looks
as follows:

Thread 4 msgr-worker-1:
Conditional jump or move depends on uninitialised value(s)
   at 0x70A36D4: ??? (in /usr/lib64/libcrypto.so.3.2.2)
   by 0x70A39A1: ??? (in /usr/lib64/libcrypto.so.3.2.2)
   by 0x6F8A09C: EVP_DecryptFinal_ex (in /usr/lib64/libcrypto.so.3.2.2)
   by 0xB498C1F: ceph::crypto::onwire::AES128GCM_OnWireRxHandler::authenticated_decrypt_update_final(ceph::buffer::v15_2_0::list&) (crypto_onwire.cc:271)
   by 0xB4992D7: ceph::msgr::v2::FrameAssembler::disassemble_preamble(ceph::buffer::v15_2_0::list&) (frames_v2.cc:281)
   by 0xB482D98: ProtocolV2::handle_read_frame_preamble_main(std::unique_ptr<ceph::buffer::v15_2_0::ptr_node, ceph::buffer::v15_2_0::ptr_node::disposer>&&, int) (ProtocolV2.cc:1149)
   by 0xB475318: ProtocolV2::run_continuation(Ct<ProtocolV2>&) (ProtocolV2.cc:54)
   by 0xB457012: AsyncConnection::process() (AsyncConnection.cc:495)
   by 0xB49E61A: EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*) (Event.cc:492)
   by 0xB49EA9D: UnknownInlinedFun (Stack.cc:50)
   by 0xB49EA9D: UnknownInlinedFun (invoke.h:61)
   by 0xB49EA9D: UnknownInlinedFun (invoke.h:111)
   by 0xB49EA9D: std::_Function_handler<void (), NetworkStack::add_thread(Worker*)::{lambda()#1}>::_M_invoke(std::_Any_data const&) (std_function.h:290)
   by 0xBB11063: ??? (in /usr/lib64/libstdc++.so.6.0.33)
   by 0x4F17119: start_thread (in /usr/lib64/libc.so.6)

The proposal to amend the existing suppression so that it's tied to the
specific callsite rather than libcrypto internals [1] received a thumbs
up from Radoslaw.

[1] https://github.com/ceph/ceph/pull/61689#issuecomment-2650179891

Fixes: https://tracker.ceph.com/issues/74672
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Afreen Misbah [Fri, 30 Jan 2026 11:11:08 +0000 (16:41 +0530)]

Merge pull request #67101 from afreen23/nvme-ns-api

mgr/dashboard: fetch all namespaces in a gateway group

Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: pujaoshahu <pshahu@redhat.com>

commit | commitdiff | tree

Nizamudeen A [Fri, 30 Jan 2026 08:56:27 +0000 (14:26 +0530)]

Merge pull request #66771 from rhcs-dashboard/delete-subsystem

mgr/dashboard: Fix nvmeof subsystems delete modal

Reviewed-by: Afreen Misbah <afreen@ibm.com>
Reviewed-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Naman Munet <nmunet@redhat.com>

commit | commitdiff | tree

Roland Sommer [Fri, 30 Jan 2026 07:54:49 +0000 (08:54 +0100)]

debian: package mgr/smb in ceph-mgr-modules-core

The `BaseController` auto-imports the packaged `mgr/dashboard/controllers/smb.py`
file, which in turn wants to import `smb.enums` etc. which is part of the `smb`
package which is missing from `debian/ceph-mgr-modules-core.install`, thus
missing in the package. The missing module causes an exception
`ModuleNotFoundError: No module named 'smb'` on mgr instances when running a
ceph tentacle cluster installed from debian packages.

See: https://tracker.ceph.com/issues/74268
Signed-off-by: Roland Sommer <rol@ndsommer.de>

commit | commitdiff | tree

Afreen Misbah [Wed, 28 Jan 2026 09:59:08 +0000 (15:29 +0530)]

mgr/dashboard: fetch all namespaces in a gateway group

- adds a new API /api/gateway_group/{group}/namespace
- updates tests
- needed for UI flows and in general to fetch all namespaces, could not change existing API due to the maintenence of backward compatibility
- in a followup PR will add server side pagination

Fixes https://tracker.ceph.com/issues/74622

Signed-off-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Ville Ojamo [Fri, 30 Jan 2026 04:47:40 +0000 (11:47 +0700)]

doc/dev: add sequence diagrams back to health-reports.rst

The sequence diagrams were removed in ce96ddd because they were causing
issues. Add them back as SVG images. Include as comments the source code
used to generate the diagrams.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>

commit | commitdiff | tree

Ilya Dryomov [Fri, 30 Jan 2026 00:19:16 +0000 (01:19 +0100)]

Merge pull request #67114 from tchaikov/wip-vstart-sans-egrep

vstart: replace obsolescent egrep with grep -E

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

John Mulligan [Thu, 29 Jan 2026 23:28:44 +0000 (18:28 -0500)]

Merge pull request #65632 from phlogistonjohn/jjm-smb-hosts-allow

smb: support shares equivalent for hosts allow

Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>
Reviewed-by: Anoop C S <anoopcs@cryptolab.net>
Reviewed-by: Shwetha Acharya <sacharya@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Kefu Chai [Thu, 29 Jan 2026 23:01:24 +0000 (07:01 +0800)]

Merge pull request #66966 from tchaikov/wip-pybind-build-failure

pybind: add pyproject.toml to fix ReadTheDocs builds with pip 25.3+

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Adam King [Thu, 29 Jan 2026 17:16:34 +0000 (12:16 -0500)]

Merge pull request #67093 from guits/fix-node-proxy-ssl-certs

mgr/cephadm: add certificate support and service spec for node-proxy

Reviewed-by: Adam King <adking@redhat.com>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom