]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph-ci.git/log
ceph-ci.git
3 days agoMerge pull request #67330 from gadididi/nvmeof/add_rados_ns
Gadi [Mon, 23 Feb 2026 06:54:28 +0000 (08:54 +0200)]
Merge pull request #67330 from gadididi/nvmeof/add_rados_ns

mgr/dashboard: Adding RADOS namespace option into add_ns_req

3 days agoMerge pull request #66575 from Tom-Sollers/ceph-pg-repeer-test
SrinivasaBharathKanta [Mon, 23 Feb 2026 02:04:43 +0000 (07:34 +0530)]
Merge pull request #66575 from Tom-Sollers/ceph-pg-repeer-test

qa/standalone: Add a test for running repeer on simple ec and rep pools

3 days agoMerge pull request #53457 from NitzanMordhai/wip-nitzan-crush-rule-delete
SrinivasaBharathKanta [Mon, 23 Feb 2026 01:54:03 +0000 (07:24 +0530)]
Merge pull request #53457 from NitzanMordhai/wip-nitzan-crush-rule-delete

mon/OSDMonitor: remove unused crush rules after erasure code pools deleted

3 days agoMerge pull request #66876 from tchaikov/wip-librbd-pwl-fix-leaks
Ilya Dryomov [Sun, 22 Feb 2026 15:58:12 +0000 (16:58 +0100)]
Merge pull request #66876 from tchaikov/wip-librbd-pwl-fix-leaks

librbd/pwl: fix memory leaks in discard operations

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
3 days agomgr/dashboard: Adding rados ns option into add_ns_req
gadi-didi [Thu, 12 Feb 2026 14:17:38 +0000 (16:17 +0200)]
mgr/dashboard: Adding rados ns option into add_ns_req

adding rados ns name option into add ns nvme command.

Signed-off-by: gadi-didi <gadi.didi@ibm.com>
3 days agoMerge pull request #67410 from VallariAg/wip-nvmeof-submodule-1.6.6
Vallari Agrawal [Sun, 22 Feb 2026 10:09:27 +0000 (15:39 +0530)]
Merge pull request #67410 from VallariAg/wip-nvmeof-submodule-1.6.6

mgr/dashboard: bump nvmeof submodule to 1.6.7

4 days agoMerge pull request #64500 from tchaikov/wip-os-silence-Wsign-compare
Kefu Chai [Sun, 22 Feb 2026 02:56:02 +0000 (10:56 +0800)]
Merge pull request #64500 from tchaikov/wip-os-silence-Wsign-compare

os,common:change osd_target_transaction_size to uint

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
4 days agoMerge pull request #66735 from ajarr/wip-fix-schedule-start-time
Ilya Dryomov [Sat, 21 Feb 2026 16:12:24 +0000 (17:12 +0100)]
Merge pull request #66735 from ajarr/wip-fix-schedule-start-time

mgr/rbd_support: Fix "start-time" arg behavior

Reviewed-by: Mykola Golub <mykola.golub@clyso.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
4 days agomgr/rbd_support: Fix "start-time" arg behavior
Ramana Raja [Wed, 24 Dec 2025 10:24:50 +0000 (05:24 -0500)]
mgr/rbd_support: Fix "start-time" arg behavior

The "start-time" argument, optionally passed when adding or removing an
mirror image snapshot schedule or a trash purge schedule, does not
behave as intended. It is meant to schedule an initial operation at a
specific time of day in a given time zone. Instead, it offsets the
schedule’s anchor time. By default, the scheduler uses the UNIX epoch as
the anchor to calculate recurring schedule times, and "start-time"
simply shifts this anchor away from UTC, which can confuse users. For
example:

```
$ # current time
$ date --universal
Wed Dec 10 05:55:21 PM UTC 2025
$ rbd mirror snapshot schedule add -p data --image img1 1h 19:00Z
$ rbd mirror snapshot schedule ls -p data --image img1
every 15m starting at 19:00:00+00:00
```

A user might assume that the scheduler will run the first snapshot each
day at 19:00 UTC and then run snapshots every 15 minutes. Instead, the
scheduler runs the first snapshot at 18:00 UTC and then continues at the
configured interval:

```
$ rbd mirror snapshot schedule status -p data --image img1
SCHEDULE TIME        IMAGE
2025-12-10 18:00:00  data/img1
```

Additionally, the "start-time" argument accepts a full ISO 8601
timestamp but silently ignores everything except hour, minute, and time
zone. Even time zone handling is incorrect: specifying "23:00-01:00"
with an interval of "1d" results in a snapshot taken once per day at
22:00 UTC rather than 00:00 UTC, because only utcoffset.seconds is used
while utcoffset.days is ignored.

Fix:
Similar to the handling of the "start" argument in the FS snap-schedule
manager module, require "start-time" to use an ISO 8601 date-time format
with a mandatory date component. Time and time zone are optional and
default to 00:00 and UTC respectively.

The "start-time" now defines the anchor time used to compute recurring
schedule times. The default anchor remains the UNIX epoch. Existing
on-disk schedules with legacy-format "start-time" values are updated to
include the date Jan 1, 1970.

The `snap schedule ls` output now displays "start-time" with date and
time in the format "%Y-%m-%d %H:%M:00". The display time is in UTC.

Fixes: https://tracker.ceph.com/issues/74192
Signed-off-by: Ramana Raja <rraja@redhat.com>
4 days agocommon/options: change osd_target_transaction_size from int to uint
Kefu Chai [Tue, 15 Jul 2025 06:40:09 +0000 (14:40 +0800)]
common/options: change osd_target_transaction_size from int to uint

Change osd_target_transaction_size from signed int to unsigned int to
match the return type of Transaction::get_num_opts() (ceph_le64).

This change:
- Eliminates compiler warnings when comparing signed/unsigned values
- Enables automatic size conversion (e.g., "4_K" → 4096) via y2c.py
  for improved administrator usability
- Maintains type consistency throughout the codebase

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
5 days agolibrbd/pwl: fix memory leaks in discard operations
Kefu Chai [Tue, 17 Feb 2026 11:41:32 +0000 (19:41 +0800)]
librbd/pwl: fix memory leaks in discard operations

Fix memory leak in librbd persistent write log (PWL) cache discard
operations by properly completing request objects.

ASan reported the following leaks in unittest_librbd:

  Direct leak of 240 byte(s) in 1 object(s) allocated from:
    #0 operator new(unsigned long)
    #1 librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::discard(...)
       /ceph/src/librbd/cache/pwl/AbstractWriteLog.cc:935:5
    #2 TestMockCacheReplicatedWriteLog_discard_Test::TestBody()
       /ceph/src/test/librbd/cache/pwl/test_mock_ReplicatedWriteLog.cc:534:7

  Plus multiple indirect leaks totaling 2,076 bytes through the
  shared_ptr reference chain.

Root cause:

C_DiscardRequest objects were never deleted because their complete()
method was never called. The on_write_persist callback released the
BlockGuard cell but didn't call complete() to trigger self-deletion.

Write requests use WriteLogOperationSet which takes the request as
its on_finish callback, ensuring complete() is eventually called.
Discard requests don't use WriteLogOperationSet and must explicitly
call complete() in their on_write_persist callback.

Solution:

Call discard_req->complete(r) in the on_write_persist callback and
move cell release into finish_req() -- mirroring how C_WriteRequest
handles it. The complete() -> finish() -> finish_req() chain ensures
the cell is released after the user request is completed, preserving
the same ordering as write requests.

Test results:
- Before: 2,316 bytes leaked in 15 allocations
- After: 0 bytes leaked
- unittest_librbd discard tests pass successfully with ASan

Fixes: https://tracker.ceph.com/issues/74972
Signed-off-by: Kefu Chai <k.chai@proxmox.com>
5 days agoos/Transaction: change get_num_ops() return type to uint64_t
Kefu Chai [Mon, 14 Jul 2025 10:50:48 +0000 (18:50 +0800)]
os/Transaction: change get_num_ops() return type to uint64_t

Change Transaction::get_num_ops() to return uint64_t instead of int
to match the underlying data.ops type (ceph_le<__u64>) and eliminate
compiler warnings about signed/unsigned comparison.

Fixes warning in ECTransaction.cc:

```
/home/kefu/dev/ceph/src/osd/ECTransaction.cc: In constructor ‘ECTransaction::Generate::Generate(PGTransaction&, ceph::ErasureCodeInterfaceRef&, pg_t&, const ECUtil::stripe_info_t&, const std::map<hobject_t, ECUtil::shard_extent_map_t>&, std::map<hobject_t, ECUtil::shard_extent_map_t>*, shard_id_map<ceph::os::Transaction>&, const OSDMapRef&, const hobject_t&, PGTransaction::ObjectOperation&, ECTransaction::WritePlanObj&, DoutPrefixProvider*, pg_log_entry_t*)’:
/home/kefu/dev/ceph/src/osd/ECTransaction.cc:589:25: warning: comparison of integer expressions of different signedness: ‘int’ and ‘__gnu_cxx::__alloc_traits<std::allocator<unsigned int>, unsigned int>::value_type’ {aka ‘unsigned int’} [-Wsign-compare]
  589 |     if (t.get_num_ops() > old_transaction_counts[int(shard)] &&
```

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
5 days agoMerge pull request #66557 from phlogistonjohn/jjm-smb-exo-cluster
John Mulligan [Fri, 20 Feb 2026 19:53:21 +0000 (14:53 -0500)]
Merge pull request #66557 from phlogistonjohn/jjm-smb-exo-cluster

smb: allow smb clusters to use cephfs from a different ceph cluster

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Xavi Hernandez <xhernandez@gmail.com>
Reviewed-by: Adam King <adking@redhat.com>
5 days agoMerge pull request #67256 from afreen23/storage-card
Afreen Misbah [Fri, 20 Feb 2026 16:32:36 +0000 (22:02 +0530)]
Merge pull request #67256 from afreen23/storage-card

mgr/dashboard: Add storage card to overview page

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
5 days agoMerge pull request #67423 from cbodley/wip-74573
Adam Emerson [Fri, 20 Feb 2026 16:20:39 +0000 (11:20 -0500)]
Merge pull request #67423 from cbodley/wip-74573

qa/rgw: bucket notifications use pynose

Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
5 days agoMerge pull request #67397 from cbodley/wip-74047
Casey Bodley [Fri, 20 Feb 2026 14:12:03 +0000 (09:12 -0500)]
Merge pull request #67397 from cbodley/wip-74047

doc/radosgw: document account-root for PUT and POST /admin/user

Reviewed-by: Ville Ojamo <git2233+ceph@ojamo.eu>
5 days agoMerge pull request #67368 from idryomov/wip-write-log-operation-set-cell
Ilya Dryomov [Fri, 20 Feb 2026 11:26:00 +0000 (12:26 +0100)]
Merge pull request #67368 from idryomov/wip-write-log-operation-set-cell

librbd/cache/pwl: WriteLogOperationSet::cell can be garbage

Reviewed-by: Miki Patel <miki.patel132@gmail.com>
6 days agoMerge PR #66907 into main
Venky Shankar [Fri, 20 Feb 2026 08:49:31 +0000 (14:19 +0530)]
Merge PR #66907 into main

* refs/pull/66907/head:

Reviewed-by: Anoop C S <anoopcs@cryptolab.net>
6 days agoMerge pull request #67274 from shraddhaag/wip-shraddhaag-cephadm-seastore-support
Shraddha Agrawal [Fri, 20 Feb 2026 06:11:16 +0000 (11:41 +0530)]
Merge pull request #67274 from shraddhaag/wip-shraddhaag-cephadm-seastore-support

cephadm: add support for seastore

6 days agoMerge pull request #67374 from shraddhaag/wip-shraddhaag-crimson-cephadm-raw
Shraddha Agrawal [Fri, 20 Feb 2026 06:10:49 +0000 (11:40 +0530)]
Merge pull request #67374 from shraddhaag/wip-shraddhaag-crimson-cephadm-raw

cephdm: add support for raw bluestore + crimson OSD deployment

6 days agoMerge pull request #66685 from bill-scales/issue74218
Bill Scales [Thu, 19 Feb 2026 16:58:36 +0000 (16:58 +0000)]
Merge pull request #66685 from bill-scales/issue74218

osd: FastEC: always update pwlc epoch when activating

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
6 days agoqa/rgw: bucket notifications use pynose
Casey Bodley [Thu, 19 Feb 2026 15:09:44 +0000 (10:09 -0500)]
qa/rgw: bucket notifications use pynose

nose incompatibility in multisite tests was fixed by switching to pynose
in https://github.com/ceph/teuthology/pull/1947, so i'm trying the same
here

Fixes: https://tracker.ceph.com/issues/74573
Signed-off-by: Casey Bodley <cbodley@redhat.com>
6 days agomgr/dashboard: bump nvmeof submodule to 1.6.7 wip-nvmeof-167-centos9-only
Vallari Agrawal [Thu, 19 Feb 2026 14:28:32 +0000 (16:28 +0200)]
mgr/dashboard: bump nvmeof submodule to 1.6.7

update proto files and gateway submodule

Fixes: https://tracker.ceph.com/issues/75015
Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
6 days agoMerge pull request #67221 from guits/node-proxy-various-fixes
Guillaume Abrioux [Thu, 19 Feb 2026 12:00:56 +0000 (13:00 +0100)]
Merge pull request #67221 from guits/node-proxy-various-fixes

node-proxy: major refactor and various fixes

6 days agodoc: add docs for seastore support in cephadm
Shraddha Agrawal [Thu, 12 Feb 2026 05:06:31 +0000 (10:36 +0530)]
doc: add docs for seastore support in cephadm

This commit updates the crimson user facing docs to add
instructions on how to deploy a crimson OSD with seastore
objectstore.

Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>
6 days agoMerge pull request #67277 from afreen23/nvmeof-api
Afreen Misbah [Thu, 19 Feb 2026 10:03:16 +0000 (15:33 +0530)]
Merge pull request #67277 from afreen23/nvmeof-api

mgr/dashboard: Add apis for add/del hosts on namespaces

Reviewed-by: Nizamudeen A <nia@redhat.com>
6 days agoMerge pull request #67165 from Matan-B/wip-matanb-io_uring
Matan Breizman [Thu, 19 Feb 2026 09:37:50 +0000 (11:37 +0200)]
Merge pull request #67165 from Matan-B/wip-matanb-io_uring

src/CMakeLists.txt: Allow Seastar to reuse HAVE_LIBURING

Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
7 days agoMerge pull request #67375 from shraddhaag/wip-shraddhaag-availability-default-state
Shraddha Agrawal [Thu, 19 Feb 2026 06:33:31 +0000 (12:03 +0530)]
Merge pull request #67375 from shraddhaag/wip-shraddhaag-availability-default-state

doc: update default availbility score status

7 days agomon/OSDMonitor: remove unused crush rules after erasure code pools deleted
Nitzan Mordechai [Thu, 14 Sep 2023 09:41:13 +0000 (09:41 +0000)]
mon/OSDMonitor: remove unused crush rules after erasure code pools deleted

When erasure code pools are created, a corresponding Crush rule is concurrently added to the Crush map.
However, when these pools are subsequently deleted, the associated rule persists within the Crush map.
In the event that a pool is re-created with the same name, the rule already exists.
However, if any modifications were made to the rule prior to the pool's deletion,
the new pool will inherit these modifications.

The proposed solution involves the automatic deletion of the Crush rule when a pool is deleted,
but only if no other pools are utilizing that particular rule.

Fixes: https://tracker.ceph.com/issues/62826
Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
7 days agoMerge pull request #67132 from rhcs-dashboard/delete-gateway-nodes
naman munet [Thu, 19 Feb 2026 05:45:31 +0000 (11:15 +0530)]
Merge pull request #67132 from rhcs-dashboard/delete-gateway-nodes

mgr/dashboard: delete-gateway-nodes

7 days agomgr/dashboard: Removed Raw capacity toggle
Afreen Misbah [Wed, 18 Feb 2026 02:08:08 +0000 (07:38 +0530)]
mgr/dashboard: Removed Raw capacity toggle

- removed raw capacity toggle
- updated tests
- added polling for promethues queries
- added tests for formatter service functions

Signed-off-by: Afreen Misbah <afreen@ibm.com>
7 days agolibrbd/cache/pwl: WriteLogOperationSet::cell can be garbage
Ilya Dryomov [Mon, 16 Feb 2026 21:24:47 +0000 (22:24 +0100)]
librbd/cache/pwl: WriteLogOperationSet::cell can be garbage

The pointer is never initialized but gets printed by operator<<.
Luckily outside of that it's unused.

Fixes: https://tracker.ceph.com/issues/74971
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
7 days agoMerge pull request #67119 from ceph/copilot/add-copilot-instructions-file
Ernesto Puerta [Wed, 18 Feb 2026 18:16:51 +0000 (19:16 +0100)]
Merge pull request #67119 from ceph/copilot/add-copilot-instructions-file

github: define contribution workflows for regular and backport PRs

7 days agocopilot: add GitHub Copilot instructions
copilot-swe-agent[bot] [Thu, 29 Jan 2026 10:01:57 +0000 (10:01 +0000)]
copilot: add GitHub Copilot instructions

Github allows to add a instructions file to each repo
(.github/copilot-instructions.md) to improve the behavior
of Copilot Reviews and Agent.

These instructions can also be customized per path, filetype, etc.:
https://docs.github.com/en/copilot/how-tos/configure-custom-instructions/add-repository-instructions

This commit was authored through a Github Agent session: https://github.com/ceph/ceph/tasks/edeca07b-eabd-477c-917a-a18e72a0e2c2

Co-authored-by: GitHub Copilot noreply@github.com
Generated-by: Claude Sonnet 4.5
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
7 days agoMerge pull request #66527 from gardran/wip-gardran-dump-omap
Igor Fedotov [Wed, 18 Feb 2026 16:28:49 +0000 (19:28 +0300)]
Merge pull request #66527 from gardran/wip-gardran-dump-omap

os/bluestore: add omap_bytes perf counter.

Reviewed-by: Adam Kupczyk <akupczyk@ibm.com>
7 days agodoc/radosgw: document account-root for PUT and POST /admin/user
Casey Bodley [Wed, 18 Feb 2026 15:50:25 +0000 (10:50 -0500)]
doc/radosgw: document account-root for PUT and POST /admin/user

like the `--account-root` option for `radosgw-admin user create` and
`user modify`, the admin apis also support the `account-root` query
param

Fixes: https://tracker.ceph.com/issues/74047
Signed-off-by: Casey Bodley <cbodley@redhat.com>
7 days agodoc/radosgw: document account-id for `POST /admin/user`
Casey Bodley [Wed, 18 Feb 2026 15:48:46 +0000 (10:48 -0500)]
doc/radosgw: document account-id for `POST /admin/user`

the account-id field applies to both `PUT /admin/user` (Create User)
and `POST /admin/user` (Modify User) apis

Signed-off-by: Casey Bodley <cbodley@redhat.com>
7 days agodoc/mgr: document new external ceph cluster support for smb
John Mulligan [Fri, 23 Jan 2026 19:56:02 +0000 (14:56 -0500)]
doc/mgr: document new external ceph cluster support for smb

Document the new feature that allows smb clusters on one Ceph cluster to
make use of CephFS running on a different external Ceph cluster.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
7 days agomgr/cephadm: validate hostname in NodeProxyCache
Guillaume Abrioux [Mon, 16 Feb 2026 13:24:36 +0000 (14:24 +0100)]
mgr/cephadm: validate hostname in NodeProxyCache

This adds a _resolve_hosts() method to resolve hostname from kwargs
and raise OrchestratorError when the host has no node-proxy data.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: improve HTTP error logging in client
Guillaume Abrioux [Mon, 16 Feb 2026 12:49:46 +0000 (13:49 +0100)]
node-proxy: improve HTTP error logging in client

This commit makes it log the http error with the code and the reason
in sessionservice_discover() and log the error code along with the
body in query() for 5xx responses.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: get serial number instead of SKU
Guillaume Abrioux [Thu, 12 Feb 2026 15:08:39 +0000 (16:08 +0100)]
node-proxy: get serial number instead of SKU

Let's get the serial number instead of SKU.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: allow multiple sources per component
Guillaume Abrioux [Thu, 12 Feb 2026 14:00:11 +0000 (15:00 +0100)]
node-proxy: allow multiple sources per component

COMPONENT_SPECS can now be a single spec or a list of specs per component.
Data from all sources is merged. Unavailable paths are skipped.

Extract get_component_data() from update_component() to support the merge logic.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: re-auth and retry once on 401
Guillaume Abrioux [Wed, 11 Feb 2026 07:41:32 +0000 (08:41 +0100)]
node-proxy: re-auth and retry once on 401

This commit makes node-proxy clear the session, call login() and retry
the request when a 401 http error is caught.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: fix flake8 E721 in _dict_diff
Guillaume Abrioux [Tue, 10 Feb 2026 15:25:47 +0000 (16:25 +0100)]
node-proxy: fix flake8 E721 in _dict_diff

Use "is not" instead of "!=" for type comparison in _dict_diff()

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: make the update loop interval configurable
Guillaume Abrioux [Tue, 10 Feb 2026 15:15:42 +0000 (16:15 +0100)]
node-proxy: make the update loop interval configurable

Read system.refresh_interval from config and use it in the update loop
sleep. The default value is 180s when unset.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agomgr/node-proxy: fix "ceph orch hardware status --category criticals"
Guillaume Abrioux [Tue, 10 Feb 2026 14:59:55 +0000 (15:59 +0100)]
mgr/node-proxy: fix "ceph orch hardware status --category criticals"

The criticals path was using the wrong data shape:
node-proxy sends status as:

  component -> sys_id -> member

 but the code assumed:

  sys_id -> component -> member

This fixes get_critical_from_host() and _criticals_table() to iterate
in the correct order and build the criticals result with the right
nesting.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: normalize storage data per member
Guillaume Abrioux [Tue, 10 Feb 2026 14:46:03 +0000 (15:46 +0100)]
node-proxy: normalize storage data per member

Let's apply normalize_dict() to each member's data only, so the first
level keys (that are redfish member identifiers like "Self") are not
lowercased.

This avoids duplicate entries in hardware status.

Example:

```
[root@node-proxy-1 cephadm]# ./cephadm shell -- ceph orch hardware status --category criticals
Inferring fsid 9d6d6012-067a-11f1-8e61-525400a04a72
Inferring config /var/lib/ceph/9d6d6012-067a-11f1-8e61-525400a04a72/mon.node-proxy-1/config
+--------------+-----------+------+--------+-------+
|     HOST     | COMPONENT | NAME | STATUS | STATE |
+--------------+-----------+------+--------+-------+
| node-proxy-1 |    self   | None |  N/A   |  N/A  |
| node-proxy-1 |    Self   | None |  N/A   |  N/A  |
+--------------+-----------+------+--------+-------+
```

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: encapsulate send logic in dedicated method
Guillaume Abrioux [Thu, 5 Feb 2026 09:01:06 +0000 (10:01 +0100)]
node-proxy: encapsulate send logic in dedicated method

Move the "send data to mgr when inventory changed" logic from main()
into a dedicated method _try_send_update().
This flattens the reporter loop and keeps main() to a single call under
the lock.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: log actual data delta in reporter
Guillaume Abrioux [Wed, 4 Feb 2026 14:46:29 +0000 (15:46 +0100)]
node-proxy: log actual data delta in reporter

this adds a _dict_diff() function that computes recursive dict diff
and uses it in reporter to log the delta (truncated at 2048 chars)

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: add periodic heartbeats in main and reporter loops
Guillaume Abrioux [Wed, 4 Feb 2026 14:15:23 +0000 (15:15 +0100)]
node-proxy: add periodic heartbeats in main and reporter loops

This logs an info message every 5 minutes so that logs show the agent
and reporter are still running when nothing else is logged.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: adjust log levels
Guillaume Abrioux [Wed, 4 Feb 2026 13:16:40 +0000 (14:16 +0100)]
node-proxy: adjust log levels

Let's adjust log levels across the project:

- use warning for bad request in the API,
  when thread is not alive and for retry failure,
- use error for OOB load failure,
- use info for backoff interval,
- use debug in send attempts and for member fetch

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: add unit tests
Guillaume Abrioux [Tue, 3 Feb 2026 15:26:16 +0000 (16:26 +0100)]
node-proxy: add unit tests

This adds some unit tests.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: add tox config for mypy, flake8, isort, black
Guillaume Abrioux [Tue, 3 Feb 2026 13:47:54 +0000 (14:47 +0100)]
node-proxy: add tox config for mypy, flake8, isort, black

this adds tox.ini with environments to run mypy, flake8, isort, and
black on the ceph_node_proxy code.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: black and isort formatting pass
Guillaume Abrioux [Tue, 3 Feb 2026 13:45:30 +0000 (14:45 +0100)]
node-proxy: black and isort formatting pass

Format the code with black and isort.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: fix mypy errors
Guillaume Abrioux [Tue, 3 Feb 2026 13:41:13 +0000 (14:41 +0100)]
node-proxy: fix mypy errors

this commit fixes mypy errors by adding explicit types for get_path
and get_* getters methods, extending SystemBackend with
start/shutdown and declaring _ca_temp_file on NodeProxyManager

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: handle nested Redfish paths for components
Guillaume Abrioux [Tue, 3 Feb 2026 12:22:03 +0000 (13:22 +0100)]
node-proxy: handle nested Redfish paths for components

Add a _resolve_path() helper to support components whose data
lives under nested Redfish paths when assembling component data.

For instance, power is exposed at 'PowerSubsystem/PowerSupplies'

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: split out config, bootstrap and redfish logic
Guillaume Abrioux [Fri, 30 Jan 2026 15:02:28 +0000 (16:02 +0100)]
node-proxy: split out config, bootstrap and redfish logic

refactor config, bootstrap, redfish layer, and monitoring:

this:
- adds a config module (CephadmCofnig, load_cephadm_config and
get_node_proxy_config) and protocols for api/reporter.
- extracts redfish logic to redfish.py
- adds a vendor registry with entrypoints.
- simplifies main() and NodeProxyManager().

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: refactor config loading
Guillaume Abrioux [Fri, 30 Jan 2026 14:33:05 +0000 (15:33 +0100)]
node-proxy: refactor config loading

This commit renames CONFIG to DEFAULTS and add load_config() with
deep merge, refactor Config to use path + defaults and makes
node-proxy config path configurable via bootstrap JSON or env.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: add 'vendor based' redfish system selection
Guillaume Abrioux [Fri, 30 Jan 2026 14:12:14 +0000 (15:12 +0100)]
node-proxy: add 'vendor based' redfish system selection

This commit adds REDFISH_SYSTEM_CLASSES registry (generic, dell, ...),
this way the user can choose a system class.

The default value is BaseRedfishSystem (generic) when vendor isn't specified.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: introduce component spec registry and overrides for updates
Guillaume Abrioux [Thu, 29 Jan 2026 12:27:22 +0000 (13:27 +0100)]
node-proxy: introduce component spec registry and overrides for updates

This change introduces a single COMPONENT_SPECS dict and get_update_spec(component)
as the single source of truth for RedFish component update config (collection, path,
fields, attribute). To support hardware that uses different paths or attributes,
get_component_spec_overrides() allows overriding only those fields (via dataclasses.replace())
without duplicating the rest of the spec.
All _update_network, _update_power, etc. now call _run_update(component).

For instance, AtollonSystem uses this to set the power path to 'PowerSubsystem'.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agomgr/cephadm: safe status/health access in node-proxy agent and inventory
Guillaume Abrioux [Thu, 29 Jan 2026 12:14:40 +0000 (13:14 +0100)]
mgr/cephadm: safe status/health access in node-proxy agent and inventory

This adds helpers in NodeProxyEndpoint and NodeProxyCache to safely
read status.health and status.state.

In NodeProxyEndpoint, methods _get_health_value() and _get_state_value()
are used in get_nok_members() to avoid KeyError on malformed data.

In NodeProxyCache, _get_health_value(), _has_health_value(),
_is_error_status(), and _is_unknown_status() are used in fullreport()
and when filtering 'non ok' members instead of accessing
status['status']['health'] inline.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: narrow build_data exception handling and re-raise
Guillaume Abrioux [Thu, 29 Jan 2026 10:38:45 +0000 (11:38 +0100)]
node-proxy: narrow build_data exception handling and re-raise

With this commit, it catches only KeyError, TypeError, and
AttributeError in build_data() instead of Exception, and
re-raise after logging so callers get the actual error.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: refactor Endpoint/EndpointMgr and fix chassis paths
Guillaume Abrioux [Thu, 29 Jan 2026 09:48:45 +0000 (10:48 +0100)]
node-proxy: refactor Endpoint/EndpointMgr and fix chassis paths

This commit refactors EndpointMgr and Endpoint to use explicit dicts
instead of dynamic attributes. It also fixes member path filtering
so chassis endpoints use Chassis paths.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: use safe field access in storage update
Guillaume Abrioux [Wed, 28 Jan 2026 12:10:26 +0000 (13:10 +0100)]
node-proxy: use safe field access in storage update

Replace direct dictionary access with .get() method when processing
storage fields to handle missing optional fields gracefully.

(extra change: extract get_members_names() call for better readability.)

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agonode-proxy: reduce log verbosity for missing optional fields
Guillaume Abrioux [Wed, 28 Jan 2026 12:05:04 +0000 (13:05 +0100)]
node-proxy: reduce log verbosity for missing optional fields

Change missing field logging from warning to debug level in
RedfishDellSystem, as missing optional fields can be expected behavior
and and doesn't require warning level logging.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
8 days agoMerge pull request #66544 from NitzanMordhai/wip-nitzan-encoder-test-backward-incompa...
SrinivasaBharathKanta [Wed, 18 Feb 2026 08:46:29 +0000 (14:16 +0530)]
Merge pull request #66544 from NitzanMordhai/wip-nitzan-encoder-test-backward-incompability-checks

test/encoding/readable: Add backward incompat checks

8 days agocephdm: add support for raw crimson OSD deployment wip-shraddhaag-crimson-cephadm-raw
Shraddha Agrawal [Tue, 17 Feb 2026 14:41:11 +0000 (20:11 +0530)]
cephdm: add support for raw crimson OSD deployment

This commit adds support for deploying crimson OSDs using
cephadm with the method raw.

Support for lvm crimson OSD was added previously in:
https://github.com/ceph/ceph/pull/66811.

Fixes: https://tracker.ceph.com/issues/74960
Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>
8 days agoMerge pull request #67287 from nbalacha/wip-nbalacha-71390
Casey Bodley [Tue, 17 Feb 2026 21:40:16 +0000 (16:40 -0500)]
Merge pull request #67287 from nbalacha/wip-nbalacha-71390

rgw/configstore:  don't reinitialize the rados client in RadosRealmWatcher

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Adam Emerson <aemerson@redhat.com>
8 days agoMerge pull request #66455 from cfanz/wip-cfanz-fix-rgw-counter-overflow
J. Eric Ivancich [Tue, 17 Feb 2026 20:07:22 +0000 (15:07 -0500)]
Merge pull request #66455 from cfanz/wip-cfanz-fix-rgw-counter-overflow

rgw: fix overflow of outstanding counter in SimpleThrottler

Reviewed-by: Casey Bodley <cbodley@redhat.com>
8 days agoMerge pull request #67306 from cbodley/wip-rgw-rest-client-strftime
J. Eric Ivancich [Tue, 17 Feb 2026 20:04:27 +0000 (15:04 -0500)]
Merge pull request #67306 from cbodley/wip-rgw-rest-client-strftime

rgw/multisite: use libfmt to format Date header

Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
8 days agomgr/dashboard: Add apis for add/del hosts on namespaces
Afreen Misbah [Mon, 9 Feb 2026 16:15:03 +0000 (21:45 +0530)]
mgr/dashboard: Add apis for add/del hosts on namespaces

- these are UI APIs
- also removed namespace API and using "*" in existing instead for getting all ns in subsystem

Signed-off-by: Afreen Misbah <afreen@ibm.com>
8 days agoMerge pull request #67112 from rhcs-dashboard/cephfs-module-enable
Pedro Gonzalez Gomez [Tue, 17 Feb 2026 17:15:20 +0000 (18:15 +0100)]
Merge pull request #67112 from rhcs-dashboard/cephfs-module-enable

mgr/dashboard: add CephFS Mirroring enablement page

Reviewed-by: Afreen Misbah <afreen@ibm.com>
8 days agomgr/smb: update the handler to support external ceph cluster type
John Mulligan [Fri, 28 Nov 2025 17:53:06 +0000 (12:53 -0500)]
mgr/smb: update the handler to support external ceph cluster type

Signed-off-by: John Mulligan <jmulligan@redhat.com>
8 days agomgr/smb: update staging funcs to include ext ceph cluster resource
John Mulligan [Fri, 23 Jan 2026 19:15:41 +0000 (14:15 -0500)]
mgr/smb: update staging funcs to include ext ceph cluster resource

Update the functions in the staging.py support file to include support
for the newly added external ceph cluster type.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
8 days agomgr/smb: add new external ceph cluster to internal store mechs
John Mulligan [Wed, 3 Dec 2025 19:32:45 +0000 (14:32 -0500)]
mgr/smb: add new external ceph cluster to internal store mechs

Extend the internal store mechanisms to support the newly added
external ceph cluster resource type.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
8 days agomgr/smb: add external ceph cluster to sqlite store
John Mulligan [Wed, 3 Dec 2025 19:35:10 +0000 (14:35 -0500)]
mgr/smb: add external ceph cluster to sqlite store

Starting to get silly with the boilerplate but it should do the trick.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
8 days agoMerge pull request #66484 from samarahu/d4n-remove-erase-from-update
Samarah Uriarte [Tue, 17 Feb 2026 15:21:42 +0000 (09:21 -0600)]
Merge pull request #66484 from samarahu/d4n-remove-erase-from-update

Reviewed-by: Pritha Srivastava <prsrivas@redhat.com>
8 days agodoc: update default availbility score status
Shraddha Agrawal [Tue, 17 Feb 2026 14:59:07 +0000 (20:29 +0530)]
doc: update default availbility score status

The feature is turned off by default post
https://github.com/ceph/ceph/pull/65545. Update the docs
to reflect the same.

Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>
8 days agocephadm: add tests for seastore support
Shraddha Agrawal [Wed, 11 Feb 2026 14:23:39 +0000 (19:53 +0530)]
cephadm: add tests for seastore support

This commits adds the following tests:
1. cephadm: JSON roundtrip of a spec with objecstore=seastore.
2. cephadm: validation checks for objecstore values.
3. cephadm to ceph-volume: cmd checks if objecstore=seastore is set.

Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>
8 days agocephadm: add support for seastore
Shraddha Agrawal [Mon, 9 Feb 2026 13:48:07 +0000 (19:18 +0530)]
cephadm: add support for seastore

This commit adds support for deploying seastore objectstore with
cephdm. This can be done in two ways:

1. using OSD spec file, we can set the objectstore argument to
seastore. eg -
```
service_type: osd
service_id: osd_crimson_seastore
placement:
  host_pattern: '*'
spec:
  objectstore: seastore
  osd_type: crimson
  data_devices:
    all: true
```

2. using --objectstore flag with ceph orch osd deploy. sample cmd:
```
ceph orch apply osd --all-available-devices --osd-type crimson --objectstore seastore
```

Fixes: https://tracker.ceph.com/issues/74616
Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>
8 days agomgr/dashboard: delete-gateway-nodes
Sagar Gopale [Fri, 30 Jan 2026 06:42:12 +0000 (12:12 +0530)]
mgr/dashboard: delete-gateway-nodes

Fixes: https://tracker.ceph.com/issues/74336
Signed-off-by: Sagar Gopale <sagar.gopale@ibm.com>
8 days agoMerge pull request #67348 from VallariAg/wip-nvmeof-udisks-disable
Vallari Agrawal [Tue, 17 Feb 2026 12:34:16 +0000 (18:04 +0530)]
Merge pull request #67348 from VallariAg/wip-nvmeof-udisks-disable

qa: Fix coredumps caused by udisks

8 days agodoc/crimson/crimson.rst: introduce Enabling io_uring wip-matanb-io_uring
Matan Breizman [Tue, 17 Feb 2026 12:20:52 +0000 (12:20 +0000)]
doc/crimson/crimson.rst: introduce Enabling io_uring

We could enable this as part of packaged installs though
letting users enable this explicitly seems like a better approach.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
8 days agoseastar: update submodule to wip-matanb-seastar-feb-26
Matan Breizman [Mon, 9 Feb 2026 14:31:35 +0000 (14:31 +0000)]
seastar: update submodule to wip-matanb-seastar-feb-26

```
Allow provided liburing builds via the URING::uring target.
This allows seastar, when built as a submodule, to use external
liburing.
```

https://github.com/ceph/seastar/tree/wip-matanb-seastar-feb26

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
8 days agoMerge pull request #67370 from kotreshhr/qa-mirror-flake8-fix
Ilya Dryomov [Tue, 17 Feb 2026 12:03:38 +0000 (13:03 +0100)]
Merge pull request #67370 from kotreshhr/qa-mirror-flake8-fix

qa/test_mirroring: Fix flake8 errors

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
9 days agoqa/test_mirroring: Fix flake8 errors
Kotresh HR [Mon, 16 Feb 2026 19:35:59 +0000 (01:05 +0530)]
qa/test_mirroring: Fix flake8 errors

Introduced-by: c1e827247bd20e8a1851bc2d7a9861c12d033ef0
Signed-off-by: Kotresh HR <khiremat@redhat.com>
9 days agoMerge pull request #67231 from guits/ceph-volume-inventory-ls-all
Guillaume Abrioux [Mon, 16 Feb 2026 14:10:45 +0000 (15:10 +0100)]
Merge pull request #67231 from guits/ceph-volume-inventory-ls-all

ceph-volume: include LVM mapper devices in get_devices()

9 days agoMerge pull request #67110 from FredNass/patch-2
Yuval Lifshitz [Mon, 16 Feb 2026 11:48:36 +0000 (13:48 +0200)]
Merge pull request #67110 from FredNass/patch-2

doc/radosgw: rgw_lua_max_memory_per_state defaults to 128K (not 500K)

9 days agomgr/dashboard: Added unit tests
Afreen Misbah [Thu, 12 Feb 2026 10:36:53 +0000 (16:06 +0530)]
mgr/dashboard: Added unit tests

Assisted-by: ChatGPT
Signed-off-by: Afreen Misbah <afreen@ibm.com>
10 days agoMerge pull request #66967 from rhcs-dashboard/gateway-add-modal
Afreen Misbah [Mon, 16 Feb 2026 06:51:32 +0000 (12:21 +0530)]
Merge pull request #66967 from rhcs-dashboard/gateway-add-modal

mgr/dashboard: gateway-add-modal

Reviewed-by: Afreen Misbah <afreen@ibm.com>
10 days agoMerge pull request #67305 from kotreshhr/qa-mirror
Venky Shankar [Mon, 16 Feb 2026 06:02:59 +0000 (11:32 +0530)]
Merge pull request #67305 from kotreshhr/qa-mirror

qa: Add retry logic to remove most sleeps in mirroring tests

Reviewed-by: Venky Shankar <vshankar@redhat.com>
10 days agoMerge pull request #66912 from idryomov/wip-74394
NitzanMordhai [Sun, 15 Feb 2026 15:20:26 +0000 (17:20 +0200)]
Merge pull request #66912 from idryomov/wip-74394

osd/PrimaryLogPG: encode an empty data_bl for empty sparse reads

10 days agoMerge pull request #66894 from tchaikov/wip-ec-isa-fix-cache-collision
NitzanMordhai [Sun, 15 Feb 2026 15:20:05 +0000 (17:20 +0200)]
Merge pull request #66894 from tchaikov/wip-ec-isa-fix-cache-collision

erasure-code/isa: fix cache collision causing buffer overflow

10 days agoMerge pull request #66376 from NitzanMordhai/wip-nitzan-self-test-influx-set-hostname
NitzanMordhai [Sun, 15 Feb 2026 15:17:51 +0000 (17:17 +0200)]
Merge pull request #66376 from NitzanMordhai/wip-nitzan-self-test-influx-set-hostname

qa/tasks/mgr: test_module_selftest set influx hostname to avoid warnings

10 days agoMerge pull request #62067 from ljflores/wip-tracker-67179
NitzanMordhai [Sun, 15 Feb 2026 15:17:33 +0000 (17:17 +0200)]
Merge pull request #62067 from ljflores/wip-tracker-67179

osd: add pg-upmap-primary to clean_pg_upmaps

11 days agoMerge pull request #67353 from idryomov/wip-daemonwatchdog-unbound
Ilya Dryomov [Sat, 14 Feb 2026 21:36:37 +0000 (22:36 +0100)]
Merge pull request #67353 from idryomov/wip-daemonwatchdog-unbound

qa/tasks/daemonwatchdog: fix unbound variable in bark_reason message

Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
11 days agoMerge pull request #67351 from idryomov/wip-74712
Ilya Dryomov [Sat, 14 Feb 2026 16:49:59 +0000 (17:49 +0100)]
Merge pull request #67351 from idryomov/wip-74712

qa: krbd_rxbounce.sh: do more reads to generate more errors

Reviewed-by: Ramana Raja <rraja@redhat.com>
12 days agoMerge pull request #65318 from tchaikov/wip-mgr-progress-cleanup
Kefu Chai [Sat, 14 Feb 2026 03:18:18 +0000 (11:18 +0800)]
Merge pull request #65318 from tchaikov/wip-mgr-progress-cleanup

pybind/mgr/progress: cleanups

Reviewed-by: Samuel Just <sjust@redhat.com>
12 days agomgr/smb: add new external ceph cluster resource type
John Mulligan [Wed, 26 Nov 2025 19:59:50 +0000 (14:59 -0500)]
mgr/smb: add new external ceph cluster resource type

Add a new resource type that will allow an smb cluster to use
cephfs from a ceph cluster *other* than the one hosting the smb
instance.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
12 days agomgr/smb: add new enum for external ceph cluster resource type
John Mulligan [Wed, 3 Dec 2025 19:31:53 +0000 (14:31 -0500)]
mgr/smb: add new enum for external ceph cluster resource type

Signed-off-by: John Mulligan <jmulligan@redhat.com>