git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/log

]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/log

projects / ceph.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Kotresh HR [Sat, 21 Feb 2026 13:51:02 +0000 (19:21 +0530)]

tools/cephfs_mirror: Fix assert while opening handles

Issue:
When the crawler or a datasync thread encountered an error,
it's possible that the crawler gets notified by a datasync
thread and bails out resulting in the unregister of the
particular dir_root. The other datasync threads might
still hold the same syncm object and tries to open the
handles during which the following assert is hit.

ceph_assert(it != m_registered.end());

Cause:
This happens because the in_flight counter in syncm object
was tracking if it's processing the actual job from the data
queue.

Fix:
Make in_flight counter in syncm object to track the active
syncm object i.e, inrement as soon as the datasync thread
get a reference to it and decrement when it goes out of
reference.

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Sat, 21 Feb 2026 10:36:31 +0000 (16:06 +0530)]

tools/cephfs_mirror: Fix dequeue of syncm on error

On error encountered in crawler thread or datasync
thread while processing a syncm object, it's possible
that multiple datasync threads attempts the dequeue of
syncm object. Though it's safe, add a condition to avoid
it.

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Sat, 21 Feb 2026 10:27:42 +0000 (15:57 +0530)]

tools/cephfs_mirror: Handle errors in crawler thread

Any error encountered in crawler threads should be
communicated to the data sync threads by marking the
crawl error in the corresponding syncm object. The
data sync threads would finish pending jobs, dequeue
the syncm object and notify crawler to bail out.

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Sat, 21 Feb 2026 10:18:56 +0000 (15:48 +0530)]

tools/cephfs_mirror: Handle error in datasync thread

On any error encountered in datasync threads while syncing
a particular syncm dataq, mark the datasync error and
communicate the error to the corresponding syncm's crawler
which is waiting to take a snaphsot. The crawler will log
the error and bail out.

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Thu, 8 Jan 2026 09:18:01 +0000 (14:48 +0530)]

tools/cephfs_mirror: Add debug to capture file sync time

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Sat, 21 Feb 2026 08:34:44 +0000 (14:04 +0530)]

tools/cephfs_mirror: Efficient use of data sync threads

The job queue is something like below for data sync threads.

  |syncm1|---------|syncm2|------...---|syncmn|
     |                |                   |
   |m_sync_dataq|   |m_sync_dataq|    |m_sync_dataq|

There is global queue of SyncMechanism objects(syncm). Each syncm
object represents a single snapshot being synced and each syncm
object owns m_sync_dataq representing list of files in the snapshot
to be synced.

The data sync threads should consume the next syncm job
if the present syncm has no pending work. This can evidently
happen if the last file being synced in the present syncm
job is a large file from it's syncm_dataq. In this case, one
data sync thread is busy syncing the large file, the rest of
data sync threads just wait for it to finish to avoid busy loop.
Instead, the idle data sync threads could start consuming the next
syncm job.

This brings in a change to data structure.
- syncm_q has to be std::deque instead of std::queue as syncm in the
   middle can finish syncing first and that needs to be removed before
   the front

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Wed, 14 Jan 2026 12:50:26 +0000 (18:20 +0530)]

tools/cephfs_mirror: Make max datasync threads configureable

Add a config to configure the number of data sync threads.

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Sat, 21 Feb 2026 08:28:47 +0000 (13:58 +0530)]

tools/cephfs_mirror: Synchronize taking snapshot

The crawler/entry creation thread needs to wait until
all the data is synced by datasync threads to take
the snapshot. This patch adds the necessary conditions
for the same.

It is important for the conditional flag to be part
of SyncMechanism and not part of PeerReplayer class.
The following bug would be hit if it were part of
PeerReplayer class.

When multiple directories are confiugred for mirroring as below
/d0                /d1              /d2
Crawler1         Crawler2          Crawler3
DoneEntryOps     DoneEntryOps      DoneEntryOps
WaitForSafeSnap  WaitForSafeSnap   WaitForSafeSnap

When all crawler threads are waiting at above, the data sync threads
which is done processing /d1, would notify, waking up all the crawlers
causing spurious/unwanted wake up and half baked snapshots.

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Wed, 14 Jan 2026 12:17:47 +0000 (17:47 +0530)]

tools/cephfs_mirror: Add SnapDiff entries to dataq

Add SnapDiff entries to dataq and process the same
in datasync threads similar to RemoteSync entries.

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Sat, 21 Feb 2026 08:20:56 +0000 (13:50 +0530)]

tools/cephfs_mirror: Process entries from dataq

Consume entries from syncm's data queue and sync
them to remote.

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Sat, 21 Feb 2026 08:19:19 +0000 (13:49 +0530)]

tools/cephfs_mirror: Move dir_root to SyncMechanism

Store m_dir_root in parent (SyncMehansim) to make
it accessible in the data sync threads to sync
files

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Sat, 21 Feb 2026 08:18:15 +0000 (13:48 +0530)]

tools/cephfs_mirror: Fix data sync threads completion logic

We need to exactly know when all data threads completes
the processing of a syncm. If a few threads finishes the
job, they all need to wait for the in processing threads
of that syncm to complete. Otherwise the finished threads
would be busy loop until in processing threads finishes.

And only after all threads finishes processing, the crawler
thread can be notified to take the snapshot.

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Tue, 9 Dec 2025 10:49:57 +0000 (16:19 +0530)]

tools/cephfs_mirror: Populate dataq for RemoteSync

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Wed, 14 Jan 2026 11:11:50 +0000 (16:41 +0530)]

tools/cephfs_mirror: Move remote_mkdir to SyncMechanism

This is required as SyncMechanism::get_entry would sync
directories during crawl.

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Tue, 9 Dec 2025 10:05:08 +0000 (15:35 +0530)]

tools/cephfs_mirror: Mark crawl finished

After entry operations are synced and stack is empty,
mark the crawl as finished so the data sync threads'
wait logic works correctly and doesn't indefinitely wait.

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Wed, 14 Jan 2026 09:56:25 +0000 (15:26 +0530)]

tools/cephfs_mirror: Add m_sync_data queue

Add data sync queue for each SyncMechanism.

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Wed, 14 Jan 2026 08:47:07 +0000 (14:17 +0530)]

tools/cephfs_mirror: Add SyncMechanism Queue

Add a queue of shared_ptr of type SyncMechanism.
Since it's shared_ptr, the queue can hold both
shared_ptr to both RemoteSync and SnapDiffSync objects.
Each SyncMechanism holds the queue for the SyncEntry
items to be synced using the data sync threads.

The SyncMechanism queue needs to be shared_ptr because
all the data sync threads needs to access the object
of SyncMechanism to process the SyncEntry Queue.

This patch sets up the building blocks for the same.

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Tue, 25 Nov 2025 10:25:05 +0000 (15:55 +0530)]

tools/cephfs_mirror: Join datasync threads on shutdown

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Wed, 14 Jan 2026 08:27:34 +0000 (13:57 +0530)]

tools/cephfs_mirror: Use the existing m_lock and m_cond

The entire snapshot is synced outside the lock.
The m_lock and m_cond pair is used for data sync
threads along with crawler threads to work well
with all terminal conditions like shutdown and
existing data structures.

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Kotresh HR [Mon, 24 Nov 2025 14:43:04 +0000 (20:13 +0530)]

tools/cephfs_mirror: Add a pool of datasync threads

Fixes: https://tracker.ceph.com/issues/73452
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Ilya Dryomov [Sat, 21 Feb 2026 16:12:24 +0000 (17:12 +0100)]

Merge pull request #66735 from ajarr/wip-fix-schedule-start-time

mgr/rbd_support: Fix "start-time" arg behavior

Reviewed-by: Mykola Golub <mykola.golub@clyso.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Ramana Raja [Wed, 24 Dec 2025 10:24:50 +0000 (05:24 -0500)]

mgr/rbd_support: Fix "start-time" arg behavior

The "start-time" argument, optionally passed when adding or removing an
mirror image snapshot schedule or a trash purge schedule, does not
behave as intended. It is meant to schedule an initial operation at a
specific time of day in a given time zone. Instead, it offsets the
schedule’s anchor time. By default, the scheduler uses the UNIX epoch as
the anchor to calculate recurring schedule times, and "start-time"
simply shifts this anchor away from UTC, which can confuse users. For
example:

```
$ # current time
$ date --universal
Wed Dec 10 05:55:21 PM UTC 2025
$ rbd mirror snapshot schedule add -p data --image img1 1h 19:00Z
$ rbd mirror snapshot schedule ls -p data --image img1
every 15m starting at 19:00:00+00:00
```

A user might assume that the scheduler will run the first snapshot each
day at 19:00 UTC and then run snapshots every 15 minutes. Instead, the
scheduler runs the first snapshot at 18:00 UTC and then continues at the
configured interval:

```
$ rbd mirror snapshot schedule status -p data --image img1
SCHEDULE TIME IMAGE
2025-12-10 18:00:00 data/img1
```

Additionally, the "start-time" argument accepts a full ISO 8601
timestamp but silently ignores everything except hour, minute, and time
zone. Even time zone handling is incorrect: specifying "23:00-01:00"
with an interval of "1d" results in a snapshot taken once per day at
22:00 UTC rather than 00:00 UTC, because only utcoffset.seconds is used
while utcoffset.days is ignored.

Fix:
Similar to the handling of the "start" argument in the FS snap-schedule
manager module, require "start-time" to use an ISO 8601 date-time format
with a mandatory date component. Time and time zone are optional and
default to 00:00 and UTC respectively.

The "start-time" now defines the anchor time used to compute recurring
schedule times. The default anchor remains the UNIX epoch. Existing
on-disk schedules with legacy-format "start-time" values are updated to
include the date Jan 1, 1970.

The `snap schedule ls` output now displays "start-time" with date and
time in the format "%Y-%m-%d %H:%M:00". The display time is in UTC.

Fixes: https://tracker.ceph.com/issues/74192
Signed-off-by: Ramana Raja <rraja@redhat.com>

commit | commitdiff | tree

John Mulligan [Fri, 20 Feb 2026 19:53:21 +0000 (14:53 -0500)]

Merge pull request #66557 from phlogistonjohn/jjm-smb-exo-cluster

smb: allow smb clusters to use cephfs from a different ceph cluster

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Xavi Hernandez <xhernandez@gmail.com>
Reviewed-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Afreen Misbah [Fri, 20 Feb 2026 16:32:36 +0000 (22:02 +0530)]

Merge pull request #67256 from afreen23/storage-card

mgr/dashboard: Add storage card to overview page

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>

commit | commitdiff | tree

Adam Emerson [Fri, 20 Feb 2026 16:20:39 +0000 (11:20 -0500)]

Merge pull request #67423 from cbodley/wip-74573

qa/rgw: bucket notifications use pynose

Reviewed-by: Adam C. Emerson <aemerson@redhat.com>

commit | commitdiff | tree

Casey Bodley [Fri, 20 Feb 2026 14:12:03 +0000 (09:12 -0500)]

Merge pull request #67397 from cbodley/wip-74047

doc/radosgw: document account-root for PUT and POST /admin/user

Reviewed-by: Ville Ojamo <git2233+ceph@ojamo.eu>

commit | commitdiff | tree

Ilya Dryomov [Fri, 20 Feb 2026 11:26:00 +0000 (12:26 +0100)]

Merge pull request #67368 from idryomov/wip-write-log-operation-set-cell

librbd/cache/pwl: WriteLogOperationSet::cell can be garbage

Reviewed-by: Miki Patel <miki.patel132@gmail.com>

commit | commitdiff | tree

Venky Shankar [Fri, 20 Feb 2026 08:49:31 +0000 (14:19 +0530)]

Merge PR #66907 into main

* refs/pull/66907/head:

Reviewed-by: Anoop C S <anoopcs@cryptolab.net>

commit | commitdiff | tree

Shraddha Agrawal [Fri, 20 Feb 2026 06:11:16 +0000 (11:41 +0530)]

Merge pull request #67274 from shraddhaag/wip-shraddhaag-cephadm-seastore-support

cephadm: add support for seastore

commit | commitdiff | tree

Shraddha Agrawal [Fri, 20 Feb 2026 06:10:49 +0000 (11:40 +0530)]

Merge pull request #67374 from shraddhaag/wip-shraddhaag-crimson-cephadm-raw

cephdm: add support for raw bluestore + crimson OSD deployment

commit | commitdiff | tree

Bill Scales [Thu, 19 Feb 2026 16:58:36 +0000 (16:58 +0000)]

Merge pull request #66685 from bill-scales/issue74218

osd: FastEC: always update pwlc epoch when activating

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Casey Bodley [Thu, 19 Feb 2026 15:09:44 +0000 (10:09 -0500)]

qa/rgw: bucket notifications use pynose

nose incompatibility in multisite tests was fixed by switching to pynose
in https://github.com/ceph/teuthology/pull/1947, so i'm trying the same
here

Fixes: https://tracker.ceph.com/issues/74573
Signed-off-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 19 Feb 2026 12:00:56 +0000 (13:00 +0100)]

Merge pull request #67221 from guits/node-proxy-various-fixes

node-proxy: major refactor and various fixes

commit | commitdiff | tree

Shraddha Agrawal [Thu, 12 Feb 2026 05:06:31 +0000 (10:36 +0530)]

doc: add docs for seastore support in cephadm

This commit updates the crimson user facing docs to add
instructions on how to deploy a crimson OSD with seastore
objectstore.

Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>

commit | commitdiff | tree

Afreen Misbah [Thu, 19 Feb 2026 10:03:16 +0000 (15:33 +0530)]

Merge pull request #67277 from afreen23/nvmeof-api

mgr/dashboard: Add apis for add/del hosts on namespaces

Reviewed-by: Nizamudeen A <nia@redhat.com>

commit | commitdiff | tree

Matan Breizman [Thu, 19 Feb 2026 09:37:50 +0000 (11:37 +0200)]

Merge pull request #67165 from Matan-B/wip-matanb-io_uring

src/CMakeLists.txt: Allow Seastar to reuse HAVE_LIBURING

Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>

commit | commitdiff | tree

Shraddha Agrawal [Thu, 19 Feb 2026 06:33:31 +0000 (12:03 +0530)]

Merge pull request #67375 from shraddhaag/wip-shraddhaag-availability-default-state

doc: update default availbility score status

commit | commitdiff | tree

naman munet [Thu, 19 Feb 2026 05:45:31 +0000 (11:15 +0530)]

Merge pull request #67132 from rhcs-dashboard/delete-gateway-nodes

mgr/dashboard: delete-gateway-nodes

commit | commitdiff | tree

Afreen Misbah [Wed, 18 Feb 2026 02:08:08 +0000 (07:38 +0530)]

mgr/dashboard: Removed Raw capacity toggle

- removed raw capacity toggle
- updated tests
- added polling for promethues queries
- added tests for formatter service functions

Signed-off-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Ilya Dryomov [Mon, 16 Feb 2026 21:24:47 +0000 (22:24 +0100)]

librbd/cache/pwl: WriteLogOperationSet::cell can be garbage

The pointer is never initialized but gets printed by operator<<.
Luckily outside of that it's unused.

Fixes: https://tracker.ceph.com/issues/74971
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Ernesto Puerta [Wed, 18 Feb 2026 18:16:51 +0000 (19:16 +0100)]

Merge pull request #67119 from ceph/copilot/add-copilot-instructions-file

github: define contribution workflows for regular and backport PRs

commit | commitdiff | tree

copilot-swe-agent[bot] [Thu, 29 Jan 2026 10:01:57 +0000 (10:01 +0000)]

copilot: add GitHub Copilot instructions

Github allows to add a instructions file to each repo
(.github/copilot-instructions.md) to improve the behavior
of Copilot Reviews and Agent.

These instructions can also be customized per path, filetype, etc.:
https://docs.github.com/en/copilot/how-tos/configure-custom-instructions/add-repository-instructions

This commit was authored through a Github Agent session: https://github.com/ceph/ceph/tasks/edeca07b-eabd-477c-917a-a18e72a0e2c2

Co-authored-by: GitHub Copilot noreply@github.com
Generated-by: Claude Sonnet 4.5
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>

commit | commitdiff | tree

Igor Fedotov [Wed, 18 Feb 2026 16:28:49 +0000 (19:28 +0300)]

Merge pull request #66527 from gardran/wip-gardran-dump-omap

os/bluestore: add omap_bytes perf counter.

Reviewed-by: Adam Kupczyk <akupczyk@ibm.com>

commit | commitdiff | tree

Casey Bodley [Wed, 18 Feb 2026 15:50:25 +0000 (10:50 -0500)]

doc/radosgw: document account-root for PUT and POST /admin/user

like the `--account-root` option for `radosgw-admin user create` and
`user modify`, the admin apis also support the `account-root` query
param

Fixes: https://tracker.ceph.com/issues/74047
Signed-off-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Casey Bodley [Wed, 18 Feb 2026 15:48:46 +0000 (10:48 -0500)]

doc/radosgw: document account-id for `POST /admin/user`

the account-id field applies to both `PUT /admin/user` (Create User)
and `POST /admin/user` (Modify User) apis

Signed-off-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

John Mulligan [Fri, 23 Jan 2026 19:56:02 +0000 (14:56 -0500)]

doc/mgr: document new external ceph cluster support for smb

Document the new feature that allows smb clusters on one Ceph cluster to
make use of CephFS running on a different external Ceph cluster.

Signed-off-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Mon, 16 Feb 2026 13:24:36 +0000 (14:24 +0100)]

mgr/cephadm: validate hostname in NodeProxyCache

This adds a _resolve_hosts() method to resolve hostname from kwargs
and raise OrchestratorError when the host has no node-proxy data.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Mon, 16 Feb 2026 12:49:46 +0000 (13:49 +0100)]

node-proxy: improve HTTP error logging in client

This commit makes it log the http error with the code and the reason
in sessionservice_discover() and log the error code along with the
body in query() for 5xx responses.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 12 Feb 2026 15:08:39 +0000 (16:08 +0100)]

node-proxy: get serial number instead of SKU

Let's get the serial number instead of SKU.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 12 Feb 2026 14:00:11 +0000 (15:00 +0100)]

node-proxy: allow multiple sources per component

COMPONENT_SPECS can now be a single spec or a list of specs per component.
Data from all sources is merged. Unavailable paths are skipped.

Extract get_component_data() from update_component() to support the merge logic.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 11 Feb 2026 07:41:32 +0000 (08:41 +0100)]

node-proxy: re-auth and retry once on 401

This commit makes node-proxy clear the session, call login() and retry
the request when a 401 http error is caught.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 10 Feb 2026 15:25:47 +0000 (16:25 +0100)]

node-proxy: fix flake8 E721 in _dict_diff

Use "is not" instead of "!=" for type comparison in _dict_diff()

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 10 Feb 2026 15:15:42 +0000 (16:15 +0100)]

node-proxy: make the update loop interval configurable

Read system.refresh_interval from config and use it in the update loop
sleep. The default value is 180s when unset.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 10 Feb 2026 14:59:55 +0000 (15:59 +0100)]

mgr/node-proxy: fix "ceph orch hardware status --category criticals"

The criticals path was using the wrong data shape:
node-proxy sends status as:

component -> sys_id -> member

but the code assumed:

sys_id -> component -> member

This fixes get_critical_from_host() and _criticals_table() to iterate
in the correct order and build the criticals result with the right
nesting.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 10 Feb 2026 14:46:03 +0000 (15:46 +0100)]

node-proxy: normalize storage data per member

Let's apply normalize_dict() to each member's data only, so the first
level keys (that are redfish member identifiers like "Self") are not
lowercased.

This avoids duplicate entries in hardware status.

Example:

```
[root@node-proxy-1 cephadm]# ./cephadm shell -- ceph orch hardware status --category criticals
Inferring fsid 9d6d6012-067a-11f1-8e61-525400a04a72
Inferring config /var/lib/ceph/9d6d6012-067a-11f1-8e61-525400a04a72/mon.node-proxy-1/config
+--------------+-----------+------+--------+-------+
|     HOST     | COMPONENT | NAME | STATUS | STATE |
+--------------+-----------+------+--------+-------+
| node-proxy-1 |    self   | None |  N/A   |  N/A  |
| node-proxy-1 |    Self   | None |  N/A   |  N/A  |
+--------------+-----------+------+--------+-------+
```

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 5 Feb 2026 09:01:06 +0000 (10:01 +0100)]

node-proxy: encapsulate send logic in dedicated method

Move the "send data to mgr when inventory changed" logic from main()
into a dedicated method _try_send_update().
This flattens the reporter loop and keeps main() to a single call under
the lock.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 4 Feb 2026 14:46:29 +0000 (15:46 +0100)]

node-proxy: log actual data delta in reporter

this adds a _dict_diff() function that computes recursive dict diff
and uses it in reporter to log the delta (truncated at 2048 chars)

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 4 Feb 2026 14:15:23 +0000 (15:15 +0100)]

node-proxy: add periodic heartbeats in main and reporter loops

This logs an info message every 5 minutes so that logs show the agent
and reporter are still running when nothing else is logged.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 4 Feb 2026 13:16:40 +0000 (14:16 +0100)]

node-proxy: adjust log levels

Let's adjust log levels across the project:

- use warning for bad request in the API,
when thread is not alive and for retry failure,
- use error for OOB load failure,
- use info for backoff interval,
- use debug in send attempts and for member fetch

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 3 Feb 2026 15:26:16 +0000 (16:26 +0100)]

node-proxy: add unit tests

This adds some unit tests.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 3 Feb 2026 13:47:54 +0000 (14:47 +0100)]

node-proxy: add tox config for mypy, flake8, isort, black

this adds tox.ini with environments to run mypy, flake8, isort, and
black on the ceph_node_proxy code.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 3 Feb 2026 13:45:30 +0000 (14:45 +0100)]

node-proxy: black and isort formatting pass

Format the code with black and isort.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 3 Feb 2026 13:41:13 +0000 (14:41 +0100)]

node-proxy: fix mypy errors

this commit fixes mypy errors by adding explicit types for get_path
and get_* getters methods, extending SystemBackend with
start/shutdown and declaring _ca_temp_file on NodeProxyManager

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 3 Feb 2026 12:22:03 +0000 (13:22 +0100)]

node-proxy: handle nested Redfish paths for components

Add a _resolve_path() helper to support components whose data
lives under nested Redfish paths when assembling component data.

For instance, power is exposed at 'PowerSubsystem/PowerSupplies'

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Fri, 30 Jan 2026 15:02:28 +0000 (16:02 +0100)]

node-proxy: split out config, bootstrap and redfish logic

refactor config, bootstrap, redfish layer, and monitoring:

this:
- adds a config module (CephadmCofnig, load_cephadm_config and
get_node_proxy_config) and protocols for api/reporter.
- extracts redfish logic to redfish.py
- adds a vendor registry with entrypoints.
- simplifies main() and NodeProxyManager().

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Fri, 30 Jan 2026 14:33:05 +0000 (15:33 +0100)]

node-proxy: refactor config loading

This commit renames CONFIG to DEFAULTS and add load_config() with
deep merge, refactor Config to use path + defaults and makes
node-proxy config path configurable via bootstrap JSON or env.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Fri, 30 Jan 2026 14:12:14 +0000 (15:12 +0100)]

node-proxy: add 'vendor based' redfish system selection

This commit adds REDFISH_SYSTEM_CLASSES registry (generic, dell, ...),
this way the user can choose a system class.

The default value is BaseRedfishSystem (generic) when vendor isn't specified.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 29 Jan 2026 12:27:22 +0000 (13:27 +0100)]

node-proxy: introduce component spec registry and overrides for updates

This change introduces a single COMPONENT_SPECS dict and get_update_spec(component)
as the single source of truth for RedFish component update config (collection, path,
fields, attribute). To support hardware that uses different paths or attributes,
get_component_spec_overrides() allows overriding only those fields (via dataclasses.replace())
without duplicating the rest of the spec.
All _update_network, _update_power, etc. now call _run_update(component).

For instance, AtollonSystem uses this to set the power path to 'PowerSubsystem'.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 29 Jan 2026 12:14:40 +0000 (13:14 +0100)]

mgr/cephadm: safe status/health access in node-proxy agent and inventory

This adds helpers in NodeProxyEndpoint and NodeProxyCache to safely
read status.health and status.state.

In NodeProxyEndpoint, methods _get_health_value() and _get_state_value()
are used in get_nok_members() to avoid KeyError on malformed data.

In NodeProxyCache, _get_health_value(), _has_health_value(),
_is_error_status(), and _is_unknown_status() are used in fullreport()
and when filtering 'non ok' members instead of accessing
status['status']['health'] inline.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 29 Jan 2026 10:38:45 +0000 (11:38 +0100)]

node-proxy: narrow build_data exception handling and re-raise

With this commit, it catches only KeyError, TypeError, and
AttributeError in build_data() instead of Exception, and
re-raise after logging so callers get the actual error.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 29 Jan 2026 09:48:45 +0000 (10:48 +0100)]

node-proxy: refactor Endpoint/EndpointMgr and fix chassis paths

This commit refactors EndpointMgr and Endpoint to use explicit dicts
instead of dynamic attributes. It also fixes member path filtering
so chassis endpoints use Chassis paths.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 28 Jan 2026 12:10:26 +0000 (13:10 +0100)]

node-proxy: use safe field access in storage update

Replace direct dictionary access with .get() method when processing
storage fields to handle missing optional fields gracefully.

(extra change: extract get_members_names() call for better readability.)

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 28 Jan 2026 12:05:04 +0000 (13:05 +0100)]

node-proxy: reduce log verbosity for missing optional fields

Change missing field logging from warning to debug level in
RedfishDellSystem, as missing optional fields can be expected behavior
and and doesn't require warning level logging.

Fixes: https://tracker.ceph.com/issues/74749
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>

commit | commitdiff | tree

SrinivasaBharathKanta [Wed, 18 Feb 2026 08:46:29 +0000 (14:16 +0530)]

Merge pull request #66544 from NitzanMordhai/wip-nitzan-encoder-test-backward-incompability-checks

test/encoding/readable: Add backward incompat checks

commit | commitdiff | tree

Shraddha Agrawal [Tue, 17 Feb 2026 14:41:11 +0000 (20:11 +0530)]

cephdm: add support for raw crimson OSD deployment

This commit adds support for deploying crimson OSDs using
cephadm with the method raw.

Support for lvm crimson OSD was added previously in:
https://github.com/ceph/ceph/pull/66811.

Fixes: https://tracker.ceph.com/issues/74960
Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>

commit | commitdiff | tree

Casey Bodley [Tue, 17 Feb 2026 21:40:16 +0000 (16:40 -0500)]

Merge pull request #67287 from nbalacha/wip-nbalacha-71390

rgw/configstore: don't reinitialize the rados client in RadosRealmWatcher

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Adam Emerson <aemerson@redhat.com>

commit | commitdiff | tree

J. Eric Ivancich [Tue, 17 Feb 2026 20:07:22 +0000 (15:07 -0500)]

Merge pull request #66455 from cfanz/wip-cfanz-fix-rgw-counter-overflow

rgw: fix overflow of outstanding counter in SimpleThrottler

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

J. Eric Ivancich [Tue, 17 Feb 2026 20:04:27 +0000 (15:04 -0500)]

Merge pull request #67306 from cbodley/wip-rgw-rest-client-strftime

rgw/multisite: use libfmt to format Date header

Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>

commit | commitdiff | tree

Afreen Misbah [Mon, 9 Feb 2026 16:15:03 +0000 (21:45 +0530)]

mgr/dashboard: Add apis for add/del hosts on namespaces

- these are UI APIs
- also removed namespace API and using "*" in existing instead for getting all ns in subsystem

Signed-off-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Pedro Gonzalez Gomez [Tue, 17 Feb 2026 17:15:20 +0000 (18:15 +0100)]

Merge pull request #67112 from rhcs-dashboard/cephfs-module-enable

mgr/dashboard: add CephFS Mirroring enablement page

Reviewed-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

John Mulligan [Fri, 28 Nov 2025 17:53:06 +0000 (12:53 -0500)]

mgr/smb: update the handler to support external ceph cluster type

Signed-off-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

John Mulligan [Fri, 23 Jan 2026 19:15:41 +0000 (14:15 -0500)]

mgr/smb: update staging funcs to include ext ceph cluster resource

Update the functions in the staging.py support file to include support
for the newly added external ceph cluster type.

Signed-off-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

John Mulligan [Wed, 3 Dec 2025 19:32:45 +0000 (14:32 -0500)]

mgr/smb: add new external ceph cluster to internal store mechs

Extend the internal store mechanisms to support the newly added
external ceph cluster resource type.

Signed-off-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

John Mulligan [Wed, 3 Dec 2025 19:35:10 +0000 (14:35 -0500)]

mgr/smb: add external ceph cluster to sqlite store

Starting to get silly with the boilerplate but it should do the trick.

Signed-off-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

Samarah Uriarte [Tue, 17 Feb 2026 15:21:42 +0000 (09:21 -0600)]

Merge pull request #66484 from samarahu/d4n-remove-erase-from-update

Reviewed-by: Pritha Srivastava <prsrivas@redhat.com>

commit | commitdiff | tree

Shraddha Agrawal [Tue, 17 Feb 2026 14:59:07 +0000 (20:29 +0530)]

doc: update default availbility score status

The feature is turned off by default post
https://github.com/ceph/ceph/pull/65545. Update the docs
to reflect the same.

Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>

commit | commitdiff | tree

Shraddha Agrawal [Wed, 11 Feb 2026 14:23:39 +0000 (19:53 +0530)]

cephadm: add tests for seastore support

This commits adds the following tests:
1. cephadm: JSON roundtrip of a spec with objecstore=seastore.
2. cephadm: validation checks for objecstore values.
3. cephadm to ceph-volume: cmd checks if objecstore=seastore is set.

Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>

commit | commitdiff | tree

Shraddha Agrawal [Mon, 9 Feb 2026 13:48:07 +0000 (19:18 +0530)]

cephadm: add support for seastore

This commit adds support for deploying seastore objectstore with
cephdm. This can be done in two ways:

1. using OSD spec file, we can set the objectstore argument to
seastore. eg -
```
service_type: osd
service_id: osd_crimson_seastore
placement:
  host_pattern: '*'
spec:
  objectstore: seastore
  osd_type: crimson
  data_devices:
    all: true
```

2. using --objectstore flag with ceph orch osd deploy. sample cmd:
```
ceph orch apply osd --all-available-devices --osd-type crimson --objectstore seastore
```

Fixes: https://tracker.ceph.com/issues/74616
Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>

commit | commitdiff | tree

Sagar Gopale [Fri, 30 Jan 2026 06:42:12 +0000 (12:12 +0530)]

mgr/dashboard: delete-gateway-nodes

Fixes: https://tracker.ceph.com/issues/74336
Signed-off-by: Sagar Gopale <sagar.gopale@ibm.com>

commit | commitdiff | tree

Vallari Agrawal [Tue, 17 Feb 2026 12:34:16 +0000 (18:04 +0530)]

Merge pull request #67348 from VallariAg/wip-nvmeof-udisks-disable

qa: Fix coredumps caused by udisks

commit | commitdiff | tree

Matan Breizman [Tue, 17 Feb 2026 12:20:52 +0000 (12:20 +0000)]

doc/crimson/crimson.rst: introduce Enabling io_uring

We could enable this as part of packaged installs though
letting users enable this explicitly seems like a better approach.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

commit | commitdiff | tree

Matan Breizman [Mon, 9 Feb 2026 14:31:35 +0000 (14:31 +0000)]

seastar: update submodule to wip-matanb-seastar-feb-26

```
Allow provided liburing builds via the URING::uring target.
This allows seastar, when built as a submodule, to use external
liburing.
```

https://github.com/ceph/seastar/tree/wip-matanb-seastar-feb26

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

commit | commitdiff | tree

Ilya Dryomov [Tue, 17 Feb 2026 12:03:38 +0000 (13:03 +0100)]

Merge pull request #67370 from kotreshhr/qa-mirror-flake8-fix

qa/test_mirroring: Fix flake8 errors

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Kotresh HR [Mon, 16 Feb 2026 19:35:59 +0000 (01:05 +0530)]

qa/test_mirroring: Fix flake8 errors

Introduced-by: c1e827247bd20e8a1851bc2d7a9861c12d033ef0
Signed-off-by: Kotresh HR <khiremat@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Mon, 16 Feb 2026 14:10:45 +0000 (15:10 +0100)]

Merge pull request #67231 from guits/ceph-volume-inventory-ls-all

ceph-volume: include LVM mapper devices in get_devices()

commit | commitdiff | tree

Yuval Lifshitz [Mon, 16 Feb 2026 11:48:36 +0000 (13:48 +0200)]

Merge pull request #67110 from FredNass/patch-2

doc/radosgw: rgw_lua_max_memory_per_state defaults to 128K (not 500K)

commit | commitdiff | tree

Afreen Misbah [Thu, 12 Feb 2026 10:36:53 +0000 (16:06 +0530)]

mgr/dashboard: Added unit tests

Assisted-by: ChatGPT
Signed-off-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Afreen Misbah [Mon, 16 Feb 2026 06:51:32 +0000 (12:21 +0530)]

Merge pull request #66967 from rhcs-dashboard/gateway-add-modal

mgr/dashboard: gateway-add-modal

Reviewed-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Venky Shankar [Mon, 16 Feb 2026 06:02:59 +0000 (11:32 +0530)]

Merge pull request #67305 from kotreshhr/qa-mirror

qa: Add retry logic to remove most sleeps in mirroring tests

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

NitzanMordhai [Sun, 15 Feb 2026 15:20:26 +0000 (17:20 +0200)]

Merge pull request #66912 from idryomov/wip-74394

osd/PrimaryLogPG: encode an empty data_bl for empty sparse reads

Unnamed repository; edit this file 'description' to name the repository.