]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph-ci.git/log
ceph-ci.git
3 weeks agoMerge pull request #66600 from pkalever/fix-resync-on-sync
Ilya Dryomov [Tue, 3 Feb 2026 17:30:47 +0000 (18:30 +0100)]
Merge pull request #66600 from pkalever/fix-resync-on-sync

rbd-mirror: allow resync while a group snapshot is still syncing

Reviewed-by: VinayBhaskar-V <vvarada@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
3 weeks agorbd-mirror: allow resync while a group snapshot is still syncing
Prasanna Kumar Kalever [Thu, 11 Dec 2025 05:23:50 +0000 (10:53 +0530)]
rbd-mirror: allow resync while a group snapshot is still syncing

currently we do not allow resync operation if the snapshot is still inprogress
to sync until its fully done. This means that if snapshot synchronization
becomes stuck for any reason, a resync cannot be triggered, resulting in an
undesirable operational limitation.

this change enables resync requests to be processed even when a group snapshot
is still syncing, allowing resync in the middle of syncing a group snapshot.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
3 weeks agoMerge pull request #66653 from pkalever/remove-creating-snaps-on-restart
Ilya Dryomov [Mon, 2 Feb 2026 14:31:31 +0000 (15:31 +0100)]
Merge pull request #66653 from pkalever/remove-creating-snaps-on-restart

rbd-mirror: fix stuck to sync mirror group snaps on restart of daemon

Reviewed-by: VinayBhaskar-V <vvarada@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
3 weeks agocleanup: minor improvements throughout the replayer
Prasanna Kumar Kalever [Mon, 2 Feb 2026 13:18:15 +0000 (18:48 +0530)]
cleanup: minor improvements throughout the replayer

Defined below routines which makes calls to image replayers:
prune_image_snapshot()
get_replayers_by_image_id()
set_image_replayer_end_limits()

this commit start using them.

Also use get_replayers_by_image_id() in other places of group replayer

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
3 weeks agorbd-mirror: prune group snapshots stuck in CREATING state on restart
Prasanna Kumar Kalever [Tue, 16 Dec 2025 14:07:05 +0000 (19:37 +0530)]
rbd-mirror: prune group snapshots stuck in CREATING state on restart

after a daemon restart, prune the entire group snapshot if it remains in
GROUP_SNAPSHOT_STATE_CREATING. This aligns group snapshot handling with the
image replayer logic and ensures that all member image snapshots are cleanly
deleted and recreated.

This is required because some image snapshots in the group may not have started
object copying prior to the restart, which can otherwise lead to missing image
state, object-map, or related metadata.

Also, add the necessary tests to validate interrupted synchronization during
group snapshot syncing. Specifically, cover the following scenarios:
Scenario 1: The snapshot on the secondary is in the creating phase when the
            daemon is restarted.
Scenario 2: The snapshot on the secondary is in the created phase when the
            daemon is restarted.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
4 weeks agorbd-mirror: avoid deleting image snapshots that are part of a group snapshot
Prasanna Kumar Kalever [Tue, 16 Dec 2025 13:47:26 +0000 (19:17 +0530)]
rbd-mirror: avoid deleting image snapshots that are part of a group snapshot

On daemon restart, the image replayer currently deletes and recreates image
snapshots if object copying has not yet started, in order to avoid missing
image state such as object-map or metadata.

This logic is unnecessary for image snapshot part of mirror group snapshots. By
the time a group snapshot reaches GROUP_SNAPSHOT_STATE_CREATED, all member
image snapshots are already guaranteed to be in the CREATED state. Deleting
such image snapshots provides no benefit and can cause group snapshots to
become stuck (in current case) waiting for such image snapshots.

Skip image snapshot deletion when the snapshot is part of a group snapshot.
A follow-up commit will address handling group snapshots that remain in
GROUP_SNAPSHOT_STATE_CREATING across a daemon restart by deleting and
allowing the syncing recreating the group snapshot as a whole.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
4 weeks agocleanup: simplify group image snapshot validation
Prasanna Kumar Kalever [Tue, 16 Dec 2025 12:53:18 +0000 (18:23 +0530)]
cleanup: simplify group image snapshot validation

validation of an image snapshot association with a group snapshot requires
checking either a valid group_spec or the presence of a group_snap_id
in cls::rbd::MirrorSnapshotNamespace.

this commit tries to remove redundant validation of checking for both and rely
on this minimal condition of checking for a valid group_spec.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2 months agoMerge pull request #66535 from pkalever/restore-readonly-check
Ilya Dryomov [Fri, 5 Dec 2025 18:38:26 +0000 (19:38 +0100)]
Merge pull request #66535 from pkalever/restore-readonly-check

librbd: restore readonly image check in snap_remove()

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2 months agoMerge pull request #66534 from pkalever/avoid-unnecessary-log
Ilya Dryomov [Fri, 5 Dec 2025 13:56:43 +0000 (14:56 +0100)]
Merge pull request #66534 from pkalever/avoid-unnecessary-log

rbd-mirror: suppress unnecessary log message during snapshot unlink

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2 months agolibrbd: restore readonly image check in snap_remove()
Prasanna Kumar Kalever [Fri, 5 Dec 2025 11:46:33 +0000 (17:16 +0530)]
librbd: restore readonly image check in snap_remove()

The check preventing snapshot removal on read-only images was previously
commented out. This commit restores the original behavior to ensure that
snap_remove() correctly rejects operations on images that are not writable.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2 months agorbd-mirror: suppress unnecessary log message during snapshot unlink
Prasanna Kumar Kalever [Fri, 5 Dec 2025 11:29:13 +0000 (16:59 +0530)]
rbd-mirror: suppress unnecessary log message during snapshot unlink

remote snapshots without a mirror peer UUID are filtered out early. Once the
peer UUID is removed from a remote snapshot, it no longer appears in
m_remote_group_snaps locally. As a result, mirror_group_snapshot_unlink_peer()
will not find that snapshot in m_remote_group_snaps, and this condition is
most expected now.

avoid printing the log message that warns about the snapshot not being present,
as this is not the true case (as it is filtered) and can be misleading.
Moreover, the message would otherwise be printed repeatedly until the local
snapshot is eventually removed, creating unnecessary noise in the logs.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
3 months agoMerge pull request #66424 from VinayBhaskar-V/wip-fix-test
Ilya Dryomov [Wed, 26 Nov 2025 09:45:44 +0000 (10:45 +0100)]
Merge pull request #66424 from VinayBhaskar-V/wip-fix-test

qa/workunits/rbd: fix check_snapshot_info in rbd_groups.sh

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
3 months agoqa/workunits/rbd: fix check_snapshot_info in rbd_groups.sh
VinayBhaskar-V [Wed, 26 Nov 2025 09:15:35 +0000 (14:45 +0530)]
qa/workunits/rbd: fix check_snapshot_info in rbd_groups.sh

Signed-off-by: VinayBhaskar-V <vvarada@redhat.com>
3 months agoMerge pull request #65164 from VinayBhaskar-V/wip-add-complete-field
Ilya Dryomov [Sat, 22 Nov 2025 14:34:32 +0000 (15:34 +0100)]
Merge pull request #65164 from VinayBhaskar-V/wip-add-complete-field

rbd-mirror: integration of the new GroupSnapshotNamespaceMirror::complete field

Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
3 months agorbd-mirror: integration of the new GroupSnapshotNamespaceMirror::complete field
VinayBhaskar-V [Wed, 20 Aug 2025 18:30:09 +0000 (18:30 +0000)]
rbd-mirror: integration of the new GroupSnapshotNamespaceMirror::complete field

This commit introduces the new field **complete**, of type **MirrorGroupSnapshotCompleteState** enum,
to the GroupSnapshotNamespaceMirror structure. This change is necessary to align behavior of
mirror group snapshots with that of mirror image snapshots, allowing for a precise differentiation
between a group snapshot that has been created and one that has been fully synced.

**1. Handling Old-Style Snapshots**

Decoding Old Snapshots: The original GroupSnapshotNamespaceMirror structure lacked the complete field,
which implicitly defaulted to a bool value of false upon initialization.
When an old snapshot (lacking the complete field) is decoded by an upgraded client,
the implicit default value maps to MIRROR_GROUP_SNAPSHOT_COMPLETE_IF_CREATED.

Completion Check: A snapshot is determined old by checking it's complete filed i.e
complete == MIRROR_GROUP_SNAPSHOT_COMPLETE_IF_CREATED and if it's old the sync completion
for these group snapshots is determined by checking the state field
i.e state == GROUP_SNAPSHOT_STATE_CREATED.

During a upgrade where **OSDs have not yet been updated**, the new client will be forced to create
snapshots using the old style. These snapshots will be initialized with MIRROR_GROUP_SNAPSHOT_COMPLETE_IF_CREATED
and will stay on that to prevent immediate, incorrect cleanup by the old OSDs and in this case
state field is set to **GROUP_SNAPSHOT_STATE_CREATED** only after snapshot completed it's sync.

**2. Handling New-Style Snapshots**

New snapshots are initialized with complete == **MIRROR_GROUP_SNAPSHOT_INCOMPLETE**,
state == GROUP_SNAPSHOT_STATE_CREATING. The group snapshot's state is marked as GROUP_SNAPSHOT_STATE_CREATED
as soon as its metadata is fully available and stored.

Completion Check: The snapshot's sync is confirmed only when complete == MIRROR_GROUP_SNAPSHOT_COMPLETE
along with state check (state == GROUP_SNAPSHOT_STATE_CREATED) is satisfied.

This approach ensures seamless transition and compatibility, allowing the system to correctly interpret the
synchronization status of both old and newly created group snapshots.

Signed-off-by: VinayBhaskar-V <vvarada@redhat.com>
3 months agoMerge pull request #66315 from pkalever/clean-in-complete-snaps
Ilya Dryomov [Fri, 21 Nov 2025 13:10:13 +0000 (14:10 +0100)]
Merge pull request #66315 from pkalever/clean-in-complete-snaps

librbd: fix incomplete group snapshot not being removed on creation failure

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
3 months agolibrbd: fix incomplete group snapshot not being removed on creation failure
Prasanna Kumar Kalever [Wed, 19 Nov 2025 11:43:36 +0000 (17:13 +0530)]
librbd: fix incomplete group snapshot not being removed on creation failure

Problem:
GroupCreatePrimaryRequest doesn't remove group snapshot when group
snapshot creation encounters an error in notify_quiesce(). As a result,
INCOMPLETE snapshots from previous failed attempts remain uncleaned.

Log snippet:
librbd::watcher::Notifier: 0x7fbdac0168b0 handle_notify: r=-110
librbd::mirror::snapshot::GroupCreatePrimaryRequest:  handle_notify_quiesce: r=-110
librbd::mirror::snapshot::GroupCreatePrimaryRequest:  notify_unquiesce:
librbd::watcher::Notifier: 0x7fbda83c59a0 handle_notify: r=-110
librbd::mirror::snapshot::GroupCreatePrimaryRequest:  handle_notify_unquiesce: r=-110
librbd::mirror::snapshot::GroupCreatePrimaryRequest:  handle_notify_unquiesce: failed to notify the unquiesce requests: (110) Connection timed out
librbd::mirror::snapshot::GroupCreatePrimaryRequest:  close_images:
librbd::mirror::snapshot::GroupCreatePrimaryRequest:  handle_close_images: r=0
librbd::mirror::snapshot::GroupCreatePrimaryRequest:  finish: r=-110

When snapshot creation fails, the remove snap path that cleans the snapshot is
skipped, leaving behind INCOMPLETE snapshot entries.

Solution:
Ensure remove_snap_metadata() is executed on failed to quience scenario like
above, allowing INCOMPLETE snapshot to be consistently cleaned up.

Note:
Another issue identified and fixed around GroupUnlinkPeerRequest::remove_peer_uuid(),
i.e in case of INCOMPLETE snapshot, group_snap_set() is expected to return
EEXIST error, and that is now handled.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
3 months agoMerge pull request #65958 from VinayBhaskar-V/wip-sync-demote
Ilya Dryomov [Thu, 20 Nov 2025 15:39:48 +0000 (16:39 +0100)]
Merge pull request #65958 from VinayBhaskar-V/wip-sync-demote

rbd-mirror: allow incomplete group demote snapshot to sync after rbd-mirror daemon restart

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>