librbd: fix incomplete group snapshot not being removed on creation failure
Problem:
GroupCreatePrimaryRequest doesn't remove group snapshot when group
snapshot creation encounters an error in notify_quiesce(). As a result,
INCOMPLETE snapshots from previous failed attempts remain uncleaned.
Log snippet:
librbd::watcher::Notifier: 0x7fbdac0168b0 handle_notify: r=-110
librbd::mirror::snapshot::GroupCreatePrimaryRequest: handle_notify_quiesce: r=-110
librbd::mirror::snapshot::GroupCreatePrimaryRequest: notify_unquiesce:
librbd::watcher::Notifier: 0x7fbda83c59a0 handle_notify: r=-110
librbd::mirror::snapshot::GroupCreatePrimaryRequest: handle_notify_unquiesce: r=-110
librbd::mirror::snapshot::GroupCreatePrimaryRequest: handle_notify_unquiesce: failed to notify the unquiesce requests: (110) Connection timed out
librbd::mirror::snapshot::GroupCreatePrimaryRequest: close_images:
librbd::mirror::snapshot::GroupCreatePrimaryRequest: handle_close_images: r=0
librbd::mirror::snapshot::GroupCreatePrimaryRequest: finish: r=-110
When snapshot creation fails, the remove snap path that cleans the snapshot is
skipped, leaving behind INCOMPLETE snapshot entries.
Solution:
Ensure remove_snap_metadata() is executed on failed to quience scenario like
above, allowing INCOMPLETE snapshot to be consistently cleaned up.
Note:
Another issue identified and fixed around GroupUnlinkPeerRequest::remove_peer_uuid(),
i.e in case of INCOMPLETE snapshot, group_snap_set() is expected to return
EEXIST error, and that is now handled.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Resolves: rhbz#
2415401