Xuehan Xu [Sun, 7 Feb 2021 04:40:36 +0000 (12:40 +0800)]
mgr: relax osd ok-to-stop condition on degraded pgs
Right now, the "ok-to-stop" condition is relatively rigorous, it allows
stopping an osd only if no PG on it is non-active or degraded. But there
are situations in which an OSD is part of a degraded pg and the pg still
still have > min_size complete replicas after the OSD is stopped.
In 9750061d5d4236aaba156d60790e0b8bcd7cfb64, we changed from considering
just acting to using avail_no_missing (OSDs that have no missing objects).
When the projected pg_acting is constructed this way, we can safely compare
to min_size... even for a PG marked degraded.
Jason Dillaman [Fri, 11 Dec 2020 00:31:45 +0000 (19:31 -0500)]
rbd-mirror: validate that remote start snapshot still exists
Perform a basic sanity check to verify that the remote start snapshot
still exists. This was previosly being deleted as part of the unlink
process due to a race condition between the remote side completing
a sync between snapshots 1 and 2 and snapshot 2 being unlinked due
to reaching max snapshots.
Jason Dillaman [Thu, 10 Dec 2020 22:32:16 +0000 (17:32 -0500)]
librbd/mirror: tweak which snapshot is unlinked when at capacity
The rbd-mirror daemon will attempt to sync from the last synced
snapshot to the next mirror snapshot. When the limit is at 3, this
currently can result in a situation where an in-use sync snapshot is
deleted. Instead of unlinking the second oldest snapshot, always
unlink the third oldest.
Fixes: https://tracker.ceph.com/issues/48553 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit a888bff8d00e3e496ec80e4273e01a47b67da5dc)
Jason Dillaman [Thu, 10 Dec 2020 21:13:23 +0000 (16:13 -0500)]
librbd/mirror: increase debug logging of snapshot state machines
Try to keep debug level 20 for IO state machines so that setting the
debug level to something lower should show the manipulation of
the mirror snapshots.
Jason Dillaman [Thu, 10 Dec 2020 04:17:24 +0000 (23:17 -0500)]
rbd-mirror: do not attempt to unlink from more recent snapshots
The snapshot-based mirroring replayer should only attempt to unlink
from any snapshots that are older than the end remote snapshot id to
prevent the remote side from incorrectly deleted the snapshot.
Fixes: https://tracker.ceph.com/issues/48527 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 78f8abce2d90d7c9bcf7b4bd4d805c3fe0b39b03)
Jason Dillaman [Thu, 10 Dec 2020 03:30:17 +0000 (22:30 -0500)]
librbd/mirror: unlink peer might recursively loop
If the mirror peer set is (incorrectly) empty, it's not currently
possible for the unlink peer state machine to properly delete the
snapshot. This can result in a recursive loop between the create
primary snapshot state machine and the unlink peer state machine
until the stack depth grows too large.
Fixes: https://tracker.ceph.com/issues/48525 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 18a45503011a572325e09b56d5ab799a15ee83d4)
If the requested write length does not match the provided bufferlist
length, disable the move optimization and instead fallback to creating
a new sub-bufferlist for the object request.
Fixes: https://tracker.ceph.com/issues/49173 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 8dbb4a3d971d9a48c171f161f531956dd0030403)
Kefu Chai [Sat, 6 Mar 2021 16:32:42 +0000 (00:32 +0800)]
.github: correct the regex in mileston workflow
also use pull_request_target event so the action is run in the
context of the base of the pull request. this helps us to overcome
the "Resource not accessible by integration" issue where the action
is run in the context of the pull request.
* refs/pull/39906/head:
mgr/volumes: Bump up AuthMetadataManager's version
pybind/ceph_volume_client: Bump up the version and compat_version to 6
pybind/ceph_volume_client: Fix auth-metadata file recovery
pybind/ceph_volume_client: Update the 'volumes' key to 'subvolumes' in auth metadata file
Reviewed-by: Ramana Raja <rraja@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Kotresh HR [Fri, 19 Feb 2021 11:27:23 +0000 (16:57 +0530)]
mgr/volumes: Bump up AuthMetadataManager's version
With ceph_volume_client and mgr-volumes co-existing
for sometime, the version of both needs to be same.
The ceph_volume_client version <=5 can't decode
'subvolumes' key in auth-metadata file. Hence to
handle version in-compatibility, the version of
ceph_volume_client is bumped up to 6 and the same
needs to be done in mgr-volume's AuthMetadataManager
Kotresh HR [Fri, 19 Feb 2021 11:12:33 +0000 (16:42 +0530)]
pybind/ceph_volume_client: Bump up the version and compat_version to 6
With 'volumes' key updated to 'subvolumes', the version of
ceph_volume_client <= 5 can't decode auth-metadata file. Hence
bumping up ceph_volume_client version and compat_version to 6.
Kotresh HR [Mon, 15 Feb 2021 16:26:51 +0000 (21:56 +0530)]
pybind/ceph_volume_client: Update the 'volumes' key to 'subvolumes' in auth metadata file
The older auth metadata files before nautilus release stores
the authorized subvolumes using the 'volumes' key. As the
notion of 'subvolumes' brought in by mgr/volumes, it makes
sense to use 'subvolumes' key. This patch would be tranparently
update 'volumes' key to 'subvolumes' and newer auth metadata
files would store them with 'subvolumes' key.
Also fails the deauthorize if the auth-id doesn't exist.
Gerald Yang [Wed, 3 Mar 2021 04:37:15 +0000 (04:37 +0000)]
common: reset last_log_sent when clog_to_monitors is updated
When clog_to_monitors is disabled, "last_log" still keeps increasing by
get_next_seq() if OSD writes info to clog
But "last_log_sent" doesn't increase, if we disable clog_to_monitors for
a bit longer and then re-enabling it, the num_unsent could be bigger than
log_queue_size(), it will trigger an assertion in _get_mon_log_message
We need to reset last_log_sent to last_log before updating clog_to_monitors
Signed-off-by: Gerald Yang <gerald.yang@canonical.com>
Mykola Golub [Mon, 22 Feb 2021 16:22:54 +0000 (16:22 +0000)]
rbd-mirror: reset update_status_task pointer in timer thread
To avoid a time window when m_update_status_task is invalid. If
during this time the cancel_update_mirror_image_replay_status is
called, it may cancel some other's ImageReplayer task, if it
happened to add the task with the same address.
Mykola Golub [Mon, 22 Feb 2021 12:53:38 +0000 (12:53 +0000)]
test/librbd: extend TestLibRBD.RenameViaLockOwner
To cover the following case:
- Client A has image opened but does not owns the lock.
- Client B renames the image (client A is not aware of it).
- Client A becomes the lock owner.
- Client B requests rename, which is proxied to the client A.
Conflicts:
src/librbd/Operations.cc
(request_id (async notification) is not used for "rename" op in octopus
-- added in pasific for "serialize maintenance operations by type")
Kotresh HR [Fri, 5 Feb 2021 18:05:22 +0000 (23:35 +0530)]
qa: Fix a few mgr/volume test cases
Recovering dirty auth metadata file might not retain the order,
fixed the comparison in 'test_recover_auth_metadata_during_authorize'
and 'test_recover_auth_metadata_during_deauthorize'.
Kotresh HR [Sat, 23 Jan 2021 17:03:32 +0000 (22:33 +0530)]
ceph_volume_client: Fix failure of test_idempotency
With the test environment, 'args must be encodeable
as a bytearray' error is seen for 'ceph_mds_command'.
Hence removed tuple and passed the JSON formatted string.
Kotresh HR [Tue, 5 Jan 2021 12:55:54 +0000 (18:25 +0530)]
mgr/volumes: Update the 'volumes' key to 'subvolumes' in auth metadata file
The older auth metadata files created by CephVolumeClient stores the
authorized subvolumes using the 'volumes' key as the notion of
'subvolumes' brought in by mgr/volumes. Hence, this would be tranparently
updated to 'subvolumes' and newer auth metadata files would store them
with 'subvolumes' key.
Also fails the deauthorize if the auth-id doesn't exist.
Optionally allow authorizing auth-ids not created by mgr plugin
via the option 'allow_existing_id'. This can help existing deployers
of manila to disallow/allow authorization of pre-created auth IDs
via a manila driver config that sets 'allow_existing_id' to False/True.
Kotresh HR [Tue, 15 Dec 2020 12:01:54 +0000 (17:31 +0530)]
mgr/volumes: Preserve existing caps while authorize/deauthorize auth-id
Authorize/Deauthorize used to overwrite the caps of auth-id which would
end up deleting existing caps. This patch fixes the same by retaining
the existing caps by appending or deleting the new caps as needed.
Kotresh HR [Mon, 4 Jan 2021 13:04:54 +0000 (18:34 +0530)]
mgr/volumes: Disallow authorize existing auth_id
This patch disallow the mgr plugin to authorize the auth_id
which is not created via mgr plugin. Those auth_ids could be
created by other means for other use cases which should not be modified
via mgr plugin.
Kotresh HR [Wed, 18 Nov 2020 10:13:25 +0000 (15:43 +0530)]
mgr/volumes: Persist auth and subvolume metadata
1. Subvolume create and delete operations create and delete subvolume
metadata file respectively.
2. Subvolume authorize creates the auth meta file and persists the
required metadata on subvolume metadata file and auth metdata file
on disk. Subvolume deauthorize clears the required metadata on
both metadata files.