David Galloway [Fri, 24 Jun 2022 16:27:43 +0000 (12:27 -0400)]
.github: Add labels while PR is open
I think https://github.com/tibdex/backport will only create backport PRs if our doc/releases PRs are labelled *and then* closed. This action currently labels after the PR is closed.
Signed-off-by: David Galloway <dgallowa@redhat.com>
Ronen Friedman [Mon, 20 Jun 2022 12:47:57 +0000 (12:47 +0000)]
scrub/osd: disable blocked-scrub warnings during some tests
As some Teuthology tests seem to block objects for long minutes,
we must not issue the "scrub is blocked for too long" warning
(that warning causes the tests to fail).
A new configuration parameter now controls the grace period before
the warning is issued. Some tests were modified to set this
configuration parameter to a large value.
Kefu Chai [Tue, 21 Jun 2022 15:28:23 +0000 (23:28 +0800)]
install-deps.sh: do not install libpmem from chacra
this change reverts 17d2bc3707bb0078e2fa1b4eef31b39804e45135, before
we recreate a chacra repo hosting libpmem packages, we are not able
to query the repo from shaman or pull the dependencies from chacra.
in future, we should be able to get the libpmem dependencies from
offical ubuntu package repo and fedora, CentOS Stream and RHEL repos.
Zac Dover [Tue, 21 Jun 2022 14:09:05 +0000 (00:09 +1000)]
doc/dev: add context note to dev guide config
This PR adds a note directing first-time cloners of
their Ceph git forks to make sure to cd into the ceph/
directory before trying to run the "git config" commands.
Ilya Dryomov [Sun, 19 Jun 2022 10:12:01 +0000 (12:12 +0200)]
mgr/rbd_support: always rescan image mirror snapshots on refresh
Establishing a watch on rbd_mirroring object and skipping rescanning
image mirror snapshots on periodic refresh unless rbd_mirroring object
gets notified in the interim is flawed. rbd_mirroring object is
notified when mirroring is enabled or disabled on some image (including
when the image is removed), but it is not notified when images are
promoted or demoted. However, load_pool_images() discards images that
are not primary at the time of the scan. If the image is promoted
later, no snapshots are created even if the schedule is in place. This
happens regardless of whether the schedule is added before or after the
promotion.
This effectively reverts commit 69259c8d3722 ("mgr/rbd_support: make
mirror_snapshot_schedule rescan only updated pools"). An alternative
fix could be to stop discarding non-primary images (i.e. drop
if not info['primary']:
continue
check added in commit d39eb283c5ce ("mgr/rbd_support: mirror snapshot
schedule should skip non-primary images")), but that would clutter the
queue and therefore "rbd mirror snapshot schedule status" output with
bogus entries. Performing a rescan roughly every 60 seconds should be
manageable: currently it amounts to a single mirror_image_status_list
request, followed by mirror_image_get, get_snapcontext and snapshot_get
requests for each snapshot-based mirroring enabled image and concluded
by a single dir_list request. Among these, per-image get_snapcontext
and snapshot_get requests are necessary for determining primaryness.
Ilya Dryomov [Sat, 18 Jun 2022 13:25:49 +0000 (15:25 +0200)]
rbd-mirror: spell out "remote image is not primary" status correctly
There is a difference: non-primary means NON_PRIMARY promotion state,
while "not primary" can refer to any of NON_PRIMARY, ORPHAN or UNKNOWN
promotion states.
Ilya Dryomov [Sat, 18 Jun 2022 11:00:34 +0000 (13:00 +0200)]
rbd-mirror: fix up PrepareReplayDisconnected test case
It was botched in commit 2bca9ee96c65 ("rbd-mirror: consolidate
prepare local/remote image steps to bootstrap") and went unnoticed
because currently no special handling is needed for disconnected
clients -- is_disconnected() check happens to be the last step
and it doesn't generate an error.
Ilya Dryomov [Mon, 20 Jun 2022 12:19:41 +0000 (14:19 +0200)]
rbd-mirror: generally skip replay/resync if remote image is not primary
Replay and resync should generally be skipped if the remote image is
not primary.
If this is not done for replay, snapshot-based mirroring can run into
a livelock if the primary image is demoted while a mirror snapshot is
being synced. On the demote site, rbd-mirror would pick up the just
demoted image, grab the exclusive lock on it and idle waiting for a new
mirror snapshot to be created. On the (still) non-primary site,
rbd-mirror would eventually finish syncing that mirror snapshot and
attempt to unlink from it on the demote site. These attempts would
fail with EROFS due to exclusive lock being held in the "refuse proxied
maintenance operations" mode, blocking forward progress (syncing of the
demotion snapshot so that the non-primary image can be orderly promoted
to primary, etc).
If this is not done for resync, data loss can ensue as the just demoted
image would be immediately trashed, underneath the non-primary site that
is still syncing.
Currently this is done in PrepareReplayRequest only for journal-based
mirroring. Note that it is conditional: if the local image is linked
to the remote image, proceeding is desirable.
Generalize this check, consolidate it with a related check in
PrepareRemoteImageRequest and move the result to BootstrapRequest to
cover both "local image does not exist" and "local image is unlinked"
cases for both modes.
Ilya Dryomov [Sat, 18 Jun 2022 10:35:51 +0000 (12:35 +0200)]
rbd-mirror: strengthen is_local_primary() and is_linked()
Initialize local_promotion_state and remote_promotion_state to UNKNOWN
instead of counterintuitive PRIMARY and NON_PRIMARY -- half the time the
final values are flipped. Then is_local_primary() and is_linked() can
be strengthened as a non-existent image should stay in UNKNOWN.
Ilya Dryomov [Fri, 17 Jun 2022 12:03:20 +0000 (14:03 +0200)]
mgr/rbd_support: avoid losing a schedule on load vs add race
If load_schedules() (i.e. periodic refresh) races with add_schedule()
invoked by the user for a fresh image, that image's schedule may get
lost until the next rebuild (not refresh!) of the queue:
1. periodic refresh invokes load_schedules()
2. load_schedules() creates a new Schedules instance and loads
schedules from rbd_mirror_snapshot_schedule object
3. add_schedule() is invoked for a new image (an image that isn't
present in self.images) by the user
4. before load_schedules() can grab self.lock, add_schedule() commits
the new schedule to rbd_mirror_snapshot_schedule object and adds it
to self.schedules
5. load_schedules() grabs self.lock and reassigns self.schedules with
Schedules instance that is now stale
6. periodic refresh invokes load_pool_images() which discovers the new
image; eventually it is added to self.images
7. periodic refresh invokes refresh_queue() which attempts to enqueue()
the new image; this fails because a matching schedule isn't present
The next periodic refresh recovers the discarded schedule from
rbd_mirror_snapshot_schedule object but no attempt to enqueue() that
image is made since it is already "known" at that point. Despite the
schedule being in place, no snapshots are created until the queue is
rebuilt from scratch or rbd_support module is reloaded.
To fix that, extend self.lock critical sections so that add_schedule()
and remove_schedule() can't get stepped on by load_schedules().
Ilya Dryomov [Fri, 17 Jun 2022 08:28:55 +0000 (10:28 +0200)]
mgr/rbd_support: refresh schedule queue immediately after delay elapses
The existing logic often leads to refresh_pools() and refresh_images()
being invoked after a 120 second delay instead of after an intended 60
second delay.
Nizamudeen A [Mon, 6 Jun 2022 05:51:29 +0000 (11:21 +0530)]
mgr/dashboard: configure rbd mirroring
One-click button in the case of an orch cluster for configuring the
rbd-mirroring when its not properly setup. This button will create an
rbd-mirror service and also an rbd labelled pool(replicated: size-3) (if they are not
existing)
Fixes: https://tracker.ceph.com/issues/55646 Signed-off-by: Nizamudeen A <nia@redhat.com>
Ronen Friedman [Sat, 11 Jun 2022 07:27:47 +0000 (07:27 +0000)]
scrub/osd: add clearer reminders that a scrub is blocked
Whenever a scrub session is waiting for an excessive length
of time for a locked object to be unlocked, the total
number of concurrent scrubs in the system is reduced.
The existing cluster warning issued on such occurrences is
easily overlooked. Here we add a constant reminder each time
the OSD tries to schedule scrubs.
Adam C. Emerson [Wed, 15 Jun 2022 23:39:59 +0000 (19:39 -0400)]
rgw: radosgw-admin includes current time in most status commands
Support folk have asked if we can have a timestamp on the output of
multisite status commands so they can see at a glance how they relate
to other events and changes.
As such, we now have a status command added to any outputs where it
doesn't disrupt things. In practice this means anything whose output
isn't a single JSON array.
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>