Pere Diaz Bou [Fri, 13 May 2022 15:15:33 +0000 (17:15 +0200)]
mgr/dashboard: snapshot mirroring from dashboard
Enable snapshot mirroring from the Pools -> Image
Also show the mirror-snapshot in the image where snapshot is enabled
When parsing images if an image has the snapshot mode enabled, it will
try to run commands that don't work with that mode. The solution was
not running those for now and appending the mode in the get call.
Fixes: https://tracker.ceph.com/issues/55648 Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com> Signed-off-by: Nizamudeen A <nia@redhat.com> Signed-off-by: Aashish Sharma <aasharma@redhat.com> Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Kefu Chai [Sun, 5 Jun 2022 10:30:28 +0000 (18:30 +0800)]
crimson/osd: reset logger before exit
* extract the code to set logging fstream into a dedicated function
* do not reset logging until the end of the seastar application.
before this change, `reset_logger` is created in the
`if (auto log_file = local_conf()->log_file; !log_file.empty())` branch,
so its life cycle ends when the `if` block ends. in other words,
the cerr fstream is used for logging after the `if` block ends.
this is not the expected behavior.
after this changge, `reset_logger` is created out of the `if` block.
so we won't reset the logger back to `cerr` until the lambda passed to
`seastar::async()` exits.
qa/suites/rbd: place cache file on tmpfs for xfstests
The RWL mode needs DAX and is dog slow otherwise -- qemu_xfstests.yaml
job always hits the 6 hour max_job_time limit.
As our tmpfs instance is limited and qemu_xfstests.yaml opens three
images at the same time, reduce the "big cache" size to 5G. This facet
was added to iron out 32-bit head/tail pointer issues and 5G still does
the job there.
Going through the loop device is needed because tmpfs doesn't support
O_DIRECT.
qa/tasks/rgw_multisite.py uses 'zonegroup set' to create zonegroups from
their json format. this doesn't enable any of the supported zonegroup
features by default, so this adds the 'enabled_features' field to the
json representations
Jeff Layton [Wed, 1 Jun 2022 17:57:29 +0000 (13:57 -0400)]
qa: fix .teuthology_branch file in qa/
According to teuthology-suite:
-t <branch>, --teuthology-branch <branch>
The teuthology branch to run against.
Default value is determined in the next order.
There is TEUTH_BRANCH environment variable set.
There is `qa/.teuthology_branch` present in
the suite repo and contains non-empty string.
There is `teuthology_branch` present in one of
the user or system `teuthology.yaml` configuration
files respectively, otherwise use `main`.
The .teuthology_branch file in the qa/ dir currently points at "master".
Change it to point to "main".
cephadm: fix osd adoption with custom cluster name
When adopting Ceph OSD containers from a Ceph cluster with a custom name, it fails
because the name isn't propagated in unit.run.
The idea here is to change the lvm metadata and enforce 'ceph.cluster_name=ceph'
given that cephadm doesn't support custom names anyway.
Fixes: https://tracker.ceph.com/issues/55654 Signed-off-by: Adam King <adking@redhat.com> Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
Redouane Kachach [Tue, 31 May 2022 10:59:26 +0000 (12:59 +0200)]
mgr/cephadm: capture exception when not able to list upgrade tags Fixes: https://tracker.ceph.com/issues/55801 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
Ronen Friedman [Tue, 31 May 2022 07:14:06 +0000 (07:14 +0000)]
osd/scrub: do not start scrubbing if the PG is snap-trimming
Both 'snap-trim' and 'snaptrim-wait' PG states now prevent
scrub from starting.
Background:
A PG should not be scrubbed and trimmed concurrently. Unlike
write operations, snap trimming does not verify that a targeted
object is not in the executing scrub's chunk.
The trimmer always checked for active scrubs before starting. The
scrubber - did not. This PR fixes that omission.
Ilya Dryomov [Sun, 29 May 2022 16:20:34 +0000 (18:20 +0200)]
librbd: unlink newest mirror snapshot when at capacity, bump capacity
CreatePrimaryRequest::unlink_peer() invoked via "rbd mirror image
snapshot" command or via rbd_support mgr module when creating a new
scheduled mirror snapshot at rbd_mirroring_max_mirroring_snapshots
capacity on the primary cluster can race with Replayer::unlink_peer()
invoked by rbd-mirror when finishing syncing an older snapshot on the
secondary cluster. Consider the following:
0. rbd-mirror is syncing snap1..snap2 delta
1. rbd_support creates primary-snap4
2. due to rbd_mirroring_max_mirroring_snapshots == 3, rbd_support picks
primary-snap3 for unlinking
3. rbd-mirror finishes syncing snap1..snap2 delta and marks
non-primary-snap2 complete
[ snap1 (the old base) is no longer needed on either cluster ]
4. rbd-mirror unlinks and removes primary-snap1
5. rbd-mirror removes non-primary-snap1
6. rbd-mirror picks snap2 as the new base
7. rbd-mirror creates non-primary-snap3 and starts syncing snap2..snap3
delta
8. rbd_support unlinks and removes primary-snap3 which is in-use by
rbd-mirror
If snap trimming on the primary cluster kicks in soon enough, the
secondary image becomes corrupted: rbd-mirror would eventually finish
"syncing" non-primary-snap3 and mark it complete in spite of bogus data
in the HEAD -- the primary cluster OSDs would start returning ENOENT
for snap trimmed objects. Luckily, rbd-mirror's attempt to pick snap3
as the new base would wedge the replayer with "split-brain detected:
failed to find matching non-primary snapshot in remote image" error.
Before commit a888bff8d00e ("librbd/mirror: tweak which snapshot is
unlinked when at capacity") this could happen pretty much all the time
as it was the second oldest snapshot that was unlinked. This commit
changed it to be the third oldest snapshot, turning this into a more
narrow but still very much possible to hit race.
Unfortunately this race condition appears to be inherent to the way
snapshot-based mirroring is currently implemented:
a. when mirror snapshots are created on the producer side of the
snapshot queue, they are already linked
b. mirror snapshots can be concurrently unlinked/removed on both
sides of the snapshot queue by non-cooperating clients (local
rbd_mirror_image_create_snapshot() vs remote rbd-mirror)
c. with mirror peer links off the list due to (a), there is no
existing way for rbd-mirror to persistently mark a snapshot as
in-use
As a workaround, bump rbd_mirroring_max_mirroring_snapshots to 5 and
always unlink the newest snapshot (i.e. slot 4) instead of the third
oldest snapshot (i.e. slot 2). Hopefully this gives enough leeway,
as rbd-mirror would need to sync two snapshots (i.e. transition from
syncing 0-1 to 1-2 and then to 2-3) before potentially colliding with
rbd_mirror_image_create_snapshot() on slot 4.
Ilya Dryomov [Sun, 29 May 2022 17:55:04 +0000 (19:55 +0200)]
test/librbd: fix set_val() call in SuccessUnlink* test cases
rbd_mirroring_max_mirroring_snapshots isn't actually set to 3 there
due to the stray conf_ prefix. It didn't matter until now because the
default was also 3.
Kefu Chai [Sat, 28 May 2022 09:03:34 +0000 (17:03 +0800)]
debian: add .requires for specifying python3 deps
we use dh_python3 to define subvar of ${python3:Depends} as a part
of the runtime dependencies of python3 packages, like,
ceph-mgr modules named "ceph-mgr-*", python3 bindings named "python3-*".
but unlike python3 bindings of Ceph APIs, the ceph-mgr modules are
not packaged in a typical python way. in other words, they do not
ship a "dist-info" or an "egg-info" directory. instead, we just
install the python scripts into a directory which can be found by
ceph-mgr, by default it is /usr/share/ceph/mgr/dashboard/plugins.
this does not follow the convention of python packaging or
debian packaging policies related to python package. but it
still makes to put these files in this non-convention place, as
they are not supposed to be python packages consumed by the
outer world -- they are but plugins. and should always work
with the same version of ceph-mgr.
the problem is, despite that we have ${python3:Depends} in
the "Depends" field of packages like ceph-mgr-dashboard, dh_python3
is not able to figure out the dependencies by looking at the
installed files. for instance, we have following "Depends" of
ceph-mgr-dashboard:
apparently, none of the subvar is materialized to
a non-empty string.
to improve the packaging, in this change:
* drop all subvars from ceph-mgr-*, as they
are all implemented in pure python.
* add debian/ceph-mgr-*.requires, it's content
is replicated with the corresponding requirements.txt
files.
* add python3-distutils for distutils, as debian
and its derivatives package non-essetial part of
distutils into a separate package, see
https://packages.debian.org/stable/python3-distutils
* add ${python3:Depends} so dh_python3
can extract the deps from debian/ceph-mgr-*.pydist
* update the rule for "override_dh_python3" target,
so dh_python3 can pick up the dependencies specified
in .requires file.
* remove the python3 dependencies not used by
ceph-mgr from ceph-mgr's "Depends"
Redouane Kachach [Tue, 31 May 2022 10:11:03 +0000 (12:11 +0200)]
mgr/cephadm: check if a service exists before trying to restart it Fixes: https://tracker.ceph.com/issues/55800 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
Zac Dover [Mon, 30 May 2022 13:32:06 +0000 (23:32 +1000)]
doc/start: update "memory" in hardware-recs.rst
This PR corrects some usage errors in the "Memory" section
of the hardware-recommendations.rst file. It also closes
some opened but never closed parentheses.
Yingxin Cheng [Mon, 30 May 2022 10:35:33 +0000 (18:35 +0800)]
crimson/os/seastore/transaction_manager: set to test mode under debug build
* force to test mode under debug build.
* make reclaim to happen and validated as early as possible.
* do not block user transaction when reclaim-ratio (unalive/unavailable)
is high, especially in the beginning.
Yingxin Cheng [Mon, 30 May 2022 05:27:30 +0000 (13:27 +0800)]
crimson/os/seastore/segment_cleaner: delay reclaim until near full
It should be generically better to delay reclaim as much as possible, so
that:
* unalive/unavailable can higher to reduce reclaim efforts;
* less conflicts between mutate and reclaim transactions;