Jason Dillaman [Thu, 10 Dec 2020 22:32:16 +0000 (17:32 -0500)]
librbd/mirror: tweak which snapshot is unlinked when at capacity
The rbd-mirror daemon will attempt to sync from the last synced
snapshot to the next mirror snapshot. When the limit is at 3, this
currently can result in a situation where an in-use sync snapshot is
deleted. Instead of unlinking the second oldest snapshot, always
unlink the third oldest.
Fixes: https://tracker.ceph.com/issues/48553 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Thu, 10 Dec 2020 21:13:23 +0000 (16:13 -0500)]
librbd/mirror: increase debug logging of snapshot state machines
Try to keep debug level 20 for IO state machines so that setting the
debug level to something lower should show the manipulation of
the mirror snapshots.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Tue, 8 Dec 2020 19:16:49 +0000 (14:16 -0500)]
librbd/deep_copy: added new migrating flag to object copy
The migration operation and the copyup state machine will set
this flag when attempting to perform a deep-copy due to a
live-migration.
This flag will prevent a possible race condition between the
start of the object deep-copy when migration was enabled and
the writing portion of the deep-copy when migration might
have completed via external means.
Fixes: https://tracker.ceph.com/issues/45694 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Sebastian Wagner [Thu, 10 Dec 2020 21:41:47 +0000 (22:41 +0100)]
Merge pull request #38490 from sebastian-philipp/mypy-0.790
src,qa,dashboard: Upgrade to mypy 0.790
Reviewed-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com> Reviewed-by: Laura Paduano <lpaduano@suse.com> Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Jason Dillaman [Thu, 10 Dec 2020 04:17:24 +0000 (23:17 -0500)]
rbd-mirror: do not attempt to unlink from more recent snapshots
The snapshot-based mirroring replayer should only attempt to unlink
from any snapshots that are older than the end remote snapshot id to
prevent the remote side from incorrectly deleted the snapshot.
Fixes: https://tracker.ceph.com/issues/48527 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Thu, 10 Dec 2020 03:30:17 +0000 (22:30 -0500)]
librbd/mirror: unlink peer might recursively loop
If the mirror peer set is (incorrectly) empty, it's not currently
possible for the unlink peer state machine to properly delete the
snapshot. This can result in a recursive loop between the create
primary snapshot state machine and the unlink peer state machine
until the stack depth grows too large.
Fixes: https://tracker.ceph.com/issues/48525 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
in crimson, we have two families of shared functions.
- one of them is used by alien store. they are compiled with
-DWITH_SEASTAR and -DWITH_ALIEN, to enable the shim code between
seastar and POSIX thread.
- another is used by crimson in general. where no lock is allowed.
currently, we use the "crimson" and "ceph" namespace to differentiate
these two families of functions, so they can colocate in the same
executable without violating the ODR. see src/include/common_fwd.h for
more details.
the functions defined in src/common/version.cc are also shared by
alien store and crimson code. and because we have different
implementations of `CephContext` in crimson and in classic OSD (i.e.
alienstore), we have to have different implementations of this function
as well, if we follow the same approach. but since these functions are
very simple and are non-blocking, there is not much value in
differentiating them, it is better to inject the test settings using
environment variable instead of using ceph option subsystem.
in this change, "ceph_debug_version_for_testing" environment variable is
checked instead, so that crimson and alienstore can share the same
compilation unit of version.cc. and "debug_version_for_testing" option
is removed.
Kefu Chai [Wed, 9 Dec 2020 16:13:15 +0000 (00:13 +0800)]
cmake: reorder linked libraries of crimson-alienstore
so the libraries like libkv can access the symbols exposed by
crimson-alien-common.
this change should address the link failures like:
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.0.0-8049-g1ab93e4d/rpm/el8/BUILD/ceph-16.0.0-8049-g1ab93e4d/src/common/PriorityCache.cc:175:
undefined reference to `ceph::common::PerfCountersBuilder::~PerfCountersBuilder()'
Kefu Chai [Wed, 9 Dec 2020 11:39:19 +0000 (19:39 +0800)]
cmake: do not link crimson-alienstore against crimson-os
crimson-os contains crimson-alienstore, we should not link the latter
against the former. this change partially reverts 490b6322fbbece053f1d92b29ae101bfb0976007
Kefu Chai [Wed, 9 Dec 2020 09:42:43 +0000 (17:42 +0800)]
test/crimson: do not link against crimson-{os,common}
this change partially reverts 652dbacc7424efbd3c3175de8ba79ed29edd55c8
quite a few test does not use crimson-os at all, so no need to link
against this library.
even worse is that crimson-os contains crimson-seastore *and*
crimson-alienstore. this introduces cyclic references.
Alexander Sushko [Fri, 27 Nov 2020 11:04:13 +0000 (14:04 +0300)]
pybind/mgr/prometheus/module.py: defaultdict for num_by_state
num_by_state[state] += count in get_pg_status method raises KeyError
if pg state is not in PG_STATES list. PG_STATES should be synced with
osd_types.cc:pg_state_string(). But sometimes it is not. After the
KeyError raise mgr metrics are not available at all.
Fixes: https://tracker.ceph.com/issues/46142 Signed-off-by: Alexander Sushko <alexandrsushko@gmail.com>
Kefu Chai [Thu, 10 Dec 2020 04:22:48 +0000 (12:22 +0800)]
cmake: stop rebuilding rocksdb everytime
this change was originally introduced as a part of 418bfd7bb5ec1dcec2b011e9df118c33ce38d398, and latter migrated / changed
in the current form. but the idea is the same: to rebuild rocksdb even
if the stamp file shows that it has been built. there is no need to do
so, as we don't hack RocksDB as we used to. also,it is distracting to
check this log message when rebuilding the tree. so drop it.
Kefu Chai [Thu, 10 Dec 2020 03:13:55 +0000 (11:13 +0800)]
script/run-cbt.sh: drop bashism
* "local" is not supported by POSIX shell, so drop it
* assign "$*" to a non-array variable, see
https://github.com/koalaman/shellcheck/wiki/SC2124
* fail early if "cd" fails, see
https://github.com/koalaman/shellcheck/wiki/SC2164
* quote "$(...)" with quotes to avoid string split
Nizamudeen A [Tue, 8 Dec 2020 14:35:28 +0000 (20:05 +0530)]
mgr/dashboard: Adding the alert bad certificate error to the ssl providers error
upstream tracked in https://github.com/cherrypy/cheroot/pull/348 Fixes: https://tracker.ceph.com/issues/48490 Signed-off-by: Nizamudeen A <nia@redhat.com>
Merge pull request #37130 from pcuzner/cephadm-exporter
cephadm:Add a daemon mode for cephadm to provide a metadata endpoint
Reviewed-by: Ernesto Puerta <epuertat@redhat.com> Reviewed-by: Jan Fajerski <jfajerski@suse.com> Reviewed-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com> Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com> Reviewed-by: Stephan Müller <smueller@suse.com>