Jason Dillaman [Wed, 15 Jan 2020 20:09:05 +0000 (15:09 -0500)]
rbd-mirror: pool replayer should instantiate the remote pool poller
Let the poller pull the metadata from the remote before advancing to
the leader watcher initialization. If the remote metadata changes
during runtime, stop the pool replayer so that it can be
re-initialized.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Wed, 15 Jan 2020 20:03:18 +0000 (15:03 -0500)]
rbd-mirror: periodic remote pool metadata poller
The mirror uuid and mirror peer uuid should be periodically retrieved
from the remote peer. The peer ping logic can also be moved to this
helper class.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Casey Bodley [Mon, 13 Jan 2020 19:44:47 +0000 (14:44 -0500)]
Merge pull request #32534 from cbodley/wip-43512
rgw multisite: enforce spawn window for incremental data sync
Reviewed-by: Daniel Gryniewicz <dang@redhat.com> Reviewed-by: Eric J. Ivancich <ivancich@redhat.com> Reviewed-by: Shilpa Jagannath <smanjara@redhat.com>
Jason Dillaman [Sun, 12 Jan 2020 15:05:04 +0000 (10:05 -0500)]
rbd-mirror: skip closing local image if it was already closed
If the journal replayer finishes relaying (error or promotion), it will
close the local image. However, the image replayer state machine will also
shut down the journal replayer (again) which might result in attempting
to close the local image again.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Ronen Friedman [Sat, 11 Jan 2020 07:42:49 +0000 (09:42 +0200)]
crimson: fix aarch64 ctest failure by removing some lambda attributes
There seems to be no "universally accepted" way to declare a lambda as
[[always_inline]]. "Universally accepted" here meaning: accepted with
no error or warning by gcc8, gcc9 and clang.
See for example: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60503
Patrick Donnelly [Sat, 11 Jan 2020 00:16:00 +0000 (16:16 -0800)]
Merge PR #32213 into master
* refs/pull/32213/head:
cephfs-shell: don't catch libcephfs.Error unnecessarily
cephfs-shell: make every command set a return value on failure
pybind/cephfs: move LibCephFSStateError closer to base class
pybind/cephfs: add method to get error code in Error and OSError
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Milind Changire [Fri, 10 Jan 2020 06:31:34 +0000 (12:01 +0530)]
mds: throttle scrub start for multiple active MDS
* add check to "scrub start" command handler to dishonor request if
multiple MDS are "active" for a recursive scrub request
* add check to MDSRank::handle_mds_map() to see if scrubbing is in
progress and max_mds is greater than 1, if so then scrubbing is
aborted. This is supposed to take care of the condition when scrubbing
has been started with max_mds == 1 and then bumped to a higher value
Kefu Chai [Fri, 10 Jan 2020 05:28:10 +0000 (13:28 +0800)]
qa/tasks/mgr: set mgr module option with --force
if mgr is not active, monitor will refuse to set any option consumed by
mgr modules.
the reason the tests pass somtimes is that, we have a racing here:
1. stop all mgr daemons
2. MgrMonitor gets updated and updates its mgr_module_options
accordingly.
3. in TestDashboard.setUp(), we reset the port number for dashboard
using "ceph config set mgr mgr/dashboard/y/ssl_server_port 7789"
4. restart all mgr daemons
but the 2nd step and 3rd step could race with each other, if the 2nd
step happens after 3rd step, the test passes. otherwise it fails.
in this change, "--force" is passed to the "ceph config set" command,
so ConfigMonitor can bypass the sanity test for the option, and just
set this option.
Jason Dillaman [Wed, 8 Jan 2020 14:21:30 +0000 (09:21 -0500)]
rbd-mirror: bootstrap and related state machines now uses StateBuilder
This removes all the journal-specific variables from the image replayer
and bootstrap state machines. Instead, these details are hidden behind
the abstract StateBuilder class and its derived journal class.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Wed, 8 Jan 2020 19:29:15 +0000 (14:29 -0500)]
rbd-mirror: journal helper state machines now inherits from simple BaseRequest
The StateBuilder will need to construct and return a simple interface when
creating the CreateLocalImageRequest and PrepareReplayRequest state machines
regardless of whether it's for journal or snapshot mirroring.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Tue, 7 Jan 2020 15:50:34 +0000 (10:50 -0500)]
rbd-mirror: created simple shell image replayer state builder
This state builder will separate common logic for journal + snapshot
mirroring and allow an abstract query and builder APIs for abstracting
the deltas between the two implementations.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Mon, 6 Jan 2020 20:22:52 +0000 (15:22 -0500)]
rbd-mirror: allow bootstrap to populate remote image id
The remote image id is used as a sanity check and a test for
permanently shutting down the replayer after the remote image is
deleted. Recent refactor work broke this by passing the remote
image id by value.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Kefu Chai [Thu, 9 Jan 2020 11:36:37 +0000 (19:36 +0800)]
qa/tasks/cephfs_test_runner: setattr to class not instance
before this change, `setattr()` sets the instance specialized with a certain method
of test case, so in `MgrTestCase.setUpClass()`
assert cls.mgr_cluster is not None
fails,
after this change, instead of test case, the class of test suite is updated with the
specified params, even if we pass a certain test to test runner.
so we can
Rishabh Dave [Fri, 13 Dec 2019 06:51:03 +0000 (12:21 +0530)]
pybind/cephfs: move LibCephFSStateError closer to base class
At first look it appears that class OSError is the only class derived
from class Error which makes class Error look redundant. Therefore, it's
better to move class LibCephFSStateError from the bunch of exception
classes derived from OSError to closer to class Error.
Kefu Chai [Thu, 9 Jan 2020 07:35:57 +0000 (15:35 +0800)]
cmake: let vstart depend on radosgwd
in f528f173, in cmake, the target of executable "radosgw" is renamed
to "radosgwd", and the static library of "radosgw_a" was renamed to
"radosgw". this broke the tests which expected radosgw to be available
if "tests" was built.
in this change, both "vstart" and "tests" now depend on "radosgwd"
instead of "radosgw".
Kefu Chai [Tue, 7 Jan 2020 08:15:51 +0000 (16:15 +0800)]
qa/tasks/ceph_manager: do not pick a pool is there is no pools
random.choice(seq) raises IndexError if seq is empty. we cannot ensure
there is always one or more pools in the cluster while using pool
related thrasher. so skip the thrasher action if there is no pools at
that moment.
BuildBoost.cmake (used when we're building the submodule) doesn't
provide parity with FindBoost.cmake (used with system Boost).
Specifically, it doesn't set the _FOUND variables for the various
components, making it hard to depend on finding those features.
Set Boost_<component>_FOUND for all the components we're building in
BuildBoost.cmake to make using these variables possible.
Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
Casey Bodley [Wed, 8 Jan 2020 15:27:31 +0000 (10:27 -0500)]
qa/rgw: remove failing radosgw_admin_rest from multisite suite
this was added to test that admin apis forward relevent requests to the
master zone, but radosgw_admin_rest.py tries to create an admin user
with 'radosgw-admin user create'. this fails with:
Please run the command on master zone. Performing this operation on
non-master zone leads to inconsistent metadata between zones
Are you sure you want to go ahead? (requires --yes-i-really-mean-it)