Patrick Donnelly [Thu, 13 Feb 2020 23:21:30 +0000 (15:21 -0800)]
Merge PR #33194 into master
* refs/pull/33194/head:
qa: add tests for mds_join_fs cluster affinity
qa: update cluster warning message for removed MDS
doc: add section on new mds_join_fs behavior
mon/MDSMonitor: enforce mds_join_fs cluster affinity
mon/MDSMonitor: use type of info.rank or mds_rank_t
qa: accept operation on current fs status
qa: add method to enable multifs
qa: fix nested generator use
qa: manage config changes through mons
Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com>
Sage Weil [Thu, 13 Feb 2020 16:26:08 +0000 (10:26 -0600)]
Merge PR #29427 into master
* refs/pull/29427/head:
mgr/rook: Make use of rook-client-python when talking to Rook
cmake: Integrate Rook client generation
mgr/rook: Automatically generate Rook client interface
Add submodule to rook-client-python.git
Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
Before this, "mds_join_fs" config enforced a preference for the standby
the monitors would select. Now the monitors actively enforce this
by purposefully removing an MDS wither lower "affinity". An MDS standby
has highest affinity if its mds_join_fs is the file system in question
or a vanilla standby (no mds_join_fs).
Fixes: https://tracker.ceph.com/issues/43392 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Mon, 10 Feb 2020 18:46:09 +0000 (10:46 -0800)]
qa: manage config changes through mons
This provides a generic framework for modifying Ceph configuration
changes in tests through the monitors rather than the asok interface or
local ceph.conf changes. Any changes are reverted during test teardown.
A future patch will convert existing tests manipulating the local
ceph.conf or admin socket.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
In erasure coding section under Architecture, there is a mention of k = 2 + M =1 for
number of data copies and redundancy copies respectively, which is a bit ambiguous.
The proposal is to change to k = 2, M = 1 as the + sign is not needed here.
Signed-off-by: Nag Pavan Chilakam <nagpavan.chilakam@gmail.com>
Sebastian Wagner [Mon, 13 Jan 2020 12:01:20 +0000 (13:01 +0100)]
mgr/rook: Make use of rook-client-python when talking to Rook
Fixes:
* `CephFilesystem.spec.onlyManageDaemons` does not exist
* `CephObjectStroe.spec.gateway.allNodes` does not exist
* Added directory-osds to existsing nodes was broken
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Kefu Chai [Thu, 13 Feb 2020 04:32:00 +0000 (12:32 +0800)]
cmake: disable -Wnon-virtual-dtor when compiling seastar
quite a few base classes with virtual functions mark their destructor
non-virtual and `protected` for better performance, as seastar destruct
them via the concrete type of the instance.
so let's disable this warning. but, please note, this newly added
CXX_FLAG in `Seastar_CXX_FLAGS` won't be populated to crimson, as it is
only added to the CXX_FLAGS used for compiling seastar itself. so we still
have `-Wnon-virtual-dtor` warnings when compiling crimson as long as seastar
headers are included.
so to silence these warnings, we need to add it also to `crimson::cflags`,
probably it's worth trading the noise caused by seastar's optimizations
with the potentially useful warning messages caused by our oversights.
in my case, there are over 300 lines of warnings split by GCC-10, so i
still think it'd be better to add it also to crimson to increase the
signal-to-noise ratio. we can aways remove it every once in a while to
check if we forget to mark the destructor of a base class `virtual`.
in the latest version of seastar, we are not able to construct a
`seastar::pollable_fd_state` directly, as its constructor is now
`protected`, and only the reactor is able to create an instance of
`seastar::pollable_fd_state` now.
and `seastar::readable_eventfd` offers all we need to get notified
by reactor in an alien world. so let's used it instead.
J. Eric Ivancich [Fri, 10 Jan 2020 19:12:35 +0000 (14:12 -0500)]
rgw: clean up address 0-length listing results...
Some minor clean-ups to the previous commit, including adjust logging
messages, rename variable, convert a #define to a constexpr (and
adjust its scope).
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
J. Eric Ivancich [Thu, 13 Feb 2020 01:38:44 +0000 (20:38 -0500)]
rgw: address 0-length listing results when non-vis entries dominate
A change to advance the marker in RGWRados::cls_bucket_list_ordered to
the last entry visited rather than the final entry in list to push
progress as far as possible.
Since non-vis entries tend to cluster on the same shard, such as
during incomplete multipart uploads, this can severely limit the
number of entries returned by a call to
RGWRados::cls_bucket_list_ordered since once that shard has provided
all its members, we must stop. This interacts with a recent
optimization to reduce the number of entries requested from each
shard. To address this the number of attempts is sent as a parameter,
so the number of entries requested from each shard can grow with each
attempt. Currently the growth is linear but perhaps exponential growth
(capped at number of entries requested) should be considered.
Previously RGWRados::Bucket::List::list_objects_ordered was capped at
2 attempts, but now we keep attempting to insure we make forward
progress and return entries when some exist. If we fail to make
forward progress, we log the error condition and stop looping.
Additional logging, mostly at level 20, is added to the two key
functions involved in ordered bucket listing to make it easier to
follow the logic and address potential future issues that might arise.
Additionally modify attempt number based on how many results were
received.
Change the per-shard request number, so it grows exponentially rather
than linearly as the attempts go up.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Jason Dillaman [Wed, 12 Feb 2020 02:59:41 +0000 (21:59 -0500)]
rbd-mirror: prevent asok commands from dereferencing uninitialized members
If rbd-mirror fails to connect to the remote cluster, there is a window of time
where the asok commands might attempt to dereference the default namespace
replayer or access invalid librados IoCtxs.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Sage Weil [Wed, 12 Feb 2020 16:56:22 +0000 (10:56 -0600)]
cephadm: avoid trigger old podman bug
This ticket seems to suggest that (1) the root cause is related to an
exec that is orphaned and screws up the container state (due to, e.g., ssh
dropping, or a timeout), (2) -f may be needed, sometimes, to recover, and
(3) newer versions fix it.
Sage Weil [Tue, 11 Feb 2020 16:01:33 +0000 (10:01 -0600)]
mgr/orch: service ls -> ps, add DaemonDescription
- We keep ServiceDescription around unmodified (although it will need some
cleanup later)
- We add DaemonDescription, and clean out the service-related ambiguities
- Add a new list_daemons() method for Orchestrator
- Add a new 'ceph orch ps' command
- In cephadm, drop get_services(), and implement list_daemons()
- a million changes to make this work
- Adjust health alert and option names
Sage Weil [Wed, 12 Feb 2020 17:13:41 +0000 (11:13 -0600)]
Merge PR #33205 into master
* refs/pull/33205/head:
mgr/cephadm: Bail if we cannot find a host for services
mgr/cephadm: fix placement of new daemons (mds,rgw,rbd-m)
mgr/orchestrator: minor change to improve type checking
mgr/cephadm: test_cephadm: simplify matching strings
The lvm batch command fails to prepare the OSDs on the created LV.
When using lvm batch, the LV/VG are created prior the OSD prepare.
During that creation, multiple tags are set with null value.
Sebastian Wagner [Wed, 12 Feb 2020 10:34:40 +0000 (11:34 +0100)]
python-common: add py.typed (PEP 561)
Bugs found:
* Fixed documentation of how `mgr/Orchestrator.create_osds` is called
* mgr/Rook.create_osds: Added missing `.path` when querying paths.
* mgr/Rook.create_osds: Fixed progress message
* mgr/RookCluster.create_osds: Empty list instead of `None`
* python-common: use empty objects instead of `None`
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Kefu Chai [Wed, 12 Feb 2020 08:12:38 +0000 (16:12 +0800)]
crimson/common: correct type of callback func
`md_config_obs_t` is defined as
```
using md_config_obs_t = ceph::md_config_obs_impl<ConfigProxy>;
```
in `common/config_obs.h`. it takes advantage of a fact that
somebody exposes the correct version of `ConfigProxy` to the global
namespace. this is intended to fulfill the needs of other components
which expects `md_config_obs_t`. otherwise we need to specify
`ceph::md_config_obs_impl<ceph::ConfigProxy>` or
`ceph::md_config_obs_impl<crimson::common::ConfigProxy>` depending on
if we are programming crimson or not.
but in this case, we are actually defining
`crimson::common::ConfigProxy`, so it'd be better to define
`md_config_obs_t` explicitly instead relying on "somebody" which exposes
`ConfigProxy`. and `ConfigObserver` is defined using the current
`ConfigProxy`, so it's more correct and more readable than using
`md_config_obs_t` defined in `common/config_obs.h`.
Kefu Chai [Wed, 12 Feb 2020 08:05:14 +0000 (16:05 +0800)]
cmake: disable concepts in boost::asio
GCC-10 and Clang choke when compiling a concept constrained with
its template parameter, which is in turn another concept. as a
workaround of the bug of boost::asio, we should disable concepts
support in it. but it's nice to enable it when the compiler is
able to use concepts to do some compile time checkings even the
concepts are not compliant to C++20.