Sage Weil [Tue, 7 May 2019 17:48:16 +0000 (12:48 -0500)]
Merge PR #27929 into master
* refs/pull/27929/head:
os/bluestore: be verbose about objects that existing on rmcoll
osd/PrimaryLogPG: disallow ops on objects with an empty name
osd/PG: fix cleanup of pgmeta-like objects on PG deletion
Yuri Weinstein [Mon, 6 May 2019 15:55:27 +0000 (08:55 -0700)]
qa/test: reduce over all number of runs
We kill thousands of queued jobs every week, so why do we even schedule them ?
Another point was that we run numerous of tests as part of PRs testing on released versions anyway, so it's duplicating effort
Kefu Chai [Tue, 7 May 2019 12:57:17 +0000 (20:57 +0800)]
seastar: pick up changes for better performance
to be specific, a78fb44c96e2912c6f39b2151f94a0bb2b5796a6 helps to
improve the performance of future implementation -- with this change
future can always reference its local state without checking its `_promise`
and dereferencing it.
Justification behind the change is behaviour of classical OSD.
It calls PrimaryLogPG::find_object_context() far before going
through OSDOps in ::do_osd_ops().
Kefu Chai [Tue, 7 May 2019 07:06:42 +0000 (15:06 +0800)]
crimson/osd: shutdown services in the right order
we should stop config service *after* osd is stopped, as osd depends on
a working and alive config subsystem when stopping itself. for instance,
the destructor of AuthRegistry unregisters itself from the ObserverMgr,
which is in turn a member variable of ConfigProxy, so if ConfigProxy is
destroyed before we destroy mon::Client, we will have a segfault with
following backtrace
ObserverMgr<ceph::md_config_obs_impl<ceph::common::ConfigProxy>
>::remove_observer(ceph::md_config_obs_impl<ceph::common::ConfigProxy>*)
at /var/ssd/ceph/build/../src/common/config_obs_mgr.h:78
AuthRegistry::~AuthRegistry() at
/var/ssd/ceph/build/../src/crimson/common/config_proxy.h:101
(inlined by) AuthRegistry::~AuthRegistry() at
/var/ssd/ceph/build/../src/auth/AuthRegistry.cc:28
ceph::mon::Client::~Client() at
/var/ssd/ceph/build/../src/crimson/mon/MonClient.h:44
ceph::mon::Client::~Client() at
/var/ssd/ceph/build/../src/crimson/mon/MonClient.h:44
OSD::~OSD() at /usr/include/c++/9/bits/unique_ptr.h:81
vstart.sh: enable creating multiple OSDs backed by spdk backend
Currently vstart.sh only support deploying one OSD based on NVMe SSD.
The following two cases will cause errors:
1.There are 2 more NVMe SSDs from the same vendor on the machine
2.Trying to deploy 2 more OSDs if we only get 1 pci_id available
Add the support for allowing deploying multiple OSDs on a machine with
multiple NVME SSDs.
Changcheng Liu [Mon, 6 May 2019 02:29:11 +0000 (10:29 +0800)]
vstart.sh: correct ceph-run path
ceph-run is in the same directory as vstart.sh. It's often that
vstart.sh is run under build directory. Without giving the right
directory, ceph-run file can't be found.
Signed-off-by: Changcheng Liu <changcheng.liu@intel.com>
Sage Weil [Thu, 2 May 2019 16:39:31 +0000 (11:39 -0500)]
os/bluestore: be verbose about objects that existing on rmcoll
This is always a bug (OSD doesn't try to remove a collection unless it
thinks it is empty), and not seeing it at default debug levels makes it
hard to track down.
Sage Weil [Thu, 2 May 2019 16:28:14 +0000 (11:28 -0500)]
osd/PG: fix cleanup of pgmeta-like objects on PG deletion
If an object has an empty 'name' field, it "looks" like a pgmeta object,
and the PG cleanup code was skipping it. However, we were letting these
objects get created.
Fix by only skipping *our* pgmeta object. If there are other pgmeta-like
objects in the PG collection, clean them up.
Fixes: https://tracker.ceph.com/issues/38724 Signed-off-by: Sage Weil <sage@redhat.com>
in previous change, MDSMap::min_compat_client was changed from int8_t to
ceph_release_t, i.e. uint8_t, and in
Server::update_required_client_features(), we check the
MDSMap::min_compat_client to see if it is greater than given version, so
a negative "-1" would overflow and be interpreted as a 255, hence will
be always greater than whatever version is compared. so we need to
bump up the encoding version, and
* normalize the number if it is -1
* ignore the number which is way too large.
we have following pains when it comes to ceph release related
programming:
* we use int, uint8_t, uint32_t, unsigned int for representing the ceph
release, i.e., jewel, luminous, nautilus, in different places in our
source tree.
* we always need to add a comment aside of `uint8_t release` to help
the folks to understand that it is CEPH_RELEASE_*.
* also we keep forgetting that "os << release" actually prints the
release as an ASCII.
* and it's painful to remember that we have to translate the release
number using `ceph_release_name()` before print it out in the human
readable format.
* we replicate the n+2 upgrade policy in multiple places
in this change, `ceph_release_t` and some helper functions are
intruduced to alleviate the pains above.
* add a scoped enum for representing ceph releases, so the release
is typed . which means that we can attach different function to
it. and in future, we can even replace `ceph_release_t` with
a class if we need to support more fancy features which cannot be
implemented using free functions.
* add `ostream<<()` operator for `ceph_release_t`, so we can simply
send it to `ostream`
* add `can_upgrade_from()` so we don't need to repeat ourselves.
* move ceph_release_from_name() to ceph_release.{h,cc}, as currently,
ceph_release.cc uses `ceph_release_name()` for implementing
`ostream<<()`, and after this change, `ceph_release_from_name()`
will return `ceph_release_t`, so if we keep `ceph_release_from_name()`
where it was, these two headers will be included by each other,
which is a no-go.
* reimplement `ceph_release_from_name()` using a loop. before this
change, `ceph_release_from_name()` was implemented using a manually
unrolled if-else structure, which is more performant, but the
downside is that, it replicates mapping between release number
and its name. so after this change, a loop is used instead.
as this function is not used in the critical path, so this change
should not have visible impact on the performance.
* always use ceph_release_t::unknown as the default value of the
"release" member variables. before this change, sometimes, we use
"0" and sometimes we use "1", after inspecting the code, i found that
"0" is good enough to cover all the use cases. and since "0" is a
magic number in this context, it is replaced using
`ceph_release_t::unknown`. to facilidate the checking against
`ceph_release_t::unknown`, `operator!()` is added.
* ceph::to_string() and ceph::to_integer<>() are added to help
to remove the asssumption of the underlying type of `ceph_release_t`,
ideally, users of `ceph_release_t` should not use `static_cast<>` to
cast it into integer types, instead, they should use
`ceph::to_integer<>()` to do this job. if, in future, we want to
use a `class` to represent `ceph_release_t`, we can get this done
with minimum change, if `ceph::to_string()` and `ceph::to_string()`
are used. we can not specialize them in `std` naming space. as
it's claimed that it's undefined behavior to do so. see
https://en.cppreference.com/w/cpp/language/extending_std .