xie xingguo [Tue, 26 Mar 2019 07:02:02 +0000 (15:02 +0800)]
osd/PG: move down peers out from peer_purged
In purge_strays(), we'll aggressively clear stray_set and
add all related peers into peer_purged.
However, if the corrsponding peer is down and becomes
up again, (unconditionally) adding it to peer_purged
will prevent primary from re-purging it.
(See Active::react(const MNotifyRec& notevt))
On consuming a new osdmap, let's move any down peers out from
peer_purged simutaneously. This way we can lower the risk
of leaving any leftover PGs behind.
xie xingguo [Tue, 26 Mar 2019 12:04:15 +0000 (20:04 +0800)]
osd/PG: introduce all_missing_unfound helper
We use pg_log.missing to track each peer's missing objects separately,
whereas missing_loc records the location of all (probably existing) good copies
for both primary and replicas' missing objects. Hence an item from
pg_log.missing or missing_loc is of different meaning and is not comparable.
During recovery, we can skip recovering primary only if
- primary is good, e.g., has no missing at all
- or all of the primary's missing objects do exist in missing_loc and are
currently unfound
Obviously, the current "all missing objects are unfound" checker is broken.
Fix by introducing an independent all_missing_unfound helper to make the
count of missing objects that are currently unfound correct.
Kefu Chai [Thu, 14 Jun 2018 01:32:08 +0000 (09:32 +0800)]
cmake: update BuildSPDK for spdk-18.05
in spdk v18.05, libuuid is linked by libspdk_util.a, in which,
it is used by lib/util/uuid.c. and libspdk_vol.a uses the wrapper
function exposed by libspdk_util.a, so update the CMakefile script to
reflect the change.
Jonas Jelten [Mon, 1 Apr 2019 10:28:09 +0000 (12:28 +0200)]
osd/PG: discover missing objects when an OSD peers and PG is degraded
When a PG is remapped from OSD `a` to OSD `b`, the objects are
backfilled. When OSD `a` is restarted, objects become degraded
as `a` is no longer queried or considered as a backfill source.
As the PG is degraded, `PG::discover_all_missing` is not called
when a candidate OSD peers with the primary: The PG is already
active, thus `PG::activate` (and in turn missing object discovery)
is not called. Discovery is also not initiated from
`PG::RecoveryState::Active::react(const MNotifyRec& notevt)`
as there are no unfound objects.
This patch adds a call to `discover_all_missing` when
when an OSD sends its `MNotifyRec` message and the PG is degraded.
mon/OSDMonitor: further improve prepare_command_pool_set E2BIG error message
d2c0fe9b5319a4404965c40ec92e291802ef30f6 improved this error message,
but it can be improved further by suggesting that the pg_num be increased in
smaller increments.
Neha Ojha [Wed, 6 Feb 2019 03:23:21 +0000 (19:23 -0800)]
osd/PGLog: should not rollback further than deleted object version
When a deleted object becomes a divergent entry in the pg log,
we should not be able to rollback to a version of the deleted
object that doesn't exist.
To avoid this, we need to preserve the original crt of the pg log,
before we update it in rewind_from_head() and use that to decide whether
we can rollback or not in _merge_object_divergent_entries().
qa/tasks/ceph_deploy: install python3.6 instead of python3.4 for py3 tests
EPEL7 has switched over to python3.6 as the main python3. and we started
packaging python bindings for python3.6 since
https://github.com/ceph/ceph-build/pull/1283
qa: install python3-{cephfs,rados} instead of python34-*
we install the latest python-rpm-macros on all builders since
https://github.com/ceph/ceph-build/pull/1283 . now that we started
building python36-* after that change, for testing the python3 packages on
CentOS/RHEL 7, we need to install python36-* instead of python34-*.
and after the change of 8ae1947, python36-* now "Provides" python3-*, we
can just install python3-* for fulfill the requirement for testing
python3 cephfs bindings.
Fixes: http://tracker.ceph.com/issues/39164 Signed-off-by: Kefu Chai <kchai@redhat.com>
Conflicts: this change is not cherry-picked from master, because,
in master, we don't install python3 packages after 7e5c85b604.
rpm: add "Provides: python3-*" for python packages
so user can install python3-rados, instead of python36-rados, without
specifying the minor version of python. also, we should not break our
teuthology tests with this naming scheme change. for instance, our
cephfs qa suite installs `python3-cephfs` for testing the `cephfs-shell`
some of our centos7 jenkins builders are failing to build ceph master and
nautilus branches. because EPEL7 recently switched from python3.4 to
python3.6 as the native python3. see
https://lists.fedoraproject.org/archives/list/epel-announce@lists.fedoraproject.org/message/EGUMKAIMPK2UD5VSHXM53BH2MBDGDWMO/
and one of our BuildRequires, cmake3,
was offered by EPEL7. it also followed the python3.6 switch-over to
rebuild against python3.6. as a result, the cmake3-data-3.13.4-2.el7
started to depend on /usr/bin/python3.6, which is in turn offered by
python36 package. after installing python36 as a dependency of the
updated cmake3. but in cmake, we originally checks for the latest
python3 interpreter if WITH_PYTHON3 is enabled, that's why these
builders which happen to install these updated packages started to fail
when detecting the existence of python3.6 related build dependencies.
as a fix, in d1e83082,
python%{python3_pkgversion}-{devel,setuptools,Cython} are listed as
BuildRequires to reflect this change in EPEL7. before d1e83082, we
hardwired them to python34-*.
but as following analysis puts, there are cases where `yum-builddep`
is inconsistent with `rpmbuild`. as `yum-builddep` changes the how
`python3_pkgversion` and `python3_version` macros are expanded:
- none of the packages installed by `yum-builddep` installs the python3
related rpm macros, so the system stays with whatever python3 it was
using. in this case, `rpmbuild` won't complain, as the
`python3_pkgversion` and `python_version` are consistent before and
after `yum-builddep`.
- system has python3.4 installed before `yum-builddep`. but
`yum-builddep` installed python3.6 and also the updated
`python-rpm-macros` packages, which points `python3_version` and
`python3_pkgversion` to 3.6 and 36 respectively. in this case,
`rpmbuild` will complain, because when we run `yum-builddep`,
`python3_version` was still "3.4".
- system does not have python3 installed before `yum-builddep`. so
it was using python34 for preparing the "BuildRequires". but some
of the packages installed by `yum-builddep` installs python36, and
also the updated `python-rpm-macros` packages, which points
`python3_version` and `python3_pkgversion` to 3.6 and 36 respectively.
in this case, `rpmbuild` will complain, because the python36 related
dependencies are missing. what the system has is python34
dependencies.
- system does not have python3 installed before `yum-builddep`. so
it was using python34 for preparing the "BuildRequires". but some
of the packages installed by `yum-builddep` installs python34, and
also the updated `python-rpm-macros` packages, which points
`python3_version` and `python3_pkgversion` to 3.4 and 34 respectively.
in this case, `rpmbuild` won't complain, as the
`python3_pkgversion` and `python_version` are also consistent before and
after `yum-builddep`.
as we cannot tell if the system has python3 or what the python3 version
the system has before `yum-builddep`, so what we can do is to ensure
`rpmbuild` has what it needs to build Ceph. so let's just stick with
python3.6.
to force cmake to use the python3 and python3 modules for building
python3 bindings
on the debian side, it's okay to continue using "-DWITH_PYTHON3=ON", as
- cmake does normalize "ON" to 3
- debian's cmake extension lives on /usr/lib/python3/dist-packages/
not in a specific /usr/lib/python3.x/dist-packages directory
Boris Ranto [Thu, 4 Apr 2019 20:00:55 +0000 (22:00 +0200)]
cmake: check for MAJOR.MINOR version of python3
We can only check for MAJOR.MINOR version of python3 since
FindPython3Libs does not support checking for MAJOR.MINOR.PATCH version
of python3. We also need to make sure we use the PYTHON3 versions of
these variables.
This should fix a regression introduced by c961e00.
use might have multiple python3 installed, some of them has/have all
dependencies installed and is good enough for building Ceph. we should
not always use the latest python installed in the system and complain that
there is missing dependencies, even if user has installed all the
python3 dependencies for the older python3.
put in other words, if user only installs cython module for python3.4, but
she has both python3.6 and python3.4 in her system. we should not force
her to uninstall python3.6 for installing Ceph.
this change also aligns with MGR_PYTHON_VERSION. i am not applying the
same change to WITH_PYTHON2, because python2 is already stablized. and distros
are not likely to release new python2 releases.
Jianpeng Ma [Wed, 20 Jun 2018 12:22:38 +0000 (20:22 +0800)]
os/bluestore: fix length overflow.
In fact, length of 'struct interval_t' and 'struct bluestore_pextent_t'
is uint32_t. But len of AllocatorLevel02::_mark_allocated is uint64_t.
So it may cause data overflow which cause bug.
Ilya Dryomov [Tue, 26 Feb 2019 15:21:10 +0000 (16:21 +0100)]
include/intarith: enforce the same type for p2*() arguments
x and align must be of the same width. Otherwise the lack of
sign-extension can bite, e.g.:
uint64_t x = 128000;
uint32_t align = 32768;
uint64_t res = p2roundup(x, align);
// res = 18446744069414715392
While the templates added in commit c06b97b3d7e3 ("include: Add
templates along side macros in intarith") are technically correct,
P2*() macros weren't meant to be used with different types. The
Solaris kernel (which is where they originated) has P2*_TYPED()
variants for that.