Jason Dillaman [Thu, 4 Apr 2019 01:35:40 +0000 (21:35 -0400)]
librbd: copyup should restart delayed ops against the same object
This avoids the potential for a race condition where an in-flight
copyup is removed from the in-flight copyup list and a subsequent
IO against the same object causes a second in-flight copyup.
Fixes: http://tracker.ceph.com/issues/39021 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Tue, 2 Apr 2019 15:51:36 +0000 (11:51 -0400)]
librbd: merge copyup object map update states
The object map HEAD and HEAD/snapshot update states have been
simplified and merged into a single state. This also fixes
several potential race conditions and an issue where CoR might
incorrectly mark the HEAD object has exists+dirty.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Fri, 29 Mar 2019 17:23:37 +0000 (13:23 -0400)]
librbd: properly hold snap/parent locks during IO
The ImageCtx::parent pointer was dereferenced without holding the lock
which could lead to a crash. The ImageCtx::migration_info structure
was also dereferenced without holding a lock.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Mon, 1 Apr 2019 18:48:47 +0000 (14:48 -0400)]
librbd: deep-copy object copy should register an in-flight op
When handling live migrations, the source image is the parent
image of the destination image. To prevent the parent image from
being closed while a request is in-flight, the object copy
state machine now registers an async operation with the source
image.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Sage Weil [Mon, 8 Apr 2019 14:24:09 +0000 (09:24 -0500)]
Merge PR #27386 into master
* refs/pull/27386/head:
os/filestore/FileJournal: note EIO events
os/filestore: make note of EIO errors when we see them
os/filestore: note devname for later use
global/signal_handler: avoid core dump on EIO
os/bluestore/KernelDevice: note EIO metadata on aio EIO
global: add hook to annotate crash report with EIO information
some of our centos7 jenkins builders are failing to build ceph master and
nautilus branches. because EPEL7 recently switched from python3.4 to
python3.6 as the native python3. see
https://lists.fedoraproject.org/archives/list/epel-announce@lists.fedoraproject.org/message/EGUMKAIMPK2UD5VSHXM53BH2MBDGDWMO/
and one of our BuildRequires, cmake3,
was offered by EPEL7. it also followed the python3.6 switch-over to
rebuild against python3.6. as a result, the cmake3-data-3.13.4-2.el7
started to depend on /usr/bin/python3.6, which is in turn offered by
python36 package. after installing python36 as a dependency of the
updated cmake3. but in cmake, we originally checks for the latest
python3 interpreter if WITH_PYTHON3 is enabled, that's why these
builders which happen to install these updated packages started to fail
when detecting the existence of python3.6 related build dependencies.
as a fix, in d1e83082,
python%{python3_pkgversion}-{devel,setuptools,Cython} are listed as
BuildRequires to reflect this change in EPEL7. before d1e83082, we
hardwired them to python34-*.
but as following analysis puts, there are cases where `yum-builddep`
is inconsistent with `rpmbuild`. as `yum-builddep` changes the how
`python3_pkgversion` and `python3_version` macros are expanded:
- none of the packages installed by `yum-builddep` installs the python3
related rpm macros, so the system stays with whatever python3 it was
using. in this case, `rpmbuild` won't complain, as the
`python3_pkgversion` and `python_version` are consistent before and
after `yum-builddep`.
- system has python3.4 installed before `yum-builddep`. but
`yum-builddep` installed python3.6 and also the updated
`python-rpm-macros` packages, which points `python3_version` and
`python3_pkgversion` to 3.6 and 36 respectively. in this case,
`rpmbuild` will complain, because when we run `yum-builddep`,
`python3_version` was still "3.4".
- system does not have python3 installed before `yum-builddep`. so
it was using python34 for preparing the "BuildRequires". but some
of the packages installed by `yum-builddep` installs python36, and
also the updated `python-rpm-macros` packages, which points
`python3_version` and `python3_pkgversion` to 3.6 and 36 respectively.
in this case, `rpmbuild` will complain, because the python36 related
dependencies are missing. what the system has is python34
dependencies.
- system does not have python3 installed before `yum-builddep`. so
it was using python34 for preparing the "BuildRequires". but some
of the packages installed by `yum-builddep` installs python34, and
also the updated `python-rpm-macros` packages, which points
`python3_version` and `python3_pkgversion` to 3.4 and 34 respectively.
in this case, `rpmbuild` won't complain, as the
`python3_pkgversion` and `python_version` are also consistent before and
after `yum-builddep`.
as we cannot tell if the system has python3 or what the python3 version
the system has before `yum-builddep`, so what we can do is to ensure
`rpmbuild` has what it needs to build Ceph. so let's just stick with
python3.6.
xie xingguo [Sat, 2 Mar 2019 08:23:12 +0000 (16:23 +0800)]
msg/async: add timeout for connections which are not yet ready
There could be various corner cases that may cause an async
connection stuck in the connecting stage (e.g., by manually
creating some loop back connections on the switches of our test cluster,
we can almost 100% reproduce http://tracker.ceph.com/issues/37499).
In 61b9432ef9a3847eceb96f8d5a854567c49bbf61 I try to employ the
existing keep_alive mechanism to get those stuck connections out of the
trap but it does not work if the corresponding connection
is not yet ready, since we always require the underlying connection to be
**ready** in order to send out a keep_alive message.
Fix by making a more general connecting timeout strategy.
If a connecting process can not be finished within a specific interval,
then we simply cut it off and retry.
Sage Weil [Thu, 4 Apr 2019 20:48:40 +0000 (15:48 -0500)]
os/filestore: make note of EIO errors when we see them
This is imprecise, since we can't (easily) map an EIO back to a specific
part of the device, or even (easily) tell whether it was a read or write
error. It's enough to mark a crash dump as an EIO event, though, and to
include the name of the (primary) filestore device.
Sage Weil [Thu, 4 Apr 2019 19:47:17 +0000 (14:47 -0500)]
os/bluestore/KernelDevice: note EIO metadata on aio EIO
Note that we only do this if we're about to induce a crash. If we can
pass EIO up the stack, it's up to the upper layer to handle it or trigger
its own crash if it can't.
Sage Weil [Sun, 7 Apr 2019 18:54:59 +0000 (13:54 -0500)]
common: add --log-early command line option
Sometime it is important and useful to see the logs from the bootstrap
phase where we are getting the initial configs from the monitors. Add
a command-line option --log-early to do that.
doc: Update documentation for the MANY_OBJECTS_PER_PG warning
The current documentation for the MANY_OBJECTS_PER_PG warning
states that The threshold can be raised to silence the health
warning by adjusting the mon_pg_warn_max_object_skew config
option on the monitors. It seems that this is not true (at least)
since the luminous times, and this option should be adjusted on
the managers.
I encountered this problem and I spend quite sometime injecting
the mon_pg_warn_max_object_skew to the monitors, added the option
ceph.conf and restarted the monitors several times but the warning
was not going away. I had to download the code to see what's
happening and I found out this:
$ git grep -A 3 mon_pg_warn_max_object_skew src/common/options.cc
src/common/options.cc:1480: Option("mon_pg_warn_max_object_skew", Option::TYPE_FLOAT, Option::LEVEL_ADVANCED)
src/common/options.cc-1481- .set_default(10.0)
src/common/options.cc-1482- .set_description("max skew few average in objects per pg")
src/common/options.cc-1483- .add_service("mgr"),
After I restarted the ceph-mgr service, the warning went away.
rgw: limit entries in remove_olh_pending_entries()
If there are too many entries to send in a single osd op, the osd rejects
the request with EINVAL. This error happens in follow_olh(), which means
that requests against the object logical head (requests with no version
id) can't be resolved to the current object version. In multisite, this
also causes data sync to get stuck in retries
Yingxin Cheng [Thu, 28 Feb 2019 08:40:24 +0000 (16:40 +0800)]
crimson/net: skeleton code for ProtocolV2 logic
crimson ProtocolV2 class is following a state-machine design style:
* states are defined in ProtocolV2::state_t;
* call `execute_<state_name>()` methods to trigger different states;
* V2 logics are implemented in each execute_<state_name>() methods, and
with explicit transitions to other states at the end of the execute_*;
* each state is associated with a write state defined in Protocol.h:
- none: not allowed to send;
- delay: messages can be queued, but will be delayed to send;
- open: dispatch queued message/keepalive/ack;
- drop: not send any messages, drop them all.
crimson ProtocolV2 is alike async ProtocolV2, with some considerations:
* explicit and encapsulated client/server handshake workflow.
* futurized-exception-based fault handling, which can interrupt protocol
workflow at any time in each state.
* introduced SERVER_WAIT state, meaning to wait for peer-client's socket
to reset or be replaced, and expect no further reads.
* introduced an explicit REPLACING state, async-msgr would be at the
NONE state during replacing.
Yingxin Cheng [Tue, 19 Mar 2019 14:26:59 +0000 (22:26 +0800)]
test/crimson: improved perf tool for crimson-msgr
New features:
* --jobs: start multiple client messengers from core #1 ~ #jobs;
* --core: can assign server core to get away from busy client cores;
* --rounds: a client will send <rounds>/<jobs> messages;
Improved:
* Better configuration report;
* Report individual client results plus a summary;
* Validate if CPU number is sufficient before running;
* Sleep 1 second while connecting, so it won't hurt performance;
* Simplify client logic and bug fixes;
Removed unecessary features:
* finish_decode() for MOSDOp;
* keepalive ratio;
Boris Ranto [Thu, 4 Apr 2019 20:00:55 +0000 (22:00 +0200)]
cmake: check for MAJOR.MINOR version of python3
We can only check for MAJOR.MINOR version of python3 since
FindPython3Libs does not support checking for MAJOR.MINOR.PATCH version
of python3. We also need to make sure we use the PYTHON3 versions of
these variables.
This should fix a regression introduced by c961e00.