Erwan Velu [Fri, 31 Mar 2017 12:54:33 +0000 (14:54 +0200)]
ceph-disk: Adding retry loop in get_partition_dev()
There is very rare cases where get_partition_dev() is called before the actual partition is available in /sys/block/<device>.
It appear that waiting a very short is usually enough to get the partition beein populated.
Analysis:
update_partition() is supposed to be enough to avoid any racing between events sent by parted/sgdisk/partprobe and
the actual creation on the /sys/block/<device>/* entrypoint.
On our CI that race occurs pretty often but trying to reproduce it locally never been possible.
This patch is almost a workaround rather than a fix to the real problem.
It offer retrying after a very short to be make a chance the device to appear.
This approach have been succesful on the CI.
Note his patch is not changing the timing when the device is perfectly created on time and just differ by a 1/5th up to 2 seconds when the bug occurs.
A typical output from the build running on a CI with that code.
command_check_call: Running command: /usr/bin/udevadm settle --timeout=600
get_dm_uuid: get_dm_uuid /dev/sda uuid path is /sys/dev/block/8:0/dm/uuid
get_partition_dev: Try 1/10 : partition 2 for /dev/sda does not in /sys/block/sda
get_partition_dev: Found partition 2 for /dev/sda after 1 tries
get_dm_uuid: get_dm_uuid /dev/sda uuid path is /sys/dev/block/8:0/dm/uuid
get_dm_uuid: get_dm_uuid /dev/sda2 uuid path is /sys/dev/block/8:2/dm/uuid
Erwan Velu [Wed, 22 Mar 2017 09:11:44 +0000 (10:11 +0100)]
ceph-disk: Reporting /sys directory in get_partition_dev()
When get_partition_dev() fails, it reports the following message :
ceph_disk.main.Error: Error: partition 2 for /dev/sdb does not appear to exist
The code search for a directory inside the /sys/block/get_dev_name(os.path.realpath(dev)).
The issue here is the error message doesn't report that path when failing while it might be involved in.
This patch is about reporting where the code was looking at when trying to estimate if the partition was available.
osd: Return correct osd_objectstore in OSD metadata
Do not simply read the configuration value as it might have changed
during OSD startup by reading the type from disk.
Fixes: http://tracker.ceph.com/issues/18638 Signed-off-by: Wido den Hollander <wido@42on.com>
(cherry picked from commit 8fe6a0303b02ac1033f5bfced9f94350fe3e33de)
Conflicts:
src/osd/OSD.cc
- g_conf->osd_objectstore was changed to cct->_conf->osd_objectstore by 1d5e967a05ddbcceb10efe3b57e242b3b6b7eb8c which is not in kraken
Conflicts:
src/librbd/AioImageRequestWQ.h:
- in master this file has morphed into src/librbd/io/ImageRequestWQ.h
- kraken has AioImageRequest<ImageCtx> instead of ImageRequest<ImageCtx>
src/librbd/image/RefreshRequest.cc:
- rename image context element to "aio_work_queue" (from "io_work_queue")
because kraken doesn't have de95d862f57b56738e04d77f2351622f83f17f4a
src/test/librbd/image/test_mock_RefreshRequest.cc:
- rename image context element to "aio_work_queue" (from "io_work_queue")
because kraken doesn't have de95d862f57b56738e04d77f2351622f83f17f4a
Samuel Just [Wed, 18 Jan 2017 18:24:13 +0000 (10:24 -0800)]
PrimaryLogPG::try_lock_for_read: give up if missing
The only users calc_*_subsets might try to read_lock an object which is
missing on the primary. Returning false in those cases is perfectly
reasonable and avoids the problem.
Samuel Just [Wed, 23 Nov 2016 23:41:13 +0000 (15:41 -0800)]
ReplicatedBackend: take read locks for clone sources during recovery
Otherwise, we run the risk of a clone source which hasn't actually
come into existence yet being used if we grab a clone which *just*
got added the the ssc, but has not yet actually had time to be
created (can't rely on message ordering here since recovery messages
don't necessarily order with client IO!).
Sage Weil [Fri, 31 Mar 2017 14:06:42 +0000 (10:06 -0400)]
ceph_test_librados_api_misc: fix stupid LibRadosMiscConnectFailure.ConnectFailure test
Sometimes the cond doesn't time out and it wakes up instead. Just repeat
the test many times to ensure that at least once it times out (usually
it doesn't; it's pretty infrequent that it doesn't).
Merge pull request #16069 from smithfarm/wip-20345-kraken
kraken: make check fails with Error EIO: load dlopen(build/lib/libec_FAKE.so): build/lib/libec_FAKE.so: cannot open shared object file: No such file or directory
qa/workunits/ceph-helpers: do not error out if is_clean
it would be a race otherwise, because we cannot be sure that the cluster
pgs are not all clean or not when run_osd() returns, but we can be sure
that they are expected to active+clean after a while. that's what
wait_for_clean() does.
Nathan Cutler [Fri, 23 Jun 2017 06:27:42 +0000 (08:27 +0200)]
tests: move swift.py task to qa/tasks
In preparation for moving this task from ceph/teuthology.git into ceph/ceph.git
The move is necessary because jewel-specific changes are needed, yet teuthology
does not maintain a separate branch for jewel. Also, swift.py is a
Ceph-specific task so it makes more sense to have it in Ceph.
Sage Weil [Tue, 30 May 2017 01:55:33 +0000 (21:55 -0400)]
os/bluestore: deep decode onode value
In particular, we want the attrs (map<string,bufferptr>) to be a deep
decode so that we do not pin this buffer, and so that any changed attr
will free the previous memory.
test/librbd/test_notify.py: don't disable feature in slave
On jewel it will have stolen the exclusive lock. Instead, ensure that
object map and fast diff are already disabled on the clone before the
start of the test.
Marcus Watts [Wed, 11 Jan 2017 05:06:15 +0000 (00:06 -0500)]
radosgw/swift: clean up flush / newline behavior.
The current code emits a newline after swift errors, but fails
to account for it when it calculates 'content-length'. This results in
some clients (go github.com/ncw/swift) producing complaints about the
unsolicited newline such as this,
Unsolicited response received on idle HTTP channel starting with "\n"; err=<nil>
This logic eliminates the newline on flush. This makes the content length
calculation correct and eliminates the stray newline.
There was already existing separator logic in the rgw plain formatter
that can emit a newline at the correct point. It had been checking
"len" to decide if previous data had been emitted, but that's reset to 0
by flush(). So, this logic adds a new per-instance variable to separately
track state that it emitted a previous item (and should emit a newline).
Fixes: http://tracker.ceph.com/issues/18473 Signed-off-by: Marcus Watts <mwatts@redhat.com> Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit 5f229d6a33eae4906f22cdb90941835e47ee9f02)
Boris Ranto [Thu, 16 Feb 2017 10:34:27 +0000 (11:34 +0100)]
ceph-disk: Add more fix targets
It turns out I forgot several more directories that needs to be fixed by
this script. We need to fix /var/log/ceph, /var/run/ceph and /etc/ceph
as well.
Boris Ranto [Thu, 9 Feb 2017 18:17:12 +0000 (19:17 +0100)]
ceph-disk: Add unit test for fix command
This will simulate the command* functions to not actually run anything
thus excercising the python code directly. It also checks that the
proper (sub-strings) are in the output.
Boris Ranto [Tue, 31 Jan 2017 12:19:33 +0000 (13:19 +0100)]
ceph-disk: Add fix subcommand
This subcommand will fix the SELinux labels and/or file permissions on
ceph data (/var/lib/ceph).
The command is also optimized to run the commands in parallel (per
sub-dir in /var/lib/ceph) and do restorecon and chown at the same time
to take advantage of the caching mechanisms.
Sage Weil [Fri, 5 May 2017 20:48:25 +0000 (16:48 -0400)]
messages/MCommand: fix type on decode
Wow, this has been broken since v0.38, but apparently
the message never made it into the object corpus so
we never noticed!
In reality the bug is harmless: decode_message() will
set_header which clobbers whatever version the default
ctor fills in, so this only affects ceph-dencoder's
test.
Jason Dillaman [Tue, 2 May 2017 01:06:19 +0000 (21:06 -0400)]
cls_rbd: default initialize snapshot namespace for legacy clients
Creating a snapshot on >=Kraken OSDs using <=Jewel clients can result
in an improperly initialized snapshot namespace. As a result, attempting
to remove the snapshot using a >=Kraken client will result in an -EINVAL
error.
Fixes: http://tracker.ceph.com/issues/19413 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 03b0b03071f3e04754896664c69f73759ddb907a)