test/librados/tmap_migrate: g_ceph_context->put() upon return
Signed-off-by: Kefu Chai <kchai@redhat.com>
Conflict: test/librados/tmap_migrate.cc
this change is not cherry-picked from master, because tmap_migrate was
removed in master. so we are applying the same change in cb1cda96713b2ec0f6418c4cbe3d964c2020729c to this test.
ceph-disk: enable directory backed OSD at boot time
https://github.com/ceph/ceph/commit/539385b143feee3905dceaf7a8faaced42f2d3c6
introduced a regression preventing directory backed OSD from starting at
boot time.
For device backed OSD the boot sequence starts with ceph-disk@.service
and proceeds to
systemctl enable --runtime ceph-osd@.service
where the --runtime ensure ceph-osd@12 is removed when the machine
reboots so that it does not compete with the ceph-disk@/dev/sdb1 unit at
boot time.
However directory backed OSD solely rely on the ceph-osd@.service unit
to start at boot time and will therefore fail to boot.
The --runtime flag is selectively set for device backed OSD only.
osd/ReplicatedPG: limit the number of concurrently trimming pgs
This patch introduces an AsyncReserver for snap trimming to limit the
number of pgs on any single OSD which can be trimming, as with backfill.
Unlike backfill, we don't take remote reservations on the assumption
that the set of pgs with trimming work to do is already well
distributed, so it doesn't seem worth the implementation overhead to get
reservations from the peers as well.
Sage Weil [Fri, 3 Mar 2017 03:20:08 +0000 (21:20 -0600)]
osdc/Objecter: resend RWORDERED ops on full
Our condition for respecting the FULL flag is complex, and involves
the WRITE | RWORDERED flags vs the FULL_FORCE | FULL_TRY flags. Previously,
we could block a read bc of RWORDRED but not resend it later.
Fix by capturing the complex condition in a respects_full() bool and using
it both for the blocking-on-send and resending-on-possibly-notfull-later
checks.
Conflicts:
debian/ceph-common.install (retain /etc/init.d/rbdmap so jewel users can choose sysvinit or systemd)
debian/rules (retain /etc/init.d/rbdmap so jewel users can choose sysvinit or systemd)
qa/tasks/workunit.py: use "overrides" as the default settings of workunit
otherwise the settings in "workunit" tasks are always overridden by the
settings in template config. so we'd better follow the way of how
"install" task updates itself with the "overrides" settings: it uses the
"overrides" as the *defaults*.
Kefu Chai [Thu, 30 Mar 2017 04:37:01 +0000 (12:37 +0800)]
tasks/workunit.py: specify the branch name when cloning a branch
c1309fb failed to specify a branch when cloning using --depth=1, which
by default clones the HEAD. and we can not "git checkout" a specific
sha1 if it is not HEAD, after cloning using '--depth=1', so in this
change, we dispatch "tag", "branch", "HEAD" using three Refspec classes.
Dan Mick [Wed, 29 Mar 2017 03:08:13 +0000 (20:08 -0700)]
tasks/workunit.py: when cloning, use --depth=1
Help avoid killing git.ceph.com. A depth 1 clone takes about
7 seconds, whereas a full one takes about 3:40 (much of it
waiting for the server to create a huge compressed pack)
build/ops: rpm: explicitly provide --with-ocf to configure
Fixes: http://tracker.ceph.com/issues/19546 Signed-off-by: Nathan Cutler <ncutler@suse.com>
(Note: This cannot be cherry-picked because master uses cmake, but
the fix does bring the jewel spec file into better alignment its master
counterpart, at least as far as this one little bit is concerned.)
Erwan Velu [Fri, 31 Mar 2017 12:54:33 +0000 (14:54 +0200)]
ceph-disk: Adding retry loop in get_partition_dev()
There is very rare cases where get_partition_dev() is called before the actual partition is available in /sys/block/<device>.
It appear that waiting a very short is usually enough to get the partition beein populated.
Analysis:
update_partition() is supposed to be enough to avoid any racing between events sent by parted/sgdisk/partprobe and
the actual creation on the /sys/block/<device>/* entrypoint.
On our CI that race occurs pretty often but trying to reproduce it locally never been possible.
This patch is almost a workaround rather than a fix to the real problem.
It offer retrying after a very short to be make a chance the device to appear.
This approach have been succesful on the CI.
Note his patch is not changing the timing when the device is perfectly created on time and just differ by a 1/5th up to 2 seconds when the bug occurs.
A typical output from the build running on a CI with that code.
command_check_call: Running command: /usr/bin/udevadm settle --timeout=600
get_dm_uuid: get_dm_uuid /dev/sda uuid path is /sys/dev/block/8:0/dm/uuid
get_partition_dev: Try 1/10 : partition 2 for /dev/sda does not in /sys/block/sda
get_partition_dev: Found partition 2 for /dev/sda after 1 tries
get_dm_uuid: get_dm_uuid /dev/sda uuid path is /sys/dev/block/8:0/dm/uuid
get_dm_uuid: get_dm_uuid /dev/sda2 uuid path is /sys/dev/block/8:2/dm/uuid
Erwan Velu [Wed, 22 Mar 2017 09:11:44 +0000 (10:11 +0100)]
ceph-disk: Reporting /sys directory in get_partition_dev()
When get_partition_dev() fails, it reports the following message :
ceph_disk.main.Error: Error: partition 2 for /dev/sdb does not appear to exist
The code search for a directory inside the /sys/block/get_dev_name(os.path.realpath(dev)).
The issue here is the error message doesn't report that path when failing while it might be involved in.
This patch is about reporting where the code was looking at when trying to estimate if the partition was available.
Matt Benjamin [Tue, 28 Feb 2017 20:49:06 +0000 (15:49 -0500)]
rgw_file: use fh_hook::is_linked() to check residence
Previously we assumed that !deleted handles were resident--there
is an observed case where a !deleted handle is !linked. Since
we currently use safe_link mode, an is_linked() check is
available, and exhaustive.
Fixes: http://tracker.ceph.com/issues/19111 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit c0aa515f8d8c57ec5ee09e3b60df3cac60453c40)
Matt Benjamin [Wed, 1 Mar 2017 01:24:12 +0000 (20:24 -0500)]
rgw_file: RGWFileHandle dtor must also cond-unlink from FHCache
Formerly masked in part by the reclaim() action, direct-delete now
substitutes for reclaim() iff its LRU lane is over its high-water
mark, and in particular, like reclaim() the destructor is certain
to see handles still interned on the FHcache when nfs-ganesha is
recycling objects from its own LRU.
Fixes: http://tracker.ceph.com/issues/19112 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit d51a3b1224ba62bb53c6c2c7751fcf7853c35a4b)
Matt Benjamin [Tue, 17 Jan 2017 16:23:45 +0000 (11:23 -0500)]
rgw_file: split last argv on ws, if provided
This is intended to allow an "extra" unparsed argument string
containing various cmdline options to be passed as the last argument
in the argv array of librgw_create(), which nfs-ganesha is
expecting to happen.
Matt Benjamin [Mon, 13 Feb 2017 01:18:26 +0000 (20:18 -0500)]
rgw_file: fix hiwat behavior
Removed logic to skip reclaim processing conditionally on hiwat,
this probably meant to be related to a lowat value, which does
not exist.
Having exercised the hiwat reclaim behavior, noticed that the
path which moves unreachable objects to LRU, could and probably
should remove them altogether when q.size exceeds hiwat. Now
the max unreachable float is lane hiwat, for all lanes.
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit b8791b2217e9ca87b2d17b51f283fa14bd68b581) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Matt Benjamin [Sun, 12 Feb 2017 23:20:43 +0000 (18:20 -0500)]
rgw_file: refcnt bugfixes
This change includes 3 related changes:
1. add required lock flags for FHCache updates--this is a crash
bug under concurrent update/lookup
2. omit to inc/dec refcnt on root filehandles in 2 places--the
root handle current is not on the lru list, so it's not
valid to do so
3. based on observation of LRU behavior during creates/deletes,
update (cohort) LRU unref to move objects to LRU when their
refcount falls to SENTINEL_REFCNT--this cheaply primes the
current reclaim() mechanism, so very significanty improves
space use (e.g., after deletes) in the absence of scans
(which is common due to nfs-ganesha caching)
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit beaeff059375b44188160dbde8a81dd4f4f8c6eb) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Matt Benjamin [Sat, 11 Feb 2017 21:38:05 +0000 (16:38 -0500)]
rgw_file: add refcount dout traces at debuglevel 17
These are helpful for checking RGWFileHandle refcnt invariants.
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit 462034e17f919fb783ee33e2c9fa8089f93fd97d) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Matt Benjamin [Fri, 10 Feb 2017 22:14:16 +0000 (17:14 -0500)]
rgw_file: add pretty-print for RGWFileHandle
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit ef330f385d3407af5f470b5093145f59cc4dcc79) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Matt Benjamin [Mon, 20 Feb 2017 20:05:18 +0000 (15:05 -0500)]
rgw_file: fix marker computation
Fixes: http://tracker.ceph.com/issues/19018 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit 4454765e7dd08535c50d24205858e18dba4b454c)