Kefu Chai [Tue, 3 Jan 2017 12:40:00 +0000 (20:40 +0800)]
ceph-disk: convert none str to str before printing it
Error('somethings goes wrong', e) is thrown if exception `e` is caught
in ceph-disk, where e is not a string. so we can not just concat it in
Error's __str__(). so cast it to str before doing so.
rgw: fix handling RGWUserInfo::system in RGWHandler_REST_SWIFT.
Before this patch the flag was wrongly handled in the Swift API
implementation. In rare conditions this might result in setting
req_state::system_request.
This may happen only if both of those conditions are fulfilled:
* RadosGW is running in a multi-site configuration (at least
one user with the system flag turned on is present),
* the "rgw_swift_account_in_url" configurable has been switched
to true. The value is false by default and our documentation
doesn't actually mention about the option.
The issue doesn't affect Jewel nor any previous release.
Michal Jarzabek [Thu, 12 Jan 2017 21:22:20 +0000 (21:22 +0000)]
client/Client.cc: prevent segfaulting
The segfaulting in the rmdir function is caused by calling
filepath::last_dentry() function.
last_dentry() function assumes that the bits vector has always at
least one element, which is not the case for the the filepath object
created with "/" input.
This commit also fixes other functions affected by this bug:
link, unlink, rename, mkdir, mknod and symlink.
Fixes: http://tracker.ceph.com/issues/9935 Signed-off-by: Michal Jarzabek <stiopa@gmail.com>
(cherry picked from commit 6ed7f2364ae5507bab14c60b582929aa7b0ba400)
Xiaoxi Chen [Wed, 11 Jan 2017 02:11:08 +0000 (19:11 -0700)]
mds/server: skip unwanted dn in handle_client_readdir
We can skip unwanted dn which < (offset_key, snap) via map.lower_bound, rather than
iterate across them.
Previously we iterate and skip dn which < (offset_key, dn->last), as dn->last >= snap
means (offset_key, dn->last) >= (offset_key, snap), and such iterate_and_skip logic
still keep, so this commit doesnt change code logic but an optimization.
Sage Weil [Tue, 17 Jan 2017 15:20:07 +0000 (10:20 -0500)]
os/bluestore: return blocks allocated from allocate()
Instead of having a separate output argument with the number of
blocks allocated, just return it via the return value. Simplifies
the calling convention.
Sage Weil [Tue, 17 Jan 2017 15:56:13 +0000 (10:56 -0500)]
os/bluestore: manage vector from ExtentList
ExtentList was previous relying the caller to preallocate/size the
vector to be large enough for the worst case allocation of extents,
and keeping it's own manual count of the extent list size. Instead,
manage that from ExtentList, and remove the preallocation from the
callers.
John Spray [Fri, 13 Jan 2017 00:30:28 +0000 (00:30 +0000)]
client: populate metadata during mount
This way we avoid having to over-write the "root"
metadata during mount, and any user-set overrides (such
as bad values injected by tests) will survive.
Because Client instances may also open sessions without
mounting to send commands, add a call into populate_metadata
from mds_command as well.
Fixes: http://tracker.ceph.com/issues/18361 Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit 1dbff09ad553f9ff07f4f4217ba7ece6c2cdc5d2)
Casey Bodley [Wed, 21 Dec 2016 19:32:04 +0000 (14:32 -0500)]
rgw: RGWMetaSyncShardCR drops stack refs on destruction
if the coroutine is canceled before collect_children() can clean up
all of its child stacks, those stack refs will leak. store these
stacks as boost::intrusive_ptr so the ref is dropped automatically on
destruction
Matt Benjamin [Fri, 6 Jan 2017 17:30:42 +0000 (12:30 -0500)]
rgw_rados: add guard assert in add_io()
Use the iterator-returning insert operation in std::map, check
assert the insert case. As a side effect, this makes use of the
inserted object record more clear.
Samuel Just [Thu, 12 Jan 2017 20:44:44 +0000 (12:44 -0800)]
Objecter: resend pg commands on interval change
mark_lost_unfound* are now async since the rework, so we need
the Objecter to be able to resend on interval change. This
is preferable to somehow requeueing the Command because they
don't use the normal op queue.
Fixes: http://tracker.ceph.com/issues/18358 Signed-off-by: Samuel Just <sjust@redhat.com>
Jason Dillaman [Tue, 3 Jan 2017 19:51:14 +0000 (14:51 -0500)]
librbd: add new lock_get_owners / lock_break_lock API methods
If the client application supports failover, let the application
force break the current lock and blacklist the owner. This is
required in case the current lock owner is alive from the point-of-view
of librbd but failover was required due to a higher level reason.
Fixes: http://tracker.ceph.com/issues/18327 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 9a5a8c75a025143cee6f92f3dbc3a12f2b6a9ad7)
Jason Dillaman [Thu, 22 Dec 2016 20:00:23 +0000 (15:00 -0500)]
librbd: separate break lock logic into standalone state machine
The current lockers are now queried before the lock is attempted to
prevent any possible race conditions when one or more clients attempt
to break the lock of a dead client.
Sage Weil [Fri, 30 Dec 2016 17:22:42 +0000 (12:22 -0500)]
os/bluestore/BlueFS: fix reclaim_blocks
We need to return all extents to the caller. The current code
fails to assign *offset so it appears like a single extent from
the start of the device, which is very wrong.
Nathan Cutler [Thu, 5 Jan 2017 20:22:58 +0000 (21:22 +0100)]
tests: override yaml to set client pid file to empty string
Due to http://tracker.ceph.com/issues/18309 the pid file for fuse clients
should always be set to the empty string. (Teuthology's default ceph.conf
sets it to /var/run/ceph/$cluster-$name.pid)
This commit adds a reusable yaml facet for this purpose.
Samuel Just [Tue, 3 Jan 2017 18:50:22 +0000 (10:50 -0800)]
PrimaryLogPG: don't update digests for objects with mismatched names
I've only seen this on one cluster, but let's not issue repops during
scrub on objects where the object_info_t::soid value is not correct.
The cluster in question has been through many different non-release
kernels and osd versions, so the objects presumably came about due to an
old xfs or filestore bug. They recently became fatal since we made
filestore crash on ENOENT for setattrs. In the past, the cluster just
silently tolerated them.
http://tracker.ceph.com/issues/18409 is a larger feature to detect these
better and repair them automatically.
Related: http://tracker.ceph.com/issues/18409 Signed-off-by: Samuel Just <sjust@redhat.com>
huanwen ren [Tue, 27 Dec 2016 10:54:45 +0000 (10:54 +0000)]
mon/OSDMonitor: fixup sortbitwise flag warning
"ceph -s" does not report warning when using
command "ceph osd unset sortbitwise" to drop
sortbitwise flag.
we should use "osdmap.get_up_osd_features() &
CEPH_FEATURE_OSD_BITWISE_HOBJ_SORT"
instead of "(osdmap.get_features(CEPH_ENTITY_TYPE_OSD, NULL) &
CEPH_FEATURE_OSD_BITWISE_HOBJ_SORT)",
because osdmap.get_features only get local "features"
Sage Weil [Thu, 22 Dec 2016 18:05:22 +0000 (13:05 -0500)]
qa/tasks/workunit: clear clone dir before retrying checkout
If we checkout ceph-ci.git, and don't find a branch,
we'll try again from ceph.git. But the checkout will
already exist and the clone will fail, so we'll still
fail to find the branch.
The same can happen if a previous workunit task already
checked out the repo.
Fix by removing the repo before checkout (the first and
second times). Note that this may break if there are
multiple workunit tasks running in parallel on the same
role. That is already racy, so if it's happening, we'll
want to switch to using a truly unique clonedir for each
instantiation.
Fixes: http://tracker.ceph.com/issues/18336 Signed-off-by: Sage Weil <sage@redhat.com>
It so happens that it's not safe to assume the monmap will be in an
empty state upon decoding.
Turns out the MonClient will reuse the MonMap instance when decoding
the just received map from the monitors. Should the monitors be on an
older version that do not support 'mon_info', this field will not be
decoded (after all, there's no field to decode from); but by this time,
the MonClient would already have a built monmap, which could have
populated 'mon_info' with temporary mon names from 'mon initial
members'.
Given the existing entries in 'mon_info', and the conflicting entries in
'mon_addr', we would end up asserting in 'sanitize_mons()'. This becomes
a non-issue if 'mon_info' is empty, as was unfortunately presumed.
Fixes: http://tracker.ceph.com/issues/18265 Signed-off-by: Joao Eduardo Luis <joao@suse.de>