Danny Al-Gaaf [Tue, 26 Feb 2013 10:25:49 +0000 (11:25 +0100)]
KeyValueDBMemory.cc: use empty() instead of size() == 0
Use empty() since it should be prefered as it has, following the
standard, a constant time complexity regardless of the containter
type. The same is not guaranteed for size().
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Danny Al-Gaaf [Tue, 26 Feb 2013 10:19:25 +0000 (11:19 +0100)]
ReplicatedPG.cc: use empty() instead of size() == 0
Use empty() since it should be prefered as it has, following the
standard, a constant time complexity regardless of the containter
type. The same is not guaranteed for size().
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Danny Al-Gaaf [Tue, 26 Feb 2013 10:16:56 +0000 (11:16 +0100)]
FileStore.cc: use if(!empty()) instead of if(size())
Use empty() since it should be prefered as it has, following the
standard, a constant time complexity regardless of the containter
type. The same is not guaranteed for size().
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Danny Al-Gaaf [Tue, 26 Feb 2013 10:15:35 +0000 (11:15 +0100)]
mon_store_converter.cc: use empty() instead of size()
Use empty() since it should be prefered as it has, following the
standard, a constant time complexity regardless of the containter
type. The same is not guaranteed for size().
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Danny Al-Gaaf [Tue, 26 Feb 2013 10:13:45 +0000 (11:13 +0100)]
Paxos.cc: use empty() instead of size()
Use empty() since it should be prefered as it has, following the
standard, a constant time complexity regardless of the containter
type. The same is not guaranteed for size().
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Danny Al-Gaaf [Mon, 25 Feb 2013 16:08:14 +0000 (17:08 +0100)]
Monitor.h: use empty() instead of !size()
Use empty() since it should be prefered as it has, following the
standard, a constant time complexity regardless of the containter
type. The same is not guaranteed for size().
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Danny Al-Gaaf [Mon, 25 Feb 2013 16:06:36 +0000 (17:06 +0100)]
Monitor.cc: use empty() instead of size()
Use empty() since it should be prefered as it has, following the
standard, a constant time complexity regardless of the containter
type. The same is not guaranteed for size().
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Gary Lowell [Mon, 11 Feb 2013 06:21:52 +0000 (22:21 -0800)]
configure.ac: Add test for c++ compiler.
The AC_PROG_CXX macro sets a flag if a C++ compiler is found
but does not fail if one is not found, it left to application
to test the flags as needed. This fix will issue an error
when a c++ compiler is not found. Bug 3955.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
it's not installed, this fix adds an error message for a
Dan Mick [Mon, 26 Nov 2012 21:43:13 +0000 (13:43 -0800)]
test_lock_fence.sh, rbdrw.py: rbd lock/fence test
qa/workunits/rbd/test_lock_fence.sh runs using test/rbdrw.py
rbdrw.py creates an image, locks it, and runs an I/O loop;
test_lock_fence.sh runs it, waits, and then blacklists that client,
which causes rbdrw.py to get ESHUTDOWN on operations thereafter.
Currently doesn't work with rbd caching enabled.
rbd.py gets new exception type for ESHUTDOWN
Fixes: #3190 Signed-off-by: Dan Mick <dan.mick@inktank.com> Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Sage Weil [Sat, 23 Feb 2013 17:01:07 +0000 (09:01 -0800)]
mon: PaxosService: remove lingering uses of paxos getters and wait methods
We should use the PaxosServices getters, setters, and wait methods when and
wherever possible. These must have fallen through the cracks during the
merge.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Josh Durgin [Mon, 25 Feb 2013 23:55:36 +0000 (15:55 -0800)]
test_rbd: close image before removing it
This error was masked before by watch notify not differentiating
between watches from the same client with different cookies.
Reopen the image at the end of this test so teardown works.
Josh Durgin [Mon, 25 Feb 2013 22:55:34 +0000 (14:55 -0800)]
systest: fix race with pool deletion
The second test have pool deletion and object listing wait on the same
semaphore to connect and start. This led to errors sometimes when the
pool was deleted before it could be opened by the listing process. Add
another semaphore so the pool deletion happens only after the listing
has begun.
Josh Durgin [Mon, 25 Feb 2013 19:33:48 +0000 (11:33 -0800)]
librbd: drop snap_lock before invalidating cache
Writeback will take the snap_lock, so read everything we need under it
before invalidating the cache. This avoids a recursive lock when writeback
uses snap_lock while snap_rollback() was holding it.
Remove a not-very-useful debugging message that depended on snap_lock being held.
Sage Weil [Sun, 24 Feb 2013 00:36:36 +0000 (16:36 -0800)]
mds: reencode MDSMap in MMDSMap if MDSENC feature is not present
In some cases the MMDSMap message from mon -> client passes from leader ->
peon -> client, and the leader doesn't encode with the correct feature
bits. As with MMOSDMap, we reencode the nested MDSMap based on the
features if relevant bits are not present.
We forgot to include this with the mds encoding changes.
which is what we are shoving into statvfs, but we have the b_size and
fr_size arithmetic swapped. However, doing the *correct* reporting would
then break the old stat by making both sizes appear to be 4KB (or
whatever).
Sidestep the issue by making *both* values 4MB.. which is both large enough
to report large FS sizes, and also the default stripe size and thus a
"reasonable" value to report for a block size.
Perhaps in the future, when we no longer care about old userland, we can
report the page size for f_bsize, which is probably the "most correct"
thing to do.
Fixes: #3794. See also #3793. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>
Josh Durgin [Fri, 22 Feb 2013 07:31:21 +0000 (23:31 -0800)]
objecter: don't resend linger ops unnecessarily
recalc_linger_op_target() was checking and then setting
linger_op->pgid and linger_op->active, but these were only set by
recalc_linger_op_target(). This was only called by handle_osd_map(),
so the first osdmap after a watch was established would cause a resend
of the watch. Analogous to the normal Op, set this information by
calling recalc_linger_op_target in send_linger().
Dan Mick [Fri, 22 Feb 2013 05:41:25 +0000 (21:41 -0800)]
configuration parsing: give better error for missing =
A ceph.conf line with "key" and no "= value" currently shows
"unexpected character while parsing putative key value,
at char N line M". There's no reason it can't be clearer.
Fixes: #4229 Signed-off-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Samuel Just [Thu, 21 Feb 2013 23:31:36 +0000 (15:31 -0800)]
PG::proc_replica_log: adjust oinfo.last_complete based on omissing
Otherwise, search_for_missing may neglect to check the missing
set for some objects assuming that if the need version is
prior to last_complete, the replica must have it.
Fixes: #4994 Signed-off-by: Samuel Just <sam.just@inktank.com>
Greg Farnum [Thu, 21 Feb 2013 22:30:42 +0000 (14:30 -0800)]
MDS: remove a few other unnecessary is_base() checks
We should let users remove xattrs as well as set them. ;) And
the check in handle_client_setlayout was totally useless -- perhaps
intended for setdirlayout?
Greg Farnum [Thu, 21 Feb 2013 22:21:08 +0000 (14:21 -0800)]
mds: allow xattrs on the root inode
This was previously disallowed because Once Upon a Time, the root
inode wasn't persisted to disk and was an entirely in-memory construct. But
it's safe now, and has been for a while.
Greg Farnum [Thu, 21 Feb 2013 17:22:00 +0000 (09:22 -0800)]
mds: use inode_t::layout for dir layout policy
This cherry-pick is going in the reverse direction of normal. That's
because this direction makes for the minimal change -- this patchset
is required to fix the loss of directory layouts we were previously
seeing, but fixing it requires changing the encoding versions. So we
wrote it on top of Bobtail and let it update the struct_v's as they existed
then. Note that we here change a few encoding versions in ways which are
NOT COMPATIBLE with previous development code (but not any releases). In
particular, development code introduced and this removes the
file_layout_policy_t, and some of the CInode and EMetaBlob encoding
struct_v values were used in development code to mean one thing, but
mean something different due to the Bobtail patch.
Remove the default_file_layout struct, which was just a ceph_file_layout,
and store it in the inode_t. Rip out all the annoying code that put this
on the heap.
To aid in this usage, add a clear_layout() function to inode_t.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Signed-off-by: Greg Farnum <greg@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 36ed407e0f939a9bca57c3ffc0ee5608d50ab7ed)
Conflicts:
src/mds/CInode.cc
src/mds/CInode.h
src/mds/MDCache.cc
src/mds/Server.cc
src/mds/events/EMetaBlob.h
Cherry-pick- Reviewed-by: Sage Weil <sage@inktank.com>
Sage Weil [Mon, 21 Jan 2013 05:53:37 +0000 (21:53 -0800)]
mds: parse ceph.*.layout vxattr key/value content
Use qi to parse a strictly formatted set of key/value pairs. Be picky
about whitespace. Any subset of recognized keys is allowed. Parse the
same set of keys as the ceph.*.layout.* vxattrs.
Sage Weil [Thu, 21 Feb 2013 21:28:47 +0000 (13:28 -0800)]
osdc/Objecter: unwatch is a mutation, not a read
This was causing librados to unblock after the ACK on unwatch, which meant
that librbd users raced and tried to delete the image before the unwatch
change was committed..and got EBUSY. See #3958.
Jan Harkes [Thu, 21 Feb 2013 20:17:38 +0000 (15:17 -0500)]
Fix failing > 4MB range requests through radosgw S3 API.
When a range request is made for more than rgw_get_obj_max_req_size
bytes the first returned chunk sets 'ret' to STATUS_PARTIAL_CONTENT and
all remaining chunks behave as if there is an error state and only
return a minimal header.
Fix this by passing STATUS_PARTIAL_CONTENT to set_req_state_err, but
leave the 'ret' member variable untouched.
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
(cherry picked from commit c83a01d4e8dcd26eec24c020c5b79fcfa4ae44a3)
Jan Harkes [Thu, 21 Feb 2013 20:17:38 +0000 (15:17 -0500)]
Fix failing > 4MB range requests through radosgw S3 API.
When a range request is made for more than rgw_get_obj_max_req_size
bytes the first returned chunk sets 'ret' to STATUS_PARTIAL_CONTENT and
all remaining chunks behave as if there is an error state and only
return a minimal header.
Fix this by passing STATUS_PARTIAL_CONTENT to set_req_state_err, but
leave the 'ret' member variable untouched.
Josh Durgin [Thu, 21 Feb 2013 19:26:45 +0000 (11:26 -0800)]
librbd: make sure racing flattens don't crash
The only way for a parent to disappear is a racing flatten completing,
or possibly in the future the image being forcibly removed. In either
case, continuing to flatten makes no sense, so stop early.
Josh Durgin [Thu, 21 Feb 2013 19:17:18 +0000 (11:17 -0800)]
librbd: use rwlocks instead of mutexes for several fields
Image metadata like snapshots, size, and parent is frequently read,
but rarely updated. During flatten, we were depending on the parent
lock to prevent the parent ImageCtx from disappearing out from under
us while we read from it. The copy-up path also needed the parent lock
to be able to read from the parent image, which lead to a deadlock.
Convert parent_lock, snap_lock, and md_lock to RWLocks, and change
their use to read instead of exclusive locks where appropriate. The
main place exclusive locks are needed is in ictx_refresh, so this is
pretty simple. This fixes the deadlock, since parent_lock is only
needed for read access in both flatten and the copy-up operation.
cache_lock and refresh_lock are only really used for exclusive access,
so leave them as regular mutexes.
One downside to this is that there's no way to assert is_locked()
for RWLocks, so we'll have to be very careful about changing code
in the future.
Sage Weil [Thu, 21 Feb 2013 18:30:08 +0000 (10:30 -0800)]
osd: clear recovery state on pg removal
This ensures we release our in-progress recovery counters, which prevents
recovery from getting blocked indefinitely when a pool removal races with
recovery ops.
Fixes: #4217
Backport: bobtail Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Samuel Just <sam.just@inktank.com>