Sam Lang [Wed, 27 Mar 2013 15:58:25 +0000 (10:58 -0500)]
mds: Delay session close if in clientreplay
If the mds is in clientreplay, a session close
request needs to be delayed until it reaches
active. Otherwise, the session state gets set to
'closing', and the replay requests get dropped on the
floor.
Fixes #4564. Signed-off-by: Sam Lang <sam.lang@inktank.com>
Sam Lang [Wed, 27 Mar 2013 14:35:08 +0000 (09:35 -0500)]
mds: Clear backtrace updates on standby_trim_seg
If the mds is standby, when a segment is trimmed, we need
to clear the backtrace updates list to avoid the following
assertion when the segment is deleted.
Samuel Just [Tue, 26 Mar 2013 22:10:37 +0000 (15:10 -0700)]
ReplicatedPG: send entire stats on OP_BACKFILL_FINISH
Otherwise, we update the stat.stat structure, but not the
stat.invalid_stats part. This will result in a recently
split primary propogating the invalid stats but not the
invalid marker. Sending the whole pg_stat_t structure
also mirrors MOSDSubOp.
Fixes: #4557
Backport: bobtail Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Joe Buck [Tue, 26 Mar 2013 21:17:14 +0000 (14:17 -0700)]
testing: fix hadoop-internal-test
Remove now superfluous directory changes
that are causing tests to fail.
This code should have been removed when we transitioned
from running tests with Ant to using Java to run the tests.
Signed-off-by: Joe Buck <jbbuck@gmail.com> Reviewed-by: Noah Watkins <noahwatkins@gmail.com>
Sam Lang [Mon, 25 Mar 2013 19:55:20 +0000 (14:55 -0500)]
client: Don't signal requests already handled
The assertion failure reported in #4530 is triggered
by the following:
1. client sends request
2. mds sends unsafe reply
3. before request gets journaled, mds is killed
4. mds restarts
5. client receives session close (from close request before restart)
6. session close does kick_requests()
7. kick_requests tries to signal caller that doesn't exist.
This fix avoids signaling a caller if the unsafe reply
has been received and the make_request() function has completed.
We do this by setting the caller_cond to null once the caller
is woken up, and only signal the caller in kick_requests if
caller_cond is non-null. This avoids trying to resend requests
listed in mds_request but that have already received unsafe replies.
The unsafe requests are handled by resend_unsafe_requests() code,
so skipping those requests is allowable.
Fixes #4530. Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Loic Dachary [Mon, 25 Mar 2013 17:40:32 +0000 (13:40 -0400)]
fix append to uninitialized buffer in FlatIndex::created
The long_name variable is not initialized. When the append_oname
function is called, it will strlen(long_name) and get a result
that depends on the stack content. The long_name is truncated to a
zero length string to prevent this unexpected behavior.
There is no sure way to trigger the problem by writing a unit
test. Unit tests are added for all public methods of the FlatIndex
class. Most of the time the tests fail if the long_name variable is
not properly initialized.
* uint32_t collection_version()
* coll_t coll() const
* void set_ref(std::tr1::shared_ptr<CollectionIndex> ref)
* int cleanup()
* int init()
* int created(const hobject_t &hoid, const char *path)
* int unlink(const hobject_t &hoid)
* int lookup(const hobject_t &hoid, IndexedPath *path, int *exist)
* int collection_list(vector<hobject_t> *ls)
* int collection_list_partial(const hobject_t &start, int min_count, int max_count, snapid_t seq, vector<hobject_t> *ls, hobject_t *next)
There are a number of border cases that cannot be tested, such as the
logic of the lfn_get static function. Since FlatIndex code is designed
to transition from older namespace conventions, it is difficult to
figure out.
The tests rely on xattr(2) and their availability is checked before
running them.
Gary Lowell [Tue, 26 Mar 2013 18:31:16 +0000 (11:31 -0700)]
ceph-disk: udevadm settle before partprobe
After changing the partition table, allow the udev event to be
processed before calling partprobe. This helps prevent partprobe
from getting a resource busy error on some platforms.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
Sam Lang [Tue, 26 Mar 2013 13:55:40 +0000 (08:55 -0500)]
mds: CInode::build_backtrace() always incr iter
Always increment the iterator when adding old pools
to the backtrace. This fixes a bug on files where
the layout had been set to a different pool and then
back to the same pool, causing continuous looping in
the build_backtrace() function.
Fixes #4537. Signed-off-by: Sam Lang <sam.lang@inktank.com>
Sam Lang [Mon, 25 Mar 2013 17:58:13 +0000 (12:58 -0500)]
client: Handle duplicate safe replies
If the mds sends a duplicate safe reply, the mds_requests
map won't contain a matching request id (tid). Instead of
assert failing, we log a message that we saw a reply without
a matching request.
Also remove redundant mds_requests->erase(tid) line.
Sam Lang [Mon, 25 Mar 2013 16:43:54 +0000 (11:43 -0500)]
client: Always cleanup request after safe
The client MetaRequest should always be cleaned up
and removed from the mds_requests map once the client
gets a safe reply. This patch avoids a leak where the
mds does not send back an unsafe reply and the request
is never cleaned up.
Sam Lang [Mon, 25 Mar 2013 16:39:19 +0000 (11:39 -0500)]
client: Remove got_safe from MetaRequest
Once a safe reply is received, we remove the
request from the mds_requests map, so checking that
it might be a duplicate won't succeed. This patch
removes the got_safe checks in the reply handling code
and the got_safe field on the MetaRequest to avoid confusion.
Yehuda Sadeh [Mon, 25 Mar 2013 16:50:33 +0000 (09:50 -0700)]
rgw: bucket index ops on system buckets shouldn't do anything
Fixes: #4508
Backport: bobtail
On certain bucket index operations we didn't check whether
the bucket was a system bucket, which caused the operations
to fail. This triggered an error message on bucket removal
operations.
Noah Watkins [Fri, 22 Mar 2013 19:42:47 +0000 (12:42 -0700)]
java: support ceph_get_osd_addr
Adds a few JNI utilities from the Android project (license: Apache) to
help with IP address conversions. These functions are also updated to
work in our environment (use Ceph exception utilities, edit header
paths).
Sage Weil [Fri, 22 Mar 2013 19:32:15 +0000 (12:32 -0700)]
crush, mon: unlink vs remove
Make an 'unlink' mode of remove that will remove a link to a bucket but
not remove the bucket itself. This refactors remove_item[_under] and moves
some of the checks into common helpers where they are not duplicated. Fix
callers to pass the extra arg.
Sage Weil [Thu, 21 Mar 2013 18:04:59 +0000 (11:04 -0700)]
mon: add 'osd crush add-bucket <name> <type>'
This is (I think) the last missing piece to let you construct an entire
map via the CLI. The add/set commands will construct intervening ancestor
nodes provide there is an existing ancestor to stick them under, but this
is needed to create the initial root node.