Dan Mick [Tue, 6 Aug 2013 01:18:59 +0000 (18:18 -0700)]
ceph.in: Re-enable ceph interactive mode (missing its output).
Also, loop on error. There's no reason to exit the interpreter loop on
an error, and it's probably less annoying if we don't. Print the error,
and any output, and continue.
Fixes: #5746 Signed-off-by: Dan Mick <dan.mick@inktank.com>
Sage Weil [Thu, 8 Aug 2013 15:30:01 +0000 (08:30 -0700)]
mon: fix 'osd crush rule rm ...' dup arg
This was broken way back in 0d66c9ebbf626117c641c975a8682a0aaba588c4, but
we were ignoring the dup until recently.
t Signed-off-by: Sage Weil <sage@inktank.com>
Dan Mick [Sat, 3 Aug 2013 04:26:51 +0000 (21:26 -0700)]
mon/PGMonitor: add 'pg dump pgs_brief' subcommand
It is useful to map OSDs to PGs and vice-versa; pg dump gives that
information, but gives a lot of other stuff. This is the same dump
as pg dump pgs, but omitting everything except pgid, state, and
osd up and acting sets.
Signed-off-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Dan Mick [Sat, 3 Aug 2013 03:46:00 +0000 (20:46 -0700)]
ceph_argparse.py: add stderr note if nonrequired param is invalid
If we run across a user-supplied parameter that doesn't validate against
a non-required descriptor, it may be that it's a valid entry for a later
descriptor...or it may be that it's supposed to match. We can't really tell.
A possible heuristic would be to call it invalid-for-sure if we're at the
end of the descriptor list, but that's not very generic.
Warn about it and try to drive on anyway.
Signed-off-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Dan Mick [Fri, 2 Aug 2013 05:35:08 +0000 (22:35 -0700)]
Fix "too few args validate"
Check that number of validated arguments matches the number of required
arguments in the signature. Also, sort all possible matches by
length of signature. This way "ceph osd crush set" and
"ceph osd crush set <args>" can work while still insisting that
extra args or too few args are errors.
Also, restructure and factor out some of the work of validate() to make
its inner loop smaller and hopefully more comprehensible.
Signed-off-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
commit 0071b8e75b (mds: stay in SCAN state in file_eval) makes
Locker::file_eval() ignore lock in LOCK_SCAN state. If there
is no request changes the lock state, the lock can be stuck in
LOCK_SCAN state forever. This can cause client read/write hang
because lock in LOCK_SCAN state does not allow Frw caps.
The fix is change LOCK_SCAN to a unstable state. Thank to the
CInode::STATE_RECOVERING check in Locker::eval_gather(), the
lock stays in the SCAN state while file is being recovering.
The lock will transit to a stable state once the recovery
finishes.
mds: handle "state == LOCK_LOCK_XLOCK" when cancelling xlock
If we find lock state is LOCK_LOCK_XLOCK when cancelling xlock,
set lock state to LOCK_XLOCK_DONE and call Locker::eval_gather().
This makes sure the lock will eventually transit to a stable state.
(LOCK_XLOCK_DONE's next state is stable)
mds: remove "type != CEPH_LOCK_DN" check in Locker::cancel_locking()
For acquiring/cancelling xlock, the lock state transitions for
dentry lock and other types of locks are the same. So I think
the "type != CEPH_LOCK_DN" check doesn't make sense.
If lock state is LOCK_XLOCKDONE, the xlocker can have GSHARED cap.
So when finishing xlock, we may need to revoke the GSHARED cap.
In most cases Locker::_finish_xlock() directly set lock state to
LOCK_LOCK or LOCK_EXCL, which hides the issue. If 'num_rdlock > 0'
or 'num_wrlock > 0' when finishing xlock, the issue reveals.
(lock get stuck in LOCK_XLOCKDONE forever)
The fix is always call Locker::_finish_xlock() when xlock count
reaches zero. _finish_xlock() checks if it can change lock state
to LOCK_EXCL immediately. If not, it uses Locker::eval_gather()
to transit lock state.
Another change of this patch is avoid changing lock state to
LOCK_LOCK directly. because lock in LOCK_XLOCK_DONE state allows
GSHARED cap, lock in LOCK_LOCK state does not.
There are several issues in the Capability::confirm_receipt()
1. when receiving a client caps message with 'seq == last_sent',
it doesn't mean we finish revoking caps. The client can send
caps message that only flushes dirty metadata.
2. When receiving a client caps message with 'seq == N', we should
forget pending revocations whose seq numbers are less than N.
This is because, when revoking caps, we create a revoke_info
structure and set its seq number to 'last_sent', then increase
the 'last_sent'.
3. When client actively releases caps (by request), the code only
works for the 'seq == last_sent' case. If there are pending
revocations, we should update them as if the release message
is received before we revoke the corresponding caps.
Yehuda Sadeh [Thu, 1 Aug 2013 20:20:19 +0000 (13:20 -0700)]
rgw: only fetch cors info when needed
Fixes: #5831
This commit moves around the cors handling code. Beforehand
we were unnecessarily reading the cors headers for every
request whether that was needed or not. Moved that code to
be only called when needed. While at it, cleaned up the
layering a bit so that not to mix S3 specific code with
the generic functionality (except for debugging).
Erik Logtenberg [Thu, 1 Aug 2013 11:29:45 +0000 (13:29 +0200)]
ceph.spec.in: add missing buildrequires for Fedora
This patch adds two buildrequires to the ceph.spec file, that are needed
to build the rpms under Fedora. Danny Al-Gaaf commented that the
snappy-devel dependency should actually be added to the leveldb-devel
package. I will try to get that fixed too, in the mean time, this patch
does make sure Ceph builds on Fedora.
Signed-off-by: Erik Logtenberg <erik@logtenberg.eu>
Danny Al-Gaaf [Wed, 31 Jul 2013 22:34:41 +0000 (00:34 +0200)]
rgw_rados.cc: fix invalid iterator comparison
The iterator should be compared against the end() function of
the same iter() from region_conn_map.
CID 1058791 (#1 of 1): Invalid iterator comparison (MISMATCHED_ITERATOR)
mismatched_comparison: Comparing "iter" from "this->region_conn_map" to
"this->zone_conn_map.end()" from "this->zone_conn_map".
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Fixes: 5808
We cannot call get_bucket_instance_info() at that point,
as the bucket structure wasn't initialized, so we don't
have the bucket instance location information. Just calling
get_bucket_info().
Samuel Just [Tue, 30 Jul 2013 22:46:22 +0000 (15:46 -0700)]
Objecter: set c->session to NULL if acting is empty
Otherwise, we might leave a session attached to the
CommandOp for an down OSD. handle_osd_map will then
delete the session for the down OSD. tick() will then
attempt to follow the invalid pointer to find a
connection over which to send a MPing.
Fixes: #5798 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Sage Weil [Tue, 30 Jul 2013 00:14:57 +0000 (17:14 -0700)]
mon: allow others to sync from us across bootstrap calls
If someone is syncing from us and there is an election, they currently get
reset and have to restart their sync. This can lead to situations where
they can never finish, e.g., when the load from them syncing makes us time
out commits and call elections.
There is nothing that changes during bootstrap that would prevent a sync
from proceeding. The only time we need to stop providing is when we
ourselves decide to sync from someone else; modify that reset call to
reset provider state. All other resets become requester resets.
rgw: set bucket attrs are a bucket instance meta operation
Need to do the action through the bucket instance handler
and not through the bucket handler, otherwise it's wrongly
recorded (and wrongly replayed, ouch).