Dan Mick [Sat, 3 Aug 2013 03:46:00 +0000 (20:46 -0700)]
ceph_argparse.py: add stderr note if nonrequired param is invalid
If we run across a user-supplied parameter that doesn't validate against
a non-required descriptor, it may be that it's a valid entry for a later
descriptor...or it may be that it's supposed to match. We can't really tell.
A possible heuristic would be to call it invalid-for-sure if we're at the
end of the descriptor list, but that's not very generic.
Warn about it and try to drive on anyway.
Signed-off-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Dan Mick [Fri, 2 Aug 2013 05:35:08 +0000 (22:35 -0700)]
Fix "too few args validate"
Check that number of validated arguments matches the number of required
arguments in the signature. Also, sort all possible matches by
length of signature. This way "ceph osd crush set" and
"ceph osd crush set <args>" can work while still insisting that
extra args or too few args are errors.
Also, restructure and factor out some of the work of validate() to make
its inner loop smaller and hopefully more comprehensible.
Signed-off-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Yehuda Sadeh [Thu, 1 Aug 2013 20:20:19 +0000 (13:20 -0700)]
rgw: only fetch cors info when needed
Fixes: #5831
This commit moves around the cors handling code. Beforehand
we were unnecessarily reading the cors headers for every
request whether that was needed or not. Moved that code to
be only called when needed. While at it, cleaned up the
layering a bit so that not to mix S3 specific code with
the generic functionality (except for debugging).
Erik Logtenberg [Thu, 1 Aug 2013 11:29:45 +0000 (13:29 +0200)]
ceph.spec.in: add missing buildrequires for Fedora
This patch adds two buildrequires to the ceph.spec file, that are needed
to build the rpms under Fedora. Danny Al-Gaaf commented that the
snappy-devel dependency should actually be added to the leveldb-devel
package. I will try to get that fixed too, in the mean time, this patch
does make sure Ceph builds on Fedora.
Signed-off-by: Erik Logtenberg <erik@logtenberg.eu>
Fixes: 5808
We cannot call get_bucket_instance_info() at that point,
as the bucket structure wasn't initialized, so we don't
have the bucket instance location information. Just calling
get_bucket_info().
Samuel Just [Tue, 30 Jul 2013 22:46:22 +0000 (15:46 -0700)]
Objecter: set c->session to NULL if acting is empty
Otherwise, we might leave a session attached to the
CommandOp for an down OSD. handle_osd_map will then
delete the session for the down OSD. tick() will then
attempt to follow the invalid pointer to find a
connection over which to send a MPing.
Fixes: #5798 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Sage Weil [Tue, 30 Jul 2013 00:14:57 +0000 (17:14 -0700)]
mon: allow others to sync from us across bootstrap calls
If someone is syncing from us and there is an election, they currently get
reset and have to restart their sync. This can lead to situations where
they can never finish, e.g., when the load from them syncing makes us time
out commits and call elections.
There is nothing that changes during bootstrap that would prevent a sync
from proceeding. The only time we need to stop providing is when we
ourselves decide to sync from someone else; modify that reset call to
reset provider state. All other resets become requester resets.
rgw: set bucket attrs are a bucket instance meta operation
Need to do the action through the bucket instance handler
and not through the bucket handler, otherwise it's wrongly
recorded (and wrongly replayed, ouch).
We now keep the bucket instance oid in rgw_bucket. The reason
we need it is that the bucket might have been created before
the entrypoint / bucket instance separation.
Sage Weil [Mon, 29 Jul 2013 20:13:24 +0000 (13:13 -0700)]
mon/PGMonitor: fix 'pg dump_[pools_]json'
Use the correct type for the dumpcontents arg. Fixes the dump_pools_json
output and avoids these errors:
2013-07-29 13:09:14.089188 7fa0c5d21700 -1 0x7fa0c5d1e7a8
2013-07-29 13:09:16.306560 7fa0c5d21700 -1 bad boost::get: key dumpcontents is not type std::vector<std::string, std::allocator<std::string> >
2013-07-29 13:09:16.317104 7fa0c5d21700 -1 0x7fa0c5d1e7a8
2013-07-29 13:09:16.317136 7fa0c5d21700 -1 bad boost::get: key dumpcontents is not type std::vector<std::string, std::allocator<std::string> >
Fixes: #5786 Signed-off-by: Sage Weil <sage@inktank.com>
check_new_interval must compare old acting with old osdmap
When trying to establish if the old acting set is either empty or
smaller than the min_size of the osdmap,
pg_interval_t::check_new_interval compares with the min_size of the
new osdmap. Since the goal is to try to determine if the previous
interval may have been writeable, it should not enter the if when
there were not enough osds in the acting set ( i.e. < min_size ). But
it may enter it anyway if min_size was decremented in the new osdmap.
A complete set of unit tests were added to cover the logic of
check_new_interval. The parameters are prepared to describe a
situation where the function returns false (i.e. no new
interval). Each case is described in a separate bloc that introduces
the minimal changes to demonstrate the intended test case.
Because a number of cases have the same output while implementing a
different logic, the debug output is parsed to differentiate between them.
A test case demonstrating the problem ( check_new_interval must
compare old acting with old osdmap ) is added, with a link to the bug
number for future reference. The problem is fixed. The text of two
debug messages are slightly changed to make the maintenance of the
test that match them easier.
http://tracker.ceph.com/issues/5780 refs #5780
Signed-off-by: Loic Dachary <loic@dachary.org> Reviewed-by: Sage Weil <sage@inktank.com> Reviewed-by: Samuel Just <sam.just@inktank.com>
Samuel Just [Mon, 29 Jul 2013 16:36:04 +0000 (09:36 -0700)]
OSD: suspend tp timeout while taking pg lock in OpWQ
If N op_tp threads are configured, and recovery_max_active
is set to a sufficiently large number, all N op_tp threads
might grab a MOSDPGPush op off of the queue for the same PG.
The last thread to get the lock will have waited
N*time_to_handle_push before completing its item and pinging
the heartbeat timeout. If that time exceeds the timeout
and there are enough ops waiting, each thread subsequently
will end up exceeding the timeout before completeing an
item preventing the OSD from heartbeating indefinitely.
We prevent this by suspending the timeout while we try to
get the PG lock. Even if we do block for an excessive
period of time attempting to get the lock, hopefully,
the thread holding the lock will cause the threadpool
to time out.
Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Danny Al-Gaaf [Sun, 28 Jul 2013 21:25:58 +0000 (23:25 +0200)]
ceph_authtool.cc: update help/usage text
Added implemented but not listed commands to the help/usage text:
* -g shortcut for --gen-key
* -a shortcut for --add-key
* -u/--set-uid to set auid
* --gen-print-key
* --import-keyring
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Dan Mick [Sat, 27 Jul 2013 00:47:32 +0000 (17:47 -0700)]
ceph-rest-api: clean up options/environment
ceph-rest-api:
* create app from wrapper by calling generate_app()
* pass args to generate_app() (early parsed in wrapper)
* parse -i/--id here as well
* set addr:port on returned app object
* handle only EnvironmentError exceptions; let others spew traceback
* turn off debug when running singlethreaded server
ceph_rest_api.py:
* put glob.* on app.ceph_* instead; pass around app in init code
* drop conf parsing (let librados do its job)
Documentation updated to match.
Signed-off-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
CID 1058391 (#1 of 1): Out-of-bounds access (OVERRUN)
32. alloc_strlen: Allocating insufficient memory for the terminating null of the string.
CID 1058390 (#1 of 1): Unchecked return value from library (CHECKED_RETURN)
13. check_return: Calling function "this->class_handler->open_all_classes()" without checking return value. It wraps a library function that may fail and return an error code. [show details]
14. unchecked_value: No check of the return value of "this->class_handler->open_all_classes()".
Dan Mick [Tue, 23 Jul 2013 07:50:15 +0000 (00:50 -0700)]
ceph_rest_api.py: obtain and handle tell <osd-or-pgid> commands
Contact an OSD that's up to get a list of the commands, and use
them to add to the URL map.
Special treatment throughout for these commands:
* hack the help signature dump
* keep a 'flavor' per command to allow special handler() processing
* strip off 'tell/<target>' when constructing command
* allow multiple dicts with the same url
(the parameters and get/put methods can change)
* because of above, method must be validated in handler()
* validate the given OSD
* calculate target for command (mon, osd, pg)
Unrelated: make method_dict into global METHOD_DICT
Sage Weil [Fri, 26 Jul 2013 22:25:12 +0000 (15:25 -0700)]
mon/PGMonitor: reset in-core PGMap if on-disk format changes
We might have a sequence like:
- start mon, load pgmap 100
- sync
- including a format upgrade at say v 150
- refresh
- see format_version==1, and try read pgmap:101 as new format
This simply clears our in-memory state if we see that the format has
changed. That will make update_from_paxos reload the latest and prevent
it from walking through the old and useless inc updates.
Note: this does not affect the auth monitor because we unconditionally
load the latest map in update_from_paxos on upgrade. Also, the upgrade
there wasn't a format change--just a translation of cap strings from the
old to new style.
Fixes: #5764 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>
Danny Al-Gaaf [Fri, 26 Jul 2013 21:28:44 +0000 (23:28 +0200)]
rgw/rgw_metadata.cc: delete md_log (RGWMetadataLog) in destructor
Call delete on md_log in the destructor.
CID 1054826 (#1 of 1): Resource leak in object (CTOR_DTOR_LEAK)
1. alloc_new: Allocating memory by calling "new RGWMetadataLog(_cct, _store)".
2. var_assign: Assigning: "this->md_log" = "new RGWMetadataLog(_cct, _store)".
3. ctor_dtor_leak: The constructor allocates field "md_log" of
"RGWMetadataManager" but the destructor and whatever functions it calls
do not free it.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Sage Weil [Fri, 26 Jul 2013 20:58:46 +0000 (13:58 -0700)]
osd: load all classes on startup
This avoid creating a wide window between when ceph-osd is started and
when a request arrives needing a class and it is loaded. In particular,
upgrading the packages in that window may cause linkage errors (if the
class API has changed, for example).
Fixes: #5752 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>