Samuel Just [Tue, 19 Jun 2012 16:11:57 +0000 (09:11 -0700)]
PG: improve find_best_info
07f853db3982e68b952a337cf91cbf7ec0709de9 is actually too conservative,
it suffices to find any info with a last_update of at least the least
last_update from the last period to go active. An info from a previous
interval is acceptable if the last interval never reported a commited
operation and thus still has the same last_update.
Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Sage Weil [Mon, 18 Jun 2012 21:00:06 +0000 (14:00 -0700)]
mon: gracefully handle slow 'ceph -w' clients
If we are sending log updates to a client (ceph -w), and they are far
enough behind to drop behind first_committed, include a friendly message
in their stream but continue.
Drop useless return value from _create_sub_incremental(). Assert that we
can read the state file.
Samuel Just [Sat, 16 Jun 2012 00:09:42 +0000 (17:09 -0700)]
PG: best_info must have a last_epoch_started as high as any other info
We disregard incomplete infos during find_best_info, but we can't an
info with a last_epoch_started less that of the incomplete info.
This should avoid cases like #2462. In that case, it appears that
a peer with empty info/log was chosen as authoritative even though
there was a non-empty incomplete peer.
Yehuda Sadeh [Tue, 12 Jun 2012 21:42:03 +0000 (14:42 -0700)]
rgw: obj copy respects -metadata-directive
Fixes #2542. The old behavior just merged src object attrs
and provided attributes. The new (and correct) behavior looks
at the x-[amz|rgw|...]-metadata-directive and either copies
the source attrs, or replaces them with the provided attrs.
Sage Weil [Wed, 13 Jun 2012 18:05:43 +0000 (11:05 -0700)]
Makefile: link gtest statically
The problem:
- the unittests link against gtest, and gtest is not installed. that's
normally fine, but...
- rbd and rados api unit tests link against gtest, and are installed
by 'make install'. they are needed for teuthology runs, etc.
- if we build gtest as an .la library, we can only control whether *all*
or *no* .la libraries are linked statically.
- we want librados to be linked dynamically.
The solution:
- build gtest as .a instead of a libtool library
- link it statically, always.
Unit test binaries are bigger now. Oh well...
Fixes: #2331 Signed-off-by: Sage Weil <sage@inktank.com>
Samuel Just [Tue, 12 Jun 2012 19:53:02 +0000 (12:53 -0700)]
PG: track purged pgs during active
See bug #2462.
The following sequence could cause a log assuming a non-empty pg
to an empty replica:
1. primary sends query to stray
2. stray sends notify to primary
3. primary sends purge to stray removing stray from peer_info
4. stray recieves query and sends a notify
5. stray recieves purge and purges its pg
6. primary recieves notify from stray and adds it to peer_info
note: peer_info[stray] is now wrong
7. acting set changes, primary is still primary, stray is replica
8. primary sends log to replica based on incorrect info from 6.
This patch adds a purged_peer set which is populated during purge_strays
and cleared during start_peering_interval. The primary will ignore
notifies from the peer once the peer is in this set.
Tommi Virtanen [Mon, 11 Jun 2012 22:27:02 +0000 (15:27 -0700)]
upstart: Read crush location and weight from ceph.conf.
This introduces two new config variables, osd_crush_location
and osd_crush_weight. Not currently included in config_opts.h,
as these are not used in the C++ code.
Yehuda Sadeh [Mon, 11 Jun 2012 17:14:43 +0000 (10:14 -0700)]
rgw: new config options
New config options for usage logging:
- rgw_enable_usage_log: enable usage logging
- rgw_usage_log_flush_threshold - limit on number of pending updates
before synchronously flushing update
- rgw_usage_log_tick_interval - asynchronous flush interval
- rgw_usage_max_shards - split info across that many objects
- rgw_usage_max_user_shards - split single user info across that many
objects
Yehuda Sadeh [Mon, 11 Jun 2012 17:11:17 +0000 (10:11 -0700)]
rgw: new class methods for handling usage information
The new methods are:
- user_usage_log_add: add new usage information
- user_usage_log_read: get usage information
- user_usage_log_trim: remove usage information
Josh Durgin [Mon, 11 Jun 2012 06:21:58 +0000 (23:21 -0700)]
cls_rbd: add get_all_features method
This is useful for reporting which features an osd supports, and for
testing rados_exec. Update the rados api tests to use this method
instead of test_exec, which was removed.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Josh Durgin [Sun, 10 Jun 2012 00:16:45 +0000 (17:16 -0700)]
librbd: ignore RBD_MAX_BLOCK_NAME_SIZE when generating object ids
The actual data object ids don't need to be artificially restricted in
length. RBD_MAX_BLOCK_NAME_SIZE just limits the size of the object
prefix, since it's used in rbd_info_t.
Josh Durgin [Fri, 8 Jun 2012 15:40:27 +0000 (08:40 -0700)]
rados: add commands to interact with object maps
The input values are stored as-is, and any values read are dumped in
hex. Rename listomap to listomapkeys to distinguish from
listomapvalues. Also add it to the man page.
Josh Durgin [Fri, 8 Jun 2012 15:07:40 +0000 (08:07 -0700)]
rbd: update for the new format
No features exist right now, so there are no extra options for them.
The old format is still used by default, and since the default will
change with layering, --new-format will be removed at that point and is
intentionally left undocumented.
Josh Durgin [Fri, 8 Jun 2012 14:43:32 +0000 (07:43 -0700)]
librbd: add create2 to create an image with the new format
This will fail if features are requested that the client or server
does not support. Currently there are no features defined, so
zero is the only valid value.
copy() preserves the format and features of the source image.
Sage Weil [Sat, 9 Jun 2012 05:29:02 +0000 (22:29 -0700)]
crushtool: drop useless clitest
This is an ancient test for an old 'bug' in functionality we're removing.
Also, it is sensitive to tester output, which will be changing a lot in
the coming weeks/months.
Sage Weil [Sat, 9 Jun 2012 03:39:41 +0000 (20:39 -0700)]
CrushTester: simplify, clean up mark down
- put it in a separate function
- operate on temporary weight vector, not user-modified input
- guard the whole thing with an #ifdef
- permute candidates and use first N, to ensure we end up picking the right
number of buckets/items.