Samuel Just [Thu, 10 Oct 2013 23:12:10 +0000 (16:12 -0700)]
ReplicatedBackend: implement RPGTransaction
RPGTransaction is essentially a wrapped ObjectStore::Transaction.
The coll_t argument is elided, tempness is instead encoded in the
hobject. RPGTransaction tracks which temp objects are created and
cleared so we can update the ReplicatedBackend tracking and possibly
create the temp collection as needed.
Samuel Just [Thu, 10 Oct 2013 23:10:36 +0000 (16:10 -0700)]
hobject_t/ReplicatedPG: tempness is now an hobject thing
PGBackend implmentations will have complete control over the temp
collection. Rather than specifying the collection when sending
ops into the PGBackend, hobjects themselves will be temp or not.
Samuel Just [Thu, 5 Dec 2013 00:06:17 +0000 (16:06 -0800)]
test/osd: restructure Object/RadosModel in prep for append
Attribute handling no longer has special support in ContentsGenerator.
The most recent operation information is now stored in a special
attr rather than at the beginning of the object. ObjectDesc layers
include their own ContentsGenerators to allow more flexibility.
Also, writes truncate to the new object size rather than simply
causing reads to stop at that object size.
Samuel Just [Fri, 22 Nov 2013 19:20:23 +0000 (11:20 -0800)]
OSDMonitor: add debug_fake_ec_pool
This flag will cause ReplicatedPG to act as though the
pool were actually an EC pool in that operations will
be restricted to operations which can be locally rolled
back thereby allowing us to test the ReplicatedPG local
log rollback mechanisms independent of EC. It will also
cause ReplicatedPG to use the async read mechanism on
the PGBackend implementation once it is implemented.
Samuel Just [Sat, 7 Dec 2013 21:19:49 +0000 (13:19 -0800)]
PGLog: don't move up log.tail
Moving up log.tail unnecessarily risks backfilling
a replica after a split. Also, it disrupts the
property that replicas from the most recent interval
which performed writes must have overlapping logs.
Ilya Dryomov [Wed, 22 Jan 2014 15:33:39 +0000 (17:33 +0200)]
MOSDMap: reencode maps if target doesn't have OSDMAP_ENC
Reencode both full and incremental maps if target doesn't know how to
decode OSDMAP_ENC maps (CEPH_FEATURE_OSDMAP_ENC bit is not set). This
fixes a compatibility bug that was introduced in 3d7c69fb0986 ("OSDMap:
add a CEPH_FEATURE_OSDMAP_ENC feature, and use new encoding").
Kai Zhang [Sat, 18 Jan 2014 20:17:10 +0000 (12:17 -0800)]
Missing a key for perm 'w' in permmap (src/pybin/ceph_rest_api.py:277)
It leads to a 500 error when getting mds help info via rest api.
Changed "w" to "rw" in MonCommands.h
Fixes: #7180 Signed-off-by: Kai Zhang <kazhang2@cisco.com>
Greg Farnum [Sat, 18 Jan 2014 01:23:33 +0000 (17:23 -0800)]
OSDMap: Populate primary_temp values a little more carefully
In _get_temp_osds(), we populate temp_pg from the list in the OSDMap,
but we also skip anybody in the list who's down. We need to account
for those skips when setting the primary. It's easy enough to do -- just
look at the output pg_temp list instead of the OSDMap's starting one.
Ilya Dryomov [Fri, 17 Jan 2014 09:49:40 +0000 (11:49 +0200)]
rbd: expose mount_timeout map option
Expose mount_timeout map option. (I missed it in commit 9b7364d2450c,
which added -o / --options option and among other options exposed
osdkeepalive and osd_idle_ttl timeouts.)
Concubidated [Thu, 16 Jan 2014 20:12:13 +0000 (12:12 -0800)]
osd: OSDMap: fix output from ceph status --format=json for num_in_osds
num_up_osds returns as an int value, while num_in_osds returns as a string.
Since only an int can be returned from get_num_in_osds(), num_in_osds should
should also be an int to remain consistant with num_up_osds.
osd: OSDMap: build reverse name->pool map upon decoding
Commit 3d7c69fb09 introduced a new OSDMap encoding/decoding scheme.
However, while the classic decoding function still kept building the
reverse name->pool map, the new decoding function did not, causing the
monitor to be unable to map pool names to pool ids.
This patch fixes this, by factoring out the loop responsible for
populating the 'name_pool' map, as well as calling 'calc_num_osds()', to
OSDMap::post_decode() and having this function called from both the
classic and the new decode functions.
Fixes: 7166 Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
6748: rgw: Optionally return the bucket name in a response header.
This can be useful in situations where accounting of traffic is done externally
when for example HTTP traffic is cached by a reverse proxy like Varnish.
Since not all traffic reaches the RGW daemon it can't fully account all traffic
and this the caching proxy needs to be aware of which bucket the request came for.
Greg Farnum [Wed, 15 Jan 2014 22:51:35 +0000 (14:51 -0800)]
OSDMonitor: make sure we don't send out maps with a primary_temp mapping
Making sure a cluster supports primary_temp is complicated and we don't
have any of the machinery in place right now (nor a need to actually support
it). We don't have any mechanisms for setting it to begin with, so assert
that we never create anything with any such mapping in update_from_paxos()
to catch any errors.
Greg Farnum [Sat, 11 Jan 2014 00:52:39 +0000 (16:52 -0800)]
test: add an OSDMap unittest
This is not super-sophisticated, but it does basic mapping function
consistency checks and looks at the [pg|primary]_temp manipulations. If
we want to in the future, we can do these programmatically across a range
of pgids instead of just checking hash 0.
Greg Farnum [Fri, 20 Dec 2013 23:21:18 +0000 (15:21 -0800)]
OSDMap: pay attention to the temp_primary in _get_temp_osds
Switch _get_temp_osds to use pointers instead of references, and force callers
to check the out params instead of relying on a return code for if anything
was set (trying to use the return code when there are two possible outputs
does not provide useable semantics). For the new temp_primary out param, fill it
in from temp_primary if set, or from the pg_temp list if it's set, or leave
it blank if neither are.
Also, don't use pointers to heap elements. Just put the ints and vectors on
the stack, and assign/swap the out parameters with them. This is less
confusing and should be a bit faster in general.