Sage Weil [Thu, 11 Jun 2015 20:30:16 +0000 (13:30 -0700)]
osd/PGBackend: set correct shard in objects_list_partial
We need to list objects within the current shard only. We could get away
with being sloppy about shard previously when the ghobject_t sort order
was broken, but not in the new world. Here, it is only necessary that we
capture all generations of the object to get the marker.
Sage Weil [Fri, 12 Jun 2015 16:33:47 +0000 (12:33 -0400)]
osd: kill META_COLL constant; use named ctor
Note: this kills a subtle bug *somewhere* where meta coll_t's pgid field
is 0.0 vs 0.0s0 (it appears the static META_COLL gets 0.0s0). Statics are
bad news so kill.
Sage Weil [Sun, 11 Jan 2015 20:17:11 +0000 (12:17 -0800)]
ceph-object-corpus: drop coll_t 'foo' and 'bar'
We don't care if these parse in real clusters, they are no longer legal,
and we still want to test other old coll_t instances.. so just drop these
ones.
Sage Weil [Tue, 6 Jan 2015 00:31:15 +0000 (16:31 -0800)]
os/KeyValueStore: change collection master list strategy
Don't use a pseudo-collection COLLECTIONS. We aren't allowed to make
strangely-named coll_t's any more. Instead, store the collection list in
a simple encoded buffer that's stored as a single key in the backend db.
Sage Weil [Mon, 5 Jan 2015 23:24:35 +0000 (15:24 -0800)]
osd/osd_types: add coll_t::parse() method
This will explicitly validate the form of the input string to make sure it
is a recognized collection. (Eventually we can then store things
internally as something other than a string.)
Sage Weil [Mon, 22 Dec 2014 22:44:49 +0000 (14:44 -0800)]
os/FileStore: force temp objects into _TEMP temp collection
We've removed the temp collection concept from the ObjectStore interface,
but the FileStore HashIndex will mix objects with different pools together
by hash id, which breaks the ordering.
Compensate by forcing objects with 'temp' pool ids (anything < -1) into
a parallel temp collection. Hide this detail isnide FileStore, below the
ObjectStore interface and above the HashIndex backend.
Sage Weil [Tue, 23 Dec 2014 00:39:38 +0000 (16:39 -0800)]
shard_id_t: change NO_SHARD to sort before 0 (min instead of max)
The min ghobject has shard NO_SHARD, and is the default constructed value.
That initial value is assumed in uncounted ways across the code base when
users do
ghobject_t foo;
foo.this = that;
such that changing it is dangerous. It is safer to change the shard_id_t
sort order such that NO_SHARD is signed instead of unsigned. The value
doesn't actually change (still 0xff), but the sorting does. Note that
only a single comparison triggers a signed/unsigned warning from this
change, and it assumes that the shard is not NO_SHARD (ec pool) and we
case it to preserve the old behavior anyway.
In PGBackend we change the minimum value for the objects_list_partial()
method to start with NO_SHARD.
Sage Weil [Thu, 11 Dec 2014 23:55:55 +0000 (15:55 -0800)]
ceph_objectstore_test: a few simple collection_list_partial tests
Add a simple list test, and a second one that mixes different pool ids
into the same collection. The latter confirms that we can deal with
ghobject_t's in the temp pool space within a single collection.
Sage Weil [Fri, 12 Dec 2014 00:28:48 +0000 (16:28 -0800)]
ghobject_t: change sort order (max, shard, hobj, gen)
Go from (hobj, shard, gen) -> (max, shard, hobj, gen)
This makes the ghobject_t's sort in a way that groups them by PG, which
will likely be useful in the future for ObjectStore implementations.
Notably, we get a ghobject_t MAX value that is distinct from the
hobject_t one.
Sage Weil [Fri, 12 Dec 2014 00:25:49 +0000 (16:25 -0800)]
hobject_t: modify operator<<
Put the most significant fields to the left so that it matches the sort
order. Also use unambiguous separator when the nspace is present
(like we do with the key).
Sage Weil [Fri, 12 Dec 2014 00:24:16 +0000 (16:24 -0800)]
hobject_t: change default pool id to INT64_MIN
We are including pool in a (more) significant part of the sort and using
negative pool IDs; the default must be the most negative (min) so that
it is a useful starting point for a search.
Sage Weil [Wed, 10 Dec 2014 23:45:26 +0000 (15:45 -0800)]
osd: eliminate temp collections
The temp objects have distinct pool ids. Old temp objects are already
blown away on OSD restart. This patch removes all the futzing with
temp_coll and puts the temp objects in the same collection as everything
else.
Interesting, collection_move_rename is now always using the same source
and dest collection. Hmm!
Sage Weil [Wed, 10 Dec 2014 22:35:11 +0000 (14:35 -0800)]
osd: use per-pool temp poolid for temp objects
Previously, all temp objects had poolid == -1. Instead, use -2 - poolid.
This, when combined with the PG hash, provides a unique temp namespace
per pg.
This has no impact on upgrade, since we delete all temp objects on startup
by collection (coll_t::is_temp()).
Sage Weil [Tue, 30 Dec 2014 18:16:10 +0000 (10:16 -0800)]
osd: use a temporary object for recovery
Currently we recover objects directly into position by deleting and then
overwriting the target object. This means that we lose the object if we
are recovering in multiple steps and we fail partway through.
This is also the last user of collection_move(), which we would like to
deprecate.
Instead, generate a unique temp object name (pgid, object version, snap
is unique), and recover to that. Use the existing temp object cleanup
machinery to throw out a partial recovery result.
Owen Synge [Thu, 18 Jun 2015 12:16:03 +0000 (14:16 +0200)]
Fixes to rcceph script
- only start OSDs if mon daemons are also present
- adds support for mask and unmask
- removes support for cluster with non default cluster name,
as this was very limited and inconsistent
- Reapplied from a patch as could not cherry-pick 66cb46c411d874be009c225450eea5021cf1219b from Mon Jan 12
as this produced issues with src/gmock