git.apps.os.sepia.ceph.com Git

]> git.apps.os.sepia.ceph.com Git - ceph.git/log

Greg Farnum [Thu, 19 Dec 2013 01:40:11 +0000 (17:40 -0800)]

OSDMonitor: implement remove_down_primary_temp()

Same as remove_down_pg_temp()

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Wed, 15 Jan 2014 23:24:38 +0000 (15:24 -0800)]

OSDMonitor: make remove_redundant_pg_temp clear primary, too

So that this works with future CRUSH changes, we copy the map and clear
out the primary_temp, then compare its output with the real map's output. If
they match, remove the primary_temp from the real map.

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Thu, 19 Dec 2013 01:53:51 +0000 (17:53 -0800)]

OSDMonitor: remove primary_temp entries when you remove their pool

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Thu, 19 Dec 2013 01:53:28 +0000 (17:53 -0800)]

OSDMap: expose the primary_temp in print()

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Thu, 19 Dec 2013 01:53:11 +0000 (17:53 -0800)]

OSDMap: dedup the primary_temp

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Thu, 19 Dec 2013 01:41:54 +0000 (17:41 -0800)]

OSDMap: add primary_temp to apply_incremental()

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Fri, 13 Dec 2013 01:21:38 +0000 (17:21 -0800)]

OSDMap: add [new_]primary_temp to the map and Incremental

It's not used actively yet, but there it is.

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Thu, 12 Dec 2013 23:35:51 +0000 (15:35 -0800)]

OSDMap: update Incremental encode/decode to match the full map's

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Thu, 12 Dec 2013 23:35:23 +0000 (15:35 -0800)]

OSDMap: add a CEPH_FEATURE_OSDMAP_ENC feature, and use new encoding

Bring our OSDMap encoding into the modern Ceph world! :) This is
fairly straightforward, but has a few rough edges:
Previously we had a "struct_v" which went at the beginning of the
OSDMap encoding, and then later on an ev "extended version" which
was used to store the more-frequently-changed OSDMap pieces. There
was no size information stored explicitly to let clients skip this,
but osd maps were always encoded into their own bufferlist before
being sent to clients, which had the same effect.
We now use the modern ENCODE_START three times:
1) for the overall OSDMap encoding,
2) for the client-usable portion of the map,
3) for the "extended" portion of the map

This will let us independently rev everything, which may come in
useful if we want to (for instance) add a "monitor" portion to the
map that the OSDs don't care about. It also makes adding new
client information a lot easier since older clients will still
be able to decode the map as a whole.

We may want to merge this OSDMAP_ENC feature with one of the others
we are creating during this cycle, since they're all very closely
related. That will also let us protect more naturally against old
clients getting a map they need to understand but can't (because
we only need the new map features-to-come when used with erasure-encoded
PGs, etc).

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Fri, 20 Dec 2013 21:45:38 +0000 (13:45 -0800)]

OSDMap: add primary out param to pg_to_raw_up, and use pointers instead of refs

The only user is in the OSDMonitor, and it's going to want that
information anyway.

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Fri, 20 Dec 2013 23:26:11 +0000 (15:26 -0800)]

OSDMap: add primary-specifying pg_to_acting_osds

This works the same as pg_to_up_acting_osds

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Sat, 14 Dec 2013 00:28:05 +0000 (16:28 -0800)]

mon, osdmaptool: switch to primary-specifying pg_to_up_acting_osds

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Fri, 20 Dec 2013 21:35:28 +0000 (13:35 -0800)]

OSDMap: implement pg_to_up_acting_osds with primary interface

Use our pointer calling conventions instead of a reference for the
new version of the function.

Right now we're just setting the primaries equal to the first member
of up and acting (or -1 if none), but very shortly we'll modify our
private OSDMap functions to export them based on the contents of temp_primary.
While in general anybody querying for the mapping information will
need to pay attention to whom the primary is as well, we have lots
of callers who will need real code changes to do so. To serve them,
we keep a version that does not export the primary, but asserts
that the primary matches the first entry in its list.

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Fri, 20 Dec 2013 21:30:40 +0000 (13:30 -0800)]

OSDMap: switch pg_to_osds to have an explicit primary param

Use pointers instead of references for the out params, too!

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Thu, 19 Dec 2013 02:14:15 +0000 (18:14 -0800)]

OSDMap: rename _raw_to_temp_osds() -> _get_temp_osds()

This function does not (and never has!) used the raw vector, so remove it
and don't use a name which implies it is doing any sort of conversion.

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Fri, 13 Dec 2013 21:55:31 +0000 (13:55 -0800)]

OSDMap: unify the pg_to_acting_osds and pg_to_up_acting_osds implementations

These were the same except for a call to _raw_to_up_osds(). Move the
existing pg_to_up_acting_osds into a private function taking a pointer,
only fill in the up vector if it's a non-NULL pointer, and call it via
the obvious header implementations.

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Fri, 13 Dec 2013 21:28:42 +0000 (13:28 -0800)]

OSDMap: remove get_pg_primary() function

This was used only by SyntheticClient, and that wants get_pg_acting_primary()
anyway. Delete the easily-misused get_pg_primary() and switch.

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Fri, 13 Dec 2013 21:36:23 +0000 (13:36 -0800)]

OSDMap: doc the different pg->OSD mapping functions

Some of these look like what you should use for mapping and they absolutely
are not suitable for that. Make it clearer.

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Tue, 14 Jan 2014 22:56:31 +0000 (14:56 -0800)]

osd: do not misuse calc_pg_role

We've been using the role returned from this to determine if we're
the primary or not. Don't.
This is mostly about removing a few asserts; while in there I also
redirected some calls to use static dereference instead of going through
the osdmap lookup path.

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Greg Farnum [Fri, 13 Dec 2013 22:48:51 +0000 (14:48 -0800)]

PG: do not use role == 0 as a determinant of primacy

We already have an is_primary() function to use instead.

Signed-off-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Josh Durgin [Wed, 15 Jan 2014 23:28:31 +0000 (15:28 -0800)]

Merge pull request #978 from ceph/wip-3454

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>

commit | commitdiff | tree

Yehuda Sadeh [Wed, 15 Jan 2014 23:12:40 +0000 (15:12 -0800)]

radosgw-admin: add temp url params to usage

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>

commit | commitdiff | tree

athanatos [Wed, 15 Jan 2014 18:25:28 +0000 (10:25 -0800)]

Merge pull request #1089 from dachary/wip-mailmap

mailmap: add athanatos <sam.just@inktank.com>

Reviewed-by: Samuel Just <sam.just@inktank.com>

commit | commitdiff | tree

athanatos [Wed, 15 Jan 2014 18:22:47 +0000 (10:22 -0800)]

Merge pull request #963 from dachary/wip-erasure-code-api

erasure code interface helpers

Reviewed-by: Samuel Just <sam.just@inktank.com>

commit | commitdiff | tree

John Wilkins [Wed, 15 Jan 2014 18:08:28 +0000 (10:08 -0800)]

doc: Updated paths for OSDs using the OS disk.

fixes: #6682

Signed-off-by: John Wilkins <john.wilkins@inktank.com>

commit | commitdiff | tree

Loic Dachary [Wed, 15 Jan 2014 08:23:09 +0000 (09:23 +0100)]

mailmap: add athanatos <sam.just@inktank.com>

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Sage Weil [Wed, 15 Jan 2014 05:57:48 +0000 (21:57 -0800)]

Merge pull request #1084 from dachary/wip-cephtool-test

qa: cleanup cephtool/test.sh tmp files

Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Gregory Farnum [Tue, 14 Jan 2014 23:12:34 +0000 (15:12 -0800)]

Merge pull request #1085 from dachary/ceph-master

Reviewed-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Loic Dachary [Tue, 14 Jan 2014 17:25:55 +0000 (18:25 +0100)]

common: fix bufferlist::append(istream) test

bufferlist::append(istream) now filters out empty lines; reflect this in
the test

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Sage Weil [Tue, 14 Jan 2014 17:37:41 +0000 (09:37 -0800)]

doc/release-notes: v0.75

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Loic Dachary [Tue, 14 Jan 2014 11:39:24 +0000 (12:39 +0100)]

qa: cleanup cephtool/test.sh tmp files

When run in a shared environment ( as opposed as a machine created for
the purpose of running this test only ), it is important to cleanup
leftovers to avoid poluting the /tmp space. Create a common temporary
directory for all tmp files.

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Ken Dreyer [Tue, 14 Jan 2014 16:16:41 +0000 (16:16 +0000)]

Merge branch 'next'

commit | commitdiff | tree

Loic Dachary [Tue, 14 Jan 2014 16:10:59 +0000 (08:10 -0800)]

Merge pull request #1076 from dachary/wip-vector-op

erasure-code: use uintptr_t instead of long long

Reviewed-by: Andreas Peters <andreas.joachim.peters@cern.ch>

commit | commitdiff | tree

Loic Dachary [Tue, 14 Jan 2014 06:38:09 +0000 (22:38 -0800)]

Merge pull request #1078 from ceph/wip-mon-pgmap

mon: make 'pg getmap' not include a trailing newline

Reviewed-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Sage Weil [Tue, 14 Jan 2014 01:43:49 +0000 (17:43 -0800)]

Merge pull request #1071 from ceph/wip-max-file-size

allow mds max file size to be adjusted

Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>

commit | commitdiff | tree

Sage Weil [Tue, 14 Jan 2014 00:50:17 +0000 (16:50 -0800)]

Merge pull request #1058 from ceph/wip-cache-snap

snap/clone promotion, flush, and other goodies

This is now passing the thrashing with both cache and snap ops:
sage-2014-01-13_15:45:26-rados:thrash-wip-cache-snap-testing-basic-plana

Reviewed-by: Samuel Just <sam.just@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 13 Jan 2014 23:09:27 +0000 (15:09 -0800)]

osd/ReplicatedPG: use get_object_context in trim_object

find_object_context() has all the logic to choose a particular clone given
a logical snap. In the trim case, we want none of that: we just need to
pull the obc for a specific clone instance. Note that this changes
none of the failure cases (previous we asserted r == 0).

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 10 Jan 2014 19:12:48 +0000 (11:12 -0800)]

ceph_test_rados: do not delete in-use snaps

There are a bunch of ops that read from snaps. Do not delete a snap
while they are in use.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 10 Jan 2014 04:59:36 +0000 (20:59 -0800)]

osd/OSDMonitor: fix 'osd tier add ...' pool mangling

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 10 Jan 2014 00:04:21 +0000 (16:04 -0800)]

osd/ReplicatedPG: update ObjectContext's object_info_t for new hit_set objects

We were fabricating an object_info_t correctly and writing it to disk, but
it was not reflected by the in-memory ObjectContext. If something came
along quickly (like backfill) and tried to use it, the info would be
invalid.

Fix this by fabricating it in the obc and copying it to the new_obs for
the update.

Fixes: #7122
Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Thu, 9 Jan 2014 22:49:52 +0000 (14:49 -0800)]

osd/ReplicatedPG: always return ENOENT on deleted snap

Previously, if a snap was deleted but the clone was there and we hadn't
trimmed it yet, we would still return the data. Instead, return ENOENT
unconditionally (even it's not removed yet). This makes the behavior from
the client perspective more predictable and conistent.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Thu, 9 Jan 2014 10:01:48 +0000 (02:01 -0800)]

ceph_test_rados_api_tier: partial test for promote vs snap trim race

This reliably returns ENODEV due to the test at the finish of flush.  Not
because we are actually racing with trim, though: the trimmer doesn't run
at all.  I believe it captures the important property, though.  Namely:
we should not write a promoted object that is "behind" the snap trimmer's
progress.  The fact that we are in front of it (the trimmer hasn't started
yet) should not matter since the object is logically deleted anyway.

We probably want to make the OSD return ENODEV on read in the normal case
when you try to access a clone that is pending trimming.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 6 Jan 2014 01:44:49 +0000 (17:44 -0800)]

osd/ReplicatedPG: cleanly abort flush if the object no longer exists

If the object no longer exists (for example, because the snap trimmer just
killed it) clean up the flush state without trying to mark the object
clean.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 6 Jan 2014 01:43:57 +0000 (17:43 -0800)]

osd/Replicated: mark obc !exists on snap trim

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 6 Jan 2014 01:43:23 +0000 (17:43 -0800)]

mon: debug propagate_snaps_to_tiers

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 6 Jan 2014 01:43:05 +0000 (17:43 -0800)]

osd: fix propagation of removed snaps to other tiers

When we update removed_snaps we do not update snap_seq. Drop this broken
optimization.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 6 Jan 2014 00:02:19 +0000 (16:02 -0800)]

osd/ReplicatedPG: handle promote that races with snap deletion

If we are promoting a clone and realize that the object is no longer
defined for any snaps, abort the copy and delete any temp object.

If the defined snaps have changed, make sure they are updated in memory
so that on promote completion the snapshot metadata is correct.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Sun, 5 Jan 2014 19:36:55 +0000 (11:36 -0800)]

osd/ReplicatedPG: simplify copy-from temp object handling

Previously the caller was generating a temp object name and passing it
down in severaly different ways. Instead, generate one when we realize
that we need it, and store it in *one* place (CopyResults), where
the completions can get at the information.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Sun, 5 Jan 2014 20:26:48 +0000 (12:26 -0800)]

ceph_test_rados_misc: test bad version for copy-from

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Sun, 5 Jan 2014 09:04:16 +0000 (01:04 -0800)]

osd/ReplicatedPG: adjust flow in process_copy_chunk

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 3 Jan 2014 23:26:00 +0000 (15:26 -0800)]

osd/ReplicatedPG: make CopyResults inline in CopyOp

No reason to put this on the heap. Make the lifetime match that of the
CopyOp.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Thu, 2 Jan 2014 18:48:57 +0000 (10:48 -0800)]

ceph_test_rados: flush can also fail due to snap trimming

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 30 Dec 2013 22:56:54 +0000 (14:56 -0800)]

osd/ReplicatedPG: handle promotion of rollback, src_oids, etc.

Make other find_object_context() callers handle the case where the object
in question needs to be promoted. We add a flag here that forces a promote
for these secondary objects so that the entire operation happens in the
same pool. Forwarding is not allowed in this case.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 30 Dec 2013 20:54:03 +0000 (12:54 -0800)]

osd/ReplicatedPG: preserve clean/dirty state on clone

If we have a clean object and clone it in make_writeable(), the clone
should also be clean (it does not need to be written back to the base
pool). If the object was dirty, the clone should be dirty.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 30 Dec 2013 20:52:39 +0000 (12:52 -0800)]

ceph_test_rados: improve read debug output

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 30 Dec 2013 20:57:28 +0000 (12:57 -0800)]

osd/ReplicatedPG: infer snaps from head when promoting oldest clean clone

Consider:

- base and cache have same object foo; marked clean in cache pool
- modify + clone foo in cache pool. foo clone is clean.
- foo clone is evicted
- foo clone is read, and promoted
- we read foo@something from base pool, and get the head's content

copy-get does not provide us with a snaps list. Instead, we use the
snap_seq from the head to infer what the snaps vector was in the cache
pool and will be in the base pool when we flush the updates to the object.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 30 Dec 2013 19:47:33 +0000 (11:47 -0800)]

osd: include snap_seq in copy-get results

This is needed by the cache layer when reading a logical snap from a head
object on the backend in order to correctly recreate the clone in the
cache layer.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 30 Dec 2013 20:52:20 +0000 (12:52 -0800)]

osd/ReplicatedPG: always set obc->ssc SnapSetContext for clones

This can be useful!

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 30 Dec 2013 19:10:46 +0000 (11:10 -0800)]

osd/ReplicatedPG: do not promote nonexistent clones

Do not promote a clone for a snap that we know doesn't exist. If
find_object_context() didn't give us a missing_oid, there is nothing to
promote.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 30 Dec 2013 17:04:40 +0000 (09:04 -0800)]

ceph_test_rados: is_dirty on non-flushing objects only

This makes its results reliable. Otherwise, we can't mix the is_dirty
test with flush, which eliminates much of its value.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 30 Dec 2013 17:04:02 +0000 (09:04 -0800)]

ceph_test_rados: assert on read error

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Sat, 28 Dec 2013 01:17:19 +0000 (17:17 -0800)]

ceph_test_rados: make flush clean correct snap in model

commit | commitdiff | tree

Sage Weil [Sat, 28 Dec 2013 01:12:54 +0000 (17:12 -0800)]

ceph_test_rados: IsDirty on random snaps

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Sat, 28 Dec 2013 00:55:15 +0000 (16:55 -0800)]

ceph_test_rados: test flush/evict on snaps

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Sat, 28 Dec 2013 00:52:48 +0000 (16:52 -0800)]

ceph_test_rados: don't update any state on successful cache-evict

- we didn't touch the user_version
- we didn't change the clean/dirty state

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 27 Dec 2013 23:42:09 +0000 (15:42 -0800)]

ceph_test_rados_api_tier: test flush on snaps/clones

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 27 Dec 2013 23:41:47 +0000 (15:41 -0800)]

osd/ReplicatedPG: construct appropriate snapc for flush/writeback

Construct a snap context that will trigger the appropriate cloning (if any)
on the base pool.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 27 Dec 2013 23:14:42 +0000 (15:14 -0800)]

osd: add pg_log_entry_t event type CLEAN

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 27 Dec 2013 21:42:54 +0000 (13:42 -0800)]

osd/ReplicatedPG: refuse to flush when older dirty clones are present

If the next oldest clone is dirty, we cannot flush. That is, we must
always flush starting with the oldest dirty clone.

Note that we can never have a sequence like dirty -> clean -> dirty,
because clones are only dirty on creation, are created in order, and cannot
be flushed (cleaned) out of order. Thus checking the previous clone is
sufficient (and thankfully cheap).

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 27 Dec 2013 21:31:07 +0000 (13:31 -0800)]

vstart.sh: allow MDS=0

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 27 Dec 2013 20:53:59 +0000 (12:53 -0800)]

osd/ReplicatedPG: make cache-[try-]flush CACHE instead of WR ops

This will allow us to send a flush op on a snap.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Sat, 28 Dec 2013 00:11:27 +0000 (16:11 -0800)]

osd/ReplicatedPG: allow cache-evict on snaps

We do three things here:

- make cache-evict a CACHE instead of WR op, allowing us to submit it
on snaps (not just head)
- allow eviction of a snap
- verify that all snaps are missing before evicting a head

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 27 Dec 2013 19:15:19 +0000 (11:15 -0800)]

osd: add rados CACHE mode (different from RD and WR)

It is useful to distinguish cache operations from read and modify
operations. Specifically, we will allow cache ops to be sent for
snaps and also allow those ops to result in a write.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 27 Dec 2013 02:06:13 +0000 (18:06 -0800)]

ceph_test_rados_api_tier: test promotion of clones

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 27 Dec 2013 01:32:43 +0000 (17:32 -0800)]

osd/ReplicatedPG: update snap_mapper for promoted clones

A clone that comes into existence via promotion takes an entirely
different path than a typical clone (which comes into existence via a
CLONE op in make_writeable()). Make sure snap_mapper is updated
accordingly.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 27 Dec 2013 23:43:40 +0000 (15:43 -0800)]

osd/ReplicatedPG: only encode SnapSet on head objects in finish_ctx

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Thu, 26 Dec 2013 17:19:08 +0000 (09:19 -0800)]

osd/ReplicatedPG: always encode snaps in finish_ctx

On promote we use finish_ctx to build the final log entries, and need to
encode the snaps vector in that case. (Normally this is done by
make_writeable or explicitly by the snap trimmer.)

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 27 Dec 2013 02:05:22 +0000 (18:05 -0800)]

osd/ReplicatedPG: mirror SnapSet info when promoting head

When we promote the head for an object, get the list of snaps from the
backend pool and construct an appropriate SnapSet. Note that this is
always placed on the head in the cache pool, since we will have a
whiteout object in this case.

Also note that the SnapSet's list of snapids will not include any snaps
for which there were no clones. This is fine, since it is only used for
creating clones, and we've already done that.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 27 Dec 2013 01:51:21 +0000 (17:51 -0800)]

osd/osd_types: SnapSet::from_snap_set

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 27 Dec 2013 01:31:01 +0000 (17:31 -0800)]

osd/ReplicatedPG: add PROMOTE log entry type

This is an alternative to MODIFY that indicates the object was just
promoted from another tier. Thanksfully, is_modify() is used in very
few places!

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 27 Dec 2013 02:02:16 +0000 (18:02 -0800)]

osd/ReplicatedPG: adjust clone stats when promoting clones

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Tue, 24 Dec 2013 16:50:38 +0000 (08:50 -0800)]

osd/ReplicatedPG: include snaps in copy-get results

When promoting a snapped object, we need to also get the set of snaps over
which the clone is defined. This is not strictly available except via the
list-snaps rados call, but that is only used on the snapdir object much
earlier when the head (whiteout) is promoted, and is not conveniently
available now. Adding it to the internal copy-get is not exposed via
librados (copy-get is not exposed at all) so I don't think this is a
problem.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Tue, 24 Dec 2013 01:26:39 +0000 (17:26 -0800)]

osd/ReplicatedPG: using missing_oid to decide which object to promote

find_object_context() now tells us which object it could use if it
doesn't find it on disk. Promote that one.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Tue, 24 Dec 2013 01:25:07 +0000 (17:25 -0800)]

osd/ReplicatedPG: make find_object_context() pass missing_oid

Prevoiusly we would return a snapid that we are blocked on if it is
missing. This is necessary because the missing clone does not always
match the logical snap we are trying to read.

Extend this to return a full hobject_t that is the missing object we want.
For the missing clone case, this cleans things up slightly. More
importantly, it lets find_object_context also tell us which on-disk
object is missing that, if it could be promoted, would help.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 13 Jan 2014 23:51:41 +0000 (15:51 -0800)]

mon/PGMap: make decode version match encode version

These should have been bumped way back in 091809b8.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 13 Jan 2014 23:50:51 +0000 (15:50 -0800)]

ceph-dencoder: include offset in 'stray data' error message

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 13 Jan 2014 23:50:29 +0000 (15:50 -0800)]

buffer: do not append trailing newline when appending empty istream

If we call

bl.append(some_istream);

do not include a \n if the istream is empty (which is apparently is not
the same thing as eof). This was causing 'ceph pg getmap' to include a
trailing newline.

Probably we don't want this newline at all! But all callers need to be
fixed for that change.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

athanatos [Mon, 13 Jan 2014 22:25:51 +0000 (14:25 -0800)]

Merge pull request #931 from ceph/wip-5858-rebase

Wip 5858 rebase

Reviewed-by: Samuel Just <sam.just@inktank.com>

commit | commitdiff | tree

Ken Dreyer [Mon, 13 Jan 2014 21:07:01 +0000 (21:07 +0000)]

v0.75

commit | commitdiff | tree

John Wilkins [Mon, 13 Jan 2014 20:57:02 +0000 (12:57 -0800)]

doc: Added comment and example for SSL enablement in rgw.conf

Signed-off-by: John Wilkins <john.wilkins@inktank.com>

commit | commitdiff | tree

David Zafman [Mon, 13 Jan 2014 19:38:48 +0000 (11:38 -0800)]

osd: Implement multiple backfill target handling

Fixes: #5858
Signed-off-by: David Zafman <david.zafman@inktank.com>

commit | commitdiff | tree

David Zafman [Thu, 21 Nov 2013 23:21:53 +0000 (15:21 -0800)]

osd: Interim backfill changes

Make peer_backfill_info a map which holds a
BackfillInterval for all backfill targets.
Initially see if recover_backfill() can just backfill
the first one and mark them all finished.

Signed-off-by: David Zafman <david.zafman@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 13 Jan 2014 19:22:49 +0000 (11:22 -0800)]

Merge pull request #1077 from ceph/wip-7141

DBObjectMap::clear_keys_header: use generate_new_header, not _generate_n...

Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Samuel Just [Mon, 13 Jan 2014 19:02:45 +0000 (11:02 -0800)]

DBObjectMap::clear_keys_header: use generate_new_header, not _generate_new_header

We aren't holding the header_lock here, so we need the locked version.

Signed-off-by: Samuel Just <sam.just@inktank.com>

commit | commitdiff | tree

Loic Dachary [Mon, 13 Jan 2014 17:16:09 +0000 (18:16 +0100)]

erasure-code: use uintptr_t instead of long long

Checking the pointer alignment using a cast to long long raises a
warning when --Wpointer-to-int-cast is given.

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Sage Weil [Mon, 13 Jan 2014 16:46:04 +0000 (08:46 -0800)]

Merge pull request #1075 from dachary/wip-crush

improve crushtool --build useability and documentation

Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Gregory Farnum [Mon, 13 Jan 2014 16:33:52 +0000 (08:33 -0800)]

Merge pull request #1072 from ceph/wip-tier-snap

Reviewed-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

Loic Dachary [Sun, 12 Jan 2014 16:48:00 +0000 (17:48 +0100)]

doc: format man pages with s/2013/2014/

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Loic Dachary [Sun, 12 Jan 2014 16:46:18 +0000 (17:46 +0100)]

commit | commitdiff | tree

Loic Dachary [Sun, 12 Jan 2014 16:34:52 +0000 (17:34 +0100)]

doc: update the crushtool manual page

* add information about CEPH_ARGS
* rework the --build documentation and example
* add an Author section
* replace vi with emacs for no good reason
* cleanup whitespace

Signed-off-by: Loic Dachary <loic@dachary.org>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom