]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
13 years agomkcephfs: error out if mon data directory is not empty
Sage Weil [Tue, 10 Jul 2012 01:16:44 +0000 (18:16 -0700)]
mkcephfs: error out if mon data directory is not empty

The ceph-mon --mkfs function no longer wipes out the directory; it is in
fact mostly a no-op that just verifies the dir exists.

So, ensure that the directory is empty at mkfs time.  This could
alternatively do an 'rm -r' in that directory (that is in fact what
ceph-mon used to do), but this is safer.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agovstart.sh: blow away mon directory on creation/start
Sage Weil [Tue, 10 Jul 2012 01:17:54 +0000 (18:17 -0700)]
vstart.sh: blow away mon directory on creation/start

Now that ceph-mon doesn't blow away the mon data content, we need to.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomon: stop doing rm -rf on mon mkfs
Sage Weil [Tue, 10 Jul 2012 01:17:16 +0000 (18:17 -0700)]
mon: stop doing rm -rf on mon mkfs

Simply verify that the directory exists, or if it doesn't, create it.
Do nothing about its content.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoReplicatedPG: don't warn if backfill peer stats don't match
Samuel Just [Tue, 10 Jul 2012 00:57:03 +0000 (17:57 -0700)]
ReplicatedPG: don't warn if backfill peer stats don't match

pinfo.stats might be wrong if we did log-based recovery on the
backfilled portion in addition to continuing backfill.

bug #2750

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoReplicatedPG: fix replay op ordering
Samuel Just [Mon, 9 Jul 2012 22:53:31 +0000 (15:53 -0700)]
ReplicatedPG: fix replay op ordering

After a client reconnect, the client replays outstanding ops.  The
OSD then immediately responds with success if the op has already
committed (version < ReplicatedPG::get_first_in_progress).
Otherwise, we stick it in waiting_for_ondisk to be replied to when
eval_repop concludes that waitfor_disk is empty.

Fixes #2508

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agolibrbd: return an error when removing a non-existent image
Josh Durgin [Tue, 10 Jul 2012 00:24:19 +0000 (17:24 -0700)]
librbd: return an error when removing a non-existent image

Try treating the image as new format if it's not in the old-style
directory, which is the last step in old-style removal. Then if the
image is not found in the new-style directory, -ENOENT will be
returned, preserving the semantics that existed prior to
6f096b6cdc66bb92762aa92e51e5e448039cf3e3.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
13 years agoMerge remote-tracking branch 'gh/wip-rbd-id'
Sage Weil [Mon, 9 Jul 2012 18:43:05 +0000 (11:43 -0700)]
Merge remote-tracking branch 'gh/wip-rbd-id'

13 years agodoc: Removed legacy paths and keyname settings from examples.
John Wilkins [Mon, 9 Jul 2012 18:06:27 +0000 (11:06 -0700)]
doc: Removed legacy paths and keyname settings from examples.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agodoc: remove reference to 'ceph stop' command
Sage Weil [Sun, 8 Jul 2012 21:39:52 +0000 (14:39 -0700)]
doc: remove reference to 'ceph stop' command

It doesn't exist anymore.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoMerge branch 'wip-cond'
Sage Weil [Sat, 7 Jul 2012 03:01:33 +0000 (20:01 -0700)]
Merge branch 'wip-cond'

Reviewed-by: Greg Farnum <greg@inktank.com>
13 years agodoc: added some discussion to libvirt.
John Wilkins [Fri, 6 Jul 2012 19:21:34 +0000 (12:21 -0700)]
doc: added some discussion to libvirt.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agocond: cleanup
Sage Weil [Fri, 6 Jul 2012 00:59:19 +0000 (17:59 -0700)]
cond: cleanup

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agocond: drop unused Wait variant
Sage Weil [Fri, 6 Jul 2012 00:58:55 +0000 (17:58 -0700)]
cond: drop unused Wait variant

This was used for debugging forever ago.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agolibrados: drop unused local variables
Sage Weil [Thu, 5 Jul 2012 04:07:44 +0000 (21:07 -0700)]
librados: drop unused local variables

This is unused boilerplate cruft.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agolibrados: take lock when signaling notify cond
Sage Weil [Fri, 6 Jul 2012 01:08:58 +0000 (18:08 -0700)]
librados: take lock when signaling notify cond

When we are signaling the cond to indicate that a notify is complete,
take the appropriate lock.  This removes the possibility of a race
that loses our signal.  (That would be very difficult given that there
are network round trips involved, but this makes the lock/cond usage
"correct.")

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoworkqueue: kick -> wake or _wake, depending on locking
Sage Weil [Thu, 5 Jul 2012 02:50:34 +0000 (19:50 -0700)]
workqueue: kick -> wake or _wake, depending on locking

Break kick() into wake() and _wake() methods, depending on whether the
lock is already held.  (The rename ensures that we audit/fix all
callers.)

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agocond: assert that we are holding the same mutex as the waiter
Sage Weil [Fri, 6 Jul 2012 02:12:22 +0000 (19:12 -0700)]
cond: assert that we are holding the same mutex as the waiter

Try to verify that we are holding the same mutex that the waiter is
waiting on.  Specifically:

 * only wait on a single mutex for this cond
 * remember which mutex that is
 * if we signal and someone has waited, try to make sure we are holding
   the mutex as well.  (Mutex::is_locked() is unsufficient here; it doesn't
   ensure that *our* thread tool the mutex.  it is necessary, though!)

Introduce a sloppy_signal() method that can be used if we actually mean
to signal the cond without holding the proper lock (and, presumably,
don't care about losing a signal).

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoclient: fix locking for SafeCond users
Sage Weil [Wed, 4 Jul 2012 22:11:21 +0000 (15:11 -0700)]
client: fix locking for SafeCond users

Need to wait on flock, not client_lock.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoMerge branch 'master' of github.com:ceph/ceph
John Wilkins [Fri, 6 Jul 2012 18:29:55 +0000 (11:29 -0700)]
Merge branch 'master' of github.com:ceph/ceph

13 years agodoc: Minor cleanup on deploy with Chef.
John Wilkins [Fri, 6 Jul 2012 18:29:31 +0000 (11:29 -0700)]
doc: Minor cleanup on deploy with Chef.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agoosd: make on_removal() pure virtual
Sage Weil [Fri, 6 Jul 2012 04:28:06 +0000 (21:28 -0700)]
osd: make on_removal() pure virtual

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoosd: fix PG dtor compile error
Sage Weil [Fri, 6 Jul 2012 04:26:27 +0000 (21:26 -0700)]
osd: fix PG dtor compile error

We need at least none non-pure virtual method to tell gcc where the
vtable goes.  The destructor wins!

libosd.a(libosd_a-ReplicatedPG.o): In function `~PG':
/home/sage/src/ceph/src/osd/PG.h:1367: undefined reference to `vtable for PG'
libosd.a(libosd_a-ReplicatedPG.o):(.rodata._ZTI12ReplicatedPG[typeinfo for ReplicatedPG]+0x10): undefined reference to `typeinfo for PG'
libosd.a(libosd_a-PG.o): In function `PG':
/home/sage/src/ceph/src/osd/PG.cc:85: undefined reference to `vtable for PG'
...

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoMerge remote-tracking branch 'gh/wip_osd_threading'
Sage Weil [Fri, 6 Jul 2012 00:20:14 +0000 (17:20 -0700)]
Merge remote-tracking branch 'gh/wip_osd_threading'

13 years agoPG,ReplicatedPG: on_removal must handle repop and watcher state
Samuel Just [Thu, 5 Jul 2012 22:39:24 +0000 (15:39 -0700)]
PG,ReplicatedPG: on_removal must handle repop and watcher state

on_removal is now in ReplicatedPG in order to handle watcher state
and repop state.  Addionally, workqueue dequeues are handled already
in OSD::_remove_pg.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSDMonitor: disable cluster snapshot
Samuel Just [Thu, 5 Jul 2012 20:41:37 +0000 (13:41 -0700)]
OSDMonitor: disable cluster snapshot

The map handling changes broke cluster snapshot support.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: ensure that OpSequencer lives through on_commit callback
Samuel Just [Thu, 5 Jul 2012 17:12:26 +0000 (10:12 -0700)]
OSD: ensure that OpSequencer lives through on_commit callback

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoReplicatedPG.cc: C_OSD_CommittedPushedObject move pg->put() to finish
Samuel Just [Tue, 3 Jul 2012 17:50:15 +0000 (10:50 -0700)]
ReplicatedPG.cc: C_OSD_CommittedPushedObject move pg->put() to finish

This should clarify the ownership of the pg ref.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD::PeeringWQ::_dequeue(PG*) drop pg refs
Samuel Just [Tue, 3 Jul 2012 17:47:53 +0000 (10:47 -0700)]
OSD::PeeringWQ::_dequeue(PG*) drop pg refs

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG:;replica_scrub: move msg->put() into queue process
Samuel Just [Tue, 3 Jul 2012 16:10:06 +0000 (09:10 -0700)]
OSD,PG:;replica_scrub: move msg->put() into queue process

This clarifies the ownership of the reference.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,ReplicatedPG::snap_trimmer: pg->put() in process, not snap_trimmer()
Samuel Just [Tue, 3 Jul 2012 16:03:53 +0000 (09:03 -0700)]
OSD,ReplicatedPG::snap_trimmer: pg->put() in process, not snap_trimmer()

This clarifies responsibility for the reference.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: drop pg refcounts in OpWQ::_dequeue(PG*)
Samuel Just [Tue, 3 Jul 2012 15:55:40 +0000 (08:55 -0700)]
OSD: drop pg refcounts in OpWQ::_dequeue(PG*)

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: clean up revcovery_wq queueing and ref counting
Samuel Just [Tue, 3 Jul 2012 15:53:54 +0000 (08:53 -0700)]
OSD: clean up revcovery_wq queueing and ref counting

Previously, we tended to explicitely remove the pg from the queue uisng
remove_myself on the xlist::item.  This causes us to drop a reference
count.  Manipulating the revovery_wq is now accomplished through the
recovery_wq interface, which also handles pg ref counting.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agodoc: minor typo
Ross Turk [Thu, 5 Jul 2012 22:29:23 +0000 (15:29 -0700)]
doc: minor typo

Signed-off-by: Ross Turk <ross@inktank.com>
13 years agodoc: update copyright notice in footer
Ross Turk [Thu, 5 Jul 2012 22:24:42 +0000 (15:24 -0700)]
doc: update copyright notice in footer

Signed-off-by: Ross Turk <ross@inktank.com>
13 years agodoc: minor updates to the restrucuredText file.
John Wilkins [Thu, 5 Jul 2012 21:01:45 +0000 (14:01 -0700)]
doc: minor updates to the restrucuredText file.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agodoc: minor cleanup.
John Wilkins [Thu, 5 Jul 2012 21:00:22 +0000 (14:00 -0700)]
doc: minor cleanup.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agodoc: Publishing as described. Still requires some verification and QA.
John Wilkins [Thu, 5 Jul 2012 20:47:45 +0000 (13:47 -0700)]
doc: Publishing as described. Still requires some verification and QA.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agoPG: C_PG_MarkUnfoundLost put pg in finish
Samuel Just [Tue, 3 Jul 2012 15:43:30 +0000 (08:43 -0700)]
PG: C_PG_MarkUnfoundLost put pg in finish

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD::activate_map: don't publish map until pgs in deleted pools have been removed
Samuel Just [Tue, 3 Jul 2012 05:09:31 +0000 (22:09 -0700)]
OSD::activate_map: don't publish map until pgs in deleted pools have been removed

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agodoc/scripts/gen_state_diagram.py: make parser a bit more forgiving
Samuel Just [Mon, 2 Jul 2012 21:57:18 +0000 (14:57 -0700)]
doc/scripts/gen_state_diagram.py: make parser a bit more forgiving

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoReplicatedPG::op_applied: update last_update_applied iff !aborted
Samuel Just [Mon, 2 Jul 2012 19:54:00 +0000 (12:54 -0700)]
ReplicatedPG::op_applied: update last_update_applied iff !aborted

scrub state and last_update_applied will have been reset during
the interval change.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agotest/encoding/types.h: disable pg_query_t encoding test
Samuel Just [Thu, 21 Jun 2012 05:21:26 +0000 (22:21 -0700)]
test/encoding/types.h: disable pg_query_t encoding test

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: split notify|info|query messages for old clients
Samuel Just [Sat, 30 Jun 2012 01:05:19 +0000 (18:05 -0700)]
OSD: split notify|info|query messages for old clients

Old clients do not expect mixed epoch compound messages.  Thus, we
send each sub-message independently.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoFileStore: delete source collection if not replaying collection_rename
Samuel Just [Fri, 29 Jun 2012 03:26:28 +0000 (20:26 -0700)]
FileStore: delete source collection if not replaying collection_rename

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoReplicatedPG: RepModify track epoch_started and bail on interval change
Samuel Just [Fri, 22 Jun 2012 17:12:26 +0000 (10:12 -0700)]
ReplicatedPG: RepModify track epoch_started and bail on interval change

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoReplicatedPG: on_activate for a peer might happen before flush
Samuel Just [Fri, 22 Jun 2012 17:12:26 +0000 (10:12 -0700)]
ReplicatedPG: on_activate for a peer might happen before flush

We don't ensure for a peer that the flush completes before activation,
merely that we don't serve any ops until flush completes.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: _remove_pg not ruin iterator consistency
Samuel Just [Thu, 21 Jun 2012 01:55:29 +0000 (18:55 -0700)]
OSD: _remove_pg not ruin iterator consistency

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: move watch into OSDService
Samuel Just [Wed, 20 Jun 2012 23:42:00 +0000 (16:42 -0700)]
OSD: move watch into OSDService

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: pass activate epoch with Activate event
Samuel Just [Mon, 18 Jun 2012 19:52:56 +0000 (12:52 -0700)]
PG: pass activate epoch with Activate event

This allows us to pass into activate() in which epoch the
message triggering activation occurred allowing us mark
the activate committed callback with the right query_epoch.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoRevert "osd: check against last_peering_reset in _activate_committed"
Samuel Just [Thu, 14 Jun 2012 20:04:27 +0000 (13:04 -0700)]
Revert "osd: check against last_peering_reset in _activate_committed"

This reverts commit 86aa07d7a91ac23074e76551c3a6db3a5736cffa.

13 years agoRevert "osd: reset last_peering_interval on replica activate"
Samuel Just [Thu, 14 Jun 2012 18:25:10 +0000 (11:25 -0700)]
Revert "osd: reset last_peering_interval on replica activate"

This reverts commit 17114f266a336b6edd7e98975d494fdd487eec20.

13 years agoOSD: write_info/log during process_peering_events, do_recovery
Samuel Just [Sun, 17 Jun 2012 23:25:09 +0000 (16:25 -0700)]
OSD: write_info/log during process_peering_events, do_recovery

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: delay ops in do_request, not queue_op
Samuel Just [Fri, 8 Jun 2012 19:17:39 +0000 (12:17 -0700)]
PG: delay ops in do_request, not queue_op

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: maybe_update_heartbeat_peers, don't print pg
Samuel Just [Fri, 8 Jun 2012 03:05:50 +0000 (20:05 -0700)]
OSD: maybe_update_heartbeat_peers, don't print pg

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: process_peering_event check for new map on each pg
Samuel Just [Fri, 8 Jun 2012 02:33:09 +0000 (19:33 -0700)]
OSD: process_peering_event check for new map on each pg

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: peering_wq is now a BatchWorkQueue
Samuel Just [Sun, 17 Jun 2012 23:23:20 +0000 (16:23 -0700)]
OSD: peering_wq is now a BatchWorkQueue

process_peering_events now handles multiple pgs at once to better
batch up notifes, etc.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: do_(notifies|infos|queries) must now be passed a map
Samuel Just [Mon, 18 Jun 2012 17:09:00 +0000 (10:09 -0700)]
OSD: do_(notifies|infos|queries) must now be passed a map

This removes the need to call them from within the osd lock.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agocommon/WorkQueue.h: add BatchWorkQueue
Samuel Just [Thu, 7 Jun 2012 23:38:08 +0000 (16:38 -0700)]
common/WorkQueue.h: add BatchWorkQueue

Rather than dispatching one item at a time to process, etc,
BatchWorkQueue dispatches up to a configurable number of
items.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: bail out of do_recovery if no longer primary and active
Samuel Just [Thu, 7 Jun 2012 22:03:50 +0000 (15:03 -0700)]
OSD: bail out of do_recovery if no longer primary and active

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: PG now store its own PGPool
Samuel Just [Thu, 7 Jun 2012 18:29:31 +0000 (10:29 -0800)]
PG: PG now store its own PGPool

Otherwise, we need to syncronize access to the shared PGPool objects.
The wasted memory is probably preferable to syncronization overhead.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: on pg_removal, project_pg_history to get current interval
Samuel Just [Thu, 7 Jun 2012 04:27:38 +0000 (21:27 -0700)]
OSD: on pg_removal, project_pg_history to get current interval

First, we don't really want to remove the pg if we can use it.  Second,
there might be messages in the pg peering queue for the next interval.
If one of those happens to be an info request or notify, we would lose
the peering message.

If the message falls in the current interval as determined by the
current osdmap, than we know that any messages currently queued must be
obsolete and can safetly be discarded.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoCrushWrapper: add locking around crush_do_rule
Samuel Just [Wed, 6 Jun 2012 22:14:01 +0000 (14:14 -0800)]
CrushWrapper: add locking around crush_do_rule

crush_do_rule uses a cache on the bucket objects.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoCrushWrapper: rmaps don't need to be mutable
Samuel Just [Wed, 6 Jun 2012 22:13:18 +0000 (15:13 -0700)]
CrushWrapper: rmaps don't need to be mutable

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG: issue pg removals in line, remove remove_list
Samuel Just [Wed, 23 May 2012 17:52:05 +0000 (10:52 -0700)]
OSD,PG: issue pg removals in line, remove remove_list

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: don't advance_pg() if pg is up-to-date
Samuel Just [Wed, 23 May 2012 17:04:37 +0000 (10:04 -0700)]
OSD: don't advance_pg() if pg is up-to-date

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG: clean up _get_or_create_pg and set interval based on msg
Samuel Just [Sun, 17 Jun 2012 23:16:42 +0000 (16:16 -0700)]
OSD,PG: clean up _get_or_create_pg and set interval based on msg

Previously, we set last_peering_reset based on the epoch in which the pg
is created.  We now pass the map from the query_epoch to the creation
methods to set based on that.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: lock recovery_wq before debug output on finish_recovery_op
Samuel Just [Tue, 22 May 2012 17:09:29 +0000 (10:09 -0700)]
OSD: lock recovery_wq before debug output on finish_recovery_op

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: only do_(notify|info|query) for up osd
Samuel Just [Tue, 22 May 2012 05:52:31 +0000 (21:52 -0800)]
OSD: only do_(notify|info|query) for up osd

pg may have an older map and attempt to notify|info|query on a down
osd.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: map_cache should contain const OSDMap
Samuel Just [Mon, 21 May 2012 22:15:15 +0000 (15:15 -0700)]
OSD: map_cache should contain const OSDMap

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: activate_map() in handle_osd_map only when active
Samuel Just [Thu, 17 May 2012 21:58:36 +0000 (14:58 -0700)]
OSD: activate_map() in handle_osd_map only when active

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG: _share_map_outgoing must not require osd_lock
Samuel Just [Sun, 17 Jun 2012 23:08:19 +0000 (16:08 -0700)]
OSD,PG: _share_map_outgoing must not require osd_lock

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoReplicatedPG: explicitely block on not active for certain ops
Samuel Just [Thu, 17 May 2012 18:03:01 +0000 (11:03 -0700)]
ReplicatedPG: explicitely block on not active for certain ops

Ops and some subops need to wait for active to ensure correct ordering
with respect to peering operations.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG,OSD: prevent pg from completing peering until deletion is complete
Samuel Just [Mon, 18 Jun 2012 19:52:06 +0000 (12:52 -0700)]
PG,OSD: prevent pg from completing peering until deletion is complete

hobject_t must now be globally unique in the filestore.  Thus, if we
start creating objects in a pg before the removal collections for the
previous incarnation are fully removed, we might end up a second
instance of the same hobject violating the filestore rules.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG: clean up pg removal
Samuel Just [Fri, 29 Jun 2012 21:11:07 +0000 (14:11 -0700)]
OSD,PG: clean up pg removal

PG opsequencers will be used for removing a pg.  If the pg is recreated
before the removal is complete, we need the new pg incarnation to be
able to inherit the osr of its predecessor.

Previously, we queued the pg for removal and only rendered it unusable
after the contents were fully removed.  Now, we syncronously remove it
from the map and queue a transaction renaming the collections.  We then
asyncronously clean up those collections.  If the pg is recreated, it
will inherit the same osr until the cleanup is complete ensuring correct
op ordering with respect to the collection rename.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: flush ops by the end of peering without osr.flush
Samuel Just [Wed, 20 Jun 2012 22:42:18 +0000 (15:42 -0700)]
PG: flush ops by the end of peering without osr.flush

Rather than explicitely flushing the filestore, send a noop through the
filestore at the beginning of peering and, at the end, wait for it to
finish by adding an extra state.

Also, delay ops until flushed is true.  Until we have finished flushing,
we cannot safetly read objects.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG: added helper methods for creating and dispatching RecoveryCtxs
Samuel Just [Mon, 18 Jun 2012 17:08:11 +0000 (10:08 -0700)]
OSD,PG: added helper methods for creating and dispatching RecoveryCtxs

This is simpler than having to update all of the RecoveryCtx users
whenever we change the types in RecoveryCtx.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG: Move pg accesible methods, objects to OSDService
Samuel Just [Thu, 14 Jun 2012 02:05:47 +0000 (19:05 -0700)]
OSD,PG: Move pg accesible methods, objects to OSDService

In order to clarify data structure locking, PGs will now access
OSDService rather the the OSD directly.  Over time, more structures will
be moved to the OSDService.  osd_lock can no longer be held while pg
locks are held.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG, OSD: info_map shouldn't contain the MOSDPGInfo*
Samuel Just [Thu, 14 Jun 2012 01:56:16 +0000 (18:56 -0700)]
PG, OSD: info_map shouldn't contain the MOSDPGInfo*

Rather, we will just pass the same type as the noties.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: queue_want_up_thru in process_peering_event
Samuel Just [Tue, 8 May 2012 17:56:36 +0000 (10:56 -0700)]
OSD: queue_want_up_thru in process_peering_event

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: do not drop osd_lock in handle_osd_map
Samuel Just [Mon, 7 May 2012 20:51:55 +0000 (12:51 -0800)]
OSD: do not drop osd_lock in handle_osd_map

PGs have their map updates done in a different thread.  Thus, we no
longer need to grab the pg locks.  activate_map no longer requires
the map_lock in order to allow us to queue events for the pgs.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: get map read lock during queue_want_up_thru
Samuel Just [Fri, 1 Jun 2012 16:59:00 +0000 (09:59 -0700)]
OSD: get map read lock during queue_want_up_thru

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: push_waiters is no longer used
Samuel Just [Fri, 1 Jun 2012 16:58:42 +0000 (09:58 -0700)]
OSD: push_waiters is no longer used

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: do not lock osd during dequeue_op
Samuel Just [Mon, 7 May 2012 20:00:41 +0000 (13:00 -0700)]
OSD: do not lock osd during dequeue_op

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: don't assume pending pg removals have flushed
Samuel Just [Mon, 7 May 2012 18:32:59 +0000 (11:32 -0700)]
OSD: don't assume pending pg removals have flushed

_create_lock_pg might encounter a preexisting pg collection simply
because the removal transaction had not yet completed.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoReplicatedPG: change ReplicatedPG debug output to match PG
Samuel Just [Mon, 7 May 2012 18:00:52 +0000 (11:00 -0700)]
ReplicatedPG: change ReplicatedPG debug output to match PG

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoReplicatedPG: do not eval_repop if aborted
Samuel Just [Mon, 7 May 2012 17:33:59 +0000 (10:33 -0700)]
ReplicatedPG: do not eval_repop if aborted

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: remove superfluous pg get/put around enqueue_op
Samuel Just [Fri, 4 May 2012 00:48:20 +0000 (17:48 -0700)]
OSD: remove superfluous pg get/put around enqueue_op

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG,OSD: fix op checking in pg, take_waiters during ActMap
Samuel Just [Tue, 12 Jun 2012 00:05:40 +0000 (17:05 -0700)]
PG,OSD: fix op checking in pg, take_waiters during ActMap

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG,OSD: add OSD::queue_for_op, use in PG::queue_op
Samuel Just [Thu, 3 May 2012 20:40:54 +0000 (13:40 -0700)]
PG,OSD: add OSD::queue_for_op, use in PG::queue_op

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: check for deleting in process_peering_event
Samuel Just [Fri, 1 Jun 2012 16:50:10 +0000 (09:50 -0700)]
OSD: check for deleting in process_peering_event

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: use osd->requeue_ops for ops, pg->queue_for_peering to requeue pg
Samuel Just [Fri, 1 Jun 2012 16:49:55 +0000 (09:49 -0700)]
PG: use osd->requeue_ops for ops, pg->queue_for_peering to requeue pg

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: compound messages must carry epoch_sent for each part
Samuel Just [Wed, 13 Jun 2012 18:27:49 +0000 (11:27 -0700)]
PG: compound messages must carry epoch_sent for each part

Query and Notify messages include logical messages from multiple
pgs.  Each logical message (pg_query_t and pg_notify_t) now
contains an epoch_sent.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: CephPeeringEvents can now be descriptively printed
Samuel Just [Wed, 13 Jun 2012 04:25:39 +0000 (21:25 -0700)]
PG: CephPeeringEvents can now be descriptively printed

The CephPeeringEvt constructor is now templated to allow
storing a description string for debugging.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: initialize pgs in get_or_create_pg via handle_create
Samuel Just [Thu, 31 May 2012 05:19:58 +0000 (21:19 -0800)]
OSD: initialize pgs in get_or_create_pg via handle_create

Previously, pgs were initialized via Info/Log/etc.  Since the event
which triggered the pg creation may now be queued, map update events may
occur before the event is processed.  Thus, get_or_create_pg now handles
the initialization prior to queuing the event.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: include info and query by value in peering events
Samuel Just [Thu, 31 May 2012 05:19:48 +0000 (22:19 -0700)]
PG: include info and query by value in peering events

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG: handle pg map advance in process_peering_event
Samuel Just [Tue, 24 Apr 2012 23:00:49 +0000 (16:00 -0700)]
OSD,PG: handle pg map advance in process_peering_event

The pg map will now be advanced in process_peering_event (in advance_pg)
to allow handle_osd_map to not grab pg locks in-line.  handle_osd_map
queues NullEvts to ensure that each pg is updated in a timely fashion.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoosd/: Make pg osdmap be independent of osd, other pg maps
Samuel Just [Tue, 12 Jun 2012 23:01:05 +0000 (16:01 -0700)]
osd/: Make pg osdmap be independent of osd, other pg maps

This will allow handle_osd_map to not stop other work queues.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG: Move Op,SubOp queueing into PG
Samuel Just [Tue, 24 Apr 2012 00:40:58 +0000 (17:40 -0700)]
OSD,PG: Move Op,SubOp queueing into PG

PG now handles delaying/discarding messages since pg map epoch may not
be the same as the OSD map.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: process peering events in a queue
Samuel Just [Thu, 31 May 2012 04:51:29 +0000 (21:51 -0700)]
PG: process peering events in a queue

Peering events are now queued via queue_peering_event in the
peering_queue.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: use intrusive_ptr in CephPeeringEvt
Samuel Just [Thu, 31 May 2012 04:50:27 +0000 (21:50 -0700)]
PG: use intrusive_ptr in CephPeeringEvt

Properly disposing of the event_base member of CephPeeringEvt
requires use of intrusive_ptr.

Signed-off-by: Samuel Just <sam.just@inktank.com>