]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
13 years agoCrushWrapper: rmaps don't need to be mutable
Samuel Just [Wed, 6 Jun 2012 22:13:18 +0000 (15:13 -0700)]
CrushWrapper: rmaps don't need to be mutable

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG: issue pg removals in line, remove remove_list
Samuel Just [Wed, 23 May 2012 17:52:05 +0000 (10:52 -0700)]
OSD,PG: issue pg removals in line, remove remove_list

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: don't advance_pg() if pg is up-to-date
Samuel Just [Wed, 23 May 2012 17:04:37 +0000 (10:04 -0700)]
OSD: don't advance_pg() if pg is up-to-date

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG: clean up _get_or_create_pg and set interval based on msg
Samuel Just [Sun, 17 Jun 2012 23:16:42 +0000 (16:16 -0700)]
OSD,PG: clean up _get_or_create_pg and set interval based on msg

Previously, we set last_peering_reset based on the epoch in which the pg
is created.  We now pass the map from the query_epoch to the creation
methods to set based on that.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: lock recovery_wq before debug output on finish_recovery_op
Samuel Just [Tue, 22 May 2012 17:09:29 +0000 (10:09 -0700)]
OSD: lock recovery_wq before debug output on finish_recovery_op

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: only do_(notify|info|query) for up osd
Samuel Just [Tue, 22 May 2012 05:52:31 +0000 (21:52 -0800)]
OSD: only do_(notify|info|query) for up osd

pg may have an older map and attempt to notify|info|query on a down
osd.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: map_cache should contain const OSDMap
Samuel Just [Mon, 21 May 2012 22:15:15 +0000 (15:15 -0700)]
OSD: map_cache should contain const OSDMap

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: activate_map() in handle_osd_map only when active
Samuel Just [Thu, 17 May 2012 21:58:36 +0000 (14:58 -0700)]
OSD: activate_map() in handle_osd_map only when active

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG: _share_map_outgoing must not require osd_lock
Samuel Just [Sun, 17 Jun 2012 23:08:19 +0000 (16:08 -0700)]
OSD,PG: _share_map_outgoing must not require osd_lock

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoReplicatedPG: explicitely block on not active for certain ops
Samuel Just [Thu, 17 May 2012 18:03:01 +0000 (11:03 -0700)]
ReplicatedPG: explicitely block on not active for certain ops

Ops and some subops need to wait for active to ensure correct ordering
with respect to peering operations.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG,OSD: prevent pg from completing peering until deletion is complete
Samuel Just [Mon, 18 Jun 2012 19:52:06 +0000 (12:52 -0700)]
PG,OSD: prevent pg from completing peering until deletion is complete

hobject_t must now be globally unique in the filestore.  Thus, if we
start creating objects in a pg before the removal collections for the
previous incarnation are fully removed, we might end up a second
instance of the same hobject violating the filestore rules.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG: clean up pg removal
Samuel Just [Fri, 29 Jun 2012 21:11:07 +0000 (14:11 -0700)]
OSD,PG: clean up pg removal

PG opsequencers will be used for removing a pg.  If the pg is recreated
before the removal is complete, we need the new pg incarnation to be
able to inherit the osr of its predecessor.

Previously, we queued the pg for removal and only rendered it unusable
after the contents were fully removed.  Now, we syncronously remove it
from the map and queue a transaction renaming the collections.  We then
asyncronously clean up those collections.  If the pg is recreated, it
will inherit the same osr until the cleanup is complete ensuring correct
op ordering with respect to the collection rename.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: flush ops by the end of peering without osr.flush
Samuel Just [Wed, 20 Jun 2012 22:42:18 +0000 (15:42 -0700)]
PG: flush ops by the end of peering without osr.flush

Rather than explicitely flushing the filestore, send a noop through the
filestore at the beginning of peering and, at the end, wait for it to
finish by adding an extra state.

Also, delay ops until flushed is true.  Until we have finished flushing,
we cannot safetly read objects.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG: added helper methods for creating and dispatching RecoveryCtxs
Samuel Just [Mon, 18 Jun 2012 17:08:11 +0000 (10:08 -0700)]
OSD,PG: added helper methods for creating and dispatching RecoveryCtxs

This is simpler than having to update all of the RecoveryCtx users
whenever we change the types in RecoveryCtx.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG: Move pg accesible methods, objects to OSDService
Samuel Just [Thu, 14 Jun 2012 02:05:47 +0000 (19:05 -0700)]
OSD,PG: Move pg accesible methods, objects to OSDService

In order to clarify data structure locking, PGs will now access
OSDService rather the the OSD directly.  Over time, more structures will
be moved to the OSDService.  osd_lock can no longer be held while pg
locks are held.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG, OSD: info_map shouldn't contain the MOSDPGInfo*
Samuel Just [Thu, 14 Jun 2012 01:56:16 +0000 (18:56 -0700)]
PG, OSD: info_map shouldn't contain the MOSDPGInfo*

Rather, we will just pass the same type as the noties.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: queue_want_up_thru in process_peering_event
Samuel Just [Tue, 8 May 2012 17:56:36 +0000 (10:56 -0700)]
OSD: queue_want_up_thru in process_peering_event

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: do not drop osd_lock in handle_osd_map
Samuel Just [Mon, 7 May 2012 20:51:55 +0000 (12:51 -0800)]
OSD: do not drop osd_lock in handle_osd_map

PGs have their map updates done in a different thread.  Thus, we no
longer need to grab the pg locks.  activate_map no longer requires
the map_lock in order to allow us to queue events for the pgs.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: get map read lock during queue_want_up_thru
Samuel Just [Fri, 1 Jun 2012 16:59:00 +0000 (09:59 -0700)]
OSD: get map read lock during queue_want_up_thru

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: push_waiters is no longer used
Samuel Just [Fri, 1 Jun 2012 16:58:42 +0000 (09:58 -0700)]
OSD: push_waiters is no longer used

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: do not lock osd during dequeue_op
Samuel Just [Mon, 7 May 2012 20:00:41 +0000 (13:00 -0700)]
OSD: do not lock osd during dequeue_op

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: don't assume pending pg removals have flushed
Samuel Just [Mon, 7 May 2012 18:32:59 +0000 (11:32 -0700)]
OSD: don't assume pending pg removals have flushed

_create_lock_pg might encounter a preexisting pg collection simply
because the removal transaction had not yet completed.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoReplicatedPG: change ReplicatedPG debug output to match PG
Samuel Just [Mon, 7 May 2012 18:00:52 +0000 (11:00 -0700)]
ReplicatedPG: change ReplicatedPG debug output to match PG

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoReplicatedPG: do not eval_repop if aborted
Samuel Just [Mon, 7 May 2012 17:33:59 +0000 (10:33 -0700)]
ReplicatedPG: do not eval_repop if aborted

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: remove superfluous pg get/put around enqueue_op
Samuel Just [Fri, 4 May 2012 00:48:20 +0000 (17:48 -0700)]
OSD: remove superfluous pg get/put around enqueue_op

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG,OSD: fix op checking in pg, take_waiters during ActMap
Samuel Just [Tue, 12 Jun 2012 00:05:40 +0000 (17:05 -0700)]
PG,OSD: fix op checking in pg, take_waiters during ActMap

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG,OSD: add OSD::queue_for_op, use in PG::queue_op
Samuel Just [Thu, 3 May 2012 20:40:54 +0000 (13:40 -0700)]
PG,OSD: add OSD::queue_for_op, use in PG::queue_op

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: check for deleting in process_peering_event
Samuel Just [Fri, 1 Jun 2012 16:50:10 +0000 (09:50 -0700)]
OSD: check for deleting in process_peering_event

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: use osd->requeue_ops for ops, pg->queue_for_peering to requeue pg
Samuel Just [Fri, 1 Jun 2012 16:49:55 +0000 (09:49 -0700)]
PG: use osd->requeue_ops for ops, pg->queue_for_peering to requeue pg

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: compound messages must carry epoch_sent for each part
Samuel Just [Wed, 13 Jun 2012 18:27:49 +0000 (11:27 -0700)]
PG: compound messages must carry epoch_sent for each part

Query and Notify messages include logical messages from multiple
pgs.  Each logical message (pg_query_t and pg_notify_t) now
contains an epoch_sent.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: CephPeeringEvents can now be descriptively printed
Samuel Just [Wed, 13 Jun 2012 04:25:39 +0000 (21:25 -0700)]
PG: CephPeeringEvents can now be descriptively printed

The CephPeeringEvt constructor is now templated to allow
storing a description string for debugging.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: initialize pgs in get_or_create_pg via handle_create
Samuel Just [Thu, 31 May 2012 05:19:58 +0000 (21:19 -0800)]
OSD: initialize pgs in get_or_create_pg via handle_create

Previously, pgs were initialized via Info/Log/etc.  Since the event
which triggered the pg creation may now be queued, map update events may
occur before the event is processed.  Thus, get_or_create_pg now handles
the initialization prior to queuing the event.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: include info and query by value in peering events
Samuel Just [Thu, 31 May 2012 05:19:48 +0000 (22:19 -0700)]
PG: include info and query by value in peering events

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG: handle pg map advance in process_peering_event
Samuel Just [Tue, 24 Apr 2012 23:00:49 +0000 (16:00 -0700)]
OSD,PG: handle pg map advance in process_peering_event

The pg map will now be advanced in process_peering_event (in advance_pg)
to allow handle_osd_map to not grab pg locks in-line.  handle_osd_map
queues NullEvts to ensure that each pg is updated in a timely fashion.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoosd/: Make pg osdmap be independent of osd, other pg maps
Samuel Just [Tue, 12 Jun 2012 23:01:05 +0000 (16:01 -0700)]
osd/: Make pg osdmap be independent of osd, other pg maps

This will allow handle_osd_map to not stop other work queues.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG: Move Op,SubOp queueing into PG
Samuel Just [Tue, 24 Apr 2012 00:40:58 +0000 (17:40 -0700)]
OSD,PG: Move Op,SubOp queueing into PG

PG now handles delaying/discarding messages since pg map epoch may not
be the same as the OSD map.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: process peering events in a queue
Samuel Just [Thu, 31 May 2012 04:51:29 +0000 (21:51 -0700)]
PG: process peering events in a queue

Peering events are now queued via queue_peering_event in the
peering_queue.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: use intrusive_ptr in CephPeeringEvt
Samuel Just [Thu, 31 May 2012 04:50:27 +0000 (21:50 -0700)]
PG: use intrusive_ptr in CephPeeringEvt

Properly disposing of the event_base member of CephPeeringEvt
requires use of intrusive_ptr.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoosd/: move history update from handle_pg_query into pg
Samuel Just [Wed, 18 Apr 2012 22:39:46 +0000 (15:39 -0700)]
osd/: move history update from handle_pg_query into pg

Previously, replica history was updated in OSD::handle_pg_query.
Updating the history is now handled in the pg state machine.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG: push message checking to pg
Samuel Just [Wed, 18 Apr 2012 22:20:02 +0000 (15:20 -0700)]
OSD,PG: push message checking to pg

old_peering_evt now checks CephPeeringEvts generically in
PG::handle_peering_event().

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: Remove handle_pg_missing, MOSDPGMissing no longer used
Samuel Just [Wed, 18 Apr 2012 22:19:31 +0000 (15:19 -0700)]
OSD: Remove handle_pg_missing, MOSDPGMissing no longer used

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoPG: Move handle_* methods to PG
Samuel Just [Wed, 18 Apr 2012 21:11:26 +0000 (14:11 -0700)]
PG: Move handle_* methods to PG

PG now calls handle_event in RecoveryState.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoPG: CephPeeringEvt
Samuel Just [Tue, 17 Apr 2012 23:54:06 +0000 (16:54 -0700)]
PG: CephPeeringEvt

CephPeeringEvt is now the supertype for all peering state machine
events.  This will allow us to generalize checking for stale peering
events and delaying events for future maps.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,PG::scrub() move pg->put() into queue process
Samuel Just [Tue, 3 Jul 2012 20:58:56 +0000 (13:58 -0700)]
OSD,PG::scrub() move pg->put() into queue process

This clarifies ownership of the pg reference.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoosd: add missing formatter close_section() to scrub status
Sage Weil [Wed, 4 Jul 2012 20:59:04 +0000 (13:59 -0700)]
osd: add missing formatter close_section() to scrub status

Also add braces to make the open/close matchups easier to see.  Broken
by f36617392710f9b3538bfd59d45fd72265993d57.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoMerge branch 'stable'
Sage Weil [Wed, 4 Jul 2012 16:30:21 +0000 (09:30 -0700)]
Merge branch 'stable'

Conflicts:
src/test/cli/radosgw-admin/help.t

13 years agolibrados: Bump the version to 0.48
Wido den Hollander [Wed, 4 Jul 2012 13:46:04 +0000 (15:46 +0200)]
librados: Bump the version to 0.48

Signed-off-by: Wido den Hollander <wido@widodh.nl>
Signed-off-by: Sage Weil <sage@inktank.com>
13 years agolibrados: add assert_version as an operation on an ObjectOperation
Samuel Just [Tue, 3 Jul 2012 19:00:32 +0000 (12:00 -0700)]
librados: add assert_version as an operation on an ObjectOperation

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoReplicatedPG: do not set reply version to last_update
Samuel Just [Tue, 3 Jul 2012 22:35:29 +0000 (15:35 -0700)]
ReplicatedPG: do not set reply version to last_update

The version should be oi.user_version as set above.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agorgw: initialize fields of RGWObjEnt
Sage Weil [Wed, 4 Jul 2012 01:51:02 +0000 (18:51 -0700)]
rgw: initialize fields of RGWObjEnt

This fixes various valgrind warnings triggered by the s3test
test_object_create_unreadable.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoMerge remote-tracking branch 'gh/wip-crush'
Sage Weil [Tue, 3 Jul 2012 23:49:29 +0000 (16:49 -0700)]
Merge remote-tracking branch 'gh/wip-crush'

13 years agorgw-admin: use correct modifier with strptime
Yehuda Sadeh [Wed, 27 Jun 2012 00:28:51 +0000 (17:28 -0700)]
rgw-admin: use correct modifier with strptime

Bug #2658: used %I (12h) instead of %H (24h)

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
13 years agorgw: send both swift x-storage-token and x-auth-token
Yehuda Sadeh [Thu, 21 Jun 2012 22:40:27 +0000 (15:40 -0700)]
rgw: send both swift x-storage-token and x-auth-token

older clients need x-storage-token, newer x-auth-token

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
13 years agorgw: radosgw-admin date params now also accept time
Yehuda Sadeh [Thu, 21 Jun 2012 22:17:19 +0000 (15:17 -0700)]
rgw: radosgw-admin date params now also accept time

The date format now is "YYYY-MM-DD[ hh:mm:ss]". Got rid of
the --time param for the old ops log stuff.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Conflicts:

src/test/cli/radosgw-admin/help.t

13 years agorgw-admin: fix usage help
Yehuda Sadeh [Thu, 21 Jun 2012 20:14:47 +0000 (13:14 -0700)]
rgw-admin: fix usage help

s/show/trim

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
13 years agoceph-disk-prepare: Partition and format OSD data disks automatically.
Tommi Virtanen [Tue, 3 Jul 2012 22:24:26 +0000 (15:24 -0700)]
ceph-disk-prepare: Partition and format OSD data disks automatically.

Uses gdisk, as it seems to be the only tool that can automate GPT uuid
changes. Needs to run as root.

Adds Recommends: gdisk to ceph.deb.

Closes: #2547
Signed-off-by: Tommi Virtanen <tv@inktank.com>
13 years agodoc: removed /srv/osd.$id.journal from ceph.conf example.
John Wilkins [Tue, 3 Jul 2012 21:20:34 +0000 (14:20 -0700)]
doc: removed /srv/osd.$id.journal  from ceph.conf example.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agoCrushTester.cc: remove BOOST dependencies.
caleb miles [Tue, 3 Jul 2012 20:05:48 +0000 (13:05 -0700)]
CrushTester.cc: remove BOOST dependencies.

remove calls to BOOST libraries for computing Chi-squared statistics and
producing discrete random variables with a given probability distribution.

Signed-off-by: caleb miles <caleb.miles@inktank.com>
13 years agodoc: Updates to 5-minute quick start.
John Wilkins [Tue, 3 Jul 2012 21:14:42 +0000 (14:14 -0700)]
doc: Updates to 5-minute quick start.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agoradosgw-admin: fix clit test
Sage Weil [Tue, 3 Jul 2012 21:07:16 +0000 (14:07 -0700)]
radosgw-admin: fix clit test

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoMerge branch 'wip-config'
Sage Weil [Tue, 3 Jul 2012 20:04:36 +0000 (13:04 -0700)]
Merge branch 'wip-config'

13 years agolockdep: increase max locks
Sage Weil [Tue, 3 Jul 2012 20:04:28 +0000 (13:04 -0700)]
lockdep: increase max locks

Hit this limit with the rados api tests.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoconfig: add unlocked version of get_my_sections; use it internally
Sage Weil [Tue, 3 Jul 2012 19:07:28 +0000 (12:07 -0700)]
config: add unlocked version of get_my_sections; use it internally

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoceph: fix cli help test
Sage Weil [Tue, 3 Jul 2012 18:32:57 +0000 (11:32 -0700)]
ceph: fix cli help test

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoMerge branch 'master' of github.com:ceph/ceph
John Wilkins [Tue, 3 Jul 2012 18:48:31 +0000 (11:48 -0700)]
Merge branch 'master' of github.com:ceph/ceph

13 years agodoc: Clean up of 5-minute quick start.
John Wilkins [Tue, 3 Jul 2012 18:48:15 +0000 (11:48 -0700)]
doc: Clean up of 5-minute quick start.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agoReplicatedPG: remove faulty scrub assert in sub_op_modify_applied
Samuel Just [Tue, 3 Jul 2012 18:23:16 +0000 (11:23 -0700)]
ReplicatedPG: remove faulty scrub assert in sub_op_modify_applied

This assert assumed that all ops submitted before MOSDRepScrub was
submitted were processed by the time that MOSDRepScrub was
processed.  In fact, MOSDRepScrub's scrub_to may refer to a
last_update yet to be seen by the replica.

Bug #2693

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoReplicatedPG: remove faulty scrub assert in sub_op_modify_applied
Samuel Just [Tue, 3 Jul 2012 18:23:16 +0000 (11:23 -0700)]
ReplicatedPG: remove faulty scrub assert in sub_op_modify_applied

This assert assumed that all ops submitted before MOSDRepScrub was
submitted were processed by the time that MOSDRepScrub was
processed.  In fact, MOSDRepScrub's scrub_to may refer to a
last_update yet to be seen by the replica.

Bug #2693

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agodoc: Updating Getting Started with 5-minute quick start.
John Wilkins [Tue, 3 Jul 2012 18:21:43 +0000 (11:21 -0700)]
doc: Updating Getting Started with 5-minute quick start.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agoceph: better usage
Kyle Bader [Tue, 3 Jul 2012 18:20:38 +0000 (11:20 -0700)]
ceph: better usage

Signed-off-by: Kyle Bader <kyle.bader@dreamhost.com>
13 years agoMerge branch 'master' of github.com:ceph/ceph
John Wilkins [Tue, 3 Jul 2012 18:18:11 +0000 (11:18 -0700)]
Merge branch 'master' of github.com:ceph/ceph

13 years agodoc: restructuring quick start section.
John Wilkins [Tue, 3 Jul 2012 18:17:50 +0000 (11:17 -0700)]
doc: restructuring quick start section.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agoIoCtxImpl: pass objver pointer to aio_operate_read
Samuel Just [Tue, 3 Jul 2012 18:10:54 +0000 (11:10 -0700)]
IoCtxImpl: pass objver pointer to aio_operate_read

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoceph-disk-prepare: Take fsid from config file.
Tommi Virtanen [Tue, 3 Jul 2012 16:22:28 +0000 (09:22 -0700)]
ceph-disk-prepare: Take fsid from config file.

Closes: #2546.
Signed-off-by: Tommi Virtanen <tv@inktank.com>
13 years agoconfig: remove bad argparse_flag argument in parse_option()
Sage Weil [Tue, 3 Jul 2012 13:46:10 +0000 (06:46 -0700)]
config: remove bad argparse_flag argument in parse_option()

This is wrong, and thankfully valgrind picks it up.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agodebian: strip new ceph-mds package
Sage Weil [Tue, 3 Jul 2012 16:20:35 +0000 (09:20 -0700)]
debian: strip new ceph-mds package

Reported-by: Amon Ott <a.ott@m-privacy.de>
Signed-off-by: Sage Weil <sage@inktank.com>
13 years agodoc: Cleaned up rbd snapshots.
John Wilkins [Tue, 3 Jul 2012 15:46:14 +0000 (08:46 -0700)]
doc: Cleaned up rbd snapshots.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agoconfig: fix lock recursion in get_val_from_conf_file()
Sage Weil [Tue, 3 Jul 2012 15:20:06 +0000 (08:20 -0700)]
config: fix lock recursion in get_val_from_conf_file()

Introduce a private, already-locked version.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoconfig: fix recursive lock in parse_config_files()
Sage Weil [Tue, 3 Jul 2012 15:15:08 +0000 (08:15 -0700)]
config: fix recursive lock in parse_config_files()

The _impl() helper is only called from parse_config_files(); don't retake
the lock.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoconfig: remove bad argparse_flag argument in parse_option()
Sage Weil [Tue, 3 Jul 2012 13:46:10 +0000 (06:46 -0700)]
config: remove bad argparse_flag argument in parse_option()

This is wrong, and thankfully valgrind picks it up.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoclient: improve dump_cache output
Sage Weil [Tue, 3 Jul 2012 04:08:27 +0000 (21:08 -0700)]
client: improve dump_cache output

Hunting #1737.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agodoc: release notes for 0.48
Sage Weil [Tue, 3 Jul 2012 03:13:51 +0000 (20:13 -0700)]
doc: release notes for 0.48

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agodoc: 'Configuring a Storage Cluster' -> 'Configuration'
Sage Weil [Tue, 3 Jul 2012 01:03:02 +0000 (18:03 -0700)]
doc: 'Configuring a Storage Cluster' -> 'Configuration'

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoMerge tag 'v0.48argonaut'
Sage Weil [Tue, 3 Jul 2012 04:24:56 +0000 (21:24 -0700)]
Merge tag 'v0.48argonaut'

v0.48argonaut

13 years agoMerge branch 'wip-msgr'
Sage Weil [Tue, 3 Jul 2012 00:54:35 +0000 (17:54 -0700)]
Merge branch 'wip-msgr'

13 years agolockdep: enable in common_init
Sage Weil [Thu, 28 Jun 2012 23:23:30 +0000 (16:23 -0700)]
lockdep: enable in common_init

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomsgr: restart_queue when replacing existing pipe and taking over the queue
Sage Weil [Mon, 2 Jul 2012 00:23:28 +0000 (17:23 -0700)]
msgr: restart_queue when replacing existing pipe and taking over the queue

The queue may have been previously stopped (by discard_queue()), and needs
to be restarted.

Fixes consistent failures from the mon_recovery.py integration tests.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomsgr: choose incoming connection if ours is STANDBY
Sage Weil [Sun, 1 Jul 2012 22:37:31 +0000 (15:37 -0700)]
msgr: choose incoming connection if ours is STANDBY

If the connect_seq matches, but our existing connection is in STANDBY, take
the incoming one.  Otherwise, the other end will wait indefinitely for us
to connect but we won't.

Alternatively, we could "win" the race and trigger a connection by sending
a keepalive (or similar), but that is more work; we may as well accept the
incoming connection we have now.

This removes STANDBY from the acceptable WAIT case states.  It also keeps
responsibility squarely on the shoulders of the peer with something to
deliver.

Without this patch, a 3-osd vstart cluster with
'ms inject socket failures = 100' and rados bench write -b 4096 would start
generating slow request warnings after a few minutes due to the osds
failing to connect to each other.  With the patch, I complete a 10 minute
run without problems.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomsgr: preserve incoming message queue when replacing pipes
Sage Weil [Fri, 29 Jun 2012 00:50:47 +0000 (17:50 -0700)]
msgr: preserve incoming message queue when replacing pipes

If we replace an existing pipe with a new one, move the incoming queue
of messages that have not yet been dispatched over to the new Pipe so that
they are not lost.  This prevents messages from being lost.

Alternatively, we could set in_seq = existing->in_seq - existing->in_qlen,
but that would make the other end resend those messages, which is a waste
of bandwidth.

Very easy to reproduce the original bug with 'ms inject socket failures'.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomsgr: move dispatch_entry into DispatchQueue class
Sage Weil [Fri, 29 Jun 2012 00:45:24 +0000 (17:45 -0700)]
msgr: move dispatch_entry into DispatchQueue class

A bit cleaner.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomsgr: move incoming queue to separate class
Sage Weil [Fri, 29 Jun 2012 00:38:34 +0000 (17:38 -0700)]
msgr: move incoming queue to separate class

This extricates the incoming queue and its funky relationship with
DispatchQueue from Pipe and moves it into IncomingQueue.  There is now a
single IncomingQueue attached to each Pipe.  DispatchQueue is now no
longer tied to Pipe.

This modularizes the code a bit better (tho that is still a work in
progress) and (more importantly) will make it possible to move the
incoming messages from one pipe to another in accept().

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomsgr: make D_CONNECT constant non-zero, fix ms_handle_connect() callback
Sage Weil [Thu, 28 Jun 2012 00:06:40 +0000 (17:06 -0700)]
msgr: make D_CONNECT constant non-zero, fix ms_handle_connect() callback

A while ago we inadvertantly broke ms_handle_connect() callbacks because
of a check for m being non-zero in the dispatch_entry() thread.  Adjust the
enums so that they get delivered again.

This fixes hangs when, for example, the ceph tool sends a command, gets a
connection reset, and doesn't get the connect callback to resend after
reconnecting to a new monitor.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomsgr: fix pipe replacement assert
Sage Weil [Wed, 27 Jun 2012 00:10:40 +0000 (17:10 -0700)]
msgr: fix pipe replacement assert

We may replace an existing pipe in the STANDBY state if the previous
attempt failed during accept() (see previous patches).

This might fix #1378.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomsgr: do not try to reconnect con with CLOSED pipe
Sage Weil [Wed, 27 Jun 2012 00:07:31 +0000 (17:07 -0700)]
msgr: do not try to reconnect con with CLOSED pipe

If we have a con with a closed pipe, drop the message.  For lossless
sessions, the state will be STANDBY if we should reconnect.  For lossy
sessions, we will end up with CLOSED and we *should* drop the message.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomsgr: move to STANDBY if we replace during accept and then fail
Sage Weil [Wed, 27 Jun 2012 00:06:41 +0000 (17:06 -0700)]
msgr: move to STANDBY if we replace during accept and then fail

If we replace an existing pipe during accept() and then fail, move to
STANDBY so that our connection state (connect_seq, etc.) is preserved.
Otherwise, we will throw out that information and falsely trigger a
RESETSESSION on the next connection attempt.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agov0.48argonaut v0.48argonaut
Sage Weil [Sat, 30 Jun 2012 21:50:20 +0000 (14:50 -0700)]
v0.48argonaut

13 years agoceph.spec.in: Change license of base package to GPL and use SPDX format
Holger Macht [Mon, 2 Jul 2012 20:54:48 +0000 (13:54 -0700)]
ceph.spec.in: Change license of base package to GPL and use SPDX format

LGPLv2 in spec file is not correct, because some of the included
packages/binaries are GPLv2. For example:

 src/mount/mtab.c     -> package ceph, binary mount.ceph
 src/common/fiemap.cc -> package ceph, binary rbd

Also use SPDX format (http://www.spdx.org/licenses) for the sub-package
licenses.

Signed-off-by: Holger Macht <hmacht@suse.de>
13 years agomon: initialize quorum_features
Sage Weil [Mon, 2 Jul 2012 23:05:16 +0000 (16:05 -0700)]
mon: initialize quorum_features

This could cause us to incorrectly encode new features into the monstore
that an old mon won't understand.

This is overly conservative; we probably need to persist the set of quorum
features that are supported and use those.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agodoc: fixed --cap error and a few additional bits of cleanup.
John Wilkins [Mon, 2 Jul 2012 20:05:26 +0000 (13:05 -0700)]
doc: fixed --cap error and a few additional bits of cleanup.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agoOSD::do_command: unlock pg only if we had it
Samuel Just [Mon, 2 Jul 2012 16:51:37 +0000 (09:51 -0700)]
OSD::do_command: unlock pg only if we had it

Signed-off-by: Samuel Just <sam.just@inktank.com>