]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
14 years agoosd: handle notify+info explicitly in GetInfo state
Sage Weil [Thu, 5 May 2011 15:54:23 +0000 (08:54 -0700)]
osd: handle notify+info explicitly in GetInfo state

This fixes a few things:
 - do not proceed past GetInfo if there are down osds.  ever.
 - if we get a new info that moves last_epoch_started forward,
   rebuild prior, because we may have eliminated said down osds.
 - if we get dup info, do nothing
 - if we get new info, see if we can proceed to GetLog

This is all simpler/cleaner by handling Notify/Info (they're the same)
explicitly in the GetInfo state and not falling back to the parent
state handler.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: statechart whitespace
Sage Weil [Thu, 5 May 2011 15:18:55 +0000 (08:18 -0700)]
osd: statechart whitespace

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: initialize pg state event counters
Sage Weil [Thu, 5 May 2011 15:14:17 +0000 (08:14 -0700)]
osd: initialize pg state event counters

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: fix GetInfo querying
Sage Weil [Thu, 5 May 2011 15:12:24 +0000 (08:12 -0700)]
osd: fix GetInfo querying

Don't query for info we already have, or have already requested.  Remove
unneeded helper so that this is simpler and we have access to the info
we need.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: handle event notify/info/log from Initial
Sage Weil [Thu, 5 May 2011 15:11:41 +0000 (08:11 -0700)]
osd: handle event notify/info/log from Initial

We shouldn't post a creation event and jump into peering/stray based on
pg creation when we are about to process more information or else we will
send out unnecessary queries.  Instead, handle those from Initial and jump
to the appropriate state.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: debug handle_*
Sage Weil [Wed, 4 May 2011 23:35:49 +0000 (16:35 -0700)]
osd: debug handle_*

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: fix min_time in state stats
Sage Weil [Wed, 4 May 2011 23:09:58 +0000 (16:09 -0700)]
osd: fix min_time in state stats

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: rename states to reflect nesting; fix enter/exit msgs
Sage Weil [Wed, 4 May 2011 23:09:46 +0000 (16:09 -0700)]
osd: rename states to reflect nesting; fix enter/exit msgs

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoOSD: fill in rctx properly for pg->handle_create in get_or_create_pg
Greg Farnum [Wed, 4 May 2011 23:16:17 +0000 (16:16 -0700)]
OSD: fill in rctx properly for pg->handle_create in get_or_create_pg

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoosd: first pass at pg peering stats
Sage Weil [Wed, 4 May 2011 21:58:42 +0000 (14:58 -0700)]
osd: first pass at pg peering stats

The numbers are a bit off it seems.  Also lots of potential for cleanup
here.  But it (basically) works!

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: use const char * state names
Sage Weil [Wed, 4 May 2011 21:12:00 +0000 (14:12 -0700)]
osd: use const char * state names

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoOSD: assert contents exist when erasing from last_scrub_map.
Greg Farnum [Wed, 4 May 2011 21:30:51 +0000 (14:30 -0700)]
OSD: assert contents exist when erasing from last_scrub_map.

Insert PG into last_scrub_map on creation so that this doesn't
break right away.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoPG: proc_replica_info, oinfo not info
Samuel Just [Wed, 4 May 2011 20:38:57 +0000 (13:38 -0700)]
PG: proc_replica_info, oinfo not info

The method param info shadowed PG::info.

14 years agoosd: move directly to Reset state on pg load
Sage Weil [Wed, 4 May 2011 20:05:09 +0000 (13:05 -0700)]
osd: move directly to Reset state on pg load

Add Initial -> Reset transition on pg load.  This avoids doing any
activation-type stuff (like sending messages) before we are ready.  In
particularly, we want to advance through any new OSDMaps and only
send out queries/notifies/whatever when we get to the activate_map
stage.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoOSD: start PG state machine when loading pre-existing PGs
Josh Durgin [Wed, 4 May 2011 19:23:08 +0000 (12:23 -0700)]
OSD: start PG state machine when loading pre-existing PGs

This caused a crash when restarting a killed OSD because the Initial
state was receiving the ActMap event.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoPG: ReplicaActive must repond to requests from discover_all_missing
Samuel Just [Wed, 4 May 2011 17:21:54 +0000 (10:21 -0700)]
PG: ReplicaActive must repond to requests from discover_all_missing

If the peer does not yet have the pg during GetMissing, there won't be
a peer_missing entry for that peer.  In that case, discover_all_missing
can legitimately request a missing set after the pg has gone active.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
14 years agoPG: collapse crashed transitions to happen on any unexpected event
Josh Durgin [Wed, 4 May 2011 16:32:49 +0000 (09:32 -0700)]
PG: collapse crashed transitions to happen on any unexpected event

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoPG: use a state_name member instead of overriding get_state_name
Josh Durgin [Wed, 4 May 2011 16:10:00 +0000 (09:10 -0700)]
PG: use a state_name member instead of overriding get_state_name

Also add debugging to each state constructor. Since dout uses
the recovery machine context, anything using it in the constructor
must be a state, not a simple_state.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoRevert "osd: simplify check for unconsumed events"
Samuel Just [Wed, 4 May 2011 00:50:34 +0000 (17:50 -0700)]
Revert "osd: simplify check for unconsumed events"

This reverts commit ab34a3ce3e757a54816bd9b884c3f900361d4930.

It turns out that unconsumed_event supersedes checking outer states. :(

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
14 years agoPG: Primary should also discard the ActMap event
Samuel Just [Wed, 4 May 2011 00:19:33 +0000 (17:19 -0700)]
PG: Primary should also discard the ActMap event

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
14 years agoPG: ActMap should be dicarded if no outer state handles it
Samuel Just [Wed, 4 May 2011 00:02:34 +0000 (17:02 -0700)]
PG: ActMap should be dicarded if no outer state handles it

14 years agoosd: simplify check for unconsumed events
Sage Weil [Wed, 4 May 2011 00:03:37 +0000 (17:03 -0700)]
osd: simplify check for unconsumed events

No need for the Crashed pseudo state.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: make debug output include state name
Sage Weil [Tue, 3 May 2011 22:49:02 +0000 (15:49 -0700)]
osd: make debug output include state name

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: fix event names
Sage Weil [Tue, 3 May 2011 23:38:27 +0000 (16:38 -0700)]
osd: fix event names

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoPG.h: transition to crashed on unhandled message
Samuel Just [Tue, 3 May 2011 23:26:55 +0000 (16:26 -0700)]
PG.h: transition to crashed on unhandled message

14 years agoosd: feed new pg mapping into state machine
Sage Weil [Tue, 3 May 2011 22:31:28 +0000 (15:31 -0700)]
osd: feed new pg mapping into state machine

instead of recalculating it.  Also pass the last map into warm_restart,
while we're at it.  Drop the Reset state constructor and instead repost
the AdvMap event before transitioning.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosdmap: fix some constedness
Sage Weil [Tue, 3 May 2011 22:29:48 +0000 (15:29 -0700)]
osdmap: fix some constedness

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: turn off recovery oid sets
Sage Weil [Tue, 3 May 2011 21:31:20 +0000 (14:31 -0700)]
osd: turn off recovery oid sets

This is slow, eats memory, and dumps huge amounts of crap to the debug
logs when enabled.  Leave it off unless we are actually hunting down a bug.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoPG: remove peer_info_requested member
Josh Durgin [Tue, 3 May 2011 21:15:45 +0000 (14:15 -0700)]
PG: remove peer_info_requested member

This is internal to the GetInfo state now.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoPG: don't become clean in purge_strays
Josh Durgin [Tue, 3 May 2011 21:03:14 +0000 (14:03 -0700)]
PG: don't become clean in purge_strays

Our state is already clean here.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoPG: send notifies when a stray or an active replica gets an ActMap
Josh Durgin [Mon, 2 May 2011 23:21:09 +0000 (16:21 -0700)]
PG: send notifies when a stray or an active replica gets an ActMap

This was present before refactoring.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoPG: fix proc_master_log output
Josh Durgin [Mon, 2 May 2011 21:39:04 +0000 (14:39 -0700)]
PG: fix proc_master_log output

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoPG: handle info in proc_replica_log just like we did in _process_pg_info
Josh Durgin [Mon, 2 May 2011 21:38:16 +0000 (14:38 -0700)]
PG: handle info in proc_replica_log just like we did in _process_pg_info

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoMerge branch 'master' into wip_pg_refactor
Sage Weil [Tue, 3 May 2011 21:28:45 +0000 (14:28 -0700)]
Merge branch 'master' into wip_pg_refactor

14 years agoosd: only specify start version for Qeury::LOG
Sage Weil [Tue, 3 May 2011 21:27:35 +0000 (14:27 -0700)]
osd: only specify start version for Qeury::LOG

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: use enum instead of const static int members
Sage Weil [Tue, 3 May 2011 21:02:41 +0000 (14:02 -0700)]
osd: use enum instead of const static int members

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: leave recovery hooks in PG
Sage Weil [Tue, 3 May 2011 21:01:42 +0000 (14:01 -0700)]
osd: leave recovery hooks in PG

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: fix pg log entry types to not always be delete
Sage Weil [Tue, 3 May 2011 20:08:35 +0000 (13:08 -0700)]
osd: fix pg log entry types to not always be delete

This was broken by the osd_trans work merged in 01f3526b62.  We need to
use the obs reference to new_obs.  This caused objects to be deleted during
pg recovery.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomon: add 'ceph osd rm N...' command
Sage Weil [Tue, 3 May 2011 19:37:33 +0000 (12:37 -0700)]
mon: add 'ceph osd rm N...' command

So we can mark an old osd as deleted and have it not appear in the osdmap
dump, summary count.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoMerge remote branch 'origin/stable'
Sage Weil [Tue, 3 May 2011 19:36:16 +0000 (12:36 -0700)]
Merge remote branch 'origin/stable'

Conflicts:
src/mon/OSDMonitor.cc

14 years agoosdmap: allow incremental to represent osd deletion
Sage Weil [Tue, 3 May 2011 19:34:54 +0000 (12:34 -0700)]
osdmap: allow incremental to represent osd deletion

Convert new_down to new_state, with values xored onto the old state.  We
preserve compatibility with old incrementals because they were (virtually)
always 0, and we can special case that to mean toggle CEPH_OSD_UP.  We
don't really care if clients get new values right.. if they don't clear
the EXISTS flag that doesn't really hurt them.  It's only important that
the monitor get it right.

To ensure that, we rev the monitor internal protocol.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoobjecter: remove useless mark_down code
Sage Weil [Tue, 3 May 2011 19:28:00 +0000 (12:28 -0700)]
objecter: remove useless mark_down code

We already check sessions a bit further down, and this code only worked
when we got incrementals, not full maps.  Take it out.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agotest-obsync: test ACL translation, run unit tests
Colin Patrick McCabe [Tue, 3 May 2011 18:07:17 +0000 (11:07 -0700)]
test-obsync: test ACL translation, run unit tests

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoobsync: remove --owner, elide owner from ACL XML
Colin Patrick McCabe [Tue, 3 May 2011 17:58:04 +0000 (10:58 -0700)]
obsync: remove --owner, elide owner from ACL XML

Just omit the owner field from the ACL XML. It is optional anyway.

Don't supply an --owner switch. The owner will always be the same user
that created the object.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoobsync: better usage
Colin Patrick McCabe [Tue, 3 May 2011 17:20:30 +0000 (10:20 -0700)]
obsync: better usage

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoOSD,PG: Peering refactor
Samuel Just [Tue, 3 May 2011 00:03:56 +0000 (17:03 -0700)]
OSD,PG: Peering refactor

Previously, peering was handled by a defacto state machine in do_peer
and related methods.  Peering state will now be encapsulated in
RecoveryState, which uses boost::state_chart internally to enforce an
explicit state machine abstraction.  OSD::handle_pg_* pass off to
PG::handle_*, which pass messages to the state machine.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
14 years agoOSD,PG: Move pg reset code from OSD::advance_map to PG
Samuel Just [Fri, 22 Apr 2011 00:42:51 +0000 (17:42 -0700)]
OSD,PG: Move pg reset code from OSD::advance_map to PG

OSD::advance_map previously handled resetting the PG for peering.  Now,
PG::acting_up_affected returns true if peering needs to be restarted and
PG::warm_restart takes care of restting the pg.

14 years agoPG: choose_log_location
Josh Durgin [Thu, 21 Apr 2011 23:27:57 +0000 (16:27 -0700)]
PG: choose_log_location

Choosing the master log holder and deciding whether to generate a
backlog are now handled by choose_log_location.

14 years agoPG: Extract query map generation from recover_master_log
Josh Durgin [Thu, 21 Apr 2011 21:59:17 +0000 (14:59 -0700)]
PG: Extract query map generation from recover_master_log

PgPriorSet::gen_query_map now generates the initial info query map.

14 years agoPG: Refactor build_prior into a PgPriorSet constructor.
Samuel Just [Thu, 21 Apr 2011 20:36:13 +0000 (13:36 -0700)]
PG: Refactor build_prior into a PgPriorSet constructor.

14 years agoPG: Add gen_prefix method for generating the pg error prefix
Samuel Just [Mon, 2 May 2011 22:39:36 +0000 (15:39 -0700)]
PG: Add gen_prefix method for generating the pg error prefix

This should make it easier to add dout macros for non-pg methods

14 years agoTestSnaps.cc: default to testing with the data pool
Samuel Just [Mon, 2 May 2011 22:27:04 +0000 (15:27 -0700)]
TestSnaps.cc: default to testing with the data pool

14 years agoOSD.cc: handle_pg_create fix initial last_epoch_started value
Samuel Just [Mon, 2 May 2011 21:33:03 +0000 (14:33 -0700)]
OSD.cc: handle_pg_create fix initial last_epoch_started value

last_epoch_started == same_acting_since should not be true before the pg
goes active for the first time.

14 years agoobsync: only require --owner if --xuser is set
Colin Patrick McCabe [Tue, 3 May 2011 17:06:11 +0000 (10:06 -0700)]
obsync: only require --owner if --xuser is set

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoobsync: implement --owner
Colin Patrick McCabe [Tue, 3 May 2011 01:41:31 +0000 (18:41 -0700)]
obsync: implement --owner

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agocfuse: encode/decode dev_t properly
Sage Weil [Tue, 3 May 2011 01:19:32 +0000 (18:19 -0700)]
cfuse: encode/decode dev_t properly

The fuse layer passes through "encoded" dev_t values (probably for
compatibility reasons or something).  I copied the encode/decode methods
from the kernel and encode/decode the st_rdev values where appropriate
(where struct stat is exposed directory or via the fuse_entry_param
struct).

Fixes: #1031
Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoobsync: implement user translation (--xuser)
Colin Patrick McCabe [Mon, 2 May 2011 21:48:27 +0000 (14:48 -0700)]
obsync: implement user translation (--xuser)

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agorgw: fix ACL XML generation
Colin Patrick McCabe [Mon, 2 May 2011 22:57:06 +0000 (15:57 -0700)]
rgw: fix ACL XML generation

Put AccessControlPolicy in the http://s3.amazonaws.com/doc/2006-03-01/
namespace.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoobsync: refactor LocalCopy
Colin Patrick McCabe [Mon, 2 May 2011 21:45:43 +0000 (14:45 -0700)]
obsync: refactor LocalCopy

Combine LocalCopy, S3StoreLocalCopy, and RadosStoreLocalCopy into one
class called LocalCopy.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoFileStore: use proper object names for linking
Greg Farnum [Mon, 2 May 2011 20:47:23 +0000 (13:47 -0700)]
FileStore: use proper object names for linking

They were backward before, which broke EVERYTHING.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoMDS: fix handle_client_rename use of path_traverse.
Greg Farnum [Wed, 27 Apr 2011 23:22:33 +0000 (16:22 -0700)]
MDS: fix handle_client_rename use of path_traverse.

It was using the MDS_TRAVERSE_DISCOVERXLOCK flag, which allows
path_traverse to return success if it encounters a NULL dentry. When
we're looking for a source inode, though, that doesn't work out! We
want MDS_TRAVERSE_DISCOVER, which will go away and look for the dentry
on other inodes but requires a linked dentry, not a NULL one.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agomds: trim non-auth swallowed subtrees during resolve
Sage Weil [Sat, 30 Apr 2011 00:30:45 +0000 (17:30 -0700)]
mds: trim non-auth swallowed subtrees during resolve

Consider:
 - peer auth for /foo
 - ambiguous import /foo/bar
 - peer claims /foo, swallows /foo/bar.
 - disambiguate_imports sees we didn't get /foo/bar, cancels ambiguous
   import.
 -> we are left with /foo/bar (and content) in cache, even tho it is
   non-auth.

Fix by pulling the try_trim_non_auth_subtree() back out of
cancel_ambiguous_import, and trimming the containing subtree in the
disambiguate (resolve completion) case.  (For the journal replay case the
subtree structure is deterministic and no such check is needed.)

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix replay of EFragment rollback
Sage Weil [Sat, 30 Apr 2011 00:23:58 +0000 (17:23 -0700)]
mds: fix replay of EFragment rollback

Remove from the uncommitted list.

Also, make uncommitted list updated unconditional: we need to do it even
if the inode wasn't already in our cache.

Also, journal the rollback with the same signedness as the prepare, so that
the descriptor/map key matches up.  Adjust signs accordingly.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agorgw: log bytes received
Yehuda Sadeh [Sat, 30 Apr 2011 00:17:23 +0000 (17:17 -0700)]
rgw: log bytes received

14 years agorgw: fix some logging problems
Yehuda Sadeh [Fri, 29 Apr 2011 23:14:22 +0000 (16:14 -0700)]
rgw: fix some logging problems

14 years agorgw_admin: dump also user email
Yehuda Sadeh [Fri, 29 Apr 2011 23:08:04 +0000 (16:08 -0700)]
rgw_admin: dump also user email

14 years agotest/ceph_crypto: Check that the shutdown/fork/init trick works for NSS.
Tommi Virtanen [Fri, 29 Apr 2011 21:48:30 +0000 (14:48 -0700)]
test/ceph_crypto: Check that the shutdown/fork/init trick works for NSS.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
14 years agofilestore: fiemap should close the fd
Yehuda Sadeh [Fri, 29 Apr 2011 21:08:57 +0000 (14:08 -0700)]
filestore: fiemap should close the fd

14 years agofilestore: fiemap should close the fd
Yehuda Sadeh [Fri, 29 Apr 2011 21:08:57 +0000 (14:08 -0700)]
filestore: fiemap should close the fd

14 years agocommon, cfuse: Hook into daemonization and shutdown/init NSS.
Tommi Virtanen [Fri, 29 Apr 2011 18:43:17 +0000 (11:43 -0700)]
common, cfuse: Hook into daemonization and shutdown/init NSS.

NSS cannot tolerate forks without this.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
14 years agomsgr, common: Refactor to extract daemonization out of messenger.
Tommi Virtanen [Fri, 29 Apr 2011 18:29:42 +0000 (11:29 -0700)]
msgr, common: Refactor to extract daemonization out of messenger.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
14 years agomsgr: Rename SimpleMessenger::start(daemonize, nonce) to start_with_nonce.
Tommi Virtanen [Fri, 29 Apr 2011 18:17:53 +0000 (11:17 -0700)]
msgr: Rename SimpleMessenger::start(daemonize, nonce) to start_with_nonce.

Otherwise, once we remove daemonize from the prototype,
all the existing ->start(false) calls will be taken
to mean nonce=0.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
14 years agoceph_crypto: Assert that NSS initialization works.
Tommi Virtanen [Fri, 29 Apr 2011 17:07:10 +0000 (10:07 -0700)]
ceph_crypto: Assert that NSS initialization works.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
14 years agocommon_init: create common_init_daemonize
Colin Patrick McCabe [Fri, 29 Apr 2011 18:17:48 +0000 (11:17 -0700)]
common_init: create common_init_daemonize

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoconfig: Update sample config with more examples
Wido den Hollander [Fri, 29 Apr 2011 17:39:04 +0000 (10:39 -0700)]
config: Update sample config with more examples

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Signed-off-by: Wido den Hollander <wido@widodh.nl>
14 years agocommon_init: set log_file, not log_dir, by default
Colin Patrick McCabe [Fri, 29 Apr 2011 17:23:10 +0000 (10:23 -0700)]
common_init: set log_file, not log_dir, by default

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agocommon_init: don't modify log_per_instance
Colin Patrick McCabe [Fri, 29 Apr 2011 17:21:08 +0000 (10:21 -0700)]
common_init: don't modify log_per_instance

check it in DoutStreambuf instead.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agomsgr: remove dup .start() call check, remove cruft
Sage Weil [Fri, 29 Apr 2011 16:49:05 +0000 (09:49 -0700)]
msgr: remove dup .start() call check, remove cruft

There is now no ordering constraint wrt the daemonize bits; those can
safely be pulled out.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agohadoop: cleanups for libceph type update
Jim Schutt [Fri, 29 Apr 2011 15:13:59 +0000 (09:13 -0600)]
hadoop: cleanups for libceph type update

Signed-off-by: Jim Schutt <jaschut@sandia.gov>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agolfn: put lfn outside of user.ceph namesapce
Sage Weil [Thu, 28 Apr 2011 23:01:23 +0000 (16:01 -0700)]
lfn: put lfn outside of user.ceph namesapce

This completely hides the lfn from the ObjectStore interface users.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoMerge remote branch 'origin/master' into lfn
Sage Weil [Thu, 28 Apr 2011 22:45:19 +0000 (15:45 -0700)]
Merge remote branch 'origin/master' into lfn

14 years agomdsmap: show mds name in summary
Sage Weil [Thu, 28 Apr 2011 22:55:20 +0000 (15:55 -0700)]
mdsmap: show mds name in summary

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agohadoop: update libceph types
Sage Weil [Thu, 28 Apr 2011 22:52:07 +0000 (15:52 -0700)]
hadoop: update libceph types

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agohypertable: update libceph types
Sage Weil [Thu, 28 Apr 2011 22:52:01 +0000 (15:52 -0700)]
hypertable: update libceph types

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agolibceph: error out if USE_FILE_OFFSET64 not defined
Sage Weil [Thu, 28 Apr 2011 22:49:37 +0000 (15:49 -0700)]
libceph: error out if USE_FILE_OFFSET64 not defined

Otherwise struct dirent will not match user code and badness on readdir
will ensure.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agolfn: don't return ENOENT if it's not lfn in some cases
Yehuda Sadeh [Thu, 28 Apr 2011 22:44:28 +0000 (15:44 -0700)]
lfn: don't return ENOENT if it's not lfn in some cases

14 years agomds: ignore fragment_notify when dft state doesn't match
Sage Weil [Thu, 28 Apr 2011 22:17:18 +0000 (15:17 -0700)]
mds: ignore fragment_notify when dft state doesn't match

In particular, if there is a resolve in there somewhere, we may have found
out about this refragment from the src because they send resolve messages
to all nodes (to resolve ambiguous migrations).  If that's the case we
can ignore the message.

Fixes crash like

2011-04-28 14:30:31.179106 7fe72e325710 -- 10.0.1.252:6805/22158 <== mds2 10.0.1.252:6803/25635 548 ==== fragment_notify(300000000b4#00* 1) v1 ==== 17+0+0 (2192211443 0 0) 0x2c9ec00 con 0x323d140
2011-04-28 14:30:31.179116 7fe72e325710 mds1.cache handle_fragment_notify fragment_notify(300000000b4#00* 1) v1 from mds2
2011-04-28 14:30:31.179149 7fe72e325710 mds1.cache adjust_dir_fragments 00* 1 on [inode 300000000b4 [...2,head] /syn.4114.0/dir.0/dir.0/dir.0/dir.0/dir.0/dir.5/dir.5/ auth{0=1,2=1} fragtree_t(*^2 00*^1) v188 ap=1 f(v0 m2011-04-28 14:23:59.074510 7=7+0) n(v2 rc2011-04-28 14:23:59.074510 15=14+1) (idft mix->lock g=0,2 dirty) (inest mix dirty) (ifile excl dirty) (ixattr excl) (iversion lock) caps={4114=pAsLsXsxFsx/-@1},l=4114(-1) | dirtyscattered dirfrag caps replicated dirty authpin 0x32bec70]
2011-04-28 14:30:31.179182 7fe72e325710 mds1.cache adjust_dir_fragments 00* bits 1 srcfrags 0x3080860,0x378da50 on [inode 300000000b4 [...2,head] /syn.4114.0/dir.0/dir.0/dir.0/dir.0/dir.0/dir.5/dir.5/ auth{0=1,2=1} fragtree_t(*^2 00*^1) v188 ap=1 f(v0 m2011-04-28 14:23:59.074510 7=7+0) n(v2 rc2011-04-28 14:23:59.074510 15=14+1) (idft mix->lock g=0,2 dirty) (inest mix dirty) (ifile excl dirty) (ixattr excl) (iversion lock) caps={4114=pAsLsXsxFsx/-@1},l=4114(-1) | dirtyscattered dirfrag caps replicated dirty authpin 0x32bec70]
2011-04-28 14:30:31.179218 7fe72e325710 mds1.cache  new fragtree is fragtree_t(*^2 00*^1)
mds/MDCache.cc: In function 'void MDCache::adjust_dir_fragments(CInode*, std::list<CDir*, std::allocator<CDir*> >&, frag_t, int, std::list<CDir*, std::allocator<CDir*> >&, std::list<Context*, std::allocator<Context*> >&, bool)', in thread '0x7fe72e325710'
mds/MDCache.cc: 9254: FAILED assert(srcfrags.size() == 1)
 ceph version 0.27-165-gaf908f8 (commit:af908f82924a67be3aeb2767eaa05ba04c145f42)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x53) [0xa5775e]
 2: (MDCache::adjust_dir_fragments(CInode*, std::list<CDir*, std::allocator<CDir*> >&, frag_t, int, std::list<CDir*, std::allocator<CDir*> >&, std::list<Context*, std::allocator<Context*> >&, bool)+0x2dd) [0x888bbf]
 3: (MDCache::adjust_dir_fragments(CInode*, frag_t, int, std::list<CDir*, std::allocator<CDir*> >&, std::list<Context*, std::allocator<Context*> >&, bool)+0x13d) [0x88817d]
 4: (MDCache::handle_fragment_notify(MMDSFragmentNotify*)+0x199) [0x88bac5]
 5: (MDCache::dispatch(Message*)+0x124) [0x8765ea]
 6: (MDS::handle_deferrable_message(Message*)+0x1f5) [0x77a607]
 7: (MDS::_dispatch(Message*)+0x784) [0x77ba90]

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: do not send fragment_notify to <= recovering nodes
Sage Weil [Thu, 28 Apr 2011 22:02:58 +0000 (15:02 -0700)]
mds: do not send fragment_notify to <= recovering nodes

They will get sorted out during rejoin.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix uninint warning on cur
Sage Weil [Thu, 28 Apr 2011 21:53:35 +0000 (14:53 -0700)]
mds: fix uninint warning on cur

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: handle import cancel while logging EImportStart
Sage Weil [Thu, 28 Apr 2011 21:17:49 +0000 (14:17 -0700)]
mds: handle import cancel while logging EImportStart

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoclient: do not send request to mds -1
Sage Weil [Thu, 28 Apr 2011 21:09:52 +0000 (14:09 -0700)]
client: do not send request to mds -1

If we can't find a target, or the chosen target isn't active, wait.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agolfn: set hash and file name constants
Yehuda Sadeh [Thu, 28 Apr 2011 20:57:45 +0000 (13:57 -0700)]
lfn: set hash and file name constants

14 years agoosd: remove warning about max object name length
Yehuda Sadeh [Thu, 28 Apr 2011 20:51:48 +0000 (13:51 -0700)]
osd: remove warning about max object name length

14 years agomds: try_trim_non_auth_subtree on any canceled import (including resolve)
Sage Weil [Thu, 28 Apr 2011 20:44:55 +0000 (13:44 -0700)]
mds: try_trim_non_auth_subtree on any canceled import (including resolve)

We were trimming on journal replay of an import failure, but not on a
canceled ambiguous import during resolve.  Fix that by moving the call into
the helper (and passing a CDir* instead of a dirfrag_t).

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: make trim_non_auth paths complete filepaths (not dnames)
Sage Weil [Thu, 28 Apr 2011 20:34:34 +0000 (13:34 -0700)]
mds: make trim_non_auth paths complete filepaths (not dnames)

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix steal_dentry dir_auth_pins adjustment
Sage Weil [Thu, 28 Apr 2011 20:22:30 +0000 (13:22 -0700)]
mds: fix steal_dentry dir_auth_pins adjustment

Pass down the correct value for dir_auth_pins (dh->auth_pins plus the
inode's auth_pins, but nothing nested beneath the inode).  The CDentry
doesn't track dir auth pins independently, and doesn't really need to.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomon: use tcmalloc
Sage Weil [Thu, 28 Apr 2011 20:08:34 +0000 (13:08 -0700)]
mon: use tcmalloc

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix export_prep trace format
Sage Weil [Thu, 28 Apr 2011 20:00:44 +0000 (13:00 -0700)]
mds: fix export_prep trace format

The prep message includes a spanning tree in the interior of the subtree
that includes all parent inodes of bounding dirfrags.  That used to look
like
df dentry inode (dir dentry inode)*

The code to generate those traces was stopping if the df->ino had already
been included.  The problem was that we may have done the that inode on a
different dirfrag.

Change this to be

df ('-' | ('f' dir | 'd') dentry inode (dir dentry inode)*)

so that we can start with a dentry (already had the dirfrag, same check
as before) or a dirfrag (already had the inode, the new case), or a '-'
(nothing at all).  A single byte is used to indicate which it is and how
to start decoding.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomon: make 'ceph osd (down,out,in) N' take mulitple osd numbers
Sage Weil [Thu, 28 Apr 2011 19:42:23 +0000 (12:42 -0700)]
mon: make 'ceph osd (down,out,in) N' take mulitple osd numbers

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agolibceph: no _t types
Sage Weil [Thu, 28 Apr 2011 19:34:11 +0000 (12:34 -0700)]
libceph: no _t types

Signed-off-by: Sage Weil <sage@newdream.net>