]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
16 years agouclient: fix op replay
Sage Weil [Thu, 2 Jul 2009 21:15:20 +0000 (14:15 -0700)]
uclient: fix op replay

Clear msg payload so it gets reencoded with adjusted values.

16 years agouclient: Cleaned up resend_unsafe_requests; handle_client_reply; debugging.
Greg Farnum [Wed, 1 Jul 2009 22:59:53 +0000 (15:59 -0700)]
uclient: Cleaned up resend_unsafe_requests; handle_client_reply; debugging.

16 years agomds: drop loner on gather before doing waiters
Sage Weil [Thu, 2 Jul 2009 17:21:51 +0000 (10:21 -0700)]
mds: drop loner on gather before doing waiters

Otherwise we would reissue/use caps on loner in non-loner states, and go
back.

16 years agouclient: fix order of session cap removal
Sage Weil [Thu, 2 Jul 2009 00:04:11 +0000 (17:04 -0700)]
uclient: fix order of session cap removal

Remove caps and kick reqeusts before erasing session.

16 years agomds: refcount MDRequest so that timed out client sessions behave
Sage Weil [Thu, 2 Jul 2009 00:03:23 +0000 (17:03 -0700)]
mds: refcount MDRequest so that timed out client sessions behave

We need to ref count MDRequest so that it will remain valid over the
(possibly extended) lifetime C_RetryRequest.

Give ownership of {client,slave}_request to MDRequest so that cleanup
is consistent.

16 years agouclient: use session->caps list an lru
Sage Weil [Wed, 1 Jul 2009 23:46:36 +0000 (16:46 -0700)]
uclient: use session->caps list an lru

Touch caps on use, via bool caps_issued_mask(mask)

16 years agoosd: change log terminology: bottom..top -> tail..head
Sage Weil [Wed, 1 Jul 2009 20:50:54 +0000 (13:50 -0700)]
osd: change log terminology: bottom..top -> tail..head

Hopefully less confusing.

16 years agouclient: fix kickback, reply handler logic
Sage Weil [Wed, 1 Jul 2009 22:43:46 +0000 (15:43 -0700)]
uclient: fix kickback, reply handler logic

We kickback cond test used to look at the request map, but since the
reply handler removes that, it was never true.  Instead, clear the
dispatch_cond pointer.

Also fix up the reply handler logic.  Any reply implies unsafe.  If it
is the first, signal the calling thread.

16 years agouclient: fix MMDSGetMap 'have' epoch
Sage Weil [Wed, 1 Jul 2009 22:25:27 +0000 (15:25 -0700)]
uclient: fix MMDSGetMap 'have' epoch

This broke with the monitor changes last week.

16 years agoMerge commit 'gregskinny/unstable' into unstable
Sage Weil [Wed, 1 Jul 2009 22:21:33 +0000 (15:21 -0700)]
Merge commit 'gregskinny/unstable' into unstable

Conflicts:

src/client/Client.cc

16 years agoDON'T USE, BROKEN. uclient's MetaRequest extra ref counts removed.
Greg Farnum [Wed, 1 Jul 2009 21:46:12 +0000 (14:46 -0700)]
DON'T USE, BROKEN. uclient's MetaRequest extra ref counts removed.

16 years agoBROKEN, DON'T USE.
Greg Farnum [Wed, 1 Jul 2009 19:02:42 +0000 (12:02 -0700)]
BROKEN, DON'T USE.

uclient changes that somehow broke message delivery without changing anything in that process

16 years agomon: better warning with injectargs on non-up mds
Sage Weil [Wed, 1 Jul 2009 18:43:10 +0000 (11:43 -0700)]
mon: better warning with injectargs on non-up mds

16 years agoosd: make write mode per-PG
Sage Weil [Wed, 1 Jul 2009 18:41:38 +0000 (11:41 -0700)]
osd: make write mode per-PG

We can't do it per-object because the access mode determines the order
we append to the log, and that has to be sequential.  It has to be per-PG,
unless a whole ton of other stuff is reworked.

This lets us capture the best access mode at least on a per-pool basis,
instead of imposing a global default.

16 years agouclient: support mds recall_state
Sage Weil [Wed, 1 Jul 2009 18:39:59 +0000 (11:39 -0700)]
uclient: support mds recall_state

Trim old caps.  Still need to make it an LRU.

16 years agouclient: remove debug print
Sage Weil [Wed, 1 Jul 2009 18:20:37 +0000 (11:20 -0700)]
uclient: remove debug print

16 years agouclient: DOESN'T WORK, but more ref counting stuff. Now attempts to resend unsafe...
Greg Farnum [Wed, 1 Jul 2009 15:10:12 +0000 (08:10 -0700)]
uclient: DOESN'T WORK, but more ref counting stuff. Now attempts to resend unsafe ops on a reconnect.

16 years agouclient: MetaRequests go on the heap and are ref-counted; safe/unsafe replies dealt...
Greg Farnum [Tue, 30 Jun 2009 17:01:45 +0000 (10:01 -0700)]
uclient: MetaRequests go on the heap and are ref-counted; safe/unsafe replies dealt with better.

16 years agokclient: checkpatch fixes
Sage Weil [Tue, 30 Jun 2009 20:30:58 +0000 (13:30 -0700)]
kclient: checkpatch fixes

16 years agokclient: use list_for_each_entry macro when possible
Sage Weil [Tue, 30 Jun 2009 20:26:30 +0000 (13:26 -0700)]
kclient: use list_for_each_entry macro when possible

16 years agoosd: fix hb down check
Sage Weil [Tue, 30 Jun 2009 21:20:13 +0000 (14:20 -0700)]
osd: fix hb down check

16 years agoosd: fix failure report on already-down osd
Sage Weil [Tue, 30 Jun 2009 20:10:49 +0000 (13:10 -0700)]
osd: fix failure report on already-down osd

We were potentially sending an osd failure on an osd that was already
down.  Double check before doing so.

16 years agoosd: fix log msg after var name changes
Sage Weil [Tue, 30 Jun 2009 20:10:07 +0000 (13:10 -0700)]
osd: fix log msg after var name changes

16 years agomds: don't choke on path_traverse_to_dir that fully exists
Sage Weil [Fri, 26 Jun 2009 23:50:00 +0000 (16:50 -0700)]
mds: don't choke on path_traverse_to_dir that fully exists

16 years agoosd: nicer scrub ok message
Sage Weil [Fri, 26 Jun 2009 23:47:56 +0000 (16:47 -0700)]
osd: nicer scrub ok message

16 years agoosd: rearrange make_writeable prints
Sage Weil [Fri, 26 Jun 2009 23:47:44 +0000 (16:47 -0700)]
osd: rearrange make_writeable prints

16 years agoosd: fix pg log trimming
Sage Weil [Fri, 26 Jun 2009 23:45:03 +0000 (16:45 -0700)]
osd: fix pg log trimming

We were zeroing out too much of the pg log!

16 years agoinitscripts: fix do_root_cmd
Sage Weil [Tue, 30 Jun 2009 19:16:54 +0000 (12:16 -0700)]
initscripts: fix do_root_cmd

sudo bash -c "echo foo" works, sudo "echo foo" does not.

16 years agomsgs: clean up v in message prints
Sage Weil [Fri, 26 Jun 2009 22:32:29 +0000 (15:32 -0700)]
msgs: clean up v in message prints

16 years agouclient: Kick requests and renew caps on stale.
Greg Farnum [Fri, 26 Jun 2009 23:28:32 +0000 (16:28 -0700)]
uclient: Kick requests and renew caps on stale.

16 years agouclient: fix bad merge
Sage Weil [Fri, 26 Jun 2009 22:25:28 +0000 (15:25 -0700)]
uclient: fix bad merge

16 years agomonc: debug option
Sage Weil [Fri, 26 Jun 2009 22:08:31 +0000 (15:08 -0700)]
monc: debug option

16 years agoosd: switch to MonClient, fix cmds and ceph
Sage Weil [Fri, 26 Jun 2009 21:26:53 +0000 (14:26 -0700)]
osd: switch to MonClient, fix cmds and ceph

16 years agomds, objecter, ceph: use MonClient
Sage Weil [Fri, 26 Jun 2009 21:12:56 +0000 (14:12 -0700)]
mds, objecter, ceph: use MonClient

16 years agomonc: create send_mon_message helper
Sage Weil [Fri, 26 Jun 2009 20:51:49 +0000 (13:51 -0700)]
monc: create send_mon_message helper

16 years agomonclient: refactor MonMap into MonClient
Sage Weil [Fri, 26 Jun 2009 20:49:12 +0000 (13:49 -0700)]
monclient: refactor MonMap into MonClient

16 years agouclient todo
Sage Weil [Fri, 26 Jun 2009 19:56:18 +0000 (12:56 -0700)]
uclient todo

16 years agouclient: Now handles STALE state nicely.
Greg Farnum [Fri, 26 Jun 2009 22:05:13 +0000 (15:05 -0700)]
uclient: Now handles STALE state nicely.

16 years agouclient: fix cap reconnect
Sage Weil [Fri, 26 Jun 2009 18:40:35 +0000 (11:40 -0700)]
uclient: fix cap reconnect

Pass cap_id to the mds.  Otherwise all subsequent cap ops will
fail due to the mismatch.

16 years agobuffer: throw exceptions instead of always asserting.
Sage Weil [Fri, 26 Jun 2009 17:13:07 +0000 (10:13 -0700)]
buffer: throw exceptions instead of always asserting.

We still assert when the user is doing something wrong.  We throw
asserts for failed memory allocs, and for buffer overruns.  I
think that'll align with usage... esp the encoding/decoding.

16 years agoassert: throw FailedAssertion exception instead of inducing segfault
Sage Weil [Fri, 26 Jun 2009 16:44:37 +0000 (09:44 -0700)]
assert: throw FailedAssertion exception instead of inducing segfault

This will allow callers to catch failed assertions, if they so
choose.

16 years agocosd: valgrind off
Sage Weil [Fri, 26 Jun 2009 04:26:15 +0000 (21:26 -0700)]
cosd: valgrind off

16 years agoauth: use string instead of const char* for maps
Sage Weil [Fri, 26 Jun 2009 00:01:20 +0000 (17:01 -0700)]
auth: use string instead of const char* for maps

16 years agomds: attach requests to session; cleanup on session close
Sage Weil [Fri, 26 Jun 2009 00:00:10 +0000 (17:00 -0700)]
mds: attach requests to session; cleanup on session close

This is a partial solution.. we also need mdr refcounting
so that any contexts with an mdr* won't do bad things.

16 years agoosd: fix _scrub head_exists test
Sage Weil [Fri, 26 Jun 2009 04:24:02 +0000 (21:24 -0700)]
osd: fix _scrub head_exists test

16 years agouclient: fix test condition
Sage Weil [Fri, 26 Jun 2009 04:09:35 +0000 (21:09 -0700)]
uclient: fix test condition

If the dentries item exists, dn must be non-null.

16 years agouclient: verify dentries belong to current session
Sage Weil [Fri, 26 Jun 2009 00:49:59 +0000 (17:49 -0700)]
uclient: verify dentries belong to current session

Check against session cap_gen

16 years agoClient: put guards around some dentries[foo] accesses without checking for existence.
Greg Farnum [Fri, 26 Jun 2009 00:32:23 +0000 (17:32 -0700)]
Client: put guards around some dentries[foo] accesses without checking for existence.

16 years agokclient: typo
Sage Weil [Thu, 25 Jun 2009 22:42:37 +0000 (15:42 -0700)]
kclient: typo

16 years agopaxos: allow wait on newer version
Sage Weil [Thu, 25 Jun 2009 22:36:58 +0000 (15:36 -0700)]
paxos: allow wait on newer version

16 years agokclient: set have_version in MOSDGetMap
Sage Weil [Thu, 25 Jun 2009 22:36:23 +0000 (15:36 -0700)]
kclient: set have_version in MOSDGetMap

16 years agoNo more VERSION_T; just 0.
Greg Farnum [Thu, 25 Jun 2009 22:03:17 +0000 (15:03 -0700)]
No more VERSION_T; just 0.

16 years agomon: remove old asserts conflicting with new readable semantics
Sage Weil [Thu, 25 Jun 2009 21:58:46 +0000 (14:58 -0700)]
mon: remove old asserts conflicting with new readable semantics

We may now call update_from_paxos while updating OR active.

16 years agomessages: Clean up of PaxosServiceMessages, and some fixes for their users.
Greg Farnum [Thu, 25 Jun 2009 21:51:11 +0000 (14:51 -0700)]
messages: Clean up of PaxosServiceMessages, and some fixes for their users.

16 years agomessages/MClass[Ack]: Roll back some unification.
Greg Farnum [Thu, 25 Jun 2009 19:05:15 +0000 (12:05 -0700)]
messages/MClass[Ack]: Roll back some unification.
version_t last and PaxosServiceMessage::version shouldn't
be the same in these messages. Remove that and add a new
constructor that does set the version (but it's unneeded).

16 years agomon/objecter: The monitors and Objecter now use the version in messages.
Greg Farnum [Thu, 25 Jun 2009 19:03:56 +0000 (12:03 -0700)]
mon/objecter: The monitors and Objecter now use the version in messages.

16 years agoinitscripts: do mount/mkfs as root, otherwise as any user
Sage Weil [Thu, 25 Jun 2009 21:03:44 +0000 (14:03 -0700)]
initscripts: do mount/mkfs as root, otherwise as any user

We want cosd to run unprivileged if possible.

16 years agoosd: update primary's notion of peer last_update on activate
Sage Weil [Thu, 25 Jun 2009 20:01:16 +0000 (13:01 -0700)]
osd: update primary's notion of peer last_update on activate

We are pushing the peer the log to bring it up to date, so
update our peer_info[peer].last_update to match.  Otherwise,
we get confused if we get, say, stray content and peer() is
called later, and we have out of date peer stats.

16 years agoosd: force RMW ordering globally
Sage Weil [Thu, 25 Jun 2009 19:45:35 +0000 (12:45 -0700)]
osd: force RMW ordering globally

We can't mix RMW and DELAYED in the same PG without screwing
up the ordering of writes, the pg log, and so forth.

So force RMW throughout.  This won't affect the mds log
appends because the client is constant.  It will slow down
concurrent writes to the same object by multiple clients, but
we don't have many (any?) of those yet.

This needs a real solution... :/

16 years agoosd: fix TMAPUP bug
Sage Weil [Thu, 25 Jun 2009 19:44:04 +0000 (12:44 -0700)]
osd: fix TMAPUP bug

Trailing bit was put in the wrong place.

16 years agoosd: fix tmapup
Sage Weil [Thu, 25 Jun 2009 17:32:40 +0000 (10:32 -0700)]
osd: fix tmapup

Various problems with decoding and applying an update.

16 years agoinitscripts: allow 'user' option, defaults to current user
Sage Weil [Thu, 25 Jun 2009 17:22:33 +0000 (10:22 -0700)]
initscripts: allow 'user' option, defaults to current user

16 years agomds: fix CDir decoding
Sage Weil [Thu, 25 Jun 2009 17:21:51 +0000 (10:21 -0700)]
mds: fix CDir decoding

16 years agomds: rev format (for TMAP changes)
Sage Weil [Thu, 25 Jun 2009 04:22:44 +0000 (21:22 -0700)]
mds: rev format (for TMAP changes)

16 years agoosd: fix head_existed check
Sage Weil [Thu, 25 Jun 2009 04:16:28 +0000 (21:16 -0700)]
osd: fix head_existed check

ssc isn't always defined, as we pass here for !may_read() too.

16 years agoMerge branch 'mdsmap' into unstable
Sage Weil [Thu, 25 Jun 2009 04:00:58 +0000 (21:00 -0700)]
Merge branch 'mdsmap' into unstable

Conflicts:

src/mds/CDir.cc

16 years agotodo
Sage Weil [Thu, 25 Jun 2009 03:57:48 +0000 (20:57 -0700)]
todo

16 years agoosd: print lost objects
Sage Weil [Thu, 25 Jun 2009 03:48:39 +0000 (20:48 -0700)]
osd: print lost objects

We still need to figure out how to continue...

16 years agoosd: rebuild missing OI_ATTR from log entry when possible
Sage Weil [Thu, 25 Jun 2009 03:48:21 +0000 (20:48 -0700)]
osd: rebuild missing OI_ATTR from log entry when possible

16 years agoosd: fix proc_replica_log stop condition
Sage Weil [Wed, 24 Jun 2009 21:09:57 +0000 (14:09 -0700)]
osd: fix proc_replica_log stop condition

This fixes condition from 4b5572a.

osd/PG.cc: In function 'void PG::activate(ObjectStore::Transaction&, std::map<int, MOSDPGInfo*, std::less<int>, std::allocator<std::pair<const int, MOSDPGInfo*> > >*)':
osd/PG.cc:1401: FAILED assert(log.backlog)
 1: ./cosd(_Z18__ceph_assert_failPKcS0_iS0_+0x3a) [0x7a838b]
 2: ./cosd(_ZN2PG8activateERN11ObjectStore11TransactionEPSt3mapIiP10MOSDPGInfoSt4lessIiESaISt4pairIKiS5_EEE+0xbe8) [0x71ff88]
 3: ./cosd(_ZN2PG4peerERN11ObjectStore11TransactionERSt3mapIiS3_I4pg_tNS_5QueryESt4lessIS4_ESaISt4pairIKS4_S5_EEES6_IiESaIS8_IKiSC_EEEPS3_IiP10MOSDPGInfoSD_SaIS8_ISE_SK_EEE+0xfa0) [0x722852]
 4: ./cosd(_ZN3OSD16_process_pg_infoEjiRN2PG4InfoERNS0_3LogERNS0_7MissingEPSt3mapIiP10MOSDPGInfoSt4lessIiESaISt4pairIKiS9_EEERi+0x712) [0x6a7000]
 5: ./cosd(_ZN3OSD13handle_pg_logEP9MOSDPGLog+0x126) [0x6a7768]
 6: ./cosd(_ZN3OSD9_dispatchEP7Message+0x34a) [0x6abe6c]
 7: ./cosd(_ZN3OSD13dispatch_implEP7Message+0x408) [0x6ac9c4]
 8: ./cosd(_ZN10Dispatcher8dispatchEP7Message+0x63) [0x61a4af]
 9: ./cosd(_ZN9Messenger8dispatchEP7Message+0x56) [0x6298f8]
 10: ./cosd(_ZN15SimpleMessenger8Endpoint14dispatch_entryEv+0x5ae) [0x62340a]
 11: ./cosd(_ZN15SimpleMessenger8Endpoint14DispatchThread5entryEv+0x19) [0x62fc9d]
 12: ./cosd(_ZN6Thread11_entry_funcEPv+0x20) [0x629e48]
 13: /lib/libpthread.so.0 [0x7fb482c933f7]

16 years agotodo
Sage Weil [Thu, 25 Jun 2009 03:49:41 +0000 (20:49 -0700)]
todo

16 years agoosd: rev ondisk format, protocols
Sage Weil [Thu, 25 Jun 2009 03:49:24 +0000 (20:49 -0700)]
osd: rev ondisk format, protocols

For monitor message changes AND osd snapset changes.

16 years agoosd: store snapset in _snapdir object if head dne
Sage Weil [Thu, 25 Jun 2009 03:43:01 +0000 (20:43 -0700)]
osd: store snapset in _snapdir object if head dne

If the _head doesn't logically exist, we can't keep it around just for
the SnapSet or else an 'ls' will have to stat in order to tell if the
head object logically exists and should be included.  That's no good,
so:

- put snapset in SS_ATTR on head if it exists
- otherwise, put it SS_ATTR on a _snapdir object

16 years agoosd: zero out pg_pool_t in constructor
Sage Weil [Thu, 25 Jun 2009 03:13:45 +0000 (20:13 -0700)]
osd: zero out pg_pool_t in constructor

Most things were getting initialized, but not snap_seq.

16 years agomon: set snap epoch for poolsnap removal, too
Sage Weil [Thu, 25 Jun 2009 03:13:08 +0000 (20:13 -0700)]
mon: set snap epoch for poolsnap removal, too

16 years agoosd: fix MOSDBoot, MOSDGetMap initialization
Sage Weil [Thu, 25 Jun 2009 02:58:19 +0000 (19:58 -0700)]
osd: fix MOSDBoot, MOSDGetMap initialization

16 years agomon: cleanup
Sage Weil [Wed, 24 Jun 2009 20:15:14 +0000 (13:15 -0700)]
mon: cleanup

16 years agokclient: update with new monitor message formats
Sage Weil [Wed, 24 Jun 2009 20:14:46 +0000 (13:14 -0700)]
kclient: update with new monitor message formats

16 years agomon: change MMDSMap to send map we have, not map we want.
Sage Weil [Wed, 24 Jun 2009 20:03:31 +0000 (13:03 -0700)]
mon: change MMDSMap to send map we have, not map we want.

16 years agoosd: make object delete not remove _head if there are clones
Sage Weil [Wed, 24 Jun 2009 19:55:00 +0000 (12:55 -0700)]
osd: make object delete not remove _head if there are clones

Truncate and rmattrs instead, so we can keep the SnapSet.

Still need to make 'ls' work properly.

16 years agofilestore: rmattrs command
Sage Weil [Wed, 24 Jun 2009 19:54:20 +0000 (12:54 -0700)]
filestore: rmattrs command

Delete all object attrs

16 years agomessages: Added PaxosServiceMessage to repository so previous commits work.
Greg Farnum [Wed, 24 Jun 2009 20:06:33 +0000 (13:06 -0700)]
messages: Added PaxosServiceMessage to repository so previous commits work.

16 years agoMonitor/Message: All messages used by Paxos are now PaxosServiceMessages.
Greg Farnum [Wed, 24 Jun 2009 18:42:24 +0000 (11:42 -0700)]
Monitor/Message: All messages used by Paxos are now PaxosServiceMessages.

16 years agomon/msg: PThey mostly hold version_t's now. Unused, though.
Greg Farnum [Tue, 23 Jun 2009 21:03:34 +0000 (14:03 -0700)]
mon/msg: PThey mostly hold version_t's now. Unused, though.

16 years agoosd: adjust recovery op accounting; explicitly track set of recovering objects
Sage Weil [Wed, 24 Jun 2009 18:17:55 +0000 (11:17 -0700)]
osd: adjust recovery op accounting; explicitly track set of recovering objects

Use a single {start,finish}_recovery_op() func to start and stop
recovery ops, so that there is a single point for counter adjustments
to occur.  On reset, simply call into OSD multiple times.

Also maintain a set<sobject_t> in each PG and on the OSD to track
the set of objects that are recovering.  This can hopefully be
compiled out once all the bugs are identified.

We are chasing this:

osd/OSD.cc:3465: FAILED assert(recovery_ops_active >= 0)
 1: ./cosd(_Z18__ceph_assert_failPKcS0_iS0_+0x3a) [0x7a769b]
 2: ./cosd(_ZN3OSD18finish_recovery_opEP2PGib+0x148) [0x696bce]
 3: ./cosd(_ZN12ReplicatedPG18finish_recovery_opEv+0x77) [0x6359c5]
 4: ./cosd(_ZN12ReplicatedPG17sub_op_push_replyEP14MOSDSubOpReply+0x540) [0x63628a]
 5: ./cosd(_ZN12ReplicatedPG15do_sub_op_replyEP14MOSDSubOpReply+0x64) [0x6407fe]
 6: ./cosd(_ZN3OSD10dequeue_opEP2PG+0x224) [0x6996ee]
 7: ./cosd(_ZN3OSD4OpWQ8_processEP2PG+0x21) [0x70d175]
 8: ./cosd(_ZN10ThreadPool9WorkQueueI2PGE13_void_processEPv+0x28) [0x6c9f78]
 9: ./cosd(_ZN10ThreadPool6workerEv+0x280) [0x7a825c]
 10: ./cosd(_ZN10ThreadPool10WorkThread5entryEv+0x19) [0x70cb9f]
 11: ./cosd(_ZN6Thread11_entry_funcEPv+0x20) [0x629d48]
 12: /lib/libpthread.so.0 [0x7f2f1e3f33f7]
 13: /lib/libc.so.6(clone+0x6d) [0x7f2f1d9c294d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

16 years agoosd: abort generate_backlog if already canceled
Sage Weil [Wed, 24 Jun 2009 18:12:03 +0000 (11:12 -0700)]
osd: abort generate_backlog if already canceled

Bail out of generate_backlog if we've been canceled.  Fixes

osd/OSD.cc: In function 'void OSD::generate_backlog(PG*)':
osd/OSD.cc:3305: FAILED assert(!pg->is_active())
 1: ./cosd(_Z18__ceph_assert_failPKcS0_iS0_+0x3a) [0x7a833b]
 2: ./cosd(_ZN3OSD16generate_backlogEP2PG+0xb6) [0x69a1a6]
 3: ./cosd(_ZN3OSD9BacklogWQ8_processEP2PG+0x21) [0x70d92b]
 4: ./cosd(_ZN10ThreadPool9WorkQueueI2PGE13_void_processEPv+0x28) [0x6ca5f8]
 5: ./cosd(_ZN10ThreadPool6workerEv+0x280) [0x7a8efc]
 6: ./cosd(_ZN10ThreadPool10WorkThread5entryEv+0x19) [0x70d331]
 7: ./cosd(_ZN6Thread11_entry_funcEPv+0x20) [0x629e48]
 8: /lib/libpthread.so.0 [0x7f0a8feed3f7]
 9: /lib/libc.so.6(clone+0x6d) [0x7f0a8f4bc94d]

16 years agoosd: fix merge_log when log and olog share bottom
Sage Weil [Wed, 24 Jun 2009 05:06:24 +0000 (22:06 -0700)]
osd: fix merge_log when log and olog share bottom

If log has 6'10 and olog has 7'10, on same object, merge_log
was failing to throw out log's 6'10 entry because the
last_kept iterator was still end().  Use a simple eversion_t
instead, and simplify existing (and otherwise correct)
log.bottom logic, but without the last_kept != end() guard
that threw us off.

09.06.23 16:52:56.032981 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering] merge_log log(469'11020,476'11021] from osd0 into log(469'11020,469'11021]
09.06.23 16:52:56.033001 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering] merge_log extending top to 476'11021
09.06.23 16:52:56.033033 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering]   ? 476'11021 (0'0) m 10001641d24.00000000/head by mds0.16:33860 09.06.23 16:50:28.931949
09.06.23 16:52:56.033057 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering] merge_log 476'11021 (0'0) m 10001641d24.00000000/head by mds0.16:33860 09.06.23 16:50:28.931949
09.06.23 16:52:56.033090 1145465168 osd4 485 pg[1.cd( v 476'11021/469'11021 (469'11020,476'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering m=1 l=1] merge_log result log(469'11020,476'11021] missing(1) changed=1

16 years agofilestore: use readdir_r to avoid SIGBUS badness
Sage Weil [Wed, 24 Jun 2009 05:03:04 +0000 (22:03 -0700)]
filestore: use readdir_r to avoid SIGBUS badness

We need to use reentrant readdir, since multiple threads
will otherwise share the struct dirent and walk all over
each other.

16 years agomds: fix session purge bug
Sage Weil [Tue, 23 Jun 2009 22:48:37 +0000 (15:48 -0700)]
mds: fix session purge bug

mds/Server.cc: In function 'void Server::_finish_session_purge(Session*)':
mds/Server.cc:410: FAILED assert(session->is_stale_purging())
 1: ./cmds(_ZN6Server21_finish_session_purgeEP7Session+0x392) [0x49edf2]
 2: ./cmds(_ZN6Server18find_idle_sessionsEv+0xa18) [0x4a3188]
 3: ./cmds(_ZN3MDS4tickEv+0x220) [0x484f60]
 4: ./cmds(_ZN9SafeTimer12EventWrapper6finishEi+0x1c1) [0x63eb11]
 5: ./cmds(_ZN5Timer11timer_entryEv+0x6f6) [0x6412d6]
 6: ./cmds(_ZN5Timer11TimerThread5entryEv+0xd) [0x46d53d]
 7: ./cmds(_ZN6Thread11_entry_funcEPv+0xc) [0x480c9c]
 8: /lib/libpthread.so.0 [0x7f51a9f4c3f7]
 9: /lib/libc.so.6(clone+0x6d) [0x7f51a951b94d]

16 years agoosd: allow recovery of missing objects not in log
Sage Weil [Tue, 23 Jun 2009 21:54:08 +0000 (14:54 -0700)]
osd: allow recovery of missing objects not in log

This happens when a scrub/repair tells us to recovery an item, but
it's older than log.bottom.

16 years agoosd: avoid using null ctx pointer
Sage Weil [Tue, 23 Jun 2009 04:37:12 +0000 (21:37 -0700)]
osd: avoid using null ctx pointer

Use localt instead, it's on the stack.

16 years agoosd: stop rewinding replica log when we reach log.bottom
Sage Weil [Tue, 23 Jun 2009 04:32:09 +0000 (21:32 -0700)]
osd: stop rewinding replica log when we reach log.bottom

We stop rewinding a replica log when we reach our own
log.bottom, because we don't know enough to do so in any
meaningful way, and because we can assume it is not
divergent at that point (barring any complete screwupedness).

Also, if we do change last_update, make sure last_complete is
rewound too.

16 years agomds: no fatal assert on ino allocation failures
Sage Weil [Tue, 23 Jun 2009 03:29:40 +0000 (20:29 -0700)]
mds: no fatal assert on ino allocation failures

We still log them LOG_ERR.  Client will be unhappy, but
that's their problem.

16 years agoosd: small cleanups
Sage Weil [Tue, 23 Jun 2009 03:25:15 +0000 (20:25 -0700)]
osd: small cleanups

16 years agomds: don't choke on bad parallel_fetch paths
Sage Weil [Mon, 22 Jun 2009 23:11:21 +0000 (16:11 -0700)]
mds: don't choke on bad parallel_fetch paths

e.g., bad reconnect path from client, like /blah/file_not_dir/blah.

16 years agorados: cleanup
Sage Weil [Tue, 23 Jun 2009 03:57:44 +0000 (20:57 -0700)]
rados: cleanup

16 years agokclient: make r_path[12] dup strings
Sage Weil [Tue, 23 Jun 2009 03:18:58 +0000 (20:18 -0700)]
kclient: make r_path[12] dup strings

The mds_request lifetime differs from the caller's stack, so we need to
duplicate these strings.  Fixes problems with request reply after MDS
recovery.

16 years agokclient: clean up mds_request path generation
Sage Weil [Tue, 23 Jun 2009 03:18:00 +0000 (20:18 -0700)]
kclient: clean up mds_request path generation

16 years agotodo
Sage Weil [Mon, 22 Jun 2009 22:50:37 +0000 (15:50 -0700)]
todo

16 years agoMakefile: add missing kernel/ headers
Sage Weil [Mon, 22 Jun 2009 22:50:16 +0000 (15:50 -0700)]
Makefile: add missing kernel/ headers