Sage Weil [Wed, 7 Apr 2010 21:12:43 +0000 (14:12 -0700)]
msgr: fail on sd < 0
I saw a case where poll(2) was blocking despite being passed an fd of -1.
Since that's clearly invalid, we can fail tcp_{read,write} before that
point.
Sage Weil [Mon, 15 Mar 2010 17:52:50 +0000 (10:52 -0700)]
osd: distinguish between per-pool snap pools and user-managed snap pools
If removed_snaps is non-empty, then snaps are managed by the
user: snap context is specified for all writes (e.g., MDS or
librados user using the snap context api).
We can enforce this by adding an (unused) snapid (1) to the
removed_snaps the first time a non-pool snap snapid is allocated.
Sage Weil [Mon, 5 Apr 2010 22:38:45 +0000 (15:38 -0700)]
mds: journal oldest client tid
Journal the client's safe tid with new requests. This keeps the client
completed_requests list trimmed, so that we don't build up a ginormous
list of all requests over the entire journal.
Sage Weil [Thu, 1 Apr 2010 20:46:26 +0000 (13:46 -0700)]
msgr: set OPEN state after accepting connection
Not doing so can eventually lead to
msg/SimpleMessenger.cc: In function 'int SimpleMessenger::Pipe::accept()':
msg/SimpleMessenger.cc:765: FAILED assert(existing->state == STATE_CONNECTING)
Sage Weil [Thu, 1 Apr 2010 20:03:04 +0000 (13:03 -0700)]
mds: remove dir from 'new' list on any commit, not just on clean
A dir may be redirtied after the commit, such that it never becomes clean.
It only needs to stay on the 'new' list until it's been written to disk
at least once, though.
Sage Weil [Thu, 1 Apr 2010 14:34:24 +0000 (07:34 -0700)]
mds: remove dentry AND inode when dropping snap metadata; add helper
We should only drop obsolete snapped metadata when it is unreferenced, and
at that point we need to drop the dentry AND inode, not just the dentry.
This delays things until caps are released, among other things.
Sage Weil [Wed, 31 Mar 2010 04:27:14 +0000 (21:27 -0700)]
mds: don't adjust subtree map in rename_prepare
Not sure what the reasoning behind this was.
This code is from pre git history, and the git->subversion conversion
managed to make pretty git-blame unusable. I doubt I really documented
what its purpose at that point was anyway.
Sage Weil [Tue, 30 Mar 2010 17:30:02 +0000 (10:30 -0700)]
mds: fix MDSTableClient ack double journaling
Do not journal ack unless the tid is registered in the LogSegment. Once
we journal it, we remove it from the LogSegment list, and once it's
journaled, we remove the pending_commit[tid] entry.
This fixes a bug where the mds got two acks, journaled both of them, and
crashed in the completion for the second because pending_commit[tid] was
gone. The second ack should have been ignored.
Sage Weil [Mon, 29 Mar 2010 23:26:54 +0000 (16:26 -0700)]
mds: start file recovery after sending rejoin ack
The rejoin ack intializes replica lock states correctly; we can't send any
lock messages before that. This fixes both the check max size call (which
sends lock messages taking the wrlock) and the file_recover() call
(which does the same).
Instead, we make two lists, files to recover and those to fix up. The lock
states for both are set to PRE_SCAN (LOCK on replica). After the rejoin
acks go out, we either check_inode_max_size or file_recover.
If file_recover someday grows another caller, this may need something a bit
more sophisticated.
Sage Weil [Fri, 26 Mar 2010 23:01:35 +0000 (16:01 -0700)]
mds: migrate frag/nest scatterlock info on bounding frags during export
This ensures that the auth inode continues to maintain accurate scatterlock
info about open frags. We include info on export if it is a bounding frag.
On import, we only take it if we are !auth. This mirrors the scatterlock
scatter/gather logic in CInode::{encode,decode}_lock_state.
Sage Weil [Thu, 25 Mar 2010 17:58:21 +0000 (10:58 -0700)]
mds: drop obsolete hack for base inodes
We used to skip base inodes for scatter_writebehind. But we can
journal these just like anything else, and it potentially breaks
try_to_expire if a base inode's lock is dirty, because the
completion queued on WAIT_STABLE by scatter_nudge never gets
completed.
Greg Farnum [Tue, 23 Mar 2010 20:38:02 +0000 (13:38 -0700)]
objecter: add change_pool_auid function.
I'm reluctant to stick this in the objecter since it doesn't quite fit, but
it's a pool management function and putting it here makes it easy to use
elsewhere while maintaining librados' standard function flow.