Sage Weil [Mon, 10 Nov 2008 23:53:26 +0000 (15:53 -0800)]
mds: fix replay lookup of snapshotted null dentries
Look up replayed dentry by dnLAST, not dnfirst, as we do with
primary and remote dentries, because that is how we identify
dentry instances in the timeline.
Sage Weil [Mon, 10 Nov 2008 23:25:24 +0000 (15:25 -0800)]
osd: ignore logs i don't expect without freaking out
We may get a log we didn't think we requested if the prior_set gets rebuilt
or our peering is restarted for some other reason. Just ignore it, instead
of asserting.
Sage Weil [Sun, 9 Nov 2008 16:43:14 +0000 (08:43 -0800)]
client: adjust objecter locking
We want to unlock client_lock before claling into objecter, mainly because the callbacks
rely on SafeCond that take a lock to signal a Cond and that gets awkward without mutex
recursion (see _write()'s sync case).
Sage Weil [Fri, 7 Nov 2008 21:26:41 +0000 (13:26 -0800)]
mds: match last snap exactly on replay, add_*_dentry
In general, we add new snapped dentries and THEN the new live dentry
to the metablob. That means that during replay, we see [2,2] followed
by [3,head], replacing [2,head]. The [2,2] dentry should be added
anew, without paying heed to [2,head], and then the [3,head] should
replace/update [2,head].
It was mainly just the assertions in add_*_dentry that were getting
in the way.. but the lookup_exact_snap is also slightly faster.
Sage Weil [Fri, 7 Nov 2008 00:28:17 +0000 (16:28 -0800)]
mds: check dn->last when finding existing dentries during replay
We can't simply search for an existing dentry based on the name and end
snap, as that may turn up the wrong item. For example, if we have
[2,head] and the replaying operations cowed that to [2,2] and [3,head], then
if we replay the [2,2] item first we'll find [2,head] (the _wrong_ dentry)
and throw an assertion.
Sage Weil [Thu, 6 Nov 2008 23:03:49 +0000 (15:03 -0800)]
osd: improve build_prior logic
If, during some interval since the pg last went active, we may have gone
rw, but none of the osds survived, then we include all of those osds
in the prior_set (even tho they're down), because they may have written data
that we want.
The prior logic appears to have been broken. It was only looking at the
primary osd.
Sage Weil [Tue, 4 Nov 2008 21:49:18 +0000 (13:49 -0800)]
osd: repeer osd if prior set may be affected
Previously we only repeered if the active set changed. However, changes
in the up/down state of the prior set (or prior set candidates) or the
primary osd's up_thru can also affect the prior set and peering.
This fixes the problem where PGs get stuck in a "crashed" state without
moving to "crashed+replay". We sit and wait for info from a peer who
we thought was up but is now down, or vice-versa.
Sage Weil [Tue, 4 Nov 2008 19:25:06 +0000 (11:25 -0800)]
osd: remove odd divergent log assertion
The divergent log handling is still broken in the face of backlogs, as we
can't really know if an item is really divergent or if it was deleted.
Since we can only diverge with administrator intervention, this is at least
not something we need to worry about _too_ much for now...
Sage Weil [Mon, 3 Nov 2008 23:55:14 +0000 (15:55 -0800)]
lockdep: error out on recursive locks
There is no checking between instances, here.. this currently just
assumes that if you take two locks of the same type that that is bad.
(In practice, the caller could do this safely with some care.)