Sage Weil [Tue, 14 Apr 2009 16:20:36 +0000 (09:20 -0700)]
kclient: do readdir from dcache when possible
If we have I_COMPLETE, do a readdir straight from the dcache. When
we do a readdir prepopulate, reorder parent->d_subdirs to match the
mds order, so that the dcache readdir order matches.
Some wonkiness here to detect the end of directory. There is also
some inconsistency if the directory changes mid-readdir and we
have to switch back to talking to the mds.
Sage Weil [Thu, 9 Apr 2009 22:28:11 +0000 (15:28 -0700)]
mds: include cap, dentry lease release in request messages
Embed dentry and cap releases inside mds request messages. This
avoids the overhead of sending additional messages, although it
does limit the release to the mds the request is going to. That
is normally fine, since updates go to the auth MDS, and that is
usually who we're dealing with.
Sage Weil [Thu, 9 Apr 2009 21:43:04 +0000 (14:43 -0700)]
mds: drop 'careful' caps concept
This was originally needed before we were smart about projected
vs non-projected values, and making them visible to only the
xlocking or excl client. No more.
Sage Weil [Thu, 9 Apr 2009 21:09:45 +0000 (14:09 -0700)]
mds: replace WANT cap op with DROP
While it's not always clear when we are expanding wanted (bc the
client may have skewed perception of mds_wanted), it IS always
clear when we are explicitly _dropping_ caps and don't want them
reissued to us. And that's the behavior we're trying to avoid
anyway.
Sage Weil [Thu, 9 Apr 2009 20:26:17 +0000 (13:26 -0700)]
mds: don't rdlock_try authlock for path_traverse permission check
We don't actually check anything anyway, just ping the lock.
If the client is trusted with AUTH_EXCL, this is pointless anyway.
It is only really useful with untrusted (e.g., fuse) clients, but
they'll need some sort of special support for that later.
Sage Weil [Thu, 9 Apr 2009 20:21:34 +0000 (13:21 -0700)]
mds: issue caps in file_eval only if needed, indicated by WANTED cap op
We can't always issue_caps in file_eval or cap releases won't work.
Sometimes the client wanted shrink is missed, though, and a wanted
expansion is sent that doesn't actually change wanted. In those
cases, we DO need to issue caps. Use a separate cap op WANTED so
the mds knows when the client is asking for caps.
Sage Weil [Wed, 8 Apr 2009 23:26:22 +0000 (16:26 -0700)]
kclient: only flush caps in write_inode if wait=1
The problem is that on delayed writeback, vm calls write_inode and
THEN writepages. Which means we still have WRBUFFER caps used,
so sending a cap release then is somewhat counter productive. If
we don't release wanted, we'll have to again later. And if we do,
the mds will explicitly revoke our WRBUFFER caps so that it can
journal the max_size to 0. Yuck.
So. If wait, then do the cap flush immediately, as before. If
!wait, queue up the caps at the front of the delay queue so that
it goes the next time we check_delayed_caps. This catches the
sync() and umount cases (basically the same, really).
On fsync, queue the caps for write, but don't wait, since the mds
can recover the size/mtime anyway.
Sage Weil [Wed, 8 Apr 2009 22:17:04 +0000 (15:17 -0700)]
kclient: maintain min and max cap hold delays
Set a minimum amount of time we keep caps wanted bits even after
the file is closed. Before that time, even if we are sending a
cap message, we tell the mds we still want the caps. After the min
but before the max, we will tell the mds we no longer need them
IF we are sending a cap message for some other reason.
The caps_delay queue works as before based on the max timeout. When
that timer goes off, send a new message to release the wanted bits.
In most cases, this now releases wanted bits when we write back the
file size to the mds after writepages, which means only a single
message after we write and close a file. Yay!
Sage Weil [Wed, 8 Apr 2009 21:41:24 +0000 (14:41 -0700)]
kclient: drop FILE_RD if !wanted and max_size non-zero
If we are dropping wanted and max_size is non-zero, the mds will
want to journal the update, which will mean syncing the lock and
recalling FILE_RD anyway.
Sage Weil [Wed, 8 Apr 2009 21:40:46 +0000 (14:40 -0700)]
mds: adjust mds client request format to include optional releases
The goal is to release caps and/or dentry leases in the same
message as the request we are dropping them for. We will already
get caps/leases reissued with the response, in most cases.
Kill the mds replication hack while we're at it. That should be
cleaned up if/when it is reincarnated.
Sage Weil [Wed, 8 Apr 2009 18:14:54 +0000 (11:14 -0700)]
kclient: mark dentries with dir rdcache_gen, not i_version
The i_version may change if we are doing dir updates with an excl
lock on the dir, but our dcache should remain consistent over those
operations (unless we lack a trace, and fill_trace clears
I_COMPLETE in that case). Instead, we want to effectively
invalidate prior dentries when we are _newly_ issued RDCACHE and
don't know what has changed... which is exactly what rdcache_gen is
for.
Sage Weil [Wed, 8 Apr 2009 18:05:31 +0000 (11:05 -0700)]
kclient: don't clear I_COMPLETE on dentry revalidate failure
I_COMPLETE + FILE_RDCACHE means we have the full set of dentries.
If a revalidating dentry is part of the current set, then it will
successfully revalidate. The only reason revalidate would fail is
if the dentry is not part of the current set, or we don't hold
FILE_RDCACHE.. and in those cases clearing I_COMPLETE is either
wrong or unnecessary. Revalidate will already fail when we lose
FILE_RDCACHE, or when a dentry is released, or when a new RDCACHE
is issued and our rdcache_gen changes.
Sage Weil [Tue, 7 Apr 2009 16:29:39 +0000 (09:29 -0700)]
kclient: remove alignment restrictions on O_DIRECT reads and writes
These aren't needed, since we aren't restricted by DMA to a hardware disk
device or any such thing. (And even if they were, it'd probably be
sector alignment, not page alignment.)