Sage Weil [Tue, 30 Aug 2011 14:09:06 +0000 (07:09 -0700)]
client: fix readder result merge
When merging readdir results into the cache, we want to remove any names
_preceeding_ the current item before updating it. Then, at the end, we
clean up the trailing items.
This fixes a cfuse crash on workunits/snaps/snaptest-2.sh.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 30 Aug 2011 14:09:06 +0000 (07:09 -0700)]
client: fix readder result merge
When merging readdir results into the cache, we want to remove any names
_preceeding_ the current item before updating it. Then, at the end, we
clean up the trailing items.
This fixes a cfuse crash on workunits/snaps/snaptest-2.sh.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Samuel Just [Tue, 23 Aug 2011 15:53:26 +0000 (08:53 -0700)]
FileStore: On mount, scan collections for unstable state
CollectionIndex implementations may perform compound operations
leaving invalid state if interrupted. index->cleanup() gives
the implementation an oportunity to cleanup any in-progress
operation. For HashIndex, split and merge fall in this
category
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just [Wed, 17 Aug 2011 23:23:00 +0000 (16:23 -0700)]
FileStore: Add filestore version stamp
A filestore will now be tagged with a version stamp during
mkfs. If on mount the version stamp detected lags the current
version, the mount will fail unless filestore_update_collections
is set in gconf. If it is set, opening a collection will cause
the version stamp on the collection to be read and the
appropriate indexing implmentation to be used. This will allow
for conversion from old collection indexing schemes to new
ones.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just [Mon, 11 Jul 2011 20:22:48 +0000 (13:22 -0700)]
osd/: fix hobject_t construction
sobject_t requires only an object_t and a snapid_t. hobject_t also
requires the hash which should be used for the object. In most cases,
the osd must fill this in using the op message. In cases where the hash
used does not matter (as in the metadata collection), the explicit
hobject_t(const sobject_t &) constructor supplies a hash.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Tommi Virtanen [Fri, 19 Aug 2011 23:43:21 +0000 (16:43 -0700)]
First draft of the documentation overhaul.
To build the docs, run ./admin/build-doc. To browse them, either get
them on any static website, or just run ./admin/serve-doc to serve
them quickly off of port 8080.
build-doc sets up a virtualenv to avoid needing Sphinx installed
system-wide. serve-doc needs thttpd installed.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
Sage Weil [Mon, 29 Aug 2011 22:02:37 +0000 (15:02 -0700)]
heartbeatmap: fix reset_timeout with mixed-used threads
If you have a ThreadPool used by multiple WorkQueues, and some of them are
setting a suicide timeout, we need to clear it when a suicide timeout is
not set.
Sage Weil [Mon, 29 Aug 2011 18:54:21 +0000 (11:54 -0700)]
osd: set suicide timeouts on some workqueues
OpWQ: timeout * 10
RecoveryWQ: this does no io; it if stalls we're probably stuck in an
infinite loop. timeout * 10.
ScrubFinalizeWQ: this is cpu only. we're probably stuck in a loop, or
swapping. timeout * 10.
Sage Weil [Mon, 29 Aug 2011 18:41:24 +0000 (11:41 -0700)]
mon: health not ok when up < in osds
We were warning if there were any not up or in osds. Instead, warn if
there are any osds that are in but not up. That means if a node fails
and successfully marks the node out and retracts onto remaining nodes, the
ceph cluster is healthy again.
Presumably the fact that the nodes failed should raise other alerts,
because those specific daemons/nodes are not healthy.
Sage Weil [Fri, 26 Aug 2011 16:47:03 +0000 (09:47 -0700)]
objectcacher: only want for commit
There was some old, weird stuff going on here where we would wait for the
ACK and COMMIT separately. This is just wrong. Writeback does not
complete until the data is committed on disk.
Simplify by waiting only for commit, removing all the 'ack' code, and
going back to a single callback (flush_set).
Greg Farnum [Thu, 25 Aug 2011 18:28:42 +0000 (11:28 -0700)]
mds: server: should apply new layout settings on top of old layout
This way, the MDS can handle updates of some values without needing
the user to specify the entire layout (ie, they can just switch pools).
This brings the behavior more in line with setting the dir layout.
- we would dirty some buffers on an object
- bump dirty_tx count
- flush()
- this adds the Object to ObjectSet::uncommitted
- truncate
- client clears FILE_BUFFER cap_ref
- Object::purge()
- clear dirty_tx count
- client puts last inode
- Object::uncommitted is not empty in ~ObjectSet
(This was triggered after several runs of workunits/suites/blogbensh.sh
on sepia.)
It turns out the uncommitted xlist<> is pretty useless, though: the same
information is captured in the dirty_tx counter. We add a separate
counter to the Object itself (for the benefit of Object::can_close()).
We also clean up Object::purge() to call truncate(0), a small
simplification.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Wed, 24 Aug 2011 23:51:15 +0000 (16:51 -0700)]
client: be careful about replacing dentries during readdir assimilation
When we are assimilate readdir results into our cache, we need to be more
careful about replacing existing dentries. We were calling
insert_dentry_inode(), which would replace a name if it already exists,
which might include pd->first, an active iterator.
Move the dentry link/relink into the caller (where we already have an
iterator pointing to the existing item, if any). Then update the dentry
lease information separately.
Fixes: #1391 Signed-off-by: Sage Weil <sage@newdream.net>