Sage Weil [Wed, 31 Aug 2011 22:43:41 +0000 (15:43 -0700)]
osd: flush previous operations to fs before collection list + destroy
We need to flush any prior ops to the fs before we can rely on
collection_list to return all the objects we need to delete. If we miss
any, we will crash shortly after this when the rmdir(2) fails with
-ENOTEMPTY (as with #1471).
Samuel Just [Wed, 31 Aug 2011 21:10:02 +0000 (14:10 -0700)]
OSD: Fix encoding versions affected by hobject switch
PG log did not previously store the object locator. To get the hash for
the hobject, scan the collection for the object during read_log if we
encouter an old style log entry.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just [Tue, 30 Aug 2011 22:34:18 +0000 (15:34 -0700)]
cosd,OSD: Improve filestore upgrade path
Previously, fsconverter was required to update an osd filestore to the
most recent version. cosd will now handle that automatically on
startup. cosd --convert-filestore will also update the FileStore
to the most recent version.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Sage Weil [Tue, 30 Aug 2011 20:10:42 +0000 (13:10 -0700)]
client: drop mostly-useless relink()
Just use unlink() and then link(). Carry an inode ref to avoid badness.
The relink() is left over from a simpler time when we didn't do proper
refcounting.
Sage Weil [Tue, 30 Aug 2011 14:09:06 +0000 (07:09 -0700)]
client: fix readder result merge
When merging readdir results into the cache, we want to remove any names
_preceeding_ the current item before updating it. Then, at the end, we
clean up the trailing items.
This fixes a cfuse crash on workunits/snaps/snaptest-2.sh.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 30 Aug 2011 14:09:06 +0000 (07:09 -0700)]
client: fix readder result merge
When merging readdir results into the cache, we want to remove any names
_preceeding_ the current item before updating it. Then, at the end, we
clean up the trailing items.
This fixes a cfuse crash on workunits/snaps/snaptest-2.sh.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Samuel Just [Tue, 23 Aug 2011 15:53:26 +0000 (08:53 -0700)]
FileStore: On mount, scan collections for unstable state
CollectionIndex implementations may perform compound operations
leaving invalid state if interrupted. index->cleanup() gives
the implementation an oportunity to cleanup any in-progress
operation. For HashIndex, split and merge fall in this
category
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just [Wed, 17 Aug 2011 23:23:00 +0000 (16:23 -0700)]
FileStore: Add filestore version stamp
A filestore will now be tagged with a version stamp during
mkfs. If on mount the version stamp detected lags the current
version, the mount will fail unless filestore_update_collections
is set in gconf. If it is set, opening a collection will cause
the version stamp on the collection to be read and the
appropriate indexing implmentation to be used. This will allow
for conversion from old collection indexing schemes to new
ones.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just [Mon, 11 Jul 2011 20:22:48 +0000 (13:22 -0700)]
osd/: fix hobject_t construction
sobject_t requires only an object_t and a snapid_t. hobject_t also
requires the hash which should be used for the object. In most cases,
the osd must fill this in using the op message. In cases where the hash
used does not matter (as in the metadata collection), the explicit
hobject_t(const sobject_t &) constructor supplies a hash.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Tommi Virtanen [Fri, 19 Aug 2011 23:43:21 +0000 (16:43 -0700)]
First draft of the documentation overhaul.
To build the docs, run ./admin/build-doc. To browse them, either get
them on any static website, or just run ./admin/serve-doc to serve
them quickly off of port 8080.
build-doc sets up a virtualenv to avoid needing Sphinx installed
system-wide. serve-doc needs thttpd installed.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
Sage Weil [Mon, 29 Aug 2011 22:02:37 +0000 (15:02 -0700)]
heartbeatmap: fix reset_timeout with mixed-used threads
If you have a ThreadPool used by multiple WorkQueues, and some of them are
setting a suicide timeout, we need to clear it when a suicide timeout is
not set.
Sage Weil [Mon, 29 Aug 2011 18:54:21 +0000 (11:54 -0700)]
osd: set suicide timeouts on some workqueues
OpWQ: timeout * 10
RecoveryWQ: this does no io; it if stalls we're probably stuck in an
infinite loop. timeout * 10.
ScrubFinalizeWQ: this is cpu only. we're probably stuck in a loop, or
swapping. timeout * 10.
Sage Weil [Mon, 29 Aug 2011 18:41:24 +0000 (11:41 -0700)]
mon: health not ok when up < in osds
We were warning if there were any not up or in osds. Instead, warn if
there are any osds that are in but not up. That means if a node fails
and successfully marks the node out and retracts onto remaining nodes, the
ceph cluster is healthy again.
Presumably the fact that the nodes failed should raise other alerts,
because those specific daemons/nodes are not healthy.