Alex Elder [Tue, 17 Apr 2012 13:33:42 +0000 (08:33 -0500)]
qa: comment out xfstest 232
Test 232 in the xfstests suite produces an XFS error in the log
when run over an RBD device. This is most likely an XFS problem
that will be tracked separately (in tracker 2302).
My original plan with getting this checked in was to have it run a
baseline set of the tests--all known to pass on rbd devices--with
the intention of doing ongoing work to add back missing tests (at
least from the "auto" group) as we understand and fix whatever
makes them produce failures.
So just comment out test 232 so the xfstests script is able to
run to completion without error.
Alex Elder [Sat, 14 Apr 2012 16:43:15 +0000 (11:43 -0500)]
run_xfstests.sh: ensure cleanup on errors
Because we exit on any error (due to 'set -e'), the cleanup call was
never getting made in the event of an error. The net effect of that
was that a filesystem could be left mounted, and rbd cleanup then
couldn't complete because the module was in use.
Fix the trap call so it calls cleanup on exit as well as error.
Switch to using the capitalized signal names in the call.
Alex Elder [Sat, 14 Apr 2012 16:26:21 +0000 (11:26 -0500)]
run_xfstests.sh: pass test result via exit status
It turns out that xfstests *does* exit with non-zero status
when a test fails. Its exit status is the number of tests
that failed (which, now that we have over 255 tests could be
an issue...)
Save the exit status and make it be the result of the run.
Alex Elder [Sat, 14 Apr 2012 02:26:22 +0000 (21:26 -0500)]
qa: add run_xfsests.sh script
Add a script that runs xfstests over a pair of devices that are
specified using command line arguments. The tests are run using
a specified filesystem type (xfs, ext4, or btrfs).
A default set of tests is run if none is specified on the command
line. Normally there's an "auto" group used for this purpose, but
for now I've laid out a (large) subset of them that I know pass on
rbd devices. These can be updated as we find they work reliably.
Abstract out how writeback is done with a WritebackHandler object.
For RBD caching, this will be done by librados, but the Client uses
the Objecter directly.
This also requires different locks, since librbd does not have access
to the lock the underlying Objecter uses. Thus, both lock and the
writeback handler are parameters of the ObjectCacher constructor.
Sage Weil [Thu, 5 Apr 2012 22:55:50 +0000 (15:55 -0700)]
librados: do aio callbacks in async thread
Call user completions in an async thread. This allows callers to call back
into librados from the callback, and allows them to take locks in their
callbacks that they hold when queuing requests (making their life much
easier).
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
DeterministicOpSequence: writing to object being cloned in the same tx.
We write to the object being cloned prior to the clone to ensure we are
cloning a valid range of bytes.
The write and the clone were being done in two distinct transactions,
which would trigger a diff mismatch if a failure happened to occur within
the write tx. By not differentiating the transactions when building a
pristine copy, we were executing one more transaction (the clone_range one)
than the transactions that were executed in the failed filestore, thus
triggering the mismatch (one more object in the pristine filestore than on
the failed filestore).
Now we issue a single transaction, containing both the clone_range()
preceded by a write().
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
Sage Weil [Thu, 12 Apr 2012 17:40:08 +0000 (10:40 -0700)]
filestore: name internally
We need to allow the perfcounter name to be controlled so that we can have
two instances of FileStore in the same process that don't step on each
other. Default to 'filestore'.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
test_idempotent_sequence: Generate a reproducible sequence of txs.
With this test we aim at reproducing the same sequence of transactions
as long as we are provided with the same seed between runs.
We also allow failures to be injected onto the filestore if the
--filestore-kill-at <VAL> argument is passed, and we provide verification
when --test-verify-at <VAL> is provided.
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
VerifyFileStore: Check if two FileStore's match after applying a set of operations.
With DeterministicOpSequence we are able to reproduce exactly the same
sequence of operations, over and over. However, if the filestore fails
(e.g., because we injected a failure), we want to check if it is kept
consistent after replaying its journal.
With VerifyFileStore, which extends DeterministicOpSequence, we are able
to bring a brand new filestore to the state the failed filestore would
reach had it not failed. We can then compare to check if the failure
introduced inconsistencies after replaying the journal.
(This is still work in progress and not fully functional)
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
TestFileStoreState: Represent a FileStore's state to be used by tests.
Instead of having each test creating the same representation of a
FileStore's state, with a map/set of collections and objects, as well as
multiple init() functions for each test that are in all similar in
nature, provide this in a single class that can be inheritted by test
classes.
Sage Weil [Sat, 14 Apr 2012 00:11:54 +0000 (17:11 -0700)]
filestore: two-phase guard
For certain operations (collection_add) we need a two-phase guard, and an
"in-progress" state.
- before exposing an object in a new location, we need to mark it so that
old operations affecting the target name don't touch the new object.
- can't just set the guard before starting or else we can't distinguish
between a collection_add that was in-progress and one that happend a
long time ago.
We may need the same for collection_rename().
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sat, 14 Apr 2012 00:17:36 +0000 (17:17 -0700)]
filestore: simple failure injections via --filestore-kill-at <n>
This will make filestore suicide (_exit(1)) on the n'th potential failure
call site. We can potentially fail:
- before a transaction
- between each op
- at the end
Additionally, we instrument the guards:
- before/after/inside _set_replay_guard
- between significant steps of callers of _set_replay_guard
All instrumentation points are inside _do_transactions(), so if everything
is done in a single sequencer (or from a single thread) the failure
point is deterministic.
That said, use an atomic so we will still reliably fail (at some point)
when there are multiple filestore threads in action.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 13 Apr 2012 18:30:49 +0000 (11:30 -0700)]
filestore: replay collection_move using add+remove
This approximates the buggy collection_move. It is still buggy. It is
only there to replay old journals.
Rip out buggy (and now unused) collection_move code.
For the record, the problem there is that a crash between setting the guard
and unlinking the old name will not remove the old name on replay because
the guard for the link stage is indistinguishable from that for the unlink
stage.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 13 Apr 2012 16:56:04 +0000 (09:56 -0700)]
filestore: implement collection_move() as add + remove
This ensures we get add and remove steps with different spos values, which
makes the guard work. The collection_move implementation breaks on replay
because those values match, so the just-set guard prevents unlink from
happening.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>