]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Sage Weil [Sat, 21 Jul 2012 00:36:43 +0000 (17:36 -0700)]
regression: do some tests on ext4
Sage Weil [Fri, 20 Jul 2012 20:14:28 +0000 (13:14 -0700)]
move cfuse+dbench back to regression for verify, too
Sage Weil [Wed, 18 Jul 2012 03:05:30 +0000 (20:05 -0700)]
move cfuse + dbench from marginal to regression
Fixed #1737, yay!
Sage Weil [Mon, 16 Jul 2012 17:35:25 +0000 (10:35 -0700)]
move cfuse + ffsb from marginal to regression
This has had no failures.
Sage Weil [Mon, 16 Jul 2012 16:41:35 +0000 (09:41 -0700)]
move cfuse + fsx back into regression suite
No failures in marginal. The objectcacher fixes that came out of the
rbd_fsx stuff probably fixed the original problem?
Sage Weil [Thu, 12 Jul 2012 23:05:12 +0000 (16:05 -0700)]
fix wrongly marked down whitelist
This used to have '...or wrong addr' but it doesn't any more.
Josh Durgin [Wed, 11 Jul 2012 17:59:08 +0000 (10:59 -0700)]
rbd: test with layering enabled
RBD_FEATURES=0 hits a bug that's fixed in wip-rbd-parent.
Once that's merged, we can add RBD_FEATURES=0 tests back in.
Sage Weil [Wed, 11 Jul 2012 15:27:30 +0000 (08:27 -0700)]
ffsb is marginal, remove from smoke suite
Sage Weil [Wed, 11 Jul 2012 03:26:25 +0000 (20:26 -0700)]
Revert "smoke: add msgr failures"
This reverts commit
9278e231e64f49c3205c2ded8b1f2d3b27265eac .
Sage Weil [Wed, 11 Jul 2012 02:57:56 +0000 (19:57 -0700)]
move cfuse fsx into marginal suite
This should probably pass, given the testing that ObjectCacher gets these
days with librbd_fsx.
Sage Weil [Wed, 11 Jul 2012 02:56:39 +0000 (19:56 -0700)]
remove suites/stress/basic
Sage Weil [Wed, 11 Jul 2012 02:56:01 +0000 (19:56 -0700)]
move some old flaky tasks into marginal suite
These were pulled out of regression a while ago. Put them into the
marginal suite where they will be regularly run and we can evaluate the
severity of the problems they cause.
Sage Weil [Sat, 7 Jul 2012 00:04:02 +0000 (17:04 -0700)]
move qemu_iozone test to marginal suite
Samuel Just [Fri, 6 Jul 2012 17:02:29 +0000 (10:02 -0700)]
increase thrashosds timeout
Sage Weil [Wed, 4 Jul 2012 19:46:03 +0000 (12:46 -0700)]
move other ffsb workloads to marginal suite
Sage Weil [Wed, 4 Jul 2012 00:39:59 +0000 (17:39 -0700)]
move locktest to marginal suite
This fails 1 in 10 times or something like that.
Sage Weil [Sun, 1 Jul 2012 22:36:50 +0000 (15:36 -0700)]
smoke: add msgr failures
Sage Weil [Mon, 2 Jul 2012 19:26:10 +0000 (12:26 -0700)]
fewer hosts for mon tests
Sage Weil [Sun, 1 Jul 2012 21:27:38 +0000 (14:27 -0700)]
add rbd_xfstests to kernel suite
Josh Durgin [Fri, 29 Jun 2012 18:02:29 +0000 (11:02 -0700)]
qemu_iozone: use a larger image
The default is not large enough.
Sage Weil [Fri, 29 Jun 2012 16:12:51 +0000 (09:12 -0700)]
kernel suite
Sage Weil [Tue, 26 Jun 2012 04:21:33 +0000 (21:21 -0700)]
include ceph task in librbd collection
Sage Weil [Mon, 25 Jun 2012 22:30:27 +0000 (15:30 -0700)]
move kclient_workunit_suites_ffsb to marginal suite
until #1947 is fixed
Josh Durgin [Fri, 22 Jun 2012 01:18:03 +0000 (18:18 -0700)]
Add some tests inside qemu for the librbd suite
Josh Durgin [Fri, 22 Jun 2012 01:16:28 +0000 (18:16 -0700)]
Move librbd tests to rbd suite
This lets us generate jobs with different caching settings instead of
hardcoding them.
Sage Weil [Wed, 20 Jun 2012 18:23:20 +0000 (11:23 -0700)]
move cfuse + dbench task that triggers #1737 to marginal suite
Sage Weil [Sun, 17 Jun 2012 15:58:59 +0000 (08:58 -0700)]
don't dup ceph task for new fsx jobs
Josh Durgin [Fri, 15 Jun 2012 18:59:43 +0000 (11:59 -0700)]
Run fsx on rbd with thrashing
Josh Durgin [Fri, 15 Jun 2012 18:55:33 +0000 (11:55 -0700)]
Increase number of ops done by fsx against rbd.
Especially in the no-cache case, this should detect more races. The
fiemap problem is detectable on plana after ~5000 fsx ops.
Sage Weil [Thu, 14 Jun 2012 21:06:34 +0000 (14:06 -0700)]
add radosgw-admin test to regression suite
We wrote this test ages ago, but forgot to add it! Fixed up a few things
that have changed since then.
Josh Durgin [Mon, 11 Jun 2012 05:37:12 +0000 (22:37 -0700)]
Add test for cls_rbd
Josh Durgin [Mon, 11 Jun 2012 04:44:55 +0000 (21:44 -0700)]
Test old and new rbd formats
Josh Durgin [Mon, 11 Jun 2012 04:26:50 +0000 (21:26 -0700)]
Update for new workunit task syntax
Sage Weil [Fri, 8 Jun 2012 21:35:56 +0000 (14:35 -0700)]
regression: fix new rados, rbd test yamls
Don't start cluster twice!
Sage Weil [Fri, 8 Jun 2012 18:55:30 +0000 (11:55 -0700)]
run rados, rbd api tests under thrashing
Sage Weil [Thu, 31 May 2012 23:44:24 +0000 (16:44 -0700)]
add rados_stress_watch to regression
Sage Weil [Tue, 8 May 2012 23:07:10 +0000 (16:07 -0700)]
rbd_fsx in write-through mode
Sage Weil [Tue, 1 May 2012 03:11:44 +0000 (20:11 -0700)]
use fewer nodes for the simple singleton tasks
Sage Weil [Thu, 19 Apr 2012 20:33:32 +0000 (13:33 -0700)]
add rbd_fsx_[no]cache jobs to regression suite
Sage Weil [Wed, 18 Apr 2012 22:19:49 +0000 (15:19 -0700)]
gather logs for cfuse dbench workload, hopefully catch #1737
Sage Weil [Mon, 16 Apr 2012 03:39:56 +0000 (20:39 -0700)]
dump_stuck: whitelist 'wrongly marked me down'
The test marks the osds down.. they may generate this error if they get
that faster than they get the signal via the daemon-wrapper.
Sage Weil [Sat, 14 Apr 2012 05:27:24 +0000 (22:27 -0700)]
add rbd_xfstests to regression suite
Sage Weil [Fri, 13 Apr 2012 05:56:09 +0000 (22:56 -0700)]
move tasks:cfuse_workunit_suites_dbench.yaml to stress pending #1737 fix
Sage Weil [Sun, 25 Mar 2012 04:47:15 +0000 (21:47 -0700)]
add smoke suite
This could probably be collapsed into a bunch of singleton tasks to make
it simpler to track how many actual jobs result, but it was simpler to
make it a subset of regression. And probably that'll be easier to maintain
moving forward.
Tried to avoid any jobs that took more than 10 minutes (tho there are a few
in here). Kept both valgrind and lockdep jobs, and dropped many of those
from the basic collection (esp api tests).
We'll see how long this takes on plana and adjust up/down from there,
depending on how long we want to wait for it.
Sage Weil [Sat, 24 Mar 2012 23:07:47 +0000 (16:07 -0700)]
add osd-recovery test
Sage Weil [Sat, 24 Mar 2012 23:07:38 +0000 (16:07 -0700)]
renamed backfill -> osd_backfill
Sage Weil [Wed, 14 Mar 2012 22:51:51 +0000 (15:51 -0700)]
disable rbd thrash workload, #2174
Sage Weil [Thu, 15 Mar 2012 17:32:39 +0000 (10:32 -0700)]
Revert "disable rbd thrash workload, #2174"
This reverts commit
1bec416c7c7ff8a6462d94baaba8e7da73e88ab4 .
Fixed with #2174
Sage Weil [Wed, 14 Mar 2012 22:51:51 +0000 (15:51 -0700)]
disable rbd thrash workload, #2174
Sage Weil [Tue, 13 Mar 2012 17:49:33 +0000 (10:49 -0700)]
thrash: put client on separate machine from osds
This allows us to run kenrel clients (kclient, rbd) against the thrashing
cluster.
Sage Weil [Mon, 12 Mar 2012 22:22:17 +0000 (15:22 -0700)]
remove dup ceph tasks from new thrash workloads
Sage Weil [Mon, 12 Mar 2012 04:50:03 +0000 (21:50 -0700)]
clusters/fixed-3.yaml: 2 -> 6 osds
plana nodes have 3 scratch disks... use them!
Sage Weil [Mon, 12 Mar 2012 04:32:45 +0000 (21:32 -0700)]
Revert "disable s3tests on valgrind/lockdep until #2103 is fixed"
This reverts commit
9f757ca9511374f6565d74263e242c74e39f8a3f .
Sage Weil [Mon, 12 Mar 2012 04:28:45 +0000 (21:28 -0700)]
add rbd, kclient workloads to regression thrash collection
This will get us some kernel osd_client osd restart coverage.
Sage Weil [Sun, 11 Mar 2012 20:03:41 +0000 (13:03 -0700)]
fix typo, ceph-fyuse -> ceph-fuse
Sage Weil [Sun, 11 Mar 2012 04:01:57 +0000 (20:01 -0800)]
use dbench workunit, not the autotest one
The autotest one uses an old tarball that doesn't build. Workunit assumes
the dbench package is installed.
Sage Weil [Fri, 2 Mar 2012 06:04:15 +0000 (22:04 -0800)]
disable s3tests on valgrind/lockdep until #2103 is fixed
Josh Durgin [Wed, 29 Feb 2012 23:45:25 +0000 (15:45 -0800)]
dump-stuck: set pg stuck threshold to match test
Sage Weil [Mon, 27 Feb 2012 22:52:35 +0000 (14:52 -0800)]
no peer as part of lost_unfound
Sage Weil [Mon, 27 Feb 2012 01:09:41 +0000 (17:09 -0800)]
move peer to separate test for now
Sage Weil [Sun, 26 Feb 2012 05:35:31 +0000 (21:35 -0800)]
lost_unfound: do peer after, until wait_for_clean propagates last_epoch_started
The peer task does wait_for_clean, and then lost_unfound immediately marks
something down. But the PGs become clean before the replica last_epoch_started
is moved forward in time, which means they block waiting for the now down
OSD. Needlessly.
Until we fix this, just do the peer test after.
Sage Weil [Sat, 25 Feb 2012 05:39:55 +0000 (21:39 -0800)]
fix lockdep.yaml conf syntax
Sage Weil [Fri, 24 Feb 2012 23:20:00 +0000 (15:20 -0800)]
run radosgw through valgrind for s3tests
Sage Weil [Fri, 24 Feb 2012 23:04:27 +0000 (15:04 -0800)]
do peer test along with lost_unfound
Sage Weil [Fri, 24 Feb 2012 20:49:26 +0000 (12:49 -0800)]
rename valgrind -> verify, add in runs under lockdep
Josh Durgin [Wed, 22 Feb 2012 00:21:05 +0000 (16:21 -0800)]
Add test for 'ceph pg dump_stuck'
Sage Weil [Tue, 21 Feb 2012 18:02:44 +0000 (10:02 -0800)]
add valgrind collection to regression suite
Run a smaller set of tests with valgrind on the mon, osd, and mds.
Valgrind is currently ignoring leaks, but this will pick up use-after-free
and similar badness.
Sage Weil [Mon, 20 Feb 2012 20:49:35 +0000 (12:49 -0800)]
cfuse -> ceph-fuse
Sage Weil [Sat, 18 Feb 2012 21:56:47 +0000 (13:56 -0800)]
thrashing: whitelist 'objects unfound and apparently lost' message
This can happen when we mark OSDs down... if the objects are found when
the osds come back up then we're fine. if not, it won't go clean, and the
test will fail for that reason.
Sage Weil [Wed, 15 Feb 2012 05:49:26 +0000 (21:49 -0800)]
add regression/multifs collection; run rgw tests under both xfs and btrfs
Sage Weil [Tue, 14 Feb 2012 16:58:30 +0000 (08:58 -0800)]
rename fs files
Sage Weil [Tue, 14 Feb 2012 00:45:04 +0000 (16:45 -0800)]
regression/thrash on xfs and btrfs both
Sage Weil [Mon, 13 Feb 2012 23:29:52 +0000 (15:29 -0800)]
btrfs: 1 -> fs: btrfs
Sage Weil [Sat, 11 Feb 2012 21:40:44 +0000 (13:40 -0800)]
add snap thrashing covering a small number of objects
The snaps-many-objects has a relatively low density of ops-per-object. This
hammers on a small number of them and does a better job of validating the
correctness wrt snaps.
Sage Weil [Sat, 11 Feb 2012 21:39:46 +0000 (13:39 -0800)]
move snap thrashing back into regression suite
Sage Weil [Sat, 11 Feb 2012 00:40:03 +0000 (16:40 -0800)]
move kclient_workunit_suites_blogbench.yaml to stress suite
This is consistently failing due to an mds/kclient interaction.
Sage Weil [Wed, 1 Feb 2012 00:37:57 +0000 (16:37 -0800)]
add backfill test
Sage Weil [Sun, 29 Jan 2012 05:11:32 +0000 (21:11 -0800)]
make 6-osd-2-machine simpler... single monitor
Josh Durgin [Sat, 28 Jan 2012 02:08:31 +0000 (18:08 -0800)]
regression: add admin socket test for objecter requests.
Sage Weil [Wed, 25 Jan 2012 22:04:04 +0000 (14:04 -0800)]
remove snap thrashing from regression suite for time being
Samuel Just [Tue, 17 Jan 2012 23:06:24 +0000 (15:06 -0800)]
Add small cluster thrashing tasks
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Sage Weil [Mon, 16 Jan 2012 23:09:29 +0000 (15:09 -0800)]
add simple thrash workload to regression suite
Sage Weil [Mon, 16 Jan 2012 19:08:34 +0000 (11:08 -0800)]
mon.0 -> mon.a
Sage Weil [Mon, 16 Jan 2012 19:08:19 +0000 (11:08 -0800)]
mds.0 -> mds.a
Yehuda Sadeh [Tue, 10 Jan 2012 23:30:53 +0000 (15:30 -0800)]
add rgw readwrite and roundtrip tasks
Sage Weil [Sat, 7 Jan 2012 18:16:39 +0000 (10:16 -0800)]
do not put monitors on the same nodes as clients
Otherwise, for kernel clients (rbd or kclient), ceph-mon can cause a deadlock when it calls sync(2).
Sage Weil [Fri, 6 Jan 2012 23:08:01 +0000 (15:08 -0800)]
move multimon failure thrashing tests into regression
We need to test these nightly.
Josh Durgin [Tue, 3 Jan 2012 21:55:36 +0000 (13:55 -0800)]
Adjust rados model workloads for new config format
Sage Weil [Tue, 13 Dec 2011 16:28:23 +0000 (08:28 -0800)]
rados load-gen workunits
Samuel Just [Thu, 8 Dec 2011 23:35:16 +0000 (15:35 -0800)]
Use btrfs for regression tests
Some of the tests (particularly the s3 tests) use very long filenames
which trigger bugs related to ext4 xattr handling.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Tommi Virtanen [Mon, 5 Dec 2011 18:08:54 +0000 (10:08 -0800)]
Rename "testrados" task to not begin with "test".
See commit
e80c32c44293e6453cce1bf89ad3cf5b1b4917ab in
teuthology.git
Josh Durgin [Wed, 30 Nov 2011 00:20:36 +0000 (16:20 -0800)]
Move kclient multiple_rsync workunit to stress collection.
Bug #1760 keeps being triggered by this.
Sage Weil [Tue, 22 Nov 2011 05:58:13 +0000 (21:58 -0800)]
Revert "more logs (yuck) for #1682"
This reverts commit
ea00114f08440563bce8e27ae2cd887bbc85aba5 .
Sage Weil [Sun, 20 Nov 2011 23:24:17 +0000 (15:24 -0800)]
more logs (yuck) for #1682
Sage Weil [Sun, 20 Nov 2011 03:28:26 +0000 (19:28 -0800)]
fix conf thinko
'int' object has no attribute 'iteritems'
Sage Weil [Sat, 19 Nov 2011 21:56:17 +0000 (13:56 -0800)]
regression/basic/tasks/kclient_workunit_misc: turn on mds log
Hopefully will catch #1682
Sage Weil [Sat, 19 Nov 2011 21:45:28 +0000 (13:45 -0800)]
regression/basic/tasks/cfuse_dbench: turn up client debugging
Hopefully we'll hit #1737...
Josh Durgin [Fri, 18 Nov 2011 18:21:16 +0000 (10:21 -0800)]
Move multimds tests to a new suite, 'experimental'.
This suite is for testing features that aren't expected to be stable yet.
Josh Durgin [Fri, 18 Nov 2011 01:57:53 +0000 (17:57 -0800)]
Move collections into separate suites
For now, there are just two suites:
* regression - tests that should always pass
* stress - tests that have problems for one reason or another
Josh Durgin [Mon, 14 Nov 2011 16:06:18 +0000 (08:06 -0800)]
multimon: need at least 2 osds to go healthy