git.apps.os.sepia.ceph.com Git

Tommi Virtanen [Tue, 22 Nov 2011 00:00:19 +0000 (16:00 -0800)]

Properly handle case where first error is inside a context manager __exit__.

Closes: http://tracker.newdream.net/issues/1743

commit | commitdiff | tree

Sage Weil [Sun, 20 Nov 2011 04:56:26 +0000 (20:56 -0800)]

nuke: don't specify full path

/tmp/cephtest/binary may have been removed; kill stray daemons by name
only. we really don't care about false positives here!

commit | commitdiff | tree

Sage Weil [Thu, 17 Nov 2011 21:52:17 +0000 (13:52 -0800)]

commit | commitdiff | tree

Josh Durgin [Fri, 18 Nov 2011 21:53:51 +0000 (13:53 -0800)]

Save summary after nuking machines.

This way you can tell when tests are entirely finished running.

commit | commitdiff | tree

Josh Durgin [Fri, 18 Nov 2011 20:22:18 +0000 (12:22 -0800)]

Add an example overrides file for running regression tests.

commit | commitdiff | tree

Josh Durgin [Fri, 18 Nov 2011 01:26:21 +0000 (17:26 -0800)]

suite: put common config before facets

This lets you add tasks to the beginning of a run, like the chef task.

commit | commitdiff | tree

Josh Durgin [Fri, 18 Nov 2011 01:14:05 +0000 (17:14 -0800)]

suite: schedule a list of collections for running instead of a single suite directory

commit | commitdiff | tree

Yehuda Sadeh [Fri, 18 Nov 2011 00:53:21 +0000 (16:53 -0800)]

testswift: fix config

commit | commitdiff | tree

Tommi Virtanen [Fri, 18 Nov 2011 01:00:44 +0000 (17:00 -0800)]

commit | commitdiff | tree

Tommi Virtanen [Fri, 18 Nov 2011 00:49:47 +0000 (16:49 -0800)]

Add a task for easily running chef-solo on all the nodes.

commit | commitdiff | tree

Sage Weil [Thu, 17 Nov 2011 21:46:02 +0000 (13:46 -0800)]

ceph_manager: fix logging

commit | commitdiff | tree

Josh Durgin [Thu, 17 Nov 2011 21:07:03 +0000 (13:07 -0800)]

ceph: deep merge overrides, so e.g. log whitelists can be overridden

commit | commitdiff | tree

Josh Durgin [Thu, 17 Nov 2011 21:06:36 +0000 (13:06 -0800)]

misc: move deep_merge out of the MergeConfig class - it's generic

commit | commitdiff | tree

Josh Durgin [Thu, 17 Nov 2011 19:57:07 +0000 (11:57 -0800)]

Save config after locking nodes, so targets are included.

commit | commitdiff | tree

Josh Durgin [Thu, 17 Nov 2011 19:18:24 +0000 (11:18 -0800)]

filestore_idempotent: remove unused import

commit | commitdiff | tree

Josh Durgin [Thu, 17 Nov 2011 19:15:47 +0000 (11:15 -0800)]

mon_recovery: remove unused code and import

commit | commitdiff | tree

Josh Durgin [Thu, 17 Nov 2011 19:11:33 +0000 (11:11 -0800)]

thrashosds: timeout for every clean check, not just the last one

commit | commitdiff | tree

Josh Durgin [Thu, 17 Nov 2011 19:05:12 +0000 (11:05 -0800)]

ceph_manager: add a default timeout of 5 minutes for mon quorum

commit | commitdiff | tree

Josh Durgin [Thu, 17 Nov 2011 18:45:19 +0000 (10:45 -0800)]

ceph_manager: log mon quorum status so the logs show progress (or lack thereof)

commit | commitdiff | tree

Yehuda Sadeh [Thu, 17 Nov 2011 00:00:01 +0000 (16:00 -0800)]

rgw: add swift task

still not completely working (for some reason it skips all the tests)

commit | commitdiff | tree

Sage Weil [Fri, 11 Nov 2011 05:35:11 +0000 (21:35 -0800)]

filestore_idempotent.py: simple task to test non-idempotent osd ops

Write some non-idempotent events to the osd. Simulate a failure. Verify
the result is correct on replay.

This must be preceeded by the ceph task just so that we get the binaries
installed. Should clean this up later if/when the installation gets
factored out of ceph.py.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 10 Nov 2011 22:13:24 +0000 (14:13 -0800)]

misc: allow >1 monitor per role in get_mon_names()

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 9 Nov 2011 21:37:02 +0000 (13:37 -0800)]

add hammer.sh

simple script to repeat a test until it fails. can probably do something much more sophisticated
here, but this works.

commit | commitdiff | tree

Josh Durgin [Wed, 9 Nov 2011 18:39:56 +0000 (10:39 -0800)]

nuke: increase reboot timeout

Some sepia nodes are very slow to reboot.

commit | commitdiff | tree

Sage Weil [Wed, 9 Nov 2011 06:06:43 +0000 (22:06 -0800)]

mon_recovery: add task to test monitor cluster failure recovery

Some simple tests to start with. We still need some sort of mon cluster
thrashing.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 9 Nov 2011 06:02:58 +0000 (22:02 -0800)]

ceph_manager: manipulate monitors

commit | commitdiff | tree

Sage Weil [Wed, 9 Nov 2011 06:00:32 +0000 (22:00 -0800)]

ceph: keep ceph.conf at ctx.ceph.conf

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Josh Durgin [Wed, 9 Nov 2011 00:06:33 +0000 (16:06 -0800)]

Remove unused imports and variable.

commit | commitdiff | tree

Josh Durgin [Wed, 9 Nov 2011 00:01:39 +0000 (16:01 -0800)]

Add nuke-on-error option.

This lets automated jobs nuke and unlock machines after failed
tests. Each machine is nuke individually, so one down machine won't
keep others from being nuked and unlocked.

commit | commitdiff | tree

Tommi Virtanen [Mon, 7 Nov 2011 21:05:14 +0000 (13:05 -0800)]

Fix leftover orchestra import clause.

This seems to be a leftover from
a2372fce12b6bd1818e155d1d8ed5134dbd8fd4a,
no idea how it stayed hidden this long.

commit | commitdiff | tree

Josh Durgin [Thu, 3 Nov 2011 20:27:44 +0000 (13:27 -0700)]

ceph_manager: log ceph -s output so progress is visible in the logs

commit | commitdiff | tree

Josh Durgin [Thu, 3 Nov 2011 20:08:39 +0000 (13:08 -0700)]

Keep each ssh connection alive.

With long-running jobs like thrashing, ssh connections were timing
out.

commit | commitdiff | tree

Josh Durgin [Thu, 3 Nov 2011 20:07:21 +0000 (13:07 -0700)]

connection: allow the caller to specify whether keep-alive should be used

commit | commitdiff | tree

Josh Durgin [Thu, 3 Nov 2011 18:26:45 +0000 (11:26 -0700)]

locker: fix race in locking

The isolation level is lower than I thought. This made it possible for
two clients to think they both locked the same machines, since the
update would still be modifying each row to change the locked_since
time.

commit | commitdiff | tree

Samuel Just [Wed, 2 Nov 2011 18:33:37 +0000 (11:33 -0700)]

testrados: set CEPH_CLIENT_ID without a ;

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Samuel Just [Mon, 31 Oct 2011 21:26:41 +0000 (14:26 -0700)]

testrados: specify CEPH_CONF directly

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Yehuda Sadeh [Thu, 27 Oct 2011 19:11:28 +0000 (12:11 -0700)]

rgw: add user suspend/enable test

commit | commitdiff | tree

Yehuda Sadeh [Thu, 27 Oct 2011 18:32:12 +0000 (11:32 -0700)]

rgw: log-to-stderr is now a binary flag

commit | commitdiff | tree

Samuel Just [Mon, 24 Oct 2011 21:23:48 +0000 (14:23 -0700)]

testrados: rename testsnaps to testrados and make snap testing optional

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Josh Durgin [Mon, 24 Oct 2011 20:52:29 +0000 (13:52 -0700)]

workunit: set PYTHONPATH so we can test python bindings

commit | commitdiff | tree

Sage Weil [Sun, 23 Oct 2011 17:30:27 +0000 (10:30 -0700)]

ceph.conf: python parser doens't like ; comments

commit | commitdiff | tree

Sage Weil [Sun, 23 Oct 2011 05:16:39 +0000 (22:16 -0700)]

ceph.conf: more frequent osd scrubbing; remove old cruft

commit | commitdiff | tree

Sage Weil [Wed, 19 Oct 2011 17:04:07 +0000 (10:04 -0700)]

ceph_manager: count active+clean+<somjething else> as active+clean

In my case, one pg was active+clean+scrubbing.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Josh Durgin [Thu, 20 Oct 2011 23:28:29 +0000 (16:28 -0700)]

coverage: don't remove ceph tarball

We want to keep it for examining core files, and we're already
fetching it here, once per suite run.

commit | commitdiff | tree

Sage Weil [Mon, 17 Oct 2011 22:32:22 +0000 (15:32 -0700)]

add lost_unfound task

Also some misc useful bits to ceph_manager.

commit | commitdiff | tree

Josh Durgin [Mon, 17 Oct 2011 21:42:03 +0000 (14:42 -0700)]

ceph: add whitelist for cluster log errors

Some messages are expected when thrashing osds or creating unfound
objects.

Fixes: #1622

commit | commitdiff | tree

Josh Durgin [Mon, 17 Oct 2011 17:40:16 +0000 (10:40 -0700)]

nuke: reset syslog configuration after rebooting

Previously we removed a file and rebooted without syncing, so the file
was never deleted.

commit | commitdiff | tree

Yehuda Sadeh [Wed, 12 Oct 2011 22:37:33 +0000 (15:37 -0700)]

radosgw-admin: test swift keys creation/removal

commit | commitdiff | tree

Josh Durgin [Fri, 7 Oct 2011 21:51:46 +0000 (14:51 -0700)]

teuthology-worker: remove --keep-locked-on-error

commit | commitdiff | tree

Josh Durgin [Fri, 7 Oct 2011 21:45:01 +0000 (14:45 -0700)]

Remove --keep-locked-on-error, and behave as if it were specified

This will help prevent machines with cephtest dirs still present from
being used. It's easy to unlock machines - the targets yaml fragment
is output during a run.

commit | commitdiff | tree

Josh Durgin [Fri, 7 Oct 2011 00:18:35 +0000 (17:18 -0700)]

reconnect: ignore SSHExceptions before the timeout expires

Fixes: #1587

commit | commitdiff | tree

Samuel Just [Thu, 6 Oct 2011 20:33:17 +0000 (13:33 -0700)]

task/watch_notify_stress: watch_notify_stress now thrashes clients

This should exercise the watch notify timeout code.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Wed, 5 Oct 2011 22:54:57 +0000 (15:54 -0700)]

rgw: keep radosgw in foreground

It defaults to a daemon now.

commit | commitdiff | tree

Josh Durgin [Wed, 5 Oct 2011 00:19:56 +0000 (17:19 -0700)]

Retry listing machines if the lock server goes down.

commit | commitdiff | tree

Sage Weil [Tue, 4 Oct 2011 23:09:32 +0000 (16:09 -0700)]

rgw: use normal logging mechanism

Keep capturing stdout/err, even though it should end up empty.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Josh Durgin [Tue, 4 Oct 2011 19:32:58 +0000 (12:32 -0700)]

teuthology-worker: clean up last_in_suite jobs

There's no reason not to delete them once they start.

commit | commitdiff | tree

Josh Durgin [Tue, 4 Oct 2011 19:16:30 +0000 (12:16 -0700)]

daemon-helper: detect the signal actually sent

I thought I fixed this when I implemented coverage collection, but I
guess it got lost in a rebase or something.

commit | commitdiff | tree

Josh Durgin [Tue, 4 Oct 2011 00:49:53 +0000 (17:49 -0700)]

ceph_manager: remove unused raw_pg_status method

commit | commitdiff | tree

Josh Durgin [Tue, 4 Oct 2011 00:49:13 +0000 (17:49 -0700)]

ceph_manager: run ceph -s as a normal program

This allows failures from it to be detected better.

commit | commitdiff | tree

Josh Durgin [Tue, 4 Oct 2011 00:05:33 +0000 (17:05 -0700)]

teuthology-results: include passed tests in email

commit | commitdiff | tree

Josh Durgin [Tue, 4 Oct 2011 00:00:45 +0000 (17:00 -0700)]

teuthology-results: include reasons for failure in email

commit | commitdiff | tree

Josh Durgin [Mon, 3 Oct 2011 23:32:42 +0000 (16:32 -0700)]

teuthology-ls: show reasons for failures with -v

commit | commitdiff | tree

Josh Durgin [Mon, 3 Oct 2011 23:08:49 +0000 (16:08 -0700)]

Add failure_reason to summary for the first failure detected.

For now, this is the exception raised during a task, the error found
in the central log, or coredumps found. More specific errors
(i.e. s3-tests had 3 failures) can be added later as exceptions raised
by tasks.

commit | commitdiff | tree

Josh Durgin [Mon, 3 Oct 2011 23:41:17 +0000 (16:41 -0700)]

radosbench: get coverage and cores

commit | commitdiff | tree

Samuel Just [Mon, 3 Oct 2011 21:04:53 +0000 (14:04 -0700)]

watch_notify_stress.py: add ceph flags option

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Samuel Just [Mon, 3 Oct 2011 21:03:36 +0000 (14:03 -0700)]

ceph.py: add btrfs option

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Josh Durgin [Mon, 3 Oct 2011 16:55:58 +0000 (09:55 -0700)]

nuke: keep up with renaming cfuse -> ceph-fuse

commit | commitdiff | tree

Sage Weil [Fri, 30 Sep 2011 16:12:45 +0000 (09:12 -0700)]

radosgw-admin: test additional keys, log list/show/rm

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 29 Sep 2011 05:20:38 +0000 (22:20 -0700)]

tasks/radosgw-admin: test radosgw-admin tool

Not yet complete...

commit | commitdiff | tree

Sage Weil [Thu, 29 Sep 2011 03:50:24 +0000 (20:50 -0700)]

nuke: killall apache2 and radosgw too

commit | commitdiff | tree

Greg Farnum [Fri, 30 Sep 2011 16:26:42 +0000 (09:26 -0700)]

s3-tests: use radosgw-admin instead of radosgw_admin

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Josh Durgin [Thu, 29 Sep 2011 16:09:31 +0000 (09:09 -0700)]

ceph_manager: parse osd numbers with dots

This is necessary since wip-dot-names was merged.

commit | commitdiff | tree

Sage Weil [Fri, 23 Sep 2011 15:57:18 +0000 (08:57 -0700)]

rename c* -> ceph-*

Leave cfuse task name unchanged for now...

commit | commitdiff | tree

Josh Durgin [Fri, 23 Sep 2011 01:23:36 +0000 (18:23 -0700)]

queue: results_timeout needs to be converted to a string

commit | commitdiff | tree

Samuel Just [Thu, 22 Sep 2011 20:23:05 +0000 (13:23 -0700)]

task/watch_notify_stress.py: add simple watch_notify stress test

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Josh Durgin [Wed, 21 Sep 2011 18:05:18 +0000 (11:05 -0700)]

schedule: put results timeout in the job

The default was always being used instead.

commit | commitdiff | tree

Greg Farnum [Tue, 20 Sep 2011 17:04:01 +0000 (10:04 -0700)]

lockfile: increase interval to prevent incorrect locking orders

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Thu, 15 Sep 2011 16:24:52 +0000 (09:24 -0700)]

lockfile: don't fail cleanup if no lock procs exist

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Tommi Virtanen [Fri, 16 Sep 2011 18:32:15 +0000 (11:32 -0700)]

workunit: Fetch source from github.

Needed an elaborate dance because Github won't let us download
an archive of a subdirectory.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>

commit | commitdiff | tree

Tommi Virtanen [Fri, 16 Sep 2011 18:09:45 +0000 (11:09 -0700)]

s3tests: Clone repository from github.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>

commit | commitdiff | tree

Tommi Virtanen [Fri, 16 Sep 2011 18:08:38 +0000 (11:08 -0700)]

coverage: Fetch source from github.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>

commit | commitdiff | tree

Samuel Just [Fri, 16 Sep 2011 00:26:03 +0000 (17:26 -0700)]

ceph.py: remove unused variables mds_daemons and mon_daemons

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Samuel Just [Wed, 14 Sep 2011 23:31:58 +0000 (16:31 -0700)]

ceph.py/cephmanager.py: add ctx.daemons for restarting daemons

ctx.daemons will now be an instance of CephState.

ctx.daemons.get_daemon(role, id).stop() to stop daemon, retart() to
restart the daemon, etc.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Samuel Just [Wed, 14 Sep 2011 23:28:06 +0000 (16:28 -0700)]

testsnaps: LD_PRELOAD needed for librados

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Tommi Virtanen [Tue, 13 Sep 2011 21:53:02 +0000 (14:53 -0700)]

Move orchestra to teuthology.orchestra so there's just one top-level package.

commit | commitdiff | tree

Tommi Virtanen [Tue, 13 Sep 2011 21:10:12 +0000 (14:10 -0700)]

Merge orchestra into teuthology.

There are too many things called Orchestra out there,
including Ubuntu's new multi-machine service orchestration
framework. The code might still be beneficial outside of
teuthology, but it can be spun off at that time.

Conflicts:
bootstrap
requirements.txt
setup.py

commit | commitdiff | tree

Tommi Virtanen [Fri, 9 Sep 2011 20:22:03 +0000 (13:22 -0700)]

Callers of task s3tests.create_users don't need to provide dummy "fixtures" dict.

commit | commitdiff | tree

Josh Durgin [Fri, 9 Sep 2011 17:31:08 +0000 (10:31 -0700)]

thrashosds: fix timeout when no options are specified

commit | commitdiff | tree

Josh Durgin [Fri, 9 Sep 2011 01:09:11 +0000 (18:09 -0700)]

thrashosds: fail if cluster doesn't finally become clean in 5 minutes

commit | commitdiff | tree

Josh Durgin [Thu, 8 Sep 2011 21:09:13 +0000 (14:09 -0700)]

thrasher: get coverage and cores from calling ceph commands

commit | commitdiff | tree

Josh Durgin [Thu, 8 Sep 2011 21:07:23 +0000 (14:07 -0700)]

thrashosds: wait for every pg to go active and clean before exiting

commit | commitdiff | tree

Josh Durgin [Thu, 8 Sep 2011 19:54:23 +0000 (12:54 -0700)]

thrasher: clean up a bit

commit | commitdiff | tree

Josh Durgin [Thu, 8 Sep 2011 00:50:12 +0000 (17:50 -0700)]

autotest: allow tests to be run on all clients

commit | commitdiff | tree

Josh Durgin [Wed, 7 Sep 2011 23:54:24 +0000 (16:54 -0700)]

rbd: allow specifying all clients

commit | commitdiff | tree

Greg Farnum [Tue, 6 Sep 2011 18:29:04 +0000 (11:29 -0700)]

locktest: don't fail cleanup if the dir doesn't exist

We're doing this the cheapest way possible: make the dir!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Sat, 3 Sep 2011 22:07:21 +0000 (15:07 -0700)]

teuthology: do a deep merge of input yaml fragments

Concatenate lists, and recursively combine dicts.

If you specify inputs like

foo:
- a
- b

and

foo:
- c

you should get

foo:
- a
- b
- c

Dicts should also be merged (last one wins), and the merging is deep. E.g.

foo:
   a:
     b:
       c: 1

and

foo:
   a:
     b:
       c: 2

is

foo:
   a:
     b:
       c: 2

Fixes: #1497

commit | commitdiff | tree

Josh Durgin [Sat, 3 Sep 2011 02:12:16 +0000 (19:12 -0700)]

lock: default to only listing machines you have locked

--all removes this restriction

commit | commitdiff | tree

Josh Durgin [Sat, 3 Sep 2011 00:58:19 +0000 (17:58 -0700)]

rgw: run as an external fastcgi server to match dho

commit | commitdiff | tree

Sage Weil [Fri, 2 Sep 2011 18:07:10 +0000 (11:07 -0700)]

don't eat exceptions for breakfast

fixes 0c2bee1514c1b1e65ca5d52459062e5a45da2d7b

commit | commitdiff | tree

Greg Farnum [Wed, 31 Aug 2011 21:40:55 +0000 (14:40 -0700)]

locktest: make it actually run the executable test

This was missing an argument (the file to run on!) and apparently
that didn't cause the command to output a failure return code.

Additionally, the ceph wrappers were blocking a crash and falsely
reporting success back to teuthology. (Yikes!)

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom