git.apps.os.sepia.ceph.com Git

Josh Durgin [Wed, 9 Nov 2011 00:01:39 +0000 (16:01 -0800)]

Add nuke-on-error option.

This lets automated jobs nuke and unlock machines after failed
tests. Each machine is nuke individually, so one down machine won't
keep others from being nuked and unlocked.

commit | commitdiff | tree

Tommi Virtanen [Mon, 7 Nov 2011 21:05:14 +0000 (13:05 -0800)]

Fix leftover orchestra import clause.

This seems to be a leftover from
a2372fce12b6bd1818e155d1d8ed5134dbd8fd4a,
no idea how it stayed hidden this long.

commit | commitdiff | tree

Josh Durgin [Thu, 3 Nov 2011 20:27:44 +0000 (13:27 -0700)]

ceph_manager: log ceph -s output so progress is visible in the logs

commit | commitdiff | tree

Josh Durgin [Thu, 3 Nov 2011 20:08:39 +0000 (13:08 -0700)]

Keep each ssh connection alive.

With long-running jobs like thrashing, ssh connections were timing
out.

commit | commitdiff | tree

Josh Durgin [Thu, 3 Nov 2011 20:07:21 +0000 (13:07 -0700)]

connection: allow the caller to specify whether keep-alive should be used

commit | commitdiff | tree

Josh Durgin [Thu, 3 Nov 2011 18:26:45 +0000 (11:26 -0700)]

locker: fix race in locking

The isolation level is lower than I thought. This made it possible for
two clients to think they both locked the same machines, since the
update would still be modifying each row to change the locked_since
time.

commit | commitdiff | tree

Samuel Just [Wed, 2 Nov 2011 18:33:37 +0000 (11:33 -0700)]

testrados: set CEPH_CLIENT_ID without a ;

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Samuel Just [Mon, 31 Oct 2011 21:26:41 +0000 (14:26 -0700)]

testrados: specify CEPH_CONF directly

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Yehuda Sadeh [Thu, 27 Oct 2011 19:11:28 +0000 (12:11 -0700)]

rgw: add user suspend/enable test

commit | commitdiff | tree

Yehuda Sadeh [Thu, 27 Oct 2011 18:32:12 +0000 (11:32 -0700)]

rgw: log-to-stderr is now a binary flag

commit | commitdiff | tree

Samuel Just [Mon, 24 Oct 2011 21:23:48 +0000 (14:23 -0700)]

testrados: rename testsnaps to testrados and make snap testing optional

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Josh Durgin [Mon, 24 Oct 2011 20:52:29 +0000 (13:52 -0700)]

workunit: set PYTHONPATH so we can test python bindings

commit | commitdiff | tree

Sage Weil [Sun, 23 Oct 2011 17:30:27 +0000 (10:30 -0700)]

ceph.conf: python parser doens't like ; comments

commit | commitdiff | tree

Sage Weil [Sun, 23 Oct 2011 05:16:39 +0000 (22:16 -0700)]

ceph.conf: more frequent osd scrubbing; remove old cruft

commit | commitdiff | tree

Sage Weil [Wed, 19 Oct 2011 17:04:07 +0000 (10:04 -0700)]

ceph_manager: count active+clean+<somjething else> as active+clean

In my case, one pg was active+clean+scrubbing.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Josh Durgin [Thu, 20 Oct 2011 23:28:29 +0000 (16:28 -0700)]

coverage: don't remove ceph tarball

We want to keep it for examining core files, and we're already
fetching it here, once per suite run.

commit | commitdiff | tree

Sage Weil [Mon, 17 Oct 2011 22:32:22 +0000 (15:32 -0700)]

add lost_unfound task

Also some misc useful bits to ceph_manager.

commit | commitdiff | tree

Josh Durgin [Mon, 17 Oct 2011 21:42:03 +0000 (14:42 -0700)]

ceph: add whitelist for cluster log errors

Some messages are expected when thrashing osds or creating unfound
objects.

Fixes: #1622

commit | commitdiff | tree

Josh Durgin [Mon, 17 Oct 2011 17:40:16 +0000 (10:40 -0700)]

nuke: reset syslog configuration after rebooting

Previously we removed a file and rebooted without syncing, so the file
was never deleted.

commit | commitdiff | tree

Yehuda Sadeh [Wed, 12 Oct 2011 22:37:33 +0000 (15:37 -0700)]

radosgw-admin: test swift keys creation/removal

commit | commitdiff | tree

Josh Durgin [Fri, 7 Oct 2011 21:51:46 +0000 (14:51 -0700)]

teuthology-worker: remove --keep-locked-on-error

commit | commitdiff | tree

Josh Durgin [Fri, 7 Oct 2011 21:45:01 +0000 (14:45 -0700)]

Remove --keep-locked-on-error, and behave as if it were specified

This will help prevent machines with cephtest dirs still present from
being used. It's easy to unlock machines - the targets yaml fragment
is output during a run.

commit | commitdiff | tree

Josh Durgin [Fri, 7 Oct 2011 00:18:35 +0000 (17:18 -0700)]

reconnect: ignore SSHExceptions before the timeout expires

Fixes: #1587

commit | commitdiff | tree

Samuel Just [Thu, 6 Oct 2011 20:33:17 +0000 (13:33 -0700)]

task/watch_notify_stress: watch_notify_stress now thrashes clients

This should exercise the watch notify timeout code.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Wed, 5 Oct 2011 22:54:57 +0000 (15:54 -0700)]

rgw: keep radosgw in foreground

It defaults to a daemon now.

commit | commitdiff | tree

Josh Durgin [Wed, 5 Oct 2011 00:19:56 +0000 (17:19 -0700)]

Retry listing machines if the lock server goes down.

commit | commitdiff | tree

Sage Weil [Tue, 4 Oct 2011 23:09:32 +0000 (16:09 -0700)]

rgw: use normal logging mechanism

Keep capturing stdout/err, even though it should end up empty.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Josh Durgin [Tue, 4 Oct 2011 19:32:58 +0000 (12:32 -0700)]

teuthology-worker: clean up last_in_suite jobs

There's no reason not to delete them once they start.

commit | commitdiff | tree

Josh Durgin [Tue, 4 Oct 2011 19:16:30 +0000 (12:16 -0700)]

daemon-helper: detect the signal actually sent

I thought I fixed this when I implemented coverage collection, but I
guess it got lost in a rebase or something.

commit | commitdiff | tree

Josh Durgin [Tue, 4 Oct 2011 00:49:53 +0000 (17:49 -0700)]

ceph_manager: remove unused raw_pg_status method

commit | commitdiff | tree

Josh Durgin [Tue, 4 Oct 2011 00:49:13 +0000 (17:49 -0700)]

ceph_manager: run ceph -s as a normal program

This allows failures from it to be detected better.

commit | commitdiff | tree

Josh Durgin [Tue, 4 Oct 2011 00:05:33 +0000 (17:05 -0700)]

teuthology-results: include passed tests in email

commit | commitdiff | tree

Josh Durgin [Tue, 4 Oct 2011 00:00:45 +0000 (17:00 -0700)]

teuthology-results: include reasons for failure in email

commit | commitdiff | tree

Josh Durgin [Mon, 3 Oct 2011 23:32:42 +0000 (16:32 -0700)]

teuthology-ls: show reasons for failures with -v

commit | commitdiff | tree

Josh Durgin [Mon, 3 Oct 2011 23:08:49 +0000 (16:08 -0700)]

Add failure_reason to summary for the first failure detected.

For now, this is the exception raised during a task, the error found
in the central log, or coredumps found. More specific errors
(i.e. s3-tests had 3 failures) can be added later as exceptions raised
by tasks.

commit | commitdiff | tree

Josh Durgin [Mon, 3 Oct 2011 23:41:17 +0000 (16:41 -0700)]

radosbench: get coverage and cores

commit | commitdiff | tree

Samuel Just [Mon, 3 Oct 2011 21:04:53 +0000 (14:04 -0700)]

watch_notify_stress.py: add ceph flags option

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Samuel Just [Mon, 3 Oct 2011 21:03:36 +0000 (14:03 -0700)]

ceph.py: add btrfs option

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Josh Durgin [Mon, 3 Oct 2011 16:55:58 +0000 (09:55 -0700)]

nuke: keep up with renaming cfuse -> ceph-fuse

commit | commitdiff | tree

Sage Weil [Fri, 30 Sep 2011 16:12:45 +0000 (09:12 -0700)]

radosgw-admin: test additional keys, log list/show/rm

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 29 Sep 2011 05:20:38 +0000 (22:20 -0700)]

tasks/radosgw-admin: test radosgw-admin tool

Not yet complete...

commit | commitdiff | tree

Sage Weil [Thu, 29 Sep 2011 03:50:24 +0000 (20:50 -0700)]

nuke: killall apache2 and radosgw too

commit | commitdiff | tree

Greg Farnum [Fri, 30 Sep 2011 16:26:42 +0000 (09:26 -0700)]

s3-tests: use radosgw-admin instead of radosgw_admin

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Josh Durgin [Thu, 29 Sep 2011 16:09:31 +0000 (09:09 -0700)]

ceph_manager: parse osd numbers with dots

This is necessary since wip-dot-names was merged.

commit | commitdiff | tree

Sage Weil [Fri, 23 Sep 2011 15:57:18 +0000 (08:57 -0700)]

rename c* -> ceph-*

Leave cfuse task name unchanged for now...

commit | commitdiff | tree

Josh Durgin [Fri, 23 Sep 2011 01:23:36 +0000 (18:23 -0700)]

queue: results_timeout needs to be converted to a string

commit | commitdiff | tree

Samuel Just [Thu, 22 Sep 2011 20:23:05 +0000 (13:23 -0700)]

task/watch_notify_stress.py: add simple watch_notify stress test

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Josh Durgin [Wed, 21 Sep 2011 18:05:18 +0000 (11:05 -0700)]

schedule: put results timeout in the job

The default was always being used instead.

commit | commitdiff | tree

Greg Farnum [Tue, 20 Sep 2011 17:04:01 +0000 (10:04 -0700)]

lockfile: increase interval to prevent incorrect locking orders

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Thu, 15 Sep 2011 16:24:52 +0000 (09:24 -0700)]

lockfile: don't fail cleanup if no lock procs exist

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Tommi Virtanen [Fri, 16 Sep 2011 18:32:15 +0000 (11:32 -0700)]

workunit: Fetch source from github.

Needed an elaborate dance because Github won't let us download
an archive of a subdirectory.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>

commit | commitdiff | tree

Tommi Virtanen [Fri, 16 Sep 2011 18:09:45 +0000 (11:09 -0700)]

s3tests: Clone repository from github.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>

commit | commitdiff | tree

Tommi Virtanen [Fri, 16 Sep 2011 18:08:38 +0000 (11:08 -0700)]

coverage: Fetch source from github.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>

commit | commitdiff | tree

Samuel Just [Fri, 16 Sep 2011 00:26:03 +0000 (17:26 -0700)]

ceph.py: remove unused variables mds_daemons and mon_daemons

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Samuel Just [Wed, 14 Sep 2011 23:31:58 +0000 (16:31 -0700)]

ceph.py/cephmanager.py: add ctx.daemons for restarting daemons

ctx.daemons will now be an instance of CephState.

ctx.daemons.get_daemon(role, id).stop() to stop daemon, retart() to
restart the daemon, etc.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Samuel Just [Wed, 14 Sep 2011 23:28:06 +0000 (16:28 -0700)]

testsnaps: LD_PRELOAD needed for librados

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Tommi Virtanen [Tue, 13 Sep 2011 21:53:02 +0000 (14:53 -0700)]

Move orchestra to teuthology.orchestra so there's just one top-level package.

commit | commitdiff | tree

Tommi Virtanen [Tue, 13 Sep 2011 21:10:12 +0000 (14:10 -0700)]

Merge orchestra into teuthology.

There are too many things called Orchestra out there,
including Ubuntu's new multi-machine service orchestration
framework. The code might still be beneficial outside of
teuthology, but it can be spun off at that time.

Conflicts:
bootstrap
requirements.txt
setup.py

commit | commitdiff | tree

Tommi Virtanen [Fri, 9 Sep 2011 20:22:03 +0000 (13:22 -0700)]

Callers of task s3tests.create_users don't need to provide dummy "fixtures" dict.

commit | commitdiff | tree

Josh Durgin [Fri, 9 Sep 2011 17:31:08 +0000 (10:31 -0700)]

thrashosds: fix timeout when no options are specified

commit | commitdiff | tree

Josh Durgin [Fri, 9 Sep 2011 01:09:11 +0000 (18:09 -0700)]

thrashosds: fail if cluster doesn't finally become clean in 5 minutes

commit | commitdiff | tree

Josh Durgin [Thu, 8 Sep 2011 21:09:13 +0000 (14:09 -0700)]

thrasher: get coverage and cores from calling ceph commands

commit | commitdiff | tree

Josh Durgin [Thu, 8 Sep 2011 21:07:23 +0000 (14:07 -0700)]

thrashosds: wait for every pg to go active and clean before exiting

commit | commitdiff | tree

Josh Durgin [Thu, 8 Sep 2011 19:54:23 +0000 (12:54 -0700)]

thrasher: clean up a bit

commit | commitdiff | tree

Josh Durgin [Thu, 8 Sep 2011 00:50:12 +0000 (17:50 -0700)]

autotest: allow tests to be run on all clients

commit | commitdiff | tree

Josh Durgin [Wed, 7 Sep 2011 23:54:24 +0000 (16:54 -0700)]

rbd: allow specifying all clients

commit | commitdiff | tree

Greg Farnum [Tue, 6 Sep 2011 18:29:04 +0000 (11:29 -0700)]

locktest: don't fail cleanup if the dir doesn't exist

We're doing this the cheapest way possible: make the dir!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Sat, 3 Sep 2011 22:07:21 +0000 (15:07 -0700)]

teuthology: do a deep merge of input yaml fragments

Concatenate lists, and recursively combine dicts.

If you specify inputs like

foo:
- a
- b

and

foo:
- c

you should get

foo:
- a
- b
- c

Dicts should also be merged (last one wins), and the merging is deep. E.g.

foo:
   a:
     b:
       c: 1

and

foo:
   a:
     b:
       c: 2

is

foo:
   a:
     b:
       c: 2

Fixes: #1497

commit | commitdiff | tree

Josh Durgin [Sat, 3 Sep 2011 02:12:16 +0000 (19:12 -0700)]

lock: default to only listing machines you have locked

--all removes this restriction

commit | commitdiff | tree

Josh Durgin [Sat, 3 Sep 2011 00:58:19 +0000 (17:58 -0700)]

rgw: run as an external fastcgi server to match dho

commit | commitdiff | tree

Sage Weil [Fri, 2 Sep 2011 18:07:10 +0000 (11:07 -0700)]

don't eat exceptions for breakfast

fixes 0c2bee1514c1b1e65ca5d52459062e5a45da2d7b

commit | commitdiff | tree

Greg Farnum [Wed, 31 Aug 2011 21:40:55 +0000 (14:40 -0700)]

locktest: make it actually run the executable test

This was missing an argument (the file to run on!) and apparently
that didn't cause the command to output a failure return code.

Additionally, the ceph wrappers were blocking a crash and falsely
reporting success back to teuthology. (Yikes!)

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Josh Durgin [Thu, 1 Sep 2011 22:35:27 +0000 (15:35 -0700)]

nuke: synchronize clocks after reboot, and optionally synchronize all clocks

commit | commitdiff | tree

Sage Weil [Wed, 31 Aug 2011 20:56:42 +0000 (13:56 -0700)]

thrashosds: make it work when first mon isn't mon.0

commit | commitdiff | tree

Sage Weil [Wed, 31 Aug 2011 20:21:30 +0000 (13:21 -0700)]

thrashosds: no camelcaps, add some whitespace

commit | commitdiff | tree

Josh Durgin [Thu, 1 Sep 2011 17:44:46 +0000 (10:44 -0700)]

nuke: remove unused import

commit | commitdiff | tree

Josh Durgin [Thu, 1 Sep 2011 17:33:20 +0000 (10:33 -0700)]

nuke: localize again imports so they occur after gevent monkey-patching

This is necessary to make ssh work properly.

commit | commitdiff | tree

Josh Durgin [Thu, 1 Sep 2011 02:46:10 +0000 (19:46 -0700)]

nuke: reboot if rbd is mounted

commit | commitdiff | tree

Josh Durgin [Thu, 1 Sep 2011 00:43:14 +0000 (17:43 -0700)]

schedule: add a way to delete jobs from the queue

commit | commitdiff | tree

Josh Durgin [Thu, 1 Sep 2011 00:13:06 +0000 (17:13 -0700)]

parallel: don't hang if no tasks were spawned

This makes 6d919152178cfbd69dc5d50cdab40fc99db166a6 work.

commit | commitdiff | tree

Josh Durgin [Wed, 31 Aug 2011 23:48:58 +0000 (16:48 -0700)]

workunits: remove unused variable

commit | commitdiff | tree

Josh Durgin [Wed, 31 Aug 2011 21:36:32 +0000 (14:36 -0700)]

nuke: add option to reboot all nodes

commit | commitdiff | tree

Josh Durgin [Wed, 31 Aug 2011 21:36:01 +0000 (14:36 -0700)]

Fix pyflakes warnings.

commit | commitdiff | tree

Josh Durgin [Wed, 31 Aug 2011 00:21:36 +0000 (17:21 -0700)]

coverage: remove debugging

commit | commitdiff | tree

Josh Durgin [Wed, 31 Aug 2011 00:12:14 +0000 (17:12 -0700)]

workunit: save coverage and coredumps

Anything that runs a ceph utility should be using these commands.

commit | commitdiff | tree

Greg Farnum [Tue, 30 Aug 2011 22:48:58 +0000 (15:48 -0700)]

workunits: rework a little bit to allow "all" clients in a run

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Wed, 24 Aug 2011 21:07:11 +0000 (14:07 -0700)]

cfuse: support running through valgrind

Also switch up the config code so we can take per-client options.

commit | commitdiff | tree

Greg Farnum [Mon, 29 Aug 2011 23:47:22 +0000 (16:47 -0700)]

valgrind: don't run valgrind_post if there's no valgrind

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Mon, 29 Aug 2011 20:58:09 +0000 (13:58 -0700)]

valgrind: scan logs for bad results

It's not sophisticated but it will warn you about a node
if at least one node has issues.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Mon, 29 Aug 2011 19:39:38 +0000 (12:39 -0700)]

valgrind: use xml output for tools that support it

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Josh Durgin [Mon, 29 Aug 2011 19:42:45 +0000 (12:42 -0700)]

suite: add option to send an email if the entire suite passed

commit | commitdiff | tree

Josh Durgin [Fri, 26 Aug 2011 00:11:33 +0000 (17:11 -0700)]

Generate coverage at the end of a suite run,
and optionally email failures and ongoing jobs.

commit | commitdiff | tree

Josh Durgin [Fri, 26 Aug 2011 00:09:03 +0000 (17:09 -0700)]

queue: delete every job when it finishes, so only running jobs are buried

commit | commitdiff | tree

Josh Durgin [Thu, 4 Aug 2011 01:08:14 +0000 (18:08 -0700)]

Add teuthology-coverage for analyzing test coverage for a suite run.

commit | commitdiff | tree

Josh Durgin [Tue, 14 Jun 2011 18:57:29 +0000 (11:57 -0700)]

Add scripts to analyze coverage for a single teuthology run.

commit | commitdiff | tree

Greg Farnum [Thu, 25 Aug 2011 22:27:30 +0000 (15:27 -0700)]

thrasher: improve documentation a little

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Thu, 25 Aug 2011 22:19:30 +0000 (15:19 -0700)]

thrasher: add option to mark OSDs down instead of out.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Thu, 25 Aug 2011 22:18:42 +0000 (15:18 -0700)]

thrasher: allow a config to set values

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Thu, 25 Aug 2011 21:38:34 +0000 (14:38 -0700)]

thrasher: remove redundant wait_till_clean()

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Wed, 24 Aug 2011 23:48:14 +0000 (16:48 -0700)]

coverage: create dir conditionally

We don't need to create the dir if we aren't using coverage.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom