]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Sage Weil [Sun, 21 Aug 2011 22:14:02 +0000 (15:14 -0700)]
rbd: make default image 10G instead of 1G
Sage Weil [Wed, 10 Aug 2011 20:34:38 +0000 (13:34 -0700)]
suite: support a suite consisting of multiple collections
suite = many collections, and maybe some shared files
collection = a collection of facets
facet = a config fragment
Greg Farnum [Wed, 17 Aug 2011 17:35:37 +0000 (10:35 -0700)]
valgrind: Document!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Wed, 17 Aug 2011 17:32:57 +0000 (10:32 -0700)]
Merge branch 'wip-valgrind'
Greg Farnum [Wed, 17 Aug 2011 17:06:58 +0000 (10:06 -0700)]
include log in valgrind log file names
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Wed, 17 Aug 2011 17:05:13 +0000 (10:05 -0700)]
ceph task: split up arguments a little more
This allows selective daemon kill signal changes. With valgrind
daemons we want term instead of kill, for instance.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Wed, 17 Aug 2011 17:04:31 +0000 (10:04 -0700)]
valgrind: move valgrind logs to log dir
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Mon, 15 Aug 2011 22:35:42 +0000 (15:35 -0700)]
ceph: split up daemon-running arguments and insert valgrind ones
This setup should let us insert other kinds of things too, if we
need them.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Mon, 15 Aug 2011 22:32:23 +0000 (15:32 -0700)]
ceph: Set up valgrind as a flavor, and create a dir for logging.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Mon, 15 Aug 2011 22:31:18 +0000 (15:31 -0700)]
ceph task: pass the full config to the daemon startup subs
So far as I can tell there is no reason to reduce them to
the coverage config, and I want the full config for my
soon-to-exist valgrind options.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Tommi Virtanen [Mon, 15 Aug 2011 16:36:06 +0000 (09:36 -0700)]
Add assert to catch simple typos in roles list.
Input of "roles:\n- [mds,1]" used to make teuthology crash
in a non-obviou way.
Greg Farnum [Wed, 10 Aug 2011 23:16:11 +0000 (16:16 -0700)]
Merge branch 'wip-nuke'
Conflicts:
teuthology/task/kernel.py
Greg Farnum [Tue, 9 Aug 2011 20:30:47 +0000 (13:30 -0700)]
manypools: remove commented-out code
This accidentally got left in from my development.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Wed, 10 Aug 2011 23:06:45 +0000 (16:06 -0700)]
teuthology-nuke: split the big main function
It was getting a bit big, but now all the functions fit on
one screen each.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Wed, 10 Aug 2011 22:38:57 +0000 (15:38 -0700)]
teuthology-nuke: move it into its own file.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Wed, 10 Aug 2011 21:19:23 +0000 (14:19 -0700)]
teuthology-nuke: identify and reboot machines with kernel mounts
This includes untested code for just force-unmounting them
when that works again, but for now it does a full reboot-and-
reconnect cycle.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Wed, 10 Aug 2011 17:55:02 +0000 (10:55 -0700)]
teuthology-nuke: use a more robust cfuse mount finder
This way it can remove cfuse mounts in any location on
the system.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Wed, 10 Aug 2011 17:47:50 +0000 (10:47 -0700)]
teuthology-nuke: split out different pieces into different loops
This will let us behave more intelligently on things like
nuking kernel mounts.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Wed, 10 Aug 2011 17:37:04 +0000 (10:37 -0700)]
Move reconnect function from kernel task to misc.py
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Tommi Virtanen [Wed, 10 Aug 2011 20:40:00 +0000 (13:40 -0700)]
Configure grub to default to the right kernel, not the greatest installed one.
This is sticky; that is, even if you install other kernels (manually/via fab/etc),
grub will keep booting up the one that was last enabled via teuthology config.
Use teuthology to switch kernels and it'll just work.
If the kernel the grub default points to is removed, grub will fall back to
booting the kernel with the greatest version number.
Closes: http://tracker.newdream.net/issues/1364
Tommi Virtanen [Wed, 10 Aug 2011 20:22:14 +0000 (13:22 -0700)]
Handle socket.timeout when waiting for a reconnect.
Now it gets ignored, just like the other harmless socket errors.
Tommi Virtanen [Wed, 10 Aug 2011 20:21:39 +0000 (13:21 -0700)]
Wait up to 300 seconds for a reboot.
At least sepia86 was reliably slower than the previous 180 second default.
Sage Weil [Wed, 10 Aug 2011 19:47:20 +0000 (12:47 -0700)]
ceph: fix max_mds calculation
Signed-off-by: Sage Weil <sage@newdream.net>
Greg Farnum [Wed, 10 Aug 2011 00:17:08 +0000 (17:17 -0700)]
kernel: comment reconnect task, clean up reporting
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Tue, 9 Aug 2011 20:30:47 +0000 (13:30 -0700)]
manypools: remove commented-out code
This accidentally got left in from my development.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Tommi Virtanen [Tue, 9 Aug 2011 23:25:00 +0000 (16:25 -0700)]
Make rbd task use mnt.N not mnt.client.N as mountpoint.
Everything else expects this, so e.g. workunits wouldn't work with rbd.
Tommi Virtanen [Tue, 9 Aug 2011 23:11:32 +0000 (16:11 -0700)]
Make sure workunit task does not create mnt.N by itself.
This used to hide a bug in the rbd task, where rbd
created the mountpoint with the wrong name. The workunits
ended up running against the local filesystem.
Tommi Virtanen [Tue, 9 Aug 2011 22:42:17 +0000 (15:42 -0700)]
Add interactive-on-error, to pause and explore on error.
Closes: http://tracker.newdream.net/issues/1291
Stephon Striplin [Tue, 9 Aug 2011 20:43:46 +0000 (13:43 -0700)]
allow s3tests.create_users defaults be overridden
Tommi Virtanen [Tue, 9 Aug 2011 20:40:56 +0000 (13:40 -0700)]
Add simple unit test for get_clients.
Sage Weil [Tue, 9 Aug 2011 20:23:58 +0000 (13:23 -0700)]
Revert "fix get_clients"
This reverts commit
83b6678e79904793bf31e82bbecad7bf16c1b2b5 . The bug I was
hitting was actually fxied by
06e3e69c293b20c0ce5df526fa923a979c1d8cfc .
Gregory Farnum [Mon, 1 Aug 2011 20:19:15 +0000 (13:19 -0700)]
teuthology: add task manypools
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Sage Weil [Fri, 5 Aug 2011 21:35:22 +0000 (14:35 -0700)]
new gitbuilder ref/branch naming
no origin_ prefix
Sage Weil [Thu, 4 Aug 2011 22:03:05 +0000 (15:03 -0700)]
cfuse, kclient: print remote host
Sage Weil [Thu, 4 Aug 2011 22:01:49 +0000 (15:01 -0700)]
fix get_clients
Only return the clients that are listed (not _all_ clients). There might
be a combination of cfuse and kclient (or other) clients here!
Sage Weil [Thu, 4 Aug 2011 17:41:50 +0000 (10:41 -0700)]
tasks/kclient: don't clobber remote
Sage Weil [Thu, 28 Jul 2011 17:28:57 +0000 (10:28 -0700)]
use coverage_dir
Josh Durgin [Fri, 5 Aug 2011 18:17:28 +0000 (11:17 -0700)]
kernel: install in parallel
Josh Durgin [Fri, 5 Aug 2011 18:08:02 +0000 (11:08 -0700)]
kernel: debug weird socket exceptions
Josh Durgin [Fri, 5 Aug 2011 18:07:40 +0000 (11:07 -0700)]
kernel: reboot immediately after installing
This hides the latency of rebooting when installing on many machines.
Josh Durgin [Fri, 5 Aug 2011 17:59:16 +0000 (10:59 -0700)]
Down machines shouldn't be considered free.
Josh Durgin [Fri, 5 Aug 2011 01:32:57 +0000 (18:32 -0700)]
Make scheduled tasks leave some machines free.
Josh Durgin [Thu, 4 Aug 2011 22:19:13 +0000 (15:19 -0700)]
Log connections to targets
This way you can tell which machines have problems in case of an
error.
Josh Durgin [Wed, 3 Aug 2011 22:28:46 +0000 (15:28 -0700)]
teuthology-worker: log to a file with timestamps
Josh Durgin [Wed, 3 Aug 2011 21:52:55 +0000 (14:52 -0700)]
teuthology-nuke: run in parallel, and print each node being nuked
Josh Durgin [Wed, 3 Aug 2011 20:59:57 +0000 (13:59 -0700)]
Set success at the beginning of a run.
This way internal tasks like locking can tell whether the run
succeeded, and unlock nodes if it did.
Josh Durgin [Wed, 3 Aug 2011 18:21:32 +0000 (11:21 -0700)]
teuthology-nuke: reset rsyslog config
Josh Durgin [Wed, 3 Aug 2011 00:56:49 +0000 (17:56 -0700)]
teuthology-worker: keep machines locked on error
This prevents a failure to clean up in one case from affecting the
rest of the tests.
Josh Durgin [Tue, 2 Aug 2011 23:13:28 +0000 (16:13 -0700)]
teuthology-lock: update usage
Josh Durgin [Tue, 2 Aug 2011 22:53:37 +0000 (15:53 -0700)]
teuthology-lock: allow list of locks to be filtered by owner and status
Greg Farnum [Fri, 29 Jul 2011 17:35:02 +0000 (10:35 -0700)]
teuthology: convert from bzip2 to gzip.
gzip is much, much faster on large log files. With a 7.7GB client log, gzip
took 2:45 to compress it to 624MB. bzip2 took 34:38 to compress it to
366MB. For our purposes the space savings are not worth the time loss.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Sage Weil [Thu, 28 Jul 2011 17:25:30 +0000 (10:25 -0700)]
set max_mds based on non-standbys
Sage Weil [Wed, 27 Jul 2011 18:45:20 +0000 (11:45 -0700)]
no ++ in python
Sage Weil [Wed, 27 Jul 2011 18:45:13 +0000 (11:45 -0700)]
roles/3-simple: include a standby mds
Sage Weil [Wed, 27 Jul 2011 17:04:37 +0000 (10:04 -0700)]
configure mds's with -s suffix as standby
Sage Weil [Wed, 27 Jul 2011 05:06:49 +0000 (22:06 -0700)]
roles: use letters for mon, mds names
Sage Weil [Wed, 27 Jul 2011 04:46:47 +0000 (21:46 -0700)]
tolerate named (not numbered) mons
Sage Weil [Wed, 27 Jul 2011 04:52:39 +0000 (21:52 -0700)]
specify and clean up admin socket
Josh Durgin [Wed, 20 Jul 2011 01:37:05 +0000 (18:37 -0700)]
lock server: configure for apache with mod_wsgi
Josh Durgin [Wed, 20 Jul 2011 01:34:42 +0000 (18:34 -0700)]
Set content-type with PUT.
Josh Durgin [Wed, 20 Jul 2011 00:24:49 +0000 (17:24 -0700)]
schedule: make default owner different from that of a normal run
This way the machines locked by scheduled jobs aren't confused
with those locked by manual runs, so they're harder to accidentally
unlock.
Josh Durgin [Wed, 20 Jul 2011 00:11:12 +0000 (17:11 -0700)]
Update example targets in readme.
Josh Durgin [Tue, 19 Jul 2011 23:24:50 +0000 (16:24 -0700)]
Remove print that clutters the worker logs.
Josh Durgin [Fri, 15 Jul 2011 22:04:08 +0000 (15:04 -0700)]
Connect without using any known_hosts files.
Josh Durgin [Thu, 14 Jul 2011 23:47:29 +0000 (16:47 -0700)]
Make targets a dictionary mapping hosts to ssh host keys.
Josh Durgin [Thu, 14 Jul 2011 00:14:52 +0000 (17:14 -0700)]
Add command to update ssh hostkeys.
Josh Durgin [Thu, 14 Jul 2011 22:26:49 +0000 (15:26 -0700)]
lock server: return host pubkeys with locked machine names
Josh Durgin [Thu, 14 Jul 2011 22:10:50 +0000 (15:10 -0700)]
lock server: allow sshpubkey to be updated
Josh Durgin [Fri, 15 Jul 2011 21:59:33 +0000 (14:59 -0700)]
Update lock db schema.
Josh Durgin [Sat, 16 Jul 2011 00:15:09 +0000 (17:15 -0700)]
Add an overrides section for the ceph task.
This lets you run a suite against a particular version of ceph, or
with special debug settings.
Josh Durgin [Thu, 14 Jul 2011 20:57:07 +0000 (13:57 -0700)]
Better interface for running functions in parallel.
Josh Durgin [Thu, 14 Jul 2011 18:15:55 +0000 (11:15 -0700)]
Merge branch 'wip-parallel'
Sage Weil [Tue, 12 Jul 2011 03:37:48 +0000 (20:37 -0700)]
ceph.conf: remove other random bits
obsolete sections, mds tuning. stick with defaults.
Josh Durgin [Wed, 13 Jul 2011 20:15:28 +0000 (13:15 -0700)]
fusermount runs on a single mount point.
Josh Durgin [Wed, 22 Jun 2011 17:57:16 +0000 (10:57 -0700)]
Download ceph binaries in parallel.
Josh Durgin [Wed, 22 Jun 2011 17:56:40 +0000 (10:56 -0700)]
Run workunits on different clients in parallel.
Josh Durgin [Wed, 22 Jun 2011 17:53:10 +0000 (10:53 -0700)]
Download and run autotests on multiple clients in parallel.
These clients must still be on different machines,
or they'll clobber each other's results.
Josh Durgin [Wed, 22 Jun 2011 17:50:09 +0000 (10:50 -0700)]
Add a utility for running functions in parallel.
Tommi Virtanen [Wed, 13 Jul 2011 19:38:12 +0000 (12:38 -0700)]
Merge branch 'localdir'
Conflicts:
teuthology/task/ceph.py
Tommi Virtanen [Wed, 13 Jul 2011 19:34:39 +0000 (12:34 -0700)]
Feed locally-created binary tarball to remotes in parallel.
This should be faster as long as we have the bandwidth for it.
Tommi Virtanen [Wed, 13 Jul 2011 19:18:55 +0000 (12:18 -0700)]
Use a nameless tempfile for local tarball, avoids cleanup.
Tommi Virtanen [Wed, 13 Jul 2011 19:07:36 +0000 (12:07 -0700)]
More careful error checking, avoid need for shell quoting.
Tommi Virtanen [Wed, 13 Jul 2011 18:32:28 +0000 (11:32 -0700)]
Clean up tarball tmpdir in all cases.
Prefer shutil.rmtree over os.system('rm -rf ...').
Tommi Virtanen [Wed, 13 Jul 2011 17:58:01 +0000 (10:58 -0700)]
Use tempfile instead of ad hoc temp dir creation.
Tommi Virtanen [Wed, 13 Jul 2011 17:44:33 +0000 (10:44 -0700)]
Remove TODO note covered by teuthology-nuke.
Tommi Virtanen [Wed, 13 Jul 2011 17:17:04 +0000 (10:17 -0700)]
Avoid identifier clash with builtin "dir".
Sage Weil [Tue, 12 Jul 2011 03:32:34 +0000 (20:32 -0700)]
ceph.conf: clean out random debug level changes
keep it simple!
Sage Weil [Tue, 12 Jul 2011 03:32:07 +0000 (20:32 -0700)]
include sha1 in summary
Redundant (there's also a ceph-sha1 file), but convenient.
Sage Weil [Tue, 12 Jul 2011 03:31:37 +0000 (20:31 -0700)]
ls: mention directories without summary.yaml
Josh Durgin [Tue, 12 Jul 2011 01:04:09 +0000 (18:04 -0700)]
Clean up from pyflakes.
Josh Durgin [Tue, 12 Jul 2011 01:00:03 +0000 (18:00 -0700)]
Whitespace and style cleanup.
Josh Durgin [Tue, 12 Jul 2011 00:39:10 +0000 (17:39 -0700)]
Remove unused variable.
Josh Durgin [Tue, 12 Jul 2011 00:34:36 +0000 (17:34 -0700)]
Success of test may not have been set yet.
Greg Farnum [Mon, 11 Jul 2011 23:40:29 +0000 (16:40 -0700)]
add locktest task
This will retrieve xfstests' locktest and run it on two clients.
I still need to tweak this so the logging output we get is more useful, and
so that we test extra features like wait locks, but it does execute.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Thu, 7 Jul 2011 22:40:37 +0000 (15:40 -0700)]
task ceph: distribute monmap to all nodes, not just mons.
And clean up the monmap, too!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Josh Durgin [Mon, 11 Jul 2011 22:48:42 +0000 (15:48 -0700)]
Add an option to keep machines locked if a test fails.
Sage Weil [Mon, 11 Jul 2011 22:25:36 +0000 (15:25 -0700)]
lock: specify machines as input yaml targets: clause
Sage Weil [Mon, 11 Jul 2011 21:49:53 +0000 (14:49 -0700)]
print --lock-many result as yaml targets: stanza
Sage Weil [Mon, 11 Jul 2011 22:27:50 +0000 (15:27 -0700)]
clean up locked machine list
Sage Weil [Mon, 11 Jul 2011 21:39:21 +0000 (14:39 -0700)]
tell user which machines you locked