Joe Buck [Wed, 20 Feb 2013 19:58:45 +0000 (11:58 -0800)]
teuthology: add an extra_packages flag to install
Some tests require additional packages
(e.g., java bindings, hadoop bindings).
Extend the install task to allow for those
packages to be specified in the yaml files.
Signed-off-by: Joe Buck <jbbuck@gmail.com> Reviewed-by: Sam Lang <sam.lang@inktank.com>
task: mon_thrash: Thrash multiple monitors and 'maintain-quorum' option
We now add a new option 'thrash-many' that by being set to true will break
the default behaviour of killing only one monitor at a time. Instead,
this option will select up to the maximum number of killable monitors to
kill in each round.
We also add a new 'maintain-quorum' option that will limit the amount of
monitors that can be killed in each thrashing round. If set to true, this
option will limit the amount of killable monitors up to (n/2-1). This
means that if we are running a configuration that only has up to two
configured monitors, if 'maintain-quorum' is set to true, this task won't
run as there are no killable monitors -- in such a scenario, this option
should be set to false.
Furthermore, if 'store-thrash' is set to true, then 'maintain-quorum' must
also be set to true, as we cannot let the task to thrash all the monitor
stores, or we wouldn't be able to sync from other monitors, nor can we
let quorum be dropped, or we won't be able to resync our way into quorum.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
task: mon_thrash: Add 'seed' and 'store-thrash' options
This patch introduces an option to thrash a monitor store when we thrash
the monitors, as well as a 'store-thrash-probability' option (defaulting
to 50%).
We also took this opportunity to introduce a new 'seed' option, that ought
to allow a given run of this task to be reproducible. This might come in
hand when attempting to reproduce a given behavior that would otherwise
be randomly triggered.
You should note that while the 'seed' option will indeed mimic past
behaviors, this only applies to a past behavior of this task: other tasks
are not affected by this value, nor are any workunits or even ceph daemons.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Sage Weil [Thu, 21 Feb 2013 05:46:37 +0000 (21:46 -0800)]
install: be more careful about package removal
- call apt separately for each package; it will error out annoyingly if
there is one in the list not in the APT sources.
- use dpkg with appropriate force to clean up broken half-installs.
Sage Weil [Mon, 18 Feb 2013 23:36:34 +0000 (15:36 -0800)]
schedule_suite.sh: include install task in all jobs
This is probably temporary. It's simpler than adding the task to every
job in the suite. We'll want to do that later when we want to test
alternative install methods (like ceph-deploy's install function).
Sage Weil [Fri, 15 Feb 2013 23:39:02 +0000 (15:39 -0800)]
ceph: simpilfy package removal
apt-get doesn't have a nice way to tell if the package is not install and
we don't need to purge it. Well, not one I found in 5 minutes. Just
do a big purge and assume it works, or failed because there was nothing to
be done.
Sander Pool [Wed, 6 Feb 2013 19:16:52 +0000 (19:16 +0000)]
Install ceph debs and use installed debs
The ceph task installs ceph using the debian
packages now, and all invocations of binaries installed
in {tmpdir}/binary/usr/local/bin/ are replace with
the use of the binaries installed in standard locations
by the debs.
Author: Sander Pool <sander.pool@inktank.com> Signed-off-by: Sam Lang <sam.lang@inktank.com>
Sandon Van Ness [Fri, 8 Feb 2013 00:34:14 +0000 (16:34 -0800)]
Merge to include --machine-type and changes to --summary
Added the ability to support multiple types of machines with
--machine-type added to teuthology-lock when used with --lock-many
or --machine-type with teuthology --lock (automated tests). It
defaults to 'plana' and the 'vps' type is currently unused but
should be in the future.
Also updated teutholoy-lock --summary to be machine type aware
and sort things in a nice output.
Signed-off-by: Sandon Van Ness <sandon@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Sandon Van Ness [Tue, 5 Feb 2013 20:53:08 +0000 (12:53 -0800)]
Added support for multiple types of machines.
Added the ability to support multiple types of machines with
--machine-type added to teuthology-lock when used with --lock-many
or --machine-type with teuthology --lock (automated tests). It
defaults to 'plana' and the 'vps' type is currently unused but
should be in the future.
Signed-off-by: Sandon Van Ness <sandon@van-ness.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Josh Durgin [Wed, 6 Feb 2013 07:31:37 +0000 (23:31 -0800)]
nuke: don't try unmount if we're rebooting everything anyway
This can cause issues when unmount hangs. Our automatic runs reboot
everything unconditionally, so this caused a bunch of unecessary hangs
when an fs was accidentally rendered un-unmountable.
Sam Lang [Tue, 5 Feb 2013 22:20:52 +0000 (16:20 -0600)]
misc: Close connections on reboot
When nodes are rebooted, the connections remain open
even after calling reconnect and setting up new ssh
sessions to the rebooted nodes. This causes ECONNRESET
errors to show up in the teuthology output.
Close the existing connections before trying to reconnect.
Sam Lang [Tue, 5 Feb 2013 16:38:48 +0000 (10:38 -0600)]
task/ceph_manager: Fix NoneType config issue
kill_mon is getting a config set to None, which blows
up now due to the check for powercycle. Initialize
the config to an empty dict if we don't get anything
on init. This is the error showing up in teuthology:
2013-02-04T15:04:16.595 ERROR:teuthology.run_tasks:Manager failed: <contextlib.GeneratorContextManager object at 0x1fcafd0>
Traceback (most recent call last):
File "/var/lib/teuthworker/teuthology-master/teuthology/run_tasks.py", line 45, in run_tasks
suppress = manager.__exit__(*exc_info)
File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/var/lib/teuthworker/teuthology-master/teuthology/task/mon_thrash.py", line 142, in task
thrash_proc.do_join()
File "/var/lib/teuthworker/teuthology-master/teuthology/task/mon_thrash.py", line 69, in do_join
self.thread.get()
File "/var/lib/teuthworker/teuthology-master/virtualenv/local/lib/python2.7/site-packages/gevent/greenlet.py", line 308, in get
raise self._exception
AttributeError: 'NoneType' object has no attribute 'get'