Sage Weil [Tue, 23 Jul 2013 21:43:56 +0000 (14:43 -0700)]
ceph: add wait_for_mon_quorum command
tasks:
...
- ceph.wait_for_mon_quorum: [a, b]
...
will block until the mon quorum consists of exactly [a, b]. This is
compared directly to the relevant field from 'ceph quorum_status'
which has the alphanumeric names only.
Sage Weil [Mon, 22 Jul 2013 20:03:24 +0000 (13:03 -0700)]
sequential, parallel: allow entries to be references to top-level config
Often we want to build a test collection that substitutes different
sequences of tasks into a parallel/sequential construction. However, the
yaml combination that happens when generating jobs is not smart enough to
substitute some fragment into a deeply-nested piece of yaml.
Instead, make these sequences top-level entries in the config dict, and
reference them. For example:
task: mon_clock_skew_check: grab max-skew value from ceph-mon's config
Instead of relying on hardcoded values, obtain the max-skew default from
'ceph-mon --show-config-value mon_clock_drift_allowed' to match the mon's
expectation.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Update to describe tasks and parameters to tasks, including the install
parameters requested in 4470. Added more information to the vm section,
and included a section documenting the test suites.
Signed-off-by: Warren Usui <warren.usui@inktank.com> Fixes: 4470
Reviewed By: Dan Mick and Alfredo Deza
Worker processes by machine type instead of teuthology branch.
teuthology-suite and schedulewill now take --worker instead of
--branch. The branch is set by setting teuthology_branch in the
yaml used to schedule the job.
The teuthology branches are assumed to be in ~/teuthology-$branch
of whatever user is running the workers.
Sage Weil [Wed, 17 Jul 2013 00:15:55 +0000 (17:15 -0700)]
workunit: set CEPH_CLI_TEST_DUP_COMMAND
This will make the CLI do every mon command twice and make sure they both
succeed. This catches problems with mon command idempotency faster than
waiting for random failures trigger.
Created tasktest to test sequential and parallel tasks.
Added sequential task and parallel task.
Changed _run_one_task to run_one_task (now called by new tasks too).
VM: Use mac addresses from DB instead of randomizing.
In order to make IP addresses less likely to change and to allow
a smaller DHCP pool to be used I generated static MAC addresses
for all the vpm entries in the DB. I also put the correct entries
for all the other types of machines as well for their primary
(eth0) mac address as well in order to keep things standardized
and so there is another location where we have this information.
Without this fix going through a few tests would exhaust the DHCP
pool which at the time was around 460 IP addresses for virtual
machines and has since been upped to ~690 IP addresses.
Signed-off-by: Sandon Van Ness <sandon@inktank.com> Reviewed-by: Warren Usui <warren.usui@inktank.com>
Fix of #5494 although bad description. Instead of adding a wait
the code used to detect if the guest was back up is fixed. The
previous code appeared to assume only one machine and broke
when it was waiting for multiple machines if the guests did not
come up within 10 seconds of each other
Make nuke not do the normal stuff if the machine is a VPS as we
just destroy them when they get unlocked.
Instead of getting downburst options from ~/.teuthology.yaml get
it from the yaml given to teuthology for the test/task instead.
Fixed an error that would make all the default downburst values
not take effect if any of them were set via a yaml.
Signed-off-by: Sandon Van Ness <sandon@inktank.com> Reviewed-by: Warren Usui <warren.usui@inktank.com>
Sandon Van Ness [Fri, 21 Jun 2013 01:36:58 +0000 (18:36 -0700)]
Wipe out existing id_rsa.pub and id_rsa before pushing ssh keys
A very simple change. Just touch a file first (to create it if it
doesn't yet exist so the delete doesn't error out) and then delete
it before pushing the keys to the file. This should avoid the
id_rsa.pub and id_rsa files from getting messed up due to previous
runs which were interrupted or failed (or if those files exist for
some reason). This appears to be what was causing breaking in the
ceph-deploy nightlies.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Samuel Just [Tue, 4 Jun 2013 21:11:29 +0000 (14:11 -0700)]
task/: add args.py
The usage doc string for a task is tedious to write and
hard to keep reconciled with the code as defaults are changed.
args.py includes a helper to put it all in one place.
Sandon Van Ness [Mon, 17 Jun 2013 23:24:37 +0000 (16:24 -0700)]
Use authorized_keys2 instead of authorized_keys
Instead of going through the trouble of adding/removing lines
from authorized_keys which has all our normal keys in it, instead
push keys to the unused authorized_keys2 file which makes the key
management significantly simpler as that file can just be wiped
out each time instead of worrying about preserving contents.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sandon Van Ness [Mon, 17 Jun 2013 23:24:37 +0000 (16:24 -0700)]
Use authorized_keys2 instead of authorized_keys
Instead of going through the trouble of adding/removing lines
from authorized_keys which has all our normal keys in it, instead
push keys to the unused authorized_keys2 file which makes the key
management significantly simpler as that file can just be wiped
out each time instead of worrying about preserving contents.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>