Kefu Chai [Fri, 24 Aug 2018 13:15:31 +0000 (21:15 +0800)]
suite/run,schedule,result: write rerun memo as the first job in suite
so we don't need to wait for the job to write result to for rerunning
the test suite. without this change, the "result" is normally the last
job in the suite to be scheduled, so it's likely we will not have the
results.log until the suite is almost completed. afer this change,
a "first-in-suite" job is scheduled as the first job to note down
the subset and seed to run the suite.
Kefu Chai [Sat, 4 Aug 2018 00:42:02 +0000 (00:42 +0000)]
setup,py,requirements.txt: add pytest
pytest requires pluggy >= 0.7, while we always use pluggy 0.6, as
specified by requirements.txt. as this version is good enough for
tox. but in tox.ini, we do use pytest, and no version is specified,
so we have good chance running into https://github.com/pytest-dev/pytest/issues/3753
also, remove pytest from tox.ini, as this dependency has been
added in requirements.txt
install: support extra_system_packages config option
On DEB systems, packages specified via the extra_packages option are installed
while forcing the same package version number as the version of the project
(i.e. Ceph) under test. So extra_packages can only be used to specify
additional project (Ceph) packages.
If we wanted to specify additional system (non-project, non-Ceph) packages to
install, we were out of luck. This commit implements an extra_system_packages
option for specifying extra non-project packages.
A branch name containing a slash is perfectly legal in git, but
teuthology uses branch names verbatim in run names, which causes POSTs
to fail when submitting runs to paddles. Replace all '/' in run names
with ':' to allow for branches with slashes in their names.
Signed-off-by: Adam Wolfe Gordon <awg@digitalocean.com>
results,schedule,woker: persist --seed and --subset in results.log
to create a repeatable test suite, in addition to `--seed <SEED>`, we also
need to pass the same `--subset <SUBSET>` to teuthology-suite when
rerunning the failed tests. but it would be handy if teuthology-suite
could remember these settings and recall them when `--rerun <RUN>`.
in this change, we repurpose the last job sending the email to report
the test result to note down the subset and seed used for scheduling the
test suite. these variables are stored in results.log at this moment.
I could not figure out the owner of the requested job.
Please pass --owner <owner>.
Worker dies with unhandled exception in run_with_watchdog
if it can't figure out owner of a job, which it tries
to kill when job runs longer then given limit of time.
Kyr Shatskyy [Mon, 7 May 2018 15:01:57 +0000 (18:01 +0300)]
Restart dead workers
This patch allows to restart dead workers separately
not stopping the rest of the teuthology components,
and what is more important the beanstalkd service.
That makes it possible to extend the number of workers too.
Also, either of pulpito and paddles can be restarted alone.
Kefu Chai [Sat, 16 Jun 2018 16:09:36 +0000 (00:09 +0800)]
teuthology-suite: add --seed option for repeatable random test
currently --rerun does not match tests of
'supported-random-distro$/ubuntu_latest.yaml' with
'supported-random-distro$/centos_latest.yaml'. the former could be part
of description of a failed test, the latter is a a part of job
description generated by build_matrix(). because the '$' operator
instructs theuthology to choose a random file under the directory ending
with '$', and we expand the '$' to a randomly picked file *before*
filtering the generated job list with the filter collected from the
failed tests, there is good chance that the job descriptions of the
failed jobs in self.args.filter_in cannot match with the randomly
generated ones.
so, we introduce an argument '--seed' for teuthology-suite for the
repeatable random test. this argument allows user to specify a seed for
tne RNG used by build_matrix().
placeholder: whitelist MDS_ALL_DOWN, MDS_UP_LESS_THAN_MAX by default
because, in ceph/qa/tasks/ceph.py, we start mon, mgr, osd, and then mds.
there is a time window where there is no mds around, but mgr is checking
mdsmap for MDS_ALL_DOWN errors. there is no way to disable this check in
this time window. so we just whitelist MDS_ALL_DOWN here.
Zack Cerza [Wed, 21 Mar 2018 23:37:32 +0000 (17:37 -0600)]
task.ansible: Allow passing in custom group_vars
Up until now, if you wanted to inject vars to a playbook run, you had to
use --extra-vars, which don't behave the same way that group_vars do.
This commit adds that functionality.
We look for a 'group_vars' dict in the task's config object. If it's
there, we create group_vars files with names taken from the keys, and
content taken from the values.
David Galloway [Tue, 27 Feb 2018 18:29:56 +0000 (13:29 -0500)]
fog: Wait 10 minutes for machine to be reachable after deploy
A lot's going on in rc.local after a machine is provisioned with FOG. 5
minutes is a little aggressive when taking into account the time it
takes for:
- The machine to reboot after the FOG task completes
- BIOS to load
- DHCP/PXE/TFTP to timeout (double this if NIC order isn't correct)
- OS to boot and rc.local to do its magic
Signed-off-by: David Galloway <dgallowa@redhat.com>
David Galloway [Tue, 27 Feb 2018 18:28:10 +0000 (13:28 -0500)]
fog: Wait 15 minutes for FOG task to complete
When there are running jobs and a large run gets scheduled, FOG
provisioning gets backed up due to its built-in rate limiting. 15
minutes should be a little more lenient and prevent large spikes of dead
jobs when a run is scheduled in an idle queue.
Signed-off-by: David Galloway <dgallowa@redhat.com>
Zack Cerza [Thu, 15 Mar 2018 19:24:12 +0000 (13:24 -0600)]
Drop libvirt from setup.py
Having an unversioned libvirt-python in setup.py started causing
problems with tox, since it uses an sdist archive as a basis for
installing teuthology in its virtualenvs. Removing it is consistent with
best practices. We'll still keep it in requirements.txt.