Nathan Cutler [Thu, 11 May 2017 20:18:55 +0000 (22:18 +0200)]
packaging: add ceph-rpm-under-test zypper repo with high priority
Otherwise the ceph packages from OBS are preferred because RPM evaluates, e.g.,
12.0.2+git.1493341348.9148e53 as a higher version number than
12.0.2-276.gf27d4b00ed.
Nathan Cutler [Sat, 11 Feb 2017 08:21:16 +0000 (09:21 +0100)]
packaging: call the zypper repo ceph-rpm-under-test
The zypper repo must have a name/alias and "ceph-rpm-under-test" seems better
than just "ceph-rpm"; it's a repo containing the ceph RPMs that are being
tested.
Nathan Cutler [Fri, 10 Feb 2017 14:38:06 +0000 (15:38 +0100)]
setup: do not set ceph_qa_suite_git_url in ~/.teuthology.yaml
When this value is set, it is necessary to explicitly give --suite-repo and
--suite-branch. We would rather have the defaults for these come from
--ceph-repo and --ceph.
Nathan Cutler [Wed, 15 Mar 2017 14:12:37 +0000 (15:12 +0100)]
openstack: upload absolute paths to remote teuthology machine
If any absolute paths are given on the teuthology-openstack command line,
assume they are local YAML files and upload them to the remote teuthology
machine before running teuthology-suite.
If the local YAML file (absolute path) is PATH, the remote
path will be /home/ubuntu/yaml/$PATH. The remote file is clobbered if
it already exists.
Patrick Donnelly [Sun, 30 Sep 2018 00:20:03 +0000 (17:20 -0700)]
run: do not block on greenlets after command exits
The stdout/stderr greenlets will not necessarily exit when the command does if
child processes are stuck in an uninterruptible sleep. For example, the fsx.sh
workunit spawns fsx processes that may be left behind in the D state after
/bin/timeout kills fsx.sh. These are connected to the stdout/stderr pipes which
prevent the greenlets from exiting normally.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
teuthology/task/install/valgrind.supp: add suppression for Boost.Thread
Boost.Thread passes `tls_destructor` to `pthread_key_create()` in hope
to free the allocated memory stored in TLS key `current_thread_tls_key`,
but neither Boost.Thread nor us uses `pthread_exit()` for calling the
cleanup functions. and Boost.Thread is against `pthread_exit()`, see [0,1].
but Boost.Thread offers a preprocessor macro to define a global variable
whose destructor calls `tls_destructor()`, but per [2], this macro is
not defined by default. and per [3], this macro could cause assertion
failure in Boost. so it might be advisable to not define it, even we
could do so in BuildBoost.cmake.
and since this `Leak_StillReachable` leak is a one-shot thing. i am
adding it to the suppression file.
Add the word "seconds" at the end of the log message, since "time.sleep()"
takes a number which is always interpreted as the number of seconds to sleep
for.
Before this commit, the log said:
INFO:teuthology.task.sleep:Sleeping for 10
After:
INFO:teuthology.task.sleep:Sleeping for 10 seconds
before d488b9bd, these params were mandatory. after d488b9bd, they are
optional. because
- these parameters passed in only for "first-in-suite" job
- subset is not mandatory even for "first-in-suite", because there is
chance that user want to run the full combination of the test matrix.
worker: create archive_dir before putting log file in it
before d488b9bd, the memo for rerunning a suite is noted down by the
last-in-suite job. when the last-in-suite job is performed, the
archive_dir has been created by the jobs which performs tests, see
the `Creating archive dir` line in run_job() in teuthology/worker.py .
but after d488b9bd, the memo is logged by the first-in-suite job, by
then, none of the test jobs is performed, so their archive dirs are not
created. i think that's why the first-in-suite job fails to write the
memo to $archive_dir/results.log.
worker: do not pass --timeout to first-in-suite job
likewise, do not pass --seed or --subset to last-in-suite job
otherwise, teuthology/schedule.py will raise a ValueError at seeing
--subset or --seed not coming along with --first-in-suite, or
--email or --timeout not coming along with --last-in-suite.
Kefu Chai [Fri, 24 Aug 2018 13:15:31 +0000 (21:15 +0800)]
suite/run,schedule,result: write rerun memo as the first job in suite
so we don't need to wait for the job to write result to for rerunning
the test suite. without this change, the "result" is normally the last
job in the suite to be scheduled, so it's likely we will not have the
results.log until the suite is almost completed. afer this change,
a "first-in-suite" job is scheduled as the first job to note down
the subset and seed to run the suite.
Kefu Chai [Sat, 4 Aug 2018 00:42:02 +0000 (00:42 +0000)]
setup,py,requirements.txt: add pytest
pytest requires pluggy >= 0.7, while we always use pluggy 0.6, as
specified by requirements.txt. as this version is good enough for
tox. but in tox.ini, we do use pytest, and no version is specified,
so we have good chance running into https://github.com/pytest-dev/pytest/issues/3753
also, remove pytest from tox.ini, as this dependency has been
added in requirements.txt
install: support extra_system_packages config option
On DEB systems, packages specified via the extra_packages option are installed
while forcing the same package version number as the version of the project
(i.e. Ceph) under test. So extra_packages can only be used to specify
additional project (Ceph) packages.
If we wanted to specify additional system (non-project, non-Ceph) packages to
install, we were out of luck. This commit implements an extra_system_packages
option for specifying extra non-project packages.
A branch name containing a slash is perfectly legal in git, but
teuthology uses branch names verbatim in run names, which causes POSTs
to fail when submitting runs to paddles. Replace all '/' in run names
with ':' to allow for branches with slashes in their names.
Signed-off-by: Adam Wolfe Gordon <awg@digitalocean.com>
results,schedule,woker: persist --seed and --subset in results.log
to create a repeatable test suite, in addition to `--seed <SEED>`, we also
need to pass the same `--subset <SUBSET>` to teuthology-suite when
rerunning the failed tests. but it would be handy if teuthology-suite
could remember these settings and recall them when `--rerun <RUN>`.
in this change, we repurpose the last job sending the email to report
the test result to note down the subset and seed used for scheduling the
test suite. these variables are stored in results.log at this moment.
I could not figure out the owner of the requested job.
Please pass --owner <owner>.
Worker dies with unhandled exception in run_with_watchdog
if it can't figure out owner of a job, which it tries
to kill when job runs longer then given limit of time.
Kyr Shatskyy [Mon, 7 May 2018 15:01:57 +0000 (18:01 +0300)]
Restart dead workers
This patch allows to restart dead workers separately
not stopping the rest of the teuthology components,
and what is more important the beanstalkd service.
That makes it possible to extend the number of workers too.
Also, either of pulpito and paddles can be restarted alone.
Kefu Chai [Sat, 16 Jun 2018 16:09:36 +0000 (00:09 +0800)]
teuthology-suite: add --seed option for repeatable random test
currently --rerun does not match tests of
'supported-random-distro$/ubuntu_latest.yaml' with
'supported-random-distro$/centos_latest.yaml'. the former could be part
of description of a failed test, the latter is a a part of job
description generated by build_matrix(). because the '$' operator
instructs theuthology to choose a random file under the directory ending
with '$', and we expand the '$' to a randomly picked file *before*
filtering the generated job list with the filter collected from the
failed tests, there is good chance that the job descriptions of the
failed jobs in self.args.filter_in cannot match with the randomly
generated ones.
so, we introduce an argument '--seed' for teuthology-suite for the
repeatable random test. this argument allows user to specify a seed for
tne RNG used by build_matrix().