Nathan Cutler [Thu, 22 Nov 2018 11:59:48 +0000 (12:59 +0100)]
run_tasks.py: allow _import to raise the right ImportError
It turns out it's possible for a file qa/tasks/foo.py to exist,
yet importing it still raises an ImportError because it references a
non-existent symbol.
In this case, teuthology was clobbering the real ImportError with its
own bogus text.
Sage Weil [Fri, 9 Nov 2018 14:53:03 +0000 (08:53 -0600)]
valgind: ignore all leaks relating to CPython code
Yes, this is a big hammer, and we are ignoring a lot. However, it is a
HUGE step forward to what we do now, which is not check for ceph-mgr
leaks at all.
By adding this suppress I found and fixed 3 separate ceph-mgr leaks. This
will let us prevent others (in non-Py code) from being introduced.
Kyr Shatskyy [Tue, 30 Oct 2018 13:17:05 +0000 (14:17 +0100)]
orchestra: add remote.sh commands analogous to misc.sh
Adds a remote.sh similar to misc.sh, in fact a shortcut for remote.run,
but return output instead of proc
Example:
my_name = Remote('127.0.0.1').sh('whoami')
Adds a remote.sh_file run a script as file on a remote with or without sudo
Example 1: Run python script
Remote('127.0.0.1').sh_file("#!/usr/bin/env python3\n"
"import sys\n"
"print(sys.version_info)")
Example 2: Run script as root
Remote('user@host.domain').sh_file("whoami", sudo=True,
label="who-am-i-for-the-real")
Example 3: Run script as other user
Remote('user@host.domain').sh_file("whoami", sudo='nobody', )
Kyrylo Shatskyy [Sun, 28 Oct 2018 17:56:14 +0000 (18:56 +0100)]
Fix ipv4 and ipv6 address logging for Remote.run
The Remote class does not respect ip addresses
when it comes to define shortnames. As a result,
the hostname is not shown correctly in the log.
For ipv4 it only shows first number of the octet.
For ipv6 it even does not allow to proceed,
and raises exception in orchestra.run.
Kefu Chai [Fri, 26 Oct 2018 06:18:34 +0000 (14:18 +0800)]
orchestra.run: log the ssh command without prefix
run() also supports single string, but if we pass a longstring literal
which contains "\n", it renders log difficult to read.
in this change, multi-line command is logged in multiple lines,
and print the "prefix" in the first line, then print the command in
following lines without "prefix".
Kyr Shatskyy [Thu, 26 Oct 2017 16:15:41 +0000 (18:15 +0200)]
openstack: add --test-repo CLI option
Add custom repos before installing rpm packages on test nodes.
Repository can be specified as a NAME:URL pair. Several repos
can be provided by specifying the option multiple times.
For example,
--test-repo foo:http://example.com/repo/foo \
--test-repo bar:http://example.com/repo/bar
gives two test package repositories named "foo" and "bar".
Nathan Cutler [Wed, 30 Aug 2017 10:18:30 +0000 (12:18 +0200)]
install: rpm: only one option per variable
Although the variables are entitled, e.g., "pkg_mng_opts" they
really can only contain at most one option. Here's what happens
when they contain more than one:
Nathan Cutler [Wed, 30 Aug 2017 09:55:15 +0000 (11:55 +0200)]
Override failing package signature checks
The RPMs built by teuthology's buildpackages task are not signed and
after a recent zypper update the install task started to fail with
File 'repomd.xml' from repository 'ceph-rpm-under-test' is unsigned, continue? [yes/no] (no): no
Error building the cache: [ceph-rpm-under-test|http://149.202.175.91/ceph-rpm-sle12-x86_64-basic/sha1/3804e807353c9d125753b1cf4f6405f79db83d4e/x86_64] Valid metadata not found at specified URL
Nathan Cutler [Thu, 11 May 2017 20:18:55 +0000 (22:18 +0200)]
packaging: add ceph-rpm-under-test zypper repo with high priority
Otherwise the ceph packages from OBS are preferred because RPM evaluates, e.g.,
12.0.2+git.1493341348.9148e53 as a higher version number than
12.0.2-276.gf27d4b00ed.
Nathan Cutler [Sat, 11 Feb 2017 08:21:16 +0000 (09:21 +0100)]
packaging: call the zypper repo ceph-rpm-under-test
The zypper repo must have a name/alias and "ceph-rpm-under-test" seems better
than just "ceph-rpm"; it's a repo containing the ceph RPMs that are being
tested.
Nathan Cutler [Fri, 10 Feb 2017 14:38:06 +0000 (15:38 +0100)]
setup: do not set ceph_qa_suite_git_url in ~/.teuthology.yaml
When this value is set, it is necessary to explicitly give --suite-repo and
--suite-branch. We would rather have the defaults for these come from
--ceph-repo and --ceph.
Nathan Cutler [Wed, 15 Mar 2017 14:12:37 +0000 (15:12 +0100)]
openstack: upload absolute paths to remote teuthology machine
If any absolute paths are given on the teuthology-openstack command line,
assume they are local YAML files and upload them to the remote teuthology
machine before running teuthology-suite.
If the local YAML file (absolute path) is PATH, the remote
path will be /home/ubuntu/yaml/$PATH. The remote file is clobbered if
it already exists.
Patrick Donnelly [Sun, 30 Sep 2018 00:20:03 +0000 (17:20 -0700)]
run: do not block on greenlets after command exits
The stdout/stderr greenlets will not necessarily exit when the command does if
child processes are stuck in an uninterruptible sleep. For example, the fsx.sh
workunit spawns fsx processes that may be left behind in the D state after
/bin/timeout kills fsx.sh. These are connected to the stdout/stderr pipes which
prevent the greenlets from exiting normally.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
teuthology/task/install/valgrind.supp: add suppression for Boost.Thread
Boost.Thread passes `tls_destructor` to `pthread_key_create()` in hope
to free the allocated memory stored in TLS key `current_thread_tls_key`,
but neither Boost.Thread nor us uses `pthread_exit()` for calling the
cleanup functions. and Boost.Thread is against `pthread_exit()`, see [0,1].
but Boost.Thread offers a preprocessor macro to define a global variable
whose destructor calls `tls_destructor()`, but per [2], this macro is
not defined by default. and per [3], this macro could cause assertion
failure in Boost. so it might be advisable to not define it, even we
could do so in BuildBoost.cmake.
and since this `Leak_StillReachable` leak is a one-shot thing. i am
adding it to the suppression file.
Add the word "seconds" at the end of the log message, since "time.sleep()"
takes a number which is always interpreted as the number of seconds to sleep
for.
Before this commit, the log said:
INFO:teuthology.task.sleep:Sleeping for 10
After:
INFO:teuthology.task.sleep:Sleeping for 10 seconds
before d488b9bd, these params were mandatory. after d488b9bd, they are
optional. because
- these parameters passed in only for "first-in-suite" job
- subset is not mandatory even for "first-in-suite", because there is
chance that user want to run the full combination of the test matrix.
worker: create archive_dir before putting log file in it
before d488b9bd, the memo for rerunning a suite is noted down by the
last-in-suite job. when the last-in-suite job is performed, the
archive_dir has been created by the jobs which performs tests, see
the `Creating archive dir` line in run_job() in teuthology/worker.py .
but after d488b9bd, the memo is logged by the first-in-suite job, by
then, none of the test jobs is performed, so their archive dirs are not
created. i think that's why the first-in-suite job fails to write the
memo to $archive_dir/results.log.
worker: do not pass --timeout to first-in-suite job
likewise, do not pass --seed or --subset to last-in-suite job
otherwise, teuthology/schedule.py will raise a ValueError at seeing
--subset or --seed not coming along with --first-in-suite, or
--email or --timeout not coming along with --last-in-suite.
Kefu Chai [Fri, 24 Aug 2018 13:15:31 +0000 (21:15 +0800)]
suite/run,schedule,result: write rerun memo as the first job in suite
so we don't need to wait for the job to write result to for rerunning
the test suite. without this change, the "result" is normally the last
job in the suite to be scheduled, so it's likely we will not have the
results.log until the suite is almost completed. afer this change,
a "first-in-suite" job is scheduled as the first job to note down
the subset and seed to run the suite.
Kefu Chai [Sat, 4 Aug 2018 00:42:02 +0000 (00:42 +0000)]
setup,py,requirements.txt: add pytest
pytest requires pluggy >= 0.7, while we always use pluggy 0.6, as
specified by requirements.txt. as this version is good enough for
tox. but in tox.ini, we do use pytest, and no version is specified,
so we have good chance running into https://github.com/pytest-dev/pytest/issues/3753
also, remove pytest from tox.ini, as this dependency has been
added in requirements.txt
install: support extra_system_packages config option
On DEB systems, packages specified via the extra_packages option are installed
while forcing the same package version number as the version of the project
(i.e. Ceph) under test. So extra_packages can only be used to specify
additional project (Ceph) packages.
If we wanted to specify additional system (non-project, non-Ceph) packages to
install, we were out of luck. This commit implements an extra_system_packages
option for specifying extra non-project packages.