Josh Durgin [Sun, 21 Mar 2021 22:28:52 +0000 (18:28 -0400)]
report, lock.ops: retry write requests to paddles
For more contended cases of updating job status and machine keys,
where we've seen 500 errors from DB conflicts, use random intervals
for the retries.
This is the teuthology half of fixing:
https://tracker.ceph.com/issues/49864
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Josh Durgin [Sat, 13 Mar 2021 03:30:42 +0000 (19:30 -0800)]
Merge pull request #1628 from ceph/ignore-systemd-sysusers-core
task/internal: ignore systemd-sysusers core file
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Sage Weil [Fri, 12 Mar 2021 17:58:47 +0000 (11:58 -0600)]
task/internal: ignore systemd-sysusers core file
This is related to dnsmasq. When installing hte kubic podman 3.0.1
packages,
Running scriptlet: dnsmasq-2.79-13.el8_3.1.x86_64 14/16
/var/tmp/rpm-tmp.6MFp00: line 5: 9079 Segmentation fault (core dumped) systemd-sysusers - &> /dev/null <<SYSTEMD_INLINE_EOF
u dnsmasq - "Dnsmasq DHCP and DNS server" /var/lib/dnsmasq
SYSTEMD_INLINE_EOF
Installing : dnsmasq-2.79-13.el8_3.1.x86_64 14/16
warning: group dnsmasq does not exist - using root
warning: group dnsmasq does not exist - using root
warning: group dnsmasq does not exist - using root
Running scriptlet: dnsmasq-2.79-13.el8_3.1.x86_64 14/16
/var/tmp/rpm-tmp.pfCGxn: line 3: 9089 Segmentation fault (core dumped) systemd-sysusers &> /dev/null
Installing : podman-3.0.1-2.el8.3.2.x86_64 15/16
Installing : podman-plugins-3.0.1-2.el8.3.2.x86_64 16/16
Running scriptlet: container-selinux-2:2.145.0-1.el8.noarch 16/16
Running scriptlet: podman-plugins-3.0.1-2.el8.3.2.x86_64 16/16
/var/tmp/rpm-tmp.bFfmjl: line 6: 11098 Segmentation fault (core dumped) /usr/bin/systemd-sysusers
warning: %triggerin(systemd-239-18.el8.x86_64) scriptlet failed, exit status 139
Error in <unknown> scriptlet in rpm package podman-plugins
Verifying : dnsmasq-2.79-13.el8_3.1.x86_64 1/16
Nothing to do with us.
Signed-off-by: Sage Weil <sage@newdream.net>
kyr [Fri, 12 Mar 2021 09:20:23 +0000 (10:20 +0100)]
Merge pull request #1573 from smithfarm/wip-45570
orchestra/console: raise RuntimeError when fail to power on
Josh Durgin [Thu, 11 Mar 2021 16:48:34 +0000 (08:48 -0800)]
Merge pull request #1627 from ceph/wip-debug-levels
suite/placeholder.py: lower osd specific debug levels
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Neha Ojha [Wed, 10 Mar 2021 23:33:55 +0000 (23:33 +0000)]
suite/placeholder.py: lower osd specific debug levels
Signed-off-by: Neha Ojha <nojha@redhat.com>
Brad Hubbard [Tue, 9 Mar 2021 22:20:21 +0000 (08:20 +1000)]
Merge pull request #1620 from ceph/wip-badone-ceph-ansible-tracker-49485
ceph_ansible: Satisfy 'six' dependency
Reviewed-by: Yuri Weinstein <yweins@redhat.com>
Sage Weil [Sat, 27 Feb 2021 20:13:30 +0000 (14:13 -0600)]
selinux: fix typo
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Sat, 27 Feb 2021 17:39:02 +0000 (11:39 -0600)]
Merge pull request #1622 from ceph/ignore-selinux-sssd
selinux: ignore issues with sssd
Sage Weil [Sat, 27 Feb 2021 15:26:36 +0000 (09:26 -0600)]
selinux: ignore issues with sssd
['type=AVC msg=audit(
1614438637.552:5615): avc: denied { read } for pid=876 comm="sssd" name="resolv.conf" dev="sda1" ino=265261 scontext=system_u:system_r:sssd_t:s0 tcontext=unconfined_u:object_r:admin_home_t:s0 tclass=file permissive=1']
(currently seen on rhel 8.3)
Signed-off-by: Sage Weil <sage@newdream.net>
kyr [Fri, 26 Feb 2021 22:49:30 +0000 (23:49 +0100)]
Merge pull request #1621 from kshtsk/wip-math-gcd
suite/matrix: use math.gcd instead of fractions.gcd
Kyr Shatskyy [Fri, 26 Feb 2021 14:13:59 +0000 (15:13 +0100)]
requirements: use ansible 2.9
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
Kyr Shatskyy [Fri, 26 Feb 2021 10:20:28 +0000 (11:20 +0100)]
requirements: bump up cffi to 1.14.5
Needs for run on Big Sur with python3.9 from brew and addresses
building error for cffi wheel:
clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -DUSE__THREAD -DHAVE_SYNC_SYNCHRONIZE -I/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/ffi -I/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/ffi -I/opt/homebrew/include -I/opt/homebrew/opt/openssl@1.1/include -I/opt/homebrew/opt/sqlite/include -I/opt/homebrew/opt/tcl-tk/include -I/Users/kyr/kshtsk/teuthology/virtualenv/include -I/opt/homebrew/Cellar/python@3.9/3.9.2_1/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c c/_cffi_backend.c -o build/temp.macosx-11-arm64-3.9/c/_cffi_backend.o
c/_cffi_backend.c:6185:5: warning: 'PyEval_InitThreads' is deprecated [-Wdeprecated-declarations]
PyEval_InitThreads();
^
/opt/homebrew/Cellar/python@3.9/3.9.2_1/Frameworks/Python.framework/Versions/3.9/include/python3.9/ceval.h:130:1: note: 'PyEval_InitThreads' has been explicitly marked deprecated here
Py_DEPRECATED(3.9) PyAPI_FUNC(void) PyEval_InitThreads(void);
^
/opt/homebrew/Cellar/python@3.9/3.9.2_1/Frameworks/Python.framework/Versions/3.9/include/python3.9/pyport.h:508:54: note: expanded from macro 'Py_DEPRECATED'
#define Py_DEPRECATED(VERSION_UNUSED) __attribute__((__deprecated__))
^
c/_cffi_backend.c:6245:9: error: implicit declaration of function 'ffi_prep_closure' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
if (ffi_prep_closure(closure, &cif_descr->cif,
^
1 warning and 1 error generated.
error: command '/usr/bin/clang' failed with exit code 1
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
Kyr Shatskyy [Fri, 26 Feb 2021 13:13:31 +0000 (14:13 +0100)]
requirements.in: stick ansible version to 2.8 version
Since we are not ready for ansible 3 from ceph-cm-ansible point of view:
2021-02-26T12:45:17.668 INFO:teuthology.task.ansible.out:ERROR! couldn't resolve module/action 'firewalld'. This often indicates a misspelling, missing collection, or incorrect module path.
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
Kyr Shatskyy [Fri, 26 Feb 2021 11:57:29 +0000 (12:57 +0100)]
requirements.in: stick pytest to 3.7.1 version
Untill someone fixes unittests.
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
Kyr Shatskyy [Fri, 26 Feb 2021 10:19:46 +0000 (11:19 +0100)]
suite/matrix: latest py3 deprecates fractions.gcd
Signed-off-by: Kyrylo Shatskyy <kyr@top.local>
Brad Hubbard [Thu, 25 Feb 2021 08:38:31 +0000 (18:38 +1000)]
ceph_ansible: Satisfy 'six' dependency
Fixes: https://tracker.ceph.com/issues/49485
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
Sage Weil [Thu, 18 Feb 2021 21:58:10 +0000 (15:58 -0600)]
Merge pull request #1618 from ceph/valgrind-soname
misc: make valgrind behave with tcmalloc
Sage Weil [Thu, 18 Feb 2021 16:23:14 +0000 (10:23 -0600)]
misc: make valgrind behave with tcmalloc
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 18 Feb 2021 14:43:46 +0000 (08:43 -0600)]
Merge pull request #1617 from ceph/no-fsid-for-state
orchestra/daemon/state: do not pass fsid property to run() later
Sage Weil [Wed, 17 Feb 2021 18:47:45 +0000 (13:47 -0500)]
orchestra/daemon/state: do not pass fsid property to run() later
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 17 Feb 2021 15:59:14 +0000 (09:59 -0600)]
Merge pull request #1616 from ceph/ignore-signal-exceptions
orchestra/daemon/cephadmunit: ignore exception when sending signal
Sage Weil [Wed, 17 Feb 2021 03:27:32 +0000 (21:27 -0600)]
orchestra/daemon/cephadmunit: ignore exception when sending signal
The osd thrashing is sending lots of signals (sighup) and can easily race with
a daemon shutting down entirely.
This makes us match the behavior of the original state.py signal() method.
Signed-off-by: Sage Weil <sage@newdream.net>
Josh Durgin [Tue, 16 Feb 2021 01:38:24 +0000 (17:38 -0800)]
Merge pull request #1615 from jdurgin/wip-debug-ms
suite: lower debug_ms for osd back to 1
Reviewed-by: Neha Ojha <nojha@redhat.com>
Josh Durgin [Tue, 16 Feb 2021 00:15:45 +0000 (19:15 -0500)]
suite: lower debug_ms for osd back to 1
This was increased for some mgr issues in
044384be450a557f56a2b39bf7d0e71e69d45cd3, but isn't helping much now
and is filling up disks for long-running tests.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Josh Durgin [Sat, 13 Feb 2021 00:29:38 +0000 (16:29 -0800)]
Merge pull request #1614 from jdurgin/wip-nuke-tests
nuke: fix no_reboot only being present in the cli and add unit tests
Reviewed-by: Neha Ojha <nojha@redhat.com>
Josh Durgin [Fri, 12 Feb 2021 22:54:17 +0000 (22:54 +0000)]
test_nuke: add unit tests for internal nuke options
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Josh Durgin [Fri, 12 Feb 2021 22:53:38 +0000 (22:53 +0000)]
nuke: only use no_reboot on the cli
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Josh Durgin [Fri, 12 Feb 2021 18:29:47 +0000 (10:29 -0800)]
Merge pull request #1613 from jdurgin/wip-nuke-keep-logs
nuke: only use keep_logs from the cli
Reviewed-by: Neha Ojha <nojha@redhat.com>
Josh Durgin [Thu, 11 Feb 2021 22:59:52 +0000 (22:59 +0000)]
nuke: only use keep_logs from the cli
nuke() is called outside of the cli with a ctx that does not include
all the cli args. Use a default parameter for the functions instead of ctx.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Sage Weil [Thu, 11 Feb 2021 22:43:28 +0000 (16:43 -0600)]
Merge pull request #1612 from ceph/nicer-ls
ls: nicer ls output
Sage Weil [Wed, 10 Feb 2021 22:22:53 +0000 (22:22 +0000)]
ls: nicer ls output
- no error when teuthology.log is missing (provisioning)
- leave off pid
Signed-off-by: Sage Weil <sage@redhat.com>
kyr [Thu, 11 Feb 2021 14:15:57 +0000 (15:15 +0100)]
Merge pull request #1611 from ceph/dependabot/pip/cryptography-3.3.2
build(deps): bump cryptography from 3.2 to 3.3.2
dependabot[bot] [Thu, 11 Feb 2021 14:09:41 +0000 (14:09 +0000)]
build(deps): bump cryptography from 3.2 to 3.3.2
Bumps [cryptography](https://github.com/pyca/cryptography) from 3.2 to 3.3.2.
- [Release notes](https://github.com/pyca/cryptography/releases)
- [Changelog](https://github.com/pyca/cryptography/blob/master/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/3.2...3.3.2)
Signed-off-by: dependabot[bot] <support@github.com>
kyr [Thu, 11 Feb 2021 14:07:51 +0000 (15:07 +0100)]
Merge pull request #1609 from ceph/dependabot/pip/httplib2-0.19.0
build(deps): bump httplib2 from 0.18.0 to 0.19.0
Josh Durgin [Tue, 9 Feb 2021 22:10:00 +0000 (14:10 -0800)]
Merge pull request #1610 from jdurgin/wip-supervisor-timeouts
supervisor: improve error handling for dead jobs
Reviewed-by: Andrew Schoen <aschoen@redhat.com>
Josh Durgin [Tue, 9 Feb 2021 21:33:34 +0000 (21:33 +0000)]
supervisor: send paddles the reason a jobs is marked dead
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Josh Durgin [Tue, 9 Feb 2021 21:16:46 +0000 (21:16 +0000)]
supervisor: kill processes before gathering logs
When we hit the max job timeout, we need to stop the test programs
before collecting logs or else we run into errors like 'file size
changed while zipping' trying to compress them, and we can't save them
or stop the job.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Josh Durgin [Tue, 9 Feb 2021 19:24:02 +0000 (19:24 +0000)]
nuke: allow not rebooting again
The default behavior was changed to always reboot in
1d47a121b385e2656e9314e9d63faf68a8e865e4 but the --reboot-all option
remained. Keep the original option around for compatibility with
existing scripts.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Josh Durgin [Tue, 9 Feb 2021 18:54:28 +0000 (18:54 +0000)]
nuke: add option to preserve logs on remote machines
This will be helpful for killing jobs that hit the max_job_timeout
while still being able to collect logs from them.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
dependabot[bot] [Mon, 8 Feb 2021 20:52:33 +0000 (20:52 +0000)]
build(deps): bump httplib2 from 0.18.0 to 0.19.0
Bumps [httplib2](https://github.com/httplib2/httplib2) from 0.18.0 to 0.19.0.
- [Release notes](https://github.com/httplib2/httplib2/releases)
- [Changelog](https://github.com/httplib2/httplib2/blob/master/CHANGELOG)
- [Commits](https://github.com/httplib2/httplib2/compare/v0.18.0...v0.19.0)
Signed-off-by: dependabot[bot] <support@github.com>
David Galloway [Fri, 5 Feb 2021 17:37:38 +0000 (12:37 -0500)]
Merge pull request #1608 from kshtsk/fix-docs
readme: fix teuthology docs link at docs.ceph.com
kyr [Fri, 5 Feb 2021 17:19:24 +0000 (18:19 +0100)]
Merge pull request #1601 from sebastian-philipp/prio-add-job-count
teuthology-suite: Add job count to priority error msg.
Kyr Shatskyy [Fri, 5 Feb 2021 17:15:49 +0000 (18:15 +0100)]
readme: fix teuthology docs link at docs.ceph.com
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
kyr [Fri, 5 Feb 2021 16:19:02 +0000 (17:19 +0100)]
Merge pull request #1607 from kshtsk/ver-1.1.0
version: increase version to 1.1.0 since we have dispatcher
kyr [Fri, 5 Feb 2021 16:11:44 +0000 (17:11 +0100)]
Merge pull request #1606 from kshtsk/supervisor-log
dispatcher: add .log extension for supervisor log
Kyr Shatskyy [Fri, 5 Feb 2021 16:09:40 +0000 (17:09 +0100)]
version: increase version to 1.1.0 since we have dispatcher
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
Kyr Shatskyy [Fri, 5 Feb 2021 16:04:52 +0000 (17:04 +0100)]
dispatcher: add .log extension for supervisor log
It would be great to have an extension for easy log identification.
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
Dan Mick [Thu, 4 Feb 2021 23:09:08 +0000 (15:09 -0800)]
Merge pull request #1605 from jdurgin/wip-supervisor-connect-error
dispatcher/supervisor: always unlock machines and save status
Josh Durgin [Thu, 4 Feb 2021 22:56:53 +0000 (17:56 -0500)]
dispatcher/supervisor: always unlock machines and save status
If we can't connect to the machines anymore, we still need to clean
up.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Josh Durgin [Tue, 2 Feb 2021 20:04:42 +0000 (12:04 -0800)]
Merge pull request #1604 from jdurgin/wip-dispatcher-commit-bug
dispatcher/repo_utils: handle missing commits better
Reviewed-by: David Galloway <dgallowa@redhat.com>
Josh Durgin [Tue, 2 Feb 2021 19:57:34 +0000 (14:57 -0500)]
dispatcher: keep operating if preparing a job fails
prep_job() handles updating the job status already.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Josh Durgin [Tue, 2 Feb 2021 19:48:47 +0000 (14:48 -0500)]
repo_utils: clone entire branch if commit is specified
If the commit is not the head of the branch, we need more history to be
able to check it out.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Josh Durgin [Tue, 2 Feb 2021 19:19:07 +0000 (14:19 -0500)]
worker: handle CommitNotFoundErrors
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Josh Durgin [Tue, 2 Feb 2021 16:52:22 +0000 (08:52 -0800)]
Merge pull request #1603 from jdurgin/wip-dispatcher-bug
dispatcher: allow empty os_type for fake config
Reviewed-by: David Galloway <dgallowa@redhat.com>
Josh Durgin [Tue, 2 Feb 2021 15:06:11 +0000 (10:06 -0500)]
dispatcher: allow empty os_type for fake config
This is the same default as reimaging uses,
though it's not too important in the supervisor.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Sebastian Wagner [Tue, 26 Jan 2021 12:02:35 +0000 (13:02 +0100)]
teuthology-suite: Add job count to priority error msg.
Don't let users guess the job count.
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Josh Durgin [Thu, 28 Jan 2021 01:28:30 +0000 (17:28 -0800)]
Merge pull request #1546 from ShraddhaAg/add-minimal-dispatcher
Add teuthology-dispatcher
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Josh Durgin [Wed, 27 Jan 2021 16:01:59 +0000 (08:01 -0800)]
Merge pull request #1599 from ceph/wip-exact-commits
Use the same version of teuthology and ceph-qa-suite for a whole run
Reviewed-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
Vasu Kulkarni [Thu, 21 Jan 2021 16:01:35 +0000 (08:01 -0800)]
Merge pull request #1598 from SrinivasaBharath/wip-deb-rm
task/install/redhat: Removing packages based on OS in cleanup task
Josh Durgin [Wed, 20 Jan 2021 18:16:12 +0000 (10:16 -0800)]
Merge branch 'master' into add-minimal-dispatcher
Josh Durgin [Tue, 19 Jan 2021 19:06:45 +0000 (11:06 -0800)]
Merge pull request #1600 from rzarzynski/wip-valgrind-controllable-exit
teuthology/misc: make the Valgrind's early exit configurable.
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Radoslaw Zarzynski [Tue, 19 Jan 2021 13:56:32 +0000 (14:56 +0100)]
teuthology/misc: make the Valgrind's early exit configurable.
This commit is a follow-up to
a98eb3e1405c8ca8f6933eb0356c03955e4e2e83
where Valgrind has been configured to exit on first-seen error as it
was (wrongly!) assumed that all components are green when it comes
to the Valgrind verification.
This assumption turned out to be broken for RGW which got a few issues
over the course as a result of having the Valgrind checks knocked out
as a side effect of the python3 transition [1]. In the consequence,
multiple problems accumulated and introducing a mechanism to disable
the early exit to e.g. develop a list of these issues looks desirable.
[1]: https://github.com/ceph/teuthology/pull/1503#issuecomment-
762837504
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Josh Durgin [Sun, 17 Jan 2021 02:31:33 +0000 (21:31 -0500)]
worker, run: use exact commits for teuthology and qa suite
This ensures we use the same version across all jobs in a run.
We already have suite_sha1 set by older versions of teuthology, but
for folks who haven't updated their suite command, and thus don't set
teuthology_sha1 in the job config, look up the sha1 of in the worker.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Josh Durgin [Sun, 17 Jan 2021 02:21:02 +0000 (21:21 -0500)]
repo_utils: allow fetching a specific sha1 to per-commit directories
Using a checkout of a single branch used by potentially many
workers/teuthology processes can result in errors when one job
updates the local branch while another job is reading it.
This causes issues particularly easily when using non-master
teuthology branches, and with the teuthology-dispatcher.
This also allows us to guarantee we're using the same
version across an entire run, even if e.g. the master
qa suite is updated between jobs.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Josh Durgin [Sun, 17 Jan 2021 02:16:59 +0000 (21:16 -0500)]
suite: pass the commit of teuthology with each job's config
We already pass the suite sha1, but do not use it yet. This is the
missing piece to be able to use the same version of everything across
a whole run.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Josh Durgin [Sun, 17 Jan 2021 02:04:08 +0000 (21:04 -0500)]
repo_utils: allow checking out a specific commit
Since we're cloning a particular branch with git clone --shallow,
assume we're still passed a branch that contains the commit. Otherwise
we'd waste time and space cloning all the branches in the repo.
Assume that this is only used for checking out a particular sha1 once,
to avoid repetitive work.
There are two use cases for these utilities:
1) on the user's machine when they're scheduling a suite - there it
will make sense to maintain a single checkout of a branch e.g. teuthology master
2) on the queue consumer side - here it's best if we use the same commit for an
entire run, so checking out by sha1 makes more sense
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
srinivasabharath [Wed, 13 Jan 2021 06:30:01 +0000 (01:30 -0500)]
task/install/redhat: Removing packages based on OS in cleanup task
Signed-off-by: Bharath <skanta@redhat.com>
Josh Durgin [Thu, 14 Jan 2021 21:00:42 +0000 (13:00 -0800)]
Merge pull request #1597 from jdurgin/wip-workunit-fingerprint
exceptions: only use one of label or command for fingerprint
Reviewed-by: Neha Ojha <nojha@redhat.com>
Josh Durgin [Wed, 13 Jan 2021 03:33:38 +0000 (22:33 -0500)]
exceptions: only use one of label or command for fingerprint
Commands like those running workunits include the ceph sha1 being
tested, so they're not useful for grouping. This also lets us group
together other tests if we like, for example to map tests with small
differences in configuration to the same fingerprint for sentry.
Also use the plain command, it's already a string at this point
so there's no reason to add spaces between its characters.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Patrick Donnelly [Tue, 12 Jan 2021 15:31:46 +0000 (07:31 -0800)]
Merge PR #1595 into master
* refs/pull/1595/head:
orchestra: squelch Traceback for expected auth failures
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
Patrick Donnelly [Fri, 8 Jan 2021 18:08:04 +0000 (10:08 -0800)]
orchestra: squelch Traceback for expected auth failures
The Traceback clutters the log and messes up greps for Tracebacks.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Josh Durgin [Wed, 6 Jan 2021 15:46:50 +0000 (07:46 -0800)]
Merge pull request #1593 from jdurgin/wip-sentry-sdk
sentry: use new library and group CommandFailedErrors more finely
Reviewed-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
Josh Durgin [Mon, 4 Jan 2021 15:43:54 +0000 (10:43 -0500)]
exceptions: group CommandFailedErrors in sentry more finely
By default sentry uses the stack trace / error type / rough error
message, which ends up with many failures from different workunits
grouped together. Include the actual command run, the exit status, and
the optional label to group these more accurately. This will group
failures of the same workunit together, for example.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Josh Durgin [Mon, 4 Jan 2021 15:39:38 +0000 (10:39 -0500)]
run_tasks: use new sentry_sdk
raven has been deprecated
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
kyr [Thu, 17 Dec 2020 10:06:36 +0000 (11:06 +0100)]
Merge pull request #1591 from rakeshgm/typo-correct
run_tasks: correct typo tuethology -> teuthology
rakeshgm [Thu, 17 Dec 2020 08:31:26 +0000 (14:01 +0530)]
run_tasks: correct typo tuethology -> teuthology
Signed-off-by: rakeshgm <rakeshgm@redhat.com>
Jason Dillaman [Wed, 16 Dec 2020 21:10:24 +0000 (16:10 -0500)]
Merge pull request #1590 from lxbsz/task_ship_utilities
teuthology: run the ship_utilities task only once
Reviewed-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Xiubo Li [Tue, 15 Dec 2020 02:49:28 +0000 (10:49 +0800)]
teuthology: run the ship_utilities task only once
This will make sure that the utilities won't removed until the last
user is unwound.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
kyr [Mon, 14 Dec 2020 20:14:48 +0000 (21:14 +0100)]
Merge pull request #1589 from kshtsk/wip-wait
Add teuthology-wait command
Kyr Shatskyy [Fri, 11 Dec 2020 17:48:16 +0000 (18:48 +0100)]
scripts: add wait script for watching run
While using teuthology-suite with --wait option it is
usefull sometimes to split the suite scheduling and
the run waiting. For example, when using tools like
Jenkins we might want to schedule a suite, report
about successful schedule and start waiting only in
the next steps.
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
Kyr Shatskyy [Fri, 11 Dec 2020 18:56:01 +0000 (19:56 +0100)]
suite: improve info message about waiting the run
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
Josh Durgin [Fri, 11 Dec 2020 03:31:50 +0000 (19:31 -0800)]
Merge pull request #1584 from kshtsk/wip-quiet-run
orchestra: introduce quiet mode for remote.run
Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
kyr [Thu, 10 Dec 2020 00:28:35 +0000 (01:28 +0100)]
Merge pull request #1588 from lxbsz/install1
install: pass the 'shaman' to Shaman class
Xiubo Li [Wed, 9 Dec 2020 15:06:59 +0000 (23:06 +0800)]
install: pass the 'shaman' to Shaman class
Signed-off-by: Xiubo Li <xiubli@redhat.com>
kyr [Tue, 8 Dec 2020 16:59:20 +0000 (17:59 +0100)]
Merge pull request #1585 from kshtsk/wip-rocket
teuthology-suite: add Rocket.Chat notification
Kyr Shatskyy [Mon, 30 Nov 2020 20:37:05 +0000 (21:37 +0100)]
teuthology-suite: add Rocket.Chat notification
Add Rocket.Chat notification for sleep before teardown.
For details see https://rocket.chat/
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
Jason Dillaman [Mon, 7 Dec 2020 16:34:49 +0000 (11:34 -0500)]
Merge pull request #1578 from lxbsz/override
packaging: try noarch repo if arch repo doesn't exist
Reviewed-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
kyr [Wed, 2 Dec 2020 08:55:57 +0000 (09:55 +0100)]
Merge pull request #1581 from ideepika/add-interactive-on-error
teuthology command: introduce --interactive-on-error flag
Kefu Chai [Wed, 2 Dec 2020 07:19:57 +0000 (15:19 +0800)]
Merge pull request #1586 from tchaikov/wip-gzip-archive
task.internal.archive: gzip archived file size > 128MB
Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Kefu Chai [Wed, 2 Dec 2020 05:11:09 +0000 (13:11 +0800)]
task.internal.archive: gzip archived file size > 128MB
* misc: add an optional write_to argument to misc.pull_directory()
so the caller can optionally specify the function to write
to local file.
* task/internal: add a global option "log-compress-min-size" which
defaults to "128MB". if the size of a file pulled from remote
host is greater or equal to the specified size, it will be
compressed with gzip with the extension of ".gz" before
stored in the archive directory.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 2 Dec 2020 05:10:33 +0000 (13:10 +0800)]
docs/detailed_test_config: document "archive-on-error" option
Signed-off-by: Kefu Chai <kchai@redhat.com>
Xiubo Li [Thu, 5 Nov 2020 03:25:23 +0000 (22:25 -0500)]
packageing: add force_noarch option support
Add one "force_noarch : True" option to force to use the "noarch"
to build the uri, False as default.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Kyr Shatskyy [Mon, 30 Nov 2020 15:27:10 +0000 (16:27 +0100)]
orchestra: introduce quiet mode for remote.run
Applied changes:
- Add quiet option to remote.run and subsidiary function calls
- Logging commands now directed to DEBUG instead of INFO logger
This is usefull when we want suppress logs for some kind of commmands like
reading binary files or logging useless data to stdout/stderr as well as
dumping some vulnarable information.
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
Kefu Chai [Sat, 21 Nov 2020 05:44:42 +0000 (13:44 +0800)]
Merge pull request #1575 from mdw-at-linuxbox/ssh-ecdsa
orchestra/connection: accept ecdsa (and future) host key types.
Reviewed-by: Kefu Chai <kchai@redhat.com>
Marcus Watts [Sat, 31 Oct 2020 19:31:35 +0000 (15:31 -0400)]
orchestra/connection: accept ecdsa (and future) host key types.
Out of the box, centos 8 ssh daemon makes this file,
/etc/ssh/ssh_host_ecdsa_key.pub
containing a key of type "ecdsa-sha2-nistp256", which was
not recognized by the existing teuthology logic.
Use logic in paramiko.hostkeys to recognize the new key types.
Signed-off-by: Marcus Watts <mwatts@redhat.com>
Deepika Upadhyay [Mon, 16 Nov 2020 09:07:44 +0000 (14:37 +0530)]
docs/detailed_test_config: document ``--block`` option
Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
Deepika Upadhyay [Thu, 12 Nov 2020 17:08:41 +0000 (22:38 +0530)]
teuthology command: introduce --interactive-on-error flag
Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>
Vasu Kulkarni [Thu, 12 Nov 2020 03:18:34 +0000 (19:18 -0800)]
Merge pull request #1580 from ceph/rh_ds_yml
redhat downstream yaml location changed and redhat install tasks to accept lists
Dan Mick [Tue, 10 Nov 2020 20:34:37 +0000 (12:34 -0800)]
Merge pull request #1577 from dmick/wip-container-build-complete
Check shaman not only for repo but for build complete