]> git.apps.os.sepia.ceph.com Git - teuthology.git/log
teuthology.git
2 years agoPhysicalConsole: Tolerate invalid UTF-8 characters 1880/head
Zack Cerza [Wed, 2 Aug 2023 18:03:21 +0000 (12:03 -0600)]
PhysicalConsole: Tolerate invalid UTF-8 characters

... in pexpect.spawn() calls.

Fixes: https://tracker.ceph.com/issues/62286
Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoPhysicalConsole.check_status(): Use log.exception
Zack Cerza [Wed, 2 Aug 2023 17:04:21 +0000 (11:04 -0600)]
PhysicalConsole.check_status(): Use log.exception

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1877 from ceph/sentry-ae
Dan Mick [Tue, 1 Aug 2023 00:51:49 +0000 (17:51 -0700)]
Merge pull request #1877 from ceph/sentry-ae

supervisor: Fix an AttributeError in reimage()

2 years agosupervisor: Fix an AttributeError in reimage() sentry-ae 1877/head
Zack Cerza [Mon, 31 Jul 2023 23:31:43 +0000 (17:31 -0600)]
supervisor: Fix an AttributeError in reimage()

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1859 from ceph/quiet-urllib3
Dan Mick [Fri, 28 Jul 2023 23:08:05 +0000 (16:08 -0700)]
Merge pull request #1859 from ceph/quiet-urllib3

Turn down logging for urllib3.util.retry

2 years agoMerge pull request #1875 from ceph/reimage-errs-sentry
Dan Mick [Fri, 28 Jul 2023 21:52:06 +0000 (14:52 -0700)]
Merge pull request #1875 from ceph/reimage-errs-sentry

Report reimage failures to Sentry

2 years agoMerge pull request #1876 from ceph/afa-sort
Dan Mick [Fri, 28 Jul 2023 21:02:07 +0000 (14:02 -0700)]
Merge pull request #1876 from ceph/afa-sort

task.ansible.FailureAnalyzer: Sort failure items

2 years agotask.ansible.FailureAnalyzer: Sort failure items afa-sort 1876/head
Zack Cerza [Fri, 28 Jul 2023 19:22:52 +0000 (13:22 -0600)]
task.ansible.FailureAnalyzer: Sort failure items

To reduce unecessary duplication in e.g. Sentry.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1874 from ceph/fix-fog-timeout
Dan Mick [Thu, 27 Jul 2023 21:56:28 +0000 (14:56 -0700)]
Merge pull request #1874 from ceph/fix-fog-timeout

fog: Fix a connection timeout bug

2 years agoMerge pull request #1873 from ceph/console-log
Dan Mick [Thu, 27 Jul 2023 21:55:36 +0000 (14:55 -0700)]
Merge pull request #1873 from ceph/console-log

orchestra.console: Scope loggers to shortname

2 years agosupervisor.reimage: Report failures to Sentry reimage-errs-sentry 1875/head
Zack Cerza [Thu, 27 Jul 2023 17:49:16 +0000 (11:49 -0600)]
supervisor.reimage: Report failures to Sentry

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMove Sentry reporting logic to utils
Zack Cerza [Thu, 27 Jul 2023 17:25:23 +0000 (11:25 -0600)]
Move Sentry reporting logic to utils

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoFOG._wait_for_ready(): Catch ConnectionErrors 1874/head
Zack Cerza [Thu, 27 Jul 2023 17:42:46 +0000 (11:42 -0600)]
FOG._wait_for_ready(): Catch ConnectionErrors

Instead of just ConnectionResetErrors, which inherit from
ConnectionError

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoremote: Raise ConnectionError when appropriate
Zack Cerza [Thu, 27 Jul 2023 17:41:11 +0000 (11:41 -0600)]
remote: Raise ConnectionError when appropriate

Instead of just Exception.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoorchestra.console: Scope loggers to shortname 1873/head
Zack Cerza [Thu, 27 Jul 2023 16:24:25 +0000 (10:24 -0600)]
orchestra.console: Scope loggers to shortname

This will make reading console debug logging easier.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1871 from ceph/sentry-ansible
Zack Cerza [Mon, 24 Jul 2023 19:38:11 +0000 (13:38 -0600)]
Merge pull request #1871 from ceph/sentry-ansible

2 years agoexceptions.AnsibleFailedError: Add fingerprint() 1871/head
Zack Cerza [Mon, 24 Jul 2023 17:22:51 +0000 (11:22 -0600)]
exceptions.AnsibleFailedError: Add fingerprint()

This will cause Sentry to group events by their failure reasons, rather
than lumping all AnsibleFailedErrors together

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1869 from dmick/wip-pexpect
Dan Mick [Sat, 22 Jul 2023 03:42:43 +0000 (20:42 -0700)]
Merge pull request #1869 from dmick/wip-pexpect

orchestra/console: log output from pexpect commands

2 years agoorchestra/console: log output from pexpect commands 1869/head
Dan Mick [Thu, 20 Jul 2023 02:45:11 +0000 (19:45 -0700)]
orchestra/console: log output from pexpect commands

in case anything weird is being noticed and communicated by
ipmitool, try to display anything it says

Signed-off-by: Dan Mick <dmick@redhat.com>
2 years agoMerge pull request #1870 from ceph/pyyaml-fix
Dan Mick [Sat, 22 Jul 2023 00:35:03 +0000 (17:35 -0700)]
Merge pull request #1870 from ceph/pyyaml-fix

Pin PyYAML to fix CI breakage

2 years agoUpdate pip-tools 1870/head
Zack Cerza [Fri, 21 Jul 2023 19:53:29 +0000 (13:53 -0600)]
Update pip-tools

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoExclude PyYAML 5.4.0,5.4.1
Zack Cerza [Fri, 21 Jul 2023 19:49:52 +0000 (13:49 -0600)]
Exclude PyYAML 5.4.0,5.4.1

See https://github.com/yaml/pyyaml/issues/601

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1866 from ceph/aa-fix
Zack Cerza [Tue, 18 Jul 2023 17:48:15 +0000 (11:48 -0600)]
Merge pull request #1866 from ceph/aa-fix

2 years agoMerge pull request #1867 from ceph/keyscan-timout
Zack Cerza [Fri, 14 Jul 2023 20:58:51 +0000 (14:58 -0600)]
Merge pull request #1867 from ceph/keyscan-timout

2 years agomisc.ssh_keyscan: Always retry, and retry more 1867/head
Zack Cerza [Fri, 14 Jul 2023 20:27:32 +0000 (14:27 -0600)]
misc.ssh_keyscan: Always retry, and retry more

We started seeing reimage failures with errors like:
"teuthology.exceptions.MaxWhileTries: 'ssh_keyscan $host' reached
maximum tries (6) after waiting for 5 seconds"

Let's be quite a bit more generous.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoTestFailureAnalyzer: Add tests for dropped items 1866/head
Zack Cerza [Fri, 14 Jul 2023 18:01:35 +0000 (12:01 -0600)]
TestFailureAnalyzer: Add tests for dropped items

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoansible.FailureAnalyzer: Drop malformed records
Zack Cerza [Fri, 14 Jul 2023 17:50:24 +0000 (11:50 -0600)]
ansible.FailureAnalyzer: Drop malformed records

If host_obj is the wrong type, we won't be able to extract anything
useful from it. In these cases, we'll end up using the raw string as we
used to do.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoAnsible._handle_failure: YAMLErrors are special
Zack Cerza [Fri, 14 Jul 2023 17:38:54 +0000 (11:38 -0600)]
Ansible._handle_failure: YAMLErrors are special

Return to treating them differently, but also continue to catch other
exceptions here.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoansible.FailureAnalyzer: Look for SSH errors
Zack Cerza [Fri, 14 Jul 2023 17:37:33 +0000 (11:37 -0600)]
ansible.FailureAnalyzer: Look for SSH errors

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoansible.FailureAnalyzer: items -> values
Zack Cerza [Fri, 14 Jul 2023 17:17:19 +0000 (11:17 -0600)]
ansible.FailureAnalyzer: items -> values

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1865 from ceph/ansible-fail-tolerate-exceptions
Dan Mick [Thu, 13 Jul 2023 23:33:37 +0000 (16:33 -0700)]
Merge pull request #1865 from ceph/ansible-fail-tolerate-exceptions

Ansible._handle_failure: Catch all Exceptions

2 years agoAnsible._handle_failure: Catch all Exceptions 1865/head
Zack Cerza [Thu, 13 Jul 2023 22:58:05 +0000 (16:58 -0600)]
Ansible._handle_failure: Catch all Exceptions

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1864 from ceph/analyze-ansible
Zack Cerza [Thu, 13 Jul 2023 00:58:34 +0000 (18:58 -0600)]
Merge pull request #1864 from ceph/analyze-ansible

2 years agoansible: Try to summarize failure logs analyze-ansible 1864/head
Zack Cerza [Wed, 5 Jul 2023 21:12:05 +0000 (15:12 -0600)]
ansible: Try to summarize failure logs

The failure logs we capture are sometimes helpful, but are often too
long, too complex, and too noisy to understand. Sentry also struggles to
associate related failures because of the presence of unique data such
as timestamps and URLs.

While I don't see a quick and generic solution to this, there are
several common failure modes that can easily be summarized. This commit
begins that work by looking for errors caused by network outages.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agotest_ansible: Use mock_open()
Zack Cerza [Wed, 5 Jul 2023 23:07:32 +0000 (17:07 -0600)]
test_ansible: Use mock_open()

This provides a more complete interface than what we were constructing.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1863 from ceph/fog-wfr-eoferror
Dan Mick [Sat, 1 Jul 2023 00:05:18 +0000 (17:05 -0700)]
Merge pull request #1863 from ceph/fog-wfr-eoferror

FOG._wait_for_ready: Tolerate EOFError

2 years agoFOG._wait_for_ready: Tolerate EOFError fog-wfr-eoferror 1863/head
Zack Cerza [Fri, 30 Jun 2023 22:11:14 +0000 (16:11 -0600)]
FOG._wait_for_ready: Tolerate EOFError

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1862 from ceph/retry-sentinel-connreset
Zack Cerza [Fri, 30 Jun 2023 19:19:08 +0000 (13:19 -0600)]
Merge pull request #1862 from ceph/retry-sentinel-connreset

2 years agoRemote.reconnect(): Use a default timeout of 30s 1862/head
Zack Cerza [Fri, 30 Jun 2023 18:29:52 +0000 (12:29 -0600)]
Remote.reconnect(): Use a default timeout of 30s

And rewrite with safe_while.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoFOG._wait_for_ready: Tolerate ConnectionResetError
Zack Cerza [Thu, 29 Jun 2023 19:08:37 +0000 (13:08 -0600)]
FOG._wait_for_ready: Tolerate ConnectionResetError

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agocontextutil: Remove leftover print statement
Zack Cerza [Thu, 29 Jun 2023 19:02:47 +0000 (13:02 -0600)]
contextutil: Remove leftover print statement

Looks like this was missed during PR submission/review in 8f8d05852

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1858 from ceph/exporter-restart
Zack Cerza [Wed, 28 Jun 2023 17:56:02 +0000 (11:56 -0600)]
Merge pull request #1858 from ceph/exporter-restart

2 years agoMerge pull request #1860 from ceph/disp-rc
Dan Mick [Tue, 27 Jun 2023 22:10:08 +0000 (15:10 -0700)]
Merge pull request #1860 from ceph/disp-rc

dispatcher: Return the highest of the jobs' RCs

2 years agodispatcher: Return the highest of the jobs' RCs 1860/head
Zack Cerza [Tue, 11 Oct 2022 19:10:53 +0000 (13:10 -0600)]
dispatcher: Return the highest of the jobs' RCs

This is so that ceph-devstack can report job failures

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1855 from ceph/ssh-ux
Dan Mick [Mon, 26 Jun 2023 23:09:43 +0000 (16:09 -0700)]
Merge pull request #1855 from ceph/ssh-ux

Improve error message when there is no SSH key

2 years agoTurn down logging for urllib3.util.retry quiet-urllib3 1859/head
Zack Cerza [Mon, 26 Jun 2023 22:54:04 +0000 (16:54 -0600)]
Turn down logging for urllib3.util.retry

This quiets messages like: "Converted retries value: 10 ->
Retry(total=10, connect=None, read=None, redirect=None, status=None)"

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoexporter: Restart every 24h 1858/head
Zack Cerza [Tue, 13 Jun 2023 23:49:48 +0000 (17:49 -0600)]
exporter: Restart every 24h

A design limitation of prometheus-client's multiprocessing mode is that
each process creates files to store its own metrics; the exporter then
has to read each file, even if the process which created it is dead.

This results in request latency growing over time, to the point of
multiple seconds when the file count gets into the thousands. This
eventually results in prometheus failing to fetch, leaving gaps in our
data.

We can work around this by restarting at a regular interval; 24h seems
like a fine place to start.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1856 from ceph/fog-debug
Dan Mick [Fri, 16 Jun 2023 22:48:04 +0000 (15:48 -0700)]
Merge pull request #1856 from ceph/fog-debug

fog: Add more debug logging

2 years agoFOG._wait_for_ready(): Use instance logger 1856/head
Zack Cerza [Fri, 16 Jun 2023 16:24:29 +0000 (10:24 -0600)]
FOG._wait_for_ready(): Use instance logger

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agodispatcher/supervisor: Set root logger level
Zack Cerza [Fri, 16 Jun 2023 16:23:42 +0000 (10:23 -0600)]
dispatcher/supervisor: Set root logger level

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agofog: Add more debug logging
Zack Cerza [Wed, 14 Jun 2023 20:53:36 +0000 (14:53 -0600)]
fog: Add more debug logging

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1854 from ceph/bootstrap-c9s
Zack Cerza [Wed, 14 Jun 2023 15:33:28 +0000 (09:33 -0600)]
Merge pull request #1854 from ceph/bootstrap-c9s

2 years agoImprove error message when there is no SSH key 1855/head
Zack Cerza [Tue, 13 Jun 2023 20:24:12 +0000 (14:24 -0600)]
Improve error message when there is no SSH key

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1853 from ceph/reimage-no-ctx
Zack Cerza [Tue, 13 Jun 2023 19:08:40 +0000 (13:08 -0600)]
Merge pull request #1853 from ceph/reimage-no-ctx

2 years agobootstrap: Tolerate a missing lsb_release 1854/head
Zack Cerza [Mon, 12 Jun 2023 21:48:34 +0000 (15:48 -0600)]
bootstrap: Tolerate a missing lsb_release

This fixes the lack of support for CentOS 9.Stream

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoprovision: Avoid a possible AttributeError 1853/head
Zack Cerza [Mon, 12 Jun 2023 21:37:56 +0000 (15:37 -0600)]
provision: Avoid a possible AttributeError

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1846 from ceph/stdin-killer
Zack Cerza [Wed, 7 Jun 2023 23:26:37 +0000 (17:26 -0600)]
Merge pull request #1846 from ceph/stdin-killer

2 years agoteuthology/task/install: add stdin-killer helper 1846/head
Patrick Donnelly [Thu, 18 May 2023 13:24:57 +0000 (09:24 -0400)]
teuthology/task/install: add stdin-killer helper

This helper tool runs commands which may or may not take data on stdin.
Like "daemon-helper", if stdin signals EOF, stdin-killer will kill the
command but only as a last resort. It forwards EOF to the command by
closing the command's stdin (pipe) and then waiting a configurable
amount of time for the command to gracefully exit.

Additionally, if stdout or stderr are hung up -- i.e. the ssh parent
process has terminated -- then stdin-killer also detects this and
initiates the graceful shutdown of the command. This is something
daemon-helper does not do.

In general, this tool is a superior replacement of the daemon-helper
tool because you can write to the command's stdin normally.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 years agosetup.cfg: install binary helpers
Patrick Donnelly [Thu, 18 May 2023 13:20:57 +0000 (09:20 -0400)]
setup.cfg: install binary helpers

These are used by vstart_runner.py for local dev operations. Install
them so they are available in the virtualenv bin directory.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 years agoteuthology/task/install: reorganize binary helpers
Patrick Donnelly [Wed, 17 May 2023 18:32:19 +0000 (14:32 -0400)]
teuthology/task/install: reorganize binary helpers

We intend to install these so move them into an appropriately named
directory.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 years agoMerge pull request #1803 from jdurgin/wip-configurable-timeouts
Zack Cerza [Wed, 31 May 2023 22:12:50 +0000 (16:12 -0600)]
Merge pull request #1803 from jdurgin/wip-configurable-timeouts

2 years agoMerge pull request #1851 from ceph/reimage-failures
Zack Cerza [Wed, 31 May 2023 20:00:35 +0000 (14:00 -0600)]
Merge pull request #1851 from ceph/reimage-failures

2 years agofog: Verify reimaged machine OS 1851/head
Zack Cerza [Thu, 18 May 2023 00:12:13 +0000 (18:12 -0600)]
fog: Verify reimaged machine OS

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1850 from ceph/unmask-unlock-response
Zack Cerza [Fri, 26 May 2023 18:23:34 +0000 (12:23 -0600)]
Merge pull request #1850 from ceph/unmask-unlock-response

2 years agoMerge pull request #1849 from ceph/prom-reimage-results
Dan Mick [Wed, 24 May 2023 21:48:00 +0000 (14:48 -0700)]
Merge pull request #1849 from ceph/prom-reimage-results

exporter: Instrument node reimaging success/fail

2 years agolock.ops.unlock_one: Fail sooner on 403, with msg 1850/head
Zack Cerza [Wed, 24 May 2023 17:53:19 +0000 (11:53 -0600)]
lock.ops.unlock_one: Fail sooner on 403, with msg

In the case of e.g. owners values not matching on an unlock attempt, we
were exhausting all retries and failing to display the exact reason for
the unlock failure. We can simply break on 403 errors and let the rest
of the function do its thing.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agofog: make timeouts configurable 1803/head
Josh Durgin [Tue, 20 Dec 2022 18:53:20 +0000 (18:53 +0000)]
fog: make timeouts configurable

This will help with the sepia lab, being able to increase these
temporarily to handle a new fog server that is sometimes exceeding the
hardcoded timeouts.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2 years agocontextutil: allow safe_while to use an explicit timeout
Josh Durgin [Tue, 20 Dec 2022 18:50:42 +0000 (18:50 +0000)]
contextutil: allow safe_while to use an explicit timeout

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
2 years agonuke: Fix an import issue 1849/head
Zack Cerza [Tue, 23 May 2023 23:43:58 +0000 (17:43 -0600)]
nuke: Fix an import issue

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agolock.ops.reimage_machines: Drop incorrect log msg
Zack Cerza [Tue, 28 Mar 2023 22:22:33 +0000 (16:22 -0600)]
lock.ops.reimage_machines: Drop incorrect log msg

This message was being logged when the reimage started, not finished.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoexporter: Instrument node reimaging success/fail
Zack Cerza [Tue, 23 May 2023 19:53:23 +0000 (13:53 -0600)]
exporter: Instrument node reimaging success/fail

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1847 from ceph/fog-timeout
Dan Mick [Tue, 23 May 2023 00:39:23 +0000 (17:39 -0700)]
Merge pull request #1847 from ceph/fog-timeout

fog: Increase timeout in wait_for_deploy_task()

2 years agotest_fog: Fix up test for preceding commit "Increase timeout" 1847/head
Dan Mick [Tue, 23 May 2023 00:15:25 +0000 (17:15 -0700)]
test_fog: Fix up test for preceding commit "Increase timeout"

Signed-off-by: Dan Mick <dmick@redhat.com>
2 years agofog: Increase timeout in wait_for_deploy_task()
Zack Cerza [Mon, 22 May 2023 23:59:36 +0000 (17:59 -0600)]
fog: Increase timeout in wait_for_deploy_task()

When too many reimaging ops are running concurrently, we're seeing
timeouts. This isn't a true fix, but should help things until we've got
one.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1843 from dmick/wip-kernel-sort
Zack Cerza [Mon, 22 May 2023 20:45:58 +0000 (14:45 -0600)]
Merge pull request #1843 from dmick/wip-kernel-sort

2 years agoMerge pull request #1844 from cbodley/wip-install-copr
Casey Bodley [Fri, 19 May 2023 16:21:52 +0000 (12:21 -0400)]
Merge pull request #1844 from cbodley/wip-install-copr

task: install supports enable_coprs array

Reviewed-by: Ken Dreyer <kdreyer@redhat.com>
Reviewed-by: Zack Cerza <zcerza@redhat.com>
2 years agotask: install supports enable_coprs array 1844/head
Casey Bodley [Wed, 17 May 2023 20:21:37 +0000 (16:21 -0400)]
task: install supports enable_coprs array

enable the installation of packages in fedora copr repositories

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2 years agotask/kernel.py: sort installed kernels by version 1843/head
Dan Mick [Wed, 17 May 2023 00:05:15 +0000 (17:05 -0700)]
task/kernel.py: sort installed kernels by version

rpm -q --last sorts by timestamp-of-install, which does not
necessarily correlate with "latest version".  sort -rV does.

Signed-off-by: Dan Mick <dmick@redhat.com>
2 years agoMerge pull request #1804 from ceph/dependabot/pip/wheel-0.38.1
kyr [Wed, 26 Apr 2023 18:49:18 +0000 (20:49 +0200)]
Merge pull request #1804 from ceph/dependabot/pip/wheel-0.38.1

build(deps): bump wheel from 0.36.2 to 0.38.1

2 years agobuild(deps): bump wheel from 0.36.2 to 0.38.1 1804/head
dependabot[bot] [Wed, 26 Apr 2023 18:00:51 +0000 (18:00 +0000)]
build(deps): bump wheel from 0.36.2 to 0.38.1

Bumps [wheel](https://github.com/pypa/wheel) from 0.36.2 to 0.38.1.
- [Release notes](https://github.com/pypa/wheel/releases)
- [Changelog](https://github.com/pypa/wheel/blob/main/docs/news.rst)
- [Commits](https://github.com/pypa/wheel/compare/0.36.2...0.38.1)

---
updated-dependencies:
- dependency-name: wheel
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2 years agoMerge pull request #1831 from kamoltat/wip-ksirivad-rerun-readme
Kamoltat Sirivadhna [Wed, 5 Apr 2023 14:05:41 +0000 (10:05 -0400)]
Merge pull request #1831 from kamoltat/wip-ksirivad-rerun-readme

teuthology-suite: --seed & --subset now also stored in teuthology.log, config.yaml and orig.config.yaml
Reviewed-by: Zack Cerza <zcerza@redhat.com>
2 years agoMerge pull request #1834 from ceph/disp_ls_remote
Dan Mick [Tue, 4 Apr 2023 01:57:20 +0000 (18:57 -0700)]
Merge pull request #1834 from ceph/disp_ls_remote

worker.prep_job: Skip job if ls_remote fails

2 years agoworker.prep_job: Skip job if ls_remote fails 1834/head
Zack Cerza [Mon, 3 Apr 2023 22:07:37 +0000 (16:07 -0600)]
worker.prep_job: Skip job if ls_remote fails

This is preferable to letting the dispatcher die.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agosuite/run.py: Added seed and subset to base_config 1831/head
Kamoltat [Mon, 3 Apr 2023 18:35:01 +0000 (18:35 +0000)]
suite/run.py: Added seed and subset to base_config

In addition to being stored in results.log
`--seed` and `--subset` are now also stored in:

`teuthlogy.log`, `config.yaml` and `orig.config.yaml`.

Fixes: https://tracker.ceph.com/issues/59300
Signed-off-by: Kamoltat <ksirivad@redhat.com>
2 years agoteuthology-suite: Log errors & warnings if results.log is missing during --rerun
Kamoltat [Thu, 30 Mar 2023 17:12:17 +0000 (17:12 +0000)]
teuthology-suite: Log errors & warnings if results.log is missing during --rerun

Notify the user if `results.log` is missing when
they issue a rerun.

Also, edited teuthology-suite doc
to inform the user about how `--rerun` by
default parse `--seed`, `--subset` and
`--no-nested-subset` from `results.log`
by default

Fixes: https://tracker.ceph.com/issues/59300
Signed-off-by: Kamoltat <ksirivad@redhat.com>
2 years agoMerge pull request #1832 from ceph/deps
Zack Cerza [Fri, 31 Mar 2023 22:08:58 +0000 (16:08 -0600)]
Merge pull request #1832 from ceph/deps

2 years agoMerge pull request #1827 from ceph/tox-no-osp
Zack Cerza [Fri, 31 Mar 2023 19:54:32 +0000 (13:54 -0600)]
Merge pull request #1827 from ceph/tox-no-osp

2 years agoDrop argparse as a requirement 1832/head
Zack Cerza [Thu, 23 Mar 2023 19:13:31 +0000 (13:13 -0600)]
Drop argparse as a requirement

It's part of the standard library.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agorequirements.txt: Add prometheus_client
Zack Cerza [Thu, 23 Mar 2023 19:12:43 +0000 (13:12 -0600)]
requirements.txt: Add prometheus_client

This should have been added a couple PRs back.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1833 from ceph/gha-ubuntu-version
Zack Cerza [Fri, 31 Mar 2023 17:03:51 +0000 (11:03 -0600)]
Merge pull request #1833 from ceph/gha-ubuntu-version

2 years agobootstrap: apt-get update before installing 1833/head
Zack Cerza [Thu, 30 Mar 2023 22:26:45 +0000 (16:26 -0600)]
bootstrap: apt-get update before installing

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years ago.github: Explicitly define test matrix
Zack Cerza [Thu, 30 Mar 2023 22:09:18 +0000 (16:09 -0600)]
.github: Explicitly define test matrix

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1824 from batrick/parallel-gzip
Zack Cerza [Thu, 30 Mar 2023 16:43:05 +0000 (10:43 -0600)]
Merge pull request #1824 from batrick/parallel-gzip

2 years agoteuthology: do not compress tarballs when pulling dir 1824/head
Patrick Donnelly [Wed, 22 Mar 2023 14:48:43 +0000 (10:48 -0400)]
teuthology: do not compress tarballs when pulling dir

Where we use this, it's for pulling log files that are already
compressed. Do not waste time double compressing!

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 years agoteuthology/misc: give verbose gzip output
Patrick Donnelly [Tue, 21 Mar 2023 14:37:25 +0000 (10:37 -0400)]
teuthology/misc: give verbose gzip output

For future analysis.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 years agoteuthology/misc: use medium compression
Patrick Donnelly [Tue, 21 Mar 2023 14:36:36 +0000 (10:36 -0400)]
teuthology/misc: use medium compression

To speed things up.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 years agoteuthology/misc: parallelize gzip
Patrick Donnelly [Tue, 21 Mar 2023 14:36:09 +0000 (10:36 -0400)]
teuthology/misc: parallelize gzip

Our machines have lots of cores, use them!

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 years agoMerge pull request #1826 from ceph/job-time
Zack Cerza [Thu, 23 Mar 2023 18:43:50 +0000 (12:43 -0600)]
Merge pull request #1826 from ceph/job-time

2 years agotox: Don't run openstack by default 1827/head
Zack Cerza [Thu, 23 Mar 2023 18:42:12 +0000 (12:42 -0600)]
tox: Don't run openstack by default

It's quite time-consuming, and we're not sure if it's in use at all.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agosueprvisor: Do not instrument certain job times 1826/head
Zack Cerza [Thu, 23 Mar 2023 18:22:15 +0000 (12:22 -0600)]
sueprvisor: Do not instrument certain job times

This should only really include first/last-in-suite jobs.

Signed-off-by: Zack Cerza <zack@redhat.com>