]> git.apps.os.sepia.ceph.com Git - teuthology.git/log
teuthology.git
19 months agoAdd teuthology-node-cleanup command
Zack Cerza [Fri, 2 Feb 2024 18:40:52 +0000 (11:40 -0700)]
Add teuthology-node-cleanup command

This replaces teuthology-nuke --stale

Signed-off-by: Zack Cerza <zack@redhat.com>
19 months agoRemove nuke: Rework unlocking
Zack Cerza [Thu, 1 Feb 2024 00:51:09 +0000 (17:51 -0700)]
Remove nuke: Rework unlocking

This commit re-implements functionality that was removed with the nuke system.

Signed-off-by: Zack Cerza <zack@redhat.com>
19 months agoRemove nuke: Trivial changes
Zack Cerza [Thu, 1 Feb 2024 00:28:59 +0000 (17:28 -0700)]
Remove nuke: Trivial changes

This commit contains trivial changes like reference removals, docs changes, and
removal of dead code.

Signed-off-by: Zack Cerza <zack@redhat.com>
19 months agoRemove nuke: deletions
Zack Cerza [Thu, 1 Feb 2024 00:27:35 +0000 (17:27 -0700)]
Remove nuke: deletions

This commit contains only full file deletions, and the relocation of
nuke.actions.clear_firewall() to nuke/__init__.py to retain compatibility with
older ceph.git tasks.

Signed-off-by: Zack Cerza <zack@redhat.com>
19 months agoMerge pull request #1914 from ceph/lock-leaks
Zack Cerza [Wed, 31 Jan 2024 02:30:00 +0000 (19:30 -0700)]
Merge pull request #1914 from ceph/lock-leaks

supervisor: Disregard nuke-on-error when unlocking

19 months agotest_exit: Drop bad test_noop 1914/head
Zack Cerza [Wed, 31 Jan 2024 01:56:19 +0000 (18:56 -0700)]
test_exit: Drop bad test_noop

This test races with other tests because Exiter doesn't have a great way to
remove all installed handlers. This is a test-only issue, so we can drop this
test.

Signed-off-by: Zack Cerza <zack@redhat.com>
19 months agosupervisor: Disregard nuke-on-error when unlocking
Zack Cerza [Wed, 31 Jan 2024 01:04:01 +0000 (18:04 -0700)]
supervisor: Disregard nuke-on-error when unlocking

Signed-off-by: Zack Cerza <zack@redhat.com>
19 months agoMerge pull request #1913 from ceph/wip-64193
Zack Cerza [Mon, 29 Jan 2024 21:15:06 +0000 (14:15 -0700)]
Merge pull request #1913 from ceph/wip-64193

supervisor: Do not nuke nodes after jobs finish

19 months agosupervisor: Do not nuke nodes after jobs finish 1913/head
Zack Cerza [Fri, 26 Jan 2024 21:02:09 +0000 (14:02 -0700)]
supervisor: Do not nuke nodes after jobs finish

This was causing a bad race condition, where we could unlock a node, then unlock
it again via the nuke process after a different job had locked it.

Fixes: https://tracker.ceph.com/issues/64193
Signed-off-by: Zack Cerza <zack@redhat.com>
19 months agoMerge pull request #1912 from ceph/kill-report-dead
Zack Cerza [Mon, 22 Jan 2024 20:30:58 +0000 (13:30 -0700)]
Merge pull request #1912 from ceph/kill-report-dead

kill: After killing a run, report it as dead

19 months agokill: After killing a run, report it as dead 1912/head
Zack Cerza [Mon, 22 Jan 2024 18:33:20 +0000 (11:33 -0700)]
kill: After killing a run, report it as dead

In case processes died a messy death.

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agoMerge pull request #1909 from ceph/deps
Zack Cerza [Mon, 8 Jan 2024 19:07:54 +0000 (12:07 -0700)]
Merge pull request #1909 from ceph/deps

Update dependencies

20 months agoJobProcesses: Ignore zombies safely 1907/head 1909/head
Zack Cerza [Wed, 3 Jan 2024 19:28:18 +0000 (12:28 -0700)]
JobProcesses: Ignore zombies safely

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agofind_dispatcher_processes: Ignore zombies safely
Zack Cerza [Wed, 3 Jan 2024 19:25:50 +0000 (12:25 -0700)]
find_dispatcher_processes: Ignore zombies safely

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agoInstall ansible collections individually
Zack Cerza [Tue, 2 Jan 2024 18:58:36 +0000 (11:58 -0700)]
Install ansible collections individually

Going forward, we can maintain our specific collection requirements in
requirements.yml.

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agoDrop ansible for ansible-core
Zack Cerza [Tue, 2 Jan 2024 18:16:15 +0000 (11:16 -0700)]
Drop ansible for ansible-core

The 'ansible' PyPI package installs _all_ collections, which ends up being
~60% the total size of our virtualenv.

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agotox.ini: Move some deps to setup.cfg
Zack Cerza [Wed, 27 Dec 2023 18:53:14 +0000 (11:53 -0700)]
tox.ini: Move some deps to setup.cfg

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agosetup.cfg: python_requires>=3.8
Zack Cerza [Wed, 27 Dec 2023 18:45:27 +0000 (11:45 -0700)]
setup.cfg: python_requires>=3.8

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agorepo_utils.fetch_repo: Use less retries
Zack Cerza [Wed, 27 Dec 2023 18:22:39 +0000 (11:22 -0700)]
repo_utils.fetch_repo: Use less retries

If a particular branch cannot successfully bootstrap, it can cause an accidental
DoS.

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agorequirements: Update via pip-compile -U
Zack Cerza [Wed, 27 Dec 2023 17:56:51 +0000 (10:56 -0700)]
requirements: Update via pip-compile -U

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agosetup.cfg: Pin urllib3 for botocore
Zack Cerza [Wed, 27 Dec 2023 18:42:47 +0000 (11:42 -0700)]
setup.cfg: Pin urllib3 for botocore

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agorequirements: Update ansible
Zack Cerza [Wed, 27 Dec 2023 17:54:22 +0000 (10:54 -0700)]
requirements: Update ansible

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agorequirements: Update pyjwt
Zack Cerza [Wed, 27 Dec 2023 17:51:35 +0000 (10:51 -0700)]
requirements: Update pyjwt

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agorequirements: Update paramiko
Zack Cerza [Wed, 27 Dec 2023 17:49:59 +0000 (10:49 -0700)]
requirements: Update paramiko

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agorequirements: Update gevent
Zack Cerza [Wed, 27 Dec 2023 17:48:40 +0000 (10:48 -0700)]
requirements: Update gevent

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agorequirements: Update configobj
Zack Cerza [Thu, 21 Dec 2023 21:33:24 +0000 (14:33 -0700)]
requirements: Update configobj

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agorequirements: Update certifi
Zack Cerza [Thu, 21 Dec 2023 21:32:06 +0000 (14:32 -0700)]
requirements: Update certifi

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agorequirements: Update requests
Zack Cerza [Thu, 21 Dec 2023 21:30:39 +0000 (14:30 -0700)]
requirements: Update requests

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agorequirements: Update PyYAML
Zack Cerza [Thu, 21 Dec 2023 21:29:21 +0000 (14:29 -0700)]
requirements: Update PyYAML

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agorequirements: Move openstack to its own variant
Zack Cerza [Thu, 21 Dec 2023 21:26:33 +0000 (14:26 -0700)]
requirements: Move openstack to its own variant

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agorequirements: Update cryptography
Zack Cerza [Thu, 21 Dec 2023 21:01:36 +0000 (14:01 -0700)]
requirements: Update cryptography

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agorequirements: Update pip-tools
Zack Cerza [Thu, 21 Dec 2023 20:59:34 +0000 (13:59 -0700)]
requirements: Update pip-tools

Signed-off-by: Zack Cerza <zack@redhat.com>
20 months agoMerge pull request #1906 from ceph/kill-unbound
Zack Cerza [Wed, 27 Dec 2023 17:44:55 +0000 (10:44 -0700)]
Merge pull request #1906 from ceph/kill-unbound

kill.kill_processes: Fix possibly-unbound variables

21 months agokill.kill_processes: Fix possibly-unbound variables 1906/head
Zack Cerza [Wed, 20 Dec 2023 23:19:10 +0000 (16:19 -0700)]
kill.kill_processes: Fix possibly-unbound variables

Signed-off-by: Zack Cerza <zack@redhat.com>
21 months agoMerge pull request #1903 from ceph/wip-package-queries
Zack Cerza [Wed, 20 Dec 2023 22:41:10 +0000 (15:41 -0700)]
Merge pull request #1903 from ceph/wip-package-queries

suite: Improve package query caching

21 months agoMerge pull request #1900 from ceph/systemd
Zack Cerza [Wed, 20 Dec 2023 22:39:42 +0000 (15:39 -0700)]
Merge pull request #1900 from ceph/systemd

Add systemd units for exporter and dispatcher

21 months agorun.util.find_git_parents: Drop refresh() 1903/head
Zack Cerza [Wed, 29 Nov 2023 23:34:51 +0000 (16:34 -0700)]
run.util.find_git_parents: Drop refresh()

This takes a long time, and can time out. The mirror is updated every ten
minutes automatically.

Signed-off-by: Zack Cerza <zack@redhat.com>
21 months agoMerge pull request #1896 from ceph/dependabot/pip/urllib3-1.26.18
kyr [Sun, 10 Dec 2023 17:25:23 +0000 (18:25 +0100)]
Merge pull request #1896 from ceph/dependabot/pip/urllib3-1.26.18

build(deps): bump urllib3 from 1.26.6 to 1.26.18

21 months agobuild(deps): bump urllib3 from 1.26.6 to 1.26.18 1896/head
dependabot[bot] [Sun, 10 Dec 2023 16:21:41 +0000 (16:21 +0000)]
build(deps): bump urllib3 from 1.26.6 to 1.26.18

Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.6 to 1.26.18.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/1.26.6...1.26.18)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
21 months agorun: Fix some pyright errors
Zack Cerza [Wed, 29 Nov 2023 18:55:28 +0000 (11:55 -0700)]
run: Fix some pyright errors

Signed-off-by: Zack Cerza <zack@redhat.com>
21 months agoorchestra.opsys: Add some newer OS codenames
Zack Cerza [Wed, 29 Nov 2023 00:27:04 +0000 (17:27 -0700)]
orchestra.opsys: Add some newer OS codenames

Signed-off-by: Zack Cerza <zack@redhat.com>
21 months agotests: Remove some gitbuilder-related tests
Zack Cerza [Wed, 29 Nov 2023 00:25:13 +0000 (17:25 -0700)]
tests: Remove some gitbuilder-related tests

Signed-off-by: Zack Cerza <zack@redhat.com>
21 months agoMake logs slightly quieter during scheduling
Zack Cerza [Wed, 22 Nov 2023 01:50:01 +0000 (18:50 -0700)]
Make logs slightly quieter during scheduling

Particularly in non-verbose mode.

Signed-off-by: Zack Cerza <zack@redhat.com>
21 months agorepo_utils.ls_remote: Memoize
Zack Cerza [Wed, 22 Nov 2023 01:54:08 +0000 (18:54 -0700)]
repo_utils.ls_remote: Memoize

Signed-off-by: Zack Cerza <zack@redhat.com>
21 months agosuite: Improve package query caching
Zack Cerza [Wed, 22 Nov 2023 01:25:56 +0000 (18:25 -0700)]
suite: Improve package query caching

We had our own "system" for caching, but it had the unfortunate characteristic
 of being a big bowl of spaghetti. While eating said pasta I also noticed we
had two competing "distro defaults" concepts - so that let me delete even more
code. Yum!

Signed-off-by: Zack Cerza <zack@redhat.com>
21 months agoMerge pull request #1899 from ceph/kill-proc-perms
Zack Cerza [Wed, 29 Nov 2023 17:23:58 +0000 (10:23 -0700)]
Merge pull request #1899 from ceph/kill-proc-perms

21 months agoMerge pull request #1902 from ceph/dispatcher-quiet
Dan Mick [Tue, 28 Nov 2023 23:28:51 +0000 (15:28 -0800)]
Merge pull request #1902 from ceph/dispatcher-quiet

dispatcher: Dont spam the journal

21 months agoMerge pull request #1792 from VallariAg/unittest-xml-scanner
Zack Cerza [Mon, 27 Nov 2023 23:25:30 +0000 (16:25 -0700)]
Merge pull request #1792 from VallariAg/unittest-xml-scanner

orch/run: Add unit test xml scanner

21 months agoutil/scanner: add UnitTestScanner.num_of_total_failures 1792/head
Vallari Agrawal [Fri, 27 Oct 2023 08:58:18 +0000 (14:28 +0530)]
util/scanner: add UnitTestScanner.num_of_total_failures

In UnitTestScanner's final error message, add total count of failures
before the first error occurance, like "(total x failed) <message>".
Another minor change: add "..." if the failure reason is more than 200 chars.

Signed-off-by: Vallari Agrawal <val.agl002@gmail.com>
21 months agoadd utils/tests/test_scanner.py
Vallari Agrawal [Tue, 19 Sep 2023 15:04:28 +0000 (20:34 +0530)]
add utils/tests/test_scanner.py

and test_run_unit_test in test_remote.py

Signed-off-by: Vallari Agrawal <val.agl002@gmail.com>
21 months agoadd Scanner, UnitTestScanner, ValgrindScanner
Vallari Agrawal [Sat, 1 Oct 2022 11:16:52 +0000 (16:46 +0530)]
add Scanner, UnitTestScanner, ValgrindScanner

1. add 'run_unit_test' to Remote
2. create util/scanner.py
3. new exception: UnitTestError
4. add `lxml` dependency in setup.cfg

Signed-off-by: Vallari Agrawal <val.agl002@gmail.com>
21 months agokill: Don't unlock nodes if killing procs fails 1899/head
Zack Cerza [Fri, 10 Nov 2023 22:24:21 +0000 (15:24 -0700)]
kill: Don't unlock nodes if killing procs fails

... so that we don't unlock nodes while their jobs are running.

Signed-off-by: Zack Cerza <zack@redhat.com>
22 months agosupervisor: Drop job output 1902/head
Zack Cerza [Fri, 17 Nov 2023 20:48:38 +0000 (13:48 -0700)]
supervisor: Drop job output

It gets logged to its own file in the job archive.

Signed-off-by: Zack Cerza <zack@redhat.com>
22 months agoMerge pull request #1901 from ceph/fog-debug-quieter
Dan Mick [Fri, 17 Nov 2023 20:43:33 +0000 (12:43 -0800)]
Merge pull request #1901 from ceph/fog-debug-quieter

fog: Drop request debug logging

22 months agodispatcher: Drop supervisor output
Zack Cerza [Fri, 17 Nov 2023 20:34:21 +0000 (13:34 -0700)]
dispatcher: Drop supervisor output

It gets logged to its own file in the job archive.

Signed-off-by: Zack Cerza <zack@redhat.com>
22 months agofog: Drop request debug logging 1901/head
Zack Cerza [Fri, 17 Nov 2023 20:22:58 +0000 (13:22 -0700)]
fog: Drop request debug logging

It's too noisy.

Signed-off-by: Zack Cerza <zack@redhat.com>
22 months agoAdd systemd units for exporter and dispatcher 1900/head
Zack Cerza [Wed, 15 Nov 2023 20:03:25 +0000 (13:03 -0700)]
Add systemd units for exporter and dispatcher

These are copies of what is currently in use in sepia.

Signed-off-by: Zack Cerza <zack@redhat.com>
22 months agoMerge pull request #1892 from ceph/devstack-simplified
Zack Cerza [Thu, 26 Oct 2023 19:05:51 +0000 (13:05 -0600)]
Merge pull request #1892 from ceph/devstack-simplified

Add containers/teuthology-dev

22 months agoAdd containers/teuthology-dev 1892/head
Zack Cerza [Tue, 26 Sep 2023 21:32:25 +0000 (14:32 -0700)]
Add containers/teuthology-dev

This is nearly identical to docs/docker-compose/teuthology, but with
some changes to better work with ceph-devstack. The bits in
docs/docker-compose should be able to be adapted easily to work with
this container.

Signed-off-by: Zack Cerza <zack@redhat.com>
23 months agoMerge PR #1895 into main
Patrick Donnelly [Tue, 17 Oct 2023 19:38:56 +0000 (15:38 -0400)]
Merge PR #1895 into main

* refs/pull/1895/head:
install/bin/stdin-killer: macOs (Darwin) compatibility

Reviewed-by: Zack Cerza <zack@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
23 months agoMerge pull request #1894 from VallariAg/fix-readthedocs-builds
Zack Cerza [Tue, 17 Oct 2023 16:16:46 +0000 (10:16 -0600)]
Merge pull request #1894 from VallariAg/fix-readthedocs-builds

fix readthedocs PR builds

23 months agoreadthedocs: fix 'The configuration key "build.image" is deprecated' 1894/head
Vallari Agrawal [Tue, 17 Oct 2023 09:16:42 +0000 (14:46 +0530)]
readthedocs: fix 'The configuration key "build.image" is deprecated'

builds are failing because support for deprecated “build.image” is
fully removed by readthedocs, need to use "build.os" instead.

ref: https://blog.readthedocs.com/use-build-os-config/
error: https://readthedocs.org/projects/teuthology/builds/22250705/

Signed-off-by: Vallari Agrawal <val.agl002@gmail.com>
23 months agoinstall/bin/stdin-killer: macOs (Darwin) compatibility 1895/head
Leonid Usov [Tue, 17 Oct 2023 10:37:27 +0000 (13:37 +0300)]
install/bin/stdin-killer: macOs (Darwin) compatibility

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
23 months agoMerge pull request #1887 from ceph/paramiko-eoferror
Josh Durgin [Wed, 11 Oct 2023 16:42:32 +0000 (09:42 -0700)]
Merge pull request #1887 from ceph/paramiko-eoferror

orchestra: Tolerate EOFError during connect

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2 years agoMerge pull request #1888 from ceph/keyscan-tweak
Zack Cerza [Tue, 5 Sep 2023 23:29:55 +0000 (17:29 -0600)]
Merge pull request #1888 from ceph/keyscan-tweak

2 years agomisc._ssh_keyscan: Sort keys before returning any 1888/head
Zack Cerza [Tue, 5 Sep 2023 18:42:46 +0000 (12:42 -0600)]
misc._ssh_keyscan: Sort keys before returning any

ssh-keyscan's output is unsorted, so this function wasn't deterministic.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoorchestra: Move connection exception handling 1887/head
Zack Cerza [Thu, 31 Aug 2023 18:10:09 +0000 (11:10 -0700)]
orchestra: Move connection exception handling

... to inside the retry loop. Also, add an increment to the safe_while
instance we use.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoorchestra: Treat EOFError as SSHException
Zack Cerza [Thu, 31 Aug 2023 17:34:41 +0000 (10:34 -0700)]
orchestra: Treat EOFError as SSHException

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1886 from ceph/update-paramiko
Zack Cerza [Wed, 30 Aug 2023 17:34:16 +0000 (11:34 -0600)]
Merge pull request #1886 from ceph/update-paramiko

2 years agoMerge pull request #1884 from kamoltat/wip-ksirivad-fix-62445
Kamoltat (Junior) Sirivadhna [Wed, 30 Aug 2023 15:59:27 +0000 (11:59 -0400)]
Merge pull request #1884 from kamoltat/wip-ksirivad-fix-62445

teuthology/scrape: Fix bad backtrace parsing in Teuthology.log
Reviewed-by Zack Cerza <zcerza@redhat.com>

2 years agoMerge pull request #1885 from ceph/fix-docs-build
Kamoltat (Junior) Sirivadhna [Wed, 30 Aug 2023 15:44:58 +0000 (11:44 -0400)]
Merge pull request #1885 from ceph/fix-docs-build

Fix docs build
Reviewed-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
2 years agoUpdate paramiko 1886/head
Zack Cerza [Mon, 28 Aug 2023 20:07:19 +0000 (14:07 -0600)]
Update paramiko

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agotox: Avoid buggy sphinx versions 1885/head
Zack Cerza [Wed, 23 Aug 2023 19:24:56 +0000 (13:24 -0600)]
tox: Avoid buggy sphinx versions

See https://github.com/ceph/teuthology/pull/1884

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agosetup.cfg: Drop license_file
Zack Cerza [Wed, 23 Aug 2023 19:21:43 +0000 (13:21 -0600)]
setup.cfg: Drop license_file

It's deprecated in favor of `license_files`, but the default value is
sufficient.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoteuthology/scrape: Fix bad backtrace parsing in Teuthology.log 1884/head
Kamoltat Sirivadhna [Thu, 17 Aug 2023 16:28:00 +0000 (12:28 -0400)]
teuthology/scrape: Fix bad backtrace parsing in Teuthology.log

Problem:

- confusing warning message stating that
the back trace is malformed

- We kept adding to the backtrace buffer
even when we exceeded the `MAX_BT_LINES`

Solution:

- Correct the warning message to be
"Ignoring backtrace that exceeds MAX_BT_LINES"
- reset the buffer once we exceeded MAX_BT_LINES
- Added some cases where we detect start/end of back trace.

Fixes:https://tracker.ceph.com/issues/62445

Signed-off-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
2 years agoMerge pull request #1883 from ceph/nuke-desc-typeerror
Dan Mick [Tue, 15 Aug 2023 19:40:36 +0000 (12:40 -0700)]
Merge pull request #1883 from ceph/nuke-desc-typeerror

nuke: Avoid a TypeError w/ null node description

2 years agonuke: Avoid a TypeError w/ null node description 1883/head
Zack Cerza [Tue, 15 Aug 2023 18:05:38 +0000 (12:05 -0600)]
nuke: Avoid a TypeError w/ null node description

This avoids a `TypeError: argument of type 'NoneType' is not iterable`
when nuking a node whose description is None.

ex: https://sentry.ceph.com/share/issue/91172146663f4c71a6cbfe43725b2e07/

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1882 from ceph/sentry-reimage-taskname
Dan Mick [Mon, 14 Aug 2023 20:34:08 +0000 (13:34 -0700)]
Merge pull request #1882 from ceph/sentry-reimage-taskname

supervisor.reimage(): Improve Sentry reporting

2 years agosupervisor.reimage(): Improve Sentry reporting 1882/head
Zack Cerza [Mon, 14 Aug 2023 18:48:47 +0000 (12:48 -0600)]
supervisor.reimage(): Improve Sentry reporting

Set the `task` tag value to 'reimage' when reporting reimage failures to
Sentry, to make searching for them in its UI easier.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1881 from ceph/stdin-killer-setpgrp
Zack Cerza [Fri, 4 Aug 2023 17:56:20 +0000 (11:56 -0600)]
Merge pull request #1881 from ceph/stdin-killer-setpgrp

stdin-killer: do not setpgrp is already leader

2 years agostdin-killer: do not setpgrp if already leader 1881/head
Patrick Donnelly [Fri, 4 Aug 2023 13:17:28 +0000 (09:17 -0400)]
stdin-killer: do not setpgrp if already leader

Fixes failure like:

    2023-08-03T19:40:10.942 INFO:teuthology.orchestra.run.smithi100.stderr:Traceback (most recent call last):
    2023-08-03T19:40:10.942 INFO:teuthology.orchestra.run.smithi100.stderr:  File "/usr/bin/stdin-killer", line 213, in <module>
    2023-08-03T19:40:10.943 INFO:teuthology.orchestra.run.smithi100.stderr:    os.setpgrp()
    2023-08-03T19:40:10.943 INFO:teuthology.orchestra.run.smithi100.stderr:PermissionError: [Errno 1] Operation not permitted

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 years agoMerge pull request #1880 from ceph/wip-62286
Zack Cerza [Wed, 2 Aug 2023 20:25:13 +0000 (14:25 -0600)]
Merge pull request #1880 from ceph/wip-62286

2 years agoPhysicalConsole: Tolerate invalid UTF-8 characters 1880/head
Zack Cerza [Wed, 2 Aug 2023 18:03:21 +0000 (12:03 -0600)]
PhysicalConsole: Tolerate invalid UTF-8 characters

... in pexpect.spawn() calls.

Fixes: https://tracker.ceph.com/issues/62286
Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoPhysicalConsole.check_status(): Use log.exception
Zack Cerza [Wed, 2 Aug 2023 17:04:21 +0000 (11:04 -0600)]
PhysicalConsole.check_status(): Use log.exception

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1845 from NitzanMordhai/wip-nitzan-correct-typo-osd-default-pool...
Zack Cerza [Wed, 2 Aug 2023 16:33:38 +0000 (10:33 -0600)]
Merge pull request #1845 from NitzanMordhai/wip-nitzan-correct-typo-osd-default-pool-size

2 years agoMerge pull request #1877 from ceph/sentry-ae
Dan Mick [Tue, 1 Aug 2023 00:51:49 +0000 (17:51 -0700)]
Merge pull request #1877 from ceph/sentry-ae

supervisor: Fix an AttributeError in reimage()

2 years agosupervisor: Fix an AttributeError in reimage() sentry-ae 1877/head
Zack Cerza [Mon, 31 Jul 2023 23:31:43 +0000 (17:31 -0600)]
supervisor: Fix an AttributeError in reimage()

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1859 from ceph/quiet-urllib3
Dan Mick [Fri, 28 Jul 2023 23:08:05 +0000 (16:08 -0700)]
Merge pull request #1859 from ceph/quiet-urllib3

Turn down logging for urllib3.util.retry

2 years agoMerge pull request #1875 from ceph/reimage-errs-sentry
Dan Mick [Fri, 28 Jul 2023 21:52:06 +0000 (14:52 -0700)]
Merge pull request #1875 from ceph/reimage-errs-sentry

Report reimage failures to Sentry

2 years agoMerge pull request #1876 from ceph/afa-sort
Dan Mick [Fri, 28 Jul 2023 21:02:07 +0000 (14:02 -0700)]
Merge pull request #1876 from ceph/afa-sort

task.ansible.FailureAnalyzer: Sort failure items

2 years agotask.ansible.FailureAnalyzer: Sort failure items afa-sort 1876/head
Zack Cerza [Fri, 28 Jul 2023 19:22:52 +0000 (13:22 -0600)]
task.ansible.FailureAnalyzer: Sort failure items

To reduce unecessary duplication in e.g. Sentry.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1874 from ceph/fix-fog-timeout
Dan Mick [Thu, 27 Jul 2023 21:56:28 +0000 (14:56 -0700)]
Merge pull request #1874 from ceph/fix-fog-timeout

fog: Fix a connection timeout bug

2 years agoMerge pull request #1873 from ceph/console-log
Dan Mick [Thu, 27 Jul 2023 21:55:36 +0000 (14:55 -0700)]
Merge pull request #1873 from ceph/console-log

orchestra.console: Scope loggers to shortname

2 years agosupervisor.reimage: Report failures to Sentry reimage-errs-sentry 1875/head
Zack Cerza [Thu, 27 Jul 2023 17:49:16 +0000 (11:49 -0600)]
supervisor.reimage: Report failures to Sentry

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMove Sentry reporting logic to utils
Zack Cerza [Thu, 27 Jul 2023 17:25:23 +0000 (11:25 -0600)]
Move Sentry reporting logic to utils

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoFOG._wait_for_ready(): Catch ConnectionErrors 1874/head
Zack Cerza [Thu, 27 Jul 2023 17:42:46 +0000 (11:42 -0600)]
FOG._wait_for_ready(): Catch ConnectionErrors

Instead of just ConnectionResetErrors, which inherit from
ConnectionError

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoremote: Raise ConnectionError when appropriate
Zack Cerza [Thu, 27 Jul 2023 17:41:11 +0000 (11:41 -0600)]
remote: Raise ConnectionError when appropriate

Instead of just Exception.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoorchestra.console: Scope loggers to shortname 1873/head
Zack Cerza [Thu, 27 Jul 2023 16:24:25 +0000 (10:24 -0600)]
orchestra.console: Scope loggers to shortname

This will make reading console debug logging easier.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #1871 from ceph/sentry-ansible
Zack Cerza [Mon, 24 Jul 2023 19:38:11 +0000 (13:38 -0600)]
Merge pull request #1871 from ceph/sentry-ansible

2 years agoexceptions.AnsibleFailedError: Add fingerprint() 1871/head
Zack Cerza [Mon, 24 Jul 2023 17:22:51 +0000 (11:22 -0600)]
exceptions.AnsibleFailedError: Add fingerprint()

This will cause Sentry to group events by their failure reasons, rather
than lumping all AnsibleFailedErrors together

Signed-off-by: Zack Cerza <zack@redhat.com>