]> git-server-git.apps.pok.os.sepia.ceph.com Git - teuthology.git/log
teuthology.git
2 weeks agokill: let pids be None when call kill_process
Kyr Shatskyy [Wed, 11 Feb 2026 13:26:09 +0000 (14:26 +0100)]
kill: let pids be None when call kill_process

If we run teuthology-kill with -j option it may not know
about job pid, let it pass None as pids argument for
kill_process so it can figure it out on its own.

Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@clyso.com>
2 weeks agodispatcher/supervisor: fix kill_job call
Kyr Shatskyy [Wed, 11 Feb 2026 12:47:10 +0000 (13:47 +0100)]
dispatcher/supervisor: fix kill_job call

Fixes: bf0242c5599c861d855d13925a565cf437b7a41b
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@clyso.com>
4 weeks agoMerge pull request #2113 from batrick/egrep 1.2.3
kyr [Sat, 24 Jan 2026 10:43:09 +0000 (11:43 +0100)]
Merge pull request #2113 from batrick/egrep

teuthology/task: use grep switch instead of egrep

4 weeks agoMerge pull request #2137 from kshtsk/wip-rocky-9.7
David Galloway [Fri, 23 Jan 2026 19:32:16 +0000 (14:32 -0500)]
Merge pull request #2137 from kshtsk/wip-rocky-9.7

orchestra: rocky 9.6 and 10.0 are gone

5 weeks agoorchestra: rocky 9.6 and 10.0 are gone 2137/head
Kyr Shatskyy [Wed, 21 Jan 2026 20:37:14 +0000 (21:37 +0100)]
orchestra: rocky 9.6 and 10.0 are gone

Welcome Rocky and Alma Linux 9.7 and 10.1

Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@clyso.com>
5 weeks agoMerge pull request #2134 from ceph/kill-pid
David Galloway [Wed, 21 Jan 2026 13:45:18 +0000 (08:45 -0500)]
Merge pull request #2134 from ceph/kill-pid

run: Send PID to paddles

5 weeks agoMerge pull request #2036 from ceph/reimage-unlock
Dan Mick [Tue, 20 Jan 2026 23:41:41 +0000 (15:41 -0800)]
Merge pull request #2036 from ceph/reimage-unlock

lock.ops.unlock_one_safe: Invert run-match logic

5 weeks agorun: Send PID to paddles 2134/head
Zack Cerza [Tue, 20 Jan 2026 22:54:27 +0000 (15:54 -0700)]
run: Send PID to paddles

This is a follow-up to bf0242c5599c861d855d13925a565cf437b7a41b

Signed-off-by: Zack Cerza <zack@cerza.org>
5 weeks agoMerge pull request #2131 from ceph/fis-lis
David Galloway [Fri, 16 Jan 2026 00:57:55 +0000 (19:57 -0500)]
Merge pull request #2131 from ceph/fis-lis

supervisor: Avoid prematurely pushing some jobs

5 weeks agoMerge pull request #2130 from ceph/kill-multi-supervisor
David Galloway [Fri, 16 Jan 2026 00:57:43 +0000 (19:57 -0500)]
Merge pull request #2130 from ceph/kill-multi-supervisor

kill: Handle supervisor procs when killing runs

5 weeks agosupervisor: Avoid prematurely pushing some jobs fis-lis 2131/head
Zack Cerza [Fri, 16 Jan 2026 00:38:35 +0000 (17:38 -0700)]
supervisor: Avoid prematurely pushing some jobs

This is a follow-up to ff615aae541032c647e78d3959d368f595c93e31; it caused us to
submit the first-in-suite and last-in-suite jobs to paddles. Those present has
having 'unknown' status, which will be confusing to users.

Signed-off-by: Zack Cerza <zack@cerza.org>
5 weeks agokill: Handle supervisor procs when killing runs kill-multi-supervisor 2130/head
Zack Cerza [Thu, 15 Jan 2026 19:20:16 +0000 (12:20 -0700)]
kill: Handle supervisor procs when killing runs

This is a follow-up to ff615aae541032c647e78d3959d368f595c93e31, which only
handled killing individual jobs. Since we're using the results server for all
run and job metadata, we can drop all mentions of the archive. This change
is necessary since we've restricted access to the archive from the teuthology
machine for normal users, to avoid resource contention.

Signed-off-by: Zack Cerza <zack@cerza.org>
5 weeks agoMerge pull request #2129 from ceph/maas-fixes
Zack Cerza [Thu, 15 Jan 2026 17:58:08 +0000 (10:58 -0700)]
Merge pull request #2129 from ceph/maas-fixes

maas: handle nodes with unexpected status

6 weeks agomaas: handle nodes with unexpected status 2129/head
Zack Cerza [Thu, 15 Jan 2026 01:04:58 +0000 (18:04 -0700)]
maas: handle nodes with unexpected status

Often times, we just need to release and re-allocate before provisioning.

Signed-off-by: Zack Cerza <zack@cerza.org>
6 weeks agoMerge pull request #2117 from deepssin/uefi_fix
Dan Mick [Wed, 14 Jan 2026 17:28:20 +0000 (09:28 -0800)]
Merge pull request #2117 from deepssin/uefi_fix

Fix kernel boot on UEFI systems

6 weeks agoFix kernel boot on UEFI systems 2117/head
deepssin [Thu, 18 Dec 2025 13:52:21 +0000 (13:52 +0000)]
Fix kernel boot on UEFI systems

- Add _update_uefi_grub_config() to sync UEFI GRUB config
- Fixes issue where systems reboot into old kernel on UEFI

Signed-off-by: deepssin <deepssin@redhat.com>
6 weeks agoMerge pull request #2105 from vamahaja/maas-integration
Zack Cerza [Tue, 13 Jan 2026 21:54:36 +0000 (14:54 -0700)]
Merge pull request #2105 from vamahaja/maas-integration

[Lib] Add MAAS (Metal-as-a-Service) provisioner

6 weeks agomaas: Correct image names 2105/head
Zack Cerza [Thu, 8 Jan 2026 20:27:24 +0000 (13:27 -0700)]
maas: Correct image names

Signed-off-by: Zack Cerza <zack@cerza.org>
6 weeks agoprovisioner: Add maas provisioner
Ceph Teuthology [Tue, 4 Nov 2025 16:19:08 +0000 (21:49 +0530)]
provisioner: Add maas provisioner

Signed-off-by: Vaibhav Mahajan <vaibhavsm04@gmail.com>
6 weeks agoMerge pull request #2126 from kshtsk/wip-defaults-https
kyr [Tue, 13 Jan 2026 13:47:55 +0000 (14:47 +0100)]
Merge pull request #2126 from kshtsk/wip-defaults-https

config: use https by default

6 weeks agoconfig: use https by default 2126/head
Kyrylo Shatskyy [Tue, 13 Jan 2026 11:41:00 +0000 (12:41 +0100)]
config: use https by default

There is no permission to use 80 port (http) for new lab's resources
for security reasons, so defaults to use https now.

Signed-off-by: Kyrylo Shatskyy <kyrylo.shatskyy@clyso.com>
6 weeks agoMerge pull request #2125 from ceph/kill-supervisor
David Galloway [Fri, 9 Jan 2026 21:06:45 +0000 (16:06 -0500)]
Merge pull request #2125 from ceph/kill-supervisor

kill: Look for, and kill, the supervisor process

6 weeks agokill: Look for, and kill, the supervisor process 2125/head
Zack Cerza [Fri, 9 Jan 2026 00:36:30 +0000 (17:36 -0700)]
kill: Look for, and kill, the supervisor process

Useful if one wants to kill a job that is still waiting for its nodes to be
provisioned.

Signed-off-by: Zack Cerza <zack@cerza.org>
2 months agoMerge pull request #2116 from tchaikov/wip-drop-distutils
Kefu Chai [Fri, 19 Dec 2025 01:36:29 +0000 (09:36 +0800)]
Merge pull request #2116 from tchaikov/wip-drop-distutils

teuthology: remove dependency on distutils

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2 months agoteuthology/task: use grep switch instead of egrep 2113/head
Patrick Donnelly [Mon, 1 Dec 2025 15:29:52 +0000 (10:29 -0500)]
teuthology/task: use grep switch instead of egrep

Resolves:

    2025-11-30T03:49:48.951 DEBUG:teuthology.task.internal.syslog:Checking ubuntu@smithi155.front.sepia.ceph.com
    2025-11-30T03:49:48.952 DEBUG:teuthology.orchestra.run.smithi155:> egrep --binary-files=text '\bBUG\b|\bINFO\b|\bDEADLOCK\b' /home/ubuntu/cephtest/archive/syslog/kern.log | grep -v 'task .* blocked for more than .* seconds' | grep -v 'lockdep is turned off' | grep -v 'trying to register non-static key' | grep -v 'DEBUG: fsize' | grep -v CRON | grep -v 'BUG: bad unlock balance detected' | grep -v 'inconsistent lock state' | grep -v '*** DEADLOCK ***' | grep -v 'INFO: possible irq lock inversion dependency detected' | grep -v 'INFO: NMI handler (perf_event_nmi_handler) took too long to run' | grep -v 'INFO: recovery required on readonly' | grep -v 'ceph-create-keys: INFO' | grep -v INFO:ceph-create-keys | grep -v 'Loaded datasource DataSourceOpenStack' | grep -v 'container-storage-setup: INFO: Volume group backing root filesystem could not be determined' | egrep -v '\bsalt-master\b|\bsalt-minion\b|\bsalt-api\b' | grep -v ceph-crash | egrep -v '\btcmu-runner\b.*\bINFO\b' | head -n 1
    2025-11-30T03:49:48.983 INFO:teuthology.orchestra.run.smithi155.stderr:egrep: warning: egrep is obsolescent; using grep -E
    2025-11-30T03:49:48.983 INFO:teuthology.orchestra.run.smithi155.stderr:egrep: warning: egrep is obsolescent; using grep -E
    2025-11-30T03:49:48.983 INFO:teuthology.orchestra.run.smithi155.stderr:egrep: warning: egrep is obsolescent; using grep -E

Fixes: https://tracker.ceph.com/issues/74259
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
2 months agoteuthology: implement strtobool and use it 2116/head
Kefu Chai [Thu, 18 Dec 2025 09:27:35 +0000 (17:27 +0800)]
teuthology: implement strtobool and use it

The distutils module was deprecated in Python 3.10 and removed in
Python 3.12. This commit replaces the deprecated distutils.utils.strtobool
imports with strtobool in teuthology.util module.

Changes:
- Add strtobool.py to teuthology/util
- Replace distutils.util.strtobool with
teuthology.util.strtobool.strtobool

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
2 months agoteuthology/task/install: implement LooseVersion and use it
Kefu Chai [Thu, 18 Dec 2025 09:05:58 +0000 (17:05 +0800)]
teuthology/task/install: implement LooseVersion and use it

The distutils module was deprecated in Python 3.10 and removed in
Python 3.12. This commit replaces the deprecated distutils.version
imports with the a homebrew LooseVersion implementation.

Changes:
- implement LooseVersion which is able to parse versions like
  '10.2.2-63-g8542898-1trusty'.
- Replace distutils.version.LooseVersion with
  teuthology.util.version.LooseVersion packaging.version.LooseVersion

Fixes:
```
Traceback (most recent call last):
  File "/home/jenkins-build/build/workspace/ceph-api/build/../qa/tasks/vstart_runner.py", line 81, in <module>
    from teuthology.orchestra.remote import RemoteShell
  File "/tmp/tmp.xwxq8FOScf/teuthology/teuthology/orchestra/remote.py", line 6, in <module>
    import teuthology.lock.util
  File "/tmp/tmp.xwxq8FOScf/teuthology/teuthology/lock/util.py", line 6, in <module>
    import teuthology.provision.downburst
  File "/tmp/tmp.xwxq8FOScf/teuthology/teuthology/provision/__init__.py", line 4, in <module>
    import teuthology.exporter
  File "/tmp/tmp.xwxq8FOScf/teuthology/teuthology/exporter.py", line 11, in <module>
    import teuthology.dispatcher
  File "/tmp/tmp.xwxq8FOScf/teuthology/teuthology/dispatcher/__init__.py", line 22, in <module>
    from teuthology.dispatcher import supervisor
  File "/tmp/tmp.xwxq8FOScf/teuthology/teuthology/dispatcher/supervisor.py", line 18, in <module>
    from teuthology.task import internal
  File "/tmp/tmp.xwxq8FOScf/teuthology/teuthology/task/internal/__init__.py", line 27, in <module>
    from teuthology.task.internal.redhat import (setup_cdn_repo, setup_base_repo,            # noqa
  File "/tmp/tmp.xwxq8FOScf/teuthology/teuthology/task/internal/redhat.py", line 13, in <module>
    from teuthology.task.install.redhat import set_deb_repo
  File "/tmp/tmp.xwxq8FOScf/teuthology/teuthology/task/install/__init__.py", line 14, in <module>
    from distutils.version import LooseVersion
ModuleNotFoundError: No module named 'distutils'
```

Related: https://peps.python.org/pep-0632/

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
2 months agolock.ops.unlock_one_safe: Invert run-match logic 2036/head
Zack Cerza [Wed, 19 Mar 2025 18:35:11 +0000 (12:35 -0600)]
lock.ops.unlock_one_safe: Invert run-match logic

When unlock_one_safe is called with run_name, the caller means to express
"unlock this node if it belongs to this run".
When it is called with run_name and job_id, it means "unlock this node if it
belongs to this job in this run".
We had inverted the logic, causing leaks on reimage failures.

Signed-off-by: Zack Cerza <zack@cerza.org>
3 months agoMerge pull request #2110 from Matan-B/wip-matanb-pexec-logs
Matan Breizman [Thu, 20 Nov 2025 08:48:57 +0000 (10:48 +0200)]
Merge pull request #2110 from Matan-B/wip-matanb-pexec-logs

teuthology/task/pexec.py: add logs to command executed

Reviewed-by: Aishwarya Mathuria <amathuri@redhat.com>
3 months agoteuthology/task/pexec.py: add logs to command executed wip-matanb-pexec-logs 2110/head
Matan Breizman [Wed, 19 Nov 2025 13:49:36 +0000 (15:49 +0200)]
teuthology/task/pexec.py: add logs to command executed

The current output by pexec is:
```
INFO:teuthology.run_tasks:Running task pexec...
INFO:teuthology.task.pexec:Executing custom commands...
INFO:teuthology.task.pexec:Running commands on host ubuntu@smithi012.front.sepia.ceph.com
DEBUG:teuthology.orchestra.run.smithi012:> TESTDIR=/home/ubuntu/cephtest bash -s
```

The output should include the acutal command executed, similar to
exec.py:

```
INFO:teuthology.run_tasks:Running task exec...
INFO:teuthology.task.exec:Executing custom commands...
INFO:teuthology.task.exec:Running commands on role client.0 host ubuntu@smithi168.front.sepia.ceph.com
DEBUG:teuthology.orchestra.run.smithi168:> sudo TESTDIR=/home/ubuntu/cephtest bash -c 'sudo ceph osd pool create low_tier 4'
```

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
3 months agoMerge pull request #2107 from deepssin/fix-gevent-loopexit-dispatcher-crash
Zack Cerza [Tue, 18 Nov 2025 00:04:46 +0000 (17:04 -0700)]
Merge pull request #2107 from deepssin/fix-gevent-loopexit-dispatcher-crash

Fix dispatcher crash on gevent LoopExit exceptions