teuthology: modify logic to check for multiple completed builds
The current logic assumes that there is only one build for each distro/flavor
per SHA1. However, there is a bug in the jenkins infrastrucutre that sometimes
causes multiple builds to trigger for one SHA1. In many of these cases, the first
build succeeds, but the second fails. Teuthology only looks at the latest build,
notices that it failed, and gives up. However, with this logic, teuthology can
go back farther and notice that there is indeed a successful build earlier in the
lineup.
Here is an example in which the first centos 8 x86_64 build succeeded, but a second
build on top of it failed. Teuthology could only detect the latest failed build:
https://shaman.ceph.com/builds/ceph/wip-pdonnell-testing-
20240503.010653-debug/
ec1d3bd17a3db9d74296aa618f8d63c801bb647e/
Addresses this failure in teuthology:
lflores@teuthology:~$ ./teuthology/virtualenv/bin/teuthology-suite -v -m smithi -c wip-pdonnell-testing-
20240503.010653-debug -s fs --subset 111/12000 -p 75 --dry-run
2024-05-03 16:39:35,231.231 INFO:teuthology.suite:Using random seed=9685
2024-05-03 16:39:35,232.232 INFO:teuthology.suite.run:kernel sha1: distro
2024-05-03 16:39:35,673.673 DEBUG:teuthology.repo_utils:git ls-remote https://git.ceph.com/ceph-ci.git wip-pdonnell-testing-
20240503.010653-debug ->
ec1d3bd17a3db9d74296aa618f8d63c801bb647e
2024-05-03 16:39:35,673.673 INFO:teuthology.suite.run:ceph sha1:
ec1d3bd17a3db9d74296aa618f8d63c801bb647e
2024-05-03 16:39:35,674.674 DEBUG:teuthology.packaging:Querying https://shaman.ceph.com/api/search?status=ready&project=ceph&flavor=default&distros=centos%2F8%2Fx86_64&sha1=
ec1d3bd17a3db9d74296aa618f8d63c801bb647e
2024-05-03 16:39:36,176.176 DEBUG:teuthology.packaging:looking for centos/8 x86_64 default
2024-05-03 16:39:36,176.176 DEBUG:teuthology.packaging:build: centos/8 arm64 default
2024-05-03 16:39:36,176.176 DEBUG:teuthology.packaging:build: centos/9 x86_64 crimson
2024-05-03 16:39:36,176.176 DEBUG:teuthology.packaging:build: centos/9 x86_64 default
2024-05-03 16:39:36,176.176 DEBUG:teuthology.packaging:build: centos/8 arm64 default
2024-05-03 16:39:36,176.176 DEBUG:teuthology.packaging:build: centos/8 x86_64 crimson
2024-05-03 16:39:36,177.177 DEBUG:teuthology.packaging:build: centos/8 x86_64 default
2024-05-03 16:39:36,178.178 INFO:teuthology.suite.util:Container build incomplete
Traceback (most recent call last):
File "./teuthology/virtualenv/bin/teuthology-suite", line 8, in <module>
sys.exit(main())
File "/cephfs/home/lflores/teuthology/scripts/suite.py", line 226, in main
return teuthology.suite.main(args)
File "/cephfs/home/lflores/teuthology/teuthology/suite/__init__.py", line 143, in main
run = Run(conf)
File "/cephfs/home/lflores/teuthology/teuthology/suite/run.py", line 56, in __init__
self.base_config = self.create_initial_config()
File "/cephfs/home/lflores/teuthology/teuthology/suite/run.py", line 94, in create_initial_config
self.choose_ceph_version(ceph_hash)
File "/cephfs/home/lflores/teuthology/teuthology/suite/run.py", line 216, in choose_ceph_version
util.schedule_fail(msg, self.name, dry_run=self.args.dry_run)
File "/cephfs/home/lflores/teuthology/teuthology/suite/util.py", line 77, in schedule_fail
raise ScheduleFailError(message, name)
teuthology.exceptions.ScheduleFailError: Scheduling lflores-2024-05-03_16:39:35-fs-wip-pdonnell-testing-
20240503.010653-debug-distro-default-smithi failed: Packages for os_type 'centos', flavor default and ceph hash '
ec1d3bd17a3db9d74296aa618f8d63c801bb647e' not found
More work should be done to fix the "double build" issue in jenkins, so this can be thought of as a workaround.
Signed-off-by: Laura Flores <lflores@ibm.com>