Patrick Donnelly [Tue, 28 Jun 2022 20:29:40 +0000 (16:29 -0400)]
teuthology: add lua based fragment merge scripting
As part of this change, there is a new generator design for producing
job configs. YAML fragments are memoized and merged manually to avoid
expensive and unnecessary parsing of the merged fragments. This provides
for a dramatic speedup in processing matrices with large numbers of
jobs. For rados suite with --subset 1/1000, this branch is 5x faster
(77s vs. 15s). (Note: the difference shrinks when there are fewer or
jobs or larger subsets are used due to cycling and the matrix generation
dominating runtime.)
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Zack Cerza [Wed, 8 Jun 2022 18:41:37 +0000 (12:41 -0600)]
nuke.nuke: Rework lock-checking logic
Previously, we would call list_locks(), then iterate over the response,
each time iterating over the list of targets. If list_locks()
encountered an error and returned an empty response, we'd never actually
verify what we intended to. Instead, we should specifically query for
each target. This is far safer and faster.
Kyr Shatskyy [Wed, 8 Jun 2022 11:28:15 +0000 (13:28 +0200)]
suite: fix type error when description is none
teuthology-watch fails when run is complete and jobs' description
gets None value
Example:
2022-06-07 16:41:22,538.538 INFO:teuthology.suite:waiting for the run runner-2022-06-07_16:41:04-suse:tier0-ses7p-none-default-ecp to complete
2022-06-07 16:41:22,539.539 DEBUG:teuthology.suite:the list of unfinished jobs will be displayed every 5.0 minutes
2022-06-07 16:46:22,599.599 DEBUG:teuthology.suite:wait for jobs ['654']
2022-06-07 16:51:22,633.633 DEBUG:teuthology.suite:wait for jobs ['654']
2022-06-07 16:51:22,686.686 INFO:teuthology.suite:wait is done
Traceback (most recent call last):
File "/home/runner/src/teuthology_master/virtualenv/bin/teuthology-wait", line 33, in <module>
sys.exit(load_entry_point('teuthology', 'console_scripts', 'teuthology-wait')())
File "/home/runner/src/teuthology_master/scripts/wait.py", line 30, in main
return teuthology.suite.wait(name, config.max_job_time, None)
File "/home/runner/src/teuthology_master/teuthology/suite/__init__.py", line 234, in wait
log.info(job['status'] + " " + url + " " + job['description'])
TypeError: must be str, not NoneType
Most runs don't use --no-nested-subset and for those that used --subset,
the if conditions would correctly pickup "seed" (when it mattered).
However, when --subset was not specified in the original run, the "seed"
was not correctly picked up. Therefore, the result of inserting the "if
no_nested_subset is None:" before the "elif seed is None:" caused it to
never read the seed for most folks teuthology runs.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Add beanstalk as a possible queue backend for Teuthology Jobs along with Paddles
With the use of the --queue-backend argument the user can specify which backend(paddles/beanstalk) they would like to use for maintaining the teuthology Jobs queue.
In order to avoid overlapping Job IDs, when a job is being scheduled in beanstalk it is also written to paddles which returns a unique ID.
This is the ID teuthology will treat as the Job ID throughout the run of the job.
To differentiate between the 2 queue backends, the teuthology-queue command has been split into teuthology-paddles-queue command and teuthology-beanstalk-queue command.
Add retry for paddles calls and modify pause queue command
1. Add retry loop for the paddles calls.
2. Add run name as a parameter for updating priority of jobs in paddles.
3. Modify the pause queue command to run on server side with an optional pause duration parameter.
The following changes support the removal of Beanstalk from Teuthology.
In place of Beanstalk, we will now be using Paddles for queue management in Teuthology.
This PR has the corresponding changes for the paddles PR: https://github.com/ceph/paddles/pull/94/files.
The changes include:
1. Removing all beanstalk related code
2. Teuthology scheduler and dispatcher using Paddles queue for scheduling and dispatching jobs
3. Adding support for Paddles queue management
4. Additional functionality of being able to change the priority of Teuthology jobs in the queued state in the teuthology-queue command
Patrick Donnelly [Tue, 24 May 2022 16:14:04 +0000 (12:14 -0400)]
Merge PR #1704 into master
* refs/pull/1704/head:
teuthology/suite/test: test nested subsets
teuthology: add option to disable nested subsets
teuthology/suite: create nested matrix subsets
teuthology/suite: patch builtin open method
teuthology/test: use correct exception type
teuthology/suite/test: make sure patchers are cleaned up on exception
teuthology/suite/test: clarify variable name
Patrick Donnelly [Fri, 14 Jan 2022 20:25:14 +0000 (15:25 -0500)]
teuthology/suite: create nested matrix subsets
The general idea is to allow the `%` convolution operator to also subset
the resulting matrix. This is done by specifying a number of divisions
for the subset in the `%` file. Such as:
dir/%:
8
This commit maps a matrix index range of `[0, Subset.size())` to the
matrix it is taking a subset of, `[0, Matrix.size())`. To get full
coverage, a random number is used to specify "which" subset to use.
Contrast with the `--subset` argument to `teuthology-suite` which lets
you specify which subset.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>