Commit
953662555840 ("qa/tasks/ceph: use Cluster.sh() and Remote.sh()
when appropriate") dropped run.wait(), which waits for all given
processes to exit. This resulted in errors like
INFO:teuthology.orchestra.run.smithi107.stderr:tar: ./objectstore_tool..log: file changed as we read it
INFO:teuthology.orchestra.run.smithi107.stderr:tar: ./ceph-client.admin.175125.log: File removed before we read it
as the task moved on to archiving semi-corrupted and uncompressed logs,
filling up the lab cluster.
Revert that hunk, as Cluster.sh() is useless here -- we don't need
stdout or stderr, but very much need parallel execution and wait for
the compression to finish.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
not (ctx.config.get('archive-on-error') and ctx.summary['success']):
# and logs
log.info('Compressing logs...')
- ctx.cluster.sh(
- 'sudo find /var/log/ceph -name *.log -print0 | '
- 'sudo xargs -0 --no-run-if-empty -- gzip --',
- wait=False)
+ run.wait(
+ ctx.cluster.run(
+ args=[
+ 'sudo',
+ 'find',
+ '/var/log/ceph',
+ '-name',
+ '*.log',
+ '-print0',
+ run.Raw('|'),
+ 'sudo',
+ 'xargs',
+ '-0',
+ '--no-run-if-empty',
+ '--',
+ 'gzip',
+ '--',
+ ],
+ wait=False,
+ ),
+ )
log.info('Archiving logs...')
path = os.path.join(ctx.archive, 'remote')