]> git.apps.os.sepia.ceph.com Git - teuthology.git/log
teuthology.git
7 years agoceph.conf.template: drop cephfs configs 1186/head
Patrick Donnelly [Wed, 27 Jun 2018 16:31:21 +0000 (09:31 -0700)]
ceph.conf.template: drop cephfs configs

This will be set in the cephfs qa suites.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
7 years agoMerge pull request #1187 from SUSE/wip-restart-workers
Zack Cerza [Tue, 3 Jul 2018 23:41:54 +0000 (17:41 -0600)]
Merge pull request #1187 from SUSE/wip-restart-workers

Restart dead workers

7 years agoMerge pull request #1188 from SUSE/wip-job-owner
Zack Cerza [Tue, 3 Jul 2018 23:39:08 +0000 (17:39 -0600)]
Merge pull request #1188 from SUSE/wip-job-owner

Fix worker can't figure out owner of a job

7 years agoFix worker can't figure out owner of a job 1188/head
Kyr Shatskyy [Mon, 2 Jul 2018 15:14:58 +0000 (17:14 +0200)]
Fix worker can't figure out owner of a job

The patch fixes uncaught exception RuntimeError:

  I could not figure out the owner of the requested job.
  Please pass --owner <owner>.

Worker dies with unhandled exception in run_with_watchdog
if it can't figure out owner of a job, which it tries
to kill when job runs longer then given limit of time.

Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
7 years agoRestart dead workers 1187/head
Kyr Shatskyy [Mon, 7 May 2018 15:01:57 +0000 (18:01 +0300)]
Restart dead workers

This patch allows to restart dead workers separately
not stopping the rest of the teuthology components,
and what is more important the beanstalkd service.
That makes it possible to extend the number of workers too.
Also, either of pulpito and paddles can be restarted alone.

Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
7 years agoMerge pull request #1185 from batrick/hidden-file
Zack Cerza [Mon, 25 Jun 2018 22:22:19 +0000 (16:22 -0600)]
Merge pull request #1185 from batrick/hidden-file

build_matrix: ignore hidden files

7 years agotest: add hidden file test 1185/head
Patrick Donnelly [Mon, 25 Jun 2018 20:51:39 +0000 (13:51 -0700)]
test: add hidden file test

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
7 years agobuild_matrix: ignore hidden files
Patrick Donnelly [Fri, 22 Jun 2018 19:21:41 +0000 (12:21 -0700)]
build_matrix: ignore hidden files

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
7 years agoMerge pull request #1182 from tchaikov/wip-rerun
Zack Cerza [Wed, 20 Jun 2018 19:53:37 +0000 (13:53 -0600)]
Merge pull request #1182 from tchaikov/wip-rerun

suite: allow `--rerun` to run full set of failed tests

7 years agoteuthology-suite: add --seed option for repeatable random test 1182/head
Kefu Chai [Sat, 16 Jun 2018 16:09:36 +0000 (00:09 +0800)]
teuthology-suite: add --seed option for repeatable random test

currently --rerun does not match tests of
'supported-random-distro$/ubuntu_latest.yaml' with
'supported-random-distro$/centos_latest.yaml'. the former could be part
of description of a failed test, the latter is a a part of job
description generated by build_matrix(). because the '$' operator
instructs theuthology to choose a random file under the directory ending
with '$', and we expand the '$' to a randomly picked file *before*
filtering the generated job list with the filter collected from the
failed tests, there is good chance that the job descriptions of the
failed jobs in self.args.filter_in cannot match with the randomly
generated ones.

so, we introduce an argument '--seed' for teuthology-suite for the
repeatable random test. this argument allows user to specify a seed for
tne RNG used by build_matrix().

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
7 years agosuite: avoid preparing the full list for filtering
Kefu Chai [Tue, 12 Jun 2018 08:36:39 +0000 (16:36 +0800)]
suite: avoid preparing the full list for filtering

speed up the filtering a little bit

Signed-off-by: Kefu Chai <kchai@redhat.com>
7 years agoMerge pull request #1180 from SUSE/wip-fix-fetch_binaries_for_coredumps
Zack Cerza [Thu, 7 Jun 2018 18:22:37 +0000 (12:22 -0600)]
Merge pull request #1180 from SUSE/wip-fix-fetch_binaries_for_coredumps

Fix fetch_binaries_for_coredumps

7 years agoFix fetch_binaries_for_coredumps 1180/head
Kyr Shatskyy [Wed, 6 Jun 2018 19:17:43 +0000 (21:17 +0200)]
Fix fetch_binaries_for_coredumps

Addresses error message:

  AttributeError: 'tuple' object has no attribute 'split'

Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
7 years agoMerge pull request #1122 from ceph/wip-restart-daemon
vasukulkarni [Mon, 4 Jun 2018 18:35:48 +0000 (11:35 -0700)]
Merge pull request #1122 from ceph/wip-restart-daemon

[orchestra/systemd]: check status when daemon is restarted

7 years agoMerge pull request #1176 from cbodley/wip-ansible-requirements
vasukulkarni [Wed, 23 May 2018 19:51:35 +0000 (12:51 -0700)]
Merge pull request #1176 from cbodley/wip-ansible-requirements

ceph-ansible: add hard-coded notario dependency

7 years agoceph-ansible: add hard-coded notario dependency 1176/head
Casey Bodley [Tue, 22 May 2018 19:26:19 +0000 (15:26 -0400)]
ceph-ansible: add hard-coded notario dependency

Fixes: http://tracker.ceph.com/issues/24230
Signed-off-by: Casey Bodley <cbodley@redhat.com>
7 years agoMerge pull request #1175 from ceph/wip-24168
Dan Mick [Fri, 18 May 2018 20:57:46 +0000 (13:57 -0700)]
Merge pull request #1175 from ceph/wip-24168

Unpin libvirt-python

7 years agoUnpin libvirt-python 1175/head
Zack Cerza [Fri, 18 May 2018 19:19:36 +0000 (13:19 -0600)]
Unpin libvirt-python

http://tracker.ceph.com/issues/24168

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1174 from ceph/wip-suite-tests
David Galloway [Wed, 16 May 2018 21:11:08 +0000 (17:11 -0400)]
Merge pull request #1174 from ceph/wip-suite-tests

suite/test/test_run_.py: Don't hit the network!

7 years agosuite/test/test_run_.py: Don't hit the network! 1174/head
Zack Cerza [Wed, 16 May 2018 21:00:07 +0000 (15:00 -0600)]
suite/test/test_run_.py: Don't hit the network!

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1173 from ceph/wip-vault-pass
Zack Cerza [Wed, 16 May 2018 20:37:30 +0000 (14:37 -0600)]
Merge pull request #1173 from ceph/wip-vault-pass

ansible.py: Write "foo" to ~/.vault_pass.txt instead of touching

7 years agoansible.py: Write "foo" to ~/.vault_pass.txt instead of touching 1173/head
David Galloway [Wed, 16 May 2018 19:46:38 +0000 (15:46 -0400)]
ansible.py: Write "foo" to ~/.vault_pass.txt instead of touching

ansible-playbook will not run with an empty vault password file.  It
will run if the password file has something in it even if it doesn't get
used.

Signed-off-by: David Galloway <dgallowa@redhat.com>
7 years agoMerge pull request #1165 from ceph/wip-wusui-23208
Zack Cerza [Mon, 7 May 2018 20:57:28 +0000 (14:57 -0600)]
Merge pull request #1165 from ceph/wip-wusui-23208

Allow both $ and directory$ for random yamls.

7 years agoAllow both $ and directory$ for random yamls. 1165/head
Warren Usui [Wed, 11 Apr 2018 00:48:55 +0000 (00:48 +0000)]
Allow both $ and directory$ for random yamls.

If either the diretory contains a magic $ file, or if the directory name
ends with $, then the random selection of a yaml file will occur.

Signed-off-by: Warren Usui <wusui@redhat.com>
7 years agoMerge pull request #1172 from batrick/slow-ops-whitelist
Sage Weil [Thu, 3 May 2018 13:50:06 +0000 (08:50 -0500)]
Merge pull request #1172 from batrick/slow-ops-whitelist

remove blanket SLOW_OPS whitelist

7 years agoremove blanket SLOW_OPS whitelist 1172/head
Patrick Donnelly [Thu, 3 May 2018 13:45:52 +0000 (06:45 -0700)]
remove blanket SLOW_OPS whitelist

Should no longer be necessary after [1].

[1] https://github.com/ceph/ceph/pull/21684

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
7 years agoMerge pull request #1170 from ceph/wip-bionic
Zack Cerza [Tue, 1 May 2018 19:51:35 +0000 (13:51 -0600)]
Merge pull request #1170 from ceph/wip-bionic

Add Bionic to distro map

7 years agoAdd Bionic to distro map 1170/head
David Galloway [Mon, 30 Apr 2018 15:43:31 +0000 (11:43 -0400)]
Add Bionic to distro map

Signed-off-by: David Galloway <dgallowa@redhat.com>
7 years agoMerge pull request #1167 from tchaikov/wip-job-404
Zack Cerza [Mon, 30 Apr 2018 11:59:58 +0000 (05:59 -0600)]
Merge pull request #1167 from tchaikov/wip-job-404

teuthology/lock: ignore none 200 jobs in node_job_is_active()

7 years agoMerge pull request #1169 from ceph/wip-23798
David Galloway [Tue, 24 Apr 2018 15:56:27 +0000 (11:56 -0400)]
Merge pull request #1169 from ceph/wip-23798

task.selinux: Whitelist syslogd_t denials

7 years agotask.selinux: Whitelist syslogd_t denials 1169/head
David Galloway [Tue, 24 Apr 2018 15:25:40 +0000 (11:25 -0400)]
task.selinux: Whitelist syslogd_t denials

Fixes: http://tracker.ceph.com/issues/23798
Signed-off-by: David Galloway <dgallowa@redhat.com>
7 years agoMerge pull request #1168 from tchaikov/wip-mds-all-down
Kefu Chai [Mon, 23 Apr 2018 05:10:56 +0000 (13:10 +0800)]
Merge pull request #1168 from tchaikov/wip-mds-all-down

placeholder: whitelist MDS_ALL_DOWN by default

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
7 years agoplaceholder: whitelist MDS_ALL_DOWN, MDS_UP_LESS_THAN_MAX by default 1168/head
Kefu Chai [Sat, 21 Apr 2018 15:47:23 +0000 (23:47 +0800)]
placeholder: whitelist MDS_ALL_DOWN, MDS_UP_LESS_THAN_MAX  by default

because, in ceph/qa/tasks/ceph.py, we start mon, mgr, osd, and then mds.
there is a time window where there is no mds around, but mgr is checking
mdsmap for MDS_ALL_DOWN errors. there is no way to disable this check in
this time window. so we just whitelist MDS_ALL_DOWN here.

Signed-off-by: Kefu Chai <kchai@redhat.com>
7 years agoteuthology/lock: ignore none 200 jobs in node_job_is_active() 1167/head
Kefu Chai [Sun, 15 Apr 2018 23:56:05 +0000 (07:56 +0800)]
teuthology/lock: ignore none 200 jobs in node_job_is_active()

there is chance that we will have 404 when accessing a job's URI.

Signed-off-by: Kefu Chai <kchai@redhat.com>
7 years agoMerge pull request #1166 from batrick/i23662
Sage Weil [Thu, 12 Apr 2018 03:27:28 +0000 (22:27 -0500)]
Merge pull request #1166 from batrick/i23662

placeholder: update whitelist for osd slow op wrn

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
7 years agoplaceholder: update whitelist for osd slow op wrn 1166/head
Patrick Donnelly [Wed, 11 Apr 2018 23:19:25 +0000 (16:19 -0700)]
placeholder: update whitelist for osd slow op wrn

Caused by: https://github.com/ceph/ceph/pull/20660 (ea97c120d2173f2fc70d979d57a7edb2a6c5da5e)

Fixes: https://tracker.ceph.com/issues/23662
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
7 years agoMerge pull request #1160 from ceph/wip-intro-link-new-wiki 1163/head
Zack Cerza [Thu, 5 Apr 2018 22:29:06 +0000 (16:29 -0600)]
Merge pull request #1160 from ceph/wip-intro-link-new-wiki

Update getting started instructions to link to the new sepia wiki

7 years agoMerge pull request #1154 from ceph/wip-distro-head
Zack Cerza [Thu, 5 Apr 2018 22:25:41 +0000 (16:25 -0600)]
Merge pull request #1154 from ceph/wip-distro-head

task.kernel: Only show latest kernel when running rpm -q kernel

7 years agoMerge pull request #1164 from ceph/wip-skip-tags
Zack Cerza [Thu, 5 Apr 2018 19:28:13 +0000 (13:28 -0600)]
Merge pull request #1164 from ceph/wip-skip-tags

ansible: Add ability to skip tags during ansible task

7 years agoansible: Add ability to skip tags during ansible task 1164/head
David Galloway [Thu, 5 Apr 2018 14:42:02 +0000 (10:42 -0400)]
ansible: Add ability to skip tags during ansible task

Signed-off-by: David Galloway <dgallowa@redhat.com>
7 years agoMerge pull request #1161 from ceph/wip-ansible-group-vars
vasukulkarni [Wed, 4 Apr 2018 15:26:15 +0000 (08:26 -0700)]
Merge pull request #1161 from ceph/wip-ansible-group-vars

task.ansible: Allow passing in custom group_vars

7 years agoOverhaul ansible task tests 1161/head
Zack Cerza [Tue, 3 Apr 2018 18:03:40 +0000 (12:03 -0600)]
Overhaul ansible task tests

This fixes failures that have only manifested in Jenkins; it also causes
CephLab to run more of the Ansible tests.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoUpdate tox again
Zack Cerza [Tue, 3 Apr 2018 18:03:26 +0000 (12:03 -0600)]
Update tox again

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agotask.internal.syslog: Whitelist a docker warning
Zack Cerza [Tue, 3 Apr 2018 01:33:51 +0000 (19:33 -0600)]
task.internal.syslog: Whitelist a docker warning

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agopytest.ini: Disable cacheprovider
Zack Cerza [Mon, 2 Apr 2018 21:35:47 +0000 (15:35 -0600)]
pytest.ini: Disable cacheprovider

A change in 3.5.0 seems to be buggy; disable the cache rather than
pinning versions.

I believe the offending commit is:
https://github.com/pytest-dev/pytest/commit/dff0500114971b30a7bb9043acb0d0fb6a9e01c4

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agotask.ansible: Allow passing in custom group_vars
Zack Cerza [Wed, 21 Mar 2018 23:37:32 +0000 (17:37 -0600)]
task.ansible: Allow passing in custom group_vars

Up until now, if you wanted to inject vars to a playbook run, you had to
use --extra-vars, which don't behave the same way that group_vars do.
This commit adds that functionality.

We look for a 'group_vars' dict in the task's config object. If it's
there, we create group_vars files with names taken from the keys, and
content taken from the values.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1157 from ceph/wusui-23208
Zack Cerza [Thu, 29 Mar 2018 20:40:14 +0000 (14:40 -0600)]
Merge pull request #1157 from ceph/wusui-23208

Implement $ option to randomly choose yamls.

7 years agoUpdate getting started instructions to link to the new sepia wiki 1160/head
Gregory Meno [Mon, 26 Mar 2018 17:14:06 +0000 (10:14 -0700)]
Update getting started instructions to link to the new sepia wiki

Signed-off-by: Gregory Meno <gmeno@redhat.com>
7 years agoAdd unit tests for '$' file. 1157/head 1159/head
Warren Usui [Fri, 23 Mar 2018 01:51:57 +0000 (01:51 +0000)]
Add unit tests for '$' file.

Signed-off-by: Warren Usui <wusui@redhat.com>
7 years agoMerge pull request #1158 from ceph/wip-wait-fog
Zack Cerza [Tue, 20 Mar 2018 21:39:01 +0000 (15:39 -0600)]
Merge pull request #1158 from ceph/wip-wait-fog

Increase timeouts teuthology will wait for FOG provisioning

7 years agofog: Wait 10 minutes for machine to be reachable after deploy 1158/head
David Galloway [Tue, 27 Feb 2018 18:29:56 +0000 (13:29 -0500)]
fog: Wait 10 minutes for machine to be reachable after deploy

A lot's going on in rc.local after a machine is provisioned with FOG.  5
minutes is a little aggressive when taking into account the time it
takes for:
 - The machine to reboot after the FOG task completes
 - BIOS to load
 - DHCP/PXE/TFTP to timeout (double this if NIC order isn't correct)
 - OS to boot and rc.local to do its magic

Signed-off-by: David Galloway <dgallowa@redhat.com>
7 years agofog: Wait 15 minutes for FOG task to complete
David Galloway [Tue, 27 Feb 2018 18:28:10 +0000 (13:28 -0500)]
fog: Wait 15 minutes for FOG task to complete

When there are running jobs and a large run gets scheduled, FOG
provisioning gets backed up due to its built-in rate limiting.  15
minutes should be a little more lenient and prevent large spikes of dead
jobs when a run is scheduled in an idle queue.

Signed-off-by: David Galloway <dgallowa@redhat.com>
7 years agoImplement $ option to randomly choose yamls.
Warren Usui [Tue, 20 Mar 2018 01:15:42 +0000 (01:15 +0000)]
Implement $ option to randomly choose yamls.

This implements tracker #23208

Signed-off-by: Warren Usui <wusui@redhat.com>
7 years agoMerge pull request #1156 from ceph/wip-libvirt-sdist
vasukulkarni [Thu, 15 Mar 2018 19:40:53 +0000 (12:40 -0700)]
Merge pull request #1156 from ceph/wip-libvirt-sdist

Drop libvirt from setup.py

7 years agoDrop libvirt from setup.py 1156/head
Zack Cerza [Thu, 15 Mar 2018 19:24:12 +0000 (13:24 -0600)]
Drop libvirt from setup.py

Having an unversioned libvirt-python in setup.py started causing
problems with tox, since it uses an sdist archive as a basis for
installing teuthology in its virtualenvs. Removing it is consistent with
best practices. We'll still keep it in requirements.txt.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agotask.kernel: Only show latest kernel when running rpm -q kernel 1154/head
David Galloway [Thu, 15 Mar 2018 16:12:45 +0000 (12:12 -0400)]
task.kernel: Only show latest kernel when running rpm -q kernel

Fixes: http://tracker.ceph.com/issues/23381
Signed-off-by: David Galloway <dgallowa@redhat.com>
7 years agoMerge pull request #1152 from ceph/wip-rhsm-selinux
David Galloway [Wed, 14 Mar 2018 19:33:20 +0000 (15:33 -0400)]
Merge pull request #1152 from ceph/wip-rhsm-selinux

task.selinux: Whitelist rhsmd denials

7 years agotask.selinux: Whitelist rhsmd denials 1152/head
David Galloway [Wed, 14 Mar 2018 17:12:06 +0000 (13:12 -0400)]
task.selinux: Whitelist rhsmd denials

These started showing up once we added RHEL to Sepia.

Fixes: https://tracker.ceph.com/issues/23343#note-5
Signed-off-by: David Galloway <dgallowa@redhat.com>
7 years agoMerge pull request #1148 from badone/wip-fedora-community-mysql-devel-depend
vasukulkarni [Wed, 28 Feb 2018 00:24:44 +0000 (16:24 -0800)]
Merge pull request #1148 from badone/wip-fedora-community-mysql-devel-depend

bootstrap: Rename mysql-community-devel for fedora

7 years agoMerge pull request #1150 from ceph/pytest-3.4
vasukulkarni [Tue, 27 Feb 2018 23:55:26 +0000 (15:55 -0800)]
Merge pull request #1150 from ceph/pytest-3.4

Unbreak testing with py.test 3.4

7 years agoUpdate tox 1150/head
Zack Cerza [Tue, 27 Feb 2018 23:46:53 +0000 (16:46 -0700)]
Update tox

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoUnbreak py.test
Zack Cerza [Tue, 27 Feb 2018 23:46:23 +0000 (16:46 -0700)]
Unbreak py.test

https://docs.pytest.org/en/latest/logging.html#incompatible-changes-in-pytest-3-4

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoUpdate ansible and cryptography
Zack Cerza [Mon, 26 Feb 2018 22:47:11 +0000 (15:47 -0700)]
Update ansible and cryptography

via pip-compile -P ansible -P cryptography

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agobootstrap: Rename mysql-community-devel for fedora 1148/head
Brad Hubbard [Fri, 16 Feb 2018 00:41:33 +0000 (10:41 +1000)]
bootstrap: Rename mysql-community-devel for fedora

The name of the package is community-mysql-devel in currently
supported releases.

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
7 years agoMerge pull request #1146 from ceph/wip-allow-more-drift
David Galloway [Fri, 9 Feb 2018 19:02:14 +0000 (14:02 -0500)]
Merge pull request #1146 from ceph/wip-allow-more-drift

ceph.conf: mon_clock_drift_allowed .5 -> 1.0

7 years agoceph.conf: mon_clock_drift_allowed .5 -> 1.0 1146/head
Sage Weil [Wed, 31 Jan 2018 12:37:23 +0000 (06:37 -0600)]
ceph.conf: mon_clock_drift_allowed .5 -> 1.0

All of the errors I see seem to be between .5 and .9s.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoMerge pull request #1145 from jcsp/wip-mgr-valgrind
Kefu Chai [Thu, 18 Jan 2018 13:31:58 +0000 (21:31 +0800)]
Merge pull request #1145 from jcsp/wip-mgr-valgrind

install: extend python-related valgrind rules

Reviewed-By: Kefu Chai <kchai@redhat.com>
7 years agoinstall: extend python-related valgrind rules 1145/head
John Spray [Mon, 15 Jan 2018 10:20:28 +0000 (10:20 +0000)]
install: extend python-related valgrind rules

These were set for kind 'possible' which was failing
to suppress Leak_DefinitelyLost errors.

Also add several suppressions related to module
initialization.

Signed-off-by: John Spray <john.spray@redhat.com>
7 years agoMerge pull request #1144 from tchaikov/wip-22438
Kefu Chai [Fri, 12 Jan 2018 14:51:31 +0000 (22:51 +0800)]
Merge pull request #1144 from tchaikov/wip-22438

teuthology/task/install/valgrind.supp: add suppression for dlopen()

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
7 years agoteuthology/task/install/valgrind.supp: add suppression for dlopen() 1144/head
Kefu Chai [Fri, 12 Jan 2018 06:24:30 +0000 (14:24 +0800)]
teuthology/task/install/valgrind.supp: add suppression for dlopen()

should cover the case of malloc() in addition to calloc()

Fixes: http://tracker.ceph.com/issues/22438
Signed-off-by: Kefu Chai <kchai@redhat.com>
7 years agoMerge pull request #1143 from tchaikov/wip-22438
Kefu Chai [Wed, 10 Jan 2018 15:15:43 +0000 (23:15 +0800)]
Merge pull request #1143 from tchaikov/wip-22438

teuthology/task/install/valgrind.supp: add suppression for dlopen()

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
7 years agoteuthology/task/install/valgrind.supp: add suppression for dlopen() 1143/head
Kefu Chai [Wed, 10 Jan 2018 08:39:15 +0000 (16:39 +0800)]
teuthology/task/install/valgrind.supp: add suppression for dlopen()

the analysis in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=700899
also applies to ceph-common.

Fixes: http://tracker.ceph.com/issues/22438
Signed-off-by: Kefu Chai <kchai@redhat.com>
7 years agoMerge pull request #1130 from tchaikov/wip-rerun-no-desc
vasukulkarni [Thu, 4 Jan 2018 20:04:57 +0000 (12:04 -0800)]
Merge pull request #1130 from tchaikov/wip-rerun-no-desc

suite: do not rerun jobs w/o description

7 years agoMerge pull request #1141 from ceph/wip-wusui-22518
Zack Cerza [Tue, 2 Jan 2018 16:29:16 +0000 (09:29 -0700)]
Merge pull request #1141 from ceph/wip-wusui-22518

Fix installer.0 bug.

7 years agoFix installer.0 bug. 1141/head
Warren Usui [Thu, 21 Dec 2017 03:03:03 +0000 (03:03 +0000)]
Fix installer.0 bug.
Insure that mon node is used for rbd_pools.

Signed-off-by: Warren Usui <wusui@redhat.com>
7 years agoMerge pull request #1140 from ceph/wip-foryuri-wusui
Zack Cerza [Tue, 19 Dec 2017 18:12:53 +0000 (11:12 -0700)]
Merge pull request #1140 from ceph/wip-foryuri-wusui

Implement installer.0 role.

7 years agoMerge pull request #1139 from ceph/wip-valgrind
Mykola Golub [Mon, 18 Dec 2017 08:48:00 +0000 (10:48 +0200)]
Merge pull request #1139 from ceph/wip-valgrind

valgrind: added suppression for cython constants

Reviewed-by: Mykola Golub <to.my.trociny@gmail.com>
7 years agovalgrind: added suppression for cython constants 1139/head
Jason Dillaman [Tue, 12 Dec 2017 21:38:54 +0000 (16:38 -0500)]
valgrind: added suppression for cython constants

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
7 years agoImplement installer.0 role. 1140/head
Warren Usui [Fri, 15 Dec 2017 07:59:56 +0000 (07:59 +0000)]
Implement installer.0 role.
This allows one to place the ceph ansible installer node on another machine.
It defaults back to the lowest monitor if there is no installer.0 role.

Signed-off-by: Warren Usui <wusui@redhat.com>
7 years agoMerge pull request #1138 from ceph/wip-rocks-leak
Sage Weil [Mon, 11 Dec 2017 21:25:34 +0000 (15:25 -0600)]
Merge pull request #1138 from ceph/wip-rocks-leak

teuthology/task/install/valgrind.supp: new rocksdb leak

7 years agoteuthology/task/install/valgrind.supp: new rocksdb leak 1138/head
Sage Weil [Mon, 11 Dec 2017 21:24:35 +0000 (15:24 -0600)]
teuthology/task/install/valgrind.supp: new rocksdb leak

<error>
  <unique>0x7</unique>
  <tid>1</tid>
  <kind>Leak_StillReachable</kind>
  <xwhat>
    <text>40 bytes in 1 blocks are still reachable in loss record 8 of 21</text>
    <leakedbytes>40</leakedbytes>
    <leakedblocks>1</leakedblocks>
  </xwhat>
  <stack>
    <frame>
      <ip>0x9EAF203</ip>
      <obj>/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so</obj>
      <fn>operator new(unsigned long)</fn>
      <dir>/builddir/build/BUILD/valgrind-3.12.0/coregrind/m_replacemalloc</dir>
      <file>vg_replace_malloc.c</file>
      <line>334</line>
    </frame>
    <frame>
      <ip>0xBDB237</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>allocate</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/ext</dir>
      <file>new_allocator.h</file>
      <line>111</line>
    </frame>
    <frame>
      <ip>0xBDB237</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>allocate</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>alloc_traits.h</file>
      <line>436</line>
    </frame>
    <frame>
      <ip>0xBDB237</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>std::__detail::_Hashtable_alloc&lt;std::allocator&lt;std::__detail::_Hash_node&lt;void const*, false&gt; &gt; &gt;::_M_allocate_buckets(unsigned long) [clone .isra.181]</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable_policy.h</file>
      <line>2107</line>
    </frame>
    <frame>
      <ip>0xBDDA87</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>_M_allocate_buckets</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable.h</file>
      <line>354</line>
    </frame>
    <frame>
      <ip>0xBDDA87</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>_M_rehash_aux</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable.h</file>
      <line>2098</line>
    </frame>
    <frame>
      <ip>0xBDDA87</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>std::_Hashtable&lt;rocksdb::ThreadStatusData*, rocksdb::ThreadStatusData*, std::allocator&lt;rocksdb::ThreadStatusData*&gt;, std::__detail::_Identity, std::equal_to&lt;rocksdb::ThreadStatusData*&gt;, std::hash&lt;rocksdb::ThreadStatusData*&gt;, std::__detail::_Mod_range_hashing, std::__detail::_Default_ra
nged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits&lt;false, true, true&gt; &gt;::_M_rehash(unsigned long, unsigned long const&amp;)</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable.h</file>
      <line>2077</line>
    </frame>
    <frame>
      <ip>0xBDDBCA</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>std::_Hashtable&lt;rocksdb::ThreadStatusData*, rocksdb::ThreadStatusData*, std::allocator&lt;rocksdb::ThreadStatusData*&gt;, std::__detail::_Identity, std::equal_to&lt;rocksdb::ThreadStatusData*&gt;, std::hash&lt;rocksdb::ThreadStatusData*&gt;, std::__detail::_Mod_range_hashing, std::__detail::_Default_ra
nged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits&lt;false, true, true&gt; &gt;::_M_insert_unique_node(unsigned long, unsigned long, std::__detail::_Hash_node&lt;rocksdb::ThreadStatusData*, false&gt;*)</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable.h</file>
      <line>1724</line>
    </frame>
    <frame>
      <ip>0xBDC6BE</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>_M_insert&lt;rocksdb::ThreadStatusData* const&amp;, std::__detail::_AllocNode&lt;std::allocator&lt;std::__detail::_Hash_node&lt;rocksdb::ThreadStatusData*, false&gt; &gt; &gt; &gt;</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable.h</file>
      <line>1828</line>
    </frame>
    <frame>
      <ip>0xBDC6BE</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>insert</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable_policy.h</file>
      <line>843</line>
    </frame>
    <frame>
      <ip>0xBDC6BE</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>insert</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>unordered_set.h</file>
      <line>420</line>
    </frame>
    <frame>
      <ip>0xBDC6BE</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>rocksdb::ThreadStatusUpdater::RegisterThread(rocksdb::ThreadStatus::ThreadType, unsigned long)</fn>
      <dir>/usr/src/debug/ceph-13.0.0-3917-g948b47a/src/rocksdb/monitoring</dir>
      <file>thread_status_updater.cc</file>
      <line>25</line>
    </frame>
    <frame>
      <ip>0xBEEE09</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*)</fn>
      <dir>/usr/src/debug/ceph-13.0.0-3917-g948b47a/src/rocksdb/util</dir>
      <file>threadpool_imp.cc</file>
      <line>258</line>
    </frame>
    <frame>
      <ip>0xC1AF0E</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>execute_native_thread_routine</fn>
    </frame>
    <frame>
      <ip>0xA8FDE24</ip>
      <obj>/usr/lib64/libpthread-2.17.so</obj>
      <fn>start_thread</fn>
    </frame>
    <frame>
      <ip>0xD75234C</ip>
      <obj>/usr/lib64/libc-2.17.so</obj>
      <fn>clone</fn>
    </frame>
  </stack>
</error>

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoMerge pull request #1136 from ceph/wip-fog
David Galloway [Wed, 6 Dec 2017 23:18:13 +0000 (18:18 -0500)]
Merge pull request #1136 from ceph/wip-fog

nuke: Power off FOG machines instead of reimaging

7 years agonuke: Power off FOG machines instead of reimaging 1136/head
Zack Cerza [Tue, 5 Dec 2017 23:08:45 +0000 (16:08 -0700)]
nuke: Power off FOG machines instead of reimaging

Now that FOG is in master, we don't have to reimage on nuke. Let's shut
them down - the next job will bring them up again during the reimage
process.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1135 from ceph/wip-fog-conslog
David Galloway [Tue, 5 Dec 2017 21:35:13 +0000 (16:35 -0500)]
Merge pull request #1135 from ceph/wip-fog-conslog

Task: Tolerate a missing ctx.config

7 years agoMerge pull request #1134 from ceph/wip-fog-timeouts
David Galloway [Tue, 5 Dec 2017 21:30:18 +0000 (16:30 -0500)]
Merge pull request #1134 from ceph/wip-fog-timeouts

PhysicalConsole fixes for FOG

7 years agoPhysicalConsole: Correctly report power_off failure 1134/head
Zack Cerza [Tue, 5 Dec 2017 21:24:10 +0000 (14:24 -0700)]
PhysicalConsole: Correctly report power_off failure

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoTask: Tolerate a missing ctx.config 1135/head
Zack Cerza [Tue, 5 Dec 2017 21:20:11 +0000 (14:20 -0700)]
Task: Tolerate a missing ctx.config

When we initialize a ConsoleLog task during FOG reimaging, the ctx
object has no config attribute.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoPhysicalConsole: Correctly report power_on failure
Zack Cerza [Tue, 5 Dec 2017 17:57:43 +0000 (10:57 -0700)]
PhysicalConsole: Correctly report power_on failure

We've been logging an error, then unconditionally reporting success. Fix
it.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoPhysicalConsole: replace retry mechanism
Zack Cerza [Tue, 5 Dec 2017 17:53:26 +0000 (10:53 -0700)]
PhysicalConsole: replace retry mechanism

It was buggy and unreadable. Use safe_while.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1133 from tchaikov/wip-hadoop-url
Sage Weil [Tue, 5 Dec 2017 14:50:05 +0000 (08:50 -0600)]
Merge pull request #1133 from tchaikov/wip-hadoop-url

task/hadoop: update hadoop 2.5.2 url

7 years agotask/hadoop: update hadoop 2.5.2 url 1133/head
Kefu Chai [Tue, 5 Dec 2017 07:02:12 +0000 (15:02 +0800)]
task/hadoop: update hadoop 2.5.2 url

hadoop v2.5.2 is pretty old now, and is removed from the hadoop/common
directory. but it can be found in archive.

Signed-off-by: Kefu Chai <kchai@redhat.com>
7 years agoMerge pull request #1131 from ceph/wip-fog-cancel
David Galloway [Mon, 4 Dec 2017 22:30:24 +0000 (17:30 -0500)]
Merge pull request #1131 from ceph/wip-fog-cancel

fog: Cancel stale deploy tasks

7 years agofog: Cancel stale deploy tasks 1131/head
Zack Cerza [Mon, 4 Dec 2017 19:11:19 +0000 (12:11 -0700)]
fog: Cancel stale deploy tasks

In case, for whatever reason, any active deploy tasks already exist for
a given host, cancel them before we schedule a new one.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agofog: Fix TypeError when canceling tasks
Zack Cerza [Mon, 4 Dec 2017 18:59:26 +0000 (11:59 -0700)]
fog: Fix TypeError when canceling tasks

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1126 from ceph/wip-fog
Zack Cerza [Fri, 1 Dec 2017 16:21:12 +0000 (09:21 -0700)]
Merge pull request #1126 from ceph/wip-fog

Bare-metal reimaging with FOG

7 years agoupdate_inventory: Canonicalize hostname 1126/head
Zack Cerza [Thu, 30 Nov 2017 17:03:03 +0000 (10:03 -0700)]
update_inventory: Canonicalize hostname

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agonuke: Reimage instead of nuking as necessary
Zack Cerza [Thu, 30 Nov 2017 00:22:38 +0000 (17:22 -0700)]
nuke: Reimage instead of nuking as necessary

Once this branch is in master, we'll probably want to switch this
behavior to no-op nodes that would be reimaged, since the next job will
do that anyway.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agofog: If deploy fails, cancel the deploy task
Zack Cerza [Tue, 28 Nov 2017 22:48:15 +0000 (15:48 -0700)]
fog: If deploy fails, cancel the deploy task

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agolock.ops.lock_many: Log console while reimaging
Zack Cerza [Tue, 28 Nov 2017 16:52:53 +0000 (09:52 -0700)]
lock.ops.lock_many: Log console while reimaging

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agotask/console_log: Allow specifying remotes
Zack Cerza [Tue, 28 Nov 2017 17:37:43 +0000 (10:37 -0700)]
task/console_log: Allow specifying remotes

For use with reimaging; allow passing remotes directly instead of
relying on the ctx.cluster object, which won't exist at that time.

Signed-off-by: Zack Cerza <zack@redhat.com>