]> git.apps.os.sepia.ceph.com Git - teuthology.git/log
teuthology.git
7 years agoceph-ansible: add hard-coded notario dependency 1176/head
Casey Bodley [Tue, 22 May 2018 19:26:19 +0000 (15:26 -0400)]
ceph-ansible: add hard-coded notario dependency

Fixes: http://tracker.ceph.com/issues/24230
Signed-off-by: Casey Bodley <cbodley@redhat.com>
7 years agoMerge pull request #1175 from ceph/wip-24168
Dan Mick [Fri, 18 May 2018 20:57:46 +0000 (13:57 -0700)]
Merge pull request #1175 from ceph/wip-24168

Unpin libvirt-python

7 years agoUnpin libvirt-python 1175/head
Zack Cerza [Fri, 18 May 2018 19:19:36 +0000 (13:19 -0600)]
Unpin libvirt-python

http://tracker.ceph.com/issues/24168

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1174 from ceph/wip-suite-tests
David Galloway [Wed, 16 May 2018 21:11:08 +0000 (17:11 -0400)]
Merge pull request #1174 from ceph/wip-suite-tests

suite/test/test_run_.py: Don't hit the network!

7 years agosuite/test/test_run_.py: Don't hit the network! 1174/head
Zack Cerza [Wed, 16 May 2018 21:00:07 +0000 (15:00 -0600)]
suite/test/test_run_.py: Don't hit the network!

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1173 from ceph/wip-vault-pass
Zack Cerza [Wed, 16 May 2018 20:37:30 +0000 (14:37 -0600)]
Merge pull request #1173 from ceph/wip-vault-pass

ansible.py: Write "foo" to ~/.vault_pass.txt instead of touching

7 years agoansible.py: Write "foo" to ~/.vault_pass.txt instead of touching 1173/head
David Galloway [Wed, 16 May 2018 19:46:38 +0000 (15:46 -0400)]
ansible.py: Write "foo" to ~/.vault_pass.txt instead of touching

ansible-playbook will not run with an empty vault password file.  It
will run if the password file has something in it even if it doesn't get
used.

Signed-off-by: David Galloway <dgallowa@redhat.com>
7 years agoMerge pull request #1165 from ceph/wip-wusui-23208
Zack Cerza [Mon, 7 May 2018 20:57:28 +0000 (14:57 -0600)]
Merge pull request #1165 from ceph/wip-wusui-23208

Allow both $ and directory$ for random yamls.

7 years agoAllow both $ and directory$ for random yamls. 1165/head
Warren Usui [Wed, 11 Apr 2018 00:48:55 +0000 (00:48 +0000)]
Allow both $ and directory$ for random yamls.

If either the diretory contains a magic $ file, or if the directory name
ends with $, then the random selection of a yaml file will occur.

Signed-off-by: Warren Usui <wusui@redhat.com>
7 years agoMerge pull request #1172 from batrick/slow-ops-whitelist
Sage Weil [Thu, 3 May 2018 13:50:06 +0000 (08:50 -0500)]
Merge pull request #1172 from batrick/slow-ops-whitelist

remove blanket SLOW_OPS whitelist

7 years agoremove blanket SLOW_OPS whitelist 1172/head
Patrick Donnelly [Thu, 3 May 2018 13:45:52 +0000 (06:45 -0700)]
remove blanket SLOW_OPS whitelist

Should no longer be necessary after [1].

[1] https://github.com/ceph/ceph/pull/21684

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
7 years agoMerge pull request #1170 from ceph/wip-bionic
Zack Cerza [Tue, 1 May 2018 19:51:35 +0000 (13:51 -0600)]
Merge pull request #1170 from ceph/wip-bionic

Add Bionic to distro map

7 years agoAdd Bionic to distro map 1170/head
David Galloway [Mon, 30 Apr 2018 15:43:31 +0000 (11:43 -0400)]
Add Bionic to distro map

Signed-off-by: David Galloway <dgallowa@redhat.com>
7 years agoMerge pull request #1167 from tchaikov/wip-job-404
Zack Cerza [Mon, 30 Apr 2018 11:59:58 +0000 (05:59 -0600)]
Merge pull request #1167 from tchaikov/wip-job-404

teuthology/lock: ignore none 200 jobs in node_job_is_active()

7 years agoMerge pull request #1169 from ceph/wip-23798
David Galloway [Tue, 24 Apr 2018 15:56:27 +0000 (11:56 -0400)]
Merge pull request #1169 from ceph/wip-23798

task.selinux: Whitelist syslogd_t denials

7 years agotask.selinux: Whitelist syslogd_t denials 1169/head
David Galloway [Tue, 24 Apr 2018 15:25:40 +0000 (11:25 -0400)]
task.selinux: Whitelist syslogd_t denials

Fixes: http://tracker.ceph.com/issues/23798
Signed-off-by: David Galloway <dgallowa@redhat.com>
7 years agoMerge pull request #1168 from tchaikov/wip-mds-all-down
Kefu Chai [Mon, 23 Apr 2018 05:10:56 +0000 (13:10 +0800)]
Merge pull request #1168 from tchaikov/wip-mds-all-down

placeholder: whitelist MDS_ALL_DOWN by default

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
7 years agoplaceholder: whitelist MDS_ALL_DOWN, MDS_UP_LESS_THAN_MAX by default 1168/head
Kefu Chai [Sat, 21 Apr 2018 15:47:23 +0000 (23:47 +0800)]
placeholder: whitelist MDS_ALL_DOWN, MDS_UP_LESS_THAN_MAX  by default

because, in ceph/qa/tasks/ceph.py, we start mon, mgr, osd, and then mds.
there is a time window where there is no mds around, but mgr is checking
mdsmap for MDS_ALL_DOWN errors. there is no way to disable this check in
this time window. so we just whitelist MDS_ALL_DOWN here.

Signed-off-by: Kefu Chai <kchai@redhat.com>
7 years agoteuthology/lock: ignore none 200 jobs in node_job_is_active() 1167/head
Kefu Chai [Sun, 15 Apr 2018 23:56:05 +0000 (07:56 +0800)]
teuthology/lock: ignore none 200 jobs in node_job_is_active()

there is chance that we will have 404 when accessing a job's URI.

Signed-off-by: Kefu Chai <kchai@redhat.com>
7 years agoMerge pull request #1166 from batrick/i23662
Sage Weil [Thu, 12 Apr 2018 03:27:28 +0000 (22:27 -0500)]
Merge pull request #1166 from batrick/i23662

placeholder: update whitelist for osd slow op wrn

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
7 years agoplaceholder: update whitelist for osd slow op wrn 1166/head
Patrick Donnelly [Wed, 11 Apr 2018 23:19:25 +0000 (16:19 -0700)]
placeholder: update whitelist for osd slow op wrn

Caused by: https://github.com/ceph/ceph/pull/20660 (ea97c120d2173f2fc70d979d57a7edb2a6c5da5e)

Fixes: https://tracker.ceph.com/issues/23662
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
7 years agoMerge pull request #1160 from ceph/wip-intro-link-new-wiki 1163/head
Zack Cerza [Thu, 5 Apr 2018 22:29:06 +0000 (16:29 -0600)]
Merge pull request #1160 from ceph/wip-intro-link-new-wiki

Update getting started instructions to link to the new sepia wiki

7 years agoMerge pull request #1154 from ceph/wip-distro-head
Zack Cerza [Thu, 5 Apr 2018 22:25:41 +0000 (16:25 -0600)]
Merge pull request #1154 from ceph/wip-distro-head

task.kernel: Only show latest kernel when running rpm -q kernel

7 years agoMerge pull request #1164 from ceph/wip-skip-tags
Zack Cerza [Thu, 5 Apr 2018 19:28:13 +0000 (13:28 -0600)]
Merge pull request #1164 from ceph/wip-skip-tags

ansible: Add ability to skip tags during ansible task

7 years agoansible: Add ability to skip tags during ansible task 1164/head
David Galloway [Thu, 5 Apr 2018 14:42:02 +0000 (10:42 -0400)]
ansible: Add ability to skip tags during ansible task

Signed-off-by: David Galloway <dgallowa@redhat.com>
7 years agoMerge pull request #1161 from ceph/wip-ansible-group-vars
vasukulkarni [Wed, 4 Apr 2018 15:26:15 +0000 (08:26 -0700)]
Merge pull request #1161 from ceph/wip-ansible-group-vars

task.ansible: Allow passing in custom group_vars

7 years agoOverhaul ansible task tests 1161/head
Zack Cerza [Tue, 3 Apr 2018 18:03:40 +0000 (12:03 -0600)]
Overhaul ansible task tests

This fixes failures that have only manifested in Jenkins; it also causes
CephLab to run more of the Ansible tests.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoUpdate tox again
Zack Cerza [Tue, 3 Apr 2018 18:03:26 +0000 (12:03 -0600)]
Update tox again

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agotask.internal.syslog: Whitelist a docker warning
Zack Cerza [Tue, 3 Apr 2018 01:33:51 +0000 (19:33 -0600)]
task.internal.syslog: Whitelist a docker warning

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agopytest.ini: Disable cacheprovider
Zack Cerza [Mon, 2 Apr 2018 21:35:47 +0000 (15:35 -0600)]
pytest.ini: Disable cacheprovider

A change in 3.5.0 seems to be buggy; disable the cache rather than
pinning versions.

I believe the offending commit is:
https://github.com/pytest-dev/pytest/commit/dff0500114971b30a7bb9043acb0d0fb6a9e01c4

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agotask.ansible: Allow passing in custom group_vars
Zack Cerza [Wed, 21 Mar 2018 23:37:32 +0000 (17:37 -0600)]
task.ansible: Allow passing in custom group_vars

Up until now, if you wanted to inject vars to a playbook run, you had to
use --extra-vars, which don't behave the same way that group_vars do.
This commit adds that functionality.

We look for a 'group_vars' dict in the task's config object. If it's
there, we create group_vars files with names taken from the keys, and
content taken from the values.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1157 from ceph/wusui-23208
Zack Cerza [Thu, 29 Mar 2018 20:40:14 +0000 (14:40 -0600)]
Merge pull request #1157 from ceph/wusui-23208

Implement $ option to randomly choose yamls.

7 years agoUpdate getting started instructions to link to the new sepia wiki 1160/head
Gregory Meno [Mon, 26 Mar 2018 17:14:06 +0000 (10:14 -0700)]
Update getting started instructions to link to the new sepia wiki

Signed-off-by: Gregory Meno <gmeno@redhat.com>
7 years agoAdd unit tests for '$' file. 1157/head 1159/head
Warren Usui [Fri, 23 Mar 2018 01:51:57 +0000 (01:51 +0000)]
Add unit tests for '$' file.

Signed-off-by: Warren Usui <wusui@redhat.com>
7 years agoMerge pull request #1158 from ceph/wip-wait-fog
Zack Cerza [Tue, 20 Mar 2018 21:39:01 +0000 (15:39 -0600)]
Merge pull request #1158 from ceph/wip-wait-fog

Increase timeouts teuthology will wait for FOG provisioning

7 years agofog: Wait 10 minutes for machine to be reachable after deploy 1158/head
David Galloway [Tue, 27 Feb 2018 18:29:56 +0000 (13:29 -0500)]
fog: Wait 10 minutes for machine to be reachable after deploy

A lot's going on in rc.local after a machine is provisioned with FOG.  5
minutes is a little aggressive when taking into account the time it
takes for:
 - The machine to reboot after the FOG task completes
 - BIOS to load
 - DHCP/PXE/TFTP to timeout (double this if NIC order isn't correct)
 - OS to boot and rc.local to do its magic

Signed-off-by: David Galloway <dgallowa@redhat.com>
7 years agofog: Wait 15 minutes for FOG task to complete
David Galloway [Tue, 27 Feb 2018 18:28:10 +0000 (13:28 -0500)]
fog: Wait 15 minutes for FOG task to complete

When there are running jobs and a large run gets scheduled, FOG
provisioning gets backed up due to its built-in rate limiting.  15
minutes should be a little more lenient and prevent large spikes of dead
jobs when a run is scheduled in an idle queue.

Signed-off-by: David Galloway <dgallowa@redhat.com>
7 years agoImplement $ option to randomly choose yamls.
Warren Usui [Tue, 20 Mar 2018 01:15:42 +0000 (01:15 +0000)]
Implement $ option to randomly choose yamls.

This implements tracker #23208

Signed-off-by: Warren Usui <wusui@redhat.com>
7 years agoMerge pull request #1156 from ceph/wip-libvirt-sdist
vasukulkarni [Thu, 15 Mar 2018 19:40:53 +0000 (12:40 -0700)]
Merge pull request #1156 from ceph/wip-libvirt-sdist

Drop libvirt from setup.py

7 years agoDrop libvirt from setup.py 1156/head
Zack Cerza [Thu, 15 Mar 2018 19:24:12 +0000 (13:24 -0600)]
Drop libvirt from setup.py

Having an unversioned libvirt-python in setup.py started causing
problems with tox, since it uses an sdist archive as a basis for
installing teuthology in its virtualenvs. Removing it is consistent with
best practices. We'll still keep it in requirements.txt.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agotask.kernel: Only show latest kernel when running rpm -q kernel 1154/head
David Galloway [Thu, 15 Mar 2018 16:12:45 +0000 (12:12 -0400)]
task.kernel: Only show latest kernel when running rpm -q kernel

Fixes: http://tracker.ceph.com/issues/23381
Signed-off-by: David Galloway <dgallowa@redhat.com>
7 years agoMerge pull request #1152 from ceph/wip-rhsm-selinux
David Galloway [Wed, 14 Mar 2018 19:33:20 +0000 (15:33 -0400)]
Merge pull request #1152 from ceph/wip-rhsm-selinux

task.selinux: Whitelist rhsmd denials

7 years agotask.selinux: Whitelist rhsmd denials 1152/head
David Galloway [Wed, 14 Mar 2018 17:12:06 +0000 (13:12 -0400)]
task.selinux: Whitelist rhsmd denials

These started showing up once we added RHEL to Sepia.

Fixes: https://tracker.ceph.com/issues/23343#note-5
Signed-off-by: David Galloway <dgallowa@redhat.com>
7 years agoMerge pull request #1148 from badone/wip-fedora-community-mysql-devel-depend
vasukulkarni [Wed, 28 Feb 2018 00:24:44 +0000 (16:24 -0800)]
Merge pull request #1148 from badone/wip-fedora-community-mysql-devel-depend

bootstrap: Rename mysql-community-devel for fedora

7 years agoMerge pull request #1150 from ceph/pytest-3.4
vasukulkarni [Tue, 27 Feb 2018 23:55:26 +0000 (15:55 -0800)]
Merge pull request #1150 from ceph/pytest-3.4

Unbreak testing with py.test 3.4

7 years agoUpdate tox 1150/head
Zack Cerza [Tue, 27 Feb 2018 23:46:53 +0000 (16:46 -0700)]
Update tox

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoUnbreak py.test
Zack Cerza [Tue, 27 Feb 2018 23:46:23 +0000 (16:46 -0700)]
Unbreak py.test

https://docs.pytest.org/en/latest/logging.html#incompatible-changes-in-pytest-3-4

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoUpdate ansible and cryptography
Zack Cerza [Mon, 26 Feb 2018 22:47:11 +0000 (15:47 -0700)]
Update ansible and cryptography

via pip-compile -P ansible -P cryptography

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agobootstrap: Rename mysql-community-devel for fedora 1148/head
Brad Hubbard [Fri, 16 Feb 2018 00:41:33 +0000 (10:41 +1000)]
bootstrap: Rename mysql-community-devel for fedora

The name of the package is community-mysql-devel in currently
supported releases.

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
7 years agoMerge pull request #1146 from ceph/wip-allow-more-drift
David Galloway [Fri, 9 Feb 2018 19:02:14 +0000 (14:02 -0500)]
Merge pull request #1146 from ceph/wip-allow-more-drift

ceph.conf: mon_clock_drift_allowed .5 -> 1.0

7 years agoceph.conf: mon_clock_drift_allowed .5 -> 1.0 1146/head
Sage Weil [Wed, 31 Jan 2018 12:37:23 +0000 (06:37 -0600)]
ceph.conf: mon_clock_drift_allowed .5 -> 1.0

All of the errors I see seem to be between .5 and .9s.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoMerge pull request #1145 from jcsp/wip-mgr-valgrind
Kefu Chai [Thu, 18 Jan 2018 13:31:58 +0000 (21:31 +0800)]
Merge pull request #1145 from jcsp/wip-mgr-valgrind

install: extend python-related valgrind rules

Reviewed-By: Kefu Chai <kchai@redhat.com>
7 years agoinstall: extend python-related valgrind rules 1145/head
John Spray [Mon, 15 Jan 2018 10:20:28 +0000 (10:20 +0000)]
install: extend python-related valgrind rules

These were set for kind 'possible' which was failing
to suppress Leak_DefinitelyLost errors.

Also add several suppressions related to module
initialization.

Signed-off-by: John Spray <john.spray@redhat.com>
7 years agoMerge pull request #1144 from tchaikov/wip-22438
Kefu Chai [Fri, 12 Jan 2018 14:51:31 +0000 (22:51 +0800)]
Merge pull request #1144 from tchaikov/wip-22438

teuthology/task/install/valgrind.supp: add suppression for dlopen()

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
7 years agoteuthology/task/install/valgrind.supp: add suppression for dlopen() 1144/head
Kefu Chai [Fri, 12 Jan 2018 06:24:30 +0000 (14:24 +0800)]
teuthology/task/install/valgrind.supp: add suppression for dlopen()

should cover the case of malloc() in addition to calloc()

Fixes: http://tracker.ceph.com/issues/22438
Signed-off-by: Kefu Chai <kchai@redhat.com>
7 years agoMerge pull request #1143 from tchaikov/wip-22438
Kefu Chai [Wed, 10 Jan 2018 15:15:43 +0000 (23:15 +0800)]
Merge pull request #1143 from tchaikov/wip-22438

teuthology/task/install/valgrind.supp: add suppression for dlopen()

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
7 years agoteuthology/task/install/valgrind.supp: add suppression for dlopen() 1143/head
Kefu Chai [Wed, 10 Jan 2018 08:39:15 +0000 (16:39 +0800)]
teuthology/task/install/valgrind.supp: add suppression for dlopen()

the analysis in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=700899
also applies to ceph-common.

Fixes: http://tracker.ceph.com/issues/22438
Signed-off-by: Kefu Chai <kchai@redhat.com>
7 years agoMerge pull request #1130 from tchaikov/wip-rerun-no-desc
vasukulkarni [Thu, 4 Jan 2018 20:04:57 +0000 (12:04 -0800)]
Merge pull request #1130 from tchaikov/wip-rerun-no-desc

suite: do not rerun jobs w/o description

7 years agoMerge pull request #1141 from ceph/wip-wusui-22518
Zack Cerza [Tue, 2 Jan 2018 16:29:16 +0000 (09:29 -0700)]
Merge pull request #1141 from ceph/wip-wusui-22518

Fix installer.0 bug.

7 years agoFix installer.0 bug. 1141/head
Warren Usui [Thu, 21 Dec 2017 03:03:03 +0000 (03:03 +0000)]
Fix installer.0 bug.
Insure that mon node is used for rbd_pools.

Signed-off-by: Warren Usui <wusui@redhat.com>
7 years agoMerge pull request #1140 from ceph/wip-foryuri-wusui
Zack Cerza [Tue, 19 Dec 2017 18:12:53 +0000 (11:12 -0700)]
Merge pull request #1140 from ceph/wip-foryuri-wusui

Implement installer.0 role.

7 years agoMerge pull request #1139 from ceph/wip-valgrind
Mykola Golub [Mon, 18 Dec 2017 08:48:00 +0000 (10:48 +0200)]
Merge pull request #1139 from ceph/wip-valgrind

valgrind: added suppression for cython constants

Reviewed-by: Mykola Golub <to.my.trociny@gmail.com>
7 years agovalgrind: added suppression for cython constants 1139/head
Jason Dillaman [Tue, 12 Dec 2017 21:38:54 +0000 (16:38 -0500)]
valgrind: added suppression for cython constants

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
7 years agoImplement installer.0 role. 1140/head
Warren Usui [Fri, 15 Dec 2017 07:59:56 +0000 (07:59 +0000)]
Implement installer.0 role.
This allows one to place the ceph ansible installer node on another machine.
It defaults back to the lowest monitor if there is no installer.0 role.

Signed-off-by: Warren Usui <wusui@redhat.com>
7 years agoMerge pull request #1138 from ceph/wip-rocks-leak
Sage Weil [Mon, 11 Dec 2017 21:25:34 +0000 (15:25 -0600)]
Merge pull request #1138 from ceph/wip-rocks-leak

teuthology/task/install/valgrind.supp: new rocksdb leak

7 years agoteuthology/task/install/valgrind.supp: new rocksdb leak 1138/head
Sage Weil [Mon, 11 Dec 2017 21:24:35 +0000 (15:24 -0600)]
teuthology/task/install/valgrind.supp: new rocksdb leak

<error>
  <unique>0x7</unique>
  <tid>1</tid>
  <kind>Leak_StillReachable</kind>
  <xwhat>
    <text>40 bytes in 1 blocks are still reachable in loss record 8 of 21</text>
    <leakedbytes>40</leakedbytes>
    <leakedblocks>1</leakedblocks>
  </xwhat>
  <stack>
    <frame>
      <ip>0x9EAF203</ip>
      <obj>/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so</obj>
      <fn>operator new(unsigned long)</fn>
      <dir>/builddir/build/BUILD/valgrind-3.12.0/coregrind/m_replacemalloc</dir>
      <file>vg_replace_malloc.c</file>
      <line>334</line>
    </frame>
    <frame>
      <ip>0xBDB237</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>allocate</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/ext</dir>
      <file>new_allocator.h</file>
      <line>111</line>
    </frame>
    <frame>
      <ip>0xBDB237</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>allocate</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>alloc_traits.h</file>
      <line>436</line>
    </frame>
    <frame>
      <ip>0xBDB237</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>std::__detail::_Hashtable_alloc&lt;std::allocator&lt;std::__detail::_Hash_node&lt;void const*, false&gt; &gt; &gt;::_M_allocate_buckets(unsigned long) [clone .isra.181]</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable_policy.h</file>
      <line>2107</line>
    </frame>
    <frame>
      <ip>0xBDDA87</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>_M_allocate_buckets</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable.h</file>
      <line>354</line>
    </frame>
    <frame>
      <ip>0xBDDA87</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>_M_rehash_aux</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable.h</file>
      <line>2098</line>
    </frame>
    <frame>
      <ip>0xBDDA87</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>std::_Hashtable&lt;rocksdb::ThreadStatusData*, rocksdb::ThreadStatusData*, std::allocator&lt;rocksdb::ThreadStatusData*&gt;, std::__detail::_Identity, std::equal_to&lt;rocksdb::ThreadStatusData*&gt;, std::hash&lt;rocksdb::ThreadStatusData*&gt;, std::__detail::_Mod_range_hashing, std::__detail::_Default_ra
nged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits&lt;false, true, true&gt; &gt;::_M_rehash(unsigned long, unsigned long const&amp;)</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable.h</file>
      <line>2077</line>
    </frame>
    <frame>
      <ip>0xBDDBCA</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>std::_Hashtable&lt;rocksdb::ThreadStatusData*, rocksdb::ThreadStatusData*, std::allocator&lt;rocksdb::ThreadStatusData*&gt;, std::__detail::_Identity, std::equal_to&lt;rocksdb::ThreadStatusData*&gt;, std::hash&lt;rocksdb::ThreadStatusData*&gt;, std::__detail::_Mod_range_hashing, std::__detail::_Default_ra
nged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits&lt;false, true, true&gt; &gt;::_M_insert_unique_node(unsigned long, unsigned long, std::__detail::_Hash_node&lt;rocksdb::ThreadStatusData*, false&gt;*)</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable.h</file>
      <line>1724</line>
    </frame>
    <frame>
      <ip>0xBDC6BE</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>_M_insert&lt;rocksdb::ThreadStatusData* const&amp;, std::__detail::_AllocNode&lt;std::allocator&lt;std::__detail::_Hash_node&lt;rocksdb::ThreadStatusData*, false&gt; &gt; &gt; &gt;</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable.h</file>
      <line>1828</line>
    </frame>
    <frame>
      <ip>0xBDC6BE</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>insert</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable_policy.h</file>
      <line>843</line>
    </frame>
    <frame>
      <ip>0xBDC6BE</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>insert</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>unordered_set.h</file>
      <line>420</line>
    </frame>
    <frame>
      <ip>0xBDC6BE</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>rocksdb::ThreadStatusUpdater::RegisterThread(rocksdb::ThreadStatus::ThreadType, unsigned long)</fn>
      <dir>/usr/src/debug/ceph-13.0.0-3917-g948b47a/src/rocksdb/monitoring</dir>
      <file>thread_status_updater.cc</file>
      <line>25</line>
    </frame>
    <frame>
      <ip>0xBEEE09</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*)</fn>
      <dir>/usr/src/debug/ceph-13.0.0-3917-g948b47a/src/rocksdb/util</dir>
      <file>threadpool_imp.cc</file>
      <line>258</line>
    </frame>
    <frame>
      <ip>0xC1AF0E</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>execute_native_thread_routine</fn>
    </frame>
    <frame>
      <ip>0xA8FDE24</ip>
      <obj>/usr/lib64/libpthread-2.17.so</obj>
      <fn>start_thread</fn>
    </frame>
    <frame>
      <ip>0xD75234C</ip>
      <obj>/usr/lib64/libc-2.17.so</obj>
      <fn>clone</fn>
    </frame>
  </stack>
</error>

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoMerge pull request #1136 from ceph/wip-fog
David Galloway [Wed, 6 Dec 2017 23:18:13 +0000 (18:18 -0500)]
Merge pull request #1136 from ceph/wip-fog

nuke: Power off FOG machines instead of reimaging

7 years agonuke: Power off FOG machines instead of reimaging 1136/head
Zack Cerza [Tue, 5 Dec 2017 23:08:45 +0000 (16:08 -0700)]
nuke: Power off FOG machines instead of reimaging

Now that FOG is in master, we don't have to reimage on nuke. Let's shut
them down - the next job will bring them up again during the reimage
process.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1135 from ceph/wip-fog-conslog
David Galloway [Tue, 5 Dec 2017 21:35:13 +0000 (16:35 -0500)]
Merge pull request #1135 from ceph/wip-fog-conslog

Task: Tolerate a missing ctx.config

7 years agoMerge pull request #1134 from ceph/wip-fog-timeouts
David Galloway [Tue, 5 Dec 2017 21:30:18 +0000 (16:30 -0500)]
Merge pull request #1134 from ceph/wip-fog-timeouts

PhysicalConsole fixes for FOG

7 years agoPhysicalConsole: Correctly report power_off failure 1134/head
Zack Cerza [Tue, 5 Dec 2017 21:24:10 +0000 (14:24 -0700)]
PhysicalConsole: Correctly report power_off failure

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoTask: Tolerate a missing ctx.config 1135/head
Zack Cerza [Tue, 5 Dec 2017 21:20:11 +0000 (14:20 -0700)]
Task: Tolerate a missing ctx.config

When we initialize a ConsoleLog task during FOG reimaging, the ctx
object has no config attribute.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoPhysicalConsole: Correctly report power_on failure
Zack Cerza [Tue, 5 Dec 2017 17:57:43 +0000 (10:57 -0700)]
PhysicalConsole: Correctly report power_on failure

We've been logging an error, then unconditionally reporting success. Fix
it.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoPhysicalConsole: replace retry mechanism
Zack Cerza [Tue, 5 Dec 2017 17:53:26 +0000 (10:53 -0700)]
PhysicalConsole: replace retry mechanism

It was buggy and unreadable. Use safe_while.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1133 from tchaikov/wip-hadoop-url
Sage Weil [Tue, 5 Dec 2017 14:50:05 +0000 (08:50 -0600)]
Merge pull request #1133 from tchaikov/wip-hadoop-url

task/hadoop: update hadoop 2.5.2 url

7 years agotask/hadoop: update hadoop 2.5.2 url 1133/head
Kefu Chai [Tue, 5 Dec 2017 07:02:12 +0000 (15:02 +0800)]
task/hadoop: update hadoop 2.5.2 url

hadoop v2.5.2 is pretty old now, and is removed from the hadoop/common
directory. but it can be found in archive.

Signed-off-by: Kefu Chai <kchai@redhat.com>
7 years agoMerge pull request #1131 from ceph/wip-fog-cancel
David Galloway [Mon, 4 Dec 2017 22:30:24 +0000 (17:30 -0500)]
Merge pull request #1131 from ceph/wip-fog-cancel

fog: Cancel stale deploy tasks

7 years agofog: Cancel stale deploy tasks 1131/head
Zack Cerza [Mon, 4 Dec 2017 19:11:19 +0000 (12:11 -0700)]
fog: Cancel stale deploy tasks

In case, for whatever reason, any active deploy tasks already exist for
a given host, cancel them before we schedule a new one.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agofog: Fix TypeError when canceling tasks
Zack Cerza [Mon, 4 Dec 2017 18:59:26 +0000 (11:59 -0700)]
fog: Fix TypeError when canceling tasks

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1126 from ceph/wip-fog
Zack Cerza [Fri, 1 Dec 2017 16:21:12 +0000 (09:21 -0700)]
Merge pull request #1126 from ceph/wip-fog

Bare-metal reimaging with FOG

7 years agoupdate_inventory: Canonicalize hostname 1126/head
Zack Cerza [Thu, 30 Nov 2017 17:03:03 +0000 (10:03 -0700)]
update_inventory: Canonicalize hostname

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agonuke: Reimage instead of nuking as necessary
Zack Cerza [Thu, 30 Nov 2017 00:22:38 +0000 (17:22 -0700)]
nuke: Reimage instead of nuking as necessary

Once this branch is in master, we'll probably want to switch this
behavior to no-op nodes that would be reimaged, since the next job will
do that anyway.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agofog: If deploy fails, cancel the deploy task
Zack Cerza [Tue, 28 Nov 2017 22:48:15 +0000 (15:48 -0700)]
fog: If deploy fails, cancel the deploy task

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agolock.ops.lock_many: Log console while reimaging
Zack Cerza [Tue, 28 Nov 2017 16:52:53 +0000 (09:52 -0700)]
lock.ops.lock_many: Log console while reimaging

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agotask/console_log: Allow specifying remotes
Zack Cerza [Tue, 28 Nov 2017 17:37:43 +0000 (10:37 -0700)]
task/console_log: Allow specifying remotes

For use with reimaging; allow passing remotes directly instead of
relying on the ctx.cluster object, which won't exist at that time.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agotask/console_log: Make logfile names customizable
Zack Cerza [Tue, 28 Nov 2017 16:50:16 +0000 (09:50 -0700)]
task/console_log: Make logfile names customizable

So that we can create separate console logs during the reimage process

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoReimage machines in parallel
Zack Cerza [Tue, 7 Nov 2017 23:05:21 +0000 (16:05 -0700)]
Reimage machines in parallel

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoSupport reimaging with FOG
Zack Cerza [Wed, 23 Aug 2017 20:03:53 +0000 (14:03 -0600)]
Support reimaging with FOG

https://fogproject.org

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agosuite: do not rerun jobs w/o description 1130/head
Kefu Chai [Thu, 30 Nov 2017 06:49:36 +0000 (14:49 +0800)]
suite: do not rerun jobs w/o description

Signed-off-by: Kefu Chai <kchai@redhat.com>
7 years agoPhysicalConsole: Add timeout arg to power_cycle()
Zack Cerza [Wed, 23 Aug 2017 19:10:41 +0000 (13:10 -0600)]
PhysicalConsole: Add timeout arg to power_cycle()

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1129 from ceph/wip-keyfix
Zack Cerza [Thu, 30 Nov 2017 00:06:00 +0000 (17:06 -0700)]
Merge pull request #1129 from ceph/wip-keyfix

Tolerate failure to scan VM host keys

7 years agoTolerate failure to scan VM host keys 1129/head
Zack Cerza [Wed, 29 Nov 2017 23:25:18 +0000 (16:25 -0700)]
Tolerate failure to scan VM host keys

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1128 from ceph/wip-keyscan
David Galloway [Wed, 29 Nov 2017 16:59:11 +0000 (11:59 -0500)]
Merge pull request #1128 from ceph/wip-keyscan

misc: Reimplement host key scanning

7 years agoMerge pull request #1123 from tchaikov/wip-wait-util-osds-up
Alfredo Deza [Wed, 29 Nov 2017 11:39:43 +0000 (06:39 -0500)]
Merge pull request #1123 from tchaikov/wip-wait-util-osds-up

misc.wait_until_osds_up: prolong the timeout from 5 min to 9 min

7 years agomisc: Reimplement host key scanning 1128/head
Zack Cerza [Tue, 28 Nov 2017 01:57:29 +0000 (18:57 -0700)]
misc: Reimplement host key scanning

We're seeing very intermittent issues with ssh-keyscan; sometimes given
N hostnames, it only returns N-1 keys. Lack of an error message adds to
the confusion. The solution is to call ssh-keyscan once for each host
instead of batching them together - with a few retries - so that we can
easily ensure we get the right amount. If we do not, raise a
RuntimeError.

Signed-off-by: Zack Cerza <zack@redhat.com>
(cherry picked from commit daa28ae9210e1a845840ec80776bd211df2e97e9)

7 years agoDrop pytest-capturelog
Zack Cerza [Tue, 28 Nov 2017 21:20:46 +0000 (14:20 -0700)]
Drop pytest-capturelog

It's apparently been replaced by the built-in pytest-catchlog, which has
a very slightly different API.

Signed-off-by: Zack Cerza <zack@redhat.com>
(cherry picked from commit 939998627e941c2aa06faefae6287dc08e3a0efe)

7 years agoMerge pull request #1127 from ceph/wip-default-os-version
Zack Cerza [Mon, 20 Nov 2017 20:44:44 +0000 (13:44 -0700)]
Merge pull request #1127 from ceph/wip-default-os-version

Update default OS versions

7 years agoUpdate default OS versions 1127/head
Zack Cerza [Mon, 20 Nov 2017 18:54:59 +0000 (10:54 -0800)]
Update default OS versions

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1124 from tchaikov/wip-more-mon-mgr-mkfs-grace
Kefu Chai [Tue, 7 Nov 2017 11:57:38 +0000 (19:57 +0800)]
Merge pull request #1124 from tchaikov/wip-more-mon-mgr-mkfs-grace

ceph.conf: prolong "mon mgr mkfs grace" to 2 minutes

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@gmail.com>
7 years agoceph.conf: prolong "mon mgr mkfs grace" to 2 minutes 1124/head
Kefu Chai [Tue, 7 Nov 2017 11:15:50 +0000 (19:15 +0800)]
ceph.conf: prolong "mon mgr mkfs grace" to 2 minutes

it might take longer than 1 minutes to:
1. create a monmap
2. create the auth keyrings
3. prepare the osd devices and activate OSDs (fdisk,mkfs.xfs,ceph-osd --mkfs)
4. prepare the monitors (mkfs), and start them
5. start monitor

Signed-off-by: Kefu Chai <kchai@redhat.com>