]> git.apps.os.sepia.ceph.com Git - teuthology.git/log
teuthology.git
7 years agoMerge pull request #1130 from tchaikov/wip-rerun-no-desc
vasukulkarni [Thu, 4 Jan 2018 20:04:57 +0000 (12:04 -0800)]
Merge pull request #1130 from tchaikov/wip-rerun-no-desc

suite: do not rerun jobs w/o description

7 years agoMerge pull request #1141 from ceph/wip-wusui-22518
Zack Cerza [Tue, 2 Jan 2018 16:29:16 +0000 (09:29 -0700)]
Merge pull request #1141 from ceph/wip-wusui-22518

Fix installer.0 bug.

7 years agoFix installer.0 bug. 1141/head
Warren Usui [Thu, 21 Dec 2017 03:03:03 +0000 (03:03 +0000)]
Fix installer.0 bug.
Insure that mon node is used for rbd_pools.

Signed-off-by: Warren Usui <wusui@redhat.com>
7 years agoMerge pull request #1140 from ceph/wip-foryuri-wusui
Zack Cerza [Tue, 19 Dec 2017 18:12:53 +0000 (11:12 -0700)]
Merge pull request #1140 from ceph/wip-foryuri-wusui

Implement installer.0 role.

7 years agoMerge pull request #1139 from ceph/wip-valgrind
Mykola Golub [Mon, 18 Dec 2017 08:48:00 +0000 (10:48 +0200)]
Merge pull request #1139 from ceph/wip-valgrind

valgrind: added suppression for cython constants

Reviewed-by: Mykola Golub <to.my.trociny@gmail.com>
7 years agovalgrind: added suppression for cython constants 1139/head
Jason Dillaman [Tue, 12 Dec 2017 21:38:54 +0000 (16:38 -0500)]
valgrind: added suppression for cython constants

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
7 years agoImplement installer.0 role. 1140/head
Warren Usui [Fri, 15 Dec 2017 07:59:56 +0000 (07:59 +0000)]
Implement installer.0 role.
This allows one to place the ceph ansible installer node on another machine.
It defaults back to the lowest monitor if there is no installer.0 role.

Signed-off-by: Warren Usui <wusui@redhat.com>
7 years agoMerge pull request #1138 from ceph/wip-rocks-leak
Sage Weil [Mon, 11 Dec 2017 21:25:34 +0000 (15:25 -0600)]
Merge pull request #1138 from ceph/wip-rocks-leak

teuthology/task/install/valgrind.supp: new rocksdb leak

7 years agoteuthology/task/install/valgrind.supp: new rocksdb leak 1138/head
Sage Weil [Mon, 11 Dec 2017 21:24:35 +0000 (15:24 -0600)]
teuthology/task/install/valgrind.supp: new rocksdb leak

<error>
  <unique>0x7</unique>
  <tid>1</tid>
  <kind>Leak_StillReachable</kind>
  <xwhat>
    <text>40 bytes in 1 blocks are still reachable in loss record 8 of 21</text>
    <leakedbytes>40</leakedbytes>
    <leakedblocks>1</leakedblocks>
  </xwhat>
  <stack>
    <frame>
      <ip>0x9EAF203</ip>
      <obj>/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so</obj>
      <fn>operator new(unsigned long)</fn>
      <dir>/builddir/build/BUILD/valgrind-3.12.0/coregrind/m_replacemalloc</dir>
      <file>vg_replace_malloc.c</file>
      <line>334</line>
    </frame>
    <frame>
      <ip>0xBDB237</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>allocate</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/ext</dir>
      <file>new_allocator.h</file>
      <line>111</line>
    </frame>
    <frame>
      <ip>0xBDB237</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>allocate</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>alloc_traits.h</file>
      <line>436</line>
    </frame>
    <frame>
      <ip>0xBDB237</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>std::__detail::_Hashtable_alloc&lt;std::allocator&lt;std::__detail::_Hash_node&lt;void const*, false&gt; &gt; &gt;::_M_allocate_buckets(unsigned long) [clone .isra.181]</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable_policy.h</file>
      <line>2107</line>
    </frame>
    <frame>
      <ip>0xBDDA87</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>_M_allocate_buckets</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable.h</file>
      <line>354</line>
    </frame>
    <frame>
      <ip>0xBDDA87</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>_M_rehash_aux</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable.h</file>
      <line>2098</line>
    </frame>
    <frame>
      <ip>0xBDDA87</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>std::_Hashtable&lt;rocksdb::ThreadStatusData*, rocksdb::ThreadStatusData*, std::allocator&lt;rocksdb::ThreadStatusData*&gt;, std::__detail::_Identity, std::equal_to&lt;rocksdb::ThreadStatusData*&gt;, std::hash&lt;rocksdb::ThreadStatusData*&gt;, std::__detail::_Mod_range_hashing, std::__detail::_Default_ra
nged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits&lt;false, true, true&gt; &gt;::_M_rehash(unsigned long, unsigned long const&amp;)</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable.h</file>
      <line>2077</line>
    </frame>
    <frame>
      <ip>0xBDDBCA</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>std::_Hashtable&lt;rocksdb::ThreadStatusData*, rocksdb::ThreadStatusData*, std::allocator&lt;rocksdb::ThreadStatusData*&gt;, std::__detail::_Identity, std::equal_to&lt;rocksdb::ThreadStatusData*&gt;, std::hash&lt;rocksdb::ThreadStatusData*&gt;, std::__detail::_Mod_range_hashing, std::__detail::_Default_ra
nged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits&lt;false, true, true&gt; &gt;::_M_insert_unique_node(unsigned long, unsigned long, std::__detail::_Hash_node&lt;rocksdb::ThreadStatusData*, false&gt;*)</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable.h</file>
      <line>1724</line>
    </frame>
    <frame>
      <ip>0xBDC6BE</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>_M_insert&lt;rocksdb::ThreadStatusData* const&amp;, std::__detail::_AllocNode&lt;std::allocator&lt;std::__detail::_Hash_node&lt;rocksdb::ThreadStatusData*, false&gt; &gt; &gt; &gt;</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable.h</file>
      <line>1828</line>
    </frame>
    <frame>
      <ip>0xBDC6BE</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>insert</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>hashtable_policy.h</file>
      <line>843</line>
    </frame>
    <frame>
      <ip>0xBDC6BE</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>insert</fn>
      <dir>/opt/rh/devtoolset-7/root/usr/include/c++/7/bits</dir>
      <file>unordered_set.h</file>
      <line>420</line>
    </frame>
    <frame>
      <ip>0xBDC6BE</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>rocksdb::ThreadStatusUpdater::RegisterThread(rocksdb::ThreadStatus::ThreadType, unsigned long)</fn>
      <dir>/usr/src/debug/ceph-13.0.0-3917-g948b47a/src/rocksdb/monitoring</dir>
      <file>thread_status_updater.cc</file>
      <line>25</line>
    </frame>
    <frame>
      <ip>0xBEEE09</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*)</fn>
      <dir>/usr/src/debug/ceph-13.0.0-3917-g948b47a/src/rocksdb/util</dir>
      <file>threadpool_imp.cc</file>
      <line>258</line>
    </frame>
    <frame>
      <ip>0xC1AF0E</ip>
      <obj>/usr/bin/ceph-mon</obj>
      <fn>execute_native_thread_routine</fn>
    </frame>
    <frame>
      <ip>0xA8FDE24</ip>
      <obj>/usr/lib64/libpthread-2.17.so</obj>
      <fn>start_thread</fn>
    </frame>
    <frame>
      <ip>0xD75234C</ip>
      <obj>/usr/lib64/libc-2.17.so</obj>
      <fn>clone</fn>
    </frame>
  </stack>
</error>

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoMerge pull request #1136 from ceph/wip-fog
David Galloway [Wed, 6 Dec 2017 23:18:13 +0000 (18:18 -0500)]
Merge pull request #1136 from ceph/wip-fog

nuke: Power off FOG machines instead of reimaging

7 years agonuke: Power off FOG machines instead of reimaging 1136/head
Zack Cerza [Tue, 5 Dec 2017 23:08:45 +0000 (16:08 -0700)]
nuke: Power off FOG machines instead of reimaging

Now that FOG is in master, we don't have to reimage on nuke. Let's shut
them down - the next job will bring them up again during the reimage
process.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1135 from ceph/wip-fog-conslog
David Galloway [Tue, 5 Dec 2017 21:35:13 +0000 (16:35 -0500)]
Merge pull request #1135 from ceph/wip-fog-conslog

Task: Tolerate a missing ctx.config

7 years agoMerge pull request #1134 from ceph/wip-fog-timeouts
David Galloway [Tue, 5 Dec 2017 21:30:18 +0000 (16:30 -0500)]
Merge pull request #1134 from ceph/wip-fog-timeouts

PhysicalConsole fixes for FOG

7 years agoPhysicalConsole: Correctly report power_off failure 1134/head
Zack Cerza [Tue, 5 Dec 2017 21:24:10 +0000 (14:24 -0700)]
PhysicalConsole: Correctly report power_off failure

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoTask: Tolerate a missing ctx.config 1135/head
Zack Cerza [Tue, 5 Dec 2017 21:20:11 +0000 (14:20 -0700)]
Task: Tolerate a missing ctx.config

When we initialize a ConsoleLog task during FOG reimaging, the ctx
object has no config attribute.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoPhysicalConsole: Correctly report power_on failure
Zack Cerza [Tue, 5 Dec 2017 17:57:43 +0000 (10:57 -0700)]
PhysicalConsole: Correctly report power_on failure

We've been logging an error, then unconditionally reporting success. Fix
it.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoPhysicalConsole: replace retry mechanism
Zack Cerza [Tue, 5 Dec 2017 17:53:26 +0000 (10:53 -0700)]
PhysicalConsole: replace retry mechanism

It was buggy and unreadable. Use safe_while.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1133 from tchaikov/wip-hadoop-url
Sage Weil [Tue, 5 Dec 2017 14:50:05 +0000 (08:50 -0600)]
Merge pull request #1133 from tchaikov/wip-hadoop-url

task/hadoop: update hadoop 2.5.2 url

7 years agotask/hadoop: update hadoop 2.5.2 url 1133/head
Kefu Chai [Tue, 5 Dec 2017 07:02:12 +0000 (15:02 +0800)]
task/hadoop: update hadoop 2.5.2 url

hadoop v2.5.2 is pretty old now, and is removed from the hadoop/common
directory. but it can be found in archive.

Signed-off-by: Kefu Chai <kchai@redhat.com>
7 years agoMerge pull request #1131 from ceph/wip-fog-cancel
David Galloway [Mon, 4 Dec 2017 22:30:24 +0000 (17:30 -0500)]
Merge pull request #1131 from ceph/wip-fog-cancel

fog: Cancel stale deploy tasks

7 years agofog: Cancel stale deploy tasks 1131/head
Zack Cerza [Mon, 4 Dec 2017 19:11:19 +0000 (12:11 -0700)]
fog: Cancel stale deploy tasks

In case, for whatever reason, any active deploy tasks already exist for
a given host, cancel them before we schedule a new one.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agofog: Fix TypeError when canceling tasks
Zack Cerza [Mon, 4 Dec 2017 18:59:26 +0000 (11:59 -0700)]
fog: Fix TypeError when canceling tasks

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1126 from ceph/wip-fog
Zack Cerza [Fri, 1 Dec 2017 16:21:12 +0000 (09:21 -0700)]
Merge pull request #1126 from ceph/wip-fog

Bare-metal reimaging with FOG

7 years agoupdate_inventory: Canonicalize hostname 1126/head
Zack Cerza [Thu, 30 Nov 2017 17:03:03 +0000 (10:03 -0700)]
update_inventory: Canonicalize hostname

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agonuke: Reimage instead of nuking as necessary
Zack Cerza [Thu, 30 Nov 2017 00:22:38 +0000 (17:22 -0700)]
nuke: Reimage instead of nuking as necessary

Once this branch is in master, we'll probably want to switch this
behavior to no-op nodes that would be reimaged, since the next job will
do that anyway.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agofog: If deploy fails, cancel the deploy task
Zack Cerza [Tue, 28 Nov 2017 22:48:15 +0000 (15:48 -0700)]
fog: If deploy fails, cancel the deploy task

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agolock.ops.lock_many: Log console while reimaging
Zack Cerza [Tue, 28 Nov 2017 16:52:53 +0000 (09:52 -0700)]
lock.ops.lock_many: Log console while reimaging

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agotask/console_log: Allow specifying remotes
Zack Cerza [Tue, 28 Nov 2017 17:37:43 +0000 (10:37 -0700)]
task/console_log: Allow specifying remotes

For use with reimaging; allow passing remotes directly instead of
relying on the ctx.cluster object, which won't exist at that time.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agotask/console_log: Make logfile names customizable
Zack Cerza [Tue, 28 Nov 2017 16:50:16 +0000 (09:50 -0700)]
task/console_log: Make logfile names customizable

So that we can create separate console logs during the reimage process

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoReimage machines in parallel
Zack Cerza [Tue, 7 Nov 2017 23:05:21 +0000 (16:05 -0700)]
Reimage machines in parallel

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoSupport reimaging with FOG
Zack Cerza [Wed, 23 Aug 2017 20:03:53 +0000 (14:03 -0600)]
Support reimaging with FOG

https://fogproject.org

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agosuite: do not rerun jobs w/o description 1130/head
Kefu Chai [Thu, 30 Nov 2017 06:49:36 +0000 (14:49 +0800)]
suite: do not rerun jobs w/o description

Signed-off-by: Kefu Chai <kchai@redhat.com>
7 years agoPhysicalConsole: Add timeout arg to power_cycle()
Zack Cerza [Wed, 23 Aug 2017 19:10:41 +0000 (13:10 -0600)]
PhysicalConsole: Add timeout arg to power_cycle()

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1129 from ceph/wip-keyfix
Zack Cerza [Thu, 30 Nov 2017 00:06:00 +0000 (17:06 -0700)]
Merge pull request #1129 from ceph/wip-keyfix

Tolerate failure to scan VM host keys

7 years agoTolerate failure to scan VM host keys 1129/head
Zack Cerza [Wed, 29 Nov 2017 23:25:18 +0000 (16:25 -0700)]
Tolerate failure to scan VM host keys

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1128 from ceph/wip-keyscan
David Galloway [Wed, 29 Nov 2017 16:59:11 +0000 (11:59 -0500)]
Merge pull request #1128 from ceph/wip-keyscan

misc: Reimplement host key scanning

7 years agoMerge pull request #1123 from tchaikov/wip-wait-util-osds-up
Alfredo Deza [Wed, 29 Nov 2017 11:39:43 +0000 (06:39 -0500)]
Merge pull request #1123 from tchaikov/wip-wait-util-osds-up

misc.wait_until_osds_up: prolong the timeout from 5 min to 9 min

7 years agomisc: Reimplement host key scanning 1128/head
Zack Cerza [Tue, 28 Nov 2017 01:57:29 +0000 (18:57 -0700)]
misc: Reimplement host key scanning

We're seeing very intermittent issues with ssh-keyscan; sometimes given
N hostnames, it only returns N-1 keys. Lack of an error message adds to
the confusion. The solution is to call ssh-keyscan once for each host
instead of batching them together - with a few retries - so that we can
easily ensure we get the right amount. If we do not, raise a
RuntimeError.

Signed-off-by: Zack Cerza <zack@redhat.com>
(cherry picked from commit daa28ae9210e1a845840ec80776bd211df2e97e9)

7 years agoDrop pytest-capturelog
Zack Cerza [Tue, 28 Nov 2017 21:20:46 +0000 (14:20 -0700)]
Drop pytest-capturelog

It's apparently been replaced by the built-in pytest-catchlog, which has
a very slightly different API.

Signed-off-by: Zack Cerza <zack@redhat.com>
(cherry picked from commit 939998627e941c2aa06faefae6287dc08e3a0efe)

7 years agoMerge pull request #1127 from ceph/wip-default-os-version
Zack Cerza [Mon, 20 Nov 2017 20:44:44 +0000 (13:44 -0700)]
Merge pull request #1127 from ceph/wip-default-os-version

Update default OS versions

7 years agoUpdate default OS versions 1127/head
Zack Cerza [Mon, 20 Nov 2017 18:54:59 +0000 (10:54 -0800)]
Update default OS versions

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1124 from tchaikov/wip-more-mon-mgr-mkfs-grace
Kefu Chai [Tue, 7 Nov 2017 11:57:38 +0000 (19:57 +0800)]
Merge pull request #1124 from tchaikov/wip-more-mon-mgr-mkfs-grace

ceph.conf: prolong "mon mgr mkfs grace" to 2 minutes

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@gmail.com>
7 years agoceph.conf: prolong "mon mgr mkfs grace" to 2 minutes 1124/head
Kefu Chai [Tue, 7 Nov 2017 11:15:50 +0000 (19:15 +0800)]
ceph.conf: prolong "mon mgr mkfs grace" to 2 minutes

it might take longer than 1 minutes to:
1. create a monmap
2. create the auth keyrings
3. prepare the osd devices and activate OSDs (fdisk,mkfs.xfs,ceph-osd --mkfs)
4. prepare the monitors (mkfs), and start them
5. start monitor

Signed-off-by: Kefu Chai <kchai@redhat.com>
7 years agomisc.wait_until_osds_up: prolong the timeout from 5 min to 9 min 1123/head
Kefu Chai [Fri, 3 Nov 2017 13:27:59 +0000 (21:27 +0800)]
misc.wait_until_osds_up: prolong the timeout from 5 min to 9 min

if an node hosts 6 OSDs, it would take longer to boot. this addresses
the failure of
/a/kchai-2017-11-03_05:56:44-rados-wip-jewel-backports-reloaded-distro-basic-mira/1806380.

Signed-off-by: Kefu Chai <kchai@redhat.com>
7 years agoMerge pull request #1121 from ceph/wip-cephmetrics
vasukulkarni [Thu, 26 Oct 2017 18:30:29 +0000 (11:30 -0700)]
Merge pull request #1121 from ceph/wip-cephmetrics

Add a cephmetrics task

7 years agoceph_ansible: Support custom cluster names 1121/head
Zack Cerza [Tue, 24 Oct 2017 16:48:50 +0000 (10:48 -0600)]
ceph_ansible: Support custom cluster names

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoAdd cephmetrics task
Zack Cerza [Wed, 18 Oct 2017 20:31:06 +0000 (14:31 -0600)]
Add cephmetrics task

Deploys, then runs integration tests via tox.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoansible: Optional suffix for inventory filenames
Zack Cerza [Thu, 19 Oct 2017 18:38:57 +0000 (12:38 -0600)]
ansible: Optional suffix for inventory filenames

So we can support e.g. YAML inventories.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoceph_ansible: Don't collect logs on non-Ceph hosts
Zack Cerza [Thu, 19 Oct 2017 18:32:07 +0000 (12:32 -0600)]
ceph_ansible: Don't collect logs on non-Ceph hosts

If a job uses machines for something other than to install Ceph, avoid
attempting to collect Ceph logs - the job will fail.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoUpgrade ansible
Zack Cerza [Thu, 19 Oct 2017 17:32:31 +0000 (11:32 -0600)]
Upgrade ansible

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoUpgrade pip-tools
Zack Cerza [Thu, 19 Oct 2017 17:31:17 +0000 (11:31 -0600)]
Upgrade pip-tools

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1120 from ceph/wip-ansible-purge-fixc
Zack Cerza [Fri, 20 Oct 2017 18:05:29 +0000 (12:05 -0600)]
Merge pull request #1120 from ceph/wip-ansible-purge-fixc

update ansible package from ansible PPA

7 years agoremove dependency packages the task added from testnode 1120/head
Vasu Kulkarni [Thu, 19 Oct 2017 17:34:19 +0000 (10:34 -0700)]
remove dependency packages the task added from testnode

Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
F

Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
7 years agoupdate ansible package from ansible PPA
Vasu Kulkarni [Wed, 18 Oct 2017 22:09:20 +0000 (15:09 -0700)]
update ansible package from ansible PPA

Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
7 years agoMerge pull request #1118 from ceph/wip-21737-shutdown-assert
Gregory Farnum [Wed, 18 Oct 2017 20:24:11 +0000 (13:24 -0700)]
Merge pull request #1118 from ceph/wip-21737-shutdown-assert

ceph.conf: enable debug asserts on shutdown

Reviewed-by: Sage Weil <sage@redhat.com>
7 years agoMerge pull request #934 from ceph/wip-daemon-helper-systemd
vasukulkarni [Wed, 18 Oct 2017 19:59:54 +0000 (12:59 -0700)]
Merge pull request #934 from ceph/wip-daemon-helper-systemd

Changes for orchestra.daemon to work with systemd

7 years agoDaemonGroup: Avoid using systemd by default 934/head
Zack Cerza [Thu, 12 Oct 2017 19:16:01 +0000 (13:16 -0600)]
DaemonGroup: Avoid using systemd by default

To use systemd, any tasks which create a DaemonGroup object must pass
use_systemd=True. This could be accomplished by looking for a flag in
the job config. Note that this option is mainly for regression testing;
it may be removed in the future, where the default is to use systemd
when possible.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoMerge pull request #1119 from ceph/wip-dbrst
David Galloway [Mon, 9 Oct 2017 21:49:12 +0000 (17:49 -0400)]
Merge pull request #1119 from ceph/wip-dbrst

downburst: Explicitly type our user-data

7 years agodownburst: Explicitly type our user-data 1119/head
Zack Cerza [Mon, 9 Oct 2017 21:14:40 +0000 (15:14 -0600)]
downburst: Explicitly type our user-data

Something broke for us in cloud-init 0.7.9. Turns out our user-data was
being interpreted as an octet-stream, as opposed to a cloud-config. Mark
it as a cloud-config, so that cloud-init will do the right thing once
more.

Signed-off-by: Zack Cerza <zack@redhat.com>
7 years agoceph.conf: enable debug asserts on shutdown 1118/head
Greg Farnum [Mon, 9 Oct 2017 21:17:44 +0000 (14:17 -0700)]
ceph.conf: enable debug asserts on shutdown

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
8 years agoMerge pull request #1117 from ceph/wip-umount-varlib
Zack Cerza [Thu, 5 Oct 2017 18:08:04 +0000 (12:08 -0600)]
Merge pull request #1117 from ceph/wip-umount-varlib

More work to allow mounting /var/lib/ceph to a testnode device

8 years agointernal: Be okay with an empty /var/lib/ceph dir 1117/head
David Galloway [Thu, 5 Oct 2017 17:36:46 +0000 (13:36 -0400)]
internal: Be okay with an empty /var/lib/ceph dir

Signed-off-by: David Galloway <dgallowa@redhat.com>
8 years agoinstall: Always try to unmount /var/lib/ceph
David Galloway [Thu, 5 Oct 2017 17:29:33 +0000 (13:29 -0400)]
install: Always try to unmount /var/lib/ceph

Signed-off-by: David Galloway <dgallowa@redhat.com>
8 years agoMerge pull request #1116 from ceph/wip-var-lib-ceph
Zack Cerza [Thu, 5 Oct 2017 15:54:38 +0000 (09:54 -0600)]
Merge pull request #1116 from ceph/wip-var-lib-ceph

install: Unmount /var/lib/ceph itself if needed

8 years agoinstall: Unmount /var/lib/ceph itself if needed 1116/head
David Galloway [Wed, 4 Oct 2017 19:44:43 +0000 (15:44 -0400)]
install: Unmount /var/lib/ceph itself if needed

We plan to mount a small logical volume to /var/lib/ceph.

Fixes: http://tracker.ceph.com/issues/20910
Signed-off-by: David Galloway <dgallowa@redhat.com>
8 years agoMerge pull request #1113 from ceph/wip-debug-shutdown
Sage Weil [Fri, 22 Sep 2017 21:05:38 +0000 (16:05 -0500)]
Merge pull request #1113 from ceph/wip-debug-shutdown

ceph.conf: osd_debug_shutdown=true

8 years agoMerge pull request #1110 from ceph/wip-ceph-ansible-rgw-interface
Zack Cerza [Fri, 22 Sep 2017 20:59:22 +0000 (14:59 -0600)]
Merge pull request #1110 from ceph/wip-ceph-ansible-rgw-interface

[task/ceph-ansible]: Add required radosgw_interface due to ceph-ansible upstream changes

8 years agoMerge pull request #1112 from ceph/wip-set-ansible-lib
vasukulkarni [Thu, 21 Sep 2017 22:12:23 +0000 (15:12 -0700)]
Merge pull request #1112 from ceph/wip-set-ansible-lib

task/ceph-ansible: copy purge-cluster to top-level directory for it to work

8 years agocopy purge-cluster to top level dir for the playbook to work 1112/head
Vasu Kulkarni [Tue, 19 Sep 2017 20:54:52 +0000 (13:54 -0700)]
copy purge-cluster to top level dir for the playbook to work

Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
8 years agoceph.conf: osd_debug_shutdown=true 1113/head
Sage Weil [Thu, 21 Sep 2017 19:21:09 +0000 (15:21 -0400)]
ceph.conf: osd_debug_shutdown=true

The extra logging during osd shutdown is becoming optional; enable it in
qa.

Signed-off-by: Sage Weil <sage@redhat.com>
8 years agoSystemDState: Use correct output_cmd for rgw
Zack Cerza [Wed, 3 May 2017 21:33:56 +0000 (15:33 -0600)]
SystemDState: Use correct output_cmd for rgw

We had this right for systemctl commands, but not for the journalctl
command.

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoDaemonGroup.add_daemon(): Properly start daemons
Zack Cerza [Tue, 2 May 2017 23:59:58 +0000 (17:59 -0600)]
DaemonGroup.add_daemon(): Properly start daemons

For some reason we weren't calling restart() inside add_daemons() for
systemd.

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoSystemDState: Tweak logging
Zack Cerza [Tue, 2 May 2017 18:55:32 +0000 (12:55 -0600)]
SystemDState: Tweak logging

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoDaemonGroup: Don't use systemd with valgrind
Zack Cerza [Tue, 2 May 2017 17:24:15 +0000 (11:24 -0600)]
DaemonGroup: Don't use systemd with valgrind

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoDaemonGroup: Detect and use systemd
Zack Cerza [Tue, 2 May 2017 17:14:59 +0000 (11:14 -0600)]
DaemonGroup: Detect and use systemd

Don't rely on a configuration item to tell us whether or not to use
systemd.

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoRemote: add init_system property
Zack Cerza [Tue, 2 May 2017 17:09:40 +0000 (11:09 -0600)]
Remote: add init_system property

So far, this only exists to answer the question: "does this host use
systemd?" - in which case it will return 'systemd'. Else it will return
None.

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoSystemDState: Treat PIDs as strings when killing
Zack Cerza [Tue, 2 May 2017 15:18:58 +0000 (09:18 -0600)]
SystemDState: Treat PIDs as strings when killing

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoSplit DaemonState in two
Zack Cerza [Thu, 27 Apr 2017 21:46:33 +0000 (15:46 -0600)]
Split DaemonState in two

Put systemd stuff in SystemDState

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoorchestra/daemon: Convert to a subpackage
Zack Cerza [Thu, 27 Apr 2017 21:01:08 +0000 (15:01 -0600)]
orchestra/daemon: Convert to a subpackage

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoDaemonState: Raise on startup failure
Zack Cerza [Tue, 25 Apr 2017 20:11:54 +0000 (14:11 -0600)]
DaemonState: Raise on startup failure

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoDaemonState: Use consistent daemon IDs
Zack Cerza [Wed, 26 Apr 2017 17:33:50 +0000 (11:33 -0600)]
DaemonState: Use consistent daemon IDs

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoDaemonState.check_status(): support systemd
Zack Cerza [Tue, 25 Apr 2017 20:02:33 +0000 (14:02 -0600)]
DaemonState.check_status(): support systemd

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoDaemonState: Drop use of self.id
Zack Cerza [Tue, 18 Apr 2017 20:44:46 +0000 (14:44 -0600)]
DaemonState: Drop use of self.id

... we already had self.id_

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoDaemonGroup: Simplify setting self.use_init
Zack Cerza [Thu, 13 Apr 2017 22:11:28 +0000 (16:11 -0600)]
DaemonGroup: Simplify setting self.use_init

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoorchestra/daemon: Tweak logging
Zack Cerza [Thu, 13 Apr 2017 17:54:05 +0000 (11:54 -0600)]
orchestra/daemon: Tweak logging

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoDaemonState: Factor out PID discovery
Zack Cerza [Thu, 13 Apr 2017 17:43:48 +0000 (11:43 -0600)]
DaemonState: Factor out PID discovery

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoDaemonState: Don't needlessly reset self.id
Zack Cerza [Thu, 13 Apr 2017 17:35:01 +0000 (11:35 -0600)]
DaemonState: Don't needlessly reset self.id

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoorchestra/daemon: Fix linter errors
Zack Cerza [Thu, 13 Apr 2017 17:28:19 +0000 (11:28 -0600)]
orchestra/daemon: Fix linter errors

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoDaemonState: factor out command generation
Zack Cerza [Thu, 13 Apr 2017 17:22:21 +0000 (11:22 -0600)]
DaemonState: factor out command generation

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoChanges for daemon-helper to work with systemd
Vasu Kulkarni [Tue, 16 Aug 2016 21:45:04 +0000 (14:45 -0700)]
Changes for daemon-helper to work with systemd

when cluster is setup using ceph-ansible or ceph-deploy
use systemd commands to kill/revive deamons.

Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
8 years agoAdd required radosgw_interface due to ceph-ansible upstream changes 1110/head
Vasu Kulkarni [Fri, 15 Sep 2017 17:53:01 +0000 (10:53 -0700)]
Add required radosgw_interface due to ceph-ansible upstream changes

Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
8 years agoMerge pull request #1111 from ceph/wip-pg-limit
Sage Weil [Tue, 19 Sep 2017 12:11:58 +0000 (07:11 -0500)]
Merge pull request #1111 from ceph/wip-pg-limit

ceph.conf: set new pg max per osd limit

8 years agoceph.conf: set new pg max per osd limit 1111/head
Sage Weil [Sun, 17 Sep 2017 21:37:44 +0000 (16:37 -0500)]
ceph.conf: set new pg max per osd limit

This is replacing the old one, but we need to keep the old one for
upgrades etc.

Signed-off-by: Sage Weil <sage@redhat.com>
8 years agoMerge pull request #1091 from ceph/wip-fix-cephlab-overrides
Zack Cerza [Wed, 13 Sep 2017 21:47:42 +0000 (15:47 -0600)]
Merge pull request #1091 from ceph/wip-fix-cephlab-overrides

[ansible-doc-fix]: correct the doc to fix the overrides name

8 years agoMerge pull request #1109 from ceph/wip-openstack-log
Zack Cerza [Tue, 12 Sep 2017 21:24:56 +0000 (15:24 -0600)]
Merge pull request #1109 from ceph/wip-openstack-log

task/internal/syslog: blacklist openstack datasource log

8 years agotask/internal/syslog: blacklist openstack datasource log 1109/head
David Galloway [Tue, 12 Sep 2017 20:47:48 +0000 (16:47 -0400)]
task/internal/syslog: blacklist openstack datasource log

Caused OVH node jobs to fail when it shouldn't have

Signed-off-by: David Galloway <dgallowa@redhat.com>
8 years agoMerge pull request #1107 from ceph/wip-prune-fail
David Galloway [Fri, 1 Sep 2017 19:42:49 +0000 (15:42 -0400)]
Merge pull request #1107 from ceph/wip-prune-fail

prune: Optionally remove failed jobs entirely

8 years agoprune: Use summary.yaml to determine age of job 1107/head
Zack Cerza [Fri, 1 Sep 2017 17:48:58 +0000 (11:48 -0600)]
prune: Use summary.yaml to determine age of job

If a job has been partially pruned, its directory's mtime will get
bumped, causing total removal to be delayed by potentially months.
Looking at the mtime of summary.yaml should provide results that are
more intuitive.

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoprune: Optionally remove failed jobs entirely
Zack Cerza [Thu, 31 Aug 2017 16:46:26 +0000 (10:46 -0600)]
prune: Optionally remove failed jobs entirely

This works exactly like --pass, except it defaults to being disabled

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoMerge pull request #1101 from ceph/wip-ansible-fixes
Zack Cerza [Wed, 30 Aug 2017 19:38:17 +0000 (13:38 -0600)]
Merge pull request #1101 from ceph/wip-ansible-fixes

[tasks/ceph-ansible]: use site.yml from the ceph-ansible repo