Sébastien Han [Tue, 4 Apr 2017 08:33:22 +0000 (10:33 +0200)]
osd: autodiscovery mode, use holders to detect device
As reported in
https://github.com/ceph/ceph-ansible/issues/1403 when devices are held
by lvm and `osd_auto_discovery` is set to true, it's not enough to check
for a partition count = 0 since Ansible does not report.
This patch also looks for 'holders' which in a case of lvm corresponds
to the name of the pv. Now we also look for holders = 0.
Fixes: #1403 Signed-off-by: Sébastien Han <seb@redhat.com>
Sébastien Han [Thu, 30 Mar 2017 09:51:38 +0000 (11:51 +0200)]
playbook: homogenize the way list osd ids
Problem: too many different commands to do the same thing. The 'cut'
command on infrastructure-playbooks/purge-cluster.yml was also wrong.
This sed command from osixia in ceph-docker
https://github.com/ceph/ceph-docker/pull/580/ addresses all the
scenarios.
Common: Do not install ntp when ntp_service_enabled is false
ntp is still installed even if ntp_service_enabled is set to false.
That could be a problem if the time synchronization is managed by
something else than ceph-ansible or if you want to use different NTP
implementation as suggested in #1354.
If a group of hosts is empty, (for instance 'mdss', in case of a
deployment without any mds node), the playbook will fails when trying
to restart service with `"'dict object' has no attribute u'XXX'"` error.
The idea here is to force the `with_items` statements in all included handler tasks
to get at least an empty array.
Concubidated [Fri, 24 Mar 2017 19:52:37 +0000 (12:52 -0700)]
ceph-common: update sysctl file location
systctl tuning should be in the sysctl.d directory. This creates
a seperation from what values were set specific to ceph, and what
values were set by the operator.
Andrew Schoen [Wed, 22 Mar 2017 21:44:29 +0000 (16:44 -0500)]
tests: set MTU to 1400 on test node interfaces
In the environment we were testing on, MTU was set to 1500 which causes
download failures of our yum repos. There might be a better way to set
this instead of doing it here in ansible.
Andrew Schoen [Wed, 22 Mar 2017 13:49:49 +0000 (08:49 -0500)]
tests: adds a 'rhcs-' prefix to the testing scenarios matrix
This allows for us to have a copy of the existing testing scenarios with
a 'rhcs-' prefix. We can use that in the tox.ini to take actions we need
to properly test Red Hat Ceph Storage.
Daniel Marks [Thu, 16 Mar 2017 22:16:30 +0000 (23:16 +0100)]
Use ansible uri module instead of shell module with curl
This fixes issue #1299. According to @ktdreyer s comment in the ticket,
he fixed the web server config so also older (non-SNI) python clients
can use the uri module here.
Andrew Schoen [Thu, 16 Mar 2017 21:31:25 +0000 (16:31 -0500)]
ceph-mon: always call ceph-create-keys
After the jewel release the mon startup does not generate keys, but it's
still harmless to call ceph-create-keys with jewel because this task has
a 'creates' argument that will cause it not to run if the keys already
exist.
Removing this when condition also allows the downstream CI tests to
install kraken or luminous without resetting ceph_stable_release, which does not
pertain to rhcs.
Andrew Schoen [Thu, 16 Mar 2017 11:16:09 +0000 (06:16 -0500)]
tests: convert extra-vars to use json
This will prevent ansible from misreading any of these values. There
were failures with xenial deployments because the value set for
``ceph_rhcs`` was being treated as a boolean True even though I'd set
the value to false. This is because boolean values passed in with
--extra-vars must use the json format.
The formatting of the json is very important as you need a '\' to escape
the starting and ending json to make tox happy. Also, each line needs to
end with '\' if it's a multi-line command.
Another thing to note is that if you want to use extra vars at the
command line to respond to a vars_prompt it must be in key/value format.
This is why we have a -e and a --extra-vars on the purge and update
tests.
Christian Zunker [Wed, 15 Mar 2017 12:32:30 +0000 (13:32 +0100)]
Make ceph-common aware off osd config fragments
This removes the implicit order requirement when using OSD fragments.
When you use OSD fragments and ceph-osd role is not the last one,
the fragments get removed from ceph.conf by ceph-common.
It is not nice to have this code at two locations, but this is
necessary to prevent problems, when ceph-osd is the last role as
ceph-common gets executed before ceph-osd.
This could be prevented when ceph-common would be explicitly called
at the end of the playbook.
Signed-off-by: Christian Zunker <christian.zunker@codecentric.de>
Andrew Schoen [Wed, 15 Mar 2017 20:08:39 +0000 (15:08 -0500)]
tests: adds the ability to set the ceph_stable_release value
Use CEPH_STABLE_RELEASE to set the name of the ceph release you plan to
install. When testing an upgrade scenario you'll also need to set
UPGRADE_CEPH_STABLE_RELEASE.
Andrew Schoen [Wed, 15 Mar 2017 20:01:32 +0000 (15:01 -0500)]
tests: add the ability to run tests with shaman repos
To run tests that deploy shaman repos set CEPH_DEV=true and optionally
use CEPH_DEV_BRANCH and CEPH_DEV_SHA1 to define with branch and sha1 to
test. CEPH_DEV_BRANCH defaults to master and CEPH_DEV_SHA1 defaults to
latest.
For example, this would run the journal_collocation test with the latest
build of the master branch:
Ken Dreyer [Mon, 13 Mar 2017 15:34:35 +0000 (09:34 -0600)]
ceph-common: install nfs-ganesha FSALs on Debian
Prior to this change, ceph-ansible would install the main NFS Ganesha
server daemon on Ubuntu, but it would skip the Ceph FSALs.
Running "apt-get install nfs-ganesha" will only install the main NFS Ganesha
server. It does *not* pull in the RGW FSAL
(/usr/lib/x86_64-linux-gnu/ganesha/libfsalrgw.so)
Running "apt-get install nfs-ganesha-fsal" will install the RGW FSAL as
well as the main NFS Ganesha server package.