]> git.apps.os.sepia.ceph.com Git - ceph-ansible.git/log
ceph-ansible.git
6 years agocleanup repos's root
Sébastien Han [Tue, 30 Oct 2018 10:28:23 +0000 (11:28 +0100)]
cleanup repos's root

Remove old files and move scripts to the contrib directory.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoceph-volume: fix TypeError exception when setting osds-per-device > 1
Maciej Naruszewicz [Fri, 19 Oct 2018 20:40:36 +0000 (22:40 +0200)]
ceph-volume: fix TypeError exception when setting osds-per-device > 1

osds-per-device needs to be passed to run_command as a string.
Otherwise, expandvars method will try to iterate over an integer.

Signed-off-by: Maciej Naruszewicz <maciej.naruszewicz@intel.com>
6 years agotestinfra: change test osds for containers
Sébastien Han [Mon, 29 Oct 2018 15:24:45 +0000 (16:24 +0100)]
testinfra: change test osds for containers

We do not use  @<device> anymore so we don't need to perform the
readlink check anymore.

Also we are making an exception for ooo which is still using ceph-disk.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoceph_volume: add container support for batch
Sébastien Han [Fri, 26 Oct 2018 14:30:32 +0000 (16:30 +0200)]
ceph_volume: add container support for batch

https://tracker.ceph.com/issues/36363 has been resolved and the patch
has been backported to luminous and mimic so let's enable the container
support.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1541415
Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agotest_osd: dynamically get the osd container
Sébastien Han [Mon, 29 Oct 2018 11:00:40 +0000 (12:00 +0100)]
test_osd: dynamically get the osd container

Do not enforce the container name since this will fail when we have
multiple VMs running OSDs.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agotest: convert all the tests to use lvm
Sébastien Han [Wed, 10 Oct 2018 19:29:56 +0000 (15:29 -0400)]
test: convert all the tests to use lvm

ceph-disk is now deprecated in ceph-ansible so let's convert all the ci
tests to use lvm instead of ceph-disk.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agotox: change container image to use master
Sébastien Han [Thu, 25 Oct 2018 14:15:36 +0000 (16:15 +0200)]
tox: change container image to use master

We have a latest-master image which contains builds from upstream ceph
so let's use it to verify build.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agotest: remove ceph-disk CI tests
Sébastien Han [Wed, 10 Oct 2018 18:55:20 +0000 (14:55 -0400)]
test: remove ceph-disk CI tests

Since we are removing the ceph-disk test from the ci in master then
there is no need to have the functionnal tests in master anymore.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoroles: fix *_docker_memory_limit default value
Guillaume Abrioux [Mon, 29 Oct 2018 10:46:46 +0000 (11:46 +0100)]
roles: fix *_docker_memory_limit default value

append 'm' suffix to specify the unit size used in all
`*_docker_memory_limit`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoroles: do not limit docker_memory_limit for various daemons
Neha Ojha [Thu, 25 Oct 2018 17:45:00 +0000 (17:45 +0000)]
roles: do not limit docker_memory_limit for various daemons

Since we do not have enough data to put valid upper bounds for the memory
usage of these daemons, do not put artificial limits by default. This will
help us avoid failures like OOM kills due to low default values.

Whenever required, these limits can be manually enforced by the user.

More details in
https://bugzilla.redhat.com/show_bug.cgi?id=1638148

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1638148
Signed-off-by: Neha Ojha <nojha@redhat.com>
6 years agoMerge branch 'jcsp-wip-rm-calamari'
Sébastien Han [Mon, 29 Oct 2018 13:53:47 +0000 (14:53 +0100)]
Merge branch 'jcsp-wip-rm-calamari'

6 years agoMerge branch 'master' into wip-rm-calamari 3147/head
Sébastien Han [Mon, 29 Oct 2018 13:50:37 +0000 (14:50 +0100)]
Merge branch 'master' into wip-rm-calamari

6 years agoinfrastructure playbooks: ensure nvme_device is defined in lv-create.yml
Ali Maredia [Mon, 29 Oct 2018 06:01:25 +0000 (06:01 +0000)]
infrastructure playbooks: ensure nvme_device is defined in lv-create.yml

Signed-off-by: Ali Maredia <amaredia@redhat.com>
6 years agonfs: do not create the nfs user if already present
Sébastien Han [Fri, 26 Oct 2018 13:27:33 +0000 (15:27 +0200)]
nfs: do not create the nfs user if already present

Check if the user exists and skip its creation if true.

Closes: https://github.com/ceph/ceph-ansible/issues/3254
Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoFix problem with ceph_key in python3
Jairo Llopis [Thu, 4 Oct 2018 05:48:03 +0000 (07:48 +0200)]
Fix problem with ceph_key in python3

Pretty basic problem of iteritems removal.

Signed-off-by: Jairo Llopis <yajo.sk8@gmail.com>
6 years agoceph_volume: better error handling
Sébastien Han [Wed, 24 Oct 2018 14:55:52 +0000 (16:55 +0200)]
ceph_volume: better error handling

When loading the json, if invalid, we should fail with a meaningful
error.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoceph_volume: expose ceph-volume logs on the host
Sébastien Han [Wed, 24 Oct 2018 14:53:12 +0000 (16:53 +0200)]
ceph_volume: expose ceph-volume logs on the host

This will tremendously help debugging failures while performing any
ceph-volume command in containers.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoresync group_vars/*.sample files
Guillaume Abrioux [Fri, 26 Oct 2018 07:46:29 +0000 (09:46 +0200)]
resync group_vars/*.sample files

ee2d52d33df2a311cdf0ff62abd353fccb3affbc missed this sync between
ceph-defaults/defaults/main.yml and group_vars/all.yml.sampl

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotox: fix a typo
Guillaume Abrioux [Thu, 25 Oct 2018 12:42:54 +0000 (14:42 +0200)]
tox: fix a typo

the line setting `ANSIBLE_CONFIG` obviously contains a typo introduced
by 1e283bf69be8b9efbc1a7a873d91212ad57c7351

`ANSIBLE_CONFIG` has to point to a path only (path to an ansible.cfg)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoigw: stop daemons on purge all calls v3.2.0beta9
Mike Christie [Wed, 12 Sep 2018 20:37:44 +0000 (15:37 -0500)]
igw: stop daemons on purge all calls

When purging the entire igw config (lio and rbd) stop disable the api
and gw daemons.

Fixes Red Hat BZ
https://bugzilla.redhat.com/show_bug.cgi?id=1621255

Signed-off-by: Mike Christie <mchristi@redhat.com>
6 years agoceph-validate: avoid "list index out of range" error
Rishabh Dave [Tue, 9 Oct 2018 20:47:40 +0000 (02:17 +0530)]
ceph-validate: avoid "list index out of range" error

Be sure that error.path has more than one members before using them.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agoceph-infra: reload firewall after rules are added v3.2.0beta8
Guillaume Abrioux [Tue, 23 Oct 2018 07:49:50 +0000 (09:49 +0200)]
ceph-infra: reload firewall after rules are added

we ensure that firewalld is installed and running before adding any
rule. This has no sense anymore not to reload firewalld once the rule
are added.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoallow custom pool size
Rishabh Dave [Mon, 1 Oct 2018 15:11:13 +0000 (11:11 -0400)]
allow custom pool size

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1596339
Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agotests: remove unnecessary variables definition v3.2.0beta7
Guillaume Abrioux [Fri, 19 Oct 2018 11:19:59 +0000 (13:19 +0200)]
tests: remove unnecessary variables definition

since we set `configure_firewall: true` in
`ceph-defaults/defaults/main.yml` there is no need to explicitly set it
in `centos7_cluster` and `docker_cluster` testing scenarios.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodefaults: set default `configure_firewall` to `True`
Guillaume Abrioux [Fri, 19 Oct 2018 11:16:23 +0000 (13:16 +0200)]
defaults: set default `configure_firewall` to `True`

Let's configure firewalld by default.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1526400
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agorolling_update: fix upgrade when using fqdn
Sébastien Han [Thu, 9 Aug 2018 09:32:53 +0000 (11:32 +0200)]
rolling_update: fix upgrade when using fqdn

CLusters that were deployed using 'mon_use_fqdn' have a different unit
name, so during the upgrade this must be used otherwise the upgrade will
fail, looking for a unit that does not exist.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1597516
Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agovalidate: check the version of python-notario
Andrew Schoen [Tue, 16 Oct 2018 15:20:54 +0000 (10:20 -0500)]
validate: check the version of python-notario

If the version of python-notario is < 0.0.13 an error message is given
like "TypeError: validate() got an unexpected keyword argument
'defined_keys'", which is not helpful in figuring
out you've got an incorrect version of python-notario.

This check will avoid that situation by telling the user that they need
to upgrade python-notario before they hit that error.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agoiscsi: fix networking issue on containerized env
Guillaume Abrioux [Thu, 18 Oct 2018 20:29:02 +0000 (22:29 +0200)]
iscsi: fix networking issue on containerized env

The iscsi-gw containers can't reach monitors without `--net=host`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoRevert "tests: test `test_all_docker_osds_are_up_and_in()` from mon nodes"
Guillaume Abrioux [Thu, 18 Oct 2018 13:43:36 +0000 (15:43 +0200)]
Revert "tests: test `test_all_docker_osds_are_up_and_in()` from mon nodes"

This approach doesn't work with all scenarios because it's comparing a
local OSD number expected to a global OSD number found in the whole
cluster.

This reverts commit b8ad35ceb99cdbd1644c79dd689b818f095ba8b8.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotests: set configure_firewall: true in centos7|docker_cluster
Guillaume Abrioux [Thu, 18 Oct 2018 11:45:14 +0000 (13:45 +0200)]
tests: set configure_firewall: true in centos7|docker_cluster

This way the CI will cover this part of the code.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoinfra: move restart fw handler in ceph-infra role
Guillaume Abrioux [Thu, 18 Oct 2018 11:41:49 +0000 (13:41 +0200)]
infra: move restart fw handler in ceph-infra role

Move the handler to restart firewall in ceph-infra role.

Closes: #3243
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotests: test `test_all_docker_osds_are_up_and_in()` from mon nodes v3.2.0beta6
Guillaume Abrioux [Tue, 16 Oct 2018 14:25:12 +0000 (16:25 +0200)]
tests: test `test_all_docker_osds_are_up_and_in()` from mon nodes

Let's get the osd tree from mons instead on osds.
This way we don't have to predict an OSD container name.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoadd-osds: followup on 3632b26
Guillaume Abrioux [Wed, 17 Oct 2018 11:57:09 +0000 (13:57 +0200)]
add-osds: followup on 3632b26

Three fixes:

- fix a typo in vagrant_variables that cause a networking issue for
containerized scenario.
- add containerized_deployment: true
- remove a useless block of code: the fact docker_exec_cmd is set in
ceph-defaults which is played right after.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoinfra: add a gather-ceph-logs.yml playbook
Sébastien Han [Thu, 24 May 2018 17:47:29 +0000 (10:47 -0700)]
infra: add a gather-ceph-logs.yml playbook

Add a gather-ceph-logs.yml which will log onto all the machines from
your inventory and will gather ceph logs. This is not intended to work
on containerized environments since the logs are stored in journald.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1582280
Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agotests: add tests for day-2-operation playbook
Guillaume Abrioux [Tue, 16 Oct 2018 15:05:10 +0000 (17:05 +0200)]
tests: add tests for day-2-operation playbook

Adding testing scenarios for day-2-operation playbook.

Steps:
- deploys a cluster,
- run testinfra,
- test idempotency,
- add a new osd node,
- run testinfra

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoinfra: rename osd-configure to add-osd and improve it
Sébastien Han [Thu, 27 Sep 2018 14:31:22 +0000 (16:31 +0200)]
infra: rename osd-configure to add-osd and improve it

The playbook has various improvements:

* run ceph-validate role before doing anything
* run ceph-fetch-keys only on the first monitor of the inventory list
* set noup flag so PGs get distributed once all the new OSDs have been
added to the cluster and unset it when they are up and running

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1624962
Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoceph-fetch-keys: refact
Sébastien Han [Thu, 27 Sep 2018 14:29:22 +0000 (16:29 +0200)]
ceph-fetch-keys: refact

This commits simplies the usage of the ceph-fetch-keys role. The role
now has a nicer way to find various ceph keys and fetch them on the
ansible server.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1624962
Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoAdd ability to use a different client container
Andy McCrae [Fri, 5 Oct 2018 13:36:36 +0000 (14:36 +0100)]
Add ability to use a different client container

Currently a throw-away container is built to run ceph client
commands to setup users, pools & auth keys. This utilises
the same base ceph container which has all the ceph services
inside it.

This PR allows the use of a separate container if the deployer
wishes - but defaults to use the same full ceph container.

This can be used for different architectures or distributions,
which may support the the Ceph client, but not Ceph server,
and allows the deployer to build and specify a separate client
container if need be.

Signed-off-by: Andy McCrae <andy.mccrae@gmail.com>
6 years agoinfra: fix wrong condition on firewalld start task
Guillaume Abrioux [Tue, 16 Oct 2018 13:09:48 +0000 (15:09 +0200)]
infra: fix wrong condition on firewalld start task

a non skipped task won't have the `skipped` attribute, so `start
firewalld` task will complain about that.
Indeed, `skipped` and `rc` attributes won't exist since the first task
`check firewalld installation on redhat or suse` won't be skipped in
case of non-containerized deployment.

Fixes: #3236
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1541840
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoceph-defaults: set ceph_stable_openstack_release_uca to queens
Christian Berendt [Thu, 11 Oct 2018 10:26:04 +0000 (12:26 +0200)]
ceph-defaults: set ceph_stable_openstack_release_uca to queens

Liberty is no longer available in the UCA. The last available release there
is currently Queens.

Signed-off-by: Christian Berendt <berendt@betacloud-solutions.de>
6 years agocontrib: add a bash script to snapshort libvirt vms
Guillaume Abrioux [Mon, 15 Oct 2018 21:42:16 +0000 (23:42 +0200)]
contrib: add a bash script to snapshort libvirt vms

This script is still 'work in progress' but could be used to make
snapshot of Libvirt VMs.
This can save some times when deploying again and again.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agohandler: remove some leftover in restart_*_daemon.sh.j2
Guillaume Abrioux [Mon, 15 Oct 2018 13:32:17 +0000 (15:32 +0200)]
handler: remove some leftover in restart_*_daemon.sh.j2

Remove some legacy in those restart script.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodoc: update default osd_objectstore value
Guillaume Abrioux [Mon, 15 Oct 2018 21:54:47 +0000 (23:54 +0200)]
doc: update default osd_objectstore value

since dc3319c3c4e2fb58cb1b5e6c60f165ed28260dc8 this should be reflected
in the doc.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodocker-ce is used in aarch64 instead of docker engine
Nan Li [Fri, 12 Oct 2018 03:26:04 +0000 (11:26 +0800)]
docker-ce is used in aarch64 instead of docker engine

Signed-off-by: Nan Li <herbert.nan@linaro.org>
6 years agoMergify: fix regexp operator
Julien Danjou [Mon, 15 Oct 2018 13:17:26 +0000 (15:17 +0200)]
Mergify: fix regexp operator

6 years agoUpdate Mergify configuration to v2
Julien Danjou [Mon, 15 Oct 2018 12:30:33 +0000 (14:30 +0200)]
Update Mergify configuration to v2

Signed-off-by: Julien Danjou <julien@danjou.info>
6 years agovagrantfile: remove disk path of OSD nodes
binhong.hua [Wed, 10 Oct 2018 15:24:30 +0000 (23:24 +0800)]
vagrantfile: remove disk path of OSD nodes

osd node's disks will remain on vagrant host,when run "vagrant destroy",
because we use time as a part of disk path, and time on delete not equal time on create.

we already use random_hostname in Libvirt backend,it will create disk
use the hostname as a part of diskname. for example: vagrant_osd2_1539159988_065f15e3e1fa6ceb0770-hda.qcow2.

Signed-off-by: binhong.hua <binhong.hua@gmail.com>
6 years agohandler: fix osd containers handler
Guillaume Abrioux [Sat, 13 Oct 2018 08:42:18 +0000 (10:42 +0200)]
handler: fix osd containers handler

`ceph_osd_container_stat` might not be set on other osd node.
We must ensure we are on the last node before trying to evaluate
`ceph_osd_container_stat`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoremove jewel support
Guillaume Abrioux [Wed, 10 Oct 2018 19:24:22 +0000 (15:24 -0400)]
remove jewel support

As of now, we should no longer support Jewel in ceph-ansible.
The latest ceph-ansible release supporting Jewel is `stable-3.1`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotest: fix docker test for lvm
Sébastien Han [Fri, 12 Oct 2018 16:58:41 +0000 (18:58 +0200)]
test: fix docker test for lvm

The CI is still running ceph-disk tests upstream. So until
https://github.com/ceph/ceph-ansible/pull/3187 is merged nothing will
pass anymore.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoswitch: allow switch big clusters (more than 99 osds) v3.2.0beta5
Sébastien Han [Wed, 26 Sep 2018 12:24:26 +0000 (14:24 +0200)]
switch: allow switch big clusters (more than 99 osds)

The current regex had a limitation of 99 OSDs, now this limit has been
removed and regardless the number of OSDs they will all be collected.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1630430
Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoceph_volume: refactor
Sébastien Han [Wed, 3 Oct 2018 17:52:42 +0000 (19:52 +0200)]
ceph_volume: refactor

This commit does a couple of things:

* Avoid code duplication
* Clarify the code
* add more unit tests
* add myself to the author of the module

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agotests: do not install lvm2 on atomic host
Guillaume Abrioux [Tue, 9 Oct 2018 20:45:05 +0000 (16:45 -0400)]
tests: do not install lvm2 on atomic host

we need to detect whether we are running on atomic host to not try to
install lvm2 package.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoci: test lvm in containerized
Sébastien Han [Thu, 12 Jul 2018 18:30:59 +0000 (20:30 +0200)]
ci: test lvm in containerized

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agodoc: improve osd configuration section
Sébastien Han [Wed, 3 Oct 2018 14:52:14 +0000 (16:52 +0200)]
doc: improve osd configuration section

Simply add that all the scenarios support the containerized deployment
option.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoosd: do not run when lvm scenario
Sébastien Han [Tue, 2 Oct 2018 16:35:52 +0000 (18:35 +0200)]
osd: do not run when lvm scenario

This task was created for ceph-disk based deployments so it's not needed
when osd are prepared with ceph-volume.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agohandler: add support for ceph-volume containerized restart
Sébastien Han [Tue, 2 Oct 2018 16:10:19 +0000 (18:10 +0200)]
handler: add support for ceph-volume containerized restart

The restart script wasn't working with the current new addition of
ceph-volume in container where now OSDs have the OSD id name in the
container name.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoceph-handler: change osd container check
Sébastien Han [Tue, 2 Oct 2018 15:37:06 +0000 (17:37 +0200)]
ceph-handler: change osd container check

Now that the container is named ceph-osd@<id> looking for something that
contains a host is not necessary. This is also backward compatible as it
will continue to match container names with hostname in them.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agotests: osd adjust osd name
Sébastien Han [Mon, 1 Oct 2018 14:00:21 +0000 (16:00 +0200)]
tests: osd adjust osd name

Now we use id of the OSD instead of the device name.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agovalidate: add warning for ceph-disk
Sébastien Han [Mon, 1 Oct 2018 13:27:06 +0000 (15:27 +0200)]
validate: add warning for ceph-disk

ceph-disk will be removed in 3.3 and we encourage to start using
ceph-volume as of 3.2.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoosd: ceph-volume activate, just pass the OSD_ID
Sébastien Han [Fri, 28 Sep 2018 16:07:08 +0000 (18:07 +0200)]
osd: ceph-volume activate, just pass the OSD_ID

We don't need to pass the device and discover the OSD ID. We have a
task that gathers all the OSD ID present on that machine, so we simply
re-use them and activate them. This also handles the situation when you
have multiple OSDs running on the same device.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoosd: change unit template for ceph-volume container
Sébastien Han [Fri, 28 Sep 2018 16:05:42 +0000 (18:05 +0200)]
osd: change unit template for ceph-volume container

We don't need to pass the hostname on the container name but we can keep
it simple and just call it ceph-osd-$id.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoosd: do not use expose_partitions on lvm
Sébastien Han [Fri, 28 Sep 2018 15:19:46 +0000 (17:19 +0200)]
osd: do not use expose_partitions on lvm

expose_partitions is only needed on ceph-disk OSDs so we don't need to
activate this code when running lvm prepared OSDs.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoceph_volume: add container support for batch command
Sébastien Han [Fri, 28 Sep 2018 11:06:18 +0000 (13:06 +0200)]
ceph_volume: add container support for batch command

The batch option got recently added, while rebasing this patch it was
necessary to implement it. So now, the batch option can work on
containerized environments.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1630977
Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoceph_volume: try to get ride of the dummy container
Sébastien Han [Mon, 16 Jul 2018 16:09:33 +0000 (18:09 +0200)]
ceph_volume: try to get ride of the dummy container

If we run on a containerized deployment we pass an env variable which
contains the container image.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoceph-osd: ceph-volume container support
Sébastien Han [Mon, 9 Jul 2018 14:58:35 +0000 (16:58 +0200)]
ceph-osd: ceph-volume container support

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoinfra: fix a typo in filename v3.2.0beta4
Guillaume Abrioux [Wed, 10 Oct 2018 16:30:26 +0000 (12:30 -0400)]
infra: fix a typo in filename

configure_firewall is missing its dot.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoinfra: add tags for each subcomponent
Guillaume Abrioux [Tue, 9 Oct 2018 18:02:04 +0000 (14:02 -0400)]
infra: add tags for each subcomponent

This way we can skip one specific component if needed.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoinfra: add firewall configuration for containerized deployment
Guillaume Abrioux [Tue, 9 Oct 2018 17:38:51 +0000 (13:38 -0400)]
infra: add firewall configuration for containerized deployment

firewalld is available on atomic so there is no reason to not apply
firewall configuration.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoinfra: update firewall rules, add cluster_network for osds
Guillaume Abrioux [Tue, 9 Oct 2018 17:35:17 +0000 (13:35 -0400)]
infra: update firewall rules, add cluster_network for osds

At the moment, all daemons accept connections from 0.0.0.0.
We should at least restrict to public_network and add
cluster_network for OSDs.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1541840
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoceph-infra: add new role ceph-infra
Guillaume Abrioux [Fri, 5 Oct 2018 13:42:52 +0000 (15:42 +0200)]
ceph-infra: add new role ceph-infra

this role manages ceph infra services such as ntp, firewall, ...

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoStringify ceph_docker_image_tag
Noah Watkins [Fri, 5 Oct 2018 22:56:45 +0000 (15:56 -0700)]
Stringify ceph_docker_image_tag

This could be a numeric input, but is treated like a string leading to
runtime errors.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1635823
Signed-off-by: Noah Watkins <nwatkins@redhat.com>
6 years agoAvoid using tests as filter
Noah Watkins [Fri, 5 Oct 2018 22:53:40 +0000 (15:53 -0700)]
Avoid using tests as filter

Fixes the deprecation warning:

  [DEPRECATION WARNING]: Using tests as filters is deprecated. Instead of
  using `result|search` use `result is search`.

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
6 years agotests: fix lvm2 setup issue
Guillaume Abrioux [Tue, 9 Oct 2018 19:43:08 +0000 (15:43 -0400)]
tests: fix lvm2 setup issue

not gathering fact causes `package` module to fail because it needs to
detect which OS we are running on to select the right package manager.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodocs: Correct mandatory config options
Ramana Raja [Tue, 9 Oct 2018 12:31:28 +0000 (18:01 +0530)]
docs: Correct mandatory config options

'radosgw_interface' or 'radosgw_address' config option does
not need to be set for all ceph-ansible deployments.

Closes: https://github.com/ceph/ceph-ansible/issues/3143
Signed-off-by: Ramana Raja <rraja@redhat.com>
6 years agotests: install lvm2 before setting up ceph-volume/LVM tests
Alfredo Deza [Tue, 9 Oct 2018 17:40:38 +0000 (13:40 -0400)]
tests: install lvm2 before setting up ceph-volume/LVM tests

Signed-off-by: Alfredo Deza <adeza@redhat.com>
6 years agoceph-validate: remove versions checks for bluestore and lvm scenario
Andrew Schoen [Tue, 9 Oct 2018 14:04:51 +0000 (10:04 -0400)]
ceph-validate: remove versions checks for bluestore and lvm scenario

These checks will never pass unless ceph_stable_release is passed and
ceph-defaults is run before ceph-validate. Additionally, we don't want
to support deploying jewel upstream at ceph-ansible master.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1637537
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agoceph-config: allow the batch --report to fail when getting the OSD num
Andrew Schoen [Tue, 2 Oct 2018 18:56:09 +0000 (13:56 -0500)]
ceph-config: allow the batch --report to fail when getting the OSD num

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agoceph-volume: if --report fails to load json, fail with better info
Andrew Schoen [Tue, 2 Oct 2018 18:50:01 +0000 (13:50 -0500)]
ceph-volume: if --report fails to load json, fail with better info

This handles the case gracefully where --report does not return any JSON
because a validator might have failed.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agotests: remove journal_size from lvm-batch testing scenario
Andrew Schoen [Mon, 1 Oct 2018 20:06:50 +0000 (15:06 -0500)]
tests: remove journal_size from lvm-batch testing scenario

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agoceph-volume: make the batch action idempotent
Andrew Schoen [Mon, 1 Oct 2018 17:51:47 +0000 (12:51 -0500)]
ceph-volume: make the batch action idempotent

The command is run with --report first to see if any OSDs will be
created or not. If they will be, then the command is run. If not, then
changed is set to False and the module exits.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agoceph-config: use 'lvm list' to find num_osds for an existing cluster
Andrew Schoen [Tue, 25 Sep 2018 20:25:40 +0000 (15:25 -0500)]
ceph-config: use 'lvm list' to find num_osds for an existing cluster

This makes finding num_osds idempotent for clusters that were deployed
using 'lvm batch'.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agoceph-volume: adds `lvm list` support to the ceph_volume module
Andrew Schoen [Tue, 25 Sep 2018 20:05:08 +0000 (15:05 -0500)]
ceph-volume: adds `lvm list` support to the ceph_volume module

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agoceph-config: use the ceph_volume module to get num_osds for lvm batch
Andrew Schoen [Thu, 20 Sep 2018 18:32:00 +0000 (13:32 -0500)]
ceph-config: use the ceph_volume module to get num_osds for lvm batch

This gives us an accurate number of how many osds will be created.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agoceph_volume: adds the report parameter
Andrew Schoen [Thu, 20 Sep 2018 18:17:29 +0000 (13:17 -0500)]
ceph_volume: adds the report parameter

Will pass the --report command to ceph-volume lvm batch.

Results will be returned in json format.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agoceph-osd: use journal_size and block_db_size for lvm batch
Andrew Schoen [Thu, 20 Sep 2018 17:26:24 +0000 (12:26 -0500)]
ceph-osd: use journal_size and block_db_size for lvm batch

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agoceph-defaults: add the block_db_size option
Andrew Schoen [Thu, 20 Sep 2018 17:24:07 +0000 (12:24 -0500)]
ceph-defaults: add the block_db_size option

This is used in the lvm osd scenario for the 'lvm batch' subcommand
of ceph-volume.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agoceph-volume: add the journal_size and block_db_size options
Andrew Schoen [Thu, 20 Sep 2018 17:18:53 +0000 (12:18 -0500)]
ceph-volume: add the journal_size and block_db_size options

These can be used for the the --journal-size and --block-db-size options
of `lvm batch`.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agosite: use default value for 'cluster' variable
Sébastien Han [Mon, 8 Oct 2018 13:45:58 +0000 (09:45 -0400)]
site: use default value for 'cluster' variable

If someone's cluster name is 'ceph' then the playbook will fail (with no
errors because of ignore_errors) saying it can not find the variable. So
let's declare the default. If the cluster name is different then it'll
be in group_vars and thus there won't be any failre.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636962
Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agorhcs: add helpers for the containerized deployment
Sébastien Han [Fri, 5 Oct 2018 12:05:11 +0000 (14:05 +0200)]
rhcs: add helpers for the containerized deployment

We give more assistance to consultants deplying by setting the registry
and the image name.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agocommon: remove check_firewall code
Guillaume Abrioux [Fri, 5 Oct 2018 12:33:04 +0000 (14:33 +0200)]
common: remove check_firewall code

Check firewall isn't working as expected and might break deployments.
This part of the code will be reworked soon.

Let's focus on configure_firewall code for now.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1541840
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agofollow up on b5d2ea2
Guillaume Abrioux [Thu, 4 Oct 2018 08:02:24 +0000 (10:02 +0200)]
follow up on b5d2ea2

Add some missed statements

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agorolling_update: add ceph-handler role
Guillaume Abrioux [Fri, 5 Oct 2018 11:15:54 +0000 (13:15 +0200)]
rolling_update: add ceph-handler role

since the introduction of ceph-handler, it has to be added in
rolling_update playbook as well

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodon't use "static" field while including tasks
Rishabh Dave [Fri, 10 Aug 2018 12:16:30 +0000 (08:16 -0400)]
don't use "static" field while including tasks

Instead used "import_tasks" and "include_tasks" to tell whether tasks
must be included statically or dynamically.

Fixes: https://github.com/ceph/ceph-ansible/issues/2998
Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agoswitch: copy initial mon keyring
Sébastien Han [Wed, 3 Oct 2018 11:39:35 +0000 (13:39 +0200)]
switch: copy initial mon keyring

We need to copy this key into /etc/ceph so when ceph-docker-common runs
it can fetch it to the ansible server. Previously the task wasn't not
failing because `fail_on_missing` was False before 2.5, so now it's True
hence the failure.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoswitch: add missing call to ceph-handler role
Guillaume Abrioux [Tue, 2 Oct 2018 17:22:20 +0000 (19:22 +0200)]
switch: add missing call to ceph-handler role

Add missing call the ceph-handler role, otherwise we can't have
reference to variable registered from ceph-handler from other roles.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoswitch: support migration when cluster is scrubbing
Guillaume Abrioux [Tue, 2 Oct 2018 15:31:49 +0000 (17:31 +0200)]
switch: support migration when cluster is scrubbing

Similar to c13a3c3 we must allow scrubbing when running this playbook.

In cluster with a large number of PGs, it can be expected some of them
scrubbing, it's a normal operation.
Preventing from scrubbing operation force to set noscrub flag.

This commit allows to switch from non containerized to containerized
environment even while PGs are scrubbing.

Closes: #3182
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoconfig: look up for monitor_address_block in hostvars
Guillaume Abrioux [Tue, 2 Oct 2018 13:55:47 +0000 (15:55 +0200)]
config: look up for monitor_address_block in hostvars

`monitor_address_block` should be read from hostvars[host] instead of
current node being played.

eg:

Let's assume we have:

```
[mons]
ceph-mon0 monitor_address=192.168.1.10
ceph-mon1 monitor_interface=eth1
ceph-mon2 monitor_address_block=192.168.1.0/24
```

the ceph.conf generation task will end up with:

```
fatal: [ceph-mon0]: FAILED! => {}

MSG:

'ansible.vars.hostvars.HostVarsVars object' has no attribute u'ansible_interface'
```

the reason is that it will assume `monitor_address_block` isn't defined even on
ceph-mon2 because looking for `monitor_address_block` instead of
`hostvars[host]['monitor_address_block']`, therefore it enters in the condition as default value:

```
    {%- else -%}
      {% set interface = 'ansible_' + (monitor_interface | replace('-', '_')) %}
      {% if ip_version == 'ipv4' -%}
        {{ hostvars[host][interface][ip_version]['address'] }}
      {%- elif ip_version == 'ipv6' -%}
        [{{ hostvars[host][interface][ip_version][0]['address'] }}]
      {%- endif %}
    {%- endif %}
```

`monitor_interface` is set with default value `'interface'` so the `interface`
variable is built with 'ansible_' + 'interface'. It makes ansible throwing a
confusing message about `'ansible_interface'`.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1635303
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoAdd support for different NTP daemons v3.2.0beta3
Benjamin Cherian [Wed, 5 Sep 2018 16:59:50 +0000 (09:59 -0700)]
Add support for different NTP daemons

Allow user to choose between timesyncd, chronyd and ntpd
Installation will default to timesyncd since it is distributed as
part of the systemd installation for most distros.
Added note indicating NTP daemon type is not used for containerized
deployments.

Fixes issue #3086 on Github

Signed-off-by: Benjamin Cherian <benjamin_cherian@amat.com>
6 years agoigw: valid client CHAP settings.
Mike Christie [Fri, 28 Sep 2018 21:23:10 +0000 (16:23 -0500)]
igw: valid client CHAP settings.

The linux kernel target layer, LIO, does not support the iscsi target to
mix ACLs that have chap enabled and disabled under the same tpg. This
patch adds a check and fails if this type of setup is detected.

This fixes Red Hat BZ:
https://bugzilla.redhat.com/show_bug.cgi?id=1615088

Signed-off-by: Mike Christie <mchristi@redhat.com>