]> git.apps.os.sepia.ceph.com Git - ceph-ansible.git/log
ceph-ansible.git
5 years agocontainer: add always tag on gather fact tasks
Dimitri Savineau [Thu, 14 Nov 2019 14:29:29 +0000 (09:29 -0500)]
container: add always tag on gather fact tasks

If we execute the site-container.yml playbook with specific tags (like
ceph_update_config) then we need to be sure to gather the facts otherwise
we will see error like:

The task includes an option with an undefined variable. The error was:
'ansible_hostname' is undefined

This commit also adds missing 'gather_facts: false' to mons plays.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1754432
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoceph-osd: add device class to crush rules
Dimitri Savineau [Thu, 31 Oct 2019 20:24:12 +0000 (16:24 -0400)]
ceph-osd: add device class to crush rules

This adds device class support to crush rules when using the class key
in the rule dict via the create-replicated sub command.
If the class key isn't specified then we use the create-simple sub
command for backward compatibility.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636508
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agomove crush rule creation from mon to osd role
Dimitri Savineau [Thu, 31 Oct 2019 20:17:33 +0000 (16:17 -0400)]
move crush rule creation from mon to osd role

If we want to create crush rules with the create-replicated sub command
and device class then we need to have the OSD created before the crush
rules otherwise the device classes won't exist.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoceph-defaults: pin prometheus container tags
Dimitri Savineau [Mon, 4 Nov 2019 15:10:26 +0000 (10:10 -0500)]
ceph-defaults: pin prometheus container tags

In addition to the grafana container tag change, we need to do the same
for the prometheus container stack based on the release present in the
OSE 4.1 container image.

$ docker run --rm openshift4/ose-prometheus-node-exporter:v4.1 --version
node_exporter, version 0.17.0
  build user:       root@67fee13ed48f
  build date:       20191023-14:38:12
  go version:       go1.11.13
$ docker run --rm openshift4/ose-prometheus-alertmanager:4.1 --version
alertmanager, version 0.16.2
  build user:       root@70b79a3f29b6
  build date:       20191023-14:57:30
  go version:       go1.11.13
$ docker run --rm openshift4/ose-prometheus:4.1 --version
prometheus, version 2.7.2
  build user:       root@12da054778a3
  build date:       20191023-14:39:36
  go version:       go1.11.13

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoEvades validation of ceph_repository_type in containerized scenario
VasishtaShastry [Thu, 7 Nov 2019 12:00:21 +0000 (17:30 +0530)]
Evades validation of ceph_repository_type in containerized scenario
This will prevent failure of site-docker.yml with configs in doc.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1769760
Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
Co-Authored-By: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoceph_key: restore file mode after a key is fetched
Guillaume Abrioux [Thu, 14 Nov 2019 09:30:34 +0000 (10:30 +0100)]
ceph_key: restore file mode after a key is fetched

when `import_key` is enabled, if the key already exists, it will only be
fetched using ceph cli, if the mode specified in the `ceph_key` task is
different from what is applied by the ceph cli, the mode isn't restored because
we don't call `module.set_fs_attributes_if_different()` before
`module.exit_json(**result)`

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1734513
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agotests: add time command in vagrant_up.sh
Guillaume Abrioux [Thu, 17 Oct 2019 13:37:31 +0000 (15:37 +0200)]
tests: add time command in vagrant_up.sh

monitor how long it takes to get all VMs up and running

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agotests: remove legacy in tox-update.ini
Guillaume Abrioux [Fri, 8 Nov 2019 13:01:41 +0000 (14:01 +0100)]
tests: remove legacy in tox-update.ini

This variable isn't used in tox-update.ini so this commit removes it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agotests: upgrade from nautilus to octopus in master
Guillaume Abrioux [Wed, 10 Apr 2019 12:17:36 +0000 (14:17 +0200)]
tests: upgrade from nautilus to octopus in master

test upgrades from nautilus to octopus instead of mimic to octopus.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoupdate: reset flags before and after each osd node upgrade
Guillaume Abrioux [Tue, 18 Jun 2019 08:08:48 +0000 (10:08 +0200)]
update: reset flags before and after each osd node upgrade

It might be possible at some point even with osd flags `noout` and
`norebalance` set the PGs states can change depending on the amount of data
written meantime. It means the check for PGs state will fail.

This commit changes the way we set those flags:
we set them before an OSD node upgrade and unset them before the PGs
state check so they can recover.

Fixes: #3961
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoFixes for Makefile
Javier Pena [Fri, 8 Nov 2019 09:38:32 +0000 (10:38 +0100)]
Fixes for Makefile

- Set default mock configuration to epel-8-x86_64, to match the
  default dist value.
- Add support for alpha tags, like the recently added v5.0.0alpha1

Signed-off-by: Javier Pena <jpena@redhat.com>
5 years agotests: add coverage on purge playbook
Guillaume Abrioux [Thu, 7 Nov 2019 12:39:25 +0000 (13:39 +0100)]
tests: add coverage on purge playbook

This commit adds a playbook to be played before we run purge playbook,
it first creates an rbd image then map an rbd device on client0 so the
purge playbook will try to unmap it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agopurge: use sysfs to unmap rbd devices
Guillaume Abrioux [Mon, 4 Nov 2019 14:59:39 +0000 (15:59 +0100)]
purge: use sysfs to unmap rbd devices

in containerized context, using the binary provided in atomic os won't
work because it's an old version provided by ceph-common based on
10.2.5.
Using a container could be an idea but for large cluster with hundreds
of client nodes, that would require to pull the image of each of them
just to unmap the rbd devices.

Let's use the sysfs method in order to avoid any issue related to ceph
version that is shipped on the host.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1766064
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoceph-validate: add rbdmirror validation
Dimitri Savineau [Tue, 5 Nov 2019 16:53:22 +0000 (11:53 -0500)]
ceph-validate: add rbdmirror validation

When ceph_rbd_mirror_configure is set to true we need to ensure that
the required variables aren't empty.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1760553
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoceph-handler: Use /proc/net/unix for rgw socket
Dimitri Savineau [Tue, 6 Aug 2019 15:41:02 +0000 (11:41 -0400)]
ceph-handler: Use /proc/net/unix for rgw socket

If for some reason, there's an old rgw socket file present in the
/var/run/ceph/ directory then the test command could fail with

test: xxxxxxxxx.asok: binary operator expected

$ ls -hl /var/run/ceph/
total 0
srwxr-xr-x. ceph-client.rgw.rgw0.rgw0.68.94153614631472.asok
srwxr-xr-x. ceph-client.rgw.rgw0.rgw0.68.94240997655088.asok

We can check the radosgw socket in /proc/net/unix to avoid using wildcard
in the socket name.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoadd-{mon,osd}: run raw install python tasks
Dimitri Savineau [Mon, 4 Nov 2019 14:04:48 +0000 (09:04 -0500)]
add-{mon,osd}: run raw install python tasks

If the new mon/osd node doesn't have python installed then we need to
execute the tasks from raw_install_python.yml.

Closes: #4368
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoceph-osd: fix fs.aio-max-nr sysctl condition
Dimitri Savineau [Wed, 6 Nov 2019 15:15:53 +0000 (10:15 -0500)]
ceph-osd: fix fs.aio-max-nr sysctl condition

[1] introduced a regression on the fs.aio-max-nr sysctl value condition.
The enable key isn't a boolean but a string because the expression isn't
evaluated.
This string output "(osd_objectstore == 'bluestore')" is always true
because item.enable condition only matches non empty string. So the
sysctl value was applyied for both filestore and bluestore backend.

[2] added the bool filter to the condition but the filter always returns
false on string and the sysctl wasn't applyed at all.

This commit fixes the enable key value by evaluating the value instead
of using the string.

[1] https://github.com/ceph/ceph-ansible/commit/08a2b58
[2] https://github.com/ceph/ceph-ansible/commit/ab54fe2

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agotests/requirements: bump testinfra and pytest v5.0.0alpha1
Dimitri Savineau [Fri, 1 Nov 2019 14:25:36 +0000 (10:25 -0400)]
tests/requirements: bump testinfra and pytest

The ansible ssh connections are now using the ssh backend instead of
paramiko starting testinfra 3.1 and persistent connections too.
pytest 4.6 is the latest release to be supported by python 2.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoceph-defaults: pin grafana container tag to 5.2.4
Dimitri Savineau [Thu, 31 Oct 2019 15:37:20 +0000 (11:37 -0400)]
ceph-defaults: pin grafana container tag to 5.2.4

The latest grafana container tag is using grafana 6.x release which could
cause issue with the ceph dashboard integration.
Considering that the grafana container in RHCS 3 is based on 5.x then we
should use the same version.

$ docker run --rm rhceph/rhceph-3-dashboard-rhel7:3 -v
Version 5.2.4 (commit: unknown-dev)

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoceph-osd: Remove ulimit nofile on container start
Dimitri Savineau [Wed, 30 Oct 2019 15:45:44 +0000 (11:45 -0400)]
ceph-osd: Remove ulimit nofile on container start

Even if this improves ceph-disk/ceph-volume performances then it also
impact the ceph-osd process.
The ceph-osd process shouldn't use 1024:4096 value for the max open
files.
Removing the ulimit option from the container engine and doing this kind
of change on the container side [1].

[1] https://github.com/ceph/ceph-container/pull/1497

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1702285
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoSet grafana-server user and password in ceph-dashboard role
fmount [Thu, 31 Oct 2019 09:49:22 +0000 (10:49 +0100)]
Set grafana-server user and password in ceph-dashboard role

This change adds two tasks to set grafana-api user and password
that are required to inject dashboard layouts to the external
grafana instance.
Without these two parameters the ceph-ansible playbook fails
showing an authorization error (HTTPError: 401 Client Error:
Unauthorized").

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1767365
Signed-off-by: fmount <fpantano@redhat.com>
5 years agoAllow setting dist and mock configuration in Makefile
Javier Pena [Wed, 23 Oct 2019 14:45:41 +0000 (16:45 +0200)]
Allow setting dist and mock configuration in Makefile

Curently, the dist and mock configurations are hardcoded in the
Makefile to be el8 and epel-7-x86_64, respectively. This commit
allows the user to override those settings using the DIST and
MOCK_CONFIG environment variables, falling back to the current
defaults if not set.

This provides additional flexibility when building the RPM directly
from the repository.

Signed-off-by: Javier Peña <jpena@redhat.com>
5 years agoceph-mon: use --admin-daemon to set default crush rule
Mihai Plasoianu [Mon, 28 Oct 2019 15:30:39 +0000 (16:30 +0100)]
ceph-mon: use --admin-daemon to set default crush rule

Signed-off-by: Mihai Plasoianu <m.plasoianu@vertical.de>
5 years agonfs: support specific keys for rgw nfs user
Radu Toader [Tue, 29 Oct 2019 07:56:00 +0000 (09:56 +0200)]
nfs: support specific keys for rgw nfs user

This brings the possibility to modify the rgw nfs user to use specific
keys when those are defined.

Signed-off-by: Radu Toader <radu.m.toader@gmail.com>
5 years agoupdate: add default values when setting fact
Guillaume Abrioux [Tue, 29 Oct 2019 17:01:50 +0000 (18:01 +0100)]
update: add default values when setting fact

This commit adds a default value in the `with_dict` because when using
python 2.7, if a task using a `with_dict` has a condition, it is
evaluated anyway whereas in python 3 it isn't.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1766499
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoceph-nfs: add nfs-ganesha-rados-grace explicitly
Dimitri Savineau [Mon, 28 Oct 2019 17:54:16 +0000 (13:54 -0400)]
ceph-nfs: add nfs-ganesha-rados-grace explicitly

Since nfs-ganesha V3.0-rc4 and [1] we need to explicitly install the
nfs-ganesha-rados-grace package.

[1] https://github.com/nfs-ganesha/nfs-ganesha/commit/0fea990

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agorolling_update: remove default filter on mds group
Dimitri Savineau [Fri, 25 Oct 2019 21:03:46 +0000 (17:03 -0400)]
rolling_update: remove default filter on mds group

There's no need to use the default filter on active/standby groups
because if the group doesn't exist then the play is just skipped.

Currently this generates warnings like:

[WARNING]: Could not match supplied host pattern, ignoring: |
[WARNING]: Could not match supplied host pattern, ignoring: default([])

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agorolling_update: fix active mds host value
Dimitri Savineau [Fri, 25 Oct 2019 20:47:50 +0000 (16:47 -0400)]
rolling_update: fix active mds host value

The active mds host should be based on the inventory hostname and not on
the ansible hostname.
The value returns under the mdsmap structure is based on the OS hostname
so we need to find the right node in the inventory with this value when
doing operation on inventory nodes.

Othewise we could see error like:

The task includes an option with an undefined variable. The error was:
"hostvars[foobar]" is undefined

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoipaddrs_in_ranges: fix python indent
Dimitri Savineau [Fri, 25 Oct 2019 19:52:27 +0000 (15:52 -0400)]
ipaddrs_in_ranges: fix python indent

pycodestyle returns:

 E111 indentation is not a multiple of four

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agomove library/plugins tests files under tests dir
Dimitri Savineau [Fri, 25 Oct 2019 19:47:05 +0000 (15:47 -0400)]
move library/plugins tests files under tests dir

To avoid unnecessary ansible warnings during playbook execution we can
move the library and plugins test files under a different directory.

[WARNING]: Skipping plugin (plugins/filter/test_ipaddrs_in_ranges.py) as
it seems to be invalid:
cannot import name 'ipaddrs_in_ranges'

Closes: #4656
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoadd-mon: add missing become flag
Dimitri Savineau [Fri, 25 Oct 2019 15:09:32 +0000 (11:09 -0400)]
add-mon: add missing become flag

Without the become flag set to true, we can't executed the roles
successfully.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agorolling_update: fix reset mon_host variable
Dimitri Savineau [Fri, 25 Oct 2019 17:36:07 +0000 (13:36 -0400)]
rolling_update: fix reset mon_host variable

mon_host should use the inventory hostname and not the node hostname.
Fix creates an issue when the inventory and node hostname are different.

Closes: #4670
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoadd-{mon,osd}: add ceph-container-engine role
Dimitri Savineau [Thu, 24 Oct 2019 14:02:10 +0000 (10:02 -0400)]
add-{mon,osd}: add ceph-container-engine role

The ceph-container-engine role is missing from both playbooks so the
container engine (docker, podman) isn't install resulting in a failure
on the added nodes.

fatal: [xxxxx]: FAILED! => changed=false
  cmd: docker --version
  msg: '[Errno 2] No such file or directory'
  rc: 2

Closes: #4634
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoupdate: use right node when creating active mds group
Guillaume Abrioux [Thu, 24 Oct 2019 07:41:06 +0000 (09:41 +0200)]
update: use right node when creating active mds group

This must be consistent with what is used in `name` parameter.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agodefaults: add user/pass auth registry variables
Dimitri Savineau [Thu, 24 Oct 2019 15:07:20 +0000 (11:07 -0400)]
defaults: add user/pass auth registry variables

Add ceph_docker_registry_username and ceph_docker_registry_password
variables in ceph-defaults role so they will be present in the group_vars
samples but commented.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1763139
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agomon: call mon_status from asok
Guillaume Abrioux [Thu, 24 Oct 2019 09:08:27 +0000 (11:08 +0200)]
mon: call mon_status from asok

since c09b82a80a392ccd0da7677c7b424ce5cd3fa5d6 in ceph/ceph we must call
mon_status from asok instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoupdate: avoid skipping single mds deployment upgrade
Guillaume Abrioux [Wed, 23 Oct 2019 17:39:15 +0000 (19:39 +0200)]
update: avoid skipping single mds deployment upgrade

otherwise a single MDS would never be updated.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoupdate: skip mds deactivation when no mds in inventory
Guillaume Abrioux [Wed, 23 Oct 2019 13:48:32 +0000 (15:48 +0200)]
update: skip mds deactivation when no mds in inventory

Let's skip this part of the code if there's no mds node in the
inventory.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agodashboard: add ceph iscsi management
Dimitri Savineau [Mon, 21 Oct 2019 19:45:19 +0000 (15:45 -0400)]
dashboard: add ceph iscsi management

When deploying with ceph-iscsi nodes and dashboard enabled, we need to
add the ceph iscsi gateway endpoints to the dashboard configuration and
add the mgr ip address in the trusted list in the iscsi gateway
configuration file.

Closes: #4638
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1764173
https://docs.ceph.com/docs/master/mgr/dashboard/#enabling-iscsi-management

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoceph-iscsi: add ceph-iscsi stable repositories
Dimitri Savineau [Mon, 21 Oct 2019 14:32:55 +0000 (10:32 -0400)]
ceph-iscsi: add ceph-iscsi stable repositories

This commit adds the support of the ceph-iscsi stable repository when
use ceph_repository community instead of always using the devel
repositories.
We're still using the devel repositories for rtslib and tcmu-runner in
both cases (dev and community).

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agorhcs: set ceph_iscsi_config_dev to false
Dimitri Savineau [Mon, 21 Oct 2019 13:58:40 +0000 (09:58 -0400)]
rhcs: set ceph_iscsi_config_dev to false

We don't have to use ceph_iscsi_config_dev (default true) on RHCS
because all iscsi packages are already included in the RHCS
repositories.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoRevert "iscsigw: install python-requests"
Dimitri Savineau [Mon, 21 Oct 2019 13:37:37 +0000 (09:37 -0400)]
Revert "iscsigw: install python-requests"

We don't need this since [1]. Also this was only working for python2 and
not supporting python3.

[1] https://github.com/ceph/ceph-iscsi/commit/00f198a

This reverts commit 167737dd3de02057403fb458c50d22cf94a85b95.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agocontainer/dashboard: run the registry auth task
Dimitri Savineau [Tue, 22 Oct 2019 17:58:50 +0000 (13:58 -0400)]
container/dashboard: run the registry auth task

When deploying with packages then the ceph-container-common role isn't
executed so the registry authentication task is ignored.

Closes: #4636
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agotests: use osd ids instead of device name in ooo_collocation
Guillaume Abrioux [Tue, 22 Oct 2019 11:27:20 +0000 (13:27 +0200)]
tests: use osd ids instead of device name in ooo_collocation

on master, it doesn't make sense anymore to use device name, we should
use osd id instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agotests: fix keyring creation in ooo_collocation
Guillaume Abrioux [Tue, 22 Oct 2019 07:30:37 +0000 (09:30 +0200)]
tests: fix keyring creation in ooo_collocation

This commit removes the backslash in allow command parameter, this was
needed before the ceph_key module integration.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agotests: update container tag for ooo_collocation
Dimitri Savineau [Wed, 28 Aug 2019 14:49:54 +0000 (10:49 -0400)]
tests: update container tag for ooo_collocation

It doesn't make sense to test the old 3.0.x container images with
nautilus+ ceph releases.
Also disable the dashboard deployment and switch to bluestore backend.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agovalidate: fix credentials validation
Guillaume Abrioux [Mon, 21 Oct 2019 15:09:31 +0000 (17:09 +0200)]
validate: fix credentials validation

This task is failing when `ceph_docker_registry_auth` is enabled and
`ceph_docker_registry_username` is undefined with an ansible error
instead of the expected message.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1763139
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agotravis: fail on ansible-lint errors
Dimitri Savineau [Mon, 21 Oct 2019 14:03:08 +0000 (10:03 -0400)]
travis: fail on ansible-lint errors

If ansible-lint reports an error then it's skipped. We should fail in
this case.

This patch also fixes the pipefail lint in the rbd mirror role

[306] Shells that use pipes should set the pipefail option

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoupdate: add missing quotes
Guillaume Abrioux [Mon, 21 Oct 2019 12:22:58 +0000 (14:22 +0200)]
update: add missing quotes

Add missing quote in order to keep consistency.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agotests: add multimds coverage
Guillaume Abrioux [Thu, 17 Oct 2019 12:43:18 +0000 (14:43 +0200)]
tests: add multimds coverage

This commit makes the all_daemons scenario deploying 3 mds in order to
cover the multimds case.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoupgrade: fix standby_mdss group creation
Guillaume Abrioux [Thu, 17 Oct 2019 13:15:58 +0000 (15:15 +0200)]
upgrade: fix standby_mdss group creation

This commit fixes the standby_mdss group creation by using `{{ item }}`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoplaybooks: show cluster status after dashboard
Dimitri Savineau [Thu, 17 Oct 2019 19:15:20 +0000 (15:15 -0400)]
playbooks: show cluster status after dashboard

We should show the ceph cluster status as the last task of the playbook.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoMove the dashboard playbook in the main directory
Dimitri Savineau [Thu, 17 Oct 2019 19:10:59 +0000 (15:10 -0400)]
Move the dashboard playbook in the main directory

The [group|host]_vars directories are ignored for the dashboard playbook
when the inventory file directory doesn't contain those directories.

Closes: #4601
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1761612
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agocommon: do not override ceph_release when using custom repo
Guillaume Abrioux [Thu, 17 Oct 2019 15:05:53 +0000 (17:05 +0200)]
common: do not override ceph_release when using custom repo

Otherwise it fails like following:

```
TASK [ceph-mds : allow multimds] **************************************************************************************************************************************************
Monday 22 July 2019  16:37:38 +0800 (0:00:03.269)       0:13:25.651 ***********
fatal: [rhel7u6clone1]: FAILED! => {"msg": "The conditional check 'ceph_release_num[ceph_release] == ceph_release_num.luminous' failed. The error was: error while evaluating conditional (ceph_release_num[ceph_release] == ceph_release_num.luminous): 'dict object' has no attribute u'dummy'\n\nThe error appears to have been in '/usr/share/ceph-ansible/roles/ceph-mds/tasks/create_mds_filesystems.yml': line 43, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: allow multimds\n  ^ here\n"}
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agotests: fix the size on the second data LV
Dimitri Savineau [Thu, 17 Oct 2019 18:28:45 +0000 (14:28 -0400)]
tests: fix the size on the second data LV

The commit replaces the pv/vg/lv commands used with the ansible command
module by the lvg and lvol modules.
This also fixes the size of the second data LV because we were only using
50% of the remaining space instead of 100%.

With a 50G device, the result was:
  - data-lv1 was 25G
  - data-lv2 was 12.5G
Instead of:
  - data-lv1 was 25G
  - data-lv2 was 25G

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoiscsi-gw: Fix rtslib installation
Mike Christie [Wed, 16 Oct 2019 03:48:31 +0000 (22:48 -0500)]
iscsi-gw: Fix rtslib installation

When using python3 the name of the rtslib rpm is python3-rtslib. The
packages that use rtslib already have code that detects the python
version and distro deps, so drop it from the ceph iscsi gw task list and
let the ceph-iscsi rpm dependency handle it.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1760930

Signed-off-by: Mike Christie <mchristi@redhat.com>
5 years agoupdate: follow new recommandation to upgrade mds cluster
Guillaume Abrioux [Thu, 10 Oct 2019 17:09:49 +0000 (19:09 +0200)]
update: follow new recommandation to upgrade mds cluster

Refact the mds cluster upgrade code in order to follow the documented
recommandation.
See: https://github.com/ceph/ceph/blob/master/doc/cephfs/upgrading.rst

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1569689
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agonfs: remove unnecessary set_fact in main.yml
Guillaume Abrioux [Wed, 16 Oct 2019 13:57:03 +0000 (15:57 +0200)]
nfs: remove unnecessary set_fact in main.yml

this task is a leftover and no longer needed.
It even causes bug when collocating nfs with mon.

Closes: #4609
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agorbd-mirror: fail if the peer is not added
Dimitri Savineau [Tue, 15 Oct 2019 15:32:40 +0000 (11:32 -0400)]
rbd-mirror: fail if the peer is not added

Due the 'failed_when: false' statement present in the peer task then
the playbook continues to ran even if the peer task was failing (like
incorrect remote peer format.

"stderr": "rbd: invalid spec 'admin@cluster1'"

This patch adds a task to list the peer present and add the peer only if
it's not already added. With this we don't need the failed_when statement
anymore.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1665877
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoceph-iscsi: notify rbd target services
Dimitri Savineau [Wed, 9 Oct 2019 20:50:09 +0000 (16:50 -0400)]
ceph-iscsi: notify rbd target services

When the iscsi gateway or the ceph configuration file change then we
need to notify the rbd target api/gw services to be restarted.
This patch also merges the rbd-target-api and rbd-target-gw handler
into a single file and listen.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoExecute common roles once on all nodes
Dimitri Savineau [Fri, 2 Aug 2019 20:20:08 +0000 (16:20 -0400)]
Execute common roles once on all nodes

The common roles don't need to be executed again on each group plays
(like mons, osds, etc..).
We only need to execute them during the first play. That wat, we will
apply the changes on all nodes in parallel instead of doing it once per
group.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoplaybook: remove duplicate facts
Dimitri Savineau [Tue, 15 Oct 2019 20:52:45 +0000 (16:52 -0400)]
playbook: remove duplicate facts

The is_atomic and container_binary facts are already defined in the
ceph-facts role so we don't need to have dedicated tasks for that
before the ceph-facts role exectution.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agomgr: do not copy all keyrings on all mgr
Guillaume Abrioux [Tue, 15 Oct 2019 15:02:18 +0000 (17:02 +0200)]
mgr: do not copy all keyrings on all mgr

There is no need to loop over all mgr nodes to set this fact, it's even
breaking deployments because it tries to copy all mgr keyring on all
mgr.

Closes: #4602
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoRemove validate action and notario dependency
Dimitri Savineau [Fri, 11 Oct 2019 15:42:36 +0000 (11:42 -0400)]
Remove validate action and notario dependency

The current ceph-validate role is using both validate action and fail
module tasks to validate the ceph configuration.
The validate action is based on the notario python library. When one of
the notario validation fails then a python stack trace is reported to the
ansible task. This output isn't understandable by users.

This patch removes the validate action and the notario depencendy. The
validation is now done with only fail ansible module.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1654790
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agolint: fix error [303,602,701,702]
Dimitri Savineau [Wed, 9 Oct 2019 19:39:04 +0000 (15:39 -0400)]
lint: fix error [303,602,701,702]

 [303] mktemp used in place of tempfile module
 [602] Don't compare to empty string
 [701] No 'galaxy_info' found
 [702] Use 'galaxy_tags' rather than 'categories'

This patch also changes the ansible log_path value via the
ANSIBLE_LOG_PATH environment variable in the travis configuration to
avoid warnings.

[WARNING]: log file at /home/travis/ansible/ansible.log is not writeable
and we cannot create it, aborting

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoceph-handler: group listen topics and condition
Dimitri Savineau [Wed, 9 Oct 2019 17:58:33 +0000 (13:58 -0400)]
ceph-handler: group listen topics and condition

We are using multiple listen topics with the handlers. That means that
we are notifying 4 tasks for each handler.
Instead we can group the listen on an include_tasks and based on the
group condition.

Before:

NOTIFIED HANDLER ceph-handler : set _mon_handler_called before restart for mon0
NOTIFIED HANDLER ceph-handler : copy mon restart script for mon0
NOTIFIED HANDLER ceph-handler : restart ceph mon daemon(s) for mon0
NOTIFIED HANDLER ceph-handler : set _mon_handler_called after restart for mon0
NOTIFIED HANDLER ceph-handler : set _osd_handler_called before restart for mon0
NOTIFIED HANDLER ceph-handler : copy osd restart script for mon0
NOTIFIED HANDLER ceph-handler : restart ceph osds daemon(s) for mon0
NOTIFIED HANDLER ceph-handler : set _osd_handler_called after restart for mon0
NOTIFIED HANDLER ceph-handler : set _mds_handler_called before restart for mon0
NOTIFIED HANDLER ceph-handler : copy mds restart script for mon0
NOTIFIED HANDLER ceph-handler : restart ceph mds daemon(s) for mon0
NOTIFIED HANDLER ceph-handler : set _mds_handler_called after restart for mon0
NOTIFIED HANDLER ceph-handler : set _rgw_handler_called before restart for mon0
NOTIFIED HANDLER ceph-handler : copy rgw restart script for mon0
NOTIFIED HANDLER ceph-handler : restart ceph rgw daemon(s) for mon0
NOTIFIED HANDLER ceph-handler : set _rgw_handler_called after restart for mon0
NOTIFIED HANDLER ceph-handler : set _mgr_handler_called before restart for mon0
NOTIFIED HANDLER ceph-handler : copy mgr restart script for mon0
NOTIFIED HANDLER ceph-handler : restart ceph mgr daemon(s) for mon0
NOTIFIED HANDLER ceph-handler : set _mgr_handler_called after restart for mon0
NOTIFIED HANDLER ceph-handler : set _rbdmirror_handler_called before restart for mon0
NOTIFIED HANDLER ceph-handler : copy rbd mirror restart script for mon0
NOTIFIED HANDLER ceph-handler : restart ceph rbd mirror daemon(s) for mon0
NOTIFIED HANDLER ceph-handler : set _rbdmirror_handler_called after restart for mon0

After:

NOTIFIED HANDLER ceph-handler : mons handler for mon0
NOTIFIED HANDLER ceph-handler : osds handler for mon0
NOTIFIED HANDLER ceph-handler : mdss handler for mon0
NOTIFIED HANDLER ceph-handler : rgws handler for mon0
NOTIFIED HANDLER ceph-handler : mgrs handler for mon0
NOTIFIED HANDLER ceph-handler : rbdmirrors handler for mon0

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agodashboard: disable facts gathering
Dimitri Savineau [Wed, 9 Oct 2019 17:34:50 +0000 (13:34 -0400)]
dashboard: disable facts gathering

This is already done in the main playbooks but absent in the dashboard
playbook.
The facts are already gathered during the first play of the main
playbooks so we don't need to doing twice.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agomgr: improve mgr keyring creation
Guillaume Abrioux [Thu, 10 Oct 2019 19:25:10 +0000 (21:25 +0200)]
mgr: improve mgr keyring creation

Delegating on remote node isn't necessary here since we are already
iterating over the right nodes.

Closes: #4518
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agocommon: do not reset `container_exec_cmd`
Guillaume Abrioux [Tue, 8 Oct 2019 08:38:39 +0000 (10:38 +0200)]
common: do not reset `container_exec_cmd`

This commit removes some legacy tasks.

These tasks aren't needed, they cause the playbook to fail when
collocating daemons.

Closes: #4553
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agovalidate: prevent from installing OSD on same disk as the OS
Guillaume Abrioux [Tue, 8 Oct 2019 12:42:44 +0000 (14:42 +0200)]
validate: prevent from installing OSD on same disk as the OS

This commit adds a validation task to prevent from installing an OSD on
the same disk as the OS.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1623580
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agodashboard: if no host is available, let's just skip these plays.
Guillaume Abrioux [Wed, 9 Oct 2019 12:46:43 +0000 (14:46 +0200)]
dashboard: if no host is available, let's just skip these plays.

If there is no host available, let's just skip these plays.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1759917
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agomergify: disable automerge on backport
Dimitri Savineau [Wed, 9 Oct 2019 15:03:00 +0000 (11:03 -0400)]
mergify: disable automerge on backport

For now we disable the automerge on backport PR until we find a better
solution.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agodashboard: update layouts before the restart
Dimitri Savineau [Tue, 8 Oct 2019 13:54:06 +0000 (09:54 -0400)]
dashboard: update layouts before the restart

If the mgr dashboard doesn't restart fast enough then the inject
dashboard task will fail with a HTTP error 400.

Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 914, in _handle_command
    return self.handle_command(inbuf, cmd)
  File "/usr/share/ceph/mgr/dashboard/module.py", line 450, in handle_command
    push_local_dashboards()
  File "/usr/share/ceph/mgr/dashboard/grafana.py", line 132, in push_local_dashboards
    retry()
  File "/usr/share/ceph/mgr/dashboard/grafana.py", line 89, in call
    result = self.func(*self.args, **self.kwargs)
  File "/usr/share/ceph/mgr/dashboard/grafana.py", line 127, in push
    grafana.push_dashboard(body)
  File "/usr/share/ceph/mgr/dashboard/grafana.py", line 54, in push_dashboard
    response.raise_for_status()
  File "/usr/lib/python2.7/site-packages/requests/models.py", line 834, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
HTTPError: 400 Client Error: Bad Request

Instead we can trigger this task before the module restart.

Closes: #4565
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agotests: reduce handler mon and osd delay
Dimitri Savineau [Tue, 8 Oct 2019 14:43:07 +0000 (10:43 -0400)]
tests: reduce handler mon and osd delay

We don't need to have high handler delay in the CI so reducing to
10 seconds.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agomergify: Update condition on status success
Dimitri Savineau [Tue, 8 Oct 2019 18:17:45 +0000 (14:17 -0400)]
mergify: Update condition on status success

We don't have one global pipeline anymore reporting the jenkins job
status. So we need to match the  CI scenario name.
This also add the Travis CI status as a condition.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agotests: update tox due to pipeline removal
Guillaume Abrioux [Tue, 8 Oct 2019 15:23:08 +0000 (17:23 +0200)]
tests: update tox due to pipeline removal

This commit reflects the recent changes in ceph/ceph-build#1406

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoswitch_to_containers: umount osd lockbox partition
Dimitri Savineau [Mon, 7 Oct 2019 19:47:52 +0000 (15:47 -0400)]
switch_to_containers: umount osd lockbox partition

When switching from a baremetal deployment to a containerized deployment
we only umount the OSD data partition.
If the OSD is encrypted (dmcrypt: true) then there's an additional
partition (part number 5) used for the lockbox and mount in the
/var/lib/ceph/osd-lockbox/ directory.
Because this partition isn't umount then the containerized OSD aren't
able to start. The partition is still mount by the system and can't be
remount from the container.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1616159
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agonfs: stop nfs server service in all context
Guillaume Abrioux [Mon, 7 Oct 2019 08:34:07 +0000 (10:34 +0200)]
nfs: stop nfs server service in all context

This commit moves this task in order to stop the nfs server service
regardless the deployment type desired (containerized or non
containerized).

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agonfs: stop nfs server service
Guillaume Abrioux [Mon, 7 Oct 2019 08:21:51 +0000 (10:21 +0200)]
nfs: stop nfs server service

The syntax here wasn't working, this refact fixes this task.
Also, removing the `ignore_errors: true` which was hidding the failure.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoswitch_to_containers: do not re-set `ceph_uid`
Guillaume Abrioux [Mon, 7 Oct 2019 09:08:44 +0000 (11:08 +0200)]
switch_to_containers: do not re-set `ceph_uid`

This commit refacts the way we set `ceph_uid` fact in `ceph-facts` and
removes all `set_fact` tasks for `ceph_uid` in switch-to-containers playbook
to avoid duplicated code.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoswitch_to_containers: optimize ownership change
Guillaume Abrioux [Mon, 7 Oct 2019 07:19:50 +0000 (09:19 +0200)]
switch_to_containers: optimize ownership change

As per https://github.com/ceph/ceph-ansible/pull/4323#issuecomment-538420164

using `find` command should be faster.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1757400
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-Authored-by: Giulio Fidente <gfidente@redhat.com>
5 years agoceph-dashboard: remove rgw api host,port,scheme
Dimitri Savineau [Wed, 2 Oct 2019 18:15:45 +0000 (14:15 -0400)]
ceph-dashboard: remove rgw api host,port,scheme

We don't need to have dedicated variables for the RGW integration into
the Ceph Dashboard and need to be manually filled.
Instead we can use the current values from the RGW nodes by using the
IP and port from the first RGW instance of the first RGW node via the
radosgw_address and radosgw_frontend_port variables.
We don't need to specify all RGW nodes, this will be done automatically
with one node.
The RGW api scheme is using the radosgw_frontend_ssl_certificate variable
to determine if the value is http or https. This variable is also reuse
as a condition for the ssl verify task.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoceph-dashboard: Improve https configuration
Dimitri Savineau [Wed, 2 Oct 2019 19:24:38 +0000 (15:24 -0400)]
ceph-dashboard: Improve https configuration

This patch moves the https dashboard configuration into a dedicated
block to avoid the multiple occurence of the dashboard_protocol
condition.
It also fixes the dashboard certificate and key variables handling in
the condition introduced by ab54fe2. Those variables aren't boolean but
strings so we can test them via the length filter.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoupdate: import ceph-defaults role in first play
Guillaume Abrioux [Fri, 4 Oct 2019 09:58:28 +0000 (11:58 +0200)]
update: import ceph-defaults role in first play

Typical error:

```
fatal: [mon0]: FAILED! =>
  msg: |-
    The conditional check 'not delegate_facts_host | bool or inventory_hostname in groups.get(client_group_name, [])' failed. The error was: error while evaluating conditional (not delegate_facts_host | bool or inventory_hostname in groups.get(client_group_name, [])): 'client_group_name' is undefined
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agosite.yml: remove raw installation of python2-dnf
Guillaume Abrioux [Wed, 2 Oct 2019 14:01:09 +0000 (16:01 +0200)]
site.yml: remove raw installation of python2-dnf

these dependencies aren't needed anymore on recent releases of Fedora.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agomain: exclude client nodes from facts gathering when delegate_facts_host
Guillaume Abrioux [Wed, 2 Oct 2019 13:36:30 +0000 (15:36 +0200)]
main: exclude client nodes from facts gathering when delegate_facts_host

This commit excludes client nodes from facts gathering, they are not
needed and can speed up this task.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agotests: fix rgw multisite vagrant variables
Dimitri Savineau [Fri, 4 Oct 2019 14:07:05 +0000 (10:07 -0400)]
tests: fix rgw multisite vagrant variables

The secondary vagrant variables didn't have the grafana vm variable
set which create an vagrant error.

There was an error loading a Vagrantfile. The file being loaded
and the error message are shown below. This is usually caused by
an invalid or undefined variable.

This patch also changes the ssh-extra-args parameter to ssh-common-args
to get the same values for ssh/sftp/scp. Otherwise we can see warnings
from ansible and some tasks are failing.

[WARNING]: sftp transfer mechanism failed on [mon0]. Use ANSIBLE_DEBUG=1
to see detailed information

It also updates the ssh-common-args value for the rgw-multisite scenario
to reflect the ANSIBLE_SSH_ARGS environment variable value.

Finally changing the IP addresses due to the Vagrant refact done in the
commit 778c51a

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoplaybook: add missing tags
Guillaume Abrioux [Fri, 4 Oct 2019 12:58:11 +0000 (14:58 +0200)]
playbook: add missing tags

Add missing tag on ceph-handler role call.
Otherwise, we can't use `--tags='ceph_update_config'` for updating the
ceph configuration file.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1754432
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agohandler: followup on #4519
Guillaume Abrioux [Fri, 4 Oct 2019 14:03:27 +0000 (16:03 +0200)]
handler: followup on #4519

This commit adds some missing `| bool` filters.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoceph-dashboard: add cluster parameter to ceph cmd
Dimitri Savineau [Thu, 3 Oct 2019 19:47:39 +0000 (15:47 -0400)]
ceph-dashboard: add cluster parameter to ceph cmd

The ceph dashboard tasks didn't use the cluster option if the cluster
name isn't the default value.

Closes: #4529
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agohandlers: refact osd handler
Guillaume Abrioux [Thu, 3 Oct 2019 11:52:11 +0000 (13:52 +0200)]
handlers: refact osd handler

This commit merges the two restart tasks into a single one, this way
it's one task less to notify.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agodashboard: remove useless block section
Dimitri Savineau [Wed, 2 Oct 2019 18:58:42 +0000 (14:58 -0400)]
dashboard: remove useless block section

The block section were used with the dashboard_enabled condition when
the code was included in the main playbooks.
Because this condition isn't present in the dashboard playbook anymore
we can remove the block section.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoceph-handler: don't restart all OSDs with limit
Dimitri Savineau [Wed, 2 Oct 2019 18:48:53 +0000 (14:48 -0400)]
ceph-handler: don't restart all OSDs with limit

When using the ansible --limit option on one or few OSD nodes and if the
handler is triggered then we will restart the OSD service on all OSDs
nodes instead of the hosts limited by the limit value.
Even if the play is limited by the --limit value we are using all OSD
nodes from the OSD group.

  with_items: '{{ groups[osd_group_name] }}'

Instead we should iterate only on the nodes present in both OSD group and
limit list.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoceph-facts: fix _radosgw_address with block
Dimitri Savineau [Thu, 3 Oct 2019 12:13:01 +0000 (08:13 -0400)]
ceph-facts: fix _radosgw_address with block

e695efc introduced a regression in the _radosgw_address fact when using
the radosgw_address_block variable.
There's no item there because we don't use the items lookup. This is
only used for _monitor_address with monitor_address_block.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1758099
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoVagrantfile: support more than 9 nodes per daemon type
Guillaume Abrioux [Wed, 2 Oct 2019 08:14:52 +0000 (10:14 +0200)]
Vagrantfile: support more than 9 nodes per daemon type

because of the current ip address assignation, it's not possible to
deploy more than 9 nodes per daemon type.
This commit refact a bit and allows us to get around this limitation.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agocommon: improve keyrings generation
Guillaume Abrioux [Wed, 2 Oct 2019 07:57:50 +0000 (09:57 +0200)]
common: improve keyrings generation

There is no need to get n * number of nodes the different keyrings.
Adding a `run_once: true` here avoid running a ceph command too many
times which could be impacting large cluster deployment.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoceph-facts: use --admin-daemon to get fsid
Dimitri Savineau [Tue, 1 Oct 2019 18:41:57 +0000 (14:41 -0400)]
ceph-facts: use --admin-daemon to get fsid

During the rolling_update scenario, the fsid value is retrieve from the
current ceph cluster configuration via the ceph daemon config command.
This command tries first to resolve the admin socket path via the
ceph-conf command.
Unfortunately this command won't work if you have a duplicate key in the
ceph configuration even if it only produces a warning. As a result the
task will fail.

Can't get admin socket path: unable to get conf option admin_socket for
mon.xxx: warning: line 13: 'osd_memory_target' in section 'osd' redefined

Instead of using ceph daemon we can use the --admin-daemon option
because we already know what the socket admin path value based on the
ceph cluster and mon hostname values.

Closes: #4492
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agovalidate: fix gpt header check
Guillaume Abrioux [Tue, 1 Oct 2019 07:34:14 +0000 (09:34 +0200)]
validate: fix gpt header check

Check for gpt header when osd scenario is lvm or lvm batch.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agorbdmirror: rename a file
Guillaume Abrioux [Mon, 30 Sep 2019 09:02:52 +0000 (11:02 +0200)]
rbdmirror: rename a file

rename this file to be more generic.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agorgw: refact tasks directory layout
Guillaume Abrioux [Mon, 30 Sep 2019 07:40:56 +0000 (09:40 +0200)]
rgw: refact tasks directory layout

This commit moves containerized deployment related files to `./tasks/`
directory. This is needed to make `docker-to-podman.yml` working since
we use `tasks_from:` option.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>