]> git.apps.os.sepia.ceph.com Git - ceph-ansible.git/log
ceph-ansible.git
5 years agoceph-facts: move grafana fact to dedicated file
Dimitri Savineau [Mon, 13 Jan 2020 15:24:52 +0000 (10:24 -0500)]
ceph-facts: move grafana fact to dedicated file

We don't need to executed the grafana fact everytime but only during
the dashboard deployment.
Especially for ceph-grafana, ceph-prometheus and ceph-dashboard roles.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790303
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f940e695ab839aafac3be73163d8c84a2d1a8ebf)

5 years agofacts: fix osp/ceph external use case
Guillaume Abrioux [Mon, 13 Jan 2020 14:30:13 +0000 (15:30 +0100)]
facts: fix osp/ceph external use case

d6da508a9b6829d2d0633c7200efdffce14f403f broke the osp/ceph external use case.

We must skip these tasks when no monitor is present in the inventory.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790508
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2592a1e1e84f0c3f407ffd879fc8cee87ad35894)

5 years agodefaults: change monitor|radosgw_address default values
Guillaume Abrioux [Mon, 9 Dec 2019 17:23:15 +0000 (18:23 +0100)]
defaults: change monitor|radosgw_address default values

To avoid confusion, let's change the default value from `0.0.0.0` to
`x.x.x.x`.
Users might think setting `0.0.0.0` will make the daemon binding on all
interfaces.

Fixes: #4827
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fc02fc98ebce0f99e81628e76ad28e7bf65435de)

5 years agoosd: ensure osd ids collected are well restarted
Guillaume Abrioux [Mon, 13 Jan 2020 15:31:00 +0000 (16:31 +0100)]
osd: ensure osd ids collected are well restarted

This commit refact the condition in the loop of that task so all
potential osd ids found are well started.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790212
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 58e6bfed2d1c9f6e86fd1a680f26539f539afcd0)

5 years agotests: add time command in vagrant_up.sh v4.0.8
Guillaume Abrioux [Thu, 17 Oct 2019 13:37:31 +0000 (15:37 +0200)]
tests: add time command in vagrant_up.sh

monitor how long it takes to get all VMs up and running

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 16bcef4f28c24b56d3896ac193226be139b4d2f2)

5 years agotests: retry to fire up VMs on vagrant failure
Guillaume Abrioux [Tue, 2 Apr 2019 12:53:19 +0000 (14:53 +0200)]
tests: retry to fire up VMs on vagrant failure

Add a script to retry several times to fire up VMs to avoid vagrant
failures.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 1ecb3a9352d869d8fde694cefae9de8af8f6fee8)

5 years agotests: add a docker2podman scenario
Guillaume Abrioux [Fri, 10 Jan 2020 13:31:42 +0000 (14:31 +0100)]
tests: add a docker2podman scenario

This commit adds a new scenario in order to test docker-to-podman.yml
migration playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit dc672e86eca3cfbd047c5968852511c07368d4b4)

5 years agodocker2podman: use set_fact to override variables
Guillaume Abrioux [Fri, 10 Jan 2020 13:30:35 +0000 (14:30 +0100)]
docker2podman: use set_fact to override variables

play vars have lower precedence than role vars and `set_fact`.
We must use a `set_fact` to reset these variables.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b0c491800a785df88613ef7a9c2680a7540a8c90)

5 years agodocker2podman: force systemd to reload config
Guillaume Abrioux [Fri, 10 Jan 2020 13:29:50 +0000 (14:29 +0100)]
docker2podman: force systemd to reload config

This is needed after a change is made in systemd unit files.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1c2ec9fb4042b65088d055ee4cd8bc773e241dcf)

5 years agodocker2podman: install podman
Guillaume Abrioux [Fri, 10 Jan 2020 10:17:27 +0000 (11:17 +0100)]
docker2podman: install podman

This commit adds a package installation task in order to install podman
during the docker-to-podman.yml migration playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d746575fd0ac83d5861b6aae0143aa8d390760e6)

5 years agoupdate: only run post osd upgrade play on 1 mon
Guillaume Abrioux [Mon, 18 Nov 2019 17:12:00 +0000 (18:12 +0100)]
update: only run post osd upgrade play on 1 mon

There is no need to run these tasks n times from each monitor.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c878e99589bde0eecb8ac72a7ec8bc1f66403eeb)

5 years agoupdate: use flags noout and nodeep-scrub only
Guillaume Abrioux [Mon, 18 Nov 2019 16:59:56 +0000 (17:59 +0100)]
update: use flags noout and nodeep-scrub only

1. set noout and nodeep-scrub flags,
2. upgrade each OSD node, one by one, wait for active+clean pgs
3. after all osd nodes are upgraded, unset flags

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Rachana Patel <racpatel@redhat.com>
(cherry picked from commit 548db78b9535348dff616665be749503f80c4fca)

5 years agoceph-validate: add rbdmirror validation
Dimitri Savineau [Tue, 5 Nov 2019 16:53:22 +0000 (11:53 -0500)]
ceph-validate: add rbdmirror validation

When ceph_rbd_mirror_configure is set to true we need to ensure that
the required variables aren't empty.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1760553
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 4a065cebd70d259bfd59b6f5f9baa45d516a9c3a)

5 years agoceph-osd: wait for all osds once
Dimitri Savineau [Wed, 27 Nov 2019 14:29:06 +0000 (09:29 -0500)]
ceph-osd: wait for all osds once

cf8c6a3 moves the 'wait for all osds' task from openstack_config to the
main tasks list.
But the openstack_config code was executed only on the last OSD node.
We don't need to do this check on all OSD node so we need to add set
run_once to true on that task.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 5bd1cf40eb5823aab3c4e16b60b37c30600f9283)

5 years agoceph-osd: wait for all osd before crush rules
Dimitri Savineau [Tue, 26 Nov 2019 16:09:11 +0000 (11:09 -0500)]
ceph-osd: wait for all osd before crush rules

When creating crush rules with device class parameter we need to be sure
that all OSDs are up and running because the device class list is
is populated with this information.
This is now enable for all scenario not openstack_config only.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit cf8c6a384999be8caedce1121dfd57ae114d5bb6)

5 years agoceph-osd: add device class to crush rules
Dimitri Savineau [Thu, 31 Oct 2019 20:24:12 +0000 (16:24 -0400)]
ceph-osd: add device class to crush rules

This adds device class support to crush rules when using the class key
in the rule dict via the create-replicated sub command.
If the class key isn't specified then we use the create-simple sub
command for backward compatibility.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636508
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ef2cb99f739ade80e285d83050ac01184aafc753)

5 years agomove crush rule creation from mon to osd role
Dimitri Savineau [Thu, 31 Oct 2019 20:17:33 +0000 (16:17 -0400)]
move crush rule creation from mon to osd role

If we want to create crush rules with the create-replicated sub command
and device class then we need to have the OSD created before the crush
rules otherwise the device classes won't exist.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ed36a11eabbdbb040652991300cdfc93d51ed491)

5 years agopurge-iscsi-gateways: don't run all ceph-facts
Dimitri Savineau [Fri, 10 Jan 2020 14:31:26 +0000 (09:31 -0500)]
purge-iscsi-gateways: don't run all ceph-facts

We only need to have the container_binary fact. Because we're not
gathering the facts from all nodes then the purge fails trying to get
one of the grafana fact.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786686
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit a09d1c38bf80e412265f58d732c554262ef23cc7)

5 years agorolling_update: run registry auth before upgrading
Dimitri Savineau [Thu, 9 Jan 2020 19:57:08 +0000 (14:57 -0500)]
rolling_update: run registry auth before upgrading

There's some tasks using the new container image during the rolling
upgrade playbook that needs to execute the registry login first otherwise
the nodes won't be able to pull the container image.

Unable to find image 'xxx.io/foo/bar:latest' locally
Trying to pull repository xxx.io/foo/bar ...
/usr/bin/docker-current: Get https://xxx.io/v2/foo/bar/manifests/latest:
unauthorized

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 3f344fdefe02c3b597b886cbef8b7456a7db28eb)

5 years agoconfig: exclude ceph-disk prepared osds in lvm batch report
Guillaume Abrioux [Thu, 9 Jan 2020 18:31:57 +0000 (19:31 +0100)]
config: exclude ceph-disk prepared osds in lvm batch report

We must exclude the devices already used and prepared by ceph-disk when
doing the lvm batch report. Otherwise it fails because ceph-volume
complains about GPT header.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786682
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fd1718f3796312e29cd5fd64fcc46826741303d2)

5 years agotests: use community repository v4.0.7
Dimitri Savineau [Thu, 9 Jan 2020 20:24:17 +0000 (15:24 -0500)]
tests: use community repository

We don't need to use dev repository on stable branches.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoshrink-rgw: refact global workflow
Dimitri Savineau [Thu, 9 Jan 2020 16:48:13 +0000 (11:48 -0500)]
shrink-rgw: refact global workflow

Instead of running the ceph roles against localhost we should do it
on the first mon.
The ansible and inventory hostname of the rgw nodes could be different.
Ensure that the rgw instance to remove is present in the cluster.
Fix rgw service and directory path.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 747555dfa601b4925204fd878735c296ef728e5d)

5 years agomon: support replacing a mon
Guillaume Abrioux [Thu, 9 Jan 2020 15:46:34 +0000 (16:46 +0100)]
mon: support replacing a mon

We must pick up a mon which actually exists in ceph-facts in order to
detect if a cluster is running. Otherwise, it will state no cluster is
already running which will end up deploying a new monitor isolated in a
new quorum.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622688
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 86f3eeb717c7daac8c6330fdaa7f8a3c83f94b0d)

5 years agoceph-iscsi: manage ipv6 in trusted_ip_list
Dimitri Savineau [Tue, 7 Jan 2020 20:01:48 +0000 (15:01 -0500)]
ceph-iscsi: manage ipv6 in trusted_ip_list

Only the ipv4 addresses from the nodes running the dashboard mgr module
were added to the trusted_ip_list configuration file on the iscsigws
nodes.
This also add the iscsi gateways with ipv6 configuration to the ceph
dashboard.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1787531
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 70eba66182aebfcb7056521eb9da7c6c13f574da)

5 years agoceph-rgw: Fix custom pool size setting
Benoît Knecht [Mon, 30 Dec 2019 09:53:20 +0000 (10:53 +0100)]
ceph-rgw: Fix custom pool size setting

RadosGW pools can be created by setting

```yaml
rgw_create_pools:
  .rgw.root:
    pg_num: 512
    size: 2
```

for instance. However, doing so would create pools of size
`osd_pool_default_size` regardless of the `size` value. This was due to
the fact that the Ansible task used

```
{{ item.size | default(osd_pool_default_size) }}
```

as the pool size value, but `item.size` is always undefined; the
correct variable is `item.value.size`.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit 3c31b19ab39f297635c84edb9e8a5de6c2da7707)

5 years agohandler: fix bug
Guillaume Abrioux [Thu, 19 Dec 2019 10:29:41 +0000 (11:29 +0100)]
handler: fix bug

411bd07d54fc3f585296b68f2fd04484328399b5 introduced a bug in handlers

using `handler_*_status` instead of `hostvars[item]['handler_*_status']`
causes handlers to be triggered in anycase even though
`handler_*_status` was set to `False` on a specific node.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622688
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 30200802d97abf56c09ebd39f64184b2b4622c50)

5 years agoceph-nfs: add ganesha_t type to selinux
Dimitri Savineau [Mon, 6 Jan 2020 14:09:42 +0000 (09:09 -0500)]
ceph-nfs: add ganesha_t type to selinux

Since RHEL 8.1 we need to add the ganesha_t type to the permissive
SELinux list.
Otherwise the nfs-ganesha service won't start.
This was done on RHEL 7 previously and part of the nfs-ganesha-selinux
package on RHEL 8.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786110
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d75812529069244734732d05cc5aa3ddbc99b7c5)

5 years agoshrink-osd: support fqdn in inventory
Guillaume Abrioux [Mon, 9 Dec 2019 14:52:26 +0000 (15:52 +0100)]
shrink-osd: support fqdn in inventory

When using fqdn in inventory, that playbook fails because of some tasks
using the result of ceph osd tree (which returns shortname) to get
some datas in hostvars[].

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1779021
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6d9ca6b05b52694dec53ce61fdc16bb83c93979d)

5 years agoceph-defaults: exclude rbd devices from discovery
Dimitri Savineau [Mon, 16 Dec 2019 20:12:47 +0000 (15:12 -0500)]
ceph-defaults: exclude rbd devices from discovery

The RBD devices aren't excluded from the devices list in the LVM auto
discovery scenario.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1783908
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 6f0556f01536932bdf47e8f1aab341b2c6761537)

5 years agoceph-infra: replace hardcoded grafana group name
Dimitri Savineau [Mon, 16 Dec 2019 16:03:21 +0000 (11:03 -0500)]
ceph-infra: replace hardcoded grafana group name

The grafana-server group name was hardcoded for the grafana/prometheus
firewalld tasks condition.
We should we the associated variable : grafana_server_group_name

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2c06678cdeed20f0d40f1693abbf8678250c25ea)

5 years agoceph-infra: move dashboard into a dedicated file
Dimitri Savineau [Mon, 16 Dec 2019 16:00:35 +0000 (11:00 -0500)]
ceph-infra: move dashboard into a dedicated file

Instead of using multiple dashboard_enabled condition in the
configure_firewall file we could just have the condition once
and include the dedicated tasks list.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f4c261ef9023d006cabfac28b45e7820bb132ceb)

5 years agoceph-infra: open dashboard port on monitor
Dimitri Savineau [Mon, 16 Dec 2019 15:48:26 +0000 (10:48 -0500)]
ceph-infra: open dashboard port on monitor

When there's no mgr group defined in the ansible inventory then the
mgrs are deployed implicitly on the mons nodes.
If the dashboard is enabled then we need to open the dashboard port on
the node that is running the ceph mgr process (mgr or mon).
The current code only allow to open that port on the mgr nodes when they
are present explicitly in the inventory but not implicitly.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1783520
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 4535985188dcc656ff4da60318dc07b44eabf3a6)

5 years agodashboard: use fqdn in external url
Guillaume Abrioux [Thu, 2 Jan 2020 17:09:38 +0000 (18:09 +0100)]
dashboard: use fqdn in external url

Force fqdn to be used in external url for prometheus and alertmanager.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1765485
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 498bc45859f9a7ac4b3ac419e21852164f8a762e)

5 years agotests: use ceph iscsi stable repository
Dimitri Savineau [Wed, 8 Jan 2020 14:20:01 +0000 (09:20 -0500)]
tests: use ceph iscsi stable repository

The ceph iscsi repository was still set to dev (shaman) instead of
using the stable ceph-iscsi repository.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agopurge-iscsi-gateways: remove node from dashboard
Dimitri Savineau [Mon, 6 Jan 2020 20:22:51 +0000 (15:22 -0500)]
purge-iscsi-gateways: remove node from dashboard

When using the ceph dashboard with iscsi gateways nodes we also need to
remove the nodes from the ceph dashboard list.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786686
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 931a842f21e5eb847ad371640307b7c0fef198bd)

5 years agofilestore-to-bluestore: umount partitions before zapping them
Guillaume Abrioux [Wed, 18 Dec 2019 14:48:32 +0000 (15:48 +0100)]
filestore-to-bluestore: umount partitions before zapping them

When an OSD is stopped, it leaves partitions mounted.
We must umount them before zapping them, otherwise error like "Device is
busy" will show up.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8056514134512f20a4b02028fb051f075ad7a145)

5 years agoshrink-mds: do not play ceph-facts entirely
Guillaume Abrioux [Wed, 8 Jan 2020 15:10:17 +0000 (16:10 +0100)]
shrink-mds: do not play ceph-facts entirely

We only need to set `container_binary`.
Let's use `tasks_from` option.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0ae0a9ce2812796d943085c6622f4188f16d6231)

5 years agoshrink-mds: use fact from delegated node
Guillaume Abrioux [Wed, 8 Jan 2020 14:02:24 +0000 (15:02 +0100)]
shrink-mds: use fact from delegated node

The command is delegated on the first monitor so we must use the fact
`container_binary` from this node.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 77b39d235b9b713b7e814296164db27b4d428ae0)

5 years agofacts: use correct python interpreter
Guillaume Abrioux [Wed, 8 Jan 2020 13:14:41 +0000 (14:14 +0100)]
facts: use correct python interpreter

that task is delegated on the first mon so we should always use the
`discovered_interpreter_python` from that node.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5adb735c78767545993192c67cf12b9e03f42138)

5 years agoshrink-mds: fix filesystem removal task
Guillaume Abrioux [Fri, 3 Jan 2020 15:02:48 +0000 (16:02 +0100)]
shrink-mds: fix filesystem removal task

This commit deletes the filesystem when no more MDS is present after
shrinking operation.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1787543
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 38278a6bb5eb2ec4186984f2006094fe7e36dd79)

5 years agoshrink-mds: ensure max_mds is always honored
Guillaume Abrioux [Fri, 3 Jan 2020 14:56:43 +0000 (15:56 +0100)]
shrink-mds: ensure max_mds is always honored

This commit prevent from shrinking an mds node when max_mds wouldn't be
honored after that operation.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2cfe5a04bfcb4aae6b7842621dfacab90bdcc7c3)

5 years agoceph_volume: support filestore to bluestore migration
Guillaume Abrioux [Tue, 7 Jan 2020 15:29:48 +0000 (16:29 +0100)]
ceph_volume: support filestore to bluestore migration

This commit adds the filestore to bluestore migration support in
ceph_volume module.

We must append to the executed command only the relevant options
according to what is passed in `osd_objectostore`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit aabba3baab50fe4fb86535cd8838edc4af87d917)

5 years agofilestore-to-bluestore: ensure all dm are closed 4885/head v4.0.6
Guillaume Abrioux [Tue, 10 Dec 2019 22:04:57 +0000 (23:04 +0100)]
filestore-to-bluestore: ensure all dm are closed

This commit adds a task to ensure device mappers are well closed when
lvm batch scenario is used.
Otherwise, OSDs can't be redeployed given that devices that are rejected
by ceph-volume because they are locked.

Adding a condition `devices | default([]) | length > 0` to remove these
dm only when using lvm batch scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8e6ef818a287e8bf139420142493843077ea3851)

5 years agofilestore-to-bluestore: force OSDs to be marked down
Guillaume Abrioux [Tue, 10 Dec 2019 22:03:40 +0000 (23:03 +0100)]
filestore-to-bluestore: force OSDs to be marked down

Otherwise, sometimes it can take a while for an OSD to be seen as down
and causes the `ceph osd purge` command to fail.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 51d601193ee2050a002dafd29005019e26e2a804)

5 years agofilestore-to-bluestore: do not use --destroy
Guillaume Abrioux [Tue, 10 Dec 2019 14:59:50 +0000 (15:59 +0100)]
filestore-to-bluestore: do not use --destroy

Do not use `--destroy` when zapping a device.
Otherwise, it destroys VGs while they are still needed to redeploy the
OSDs.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e3305e6bb655aa64f936687097cbcb6fc62f43cb)

5 years agoceph_volume: add destroy option support
Guillaume Abrioux [Tue, 10 Dec 2019 14:57:42 +0000 (15:57 +0100)]
ceph_volume: add destroy option support

The zap action from ceph_volume module always implies `--destroy`.
This commit adds the destroy option support so we can ask ceph-volume to
not use `--destroy` when zapping a device.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0dcacdbed09ef78350c03112021cae885ff47ba9)

5 years agofilestore-to-bluestore: add non containerized support
Guillaume Abrioux [Tue, 10 Dec 2019 10:07:30 +0000 (11:07 +0100)]
filestore-to-bluestore: add non containerized support

This commit adds the non containerized context support to the
filestore-to-bluestore.yml infrastructure playbook.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4833b85e0428349f99c68a94f5f90867799ae224)

5 years agotests: add filestore_to_bluestore job
Guillaume Abrioux [Tue, 10 Dec 2019 13:37:47 +0000 (14:37 +0100)]
tests: add filestore_to_bluestore job

This commit adds a new job in order to test the
filestore-to-bluestore.yml infrastructure playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 40de34fb5e42cffa43e59afe1cacc4d5675287b0)

5 years agoansible.cfg: do not enforce PreferredAuthentications
Guillaume Abrioux [Mon, 9 Dec 2019 16:10:11 +0000 (17:10 +0100)]
ansible.cfg: do not enforce PreferredAuthentications

There's no need to enforce PreferredAuthentications by default.
Users can still choose to override the ansible.cfg with any additional
parameter like this one to fit their infrastructure.

Fixes: #4826
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d682412e2aa5eeb411cc0dff9a3ffef4b4aa8683)

5 years agodefaults: change default value for dashboard_admin_password
Guillaume Abrioux [Thu, 5 Dec 2019 14:21:41 +0000 (15:21 +0100)]
defaults: change default value for dashboard_admin_password

A recent change in ceph/ceph prevent from having username in the
password:

`Error EINVAL: Password cannot contain username.`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0756fa467d2bdbe6b758b07b89eabc0a8b00e715)

5 years agoupdate: restart iscsigws daemons after upgrade
Guillaume Abrioux [Thu, 5 Dec 2019 10:06:06 +0000 (11:06 +0100)]
update: restart iscsigws daemons after upgrade

In containerized context, containers aren't stopped early in the
sequence.
It means they aren't restarted after the upgrade because the task is
just checking the daemon status is started (eg: `state: started`).

This commit also removes the task which ensure services are started
because it's already done in the role ceph-iscsigw.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c7708eb4582f65717599d2d7bee7a7a5885ecc2b)

5 years agoupgrade: add dashboard deployment
Guillaume Abrioux [Wed, 4 Dec 2019 16:17:36 +0000 (17:17 +0100)]
upgrade: add dashboard deployment

when upgrading from RHCS 3, dashboard has obviously never been deployed
and it forces us to deploy it later manually.
This commit adds the dashboard deployment as part of the upgrade to
RHCS 4.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1779092
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 451c5ca93499fc0520b60729a170f0aa6f6132db)

5 years agodefaults: add a comment
Guillaume Abrioux [Mon, 9 Dec 2019 17:31:52 +0000 (18:31 +0100)]
defaults: add a comment

This commit isolates and adds an explicit comment about variables not
intended to be modified by the user.

Fixes: #4828
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a234338eff4b4e35e1c0504ae2ee005e4ca76cf7)

5 years agodashboard: run node_export as privileged container
Guillaume Abrioux [Tue, 3 Dec 2019 13:39:53 +0000 (14:39 +0100)]
dashboard: run node_export as privileged container

Typical error:

```
type=AVC msg=audit(1575367499.582:3210): avc:  denied  { search } for  pid=26680 comm="node_exporter" name="1" dev="proc" ino=11528 scontext=system_u:system_r:container_t:s0:c100,c1014 tcontext=system_u:system_r:init_t:s0 tclass=dir permissive=0
```

node_exporter needs to be run as privileged to avoid avc denied error
since it gathers lot of information on the host.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1762168
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d245eb7e7d9453af04e141ed0abd3fbdef1e563c)

5 years agoceph-defaults: exclude md devices from discovery
Dimitri Savineau [Wed, 4 Dec 2019 17:32:49 +0000 (12:32 -0500)]
ceph-defaults: exclude md devices from discovery

The md devices (RAID software) aren't excluded from the devices list in
the auto discovery scenario.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1764601
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 014f51c2a42e4922b43da07b97c4b810ede32200)

5 years agofacts: avoid duplicated element in devices list
Guillaume Abrioux [Wed, 20 Nov 2019 10:02:49 +0000 (11:02 +0100)]
facts: avoid duplicated element in devices list

When using `osd_auto_discovery`, `devices` is built multiple times due
to multiple runs of `ceph-facts` role. It end up with duplicate
instances of a same device in the list.

Using `unique` filter when building the list fixes this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 23b1f43897db0a03ef94f51e83ed3c562c4584d0)

5 years agopurge-cluster: add podman support
Dimitri Savineau [Wed, 4 Dec 2019 15:10:08 +0000 (10:10 -0500)]
purge-cluster: add podman support

The podman support was added to the purge-container-cluster playbook but
containers are always used for the dashboard even on non containerized
deployment.
This commits adds the podman support on purging the dashboard resources
in the purge-cluster playbook.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 89f6cc54a2f6310541eb243390a696cb914b0c7a)

5 years agotests: reduce max_mds from 3 to 2
Dimitri Savineau [Wed, 4 Dec 2019 17:12:05 +0000 (12:12 -0500)]
tests: reduce max_mds from 3 to 2

Having max_mds value equals to the number of mds nodes generates a
warning in the ceph cluster status:

cluster:
id:     6d3e49a4-ab4d-4e03-a7d6-58913b8ec00a'
health: HEALTH_WARN'
        insufficient standby MDS daemons available'
(...)
services:
  mds:     cephfs:3 {0=mds1=up:active,1=mds0=up:active,2=mds2=up:active}'

Let's use 2 active and 1 standby mds.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 4a6d19dae296969954e5101e9bd53443fddde03d)

5 years agopurge: rename playbook (container)
Guillaume Abrioux [Tue, 3 Dec 2019 14:48:59 +0000 (15:48 +0100)]
purge: rename playbook (container)

Since we now support podman, let's rename the playbook so it's more
generic.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7bc7e3669d26fb41d096bdf50e2301ee782a428b)

5 years agoadd-{mon,osd}: run raw install python tasks
Dimitri Savineau [Mon, 4 Nov 2019 14:04:48 +0000 (09:04 -0500)]
add-{mon,osd}: run raw install python tasks

If the new mon/osd node doesn't have python installed then we need to
execute the tasks from raw_install_python.yml.

Closes: #4368
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 34b03d1873f6a5fba8baddbf61b08b50a224e555)

5 years agoceph-grafana: remove ipv6 brakets on wait_for
Dimitri Savineau [Mon, 25 Nov 2019 20:58:27 +0000 (15:58 -0500)]
ceph-grafana: remove ipv6 brakets on wait_for

The wait_for ansible module doesn't support the backets on IPv6 address
so need to remove them.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1769710
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 55adc10be31f313ac97b3b3c3c9ea7f17181922d)

5 years agoceph-defaults: pin prometheus container tags
Dimitri Savineau [Mon, 4 Nov 2019 15:10:26 +0000 (10:10 -0500)]
ceph-defaults: pin prometheus container tags

In addition to the grafana container tag change, we need to do the same
for the prometheus container stack based on the release present in the
OSE 4.1 container image.

$ docker run --rm openshift4/ose-prometheus-node-exporter:v4.1 --version
node_exporter, version 0.17.0
  build user:       root@67fee13ed48f
  build date:       20191023-14:38:12
  go version:       go1.11.13
$ docker run --rm openshift4/ose-prometheus-alertmanager:4.1 --version
alertmanager, version 0.16.2
  build user:       root@70b79a3f29b6
  build date:       20191023-14:57:30
  go version:       go1.11.13
$ docker run --rm openshift4/ose-prometheus:4.1 --version
prometheus, version 2.7.2
  build user:       root@12da054778a3
  build date:       20191023-14:39:36
  go version:       go1.11.13

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 3e29b8d5ffcfe4d0c74f44d6d974bc770e21e828)

5 years agoswitch_to_containers: fix umount ceph partitions
Dimitri Savineau [Wed, 27 Nov 2019 16:27:09 +0000 (11:27 -0500)]
switch_to_containers: fix umount ceph partitions

When a container is already running on a non containerized node then the
umount ceph partition task is skipped.
This is due to the container ps command which always returns 0 even if
the filter matches nothing.

We should run the umount task when:
1/ the container command is failing (not installed) : rc != 0
2/ the container command reports running ceph-osd containers : rc == 0

Also we should not fail on the ceph directory listing.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1616159
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 39cfe0aa65ddd96458ba9d0a031d801efbb0d394)

5 years agopurge: do not try to stop docker when binary is podman
Guillaume Abrioux [Tue, 26 Nov 2019 15:18:28 +0000 (16:18 +0100)]
purge: do not try to stop docker when binary is podman

If the container binary is podman, we shouldn't try to stop docker here.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b18476a1a6ff23550ca3b36864771f7afbfaf373)

5 years agofacts: isolate container_binary facts
Guillaume Abrioux [Tue, 26 Nov 2019 15:10:17 +0000 (16:10 +0100)]
facts: isolate container_binary facts

in order to be able to call container_binary without having to run the
whole ceph-facts role.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fe5ffe589e3cf19d1b03d73f3e2867a870c86192)

5 years agopurge: remove docker_* task
Guillaume Abrioux [Tue, 26 Nov 2019 14:26:35 +0000 (15:26 +0100)]
purge: remove docker_* task

All containers are removed when systemd stops them.
There is no need to call this module in purge container playbook.

This commit also removes all docker_image task and remove all container
images in the final cleanup play.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1776736
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d23383a820ed23e675af28e80b161e17d17fe40d)

5 years agodashboard: use fqdn url for active alert
Guillaume Abrioux [Mon, 2 Dec 2019 13:31:41 +0000 (14:31 +0100)]
dashboard: use fqdn url for active alert

When using the shortname, the URL for active alert launches with short
hostname and fails to connect to the server.

This commit changes the template in order to use the fqdn.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1765485
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a8d76d72d71c8440003ac1ac78fd187f7b7a65e6)

5 years agodashboard: only print dashboard url of the grafana-server node
Guillaume Abrioux [Tue, 26 Nov 2019 09:59:29 +0000 (10:59 +0100)]
dashboard: only print dashboard url of the grafana-server node

This commit makes the ceph-dashboard role only printing ceph-dashboard
URL of the nodes present in grafana-server group

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1762163
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit cc0c1ce30154118670d7e9b19e9e092b22dbfc2c)

5 years agodocker2podman: import ceph-handler role
Guillaume Abrioux [Mon, 2 Dec 2019 08:47:21 +0000 (09:47 +0100)]
docker2podman: import ceph-handler role

This is needed to avoid following error:

```
ERROR! The requested handler 'restart ceph mons' was not found in either the main handlers list nor in the listening handlers list
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1777829
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a43a8721050aa858fec30da7ab8845e2ae845659)

5 years agodocker2podman: do not hardcode group name
Guillaume Abrioux [Thu, 28 Nov 2019 14:12:59 +0000 (15:12 +0100)]
docker2podman: do not hardcode group name

let's use `client_group_name` instead of hardcoding the name.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7fe0d55efff1b3ec9571ba3f6881d68cfa66ffed)

5 years agodocker2podman: import ceph-defaults in first play
Guillaume Abrioux [Thu, 28 Nov 2019 13:01:13 +0000 (14:01 +0100)]
docker2podman: import ceph-defaults in first play

We must import this role in the first play otherwise the first call to
`client_group_name`fails.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1777829
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6526a25ab52c12841403191539297ff8adc9cacd)

5 years agotests: revert vagrant_variable file name detection
Guillaume Abrioux [Mon, 25 Nov 2019 09:03:08 +0000 (10:03 +0100)]
tests: revert vagrant_variable file name detection

This commit reverts the following change:

https://github.com/ceph/ceph-ansible/pull/4510/commits/fcf181342a70b78a355d1c985699028012326b5f#diff-23b6f443c01ea2efcb4f36eedfea9089R7-R14

this is causing CI failures so this commit is intended to unlock the CI.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5353ab8a23aee92ebab8146eeeeffcfcb25c0865)

5 years agoFixes failure of cephfs configuration using --limit
VasishtaShastry [Mon, 18 Nov 2019 09:49:17 +0000 (15:19 +0530)]
Fixes failure of cephfs configuration using --limit
Configuration of cephfs with an existing cluster using --limit used to fail
at different tasks while running with site-docker.yml
This commit addresses both of those tasks

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1773489
Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
(cherry picked from commit 72c43cc5d9052455f37414f7cd4fba1e37f99ef0)

5 years agocontainer: add always tag on gather fact tasks
Dimitri Savineau [Thu, 14 Nov 2019 14:29:29 +0000 (09:29 -0500)]
container: add always tag on gather fact tasks

If we execute the site-container.yml playbook with specific tags (like
ceph_update_config) then we need to be sure to gather the facts otherwise
we will see error like:

The task includes an option with an undefined variable. The error was:
'ansible_hostname' is undefined

This commit also adds missing 'gather_facts: false' to mons plays.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1754432
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d7fd769b6ddcb63086cc414e00cce31433d56673)

5 years agoEvades validation of ceph_repository_type in containerized scenario
VasishtaShastry [Thu, 7 Nov 2019 12:00:21 +0000 (17:30 +0530)]
Evades validation of ceph_repository_type in containerized scenario
This will prevent failure of site-docker.yml with configs in doc.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1769760
Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
Co-Authored-By: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9a1f1626c3e57e64bdcd8d37ae600c21f3ea2a24)

5 years agoceph_key: restore file mode after a key is fetched
Guillaume Abrioux [Thu, 14 Nov 2019 09:30:34 +0000 (10:30 +0100)]
ceph_key: restore file mode after a key is fetched

when `import_key` is enabled, if the key already exists, it will only be
fetched using ceph cli, if the mode specified in the `ceph_key` task is
different from what is applied by the ceph cli, the mode isn't restored because
we don't call `module.set_fs_attributes_if_different()` before
`module.exit_json(**result)`

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1734513
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b717b5f736448903c69882392e0691fba60893aa)

5 years agotests: add coverage on purge playbook
Guillaume Abrioux [Thu, 7 Nov 2019 12:39:25 +0000 (13:39 +0100)]
tests: add coverage on purge playbook

This commit adds a playbook to be played before we run purge playbook,
it first creates an rbd image then map an rbd device on client0 so the
purge playbook will try to unmap it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit db77fbda15bf9f79f8122559b01b6625005ae29c)

5 years agopurge: use sysfs to unmap rbd devices
Guillaume Abrioux [Mon, 4 Nov 2019 14:59:39 +0000 (15:59 +0100)]
purge: use sysfs to unmap rbd devices

in containerized context, using the binary provided in atomic os won't
work because it's an old version provided by ceph-common based on
10.2.5.
Using a container could be an idea but for large cluster with hundreds
of client nodes, that would require to pull the image of each of them
just to unmap the rbd devices.

Let's use the sysfs method in order to avoid any issue related to ceph
version that is shipped on the host.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1766064
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3cfcc7a105156dfde65b23e9d8662cd848537094)

5 years agomergify: remove mergify config on stable-4.0
Guillaume Abrioux [Thu, 7 Nov 2019 20:11:26 +0000 (21:11 +0100)]
mergify: remove mergify config on stable-4.0

This commit removes the mergify config on stable-4.0

At the moment there is no need to have a mergify config on this branch
given that we don't use it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoceph-osd: fix fs.aio-max-nr sysctl condition
Dimitri Savineau [Wed, 6 Nov 2019 15:15:53 +0000 (10:15 -0500)]
ceph-osd: fix fs.aio-max-nr sysctl condition

[1] introduced a regression on the fs.aio-max-nr sysctl value condition.
The enable key isn't a boolean but a string because the expression isn't
evaluated.
This string output "(osd_objectstore == 'bluestore')" is always true
because item.enable condition only matches non empty string. So the
sysctl value was applyied for both filestore and bluestore backend.

[2] added the bool filter to the condition but the filter always returns
false on string and the sysctl wasn't applyed at all.

This commit fixes the enable key value by evaluating the value instead
of using the string.

[1] https://github.com/ceph/ceph-ansible/commit/08a2b58
[2] https://github.com/ceph/ceph-ansible/commit/ab54fe2

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ece46d33be566994d6ce799fdc4547299b352429)

5 years agotests/requirements: bump testinfra and pytest
Dimitri Savineau [Fri, 1 Nov 2019 14:25:36 +0000 (10:25 -0400)]
tests/requirements: bump testinfra and pytest

The ansible ssh connections are now using the ssh backend instead of
paramiko starting testinfra 3.1 and persistent connections too.
pytest 4.6 is the latest release to be supported by python 2.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 02df2ab5ea37ab7d9cd42b1a4d324515cb503677)

5 years agoceph-defaults: pin grafana container tag to 5.2.4 v4.0.5
Dimitri Savineau [Thu, 31 Oct 2019 15:37:20 +0000 (11:37 -0400)]
ceph-defaults: pin grafana container tag to 5.2.4

The latest grafana container tag is using grafana 6.x release which could
cause issue with the ceph dashboard integration.
Considering that the grafana container in RHCS 3 is based on 5.x then we
should use the same version.

$ docker run --rm rhceph/rhceph-3-dashboard-rhel7:3 -v
Version 5.2.4 (commit: unknown-dev)

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2037fb87b63ef0c6f9f70ae43af3a0fc82ff7d1a)

5 years agoceph-osd: Remove ulimit nofile on container start
Dimitri Savineau [Wed, 30 Oct 2019 15:45:44 +0000 (11:45 -0400)]
ceph-osd: Remove ulimit nofile on container start

Even if this improves ceph-disk/ceph-volume performances then it also
impact the ceph-osd process.
The ceph-osd process shouldn't use 1024:4096 value for the max open
files.
Removing the ulimit option from the container engine and doing this kind
of change on the container side [1].

[1] https://github.com/ceph/ceph-container/pull/1497

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1702285
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 9a996aef7f79d5018e6999362fd025e9c04c9b3f)

5 years agoSet grafana-server user and password in ceph-dashboard role
fmount [Thu, 31 Oct 2019 09:49:22 +0000 (10:49 +0100)]
Set grafana-server user and password in ceph-dashboard role

This change adds two tasks to set grafana-api user and password
that are required to inject dashboard layouts to the external
grafana instance.
Without these two parameters the ceph-ansible playbook fails
showing an authorization error (HTTPError: 401 Client Error:
Unauthorized").

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1767365
Signed-off-by: fmount <fpantano@redhat.com>
(cherry picked from commit 41b8c17356fa1273761c3d864f959fbcb11813e7)

5 years agoceph-mon: use --admin-daemon to set default crush rule
Mihai Plasoianu [Mon, 28 Oct 2019 15:30:39 +0000 (16:30 +0100)]
ceph-mon: use --admin-daemon to set default crush rule

Signed-off-by: Mihai Plasoianu <m.plasoianu@vertical.de>
(cherry picked from commit d3f67d63aebd31133310a39b4e42a16d064397da)

5 years agoupdate: add default values when setting fact
Guillaume Abrioux [Tue, 29 Oct 2019 17:01:50 +0000 (18:01 +0100)]
update: add default values when setting fact

This commit adds a default value in the `with_dict` because when using
python 2.7, if a task using a `with_dict` has a condition, it is
evaluated anyway whereas in python 3 it isn't.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1766499
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e9823f319ba8deb2f653c51868eda4566ebff609)

5 years agorolling_update: remove default filter on mds group
Dimitri Savineau [Fri, 25 Oct 2019 21:03:46 +0000 (17:03 -0400)]
rolling_update: remove default filter on mds group

There's no need to use the default filter on active/standby groups
because if the group doesn't exist then the play is just skipped.

Currently this generates warnings like:

[WARNING]: Could not match supplied host pattern, ignoring: |
[WARNING]: Could not match supplied host pattern, ignoring: default([])

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2ca79fcc99bcff6f73478f11e67ba7edb178b029)

5 years agorolling_update: fix active mds host value
Dimitri Savineau [Fri, 25 Oct 2019 20:47:50 +0000 (16:47 -0400)]
rolling_update: fix active mds host value

The active mds host should be based on the inventory hostname and not on
the ansible hostname.
The value returns under the mdsmap structure is based on the OS hostname
so we need to find the right node in the inventory with this value when
doing operation on inventory nodes.

Othewise we could see error like:

The task includes an option with an undefined variable. The error was:
"hostvars[foobar]" is undefined

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f1f2352c7974f0839b5e74cb23849e943a1131c6)

5 years agoipaddrs_in_ranges: fix python indent
Dimitri Savineau [Fri, 25 Oct 2019 19:52:27 +0000 (15:52 -0400)]
ipaddrs_in_ranges: fix python indent

pycodestyle returns:

 E111 indentation is not a multiple of four

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 8a0a13f67a56bd3c1eb1078892456cdb692b0fcb)

5 years agomove library/plugins tests files under tests dir
Dimitri Savineau [Fri, 25 Oct 2019 19:47:05 +0000 (15:47 -0400)]
move library/plugins tests files under tests dir

To avoid unnecessary ansible warnings during playbook execution we can
move the library and plugins test files under a different directory.

[WARNING]: Skipping plugin (plugins/filter/test_ipaddrs_in_ranges.py) as
it seems to be invalid:
cannot import name 'ipaddrs_in_ranges'

Closes: #4656
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 6ce4fde82074c82e762fd7c56f2b5901f0705d09)

5 years agorolling_update: fix reset mon_host variable
Dimitri Savineau [Fri, 25 Oct 2019 17:36:07 +0000 (13:36 -0400)]
rolling_update: fix reset mon_host variable

mon_host should use the inventory hostname and not the node hostname.
Fix creates an issue when the inventory and node hostname are different.

Closes: #4670
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 650bc0c3f0598feb0a6d9b0f7688b773836819a2)

5 years agoadd-mon: add missing become flag
Dimitri Savineau [Fri, 25 Oct 2019 15:09:32 +0000 (11:09 -0400)]
add-mon: add missing become flag

Without the become flag set to true, we can't executed the roles
successfully.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 77b212833e22cb6584ad020aa9bd0e477c50b461)

5 years agoupdate: use right node when creating active mds group v4.0.4
Guillaume Abrioux [Thu, 24 Oct 2019 07:41:06 +0000 (09:41 +0200)]
update: use right node when creating active mds group

This must be consistent with what is used in `name` parameter.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d06057ebd2f5730dea67e687ad596c98f1e5fad8)

5 years agoupdate: avoid skipping single mds deployment upgrade
Guillaume Abrioux [Wed, 23 Oct 2019 17:39:15 +0000 (19:39 +0200)]
update: avoid skipping single mds deployment upgrade

otherwise a single MDS would never be updated.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d8ab11d2f8e08c28731f783d592507dd198b5742)

5 years agoupdate: skip mds deactivation when no mds in inventory
Guillaume Abrioux [Wed, 23 Oct 2019 13:48:32 +0000 (15:48 +0200)]
update: skip mds deactivation when no mds in inventory

Let's skip this part of the code if there's no mds node in the
inventory.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5ec906c3af2d188de23cc354ecb9ddcfc0af9d90)

5 years agoadd-{mon,osd}: add ceph-container-engine role
Dimitri Savineau [Thu, 24 Oct 2019 14:02:10 +0000 (10:02 -0400)]
add-{mon,osd}: add ceph-container-engine role

The ceph-container-engine role is missing from both playbooks so the
container engine (docker, podman) isn't install resulting in a failure
on the added nodes.

fatal: [xxxxx]: FAILED! => changed=false
  cmd: docker --version
  msg: '[Errno 2] No such file or directory'
  rc: 2

Closes: #4634
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit bfb1d6be12541bb0fff4db07a6389fc9ea24ac51)

5 years agodefaults: add user/pass auth registry variables
Dimitri Savineau [Thu, 24 Oct 2019 15:07:20 +0000 (11:07 -0400)]
defaults: add user/pass auth registry variables

Add ceph_docker_registry_username and ceph_docker_registry_password
variables in ceph-defaults role so they will be present in the group_vars
samples but commented.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1763139
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit b33c476f16a17abaedc5f4a31e37fbed121b968e)

5 years agotests: use osd ids instead of device name in ooo_collocation
Guillaume Abrioux [Tue, 22 Oct 2019 11:27:20 +0000 (13:27 +0200)]
tests: use osd ids instead of device name in ooo_collocation

on master, it doesn't make sense anymore to use device name, we should
use osd id instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b5a61fe2e3b4a509bb80536b634543c5e7a44d07)

5 years agotests: fix keyring creation in ooo_collocation
Guillaume Abrioux [Tue, 22 Oct 2019 07:30:37 +0000 (09:30 +0200)]
tests: fix keyring creation in ooo_collocation

This commit removes the backslash in allow command parameter, this was
needed before the ceph_key module integration.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 384161edcd10730a1354641418d74acb4d3f82bc)

5 years agotests: update container tag for ooo_collocation
Dimitri Savineau [Wed, 28 Aug 2019 14:49:54 +0000 (10:49 -0400)]
tests: update container tag for ooo_collocation

It doesn't make sense to test the old 3.0.x container images with
nautilus+ ceph releases.
Also disable the dashboard deployment and switch to bluestore backend.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 3c2840da03ac7bd131f60c9550e6e890b2abeffd)