]> git.apps.os.sepia.ceph.com Git - ceph-ansible.git/log
ceph-ansible.git
6 years agonfs: add missing | bool filters
Guillaume Abrioux [Tue, 25 Jun 2019 15:11:28 +0000 (17:11 +0200)]
nfs: add missing | bool filters

To address this warning:
```
[DEPRECATION WARNING]: evaluating nfs_ganesha_dev as a bare variable, this
behaviour will go away and you might need to add |bool to the expression in the
 future
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agonfs: remove duplicate task
Guillaume Abrioux [Tue, 25 Jun 2019 08:23:57 +0000 (10:23 +0200)]
nfs: remove duplicate task

This task is already present in pre_requisite_non_container.yml

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotests: test nfs-ganesha deployment
Guillaume Abrioux [Tue, 25 Jun 2019 05:44:43 +0000 (07:44 +0200)]
tests: test nfs-ganesha deployment

Add back the nfs-ganesha deployment testing which was removed because of
broken dependencies.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agovalidate.py: Fix alphabetical order on uca
Gabriel Ramirez [Tue, 25 Jun 2019 04:52:11 +0000 (21:52 -0700)]
validate.py: Fix alphabetical order on uca

Alphabetized ceph_repository_uca keys due to errors validating when
using UCA/queens repository on Ubuntu 16.04

An exception occurred during task execution. To see the full
traceback, use -vvv. The error was:
SchemaError: -> ceph_stable_repo_uca  schema item is not
alphabetically ordered

Closes: #4154
Signed-off-by: Gabriel Ramirez <gabrielramirez1109@gmail.com>
6 years agotests: deploy nfs-ganesha in container-all_daemons
Guillaume Abrioux [Fri, 21 Jun 2019 16:16:51 +0000 (18:16 +0200)]
tests: deploy nfs-ganesha in container-all_daemons

this commit bring back the nfs-ganesha testing in containerized
deployment.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agopurge: ensure no ceph kernel thread is present
Guillaume Abrioux [Fri, 21 Jun 2019 14:10:16 +0000 (16:10 +0200)]
purge: ensure no ceph kernel thread is present

This tries to first unmount any cephfs/nfs-ganesha mount point on client
nodes, then unmap any mapped rbd devices and finally it tries to remove
ceph kernel modules.
If it fails it means some resources are still busy and should be cleaned
manually before continuing to purge the cluster.
This is done early in the playbook so the cluster stays untouched until
everything is ready for that operation, otherwise if you try to redeploy
a cluster it could end up by getting confused by leftover from previous
deployment.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1337915
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoceph-handler: Fix OSD restart script
Dimitri Savineau [Thu, 20 Jun 2019 21:33:39 +0000 (17:33 -0400)]
ceph-handler: Fix OSD restart script

There's two big issues with the current OSD restart script.

1/ We try to test if the ceph osd daemon socket exists but we use a
wildcard for the socket name : /var/run/ceph/*.asok.
This fails because we usually have multiple ceph osd sockets (or
other ceph daemon collocated) present in /var/run/ceph directory.
Currently the test fails with:

bash: line xxx: [: too many arguments

But it doesn't stop the script execution.
Instead we can specify the full ceph osd socket name because we
already know the OSD id.

2/ The container filter pattern is wrong and could matches multiple
containers resulting the script to fail.
We use the filter with two different patterns. One is with the device
name (sda, sdb, ..) and the other one is with the OSD id (ceph-osd-0,
ceph-osd-15, ..).
In both case we could match more than needed.

$ docker container ls
CONTAINER ID IMAGE              NAMES
958121a7cc7d ceph-daemon:latest ceph-osd-strg0-sda
589a982d43b5 ceph-daemon:latest ceph-osd-strg0-sdb
46c7240d71f3 ceph-daemon:latest ceph-osd-strg0-sdaa
877985ec3aca ceph-daemon:latest ceph-osd-strg0-sdab
$ docker container ls -q -f "name=sda"
958121a7cc7d
46c7240d71f3
877985ec3aca

$ docker container ls
CONTAINER ID IMAGE              NAMES
2db399b3ee85 ceph-daemon:latest ceph-osd-5
099dc13f08f1 ceph-daemon:latest ceph-osd-13
5d0c2fe8f121 ceph-daemon:latest ceph-osd-17
d6c7b89db1d1 ceph-daemon:latest ceph-osd-1
$ docker container ls -q -f "name=ceph-osd-1"
099dc13f08f1
5d0c2fe8f121
d6c7b89db1d1

Adding an extra '$' character at the end of the pattern solves the
problem.

Finally removing the get_container_osd_id function because it's not
used in the script at all.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoChange ansible_lsb by ansible_distribution_release
Dimitri Savineau [Thu, 20 Jun 2019 18:04:24 +0000 (14:04 -0400)]
Change ansible_lsb by ansible_distribution_release

The ansible_lsb fact is based on the lsb package (lsb-base,
lsb-release or redhat-lsb-core).
If the package isn't installed on the remote host then the fact isn't
populated.

--------
"ansible_lsb": {},
--------

Switching to the ansible_distribution_release fact instead.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoupgrade: accept HEALTH_OK and HEALTH_WARN as valid state
Guillaume Abrioux [Wed, 19 Jun 2019 13:07:24 +0000 (15:07 +0200)]
upgrade: accept HEALTH_OK and HEALTH_WARN as valid state

3a100cfa5265b3a5327ef6a8d382a8059391b903 introduced a check which is a
bit too restrictive, let's accept HEALTH_OK and HEALTH_WARN.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoAdd higher retry/delay defaults to check the quorum status.
fpantano [Wed, 19 Jun 2019 17:13:37 +0000 (19:13 +0200)]
Add higher retry/delay defaults to check the quorum status.

As per bz1718981, this commit adds higher values to check
the quorum status. This is helpful for several OSP deployments
that fail during the scale up.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1718981
Signed-off-by: fpantano <fpantano@redhat.com>
6 years agoceph-volume: Set max open files limit on container
Dimitri Savineau [Thu, 20 Jun 2019 14:28:44 +0000 (10:28 -0400)]
ceph-volume: Set max open files limit on container

The ceph-volume lvm list command takes ages to complete when having
a lot of LV devices on containerized deployment.
For instance, with 25 OSDs on a node it takes 3 mins 44s to list the
OSD.
Adding the max open files limit to the container engine cli when
executing the ceph-volume command seems to improve a lot thee
execution time ~30s.

This was impacting the OSDs creation with ceph-volume (both filestore
and bluestore) when using multiple LV devices.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1702285
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agofacts: add a retry on get current fsid task
Guillaume Abrioux [Thu, 20 Jun 2019 12:45:07 +0000 (14:45 +0200)]
facts: add a retry on get current fsid task

sometimes it can happen the following task fails:

```
TASK [ceph-facts : get current fsid] *******************************************
task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-dev-centos-container-update/roles/ceph-facts/tasks/facts.yml:78
Wednesday 19 June 2019  18:12:49 +0000 (0:00:00.203)       0:02:39.995 ********
fatal: [mon2 -> mon1]: FAILED! => changed=true
  cmd:
  - timeout
  - --foreground
  - -s
  - KILL
  - 600s
  - docker
  - exec
  - ceph-mon-mon1
  - ceph
  - --cluster
  - ceph
  - daemon
  - mon.mon1
  - config
  - get
  - fsid
  delta: '0:00:00.239339'
  end: '2019-06-19 18:12:49.812099'
  msg: non-zero return code
  rc: 22
  start: '2019-06-19 18:12:49.572760'
  stderr: 'admin_socket: exception getting command descriptions: [Errno 2] No such file or directory'
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>
```

not sure exactly why since just before this task, mon1 seems to be well
UP otherwise it wouldn't have passed the task `waiting for the
containerized monitor to join the quorum`.

As a quick fix/workaround, let's add a retry which allows us to get
around this situation:

```
TASK [ceph-facts : get current fsid] *******************************************
task path: /home/jenkins-build/build/workspace/ceph-ansible-scenario/roles/ceph-facts/tasks/facts.yml:78
Thursday 20 June 2019  15:35:07 +0000 (0:00:00.201)       0:03:47.288 *********
FAILED - RETRYING: get current fsid (3 retries left).
changed: [mon2 -> mon1] => changed=true
  attempts: 2
  cmd:
  - timeout
  - --foreground
  - -s
  - KILL
  - 600s
  - docker
  - exec
  - ceph-mon-mon1
  - ceph
  - --cluster
  - ceph
  - daemon
  - mon.mon1
  - config
  - get
  - fsid
  delta: '0:00:00.290252'
  end: '2019-06-20 15:35:13.960188'
  rc: 0
  start: '2019-06-20 15:35:13.669936'
  stderr: ''
  stderr_lines: <omitted>
  stdout: |-
    {
        "fsid": "153e159d-7ade-42a7-842c-4d04348b901e"
    }
  stdout_lines: <omitted>
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoroles: Remove useless become (true) flag
Dimitri Savineau [Tue, 18 Jun 2019 17:50:11 +0000 (13:50 -0400)]
roles: Remove useless become (true) flag

We already set the become flag to true at a play level in the site*
playbooks so we don't need to set it at a task level.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoosd: remove legacy task
Guillaume Abrioux [Tue, 11 Jun 2019 09:16:51 +0000 (11:16 +0200)]
osd: remove legacy task

`parted_results` isn't used anymore in the playbook.

By the way, `parted` seems to cause issue because it changes the
ownership on devices:

```
root@osd0 ~]# ls -l /dev/sdc*
brw-rw----. 1 root disk 8, 32 Jun 11 08:53 /dev/sdc
brw-rw----. 1 ceph ceph 8, 33 Jun 11 08:53 /dev/sdc1
brw-rw----. 1 ceph ceph 8, 34 Jun 11 08:53 /dev/sdc2

[root@osd0 ~]# parted -s /dev/sdc print
Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sdc: 53.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start   End     Size    File system  Name           Flags
 1      1049kB  1075MB  1074MB               ceph block.db
 2      1075MB  2149MB  1074MB               ceph block.db

[root@osd0 ~]# #We can see ownerships have changed from ceph:ceph to root:disk:
[root@osd0 ~]# ls -l /dev/sdc*
brw-rw----. 1 root disk 8, 32 Jun 11 08:57 /dev/sdc
brw-rw----. 1 root disk 8, 33 Jun 11 08:57 /dev/sdc1
brw-rw----. 1 root disk 8, 34 Jun 11 08:57 /dev/sdc2
[root@osd0 ~]#
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agorolling_update: fail early if cluster state is not OK
Guillaume Abrioux [Mon, 10 Jun 2019 14:26:18 +0000 (16:26 +0200)]
rolling_update: fail early if cluster state is not OK

starting an upgrade if the cluster isn't HEALTH_OK isn't a good idea.
Let's check for the cluster status before trying to upgrade.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agorolling_update: only mask and stop unit in mgr part
Guillaume Abrioux [Mon, 10 Jun 2019 13:18:43 +0000 (15:18 +0200)]
rolling_update: only mask and stop unit in mgr part

Otherwise it fails like following:

```
fatal: [mon0]: FAILED! => changed=false
  msg: |-
    Unable to enable service ceph-mgr@mon0: Failed to execute operation: Cannot send after transport endpoint shutdown
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoAdd installer phase for dashboard roles
Dimitri Savineau [Mon, 17 Jun 2019 19:52:04 +0000 (15:52 -0400)]
Add installer phase for dashboard roles

This commits adds the support of the installer phase for dashboard,
grafana and node-exporter roles.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoremove ceph restapi references
Dimitri Savineau [Mon, 17 Jun 2019 19:02:25 +0000 (15:02 -0400)]
remove ceph restapi references

The ceph restapi configuration was only available until Luminous
release so we don't need those leftovers for nautilus+.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agodashboard: fix hosts sections in main playbook
Guillaume Abrioux [Fri, 14 Jun 2019 13:27:11 +0000 (15:27 +0200)]
dashboard: fix hosts sections in main playbook

ceph-dashboard should be deployed on either a dedicated mgr node or a
mon if they are collocated.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agouse pre_tasks and post_tasks in shrink-mon.yml too
Rishabh Dave [Sat, 15 Jun 2019 12:07:13 +0000 (17:37 +0530)]
use pre_tasks and post_tasks in shrink-mon.yml too

This commit should've been part of commit
2fb12ae55462f5601a439a104a5b0c01929accd9.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agotests: Update ansible ssh_args variable
Dimitri Savineau [Fri, 14 Jun 2019 21:31:39 +0000 (17:31 -0400)]
tests: Update ansible ssh_args variable

Because we're using vagrant, a ssh config file will be created for
each nodes with options like user, host, port, identity, etc...
But via tox we're override ANSIBLE_SSH_ARGS to use this file. This
remove the default value set in ansible.cfg.

Also adding PreferredAuthentications=publickey because CentOS/RHEL
servers are configured with GSSAPIAuthenticationis enabled for ssh
server forcing the client to make a PTR DNS query.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agotests: increase docker pull timeout
Guillaume Abrioux [Fri, 14 Jun 2019 09:45:29 +0000 (11:45 +0200)]
tests: increase docker pull timeout

CI is facing issues where docker pull reach the timeout, let's increase
this to avoid CI failures.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoceph-infra: make chronyd default NTP daemon
Rishabh Dave [Wed, 12 Jun 2019 09:09:44 +0000 (14:39 +0530)]
ceph-infra: make chronyd default NTP daemon

Since timesyncd is not available on RHEL-based OSs, change the default
to chronyd for RHEL-based OSs. Also, chronyd is chrony on Ubuntu, so
set the Ansible fact accordingly.

Fixes: https://github.com/ceph/ceph-ansible/issues/3628
Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agoceph-infra: update cache for Ubuntu
Rishabh Dave [Thu, 13 Jun 2019 08:06:00 +0000 (13:36 +0530)]
ceph-infra: update cache for Ubuntu

Ubuntu-based CI jobs often fail with error code 404 while installing
NTP daemons. Updating cache beforehand should fix the issue.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agoalign cephfs pool creation
Rishabh Dave [Tue, 10 Apr 2018 09:32:58 +0000 (11:32 +0200)]
align cephfs pool creation

The definitions of cephfs pools should match openstack pools.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
Co-Authored-by: Simone Caronni <simone.caronni@teralytics.net>
6 years agoiscsi: assign application (rbd) to pool 'rbd'
Guillaume Abrioux [Tue, 11 Jun 2019 20:03:59 +0000 (22:03 +0200)]
iscsi: assign application (rbd) to pool 'rbd'

if we don't assign the rbd application tag on this pool,
the cluster will get `HEALTH_WARN` state like following:

```
HEALTH_WARN application not enabled on 1 pool(s)
POOL_APP_NOT_ENABLED application not enabled on 1 pool(s)
    application not enabled on pool 'rbd'
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoceph-handler: replace fuser by /proc/net/unix
Dimitri Savineau [Thu, 6 Jun 2019 18:08:18 +0000 (14:08 -0400)]
ceph-handler: replace fuser by /proc/net/unix

We're using fuser command to see if a process is using a ceph unix
socket file. But the fuser command runs through every PID present in
/proc/<PID> to see if one of them is using the file.
On a system running thousands processes, the fuser command can take
a long time to finish.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1717011

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agomon: enforce mon0 delegation for initial_mon_key register
Guillaume Abrioux [Wed, 12 Jun 2019 09:38:49 +0000 (11:38 +0200)]
mon: enforce mon0 delegation for initial_mon_key register

since this task is designed to be always run on the first monitor, let's
enforce the container name accordingly otherwise it could fail like
following:

```
fatal: [mon1 -> mon0]: FAILED! => changed=true
  cmd:
  - docker
  - exec
  - ceph-mon-mon1
  - ceph
  - --cluster
  - ceph
  - --name
  - mon.
  - -k
  - /var/lib/ceph/mon/ceph-mon0/keyring
  - auth
  - get-key
  - mon.
  delta: '0:00:00.085025'
  end: '2019-06-12 06:12:27.677936'
  msg: non-zero return code
  rc: 1
  start: '2019-06-12 06:12:27.592911'
  stderr: 'Error response from daemon: No such container: ceph-mon-mon1'
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotests: remove unused variable
Guillaume Abrioux [Wed, 12 Jun 2019 14:35:14 +0000 (16:35 +0200)]
tests: remove unused variable

`e MGR_DASHBOARD=0` isn't needed anymore here, let's remove this legacy.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotests: update docker image tag used in ooo job
Guillaume Abrioux [Wed, 12 Jun 2019 14:31:32 +0000 (16:31 +0200)]
tests: update docker image tag used in ooo job

ceph-ansible@master isn't intended to deploy luminous.
Let's use latest-master on ceph-ansible@master branch

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: add allow_embedding support
Guillaume Abrioux [Wed, 12 Jun 2019 06:01:06 +0000 (08:01 +0200)]
dashboard: add allow_embedding support

Add a variable to support the allow_embedding support.

See ceph/ceph-ansible/issues/4084 for details.

Fixes: #4084
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: fix dashboard_url setting
Guillaume Abrioux [Wed, 12 Jun 2019 06:31:47 +0000 (08:31 +0200)]
dashboard: fix dashboard_url setting

This setting must be set to something resolvable.

See: ceph/ceph-ansible/issues/4085 for details

Fixes: #4085
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoceph-node-exporter: Fix systemd template
Dimitri Savineau [Tue, 11 Jun 2019 14:46:35 +0000 (10:46 -0400)]
ceph-node-exporter: Fix systemd template

069076b introduced a bug in the systemd unit script template. This
commit fixes the options used by the node-exporter container.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoceph-node-exporter: use modprobe ansible module
Dimitri Savineau [Tue, 11 Jun 2019 13:35:28 +0000 (09:35 -0400)]
ceph-node-exporter: use modprobe ansible module

Instead of using the modprobe command from the path in the systemd
unit script, we can use the modprobe ansible module.
That way we don't have to manage the binary path based on the linux
distribution.

Resolves: #4072

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoFix units and add ability to have a dedicated instance
fmount [Thu, 23 May 2019 14:21:08 +0000 (16:21 +0200)]
Fix units and add ability to have a dedicated instance

Few fixes on systemd unit templates for node_exporter and
alertmanager container parameters.
Added the ability to use a dedicated instance to deploy the
dashboard components (prometheus and grafana).
This commit also introduces the grafana_group_name variable
to refer grafana group and keep consistency with the other
groups.
During the integration with TripleO some grafana/prometheus
template variables resulted undefined. This commit adds the
ability to check if the group exist and create, accordingly,
different job groups in prometheus template.

Signed-off-by: fmount <fpantano@redhat.com>
6 years agovalidate: fail in check_devices at the right task
Guillaume Abrioux [Fri, 7 Jun 2019 08:50:28 +0000 (10:50 +0200)]
validate: fail in check_devices at the right task

see https://bugzilla.redhat.com/show_bug.cgi?id=1648168#c17 for details.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1648168#c17
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agospec: bring back possibility to install ceph with custom repo
Guillaume Abrioux [Fri, 7 Jun 2019 08:16:16 +0000 (10:16 +0200)]
spec: bring back possibility to install ceph with custom repo

This can be seen as a regression for customers who were used to deploy
in offline environment with custom repositories.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1673254
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agopodman: Add systemd dependency on network.target
Dimitri Savineau [Thu, 6 Jun 2019 19:41:35 +0000 (15:41 -0400)]
podman: Add systemd dependency on network.target

When using podman, the systemd unit scripts don't have a dependency
on the network. So we're not sure that the network is up and running
when the containers are starting.
With docker this behaviour is already handled because the systemd
unit scripts depend on docker service which is started after the
network.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agopurge-cluster: clean all ceph repo files
Dimitri Savineau [Thu, 6 Jun 2019 17:51:16 +0000 (13:51 -0400)]
purge-cluster: clean all ceph repo files

We currently only purge rh_storage yum repository file but depending
on the ceph_repository value we are using, the ceph repository file
could have a different name.

Resolves: #4056

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoAdd section for purging rgw loadbalancer in purge-cluster.yml
guihecheng [Fri, 1 Mar 2019 07:51:43 +0000 (15:51 +0800)]
Add section for purging rgw loadbalancer in purge-cluster.yml

Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com>
6 years agoAdd section for rgw loadbalancer in site.yml
guihecheng [Thu, 4 Apr 2019 03:33:15 +0000 (11:33 +0800)]
Add section for rgw loadbalancer in site.yml

This drives ceph rgw loadbalancer stuff to run.

Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com>
6 years agoAdd role definitions of ceph-rgw-loadbalancer
guihecheng [Thu, 4 Apr 2019 02:54:41 +0000 (10:54 +0800)]
Add role definitions of ceph-rgw-loadbalancer

This add support for rgw loadbalancer based on HAProxy and Keepalived.
We define a single role ceph-rgw-loadbalancer and include HAProxy and
Keepalived configurations all in this.

A single haproxy backend is used to balance all RGW instances and
a single frontend is exported via a single port, default 80.

Keepalived is used to maintain the high availability of all haproxy
instances. You are free to use any number of VIPs. A single VIP is
shared across all keepalived instances and there will be one
master for one VIP, selected sequentially, and others serve as
backups.
This assumes that each keepalived instance is on the same node as
one haproxy instance and we use a simple check script to detect
the state of each haproxy instance and trigger the VIP failover
upon its failure.

Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com>
6 years agoansible: use 'bool' filter on boolean conditionals
L3D [Wed, 22 May 2019 08:02:42 +0000 (10:02 +0200)]
ansible: use 'bool' filter on boolean conditionals

By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these:
```
[DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable,
this behaviour will go away and you might need to add |bool to the expression
in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This
feature will be removed in version 2.12. Deprecation warnings can be disabled
by setting deprecation_warnings=False in ansible.cfg.
```

Now appended ``| bool`` on a lot of the affected variables.

Sometimes the coding style from ``variable|bool`` changed to ``variable | bool`` *(with spaces at the pipe)*.

Closes: #4022
Signed-off-by: L3D <l3d@c3woc.de>
6 years agocontainer-common: support podman on Ubuntu
Dimitri Savineau [Fri, 17 May 2019 21:10:34 +0000 (17:10 -0400)]
container-common: support podman on Ubuntu

Currently we're only able to use podman on ubuntu if podman's
installation is done manually before the ceph-ansible execution
because the deb package is present in an external repository.
We already manage the docker-ce installation via an external
repository so we should be able to allow the podman installation
with the same mechanism too.

https://github.com/containers/libpod/blob/master/install.md#ubuntu

Resolves: #3947

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoceph-osd: do not relabel /run/udev in containerized context
Guillaume Abrioux [Mon, 3 Jun 2019 17:15:30 +0000 (19:15 +0200)]
ceph-osd: do not relabel /run/udev in containerized context

Otherwise content in /run/udev is mislabeled and prevent some services
like NetworkManager from starting.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotests: test podman against atomic os instead rhel8
Guillaume Abrioux [Thu, 23 May 2019 08:49:54 +0000 (10:49 +0200)]
tests: test podman against atomic os instead rhel8

the rhel8 image used is an outdated beta version, it is not worth it to
maintain this image upstream, since it's possible to test podman with a
newer version of centos/atomic-host image.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agosite-container: update container-engine role
Dimitri Savineau [Tue, 28 May 2019 20:43:48 +0000 (16:43 -0400)]
site-container: update container-engine role

Since the split between container-engine and container-common roles,
the tags and condition were not updated to reflect the change.

- ceph-container-engine needs with_pkg tag
- ceph-container-common needs fetch_container_images
- we don't need to pull the container image in a dedicated task for
atomic host. We can now use the ceph-container-common role.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoceph-nfs: use template module for configuration
Dimitri Savineau [Mon, 3 Jun 2019 19:28:39 +0000 (15:28 -0400)]
ceph-nfs: use template module for configuration

789cef7 introduces a regression in the ganesha configuration file
generation. The new config_template module version broke it.
But the ganesha.conf file isn't an ini file and doesn't really
need to use the config_template module. Instead we can use the
classic template module.

Resolves: #4045

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agodashboard: add a last default value in grafana-server host section
Guillaume Abrioux [Mon, 20 May 2019 14:32:08 +0000 (16:32 +0200)]
dashboard: add a last default value in grafana-server host section

If there is no mgrs and mons in the inventory, it will fail with the following error:

```
ERROR! The field 'hosts' has an invalid value, which includes an undefined variable. The error was: 'dict object' has no attribute 'mons'

The error appears to be in '/home/guits/ceph-ansible/site-docker.yml.sample': line 539, column 3, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

- hosts: '{{ (groups["mgrs"] | default(groups["mons"]))[0] }}'
  ^ here
We could be wrong, but this one looks like it might be an issue with
missing quotes. Always quote template expression brackets when they
start a value. For instance:

    with_items:
      - {{ foo }}

Should be written as:

    with_items:
      - "{{ foo }}"
```

let's add an  `omit` so it just display this message instead:

```
PLAY [[]] *******************
skipping: no hosts matched
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: move ceph-grafana-dashboards package installation
Guillaume Abrioux [Thu, 23 May 2019 08:26:36 +0000 (10:26 +0200)]
dashboard: move ceph-grafana-dashboards package installation

This commit moves the package installation into ceph-dashboard role.
This is needed to install ceph dasboard json file in
`/etc/grafana/dashboards/ceph-dashboard/`.

Closes: #4026
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoinfra: refact dashboard firewall rules
Guillaume Abrioux [Wed, 22 May 2019 14:31:21 +0000 (16:31 +0200)]
infra: refact dashboard firewall rules

- There is no need to open ports 3000, 8234, 9283 on all nodes.
- Add missing rule for alertmanager (port 9093)

Closes: #4023
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: append mgr modules to ceph_mgr_modules
Guillaume Abrioux [Wed, 22 May 2019 14:12:16 +0000 (16:12 +0200)]
dashboard: append mgr modules to ceph_mgr_modules

when `dashboard_enabled` is `True`, let's append `dashboard` and
`prometheus` modules to `ceph_mgr_modules` so they are automatically
loaded.

Closes: #4026
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoremove ceph-agent role and references
Dimitri Savineau [Tue, 28 May 2019 14:55:03 +0000 (10:55 -0400)]
remove ceph-agent role and references

The ceph-agent role was used only for RHCS 2 (jewel) so it's not
usefull anymore.
The current code will fail on CentOS distribution because the rhscon
package is only avaible on Red Hat with the RHCS 2 repository and
this ceph release is supported on stable-3.0 branch.

Resolves: #4020

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agovalidate: add a check for nfs standalone
Guillaume Abrioux [Mon, 20 May 2019 14:28:42 +0000 (16:28 +0200)]
validate: add a check for nfs standalone

if `nfs_obj_gw` is True when deploying an internal ganesha with an
external ceph cluster, `ceph_nfs_rgw_access_key` and
`ceph_nfs_rgw_secret_key` must be provided so the
ganesha configuration file can be generated.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agonfs: support internal Ganesha with external ceph cluster
Guillaume Abrioux [Mon, 20 May 2019 13:58:10 +0000 (15:58 +0200)]
nfs: support internal Ganesha with external ceph cluster

This commits allows to deploy an internal ganesha with an external ceph
cluster.

This requires to define `external_cluster_mon_ips` with a comma
separated list of external monitors.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1710358
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoceph-facts: generate fsid on mon node
Dimitri Savineau [Fri, 31 May 2019 17:26:30 +0000 (13:26 -0400)]
ceph-facts: generate fsid on mon node

The fsid generation is done via a python command. When the ansible
controller node only have python3 available (like RHEL 8) then the
python command isn't necessarily present causing the fsid generation
to fail.
We already do some resource creation (like ceph keyring secret) with
the python command too but from the mon node so we should do the same
for fsid.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1714631

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agovagrant: Default box to centos/7
Dimitri Savineau [Fri, 31 May 2019 14:22:15 +0000 (10:22 -0400)]
vagrant: Default box to centos/7

We don't use ceph/ubuntu-xenial anymore but only centos/7 and
centos/atomic-host.
Changing the default to centos/7.

Resolves: #4036

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoSync config_template from upstream
Kevin Carter [Wed, 22 May 2019 18:08:10 +0000 (13:08 -0500)]
Sync config_template from upstream

This change pulls in the most recent release of the config_template module
into the ceph_ansible action plugins.

Signed-off-by: Kevin Carter <kecarter@redhat.com>
6 years agotests: add retries on failing tests in testinfra
Guillaume Abrioux [Wed, 22 May 2019 08:42:33 +0000 (10:42 +0200)]
tests: add retries on failing tests in testinfra

This commit adds `pytest-rerunfailures` in requirements.txt so we can
retry failing test in testinfra to avoid false positive. (eg: sometimes it
can happen for some reason a service takes too much time to start)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoroles: introduce `ceph-container-engine` role
Guillaume Abrioux [Mon, 20 May 2019 07:46:10 +0000 (09:46 +0200)]
roles: introduce `ceph-container-engine` role

This commit splits the current `ceph-container-common` role.

This introduces a new role `ceph-container-engine` which handles the
tasks specific to the installation of containers tools (docker/podman).

This is needed for the ceph-dashboard implementation for 2 main reasons:

1/ Since the ceph-dashboard stack is only containerized, we must install
everything needed to run containers even in non containerized
deployments. Splitting this role allows us to not have to call the full
`ceph-container-common` role which would run a bunch of unneeded tasks
that would have been skipped anyway.

2/ The current implementation would have required to run
`ceph-container-common` on all ceph-clients nodes which would have been
conflicting with 9d3517c670ea2e944565e1a3e150a966b2d399de (we don't want
to run ceph-container-common on all client nodes, see mentioned commit
for more details)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoceph-mgr: install python-routes for dashboard
Dimitri Savineau [Fri, 17 May 2019 15:24:00 +0000 (11:24 -0400)]
ceph-mgr: install python-routes for dashboard

The ceph mgr dashboard requires routes python library to be installed
on the system.

Resolves: #3995

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agocommon: use gnupg instead of gpg
Dimitri Savineau [Tue, 21 May 2019 13:21:16 +0000 (09:21 -0400)]
common: use gnupg instead of gpg

gpg package isn't available for all Debian/Ubuntu distribution but
gnupg is.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoceph-prometheus: fix error in templates
Dimitri Savineau [Tue, 21 May 2019 14:29:16 +0000 (10:29 -0400)]
ceph-prometheus: fix error in templates

- remove trailing double quotes in jinja templates
- add jinja filename without .j2 suffix

Resolves: #4011

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoconfig: fix ipv6
Guillaume Abrioux [Tue, 21 May 2019 13:48:34 +0000 (15:48 +0200)]
config: fix ipv6

As of nautilus, if you set `ms bind ipv6 = True` you must explicitly set
`ms bind ipv4 = False` too, otherwise OSDs will still try to pick up an
IPv4 address.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1710319
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotests: update testinfra release
Dimitri Savineau [Tue, 30 Apr 2019 14:24:25 +0000 (10:24 -0400)]
tests: update testinfra release

In order to support ansible 2.8 with testinfra we need to use the
latest release (3.0.x).
Adding ssh-config option to py.test.
Also bumping the pytest and xdist version.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoceph-nfs: apply selinux fix anyway
Dimitri Savineau [Thu, 18 Apr 2019 14:02:12 +0000 (10:02 -0400)]
ceph-nfs: apply selinux fix anyway

Because ansible_distribution_version doesn't return minor version on
CentOS with ansible 2.8 we can apply the selinux anyway but only for
CentOS/RHEL 7.
Starting RHEL 8, there's a dedicated package for selinux called
nfs-ganesha-selinux [1].

Also replace the command module + semanage by the selinux_permissive
module.

[1] https://github.com/nfs-ganesha/nfs-ganesha/commit/a7911f

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoceph-validate: use kernel validation for iscsi
Dimitri Savineau [Thu, 18 Apr 2019 13:37:07 +0000 (09:37 -0400)]
ceph-validate: use kernel validation for iscsi

Ceph iSCSI gateway requires Red Hat Enterprise Linux or CentOS 7.5
or later.
Because we can not check the ansible_distribution_version fact for
CentOS with ansible 2.8 (returns only the major version) we can
fallback by checking the kernel option.

  - CONFIG_TARGET_CORE=m
  - CONFIG_TCM_USER2=m
  - CONFIG_ISCSI_TARGET=m

http://docs.ceph.com/docs/master/rbd/iscsi-target-cli-manual-install/

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoswitch to ansible 2.8
Guillaume Abrioux [Tue, 9 Apr 2019 07:22:06 +0000 (09:22 +0200)]
switch to ansible 2.8

- remove private attribute with import_role.
- update documentation.
- update rpm spec requirement.
- fix MagicMock python import in unit tests.

Closes: #3765
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agocommon: install dependencies for apt modules
Dimitri Savineau [Fri, 17 May 2019 14:31:46 +0000 (10:31 -0400)]
common: install dependencies for apt modules

When using a minimal Debian/Ubuntu distribution there's no
ca-certificates and gpg packages installed so the apt modules will
fail:

Failed to find required executable gpg in paths:
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

apt.cache.FetchFailedException:
W:https://download.ceph.com/debian-luminous/dists/bionic/InRelease:
No system certificates available. Try installing ca-certificates.

Resolves: #3994

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agodashboard: move the call to ceph-node-exporter
Guillaume Abrioux [Fri, 17 May 2019 15:34:09 +0000 (17:34 +0200)]
dashboard: move the call to ceph-node-exporter

This moves the call to ceph-node-exporter role after
ceph-container-common, otherwise it will try to run container before
docker or podman are installed.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotox: Don't copy infrastructure playbook
Dimitri Savineau [Tue, 23 Apr 2019 14:40:09 +0000 (10:40 -0400)]
tox: Don't copy infrastructure playbook

Since a1a871c we don't need to copy the infrastructure playbooks
under the ceph-ansible root directory.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agopurge-docker-cluster: don't remove data on atomic
Dimitri Savineau [Thu, 16 May 2019 14:00:58 +0000 (10:00 -0400)]
purge-docker-cluster: don't remove data on atomic

Because we don't manage the docker service on atomic (yet) via the
ceph-container-common role then we can't stop docker dans remove
the data.
For now let's do that only for non atomic hosts.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agodashboard: move defaults variables to ceph-defaults
Guillaume Abrioux [Thu, 16 May 2019 13:58:20 +0000 (15:58 +0200)]
dashboard: move defaults variables to ceph-defaults

There is no need to have default values for these variables in each roles
since there is no corresponding host groups

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agorename docker_exec_cmd variable
Guillaume Abrioux [Tue, 14 May 2019 12:51:32 +0000 (14:51 +0200)]
rename docker_exec_cmd variable

This commit renames the `docker_exec_cmd` variable to
`container_exec_cmd` so it's more generic.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: fix a typo
Guillaume Abrioux [Thu, 16 May 2019 12:36:53 +0000 (14:36 +0200)]
dashboard: fix a typo

6f0643c8e introduced a typo, the role that should be run is
ceph-container-common, not ceph-common

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotests: add dashboard scenario testing
Guillaume Abrioux [Thu, 16 May 2019 09:19:11 +0000 (11:19 +0200)]
tests: add dashboard scenario testing

This commit add a new scenario to test the dashboard deployment via
ceph-ansible.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: align the way containers are managed
Guillaume Abrioux [Thu, 16 May 2019 08:56:06 +0000 (10:56 +0200)]
dashboard: align the way containers are managed

This commit aligns the way the different containers are managed with how
it's currently done with the other ceph daemon.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: convert dashboard_rgw_api_no_ssl_verify to a bool
Guillaume Abrioux [Wed, 15 May 2019 14:16:55 +0000 (16:16 +0200)]
dashboard: convert dashboard_rgw_api_no_ssl_verify to a bool

make `dashboard_rgw_api_no_ssl_verify` a bool variable since it seems to
be used as it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: generate group_vars sample files
Guillaume Abrioux [Wed, 15 May 2019 14:15:48 +0000 (16:15 +0200)]
dashboard: generate group_vars sample files

generate all group_vars sample files corresponding to new roles added
for ceph-dashboard implementation.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: remove legacy file
Guillaume Abrioux [Wed, 15 May 2019 13:00:26 +0000 (15:00 +0200)]
dashboard: remove legacy file

this file seems to be no longer used, let's remove it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: set less permissive permissions on dashboard certificate/key
Guillaume Abrioux [Wed, 15 May 2019 12:38:46 +0000 (14:38 +0200)]
dashboard: set less permissive permissions on dashboard certificate/key

use `0440` instead of `0644` is enough

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: simplify config-key command
Guillaume Abrioux [Wed, 15 May 2019 12:35:24 +0000 (14:35 +0200)]
dashboard: simplify config-key command

since stable-4.0 isn't to deploy ceph releases prior to nautilus,
there's no need to add this complexity here.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoplaybook: use blocks for grafana-server section
Guillaume Abrioux [Wed, 15 May 2019 12:11:00 +0000 (14:11 +0200)]
playbook: use blocks for grafana-server section

use a block in grafana-server section to avoid duplicate condition.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: do not call ceph-container-common from other role
Guillaume Abrioux [Tue, 14 May 2019 14:34:50 +0000 (16:34 +0200)]
dashboard: do not call ceph-container-common from other role

use site.yml to deploy ceph-container-common in order to install docker
even in non-containerized deployments since there's no RPM available to
deploy the differents applications needed for ceph-dashboard.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: use existing variable to detect containerized deployment
Guillaume Abrioux [Tue, 14 May 2019 12:46:25 +0000 (14:46 +0200)]
dashboard: use existing variable to detect containerized deployment

there is no need to add more complexity for this, let's use
`containerized_deployment` in order to detect if we are running a
containerized deployment.
The idea is to use `container_exec_cmd` the same way we do in the rest of
the playbook to run the different ceph commands needed to deploy the
ceph-dashboard role.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agofacts: set container_binary fact in non-containerized deployment
Guillaume Abrioux [Mon, 13 May 2019 14:34:53 +0000 (16:34 +0200)]
facts: set container_binary fact in non-containerized deployment

This is needed for the ceph-dashboard implementation since it requires
to run containerized application which aren't packaged as RPMs.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: rename template files
Guillaume Abrioux [Mon, 13 May 2019 14:21:16 +0000 (16:21 +0200)]
dashboard: rename template files

add .j2 to all templates file related to dashboard roles.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: Support podman
Boris Ranto [Mon, 8 Apr 2019 13:40:25 +0000 (15:40 +0200)]
dashboard: Support podman

This adds support for podman in dashboard-related roles. It also drops
the creation of custom network for the dashboard-related roles as this
functionality works in a different way with podman.

Signed-off-by: Boris Ranto <branto@redhat.com>
6 years agodashboard: Set ssl_server_port if it is supported
Boris Ranto [Thu, 4 Apr 2019 17:51:16 +0000 (19:51 +0200)]
dashboard: Set ssl_server_port if it is supported

We cannot use the old fashioned config-key way, here. It was not
supported when the option was introduced (post 14.2.0). Since the option
is not always supported we can simply ignore the potential failure on
ceph clusters that do not support it.

Signed-off-by: Boris Ranto <branto@redhat.com>
6 years agodashboard: Add and copy alerting rules
Boris Ranto [Fri, 15 Feb 2019 19:27:15 +0000 (20:27 +0100)]
dashboard: Add and copy alerting rules

This commit adds a list of alerting rules for ceph-dashboard from the
old cephmetrics project. It also installs the configuration file so that
the rules get recognized by the prometheus server.

Signed-off-by: Boris Ranto <branto@redhat.com>
6 years agopurge-docker-cluster.yml: Default lvm_volumes
Zack Cerza [Fri, 4 Jan 2019 20:26:59 +0000 (13:26 -0700)]
purge-docker-cluster.yml: Default lvm_volumes

We were failing when that variable is unset; purge-cluster.yml contains
this workaround.

Signed-off-by: Zack Cerza <zack@redhat.com>
6 years agoMerge cephmetrics/dashboard-ansible repo
Boris Ranto [Wed, 5 Dec 2018 18:59:47 +0000 (19:59 +0100)]
Merge cephmetrics/dashboard-ansible repo

This commit will merge dashboard-ansible installation scripts with
ceph-ansible. This includes several new roles to setup ceph-dashboard
and the underlying technologies like prometheus and grafana server.

Signed-off-by: Boris Ranto & Zack Cerza <team-gmeno@redhat.com>
Co-authored-by: Zack Cerza <zcerza@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoshrink_osd: mark all osd(s) out in one command
wumingqiao [Wed, 15 May 2019 07:27:21 +0000 (15:27 +0800)]
shrink_osd: mark all osd(s) out in one command

Signed-off-by: wumingqiao <wumingqiao@beyondcent.com>
6 years agotests: fix a typo in dev_setup.yml
Guillaume Abrioux [Tue, 14 May 2019 12:27:19 +0000 (14:27 +0200)]
tests: fix a typo in dev_setup.yml

c907ec41ae0698b7627ebcbe97f1c293611d41d7 introduced a typo.
This commit fixes it.

```
[WARNING]: While constructing a mapping from /home/guits/ceph-ansible/tests/functional/dev_setup.yml, line 21, column 9, found a duplicate dict key (replace).
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agopurge-docker-cluster: remove docker data
Dimitri Savineau [Mon, 13 May 2019 21:03:55 +0000 (17:03 -0400)]
purge-docker-cluster: remove docker data

We never clean the content of /var/lib/docker so we can still have
some data present in this directory after run the purge playbook.
Pip isn't used anymore.
Also update the docker package name (especially the python binding
one).

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agocontainer-common: allow podman for other distros
Dimitri Savineau [Fri, 10 May 2019 19:35:17 +0000 (15:35 -0400)]
container-common: allow podman for other distros

Currently podman installation is very tied to RHEL 8 even if we're
able to install it on Debian/Ubuntu distribution.
This patch changes the way we are starting or not the (fat) container
daemon. Before the condition was based on the distribution release
and now on the container_service_name variable.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoceph-nfs: fixed with_items
Bruceforce [Sun, 12 May 2019 11:10:30 +0000 (13:10 +0200)]
ceph-nfs: fixed with_items

If we do this in one line we get the error described in #3968

fixes #3968

Signed-off-by: Bruceforce <markus.greis@gmx.de>
6 years agogather-ceph-logs: fix logs list generation
Dimitri Savineau [Mon, 13 May 2019 14:12:42 +0000 (10:12 -0400)]
gather-ceph-logs: fix logs list generation

The shell module doesn't have a stdout_lines attributes. Instead of
using the shell module, we can use the find modules.

Also adding `become: false` to the local tmp directory creation
otherwise we won't have enough right to fetch the files into this
directory.

Resolves: #3966

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoceph-nfs: fixed condition for "stable repos specific tasks"
Bruceforce [Sun, 12 May 2019 09:40:05 +0000 (11:40 +0200)]
ceph-nfs: fixed condition for "stable repos specific tasks"

The old condition would resolve to
"when": "nfs_ganesha_stable - ceph_repository == 'community'"

now it is
"when": [
          "nfs_ganesha_stable",
          "ceph_repository == 'community'"
        ]

Please backport to stable-4.0

Signed-off-by: Bruceforce <markus.greis@gmx.de>
6 years agoUpdate RHCS version with Nautilus
Dimitri Savineau [Fri, 10 May 2019 19:28:18 +0000 (15:28 -0400)]
Update RHCS version with Nautilus

RHCS 4 will be based on Nautilus and only usable on RHEL 8.
Updated the default ceph_rhcs_version to 4 and update the rhcs
repositories to rhcs 4 with RHEL 8.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>