]> git.apps.os.sepia.ceph.com Git - ceph-ansible.git/log
ceph-ansible.git
5 years agoswitch_to_container: fix osd systemd regex
Dimitri Savineau [Thu, 4 Jun 2020 20:57:17 +0000 (16:57 -0400)]
switch_to_container: fix osd systemd regex

The systemd LOAD and ACTIVE fileds could have more than one space between
both values.
This update the systemd regex the same way we're using it in different
part of the code.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1843500
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 50140c9b5dfd4c865b36b333db8d2f725a905a5a)

5 years agodashboard: allow disabling grafana api ssl verify
Dimitri Savineau [Tue, 28 Apr 2020 17:31:01 +0000 (13:31 -0400)]
dashboard: allow disabling grafana api ssl verify

When using an untrusted TLS certificate (like self-signed) on grafana
then the grafana dashboards update subcommand will fail.
One solution could be to trust the TLS certificate.
The other one is to disable the TLS verification on the grafana API.

Closes: #5324
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit b20519efd0b9af4f2467daa311b9dca6086d4f87)

5 years agorgw multisite: add master zone endpoints to zonegroup
Ali Maredia [Fri, 5 Jun 2020 21:21:27 +0000 (21:21 +0000)]
rgw multisite: add master zone endpoints to zonegroup

We were only adding the endpoints to the master zone but not to the
zonegroup.
This patch fixes the issue.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1839228
Signed-off-by: Ali Maredia <amaredia@redhat.com>
(cherry picked from commit 0175c205fa16c05e2bbf5b4d8111092555aefa66)

5 years agocommon: fix target_size_ratio task enablement
Guillaume Abrioux [Thu, 14 May 2020 09:00:12 +0000 (11:00 +0200)]
common: fix target_size_ratio task enablement

The condition on this task is wrong, we have to check whether
`target_size_ratio` is set in the pool definition instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8c7a48832cd62524982c9ebe193a5ca6ea2c7bfa)

5 years agofacts: always set ceph_run_cmd and ceph_admin_command
Guillaume Abrioux [Thu, 14 May 2020 09:06:41 +0000 (11:06 +0200)]
facts: always set ceph_run_cmd and ceph_admin_command

always set these facts on monitor nodes whatever we run with `--limit`.
Otherwise, playbook will fail when using `--limit` on nodes where these
facts are used on a delegated task to monitor.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e5e81843e918ed9aa57a5675af2888499700eac2)

5 years agoosd: add a default value for 'default' in crush_rules
Guillaume Abrioux [Tue, 24 Mar 2020 08:56:45 +0000 (09:56 +0100)]
osd: add a default value for 'default' in crush_rules

Let's default to `False` for the `default` attribute in `crush_rules`
variable.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1797774
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1b0b7af119929eca86d8f4684e4dbd228d6509f4)

5 years agodocker2podman: manage dashboard nodes
Dimitri Savineau [Thu, 16 Apr 2020 16:17:12 +0000 (12:17 -0400)]
docker2podman: manage dashboard nodes

The dashboard nodes (alertmanager, grafana, node-exporter, and prometheus)
were not manage during the docker to podman migration.

This adds the systemd container template of those services to a dedicated
file (systemd.yml) in order to include it in the docker2podman playbook.

This also adds the dashboard container images pull from docker to podman.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1829389
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 252e78b4e4e90bc1c21d9dfd4a7c9bd132e94730)

5 years agodocker2podman: pull images from docker daemon
Dimitri Savineau [Thu, 16 Apr 2020 15:30:11 +0000 (11:30 -0400)]
docker2podman: pull images from docker daemon

The docker2podman playbook only installs the podman package and updates
the systemd units with the right container_binary value.

We never pull the container image so if one service is restarted then
the container image will be pulled first before the service can start
which could cause longer downstream.

To avoid to download the container image from internet again we can just
pull it from the local docker daemon.

The container_{binding,package,service}_name variables are removed
because they are only used in the ceph-container-engine role which
isn't call in this playbook.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d38f21aeba29f341dc737b8cdeaa9fdaa9f55408)

5 years agorolling_update: fix rbdmirror group name
Dimitri Savineau [Thu, 30 Apr 2020 20:06:55 +0000 (16:06 -0400)]
rolling_update: fix rbdmirror group name

The rbdmirror group name was using the wrong variable definition.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit c0a213f9284eac733de99ebcc7f18b1ebdf8f115)

5 years agoceph-nfs: bind mount ganesha log directory
Dimitri Savineau [Mon, 4 May 2020 22:39:05 +0000 (18:39 -0400)]
ceph-nfs: bind mount ganesha log directory

The current ganesha log directory is only present in the container
and not bind mount on the host.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 222fe4abd8771fbb5f3e9c793fcf67123a85e8ab)

5 years agodocker-to-podman: conditional docker commands
Dimitri Savineau [Tue, 12 May 2020 15:38:47 +0000 (11:38 -0400)]
docker-to-podman: conditional docker commands

The docker commands should be based on the container_binary variable
otherwise running the playbook on a host without docker (like podman
only) will failed.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1829985
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoceph-validate: Expand templates in rgw_create_pools
Benoît Knecht [Mon, 11 May 2020 14:21:55 +0000 (16:21 +0200)]
ceph-validate: Expand templates in rgw_create_pools

Same fix as `ceph-rgw` for `rgw_create_pools` pool names that contain Jinja
templates.

See #5348 for details.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit 444b46ea24a71f0e719296abc677a975afa8ac7f)

5 years agoceph-rgw: Make sure pool name templates are expanded
Benoît Knecht [Mon, 11 May 2020 13:49:32 +0000 (15:49 +0200)]
ceph-rgw: Make sure pool name templates are expanded

It is common to set templated pool names in `rgw_create_pools`, e.g.

```yaml
rgw_create_pools:
  "{{ rgw_zone }}.rgw.buckets.index":
    pg_num: 16
    size: 3
    type: replicated
```

This worked fine with Ansible 2.8, but broke in Ansible 2.9 due to a change in
the way `with_dict` works [1].

This commit replaces the use of `with_dict` with

```yaml
loop: "{{ rgw_create_pools | dict2items }}"
```

which works as intended and expands the template in the pool name.

[1]: https://docs.ansible.com/ansible/latest/porting_guides/porting_guide_2.9.html#loops

Closes #5348

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit d2b7670c7dea29bd1072b65ce2ccdccccf97550d)

5 years agoceph-facts: fix IPv6 _radosgw_address interface
Dimitri Savineau [Mon, 27 Apr 2020 20:01:24 +0000 (16:01 -0400)]
ceph-facts: fix IPv6 _radosgw_address interface

When using radosgw_interface and IPv6 setup then the _radosgw_address
fact doesn't use square brackets compared to the radosgw_address and
radosgw_address_block configuration.

Closes: #5325
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ed4f23d5303f62c814796835ecfad637678641be)

5 years agoRefresh ceph dashboard user role
fmount [Fri, 10 Apr 2020 13:04:52 +0000 (15:04 +0200)]
Refresh ceph dashboard user role

This change allows the operator to refresh the
ceph dashboard admin role on multiple ceph-ansible
executions.
In the current state the role is set only when the
user is created, and there's no way to change it if
the user exists.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1826002
Signed-off-by: fmount <fpantano@redhat.com>
(cherry picked from commit 5eb363e0331caf5d0be8e1242d1e57f4f5045812)

5 years agomds: don't enable application pool on cephfs pools
Guillaume Abrioux [Tue, 21 Apr 2020 08:29:23 +0000 (10:29 +0200)]
mds: don't enable application pool on cephfs pools

this commit removes the task which enable application on cephfs pools.

See: https://tracker.ceph.com/issues/43761

Fixes: #5278
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 86dc6f8206f11d18baeef100837cc72c23baa3f4)

5 years agotest: set sitepackages=false in tox
Guillaume Abrioux [Wed, 13 May 2020 15:49:07 +0000 (17:49 +0200)]
test: set sitepackages=false in tox

Otherwise it might try to use the system installed version of ansible
when there's one available.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6d9acb5e6de24043c9f858ea66def63113af3bd4)

5 years agodocs: minor fixes to README-MULTISITE.md
Ali Maredia [Mon, 27 Apr 2020 22:04:58 +0000 (18:04 -0400)]
docs: minor fixes to README-MULTISITE.md

Make all of the hosts start at 1 and not 0,
also make some minor changes in scenario 3 to
remova an inconsistency.

Signed-off-by: Ali Maredia <amaredia@redhat.com>
(cherry picked from commit bd1440f2cd7e52435c6533970789e0f7ee791f31)

5 years agoceph-rgw: use match instead of equalto from jinja2 v4.0.23
Dimitri Savineau [Wed, 6 May 2020 17:32:18 +0000 (13:32 -0400)]
ceph-rgw: use match instead of equalto from jinja2

The '==' jinja2 operator (or 'equalto') has been introduced in jinja2
2.8.
On EL7, jinja2 version is 2.7 so the operator isn't present creating
templating error like:

The error was: TemplateRuntimeError: no test named '=='

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1747206
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 34e6e8e06c8993c18f0f75df9989f39f67727ef9)

5 years agoceph-nfs: fix internal ganesha deployment
Dimitri Savineau [Wed, 6 May 2020 13:31:34 +0000 (09:31 -0400)]
ceph-nfs: fix internal ganesha deployment

Since ea2b654d9 we're not running the rados command from the monitor
nodes but from the ganesha node. Unfortunately we don't have the
required keyring on that node to run the rados command as we don't
import the right keyring.
This commit restores the workflow for internal ganesha deployment like
before ea2b654d9 but keeps the rados commands from the ganesha node for
external deployment until we have a better design.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 8a890306ad870f0174f76c6445644d7f8db6396e)

5 years agoceph-nfs: fix keyring copy for external ganesha
Dimitri Savineau [Tue, 5 May 2020 14:46:14 +0000 (10:46 -0400)]
ceph-nfs: fix keyring copy for external ganesha

Fix the condition on the keyring copy task that prevent the ganesha
keyring to be created in the /var/lib/ceph directory.
Also ensure that the directory exists first.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1831285
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 748ac4b928f8f44323e1d3bbef0afec5da72dbfb)

5 years agonfs: fix 2 typo
Guillaume Abrioux [Thu, 30 Apr 2020 14:21:14 +0000 (16:21 +0200)]
nfs: fix 2 typo

The condition is missing an index here which makes the playbook failing.

Typical error:
```
The conditional check 'not item.get('skipped', False)' failed. The error was: error while evaluating conditional (not item.get('skipped', False)): 'list object' has no attribute 'get'",
```

Also, adds the missing '/keyring' on the `exec_cmd_nfs` fact.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1831342
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit cf460274c7489940968fed176c113ad473b22f4d)

5 years agoceph-dashboard: fix mgr dashboard IPv6 fact v4.0.22
Dimitri Savineau [Thu, 23 Apr 2020 18:34:39 +0000 (14:34 -0400)]
ceph-dashboard: fix mgr dashboard IPv6 fact

15ed9ee introduced a regression for the mgr dashboard daemon using
IPv6 since the mgr dashboard configuration doesn't support brackets.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1827299
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f1728929cdea745f613b60f215db6494dd23740a)

5 years agodocs: fix multisite docs add endpoints var in rgw_instances section
Ali Maredia [Thu, 23 Apr 2020 15:12:13 +0000 (11:12 -0400)]
docs: fix multisite docs add endpoints var in rgw_instances section

+ Mention of this variable was missing in the original version.

+ Minor revisions around the concept of secondary zone.

Signed-off-by: Ali Maredia <amaredia@redhat.com>
(cherry picked from commit 2b3260457745bf568573cd95463e9210ebbb17c9)

5 years agodocs: Update and consolidate rgw multisite documentation
Ali Maredia [Thu, 16 Apr 2020 19:47:17 +0000 (19:47 +0000)]
docs: Update and consolidate rgw multisite documentation

Signed-off-by: Ali Maredia <amaredia@redhat.com>
(cherry picked from commit afa78bd0c001a7732a6632c563d6065194f0df6e)

5 years agotypo: updating type check on rc v4.0.21
ianwatsonrh [Tue, 21 Apr 2020 13:14:46 +0000 (14:14 +0100)]
typo: updating type check on rc

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1826884
Signed-off-by: ianwatsonrh <ianwatson@redhat.com>
(cherry picked from commit ccf6a7f153c7a36a700b914db956058f2408304b)

5 years agodoc: update release note
Guillaume Abrioux [Fri, 27 Mar 2020 04:39:08 +0000 (05:39 +0100)]
doc: update release note

This commit mentions:
- the Debian support removal from RHCS deployment.
- site-docker.yml.sample and purge-docker-cluster.yml symlinks removal.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoceph-container-engine: add CentOS 8 support
Dimitri Savineau [Wed, 8 Apr 2020 14:24:02 +0000 (10:24 -0400)]
ceph-container-engine: add CentOS 8 support

This adds CentOS 8 support for containerized deployment allowing podman
installation as the default container engine for this distribution.

Closes: #5130
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agodoc: add day-2 operations documentation
Guillaume Abrioux [Tue, 21 Apr 2020 07:50:27 +0000 (09:50 +0200)]
doc: add day-2 operations documentation

This commit is the first of a serie in order to describe all day-2 operations
that are possible via ceph-ansible using a set of playbook provided in
`infrastructure-playbooks` directory.

Fixes: #5061
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7e800303e9933cb61a5288b608e8d4f2cfdd7746)

5 years agofilestore-to-bluestore: fix py2 on skipped tasks v4.0.20
Dimitri Savineau [Mon, 20 Apr 2020 13:47:31 +0000 (09:47 -0400)]
filestore-to-bluestore: fix py2 on skipped tasks

When using skipped variables with from_json filter and python2 then we
need to have a default value otherwise the skipped task will fail.

Unexpected templating type error occurred on
({{ (ceph_volume_lvm_list.stdout | from_json) }}): expected string or
buffer

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790472
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2b9edba13149cdf326cbe944defbd7e05e328497)

5 years agoUpdated use of deprecated filter
abaird-rh [Fri, 17 Apr 2020 16:34:32 +0000 (17:34 +0100)]
Updated use of deprecated filter

This was removed in Ansible 2.9.

[DEPRECATION WARNING]: Using tests as filters is deprecated. Instead of
using `result|version_compare` use `result is version_compare`. This
feature will be removed in version 2.9. Deprecation warnings can be
disabled by setting deprecation_warnings=False in ansible.cfg.

Rename 'version_compare' to the function 'version'.

version_compose was renamed to version since ansible 2.5

Signed-off-by: abaird-rh <abaird@redhat.com>
(cherry picked from commit eb71244bfd2f3c59c8245a1cf45dc6b44fa442f8)

5 years agolibrary/ceph_volume: look for error messages in stderr
Rishabh Dave [Tue, 7 Apr 2020 11:50:35 +0000 (17:20 +0530)]
library/ceph_volume: look for error messages in stderr

Error message were moved to from stdout in stderr here -
https://github.com/ceph/ceph/commit/b8d6dcbe9f803c96c0af68da54f1262e9b6a9e77#diff-20f7c578a4e69ec61a5869d706567a24R137.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1793542
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 4249d1e02d6da07466a4ddf1282cf4600a131773)

5 years agomds: fix --limit run against mds nodes v4.0.19
Guillaume Abrioux [Thu, 9 Apr 2020 23:02:06 +0000 (01:02 +0200)]
mds: fix --limit run against mds nodes

This commit fixes --limit runs against mds nodes.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 378405e3285bac412ca0ead90212f456af051574)

5 years agonfs: create empty rados index object for nfs standalone
Guillaume Abrioux [Fri, 10 Apr 2020 09:05:25 +0000 (11:05 +0200)]
nfs: create empty rados index object for nfs standalone

This commit creates an empty rados index object even when deploying
standalone nfs-ganesha.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1822328
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ea2b654d951f0ddb4abed3d4e96d66458baf80f8)

5 years agoceph-validate: update RHEL requirement for RHCS
Dimitri Savineau [Thu, 9 Apr 2020 18:00:52 +0000 (14:00 -0400)]
ceph-validate: update RHEL requirement for RHCS

We were not testing the right ansible_distribution fact value for RHEL
distribution.
This commit also updates the minial RHEL version supported by RHCS.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 5de74fe512575b2873b5863f5817f676954d3469)

5 years agoosd: fix monitor_name error when scaling out OSDs
Guillaume Abrioux [Thu, 9 Apr 2020 12:48:53 +0000 (14:48 +0200)]
osd: fix monitor_name error when scaling out OSDs

This commit fixes a bug when trying to scale out osd nodes with
`crush_rule_config` is enabled.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1822599
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4bcc52cb2aad9bac563025ba4fd9ae4565524b67)

5 years agotests: update mgr dashboard socket listening test
Dimitri Savineau [Tue, 24 Mar 2020 18:28:51 +0000 (14:28 -0400)]
tests: update mgr dashboard socket listening test

Since 15ed9ee the ceph-mgr daemon binds on the IP address on the public
network instead of binding on all addresses.
This commit updates the testinfra code to reflect that change.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 0f0a14772c21fbeadcc0c92b48a8c54ab8b729d6)

5 years agotests: register mark in pytest configuration
Dimitri Savineau [Wed, 15 Jan 2020 17:48:10 +0000 (12:48 -0500)]
tests: register mark in pytest configuration

Unregister marks generates warnings like:

PytestUnknownMarkWarning: Unknown pytest.mark.docker - is this a typo?
You can register custom marks to avoid this warning

https://docs.pytest.org/en/latest/mark.html

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ac4f8763aab8ad62771b412662230349676242c7)

5 years agotests: add dashboard testinfra configuration
Dimitri Savineau [Fri, 12 Jul 2019 18:56:01 +0000 (14:56 -0400)]
tests: add dashboard testinfra configuration

This commit adds basic tests for grafana, prometheus, node-exporter and
ceph mgr dashboard services.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f2c6281207b1cdc200812cf8f389164a586232fd)

5 years agoceph-mgr: add saml python lib for dashboard SSO v4.0.18
Dimitri Savineau [Fri, 3 Apr 2020 20:33:11 +0000 (16:33 -0400)]
ceph-mgr: add saml python lib for dashboard SSO

The dashboard SSO mgr module requires the saml python library to be
installed. This is only a valid scenario for RHCS deployment because
the saml python library isn't available in other classic repositories.
This package is present in RHCS Tools repository so we also need to
enable it on the mgr nodes.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1820233
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 6617d90733d08b03779657c0129b7a7089eb4a90)

5 years agoceph_key: fetch key when needed
Guillaume Abrioux [Fri, 3 Apr 2020 16:23:00 +0000 (18:23 +0200)]
ceph_key: fetch key when needed

Fetch the key when it is present in the cluster but not on the node.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ccfa249919b338197daec353cb5d4e535b3fb734)

5 years agoceph_key: fix idempotency when no secret is passed
Guillaume Abrioux [Fri, 3 Apr 2020 08:24:32 +0000 (10:24 +0200)]
ceph_key: fix idempotency when no secret is passed

553584cbd0d014429e665f998776e8d198f72d2b introduced a regression when no
secret is passed, it overwrites the secret each time the task is run.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 003defec0311af0f03da861f80d596852bdb9cf5)

5 years agotox: replace testinfra by pytest for add-mgrs
Dimitri Savineau [Thu, 2 Apr 2020 20:26:48 +0000 (16:26 -0400)]
tox: replace testinfra by pytest for add-mgrs

The add-mgrs scenario is still using the testinfra command instead of
pytest so the tests exectution are failling.

ERROR: InvocationError for command could not find executable testinfra

This also adds the missing --ssh-config option to testinfra.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 92f538f1af56c8f4cd28a4409e32397efbc77e52)

5 years agodocker2podman: call `container_options_facts.yml` on osd nodes
Guillaume Abrioux [Wed, 1 Apr 2020 12:20:05 +0000 (14:20 +0200)]
docker2podman: call `container_options_facts.yml` on osd nodes

We must call `ceph-osd` role from `container_options_facts.yml` because
ceph-osd-run.sh.j2 needs variables set in this file.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1819681
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4a4f54f6eeaac99f04faad82259129a906f4ccb9)

5 years agoceph_key: remove 'update' state
Guillaume Abrioux [Tue, 17 Mar 2020 14:34:11 +0000 (15:34 +0100)]
ceph_key: remove 'update' state

With this change, the state `present` is enough to update a keyring.
If the keyring already exist, it will be updated if caps or secret
passed to the module are different.
If the keyring doen't exist, it will be created.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1808367
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 553584cbd0d014429e665f998776e8d198f72d2b)

5 years agoosd: use default crush rule name when needed
Guillaume Abrioux [Fri, 27 Mar 2020 15:21:09 +0000 (16:21 +0100)]
osd: use default crush rule name when needed

When `rule_name` isn't set in `crush_rules` the osd pool creation will
fail.
This commit adds a new fact `ceph_osd_pool_default_crush_rule_name` with
the default crush rule name.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1817586
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1bb9860dfd32a79587f053a417a841a57b6c0192)

5 years agotests: add more coverage in external_clients scenario
Guillaume Abrioux [Fri, 27 Mar 2020 16:56:26 +0000 (17:56 +0100)]
tests: add more coverage in external_clients scenario

Run create_users_keys.yml in external_clients scenario

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8c1c34b20157f91910ec9052dabcc543e30c2035)

5 years agoosd: support changing default rule even when osd_crush_location isn't defined
Guillaume Abrioux [Thu, 12 Mar 2020 11:14:01 +0000 (12:14 +0100)]
osd: support changing default rule even when osd_crush_location isn't defined

Creating crush rules even with no crush hierarchy configuration is a
valid scenario so we shouldn't be bound to the first task result (which
configure crush hierarchy) to be able to add new crush rules.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1816989
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5b0476385ccb00a9edb9092a183c18e2637afd5d)

5 years agopurge-container: get *all* osds id
Guillaume Abrioux [Tue, 31 Mar 2020 11:59:23 +0000 (13:59 +0200)]
purge-container: get *all* osds id

Adding `--all` to the `systemctl list-units` command in order to get
*all* osds id on the node (including stoppped osds). Otherwise, it will
purge the cluster but there will be leftover after that.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1814542
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5e7962ccf6c9fa35f5611888826131c4e9e8f043)

5 years agoFixes for Makefile
Javier Pena [Fri, 8 Nov 2019 09:38:32 +0000 (10:38 +0100)]
Fixes for Makefile

- Set default mock configuration to epel-8-x86_64, to match the
  default dist value.
- Add support for alpha tags, like the recently added v5.0.0alpha1

Signed-off-by: Javier Pena <jpena@redhat.com>
(cherry picked from commit 19a43ff26134b1a59b4cd74cc27364ab754f3d03)

5 years agoAllow setting dist and mock configuration in Makefile
Javier Pena [Wed, 23 Oct 2019 14:45:41 +0000 (16:45 +0200)]
Allow setting dist and mock configuration in Makefile

Curently, the dist and mock configurations are hardcoded in the
Makefile to be el8 and epel-7-x86_64, respectively. This commit
allows the user to override those settings using the DIST and
MOCK_CONFIG environment variables, falling back to the current
defaults if not set.

This provides additional flexibility when building the RPM directly
from the repository.

Signed-off-by: Javier Peña <jpena@redhat.com>
(cherry picked from commit ed8341568bd4f25243bed7eecb3c0dfa7f3e007b)

5 years agoceph_volume: fix multiple db/wal devices
Dimitri Savineau [Fri, 27 Mar 2020 21:16:41 +0000 (17:16 -0400)]
ceph_volume: fix multiple db/wal devices

When using the lvm batch ceph-volume subcommand with dedicated devices
for bluestore (db/wal) then the list of devices is convert to a string
instead of being extended via an iterable.
This was working with only one dedicated device but starting with more
then the ceph_volume module fails.

TASK [ceph-osd : use ceph-volume lvm batch to create bluestore osds] **
fatal: [xxxxxx]: FAILED! => changed=true
  cmd:
  - ceph-volume
  - --cluster
  - ceph
  - lvm
  - batch
  - --bluestore
  - --yes
  - --prepare
  - --osds-per-device
  - '4'
  - /dev/nvme2n1
  - /dev/nvme3n1
  - /dev/nvme4n1
  - /dev/nvme5n1
  - /dev/nvme6n1
  - --db-devices
  - /dev/nvme0n1 /dev/nvme1n1
  - --report
  - --format=json
  msg: non-zero return code
  rc: 2
  stderr: |2-
     stderr: lsblk: /dev/nvme0n1 /dev/nvme1n1: not a block device
     stderr: error: /dev/nvme0n1 /dev/nvme1n1: No such file or directory
     stderr: Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.
    usage: ceph-volume lvm batch [-h] [--db-devices [DB_DEVICES [DB_DEVICES ...]]]
                                 [--wal-devices [WAL_DEVICES [WAL_DEVICES ...]]]
                                 [--journal-devices [JOURNAL_DEVICES [JOURNAL_DEVICES ...]]]
                                 [--no-auto] [--bluestore] [--filestore]
                                 [--report] [--yes] [--format {json,pretty}]
                                 [--dmcrypt]
                                 [--crush-device-class CRUSH_DEVICE_CLASS]
                                 [--no-systemd]
                                 [--osds-per-device OSDS_PER_DEVICE]
                                 [--block-db-size BLOCK_DB_SIZE]
                                 [--block-wal-size BLOCK_WAL_SIZE]
                                 [--journal-size JOURNAL_SIZE] [--prepare]
                                 [--osd-ids [OSD_IDS [OSD_IDS ...]]]
                                 [DEVICES [DEVICES ...]]
    ceph-volume lvm batch: error: Unable to proceed with non-existing device: /dev/nvme0n1 /dev/nvme1n1

So the dedicated device list is considered as a single string.

This commit also adds the block_db_devices and wal_devices documentation
to the ceph_volume module.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1816713
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 760b6cd7b0b671d63e2d6ea32ec336edd2f8cd32)

5 years agoceph-defaults: update container tag for nautilus
Dimitri Savineau [Fri, 27 Mar 2020 19:53:07 +0000 (15:53 -0400)]
ceph-defaults: update container tag for nautilus

The latest Ceph stable release is now Octopus so the "latest" container
image tag is pointing to Octopus and not Nautilus anymore.
This commit updates the ceph_docker_image_tag with "latest-nautilus".

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agorhcs: drop debian support
Dimitri Savineau [Thu, 26 Mar 2020 21:39:09 +0000 (17:39 -0400)]
rhcs: drop debian support

Support for debian with RHCS has been dropped starting RHCS 4

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 4ac99223b2dff5cf264e1b1632bf89583bff3a25)

5 years agoAdd a release note
Dimitri Savineau [Thu, 26 Mar 2020 21:56:33 +0000 (17:56 -0400)]
Add a release note

This adds a release not for the Ceph Nautilus release used in the
stable-4.0 branch.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoceph-defaults: update ganesha to 2.8
Dimitri Savineau [Thu, 26 Mar 2020 14:57:03 +0000 (10:57 -0400)]
ceph-defaults: update ganesha to 2.8

With Ceph Nautilus release we should use nfs-ganesha 2.8

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agodefaults: remove legacy comment
Guillaume Abrioux [Thu, 26 Mar 2020 06:38:36 +0000 (07:38 +0100)]
defaults: remove legacy comment

This is no longer true, let's remove this comment given that this option
is not ignored in containerized deployments.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e551b5ba1a65e653540b5b5c7cb3a4f5d32b2540)

5 years agotests: add inventory host for 5.0 upgrade job
Guillaume Abrioux [Thu, 26 Mar 2020 10:20:47 +0000 (11:20 +0100)]
tests: add inventory host for 5.0 upgrade job

This inventory is intended to be used in the upgrade scenario in
stable-5.0 branch.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoceph-facts: fix rgw_instances_all fact
Dimitri Savineau [Tue, 24 Mar 2020 21:11:44 +0000 (17:11 -0400)]
ceph-facts: fix rgw_instances_all fact

The rgw_instances_all fact is supposed to be the list of all radosgw
instances from all rgw nodes.
But the fact is always using the local rgw_instances variable so this
won't work on multiple nodes.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 0487d21938c91d4f3a48cb7157aa9b9a00f21f8f)

5 years agoceph-defaults: regenerate group_vars samples v4.0.17
Dimitri Savineau [Mon, 16 Dec 2019 20:19:35 +0000 (15:19 -0500)]
ceph-defaults: regenerate group_vars samples

In fc02fc9 the group_vars samples have been generated but only for
monitor_address variable not radosgw_address.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit b46839bad0732486db8a8c698f5963716c3b2b5b)

5 years agonfs: fix nfs with external ceph cluster support
Guillaume Abrioux [Thu, 19 Mar 2020 19:44:20 +0000 (20:44 +0100)]
nfs: fix nfs with external ceph cluster support

This commit refact and fix the nfs deployment with external ceph cluster
support.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1814942
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit cc28d9ec2669f14fa2e75627093aa65905af969f)

5 years agodashboard: allow to set read-only admin user
Dimitri Savineau [Wed, 18 Mar 2020 14:53:40 +0000 (10:53 -0400)]
dashboard: allow to set read-only admin user

This commit allows one to set the role for the admin user as read-only.
This can be controlled via the dashboard_admin_user_ro variable but the
default value is false for backward compatibility.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1810176
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit fb69f6990ce0bf4c9cd4caf9ce7a29e15ab07cfd)

5 years agoceph-defaults: update grafana container tag
Dimitri Savineau [Mon, 16 Mar 2020 21:52:30 +0000 (17:52 -0400)]
ceph-defaults: update grafana container tag

Since 8e8aa73 we're using grafana 5.4.3 in RHCS 4.1 via [1].
We should also update the grafana container tag from docker.io when
using the community release.

[1] registry.redhat.io/rhceph/rhceph-4-dashboard-rhel8:4

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit b97a4d520158319a12c06ac169f645f733e47cd9)

5 years agorhcs_edits: Update grafana version
Boris Ranto [Mon, 16 Mar 2020 16:08:03 +0000 (17:08 +0100)]
rhcs_edits: Update grafana version

We are planning to release updated grafana image for ceph dashboard in
RHCS 4.1. We need to update the rhcs edut to point to the new image
then.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786107
Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit 8e8aa735e0193786a11e36c84e3dfeb5ec50e95f)

5 years agoceph-facts: Fix system_secret_key variable handling
petruha [Mon, 16 Mar 2020 16:35:20 +0000 (17:35 +0100)]
ceph-facts: Fix system_secret_key variable handling

This commit fixes the system_secret_key variable not substitued by the
right value and always using the 'system_secret_key' string instead.

$ egrep 'system_(access|secret)_key' group_vars/all.yml
system_access_key: foofoofoofoofoofoofo
system_secret_key: barbarbarbarbarbarbarbarbarbarbarbarbarb

$ ansible-playbook -vv -i hosts site.yml.sample -e rgw_multisite=true
(...)
  - hostname: storage0
    endpoint: http://192.168.100.42:8080
    instance_name: rgw0
    radosgw_address: 192.168.50.3
    radosgw_frontend_port: 8085
    rgw_realm: canada
    rgw_zone: montreal
    rgw_zone_user: justin.trudeau
    rgw_zone_user_display_name: Justin Trudeau
    rgw_zonegroup: quebec
    system_access_key: foofoofoofoofoofoofo
    system_secret_key: system_secret_key

Fixes https://github.com/ceph/ceph-ansible/issues/5150

Signed-off-by: petruha <5363545+p37ruh4@users.noreply.github.com>
(cherry picked from commit 73b3fadb0ed4cd9eba62669c189fa76178431f07)

5 years agoconfig: remove legacy option in ceph.conf.j2
Guillaume Abrioux [Mon, 16 Mar 2020 08:55:20 +0000 (09:55 +0100)]
config: remove legacy option in ceph.conf.j2

This option has been deprecated (As of 0.51).
By the way, ceph-ansible already sets the
auth_{service,client,cluster}_required variables.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1623586
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 152c2caa9f8e2d250f5d11db7543c951b2ee5d4d)

5 years agodoc: update infra playbooks statements
Dimitri Savineau [Mon, 24 Feb 2020 18:58:38 +0000 (13:58 -0500)]
doc: update infra playbooks statements

We don't need to copy the infrastructure playbooks in the root
ceph-ansible directory.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 195944b123c0dc6780f3e3ede070f2ad9235f54d)

5 years agohandler: add rgw multi-instances support v4.0.16
Dimitri Savineau [Thu, 12 Mar 2020 16:06:55 +0000 (17:06 +0100)]
handler: add rgw multi-instances support

This commit adds the rgw multi-instances support in ceph-handler
(restart_rgw_daemons.sh.j2)

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 3626c688cfb655ef2ab31a68dd9aedc806ad9038)

5 years agorgw: add multi-instances support when deploying multisite
Guillaume Abrioux [Mon, 9 Mar 2020 10:05:01 +0000 (11:05 +0100)]
rgw: add multi-instances support when deploying multisite

This commit adds the multi-instances when deploying rgw multisite

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 60a2e28189758c239e543d360fef8e66e70e7bf7)

5 years agoceph-infra: open radosgw ports for multi instances
Dimitri Savineau [Wed, 11 Mar 2020 02:41:27 +0000 (22:41 -0400)]
ceph-infra: open radosgw ports for multi instances

When using the radosgw multi instances configuration then the firewall
rules aren't adapted to that setup.
We only open the port according to the radosgw_frontend_port variable
so only the first radosgw instance port will be opened in the firewall
configuration.
We should instead iterate over the rgw_instances list.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit e8bf0a0cf2fdd9d02e442b6778b8b3f76a1c9473)

5 years agorgw: fix a typo in create_realm_zonegroup_zone_lists
Guillaume Abrioux [Tue, 10 Mar 2020 13:07:24 +0000 (14:07 +0100)]
rgw: fix a typo in create_realm_zonegroup_zone_lists

This commit fixes a typo.

`s/realms/secondary_realms`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b3bbd6bb774de5281f117efd146557ca691aa69c)

5 years agoinfra: add retries/until on firewalld start task
Guillaume Abrioux [Mon, 9 Mar 2020 09:40:54 +0000 (10:40 +0100)]
infra: add retries/until on firewalld start task

This commit make that task retrying 5 times to start the service
firewalld to avoid failure like following:

```
TASK [ceph-infra : start firewalld] ********************************************
task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-centos-container-purge/roles/ceph-infra/tasks/configure_firewall.yml:22
Monday 09 March 2020  08:58:48 +0000 (0:00:00.963)       0:02:16.457 **********
fatal: [osd4]: FAILED! => changed=false
  msg: |-
    Unable to enable service firewalld: Created symlink from /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service to /usr/lib/systemd/system/firewalld.service.
    Created symlink from /etc/systemd/system/multi-user.target.wants/firewalld.service to /usr/lib/systemd/system/firewalld.service.
    Failed to execute operation: Connection reset by peer
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b3d943fe9f1edc2f62af5c147997ef8378c82d84)

5 years agofilestore-to-bluestore: stop ceph-volume services
Dimitri Savineau [Thu, 5 Mar 2020 19:18:33 +0000 (14:18 -0500)]
filestore-to-bluestore: stop ceph-volume services

We only disable the ceph-osd services but not the ceph-volume lvm
services during the filestore to bluestore migration.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 38a683e5bf18e0e8ae45ee004715cdb0fae1e09d)

5 years agofilestore-to-bluestore: reuse dedicated journal
Dimitri Savineau [Mon, 3 Feb 2020 20:03:17 +0000 (15:03 -0500)]
filestore-to-bluestore: reuse dedicated journal

If the filestore configuration was using a dedicated journal with either
a partition or a LV/VG then we need to reuse this for bluestore DB.

When filestore is using a raw devices then we shouldn't destroy
everything (data + journal) but only data otherwise the journal
partition won't exist anymore.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790479
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 535da53d69bdb3702acfcc23167a5f6a872811ba)

5 years agotests/requirements: bump testinfra
Dimitri Savineau [Thu, 5 Mar 2020 14:52:56 +0000 (09:52 -0500)]
tests/requirements: bump testinfra

3.4 is the latest testinfra release available but python2 is dropped
starting 4.0.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ccec67aa6a5e1a36f9a22b784baf0cefe0b1d002)

5 years agorgw: add retry/until on pools tasks
Guillaume Abrioux [Fri, 6 Mar 2020 07:06:37 +0000 (08:06 +0100)]
rgw: add retry/until on pools tasks

Sometimes, these task can timeout for some reason.
Adding these retries can help to avoid unexcepted failures.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7a8a719e7568f7c3b9936230e506e37869007fd6)

5 years agotests: fix update scenario
Guillaume Abrioux [Wed, 4 Mar 2020 21:52:24 +0000 (22:52 +0100)]
tests: fix update scenario

Update this scenario due to recent changes.
e361c0ff94de34dd0d627ffe6e6dd616f41c26e3 added more nodes in order to
cover code erasure pool creation in the CI. Therefore, we must add more
nodes in the 3.2 deployment too.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoclient: skip create_users_keys.yml when rolling_update
Guillaume Abrioux [Wed, 4 Mar 2020 15:33:46 +0000 (16:33 +0100)]
client: skip create_users_keys.yml when rolling_update

There's no need to run this part of the role when upgrading clients
node. Let's skip it when rolling_update.yml is being run.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit eac207091b0574e16084c62e71372b510f25de6b)

5 years agotests: add more osd nodes in all_daemons scenario
Guillaume Abrioux [Tue, 3 Mar 2020 18:01:27 +0000 (19:01 +0100)]
tests: add more osd nodes in all_daemons scenario

This commit adds more osd nodes in all_daemons scenario in order to test
erasure pool creation.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9f0c6df94fabdfcd177603dfa72aadbbb5c6539a)

5 years agotests: update ooo job
Guillaume Abrioux [Tue, 3 Mar 2020 14:06:40 +0000 (15:06 +0100)]
tests: update ooo job

This commit changes the value passed for the attribute 'rule_name' in
openstack_pools definition. It doesn't make sense to have emptry string
as passed value here.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 248978596a1b444e1f49e5083a58195474925e0a)

5 years agoosd: do not change pool size on erasure pool
Guillaume Abrioux [Tue, 3 Mar 2020 09:47:19 +0000 (10:47 +0100)]
osd: do not change pool size on erasure pool

This commit adds condition in order to not try to customize pools size
when its type is erasure.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e17c79b871600b5488148a32c994e888fff0919f)

5 years agotests: add erasure pool creation test in CI
Guillaume Abrioux [Mon, 2 Mar 2020 16:05:18 +0000 (17:05 +0100)]
tests: add erasure pool creation test in CI

This commit makes the CI testing an OSD pool erasure creation due to the
recent refact of the OSD pool creation tasks in the playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8cacba1f544216720f9b555bbf8265ff41f4176c)

5 years agotests: enable pg autoscaler on 1 pool
Guillaume Abrioux [Fri, 28 Feb 2020 17:38:38 +0000 (18:38 +0100)]
tests: enable pg autoscaler on 1 pool

This commit enables the pg autoscaler on 1 pool.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a3b797e059cdd69be74841c652cdcaaa9862a169)

5 years agoosd: add pg autoscaler support
Guillaume Abrioux [Fri, 28 Feb 2020 15:03:15 +0000 (16:03 +0100)]
osd: add pg autoscaler support

This commit adds the pg autoscaler support.

The structure for pool definition has now two additional attributes
`pg_autoscale_mode` and `target_size_ratio`, eg:

```
test:
  name: "test"
  pg_num: "{{ osd_pool_default_pg_num }}"
  pgp_num: "{{ osd_pool_default_pg_num }}"
  rule_name: "replicated_rule"
  application: "rbd"
  type: 1
  erasure_profile: ""
  expected_num_objects: ""
  size: "{{ osd_pool_default_size }}"
  min_size: "{{ osd_pool_default_min_size }}"
  pg_autoscale_mode: False
  target_size_ratio": 0.1
```

when `pg_autoscale_mode` is `True` user has to set a decent value in
`target_size_ratio`.

Given that it's a new feature, it's still disabled by default.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1782253
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 47adc2bb08b18845c34d90bd0dafb43298e6bee5)

5 years agoosd: refact osd pool creation
Guillaume Abrioux [Fri, 28 Feb 2020 10:41:00 +0000 (11:41 +0100)]
osd: refact osd pool creation

Currently, the command executed is wrong, eg:

```
  cmd:
  - podman
  - exec
  - ceph-mon-controller-0
  - ceph
  - --cluster
  - ceph
  - osd
  - pool
  - create
  - volumes
  - '32'
  - '32'
  - replicated_rule
  - '1'
  delta: '0:00:01.625525'
  end: '2020-02-27 16:41:05.232705'
  item:
```

From documentation, the osd pool creation command is :

```
ceph osd pool create {pool-name} {pg-num} [{pgp-num}] [replicated] \
     [crush-rule-name] [expected-num-objects]
ceph osd pool create {pool-name} {pg-num}  {pgp-num}   erasure \
     [erasure-code-profile] [crush-rule-name] [expected_num_objects]
```

it means we pass '1' (from item.type) as value for
`expected_num_objects` by default which is very likely not what we want.

Also, this commit modifies the default value when no `rule_name` is set
to use the existing variable `osd_pool_default_crush_rule`

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1808495
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit bf1f125d718c386264f19ce82cae5060ac166ce6)

5 years agorgw multisite: enable more than 1 realm per cluster
Ali Maredia [Fri, 4 Oct 2019 19:31:25 +0000 (19:31 +0000)]
rgw multisite: enable more than 1 realm per cluster

Make it so that more than one realm, zonegroup,
or zone can be created during a run of the rgw
multisite ansible playbooks.

The rgw hosts now need to be grouped into zones
and realms in the inventory.

.yml files need to be created in group_vars
for the realms and zones. Sample yaml files
are available.

Also remove multsite destroy playbook
and add --cluster before radosgw-admin commands

remove manually added rgw_zone_endpoints var
and have ceph-ansible automatically add the
correct endpoints of all the rgws in a rgw_zone
from the information provided in that rgws hostvars.

Signed-off-by: Ali Maredia <amaredia@redhat.com>
(cherry picked from commit 71f55bd54d923f610e451e11828378c14f1ec1cc)

5 years agoshrink-rbdmirror: fix presence after removal
Dimitri Savineau [Fri, 28 Feb 2020 16:37:14 +0000 (11:37 -0500)]
shrink-rbdmirror: fix presence after removal

We should add retry/delay to check the presence of the rbdmirror daemon
in the cluster status because the status takes some time to be updated.
Also the metadata.hostname isn't a good key to check because it doesn't
reflect the ansible_hostname fact. We should use metadata.id instead.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d1316ce77b796ce969c2fd36c975baca3ffe0bc9)

5 years agoshrink-mgr: fix systemd condition
Dimitri Savineau [Wed, 26 Feb 2020 16:16:56 +0000 (11:16 -0500)]
shrink-mgr: fix systemd condition

This playbook was using mds systemd condition.
Also a command task was using pipeline which is not allowed.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit a664159061d377d3dc03703dc3927a4d022a6b97)

5 years agotox: update shrink scenario configuration
Dimitri Savineau [Wed, 26 Feb 2020 16:03:32 +0000 (11:03 -0500)]
tox: update shrink scenario configuration

The shrink scenarios don't need the docker variables (except for OSD).
Removing pytest for shrink-mgr.
Adding environment variables for xxx_to_kill ansible variable.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2f4413f5ce620d736db895862c8023134736ec85)

5 years agoshrink: don't use localhost node
Dimitri Savineau [Wed, 26 Feb 2020 15:20:05 +0000 (10:20 -0500)]
shrink: don't use localhost node

The ceph-facts are running on localhost so if this node is using a
different OS/release that the ceph node we can have a mismatch between
docker/podman container binary.
This commit also reduces the scope of the ceph-facts role because we only
need the container_binary tasks.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 08ac2e30340e5ed6c69a74e538a109731c5e6652)

5 years agopurge: stop rgw instances by iteration
Dimitri Savineau [Fri, 31 Jan 2020 15:42:10 +0000 (10:42 -0500)]
purge: stop rgw instances by iteration

It looks like that the service module doesn't support wildcard anymore
for stopping/disabling multiple services.

fatal: [rgw0]: FAILED! => changed=false
  msg: 'This module does not currently support using glob patterns,
        found ''*'' in service name: ceph-radosgw@*'
...ignoring

Instead we should iterate over the rgw_instances list.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 9d3b49293d785b9b2b6bb902bce0ae8bce6b4e6e)

5 years agoceph-infra: install firewalld python bindings
Dimitri Savineau [Fri, 15 Nov 2019 15:37:27 +0000 (10:37 -0500)]
ceph-infra: install firewalld python bindings

When using the firewalld ansible module we need to be sure that the
python bindings are installed.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 90b1fc8fe94a5349ca6f2e42f578636a877dc5d3)

5 years agoceph-infra: split firewalld tasks
Dimitri Savineau [Fri, 15 Nov 2019 15:11:33 +0000 (10:11 -0500)]
ceph-infra: split firewalld tasks

Since ansible 2.9 the firewalld task could not be used with service and
source in the same time anymore.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 45fb9241c06ae46486c0fb9664decfaa87600acf)

5 years agoAdd ansible 2.9 support
Dimitri Savineau [Fri, 1 Nov 2019 13:28:03 +0000 (09:28 -0400)]
Add ansible 2.9 support

This commit adds ansible 2.9 support in addition of 2.8.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit aefba82a2e2ebecd24eb743f687469d070246ca2)

5 years agoceph-validate: start with ansible version test
Dimitri Savineau [Fri, 6 Dec 2019 21:11:51 +0000 (16:11 -0500)]
ceph-validate: start with ansible version test

It doesn't make sense to start validating configuration if the ansible
version isn't the good one.
This commit moves the check_system task as the first task in the
ceph-validate role.
The ansible version test tasks are moved at the top of this file.
Also moving the iscsi kernel tests from check_system to check_iscsi
file.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 1a77dd7e914bc96614aae0014b39e3dbe5ff526a)

5 years agocommon: support OSDs with more than 2 digits
Guillaume Abrioux [Fri, 21 Feb 2020 09:22:32 +0000 (10:22 +0100)]
common: support OSDs with more than 2 digits

When running environment with OSDs having ID with more than 2 digits,
some tasks don't match the system units and therefore, playbook can fail.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1805643
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a084a2a3470887bfaa3616acac1052664b52e12a)

5 years agorequirements: enforce ansible version requirement
Guillaume Abrioux [Thu, 27 Feb 2020 12:33:31 +0000 (13:33 +0100)]
requirements: enforce ansible version requirement

See https://github.com/advisories/GHSA-3m93-m4q6-mc6v

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a2d2e70ac26b5e3a9dbb3acbf5bbbc2a04defbc2)

5 years agoshrink-osd: support shrinking ceph-disk prepared osds
Guillaume Abrioux [Wed, 19 Feb 2020 17:30:14 +0000 (18:30 +0100)]
shrink-osd: support shrinking ceph-disk prepared osds

This commit adds the ceph-disk prepared osds support

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1796453
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1de2bf99919af771dbd34f5cc768a1ba12fcffc8)

5 years agoshrink-osd: don't run ceph-facts entirely
Guillaume Abrioux [Wed, 19 Feb 2020 12:51:49 +0000 (13:51 +0100)]
shrink-osd: don't run ceph-facts entirely

We need to call ceph-facts only for setting `container_binary`.
Since this task has been isolated we can use `tasks_from` to only execute the
needed task.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 55970b18f1506ee581ef6f2f8438a023f0f01807)

5 years agoinfrastructure-playbooks: Run shrink-osd tasks on monitor
Benoît Knecht [Fri, 3 Jan 2020 09:38:20 +0000 (10:38 +0100)]
infrastructure-playbooks: Run shrink-osd tasks on monitor

Instead of running shring-osd tasks on localhost and delegating most of
them to the first monitor, run all of them on the first monitor
directly.

This has the added advantage of becoming root on the monitor only, not
on localhost.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit 8b3df4e4180c90141b3fc8ca25573171eb45bb0a)