]> git.apps.os.sepia.ceph.com Git - ceph-ansible.git/log
ceph-ansible.git
3 years agocommon: fix a typo rhcs-5.1 v6.0.25.9
Guillaume Abrioux [Sun, 3 Jul 2022 05:17:58 +0000 (07:17 +0200)]
common: fix a typo

s/of/or ..

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2099828#c25
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2e823b117ef5e6c7396517aa0eb2c8553bd33887)
(cherry picked from commit c36bac39035a1d185662e93e4587818299b586a8)

3 years agopurge: reset-failed ceph-crash v6.0.25.8
Guillaume Abrioux [Mon, 23 May 2022 07:49:10 +0000 (09:49 +0200)]
purge: reset-failed ceph-crash

This ensures we always reset-failed the ceph-crash service.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2055992
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5ab46f836dd0903b6e009178c3ef950d7524bf32)

3 years agocephadm-adopt: remove legacy directory after adoption v6.0.25.7
Guillaume Abrioux [Wed, 11 May 2022 11:47:46 +0000 (13:47 +0200)]
cephadm-adopt: remove legacy directory after adoption

When this directory is left after the osd adoption, it leads to the following error:

```
[WRN] CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
    host axdesec2ocs1n002.ecommerce.inditex.grp `cephadm ceph-volume` failed: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/41555360-e96b-4b16-a37c-873e0c940091/mon.axdesec2ocs1n002/config
ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/41555360-e96b-4b16-a37c-873e0c940091/mon.axdesec2ocs1n002/config'.
```

this is because of an unexpected behavior regarding 'config inferring' when a legacy directory is present in /var/lib/ceph.

Note: this doesn't fix the root cause, this is a workaround.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2075510
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 364fe36e927e7d885089aece8f65cd23684f7b32)

3 years agocommon: config rhcs tools repo on all nodes v6.0.25.6
Guillaume Abrioux [Thu, 28 Apr 2022 08:46:35 +0000 (10:46 +0200)]
common: config rhcs tools repo on all nodes

Otherwise `cephadm` can't be installed during cephadm-adopt.yml
playbook execution.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2073480
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 380382a3c117036729a9d18456c43bee3cea60bd)

3 years agovalidate: drop a check v6.0.25.5
Guillaume Abrioux [Mon, 28 Mar 2022 09:49:39 +0000 (11:49 +0200)]
validate: drop a check

Since the ISO install method removal, ceph-ansible isn't able
to detect wheter the user is deploying in a 'disconnected environment'.
By the way, given that ceph-ansible is available only for upgrading to RHCS 5,
this check doesn't make sense anymore, let's drop it.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2062147
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1cd1fa05601384c5cd8d907b0581fdfaf125c098)
(cherry picked from commit 0d6763d4ef74a217b4122cc77792e0f6e48e4092)

3 years agoUsing another user than root for cephadm ssh connections fails v6.0.25.4
Teoman ONAY [Thu, 17 Mar 2022 14:13:06 +0000 (15:13 +0100)]
Using another user than root for cephadm ssh connections fails

Fixes commit da42f3d139e595d09edfb30334fbc7ce17ffa3fe

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2048734
Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit f851d3232c6237c46044fb68fdb70ccc0d0deb3c)
(cherry picked from commit 274a78023795f0c1cacab18606975aed73f6fc7f)

3 years agoupgrade: block upgrade when rgw multisite is active
Guillaume Abrioux [Fri, 18 Mar 2022 12:41:17 +0000 (13:41 +0100)]
upgrade: block upgrade when rgw multisite is active

With this commit, upgrading a cluster from Nautilus to Pacific with
active rgw multisite replication will be blocked.
This is because a lot of bugs are currently present in Pacific regarding
RGW multisite.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063702
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 51bc8cb63642b5b1ca460e79adf4649d92773aa4)
(cherry picked from commit f7b7ba30d9680af3058afbd81290243f81bd3998)

3 years agoupdate: allow qe testing with rgw_multisite v6.0.25.3
Guillaume Abrioux [Wed, 16 Mar 2022 11:58:33 +0000 (12:58 +0100)]
update: allow qe testing with rgw_multisite

This allows QE testing with rgw multisite enabled.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063702
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
3 years agoupgrade: block upgrade when rgw multisite is deployed v6.0.25.2
Guillaume Abrioux [Tue, 15 Mar 2022 10:18:21 +0000 (11:18 +0100)]
upgrade: block upgrade when rgw multisite is deployed

See BZ for details.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063702
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
3 years agoRevert "rolling_update: block upgrade for RHCS deployments"
Guillaume Abrioux [Tue, 15 Mar 2022 10:16:37 +0000 (11:16 +0100)]
Revert "rolling_update: block upgrade for RHCS deployments"

This reverts commit 686de207567cad4895f5b82c393d809484ef3968.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
3 years agoRevert "update: allow qe testing"
Guillaume Abrioux [Tue, 15 Mar 2022 10:16:25 +0000 (11:16 +0100)]
Revert "update: allow qe testing"

This reverts commit 445acc99f754c1ba5cd82c52b443519301e19e1d.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
3 years agoadopt: fix node labelling v6.0.25.1
Guillaume Abrioux [Thu, 3 Mar 2022 12:44:53 +0000 (13:44 +0100)]
adopt: fix node labelling

When using group of group, the playbook will apply undesired
labels on nodes.
This commit fixes it by applying only the expected labels.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2057528
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 266b6e739c26a931f1c975a990f22906223015ef)
(cherry picked from commit bcab0d7a5571ec3195e79e0b776343f3dda70ec0)

3 years agoAdd cluster custom name support
Teoman ONAY [Thu, 24 Feb 2022 11:01:19 +0000 (12:01 +0100)]
Add cluster custom name support

When using cluster custom names, cephadm commands are executed using
the default admin keyring name which fails.

Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit f8c6bba65782a861e5be51432e072efb0c4b0110)
(cherry picked from commit 839ad5927deaa51f0155d31528ca83318fdf9c16)

3 years agoEnable user to change the account used for ssh connection
Teoman ONAY [Mon, 7 Feb 2022 13:23:49 +0000 (14:23 +0100)]
Enable user to change the account used for ssh connection

By default cephadm uses root account to connect remotely
to other nodes in the cluster. This change allows to choose
another account.
This commit also allows to use a dedicated subnet for cephadm mgmt.

Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit da42f3d139e595d09edfb30334fbc7ce17ffa3fe)
(cherry picked from commit c3ce6fc41ad146c118916d3f0e5819f3519f4f65)

3 years agoupdate: allow qe testing
Guillaume Abrioux [Thu, 3 Mar 2022 14:42:25 +0000 (15:42 +0100)]
update: allow qe testing

Enable QE to test upgrade from 4.x to 5.1.

pass `-e qe_testing=true` to get around the upgrade blocking.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
3 years agorolling_update: block upgrade for RHCS deployments v6.0.25.0
Guillaume Abrioux [Thu, 17 Feb 2022 08:57:42 +0000 (09:57 +0100)]
rolling_update: block upgrade for RHCS deployments

Specific to RHCS deployments. See BZ for details.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2052614
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
3 years agoadopt: fix rbd-mirror adoption v6.0.25
Guillaume Abrioux [Wed, 9 Feb 2022 16:29:29 +0000 (17:29 +0100)]
adopt: fix rbd-mirror adoption

We can't use `{{ cephadm_cmd }}` here because the monitors aren't yet adopted.
We must use `{{ ceph_cmd }}` instead.
This also fixes some filters `| default()` (they must be moved before `| from_json()`)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967440
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 94e51d5c14719fa08477a8df1c5cf05d359b5ab5)

3 years agoadopt: fix bug in mon_ip_list set_fact v6.0.24
Guillaume Abrioux [Tue, 8 Feb 2022 17:02:24 +0000 (18:02 +0100)]
adopt: fix bug in mon_ip_list set_fact

`default('{}')` must be before `| from_json`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f30767432b2c3d5df60ff7087b77920ccd1ae0be)

3 years agoadopt: check for POOL_APP_NOT_ENABLED warning
Guillaume Abrioux [Mon, 7 Feb 2022 15:08:40 +0000 (16:08 +0100)]
adopt: check for POOL_APP_NOT_ENABLED warning

This commit makes the cephadm-adopt playbook fail if the cluster
has the `POOL_APP_NOT_ENABLED` warning raised.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2040243
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ddae06e1a23db7a8e45a7adc22421212bc0dedfa)

3 years agoceph-grafana: Add proxy env vars to grafana service template
John Karasev [Tue, 27 Apr 2021 20:52:48 +0000 (13:52 -0700)]
ceph-grafana: Add proxy env vars to grafana service template

When installing grafana plugins, the container will make http requests.
This requires http proxy otherwise installation cannot be performed. Passed
the proxy vars from all.yml as env args.
Fixes: ceph#6484, ceph#6481
Signed-off-by: John Karasev <john.karasev@intel.com>
(cherry picked from commit 79ca442d53ba25a463fc5bbb9a863da22cec55d1)

3 years agoRemove the remaining packages
jowsiewski [Thu, 20 Jan 2022 13:24:00 +0000 (14:24 +0100)]
Remove the remaining packages

Signed-off-by: jowsiewski <owsiewski@gmail.com>
(cherry picked from commit 1dfd195c7e3782082cf4fbf43ba3bab19156ce3f)

3 years agoAdd with_pkg tag on package related tasks
Francesco Pantano [Mon, 31 Jan 2022 16:25:19 +0000 (17:25 +0100)]
Add with_pkg tag on package related tasks

In the OpenStack context we let the integration tool (TripleO)
deal with repositories and packages.
This change just adds the with_pkg tag to allow TripleO skipping
both the repositories and packages installation.

Signed-off-by: Francesco Pantano <fpantano@redhat.com>
(cherry picked from commit 12dd8b5df10c403d09292168e1c9ad51431ff822)

3 years agotests: use centos stream-8 instead of centos 8
Guillaume Abrioux [Mon, 31 Jan 2022 12:42:10 +0000 (13:42 +0100)]
tests: use centos stream-8 instead of centos 8

CentOS 8 is EOL as of December 2021.
Let's use CentOS stream 8 instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit bc36f60e8d44986697c5467bf98b2a68bac6efbd)

3 years agonfs-ganesha: fix debian based OS deployments v6.0.23
Guillaume Abrioux [Wed, 19 Jan 2022 09:19:37 +0000 (10:19 +0100)]
nfs-ganesha: fix debian based OS deployments

Let's use ppa repositories in order to deploy nfs-ganesha on Debian based OS.

Fixes: #7031
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c491e67486d8ed2717b5b9eda96544eb4f7eae2d)

3 years agoadopt: create nfs exports at the user level
Guillaume Abrioux [Fri, 28 Jan 2022 13:12:07 +0000 (14:12 +0100)]
adopt: create nfs exports at the user level

The current implementation is wrong.
ceph-ansible lists all existing buckets and try to create
an export for each of them.
Instead, it's easier to create the export at the user level.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2037691
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7f517cdd2253f9bd939659d916f6977c4a237bb8)

3 years agoFix rich version for ansible-lint
Dmitriy Rabotyagov [Thu, 13 Jan 2022 16:17:14 +0000 (18:17 +0200)]
Fix rich version for ansible-lint

Ansible-lint prior to v5.3.1 has issue with reach version >=11.0.0.
In order to cherry-pick fix to stable branches we fix rich version.

This should be reverted with ansible-lint version bump.

Signed-off-by: Dmitriy Rabotyagov <noonedeadpunk@ya.ru>
(cherry picked from commit 583e60af84180f0414d67ee52c3ec7cd64ddb4dd)

3 years agomake grafana network a configurable option
Danny Webb [Tue, 23 Nov 2021 16:28:02 +0000 (16:28 +0000)]
make grafana network a configurable option

Signed-off-by: Danny Webb <danny.webb@thehutgroup.com>
(cherry picked from commit 189ff9337202ce1900a8f6f8c3e48a6e3ecb7519)

3 years agocephadm-adopt: use named args in rgw export creation v6.0.22
Guillaume Abrioux [Thu, 6 Jan 2022 13:33:42 +0000 (14:33 +0100)]
cephadm-adopt: use named args in rgw export creation

In order to avoid breaking changes, let's use named argument
instead of positional argument syntax in the command line
used to create rgw export.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2037691
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit aee1f06497d1e3737d286186d24338a03990644a)

3 years agopurge: remove ceph directories on client nodes
Guillaume Abrioux [Mon, 22 Nov 2021 08:22:45 +0000 (09:22 +0100)]
purge: remove ceph directories on client nodes

Otherwise any ceph directories are left over on client nodes
after the purge.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2024815
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 20035852a4e51f3800808bc4c71b0e0be86b8132)

3 years agocontainer: align systemd units with rpm
Guillaume Abrioux [Wed, 8 Dec 2021 16:37:14 +0000 (17:37 +0100)]
container: align systemd units with rpm

Update `After=` and `Wants=` parameters in container systemd units
and make them be aligned with the systemd units that come
from the packaging.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2027440
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f01536ea195a56c3ea2b31c7232391387e909c41)

3 years agoupdate: speed up client play
Guillaume Abrioux [Tue, 9 Nov 2021 14:35:12 +0000 (15:35 +0100)]
update: speed up client play

wip

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 817c03bc0ebee9df270093d6f414d02f8b95866a)

3 years agocommon: remove legacy repositories v6.0.21
Guillaume Abrioux [Wed, 15 Dec 2021 12:25:49 +0000 (13:25 +0100)]
common: remove legacy repositories

As of rhceph-5, those repositories don't longer exist.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2032790
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit dc8940fe1c4fefd656cc52c782f6d7297a42d5ed)

3 years agocephadm-adopt: ensure /etc/ceph is present on monitoring node
Guillaume Abrioux [Tue, 7 Dec 2021 20:11:50 +0000 (21:11 +0100)]
cephadm-adopt: ensure /etc/ceph is present on monitoring node

When deploying the monitoring stack on a dedicated node, the directory
`/etc/ceph` has never been created. Therefore, the play for adopting the
monitoring stack fails because it can't write the minimal config file.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2029697
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7ece59b41defc81ffd3bf184a24b63b45ec7d097)

3 years agocephadm-adopt: bindmount /var/lib/ceph with 'ro' v6.0.20
Guillaume Abrioux [Tue, 30 Nov 2021 09:00:20 +0000 (10:00 +0100)]
cephadm-adopt: bindmount /var/lib/ceph with 'ro'

When collocating osds with iscsigw daemons, cephadm bindmounts the
following:

```
-v /var/lib/ceph/6126c064-6a9e-4092-8a64-977930df0843/iscsi.rbd.ceph-ameenasuhani-4fs3bq-node5.vomtqb/configfs:/sys/kernel/config
```

this prevents cephadm-adopt playbook from running container and bindmounting `/var/lib/ceph:/var/lib/ceph:z`

since 'ro' is enough in this playbook, let's replace the ':z' option on
this bindmount with ':ro'

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2027411
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c4fdf956bd7269cb457023c45366d0edc17a8a67)

3 years agoceph_volume: support overriding bind-mounts
Guillaume Abrioux [Tue, 30 Nov 2021 08:52:59 +0000 (09:52 +0100)]
ceph_volume: support overriding bind-mounts

This makes it possible to call `podman run` with custom bind-mounts.

cephadm-adopt.yml playbook needs it for a very specific use case:

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2027411
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b02d71c3076de181fab98b1afc0858314414d0f4)

3 years agoadopt: fix ceph_origin and ceph_repository defaults
Guillaume Abrioux [Mon, 29 Nov 2021 09:48:23 +0000 (10:48 +0100)]
adopt: fix ceph_origin and ceph_repository defaults

This is overriding those variables because the precedence at the 'block
var' level is greater than the group_vars/host_vars.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2026861
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e5ea2ece993bd54f50a8a7419321734ec03389f2)

3 years agovalidate: fix bug when using vault
Guillaume Abrioux [Wed, 10 Nov 2021 13:32:26 +0000 (14:32 +0100)]
validate: fix bug when using vault

since a variable encrypted with vault is no longer a string but a
encrypted object we can't use the filter | length, we have to convert it
to a string before.

Fixes: #6991
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6ad7e5286920fceff6f4483672cbdca44f06a25f)

3 years agocephadm: support adding hosts with ipv6 v6.0.19
Guillaume Abrioux [Thu, 28 Oct 2021 12:12:46 +0000 (14:12 +0200)]
cephadm: support adding hosts with ipv6

The current implementation doesn't support adding hosts when using ipv6
addresses.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4f2c2af9b4de8a35e7db2894956d625e69488c34)

3 years agocephadm: use public_network when adding hosts
Guillaume Abrioux [Thu, 28 Oct 2021 12:10:26 +0000 (14:10 +0200)]
cephadm: use public_network when adding hosts

When adding host, using ansible_facts['default_ipv4']['address'] might
not be the desired network, we shouldn't enforce the subnet with the
default route.
Let's use the public_network instead.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2006415
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2f34531304cc1f7a718118ca1931ea600e59ea7e)

3 years agocephadm-adopt: remove logrotate configuration
Dimitri Savineau [Thu, 28 Oct 2021 21:15:49 +0000 (17:15 -0400)]
cephadm-adopt: remove logrotate configuration

cephadm uses its own logrotate configuration file so ceph-ansible needs
to remove that custom file during the cephadm-adopt playbook.

Closes: #6944
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit c41241244e835ada1988b16252b80330b5e80efb)

3 years agoupdate: move a set_fact
Guillaume Abrioux [Thu, 28 Oct 2021 21:40:18 +0000 (23:40 +0200)]
update: move a set_fact

ceph-facts roles makes decisions based on the fact `rolling_update` so
it must be called before we run this role.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e5edcc4214348945e1566641a490b68ef8886cf0)

3 years agoupdate: support --limit on monitor nodes
Guillaume Abrioux [Thu, 28 Oct 2021 14:17:24 +0000 (16:17 +0200)]
update: support --limit on monitor nodes

Change needed in order to support --limit on mon nodes.
Otherwise, a call to `hostvars[groups[mon_group_name][0]]['_current_monitor_address']`
throws an error:

```
"The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute '_current_monitor_address'"
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304#c28
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 82eee4303bce3e41b5043bcb03fa3143dcdfd30d)

3 years agoRevert "update: block upgrade when nfs+rgw is deployed"
Guillaume Abrioux [Tue, 26 Oct 2021 18:31:13 +0000 (20:31 +0200)]
Revert "update: block upgrade when nfs+rgw is deployed"

This reverts commit 93f17652595cc9290cdba9dcba53611bdb9cd07c.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2017508
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
3 years agorolling_update: modify default health_osd_check_*
Guillaume Abrioux [Mon, 25 Oct 2021 12:28:41 +0000 (14:28 +0200)]
rolling_update: modify default health_osd_check_*

let's do more retries with a shorter delay.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 50a21d695eb571bdc5b4d67bde914cf58c502b44)

3 years agoadopt: fix rbd mirror adoption v6.0.18
Guillaume Abrioux [Tue, 12 Oct 2021 14:01:20 +0000 (16:01 +0200)]
adopt: fix rbd mirror adoption

The rbd mirroring is broken because cephadm doesn't bindmount /etc/ceph anymore.
It means the keyrings and ceph config file aren't available after the
migration.
The idea here is to remove the current rbd mirror peer and add it back
to the mon config store so we aren't bound to the /etc/ceph directory.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1967440
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9c794aa9bcfab177f437f503bda0ff37dceca319)

3 years agoadopt: use mgr/nfs volume
Guillaume Abrioux [Thu, 14 Oct 2021 22:44:02 +0000 (00:44 +0200)]
adopt: use mgr/nfs volume

use the mgr 'nfs' module to recreate nfs exports.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1954971
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4257410dcdc7e143893a7d442026a92570b03783)

3 years agorolling_update: fix pre and post osd upgrade play
Guillaume Abrioux [Mon, 25 Oct 2021 11:43:25 +0000 (13:43 +0200)]
rolling_update: fix pre and post osd upgrade play

when using --limit osds, the play before and after osd upgrade are
skipped because we use `hosts: "{{ mon_group_name | default('mons') }}[0]"`
using `hosts: "{{ osds_group_name | default('osds') }}" with
`delegate_to` to the first monitor addresses this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fc9f87c45f6e58a595f59365b58bf1b0e3909bf5)

3 years agotests: add new scenario subset_update
Guillaume Abrioux [Wed, 20 Oct 2021 07:59:48 +0000 (09:59 +0200)]
tests: add new scenario subset_update

new scenario in order to test the subset upgrade approach using tags.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fb8a66149bc5605c0e51ab137f46c2c48580452a)

3 years agoupdate: support upgrading a subset of nodes
Guillaume Abrioux [Wed, 20 Oct 2021 08:01:05 +0000 (10:01 +0200)]
update: support upgrading a subset of nodes

It can be useful in a large cluster deployment to split the upgrade and
only upgrade a group of nodes at a time.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e5cf9db2b04f55196d867f5a7248b455307f4407)

3 years agoshrink-osd: fix regression because of a wrong regex
Per Abildgaard Toft [Wed, 20 Oct 2021 07:45:16 +0000 (09:45 +0200)]
shrink-osd: fix regression because of a wrong regex

968891f4498da9625acfdd34bfb01fe445d1eef2 introduced a regression.
The regex is wrong because it doesn't allow to shrink osds with id
greater than 9

Fixes: #6950
Signed-off-by: Per Abildgaard Toft <per@minfejl.dk>
(cherry picked from commit 84118a3063e38ed9d274cca90d115809353819b4)

3 years agocephadm: set ssh configs at bootstrap step v6.0.17
Seena Fallah [Sat, 9 Oct 2021 22:52:08 +0000 (02:22 +0330)]
cephadm: set ssh configs at bootstrap step

Add support ssh_user and ssh_config to cephadm bootstrap plugin

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit ae6be71b081b379e19035d6abc05475ed8a00e5d)

3 years agoshrink-osd: check osd id format
Guillaume Abrioux [Tue, 12 Oct 2021 15:55:40 +0000 (17:55 +0200)]
shrink-osd: check osd id format

This adds a check early in order to ensure the format of osd ids passed
is correct.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2005734
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 968891f4498da9625acfdd34bfb01fe445d1eef2)

3 years agocephadm: install cephadm from repository
Seena Fallah [Wed, 15 Sep 2021 12:53:04 +0000 (17:23 +0430)]
cephadm: install cephadm from repository

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit 582293625237c1c02dd28a05097783a2c42e665e)

3 years agocephadm-adopt: configure repository for cephadm installation
Seena Fallah [Thu, 5 Aug 2021 15:48:38 +0000 (20:18 +0430)]
cephadm-adopt: configure repository for cephadm installation

Configure repository for cephadm installation and use package install in both containerized and non containerized deployment

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit 339212a7c602ba17538f091c4134ca73c1cd626b)

3 years agoceph-validate: export validate repository vars as a task
Seena Fallah [Thu, 5 Aug 2021 15:47:10 +0000 (20:17 +0430)]
ceph-validate: export validate repository vars as a task

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit 4f6da9d92ff264e05dbcd7c9d36398acd1692935)

3 years agoceph-common: export repository configuration to a single task
Seena Fallah [Thu, 5 Aug 2021 15:46:04 +0000 (20:16 +0430)]
ceph-common: export repository configuration to a single task

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit e79bda9a054d6a3ffc69c52711ea9a8bb2bfb514)

3 years agotests: remove all references to ceph_stable_release v6.0.16
Guillaume Abrioux [Wed, 29 Sep 2021 14:25:42 +0000 (16:25 +0200)]
tests: remove all references to ceph_stable_release

this is legacy and not needed anymore.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f277a39dfe4ea7ea7d7f211a6a554866ac519f52)

3 years agoceph-defaults: set ceph_stable_release default to the stable branch release
Seena Fallah [Tue, 21 Sep 2021 07:54:13 +0000 (12:24 +0430)]
ceph-defaults: set ceph_stable_release default to the stable branch release

ceph_stable_release is a legacy from the time where a single branch of ceph-ansible supported more than one release of ceph

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit fb99626987740d676d649b0bce2215bce72ca0cf)

3 years agoAdd ceph_nfs_adopt tag to the cephadm-adopt playbook
Francesco Pantano [Thu, 30 Sep 2021 07:34:37 +0000 (09:34 +0200)]
Add ceph_nfs_adopt tag to the cephadm-adopt playbook

There are existing OpenStack scenarios where nfs is still not managed
by cephadm. For this reason sometimes is useful skip the nfs part of
the adoption playbook and leave this daemon unmanaged.
The purpose of this patch is providing a tag to enable the OpenStack
operators to skip this playbook section.

Closes: https://bugzilla.redhat.com/2009212
Signed-off-by: Francesco Pantano <fpantano@redhat.com>
(cherry picked from commit b7299f258b607dc57e4c9c0ce693261a736d1777)

3 years agocephadm: use cephadm_ssh_user for ssh user
Seena Fallah [Wed, 15 Sep 2021 13:02:05 +0000 (17:32 +0430)]
cephadm: use cephadm_ssh_user for ssh user

Use cephadm_ssh_user to set custom user (not root) for cephadm to ssh to the hosts

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit 0b78faa723c818c5dd476fc917199be4a88b1bf3)

3 years agocephadm: add admin label on mon nodes
Guillaume Abrioux [Fri, 1 Oct 2021 12:41:23 +0000 (14:41 +0200)]
cephadm: add admin label on mon nodes

This is needed if you want a copy of the admin keyring on the admin
nodes.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b555f1d1cdbcf3e2bf902fe5a64d32adc61d0266)

3 years agodashboard: retry setting rgw-credentials
Guillaume Abrioux [Wed, 29 Sep 2021 06:34:09 +0000 (08:34 +0200)]
dashboard: retry setting rgw-credentials

for some reason, this task can fail in the CI.
Adding a retry can help to avoid this failure.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f8d49827a4194316ada5ff89c41bf4a363f9d3a6)

3 years agotests: add osd node in collocation
Guillaume Abrioux [Tue, 28 Sep 2021 20:24:43 +0000 (22:24 +0200)]
tests: add osd node in collocation

we update the pool size from 1 to 2 in idempotency test
but only 1 node is available.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b6c470c7e238793deb1ec638cbdb61e0ad58b142)

3 years agotests: set rgw_instances in collect-logs.yml
Guillaume Abrioux [Thu, 30 Sep 2021 09:32:12 +0000 (11:32 +0200)]
tests: set rgw_instances in collect-logs.yml

in order to gather rgw logs, we need rgw_instances to be set.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c2e46fe5a5b9ebeab0828da1b7dd6540b3766fb2)

3 years agotests: update collect-logs.yml playbook
Guillaume Abrioux [Thu, 30 Sep 2021 06:23:42 +0000 (08:23 +0200)]
tests: update collect-logs.yml playbook

- change `ceph -s` output to json-pretty.
- gather rgw logs
- add `health detail` command

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b2ccc7234a8413a8bbd5d471da9ff2cf8f3ccde2)

3 years agotests: move collect-logs.yml to ceph-ansible repo
Guillaume Abrioux [Wed, 29 Sep 2021 12:29:58 +0000 (14:29 +0200)]
tests: move collect-logs.yml to ceph-ansible repo

related ceph-build PR: ceph/ceph-build#1914

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 702564518b9cb5019648f1a9edcdc4cc962a36d9)

3 years agodashboard: allow disabling of unused features
Alex Lambert [Tue, 21 Sep 2021 09:14:43 +0000 (10:14 +0100)]
dashboard: allow disabling of unused features

Unconfigured dashboard features can lead to empty tabs in the dashboard
containing no meaningful content. Allow users to disable dashboard features
they know will not be used.

A list of features to be disabled allows the user to define a streamlined
dashboard as standard across deployments. Defaults to disabling no features,
ensuring that users are sure they do not need the dashboard feature before
disabling it.

Signed-off-by: Alex Lambert <lamberta@microsoft.com>
(cherry picked from commit a9680ab17f19cc37809ef016897244f975034666)

3 years agotests: fix container-cephadm job
Guillaume Abrioux [Thu, 16 Sep 2021 14:53:33 +0000 (16:53 +0200)]
tests: fix container-cephadm job

add missing variable `containerized_deployment` in group_vars

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 66f3eb377c20b9d670f68df50079d4a5a125489b)

3 years agocephadm-adopt: add no_log: true
Guillaume Abrioux [Tue, 21 Sep 2021 08:41:53 +0000 (10:41 +0200)]
cephadm-adopt: add no_log: true

Let's add a `no_log: true` on the `cephadm registry-login` task.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0a3b916ee75277e14358af9e8a8aff4ffa194ee6)

3 years agoadopt: stop iscsi services in the first place
Guillaume Abrioux [Fri, 24 Sep 2021 12:45:11 +0000 (14:45 +0200)]
adopt: stop iscsi services in the first place

If old containers are still running, it can make tcmu-runner process
unable to open devices and there's nothing else to do than restarting
the container.

Also, as per discussion with iscsi experts, iscsi should be migrated before
OSDs. (the client should be closed before the server)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2000412
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d12efa1ab4fb6c4641d05ba07cb32db9fd1aa409)

4 years agoceph-dashboard: fix oject gateway integration
Dimitri Savineau [Tue, 17 Aug 2021 15:27:57 +0000 (11:27 -0400)]
ceph-dashboard: fix oject gateway integration

Since [1] multiple ceph dashboard commands have been removed and this is
breaking the current ceph-ansible dashboard with RGW automation.
This removes the following dashboard rgw commands:

- ceph dashboard set-rgw-api-access-key
- ceph dashboard set-rgw-api-secret-key
- ceph dashboard set-rgw-api-host
- ceph dashboard set-rgw-api-port
- ceph dashboard set-rgw-api-scheme

Which are replaced by `ceph dashboard set-rgw-credentials`

The RGW user creation task is also removed.

Finally moving the delegate_to statement from the rgw tasks at the block
level.

[1] https://github.com/ceph/ceph/pull/42252

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2ee2194ee0aff2e9d2f185e46366f879b9def1ab)

4 years agocephadm-adopt: use cephadm_ssh_user for ssh user v6.0.15
Seena Fallah [Tue, 27 Jul 2021 17:44:38 +0000 (22:14 +0430)]
cephadm-adopt: use cephadm_ssh_user for ssh user

Use cephadm_ssh_user to set custom user (not root) for cephadm to ssh to the hosts

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit 67389d08d4657a918af3b01ecd727b536ebfd28d)

4 years agocephadm-adopt: set cephadm registry login info
Daniel Pivonka [Thu, 9 Sep 2021 21:14:10 +0000 (17:14 -0400)]
cephadm-adopt: set cephadm registry login info

registry login info needs to be stored in cluster for cephadm and future hosts

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2000103
Signed-off-by: Daniel Pivonka <dpivonka@redhat.com>
(cherry picked from commit 1c50dc29cf9e9d08d668f82b679bd0a308ed5835)

4 years agopurge: add remove_docker tag
Seena Fallah [Mon, 16 Aug 2021 20:37:40 +0000 (01:07 +0430)]
purge: add remove_docker tag

This can help to skip docker removal tasks

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit ff39c8d70b7326f4215d32e78e2f89b632b07008)

4 years agopurge: add container_binary needed for zap osds
Seena Fallah [Mon, 16 Aug 2021 20:08:47 +0000 (00:38 +0430)]
purge: add container_binary needed for zap osds

`container_binary` isn't set anymore in the purge osd play because of a
regression introduced by 60aa70a.
The CI didn't catch it because the play purging node-exporter sets this
variable for all nodes before we run the purge osd play.

This commit fixes this regression.

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit a51ce767ca6749b3eb4c0c871e436daf3828e6c6)

4 years agoceph-defaults: set quay.io as the default registry
Dimitri Savineau [Fri, 27 Aug 2021 16:01:27 +0000 (12:01 -0400)]
ceph-defaults: set quay.io as the default registry

Because the ceph container images are now only pushed to the quay.io
registry then this updates the default registry value.
The docker.io registry can still be used but doesn't receive updated
container images.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit e7b43c1fc632237376351f43363597c23bd33cb7)

4 years agopurge-dashboard: remove cid files
Dimitri Savineau [Tue, 7 Sep 2021 16:13:37 +0000 (12:13 -0400)]
purge-dashboard: remove cid files

This adds the service cid file cleanup as supported in the classic purge
playbook since b9dd253

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786691
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit cddc23f51134a6a95fd7492ee27a5d89bf7ebf9f)

4 years agoceph-container-engine: allow override container_package_name and container_service_name
Seena Fallah [Thu, 5 Aug 2021 11:03:55 +0000 (15:33 +0430)]
ceph-container-engine: allow override container_package_name and container_service_name

Only include specific variables when they are undefined

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit 95bce32270c7f5ea7e397588340b674efd7db63f)

4 years agotests/rgw: use json format output for user info v6.0.14
Dimitri Savineau [Thu, 26 Aug 2021 20:45:07 +0000 (16:45 -0400)]
tests/rgw: use json format output for user info

If the radosgw user already exists then we need to have the output in json
format because we are expecting to load the output with json.loads()
Otherwise we have pytest failure like:

```console
self = <json.decoder.JSONDecoder object at 0x7fa2f00a5fd0>, s = '', idx = 0

    def raw_decode(self, s, idx=0):
        """Decode a JSON document from ``s`` (a ``str`` beginning with
        a JSON document) and return a 2-tuple of the Python
        representation and the index in ``s`` where the document ended.

        This can be used to decode a JSON document from a string that may
        have extraneous data at the end.

        """
        try:
            obj, end = self.scan_once(s, idx)
        except StopIteration as err:
>           raise JSONDecodeError("Expecting value", s, err.value) from None
E           json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
```

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f2bd8ae70f0ae5d0c7c7a36bd6ecdad6383feed9)

4 years agotests/rgw: add timeout 5s to radosgw-admin command
Dimitri Savineau [Tue, 10 Aug 2021 15:57:01 +0000 (11:57 -0400)]
tests/rgw: add timeout 5s to radosgw-admin command

If the radosgw daemons aren't up and running correctly (like not registered
in the servicemap or the OSD are down) then the radosgw-admin will hang
forever.
Jenkins will kill the jobs after 3h but we don't want to wait until this global
timeout.
Adding the timeout 5 command to the radosgw-admin commands (which is already
present on other ceph calls) allows the job to fail earlier.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f01ae82eeccb1c3ebc97a84b0d8a547f360741d3)

4 years agocephadm-adopt: fix orch host add with FQDN
Dimitri Savineau [Thu, 26 Aug 2021 16:06:11 +0000 (12:06 -0400)]
cephadm-adopt: fix orch host add with FQDN

When a node is configured with FQDN as the hostname value then the
`ceph orch host add` command will fail because the `ansible_hostname` used
by that command contains the short hostname which won't match the current
hostname (FQDN)
Instead we can use the ansible_nodename fact.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1997083
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2630f8d47a790d3b0c02da6e8fcbb01649e354fd)

4 years agocontainer: explicitly pull monitoring images
Dimitri Savineau [Thu, 19 Aug 2021 18:08:06 +0000 (14:08 -0400)]
container: explicitly pull monitoring images

We don't pull the monitoring container images (alertmanager, prometheus,
node-exporter and grafana) in a dedicated task like we're doing for the
ceph container image.
This means that the container image pull is done during the start of the
systemd service.
By doing this, pulling the image behind a proxy isn't working with podman.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1995574
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 5bb7240f878ff9b369ea028839d26fd46342ff77)

4 years agoiscsi: don't set default value for trusted_ip_list
Guillaume Abrioux [Wed, 18 Aug 2021 11:23:44 +0000 (13:23 +0200)]
iscsi: don't set default value for trusted_ip_list

It restricts access to the iSCSI API.
It can be left empty if the API isn't going to be access from outside the
gateway node

Even though this seems to be a limited use case, it's better to leave it
empty by default than having a meaningless default value.

We could make this variable mandatory but that would be a breaking
change. Let's just add a logic in the template in order to set this
variable in the configuration file only if it was specified by users.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1994930
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 6802b8dddd7f8d1f1c47f4eb3b7dd6a6a48820dc)

4 years agocephadm-adopt: remove ceph-nfs.target
Dimitri Savineau [Wed, 18 Aug 2021 15:15:39 +0000 (11:15 -0400)]
cephadm-adopt: remove ceph-nfs.target

This systemd target doesn't exist at all.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 8ba6101bbbfaeefb184d835b6e23db8be58d08ea)

4 years agocontainers: introduce target systemd unit
Guillaume Abrioux [Tue, 10 Aug 2021 13:21:19 +0000 (15:21 +0200)]
containers: introduce target systemd unit

This adds ceph-*.target systemd unit files support for containerized
deployments.
This also fixes a regression introduced by PR #6719 (rgw and nfs systemd
units not getting purged)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1962748
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 09ef465f62fde775bd2490be5b43d7796e2a9c6c)

4 years agoroles: remove leftover from pr #4319
Guillaume Abrioux [Tue, 10 Aug 2021 13:34:50 +0000 (15:34 +0200)]
roles: remove leftover from pr #4319

pr #4319 introduced some uesless `become: true` on systemd tasks.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1db8fa89895546571a831289ebbe0f83d02b1e0a)

4 years agoVagrantfile: fallback on 'varant_variables.yml.sample'
Guillaume Abrioux [Tue, 10 Aug 2021 14:11:37 +0000 (16:11 +0200)]
Vagrantfile: fallback on 'varant_variables.yml.sample'

When using a vagrant command from the root directory of the repo, it
throws an error if no 'vagrant_variables.yml' file is present.

```
Message: Errno::ENOENT: No such file or directory @ rb_sysopen - /home/guits/workspaces/ceph-ansible/vagrant_variables.yml
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3d27f9e7dc7ee775be57c27c3620009f9935ddcc)

4 years agoupdate: gather facts only one time
Guillaume Abrioux [Tue, 17 Aug 2021 14:07:03 +0000 (16:07 +0200)]
update: gather facts only one time

this play doesn't need to gather facts from localhost

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c14e9114baebd155996b42b18744567698178836)

4 years agoceph-mon: do not log monitor keyring
Dimitri Savineau [Wed, 11 Aug 2021 20:01:08 +0000 (16:01 -0400)]
ceph-mon: do not log monitor keyring

We don't want to display the keyring in the ansible log.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit e44075abd607648da88b4e3555353a99ecb171a6)

4 years agocommon: do not log keyring secret
Guillaume Abrioux [Mon, 9 Aug 2021 12:57:33 +0000 (14:57 +0200)]
common: do not log keyring secret

let's not display any keyring secret by default in ansible log.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1980744
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7511195738e9d1e8f3d3ec77ad4473fa90d17d22)

4 years agoceph-dashboard: fix TLS cert openssl generation
Dimitri Savineau [Mon, 9 Aug 2021 14:33:40 +0000 (10:33 -0400)]
ceph-dashboard: fix TLS cert openssl generation

With OpenSSL version prior 1.1.1 (like CentOS 7 with 1.0.2k), the -addext
doesn't exist.
As a solution, this uses the default openssl.cnf configuration file as a
template and add the subjectAltName in the v3_ca section. This temp openssl
configuration file is removed after the TLS certificate creation.
This patch also move the run_once statement at the block level.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1978869
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 5e0ace7e5493f7d8299155e915435691a0f1a007)

4 years agodashboard: subj_alt_names fact refactor
Guillaume Abrioux [Thu, 5 Aug 2021 13:00:49 +0000 (15:00 +0200)]
dashboard: subj_alt_names fact refactor

the current way the variable is built results in:

```
2021-08-03 04:18:23,020 - ceph.ceph - INFO - ok: [ceph-sangadi-4x-indpt6-node1-installer] => changed=false
  ansible_facts:
    subj_alt_names: |-
      subjectAltName=ceph-sangadi-4x-indpt6-node1-installer/subjectAltName=10.0.210.223/subjectAltName=ceph-sangadi-4x-indpt6-node1-installersubjectAltName=ceph-sangadi-4x-indpt6-node2/subjectAltName=10.0.210.252/subjectAltName=ceph-sangadi-4x-indpt6-node2/
```

which is incorrect.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1978869
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6f1a0634f73ad1f41af613a9452dc9c5f70b2702)

4 years agoFixes typo in rgw-add-users-buckets playbook
VasishtaShastry [Fri, 6 Aug 2021 10:40:19 +0000 (16:10 +0530)]
Fixes typo in rgw-add-users-buckets playbook

Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
(cherry picked from commit 478d9fdcb6fe6fb6ef7d00c9fe09dd48acd345cd)

4 years agoadopt: import rgw ssl certificate into kv store
Guillaume Abrioux [Wed, 28 Jul 2021 19:50:15 +0000 (21:50 +0200)]
adopt: import rgw ssl certificate into kv store

Without this, when rgw is managed by cephadm, it fails to start because
the ssl certificate isn't present in the kv store.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1987010
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1988404
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 930fc4c8500b62d6fbf487a172e6da5b9b54cb86)

4 years agopodman pids.max default value is 2048, docker's one is 4096 which are
Teoman ONAY [Tue, 3 Aug 2021 14:06:53 +0000 (16:06 +0200)]
podman pids.max default value is 2048, docker's one is 4096 which are
sufficient for the default value (512) of rgw thread pool size.
But if its value is increased near to the pids-limit value,
it does not leave place for the other processes to spawn and run within
the container and the container crashes.

pids-limit set to unlimited regardless of the container engine.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1987041
Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit 9b5d97adb95a788bc1fdedbba562a9c71a1808be)

4 years agoinfra: use dedicated variables for balancer status
Dimitri Savineau [Tue, 3 Aug 2021 15:58:49 +0000 (11:58 -0400)]
infra: use dedicated variables for balancer status

The balancer status is registered during the cephadm-adopt, rolling_update
and swith2container playbooks. But it is also used in the ceph-handler role
which is included in those playbooks too.
Even if the ceph-handler tasks are skipped for rolling_update and
switch2container, the balancer_status variable is erased with the skip task
result.

play1:
  register: balancer_status
play2:
  register: balancer_status <-- skipped
play3:
  when: (balancer_status.stdout | from_json)['active'] | bool

This leads to issue like:

The conditional check '(balancer_status.stdout | from_json)['active'] | bool'
failed. The error was: Unexpected templating type error occurred on
({% if (balancer_status.stdout | from_json)['active'] | bool %} True
{% else %} False {% endif %}): expected string or buffer.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1982054
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 386661699bcfe05a220de6d58b9d50baa7eb6dc1)

4 years agoosds: use osd pool ls instead of osd dump command
Dimitri Savineau [Wed, 28 Jul 2021 18:54:15 +0000 (14:54 -0400)]
osds: use osd pool ls instead of osd dump command

The ceph osd pool ls detail command is a subset of the ceph osd dump
command.

$ ceph osd dump --format json|wc -c
10117
$ ceph osd pool ls detail --format json|wc -c
4740

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 06471a4b82d63ebb35f80d45aa6ae629a4daeedc)

4 years agolibrary: exit on user creation failure
Dimitri Savineau [Wed, 28 Jul 2021 16:27:00 +0000 (12:27 -0400)]
library: exit on user creation failure

When the ceph dashboard user creation fails then the issue is hidden
as we don't check the return code and don't print the error message
in the module output.

This ends up with a failure on the ceph dashboard set roles command saying
that the user doesn't exist.

By failing on the user creation, we will have an explicit explaination of
the issue (like weak password).

Closes: #6197
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 17784624e0fb02080e14d15b4105ef92a78ec8c4)

4 years agorolling_update: get ceph version when mons exist
Dimitri Savineau [Thu, 29 Jul 2021 16:26:33 +0000 (12:26 -0400)]
rolling_update: get ceph version when mons exist

eec3878 introduced a regression for upgrade scenarios where there's no
monitor nodes at all (like ganesha standalone, external clients, etc..)

TASK [get the ceph release being deployed] ************************************
task path: infrastructure-playbooks/rolling_update.yml:121
Thursday 29 July 2021  15:55:29 +0000 (0:00:00.484)       0:00:15.802 *********
fatal: [client0]: FAILED! =>
  msg: '''dict object'' has no attribute ''mons'''

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit e87a47cf0cc0d01050d0cb94cabbb8bc42db0c57)

4 years agoinfrastructure-playbooks: Get Ceph info in check mode
Benoît Knecht [Mon, 26 Jul 2021 15:10:19 +0000 (17:10 +0200)]
infrastructure-playbooks: Get Ceph info in check mode

In the `set osd flags` block, run the Ceph commands that gather information
from the cluster (and don't make any changes to it) even when running in check
mode.

This allows the tasks that depend on the variables set by those tasks to
succeed in check mode.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit d7653dca95247e52c4a6821c1eec00748263082a)