]> git.apps.os.sepia.ceph.com Git - ceph-ansible.git/log
ceph-ansible.git
4 years agocontainer: force rm --storage on ExecStartPre
Guillaume Abrioux [Thu, 12 Nov 2020 10:34:41 +0000 (11:34 +0100)]
container: force rm --storage on ExecStartPre

This is a workaround to avoid error like following:
```
Error: error creating container storage: the container name "ceph-mgr-magna022" is already in use by "4a5f674e113f837a0cc561dea5d2cd55d16ca159a647b7794ab06c4c276ef701"
```

that doesn't seem to be 100% reproducible but it shows up after a
reboot. The only workaround we came up with at the moment is to run
`podman rm --storage <container>` before starting it.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1887716
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agoceph-facts: Fix osd_pool_default_crush_rule fact
Benoît Knecht [Wed, 7 Oct 2020 07:44:29 +0000 (09:44 +0200)]
ceph-facts: Fix osd_pool_default_crush_rule fact

The `osd_pool_default_crush_rule` is set based on `crush_rule_variable`, which
is the output of a `grep` command.

However, two consecutive tasks can set that variable, and if the second task is
skipped, it still overwrites the `crush_rule_variable`, leading the
`osd_pool_default_crush_rule` to be set to `ceph_osd_pool_default_crush_rule`
instead of the output of the first task.

This commit ensures that the fact is set right after the `crush_rule_variable`
is assigned, before it can be overwritten.

Closes #5912

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
4 years agoconfig: Always use osd_memory_target if set
Gaudenz Steinlin [Mon, 28 Oct 2019 09:41:26 +0000 (10:41 +0100)]
config: Always use osd_memory_target if set

The osd_memory_target variable was only used if it was higher than the
calculated value based on the number of OSDs. This is changed to always
use the value if it is set in the configuration. This allows this value
to be intentionally set lower so that it does not have to be changed
when more OSDs are added later.

Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch>
4 years agomain: followup on pr 6012
Guillaume Abrioux [Thu, 12 Nov 2020 14:19:42 +0000 (15:19 +0100)]
main: followup on pr 6012

This tag can be set at the play level.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agoswitch2container: disable ceph-osd enabled-runtime
Dimitri Savineau [Mon, 19 Oct 2020 21:22:31 +0000 (17:22 -0400)]
switch2container: disable ceph-osd enabled-runtime

When deploying the ceph OSD via the packages then the ceph-osd@.service
unit is configured as enabled-runtime.
This means that each ceph-osd service will inherit from that state.
The enabled-runtime systemd state doesn't survive after a reboot.
For non containerized deployment the OSD are still starting after a
reboot because there's the ceph-volume@.service and/or ceph-osd.target
units that are doing the job.

$ systemctl list-unit-files|egrep '^ceph-(volume|osd)'|column -t
ceph-osd@.service     enabled-runtime
ceph-volume@.service  enabled
ceph-osd.target       enabled

When switching to containerized deployment we are stopping/disabling
ceph-osd@XX.servive, ceph-volume and ceph.target and then removing the
systemd unit files.
But the new systemd units for containerized ceph-osd service will still
inherit from ceph-osd@.service unit file.

As a consequence, if an OSD host is rebooting after the playbook execution
then the ceph-osd service won't come back because they aren't enabled at
boot.

This patch also adds a reboot and testinfra run after running the switch
to container playbook.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1881288
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agoAdd ceph_client tag to execute or skip the playbook
Francesco Pantano [Mon, 9 Nov 2020 16:25:17 +0000 (17:25 +0100)]
Add ceph_client tag to execute or skip the playbook

There are some use cases where there's a need to skip the execution
of the ceph-ansible client role even though the client section of the
inventory isn't empty.
This can happen in contexts where the services are colocated or when
a all-in-one deployment is performed.
The purpose of this change is adding a 'ceph_client' tag to avoid
altering the ceph-ansible execution flow but at the same time be able
to include or exclude a set of tasks using this tag.

Signed-off-by: Francesco Pantano <fpantano@redhat.com>
4 years agorolling_update: always run cv simple scan/activate
Dimitri Savineau [Mon, 9 Nov 2020 22:02:17 +0000 (17:02 -0500)]
rolling_update: always run cv simple scan/activate

There's no need to use a condition on the ceph release for the
ceph-volume simple commands.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agodashboard: change dashboard_grafana_api_no_ssl_verify default value
Guillaume Abrioux [Tue, 3 Nov 2020 15:32:17 +0000 (16:32 +0100)]
dashboard: change dashboard_grafana_api_no_ssl_verify default value

This sets the `dashboard_grafana_api_no_ssl_verify` default value
according to the length of `dashboard_crt` and `dashboard_key`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agodashboard: enable https by default
Guillaume Abrioux [Tue, 3 Nov 2020 12:49:59 +0000 (13:49 +0100)]
dashboard: enable https by default

see linked bz for details

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1889426
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agoosd: Fix number of OSD calculation
Gaudenz Steinlin [Tue, 27 Aug 2019 13:15:35 +0000 (15:15 +0200)]
osd: Fix number of OSD calculation

If some OSDs are to be created and others already exist the calculation
only counted the to be created OSDs. This changes the calculation to
take all OSDs into account.

Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch>
4 years agorolling_update: fix mgr start with mon collocation
Dimitri Savineau [Fri, 30 Oct 2020 14:54:16 +0000 (10:54 -0400)]
rolling_update: fix mgr start with mon collocation

cec994b introduced a regression when a mgr is collocated with a mon.
During the mon upgrade, the mgr service is masked to avoid to be
restarted on packages update.
Then the start mgr task is failing because the service is still masked.
Instead we should unmask it.

Fixes: #5983
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agoinfrastructure: consume ceph_fs module
Dimitri Savineau [Fri, 23 Oct 2020 15:46:30 +0000 (11:46 -0400)]
infrastructure: consume ceph_fs module

bd611a7 introduced the new ceph_fs module but missed some tasks in
rolling_update and shrink-mds playbooks.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agorolling_update: use ceph health instead of ceph -s
Dimitri Savineau [Mon, 26 Oct 2020 23:35:06 +0000 (19:35 -0400)]
rolling_update: use ceph health instead of ceph -s

The ceph status command returns a lot of information stored in variables
and/or facts which could consume resources for nothing.
When checking the cluster health, we're using the health structure in the
ceph status output.
To optimize this, we could use the ceph health command which contains
the same needed information.

$ ceph status -f json | wc -c
2001
$ ceph health -f json | wc -c
46

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agorgw/rbdmirror: use service dump instead of ceph -s
Dimitri Savineau [Mon, 26 Oct 2020 21:49:47 +0000 (17:49 -0400)]
rgw/rbdmirror: use service dump instead of ceph -s

The ceph status command returns a lot of information stored in variables
and/or facts which could consume resources for nothing.
When checking the rgw/rbdmirror services status, we're only using the
servicmap structure in the ceph status output.
To optimize this, we could use the ceph service dump command which contains
the same needed information.
This command returns less information and is slightly faster than the ceph
status command.

$ ceph status -f json | wc -c
2001
$ ceph service dump -f json | wc -c
1105
$ time ceph status -f json > /dev/null

real 0m0.557s
user 0m0.516s
sys 0m0.040s
$ time ceph service dump -f json > /dev/null

real 0m0.454s
user 0m0.434s
sys 0m0.020s

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agomonitor: use quorum_status instead of ceph status
Dimitri Savineau [Mon, 26 Oct 2020 21:33:45 +0000 (17:33 -0400)]
monitor: use quorum_status instead of ceph status

The ceph status command returns a lot of information stored in variables
and/or facts which could consume resources for nothing.
When checking the quorum status, we're only using the quorum_names
structure in the ceph status output.
To optimize this, we could use the ceph quorum_status command which contains
the same needed information.
This command returns less information.

$ ceph status -f json  | wc -c
2001
$ ceph quorum_status -f json  | wc -c
957
$ time ceph status -f json > /dev/null

real 0m0.577s
user 0m0.538s
sys 0m0.029s
$ time ceph quorum_status -f json > /dev/null

real 0m0.544s
user 0m0.527s
sys 0m0.016s

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agoosds: use pg stat command instead of ceph status
Dimitri Savineau [Mon, 26 Oct 2020 15:23:01 +0000 (11:23 -0400)]
osds: use pg stat command instead of ceph status

The ceph status command returns a lot of information stored in variables
and/or facts which could consume resources for nothing.
When checking the pgs state, we're using the pgmap structure in the ceph
status output.
To optimize this, we could use the ceph pg stat command which contains
the same needed information.
This command returns less information (only about pgs) and is slightly
faster than the ceph status command.

$ ceph status -f json | wc -c
2000
$ ceph pg stat -f json | wc -c
240
$ time ceph status -f json > /dev/null

real 0m0.529s
user 0m0.503s
sys 0m0.024s
$ time ceph pg stat -f json > /dev/null

real 0m0.426s
user 0m0.409s
sys 0m0.016s

The data returned by the ceph status is even bigger when using the
nautilus release.

$ ceph status -f json | wc -c
35005
$ ceph pg stat -f json | wc -c
240

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agoosds: use ceph osd stat instead of ceph status
wangxiaotong [Sat, 24 Oct 2020 13:59:17 +0000 (21:59 +0800)]
osds: use ceph osd stat instead of ceph status

Improve the checked way of the OSD created checking process.
This replaces the ceph status command by the ceph osd stat command.
The osdmap structure isn't needed anymore.

$ ceph status -f json | wc -c
2001
$ ceph osd stat -f json | wc -c
132
$ time ceph status -f json > /dev/null

real    0m0.563s
user    0m0.526s
sys     0m0.036s
$ time ceph osd stat -f json > /dev/null

real 0m0.457s
user 0m0.411s
sys 0m0.045s

Signed-off-by: wangxiaotong <wangxiaotong@fiberhome.com>
4 years agocommon: follow up on #5948
Guillaume Abrioux [Mon, 2 Nov 2020 14:56:28 +0000 (15:56 +0100)]
common: follow up on #5948

In addition to f7e2b2c608eef4bbba47586f1e24d6ade1572758

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agoceph-mon: Don't set monitor directory mode recursively
Benoît Knecht [Wed, 28 Oct 2020 15:09:58 +0000 (16:09 +0100)]
ceph-mon: Don't set monitor directory mode recursively

After rolling updates performed with
`infrastructure-playbooks/rolling_updates.yml`, files located in
`/var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }}` had mode 0755 (including
the keyring), making them world-readable.

This commit separates the task that configured permissions recursively on
`/var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }}` into two separate tasks:

1. Set the ownership and mode of the directory itself;
2. Recursively set ownership in the directory, but don't modify the mode.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
4 years agolibrary: remove unused module import
Dimitri Savineau [Tue, 27 Oct 2020 16:14:19 +0000 (12:14 -0400)]
library: remove unused module import

Move the import at the top of the file and remove unused module import.

- E402 module level import not at top of file
- F401 'xxxx' imported but unused

This also removes the '# noqa E402' statement from the code.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agokeyring: use ceph_key module for get-or-create cmd
Dimitri Savineau [Fri, 23 Oct 2020 22:03:49 +0000 (18:03 -0400)]
keyring: use ceph_key module for get-or-create cmd

Instead of using ceph auth get-or-create command via the ansible command
module then we can use the ceph_key module.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agokeyring: use ceph_key module for auth get command
Dimitri Savineau [Fri, 23 Oct 2020 19:19:53 +0000 (15:19 -0400)]
keyring: use ceph_key module for auth get command

Instead of using ceph auth get command via the ansible command module
then we can use the ceph_key module and the info state.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agolibrary/ceph_key: add output format parameter
Dimitri Savineau [Fri, 23 Oct 2020 18:50:52 +0000 (14:50 -0400)]
library/ceph_key: add output format parameter

The ceph_key module currently only supports the json output for the
info state.
When using this state on an entity then we something want the output
as:
  - plain for copying it to another node.
  - json in order to get only a subset information of the entity (like
the key or caps).

This patch adds the output_format parameter which uses json as a
default value for backward compatibility. It removes the internal and
hardcoded variable also called output_format.
In addition of json and plain outputs, there's also xml and yaml
values available.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agoopenstack: use ceph_keyring_permissions by default
Gaudenz Steinlin [Mon, 10 Aug 2020 09:52:56 +0000 (11:52 +0200)]
openstack: use ceph_keyring_permissions by default

Otherwise this task fails if no permission is set on the item.
Previously the code omited the mode parameter if it was not set, but
this was lost with commit ab370b6ad823e551cfc324fd9c264633a34b72b5.

Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch>
4 years agopodman: force log driver to journald
Dimitri Savineau [Thu, 22 Oct 2020 14:59:15 +0000 (10:59 -0400)]
podman: force log driver to journald

Since we've changed to podman configuration using the detach mode and
systemd type to forking then the container logs aren't present in the
journald anymore.
The default conmon log driver is using k8s-file.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1890439
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agoceph-handler: fix curl ipv6 command with rgw
Dimitri Savineau [Thu, 22 Oct 2020 19:05:12 +0000 (15:05 -0400)]
ceph-handler: fix curl ipv6 command with rgw

When using the curl command with ipv6 address and brackets then we need
to use the -g option otherwise the command fails.

$ curl http://[fdc2:328:750b:6983::6]:8080
curl: (3) [globbing] error: bad range specification after pos 9

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agoiscsi: fix ownership on iscsi-gateway.cfg
Guillaume Abrioux [Wed, 21 Oct 2020 12:26:57 +0000 (14:26 +0200)]
iscsi: fix ownership on iscsi-gateway.cfg

This file is currently deployed with '0644' ownership making this file
readable by any user on the system.
Since it contains sensitive information it should be readable by the
owner only.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1890119
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agocommon: drop `fetch_directory` feature
Guillaume Abrioux [Tue, 6 Oct 2020 05:53:06 +0000 (07:53 +0200)]
common: drop `fetch_directory` feature

This commit drops the `fetch_directory` feature.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agoceph-config: ceph.conf rendering refactor
Guillaume Abrioux [Mon, 5 Oct 2020 15:41:20 +0000 (17:41 +0200)]
ceph-config: ceph.conf rendering refactor

This commit cleans up the `main.yml` task file of `ceph-config`.
It drops the local ceph.conf generation.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agocrash: refact caps definition
Guillaume Abrioux [Mon, 19 Oct 2020 14:57:53 +0000 (16:57 +0200)]
crash: refact caps definition

there is no need to use `{{ }}` syntax here.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agoceph-volume: refresh lvm metadata cache
Guillaume Abrioux [Mon, 19 Oct 2020 08:22:21 +0000 (10:22 +0200)]
ceph-volume: refresh lvm metadata cache

When running rhel8 containers on a rhel7 host, after zapping an OSD
there's a discrepancy with the lvmetad cache that needs to be refreshed.
Otherwise, the host still sees the lv and can makes the user confused.
If user tries to redeploy an OSD, it will fail because the LV isn't
present and need to be recreated.

ie:

```
 stderr: lsblk: ceph-block-8/block-8: not a block device
 stderr: blkid: error: ceph-block-8/block-8: No such file or directory
 stderr: Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.
usage: ceph-volume lvm prepare [-h] --data DATA [--data-size DATA_SIZE]
                               [--data-slots DATA_SLOTS] [--filestore]
                               [--journal JOURNAL]
                               [--journal-size JOURNAL_SIZE] [--bluestore]
                               [--block.db BLOCK_DB]
                               [--block.db-size BLOCK_DB_SIZE]
                               [--block.db-slots BLOCK_DB_SLOTS]
                               [--block.wal BLOCK_WAL]
                               [--block.wal-size BLOCK_WAL_SIZE]
                               [--block.wal-slots BLOCK_WAL_SLOTS]
                               [--osd-id OSD_ID] [--osd-fsid OSD_FSID]
                               [--cluster-fsid CLUSTER_FSID]
                               [--crush-device-class CRUSH_DEVICE_CLASS]
                               [--dmcrypt] [--no-systemd]
ceph-volume lvm prepare: error: Unable to proceed with non-existing device: ceph-block-8/block-8
```

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1886534
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agoceph-osd: Fix check mode for start osds tasks
Benoît Knecht [Mon, 19 Oct 2020 09:39:06 +0000 (11:39 +0200)]
ceph-osd: Fix check mode for start osds tasks

Correctly set `osd_ids_non_container.stdout_lines` to an empty list if it's
undefined (i.e. in check mode).

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
4 years agoceph-mon: Fix check mode for deploy monitor tasks
Benoît Knecht [Mon, 19 Oct 2020 09:23:59 +0000 (11:23 +0200)]
ceph-mon: Fix check mode for deploy monitor tasks

Skip the `get initial keyring when it already exists` task when both commands
whose `stdout` output it requires have been skipped (e.g. when running in check
mode).

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
4 years agoceph-crash: Only deploy key to targeted hosts
Gaudenz Steinlin [Mon, 10 Aug 2020 09:38:47 +0000 (11:38 +0200)]
ceph-crash: Only deploy key to targeted hosts

The current task installs the ceph-crash key to "most" hosts via
"delegate_to". This key is only used by the ceph-crash daemon and should
just be installed on all hosts targeted by this role. There is no need
for using a delegated task.

Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch>
4 years agoceph-osd: start osd after systemd overrides
Guillaume Abrioux [Wed, 14 Oct 2020 06:52:02 +0000 (08:52 +0200)]
ceph-osd: start osd after systemd overrides

The service should be started after the ceph-osd systemd overrides has
been added, otherwise, the latter isn't considered.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1860739
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agocontainer: remove container_binding_name variable
Dimitri Savineau [Fri, 9 Oct 2020 18:34:14 +0000 (14:34 -0400)]
container: remove container_binding_name variable

The container_binding_name package was only mandatory when we were
using the docker modules (docker_image and docker_container) but since
we manage both docker and podman containers without using the dedicated
module then we can remove it.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agoceph-osd: don't start the OSD services twice
Dimitri Savineau [Wed, 14 Oct 2020 00:43:53 +0000 (20:43 -0400)]
ceph-osd: don't start the OSD services twice

Using the + operation on two lists doesn't filter out the duplicate
keys.
Currently each OSDs is started (via systemd) twice.
Instead we could use the union filter.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agohandler: refact check_socket_non_container
Guillaume Abrioux [Tue, 6 Oct 2020 12:58:46 +0000 (14:58 +0200)]
handler: refact check_socket_non_container

the `stat --printf=%n` returns something like following:

```
ok: [osd0] => changed=false
  cmd: |-
    stat --printf=%n /var/run/ceph/ceph-osd*.asok
  delta: '0:00:00.009388'
  end: '2020-10-06 06:18:28.109500'
  failed_when_result: false
  rc: 0
  start: '2020-10-06 06:18:28.100112'
  stderr: ''
  stderr_lines: <omitted>
  stdout: /var/run/ceph/ceph-osd.2.asok/var/run/ceph/ceph-osd.5.asok
  stdout_lines: <omitted>
```

it makes the next task "check if the ceph osd socket is in-use" grep
like this:

```
ok: [osd0] => changed=false
  cmd:
  - grep
  - -q
  - /var/run/ceph/ceph-osd.2.asok/var/run/ceph/ceph-osd.5.asok
  - /proc/net/unix
```

which will obviously fail because this path never exists. It makes the
OSD handler broken.

Let's use `find` module instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agoFix Ansible check mode for site.yml.sample playbook
Benoît Knecht [Tue, 1 Sep 2020 09:24:59 +0000 (11:24 +0200)]
Fix Ansible check mode for site.yml.sample playbook

Make sure the `site.yml.sample` playbook can be run in check mode by skipping
tasks that try to read the output of commands that have been skipped.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
4 years agotests: change cephfs pool size
Guillaume Abrioux [Tue, 6 Oct 2020 08:55:37 +0000 (10:55 +0200)]
tests: change cephfs pool size

`all_daemons` scenario can't handle pools with `size: 3` because we have
1 osd node in root=HDD and two nodes in root=default.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agolibrary: add radosgw_zone module
Dimitri Savineau [Wed, 26 Aug 2020 22:53:04 +0000 (18:53 -0400)]
library: add radosgw_zone module

This adds radosgw_zone ansible module for replacing the command module
usage with the radosgw-admin zone command.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agolibrary: add radosgw_zonegroup module
Dimitri Savineau [Wed, 12 Aug 2020 21:43:29 +0000 (17:43 -0400)]
library: add radosgw_zonegroup module

This adds radosgw_zonegroup ansible module for replacing the command
module usage with the radosgw-admin zonegroup command.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agolibrary: add radosgw_realm module
Dimitri Savineau [Thu, 6 Aug 2020 13:48:58 +0000 (09:48 -0400)]
library: add radosgw_realm module

This adds radosgw_realm ansible module for replacing the command module
usage with the radosgw-admin realm command.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agolibrary: add radosgw_user module
Dimitri Savineau [Fri, 22 May 2020 19:47:45 +0000 (15:47 -0400)]
library: add radosgw_user module

This adds radosgw_user ansible module for replacing the command module
usage with the radosgw-admin user command.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agoinfrastructure-playbooks: drop add-osd playbook
Guillaume Abrioux [Tue, 15 Sep 2020 12:11:59 +0000 (14:11 +0200)]
infrastructure-playbooks: drop add-osd playbook

This playbook isn't needed anymore, we can achieve this operation by
running main playbook with `--limit` option.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agolibrary: add ceph_fs module
Dimitri Savineau [Wed, 30 Sep 2020 15:57:20 +0000 (11:57 -0400)]
library: add ceph_fs module

This adds the ceph_fs ansible module for replacing the command module
usage with the ceph fs command.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agoflake8: run the workflow conditionally
Dimitri Savineau [Fri, 2 Oct 2020 16:14:36 +0000 (12:14 -0400)]
flake8: run the workflow conditionally

We don't need to run flake8 on ansible modules and their tests if we
don't have any modifitions.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agoflake8: fix pep8 syntax on tests/functional/tests/
Guillaume Abrioux [Sun, 4 Oct 2020 08:32:45 +0000 (10:32 +0200)]
flake8: fix pep8 syntax on tests/functional/tests/

tests/conftest.py and tests present in tests/functional/tests/ has been
missed from previous commit

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agoceph_key: remove backward compatibility
Dimitri Savineau [Mon, 5 Oct 2020 15:16:44 +0000 (11:16 -0400)]
ceph_key: remove backward compatibility

It's time to remove this backward compatibility. Users had enough time
to convert their openstack_keys and key values.
We now fail in ceph-validate if the caps key isn't set.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agoceph_key: support using different keyring
Guillaume Abrioux [Sat, 3 Oct 2020 04:56:06 +0000 (06:56 +0200)]
ceph_key: support using different keyring

Currently the `ceph_key` module doesn't support using a different
keyring than `client.admin`.
This commit adds the possibility to use a different keyring.

Usage:
```
      ceph_key:
        name: "client.rgw.myrgw-node.rgw123"
        cluster: "ceph"
        user: "client.bootstrap-rgw"
        user_key: /var/lib/ceph/bootstrap-rgw/ceph.keyring
        dest: "/var/lib/ceph/radosgw/ceph-rgw.myrgw-node.rgw123/keyring"
        caps:
          osd: 'allow rwx'
          mon: 'allow rw'
          import_key: False
        owner: "ceph"
        group: "ceph"
        mode: "0400"
```

Where:
`user` corresponds to `-n (--name)`
`user_key` corresponds to `-k (--keyring)`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agorgw: fix multi instances scaleout in baremetal
Guillaume Abrioux [Wed, 23 Sep 2020 15:47:20 +0000 (17:47 +0200)]
rgw: fix multi instances scaleout in baremetal

When rgw and osd are collocated, the current workflow prevents from
scaling out the radosgw_num_instances parameter when rerunning the
playbook in baremetal deployments.

When ceph-osd notifies handlers, it means rgw handlers are triggered
too. The issue with this is that they are triggered before the role
ceph-rgw is run.
In the case a scaleout operation is expected on `radosgw_num_instances`
it causes an issue because keyrings haven't been created yet so the new
instances won't start.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1881313
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agotests: reboot and test idempotency on collocation
Guillaume Abrioux [Wed, 23 Sep 2020 15:58:39 +0000 (17:58 +0200)]
tests: reboot and test idempotency on collocation

test reboot and idempotency on collocation scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agoceph-osd: refact `docker_exec_start_osd`
Guillaume Abrioux [Fri, 2 Oct 2020 09:00:29 +0000 (11:00 +0200)]
ceph-osd: refact `docker_exec_start_osd`

This commit drops nested jinja construction in this set_fact task.
It also rename it to `container_exec_start_osd`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agotests: remove ooo_collocation job
Guillaume Abrioux [Sun, 4 Oct 2020 08:18:39 +0000 (10:18 +0200)]
tests: remove ooo_collocation job

This job is redundant with 'collocation' job.
The only difference is osd/rgw collocation so let's add this usecase in
'collocation'.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 19d683d7acfb5344b38ac1ba4c123dcdd4d80f35)

4 years agoceph-volume: dirty hack
Guillaume Abrioux [Fri, 2 Oct 2020 14:12:13 +0000 (16:12 +0200)]
ceph-volume: dirty hack

ceph-volume recently introduced a breaking change because of a `lvm
batch` refactor.
when rerunning `lvm batch --report --format json` on existing OSDs, it
doesn't output a valid json on stdout.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agoflake8: fix all tests/library/*.py files
Guillaume Abrioux [Thu, 1 Oct 2020 20:28:17 +0000 (22:28 +0200)]
flake8: fix all tests/library/*.py files

This commit modifies all *.py files in ./tests/library/ so flake8
passes.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agotests: refact flake8 workflow
Guillaume Abrioux [Thu, 1 Oct 2020 19:59:53 +0000 (21:59 +0200)]
tests: refact flake8 workflow

drop ricardochaves/python-lint action and use `run` steps instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agoRevert "tests: disable nfs-ganesha testing"
Dimitri Savineau [Mon, 14 Sep 2020 18:45:33 +0000 (14:45 -0400)]
Revert "tests: disable nfs-ganesha testing"

This reverts commit 7348e9a253518904724b05565c97fa1f35c47006.

Since the nfs-ganesha rpm build for CentOS 8 has been fixed, and
the nfs-ganesha segfault caused by an issue in librgw has also been
fixed.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agodefaults: change defaults value
Guillaume Abrioux [Wed, 30 Sep 2020 14:32:56 +0000 (16:32 +0200)]
defaults: change defaults value

this commit changes defaults value in default pool definitions.

there's no need to define `pg_num`, `pgp_num`, `size` and `min_size`,
`ceph_pool` module will use the current default if needed.

This also drops the 3 following `set_fact` in `ceph-facts`:

- osd_pool_default_pg_num,
- osd_pool_default_pgp_num,
- osd_pool_default_size_num

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agoceph_pool: update tests
Guillaume Abrioux [Wed, 30 Sep 2020 12:10:20 +0000 (14:10 +0200)]
ceph_pool: update tests

update test_ceph_pool.py due to recent refact

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agoceph_pool: improve pg_autoscaler support
Guillaume Abrioux [Wed, 30 Sep 2020 09:42:12 +0000 (11:42 +0200)]
ceph_pool: improve pg_autoscaler support

This commit modifies how the `pg_autoscaler` feature is handled by the
ceph_pool module.

1/ If a pool has the pg_autoscaler feature enabled, we shouldn't try to
update pg/pgp.
2/ Make it more readable

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agoceph_pool: pep8
Guillaume Abrioux [Wed, 30 Sep 2020 07:22:58 +0000 (09:22 +0200)]
ceph_pool: pep8

Adopt pep8 syntax in ceph_pool module

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agoceph_pool: refact module
Guillaume Abrioux [Mon, 28 Sep 2020 21:27:47 +0000 (23:27 +0200)]
ceph_pool: refact module

remove complexity about current defaults in running cluster

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agolibrary: remove legacy file
Guillaume Abrioux [Thu, 1 Oct 2020 09:25:19 +0000 (11:25 +0200)]
library: remove legacy file

This file is a leftover and should have been removed when we dropped the
validate module.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agotests: add github workflows
Guillaume Abrioux [Mon, 7 Sep 2020 07:55:41 +0000 (09:55 +0200)]
tests: add github workflows

Add github workflow. Especially for flake8 for now.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agolibrary: flake8 ceph-ansible modules
Wong Hoi Sing Edison [Sun, 6 Sep 2020 02:17:02 +0000 (10:17 +0800)]
library: flake8 ceph-ansible modules

This commit ensure all ceph-ansible modules pass flake8 properly.

Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agotests: remove sleep commands from tox ini files
Guillaume Abrioux [Wed, 30 Sep 2020 15:59:39 +0000 (17:59 +0200)]
tests: remove sleep commands from tox ini files

Since we use the rerun plugin in tox, we shouldn't need to add these
`sleep` commands.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agofs2bs: support `osd_auto_discovery` scenario v6.0.0alpha2
Guillaume Abrioux [Wed, 23 Sep 2020 14:21:21 +0000 (16:21 +0200)]
fs2bs: support `osd_auto_discovery` scenario

This commit adds the `osd_auto_discovery` scenario support in the
filestore-to-bluestore playbook.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1881523
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agoceph-facts: add get default crush rule from running monitor
Seena Fallah [Sun, 27 Sep 2020 17:11:07 +0000 (20:41 +0330)]
ceph-facts: add get default crush rule from running monitor

In case of deploying new monitor node to an existing cluster,
osd_pool_default_crush_rule should be taken from running monitor because
ceph-osd role won't be run and the new monitor will have different
osd_pool_default_crush_role from other monitors.

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
4 years agodefaults: change default grafana-server name
Guillaume Abrioux [Fri, 24 Jul 2020 22:05:41 +0000 (00:05 +0200)]
defaults: change default grafana-server name

This change default value of grafana-server group name.
Adding some tasks in ceph-defaults in order to keep backward
compatibility.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agorgw multisite: check connection for realm endpoint
Ali Maredia [Thu, 17 Sep 2020 04:19:45 +0000 (00:19 -0400)]
rgw multisite: check connection for realm endpoint

This commit adds connection checks before realm pulls
Curls are performed on the endpoint being pulled from
the mons and the rgws

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1731158
Signed-off-by: Ali Maredia <amaredia@redhat.com>
4 years agoRemove unused centos docker tasks
Dimitri Savineau [Fri, 25 Sep 2020 17:49:41 +0000 (13:49 -0400)]
Remove unused centos docker tasks

The `enable extras on centos` task just doesn't work when using the
variable ceph_docker_enable_centos_extra_repo to true.

fatal: [xxx]; FAILED! => {"changed": false, "msg": "Parameter
'baseurl', 'metalink' or 'mirrorlist' is required."}

The CentOS extras repository is enabled by default so it's pretty
safe to remove this task and the associated variable.

This also removes the ceph_docker_on_openstack variable as it's a
leftover and it is unused.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agoceph-handler: set handler on xxx_stat result
Dimitri Savineau [Fri, 25 Sep 2020 18:27:33 +0000 (14:27 -0400)]
ceph-handler: set handler on xxx_stat result

In non containerized deployment we check if the service is running
via the socket file presence.
This is done via the xxx_socket_stat variable that check the file
socket in the /var/run/ceph/ directory.
In some scenarios, we could have the socket file still present in
that directory but not used by any process.
That's why we have the xxx_stat variable which clean those leftovers.

The problem here is that we're set the variable for the handlers status
(like handler_mon_status) based on xxx_socket_stat instead of xxx_stat.
That means we will trigger the handlers if there's an old socket file
present on the system without any process associated.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1866834
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agoceph-iscsi: create pool once from monitor
Dimitri Savineau [Sat, 26 Sep 2020 00:58:30 +0000 (20:58 -0400)]
ceph-iscsi: create pool once from monitor

af9f6684 introduced a regression on the ceph iscsi pool creation
because it was delegated to the first monitor node before that change.
This patch restores the initial worflow.
When the iscsi node doesn't have the admin keyring then the pool
creation fails.
This commit also ensures that the pool creation is only executed once
when having multiple iscsi nodes.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agoceph-facts: check for mon socket in its own host
Seena Fallah [Sun, 27 Sep 2020 17:04:14 +0000 (20:34 +0330)]
ceph-facts: check for mon socket in its own host

delegate to its own host after checking mon socket to findout if mon socket is in-use or not.

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
4 years agoadd missing boolean filter
Dimitri Savineau [Fri, 25 Sep 2020 16:15:02 +0000 (12:15 -0400)]
add missing boolean filter

Otherwise this will generate an ansible warning about the missing
filter.

[DEPRECATION WARNING]: evaluating xxx as a bare variable, this behaviour
will go away and you might need to add |bool to the expression in the
future.
Also see CONDITIONAL_BARE_VARS configuration toggle.. This feature will
be removed in version 2.12.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
4 years agoRevert "ceph-rgw: remove ceph_pool state and default value"
Guillaume Abrioux [Fri, 25 Sep 2020 19:01:16 +0000 (21:01 +0200)]
Revert "ceph-rgw: remove ceph_pool state and default value"

This reverts commit ba3512a8fcffdbbd40fbd41f4e16b0d3ca1ca328.

4 years agoceph-mds: remove unused block condition
Dimitri Savineau [Sat, 26 Sep 2020 01:11:03 +0000 (21:11 -0400)]
ceph-mds: remove unused block condition

Since af9f6684 the cephfs pool(s) creation don't use the fs_pools_created
variable anymore because the ceph_pool module is idempotent.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agodoc: Update methods.rst
Raffael [Thu, 9 Jul 2020 21:12:28 +0000 (23:12 +0200)]
doc: Update methods.rst

Based on the discussion in issue #5392 I added now this paragraph to this page.

Signed-off-by: Raffael Luthiger <r.luthiger@huanga.com>
5 years agofacts: support device aliases for (dedicated|bluestore_wal)_devices
Tyler Bishop [Thu, 23 Jul 2020 13:36:01 +0000 (09:36 -0400)]
facts: support device aliases for (dedicated|bluestore_wal)_devices

Just likve `devices`, this commit adds the support for linux device aliases for
`dedicated_devices` and `bluestore_wal_devices`.

Signed-off-by: Tyler Bishop <tbishop@liquidweb.com>
5 years agolibrary: Fix new-style modules check mode
Benoît Knecht [Tue, 1 Sep 2020 11:06:57 +0000 (13:06 +0200)]
library: Fix new-style modules check mode

Running the `ceph_crush.py`, `ceph_key.py` or `ceph_volume.py` modules in check
mode resulted in the following error:

```
New-style module did not handle its own exit
```

This was due to the fact that they simply returned a `dict` in that case,
instead of calling `module.exit_json()`.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
5 years agoREADME-MULTISITE: Fix syntax issues from markdownlint
Benoît Knecht [Thu, 3 Sep 2020 07:01:16 +0000 (09:01 +0200)]
README-MULTISITE: Fix syntax issues from markdownlint

This commit makes the following changes:

- Remove trailing whitespace;
- Use consistent header levels;
- Fix code blocks;
- Remove hard tabs;
- Fix ordered lists;
- Fix bare URLs;
- Use markdown list of sections.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
5 years agoceph-rgw: remove ceph_pool state and default value
Dimitri Savineau [Mon, 14 Sep 2020 18:34:07 +0000 (14:34 -0400)]
ceph-rgw: remove ceph_pool state and default value

Since the state is now optional and default values are handled in the
ceph_pool module itself.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoceph_pool: add idempotency to absent state
Dimitri Savineau [Thu, 3 Sep 2020 17:11:31 +0000 (13:11 -0400)]
ceph_pool: add idempotency to absent state

When using the "absent" state on a non existing pool then the ceph_pool
module will fail and return a python traceback.

Instead we should check if the pool exit or not and execute the pool
deletion according to the result.
The state changed is now set when the pool is actually deleted.

This also disable add_file_common_args because we don't manipulate
files with this module.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agorolling_update: remove msgr2 migration
Dimitri Savineau [Fri, 25 Sep 2020 14:20:35 +0000 (10:20 -0400)]
rolling_update: remove msgr2 migration

In Pacific we're are sure that users already achieved the msgr2 because
that was introduced in Nautilus.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoceph-config: remove ceph_release from ceph.conf.j2
Dimitri Savineau [Fri, 25 Sep 2020 14:44:08 +0000 (10:44 -0400)]
ceph-config: remove ceph_release from ceph.conf.j2

We don't use ceph_release variable in the ceph.conf jinja template.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agodocs: update URLs to point to the RTD links
Kefu Chai [Thu, 24 Sep 2020 16:46:30 +0000 (00:46 +0800)]
docs: update URLs to point to the RTD links

Fixes #5798
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
5 years agoansible.cfg: remove cfg file in infrastructure-playbooks
Guillaume Abrioux [Thu, 24 Sep 2020 02:20:34 +0000 (04:20 +0200)]
ansible.cfg: remove cfg file in infrastructure-playbooks

There's no need ot have a copy of this file in infrastructure-playbooks
directory.
playbooks in that directory can be run from the root dir of
ceph-ansible.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoansible.cfg: set force_valid_group_names param
Guillaume Abrioux [Thu, 24 Sep 2020 01:51:56 +0000 (03:51 +0200)]
ansible.cfg: set force_valid_group_names param

As of 2.10, group names containing a dash are invalid.
However, setting this option makes it still possible to use a dash in
group names and prevent this warning to show up.
It might need to be definitely addressed in a future ansible release.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1880476
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agolibrary/ceph_key: set no_log on secret
Dimitri Savineau [Wed, 23 Sep 2020 16:00:30 +0000 (12:00 -0400)]
library/ceph_key: set no_log on secret

We don't need to show this information during the module execution.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoRemove libjemalloc1 installation task
Dmitriy Rabotyagov [Wed, 23 Sep 2020 13:06:33 +0000 (16:06 +0300)]
Remove libjemalloc1 installation task

libjemalloc1 package is not required neither for ganesha dependency nor
for the package build process. So this task can be simply dropped.

Signed-off-by: Dmitriy Rabotyagov <noonedeadpunk@ya.ru>
5 years agocontainer: quote registry password
Dimitri Savineau [Fri, 18 Sep 2020 14:03:13 +0000 (10:03 -0400)]
container: quote registry password

When using a quote in the registry password then we have the following
error:

The error was: ValueError: No closing quotation

To fix this we need to use the quote filter.

Close: https://bugzilla.redhat.com/show_bug.cgi?id=1880252

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agofacts: fix 'set_fact rgw_instances with rgw multisite'
Guillaume Abrioux [Fri, 18 Sep 2020 07:09:57 +0000 (09:09 +0200)]
facts: fix 'set_fact rgw_instances with rgw multisite'

the current condition doesn't work, as soon as the first iteration is
done the condition makes next iterations skip since `rgw_instances` got
set with the first iteration.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1859872
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agotests: disable container nfs testing
Dimitri Savineau [Thu, 17 Sep 2020 18:43:08 +0000 (14:43 -0400)]
tests: disable container nfs testing

Looks like nfs-ganesha 3.3 and 4.-dev doesn't work with recent changes
in librgw 16.0.0.
The nfs-ganesha daemon is segfaulting and restart in a loop.

See https://tracker.ceph.com/issues/47520

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoceph-infra: include iscsi nodes for logrotate
Dimitri Savineau [Thu, 17 Sep 2020 18:11:22 +0000 (14:11 -0400)]
ceph-infra: include iscsi nodes for logrotate

The iscsi nodes aren't included in the logrotate condition.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoinfra: support log rotation for tcmu-runner
Guillaume Abrioux [Tue, 15 Sep 2020 07:48:31 +0000 (09:48 +0200)]
infra: support log rotation for tcmu-runner

This commit adds the log rotation support for tcmu-runner.

ceph-container related PR: ceph/ceph-container#1726

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1873915
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
5 years agoceph-prometheus: update pool stat counter
Dimitri Savineau [Tue, 15 Sep 2020 13:30:42 +0000 (09:30 -0400)]
ceph-prometheus: update pool stat counter

Since [1] The bytes_used pool counter in prometheus has been renamed
to stored.

Closes: #5781
[1] https://github.com/ceph/ceph/commit/71fe9149

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agocontainer: add optional http(s) proxy option
Dimitri Savineau [Tue, 15 Sep 2020 00:13:13 +0000 (20:13 -0400)]
container: add optional http(s) proxy option

When using a http(s) proxy with either docker or podman we can rely on
the HTTP_PROXY, HTTPS_PROXY and NO_PROXY environment variables.
But with ansible, even if those variables are defined in a source file
then they aren't loaded during the container pull/login tasks.
This implements the http(s) proxy support with docker/podman.
Both implementations are different:
  1/ docker doesn't rely en the environment variables with the CLI.
Thos are needed by the docker daemon via systemd.
  2/ podman uses the environment variables so we need to add them to
the login/pull tasks.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1876692
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoswitch2container: chown symlink for devices
Dimitri Savineau [Tue, 15 Sep 2020 13:59:06 +0000 (09:59 -0400)]
switch2container: chown symlink for devices

If the OSD directory is using symlinks for referencing devices (like
block, db, wal for bluestore and journal for filestore) then the chown
command could fail to change the owner:group on some system.

$ ls -hl /var/lib/ceph/osd/ceph-0/
total 28K
lrwxrwxrwx 1 ceph ceph 92 Sep 15 01:53 block -> /dev/ceph-45113532-95ca-471b-bd75-51de46f1339c/osd-data-570a1aee-60c0-44c9-8036-ffed7d67a4e6
-rw------- 1 ceph ceph 37 Sep 15 01:53 ceph_fsid
-rw------- 1 ceph ceph 37 Sep 15 01:53 fsid
-rw------- 1 ceph ceph 55 Sep 15 01:53 keyring
-rw------- 1 ceph ceph  6 Sep 15 01:53 ready
-rw------- 1 ceph ceph  3 Sep 15 02:00 require_osd_release
-rw------- 1 ceph ceph 10 Sep 15 01:53 type
-rw------- 1 ceph ceph  2 Sep 15 01:53 whoami
$ find /var/lib/ceph/osd/ceph-0 -not -user 167 -execdir chown 167:167 {} +
chown: cannot dereference './block': Permission denied
$ find /var/lib/ceph/osd/ceph-0 -not -user 167
/var/lib/ceph/osd/ceph-0/block

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
5 years agoswitch2container: remove deb systemd units
Dimitri Savineau [Tue, 15 Sep 2020 13:46:30 +0000 (09:46 -0400)]
switch2container: remove deb systemd units

When running the switch2container playbook on a Debian based system
then the systemd unit path isn't the same than Red Hat based system.
Because the systemd unit files aren't removed then the new container
systemd unit isn't take in count.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>