]>
git.apps.os.sepia.ceph.com Git - ceph-ansible.git/log
Guillaume Abrioux [Fri, 2 Oct 2020 09:00:29 +0000 (11:00 +0200)]
ceph-osd: refact `docker_exec_start_osd`
This commit drops nested jinja construction in this set_fact task.
It also rename it to `container_exec_start_osd`
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Sun, 4 Oct 2020 08:18:39 +0000 (10:18 +0200)]
tests: remove ooo_collocation job
This job is redundant with 'collocation' job.
The only difference is osd/rgw collocation so let's add this usecase in
'collocation'.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
19d683d7acfb5344b38ac1ba4c123dcdd4d80f35 )
Guillaume Abrioux [Fri, 2 Oct 2020 14:12:13 +0000 (16:12 +0200)]
ceph-volume: dirty hack
ceph-volume recently introduced a breaking change because of a `lvm
batch` refactor.
when rerunning `lvm batch --report --format json` on existing OSDs, it
doesn't output a valid json on stdout.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Thu, 1 Oct 2020 20:28:17 +0000 (22:28 +0200)]
flake8: fix all tests/library/*.py files
This commit modifies all *.py files in ./tests/library/ so flake8
passes.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Thu, 1 Oct 2020 19:59:53 +0000 (21:59 +0200)]
tests: refact flake8 workflow
drop ricardochaves/python-lint action and use `run` steps instead.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Dimitri Savineau [Mon, 14 Sep 2020 18:45:33 +0000 (14:45 -0400)]
Revert "tests: disable nfs-ganesha testing"
This reverts commit
7348e9a253518904724b05565c97fa1f35c47006 .
Since the nfs-ganesha rpm build for CentOS 8 has been fixed, and
the nfs-ganesha segfault caused by an issue in librgw has also been
fixed.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Guillaume Abrioux [Wed, 30 Sep 2020 14:32:56 +0000 (16:32 +0200)]
defaults: change defaults value
this commit changes defaults value in default pool definitions.
there's no need to define `pg_num`, `pgp_num`, `size` and `min_size`,
`ceph_pool` module will use the current default if needed.
This also drops the 3 following `set_fact` in `ceph-facts`:
- osd_pool_default_pg_num,
- osd_pool_default_pgp_num,
- osd_pool_default_size_num
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Wed, 30 Sep 2020 12:10:20 +0000 (14:10 +0200)]
ceph_pool: update tests
update test_ceph_pool.py due to recent refact
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Wed, 30 Sep 2020 09:42:12 +0000 (11:42 +0200)]
ceph_pool: improve pg_autoscaler support
This commit modifies how the `pg_autoscaler` feature is handled by the
ceph_pool module.
1/ If a pool has the pg_autoscaler feature enabled, we shouldn't try to
update pg/pgp.
2/ Make it more readable
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Wed, 30 Sep 2020 07:22:58 +0000 (09:22 +0200)]
ceph_pool: pep8
Adopt pep8 syntax in ceph_pool module
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Mon, 28 Sep 2020 21:27:47 +0000 (23:27 +0200)]
ceph_pool: refact module
remove complexity about current defaults in running cluster
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Thu, 1 Oct 2020 09:25:19 +0000 (11:25 +0200)]
library: remove legacy file
This file is a leftover and should have been removed when we dropped the
validate module.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Mon, 7 Sep 2020 07:55:41 +0000 (09:55 +0200)]
tests: add github workflows
Add github workflow. Especially for flake8 for now.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Wong Hoi Sing Edison [Sun, 6 Sep 2020 02:17:02 +0000 (10:17 +0800)]
library: flake8 ceph-ansible modules
This commit ensure all ceph-ansible modules pass flake8 properly.
Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Wed, 30 Sep 2020 15:59:39 +0000 (17:59 +0200)]
tests: remove sleep commands from tox ini files
Since we use the rerun plugin in tox, we shouldn't need to add these
`sleep` commands.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Wed, 23 Sep 2020 14:21:21 +0000 (16:21 +0200)]
fs2bs: support `osd_auto_discovery` scenario
This commit adds the `osd_auto_discovery` scenario support in the
filestore-to-bluestore playbook.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1881523
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>
Seena Fallah [Sun, 27 Sep 2020 17:11:07 +0000 (20:41 +0330)]
ceph-facts: add get default crush rule from running monitor
In case of deploying new monitor node to an existing cluster,
osd_pool_default_crush_rule should be taken from running monitor because
ceph-osd role won't be run and the new monitor will have different
osd_pool_default_crush_role from other monitors.
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
Guillaume Abrioux [Fri, 24 Jul 2020 22:05:41 +0000 (00:05 +0200)]
defaults: change default grafana-server name
This change default value of grafana-server group name.
Adding some tasks in ceph-defaults in order to keep backward
compatibility.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Ali Maredia [Thu, 17 Sep 2020 04:19:45 +0000 (00:19 -0400)]
rgw multisite: check connection for realm endpoint
This commit adds connection checks before realm pulls
Curls are performed on the endpoint being pulled from
the mons and the rgws
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1731158
Signed-off-by: Ali Maredia <amaredia@redhat.com>
Dimitri Savineau [Fri, 25 Sep 2020 17:49:41 +0000 (13:49 -0400)]
Remove unused centos docker tasks
The `enable extras on centos` task just doesn't work when using the
variable ceph_docker_enable_centos_extra_repo to true.
fatal: [xxx]; FAILED! => {"changed": false, "msg": "Parameter
'baseurl', 'metalink' or 'mirrorlist' is required."}
The CentOS extras repository is enabled by default so it's pretty
safe to remove this task and the associated variable.
This also removes the ceph_docker_on_openstack variable as it's a
leftover and it is unused.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Fri, 25 Sep 2020 18:27:33 +0000 (14:27 -0400)]
ceph-handler: set handler on xxx_stat result
In non containerized deployment we check if the service is running
via the socket file presence.
This is done via the xxx_socket_stat variable that check the file
socket in the /var/run/ceph/ directory.
In some scenarios, we could have the socket file still present in
that directory but not used by any process.
That's why we have the xxx_stat variable which clean those leftovers.
The problem here is that we're set the variable for the handlers status
(like handler_mon_status) based on xxx_socket_stat instead of xxx_stat.
That means we will trigger the handlers if there's an old socket file
present on the system without any process associated.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1866834
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Sat, 26 Sep 2020 00:58:30 +0000 (20:58 -0400)]
ceph-iscsi: create pool once from monitor
af9f6684 introduced a regression on the ceph iscsi pool creation
because it was delegated to the first monitor node before that change.
This patch restores the initial worflow.
When the iscsi node doesn't have the admin keyring then the pool
creation fails.
This commit also ensures that the pool creation is only executed once
when having multiple iscsi nodes.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Seena Fallah [Sun, 27 Sep 2020 17:04:14 +0000 (20:34 +0330)]
ceph-facts: check for mon socket in its own host
delegate to its own host after checking mon socket to findout if mon socket is in-use or not.
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
Dimitri Savineau [Fri, 25 Sep 2020 16:15:02 +0000 (12:15 -0400)]
add missing boolean filter
Otherwise this will generate an ansible warning about the missing
filter.
[DEPRECATION WARNING]: evaluating xxx as a bare variable, this behaviour
will go away and you might need to add |bool to the expression in the
future.
Also see CONDITIONAL_BARE_VARS configuration toggle.. This feature will
be removed in version 2.12.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Guillaume Abrioux [Fri, 25 Sep 2020 19:01:16 +0000 (21:01 +0200)]
Revert "ceph-rgw: remove ceph_pool state and default value"
This reverts commit
ba3512a8fcffdbbd40fbd41f4e16b0d3ca1ca328 .
Dimitri Savineau [Sat, 26 Sep 2020 01:11:03 +0000 (21:11 -0400)]
ceph-mds: remove unused block condition
Since
af9f6684 the cephfs pool(s) creation don't use the fs_pools_created
variable anymore because the ceph_pool module is idempotent.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Raffael [Thu, 9 Jul 2020 21:12:28 +0000 (23:12 +0200)]
doc: Update methods.rst
Based on the discussion in issue #5392 I added now this paragraph to this page.
Signed-off-by: Raffael Luthiger <r.luthiger@huanga.com>
Tyler Bishop [Thu, 23 Jul 2020 13:36:01 +0000 (09:36 -0400)]
facts: support device aliases for (dedicated|bluestore_wal)_devices
Just likve `devices`, this commit adds the support for linux device aliases for
`dedicated_devices` and `bluestore_wal_devices`.
Signed-off-by: Tyler Bishop <tbishop@liquidweb.com>
Benoît Knecht [Tue, 1 Sep 2020 11:06:57 +0000 (13:06 +0200)]
library: Fix new-style modules check mode
Running the `ceph_crush.py`, `ceph_key.py` or `ceph_volume.py` modules in check
mode resulted in the following error:
```
New-style module did not handle its own exit
```
This was due to the fact that they simply returned a `dict` in that case,
instead of calling `module.exit_json()`.
Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
Benoît Knecht [Thu, 3 Sep 2020 07:01:16 +0000 (09:01 +0200)]
README-MULTISITE: Fix syntax issues from markdownlint
This commit makes the following changes:
- Remove trailing whitespace;
- Use consistent header levels;
- Fix code blocks;
- Remove hard tabs;
- Fix ordered lists;
- Fix bare URLs;
- Use markdown list of sections.
Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
Dimitri Savineau [Mon, 14 Sep 2020 18:34:07 +0000 (14:34 -0400)]
ceph-rgw: remove ceph_pool state and default value
Since the state is now optional and default values are handled in the
ceph_pool module itself.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Thu, 3 Sep 2020 17:11:31 +0000 (13:11 -0400)]
ceph_pool: add idempotency to absent state
When using the "absent" state on a non existing pool then the ceph_pool
module will fail and return a python traceback.
Instead we should check if the pool exit or not and execute the pool
deletion according to the result.
The state changed is now set when the pool is actually deleted.
This also disable add_file_common_args because we don't manipulate
files with this module.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Fri, 25 Sep 2020 14:20:35 +0000 (10:20 -0400)]
rolling_update: remove msgr2 migration
In Pacific we're are sure that users already achieved the msgr2 because
that was introduced in Nautilus.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Fri, 25 Sep 2020 14:44:08 +0000 (10:44 -0400)]
ceph-config: remove ceph_release from ceph.conf.j2
We don't use ceph_release variable in the ceph.conf jinja template.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Kefu Chai [Thu, 24 Sep 2020 16:46:30 +0000 (00:46 +0800)]
docs: update URLs to point to the RTD links
Fixes #5798
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Guillaume Abrioux [Thu, 24 Sep 2020 02:20:34 +0000 (04:20 +0200)]
ansible.cfg: remove cfg file in infrastructure-playbooks
There's no need ot have a copy of this file in infrastructure-playbooks
directory.
playbooks in that directory can be run from the root dir of
ceph-ansible.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Thu, 24 Sep 2020 01:51:56 +0000 (03:51 +0200)]
ansible.cfg: set force_valid_group_names param
As of 2.10, group names containing a dash are invalid.
However, setting this option makes it still possible to use a dash in
group names and prevent this warning to show up.
It might need to be definitely addressed in a future ansible release.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1880476
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Dimitri Savineau [Wed, 23 Sep 2020 16:00:30 +0000 (12:00 -0400)]
library/ceph_key: set no_log on secret
We don't need to show this information during the module execution.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dmitriy Rabotyagov [Wed, 23 Sep 2020 13:06:33 +0000 (16:06 +0300)]
Remove libjemalloc1 installation task
libjemalloc1 package is not required neither for ganesha dependency nor
for the package build process. So this task can be simply dropped.
Signed-off-by: Dmitriy Rabotyagov <noonedeadpunk@ya.ru>
Dimitri Savineau [Fri, 18 Sep 2020 14:03:13 +0000 (10:03 -0400)]
container: quote registry password
When using a quote in the registry password then we have the following
error:
The error was: ValueError: No closing quotation
To fix this we need to use the quote filter.
Close: https://bugzilla.redhat.com/show_bug.cgi?id=
1880252
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Guillaume Abrioux [Fri, 18 Sep 2020 07:09:57 +0000 (09:09 +0200)]
facts: fix 'set_fact rgw_instances with rgw multisite'
the current condition doesn't work, as soon as the first iteration is
done the condition makes next iterations skip since `rgw_instances` got
set with the first iteration.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1859872
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Dimitri Savineau [Thu, 17 Sep 2020 18:43:08 +0000 (14:43 -0400)]
tests: disable container nfs testing
Looks like nfs-ganesha 3.3 and 4.-dev doesn't work with recent changes
in librgw 16.0.0.
The nfs-ganesha daemon is segfaulting and restart in a loop.
See https://tracker.ceph.com/issues/47520
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Thu, 17 Sep 2020 18:11:22 +0000 (14:11 -0400)]
ceph-infra: include iscsi nodes for logrotate
The iscsi nodes aren't included in the logrotate condition.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Guillaume Abrioux [Tue, 15 Sep 2020 07:48:31 +0000 (09:48 +0200)]
infra: support log rotation for tcmu-runner
This commit adds the log rotation support for tcmu-runner.
ceph-container related PR: ceph/ceph-container#1726
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1873915
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Dimitri Savineau [Tue, 15 Sep 2020 13:30:42 +0000 (09:30 -0400)]
ceph-prometheus: update pool stat counter
Since [1] The bytes_used pool counter in prometheus has been renamed
to stored.
Closes: #5781
[1] https://github.com/ceph/ceph/commit/
71fe9149
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Tue, 15 Sep 2020 00:13:13 +0000 (20:13 -0400)]
container: add optional http(s) proxy option
When using a http(s) proxy with either docker or podman we can rely on
the HTTP_PROXY, HTTPS_PROXY and NO_PROXY environment variables.
But with ansible, even if those variables are defined in a source file
then they aren't loaded during the container pull/login tasks.
This implements the http(s) proxy support with docker/podman.
Both implementations are different:
1/ docker doesn't rely en the environment variables with the CLI.
Thos are needed by the docker daemon via systemd.
2/ podman uses the environment variables so we need to add them to
the login/pull tasks.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1876692
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Tue, 15 Sep 2020 13:59:06 +0000 (09:59 -0400)]
switch2container: chown symlink for devices
If the OSD directory is using symlinks for referencing devices (like
block, db, wal for bluestore and journal for filestore) then the chown
command could fail to change the owner:group on some system.
$ ls -hl /var/lib/ceph/osd/ceph-0/
total 28K
lrwxrwxrwx 1 ceph ceph 92 Sep 15 01:53 block -> /dev/ceph-
45113532 -95ca-471b-bd75-
51de46f1339c /osd-data-
570a1aee -60c0-44c9-8036-
ffed7d67a4e6
-rw------- 1 ceph ceph 37 Sep 15 01:53 ceph_fsid
-rw------- 1 ceph ceph 37 Sep 15 01:53 fsid
-rw------- 1 ceph ceph 55 Sep 15 01:53 keyring
-rw------- 1 ceph ceph 6 Sep 15 01:53 ready
-rw------- 1 ceph ceph 3 Sep 15 02:00 require_osd_release
-rw------- 1 ceph ceph 10 Sep 15 01:53 type
-rw------- 1 ceph ceph 2 Sep 15 01:53 whoami
$ find /var/lib/ceph/osd/ceph-0 -not -user 167 -execdir chown 167:167 {} +
chown: cannot dereference './block': Permission denied
$ find /var/lib/ceph/osd/ceph-0 -not -user 167
/var/lib/ceph/osd/ceph-0/block
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Tue, 15 Sep 2020 13:46:30 +0000 (09:46 -0400)]
switch2container: remove deb systemd units
When running the switch2container playbook on a Debian based system
then the systemd unit path isn't the same than Red Hat based system.
Because the systemd unit files aren't removed then the new container
systemd unit isn't take in count.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Guillaume Abrioux [Fri, 11 Sep 2020 15:30:33 +0000 (17:30 +0200)]
purge: remove potential socket leftover
This commit ensure we remove any socket left by ceph and the
`ceph-osd-run.sh` script.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1861755
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Dimitri Savineau [Mon, 14 Sep 2020 19:13:03 +0000 (15:13 -0400)]
tests/library: rename ceph_dashboard_user class
Rename the test class with the right information.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Guillaume Abrioux [Mon, 14 Sep 2020 13:14:24 +0000 (15:14 +0200)]
tests: do not run node_exporter test on clients
We need to skip these tests on client nodes since we don't deploy
node_exporter on them anymore
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Dimitri Savineau [Fri, 11 Sep 2020 13:34:05 +0000 (09:34 -0400)]
ceph_key: set state as optional
Most ansible module using a state parameter default to the present
value (when available) instead of using it as a mandatory option.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Guillaume Abrioux [Fri, 11 Sep 2020 08:29:28 +0000 (10:29 +0200)]
Revert "ceph_pool: use default size/min_size and rule_name"
This reverts commit
142934057f7a0485eca060c02892d8dac22a0f12 .
This is already handled in the ceph_pool module itself
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Dimitri Savineau [Mon, 24 Aug 2020 20:05:57 +0000 (16:05 -0400)]
dashboard: use run_once at block level
Instead of using run_once: true on each tasks in a block section, we
can use the run_once statement at the block level.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Fri, 11 Sep 2020 15:25:57 +0000 (11:25 -0400)]
node-exporter: exclude client nodes
We don't need to install node-exporter on client node because there's
no ceph services running on them.
This also makes sure we use the group name variables in the prometheus
service template instead of hardcoding the values.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Thu, 10 Sep 2020 00:44:54 +0000 (20:44 -0400)]
ceph_pool: set state as optional
Most ansible module using a state parameter default to the present
value (when available) instead of using it as a mandatory option.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Fri, 4 Sep 2020 18:49:07 +0000 (14:49 -0400)]
library: add ceph_dashboard_user module
This adds the ceph_dashboard_user ansible module for replacing the
command module usage with the ceph dashboard ac-user-xxx command.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Thu, 10 Sep 2020 00:54:30 +0000 (20:54 -0400)]
ceph_pool: use default size/min_size and rule_name
Before [1] we were using default value for
- size
- min_size
- rule_name
when the key wasn't present in the pool dict.
The commit [1] changed this by defaulting to omit.
This patch restores the original workflow by using facts:
- osd_pool_default_size
- osd_pool_default_min_size
- ceph_osd_pool_default_crush_rule_name
[1]
af9f6684f297d223b7bffc77ea50d3eec2665c15
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Thu, 10 Sep 2020 17:43:26 +0000 (13:43 -0400)]
tests: add quay registry for collocation baremetal
Even if the non containerized collocation scenario deploys ceph with
RPMs then we also deploy the dashboard/monitoring but with containers.
This requires to set the registry variable to ceph's quay.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Thu, 10 Sep 2020 15:27:37 +0000 (11:27 -0400)]
container: run engine/common roles on first client
We already do this in the site-container.yml playbook because we don't
need docker/podman installed on all client nodes and having the
container image only on the first client node.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Thu, 10 Sep 2020 14:12:13 +0000 (10:12 -0400)]
ceph-facts: only get fsid when monitor are present
When running the rolling_update playbook with an inventory without
monitor nodes defined (like external scenario) then we can't retrieve
the cluster fsid from the running monitor.
In this scenario we have to pass this information manually (group_vars
or host_vars).
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1877426
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Wed, 9 Sep 2020 22:38:33 +0000 (18:38 -0400)]
ceph-rgw: use ceph_pool module
Since [1] we can use the ceph_pool module instead of using the command
module combined with ceph osd pool commands.
[1]
bddcb439ce1b46735946e9fd5d147bc6604bcda3
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Tue, 8 Sep 2020 14:36:20 +0000 (10:36 -0400)]
tests: use grafana from quay.io
This changes the grafana container image regitry from docker.io to
quay.io to avoid rate limit.
This also adds the missing container image values for docker2podman
and podman scenarios.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Guillaume Abrioux [Tue, 8 Sep 2020 08:00:06 +0000 (10:00 +0200)]
tests: clean legacy
clean some legacies since quay.ceph.io migration
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Francesco Pantano [Tue, 8 Sep 2020 11:16:33 +0000 (13:16 +0200)]
Fix hosts field in rolling_update playbook when mds are processed
In the OSP context, during the rolling update the playbook fails
with the following error:
'''
ERROR! The field 'hosts' has an invalid value, which includes an
undefined variable. The error was: list object has no element 0
'''
This PR just change the hosts field providing a valid mons group
value.
Closes: https://bugzilla.redhat.com/1876803
Signed-off-by: Francesco Pantano <fpantano@redhat.com>
Francesco Pantano [Mon, 7 Sep 2020 12:02:06 +0000 (14:02 +0200)]
Add --cluster option on ceph require-osd-release command
On DCN environments, or when multiple ceph cluster are configured,
we need to specify the cluster name before running the command or
the rolling_update playbook will fail during minor updates.
Closes: https://bugzilla.redhat.com/1876447
Signed-off-by: Francesco Pantano <fpantano@redhat.com>
Guillaume Abrioux [Mon, 7 Sep 2020 07:02:03 +0000 (09:02 +0200)]
tests: disable nfs-ganesha testing
This commit diables nfs-ganesha testing on master for non-containerized
deployment because the dev repos are broken at the moment.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Fri, 4 Sep 2020 14:50:26 +0000 (16:50 +0200)]
tests: migrate to quay.ceph.io registry
in order to avoid docker.io rate limiting
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Dai Dang Van [Wed, 26 Aug 2020 10:02:34 +0000 (17:02 +0700)]
Fix typo shrink osd file name in day-2 docs
Signed-off-by: Dai Dang Van <daikk115@gmail.com>
Dimitri Savineau [Thu, 27 Aug 2020 13:57:35 +0000 (09:57 -0400)]
tests: reenable ceph-iscsi testing
This re-adds the ceph-iscsi testing for both non containerized and
containerized deployment since the rados connection error on ceph
dev has been fixed [1].
[1] https://tracker.ceph.com/issues/47002
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Niko Smeds [Thu, 5 Mar 2020 22:24:56 +0000 (14:24 -0800)]
Enable HAProxy backend checks for Ceph RGW
Add the `check` option to server definitions to enable basic HAProxy health
checks for Ceph RADOS gateway backends.
Currently traffic will be forwarded to unhealthly `radosgw.service` servers.
These changes resolve the issue.
Signed-off-by: Niko Smeds nikosmeds@gmail.com
Guillaume Abrioux [Fri, 21 Aug 2020 08:51:22 +0000 (10:51 +0200)]
rolling_update: remove 'ignore_errors'
There's no need to use `ignore_errors: true` on these tasks.
Using a loop on the task stopping mon daemons allows us to avoid
duplicating this task, the `ignore_errors` isn't needed here because it
won't fail the playbook if one of the ID doesn't exist (shortname vs. fqdn)
Using the right condition on the task starting the mgr daemon allows us
to avoid using an `ignore_errors: true` as well.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Tue, 4 Aug 2020 11:53:24 +0000 (13:53 +0200)]
ceph_key: refact the code and minor fixes
This commit refactors the code to remove a duplicate condition and it
makes the `state: absent` code idempotent
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Wed, 15 Jul 2020 15:28:51 +0000 (17:28 +0200)]
tests: add more coverage for test_ceph_key
This commit adds more coverage regarding the testing of ceph_key module
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Wed, 19 Aug 2020 21:33:51 +0000 (23:33 +0200)]
dashboard: refact admin user creation task
this commit splits this task in order to avoid using a `shell` module.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Mon, 17 Aug 2020 08:31:11 +0000 (10:31 +0200)]
facts: refact and optimize memory consumption
there's no need to run this task on all nodes.
This uses too much memory for nothing.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1856981
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Dimitri Savineau [Mon, 6 Jul 2020 15:04:13 +0000 (11:04 -0400)]
tests: reenable nfs-ganesha testing
This re-adds the nfs-ganesha testing in non containerized deployment.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
George Shuklin [Mon, 13 Jul 2020 10:40:17 +0000 (13:40 +0300)]
Make 'disable ssl for dashboard task' idempotent.
This should reduce number of 'changed' tasks during convergence test.
Signed-off-by: George Shuklin <george.shuklin@gmail.com>
Rafał Wądołowski [Thu, 20 Aug 2020 08:13:43 +0000 (10:13 +0200)]
Comment out ceph_custom_key
Since there is a check if ceph_custom_key is defined, there is no reason
to define it by default.
Signed-off-by: Rafał Wądołowski <rwadolowski@cloudferro.com>
Guillaume Abrioux [Wed, 19 Aug 2020 16:11:02 +0000 (18:11 +0200)]
iscsigw: add retry/until
In order to avoid failures that could be fixed by simply
retrying.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Tue, 11 Aug 2020 13:26:16 +0000 (15:26 +0200)]
tests: move erasure pool testing in lvm_osds
This commit moves the erasure pool creation testing from `all_daemons`
to `lvm_osds` so we can decrease the number of osd nodes we spawn so the
OVH Jenkins slaves aren't less overwhelmed when a `all_daemons` based
scenario is being tested.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
John Fulton [Tue, 18 Aug 2020 14:41:42 +0000 (10:41 -0400)]
Set default permission for prometheus config files
Regardless of the outcome of Ansible 2.9.12 issue 71200
we can set a default permission for these files.
Closes: https://github.com/ceph/ceph-ansible/issues/5677
Signed-off-by: John Fulton <fulton@redhat.com>
Guillaume Abrioux [Tue, 18 Aug 2020 18:35:17 +0000 (20:35 +0200)]
shrink-mds: use mds_to_kill_hostname instead
When using fqdn in inventory host file, this task will fail because the
mds is registered with its shortname.
It means we must use `mds_to_kill_hostname` in this task.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1869837
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Thu, 13 Aug 2020 18:37:11 +0000 (20:37 +0200)]
infra: only install logrotate on right nodes
For intsance, there is no need to install logrotate on clients nodes.
This also ensure logrotate is installed only for containerized
deployments since the packaging has an explicit dependency to logrotate
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Tue, 18 Aug 2020 13:37:08 +0000 (15:37 +0200)]
travis: enforce ansible-lint 4.2.0
Let's pin to 4.2.0
(because of ansible/ansible-lint/issues/966)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Tue, 18 Aug 2020 07:48:54 +0000 (09:48 +0200)]
tests: remove hosts-ubuntu inventories
Since we've dropped ubuntu testing, we don't need these inventories
anymore. Let's remove this leftover.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Tue, 18 Aug 2020 07:45:53 +0000 (09:45 +0200)]
tests: disable iscsigw testing (container)
Temporarily disable iscsigw testing for containerized deployments
because it's broken upstream on ceph@master.
non-containerized deployments use stable build for iscsigw to get around
this issue.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Dimitri Savineau [Mon, 17 Aug 2020 17:55:47 +0000 (13:55 -0400)]
ceph-rgw: allow specifying crush rule on pool
We already support specifiying a custom crush rule during pool creation
in ceph-osd role but not in ceph-rgw role.
This patch adds the missing code to implement this feature.
Note this is only available for replicated pool not erasure. The rule
must also exist prior the pool creation.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1855439
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Mon, 17 Aug 2020 18:56:17 +0000 (14:56 -0400)]
container: don't install the engine on all clients
We only need the container engine to be installed on the first clients
node in order to execute the pools/keys operation. We already do the
same worflow with the ceph-container-common role which pull the ceph
container image.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Ali Maredia [Thu, 4 Jun 2020 21:00:16 +0000 (21:00 +0000)]
rgw: allow rgws to be concurrently with or without multisite
Allows rgws in a ceph cluster to be run with
multisite and without multisite at the same time.
Signed-off-by: Ali Maredia <amaredia@redhat.com>
Guillaume Abrioux [Tue, 4 Aug 2020 15:29:41 +0000 (17:29 +0200)]
purge-cluster: use sysfs method for unmapping rbd devices
This way we keep consistency with purge-container-cluster.yml playbook.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Thu, 13 Aug 2020 13:29:28 +0000 (15:29 +0200)]
infra: add missing tag
This commit adds the missing `with_pkg` tag on the logrotate
installation task.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Wed, 12 Aug 2020 19:05:57 +0000 (21:05 +0200)]
tests: test iscsigw against stable
Since it is broken at the moment with dev repos, let's test against
stable builds so the CI is unlocked.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Thu, 6 Aug 2020 07:46:12 +0000 (09:46 +0200)]
purge: import ceph-defaults in purge osd play
Otherwise, `ceph_volume_debug` variable is undefined
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Tue, 4 Aug 2020 23:47:04 +0000 (01:47 +0200)]
infra: add log rotation support (containers)
This commit adds the log rotation support via logrotate in containerized
deployments.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1848388
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Wed, 5 Aug 2020 16:02:48 +0000 (18:02 +0200)]
common: don't enable debug log on ceph-volume calls by default
ceph-volume can generate large logs at some point.
debug logs by definition should be enabled only when debugging.
Let's make it customizable with a variable which is set to `False` by
default.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
raul [Mon, 3 Aug 2020 10:58:50 +0000 (12:58 +0200)]
rgw: support 1+ rgw instance in `radosgw_frontend_port`
Change the radosgw_frontend_port to take in account more than 1 RGW instance,
in it's original form `radosgw_frontend_port: radosgw_frontend_port | int`,
it configured the 8080 port to all instances, with the following modification
`radosgw_frontend_port: radosgw_frontend_port | int + item|int` we increase in
1 the port count.
Co-authored-by: Daniel Parkes <dparkes@redhat.com>
Signed-off-by: raul <rmahique@redhat.com>
Guillaume Abrioux [Fri, 7 Aug 2020 08:12:50 +0000 (10:12 +0200)]
nfs: do not copy rgw keyring when `nfs_obj_gw` is true
This keyring shouldn't be copied when `nfs_obj_gw` is `True` if the
cluster doesn't contain a rgw node, which can be the case given we are
using `nfs_obj_gw` instead of `nfs_file_gw` (cephfs vs. object), the
deployment will fail trying to copy a key that doesn't exist.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Thu, 6 Aug 2020 13:26:24 +0000 (15:26 +0200)]
tox: only wait 30sec for right jobs
There's no need to call `sleep 30` for other job than `all_daemons` and
`all_in_one`.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Benoît Knecht [Fri, 31 Jul 2020 06:11:31 +0000 (08:11 +0200)]
purge-cluster: check if rbdmap exists
When running `infrastructure-playbooks/purge-cluster.yml` twice, it fails the
second time on the `ensure rbd devices are unmapped` task, because `rbdmap`
isn't installed anymore at that point.
This commit adds a check that ensures `rbdmap` is available, and skips the
`ensure rbd devices are unmapped` task if it isn't.
Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>