]>
git-server-git.apps.pok.os.sepia.ceph.com Git - ceph-ansible.git/log
Guillaume Abrioux [Wed, 3 Aug 2022 05:43:35 +0000 (07:43 +0200)]
replace 'master' references with 'main'
Because the branch'master' was renamed 'main'
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
0ce39416e8dcd8d5b1093492df4b7e275fc3abc1 )
Guillaume Abrioux [Mon, 1 Aug 2022 21:12:13 +0000 (23:12 +0200)]
tests: skip rbdmirror tests on non-secondary daemon
the daemon is not running on the 'primary' daemon.
Therefore, these tests are not needed.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
37e67fb67253b3864a938e592b4a8e52d9a36d3f )
Guillaume Abrioux [Mon, 1 Aug 2022 18:22:10 +0000 (20:22 +0200)]
tests: set no_log_on_ceph_key_tasks=false
In order to not have to always reproduce it when a failure shows up in the CI
having the failure logged can make us save some time.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
f1239b6907da5750b8dac17dca78338f864eb5c9 )
Guillaume Abrioux [Mon, 1 Aug 2022 18:18:50 +0000 (20:18 +0200)]
rbd-mirror: follow up on recent rbd-mirror refactor
- ensure /var/lib/ceph/bootstrap-rbd-mirror exists
- always install ceph-base on rbdmirror nodes (otherwise, ceph-crash
isn't present)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
041435e1e371f00b7112efed3306beec91d5a434 )
(cherry picked from commit
b634fb1cb3cef6d895678501dfcd817a4fdea99c )
(cherry picked from commit
302da16c274b6b9e5cced970dd0220d7c56e6e5c )
Teoman ONAY [Mon, 1 Aug 2022 14:28:23 +0000 (16:28 +0200)]
Set ceph_rbd_mirror_pool default value
Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit
0c50bfac98cd174bd785487574af54a468b82ae6 )
(cherry picked from commit
8a0b5a9571df55cff3b4d4d5ab8b41d739a6e1cb )
(cherry picked from commit
4359464bb652f6efad9cb82b34c7cc17fca30a15 )
Guillaume Abrioux [Thu, 12 May 2022 15:22:54 +0000 (17:22 +0200)]
rbd-mirror: major refactor
- Use config-key store to add cluster peer.
- Support multiple pools mirroring.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
b74ff6e22c0d1b95e71384e4d7e2fb2ad556ac39 )
Guillaume Abrioux [Mon, 8 Aug 2022 05:43:54 +0000 (07:43 +0200)]
config: do not always set _osd_memory_target
When 'osd_memory_target' is overridden in ceph_conf_overrides.
The task that sets the fact `osd_memory_target` in the ceph-config role
should be skipped.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2056675#c11
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
cb5d6b48fb2395e99a7f7ee6e8d2afbcba45901d )
Teoman ONAY [Mon, 1 Aug 2022 13:36:48 +0000 (15:36 +0200)]
Playbook fails when using --limit to install new MDS
"set_fact container_run_cmd" is not set when using --limit on MDS as facts
were not run on first MON.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2111017
Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit
9a4a3f5f1921b76fa3a2a28ada2a632b5df70ca2 )
Guillaume Abrioux [Mon, 11 Jul 2022 14:07:13 +0000 (16:07 +0200)]
config: followup on
8a5628b51
Add missing `--cluster {{ cluster }}` on task
`set osd_memory_target` in the main.yml file of the
ceph-config role.
Also it moves the task after ceph configuration file is actually written.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
9f59c7286fcfcf1c15cb02c19f86a7ad91d4e4a3 )
Guillaume Abrioux [Mon, 18 Jul 2022 08:11:27 +0000 (10:11 +0200)]
shrink-osd: use command instead of ceph_volume_simple_scan
This module isn't available in RHCS 4
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2071035
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Thu, 7 Jul 2022 15:03:34 +0000 (17:03 +0200)]
backup-and-restore: use archive/unarchive approach
current approach is too complex and causes too many issues permission
issues.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
dffe7b47de70b6eeec71a3fa86f8c407adb4dd8e )
Guillaume Abrioux [Tue, 5 Jul 2022 07:58:02 +0000 (09:58 +0200)]
backup-and-restore: various fixes
- preserve mode and ownership on main directories
- make sure the directories are well present prior to restoring files.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
047af3a3f66b6135cf7ab6a7c28b9c30dea3c836 )
Guillaume Abrioux [Wed, 29 Jun 2022 06:59:55 +0000 (08:59 +0200)]
backup-and-restore: fix check on 'target_node' variable
If the user doesn't pass a valid name (present in the inventory)
the playbook will fail like following:
```
fatal: [localhost -> {{ target_node }}]: FAILED! =>
msg: |-
The task includes an option with an undefined variable. The error was: "hostvars['10.70.46.40']" is undefined
```
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
b18a1aa3cafa235dfa4f564d6574318ee539f961 )
Guillaume Abrioux [Wed, 29 Jun 2022 06:40:13 +0000 (08:40 +0200)]
backup-and-restore: fix check on 'mode' variable
Typical failure:
```
fatal: [localhost]: FAILED! =>
msg: |-
The conditional check 'mode not in ['backup', 'restore']' failed. The error was: error while evaluating conditional (mode not in ['backup', 'restore']): 'mode' is undefined
```
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
848dd03fa66c0febaf1ed081d4986d5de8472a09 )
Guillaume Abrioux [Wed, 15 Jun 2022 08:46:52 +0000 (10:46 +0200)]
backup-and-restore: fix a typo
Typo introduced during initial implementation.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
e28c486e528517439997583bcb0faa3df16ab848 )
Guillaume Abrioux [Tue, 12 Apr 2022 17:48:18 +0000 (19:48 +0200)]
contrib: add a playbook
this playbook can backup or restore some ceph files.
(/etc/ceph, /var/lib/ceph, ...)
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2051640
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
ed0bba4d770d7b886de83efe25d482e08d558f97 )
Guillaume Abrioux [Mon, 11 Jul 2022 08:23:56 +0000 (10:23 +0200)]
config/osd: various fixes
- sets `osd_memory_target` per osd host.
- ceph.conf refactor (osd)
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2056675
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
8a5628b51616b0b9740680bd14c669d4316941f3 )
Guillaume Abrioux [Mon, 11 Jul 2022 08:03:53 +0000 (10:03 +0200)]
config: fix indentation in main.yml
For consistency and readability.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
5283fa6e966ae3bcde82559240794b8fd5c0d8f3 )
Teoman ONAY [Mon, 4 Jul 2022 09:54:41 +0000 (11:54 +0200)]
Refresh /etc/ceph/osd json files content before zapping the disks
If the physical disk to device path mapping has changed since the
last ceph-volume simple scan (e.g. addition or removal of disks),
a wrong disk could be deleted.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2071035
Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit
64e08f2c0bdea6f4c4ad5862dc8f350c6adbe2cd )
Guillaume Abrioux [Tue, 5 Jul 2022 08:27:39 +0000 (10:27 +0200)]
facts: fix set_radosgw_address.yml
use `include_tasks` instead of `import_tasks`.
Given that with `import_tasks` statements are preprocessed
and the tasks that defines it hasn't been run yet, it will fail
and complain like following:
```
The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute '_interface'
```
Using `include_tasks` instead fixes this.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
434793e2feec64fdbbb8a7bdb959b08dacbb2006 )
(cherry picked from commit
d57377ef619c3304c1eaf4d466dae4055b492d7b )
Guillaume Abrioux [Thu, 16 Jun 2022 14:53:14 +0000 (16:53 +0200)]
facts: fix deployments with different net interface names
Deployments when radosgws don't have the same names for
network interface.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2095605
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
f6b49f78a9f14c37b2ca81fa6172beba8f43adc4 )
(cherry picked from commit
ab6fc3e6f771f17c8726618158a512cf9a375c71 )
Guillaume Abrioux [Mon, 23 May 2022 07:49:10 +0000 (09:49 +0200)]
purge: reset-failed ceph-crash
This ensures we always reset-failed the ceph-crash service.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2055992
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
e368ee0fc9455d7aa23b84f7690a4d3ee4deb156 )
Guillaume Abrioux [Mon, 22 Nov 2021 08:22:45 +0000 (09:22 +0100)]
purge: remove ceph directories on client nodes
Otherwise any ceph directories are left over on client nodes
after the purge.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2024815
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
20035852a4e51f3800808bc4c71b0e0be86b8132 )
(cherry picked from commit
346d4a1e1dc19238020886e3572b296e6b711c70 )
Guillaume Abrioux [Tue, 9 Nov 2021 14:35:12 +0000 (15:35 +0100)]
update: speed up client play
there's no need to run the roles ceph-facts, ceph-config and ceph-client
altogether on client nodes in rolling update playbook.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2019831
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
817c03bc0ebee9df270093d6f414d02f8b95866a )
(cherry picked from commit
c0da98b1d67177a55ed29bd3b72ac9774d2e320c )
Guillaume Abrioux [Wed, 8 Dec 2021 16:37:14 +0000 (17:37 +0100)]
container: align systemd units with rpm
Update `After=` and `Wants=` parameters in container systemd units
and make them be aligned with the systemd units that come
from the packaging.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2027440
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
f01536ea195a56c3ea2b31c7232391387e909c41 )
(cherry picked from commit
690c879aef47984136244d1fc8672c0afd6961f5 )
Guillaume Abrioux [Mon, 21 Feb 2022 16:12:06 +0000 (17:12 +0100)]
switch2containers: fail if less than 3 monitors
This playbook doesn't support less than 3 monitors present in the inventory.
Just like the rolling_update playbook, let's fail if less than
3 monitors are present.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2049132
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
f08129edf2efb8bad21043b7e90af59e6f3e1a8d )
(cherry picked from commit
b970ab6691b10c0156fb8cf4a7b8fcc123a453e3 )
Guillaume Abrioux [Tue, 7 Dec 2021 14:43:45 +0000 (15:43 +0100)]
dashboard: allow collecting stats from the host
This commit makes podman bindmount `/:/rootfs:ro` so the container can
collect data from the host.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2028775
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
0f34cd16d808dbb800efd0f558b29bcfb218bbfa )
(cherry picked from commit
2e2d893d28d74f101a9c8c94ca98d12445ff2267 )
Guillaume Abrioux [Mon, 28 Feb 2022 08:51:36 +0000 (09:51 +0100)]
purge: ceph-crash purge fixes
This fixes the service file removal and makes the playbook
call `systemctl reset-failed` on the service because in Ceph
Nautilus, ceph-crash doesn't handle `SIGTERM` signal.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2055992
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
2f11982590e298e92f7d3c4a06026fb00bea40cd )
(cherry picked from commit
7a570c719e16b9b802aab3703c747b7e429b7b77 )
Guillaume Abrioux [Thu, 14 Apr 2022 12:48:06 +0000 (14:48 +0200)]
common: support setting pg autoscaler to off
The current implementation doesn't allow to disable the pg autoscaler
on created pools. This allows only 'on' or 'warn'.
With this commit, this is now possible to disable it.
Valid values would be ['on', 'yes', 'true', 'off', 'no', 'false']
```
openstack_glance_pool:
name: "images"
pg_num: 128
pgp_num: 128
rule_name: "replicated_rule"
type: 1
application: "rbd"
size: 3
pg_autoscale_mode: off
```
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2062621
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
9d1ff8f236ed02b6ac7fc8a7f8e056ff6f52f14c )
Teoman ONAY [Mon, 7 Mar 2022 09:31:14 +0000 (10:31 +0100)]
Turn off SELinux separation for containers MON and RGW
Initially MONs and RGW binded /etc/pki/ca-trust/extracted using the :z flag
(introduced to solve an OSP TripleO issue on RHEL - #3638) but using
this flag prevents local services (like sssd) running on the host from accessing
the certificates/files in that folder.
Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit
7e8ce2567ec7f163c763547252a7a5bcc983fd98 )
(cherry picked from commit
cf44ad76f6858520bae97940bb3aaf171f74e24c )
Guillaume Abrioux [Thu, 21 Apr 2022 08:06:56 +0000 (10:06 +0200)]
facts: follow up on
aa0cc93
when these variables are defined in the inventory host file,
all tasks are skipped then because the node being played isn't
aware about the values from the rgw nodes.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063029
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
328bd7c975f5578e1e852872adb72d105f74a876 )
Guillaume Abrioux [Tue, 19 Apr 2022 21:39:12 +0000 (23:39 +0200)]
facts: fix mon/mgr collocation
`service dump` hangs when no active mgr is available.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
617dce5e10bfb2e309b043124999c8f32ebdaf4f )
Guillaume Abrioux [Tue, 19 Apr 2022 05:12:44 +0000 (07:12 +0200)]
dashboard: fix regression
introduced by ceph/ceph-ansible/pull/7150
when no rgw is present, it fails.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2076192
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
1a56fd6a21b3e39c8acd4d3d20451df9b96754d2 )
Guillaume Abrioux [Wed, 13 Apr 2022 08:42:47 +0000 (10:42 +0200)]
dashboard: support --limit execution with rgw
When the following conditions are met:
- rgw is deployed,
- dashboard is deployed,
- playbook is called with --limit,
- a node being processed is collocated on either a mon or mgr.
The playbook fails because `rgw_instances` is undefined.
The idea here is to make sure this variable is always defined.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063029
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
aa0cc9381d8491c30a47d27b395f58e776c5ef0b )
Guillaume Abrioux [Fri, 25 Mar 2022 08:14:56 +0000 (09:14 +0100)]
dashboard: always set `dashboard_server_addr`
When running the playbook with `--limit`, if the play targeted doesn't match
hosts present in the mgr group the playbook can fail.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2063029
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
72e4654aaea9638627c3e44d917d5af2a6ecdf0c )
(cherry picked from commit
d1e4b831062988af8b43416ce9ac3cb97f7f3c80 )
Guillaume Abrioux [Tue, 21 Dec 2021 13:03:45 +0000 (14:03 +0100)]
dashboard: fix radosgw system user creation
The radosgw system user creation will fail when `rgw_instances`
is set at the host_var level because this variable won't bet set
on monitor nodes, given that this is where the tasks is delegated, it fails.
The idea here is to check over all rgw instances that are defined and set a
boolean fact in order to check if at least one instance has `rgw_zonemaster` set
to `True`
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2034595
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Wed, 10 Nov 2021 13:32:26 +0000 (14:32 +0100)]
validate: fix bug when using vault
since a variable encrypted with vault is no longer a string but a
encrypted object we can't use the filter | length, we have to convert it
to a string before.
Fixes: #6991
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
6ad7e5286920fceff6f4483672cbdca44f06a25f )
Guillaume Abrioux [Thu, 28 Oct 2021 23:28:38 +0000 (01:28 +0200)]
mgr: append balancer module to ceph_mgr_modules
otherwise the osd play in rolling_update can fail when it tries to
disable it before upgrading osd nodes.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
45a1d634d850ad971c71dbe9191b97caa3068b5a )
Guillaume Abrioux [Thu, 28 Oct 2021 21:40:18 +0000 (23:40 +0200)]
update: move a set_fact
ceph-facts roles makes decisions based on the fact `rolling_update` so
it must be called before we run this role.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
e5edcc4214348945e1566641a490b68ef8886cf0 )
Guillaume Abrioux [Thu, 28 Oct 2021 14:17:24 +0000 (16:17 +0200)]
update: support --limit on monitor nodes
Change needed in order to support --limit on mon nodes.
Otherwise, a call to `hostvars[groups[mon_group_name][0]]['_current_monitor_address']`
throws an error:
```
"The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute '_current_monitor_address'"
```
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304#c28
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
82eee4303bce3e41b5043bcb03fa3143dcdfd30d )
Guillaume Abrioux [Wed, 13 Oct 2021 08:26:59 +0000 (10:26 +0200)]
nfs/rgw: support enforcing keys
if one sets `ceph_nfs_rgw_access_key` and/or `ceph_nfs_rgw_secret_key`,
the nfs/rgw user creation won't take those variables into account and it
will generate a user with automatically generated credentials.
It ends up with a mismatch between what will be set in ganesha.conf and
the created user.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2010754
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Per Abildgaard Toft [Wed, 20 Oct 2021 07:45:16 +0000 (09:45 +0200)]
shrink-osd: fix regression because of a wrong regex
968891f4498da9625acfdd34bfb01fe445d1eef2 introduced a regression.
The regex is wrong because it doesn't allow to shrink osds with id
greater than 9
Fixes: #6950
Signed-off-by: Per Abildgaard Toft <per@minfejl.dk>
(cherry picked from commit
84118a3063e38ed9d274cca90d115809353819b4 )
Guillaume Abrioux [Tue, 12 Oct 2021 15:55:40 +0000 (17:55 +0200)]
shrink-osd: check osd id format
This adds a check early in order to ensure the format of osd ids passed
is correct.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2005734
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
968891f4498da9625acfdd34bfb01fe445d1eef2 )
Guillaume Abrioux [Mon, 25 Oct 2021 12:28:41 +0000 (14:28 +0200)]
rolling_update: modify default health_osd_check_*
let's do more retries with a shorter delay.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
50a21d695eb571bdc5b4d67bde914cf58c502b44 )
Guillaume Abrioux [Wed, 20 Oct 2021 07:59:48 +0000 (09:59 +0200)]
tests: add new scenario subset_update
new scenario in order to test the subset upgrade approach using tags.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
fb8a66149bc5605c0e51ab137f46c2c48580452a )
Guillaume Abrioux [Mon, 25 Oct 2021 11:43:25 +0000 (13:43 +0200)]
rolling_update: fix pre and post osd upgrade play
when using --limit osds, the play before and after osd upgrade are
skipped because we use `hosts: "{{ mon_group_name | default('mons') }}[0]"`
using `hosts: "{{ osds_group_name | default('osds') }}" with
`delegate_to` to the first monitor addresses this issue.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
fc9f87c45f6e58a595f59365b58bf1b0e3909bf5 )
Guillaume Abrioux [Wed, 20 Oct 2021 08:01:05 +0000 (10:01 +0200)]
update: support upgrading a subset of nodes
It can be useful in a large cluster deployment to split the upgrade and
only upgrade a group of nodes at a time.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2014304
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
e5cf9db2b04f55196d867f5a7248b455307f4407 )
Guillaume Abrioux [Fri, 1 Oct 2021 22:41:03 +0000 (00:41 +0200)]
tests: ensure ca-certificates is up to date
otherwise the `rpm_key` module fails because it can't verify the
certificate.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Guillaume Abrioux [Wed, 29 Sep 2021 14:25:42 +0000 (16:25 +0200)]
tests: remove all references to ceph_stable_release
this is legacy and not needed anymore.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
f277a39dfe4ea7ea7d7f211a6a554866ac519f52 )
Seena Fallah [Tue, 21 Sep 2021 07:54:13 +0000 (12:24 +0430)]
ceph-defaults: set ceph_stable_release default to the stable branch release
ceph_stable_release is a legacy from the time where a single branch of ceph-ansible supported more than one release of ceph
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit
fb99626987740d676d649b0bce2215bce72ca0cf )
Guillaume Abrioux [Thu, 30 Sep 2021 09:32:12 +0000 (11:32 +0200)]
tests: set rgw_instances in collect-logs.yml
in order to gather rgw logs, we need rgw_instances to be set.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
c2e46fe5a5b9ebeab0828da1b7dd6540b3766fb2 )
Guillaume Abrioux [Thu, 30 Sep 2021 06:23:42 +0000 (08:23 +0200)]
tests: update collect-logs.yml playbook
- change `ceph -s` output to json-pretty.
- gather rgw logs
- add `health detail` command
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
b2ccc7234a8413a8bbd5d471da9ff2cf8f3ccde2 )
Guillaume Abrioux [Wed, 29 Sep 2021 12:29:58 +0000 (14:29 +0200)]
tests: move collect-logs.yml to ceph-ansible repo
related ceph-build PR: ceph/ceph-build#1914
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
702564518b9cb5019648f1a9edcdc4cc962a36d9 )
Seena Fallah [Mon, 16 Aug 2021 20:37:40 +0000 (01:07 +0430)]
purge: add remove_docker tag
This can help to skip docker removal tasks
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit
ff39c8d70b7326f4215d32e78e2f89b632b07008 )
Seena Fallah [Mon, 16 Aug 2021 20:08:47 +0000 (00:38 +0430)]
purge: add container_binary needed for zap osds
`container_binary` isn't set anymore in the purge osd play because of a
regression introduced by
60aa70a .
The CI didn't catch it because the play purging node-exporter sets this
variable for all nodes before we run the purge osd play.
This commit fixes this regression.
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit
a51ce767ca6749b3eb4c0c871e436daf3828e6c6 )
Dimitri Savineau [Fri, 27 Aug 2021 16:01:27 +0000 (12:01 -0400)]
ceph-defaults: set quay.io as the default registry
Because the ceph container images are now only pushed to the quay.io
registry then this updates the default registry value.
The docker.io registry can still be used but doesn't receive updated
container images.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
e7b43c1fc632237376351f43363597c23bd33cb7 )
Seena Fallah [Thu, 5 Aug 2021 11:03:55 +0000 (15:33 +0430)]
ceph-container-engine: allow override container_package_name and container_service_name
Only include specific variables when they are undefined
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit
95bce32270c7f5ea7e397588340b674efd7db63f )
Dimitri Savineau [Tue, 7 Sep 2021 16:13:37 +0000 (12:13 -0400)]
purge-dashboard: remove cid files
This adds the service cid file cleanup as supported in the classic purge
playbook since
b9dd253
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786691
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
cddc23f51134a6a95fd7492ee27a5d89bf7ebf9f )
Dimitri Savineau [Thu, 26 Aug 2021 20:45:07 +0000 (16:45 -0400)]
tests/rgw: use json format output for user info
If the radosgw user already exists then we need to have the output in json
format because we are expecting to load the output with json.loads()
Otherwise we have pytest failure like:
```console
self = <json.decoder.JSONDecoder object at 0x7fa2f00a5fd0>, s = '', idx = 0
def raw_decode(self, s, idx=0):
"""Decode a JSON document from ``s`` (a ``str`` beginning with
a JSON document) and return a 2-tuple of the Python
representation and the index in ``s`` where the document ended.
This can be used to decode a JSON document from a string that may
have extraneous data at the end.
"""
try:
obj, end = self.scan_once(s, idx)
except StopIteration as err:
> raise JSONDecodeError("Expecting value", s, err.value) from None
E json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
```
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
f2bd8ae70f0ae5d0c7c7a36bd6ecdad6383feed9 )
Dimitri Savineau [Tue, 10 Aug 2021 15:57:01 +0000 (11:57 -0400)]
tests/rgw: add timeout 5s to radosgw-admin command
If the radosgw daemons aren't up and running correctly (like not registered
in the servicemap or the OSD are down) then the radosgw-admin will hang
forever.
Jenkins will kill the jobs after 3h but we don't want to wait until this global
timeout.
Adding the timeout 5 command to the radosgw-admin commands (which is already
present on other ceph calls) allows the job to fail earlier.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
f01ae82eeccb1c3ebc97a84b0d8a547f360741d3 )
Dimitri Savineau [Thu, 19 Aug 2021 18:08:06 +0000 (14:08 -0400)]
container: explicitly pull monitoring images
We don't pull the monitoring container images (alertmanager, prometheus,
node-exporter and grafana) in a dedicated task like we're doing for the
ceph container image.
This means that the container image pull is done during the start of the
systemd service.
By doing this, pulling the image behind a proxy isn't working with podman.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1995574
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
5bb7240f878ff9b369ea028839d26fd46342ff77 )
Guillaume Abrioux [Wed, 18 Aug 2021 11:23:44 +0000 (13:23 +0200)]
iscsi: don't set default value for trusted_ip_list
It restricts access to the iSCSI API.
It can be left empty if the API isn't going to be access from outside the
gateway node
Even though this seems to be a limited use case, it's better to leave it
empty by default than having a meaningless default value.
We could make this variable mandatory but that would be a breaking
change. Let's just add a logic in the template in order to set this
variable in the configuration file only if it was specified by users.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1994930
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
6802b8dddd7f8d1f1c47f4eb3b7dd6a6a48820dc )
Guillaume Abrioux [Tue, 10 Aug 2021 13:21:19 +0000 (15:21 +0200)]
containers: introduce target systemd unit
This adds ceph-*.target systemd unit files support for containerized
deployments.
This also fixes a regression introduced by PR #6719 (rgw and nfs systemd
units not getting purged)
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1962748
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
09ef465f62fde775bd2490be5b43d7796e2a9c6c )
Guillaume Abrioux [Tue, 10 Aug 2021 13:34:50 +0000 (15:34 +0200)]
roles: remove leftover from pr #4319
pr #4319 introduced some uesless `become: true` on systemd tasks.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
1db8fa89895546571a831289ebbe0f83d02b1e0a )
Guillaume Abrioux [Tue, 10 Aug 2021 14:11:37 +0000 (16:11 +0200)]
Vagrantfile: fallback on 'varant_variables.yml.sample'
When using a vagrant command from the root directory of the repo, it
throws an error if no 'vagrant_variables.yml' file is present.
```
Message: Errno::ENOENT: No such file or directory @ rb_sysopen - /home/guits/workspaces/ceph-ansible/vagrant_variables.yml
```
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
3d27f9e7dc7ee775be57c27c3620009f9935ddcc )
Guillaume Abrioux [Tue, 17 Aug 2021 14:07:03 +0000 (16:07 +0200)]
update: gather facts only one time
this play doesn't need to gather facts from localhost
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
c14e9114baebd155996b42b18744567698178836 )
Dimitri Savineau [Wed, 11 Aug 2021 20:01:08 +0000 (16:01 -0400)]
ceph-mon: do not log monitor keyring
We don't want to display the keyring in the ansible log.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
e44075abd607648da88b4e3555353a99ecb171a6 )
Guillaume Abrioux [Mon, 9 Aug 2021 12:57:33 +0000 (14:57 +0200)]
common: do not log keyring secret
let's not display any keyring secret by default in ansible log.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1980744
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
7511195738e9d1e8f3d3ec77ad4473fa90d17d22 )
Benoît Knecht [Wed, 4 Aug 2021 13:12:37 +0000 (15:12 +0200)]
ceph-rgw: Work around Jinja2 < 2.8 missng eq test
EL7 ships with Jinja2 version 2.7, which is missing the `eq` test.
Work around this by using `match` instead.
Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
Benoît Knecht [Tue, 27 Jul 2021 07:31:35 +0000 (09:31 +0200)]
ceph-rgw: Set pg_num on RGW pool if required
If the `pg_num` value specified in `rgw_create_pools` is different from the
actual value in the cluster, apply it with `ceph osd pool set`.
This corresponds to the behavior of the `ceph_pool` module used in Ceph Ansible
5.0 onward.
Also avoid setting the pool application if it's already done.
Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
Dimitri Savineau [Mon, 9 Aug 2021 19:41:40 +0000 (15:41 -0400)]
switch2container: fix mon quorum check
This was reverted by
7ddbe74
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1990733
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Mon, 9 Aug 2021 14:33:40 +0000 (10:33 -0400)]
ceph-dashboard: fix TLS cert openssl generation
With OpenSSL version prior 1.1.1 (like CentOS 7 with 1.0.2k), the -addext
doesn't exist.
As a solution, this uses the default openssl.cnf configuration file as a
template and add the subjectAltName in the v3_ca section. This temp openssl
configuration file is removed after the TLS certificate creation.
This patch also move the run_once statement at the block level.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1978869
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
5e0ace7e5493f7d8299155e915435691a0f1a007 )
Guillaume Abrioux [Thu, 5 Aug 2021 13:00:49 +0000 (15:00 +0200)]
dashboard: subj_alt_names fact refactor
the current way the variable is built results in:
```
2021-08-03 04:18:23,020 - ceph.ceph - INFO - ok: [ceph-sangadi-4x-indpt6-node1-installer] => changed=false
ansible_facts:
subj_alt_names: |-
subjectAltName=ceph-sangadi-4x-indpt6-node1-installer/subjectAltName=10.0.210.223/subjectAltName=ceph-sangadi-4x-indpt6-node1-installersubjectAltName=ceph-sangadi-4x-indpt6-node2/subjectAltName=10.0.210.252/subjectAltName=ceph-sangadi-4x-indpt6-node2/
```
which is incorrect.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1978869
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
6f1a0634f73ad1f41af613a9452dc9c5f70b2702 )
VasishtaShastry [Fri, 6 Aug 2021 10:40:19 +0000 (16:10 +0530)]
Fixes typo in rgw-add-users-buckets playbook
Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
(cherry picked from commit
478d9fdcb6fe6fb6ef7d00c9fe09dd48acd345cd )
Dimitri Savineau [Fri, 6 Aug 2021 15:27:08 +0000 (11:27 -0400)]
add-osd: use container_exec_cmd fact from mon host
Because we're delegating the task to the first monitor node, we need to be
sure that the container_exec_cmd fact is the one from that node too otherwise
we could have a mismatch on the ceph-mon container name.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1990772
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Teoman ONAY [Tue, 3 Aug 2021 14:06:53 +0000 (16:06 +0200)]
podman pids.max default value is 2048, docker's one is 4096 which are
sufficient for the default value (512) of rgw thread pool size.
But if its value is increased near to the pids-limit value,
it does not leave place for the other processes to spawn and run within
the container and the container crashes.
pids-limit set to unlimited regardless of the container engine.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1987041
Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit
9b5d97adb95a788bc1fdedbba562a9c71a1808be )
Dimitri Savineau [Tue, 3 Aug 2021 15:58:49 +0000 (11:58 -0400)]
infra: use dedicated variables for balancer status
The balancer status is registered during the cephadm-adopt, rolling_update
and swith2container playbooks. But it is also used in the ceph-handler role
which is included in those playbooks too.
Even if the ceph-handler tasks are skipped for rolling_update and
switch2container, the balancer_status variable is erased with the skip task
result.
play1:
register: balancer_status
play2:
register: balancer_status <-- skipped
play3:
when: (balancer_status.stdout | from_json)['active'] | bool
This leads to issue like:
The conditional check '(balancer_status.stdout | from_json)['active'] | bool'
failed. The error was: Unexpected templating type error occurred on
({% if (balancer_status.stdout | from_json)['active'] | bool %} True
{% else %} False {% endif %}): expected string or buffer.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1982054
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
386661699bcfe05a220de6d58b9d50baa7eb6dc1 )
Dimitri Savineau [Wed, 28 Jul 2021 18:54:15 +0000 (14:54 -0400)]
osds: use osd pool ls instead of osd dump command
The ceph osd pool ls detail command is a subset of the ceph osd dump
command.
$ ceph osd dump --format json|wc -c
10117
$ ceph osd pool ls detail --format json|wc -c
4740
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
06471a4b82d63ebb35f80d45aa6ae629a4daeedc )
Dimitri Savineau [Thu, 29 Jul 2021 16:26:33 +0000 (12:26 -0400)]
rolling_update: get ceph version when mons exist
eec3878 introduced a regression for upgrade scenarios where there's no
monitor nodes at all (like ganesha standalone, external clients, etc..)
TASK [get the ceph release being deployed] ************************************
task path: infrastructure-playbooks/rolling_update.yml:121
Thursday 29 July 2021 15:55:29 +0000 (0:00:00.484) 0:00:15.802 *********
fatal: [client0]: FAILED! =>
msg: '''dict object'' has no attribute ''mons'''
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
e87a47cf0cc0d01050d0cb94cabbb8bc42db0c57 )
Benoît Knecht [Mon, 26 Jul 2021 15:10:19 +0000 (17:10 +0200)]
infrastructure-playbooks: Get Ceph info in check mode
In the `set osd flags` block, run the Ceph commands that gather information
from the cluster (and don't make any changes to it) even when running in check
mode.
This allows the tasks that depend on the variables set by those tasks to
succeed in check mode.
Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit
d7653dca95247e52c4a6821c1eec00748263082a )
Benoît Knecht [Mon, 26 Jul 2021 11:03:56 +0000 (13:03 +0200)]
ceph-handler: Fix osd handler in check mode
Run the Ceph commands that only gather information (without making any changes
to the cluster) when running Ansible in check mode.
This allows the tasks that depend on the variables set by those tasks to
succeed in check mode.
Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit
498acd7527410f7f359b2b0181e83ca39c682ec0 )
Dimitri Savineau [Tue, 27 Oct 2020 16:14:19 +0000 (12:14 -0400)]
library: remove unused module import
Move the import at the top of the file and remove unused module import.
- E402 module level import not at top of file
- F401 'xxxx' imported but unused
This also removes the '# noqa E402' statement from the code.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
2138a00a3294b222d5e8325495300841ed5a7f5f )
Wong Hoi Sing Edison [Thu, 17 Jun 2021 16:18:07 +0000 (00:18 +0800)]
library: flake8 ceph-ansible modules
This commit ensure all ceph-ansible modules pass flake8 properly.
Signed-off-by: Wong Hoi Sing Edison <hswong3i@pantarei-design.com>
(cherry picked from commit
beda1fe77381fbacb40fb75e5c06f36fbbad4a4a )
Dimitri Savineau [Tue, 27 Jul 2021 14:30:30 +0000 (10:30 -0400)]
ceph-defaults: add missing grafana dashboards
The radosgw-sync-overview and rbd-details grafana dashboars were missing
from the list.
Closes: #6758
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
f0ccf3ebf0b1ad6737f0d65174c0024b49db00a4 )
Guillaume Abrioux [Mon, 26 Jul 2021 09:19:36 +0000 (11:19 +0200)]
update: check the ceph release
Check early which Ceph release is going to be deployed and fail if it
doesn't correspond to the ceph-ansible version being used.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1978643
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
eec38784ecfaae6bf51af6cc5e3aea934d1d3d58 )
Dimitri Savineau [Fri, 23 Jul 2021 14:27:55 +0000 (10:27 -0400)]
alertmanager: allow disable dashboard tls verify
When using self-signed/untrusted CA certificates, alertmanager displays
an error in logs. With this commit this should make those messages
disappear.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1936299
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
9f77b929d145512e0d8886b96caf6047c5072a68 )
Guillaume Abrioux [Mon, 5 Jul 2021 15:49:26 +0000 (17:49 +0200)]
dashboard: support dedicated network for the dashboard
This introduces a new variable `dashboard_network` in order to support
deploying the dashboard on a different subnet.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1927574
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
f4f73b61972f416db9fe6ec305de282094581e07 )
Dimitri Savineau [Fri, 9 Jul 2021 21:24:09 +0000 (17:24 -0400)]
multisite: use node fqdn for endpoints when https
When the rgw_multisite_proto variable is set to https then we shoudn't use
the IP address in the zone endpoints list but the node FQDN to match the
TLS certificate CN.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1965504
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
ad05a0816048a69adba0e9b27683ed799e3c40bd )
Guillaume Abrioux [Wed, 21 Jul 2021 21:16:59 +0000 (23:16 +0200)]
purge: support osd_auto_discovery
This adds a task that zaps by osd id so we can support the scenario
where osds were deployed with `osd_auto_discovery` is true.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1876860
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
4144074a50ffd1e8893e3af2242fc44a23fd9c3e )
Guillaume Abrioux [Tue, 13 Jul 2021 16:48:42 +0000 (18:48 +0200)]
purge: merge playbooks
This refactor merges the two playbooks so we only have to maintain 1
playbook.
(Symlink the old purge-container-cluster.yml playbook for backward
compatibility).
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
17cd83bf3a35b482fa453b6bef8445e5e1ad8bce )
Guillaume Abrioux [Tue, 13 Jul 2021 15:11:22 +0000 (17:11 +0200)]
purge: drop variables from 'hosts' sections
Those variables are useless given this is not possible to override them.
Let's replace them with the hardcoded name instead.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
6b50401d0c2021fe691ee4f2be083b059d991c8b )
Guillaume Abrioux [Tue, 13 Jul 2021 12:26:40 +0000 (14:26 +0200)]
purge: reindent playbook
This commit reindents the playbook.
Also improve readability by adding an extra line between plays.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
60aa70a12820835412835063972c34a1c93cac7d )
Dimitri Savineau [Mon, 19 Jul 2021 19:19:23 +0000 (15:19 -0400)]
ceph-mgr: don't install dashboard pkg by default
This is a partial backport of
2547ab60 .
We are currently installing the ceph-mgr-dashboard package even if the
dashboard_enabled variable is set to false.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
Dimitri Savineau [Thu, 15 Jul 2021 19:38:07 +0000 (15:38 -0400)]
ceph-mgr: move mgr module list to common
Populating the ceph_mgr_modules list in the mgr_modules doesn't make sense
since that file is only executed if the list isn't empty or we're using the
dashboard.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
cd06e7c046b3e56920b1f9bdc1907429382bee5c )
Dimitri Savineau [Thu, 15 Jul 2021 20:24:28 +0000 (16:24 -0400)]
ceph-nfs: allow overriding NFS_CORE_PARAM
We already have config override variables for existing block (like
ganesha_ceph_export_overrides, ganesha_log_overrides, etc...) or a
global one (ganesha_conf_overrides) but redefining the NFS_CORE_PARAM
block in that variable will erase all previous values (currently only
Bind_Addr).
ganesha_core_param_overrides: |
Enable_UDP = false;
NFS_Port = 2050;
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1941775
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
9817d29543099ca640ce8b23da2ab9f26179cba5 )
Guillaume Abrioux [Fri, 9 Jul 2021 09:07:08 +0000 (11:07 +0200)]
lib/ceph-volume: support zapping by osd_id
This commit adds the support for zapping an osd by osd_id in the
ceph_volume module.
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
70f1d6e2cd9ed4abb4db599f9faa816703430d80 )
Dimitri Savineau [Fri, 9 Jul 2021 20:09:49 +0000 (16:09 -0400)]
rolling_update: check quorum state before upgrade
If one a the monitor is out of the quorum then nothing prevents the upgrade
playbook to run.
We only check if we have at least three monitor nodes but we should also
check if those monitor nodes are correctly present in the quorum.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1952571
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
97148dd58c77a84aff1235dc9be3cb8c9d73cc09 )
Dimitri Savineau [Wed, 16 Dec 2020 19:18:08 +0000 (14:18 -0500)]
ceph-facts: move device facts to its own file
Instead of reusing the condition 'inventory_hostname in groups[osds]'
on each device facts tasks then we can move all the tasks into a
dedicated file and set the condition on the import_tasks statement.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
d704b05e52d10910cd68c49033933bd7e6ded268 )
Dimitri Savineau [Tue, 15 Dec 2020 22:34:34 +0000 (17:34 -0500)]
ceph-validate: check logical volumes
We currently don't check if the logical volume used in lvm_volumes list
for either bluestore data/db/wal or filestore data/journal exist.
We're only doing this on raw devices for batch scenario.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
55bca07cb612b766bc099e14e0a5661185a7f9a6 )
Dimitri Savineau [Tue, 15 Dec 2020 20:08:00 +0000 (15:08 -0500)]
ceph-validate: check db/journal/wal devices too
When using dedicated devices for db/journal/wal objecstore with
ceph-volume lvm batch then we should also validate that those devices
exist and don't use a gpt partition table in addition of the devices
and lvm_volume.data variables.
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
808e7106dec5f3f7a743fe343ba3023c9390a1ba )