]> git.apps.os.sepia.ceph.com Git - ceph-ansible.git/log
ceph-ansible.git
6 years agoExtends check_devices tasks to non-collocated an lvm-batch scenarios
VasishtaShastry [Fri, 9 Nov 2018 17:20:05 +0000 (22:50 +0530)]
Extends check_devices tasks to non-collocated an lvm-batch scenarios

Tuned name of a task and error message to make it more user understandable

Fixes BZ 1648168 - ceph-validate : devices are not validated in non-collocated and lvm_batch scenario

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1648168
Signed-off-by: VasishtaShastry <vipin.indiasmg@gmail.com>
(cherry picked from commit 34c25ef49b10ef6c789447e785a4bf6938c2a804)

6 years agoConvert interface names to underscores
ToprHarley [Mon, 18 Feb 2019 18:02:03 +0000 (19:02 +0100)]
Convert interface names to underscores

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1540881
Signed-off-by: Tomas Petr <tpetr@redhat.com>
(cherry picked from commit 573adce7dd4f306c384b3308c8049ae49ef59716)

6 years agoosd: add ipc=host in systemd template for containers v3.2.8
Guillaume Abrioux [Thu, 28 Feb 2019 12:13:35 +0000 (13:13 +0100)]
osd: add ipc=host in systemd template for containers

in addition to 15812970f033206b8680cc68351952d49cc18314

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d5be83e5042a5e22ace6250234ccd81acaffb0a2)

6 years agotests: update ceph_volume tests
Guillaume Abrioux [Thu, 28 Feb 2019 09:54:03 +0000 (10:54 +0100)]
tests: update ceph_volume tests

accordingly to change introduced by b5548ea9412cd7741bee993dddcbfd9daa34cb02

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f2dcb02d213e862c5a5498c2d12cd86b22676c84)

6 years agocv: expose host ipc namespace to ceph-volume container
Noah Watkins [Thu, 28 Feb 2019 00:05:19 +0000 (16:05 -0800)]
cv: expose host ipc namespace to ceph-volume container

this is needed to properly handle semaphore synchronization for udev
actions via dmcrypt/cryptsetup.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683770
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
(cherry picked from commit 15812970f033206b8680cc68351952d49cc18314)

# Conflicts:
# library/ceph_volume.py

6 years agotests: add lvm bluestore dmcrypt support
Guillaume Abrioux [Thu, 28 Feb 2019 09:42:03 +0000 (10:42 +0100)]
tests: add lvm bluestore dmcrypt support

Add coverage for container / non container lvm bluestore dmcrypt OSDs

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 207fae38d480f7de369106c5bda1dfe0f1b6033c)

6 years agoRemoved not needed mountpoint and removed ubuntu section
fpantano [Thu, 28 Feb 2019 07:55:48 +0000 (08:55 +0100)]
Removed not needed mountpoint and removed ubuntu section

Referring to BZ#1683290, as dsavineau suggests, being this
bug tripleO specific, removed the ubuntu section and removed
useless mountpoints.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683290
Signed-off-by: fpantano <fpantano@redhat.com>
(cherry picked from commit 21fad7ced344e441ffcd5c4010d634b81ead517f)

6 years agoAdded to the ceph-radosgw service template the ca-trust
fpantano [Tue, 26 Feb 2019 18:51:05 +0000 (19:51 +0100)]
Added to the ceph-radosgw service template the ca-trust
volume avoiding to expose useless information.
This bug is referred to the following bugzilla:

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683290
Signed-off-by: fpantano <fpantano@redhat.com>
(cherry picked from commit 0c1944236bfb397e9dff6ef436569556bc00379d)

6 years agoSet permissions on monitor directory to u=rwX,g=rX,o=rX recursive
Kevin Coakley [Tue, 26 Feb 2019 17:30:31 +0000 (09:30 -0800)]
Set permissions on monitor directory to u=rwX,g=rX,o=rX recursive

Set directories to 755 and files to 644 to
/var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }} recursively instead of
setting files and directories to 755 recursively. The ceph mon
process writes files to this path with permissions 644. This update stops
ansible from updating the permissions in
/var/lib/ceph/mon/{{ cluster }}-{{ monitor_name }} every time ceph mon writes
a file and increases idempotency.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683997
Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
(cherry picked from commit d327681b99915578fc8b389fda69556966db905f)

6 years agomon: Move client admin variable to defaults
Dimitri Savineau [Wed, 27 Feb 2019 16:40:36 +0000 (11:40 -0500)]
mon: Move client admin variable to defaults

There's no need to set the client_admin_ceph_authtool_cap variable
via a set_fact task.
Instead we can set this in the role defaults.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 58a9d310d5651171214dc2a621cf2ba197229951)

6 years agomon: Add mds permissions to client.admin
Dimitri Savineau [Wed, 27 Feb 2019 16:07:38 +0000 (11:07 -0500)]
mon: Add mds permissions to client.admin

The administrator keyring needs full capabilities on mds like mon,
osd and mgr.
Whithout this, the client.admin key won't be able to run commands
against mds (like ceph tell mds.0 session ls)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1672878
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit dd7b7604de62c49cc979adfc89b4b89c1b39ae6e)

6 years agocommon: do not override ceph_release when ceph_repository is 'rhcs' v3.2.7
Guillaume Abrioux [Thu, 21 Feb 2019 09:30:29 +0000 (10:30 +0100)]
common: do not override ceph_release when ceph_repository is 'rhcs'

We shouldn't reset `ceph_release` with `ceph_stable_release` when
`ceph_repository` is `rhcs`

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2b60a356343677da6371b7861ee657bfd42c54fd)

6 years agoosd: make the 'wait for all osd to be up' task configurable v3.2.6
Guillaume Abrioux [Wed, 20 Feb 2019 15:24:25 +0000 (16:24 +0100)]
osd: make the 'wait for all osd to be up' task configurable

introduce two new variables to make the check that 'wait for all osd to
be up' configurable.
It's possible that for some deployments, OSDs can take longer to be seen
as UP and IN.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1676763
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 21e5db8982afd6e075541e7fc88620d59a1df498)

6 years agoensure at least one osd is up
David Waiting [Mon, 10 Dec 2018 14:54:18 +0000 (09:54 -0500)]
ensure at least one osd is up

The existing task checks that the number of OSDs is equal to the number of up OSDs before continuing.

The problem is that if none of the OSDs have been discovered yet, the task will exit immediately and subsequent pool creation will fail (num_osds = 0, num_up_osds = 0).

This is related to Bugzilla 1578086.

In this change, we also check that at least one OSD is present. In our testing, this results in the task correctly waiting for all OSDs to come up before continuing.

Signed-off-by: David Waiting <david_waiting@comcast.com>
(cherry picked from commit 3930791cb7d2872e3388d33713171d7a0c1951e8)

6 years agoceph_key: fix rstrip for python 3
Sébastien Han [Mon, 19 Nov 2018 08:56:45 +0000 (09:56 +0100)]
ceph_key: fix rstrip for python 3

Removing bytes literals since rstrip only supports type String or None.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit f5c2ca3710844f73960b5e3652c521de97fb3383)

6 years agosetup_ntp: call handler to disable ntpd if chronyd used
Patrick C. F. Ernzer [Thu, 7 Feb 2019 15:36:20 +0000 (16:36 +0100)]
setup_ntp: call handler to disable ntpd if chronyd used

The task setup chronyd called the handler disable chronyd, which of
course defeats the purpose.

Changing the task to disable ntpd instead fixes the issue of chronyd
being disabled after it got enabled.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1673664
Fixes: #3582
Signed-off-by: Patrick C. F. Ernzer pcfe@redhat.com
(cherry picked from commit c605ff6a68720ab43b63086c3ac1d529a651f585)

6 years agoiscsi: fix permission denied error
Guillaume Abrioux [Thu, 7 Feb 2019 13:16:13 +0000 (14:16 +0100)]
iscsi: fix permission denied error

Typical error:
```
fatal: [iscsi-gw0]: FAILED! =>
  msg: 'an error occurred while trying to read the file ''/home/guits/ceph-ansible/tests/functional/all_daemons/fetch/e5f4ab94-c099-4781-b592-dbd440a9d6f3/iscsi-gateway.key'': [Errno 13] Permission denied: b''/home/guits/ceph-ansible/tests/functional/all_daemons/fetch/e5f4ab94-c099-4781-b592-dbd440a9d6f3/iscsi-gateway.key'''
```

`become: True` is not needed on the following task:

`copy crt file(s) to gateway nodes`.

Since it's already set in the main playbook (site.yml/site-container.yml)

The thing is that the files get generated in the 'fetch_directory' with
root user because there is a 'delegate_to' + we run the playbook with
`become: True` (from main playbook).

The idea here is to create files under ansible user so we can open them
later to copy them on the remote machine.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9d590f4339a4d758f07388bf97b7eabdcbca6043)

6 years agoadd 'custom' as valid ceph_repository value
Justin Riley [Fri, 21 Dec 2018 02:33:05 +0000 (21:33 -0500)]
add 'custom' as valid ceph_repository value

This is documented as valid:

https://github.com/ceph/ceph-ansible/blob/561746f75e3913b30e6ae3f14768ebc8a516bf66/group_vars/all.yml.sample#L245

Signed-off-by: Justin Riley <justin.t.riley@gmail.com>
(cherry picked from commit 6a79870d62565eae9ae34a7e5d386941fc8ba590)

6 years agoFix uses of default(omit) with string concatenation
Leah Neukirchen [Thu, 7 Feb 2019 17:09:21 +0000 (18:09 +0100)]
Fix uses of default(omit) with string concatenation

When {{omit}} is concatenated with another string, it expands to something
like __omit_place_holder__63eea0d96dd6ed867b95405e11d87dddf61f448d.
However, in these use-cases we need an empty string.

Regression introduced in d53f55e807e.

Signed-off-by: Leah Neukirchen <leah.neukirchen@mayflower.de>
6 years agotests: do not deploy iscsigw on ubuntu
Guillaume Abrioux [Wed, 6 Feb 2019 12:23:38 +0000 (13:23 +0100)]
tests: do not deploy iscsigw on ubuntu

not supported on non rhel based distribution

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotests: add inventory file
Guillaume Abrioux [Wed, 6 Feb 2019 12:22:51 +0000 (13:22 +0100)]
tests: add inventory file

add missing inventory file for ubuntu-container-all_daemons job

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoansible: increase fact cache timeout
Guillaume Abrioux [Wed, 6 Feb 2019 07:32:04 +0000 (08:32 +0100)]
ansible: increase fact cache timeout

10m seems a bit low, indeed, a complete run can take more than 1h.
Let's increase it to 2h

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b37c4adb32715b8749b7d6714a20b8b538bdf214)

6 years agoosd: expose udev into the container
Sébastien Han [Mon, 26 Nov 2018 16:58:49 +0000 (17:58 +0100)]
osd: expose udev into the container

In order to be able to retrieve udev information, we must expose its
socket. As per, https://github.com/ceph/ceph/pull/25201 ceph-volume will
start consuming udev output.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 997667a8734eddaa616fe642e57f6378408736a9)

6 years agoosd: bind mount /var/run/udev/
Guillaume Abrioux [Tue, 5 Feb 2019 08:25:20 +0000 (09:25 +0100)]
osd: bind mount /var/run/udev/

without this, the command `ceph-volume lvm list --format json` hangs and
takes a very long time to complete.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7ade0328072896e99817b070b6a82448024bfb84)

6 years agoshrink_osd: use cv zap by fsid to remove parts/lvs
Noah Watkins [Thu, 17 Jan 2019 23:08:19 +0000 (15:08 -0800)]
shrink_osd: use cv zap by fsid to remove parts/lvs

Fixes:
  https://bugzilla.redhat.com/show_bug.cgi?id=1569413
  https://bugzilla.redhat.com/show_bug.cgi?id=1572933

Note: rebased

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
(cherry picked from commit 9a43674d2e91ef46917cabe49651c46b630e5ace)

6 years agotest: add missing test dependency
Noah Watkins [Wed, 16 Jan 2019 23:50:23 +0000 (15:50 -0800)]
test: add missing test dependency

[nwatkins@smash ceph-ansible]$ virtualenv env
[nwatkins@smash ceph-ansible]$ env/bin/pip install -r tests/requirements.txt
[nwatkins@smash ceph-ansible]$ env/bin/python -c "import mock"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'mock'

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
(cherry picked from commit 8a5530ee98d3128c9558e8e8e38f9517fb34d7cf)

6 years agocv: support zap by osd fsid
Noah Watkins [Wed, 16 Jan 2019 23:50:08 +0000 (15:50 -0800)]
cv: support zap by osd fsid

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
(cherry picked from commit fce9f6ef60e3725ac6912bcb150ae59e36ff56fb)

6 years agoset `any_errors_fatal` true for left out host sections v3.2.5
Rishabh Dave [Mon, 28 Jan 2019 09:02:32 +0000 (14:32 +0530)]
set `any_errors_fatal` true for left out host sections

Many hosts sections in site.yml.sample were left out during the
backport commit 6e2cd0930fa17f5d50c73496eff71074301f55bd.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agouse shortname in keyring path
Patrick Donnelly [Sat, 26 Jan 2019 02:48:28 +0000 (18:48 -0800)]
use shortname in keyring path

socket.gethostname may return a FQDN. Problem found in Linode.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 8cd0308f5f570635d66295c442ea49dc2c043194)

6 years agotests: run lvm_setup.yml only when osd_scenario is lvm
Guillaume Abrioux [Wed, 30 Jan 2019 21:42:42 +0000 (22:42 +0100)]
tests: run lvm_setup.yml only when osd_scenario is lvm

especially for ooo_collocation scenario which is still using ceph-disk
testing.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotests: add nodes for container-all_daemons scenario
Guillaume Abrioux [Wed, 30 Jan 2019 12:11:56 +0000 (13:11 +0100)]
tests: add nodes for container-all_daemons scenario

add back iscsigw and rbdmirror vm in all_daemons testing

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoAdd a ceph-volume aware shrink-osd playbook
Noah Watkins [Wed, 31 Oct 2018 18:17:16 +0000 (11:17 -0700)]
Add a ceph-volume aware shrink-osd playbook

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
(cherry picked from commit f5dacbf7de38a9b08cfcf041438d49acce792afe)

6 years agoRename ceph-disk version of shrink-osd playbook
Noah Watkins [Wed, 31 Oct 2018 18:14:08 +0000 (11:14 -0700)]
Rename ceph-disk version of shrink-osd playbook

This will be replaced by a ceph-volume aware verison.

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
(cherry picked from commit 0782cfc546ec398cfa405fb4c9c8226ab52a7960)

6 years agotests: specify docker params for shrink-osd
Guillaume Abrioux [Tue, 29 Jan 2019 09:42:57 +0000 (10:42 +0100)]
tests: specify docker params for shrink-osd

Otherwise, it will go with the default values, eg:

"latest" for `ceph_docker_image_tag`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoFixup shrink_osd[_container] scenario config
Noah Watkins [Tue, 6 Nov 2018 16:49:39 +0000 (08:49 -0800)]
Fixup shrink_osd[_container] scenario config

** configuration seems to be for filestore:

[ERROR]: [ceph-osd0] Validation failed for variable: lvm_volumes

** Removing `radosgw_interface: eth1` to resolve:

The task includes an option with an undefined variable. The error was:
'ansible.vars.hostvars.HostVarsVars object' has no attribute
u'ansible_eth1'

The error appears to have been in
'/home/nwatkins/src/ceph-ansible/roles/ceph-defaults/tasks/set_radosgw_address.yml':
line 21, column 5, but may be elsewhere in the file depending on the
exact syntax problem.

The offending line appears to be:

  - name: set_fact _radosgw_address to radosgw_interface - ipv4
    ^ here

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
(cherry picked from commit 50255b964084ab52d6ca949b50f413c0ad9e2362)

6 years agotests: refact testing in stable-3.2
Guillaume Abrioux [Tue, 22 Jan 2019 13:25:45 +0000 (14:25 +0100)]
tests: refact testing in stable-3.2

Apply the same refact recently introduced in master to stable-3.2

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agooverride ceph_release with ceph_stable_release
Guillaume Abrioux [Tue, 22 Jan 2019 13:25:45 +0000 (14:25 +0100)]
override ceph_release with ceph_stable_release

when `ceph_origin` is set to `'repository'` and `ceph_repository` to
`'community'` we need to ensure `ceph_release` reflect
`ceph_stable_release`.

4a3f180f9d29d5a31468ebb3d1c5f31a53a93960 simply removed the override
while it should just have to be run only when the condition mentioned
above is satisfied.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0bfefdd5bc06b4f1dd03d9060b0a38a6f447b207)

6 years agoconfig: remove code related to ceph release prior to luminous
Guillaume Abrioux [Thu, 29 Nov 2018 09:57:54 +0000 (10:57 +0100)]
config: remove code related to ceph release prior to luminous

This part of the code is not needed since ceph-ansible@master is
intended to deploy ceph@master only.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1bbdde272f6ae2107174cd521132b4c2f7ed325a)

6 years agoceph-default: rm useless condition
Guillaume Abrioux [Wed, 2 Jan 2019 13:30:27 +0000 (14:30 +0100)]
ceph-default: rm useless condition

This condition is useless and it's also creating issues we don't see in
our CI. ceph_release is set by either ceph-common or ceph-docker-common
so let's keep it this way.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1645379
(cherry picked from commit e9188cd202663656a773eb9e2276c6dbc0684599)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoPreserve rolling_update backward compatibility with ansible < 2.5 v3.2.4
Giulio Fidente [Fri, 18 Jan 2019 08:03:40 +0000 (09:03 +0100)]
Preserve rolling_update backward compatibility with ansible < 2.5

Let's enforce the default value for `client_update_batch` to 20 since
`ansible_forks` isn't always available.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1650184
Signed-off-by: Giulio Fidente <gfidente@redhat.com>
(cherry picked from commit ff8dbe114cc1e13c8972993c340cd3b1a189d326)

6 years agoVagrantfile: remove useless default values
Guillaume Abrioux [Wed, 2 Jan 2019 13:46:54 +0000 (14:46 +0100)]
Vagrantfile: remove useless default values

Those default values are useless and might cause issues.

- `osd_scenario` should be mandatory anyway.
- `pool_default_size` is not used anymore (this has been refactored
recently.

Closes: #3468
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit c7a929b2dc75afefc1c30deb6e32e3513a81dc1c)

6 years agostart_osds: use list instead of keys (re-introduce)
Noah Watkins [Wed, 5 Dec 2018 22:04:48 +0000 (14:04 -0800)]
start_osds: use list instead of keys (re-introduce)

the python3 fix merged by:

  https://github.com/ceph/ceph-ansible/pull/3346

was reintroduced a few days later by:

  https://github.com/ceph/ceph-ansible/commit/82a6b5adec4d72eb4b7219147f2225b7b2904460

and this patch fixes it again :)

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
(cherry picked from commit 3cf5fd2c3ee1fc342ac8dc3365ed82d863c7127e)

6 years agosite: Make sure is_atomic is defined
Brad Hubbard [Wed, 9 Jan 2019 23:19:02 +0000 (09:19 +1000)]
site: Make sure is_atomic is defined

configure_firewall tests the is_atomic variable if the firewalld package
is not present. is_atomic is defined in ceph_facts so include that.

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
(cherry picked from commit 55fab6f547db8c0604abf536b8f1f469c4cfe2ec)

6 years agoceph-facts: resync group_vars file v3.2.3
Sébastien Han [Mon, 14 Jan 2019 16:53:26 +0000 (17:53 +0100)]
ceph-facts: resync group_vars file

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoswitch: do not fail on missing key
Sébastien Han [Mon, 14 Jan 2019 15:31:45 +0000 (16:31 +0100)]
switch: do not fail on missing key

Some people use the switch playbook to perform upgrade so they end up in
the same situation than https://bugzilla.redhat.com/show_bug.cgi?id=1650572
This is applying the same fix as
729744c6a8c69f5fdf66b67fb28063297996e30a.

We don't want to fail on key that are not present since they will get
created after the mons are updated. They will be created by the task
"create potentially missing keys (rbd and rbd-mirror)".

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoceph-infra: remove ntp_rmp.yml and ntp_debian.yml v3.2.2
Rishabh Dave [Wed, 9 Jan 2019 15:37:00 +0000 (21:07 +0530)]
ceph-infra: remove ntp_rmp.yml and ntp_debian.yml

This commit fixes the merge conflict that occurred during the
auto-backport and auto-merge of the commit
488281187e8ac6c587db74961db9e075f31c8eae.

Also please note that the commit
488281187e8ac6c587db74961db9e075f31c8eae was merged (on PR 3477)
"as it is" (despite of merge conflicts) which was not supposed to be
the case ideally. This had a side-effect that the feature of supporting
multiple NTP daemons (new ones are namely chronyd and timesyncd) was
also backported which is itself against the convention. For
consistency's sake the feature was backported to stable-3.1 as well.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agointroduce new role ceph-facts
Guillaume Abrioux [Mon, 10 Dec 2018 14:46:32 +0000 (15:46 +0100)]
introduce new role ceph-facts

sometimes we play the whole role `ceph-defaults` just to access the
default value of some variables. It means we play the `facts.yml` part
in this role while it's not desired. Splitting this role will speedup
the playbook.

Closes: #3282
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0eb56e36f8ce52015aa6c343faccd589e5fd2c6c)

6 years agopurge-container: move facts gathering after ceph-defaults role import
Guillaume Abrioux [Wed, 12 Dec 2018 15:34:14 +0000 (16:34 +0100)]
purge-container: move facts gathering after ceph-defaults role import

This task has to be called after the role `ceph-defaults` has been
played, otherwise, `mon_group_name` will never be known.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a12de3e048e1b62b8cbf6bc6d089db1bb880d37c)

6 years agopurge-container: fix wrong syntax
Guillaume Abrioux [Wed, 12 Dec 2018 08:53:32 +0000 (09:53 +0100)]
purge-container: fix wrong syntax

we want a default value for `mon_group_name`, not for
`groups[mon_group_name]`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d0b3cb7f851ece584da47af65c4416ad93541065)

6 years agopurge-docker: do not call ceph-osd role
Guillaume Abrioux [Mon, 10 Dec 2018 20:43:35 +0000 (21:43 +0100)]
purge-docker: do not call ceph-osd role

calling ceph-osd role in purge playbook is not needed.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ae7f3d66a67d9e446bf66aed4f26dd5701afdbda)

6 years agopurge: gather monitors facts in OSD purge
Guillaume Abrioux [Wed, 5 Dec 2018 08:06:53 +0000 (09:06 +0100)]
purge: gather monitors facts in OSD purge

the OSD part of the purge delegates commands on monitor node, we need to
gather monitors facts to know the `ansible_hostname` fact that is used
in the `docker_exec_cmd` fact.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1a4a6ec855fcb2943d78d2a84c29a7e8c588824f)

6 years agopurge-container: gather fact before calling ceph-defaults
Sébastien Han [Tue, 4 Dec 2018 08:22:34 +0000 (09:22 +0100)]
purge-container: gather fact before calling ceph-defaults

ceph-defaults relies on facts so we must gather facts before running it.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 62111ff53c6a139b7ce2195c2b13d0c0b22e2769)

6 years agopurge-cluster: add support for mon/mgr collocation
Sébastien Han [Mon, 3 Dec 2018 21:59:17 +0000 (22:59 +0100)]
purge-cluster: add support for mon/mgr collocation

Recently we introduced the default collocation of mon/mgr without the
need of a dedicated mgrs section. This means we have to stop the mgr
process on that machine too.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit fc6ebd8ebb39c74544ece57c401783908ad1ed24)

6 years agopurge-cluster: remove support for other init system
Sébastien Han [Mon, 3 Dec 2018 21:58:19 +0000 (22:58 +0100)]
purge-cluster: remove support for other init system

We only support systemd and use the service module anyway.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 3a154fa0ad64f6704a832743571e5d20b84e9813)

6 years agopurge-docker-cluster: add support for mgr/mon collocation
Sébastien Han [Mon, 3 Dec 2018 21:46:52 +0000 (22:46 +0100)]
purge-docker-cluster: add support for mgr/mon collocation

Recently we introduced the collocation of mon and mgr by default, so we
don't need to have an explicit mgrs section for this. This means we have
to remove the mgr container on the mon machines too.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 325a159415a0eb8699a45c04b2d8ea233b2157c2)

# Conflicts:
# infrastructure-playbooks/purge-docker-cluster.yml

6 years agopurge-docker-cluste: add a task to check hosts
Sébastien Han [Mon, 3 Dec 2018 15:46:38 +0000 (16:46 +0100)]
purge-docker-cluste: add a task to check hosts

It's useful when running on CI to see what might remain on the machines.
So we list all the containers and images. We expect the list to be
empty.

We fail if we see containers running.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 2bcc00896f623e3baa2f306128115134bdce84ce)

6 years agopurge-docker-cluster: add ceph-volume support
Sébastien Han [Thu, 4 Oct 2018 15:40:25 +0000 (17:40 +0200)]
purge-docker-cluster: add ceph-volume support

This commits adds the support for purging cluster that were deployed
with ceph-volume. It also separates nicely with a block intruction the
work to do when lvm is used or not.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 1751885bc9292581e5114f88fe5b513cb396ed72)

6 years agoThe nfs_ganesha_dev_apt_repo variable was set incorrect in task
Bruceforce [Fri, 4 Jan 2019 16:26:02 +0000 (17:26 +0100)]
The nfs_ganesha_dev_apt_repo variable was set incorrect in task
"fetch nfs-ganesha development repository"
This has to be pushed directly to stable-3.2 since master has diverged

Signed-off-by: Bruceforce <Bruceforce@users.noreply.github.com>
6 years agoceph-infra: disable unrequired NTP services
Rishabh Dave [Wed, 12 Dec 2018 11:23:23 +0000 (16:53 +0530)]
ceph-infra: disable unrequired NTP services

When one of the currently supported NTP services has been set up,
disable rest of the NTP services on Ceph nodes.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 6fa757d34358e90ae3a2f035b50d319193521ec5)

6 years agoceph-infra: merge ntp_debian.yml and ntp_rpm.yml
Rishabh Dave [Wed, 12 Dec 2018 11:15:00 +0000 (16:45 +0530)]
ceph-infra: merge ntp_debian.yml and ntp_rpm.yml

Merge ntp_debian.yml and ntp_rpm.yml into one (the new file is called
setup_ntp.yml) since they are almost identical. Also avoid repetition
of the common setup step for ntpd and chronyd services.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit b03ab607422eda0094d74223d52024a373b7ee9a)

# Conflicts:
# roles/ceph-infra/tasks/ntp_debian.yml
# roles/ceph-infra/tasks/ntp_rpm.yml

6 years agofix json data type v3.2.1
Sébastien Han [Tue, 4 Dec 2018 08:59:47 +0000 (09:59 +0100)]
fix json data type

Json is a type structure which is always typed as a string, where before
this we were declaring a dict, which is not a json valid structure.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1663026
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 896676ee80226121785f44f50d1f01fff5aa2fd7)

6 years agoupdate: do not enforce `serial: 1` on client nodes
Guillaume Abrioux [Wed, 2 Jan 2019 15:53:06 +0000 (16:53 +0100)]
update: do not enforce `serial: 1` on client nodes

There is no need to enforce `serial: 1` on client nodes.
Let's make it parameterizable by introducing a new *extra* variable
`client_update_batch`, if not filled this will default to `{{
ansible_forks }}`.

NOTE: this is only usable as an extra variable passed with
`-e client_update_batch=<num>`

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1650184
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 268f2cef821dcb5835bd925c42585ddda5a07861)

6 years agoset any_errors_fatal to true for all host sections
Rishabh Dave [Mon, 17 Dec 2018 10:34:46 +0000 (16:04 +0530)]
set any_errors_fatal to true for all host sections

Add `any_errors_fatal: true` to all host sections in `site.yml.sample`
and `site-container.yml.sample` so that the playbook execution
ceases spontaneously and instantaneously when errors occurs.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 5f43dae5938b4c0a3bfbafccf9e2aa13816a237f)

6 years agoadd support for rocksdb and wal on the same partition in non-collocated
Kai Wembacher [Thu, 13 Dec 2018 07:42:49 +0000 (08:42 +0100)]
add support for rocksdb and wal on the same partition in non-collocated

Signed-off-by: Kai Wembacher <kai@ktwe.de>
(cherry picked from commit a273ed7f6038b51d3ddb5198d4f3ab57d45bc328)

6 years agopurge: tox add lvm-setup
Sébastien Han [Tue, 4 Dec 2018 08:21:51 +0000 (09:21 +0100)]
purge: tox add lvm-setup

Since we deploy > purge > deploy the LVs are gone so we much recreate
them.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 656fbd290121a79722bd5f3af4bd44e928e74ae2)

6 years agopurge-cluster: skip tasks that use ceph-volume if it's not installed
Andrew Schoen [Tue, 11 Dec 2018 16:52:26 +0000 (10:52 -0600)]
purge-cluster: skip tasks that use ceph-volume if it's not installed

This will allow the playbook to be idempotent.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1656935
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit ffd56177e7616ba6345f1f1cc1f3b3e6ea7d66f3)

6 years agoceph_keys: pass in module for error messages
Noah Watkins [Thu, 6 Dec 2018 18:34:49 +0000 (10:34 -0800)]
ceph_keys: pass in module for error messages

fixes: #3421

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
(cherry picked from commit 114fac15dc3200bbf9da183c75d889fd75794654)

6 years agoRELASE-NOTE: fix PR links
Sébastien Han [Mon, 10 Dec 2018 08:47:39 +0000 (09:47 +0100)]
RELASE-NOTE: fix PR links

Fix wrong position of link and names. The format is [name](link).

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoAdd release note for stable-3.2 v3.2.0
Sébastien Han [Mon, 29 Oct 2018 10:38:22 +0000 (11:38 +0100)]
Add release note for stable-3.2

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agotests: reintroduce purge_cluster scenario v3.2.0rc8
Guillaume Abrioux [Tue, 4 Dec 2018 09:29:22 +0000 (10:29 +0100)]
tests: reintroduce purge_cluster scenario

- reintroduce `purge_cluster_container` and `purge_cluster_non_container`
on `stable-3.2`,
- remove all purge scenario based on ceph-disk,
- remove purge_lvm_osds_* scenarios.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotests: add purge_lvm_osds_container scenario
Guillaume Abrioux [Fri, 30 Nov 2018 10:25:25 +0000 (11:25 +0100)]
tests: add purge_lvm_osds_container scenario

This commits adds the purge_lvm_osds_container scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b04fe72f35a1d857968463b8ac0421e9b0e03872)

6 years agopurge: add iscsi support
Guillaume Abrioux [Thu, 29 Nov 2018 16:52:18 +0000 (17:52 +0100)]
purge: add iscsi support

add iscsi support for both non containerized and containerized
deployment in purge playbooks.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1651054
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 78116fa6dbe269d1319213251b01e43f1b8f3cff)

6 years agorevert infra: don't restart firewalld if unit is masked
Guillaume Abrioux [Fri, 30 Nov 2018 16:12:21 +0000 (17:12 +0100)]
revert infra: don't restart firewalld if unit is masked

If firewalld unit is masked, setting `configure_firewall: false` is
enough

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1655059
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1cff1f98065bf3b4056810a15998411f7300b58a)

6 years agorolling_update: fail if less than 3 MONs
Ramana Raja [Mon, 3 Dec 2018 14:25:42 +0000 (19:55 +0530)]
rolling_update: fail if less than 3 MONs

... for non-containerized deployments as well.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1655470
Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit cb784c601d2063b95fb7d2514e39518137164e12)

6 years agodisable nfs scenario
Sébastien Han [Mon, 3 Dec 2018 09:04:38 +0000 (10:04 +0100)]
disable nfs scenario

The packages are broken, so let's remove it, until this solved.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit a502327e52b2577a721790fce1cdc5e3201678bf)

6 years agotest: disable nfs for containers
Sébastien Han [Tue, 4 Dec 2018 09:44:28 +0000 (10:44 +0100)]
test: disable nfs for containers

Based on https://github.com/ceph/ceph-container/pull/1269 and given
there are no stable packages and reliable repository, we disable nfs
ganesha temporarly.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 6c3ef90ebe94eb874b415d1cfcf329e20232ba9a)

6 years agoosd: discover osd_objectstore on the fly
Sébastien Han [Fri, 30 Nov 2018 10:20:03 +0000 (11:20 +0100)]
osd: discover osd_objectstore on the fly

Applying and passing the OSD_BLUESTORE/FILESTORE on the fly is wrong for
existing clusters as their config will be changed.

Typically, if an OSD was prepared with ceph-disk on filestore and we
change the default objectstore to bluestore, the activation will fail.
The flag osd_objectstore should only be used for the preparation, not
activation. The activate in this case detects the osd objecstore which
prevents failures like the one described above.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 4c5113019893c92c4d75c9fc457b04158b86398b)

6 years agoceph-osd: change jinja condition
Sébastien Han [Tue, 27 Nov 2018 16:50:44 +0000 (17:50 +0100)]
ceph-osd: change jinja condition

If an existing cluster runs this config, and has ceph-disk OSD, the
`expose_partitions` won't be expected by jinja since it's inside the
'old' if. We need it as part of the osd_scenario != 'lvm' condition.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1640273
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit bef522627e1e9827b86710c7a54f35a0cd596fbb)

6 years agorolling_update: do not fail on missing keys v3.2.0rc7
Sébastien Han [Thu, 29 Nov 2018 13:26:41 +0000 (14:26 +0100)]
rolling_update: do not fail on missing keys

We don't want to fail on key that are not present since they will get
created after the mons are updated. They will be created by the task
"create potentially missing keys (rbd and rbd-mirror)".

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1650572
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit ebc901c6af67300f7b7b8da1b2d0a74147798da5)

6 years agorgw: use correct default rgw frontend address
Noah Watkins [Fri, 30 Nov 2018 23:46:42 +0000 (15:46 -0800)]
rgw: use correct default rgw frontend address

since 0.0.0.0 is the default radosgw address (not 'address'), not
configuring an address explicitly, and instead configuring the radosgw
interface, would result in 0.0.0.0 being used, instead of falling
through to section that inspects the interface config option.

backport note: this cannot be cherry-picked from master since this code
doesn't exist in master.

fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1655131

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
6 years agotox.ini: setup LVs in OSD hosts for '*-cluster' scenarios
Ramana Raja [Fri, 30 Nov 2018 15:01:13 +0000 (20:31 +0530)]
tox.ini: setup LVs in OSD hosts for '*-cluster' scenarios

... as the scenarios set up ceph clusters with LVM OSDs.

Closes: https://github.com/ceph/ceph-ansible/issues/3399
Signed-off-by: Ramana Raja <rraja@redhat.com>
6 years agoosd: manage legacy ceph-disk non-container startup v3.2.0rc6
Sébastien Han [Thu, 29 Nov 2018 13:59:25 +0000 (14:59 +0100)]
osd: manage legacy ceph-disk non-container startup

The code is now able (again) to start osds that where configured with
ceph-disk on a non-container scenario.

Closes: https://github.com/ceph/ceph-ansible/issues/3388
Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoconfig: write jinja comment with appropriate syntax
Guillaume Abrioux [Thu, 29 Nov 2018 09:16:52 +0000 (10:16 +0100)]
config: write jinja comment with appropriate syntax

jinja comment should be written using the jinja syntax `{# ... #}`

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1654441
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a86c2b85263f84891e2cbf7e782f7ac8891257b3)

6 years agorolling_update: default ceph json output to empty dict v3.2.0rc5
Sébastien Han [Wed, 28 Nov 2018 23:27:49 +0000 (00:27 +0100)]
rolling_update: default ceph json output to empty dict

So we can avoid the following failure:

The conditional check 'hostvars[mon_host]['ansible_hostname'] in (ceph_health_raw.stdout | from_json)["quorum_names"] or hostvars[mon_host]['ansible_fqdn'] in (ceph_health_raw.stdout | from_json)["quorum_names"]
' failed. The error was: No JSON object could be decoded

We just need to set a default, the next iteration will have a more
complete json since the command won't fail.

Signed-off-by: Sébastien Han <seb@redhat.com>
6 years agoclient: change default pool size
Guillaume Abrioux [Wed, 21 Nov 2018 16:28:00 +0000 (17:28 +0100)]
client: change default pool size

default pool size should match the real default that is defined in ceph
itself.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ed42262b372ace8688c2b20a05d143e46174ec08)

6 years agodefaults: change default size for openstack pools
Guillaume Abrioux [Wed, 21 Nov 2018 16:27:11 +0000 (17:27 +0100)]
defaults: change default size for openstack pools

default pool size should match the real default that is defined in ceph
itself.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6d1fe329980b91944cae58e68b909d34892667e7)

6 years agodefaults: change for default pool size for cephfs_pools
Guillaume Abrioux [Wed, 21 Nov 2018 16:08:19 +0000 (17:08 +0100)]
defaults: change for default pool size for cephfs_pools

default pool size should match the real default that is defined in ceph
itself.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit fdc438dd0dd7ad91b296008a7335460a88c2ca4a)

6 years agodefaults: add ceph related vars file
Guillaume Abrioux [Wed, 21 Nov 2018 10:06:45 +0000 (11:06 +0100)]
defaults: add ceph related vars file

This is to add a granularity level.
We can have ceph specific variables that user shouldn't have to change
here.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f1735e9bb016dd30c2164b9f8ec6f644914052b1)

6 years agorefact osd pool size customization
Guillaume Abrioux [Wed, 21 Nov 2018 10:00:11 +0000 (11:00 +0100)]
refact osd pool size customization

Add real default value for osd pool size customization.
Ceph itself has an `osd_pool_default_size` default value to `3`.

If users don't specify a pool size in various pools definition within
ceph-ansible, we should default to `3`.

By the way, this kind of condition isn't really clear:
```
when:
  - rbd_pool_size | default ("")
```

we should try to get the customized value then default to what is in
`osd_pool_default_size` (which has its default value pointing to
`ceph_osd_pool_default_size` (`3`) as well) and compare it to
`ceph_osd_pool_default_size`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7774069d45477df9f37c98bc414b3bf38cf41feb)

6 years agomon: move `osd_pool_default_pg_num` in `ceph-defaults`
Guillaume Abrioux [Tue, 13 Nov 2018 14:40:35 +0000 (15:40 +0100)]
mon: move `osd_pool_default_pg_num` in `ceph-defaults`

`osd_pool_default_pg_num` parameter is set in `ceph-mon`.
When using ceph-ansible with `--limit` on a specifc group of nodes, it
will fail when trying to access this variables since it wouldn't be
defined.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1518696
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d4c0960f04342e995db2453b50940aa9933ceb09)

6 years agotests: change default pools size
Guillaume Abrioux [Wed, 21 Nov 2018 16:28:31 +0000 (17:28 +0100)]
tests: change default pools size

default pool size in our test should be explicitly set to 1

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoupdate: fix a typo
Guillaume Abrioux [Mon, 26 Nov 2018 13:10:19 +0000 (14:10 +0100)]
update: fix a typo

`hostvars[groups[mon_host]]['ansible_hostname']` seems to be a typo.
That should be `hostvars[mon_host]['ansible_hostname']`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7c99b6df6d8f0daa05ed8da987984d638af3a794)

6 years agotests: do not fully override previous ceph_conf_overrides
Guillaume Abrioux [Thu, 22 Nov 2018 10:33:20 +0000 (11:33 +0100)]
tests: do not fully override previous ceph_conf_overrides

We run an initial deployment with `osd_pool_default_size: 1` in
`ceph_conf_overrides`.
When re-running the playbook to test idempotency and handlers, we reset
`ceph_conf_overrides`, we must append a new value instead of just
overwritting it, otherwise, this can lead to error in the CI.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f290e49df86a6c878dfffa4d017537f3be6ff615)

6 years agorolling_update: refact set_fact `mon_host`
Guillaume Abrioux [Thu, 22 Nov 2018 16:52:58 +0000 (17:52 +0100)]
rolling_update: refact set_fact `mon_host`

each monitor node should select another monitor which isn't itself.
Otherwise, one node in the monitor group won't set this fact and causes
failure.

Typical error:
```
TASK [create potentially missing keys (rbd and rbd-mirror) when mon is containerized] ***
task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-dev-update_docker_cluster/rolling_update.yml:200
Thursday 22 November 2018  14:02:30 +0000 (0:00:07.493)       0:02:50.005 *****
fatal: [mon1]: FAILED! => {}

MSG:

The task includes an option with an undefined variable. The error was: 'dict object' has no attribute u'mon2'
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit af78173584f1b3a99515e9b94f450be22420c545)

6 years agorolling_update: create rbd and rbd-mirror keyrings
Sébastien Han [Wed, 21 Nov 2018 15:18:58 +0000 (16:18 +0100)]
rolling_update: create rbd and rbd-mirror keyrings

During an upgrade ceph won't create keys that were not existing on the
previous version. So after the upgrade of let's Jewel to Luminous, once
all the monitors have the new version they should get or create the
keys. It's ok to have the task fails, especially for the rbd-mirror
key, which only appears in Nautilus.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1650572
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 4e267bee4f9263b9ac3b5649f1e3cf3cbaf12d10)

6 years agoceph_key: add a get_key function
Sébastien Han [Wed, 21 Nov 2018 15:17:04 +0000 (16:17 +0100)]
ceph_key: add a get_key function

When checking if a key exists we also have to ensure that the key exists
on the filesystem, the key can change on Ceph but still have an outdated
version on the filesystem. This solves this issue.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 691f373543d96d26b1af61c4ff7731fd888a9ce9)

6 years agoswitch: do not look for devices anymore
Sébastien Han [Mon, 19 Nov 2018 13:58:03 +0000 (14:58 +0100)]
switch: do not look for devices anymore

It's easier lookup a directoriy instead of the block devices,
especially because of ceph-volume and ceph-disk have a different way to
handle devices.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit c14f9b78ff7b88419148ac2dd01611b7ec830598)

6 years agoswitch: disable all ceph units
Sébastien Han [Fri, 16 Nov 2018 15:15:24 +0000 (16:15 +0100)]
switch: disable all ceph units

Prior to this commit we were only disabling ceph-osd units, but forgot
the ceph.target which is controlling everything and will restart the
ceph-osd units at each reboot.
Now that everything gets disabled there won't be any conflicts between
the old non-container and the new container units.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit cd56dad9fa4574f8474c362083d97003f62926ab)

6 years agoswitch: do not mask systemd unit
Sébastien Han [Tue, 13 Nov 2018 16:43:21 +0000 (17:43 +0100)]
switch: do not mask systemd unit

If we mask it we won't be able to start the OSD container since now the
osd container use the osd ID as a name such as: ceph-osd@0

Fixes the error:  Failed to execute operation: Cannot send after transport endpoint shutdown

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit fe1d09925ae1525e99f22a3eab9ca1823c079bda)

6 years agoosd: re-introduce disk_list check
Sébastien Han [Wed, 28 Nov 2018 23:10:29 +0000 (00:10 +0100)]
osd: re-introduce disk_list check

This commit
https://github.com/ceph/ceph-ansible/commit/4cc1506303739f13bb7a6e1022646ef90e004c90#diff-51bbe3572e46e3b219ad726da44b64ebL13
accidentally removed this check.

This is a must have for ceph-disk based containerized OSDs.

Signed-off-by: Sébastien Han <seb@redhat.com>