git-server-git.apps.pok.os.sepia.ceph.com Git - ceph-ansible.git/log

]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph-ansible.git/log

projects / ceph-ansible.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Sébastien Han [Fri, 16 Nov 2018 09:46:10 +0000 (10:46 +0100)]

ceph_key: apply permissions using ansible code module

Instead of applying file permissions from our code, let's rely on the
ansible code 'file' module for this. This is now handled at the task
declaration level instead of inside the module.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 26 Oct 2018 10:12:20 +0000 (12:12 +0200)]

fw: update rules for mon/mgr collocation

Since we now deploy mgr on mon we need to open fw rules so the mgr can
reach out to the osds.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 16 Nov 2018 09:31:57 +0000 (10:31 +0100)]

mon: remove old ubuntu login status

We don't support Ubuntu Precise, so this feature does not exists
anymore.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Mon, 26 Nov 2018 10:06:10 +0000 (11:06 +0100)]

sites: fail the playbook on any failure

We need to apply any_errors_fatal: true to every play so it can take
effect, not only on the initial pass. With this flag, any error in the
playbook will cause the playbook to stop.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Mon, 26 Nov 2018 10:05:13 +0000 (11:05 +0100)]

site-container: retry image pull

Sometimes pulling an image fails for network hickup, so let's retry 5
times at 5sec interval.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 16 Nov 2018 09:57:14 +0000 (10:57 +0100)]

travis: run modules unit tests

Travis now runs our modules unit tests to make sure they always pass.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 16 Nov 2018 09:29:05 +0000 (10:29 +0100)]

mon: secure cluster on container

Add the ability to protect pools on containerized clusters.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Sat, 1 Dec 2018 04:46:16 +0000 (05:46 +0100)]

osd: remove a leftover

this file is never included in ceph-osd, looks like a leftover let's remove it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Sat, 1 Dec 2018 04:24:11 +0000 (05:24 +0100)]

osd: remove an incorrect information

This is false, `./defaults/main.yml` is not supposed to be modified
directly. groups_vars a/o host_vars should always be preferred.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Mon, 26 Nov 2018 13:54:02 +0000 (14:54 +0100)]

remove kv store support

the next stable release will drop this feature.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 29 Nov 2018 15:04:28 +0000 (16:04 +0100)]

rolling_update: create missing keyring only on running mon

try to create the potentially missing keys only on monitors that are
actually running.
The current node being played is stopped before this task.
By the way, delegating the command on all nodes but the current node
being played ensures that the generated keys will be present on all
monitors.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Christian Berendt [Thu, 29 Nov 2018 13:58:13 +0000 (14:58 +0100)]

Add missing space before }}

This will fix the following yamllint warning:

Variables should have spaces after {{ and before }}

Signed-off-by: Christian Berendt <berendt@betacloud-solutions.de>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 29 Nov 2018 09:16:52 +0000 (10:16 +0100)]

config: write jinja comment with appropriate syntax

jinja comment should be written using the jinja syntax `{# ... #}`

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1654441
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Sébastien Han [Wed, 28 Nov 2018 23:27:49 +0000 (00:27 +0100)]

rolling_update: default ceph json output to empty dict

So we can avoid the following failure:

The conditional check 'hostvars[mon_host]['ansible_hostname'] in (ceph_health_raw.stdout | from_json)["quorum_names"] or hostvars[mon_host]['ansible_fqdn'] in (ceph_health_raw.stdout | from_json)["quorum_names"]
' failed. The error was: No JSON object could be decoded

We just need to set a default, the next iteration will have a more
complete json since the command won't fail.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 28 Nov 2018 19:53:10 +0000 (20:53 +0100)]

validate: change default value for `radosgw_address`

change default value of `radosgw_address` to keep consistency with
`monitor_address`.
Moreover, `ceph-validate` checks if the value is '0.0.0.0' to determine
if it has to run `check_eth_rgw.yml`.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1600227
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 28 Nov 2018 17:46:45 +0000 (18:46 +0100)]

tests: rgw_multisite allow clusters to talk to each other

Adding this rule on the hypervisor will allow cluster to talk to each
other.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 28 Nov 2018 10:30:25 +0000 (11:30 +0100)]

tests: update default pg num and pool size for podman scenario

bring the recent refact about `osd_pool_default_pg_num` and
`osd_pool_default_size` into podman scenario as well.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 27 Nov 2018 13:50:18 +0000 (14:50 +0100)]

tests: fix image tag for secondary rgw cluster (rgw_multisite)

the first cluster is using `latest-master` while the second is using
`latest` which is not the right version to be used here.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 27 Nov 2018 09:26:41 +0000 (10:26 +0100)]

tests: apply dev_setup on the secondary cluster for rgw_multisite

we must apply this playbook before deploying the secondary cluster.
Otherwise, there will be a mismatch between the two deployed cluster.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 27 Nov 2018 12:42:41 +0000 (13:42 +0100)]

mgr: fix mgr keyring error on rolling_update

when upgrading from RHCS 2.5 to 3.2, it fails because the task `create
ceph mgr keyring(s) when mon is containerized` has a when condition
`inventory_hostname == groups[mon_group_name]|last`.
First, this is incorrect because `inventory_hostname` is referring to a
mgr node, it means this condition would have never been satisfied.
Then, this condition + `serial: 1` makes the mgr keyring creating skipped on
the first node. Further, the `ceph-mgr` role tries to copy the mgr
keyring (it's not aware we are running `serial: 1`) this leads to a
failure like the following:

```
TASK [ceph-mgr : copy ceph keyring(s) if needed] ***************************************************************************************************************************************************************************************************************************************************************************
task path: /usr/share/ceph-ansible/roles/ceph-mgr/tasks/common.yml:10
Tuesday 27 November 2018 12:03:34 +0000 (0:00:00.296) 0:11:01.290 ******
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: AnsibleFileNotFound: Could not find or access '~/ceph-ansible-keys/48d78ac1-e0d6-4e35-ab3e-772aea7828fc//etc/ceph/local.mgr.magna021.keyring'
failed: [magna021] (item={u'dest': u'/var/lib/ceph/mgr/local-magna021/keyring', u'name': u'/etc/ceph/local.mgr.magna021.keyring', u'copy_key': True}) => {"changed": false, "item": {"copy_key": true, "dest": "/var/lib/ceph/mgr/local-magna021/keyring", "name": "/etc/ceph/local.mgr.magna021.keyring"}, "msg": "Could not find or access '~/ceph-ansible-keys/48d78ac1-e0d6-4e35-ab3e-772aea7828fc//etc/ceph/local.mgr.magna021.keyring'"}
```

The ceph_key module is idempotent, so there is no need to have such a
condition.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1649957
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Sébastien Han [Tue, 27 Nov 2018 09:03:07 +0000 (10:03 +0100)]

ceph-osd fix batch with container binary

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Tue, 27 Nov 2018 08:59:07 +0000 (09:59 +0100)]

ceph_key: fix after rebase

Fix the tests

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Mon, 26 Nov 2018 16:22:04 +0000 (17:22 +0100)]

fix template generation

Position the right condition on ceph_docker_version, activate it when
the container_binary is 'docker'.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Mon, 26 Nov 2018 10:52:04 +0000 (11:52 +0100)]

container-common: remove leftover

ntp is installation is managed by the ceph-infra role.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Thu, 22 Nov 2018 16:32:25 +0000 (17:32 +0100)]

shrink-osd: add missing CEPH_BINARY

We need to add the right binary to do the docker exec.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 22 Nov 2018 13:38:57 +0000 (14:38 +0100)]

defaults: play set_radosgw_address.yml only on rgw nodes

This is not needed to play these tasks on nodes that are not in rgw
group.

Always playing this code makes `shrink_mon.yml` failing.

Typical error:

```
TASK [ceph-defaults : set_fact _radosgw_address to radosgw_interface - ipv4] ***
task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-dev-shrink_mon/roles/ceph-defaults/tasks/set_radosgw_address.yml:21
Thursday 22 November 2018 12:34:51 +0000 (0:00:00.154) 0:00:12.371 *****
fatal: [localhost]: FAILED! => {}

MSG:

The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute u'ansible_eth1'
```

Indeed, `radosgw_interface` is the network interface on rgw only. It is
expected that this same interface doesn't exist on `localhost`, so, when
running `shrink_mon.yml`, the role `ceph-defaults` is called in
`hosts: localhost` and causes the playbook to fail.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Sébastien Han [Tue, 20 Nov 2018 21:29:53 +0000 (22:29 +0100)]

defaults: declare container_binary

Always declare container_binary and assign it a correct value.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Tue, 20 Nov 2018 10:28:02 +0000 (11:28 +0100)]

ceph-defaults: use podman on Fedora only

It seems Atomic 7.5 has podman already, however this is an old version
(0.4). The podman integration is targetting RHEL 8, so Fedora is
currently the closest to that.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Tue, 20 Nov 2018 13:35:41 +0000 (14:35 +0100)]

site: symlink site-docker to site-container

We deprecated site-docker to site-container so let's have a symlink for
backward compatibility.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Mon, 19 Nov 2018 16:16:00 +0000 (17:16 +0100)]

rolling_update: update ceph_key task for container

Use the new way to create keys on containerized env as introduced by: 1098b71bda90db3dad19ac179f0ba900ccb0f953

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Mon, 19 Nov 2018 16:13:37 +0000 (17:13 +0100)]

infra playbooks: use the right container binary

Use podman or docker wether they are available or not. podman will be
prioritized over docker if present.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Mon, 19 Nov 2018 14:14:55 +0000 (15:14 +0100)]

site: choose the right container runtime binary

We need to verify wether podman exists or not, if yes we use it instead
of docker.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Mon, 19 Nov 2018 13:12:43 +0000 (14:12 +0100)]

iscsi: expose /dev/log in the container

During its initialisation both rbd-target-api and rbd-target-gw try to
open /dev/log for their syslog handler. If the device is not present the
service fails to start. Thus expose /dev/log from the host in the
container solves that problem.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Mon, 19 Nov 2018 10:09:30 +0000 (11:09 +0100)]

testinfra: linting

Make flake8 happy on the testinfra files.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Mon, 19 Nov 2018 10:03:45 +0000 (11:03 +0100)]

testinfra: add support for podman

Since we are now testing on docker and podman our functionnal tests must
reflect that. So now, if we detect the podman binary we will use it,
otherwise we default to docker.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Mon, 19 Nov 2018 08:56:45 +0000 (09:56 +0100)]

ceph_key: fix rstrip for python 3

Removing bytes literals since rstrip only supports type String or None.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 16 Nov 2018 09:42:34 +0000 (10:42 +0100)]

test_lookup_ceph_initial_entities: fix

The previous dict was missing 2 entities:

* client.bootstrap-mgr
* client.bootstrap-rbd-mirror

So the test was failing since it expects 7 entities to match.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 16 Nov 2018 09:41:12 +0000 (10:41 +0100)]

test_build_key_path_bootstrap_osd: fix

The entity name is client.bootstrap-osd (as returned by Ceph), and not
bootstrap-osd. The build_key_path function split 'client.bootstrap-osd'
on the '.' so using bootstrap-osd fails with index out of range.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 16 Nov 2018 09:37:07 +0000 (10:37 +0100)]

ceph_key: remove set-uid support

The support of set-uid was remove from Ceph during the Nautilus cycle by
the following commit: d6def8ba1126209f8dcb40e296977dc2b09a376e so this
will not work anymore when deploying Nautilus clusters and above.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Sun, 18 Nov 2018 15:10:11 +0000 (16:10 +0100)]

ceph_key: use the right container runtime binary

Rework all the ceph_key invocation to use either docker or podman
binary.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Sat, 17 Nov 2018 18:58:54 +0000 (19:58 +0100)]

client: do not use a dummy container anymore

Since 84fcf4639140c390a7f1fcd790ba190503713f86 we now use the container
binary cli to create ceph keys instead of creating a container and
'docker execing' into it.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Thu, 8 Nov 2018 09:02:37 +0000 (10:02 +0100)]

Add new container scenario

Test with podman instead of docker and also support for python 3 only.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 16 Nov 2018 09:46:10 +0000 (10:46 +0100)]

ceph_key: rework container support

Previously, we were doing a 'docker exec' inside a mon container, this
worked but this wasn't ideal since it required a mon to be up to
generate keys. We must be able to generate a key without a running mon,
e.g, when we create the initial key or simply when you want to generate
a key from any node that is not a mon.
Now, just like the ceph_volume module we use a 'docker run' command with
the right binary as an entrypoint to perform the choosen action, this is
more elegant and also only requires an env variable to be set in the
playbook: CEPH_CONTAINER_IMAGE.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Tue, 27 Nov 2018 09:45:05 +0000 (10:45 +0100)]

handler: show unit logs on error

This will tremendously help debugging daemons that fail on restart by
showing the systemd unit logs.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Mon, 26 Nov 2018 22:29:50 +0000 (23:29 +0100)]

validate: add nautilus release

validate must accept ceph nautilus release.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Tue, 20 Nov 2018 20:28:58 +0000 (14:28 -0600)]

ceph-volume: be idempotent when the batch strategy changes

If you deploy with 2 HDDs and 1 SDD then each subsequent deploy both
HDD drives will be filtered out, because they're already used by ceph.
ceph-volume will report this as a 'strategy change' because the device
list went from a mixed type of HDD and SDD to a single type of only SDD.

This situation results in a non-zero exit code from ceph-volume. We want
to handle this situation gracefully and report that nothing will be changed.
A similar json structure to what would have been given by ceph-volume is
returned in the 'stdout' key.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1650306
Signed-off-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Sébastien Han [Mon, 26 Nov 2018 16:58:49 +0000 (17:58 +0100)]

osd: expose udev into the container

In order to be able to retrieve udev information, we must expose its
socket. As per, https://github.com/ceph/ceph/pull/25201 ceph-volume will
start consuming udev output.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Mon, 26 Nov 2018 13:10:19 +0000 (14:10 +0100)]

update: fix a typo

`hostvars[groups[mon_host]]['ansible_hostname']` seems to be a typo.
That should be `hostvars[mon_host]['ansible_hostname']`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 22 Nov 2018 10:33:20 +0000 (11:33 +0100)]

tests: do not fully override previous ceph_conf_overrides

We run an initial deployment with `osd_pool_default_size: 1` in
`ceph_conf_overrides`.
When re-running the playbook to test idempotency and handlers, we reset
`ceph_conf_overrides`, we must append a new value instead of just
overwritting it, otherwise, this can lead to error in the CI.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 22 Nov 2018 16:52:58 +0000 (17:52 +0100)]

rolling_update: refact set_fact `mon_host`

each monitor node should select another monitor which isn't itself.
Otherwise, one node in the monitor group won't set this fact and causes
failure.

Typical error:
```
TASK [create potentially missing keys (rbd and rbd-mirror) when mon is containerized] ***
task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-dev-update_docker_cluster/rolling_update.yml:200
Thursday 22 November 2018 14:02:30 +0000 (0:00:07.493) 0:02:50.005 *****
fatal: [mon1]: FAILED! => {}

MSG:

The task includes an option with an undefined variable. The error was: 'dict object' has no attribute u'mon2'
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Sébastien Han [Wed, 21 Nov 2018 15:18:58 +0000 (16:18 +0100)]

rolling_update: create rbd and rbd-mirror keyrings

During an upgrade ceph won't create keys that were not existing on the
previous version. So after the upgrade of let's Jewel to Luminous, once
all the monitors have the new version they should get or create the
keys. It's ok to have the task fails, especially for the rbd-mirror
key, which only appears in Nautilus.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1650572
Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Wed, 21 Nov 2018 15:17:04 +0000 (16:17 +0100)]

ceph_key: add a get_key function

When checking if a key exists we also have to ensure that the key exists
on the filesystem, the key can change on Ceph but still have an outdated
version on the filesystem. This solves this issue.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Mon, 19 Nov 2018 13:58:03 +0000 (14:58 +0100)]

switch: do not look for devices anymore

It's easier lookup a directoriy instead of the block devices,
especially because of ceph-volume and ceph-disk have a different way to
handle devices.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 16 Nov 2018 15:15:24 +0000 (16:15 +0100)]

switch: disable all ceph units

Prior to this commit we were only disabling ceph-osd units, but forgot
the ceph.target which is controlling everything and will restart the
ceph-osd units at each reboot.
Now that everything gets disabled there won't be any conflicts between
the old non-container and the new container units.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Tue, 13 Nov 2018 16:43:21 +0000 (17:43 +0100)]

switch: do not mask systemd unit

If we mask it we won't be able to start the OSD container since now the
osd container use the osd ID as a name such as: ceph-osd@0

Fixes the error: Failed to execute operation: Cannot send after transport endpoint shutdown

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 21 Nov 2018 16:28:31 +0000 (17:28 +0100)]

tests: change default pools size

default pool size in our test should be explicitly set to 1

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 21 Nov 2018 16:28:00 +0000 (17:28 +0100)]

client: change default pool size

default pool size should match the real default that is defined in ceph
itself.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 21 Nov 2018 16:27:11 +0000 (17:27 +0100)]

defaults: change default size for openstack pools

default pool size should match the real default that is defined in ceph
itself.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 21 Nov 2018 16:08:19 +0000 (17:08 +0100)]

defaults: change for default pool size for cephfs_pools

default pool size should match the real default that is defined in ceph
itself.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 21 Nov 2018 10:06:45 +0000 (11:06 +0100)]

defaults: add ceph related vars file

This is to add a granularity level.
We can have ceph specific variables that user shouldn't have to change
here.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 21 Nov 2018 10:00:11 +0000 (11:00 +0100)]

refact osd pool size customization

Add real default value for osd pool size customization.
Ceph itself has an `osd_pool_default_size` default value to `3`.

If users don't specify a pool size in various pools definition within
ceph-ansible, we should default to `3`.

By the way, this kind of condition isn't really clear:
```
when:
- rbd_pool_size | default ("")
```

we should try to get the customized value then default to what is in
`osd_pool_default_size` (which has its default value pointing to
`ceph_osd_pool_default_size` (`3`) as well) and compare it to
`ceph_osd_pool_default_size`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 13 Nov 2018 14:40:35 +0000 (15:40 +0100)]

mon: move `osd_pool_default_pg_num` in `ceph-defaults`

`osd_pool_default_pg_num` parameter is set in `ceph-mon`.
When using ceph-ansible with `--limit` on a specifc group of nodes, it
will fail when trying to access this variables since it wouldn't be
defined.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1518696
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 21 Nov 2018 13:38:25 +0000 (14:38 +0100)]

config: convert _osd_memory_target to int

ceph.conf doesn't accept float value.

Typical error seen:
```
$ sudo ceph daemon osd.2 config get osd_memory_target
Can't get admin socket path: unable to get conf option admin_socket for osd.2:
parse error setting 'osd_memory_target' to '7823740108,8' (strict_si_cast:
unit prefix not recognized)
```

This commit ensures the value inserted in ceph.conf will be an integer.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Sébastien Han [Tue, 20 Nov 2018 17:03:14 +0000 (18:03 +0100)]

site: resync container playbook

This PR https://github.com/ceph/ceph-ansible/pull/3251 forgot to create a symlink from site-docker.yml.sample to site-container.yml.sample.
This commit resyncs and put the symlink in place.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Boris Ranto [Tue, 20 Nov 2018 10:46:08 +0000 (11:46 +0100)]

defaults/facts: Use list instead of keys

It is safer to use the list filter than the keys() method since the keys
method does have some interoperability issues between python2 and
python3 based ansible/jinja.

Signed-off-by: Boris Ranto <branto@redhat.com>

commit | commitdiff | tree

Boris Ranto [Mon, 19 Nov 2018 23:45:40 +0000 (00:45 +0100)]

start_osds: Use list instead of keys

If you use python3 based ansible then keys() returns a dict_keys object,
not a list of keys. This breaks the installation on such a system. Using
the list filter provides a more robust solution that should work on both
python2 and python3 based ansible. You can find some more information
about the issue, here:

https://github.com/ansible/ansible/issues/19514

Signed-off-by: Boris Ranto <branto@redhat.com>

commit | commitdiff | tree

Valentin Lorentz [Mon, 19 Nov 2018 20:49:45 +0000 (21:49 +0100)]

Discover rbd facts.

Signed-off-by: Valentin Lorentz <progval+git@progval.net>

commit | commitdiff | tree

Dan Mick [Sat, 17 Nov 2018 00:28:54 +0000 (16:28 -0800)]

validate plugin: handle missing exception fields without traceback

"missing variable" errors introduced by PR3058 would attempt to
be reported, but since the exception contained no "path" definition,
would cause a second exception in the Invalid exception handler.
Make the exception handler verify that any field it tries to use
exists, clean up its message formatting, and reduce the verbose
level to see the literal error from notario in case more goes
wrong in future.

Signed-off-by: Dan Mick <dan.mick@redhat.com>

commit | commitdiff | tree

Neha Ojha [Mon, 19 Nov 2018 06:50:02 +0000 (06:50 +0000)]

osd_memory_target: standardize unit and fix calculation

* The default value of osd_memory_target used by ceph is 4294967296 bytes,
so use the same as ceph-ansible default.

* Convert ansible_memtotal_mb to bytes to calculate osd_memory_target

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Mon, 19 Nov 2018 08:42:08 +0000 (09:42 +0100)]

doc: update doc to add stable-3.2 information

Since the branch has been created, we must reflect it in the doc.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Sébastien Han [Sun, 18 Nov 2018 20:48:47 +0000 (21:48 +0100)]

ceph.ceph-container-common remove symlink

This error was introduced in the recent refactor of ceph-docker-common
in https://github.com/ceph/ceph-ansible/pull/3251. However, the Ansible
galaxy linter is not happy about it and fails importing the role.
Removing this since it's not used anymore.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Sat, 17 Nov 2018 16:40:35 +0000 (17:40 +0100)]

client: fix a typo in create_users_keys.yml

cd1e4ee024ef400ded25e8c99948648ead3a0892 introduced a typo.
This commit fixes it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 15 Nov 2018 20:56:11 +0000 (21:56 +0100)]

infra: don't restart firewalld if unit is masked

if firewalld.service systemd unit is masked, the handler will fail when
trying to restart it.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1650281
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Noah Watkins [Thu, 15 Nov 2018 22:04:45 +0000 (14:04 -0800)]

Remove outdated documentation

Fixes BZ
https://bugzilla.redhat.com/show_bug.cgi?id=1640525

Signed-off-by: Noah Watkins <nwatkins@redhat.com>

commit | commitdiff | tree

Sébastien Han [Tue, 13 Nov 2018 16:54:14 +0000 (17:54 +0100)]

tox: add lvm setup to shrink mon

Fix shrink mon scenario by setting lvm so we can configure ceph-volume
lvm osds.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 7 Nov 2018 10:45:29 +0000 (11:45 +0100)]

osd: commonize start_osd code

since `ceph-volume` introduction, there is no need to split those tasks.

Let's refact this part of the code so it's clearer.

By the way, this was breaking rolling_update.yml when `openstack_config:
true` playbook because nothing ensured OSDs were started in ceph-osd role (In
`openstack_config.yml` there is a check ensuring all OSD are UP which was
obviously failing) and resulted with OSDs on the last OSD node not started
anyway.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 8 Nov 2018 08:08:28 +0000 (09:08 +0100)]

tests: set pool size to 1 in ceph-override.json

setting this setting to 1 makes the CI covering the related code in the
playbook without breaking the upgrade scenarios.

Those scenarios were broken because there is a check `TASK [waiting for
clean pgs...]` in rolling_update.yml, since the pool size for
`cephfs_metadata` and `cephfs_data` are updated to `2` in
`ceph-override.json` and there is not enough osd to honor this size,
some PGs are degraded and make the mentioned check failing.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Fri, 19 Oct 2018 15:49:34 +0000 (17:49 +0200)]

site-docker: rename to 'site-container.yml.sample'

Add a symlink for backward compatibility

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Fri, 19 Oct 2018 15:07:55 +0000 (17:07 +0200)]

docker-common: rename role

rename `ceph-docker-common` role to `ceph-container-common`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Fri, 19 Oct 2018 14:58:44 +0000 (16:58 +0200)]

docker-common: remove system_checks.yml

This check is now part of `ceph-validate`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Fri, 19 Oct 2018 14:51:38 +0000 (16:51 +0200)]

docker-common: remove check_mandatory_vars.yml

this is part of `ceph-validate` role.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Fri, 19 Oct 2018 14:06:44 +0000 (16:06 +0200)]

docker-common: remove dirs_permissions.yml

this is already done in `ceph-config` role.
Let's remove this duplicated task.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Fri, 19 Oct 2018 13:57:18 +0000 (15:57 +0200)]

docker-common: remove legacy tasks for ntp configuration

Those tasks aren't needed in docker-common since the introduction of
`ceph-infra` role. They are duplicated tasks.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Fri, 19 Oct 2018 13:52:28 +0000 (15:52 +0200)]

docker-common: remove duplicate running cluster check

this is already done in ceph-defaults, there is no need to have this
check in ceph-docker-common.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 17 Oct 2018 15:30:36 +0000 (17:30 +0200)]

docker-common: remove duplicate set_fact (monitor_name)

this fact is already set in ceph-defaults, there is no need to set it
again in ceph-docker-common

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Rishabh Dave [Thu, 8 Nov 2018 13:08:07 +0000 (18:38 +0530)]

add quotes around package names added in da6f384

Add quotes around package names added in the commit
da6f38422396307605d62ef63980bd0c5b7868f6 so that the difference between
the Ansible variables and package names is clear.

Signed-off-by: Rishabh Dave <ridave@redhat.com>

commit | commitdiff | tree

Rishabh Dave [Thu, 8 Nov 2018 09:26:58 +0000 (04:26 -0500)]

pass the list of packages to package management modules

Instead of looping over a list of packages or repeating the task
separately for different packages, pass the list of packages to the
task performing package management.

Signed-off-by: Rishabh Dave <ridave@redhat.com>

commit | commitdiff | tree

Sébastien Han [Mon, 5 Nov 2018 16:14:31 +0000 (17:14 +0100)]

rbd-mirror: use the new rbd-mirror key

Instead of using the old rbd key let's use the new rbr-mirror key to
bootstrap the rbd -mirror daemon.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Tue, 23 Oct 2018 16:38:40 +0000 (18:38 +0200)]

ceph_key: add fetch_initial_keys capability

This is needed for Nautilus since the ceph-create-keys script goes away.
(https://github.com/ceph/ceph/pull/21305)
Now the module if called with 'state: fetch_initial_keys' will lookup
keys generated by the monitor and write them down on the filesystem to
the right location (/etc/ceph and /var/lib/ceph/boostrap*).
This is not applicable to container since keys are generated by the
container only.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Mike Christie [Tue, 30 Oct 2018 19:03:37 +0000 (14:03 -0500)]

igw: open iscsi target port

Open the port the iscsi target uses for iscsi traffic.

Signed-off-by: Mike Christie <mchristi@redhat.com>

commit | commitdiff | tree

Mike Christie [Thu, 8 Nov 2018 21:23:24 +0000 (15:23 -0600)]

igw: use api_port variable for firewall port setting

Don't hard code api port because it might be overridden by the user.

Signed-off-by: Mike Christie <mchristi@redhat.com>

commit | commitdiff | tree

Mike Christie [Tue, 30 Oct 2018 18:54:52 +0000 (13:54 -0500)]

igw: fix firewall iscsi_group_name check

The firewall setup for igw is not getting setup because iscsi_group_name
does not it exist. It should be iscsi_gw_group_name.

Signed-off-by: Mike Christie <mchristi@redhat.com>

commit | commitdiff | tree

Mike Christie [Tue, 30 Oct 2018 18:54:03 +0000 (13:54 -0500)]

igw: Fix default api port

The default igw api port is 5000 in the manual setup docs and
ceph-iscsi-config package so this syncs up ansible.

Signed-off-by: Mike Christie <mchristi@redhat.com>

commit | commitdiff | tree

Mike Christie [Thu, 8 Nov 2018 21:38:08 +0000 (15:38 -0600)]

igw: stop tcmu-runner on iscsi purge

When the iscsi purge playbook is run we stop the gw and api daemons but
not tcmu-runner which I forgot on the previous PR.

Fixes Red Hat BZ:
https://bugzilla.redhat.com/show_bug.cgi?id=1621255

Signed-off-by: Mike Christie <mchristi@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Wed, 31 Oct 2018 15:25:26 +0000 (10:25 -0500)]

validate: do not validate ceph_repository if deploying containers

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1630975
Signed-off-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Noah Watkins [Wed, 7 Nov 2018 20:54:45 +0000 (12:54 -0800)]

don't use "role" or "roles" to include roles

see 3f62fc585f60fcaecbc783316a99cd8314aab062

Signed-off-by: Noah Watkins <nwatkins@redhat.com>

commit | commitdiff | tree

Noah Watkins [Wed, 7 Nov 2018 20:54:38 +0000 (12:54 -0800)]

Fix comments in shrink-osd-ceph-disk playbook

Signed-off-by: Noah Watkins <nwatkins@redhat.com>

commit | commitdiff | tree

Noah Watkins [Wed, 31 Oct 2018 18:17:16 +0000 (11:17 -0700)]

Add a ceph-volume aware shrink-osd playbook

Signed-off-by: Noah Watkins <nwatkins@redhat.com>

commit | commitdiff | tree

Noah Watkins [Tue, 6 Nov 2018 16:49:39 +0000 (08:49 -0800)]

Fixup shrink_osd[_container] scenario config

** configuration seems to be for filestore:

[ERROR]: [ceph-osd0] Validation failed for variable: lvm_volumes

** Removing `radosgw_interface: eth1` to resolve:

The task includes an option with an undefined variable. The error was:
'ansible.vars.hostvars.HostVarsVars object' has no attribute
u'ansible_eth1'

The error appears to have been in
'/home/nwatkins/src/ceph-ansible/roles/ceph-defaults/tasks/set_radosgw_address.yml':
line 21, column 5, but may be elsewhere in the file depending on the
exact syntax problem.

The offending line appears to be:

- name: set_fact _radosgw_address to radosgw_interface - ipv4
^ here

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>

commit | commitdiff | tree

Noah Watkins [Wed, 31 Oct 2018 18:14:08 +0000 (11:14 -0700)]

Rename ceph-disk version of shrink-osd playbook

This will be replaced by a ceph-volume aware verison.

Signed-off-by: Noah Watkins <nwatkins@redhat.com>

Unnamed repository; edit this file 'description' to name the repository.