]> git.apps.os.sepia.ceph.com Git - ceph-ansible.git/log
ceph-ansible.git
6 years agocontainer-common: support podman on Ubuntu
Dimitri Savineau [Fri, 17 May 2019 21:10:34 +0000 (17:10 -0400)]
container-common: support podman on Ubuntu

Currently we're only able to use podman on ubuntu if podman's
installation is done manually before the ceph-ansible execution
because the deb package is present in an external repository.
We already manage the docker-ce installation via an external
repository so we should be able to allow the podman installation
with the same mechanism too.

https://github.com/containers/libpod/blob/master/install.md#ubuntu

Resolves: #3947

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 518ab794fb0965c6ca8af56f18e0c54529eca8d5)

6 years agopodman: Add systemd dependency on network.target
Dimitri Savineau [Thu, 6 Jun 2019 19:41:35 +0000 (15:41 -0400)]
podman: Add systemd dependency on network.target

When using podman, the systemd unit scripts don't have a dependency
on the network. So we're not sure that the network is up and running
when the containers are starting.
With docker this behaviour is already handled because the systemd
unit scripts depend on docker service which is started after the
network.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f49090df7ef82419c69dfd7a22250a79c17de42f)

6 years agoansible: use 'bool' filter on boolean conditionals
L3D [Wed, 22 May 2019 08:02:42 +0000 (10:02 +0200)]
ansible: use 'bool' filter on boolean conditionals

By running ceph-ansible there are a lot ``[DEPRECATION WARNING]`` like these:
```
[DEPRECATION WARNING]: evaluating containerized_deployment as a bare variable,
this behaviour will go away and you might need to add |bool to the expression
in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This
feature will be removed in version 2.12. Deprecation warnings can be disabled
by setting deprecation_warnings=False in ansible.cfg.
```

Now appended ``| bool`` on a lot of the affected variables.

Sometimes the coding style from ``variable|bool`` changed to ``variable | bool`` *(with spaces at the pipe)*.

Closes: #4022
Signed-off-by: L3D <l3d@c3woc.de>
(cherry picked from commit ab54fe20ec2e3bf16e4544c39548d1e21dacf0d5)

6 years agopurge-cluster: clean all ceph repo files
Dimitri Savineau [Thu, 6 Jun 2019 17:51:16 +0000 (13:51 -0400)]
purge-cluster: clean all ceph repo files

We currently only purge rh_storage yum repository file but depending
on the ceph_repository value we are using, the ceph repository file
could have a different name.

Resolves: #4056

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 44c63903cacb06fd6a32fcc591d31b2be3c7e82a)

6 years agoAdd section for purging rgw loadbalancer in purge-cluster.yml
guihecheng [Fri, 1 Mar 2019 07:51:43 +0000 (15:51 +0800)]
Add section for purging rgw loadbalancer in purge-cluster.yml

Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com>
(cherry picked from commit 59e702ec39f5b6b109138f30aa6c45b56b544554)

6 years agoAdd section for rgw loadbalancer in site.yml
guihecheng [Thu, 4 Apr 2019 03:33:15 +0000 (11:33 +0800)]
Add section for rgw loadbalancer in site.yml

This drives ceph rgw loadbalancer stuff to run.

Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com>
(cherry picked from commit 96c346743ba7cff2e737f13b8b442f14c54a9a55)

6 years agoAdd role definitions of ceph-rgw-loadbalancer
guihecheng [Thu, 4 Apr 2019 02:54:41 +0000 (10:54 +0800)]
Add role definitions of ceph-rgw-loadbalancer

This add support for rgw loadbalancer based on HAProxy and Keepalived.
We define a single role ceph-rgw-loadbalancer and include HAProxy and
Keepalived configurations all in this.

A single haproxy backend is used to balance all RGW instances and
a single frontend is exported via a single port, default 80.

Keepalived is used to maintain the high availability of all haproxy
instances. You are free to use any number of VIPs. A single VIP is
shared across all keepalived instances and there will be one
master for one VIP, selected sequentially, and others serve as
backups.
This assumes that each keepalived instance is on the same node as
one haproxy instance and we use a simple check script to detect
the state of each haproxy instance and trigger the VIP failover
upon its failure.

Signed-off-by: guihecheng <guihecheng@cmiot.chinamobile.com>
(cherry picked from commit 35d40c65f8c7f785a53978210c54f642e1384feb)

6 years agovalidate: add a check for nfs standalone
Guillaume Abrioux [Mon, 20 May 2019 14:28:42 +0000 (16:28 +0200)]
validate: add a check for nfs standalone

if `nfs_obj_gw` is True when deploying an internal ganesha with an
external ceph cluster, `ceph_nfs_rgw_access_key` and
`ceph_nfs_rgw_secret_key` must be provided so the
ganesha configuration file can be generated.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 003aeea45a8e232d2bd592c0dc866eb768e9d812)

6 years agonfs: support internal Ganesha with external ceph cluster
Guillaume Abrioux [Mon, 20 May 2019 13:58:10 +0000 (15:58 +0200)]
nfs: support internal Ganesha with external ceph cluster

This commits allows to deploy an internal ganesha with an external ceph
cluster.

This requires to define `external_cluster_mon_ips` with a comma
separated list of external monitors.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1710358
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6a6785b719d126cf54ebad8b2a22c97d90afd05e)

6 years agoceph-osd: do not relabel /run/udev in containerized context
Guillaume Abrioux [Mon, 3 Jun 2019 17:15:30 +0000 (19:15 +0200)]
ceph-osd: do not relabel /run/udev in containerized context

Otherwise content in /run/udev is mislabeled and prevent some services
like NetworkManager from starting.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 80875adba791b732713f686a4e4eba182758dc9d)

6 years agotests: test podman against atomic os instead rhel8
Guillaume Abrioux [Thu, 23 May 2019 08:49:54 +0000 (10:49 +0200)]
tests: test podman against atomic os instead rhel8

the rhel8 image used is an outdated beta version, it is not worth it to
maintain this image upstream, since it's possible to test podman with a
newer version of centos/atomic-host image.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a78fb209b18e4f8e4f60c92e6f62520446eda486)

6 years agosite-container: update container-engine role
Dimitri Savineau [Tue, 28 May 2019 20:43:48 +0000 (16:43 -0400)]
site-container: update container-engine role

Since the split between container-engine and container-common roles,
the tags and condition were not updated to reflect the change.

- ceph-container-engine needs with_pkg tag
- ceph-container-common needs fetch_container_images
- we don't need to pull the container image in a dedicated task for
atomic host. We can now use the ceph-container-common role.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2d375e1aa779e19a39ac435d7064133add07c3ce)

6 years agoceph-nfs: use template module for configuration
Dimitri Savineau [Mon, 3 Jun 2019 19:28:39 +0000 (15:28 -0400)]
ceph-nfs: use template module for configuration

789cef7 introduces a regression in the ganesha configuration file
generation. The new config_template module version broke it.
But the ganesha.conf file isn't an ini file and doesn't really
need to use the config_template module. Instead we can use the
classic template module.

Resolves: #4045

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 616c4846980bc01144417416d60fd9bb46aa14a9)

6 years agoceph-facts: generate fsid on mon node
Dimitri Savineau [Fri, 31 May 2019 17:26:30 +0000 (13:26 -0400)]
ceph-facts: generate fsid on mon node

The fsid generation is done via a python command. When the ansible
controller node only have python3 available (like RHEL 8) then the
python command isn't necessarily present causing the fsid generation
to fail.
We already do some resource creation (like ceph keyring secret) with
the python command too but from the mon node so we should do the same
for fsid.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1714631

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit daf92a9e1f8ed14e03e20a4d908f49c411eb8887)

6 years agovagrant: Default box to centos/7
Dimitri Savineau [Fri, 31 May 2019 14:22:15 +0000 (10:22 -0400)]
vagrant: Default box to centos/7

We don't use ceph/ubuntu-xenial anymore but only centos/7 and
centos/atomic-host.
Changing the default to centos/7.

Resolves: #4036

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 24d0fd70030e3014405bf3bf2d628ede4cee6466)

6 years agoSync config_template from upstream
Kevin Carter [Wed, 22 May 2019 18:08:10 +0000 (13:08 -0500)]
Sync config_template from upstream

This change pulls in the most recent release of the config_template module
into the ceph_ansible action plugins.

Signed-off-by: Kevin Carter <kecarter@redhat.com>
(cherry picked from commit 789cef7621a3869fb42d4b2749f22d11ff08f6e0)

6 years agotests: add retries on failing tests in testinfra
Guillaume Abrioux [Wed, 22 May 2019 08:42:33 +0000 (10:42 +0200)]
tests: add retries on failing tests in testinfra

This commit adds `pytest-rerunfailures` in requirements.txt so we can
retry failing test in testinfra to avoid false positive. (eg: sometimes it
can happen for some reason a service takes too much time to start)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4708b7615f6acab9fe9f251eaf2a6da2e1f859ab)

6 years agoroles: introduce `ceph-container-engine` role
Guillaume Abrioux [Mon, 20 May 2019 07:46:10 +0000 (09:46 +0200)]
roles: introduce `ceph-container-engine` role

This commit splits the current `ceph-container-common` role.

This introduces a new role `ceph-container-engine` which handles the
tasks specific to the installation of containers tools (docker/podman).

This is needed for the ceph-dashboard implementation for 2 main reasons:

1/ Since the ceph-dashboard stack is only containerized, we must install
everything needed to run containers even in non containerized
deployments. Splitting this role allows us to not have to call the full
`ceph-container-common` role which would run a bunch of unneeded tasks
that would have been skipped anyway.

2/ The current implementation would have required to run
`ceph-container-common` on all ceph-clients nodes which would have been
conflicting with 9d3517c670ea2e944565e1a3e150a966b2d399de (we don't want
to run ceph-container-common on all client nodes, see mentioned commit
for more details)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 55420d6253bc6605738fe3f4745e2ba08a6ea5b8)

6 years agoceph-mgr: install python-routes for dashboard
Dimitri Savineau [Fri, 17 May 2019 15:24:00 +0000 (11:24 -0400)]
ceph-mgr: install python-routes for dashboard

The ceph mgr dashboard requires routes python library to be installed
on the system.

Resolves: #3995

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f37edfa113cc16844b5b76cb218f180124acb283)

6 years agoceph-prometheus: fix error in templates
Dimitri Savineau [Tue, 21 May 2019 14:29:16 +0000 (10:29 -0400)]
ceph-prometheus: fix error in templates

- remove trailing double quotes in jinja templates
- add jinja filename without .j2 suffix

Resolves: #4011

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 29b0d47c8cc3943ee89aaa660455616f87f90caa)

6 years agocommon: use gnupg instead of gpg
Dimitri Savineau [Tue, 21 May 2019 13:21:16 +0000 (09:21 -0400)]
common: use gnupg instead of gpg

gpg package isn't available for all Debian/Ubuntu distribution but
gnupg is.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 622d9feae924b216d02fc10a90c5a2089ab98794)

6 years agoconfig: fix ipv6
Guillaume Abrioux [Tue, 21 May 2019 13:48:34 +0000 (15:48 +0200)]
config: fix ipv6

As of nautilus, if you set `ms bind ipv6 = True` you must explicitly set
`ms bind ipv4 = False` too, otherwise OSDs will still try to pick up an
IPv4 address.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1710319
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6ca7372a2df1cb1ad7ef56b121ebfc94afc24ec7)

6 years agotests: update testinfra release
Dimitri Savineau [Tue, 30 Apr 2019 14:24:25 +0000 (10:24 -0400)]
tests: update testinfra release

In order to support ansible 2.8 with testinfra we need to use the
latest release (3.0.x).
Adding ssh-config option to py.test.
Also bumping the pytest and xdist version.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit de147469d7e78be51874575602226a684280ef4a)

6 years agoceph-nfs: apply selinux fix anyway
Dimitri Savineau [Thu, 18 Apr 2019 14:02:12 +0000 (10:02 -0400)]
ceph-nfs: apply selinux fix anyway

Because ansible_distribution_version doesn't return minor version on
CentOS with ansible 2.8 we can apply the selinux anyway but only for
CentOS/RHEL 7.
Starting RHEL 8, there's a dedicated package for selinux called
nfs-ganesha-selinux [1].

Also replace the command module + semanage by the selinux_permissive
module.

[1] https://github.com/nfs-ganesha/nfs-ganesha/commit/a7911f

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 0ee833432eb5d2b4998002c495ff08a65a3b26c6)

6 years agoceph-validate: use kernel validation for iscsi
Dimitri Savineau [Thu, 18 Apr 2019 13:37:07 +0000 (09:37 -0400)]
ceph-validate: use kernel validation for iscsi

Ceph iSCSI gateway requires Red Hat Enterprise Linux or CentOS 7.5
or later.
Because we can not check the ansible_distribution_version fact for
CentOS with ansible 2.8 (returns only the major version) we can
fallback by checking the kernel option.

  - CONFIG_TARGET_CORE=m
  - CONFIG_TCM_USER2=m
  - CONFIG_ISCSI_TARGET=m

http://docs.ceph.com/docs/master/rbd/iscsi-target-cli-manual-install/

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 0c7fd79865d216d1caa1228bbbc9c021551ab12c)

6 years agoswitch to ansible 2.8
Guillaume Abrioux [Tue, 9 Apr 2019 07:22:06 +0000 (09:22 +0200)]
switch to ansible 2.8

- remove private attribute with import_role.
- update documentation.
- update rpm spec requirement.
- fix MagicMock python import in unit tests.

Closes: #3765
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 72d8315299aa56889d8c680269fdf5da57f9654e)

6 years agodashboard: move the call to ceph-node-exporter
Guillaume Abrioux [Fri, 17 May 2019 15:34:09 +0000 (17:34 +0200)]
dashboard: move the call to ceph-node-exporter

This moves the call to ceph-node-exporter role after
ceph-container-common, otherwise it will try to run container before
docker or podman are installed.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7c6a3bf825cfd94aad98734ec1d00109189d005b)

6 years agocommon: install dependencies for apt modules
Dimitri Savineau [Fri, 17 May 2019 14:31:46 +0000 (10:31 -0400)]
common: install dependencies for apt modules

When using a minimal Debian/Ubuntu distribution there's no
ca-certificates and gpg packages installed so the apt modules will
fail:

Failed to find required executable gpg in paths:
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

apt.cache.FetchFailedException:
W:https://download.ceph.com/debian-luminous/dists/bionic/InRelease:
No system certificates available. Try installing ca-certificates.

Resolves: #3994

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 494746b7a661efcf99addd20cfe2ec7b34c4f490)

6 years agotox: Don't copy infrastructure playbook
Dimitri Savineau [Tue, 23 Apr 2019 14:40:09 +0000 (10:40 -0400)]
tox: Don't copy infrastructure playbook

Since a1a871c we don't need to copy the infrastructure playbooks
under the ceph-ansible root directory.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 0f89a3f7a5924802d32d57bd8a4510a025f5b07e)

6 years agopurge-docker-cluster: don't remove data on atomic
Dimitri Savineau [Thu, 16 May 2019 14:00:58 +0000 (10:00 -0400)]
purge-docker-cluster: don't remove data on atomic

Because we don't manage the docker service on atomic (yet) via the
ceph-container-common role then we can't stop docker dans remove
the data.
For now let's do that only for non atomic hosts.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 638604929b2105c1c224a2858df90d976f91761e)

6 years agodashboard: move defaults variables to ceph-defaults v4.0.0rc8
Guillaume Abrioux [Thu, 16 May 2019 13:58:20 +0000 (15:58 +0200)]
dashboard: move defaults variables to ceph-defaults

There is no need to have default values for these variables in each roles
since there is no corresponding host groups

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9f0d4d6847c6edff6804969ab4fdd34451a5d2cc)

6 years agorename docker_exec_cmd variable
Guillaume Abrioux [Tue, 14 May 2019 12:51:32 +0000 (14:51 +0200)]
rename docker_exec_cmd variable

This commit renames the `docker_exec_cmd` variable to
`container_exec_cmd` so it's more generic.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e74d80e72fa5044569d30d5185fd16b7debf1dea)

6 years agodashboard: fix a typo
Guillaume Abrioux [Thu, 16 May 2019 12:36:53 +0000 (14:36 +0200)]
dashboard: fix a typo

6f0643c8e introduced a typo, the role that should be run is
ceph-container-common, not ceph-common

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit acac24d9847b7708a27bfdea36ee73625440720a)

6 years agotests: add dashboard scenario testing
Guillaume Abrioux [Thu, 16 May 2019 09:19:11 +0000 (11:19 +0200)]
tests: add dashboard scenario testing

This commit add a new scenario to test the dashboard deployment via
ceph-ansible.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 17634fc3df5adf3b5eca70c42eeb5dd5c235aaae)

6 years agodashboard: align the way containers are managed
Guillaume Abrioux [Thu, 16 May 2019 08:56:06 +0000 (10:56 +0200)]
dashboard: align the way containers are managed

This commit aligns the way the different containers are managed with how
it's currently done with the other ceph daemon.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit cc285c417ab42fa8c3d1bc08bdb95e981ba9444f)

6 years agodashboard: convert dashboard_rgw_api_no_ssl_verify to a bool
Guillaume Abrioux [Wed, 15 May 2019 14:16:55 +0000 (16:16 +0200)]
dashboard: convert dashboard_rgw_api_no_ssl_verify to a bool

make `dashboard_rgw_api_no_ssl_verify` a bool variable since it seems to
be used as it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit cd5f3fca649ad823247f96e4060456ca44b1415e)

6 years agodashboard: generate group_vars sample files
Guillaume Abrioux [Wed, 15 May 2019 14:15:48 +0000 (16:15 +0200)]
dashboard: generate group_vars sample files

generate all group_vars sample files corresponding to new roles added
for ceph-dashboard implementation.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 50672c65a6ab87e7536764f43666998301330f53)

6 years agodashboard: remove legacy file
Guillaume Abrioux [Wed, 15 May 2019 13:00:26 +0000 (15:00 +0200)]
dashboard: remove legacy file

this file seems to be no longer used, let's remove it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8bbcc46ae4f78c477d041082706f1ddf34c85dee)

6 years agodashboard: set less permissive permissions on dashboard certificate/key
Guillaume Abrioux [Wed, 15 May 2019 12:38:46 +0000 (14:38 +0200)]
dashboard: set less permissive permissions on dashboard certificate/key

use `0440` instead of `0644` is enough

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 14f381200d7341ff5c5ce19e8768da8e97f43fcd)

6 years agodashboard: simplify config-key command
Guillaume Abrioux [Wed, 15 May 2019 12:35:24 +0000 (14:35 +0200)]
dashboard: simplify config-key command

since stable-4.0 isn't to deploy ceph releases prior to nautilus,
there's no need to add this complexity here.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4405f50c85720ff9d0cee78eb784621b527b28cc)

6 years agoplaybook: use blocks for grafana-server section
Guillaume Abrioux [Wed, 15 May 2019 12:11:00 +0000 (14:11 +0200)]
playbook: use blocks for grafana-server section

use a block in grafana-server section to avoid duplicate condition.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit be4a5656125cb26589564e5b8b15829da3db414d)

6 years agodashboard: do not call ceph-container-common from other role
Guillaume Abrioux [Tue, 14 May 2019 14:34:50 +0000 (16:34 +0200)]
dashboard: do not call ceph-container-common from other role

use site.yml to deploy ceph-container-common in order to install docker
even in non-containerized deployments since there's no RPM available to
deploy the differents applications needed for ceph-dashboard.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit cdff0da7d421b6761842d2d6195f33a4e1030541)

6 years agodashboard: use existing variable to detect containerized deployment
Guillaume Abrioux [Tue, 14 May 2019 12:46:25 +0000 (14:46 +0200)]
dashboard: use existing variable to detect containerized deployment

there is no need to add more complexity for this, let's use
`containerized_deployment` in order to detect if we are running a
containerized deployment.
The idea is to use `container_exec_cmd` the same way we do in the rest of
the playbook to run the different ceph commands needed to deploy the
ceph-dashboard role.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 742bb6214c806cb7db1f1ea54276aecf0bf22049)

6 years agofacts: set container_binary fact in non-containerized deployment
Guillaume Abrioux [Mon, 13 May 2019 14:34:53 +0000 (16:34 +0200)]
facts: set container_binary fact in non-containerized deployment

This is needed for the ceph-dashboard implementation since it requires
to run containerized application which aren't packaged as RPMs.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6d9dbb1d3908507763c9f07609d7fe192ec51f5e)

6 years agodashboard: rename template files
Guillaume Abrioux [Mon, 13 May 2019 14:21:16 +0000 (16:21 +0200)]
dashboard: rename template files

add .j2 to all templates file related to dashboard roles.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3578d576a4c16b9be68c403addf07d3c30c67117)

6 years agodashboard: Support podman
Boris Ranto [Mon, 8 Apr 2019 13:40:25 +0000 (15:40 +0200)]
dashboard: Support podman

This adds support for podman in dashboard-related roles. It also drops
the creation of custom network for the dashboard-related roles as this
functionality works in a different way with podman.

Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit b4d1c3693bba386f73c9bc1bedf772d07827ecb1)

6 years agodashboard: Set ssl_server_port if it is supported
Boris Ranto [Thu, 4 Apr 2019 17:51:16 +0000 (19:51 +0200)]
dashboard: Set ssl_server_port if it is supported

We cannot use the old fashioned config-key way, here. It was not
supported when the option was introduced (post 14.2.0). Since the option
is not always supported we can simply ignore the potential failure on
ceph clusters that do not support it.

Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit e737a1f83edbdda09663f18c55841befb21ffdfd)

6 years agodashboard: Add and copy alerting rules
Boris Ranto [Fri, 15 Feb 2019 19:27:15 +0000 (20:27 +0100)]
dashboard: Add and copy alerting rules

This commit adds a list of alerting rules for ceph-dashboard from the
old cephmetrics project. It also installs the configuration file so that
the rules get recognized by the prometheus server.

Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit 8f77caa932f80e03e9f978855d22e8b40d240933)

6 years agopurge-docker-cluster.yml: Default lvm_volumes
Zack Cerza [Fri, 4 Jan 2019 20:26:59 +0000 (13:26 -0700)]
purge-docker-cluster.yml: Default lvm_volumes

We were failing when that variable is unset; purge-cluster.yml contains
this workaround.

Signed-off-by: Zack Cerza <zack@redhat.com>
(cherry picked from commit 9b4339a2baf3f42fbeb0fce76af31a6b6d87c3b6)

6 years agoMerge cephmetrics/dashboard-ansible repo
Boris Ranto [Wed, 5 Dec 2018 18:59:47 +0000 (19:59 +0100)]
Merge cephmetrics/dashboard-ansible repo

This commit will merge dashboard-ansible installation scripts with
ceph-ansible. This includes several new roles to setup ceph-dashboard
and the underlying technologies like prometheus and grafana server.

Signed-off-by: Boris Ranto & Zack Cerza <team-gmeno@redhat.com>
Co-authored-by: Zack Cerza <zcerza@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2f141a6e808766bb6cd406ccc67ba0353b46e780)

6 years agoshrink_osd: mark all osd(s) out in one command v4.0.0rc7
wumingqiao [Wed, 15 May 2019 07:27:21 +0000 (15:27 +0800)]
shrink_osd: mark all osd(s) out in one command

Signed-off-by: wumingqiao <wumingqiao@beyondcent.com>
(cherry picked from commit 5320aa11c4fdd568fe4d123907633696412a080a)

6 years agotests: fix a typo in dev_setup.yml
Guillaume Abrioux [Tue, 14 May 2019 12:27:19 +0000 (14:27 +0200)]
tests: fix a typo in dev_setup.yml

c907ec41ae0698b7627ebcbe97f1c293611d41d7 introduced a typo.
This commit fixes it.

```
[WARNING]: While constructing a mapping from /home/guits/ceph-ansible/tests/functional/dev_setup.yml, line 21, column 9, found a duplicate dict key (replace).
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2798774e965a045787ce9d05fd82e63d8329bffb)

6 years agopurge-docker-cluster: remove docker data
Dimitri Savineau [Mon, 13 May 2019 21:03:55 +0000 (17:03 -0400)]
purge-docker-cluster: remove docker data

We never clean the content of /var/lib/docker so we can still have
some data present in this directory after run the purge playbook.
Pip isn't used anymore.
Also update the docker package name (especially the python binding
one).

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 168d7cd016a9aa3f771df0e805b5a35f137a9e41)

6 years agocontainer-common: allow podman for other distros
Dimitri Savineau [Fri, 10 May 2019 19:35:17 +0000 (15:35 -0400)]
container-common: allow podman for other distros

Currently podman installation is very tied to RHEL 8 even if we're
able to install it on Debian/Ubuntu distribution.
This patch changes the way we are starting or not the (fat) container
daemon. Before the condition was based on the distribution release
and now on the container_service_name variable.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit d2ad191eca015ba3a6a66b4bc16f5c19cf7053ed)

6 years agoceph-nfs: fixed with_items
Bruceforce [Sun, 12 May 2019 11:10:30 +0000 (13:10 +0200)]
ceph-nfs: fixed with_items

If we do this in one line we get the error described in #3968

fixes #3968

Signed-off-by: Bruceforce <markus.greis@gmx.de>
(cherry picked from commit c3b0ee30a1d4d30f4775b149f65ed735a3c79c9a)

6 years agogather-ceph-logs: fix logs list generation
Dimitri Savineau [Mon, 13 May 2019 14:12:42 +0000 (10:12 -0400)]
gather-ceph-logs: fix logs list generation

The shell module doesn't have a stdout_lines attributes. Instead of
using the shell module, we can use the find modules.

Also adding `become: false` to the local tmp directory creation
otherwise we won't have enough right to fetch the files into this
directory.

Resolves: #3966

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ea1f8f551cafd3dcd23435630d231811d8cb0e15)

6 years agoUpdate RHCS version with Nautilus
Dimitri Savineau [Fri, 10 May 2019 19:28:18 +0000 (15:28 -0400)]
Update RHCS version with Nautilus

RHCS 4 will be based on Nautilus and only usable on RHEL 8.
Updated the default ceph_rhcs_version to 4 and update the rhcs
repositories to rhcs 4 with RHEL 8.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ba49225eabadb4fac6cc7cf5eb56a8ffe64ad47c)

6 years agoceph-nfs: fixed condition for "stable repos specific tasks"
Bruceforce [Sun, 12 May 2019 09:40:05 +0000 (11:40 +0200)]
ceph-nfs: fixed condition for "stable repos specific tasks"

The old condition would resolve to
"when": "nfs_ganesha_stable - ceph_repository == 'community'"

now it is
"when": [
          "nfs_ganesha_stable",
          "ceph_repository == 'community'"
        ]

Please backport to stable-4.0

Signed-off-by: Bruceforce <markus.greis@gmx.de>
(cherry picked from commit 29f2c953b44041d0fe2119d3433b0e8cdcbe6470)

6 years agoSet the rgw_create_pools pools application to rgw
Kevin Coakley [Fri, 10 May 2019 13:32:00 +0000 (06:32 -0700)]
Set the rgw_create_pools pools application to rgw

Set the application to rgw for pools created from rgw_create_pools. On Ceph Nautilus the heath is set to HEALTH_WARN with the message "application not enabled on X pool(s)" if an application isn't specified for a pool.

Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
(cherry picked from commit 381c58ca3e860ec8f0b3641e76035c55d5e0732f)

6 years agoceph-mds: group similar tasks in create_mds_filesystem.yml
Rishabh Dave [Wed, 24 Apr 2019 09:08:15 +0000 (14:38 +0530)]
ceph-mds: group similar tasks in create_mds_filesystem.yml

Group similar tasks together using block keyword.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 1a4dccdbb9266cc1e86b3d302aa8b00bfb3cd4e2)

6 years agoceph-rbd-mirror: refactor tasks/main.yml
Rishabh Dave [Wed, 24 Apr 2019 09:19:04 +0000 (14:49 +0530)]
ceph-rbd-mirror: refactor tasks/main.yml

Use blocks for similar tasks in main.yml. And move when keywords before
block keywords.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 121b5e4184a24bc5e862dd515cd2678590881c1f)

6 years agoigw: Fix rolling update service ordering
Mike Christie [Thu, 9 May 2019 19:52:08 +0000 (14:52 -0500)]
igw: Fix rolling update service ordering

We must stop tcmu-runner after the other rbd-target-* services
because they may need to interact with tcmu-runner during shutdown.
There is also a bug in some kernels where IO can get stuck in the
kernel and by stopping rbd-target-* first we can make sure all IO is
flushed.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1659611

Signed-off-by: Mike Christie <mchristi@redhat.com>
(cherry picked from commit d7ef12910e7b583fa42f84a7173a87e7c679e79e)

6 years agotox: Refact lvm_osds scenario v4.0.0rc6
Dimitri Savineau [Wed, 3 Apr 2019 20:22:47 +0000 (16:22 -0400)]
tox: Refact lvm_osds scenario

The current lvm_osds only tests filestore on one OSD node.
We also have bs_lvm_osds to test bluestore and encryption.
Let's use only one scenario to test filestore/bluestore and with or
without dmcrypt on four OSD nodes.
Also use validate_dmcrypt_bool_value instead of types.boolean on
dmcrypt validation via notario.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 52b9f3fb2886d703b25f650221ea973147c68ed6)

6 years agofacts: fix external cluster bug
Guillaume Abrioux [Tue, 7 May 2019 14:42:49 +0000 (16:42 +0200)]
facts: fix external cluster bug

running an external ceph cluster deployment with (obviously) no
monitors defined in inventory breaks with an undefined error because
`_monitor_addresses` never get defined.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1707460
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 936c6fca7851287e03e687c2386bd1f3cb785505)

6 years agodon't access other node's docker_exec_cmd variable
Rishabh Dave [Sun, 28 Apr 2019 16:42:45 +0000 (22:12 +0530)]
don't access other node's docker_exec_cmd variable

Except for some corner case, it's not correct to access some other
node's copy of variable docker_exec_cmd. Therefore replace
"hostvars[groups[mon_group_name][0]]['docker_exec_cmd']" by
"docker_exec_cmd".

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 89748d579af9e5fb16aa4443198a4de0cf9cd39c)

6 years agoceph-mgr: create keys for MGRs
Rishabh Dave [Thu, 2 May 2019 12:48:00 +0000 (08:48 -0400)]
ceph-mgr: create keys for MGRs

Add code in ceph-mgr for creating a keyring for manager in so that
managers can be deployed on a separate node too.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 56bfec7c58407e269f6e6fa7b4c8a5928953dc6f)

6 years agoallow adding a manager to a deployed cluster
Rishabh Dave [Sat, 9 Feb 2019 07:46:12 +0000 (13:16 +0530)]
allow adding a manager to a deployed cluster

Add a playbook that deploys manager on a new node and adds that node to
the already deployed Ceph cluster.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit d2cfd8b780e78c18148b47a1b512ed996b8ef8b1)

6 years agoallow adding a RGW to already deployed cluster
Rishabh Dave [Sun, 7 Apr 2019 06:36:31 +0000 (02:36 -0400)]
allow adding a RGW to already deployed cluster

Add a tox scenario that adds a new RGW node as a part of already
deployed Ceph cluster and deploys RGW there.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit f2012224475dc7b8352ed4b179c1d5f1aac55a50)

Conflicts:
tox.ini
replaced "dev" and "nautilus" during cherry-pick.

6 years agoremove infrastructure-playbooks/rgw-standalone.yml
Rishabh Dave [Tue, 7 May 2019 10:58:36 +0000 (16:28 +0530)]
remove infrastructure-playbooks/rgw-standalone.yml

We don't need infrastructure-playbooks/rgw-standalone.yml since
site.yml.sample and site-cotainer.yml.sample can add a new RGW node to
an already deployed Ceph cluster.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 6e8fb2b3ea87f03ef6389a05c63f541aef51f162)

6 years agoFix check mode support
Gaudenz Steinlin [Mon, 6 May 2019 08:14:36 +0000 (10:14 +0200)]
Fix check mode support

Adds "check_mode: no" to commands which register cluster state in a
variable and don't modify anything. These commands have to run in order
to support running the playbook in check mode.

Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch>
(cherry picked from commit 3c8987c7a549b63b2e615c6daa4a3a93f5049967)

6 years agoallow adding a RBD mirror to already deployed cluster
Rishabh Dave [Sun, 7 Apr 2019 06:14:05 +0000 (02:14 -0400)]
allow adding a RBD mirror to already deployed cluster

Add a tox scenario that adds a new RBD mirror node as a part of already
deployed Ceph cluster and deploys RBD mirror there.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 221b2b4988a735a3ebe08f227b5fa85352022e05)

Conflicts:
tox.ini
"dev" was to replaced by "nautilus" in "envlist"

6 years agoFix comment content
letterwuyu [Sun, 28 Apr 2019 09:56:29 +0000 (17:56 +0800)]
Fix comment content

Signed-off-by: lishuhao letterwuyu@gmail.com
(cherry picked from commit d57f6fcdc601e04239c59344c2bab7f05dc1f87f)

6 years agoimprove coding style
Rishabh Dave [Mon, 1 Apr 2019 15:46:15 +0000 (21:16 +0530)]
improve coding style

Keywords requiring only one item shouldn't express it by creating a
list with single item.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 739a662c8084a80cf5565640556389d2b59c7daa)

Conflicts:
roles/ceph-mon/tasks/ceph_keys.yml
roles/ceph-validate/tasks/check_devices.yml

6 years agoansible: remove private and static attribute v4.0.0rc5
Dimitri Savineau [Thu, 2 May 2019 13:57:19 +0000 (09:57 -0400)]
ansible: remove private and static attribute

This will be removed in ansible 2.8 and breaks the playbook execution
with this release.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit ae266c6f2b9eca06dc1083331d49ea68a4b60562)

6 years agoceph-mds: Increase cpu limit to 4
Dimitri Savineau [Tue, 23 Apr 2019 19:54:38 +0000 (15:54 -0400)]
ceph-mds: Increase cpu limit to 4

In containerized deployment the default mds cpu quota is too low
for production environment.
This is causing performance degradation compared to bare-metal.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695850
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 1999cf3d1902456aa123ed3c96116c21e88799bb)

6 years agoceph-osd: Increase cpu limit to 4
Dimitri Savineau [Fri, 5 Apr 2019 13:45:28 +0000 (09:45 -0400)]
ceph-osd: Increase cpu limit to 4

In containerized deployment the default osd cpu quota is too low
for production environment using NVMe devices.
This is causing performance degradation compared to bare-metal.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695880
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit c17106874c29f3eafb196a30b97fd1f8fd52e768)

6 years agorolling_update: restart all ceph-iscsi services
Dimitri Savineau [Tue, 23 Apr 2019 18:58:37 +0000 (14:58 -0400)]
rolling_update: restart all ceph-iscsi services

Currently only rbd-target-gw service is restarted during an update.
We also need to restart tcmu-runner and rbd-target-api services
during the ceph iscsi upgrade.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1659611
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit f1048627eaab27563511011fa3cc31b525e2f4c9)

6 years agoceph-iscsi: start tcmu-runner for non-container
Dimitri Savineau [Tue, 23 Apr 2019 14:08:30 +0000 (10:08 -0400)]
ceph-iscsi: start tcmu-runner for non-container

Only rbd-target-api and rbd-target-gw were started/enabled for non
containerized deployment.
The issue doesn't happen with containerized setup.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 4ae5ce399be740a71699ac55c719b97aa1522df6)

6 years agotests: group and parametrize tests
Dimitri Savineau [Thu, 18 Apr 2019 21:08:13 +0000 (17:08 -0400)]
tests: group and parametrize tests

Instead of creating a dedicated test and using the same testinfra
module we can group them into a single test to avoid multiple ansible
connections and testinfra module execution.
This patch also adds parametrize pytest decorator when possible.
Finally fixing some flake minor issue.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 564ec9c99242a99982f31ea724ddeb36a95bd168)

6 years agoceph-config: remove redundant condition on a block
Rishabh Dave [Wed, 24 Apr 2019 12:06:56 +0000 (17:36 +0530)]
ceph-config: remove redundant condition on a block

Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agotox: Remove update scenario reference
Dimitri Savineau [Tue, 23 Apr 2019 20:33:46 +0000 (16:33 -0400)]
tox: Remove update scenario reference

update scenario is now handled by tox-update.ini file so we shoudn't
have update reference in tox.ini file.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 8ab6a3391f0c51db2bcba6d9afc35a6a442c0e84)

6 years agoUpdate group_vars according to defaults
Dimitri Savineau [Tue, 23 Apr 2019 20:19:00 +0000 (16:19 -0400)]
Update group_vars according to defaults

b2f2426 didn't use the generate_group_vars_sample.sh script so we
currently have a difference between the content in group_vars and the
ceph-defaults/defaults directories.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 1eeddc394d27895509e6e5e5afabe71e6993a006)

6 years ago"when" keyword should precede "block" keyword
Rishabh Dave [Thu, 28 Mar 2019 08:13:30 +0000 (13:43 +0530)]
"when" keyword should precede "block" keyword

Otherwise the reader is forced to search for "when" when blocks are too
long.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit e0beaf123ac8c4de8212377c24c3e11c8a774b89)

Conflicts:
roles/ceph-config/tasks/main.yml
roles/ceph-container-common/tasks/pre_requisites/prerequisites.yml
roles/ceph-validate/tasks/check_devices.yml

6 years agovalidate: fix a typo
Guillaume Abrioux [Tue, 23 Apr 2019 14:04:27 +0000 (16:04 +0200)]
validate: fix a typo

5aa27794615e7d4521b1dbf1444b61388aacb852 introduced a typo.
This commit fixes it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d6e28ffd277b0052fa5d8e247b0088630b1fb383)

6 years agovalidate: fix notario error
Guillaume Abrioux [Tue, 23 Apr 2019 13:19:26 +0000 (15:19 +0200)]
validate: fix notario error

Typical error:

```
AttributeError: 'Invalid' object has no attribute 'message'
```

As of python 2.6, `BaseException.message` has been deprecated.
When using python3, it fails because it has been removed.

Let's use `str(error)` instead so we don't hit this error when using
python3.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2326180bf98ef311ebddb8c6e18d83ae819cd29f)

6 years agorgw: add cpuset support v4.0.0rc4
Kyle Bader [Thu, 21 Mar 2019 18:54:34 +0000 (11:54 -0700)]
rgw: add cpuset support

1/ The OSD already supports cpuset to be used for containerized deployments
through the use of the ceph_osd_docker_cpuset_cpus variable. This adds similar
support to the RGW service for containerized deployments by setting a new
variable named ceph_rgw_docker_cpuset_cpus. Like the OSD, there are times where
using distinct cores has advantages over using the CFS in kernel scheduler.

ceph_rgw_docker_cpuset_cpus accepts a comma delimited set of CPU ids

2/ Add support for specifying --cpuset-mem variable to restrict the cgroup's memory
allocations to a particular numa node, which should typically correspond with
the cpu ids of that numa node that were provided with --cpuset-cpus. To ensure
the correct cpu ids are used one can run `numactl --hardware`  to list the nodes
and which cpu ids correspond to each.

Signed-off-by: Kyle Bader <kbader@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0bee90b20195f0764c3d9001fb06d0999b5fc6cf)

6 years agoAllow CephFS pool to be created with specific rule_name, erasure_profile just like...
Radu Toader [Thu, 18 Apr 2019 19:12:55 +0000 (22:12 +0300)]
Allow CephFS pool to be created with specific rule_name, erasure_profile just like rbd pools

Signed-off-by: Radu Toader <radu.m.toader@gmail.com>
(cherry picked from commit b2f242660edc63eece5f2dd2eaefb187c4b425a2)

6 years agoceph-container-common: modify requirement flow
Dimitri Savineau [Tue, 16 Apr 2019 13:33:02 +0000 (09:33 -0400)]
ceph-container-common: modify requirement flow

Until now it was not possible to install a specific container package
because it was somehow hardcoded.
This patch allows to override the container package name (docker.io
vs docker-ce) and refacts the package installation. This could be
achieve via the container_package_name variable.
Instead of using one task per distribution we can set the package and
service name in vars. This allows to have a unified package task.
Also refactorize the debian_prerequisites tasks because the content
was outdated.

https://docs.docker.com/install/linux/docker-ce/debian/
https://docs.docker.com/install/linux/docker-ce/ubuntu/

Resolves: #3609

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 8105a1cefb065b8a519de0aad1c89c8c887ee2a4)

6 years agorolling_update: ceph commands should use --cluster
Andrew Schoen [Thu, 28 Mar 2019 21:05:09 +0000 (16:05 -0500)]
rolling_update: ceph commands should use --cluster

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit e2529dcd7f92176cb69e31c88e5fce7c1058044e)

6 years agorolling_update: set num_osds to the number of running osds
Andrew Schoen [Thu, 28 Mar 2019 19:34:48 +0000 (14:34 -0500)]
rolling_update: set num_osds to the number of running osds

We do this so that the ceph-config role can most accurately
report the number of osds for the generation of the ceph.conf
file.

We don't want to use ceph-volume to determine the number of
osds because in an upgrade to nautilus ceph-volume won't be able to
accurately count osds created by ceph-disk.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 67453853ff559f1c19bc969ca3116c9c13f877bb)

6 years agoceph-osd: do not run lvm batch tasks during update
Andrew Schoen [Thu, 28 Mar 2019 19:02:54 +0000 (14:02 -0500)]
ceph-osd: do not run lvm batch tasks during update

When performing a rolling update do not try to create
any new osds with `ceph-volume lvm batch`. This is troublesome
because when upgrading to nautilus the devices list might contain
devices that are currently being used by ceph-disk and have GPT
headers on them, which will cause ceph-volume to fail when
trying to use such a device. Any devices originally created
by ceph-disk will need to be removed from the devices list
before any new osds can be created.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 5e3dfe50210f6b5e49f6c15160086a0dd84f9e6b)

6 years agotests: adds the migrate_ceph_disk_to_ceph_volume scenario
Andrew Schoen [Wed, 27 Mar 2019 19:36:51 +0000 (14:36 -0500)]
tests: adds the migrate_ceph_disk_to_ceph_volume scenario

This test deploys a luminous cluster with ceph-disk created osds
and then upgrades to nautilus and migrates those osds to ceph-volume.
The nodes are then rebooted and cluster state verified.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 399a821439f8f0997dc9b66e97aacd61ea870ea3)

6 years agorolling_update: migrate ceph-disk osds to ceph-volume
Andrew Schoen [Tue, 19 Mar 2019 20:08:32 +0000 (15:08 -0500)]
rolling_update: migrate ceph-disk osds to ceph-volume

When upgrading to nautlius run ``ceph-volume simple scan`` and
``ceph-volume simple activate --all`` to migrate any running
ceph-disk osds to ceph-volume.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1656460
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 28c47e4d1bb22b0c89f67b315a732b624f650f4b)

6 years agoceph-mgr: Add extra module packages
Dimitri Savineau [Mon, 15 Apr 2019 16:15:49 +0000 (12:15 -0400)]
ceph-mgr: Add extra module packages

Since Nautilus there's mgr extra modules not present in ceph-mgr
package but in dedicated packages.

Resolves: #3860

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 86315272c75f62dd9fa07eec91b9e6f05f54c284)

6 years agoupdate: ensure tasks are executed on an upgraded mon
Guillaume Abrioux [Wed, 17 Apr 2019 12:02:06 +0000 (14:02 +0200)]
update: ensure tasks are executed on an upgraded mon

These tasks must be run from a monitor which is upgraded otherwise it
might fail.
See: https://tracker.ceph.com/issues/39355

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7eb42c9e8ec9f28b81eb1f1348c20749c4c8adc2)

6 years agoupdate: ensure ceph command returns 0
Guillaume Abrioux [Wed, 17 Apr 2019 11:57:29 +0000 (13:57 +0200)]
update: ensure ceph command returns 0

these commands could return something else than 0.
Let's ensure all retries have been done before actually failing.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ed84325b1d97cc7faf0f478a68d8bc2b1edcaaf6)

6 years agoupdate: set osd flags before upgrading any mon
Guillaume Abrioux [Wed, 17 Apr 2019 06:47:25 +0000 (08:47 +0200)]
update: set osd flags before upgrading any mon

Typical error:

```
failed: [mon0 -> mon2] (item=noout) => changed=true
  cmd:
  - ceph
  - --cluster
  - ceph
  - osd
  - set
  - noout
  delta: '0:00:00.293756'
  end: '2019-04-17 06:31:57.552386'
  item: noout
  msg: non-zero return code
  rc: 1
  start: '2019-04-17 06:31:57.258630'
  stderr: |-
    Traceback (most recent call last):
      File "/bin/ceph", line 1222, in <module>
        retval = main()
      File "/bin/ceph", line 1146, in main
        sigdict = parse_json_funcsigs(outbuf.decode('utf-8'), 'cli')
      File "/usr/lib/python2.7/site-packages/ceph_argparse.py", line 788, in parse_json_funcsigs
        cmd['sig'] = parse_funcsig(cmd['sig'])
      File "/usr/lib/python2.7/site-packages/ceph_argparse.py", line 728, in parse_funcsig
        raise JsonFormat(s)
    ceph_argparse.JsonFormat: unknown type CephBool
  stderr_lines:
  - 'Traceback (most recent call last):'
  - '  File "/bin/ceph", line 1222, in <module>'
  - '    retval = main()'
  - '  File "/bin/ceph", line 1146, in main'
  - '    sigdict = parse_json_funcsigs(outbuf.decode(''utf-8''), ''cli'')'
  - '  File "/usr/lib/python2.7/site-packages/ceph_argparse.py", line 788, in parse_json_funcsigs'
  - '    cmd[''sig''] = parse_funcsig(cmd[''sig''])'
  - '  File "/usr/lib/python2.7/site-packages/ceph_argparse.py", line 728, in parse_funcsig'
  - '    raise JsonFormat(s)'
  - 'ceph_argparse.JsonFormat: unknown type CephBool'
  stdout: ''
  stdout_lines: <omitted>
```

Having mixed versions of monitors seems to cause this error.
Moving these tasks before any monitor gets upgraded seems to be enough
to get around this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 543d1e2e41c00bccf611c7e240ba0d8f4eda5d95)

6 years agoupdate: refact msgr2 migration
Guillaume Abrioux [Tue, 16 Apr 2019 08:31:44 +0000 (10:31 +0200)]
update: refact msgr2 migration

this commit refact the msgr2 protocol introduction.

If it's a fresh install, let's go with v2 only.
If we upgrade to nautilus, we should go with v2+v1 syntax to ensure
nothing breaks.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a4bc7bda5148a94e917d228037447a138bd08cab)

6 years agoceph-iscsi-gw: Remove library directory
Dimitri Savineau [Wed, 17 Apr 2019 15:37:03 +0000 (11:37 -0400)]
ceph-iscsi-gw: Remove library directory

The library directory that contain the custom ceph modules in present
in the ceph-ansible root directory.
All igw_* mocules are already present there so we don't need the one
present in roles/ceph-iscsi-gw/library.
Also remove the associated spec file.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit c8814d13310c67665da6d39058807f8b3e089d8d)

6 years agomds: remove legacy task
Guillaume Abrioux [Thu, 18 Apr 2019 08:44:41 +0000 (10:44 +0200)]
mds: remove legacy task

this task has nothing to do in stable-4.0 and after.
Let's remove it since stable-4.0 and after aren't intended to deploy
luminous.

Closes: #3873
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 58f38515730730771f363c3b79acab14d1093b6d)