]> git.apps.os.sepia.ceph.com Git - ceph-ansible.git/log
ceph-ansible.git
6 years agoceph-nfs: apply selinux fix anyway
Dimitri Savineau [Thu, 18 Apr 2019 14:02:12 +0000 (10:02 -0400)]
ceph-nfs: apply selinux fix anyway

Because ansible_distribution_version doesn't return minor version on
CentOS with ansible 2.8 we can apply the selinux anyway but only for
CentOS/RHEL 7.
Starting RHEL 8, there's a dedicated package for selinux called
nfs-ganesha-selinux [1].

Also replace the command module + semanage by the selinux_permissive
module.

[1] https://github.com/nfs-ganesha/nfs-ganesha/commit/a7911f

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoceph-validate: use kernel validation for iscsi
Dimitri Savineau [Thu, 18 Apr 2019 13:37:07 +0000 (09:37 -0400)]
ceph-validate: use kernel validation for iscsi

Ceph iSCSI gateway requires Red Hat Enterprise Linux or CentOS 7.5
or later.
Because we can not check the ansible_distribution_version fact for
CentOS with ansible 2.8 (returns only the major version) we can
fallback by checking the kernel option.

  - CONFIG_TARGET_CORE=m
  - CONFIG_TCM_USER2=m
  - CONFIG_ISCSI_TARGET=m

http://docs.ceph.com/docs/master/rbd/iscsi-target-cli-manual-install/

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoswitch to ansible 2.8
Guillaume Abrioux [Tue, 9 Apr 2019 07:22:06 +0000 (09:22 +0200)]
switch to ansible 2.8

- remove private attribute with import_role.
- update documentation.
- update rpm spec requirement.
- fix MagicMock python import in unit tests.

Closes: #3765
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agocommon: install dependencies for apt modules
Dimitri Savineau [Fri, 17 May 2019 14:31:46 +0000 (10:31 -0400)]
common: install dependencies for apt modules

When using a minimal Debian/Ubuntu distribution there's no
ca-certificates and gpg packages installed so the apt modules will
fail:

Failed to find required executable gpg in paths:
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

apt.cache.FetchFailedException:
W:https://download.ceph.com/debian-luminous/dists/bionic/InRelease:
No system certificates available. Try installing ca-certificates.

Resolves: #3994

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agodashboard: move the call to ceph-node-exporter
Guillaume Abrioux [Fri, 17 May 2019 15:34:09 +0000 (17:34 +0200)]
dashboard: move the call to ceph-node-exporter

This moves the call to ceph-node-exporter role after
ceph-container-common, otherwise it will try to run container before
docker or podman are installed.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotox: Don't copy infrastructure playbook
Dimitri Savineau [Tue, 23 Apr 2019 14:40:09 +0000 (10:40 -0400)]
tox: Don't copy infrastructure playbook

Since a1a871c we don't need to copy the infrastructure playbooks
under the ceph-ansible root directory.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agopurge-docker-cluster: don't remove data on atomic
Dimitri Savineau [Thu, 16 May 2019 14:00:58 +0000 (10:00 -0400)]
purge-docker-cluster: don't remove data on atomic

Because we don't manage the docker service on atomic (yet) via the
ceph-container-common role then we can't stop docker dans remove
the data.
For now let's do that only for non atomic hosts.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agodashboard: move defaults variables to ceph-defaults
Guillaume Abrioux [Thu, 16 May 2019 13:58:20 +0000 (15:58 +0200)]
dashboard: move defaults variables to ceph-defaults

There is no need to have default values for these variables in each roles
since there is no corresponding host groups

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agorename docker_exec_cmd variable
Guillaume Abrioux [Tue, 14 May 2019 12:51:32 +0000 (14:51 +0200)]
rename docker_exec_cmd variable

This commit renames the `docker_exec_cmd` variable to
`container_exec_cmd` so it's more generic.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: fix a typo
Guillaume Abrioux [Thu, 16 May 2019 12:36:53 +0000 (14:36 +0200)]
dashboard: fix a typo

6f0643c8e introduced a typo, the role that should be run is
ceph-container-common, not ceph-common

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotests: add dashboard scenario testing
Guillaume Abrioux [Thu, 16 May 2019 09:19:11 +0000 (11:19 +0200)]
tests: add dashboard scenario testing

This commit add a new scenario to test the dashboard deployment via
ceph-ansible.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: align the way containers are managed
Guillaume Abrioux [Thu, 16 May 2019 08:56:06 +0000 (10:56 +0200)]
dashboard: align the way containers are managed

This commit aligns the way the different containers are managed with how
it's currently done with the other ceph daemon.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: convert dashboard_rgw_api_no_ssl_verify to a bool
Guillaume Abrioux [Wed, 15 May 2019 14:16:55 +0000 (16:16 +0200)]
dashboard: convert dashboard_rgw_api_no_ssl_verify to a bool

make `dashboard_rgw_api_no_ssl_verify` a bool variable since it seems to
be used as it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: generate group_vars sample files
Guillaume Abrioux [Wed, 15 May 2019 14:15:48 +0000 (16:15 +0200)]
dashboard: generate group_vars sample files

generate all group_vars sample files corresponding to new roles added
for ceph-dashboard implementation.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: remove legacy file
Guillaume Abrioux [Wed, 15 May 2019 13:00:26 +0000 (15:00 +0200)]
dashboard: remove legacy file

this file seems to be no longer used, let's remove it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: set less permissive permissions on dashboard certificate/key
Guillaume Abrioux [Wed, 15 May 2019 12:38:46 +0000 (14:38 +0200)]
dashboard: set less permissive permissions on dashboard certificate/key

use `0440` instead of `0644` is enough

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: simplify config-key command
Guillaume Abrioux [Wed, 15 May 2019 12:35:24 +0000 (14:35 +0200)]
dashboard: simplify config-key command

since stable-4.0 isn't to deploy ceph releases prior to nautilus,
there's no need to add this complexity here.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoplaybook: use blocks for grafana-server section
Guillaume Abrioux [Wed, 15 May 2019 12:11:00 +0000 (14:11 +0200)]
playbook: use blocks for grafana-server section

use a block in grafana-server section to avoid duplicate condition.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: do not call ceph-container-common from other role
Guillaume Abrioux [Tue, 14 May 2019 14:34:50 +0000 (16:34 +0200)]
dashboard: do not call ceph-container-common from other role

use site.yml to deploy ceph-container-common in order to install docker
even in non-containerized deployments since there's no RPM available to
deploy the differents applications needed for ceph-dashboard.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: use existing variable to detect containerized deployment
Guillaume Abrioux [Tue, 14 May 2019 12:46:25 +0000 (14:46 +0200)]
dashboard: use existing variable to detect containerized deployment

there is no need to add more complexity for this, let's use
`containerized_deployment` in order to detect if we are running a
containerized deployment.
The idea is to use `container_exec_cmd` the same way we do in the rest of
the playbook to run the different ceph commands needed to deploy the
ceph-dashboard role.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agofacts: set container_binary fact in non-containerized deployment
Guillaume Abrioux [Mon, 13 May 2019 14:34:53 +0000 (16:34 +0200)]
facts: set container_binary fact in non-containerized deployment

This is needed for the ceph-dashboard implementation since it requires
to run containerized application which aren't packaged as RPMs.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: rename template files
Guillaume Abrioux [Mon, 13 May 2019 14:21:16 +0000 (16:21 +0200)]
dashboard: rename template files

add .j2 to all templates file related to dashboard roles.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodashboard: Support podman
Boris Ranto [Mon, 8 Apr 2019 13:40:25 +0000 (15:40 +0200)]
dashboard: Support podman

This adds support for podman in dashboard-related roles. It also drops
the creation of custom network for the dashboard-related roles as this
functionality works in a different way with podman.

Signed-off-by: Boris Ranto <branto@redhat.com>
6 years agodashboard: Set ssl_server_port if it is supported
Boris Ranto [Thu, 4 Apr 2019 17:51:16 +0000 (19:51 +0200)]
dashboard: Set ssl_server_port if it is supported

We cannot use the old fashioned config-key way, here. It was not
supported when the option was introduced (post 14.2.0). Since the option
is not always supported we can simply ignore the potential failure on
ceph clusters that do not support it.

Signed-off-by: Boris Ranto <branto@redhat.com>
6 years agodashboard: Add and copy alerting rules
Boris Ranto [Fri, 15 Feb 2019 19:27:15 +0000 (20:27 +0100)]
dashboard: Add and copy alerting rules

This commit adds a list of alerting rules for ceph-dashboard from the
old cephmetrics project. It also installs the configuration file so that
the rules get recognized by the prometheus server.

Signed-off-by: Boris Ranto <branto@redhat.com>
6 years agopurge-docker-cluster.yml: Default lvm_volumes
Zack Cerza [Fri, 4 Jan 2019 20:26:59 +0000 (13:26 -0700)]
purge-docker-cluster.yml: Default lvm_volumes

We were failing when that variable is unset; purge-cluster.yml contains
this workaround.

Signed-off-by: Zack Cerza <zack@redhat.com>
6 years agoMerge cephmetrics/dashboard-ansible repo
Boris Ranto [Wed, 5 Dec 2018 18:59:47 +0000 (19:59 +0100)]
Merge cephmetrics/dashboard-ansible repo

This commit will merge dashboard-ansible installation scripts with
ceph-ansible. This includes several new roles to setup ceph-dashboard
and the underlying technologies like prometheus and grafana server.

Signed-off-by: Boris Ranto & Zack Cerza <team-gmeno@redhat.com>
Co-authored-by: Zack Cerza <zcerza@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoshrink_osd: mark all osd(s) out in one command
wumingqiao [Wed, 15 May 2019 07:27:21 +0000 (15:27 +0800)]
shrink_osd: mark all osd(s) out in one command

Signed-off-by: wumingqiao <wumingqiao@beyondcent.com>
6 years agotests: fix a typo in dev_setup.yml
Guillaume Abrioux [Tue, 14 May 2019 12:27:19 +0000 (14:27 +0200)]
tests: fix a typo in dev_setup.yml

c907ec41ae0698b7627ebcbe97f1c293611d41d7 introduced a typo.
This commit fixes it.

```
[WARNING]: While constructing a mapping from /home/guits/ceph-ansible/tests/functional/dev_setup.yml, line 21, column 9, found a duplicate dict key (replace).
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agopurge-docker-cluster: remove docker data
Dimitri Savineau [Mon, 13 May 2019 21:03:55 +0000 (17:03 -0400)]
purge-docker-cluster: remove docker data

We never clean the content of /var/lib/docker so we can still have
some data present in this directory after run the purge playbook.
Pip isn't used anymore.
Also update the docker package name (especially the python binding
one).

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agocontainer-common: allow podman for other distros
Dimitri Savineau [Fri, 10 May 2019 19:35:17 +0000 (15:35 -0400)]
container-common: allow podman for other distros

Currently podman installation is very tied to RHEL 8 even if we're
able to install it on Debian/Ubuntu distribution.
This patch changes the way we are starting or not the (fat) container
daemon. Before the condition was based on the distribution release
and now on the container_service_name variable.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoceph-nfs: fixed with_items
Bruceforce [Sun, 12 May 2019 11:10:30 +0000 (13:10 +0200)]
ceph-nfs: fixed with_items

If we do this in one line we get the error described in #3968

fixes #3968

Signed-off-by: Bruceforce <markus.greis@gmx.de>
6 years agogather-ceph-logs: fix logs list generation
Dimitri Savineau [Mon, 13 May 2019 14:12:42 +0000 (10:12 -0400)]
gather-ceph-logs: fix logs list generation

The shell module doesn't have a stdout_lines attributes. Instead of
using the shell module, we can use the find modules.

Also adding `become: false` to the local tmp directory creation
otherwise we won't have enough right to fetch the files into this
directory.

Resolves: #3966

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoceph-nfs: fixed condition for "stable repos specific tasks"
Bruceforce [Sun, 12 May 2019 09:40:05 +0000 (11:40 +0200)]
ceph-nfs: fixed condition for "stable repos specific tasks"

The old condition would resolve to
"when": "nfs_ganesha_stable - ceph_repository == 'community'"

now it is
"when": [
          "nfs_ganesha_stable",
          "ceph_repository == 'community'"
        ]

Please backport to stable-4.0

Signed-off-by: Bruceforce <markus.greis@gmx.de>
6 years agoUpdate RHCS version with Nautilus
Dimitri Savineau [Fri, 10 May 2019 19:28:18 +0000 (15:28 -0400)]
Update RHCS version with Nautilus

RHCS 4 will be based on Nautilus and only usable on RHEL 8.
Updated the default ceph_rhcs_version to 4 and update the rhcs
repositories to rhcs 4 with RHEL 8.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoSet the rgw_create_pools pools application to rgw
Kevin Coakley [Fri, 10 May 2019 13:32:00 +0000 (06:32 -0700)]
Set the rgw_create_pools pools application to rgw

Set the application to rgw for pools created from rgw_create_pools. On Ceph Nautilus the heath is set to HEALTH_WARN with the message "application not enabled on X pool(s)" if an application isn't specified for a pool.

Signed-off-by: Kevin Coakley <kcoakley@sdsc.edu>
6 years agoigw: Fix rolling update service ordering
Mike Christie [Thu, 9 May 2019 19:52:08 +0000 (14:52 -0500)]
igw: Fix rolling update service ordering

We must stop tcmu-runner after the other rbd-target-* services
because they may need to interact with tcmu-runner during shutdown.
There is also a bug in some kernels where IO can get stuck in the
kernel and by stopping rbd-target-* first we can make sure all IO is
flushed.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1659611

Signed-off-by: Mike Christie <mchristi@redhat.com>
6 years agoceph-rbd-mirror: refactor tasks/main.yml
Rishabh Dave [Wed, 24 Apr 2019 09:19:04 +0000 (14:49 +0530)]
ceph-rbd-mirror: refactor tasks/main.yml

Use blocks for similar tasks in main.yml. And move when keywords before
block keywords.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agoceph-mds: group similar tasks in create_mds_filesystem.yml
Rishabh Dave [Wed, 24 Apr 2019 09:08:15 +0000 (14:38 +0530)]
ceph-mds: group similar tasks in create_mds_filesystem.yml

Group similar tasks together using block keyword.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agotox: Refact lvm_osds scenario
Dimitri Savineau [Wed, 3 Apr 2019 20:22:47 +0000 (16:22 -0400)]
tox: Refact lvm_osds scenario

The current lvm_osds only tests filestore on one OSD node.
We also have bs_lvm_osds to test bluestore and encryption.
Let's use only one scenario to test filestore/bluestore and with or
without dmcrypt on four OSD nodes.
Also use validate_dmcrypt_bool_value instead of types.boolean on
dmcrypt validation via notario.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agofacts: fix external cluster bug
Guillaume Abrioux [Tue, 7 May 2019 14:42:49 +0000 (16:42 +0200)]
facts: fix external cluster bug

running an external ceph cluster deployment with (obviously) no
monitors defined in inventory breaks with an undefined error because
`_monitor_addresses` never get defined.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1707460
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoceph-mgr: create keys for MGRs
Rishabh Dave [Thu, 2 May 2019 12:48:00 +0000 (08:48 -0400)]
ceph-mgr: create keys for MGRs

Add code in ceph-mgr for creating a keyring for manager in so that
managers can be deployed on a separate node too.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agoallow adding a manager to a deployed cluster
Rishabh Dave [Sat, 9 Feb 2019 07:46:12 +0000 (13:16 +0530)]
allow adding a manager to a deployed cluster

Add a playbook that deploys manager on a new node and adds that node to
the already deployed Ceph cluster.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agoremove infrastructure-playbooks/rgw-standalone.yml
Rishabh Dave [Tue, 7 May 2019 10:58:36 +0000 (16:28 +0530)]
remove infrastructure-playbooks/rgw-standalone.yml

We don't need infrastructure-playbooks/rgw-standalone.yml since
site.yml.sample and site-cotainer.yml.sample can add a new RGW node to
an already deployed Ceph cluster.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agodon't access other node's docker_exec_cmd variable
Rishabh Dave [Sun, 28 Apr 2019 16:42:45 +0000 (22:12 +0530)]
don't access other node's docker_exec_cmd variable

Except for some corner case, it's not correct to access some other
node's copy of variable docker_exec_cmd. Therefore replace
"hostvars[groups[mon_group_name][0]]['docker_exec_cmd']" by
"docker_exec_cmd".

Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agoallow adding a RGW to already deployed cluster
Rishabh Dave [Sun, 7 Apr 2019 06:36:31 +0000 (02:36 -0400)]
allow adding a RGW to already deployed cluster

Add a tox scenario that adds a new RGW node as a part of already
deployed Ceph cluster and deploys RGW there.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agoFix comment content
letterwuyu [Sun, 28 Apr 2019 09:56:29 +0000 (17:56 +0800)]
Fix comment content

Signed-off-by: lishuhao letterwuyu@gmail.com
6 years agoFix check mode support
Gaudenz Steinlin [Mon, 6 May 2019 08:14:36 +0000 (10:14 +0200)]
Fix check mode support

Adds "check_mode: no" to commands which register cluster state in a
variable and don't modify anything. These commands have to run in order
to support running the playbook in check mode.

Signed-off-by: Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch>
6 years agoallow adding a RBD mirror to already deployed cluster
Rishabh Dave [Sun, 7 Apr 2019 06:14:05 +0000 (02:14 -0400)]
allow adding a RBD mirror to already deployed cluster

Add a tox scenario that adds a new RBD mirror node as a part of already
deployed Ceph cluster and deploys RBD mirror there.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431
Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agoansible: remove private and static attribute
Dimitri Savineau [Thu, 2 May 2019 13:57:19 +0000 (09:57 -0400)]
ansible: remove private and static attribute

This will be removed in ansible 2.8 and breaks the playbook execution
with this release.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoceph-mds: Increase cpu limit to 4
Dimitri Savineau [Tue, 23 Apr 2019 19:54:38 +0000 (15:54 -0400)]
ceph-mds: Increase cpu limit to 4

In containerized deployment the default mds cpu quota is too low
for production environment.
This is causing performance degradation compared to bare-metal.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695850
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoceph-osd: Increase cpu limit to 4
Dimitri Savineau [Fri, 5 Apr 2019 13:45:28 +0000 (09:45 -0400)]
ceph-osd: Increase cpu limit to 4

In containerized deployment the default osd cpu quota is too low
for production environment using NVMe devices.
This is causing performance degradation compared to bare-metal.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1695880
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agovalidate: check custom repository config options
Jugwan Eom [Mon, 21 Jan 2019 08:08:39 +0000 (08:08 +0000)]
validate: check custom repository config options

This adds missing configuration options when the 'custom'
 repository is used.

Signed-off-by: Jugwan Eom <zugwan@gmail.com>
6 years agoceph-iscsi: start tcmu-runner for non-container
Dimitri Savineau [Tue, 23 Apr 2019 14:08:30 +0000 (10:08 -0400)]
ceph-iscsi: start tcmu-runner for non-container

Only rbd-target-api and rbd-target-gw were started/enabled for non
containerized deployment.
The issue doesn't happen with containerized setup.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agotests: group and parametrize tests
Dimitri Savineau [Thu, 18 Apr 2019 21:08:13 +0000 (17:08 -0400)]
tests: group and parametrize tests

Instead of creating a dedicated test and using the same testinfra
module we can group them into a single test to avoid multiple ansible
connections and testinfra module execution.
This patch also adds parametrize pytest decorator when possible.
Finally fixing some flake minor issue.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agotox: Remove update scenario reference
Dimitri Savineau [Tue, 23 Apr 2019 20:33:46 +0000 (16:33 -0400)]
tox: Remove update scenario reference

update scenario is now handled by tox-update.ini file so we shoudn't
have update reference in tox.ini file.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoUpdate group_vars according to defaults
Dimitri Savineau [Tue, 23 Apr 2019 20:19:00 +0000 (16:19 -0400)]
Update group_vars according to defaults

b2f2426 didn't use the generate_group_vars_sample.sh script so we
currently have a difference between the content in group_vars and the
ceph-defaults/defaults directories.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agorolling_update: restart all ceph-iscsi services
Dimitri Savineau [Tue, 23 Apr 2019 18:58:37 +0000 (14:58 -0400)]
rolling_update: restart all ceph-iscsi services

Currently only rbd-target-gw service is restarted during an update.
We also need to restart tcmu-runner and rbd-target-api services
during the ceph iscsi upgrade.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1659611
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agovalidate: fix a typo
Guillaume Abrioux [Tue, 23 Apr 2019 14:04:27 +0000 (16:04 +0200)]
validate: fix a typo

5aa27794615e7d4521b1dbf1444b61388aacb852 introduced a typo.
This commit fixes it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoimprove coding style
Rishabh Dave [Mon, 1 Apr 2019 15:46:15 +0000 (21:16 +0530)]
improve coding style

Keywords requiring only one item shouldn't express it by creating a
list with single item.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agovalidate: fix notario error
Guillaume Abrioux [Tue, 23 Apr 2019 13:19:26 +0000 (15:19 +0200)]
validate: fix notario error

Typical error:

```
AttributeError: 'Invalid' object has no attribute 'message'
```

As of python 2.6, `BaseException.message` has been deprecated.
When using python3, it fails because it has been removed.

Let's use `str(error)` instead so we don't hit this error when using
python3.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoAllow CephFS pool to be created with specific rule_name, erasure_profile just like...
Radu Toader [Thu, 18 Apr 2019 19:12:55 +0000 (22:12 +0300)]
Allow CephFS pool to be created with specific rule_name, erasure_profile just like rbd pools

Signed-off-by: Radu Toader <radu.m.toader@gmail.com>
6 years agoceph-container-common: modify requirement flow
Dimitri Savineau [Tue, 16 Apr 2019 13:33:02 +0000 (09:33 -0400)]
ceph-container-common: modify requirement flow

Until now it was not possible to install a specific container package
because it was somehow hardcoded.
This patch allows to override the container package name (docker.io
vs docker-ce) and refacts the package installation. This could be
achieve via the container_package_name variable.
Instead of using one task per distribution we can set the package and
service name in vars. This allows to have a unified package task.
Also refactorize the debian_prerequisites tasks because the content
was outdated.

https://docs.docker.com/install/linux/docker-ce/debian/
https://docs.docker.com/install/linux/docker-ce/ubuntu/

Resolves: #3609

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agodoc: update index.rst with current information for stable-4.0
Florian Haas [Thu, 18 Apr 2019 13:59:11 +0000 (15:59 +0200)]
doc: update index.rst with current information for stable-4.0

With the stable-4.0 branch nearing release, update
docs/source/index.rst with current information about which Ceph
releases are supported, and which Ansible versions are required, for
each branch.

Signed-off-by: Florian Haas <florian@citynetwork.eu>
6 years agomds: remove legacy task
Guillaume Abrioux [Thu, 18 Apr 2019 08:44:41 +0000 (10:44 +0200)]
mds: remove legacy task

this task has nothing to do in stable-4.0 and after.
Let's remove it since stable-4.0 and after aren't intended to deploy
luminous.

Closes: #3873
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agorgw: add cpuset support
Kyle Bader [Thu, 21 Mar 2019 18:54:34 +0000 (11:54 -0700)]
rgw: add cpuset support

1/ The OSD already supports cpuset to be used for containerized deployments
through the use of the ceph_osd_docker_cpuset_cpus variable. This adds similar
support to the RGW service for containerized deployments by setting a new
variable named ceph_rgw_docker_cpuset_cpus. Like the OSD, there are times where
using distinct cores has advantages over using the CFS in kernel scheduler.

ceph_rgw_docker_cpuset_cpus accepts a comma delimited set of CPU ids

2/ Add support for specifying --cpuset-mem variable to restrict the cgroup's memory
allocations to a particular numa node, which should typically correspond with
the cpu ids of that numa node that were provided with --cpuset-cpus. To ensure
the correct cpu ids are used one can run `numactl --hardware`  to list the nodes
and which cpu ids correspond to each.

Signed-off-by: Kyle Bader <kbader@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoceph-mgr: Add extra module packages
Dimitri Savineau [Mon, 15 Apr 2019 16:15:49 +0000 (12:15 -0400)]
ceph-mgr: Add extra module packages

Since Nautilus there's mgr extra modules not present in ceph-mgr
package but in dedicated packages.

Resolves: #3860

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoupdate: ensure tasks are executed on an upgraded mon
Guillaume Abrioux [Wed, 17 Apr 2019 12:02:06 +0000 (14:02 +0200)]
update: ensure tasks are executed on an upgraded mon

These tasks must be run from a monitor which is upgraded otherwise it
might fail.
See: https://tracker.ceph.com/issues/39355

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoupdate: ensure ceph command returns 0
Guillaume Abrioux [Wed, 17 Apr 2019 11:57:29 +0000 (13:57 +0200)]
update: ensure ceph command returns 0

these commands could return something else than 0.
Let's ensure all retries have been done before actually failing.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoupdate: set osd flags before upgrading any mon
Guillaume Abrioux [Wed, 17 Apr 2019 06:47:25 +0000 (08:47 +0200)]
update: set osd flags before upgrading any mon

Typical error:

```
failed: [mon0 -> mon2] (item=noout) => changed=true
  cmd:
  - ceph
  - --cluster
  - ceph
  - osd
  - set
  - noout
  delta: '0:00:00.293756'
  end: '2019-04-17 06:31:57.552386'
  item: noout
  msg: non-zero return code
  rc: 1
  start: '2019-04-17 06:31:57.258630'
  stderr: |-
    Traceback (most recent call last):
      File "/bin/ceph", line 1222, in <module>
        retval = main()
      File "/bin/ceph", line 1146, in main
        sigdict = parse_json_funcsigs(outbuf.decode('utf-8'), 'cli')
      File "/usr/lib/python2.7/site-packages/ceph_argparse.py", line 788, in parse_json_funcsigs
        cmd['sig'] = parse_funcsig(cmd['sig'])
      File "/usr/lib/python2.7/site-packages/ceph_argparse.py", line 728, in parse_funcsig
        raise JsonFormat(s)
    ceph_argparse.JsonFormat: unknown type CephBool
  stderr_lines:
  - 'Traceback (most recent call last):'
  - '  File "/bin/ceph", line 1222, in <module>'
  - '    retval = main()'
  - '  File "/bin/ceph", line 1146, in main'
  - '    sigdict = parse_json_funcsigs(outbuf.decode(''utf-8''), ''cli'')'
  - '  File "/usr/lib/python2.7/site-packages/ceph_argparse.py", line 788, in parse_json_funcsigs'
  - '    cmd[''sig''] = parse_funcsig(cmd[''sig''])'
  - '  File "/usr/lib/python2.7/site-packages/ceph_argparse.py", line 728, in parse_funcsig'
  - '    raise JsonFormat(s)'
  - 'ceph_argparse.JsonFormat: unknown type CephBool'
  stdout: ''
  stdout_lines: <omitted>
```

Having mixed versions of monitors seems to cause this error.
Moving these tasks before any monitor gets upgraded seems to be enough
to get around this issue.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoupdate: refact msgr2 migration
Guillaume Abrioux [Tue, 16 Apr 2019 08:31:44 +0000 (10:31 +0200)]
update: refact msgr2 migration

this commit refact the msgr2 protocol introduction.

If it's a fresh install, let's go with v2 only.
If we upgrade to nautilus, we should go with v2+v1 syntax to ensure
nothing breaks.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agorolling_update: ceph commands should use --cluster
Andrew Schoen [Thu, 28 Mar 2019 21:05:09 +0000 (16:05 -0500)]
rolling_update: ceph commands should use --cluster

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agorolling_update: set num_osds to the number of running osds
Andrew Schoen [Thu, 28 Mar 2019 19:34:48 +0000 (14:34 -0500)]
rolling_update: set num_osds to the number of running osds

We do this so that the ceph-config role can most accurately
report the number of osds for the generation of the ceph.conf
file.

We don't want to use ceph-volume to determine the number of
osds because in an upgrade to nautilus ceph-volume won't be able to
accurately count osds created by ceph-disk.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agoceph-osd: do not run lvm batch tasks during update
Andrew Schoen [Thu, 28 Mar 2019 19:02:54 +0000 (14:02 -0500)]
ceph-osd: do not run lvm batch tasks during update

When performing a rolling update do not try to create
any new osds with `ceph-volume lvm batch`. This is troublesome
because when upgrading to nautilus the devices list might contain
devices that are currently being used by ceph-disk and have GPT
headers on them, which will cause ceph-volume to fail when
trying to use such a device. Any devices originally created
by ceph-disk will need to be removed from the devices list
before any new osds can be created.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agotests: adds the migrate_ceph_disk_to_ceph_volume scenario
Andrew Schoen [Wed, 27 Mar 2019 19:36:51 +0000 (14:36 -0500)]
tests: adds the migrate_ceph_disk_to_ceph_volume scenario

This test deploys a luminous cluster with ceph-disk created osds
and then upgrades to nautilus and migrates those osds to ceph-volume.
The nodes are then rebooted and cluster state verified.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agorolling_update: migrate ceph-disk osds to ceph-volume
Andrew Schoen [Tue, 19 Mar 2019 20:08:32 +0000 (15:08 -0500)]
rolling_update: migrate ceph-disk osds to ceph-volume

When upgrading to nautlius run ``ceph-volume simple scan`` and
``ceph-volume simple activate --all`` to migrate any running
ceph-disk osds to ceph-volume.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1656460
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
6 years agoceph-iscsi-gw: Remove library directory
Dimitri Savineau [Wed, 17 Apr 2019 15:37:03 +0000 (11:37 -0400)]
ceph-iscsi-gw: Remove library directory

The library directory that contain the custom ceph modules in present
in the ceph-ansible root directory.
All igw_* mocules are already present there so we don't need the one
present in roles/ceph-iscsi-gw/library.
Also remove the associated spec file.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agotest_osds: remove scenario leftover
Dimitri Savineau [Tue, 16 Apr 2019 20:23:51 +0000 (16:23 -0400)]
test_osds: remove scenario leftover

Since there's only only scenario available we don't need lvm_scenario
and no_lvm_scenario.
Also add missing assert for ceph-volume tests.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoallow using ansible 2.8
Dimitri Savineau [Wed, 17 Apr 2019 14:22:59 +0000 (10:22 -0400)]
allow using ansible 2.8

Currently we only support ansible 2.7
We plan to use 2.8 when it will be release so we have to support both
2.7 and 2.8.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1700548
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agotests/functional/setup: change mount options
Dimitri Savineau [Fri, 12 Apr 2019 14:46:20 +0000 (10:46 -0400)]
tests/functional/setup: change mount options

In the CI jobs we can change the mount options of the main partition
to avoid extra operations on disk.
Adding jmespath to tests/requirements.txt due to the json_query
filter usage.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agotest_mons: test mon listening on port 3300
Dimitri Savineau [Tue, 16 Apr 2019 20:52:42 +0000 (16:52 -0400)]
test_mons: test mon listening on port 3300

Since nautilus and msgr2 the monitors also bind on port 3300 in
addition of 6789.
This patch updates test_mons to reflect that change.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agodefaults: refact package dependencies installation.
Guillaume Abrioux [Tue, 16 Apr 2019 07:58:52 +0000 (09:58 +0200)]
defaults: refact package dependencies installation.

Because 5c98e361df5241fbfa5bd0a2ae1317219b7e1244 could be seen as a non
backward compatible change this commit reverts it and bring back package
dependencies installation support.
Let's just modify the default value instead.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agodefaults: remove some package dependencies
Guillaume Abrioux [Mon, 15 Apr 2019 14:38:50 +0000 (16:38 +0200)]
defaults: remove some package dependencies

These packages aren't needed anymore.
They were needed for ceph-init-detect buti as of ceph-init-detect doesn't exist
anymore.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1683885
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoallow adding a monitor to a deployed cluster
Rishabh Dave [Thu, 8 Nov 2018 13:47:51 +0000 (08:47 -0500)]
allow adding a monitor to a deployed cluster

Add a playbook that deploys a new monitor on a new node, adds that node
to the Ceph cluster and the monitor to the quorum and updates the ceph
configuration file on OSD nodes.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agocheck if mon daemon is installed before restarting it
Rishabh Dave [Sat, 6 Apr 2019 06:15:31 +0000 (02:15 -0400)]
check if mon daemon is installed before restarting it

Signed-off-by: Rishabh Dave <ridave@redhat.com>
6 years agomon: check if an initial monitor keyring already exists
Guillaume Abrioux [Wed, 30 Jan 2019 09:11:26 +0000 (10:11 +0100)]
mon: check if an initial monitor keyring already exists

When adding a new monitor, we must reuse the existing initial monitor
keyring. Otherwise, the new monitor will issue its 'mkfs' with a new
monitor keyring and it will result with a mismatch between them. The
new monitor will be unable to join the quorum in the end.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Rishabh Dave <ridave@redhat.com>
6 years agopurge-cluster: remove python-ceph-argparse package
Dimitri Savineau [Fri, 12 Apr 2019 19:30:35 +0000 (15:30 -0400)]
purge-cluster: remove python-ceph-argparse package

When using purge-cluster playbook with nautilus, there's still the
python-ceph-argparse package installed on the host preventing to
reinstall a ceph cluster with a different version (like luminous or
mimic)

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agodocs: Update ceph.conf supported section
Dimitri Savineau [Thu, 11 Apr 2019 21:09:04 +0000 (17:09 -0400)]
docs: Update ceph.conf supported section

[rgw] isn't a valide section.
[client.rgw.{instance_name] should be used instead.

Resolves: #3841

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agoswitch-from-non-containerized: stop all osds
Dimitri Savineau [Thu, 11 Apr 2019 20:20:41 +0000 (16:20 -0400)]
switch-from-non-containerized: stop all osds

e6bfb84 introduced a regression in the switch from non containerized
to container deployment.
We need to stop all previous OSDs services. We just don't need the
ceph-disk pattern in the regex.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
6 years agopurge: remove references to ceph-disk
Guillaume Abrioux [Thu, 11 Apr 2019 15:03:44 +0000 (17:03 +0200)]
purge: remove references to ceph-disk

as of stable-4.0, ceph-disk is no longer supported.
These tasks aren't needed anymore.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoshrink-osd: remove legacy playbook
Guillaume Abrioux [Thu, 11 Apr 2019 15:01:39 +0000 (17:01 +0200)]
shrink-osd: remove legacy playbook

as of stable-4.0, ceph-disk is no longer supported.
Let's remove this legacy version of the playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoswitch_to_containers: remove ceph-disk references
Guillaume Abrioux [Thu, 11 Apr 2019 15:00:58 +0000 (17:00 +0200)]
switch_to_containers: remove ceph-disk references

as of stable-4.0, ceph-disk is no longer supported.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoosd: remove legacy file
Guillaume Abrioux [Thu, 11 Apr 2019 14:51:03 +0000 (16:51 +0200)]
osd: remove legacy file

this file is not used anymore, let's remove it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotests: pass osd_scenario value to lvm_setup.yml
Guillaume Abrioux [Thu, 11 Apr 2019 15:18:02 +0000 (17:18 +0200)]
tests: pass osd_scenario value to lvm_setup.yml

we must pass the value of osd_scenario from the stable-3.2 branch which
is used for the initial deployment.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agotests: remove test_journal_collocation.py in OSD testing
Guillaume Abrioux [Thu, 11 Apr 2019 12:57:56 +0000 (14:57 +0200)]
tests: remove test_journal_collocation.py in OSD testing

this test is related to ceph-disk which is dropped as of stable-4.0

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoresync sample file
Guillaume Abrioux [Thu, 11 Apr 2019 08:13:17 +0000 (10:13 +0200)]
resync sample file

d17b1b48b6 introduced a change that hasn't been reported in sample files

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoosd: remove ceph-disk scenarios files
Guillaume Abrioux [Thu, 11 Apr 2019 08:09:31 +0000 (10:09 +0200)]
osd: remove ceph-disk scenarios files

these files aren't needed anymore since we only use lvm scenario.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoosd: remove dedicated_devices variable
Guillaume Abrioux [Thu, 11 Apr 2019 08:08:22 +0000 (10:08 +0200)]
osd: remove dedicated_devices variable

This variable was related to ceph-disk scenarios.
Since we are entirely dropping ceph-disk support as of stable-4.0, let's
remove this variable.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoosd: remove variable osd_scenario
Guillaume Abrioux [Thu, 11 Apr 2019 08:01:15 +0000 (10:01 +0200)]
osd: remove variable osd_scenario

As of stable-4.0, the only valid scenario is `lvm`.
Thus, this makes this variable useless.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
6 years agoosd: remove legacy file
Guillaume Abrioux [Wed, 10 Apr 2019 11:33:57 +0000 (13:33 +0200)]
osd: remove legacy file

ceph_disk_cli_options_facts.yml is not used anymore, let's remove it.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>