ceph-rgw: use match instead of equalto from jinja2
The '==' jinja2 operator (or 'equalto') has been introduced in jinja2
2.8.
On EL7, jinja2 version is 2.7 so the operator isn't present creating
templating error like:
The error was: TemplateRuntimeError: no test named '=='
Since ea2b654d9 we're not running the rados command from the monitor
nodes but from the ganesha node. Unfortunately we don't have the
required keyring on that node to run the rados command as we don't
import the right keyring.
This commit restores the workflow for internal ganesha deployment like
before ea2b654d9 but keeps the rados commands from the ganesha node for
external deployment until we have a better design.
Fix the condition on the keyring copy task that prevent the ganesha
keyring to be created in the /var/lib/ceph directory.
Also ensure that the directory exists first.
When using radosgw_interface and IPv6 setup then the _radosgw_address
fact doesn't use square brackets compared to the radosgw_address and
radosgw_address_block configuration.
This change allows the operator to refresh the
ceph dashboard admin role on multiple ceph-ansible
executions.
In the current state the role is set only when the
user is created, and there's no way to change it if
the user exists.
The CentOS 7 distribution could still be used be deploying ceph if
- it's a containerized deployment
- it's a non containerized deployment without the dashboard (due to
missing python3 libraries).
The ceph_stable_redhat_distro variable has been remove because we can
rely on the ansible_distribution_major_version fact instead.
The copr el8 repository configuration is only applied for CentOS 8.
The ceph-mgr-dashboard package is only installed when the
dashboard_enabled variable is set to true.
This commit is the first of a serie in order to describe all day-2 operations
that are possible via ceph-ansible using a set of playbook provided in
`infrastructure-playbooks` directory.
[DEPRECATION WARNING]: Using tests as filters is deprecated. Instead of
using `result|version_compare` use `result is version_compare`. This
feature will be removed in version 2.9. Deprecation warnings can be
disabled by setting deprecation_warnings=False in ansible.cfg.
Rename 'version_compare' to the function 'version'.
version_compose was renamed to version since ansible 2.5
Rishabh Dave [Tue, 7 Apr 2020 11:50:35 +0000 (17:20 +0530)]
library/ceph_volume: look for error messages in stderr
Error message were moved to from stdout in stderr here -
https://github.com/ceph/ceph/commit/b8d6dcbe9f803c96c0af68da54f1262e9b6a9e77#diff-20f7c578a4e69ec61a5869d706567a24R137.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1793542 Signed-off-by: Rishabh Dave <ridave@redhat.com>
We were not testing the right ansible_distribution fact value for RHEL
distribution.
This commit also updates the minial RHEL version supported by RHCS.
The workflow in this playbook should be the same than in rolling_update,
we should first set noout and nodeep-scrub flags before migrating the
first osd and unset osd flags after the last osd is migrated.
When setting/unsetting osd flags, we can use `tasks_from` when importing
`ceph-facts` role to save some times given that we only need this role
for setting `container_binary`
The dashboard SSO mgr module requires the saml python library to be
installed. This is only a valid scenario for RHCS deployment because
the saml python library isn't available in other classic repositories.
This package is present in RHCS Tools repository so we also need to
enable it on the mgr nodes.
The current centos/8 vagrant image (libvirt) is still using the
CentOS 8.0 release (1905) while the 8.1 release (1911) is already
available since few months.
Using an update CentOS 8 release fixes slow ceph-volume/lvm commands.
With this change, the state `present` is enough to update a keyring.
If the keyring already exist, it will be updated if caps or secret
passed to the module are different.
If the keyring doen't exist, it will be created.
When `rule_name` isn't set in `crush_rules` the osd pool creation will
fail.
This commit adds a new fact `ceph_osd_pool_default_crush_rule_name` with
the default crush rule name.
osd: support changing default rule even when osd_crush_location isn't defined
Creating crush rules even with no crush hierarchy configuration is a
valid scenario so we shouldn't be bound to the first task result (which
configure crush hierarchy) to be able to add new crush rules.
Adding `--all` to the `systemctl list-units` command in order to get
*all* osds id on the node (including stoppped osds). Otherwise, it will
purge the cluster but there will be leftover after that.
Dimitri Savineau [Fri, 27 Mar 2020 21:16:41 +0000 (17:16 -0400)]
ceph_volume: fix multiple db/wal/journal devices
When using the lvm batch ceph-volume subcommand with dedicated devices
for filestore (journal) or bluestore (db/wal) then the list of devices
is convert to a string instead of being extended via an iterable.
This was working with only one dedicated device but starting with more
then the ceph_volume module fails.
Dimitri Savineau [Tue, 24 Mar 2020 21:11:44 +0000 (17:11 -0400)]
ceph-facts: fix rgw_instances_all fact
The rgw_instances_all fact is supposed to be the list of all radosgw
instances from all rgw nodes.
But the fact is always using the local rgw_instances variable so this
won't work on multiple nodes.
Dimitri Savineau [Tue, 24 Mar 2020 18:28:51 +0000 (14:28 -0400)]
tests: update mgr dashboard socket listening test
Since 15ed9ee the ceph-mgr daemon binds on the IP address on the public
network instead of binding on all addresses.
This commit updates the testinfra code to reflect that change.
Dimitri Savineau [Wed, 18 Mar 2020 14:53:40 +0000 (10:53 -0400)]
dashboard: allow to set read-only admin user
This commit allows one to set the role for the admin user as read-only.
This can be controlled via the dashboard_admin_user_ro variable but the
default value is false for backward compatibility.
Dimitri Savineau [Tue, 17 Mar 2020 00:45:03 +0000 (20:45 -0400)]
ceph-defaults: add registry name on dashboard vars
We don't use the registry name when using the community dashboard
container images (grafana, prometheus, alertmanager & node exporter).
This commit adds the docker.io registry explicitly in the default
dashboard container image name values.
Dimitri Savineau [Mon, 16 Mar 2020 21:52:30 +0000 (17:52 -0400)]
ceph-defaults: update grafana container tag
Since 8e8aa73 we're using grafana 5.4.3 in RHCS 4.1 via [1].
We should also update the grafana container tag from docker.io when
using the community release.
Dimitri Savineau [Wed, 11 Mar 2020 02:41:27 +0000 (22:41 -0400)]
ceph-infra: open radosgw ports for multi instances
When using the radosgw multi instances configuration then the firewall
rules aren't adapted to that setup.
We only open the port according to the radosgw_frontend_port variable
so only the first radosgw instance port will be opened in the firewall
configuration.
We should instead iterate over the rgw_instances list.
Dimitri Savineau [Wed, 11 Mar 2020 00:50:55 +0000 (20:50 -0400)]
update osd pool set size command
Since [1] we can't use osd pool without replicas (size: 1) by default.
We now need to set the mon_allow_pool_size_one flag to true in the ceph
configuration and add the --yes-i-really-mean-it flag to the osd pool
set size cli.
This commit make that task retrying 5 times to start the service
firewalld to avoid failure like following:
```
TASK [ceph-infra : start firewalld] ********************************************
task path: /home/jenkins-build/build/workspace/ceph-ansible-prs-centos-container-purge/roles/ceph-infra/tasks/configure_firewall.yml:22
Monday 09 March 2020 08:58:48 +0000 (0:00:00.963) 0:02:16.457 **********
fatal: [osd4]: FAILED! => changed=false
msg: |-
Unable to enable service firewalld: Created symlink from /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service to /usr/lib/systemd/system/firewalld.service.
Created symlink from /etc/systemd/system/multi-user.target.wants/firewalld.service to /usr/lib/systemd/system/firewalld.service.
Failed to execute operation: Connection reset by peer
```
Nizamudeen [Fri, 6 Mar 2020 10:38:31 +0000 (16:08 +0530)]
doc: Fixed a minor typo in the document
In the Demo part of the Document, for both Vagrant and Bare metal the description was mentioned as "Deployment from scratch on bare metal machines".
Changed "bare metal" to "vagrant" for Vagrant section
Ali Maredia [Fri, 4 Oct 2019 19:31:25 +0000 (19:31 +0000)]
rgw multisite: enable more than 1 realm per cluster
Make it so that more than one realm, zonegroup,
or zone can be created during a run of the rgw
multisite ansible playbooks.
The rgw hosts now need to be grouped into zones
and realms in the inventory.
.yml files need to be created in group_vars
for the realms and zones. Sample yaml files
are available.
Also remove multsite destroy playbook
and add --cluster before radosgw-admin commands
remove manually added rgw_zone_endpoints var
and have ceph-ansible automatically add the
correct endpoints of all the rgws in a rgw_zone
from the information provided in that rgws hostvars.
This commit changes the value passed for the attribute 'rule_name' in
openstack_pools definition. It doesn't make sense to have emptry string
as passed value here.
Dimitri Savineau [Fri, 28 Feb 2020 16:37:14 +0000 (11:37 -0500)]
shrink-rbdmirror: fix presence after removal
We should add retry/delay to check the presence of the rbdmirror daemon
in the cluster status because the status takes some time to be updated.
Also the metadata.hostname isn't a good key to check because it doesn't
reflect the ansible_hostname fact. We should use metadata.id instead.
Dimitri Savineau [Wed, 26 Feb 2020 16:03:32 +0000 (11:03 -0500)]
tox: update shrink scenario configuration
The shrink scenarios don't need the docker variables (except for OSD).
Removing pytest for shrink-mgr.
Adding environment variables for xxx_to_kill ansible variable.
Dimitri Savineau [Wed, 26 Feb 2020 15:20:05 +0000 (10:20 -0500)]
shrink: don't use localhost node
The ceph-facts are running on localhost so if this node is using a
different OS/release that the ceph node we can have a mismatch between
docker/podman container binary.
This commit also reduces the scope of the ceph-facts role because we only
need the container_binary tasks.
Dimitri Savineau [Fri, 28 Feb 2020 14:42:44 +0000 (09:42 -0500)]
ceph-validate: add key format validation
If the user provides manually the key value for a specific keyring then
there's not valation on the content which could lead to unexpected
failures in the ceph_key module.
Dimitri Savineau [Fri, 31 Jan 2020 15:42:10 +0000 (10:42 -0500)]
purge: stop rgw instances by iteration
It looks like that the service module doesn't support wildcard anymore
for stopping/disabling multiple services.
fatal: [rgw0]: FAILED! => changed=false
msg: 'This module does not currently support using glob patterns,
found ''*'' in service name: ceph-radosgw@*'
...ignoring
Instead we should iterate over the rgw_instances list.