For some reason we changed the check of pgs but it appears it could be
dangerous because the current check might satisfied as long as 1 PG is
active+clean.
`ceph-docker-common`:
At the moment there is a lot of duplicated tasks in each
`./roles/ceph-<role>/tasks/docker/main.yml` that could be refactored in
`./roles/ceph-docker-common/tasks/main.yml`.
`*_containerized_deployment` variables:
All `*_containerized_deployment` have been refactored to a single
variable `containerized_deployment`
duplicate `cephx` variables in `group_vars/* have been removed.
Sébastien Han [Mon, 22 May 2017 12:18:45 +0000 (14:18 +0200)]
common: fix installation condition
Problem: we could end up in situation where we would install a package
on a machine that does not have the right repo enabled. Because the
condition was set to OR we weren't pinning a particular host but just a
condition. Let's say someone sets 'ceph_origin == "distro"', this would
try to install OSD packages on Monitors.
Solution: use a AND condition to first pin to the group_name (which
identifies a set of hosts) AND then after this one of the installation
condition.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1453119 Co-Authored-By: https://github.com/zhsj Signed-off-by: Sébastien Han <seb@redhat.com>
Ali Maredia [Wed, 10 May 2017 17:00:41 +0000 (13:00 -0400)]
rgw: move default bucket quota conf vars to global
"rgw override bucket index max shards" and
"rgw bucket default quota max objects" were in the
client section of the ceph.conf and not being
applied, this commit moves them to global
Sébastien Han [Thu, 16 Mar 2017 09:17:08 +0000 (10:17 +0100)]
mgr: add new role for ceph-mgr
The Ceph Manager daemon (ceph-mgr) runs alongside monitor daemons, to
provide additional monitoring and interfaces to external monitoring and
management systems.
Only works as of the Kraken release.
Co-Authored-By: Guillaume Abrioux <gabrioux@redhat.com> Signed-off-by: Sébastien Han <seb@redhat.com>
Common: Fix handlers that are not properly triggered.
Until now, only the first task were executed.
The idea here is to use `listen` statement to be able to notify multiple
handler and regroup all of them in `./handlers/main.yml` as notifying an
included handler task is not possible.
Sébastien Han [Thu, 30 Mar 2017 09:51:38 +0000 (11:51 +0200)]
playbook: homogenize the way list osd ids
Problem: too many different commands to do the same thing. The 'cut'
command on infrastructure-playbooks/purge-cluster.yml was also wrong.
This sed command from osixia in ceph-docker
https://github.com/ceph/ceph-docker/pull/580/ addresses all the
scenarios.
Common: Do not install ntp when ntp_service_enabled is false
ntp is still installed even if ntp_service_enabled is set to false.
That could be a problem if the time synchronization is managed by
something else than ceph-ansible or if you want to use different NTP
implementation as suggested in #1354.
If a group of hosts is empty, (for instance 'mdss', in case of a
deployment without any mds node), the playbook will fails when trying
to restart service with `"'dict object' has no attribute u'XXX'"` error.
The idea here is to force the `with_items` statements in all included handler tasks
to get at least an empty array.
Concubidated [Fri, 24 Mar 2017 19:52:37 +0000 (12:52 -0700)]
ceph-common: update sysctl file location
systctl tuning should be in the sysctl.d directory. This creates
a seperation from what values were set specific to ceph, and what
values were set by the operator.
Daniel Marks [Thu, 16 Mar 2017 22:16:30 +0000 (23:16 +0100)]
Use ansible uri module instead of shell module with curl
This fixes issue #1299. According to @ktdreyer s comment in the ticket,
he fixed the web server config so also older (non-SNI) python clients
can use the uri module here.
Christian Zunker [Wed, 15 Mar 2017 12:32:30 +0000 (13:32 +0100)]
Make ceph-common aware off osd config fragments
This removes the implicit order requirement when using OSD fragments.
When you use OSD fragments and ceph-osd role is not the last one,
the fragments get removed from ceph.conf by ceph-common.
It is not nice to have this code at two locations, but this is
necessary to prevent problems, when ceph-osd is the last role as
ceph-common gets executed before ceph-osd.
This could be prevented when ceph-common would be explicitly called
at the end of the playbook.
Signed-off-by: Christian Zunker <christian.zunker@codecentric.de>
Ken Dreyer [Mon, 13 Mar 2017 15:34:35 +0000 (09:34 -0600)]
ceph-common: install nfs-ganesha FSALs on Debian
Prior to this change, ceph-ansible would install the main NFS Ganesha
server daemon on Ubuntu, but it would skip the Ceph FSALs.
Running "apt-get install nfs-ganesha" will only install the main NFS Ganesha
server. It does *not* pull in the RGW FSAL
(/usr/lib/x86_64-linux-gnu/ganesha/libfsalrgw.so)
Running "apt-get install nfs-ganesha-fsal" will install the RGW FSAL as
well as the main NFS Ganesha server package.
Ken Dreyer [Fri, 3 Mar 2017 18:20:05 +0000 (11:20 -0700)]
avoid setting vfs_cache_pressure
From Josh Durgin, "I'd recommend not setting vfs_cache_pressure in
ceph-ansible. The syncfs issue is still there, and has caused real
problems in the past, whereas there hasn't been good data showing lower
vfs_cache_pressure is very helpful - the only cases I'm aware of have
shown it makes little difference to performance."
Andrew Schoen [Tue, 21 Feb 2017 18:35:00 +0000 (12:35 -0600)]
ceph-common: remove infernalis comment on radosgw_civetweb_port
As of Infernalis, the Ceph daemons run as an unprivileged "ceph" UID,
and this is by design.
Commit f19b765 altered the default
civetweb port from 80 to 8080 with a comment in the commit log about
"until this gets solved"
Remove the comment about permissions on Infernalis, because this is
always going to be the case on the Ceph versions we support, and it
is just confusing.
If users want to expose civetweb to s3 clients using privileged TCP
ports, they can redirect traffic with iptables, or use a reverse proxy
application like HAproxy.
Andrew Schoen [Fri, 17 Feb 2017 20:27:15 +0000 (14:27 -0600)]
ceph-common: use yum_repository when adding the ceph_stable repo
This gives us more flexibility than installing the ceph-release package
as we can easily use different mirrors. Also, I noticed an issue when
upgrading from jewel -> kraken as the ceph-release package for those
releases both have the same version number and yum doesn't know to
update anything.
Sébastien Han [Mon, 20 Feb 2017 22:07:53 +0000 (17:07 -0500)]
common: fix "disable transparent hugepage"
To configure kernel the task is using "command" module which is not
respect operator ">". So this task just print to "stdout": "never >
/sys/kernel/mm/transparent_hugepage/enabled"
Sébastien Han [Thu, 9 Feb 2017 14:16:39 +0000 (15:16 +0100)]
docker: use a better method to pull images
We changed the way we declare image.
Prior to this patch we must have a "user/image:tag"
format, which is incompatible with non docker-hub registry where you
usually don't have a "user". On the docker hub a "user" is also
identified as a namespace, so for Ceph the user was "ceph".
Variables have been simplified with only:
* ceph_docker_image
* ceph_docker_image_tag
1. For docker hub images: ceph_docker_name: "ceph/daemon" will give
you the 'daemon' image of the 'ceph' user.
2. For non docker hub images: ceph_docker_name: "daemon" will simply
give you the "daemon" image.
Infrastructure playbooks have been modified as well.
The file group_vars/all.docker.yml.sample has been removed as well.
It is hard to maintain since we have to generate it manually. If
you want to configure specific variables for a specific daemon simply
edit group_vars/$DAEMON.yml
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1420207 Signed-off-by: Sébastien Han <seb@redhat.com>
Sébastien Han [Mon, 30 Jan 2017 10:05:01 +0000 (11:05 +0100)]
common: create ceph initial directories
Some users purge their environments and leave it in a non-optimal state.
e.g: packages are still installed but /etc/ceph and /var/lib/ceph don't
exist anymore. This will result in multiple failures across the play,
sometimes hard to detect. Populating these directories "just in case"
should help us solving these problems.
Closes: #1253 Signed-off-by: Sébastien Han <seb@redhat.com>
Andrew Schoen [Tue, 24 Jan 2017 15:06:10 +0000 (09:06 -0600)]
Adds ip_version configuration option
This allows the user to set ip_version to either ipv4 or ipv6. This
resolves a bug where monitor_address is set to an ipv6 address, but the
template fails to render because it's hardcoded to look for an 'ipv4'
key in the ansible facts.
Logan V [Mon, 16 Jan 2017 14:14:02 +0000 (08:14 -0600)]
RGW: Allow configurable rgw frontends setting
Allow for more operator flexibility in the `rgw frontends` setting
while maintaining backwards compatibility with the old vars. This
allows an operator to, for example, use the civetweb settings for
implementing SSL ports.
For available civetweb configuration parameters, see:
https://github.com/civetweb/civetweb/blob/master/docs/UserManual.md
Logan V [Mon, 16 Jan 2017 14:12:15 +0000 (08:12 -0600)]
Remove libcephfs1 from group_vars sample
The libcephfs1 package was removed from ceph-common in cb1c06901e02a9f44c24a5d20737a9f33ac8ab2b, however it was not synced
to group_vars/all.yml.sample using the `generate_group_vars_sample.sh`
script. This fixes up the comment formatting in the ceph-common
defaults and brings the group_vars sample back into sync.
Ken Dreyer [Thu, 5 Jan 2017 21:29:53 +0000 (14:29 -0700)]
ceph-common: always include release.yml
Prior to this change, a playbook run with '--tags' or '--skip-tags'
would fail, because the ceph-common role would not include the
release.yml task, and this file defines critical things like
ceph_release.
Thanks Andrew Schoen <aschoen@redhat.com> for help with the fix.
Sébastien Han [Fri, 16 Dec 2016 10:36:42 +0000 (11:36 +0100)]
common: do not become root on local task
There is no need to become root on local_action. This will event trigger
an error on some systems as it will try to run a sudo command. If the
current user does not have passwordless sudo, Ansible will fail. Anyway
using the current user is perfectly fine and no elevation privilege is
needed.
Logan V [Thu, 14 Jul 2016 19:27:03 +0000 (14:27 -0500)]
Add support for Keystone v3 API
The Keystone v2 APIs are deprecated and scheduled to be removed in
Q release of Openstack. This adds support for configuring RGW to
use the current Keystone v3 API.
Logan V [Thu, 14 Jul 2016 19:09:31 +0000 (14:09 -0500)]
Add a switch to disable nss PKI database initialization
The PKI keys are used to decrypt the Keystone revocation list when
PKI tokens are used. When UUID or Fernet token providers are used in
Keystone, PKI certs may not exist, so we now accommodate this scenario
by allowing the operator to disable the PKI tasks.
Logan V [Mon, 11 Jul 2016 12:52:11 +0000 (07:52 -0500)]
Add support for Keystone user authentication with RGW
Jewel added support for user/pass authentication with Keystone,
allowing deployers to disable Keystone admin token as required
for production deployments.
This implements configuration for the new RGW Keystone user/pass
authentication feature added in Jewel.
See docs here: http://docs.ceph.com/docs/master/radosgw/keystone/
common: do not regenerate initial mon keyring if cluster exists
This commit solves the situation where you lost your fetch directory and
you are running ansible against an existing cluster. Since no fetch
directory is present the file containing the initial mon keyring
doesn't exist so we are generating a new one.
Sébastien Han [Wed, 14 Dec 2016 18:03:04 +0000 (19:03 +0100)]
common: remove uncessary conditions and spell red hat entirely
We do not need to run another condition for 'ceph_rhcs' since the
include we came from already has it, so we are already inside this
condition.
We also spell red hat entirely instead of rh and we remove capital
letters.
Casey Bodley [Fri, 9 Dec 2016 15:41:54 +0000 (10:41 -0500)]
ceph-common: remove libcephfs1 from debian_ceph_packages
in hammer, ceph-common depended on libcephfs (indirectly, via
python-cephfs). this is no longer the case in jewel or later, so it can
be removed from debian_ceph_packages
Sébastien Han [Fri, 9 Dec 2016 13:51:35 +0000 (14:51 +0100)]
common: do not run tasks in main.yml, use include
For readibility and clarity we do not run any tasks directly in the
main.yml file. This file should only contain include, which helps us
later to apply conditionnals if we want to.
Logan V [Thu, 8 Dec 2016 19:16:02 +0000 (13:16 -0600)]
Fix the mons running check to use group name var
mon_group_name variable can be used to override mons group, but
this task assumes the group is always 'mons'. So we need to use
the var to find the group name instead.
Sébastien Han [Tue, 6 Dec 2016 14:59:49 +0000 (15:59 +0100)]
common: do not regenerate a cluster fsid if cluster exists
This commit solves the situation where you lost your fetch directory and
you are running ansible against an existing cluster. Since no fetch
directory is present the file containing the fsid doesn't exist so we
are creating a new one. Later the ceph.conf gets updated with a wrong
fsid which causes problems for clients and ceph processes.
Closes: #1148 Signed-off-by: Sébastien Han <seb@redhat.com>