Matthew Vernon [Tue, 11 Apr 2017 12:27:19 +0000 (13:27 +0100)]
Only assemble {{ cluster }}.conf and osd.conf
Ansible's assemble module by default will put all files in the src
directory together into dest. We only want to put {{ cluster }}.conf
and osd.conf together, not anything that might have found its way into
/etc/ceph/ceph.d (e.g. files left by the sysadmin taking backups
before an ansible run). So specify a regexp that matches only those
two files.
Christian Zunker [Wed, 15 Mar 2017 12:32:30 +0000 (13:32 +0100)]
Make ceph-common aware of osd config fragments
This removes the implicit order requirement when using OSD fragments.
When you use OSD fragments and ceph-osd role is not the last one,
the fragments get removed from ceph.conf by ceph-common.
It is not nice to have this code at two locations, but this is
necessary to prevent problems, when ceph-osd is the last role as
ceph-common gets executed before ceph-osd.
This could be prevented when ceph-common would be explicitly called
at the end of the playbook.
Andrew Schoen [Tue, 7 Feb 2017 20:38:02 +0000 (14:38 -0600)]
purge-cluster: remove all include tasks
Including variables from role defaults or files in a group_vars
directory relative to the playbook is a bad practice. We don't want to
do this because including these defaults at the task level overrides
values that would be set in a group_vars directory relative to the
inventory file, which is the correct usage if you wish to override
those default values.
Chris Wells [Sat, 28 Jan 2017 17:30:27 +0000 (12:30 -0500)]
Using ini_file with ansible_hostname to ensure each INI block gets the rgw_zone setting in a multi-RGW setup. Also, ansible_hostname better matches what ceph-common does for the actual hostname (ansible_host != ansible_hostname under all conditions).
Kyle Squizzato [Wed, 1 Feb 2017 18:29:45 +0000 (13:29 -0500)]
README: Don't use underscores for opts in ceph_conf_overrides
Adding underscores in the ceph_conf_overrides variable can result in incorrect
config options appearing. A note has been added to clarify that using
underscores here can cause this behavior and recommending not to do it.
Sébastien Han [Mon, 30 Jan 2017 10:05:01 +0000 (11:05 +0100)]
common: create ceph initial directories
Some users purge their environments and leave it in a non-optimal state.
e.g: packages are still installed but /etc/ceph and /var/lib/ceph don't
exist anymore. This will result in multiple failures across the play,
sometimes hard to detect. Populating these directories "just in case"
should help us solving these problems.
Sébastien Han [Fri, 27 Jan 2017 14:40:41 +0000 (15:40 +0100)]
purge: do not stop ceph.target on each daemon
Doing this cause some all the daemons to go down at the same time. In a
scenario where we colocate a monitor and an osd, this osds will take
some time to go down which will make the 'umount' task fail.
Sébastien Han [Fri, 27 Jan 2017 12:45:16 +0000 (13:45 +0100)]
purge: do not fail on purge ceph files
On systems running docker there is an issue with lxfs that results in
the find command returning 1 but actually did the job.
e.g: on a system with docker runnning find /var will give us the
following error:
Sébastien Han [Fri, 27 Jan 2017 10:33:37 +0000 (11:33 +0100)]
purge: fix ubuntu purge when not using systemd
We now rely on the cli tool ceph-detect-init which will tell us the init
system in used on the distribution. We do this instead of the previous
lookup for systemd unit files to call the right task depending on the
init system.
Sébastien Han [Fri, 27 Jan 2017 10:21:04 +0000 (11:21 +0100)]
purge: allow purge to run multiple times
with_items is evaluated before the when so in a second run where the
variable is empty if will fail with "'dict object' has no attribute
'stdout_lines'". To fix this we had a default array so with_items does
not fail and the task is skipped with the when.
Sébastien Han [Fri, 27 Jan 2017 10:10:21 +0000 (11:10 +0100)]
osd: make sure osd directory exists
Sometimes users for testing, tend to delete the whole /var/lib/ceph and
then run ansible again, OSD will never come up if we do not create their
directory.
Andrew Schoen [Thu, 26 Jan 2017 18:07:42 +0000 (12:07 -0600)]
purge-cluster: fix failure when raw_multi_journal is not defined
Because the purge-cluster.yml playbook does not have access to the roles
default vars then we can be sure that raw_multi_journal is defined. For
example, if this was purging a dmcrypt journal then raw_multi_journal
might not be defined at all in group_vars/all.yml or
group_vars/osds.yml.
Sébastien Han [Tue, 3 Jan 2017 12:48:59 +0000 (13:48 +0100)]
mon: make sure osd_pool_default_size is honoured
This patch makes sure we set the proper pool size on the rbd pool.
Usually during bootstrap the rbd pool size is not honoured so we need to
add this workaround.
Sébastien Han [Thu, 19 Jan 2017 14:28:44 +0000 (15:28 +0100)]
purge: remove dm-crypt devices
When running encrypted OSDs, an encrypted device mapper is used (because
created by the crypsetup tool). So before attempting to remove all the
partitions on a device we must delete all the encrypted device mappers,
then we can delete all the partitions.
Signed-off-by: Sébastien Han <seb@redhat.com>
Please enter the commit message for your changes. Lines starting
Sébastien Han [Wed, 18 Jan 2017 09:55:01 +0000 (10:55 +0100)]
purge: remove zap_block_devs variable
The name of this variable was a bit confusing since its activation will
zap all the block devices no matter which osd scenario we are using.
Removing this variable and applying a condition on the OSD scenario is
now feasible and easier since we import group_vars variable files for
OSDs.
Andrew Schoen [Tue, 24 Jan 2017 15:06:10 +0000 (09:06 -0600)]
Adds ip_version configuration option
This allows the user to set ip_version to either ipv4 or ipv6. This
resolves a bug where monitor_address is set to an ipv6 address, but the
template fails to render because it's hardcoded to look for an 'ipv4'
key in the ansible facts.
Sébastien Han [Thu, 19 Jan 2017 13:35:00 +0000 (14:35 +0100)]
mon: fix mds pool creation
It is not enough to check for the mds to exists, it actually always does
because we declare the variable. So we need to make sure that there is a
mds host.
Sébastien Han [Mon, 5 Dec 2016 13:21:54 +0000 (14:21 +0100)]
mon: pool creation and pgs
Since we introduced config_overrides we removed a lot of options from
the default template. In some cases, like mds pool, openstack pools etc
we need to know the amount of PGs required. The idea here is to skip the
task if ceph_conf_overrides.global.osd_pool_default_pg_num is not define
in your `group_vars/all.yml`.
Closes: #1145 Signed-off-by: Sébastien Han <seb@redhat.com> Co-Authored-By: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ddac3a1fb5e319d6f778256504c2787746454c79)
Andrew Schoen [Tue, 10 Jan 2017 22:56:07 +0000 (16:56 -0600)]
tests: copy purge-cluster.yml to root of ceph-ansible
There is an Ansible bug which makes the playbook fail when we are
running a playbook from the non-git root directory. The real problem is
that the ansible.cfg is not honoured and we are including variable from
roles/<role>/defaults/main.yml
The fix is too copy the purge cluster playbook on the git root directory
and execute it.
Andrew Schoen [Tue, 10 Jan 2017 22:57:58 +0000 (16:57 -0600)]
purge-cluster: do not include ceph-osd and ceph-common defaults for osds
When purging OSDs we do not need to include these defaults as nothing in
the following tasks uses them. Also, it has the side effect of
overwriting any variables defined in group_vars files that are relative
to the inventory you are using with the default values. That behavior
was causing the CI tests to fail.
Andrew Schoen [Thu, 22 Dec 2016 19:47:22 +0000 (13:47 -0600)]
purge-cluster: get journal partitions after zapping osd disks
In my testing zapping the osd disks deleted the journal
partitions, making the 'zap ceph journal partitions' task fail because
the partitions it found previously do not exist anymore.
This moves the task that finds the journal partitions after 'zap osd disks'
to catch any partitions ceph-disk might have missed.