git-server-git.apps.pok.os.sepia.ceph.com Git - ceph-ansible.git/log

]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph-ansible.git/log

projects / ceph-ansible.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Guillaume Abrioux [Wed, 16 May 2018 15:34:38 +0000 (17:34 +0200)]

purge_cluster: wipe all partitions

In order to ensure there is no leftover after having purged a cluster,
we must wipe all partitions properly.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492242
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a9247c4de78dec8a63f17400deb8b06ce91e7267)

commit | commitdiff | tree

Guillaume Abrioux [Wed, 16 May 2018 14:04:25 +0000 (16:04 +0200)]

purge_cluster: fix bug when building device list

there is some leftover on devices when purging osds because of a invalid
device list construction.

typical error:
```
changed: [osd3] => (item=/dev/sda sda1) => {
    "changed": true,
    "cmd": "# if the disk passed is a raw device AND the boot system disk\n if parted -s \"/dev/sda sda1\" print | grep -sq boot; then\n echo \"Looks like /dev/sda sda1 has a boot partition,\"\n echo \"if you want to delete specific partitions point to the partition instead of the raw device\"\n echo \"Do not use your system disk!\"\n exit 1\n fi\n echo sgdisk -Z \"/dev/sda sda1\"\n echo dd if=/dev/zero of=\"/dev/sda sda1\" bs=1M count=200\n echo udevadm settle --timeout=600",
    "delta": "0:00:00.015188",
    "end": "2018-05-16 12:41:40.408597",
    "item": "/dev/sda sda1",
    "rc": 0,
    "start": "2018-05-16 12:41:40.393409"
}

STDOUT:

sgdisk -Z /dev/sda sda1
dd if=/dev/zero of=/dev/sda sda1 bs=1M count=200
udevadm settle --timeout=600

STDERR:

Error: Could not stat device /dev/sda sda1 - No such file or directory.
```

the devices list in the task `resolve parent device` isn't built
properly because the command used to resolve the parent device doesn't
return the expected output

eg:

```
changed: [osd3] => (item=/dev/sda1) => {
    "changed": true,
    "cmd": "echo /dev/$(lsblk -no pkname \"/dev/sda1\")",
    "delta": "0:00:00.013634",
    "end": "2018-05-16 12:41:09.068166",
    "item": "/dev/sda1",
    "rc": 0,
    "start": "2018-05-16 12:41:09.054532"
}

STDOUT:

/dev/sda sda1
```

For instance, it will result with a devices list like:
`['/dev/sda sda1', '/dev/sdb', '/dev/sdc sdc1']`
where we expect to have:
`['/dev/sda', '/dev/sdb', '/dev/sdc']`

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1492242
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9cad113e2f22132d08208cd58462f11056c41305)

commit | commitdiff | tree

Sébastien Han [Fri, 18 May 2018 12:43:57 +0000 (14:43 +0200)]

defaults: restart_osd_daemon unit spaces

Extra space in systemctl list-units can cause restart_osd_daemon.sh to
fail

It looks like if you have more services enabled in the node space
between "loaded" and "active" get more space as compared to one space
given in command the command[1].

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1573317
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 2f43e9dab5f077276162069f449978ea97c2e9c0)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Michael Vollman [Thu, 17 May 2018 19:17:29 +0000 (15:17 -0400)]

Do nothing when mgr module is in good state

Check whether a mgr module is supposed to be disabled before disabling
it and whether it is already enabled before enabling it.

Signed-off-by: Michael Vollman <michael.b.vollman@gmail.com>
(cherry picked from commit ed050bf3f682e74d9453451276d10af8c6b5947f)
Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 17 May 2018 15:29:20 +0000 (17:29 +0200)]

take-over: fix bug when trying to override variable

A customer has been facing an issue when trying to override
`monitor_interface` in inventory host file.
In his use case, all nodes had the same interface for
`monitor_interface` name except one. Therefore, they tried to override
this variable for that node in the inventory host file but the
take-over-existing-cluster playbook was failing when trying to generate
the new ceph.conf file because of undefined variable.

Typical error:

```
fatal: [srvcto103cnodep01]: FAILED! => {"failed": true, "msg": "'dict object' has no attribute u'ansible_bond0.15'"}
```

Including variables like this `include_vars: group_vars/all.yml` prevent
us from overriding anything in inventory host file because it
overwrites everything you would have defined in inventory.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1575915
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 415dc0a29b10b28cbd047fe28eb4dd38419ea5dc)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Sébastien Han [Wed, 16 May 2018 14:02:41 +0000 (16:02 +0200)]

rolling_update: move osd flag section

During a minor update from a jewel to a higher jewel version (10.2.9 to
10.2.10 for example) osd flags don't get applied because they were done
in the mgr section which is skipped in jewel since this daemons does not
exist.
Moving the set flag section after all the mons have been updated solves
that problem.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1548071
Co-authored-by: Tomas Petr <tpetr@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit d80a871a078a175d0775e91df00baf625dc39725)

commit | commitdiff | tree

Guillaume Abrioux [Thu, 3 May 2018 19:36:21 +0000 (21:36 +0200)]

client: remove default value for pg_num in pools creation

trying to set the default value for pg_num to
`hostvars[groups[mon_group_name][0]]['osd_pool_default_pg_num'])` will
break in case of external client nodes deployment.
the `pg_num` attribute should be mandatory and be tested in future
`ceph-validate` role.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f60b049ae53bbf54dd550587e84b986fef15fbe6)
Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Thu, 10 May 2018 17:38:55 +0000 (10:38 -0700)]

rolling_update: move mgr key creation

Until all the mons haven't been updated to Luminous, there is no way to
create a key. So we should do the key creation in the mon role only if
we are not part of an update.
If we are then the key creation is done after the mons upgrade to
Luminous.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1574995
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 52fc8a0385a7bc58b8b33fc0c5e05db1a03c5c1f)
Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Thu, 10 May 2018 17:02:44 +0000 (10:02 -0700)]

Revert "mon: fix mgr keyring creation when upgrading from jewel"

This reverts commit 259fae931d77f056b7e1077b023710cfab1e5cca.

(cherry picked from commit e810fb217f1b78df4039ee50593b8c770fb70dde)
Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 15 May 2018 09:41:26 +0000 (11:41 +0200)]

rolling_update: fix dest path for mgr keys fetching

the role `ceph-mgr` that is played later in the playbook fails because
the destination path for the fetched keys is wrong.
This patch fix the destination path used in the task `fetch ceph mgr
key(s)` so there is no mismatch.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1574995
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1b4c3f292d8779158ea445a8c9a11c8ed26abe11)
Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Mon, 14 May 2018 15:39:25 +0000 (17:39 +0200)]

iscsi-gw: fix issue when trying to mask target

trying to mask target when `/etc/systemd/system/target.service` doesn't
exist seems to be a bug.
There is no need to mask a unit file which doesn't exist.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a145caf947aec64467150a007b7aafe57abe2891)

commit | commitdiff | tree

Sébastien Han [Mon, 14 May 2018 07:21:48 +0000 (09:21 +0200)]

iscsi: add python-rtslib repository

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 8c7c11b774f54078b32b652481145699dbbd79ff)

commit | commitdiff | tree

Andy McCrae [Thu, 10 May 2018 10:15:30 +0000 (11:15 +0100)]

Allow os_tuning_params to overwrite fs.aio-max-nr

The order of fs.aio-max-nr (which is hard-coded to 1048576) means that
if you set fs.aio-max-nr in os_tuning_params it will effectively be
ignored for bluestore scenarios.

To resolve this we should move the setting of fs.aio-max-nr above the
setting of os_tuning_params, in this way the operator can define the
value of fs.aio-max-nr to be something other than 1048576 if they want
to.

Additionally, we can make the sysctl settings happen in 1 task rather
than multiple.

(cherry picked from commit 08a2b58d39a687e25436afdf3fda1591d3be8ca1)

commit | commitdiff | tree

Gregory Meno [Wed, 9 May 2018 18:17:26 +0000 (11:17 -0700)]

adds missing state needed to upgrade nfs-ganesha

in tasks for os_family Red Hat we were missing this

fixes: bz1575859
Signed-off-by: Gregory Meno <gmeno@redhat.com>
(cherry picked from commit 26f6a650425517216fb57c08e1a8bda39ddcf2b5)
Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 9 May 2018 12:42:27 +0000 (14:42 +0200)]

mon: fix mgr keyring creation when upgrading from jewel

On containerized deployment,
when upgrading from jewel to luminous, mgr keyring creation fails because the
command to create mgr keyring is executed on a container that is still
running jewel since the container is restarted later to run the new
image, therefore, it fails with bad entity error.

To get around this situation, we can delegate the command to create
these keyrings on the first monitor when we are running the playbook on the last monitor.
That way we ensure we will issue the command on a container that has
been well restarted with the new image.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1574995
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 9 May 2018 01:10:30 +0000 (03:10 +0200)]

osd: clean legacy syntax in ceph-osd-run.sh.j2

Quick clean on a legacy syntax due to e0a264c7e

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Simone Caronni [Thu, 5 Apr 2018 14:14:23 +0000 (16:14 +0200)]

Make sure the restart_mds_daemon script is created with the correct MDS name

commit | commitdiff | tree

Sébastien Han [Tue, 8 May 2018 14:11:14 +0000 (07:11 -0700)]

common: enable Tools repo for rhcs clients

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1574458
Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Andy McCrae [Thu, 22 Mar 2018 12:19:22 +0000 (12:19 +0000)]

Fix install of nfs-ganesha-ceph for Debian/SuSE

The Debian and SuSE installs for nfs-ganesha on the non-rhcs repository
requires you to allow_unauthenticated for Debian, and disable_gpg_check
for SuSE. The nfs-ganesha-rgw package already does this, but the
nfs-ganesha-ceph package will fail to install because of this same
issue.

This PR moves the installations to happen when the appropriate flags are
set to True (nfs_obj_gw & nfs_file_gw), but does it per distro (one for
SuSE and one for Debian) so that the appropriate flag can be passed to
ignore the GPG check.

commit | commitdiff | tree

Guillaume Abrioux [Thu, 3 May 2018 16:41:16 +0000 (18:41 +0200)]

playbook: improve facts gathering

there is no need to gather facts with O(N^2) way.
Only one node should gather facts from other node.

Fixes: #2553
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Ramana Raja [Thu, 3 May 2018 12:10:13 +0000 (17:40 +0530)]

ceph-nfs: disable attribute caching

When 'ceph_nfs_disable_caching' is set to True, disable attribute
caching done by Ganesha for all Ganesha exports.

Signed-off-by: Ramana Raja <rraja@redhat.com>

commit | commitdiff | tree

Sébastien Han [Thu, 3 May 2018 14:54:53 +0000 (16:54 +0200)]

common: copy iso files if rolling_update

If we are in a middle of an update we want to get the new package
version being installed so the task that copies the repo files should
not be skipped.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1572032
Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Andy McCrae [Thu, 26 Apr 2018 09:42:11 +0000 (10:42 +0100)]

Move apt cache update to individual task per role

The apt-cache update can fail due to transient issues related to the
action being a network operation. To reduce the impact of these
transient failures this patch adds a retry to the update_cache task.

However, the apt_repository tasks which would perform an apt_update
won't retry the apt_update on a failure in the same way, as such this PR
moves the apt_update into an individual task, once per role.

Finally, the apt_repository tasks no longer have a changed_when: false,
and the apt_cache update is only performed once per role, if the
repositories change. Otherwise the cache is updated on the "apt" install
tasks if the cache_timeout has been reached.

commit | commitdiff | tree

Guillaume Abrioux [Mon, 30 Apr 2018 18:53:42 +0000 (20:53 +0200)]

client: fix pool creation

the value in `docker_exec_client_cmd` doesn't allow to check for
existing pools because it's set with a wrong value for the entrypoint
that is going to be used.
It means the check were going to fail anyway even if pools actually exist.

Using jinja syntax to set `docker_exec_cmd` allows to handle the case
where you don't have monitors in your inventory.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Sébastien Han [Thu, 26 Apr 2018 17:55:48 +0000 (19:55 +0200)]

mon: change application pool support

If openstack_pools contains an application key it will be used to apply
this application pool type to a pool.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1562220
Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Fri, 27 Apr 2018 12:48:33 +0000 (14:48 +0200)]

check if pools already exist before creating them

Add a task to check if pools already exist before we create them.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 25 Apr 2018 15:33:35 +0000 (17:33 +0200)]

tests: update the type for the rule used in pools

As of ceph 12.2.5 the type of the parameter `type` is not a name anymore but
an id, therefore an `int` is expected otherwise it will fail with the
following error

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 25 Apr 2018 12:20:35 +0000 (14:20 +0200)]

switch: fix ceph_uid fact for osd

In addition to b324c17 this commit fix the ceph uid for osd role in the
switch from non containerized to containerized playbook.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Sébastien Han [Thu, 19 Apr 2018 12:45:03 +0000 (14:45 +0200)]

switch: resolve device path so we can umount the osd data dir

If we don't do this, umounting devices declared like this
/dev/disk/by-id/ata-QEMU_HARDDISK_QM00001

will fail like:

umount: /dev/disk/by-id/ata-QEMU_HARDDISK_QM000011: mountpoint not found

Since we append '1' (partition 1), this won't work.
So we need to resolved the link to get something like /dev/sdb and then
append 1 to /dev/sdb1

Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Sébastien Han [Thu, 19 Apr 2018 08:28:56 +0000 (10:28 +0200)]

switch: fix ceph_uid fact

Latest is now centos not ubuntu anymore so the condition was wrong.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 27 Apr 2018 11:19:25 +0000 (13:19 +0200)]

Revert "add .vscode/ to gitignore"

This reverts commit 3c4319ca4b5355d69b2925e916420f86d29ee524.

commit | commitdiff | tree

Sébastien Han [Mon, 23 Apr 2018 08:02:16 +0000 (10:02 +0200)]

mon/client: honor key mode when copying it to other nodes

The last mon creates the keys with a particular mode, while copying them
to the other mons (first and second) we must re-use the mode that was
set.

The same applies for the client node, the slurp preserves the initial
'item' so we can get the mode for the copy.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Mon, 23 Apr 2018 08:01:23 +0000 (10:01 +0200)]

ci: bump client nodes to 2

In order to test the key distribution is correct we must have 2 client
nodes.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Mon, 23 Apr 2018 07:52:18 +0000 (09:52 +0200)]

mon: remove redundant copy task

We had twice the same task, also one was overriding the mode.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 20 Apr 2018 14:44:41 +0000 (16:44 +0200)]

mon/client: remove acl code

Applying ACL on the keyrings is not used anymore so let's remove this
code.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 20 Apr 2018 14:37:05 +0000 (16:37 +0200)]

mon/client: apply mode from ceph_key

Do not use a dedicated task for this but use the ceph_key module
capability to set file mode.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 20 Apr 2018 14:35:39 +0000 (16:35 +0200)]

ceph_key: ability to apply a mode to a file

You can now create keys and set file mode on them. Use the 'mode'
parameter for that, mode must be in octal so 0644.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Di Xu [Mon, 23 Apr 2018 02:08:48 +0000 (10:08 +0800)]

add AArch64 to supported architecture

works on AArch64 platform

commit | commitdiff | tree

Sébastien Han [Thu, 19 Apr 2018 16:54:53 +0000 (18:54 +0200)]

mon: remove mgr key from ceph_config_keys

This key is created after the last mon is up so there is no need to try
to push it from the first mon. The initia mon container is not creating
the mgr key, ansible does. So this key will never exist.
The key will go into the fetch dir once the last mon is up, then when
the ceph-mgr plays it will try to get it from the fetch directory.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Thu, 19 Apr 2018 16:40:16 +0000 (18:40 +0200)]

mon: remove mon map from ceph_config_keys

During the initial bootstrap of the first mon, the monmap file is
destroyed so it's not available and ansible will never find it.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Sat, 31 Mar 2018 10:43:42 +0000 (12:43 +0200)]

config_template: resync with upstream

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Wed, 28 Mar 2018 19:52:40 +0000 (21:52 +0200)]

ci: test ansible 2.5

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Thu, 12 Apr 2018 13:52:30 +0000 (15:52 +0200)]

Expose /var/run/ceph

Useful for softwares that do data collection/monitoring like collectd.
They can connect to the socket and then retrieve information.

Even though the sockets are exposed now, I'm keeping the docker exec to
check the socket, this will allow newer version of ceph-ansible to work
with older versions.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1563280
Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 13 Apr 2018 17:42:17 +0000 (19:42 +0200)]

default: extent ceph_uid and gid

We now have the ability to detect the uid/gid of the ceph user depending
on the distribution we are running on and so we are doing non-container
deployements.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 13 Apr 2018 15:56:06 +0000 (17:56 +0200)]

move create ceph initial directories to default

This is needed for both non-container and container deployments.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 20 Apr 2018 09:13:51 +0000 (11:13 +0200)]

shrink-osd: ability to shrink NVMe drives

Now if the service name contains nvme we know we need to remove the last
2 character instead of 1.

If nvme then osd_to_kill_disks is nvme0n1, we need nvme0
If ssd or hdd then osd_to_kill_disks is sda1, we need sda

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1561456
Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Tue, 17 Apr 2018 13:32:53 +0000 (15:32 +0200)]

selinux: remove chcon calls

We know bindmount with the :z option at the end of the -v command so
this will basically run the exact same command as we used to run. So to
speak:

chcon -Rt svirt_sandbox_file_t /var/lib/ceph

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Tue, 17 Apr 2018 12:16:41 +0000 (14:16 +0200)]

client: add a --rm option to run the container

This fixes the case where the playbook died and never removed the
container. So now, once the container exits it will remove itself from
the container list.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1568157
Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Wed, 18 Apr 2018 13:44:36 +0000 (15:44 +0200)]

client: import the key in ceph is copy_admin_key is true

If the user has set copy_admin_key to true we assume he/she wants to
import the key in Ceph and not only create the key on the filesystem.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Wed, 18 Apr 2018 13:11:55 +0000 (15:11 +0200)]

client: add quotes to the dict values

ceph-authtool does not support raw arguements so we have to quote caps
declaration like this allow 'bla bla' instead of allow bla bla

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1568157
Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Andy McCrae [Wed, 21 Mar 2018 15:57:00 +0000 (15:57 +0000)]

Add support for --diff in config_template

Add support for the Ansible --diff mode in config_template. This will
show the before/after for config_template changes, in the same way as
the base copy and template modules do.

To utilise this run your playbooks with "--diff --check".

commit | commitdiff | tree

Sébastien Han [Wed, 11 Apr 2018 15:15:29 +0000 (17:15 +0200)]

refactor the way we copy keys

This commit does a couple of things:

* use a common.yml file that contains things that can be played on both
container and non-container

* refactor the ability to copy the admin key to the nodes

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Randy J. Martinez [Thu, 29 Mar 2018 04:17:02 +0000 (23:17 -0500)]

ceph-defaults: fix ceph_uid fact on container deployments

Red Hat is now using tags[3,latest] for image rhceph/rhceph-3-rhel7.
Because of this, the ceph_uid conditional passes for Debian
when 'ceph_docker_image_tag: latest' on RH deployments.
I've added an additional task to check for rhceph image specifically,
and also updated the RH family task for ceph/daemon [centos|fedora]tags.

Signed-off-by: Randy J. Martinez <ramartin@redhat.com>

commit | commitdiff | tree

Sébastien Han [Tue, 17 Apr 2018 13:59:52 +0000 (15:59 +0200)]

rhcs: re-add apt-pining

When installing rhcs on Debian systems the red hat repos must have the
highest priority so we avoid packages conflicts and install the rhcs
version.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1565850
Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Mon, 9 Apr 2018 16:07:31 +0000 (18:07 +0200)]

defaults: check only 1 time if there is a running cluster

There is no need to check for a running cluster n*nodes time in
`ceph-defaults` so let's add a `run_once: true` to save some resources
and time.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 10 Apr 2018 13:30:16 +0000 (15:30 +0200)]

site: make it more readable

These conditions introduced by d981c6bd2 were insane.
This should be a bit easier to read.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 13 Apr 2018 14:36:43 +0000 (16:36 +0200)]

osd: do not do anything if the dev has a partition

Regardless if the partition is 'ceph' or something else, we don't want
to be as strick as checking for a particular partition.
If the drive has a partition, we just don't do anything.

This solves the case where the server reboots, disks get a different
/dev/sda (node) allocation. In this case, prior to restarting the server
/dev/sda was an OSD, but now it's /dev/sdb and the other way around.
In such scenario, we will try to prepare the OSD and create a new
partition, so let's not mess around with devices that have partitions.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1498303
Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Thu, 12 Apr 2018 07:55:25 +0000 (09:55 +0200)]

tests: update tests for mds to cover multimds case

in case of multimds we must check for the number of mds up instead of
just checking if the hostname of the node is in the fsmap.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Sébastien Han [Thu, 12 Apr 2018 10:15:35 +0000 (12:15 +0200)]

common: add tools repo for iscsi gw

To install iscsi gw packages we need to enable the tools repo.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1547849
Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Douglas Fuller [Wed, 4 Apr 2018 18:23:25 +0000 (14:23 -0400)]

Remove deprecated allow_multimds

allow_multimds will be officially deprecated in Mimic, specify it
only for all versions of Ceph where it was declared stable. Going
forward, specify only max_mds.

Signed-off-by: Douglas Fuller <dfuller@redhat.com>

commit | commitdiff | tree

vasishta p shastry [Tue, 10 Apr 2018 13:37:35 +0000 (19:07 +0530)]

Fixed a typo (extra space)

commit | commitdiff | tree

vasishta p shastry [Tue, 10 Apr 2018 13:21:50 +0000 (18:51 +0530)]

osd: to support copy_admin_key

commit | commitdiff | tree

vasishta p shastry [Tue, 10 Apr 2018 12:39:43 +0000 (18:09 +0530)]

mds: to support copy_admin_keyring

commit | commitdiff | tree

vasishta p shastry [Tue, 10 Apr 2018 12:37:11 +0000 (18:07 +0530)]

nfs: to support copy_admin_key - containerized

commit | commitdiff | tree

Ali Maredia [Mon, 2 Apr 2018 17:47:31 +0000 (13:47 -0400)]

nfs: ensure nfs-server server is stopped

NFS-ganesha cannot start is the nfs-server service
is running. This commit stops nfs-server in case it
is running on a (debian, redhat, suse) node before
the nfs-ganesha service starts up

fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506

Signed-off-by: Ali Maredia <amaredia@redhat.com>

commit | commitdiff | tree

Ramana Raja [Mon, 9 Apr 2018 12:03:33 +0000 (17:33 +0530)]

ceph-nfs: allow disabling ganesha caching

Add a variable, ceph_nfs_disable_caching, that if set to true
disables ganesha's directory and attribute caching as much as
possible.

Also, disable caching done by ganesha, when 'nfs_file_gw'
variable is true, i.e., when Ganesha is used as CephFS's gateway.
This is the recommended Ganesha setting as libcephfs already caches
information. And doing so helps avoid cache incoherency issues
especially with clustered ganesha over CephFS.

Fixes: https://tracker.ceph.com/issues/23393
Signed-off-by: Ramana Raja <rraja@redhat.com>

commit | commitdiff | tree

Sébastien Han [Tue, 10 Apr 2018 13:39:44 +0000 (15:39 +0200)]

ceph-defaults: bring backward compatibility for old syntax

If people keep on using the mon_cap, osd_cap etc the playbook will
translate this old syntax on the flight.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Mon, 9 Apr 2018 22:33:33 +0000 (00:33 +0200)]

ci: fix tripleO scenario

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Thu, 5 Apr 2018 16:52:23 +0000 (18:52 +0200)]

ci: client copy admin key

If we don't copy the admin key we can't add the key into ceph.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Wed, 4 Apr 2018 14:31:04 +0000 (16:31 +0200)]

ci: remove useless tests

These are already handled by ceph-client/defaults/main.yml so the keys
will be created once user_config is set to True.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Wed, 4 Apr 2018 14:22:36 +0000 (16:22 +0200)]

ceph_key: use ceph_key in the playbook

Replaced all the occurence of raw command using the 'command' module
with the ceph_key module instead.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Fri, 30 Mar 2018 14:56:44 +0000 (16:56 +0200)]

infra: add playbook example for ceph_key module

Helper playbook to manage CephX keys.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Sun, 18 Mar 2018 14:53:45 +0000 (15:53 +0100)]

add ceph_key module

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Thu, 5 Apr 2018 14:12:32 +0000 (09:12 -0500)]

ceph_volume: objectstore should default to 'bluestore'

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Tue, 3 Apr 2018 16:55:36 +0000 (11:55 -0500)]

ceph_volume: refactor to not run ceph osd destroy

This changes state to action and gives the options 'create'
or 'zap'. The zap parameter is also removed.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Wed, 28 Mar 2018 16:10:17 +0000 (11:10 -0500)]

ceph_volume: perserve newlines in stdout and stderr when zapping

Because we have many commands we might need to run the
ANSIBLE_STDOUT_CALLBACK won't format these nicely because we're
not reporting these back at the root level of the json result.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Wed, 14 Mar 2018 19:46:37 +0000 (14:46 -0500)]

purge-cluster: no need to use objectstore for ceph_volume module

When zapping objectstore is not required.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Wed, 14 Mar 2018 17:26:43 +0000 (12:26 -0500)]

ceph_volume: rc should be 0 on successful runs

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Wed, 14 Mar 2018 17:19:42 +0000 (12:19 -0500)]

ceph_volume: defines the zap param in module_args

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Wed, 14 Mar 2018 16:49:48 +0000 (11:49 -0500)]

ceph_volume: make state not required so I can provide a default

I want a default value of 'present' for state, so it can not
be made required. Othewise it'll throw a 'Module alias error'
from ansible.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Wed, 14 Mar 2018 16:47:07 +0000 (11:47 -0500)]

ceph_volume: objectstore is now optional except when state is present

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Wed, 14 Mar 2018 16:32:19 +0000 (11:32 -0500)]

purge-cluster: use ceph_volume module to zap and destroy OSDs

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Mon, 12 Mar 2018 19:06:39 +0000 (14:06 -0500)]

tests: no need to remove partitions in lvm_setup.yml

Now that we are using ceph_volume_zap the partitions are
kept around and should be able to be reused.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Wed, 14 Mar 2018 16:24:40 +0000 (11:24 -0500)]

ceph_volume: adds a zap property and reworks to support state: absent

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Wed, 14 Mar 2018 15:14:21 +0000 (10:14 -0500)]

ceph_volume: adds a state property

This can be either present or absent.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Wed, 14 Mar 2018 14:57:49 +0000 (09:57 -0500)]

ceph_volume: remove the subcommand argument

This really isn't needed currently and I don't believe is a good
mechanism for switching subcommands anwyay. The user of this module
should not have to be familar with all ceph-volume subcommands.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Randy J. Martinez [Wed, 28 Mar 2018 23:46:54 +0000 (18:46 -0500)]

purge-docker: added conditionals needed to successfully re-run purge

Added 'ignore_errors: true' to multiple lines which run docker commands; even in cases where docker is no longer installed. Because of this, certain tasks in the purge-docker-cluster.yml will cause the playbook to fail if re-run and stop the purge. This leaves behind a dirty environment, and a playbook which can no longer be run.
Fix Regex line 275: Sometimes 'list-units' will output 4 spaces between loaded+active. The update will account for both scenarios.
purge fetch_directory: in other roles fetch_directory is hard linked ex.: "{{ fetch_directory }}"/"{{ somedir }}". That being said, fetch_directory will never have a trailing slash in the all.yml so this task was never being run(causing failures when trying to re-deploy).

Signed-off-by: Randy J. Martinez <ramartin@redhat.com>

commit | commitdiff | tree

JohnHaan [Tue, 10 Apr 2018 00:48:47 +0000 (09:48 +0900)]

Fixed wrong path of ceph.conf in docs.

The path of ceph.conf sample template moved to ceph-config.
Therefore docs needs to be changed to the right directory.

Signed-off-by: JohnHaan <yongiman@gmail.com>

commit | commitdiff | tree

Guillaume Abrioux [Mon, 9 Apr 2018 11:02:44 +0000 (13:02 +0200)]

defaults: fix backward compatibility

backward compatibility with `ceph_mon_docker_interface` and
`ceph_mon_docker_subnet` was not working since there wasn't lookup on
`monitor_interface` and `public_network`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Ken Dreyer [Thu, 5 Apr 2018 19:40:15 +0000 (13:40 -0600)]

common: upgrade/install ceph-test RPM first

Prior to this change, if a user had ceph-test-12.2.1 installed, and
upgraded to ceph v12.2.3 or newer, the RPM upgrade process would
fail.

The problem is that the ceph-test RPM did not depend on an exact version
of ceph-common until v12.2.3.

In Ceph v12.2.3, ceph-{osdomap,kvstore,monstore}-tool binaries moved
from ceph-test into ceph-base. When ceph-test is not yet up-to-date, Yum
encounters package conflicts between the older ceph-test and newer
ceph-base.

When all users have upgraded beyond Ceph < 12.2.3, this is no longer
relevant.

commit | commitdiff | tree

Sébastien Han [Mon, 9 Apr 2018 08:01:30 +0000 (10:01 +0200)]

ceph-defaults: fix ceoh_uid for container image tag latest

According to our recent change, we now use "CentOS" as a latest
container image. We need to reflect this on the ceph_uid.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Sébastien Han [Thu, 5 Apr 2018 08:28:51 +0000 (10:28 +0200)]

tox: use container latest tag for upgrades

Currently tag-build-master-luminous-ubuntu-16.04 is not used anymore.
Also now, 'latest' points to CentOS so we need to make that switch here
too.

We know have latest tags for each stable release so let's use them and
point tox at them to deploy the right version.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Zack Cerza [Fri, 6 Apr 2018 16:17:48 +0000 (10:17 -0600)]

Use the CentOS repo for Red Hat dev packages

No use even trying to use something that doesn't exist.

Signed-off-by: Zack Cerza <zack@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 4 Apr 2018 09:46:51 +0000 (11:46 +0200)]

site-docker: followup on #2487

get a non empty array as default value for `groups.get('clients')`,
otherwise `| first` filter will complain because it can't work with
empty array.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Sébastien Han [Wed, 4 Apr 2018 14:23:54 +0000 (16:23 +0200)]

add .vscode/ to gitignore

I personally dev on vscode and I have some preferences to save when it
comes to running the python unit tests. So escaping this directory is
actually useful.

Signed-off-by: Sébastien Han <seb@redhat.com>

commit | commitdiff | tree

Attila Fazekas [Wed, 4 Apr 2018 13:30:55 +0000 (15:30 +0200)]

Deploying without managed monitors failed

Tripleo deployment failed when the monitors not manged
by tripleo itself with:
FAILED! => {"msg": "list object has no element 0"}

The failing play item was introduced by
f46217b69ae18317cb0c1cc3e391a0bca5767eb6 .

fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1552327

Signed-off-by: Attila Fazekas <afazekas@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 3 Apr 2018 11:43:53 +0000 (13:43 +0200)]

defaults: remove `run_once: true` when creating fetch_directory

because of `serial: 1`, it can be an issue when the playbook is being
run on client nodes.
Since the refact of `ceph-client` we skip the role `ceph-defaults` on
every node except the first client node, it means that the task is not
going to be played because of `run_once: true`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Tue, 3 Apr 2018 11:41:07 +0000 (13:41 +0200)]

config: use fact `ceph_uid`

Use fact `ceph_uid` in the task which ensures `/etc/ceph` exists in
containerized deployments.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Fri, 30 Mar 2018 11:48:17 +0000 (13:48 +0200)]

clients: refact `ceph-clients` role

This commit refacts this role so we don't have to pull container image
on client nodes just to create pools and keys.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1550977
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Fri, 30 Mar 2018 10:50:14 +0000 (12:50 +0200)]

client: remove legacy code

This seems to be a leftover.
This commit removes an unnecessary 'set linux permissions' on
`/var/lib/ceph`

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

Unnamed repository; edit this file 'description' to name the repository.