This commit changes the bind mount option for the mount point
`/var/lib/ceph` in the systemd template for mon and mgr containers. This
is needed in case of collocating mon/mgr with osds using dmcrypt
scenario.
Once mon/mgr got converted to containers, the dmcrypt layer sub mount is
still seen in `/var/lib/ceph`. For some reason it makes the
corresponding devices busy so any other container can't open/close it.
As a result, it prevents osds from starting properly.
Since it only happens on the nodes converted before the OSD play, the idea is
to bind mount `/var/lib/ceph` on mon and mgr with the `rshared` option
so once the sub mount is unmounted, it is propagated inside the
container so it doesn't see that mount point.
Dimitri Savineau [Mon, 16 Nov 2020 15:31:11 +0000 (10:31 -0500)]
switch2container: chown symlink in mon/mgr plays
fa2bb3a only fix the symlink owner/group issue in the OSD play. If the
OSDs are collocated with other services like MONs and MGRs then the
chown command will fail.
when using `monitor_interface`, if nodes don't have same interface names
this task will fail like following:
```
fatal: [argo010]: FAILED! => {
"msg": "The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute u'ansible_enp1s0f0'\n\nThe error appears to have been in '/usr/share/ceph-ansible/roles/ceph-mon/tasks/docker/main.yml': line 19, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: ipv4 - force peer addition as potential bootstrap peer for cluster bringup - monitor_interface\n ^ here\n"
}
90f3f61 introduced the docker-to-podman.yml playbook but the
ceph-osd-run.sh.j2 template still has some docker hardcoded instead
of using the container_binary variable.
If the OSD directory is using symlinks for referencing devices (like
block, db, wal for bluestore and journal for filestore) then the chown
command could fail to change the owner:group on some system.
When running the switch2container playbook on a Debian based system
then the systemd unit path isn't the same than Red Hat based system.
Because the systemd unit files aren't removed then the new container
systemd unit isn't take in count.
RPietrzak [Thu, 20 Aug 2020 13:17:22 +0000 (15:17 +0200)]
Remove 'run_once: true' from wait 'for all osd to be up' task in ceph-osd/tasks/main.yml role.
This together with condition 'ansible_play_hosts_all | last' causes skipping that task on the first host.
This node was needed for the upgrade job in stable-4.0.
Since we moved the code erasure pool testing in lvm_osds, we don't need
to fire up that node anymore.
This commit moves the systemd rendering task into `systemd.yml` file.
Otherwise, when running docker to podman playbook, the systemd unit file
isn't updated as it should be.
This commit makes the bindmount a bit more generic, otherwise it
currently makes the OSDs failing to start in an OSP FFU upgrade
(with RHEL7 > RHEL8 OS upgrade).
docker2podman playbook is run from ceph-ansible stable-3.2 branch
against RHEL7 nodes where `/var/run/lvmetad.socket` exists but once the
system is upgraded to RHEL8, this socket doesn't exist anymore and
prevent OSDs from starting after the reboot.
As a workaround we can make this bindmount a bit more generic like what
is done in `stable-4.0` branch by mounting `/run/lvm` instead.
When using non lvm scenarios (collocated or non-collocated) then the
disk_list variable isn't set because this is done during the ceph-osd
role (start_osds.yml) which isn't executed in the docker2podman
playbook.
Dimitri Savineau [Tue, 30 Jun 2020 14:13:42 +0000 (10:13 -0400)]
facts: explicitly disable facter and ohai
By default, ansible gathers facts from facter and ohai if installed on
the remote nodes, given we don't need them, let's exclude these facts
from our facts gathering
Dimitri Savineau [Fri, 26 Jun 2020 17:49:17 +0000 (13:49 -0400)]
ceph-osd: exit gracefully when no data partition
When using collocated or non-collocated osd_scenarios (ceph-disk) and
trying to deterime the OSD_DEVICE from the OSD_ID passed to the systemd
unit then we can be in a situation where the OSD hasn't been activated
but the OSD ID exists.
This means the data partition isn't in activate state and the ceph-disk
list command won't show the OSD ID on the data partition.
This isn't backported from master because there are too many changes
between stable-3.2 and other newer branches.
NOTE:
This playbook *doesn't* add podman support in stable-3.2 at all.
This is a tripleO dedicated playbook which is intended to be run
early during FFU workflow in order to prepare the OS upgrade.
```
Warning, treated as error:
/home/jenkins-build/build/workspace/ceph-ansible-docs-pull-requests/docs/source/day-2/upgrade.rst:2:Title underline too short.
```
The workflow in this playbook should be the same than in rolling_update,
we should first set noout and nodeep-scrub flags before migrating the
first osd and unset osd flags after the last osd is migrated.
Dimitri Savineau [Mon, 22 Jun 2020 17:58:10 +0000 (13:58 -0400)]
docker: Add Requires on docker service
When using docker container engine then the systemd unit scripts only
use a dependency on the docker daemon via the After parameter.
But if docker is restarted on a live system then the ceph systemd units
should wait for the docker daemon to be fully restarted.
The workflow in this playbook should be the same than in rolling_update,
we should first set noout and nodeep-scrub flags before migrating the
first osd and unset osd flags after the last osd is migrated.
This commit is the first of a serie in order to describe all day-2 operations
that are possible via ceph-ansible using a set of playbook provided in
`infrastructure-playbooks` directory.
Rishabh Dave [Tue, 7 Apr 2020 11:50:35 +0000 (17:20 +0530)]
library/ceph_volume: look for error messages in stderr
Error message were moved to from stdout in stderr here -
https://github.com/ceph/ceph/commit/b8d6dcbe9f803c96c0af68da54f1262e9b6a9e77#diff-20f7c578a4e69ec61a5869d706567a24R137.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1793542 Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 4249d1e02d6da07466a4ddf1282cf4600a131773)
We were not testing the right ansible_distribution fact value for RHEL
distribution.
This commit also updates the minial RHEL version supported by RHCS.
add-osd: unset noup flag after last osd is deployed
this commit fixes a bug when using `add-osd.yml` playbook.
`noup` flag is set early but it never got unset before the "wait for pgs
clean" check, so the playbook always fails because OSDs aren't never
seen UP.
With this change, the state `present` is enough to update a keyring.
If the keyring already exist, it will be updated if caps or secret
passed to the module are different.
If the keyring doen't exist, it will be created.
osd: support changing default rule even when osd_crush_location isn't defined
Creating crush rules even with no crush hierarchy configuration is a
valid scenario so we shouldn't be bound to the first task result (which
configure crush hierarchy) to be able to add new crush rules.
John Fulton [Thu, 6 Feb 2020 02:23:54 +0000 (21:23 -0500)]
The _filtered_clients list should intersect with ansible_play_batch
Client configuration with --limit fails without this patch
because certain tasks are only done to the first host in the
_filtered_clients list and it's likely that first host will
not be included in what's sepcified with --limit. To fix this
the _filtered_clients list should be built from all clients
in the inventory that are also in the running play.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1798781 Signed-off-by: John Fulton <fulton@redhat.com>
(cherry picked from commit e4bf4857f556465c60f89d32d5f2a92d25d5c90f)
Benoît Knecht [Mon, 20 Jan 2020 10:36:27 +0000 (11:36 +0100)]
ceph-rgw: Fix customize pool size "when" condition
In 3c31b19ab39f297635c84edb9e8a5de6c2da7707, I fixed the `customize pool
size` task by replacing `item.size` with `item.value.size`. However, I
missed the same issue in the `when` condition.
for instance. However, doing so would create pools of size
`osd_pool_default_size` regardless of the `size` value. This was due to
the fact that the Ansible task used
Dimitri Savineau [Mon, 10 Feb 2020 18:43:31 +0000 (13:43 -0500)]
ceph-{mon,osd}: move default crush variables
Since ed36a11 we move the crush rules creation code from the ceph-mon to
the ceph-osd role.
To keep the backward compatibility we kept the possibility to set the
crush variables on the mons side but we didn't move the default values.
As a result, when using crush_rule_config set to true and wanted to use
the default values for crush_rules then the crush rule ansible task
creation will fail.
"msg": "'ansible.vars.hostvars.HostVarsVars object' has no attribute
'crush_rules'"
This patch move the default crush variables from ceph-mon to ceph-osd
role but also use those default values when nothing is defined on the
mons side.
When using ceph aliases with commands that require manual intervention
to stop then the command will keep running inside the container (like
using Ctrl+c).
For handling this, we should use the interactive session option (-it)
with the docker commands.
Mike Christie [Tue, 28 Jan 2020 22:31:55 +0000 (16:31 -0600)]
iscsi: Fix crashes during rolling update
During a rolling update we will run the ceph iscsigw tasks that start
the daemons then run the configure_iscsi.yml tasks which can create
iscsi objects like targets, disks, clients, etc. The problem is that
once the daemons are started they will accept confifguration requests,
or may want to update the system themself. Those operations can then
conflict with the configure_iscsi.yml tasks that setup objects and we
can end up in crashes due to the kernel being in a unsupported state.
This could also happen during creation, but is less likely due to no
objects being setup yet, so there are no watchers or users accessing the
gws yet. The fix in this patch works for both update and initial setup.
validate: allow running ceph-ansible 3.2 against ansible 2.7
This commit allows ceph-ansible 3.2 to be run against ansible 2.7
However, note that running stable-3.2 against ansible 2.7 doesn't get
any testing upstream this might break the playbook, only ansible 2.6 is
officially supported.
Dimitri Savineau [Tue, 28 Jan 2020 15:27:34 +0000 (10:27 -0500)]
ceph-defaults: remove rgw from ceph_conf_overrides
The [rgw] section in the ceph.conf file or via the ceph_conf_overrides
variable doesn't exist and has no effect.
To apply overrides to all radosgw instances we should use either the
[global] or [client] sections.
Overrides per radosgw instance should still use the
[client.rgw.{instance-name}] section.
To avoid confusion, let's change the default value from `0.0.0.0` to
`x.x.x.x`.
Users might think setting `0.0.0.0` will make the daemon binding on all
interfaces.
This commit adds a playbook to be played before we run purge playbook,
it first creates an rbd image then map an rbd device on client0 so the
purge playbook will try to unmap it.
in containerized context, using the binary provided in atomic os won't
work because it's an old version provided by ceph-common based on
10.2.5.
Using a container could be an idea but for large cluster with hundreds
of client nodes, that would require to pull the image of each of them
just to unmap the rbd devices.
Let's use the sysfs method in order to avoid any issue related to ceph
version that is shipped on the host.
Dimitri Savineau [Wed, 27 Nov 2019 14:29:06 +0000 (09:29 -0500)]
ceph-osd: wait for all osds once
cf8c6a3 moves the 'wait for all osds' task from openstack_config to the
main tasks list.
But the openstack_config code was executed only on the last OSD node.
We don't need to do this check on all OSD node so we need to add set
run_once to true on that task.
Dimitri Savineau [Tue, 26 Nov 2019 16:09:11 +0000 (11:09 -0500)]
ceph-osd: wait for all osd before crush rules
When creating crush rules with device class parameter we need to be sure
that all OSDs are up and running because the device class list is
is populated with this information.
This is now enable for all scenario not openstack_config only.
Dimitri Savineau [Thu, 31 Oct 2019 20:24:12 +0000 (16:24 -0400)]
ceph-osd: add device class to crush rules
This adds device class support to crush rules when using the class key
in the rule dict via the create-replicated sub command.
If the class key isn't specified then we use the create-simple sub
command for backward compatibility.
Dimitri Savineau [Thu, 31 Oct 2019 20:17:33 +0000 (16:17 -0400)]
move crush rule creation from mon to osd role
If we want to create crush rules with the create-replicated sub command
and device class then we need to have the OSD created before the crush
rules otherwise the device classes won't exist.
Dimitri Savineau [Mon, 16 Dec 2019 21:41:20 +0000 (16:41 -0500)]
switch_to_containers: set GUID on lockbox part
The ceph lockbox partition (part number 5) used with non lvm scenarios
and in non containerized deployment don't have a valid PARTUUID.
The value is set to 00000000-0000-0000-0000-000000000000 for each OSD
devices.
When switching to containerized deployment we manually mount the lockbox
partition by using the PARTUUID.
Unfortunately because we have most of the time multiple OSD on the same
node we can't have the right symlink in /dev/disk/by-partuuid because it
will point to only one partition.
After the switch_to_containers playbook then only one OSD will restart
correctly and the other will try to access to the wrong device causing
error like 'xxxx is still in use'.
When deploying with containers and dmcrypt OSDs we force a PARTUUID
value during the ceph-disk prepare task.
When using `osd_auto_discovery`, `devices` is built multiple times due
to multiple runs of `ceph-facts` role. It end up with duplicate
instances of a same device in the list.
Using `unique` filter when building the list fixes this issue.
When using fqdn in inventory, that playbook fails because of some tasks
using the result of ceph osd tree (which returns shortname) to get
some datas in hostvars[].
This commit adds the support of the ceph-iscsi stable repository when
use ceph_repository community instead of always using the devel
repositories.
We're still using the devel repositories for rtslib and tcmu-runner in
both cases (dev and community).
ansible.cfg: do not enforce PreferredAuthentications
There's no need to enforce PreferredAuthentications by default.
Users can still choose to override the ansible.cfg with any additional
parameter like this one to fit their infrastructure.
The systemd unit script wasn't updated with the new container name
format (without the hostname).
We now have the same start/stop docker commands for all scenarios.
During the device to id OSD migration we need to be sure that the
old container with the hostname are stopped.
the previous approach was wrong.
checking if `item.key` is in `osd_auto_discovery_exclude` (`['dm-',
'loop']`) is incorrect because it will obviously not match. Therefore,
the condition will return `True` whatever the device we are checking.
Dimitri Savineau [Wed, 27 Nov 2019 16:27:09 +0000 (11:27 -0500)]
switch_to_containers: fix umount ceph partitions
When a container is already running on a non containerized node then the
umount ceph partition task is skipped.
This is due to the container ps command which always returns 0 even if
the filter matches nothing.
We should run the umount task when:
1/ the container command is failing (not installed) : rc != 0
2/ the container command reports running ceph-osd containers : rc == 0
Also we should not fail on the ceph directory listing.
Dimitri Savineau [Wed, 20 Nov 2019 19:40:52 +0000 (14:40 -0500)]
rolling_update: don't enable ceph-mon unit
On non containerized deployment the ceph-mon hostname/fqdn systemd
service are stopped at the beginning of the mon upgrade.
But the parameter enabled is set to true for both task so even if we're
not using the fqdn then it will enabled the systemd unit based on it.