Dimitri Savineau [Thu, 22 Oct 2020 14:59:15 +0000 (10:59 -0400)]
podman: force log driver to journald
Since we've changed to podman configuration using the detach mode and
systemd type to forking then the container logs aren't present in the
journald anymore.
The default conmon log driver is using k8s-file.
This file is currently deployed with '0644' ownership making this file
readable by any user on the system.
Since it contains sensitive information it should be readable by the
owner only.
Benoît Knecht [Mon, 19 Oct 2020 09:23:59 +0000 (11:23 +0200)]
ceph-mon: Fix check mode for deploy monitor tasks
Skip the `get initial keyring when it already exists` task when both commands
whose `stdout` output it requires have been skipped (e.g. when running in check
mode).
When running rhel8 containers on a rhel7 host, after zapping an OSD
there's a discrepancy with the lvmetad cache that needs to be refreshed.
Otherwise, the host still sees the lv and can makes the user confused.
If user tries to redeploy an OSD, it will fail because the LV isn't
present and need to be recreated.
ie:
```
stderr: lsblk: ceph-block-8/block-8: not a block device
stderr: blkid: error: ceph-block-8/block-8: No such file or directory
stderr: Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.
usage: ceph-volume lvm prepare [-h] --data DATA [--data-size DATA_SIZE]
[--data-slots DATA_SLOTS] [--filestore]
[--journal JOURNAL]
[--journal-size JOURNAL_SIZE] [--bluestore]
[--block.db BLOCK_DB]
[--block.db-size BLOCK_DB_SIZE]
[--block.db-slots BLOCK_DB_SLOTS]
[--block.wal BLOCK_WAL]
[--block.wal-size BLOCK_WAL_SIZE]
[--block.wal-slots BLOCK_WAL_SLOTS]
[--osd-id OSD_ID] [--osd-fsid OSD_FSID]
[--cluster-fsid CLUSTER_FSID]
[--crush-device-class CRUSH_DEVICE_CLASS]
[--dmcrypt] [--no-systemd]
ceph-volume lvm prepare: error: Unable to proceed with non-existing device: ceph-block-8/block-8
```
Gaudenz Steinlin [Mon, 10 Aug 2020 09:38:47 +0000 (11:38 +0200)]
ceph-crash: Only deploy key to targeted hosts
The current task installs the ceph-crash key to "most" hosts via
"delegate_to". This key is only used by the ceph-crash daemon and should
just be installed on all hosts targeted by this role. There is no need
for using a delegated task.
Dimitri Savineau [Wed, 14 Oct 2020 00:43:53 +0000 (20:43 -0400)]
ceph-osd: don't start the OSD services twice
Using the + operation on two lists doesn't filter out the duplicate
keys.
Currently each OSDs is started (via systemd) twice.
Instead we could use the union filter.
Currently the `ceph_key` module doesn't support using a different
keyring than `client.admin`.
This commit adds the possibility to use a different keyring.
When rgw and osd are collocated, the current workflow prevents from
scaling out the radosgw_num_instances parameter when rerunning the
playbook in baremetal deployments.
When ceph-osd notifies handlers, it means rgw handlers are triggered
too. The issue with this is that they are triggered before the role
ceph-rgw is run.
In the case a scaleout operation is expected on `radosgw_num_instances`
it causes an issue because keyrings haven't been created yet so the new
instances won't start.
ceph-facts: add get default crush rule from running monitor
In case of deploying new monitor node to an existing cluster,
osd_pool_default_crush_rule should be taken from running monitor because
ceph-osd role won't be run and the new monitor will have different
osd_pool_default_crush_role from other monitors.
In non containerized deployment we check if the service is running
via the socket file presence.
This is done via the xxx_socket_stat variable that check the file
socket in the /var/run/ceph/ directory.
In some scenarios, we could have the socket file still present in
that directory but not used by any process.
That's why we have the xxx_stat variable which clean those leftovers.
The problem here is that we're set the variable for the handlers status
(like handler_mon_status) based on xxx_socket_stat instead of xxx_stat.
That means we will trigger the handlers if there's an old socket file
present on the system without any process associated.
ansible.cfg: remove cfg file in infrastructure-playbooks
There's no need ot have a copy of this file in infrastructure-playbooks
directory.
playbooks in that directory can be run from the root dir of
ceph-ansible.
As of 2.10, group names containing a dash are invalid.
However, setting this option makes it still possible to use a dash in
group names and prevent this warning to show up.
It might need to be definitely addressed in a future ansible release.
Dimitri Savineau [Thu, 16 Jan 2020 14:38:08 +0000 (09:38 -0500)]
ceph-facts: move facts to defaults value
There's no need to define a variable via a fact if we can do it via a
default value. Using a fact could be interesseting to override the
default value on some condition.
- ceph_uid could be set to 167 by default because it's only different on
non containerized deployment on Debian/Ubuntu.
- rbd_client_directory_{owner,group,mode} could be set to ceph,ceph,0770
by default install of null as we are doing in the facts.
facts: fix 'set_fact rgw_instances with rgw multisite'
the current condition doesn't work, as soon as the first iteration is
done the condition makes next iterations skip since `rgw_instances` got
set with the first iteration.
When using a http(s) proxy with either docker or podman we can rely on
the HTTP_PROXY, HTTPS_PROXY and NO_PROXY environment variables.
But with ansible, even if those variables are defined in a source file
then they aren't loaded during the container pull/login tasks.
This implements the http(s) proxy support with docker/podman.
Both implementations are different:
1/ docker doesn't rely en the environment variables with the CLI.
Thos are needed by the docker daemon via systemd.
2/ podman uses the environment variables so we need to add them to
the login/pull tasks.
If the OSD directory is using symlinks for referencing devices (like
block, db, wal for bluestore and journal for filestore) then the chown
command could fail to change the owner:group on some system.
When running the switch2container playbook on a Debian based system
then the systemd unit path isn't the same than Red Hat based system.
Because the systemd unit files aren't removed then the new container
systemd unit isn't take in count.
We don't need to install node-exporter on client node because there's
no ceph services running on them.
This also makes sure we use the group name variables in the prometheus
service template instead of hardcoding the values.
> That commit [1] introduced a regression in the dashboard configuration
> because the ceph config get mgr xxxx command doesn't work with
> nautilus.
> In that release the get operation needs an entity.
We were only supporting CentOS 8 for containerized deployment.
Since Nautilus 14.2.10 we now have el8 rpm packages so we should be
able to deploy a nautilus ceph cluster with el8.
Note that the nfs-ganesha isn't supported because there's no el8 rpm
packages for nfs-ganesha V2.8.
Dimitri Savineau [Mon, 17 Aug 2020 17:55:47 +0000 (13:55 -0400)]
ceph-rgw: allow specifying crush rule on pool
We already support specifiying a custom crush rule during pool creation
in ceph-osd role but not in ceph-rgw role.
This patch adds the missing code to implement this feature.
Note this is only available for replicated pool not erasure. The rule
must also exist prior the pool creation.
container: run engine/common roles on first client
We already do this in the site-container.yml playbook because we don't
need docker/podman installed on all client nodes and having the
container image only on the first client node.
Dimitri Savineau [Mon, 17 Aug 2020 18:56:17 +0000 (14:56 -0400)]
container: don't install the engine on all clients
We only need the container engine to be installed on the first clients
node in order to execute the pools/keys operation. We already do the
same worflow with the ceph-container-common role which pull the ceph
container image.
Dimitri Savineau [Mon, 17 Aug 2020 19:27:16 +0000 (15:27 -0400)]
Allow updating crush rule on existing pool
The crush rule value was only set once during the pool creation. It was
not possible to update the crush rule value by updating the value in the
configuration.
PytestUnknownMarkWarning: Unknown pytest.mark.ceph_crash - is this a typo?
You can register custom marks to avoid this warning - for details,
see https://docs.pytest.org/en/latest/mark.html
ceph-facts: only get fsid when monitor are present
When running the rolling_update playbook with an inventory without
monitor nodes defined (like external scenario) then we can't retrieve
the cluster fsid from the running monitor.
In this scenario we have to pass this information manually (group_vars
or host_vars).
This changes the grafana container image regitry from docker.io to
quay.io to avoid rate limit.
This also adds the missing container image values for docker2podman
and podman scenarios.
Add --cluster option on ceph require-osd-release command
On DCN environments, or when multiple ceph cluster are configured,
we need to specify the cluster name before running the command or
the rolling_update playbook will fail during minor updates.
Closes: https://bugzilla.redhat.com/1876447 Signed-off-by: Francesco Pantano <fpantano@redhat.com>
(cherry picked from commit cb64df30b687d95704bac76ed0b4f83dfc3ca992)
This commit moves the erasure pool creation testing from `all_daemons`
to `lvm_osds` so we can decrease the number of osd nodes we spawn so the
OVH Jenkins slaves aren't less overwhelmed when a `all_daemons` based
scenario is being tested.
nfs: do not copy rgw keyring when `nfs_obj_gw` is true
This keyring shouldn't be copied when `nfs_obj_gw` is `True` if the
cluster doesn't contain a rgw node, which can be the case given we are
using `nfs_obj_gw` instead of `nfs_file_gw` (cephfs vs. object), the
deployment will fail trying to copy a key that doesn't exist.
raul [Mon, 3 Aug 2020 10:58:50 +0000 (12:58 +0200)]
rgw: support 1+ rgw instance in `radosgw_frontend_port`
Change the radosgw_frontend_port to take in account more than 1 RGW instance,
in it's original form `radosgw_frontend_port: radosgw_frontend_port | int`,
it configured the 8080 port to all instances, with the following modification
`radosgw_frontend_port: radosgw_frontend_port | int + item|int` we increase in
1 the port count.
Co-authored-by: Daniel Parkes <dparkes@redhat.com> Signed-off-by: raul <rmahique@redhat.com>
(cherry picked from commit 110eaf5f9f8a2fe26993e2e663849a74531da9d2)
When running `infrastructure-playbooks/purge-cluster.yml` twice, it fails the
second time on the `ensure rbd devices are unmapped` task, because `rbdmap`
isn't installed anymore at that point.
This commit adds a check that ensures `rbdmap` is available, and skips the
`ensure rbd devices are unmapped` task if it isn't.
Kevin Coakley [Mon, 3 Aug 2020 17:03:34 +0000 (10:03 -0700)]
Remove ceph-radosgw.target when switching to containerize daemons
The task "remove old systemd unit file" under "switching from
non-containerized to containerized ceph rgw" only removes
the ceph-radosgw@.service file. The task should also remove
the ceph-radosgw.target file, like the "remove old systemd unit
files" tasks for the mons, mgrs, osds, etc, in order to clean up
all of the unused systemd unit files.
When using TLS on the ceph dashboard or grafana services, we can provide
the TLS certificate and key.
Those files should be present on the ansible controller and they will be
copyied to the right node(s).
In some situation, the TLS certificate and key could be already present
on the target node and not on the ansible controller.
For this scenario, we just need to copy the files locally (on each remote
host).
This patch adds the dashboard_tls_external variable (with default to
false) to allow users to achieve this scenario when configuring this
variable to true.
In addition of 155e2a2, the active mds daemons isn't stop/start
correctly as opposed as the other services so that daemon doesn't come
back after the upgrade.
The dashboard upgrade workflow should do the same process than the ceph
upgrade otherwise any systemd unit modification won't be apply on the
monitoring/dashboard stack.