ceph_pool: set target size ratio on both 'on' and 'warn' mode
when we set target_size_ratio to warn it means that the administrator wants to get suggestion from the mgr module but apply it manually when he/she wants. So it's in the same approach as 'on' mode just triggered by hand.
So there is no need to set pg_num when target_size_ratio is 'warn' and the mgr module will calculate the correct pg_num and the administrator will adjust it whenever he/she wants.
When the upgrade from Ceph 4 to 5 is performed in the OpenStack context,
ceph-ansible triggers the rolling_update playbook, which is supposed to
rollout new Ceph containers. The ceph-infra role tries to take care
about firewall, ntp config and logrotate; however, TripleO manages them
through tripleo-heat-templates. This patch just add an additional tag
to skip the ceph-infra role in the OpenStack context.
Closes: https://bugzilla.redhat.com/2090456 Signed-off-by: Francesco Pantano <fpantano@redhat.com>
cephadm-adopt: remove legacy directory after adoption
When this directory is left after the osd adoption, it leads to the following error:
```
[WRN] CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
host axdesec2ocs1n002.ecommerce.inditex.grp `cephadm ceph-volume` failed: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/41555360-e96b-4b16-a37c-873e0c940091/mon.axdesec2ocs1n002/config
ERROR: [Errno 2] No such file or directory: '/var/lib/ceph/41555360-e96b-4b16-a37c-873e0c940091/mon.axdesec2ocs1n002/config'.
```
this is because of an unexpected behavior regarding 'config inferring' when a legacy directory is present in /var/lib/ceph.
Note: this doesn't fix the root cause, this is a workaround.
ceph-facts: fix ansible templating error for auto osd discovery
This commit fixes templating error that occurs when using auto osd discovery. Getting the len before converting the result to a list causes "object of type generator has no len()" error.
Since the ISO install method removal, ceph-ansible isn't able
to detect wheter the user is deploying in a 'disconnected environment'.
By the way, given that ceph-ansible is available only for upgrading to RHCS 5,
this check doesn't make sense anymore, let's drop it.
insatomcat [Wed, 30 Mar 2022 13:48:02 +0000 (15:48 +0200)]
do not update Debian cache when package-install is disabled
When deploying with --skip-tags=package-install (when there is no access to a repository), the playbook is still trying to update the package cache, which makes the playbook fail.
This change prevents the playbook to try to update the cache when the package-install tag is skipped.
upgrade: block upgrade when rgw multisite is active
With this commit, upgrading a cluster from Nautilus to Pacific with
active rgw multisite replication will be blocked.
This is because a lot of bugs are currently present in Pacific regarding
RGW multisite.
Teoman ONAY [Mon, 7 Mar 2022 09:31:14 +0000 (10:31 +0100)]
Turn off SELinux separation for containers MON and RGW
Initially MONs and RGW binded /etc/pki/ca-trust/extracted using the :z flag
(introduced to solve an OSP TripleO issue on RHEL - #3638) but using
this flag prevents local services (like sssd) running on the host from accessing
the certificates/files in that folder.
Teoman ONAY [Mon, 7 Feb 2022 13:23:49 +0000 (14:23 +0100)]
Enable user to change the account used for ssh connection
By default cephadm uses root account to connect remotely
to other nodes in the cluster. This change allows to choose
another account.
This commit also allows to use a dedicated subnet for cephadm mgmt.
This fixes the service file removal and makes the playbook
call `systemctl reset-failed` on the service because in Ceph
Nautilus, ceph-crash doesn't handle `SIGTERM` signal.
This playbook doesn't support less than 3 monitors present in the inventory.
Just like the rolling_update playbook, let's fail if less than
3 monitors are present.
We can't use `{{ cephadm_cmd }}` here because the monitors aren't yet adopted.
We must use `{{ ceph_cmd }}` instead.
This also fixes some filters `| default()` (they must be moved before `| from_json()`)
Benoît Knecht [Mon, 13 Dec 2021 15:36:27 +0000 (16:36 +0100)]
ceph-facts: Fix get_def_crush_rule_name.yml in check mode
This construct doesn't work as intended since ansible/ansible#74212:
```
item.stdout | default('{}') | from_json
```
That PR made the `command` module return `stdout` even in check mode (setting
it to the empty string), so `default()` has no effect in that case and
`from_json()` fails to parse an empty string.
Instead, `default()` needs to be invoked with its second argument set to
`True`, so that it replaces any `False` value (such as an empty string) with
its first argument:
Benoît Knecht [Mon, 13 Dec 2021 13:41:23 +0000 (14:41 +0100)]
ceph-osd: Fix crush_rules.yml in check mode
Set a default value for `item.stdout` before passing it to `from_json()`. The
`when` condition doesn't prevent this template from being evaluated in check
mode, so it fails if `item.stdout` doesn't contain a valid JSON string.
That PR made the `command` module return `stdout` even in check mode (setting
it to the empty string), so `default()` has no effect in that case and
`from_json()` fails to parse an empty string.
Instead, `default()` needs to be invoked with its second argument set to
`True`, so that it replaces any `False` value (such as an empty string) with
its first argument:
John Karasev [Tue, 27 Apr 2021 20:52:48 +0000 (13:52 -0700)]
ceph-grafana: Add proxy env vars to grafana service template
When installing grafana plugins, the container will make http requests.
This requires http proxy otherwise installation cannot be performed. Passed
the proxy vars from all.yml as env args. Fixes: ceph#6484, ceph#6481 Signed-off-by: John Karasev <john.karasev@intel.com>
In the OpenStack context we let the integration tool (TripleO)
deal with repositories and packages.
This change just adds the with_pkg tag to allow TripleO skipping
both the repositories and packages installation.
Signed-off-by: Francesco Pantano <fpantano@redhat.com>
The current implementation is wrong.
ceph-ansible lists all existing buckets and try to create
an export for each of them.
Instead, it's easier to create the export at the user level.
In order to reduce need of module
internal maintenance and to join forces on plugin development,
it's proposed to switch to using upstream version of
config_template module.
As it's shipped as collection, it's installation for end-users
is trivial and aligns with general approach of shipping extra modules.
Benoît Knecht [Thu, 30 Dec 2021 14:08:08 +0000 (15:08 +0100)]
ceph-handler: Fix check mode
When running in check mode with one or more Ceph daemons that need to be
restarted, the `tmpdirpath.path` variable that several handlers rely on is
undefined, leading to fatal errors.
This commit ensures the tasks that require `tmpdirpath.path` are skipped when
it's undefined.
cephadm-adopt: ensure /etc/ceph is present on monitoring node
When deploying the monitoring stack on a dedicated node, the directory
`/etc/ceph` has never been created. Therefore, the play for adopting the
monitoring stack fails because it can't write the minimal config file.
Benoît Knecht [Tue, 26 Oct 2021 14:00:05 +0000 (16:00 +0200)]
roles/ceph-rgw: Support CRUSH device class
The pools created by `ceph-rgw` (listed in `rgw_create_pools`) now support a
`ec_crush_device_class` option to specify which device class the EC pool should
use.
It default to being omitted, which means it will use OSDs from any device class
by default.
since a variable encrypted with vault is no longer a string but a
encrypted object we can't use the filter | length, we have to convert it
to a string before.
Change needed in order to support --limit on mon nodes.
Otherwise, a call to `hostvars[groups[mon_group_name][0]]['_current_monitor_address']`
throws an error:
```
"The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute '_current_monitor_address'"
```
When adding host, using ansible_facts['default_ipv4']['address'] might
not be the desired network, we shouldn't enforce the subnet with the
default route.
Let's use the public_network instead.
When adding host, using `ansible_facts['default_ipv4']['address']` might
not be the desired network, we shouldn't enforce the subnet with the
default route.
Let's use the public_network instead.
The rbd mirroring is broken because cephadm doesn't bindmount /etc/ceph anymore.
It means the keyrings and ceph config file aren't available after the
migration.
The idea here is to remove the current rbd mirror peer and add it back
to the mon config store so we aren't bound to the /etc/ceph directory.
when using --limit osds, the play before and after osd upgrade are
skipped because we use `hosts: "{{ mon_group_name | default('mons') }}[0]"`
using `hosts: "{{ osds_group_name | default('osds') }}" with
`delegate_to` to the first monitor addresses this issue.
Add ceph_nfs_adopt tag to the cephadm-adopt playbook
There are existing OpenStack scenarios where nfs is still not managed
by cephadm. For this reason sometimes is useful skip the nfs part of
the adoption playbook and leave this daemon unmanaged.
The purpose of this patch is providing a tag to enable the OpenStack
operators to skip this playbook section.
Closes: https://bugzilla.redhat.com/2009212 Signed-off-by: Francesco Pantano <fpantano@redhat.com>