Damian Dabrowski [Tue, 12 May 2026 19:17:37 +0000 (21:17 +0200)]
Make sure CEPH_CONTAINER_IMAGE is fixed everywhere
In [1], `CEPH_CONTAINER_IMAGE` was fixed to work properly with
ansible-core 2.19, where the way ansible handles environment variables
with "None" value has changed.
Unfortunately, `CEPH_CONTAINER_IMAGE` was left unchanged in a few
places.
This patch fixes that by making sure that the value of
`CEPH_CONTAINER_IMAGE` is correct everywhere.
The name[unique] [1] for ansible-lint which was recently
introduced raises due to duplicating names of tasks
within same playbooks. We fix task naming to be unique.
Christian Brandt [Tue, 16 Sep 2025 15:33:55 +0000 (17:33 +0200)]
ceph-pool: pass pg_num for pg_autoscale_mode warn
The pg_autoscale_mode parameter has three possible values 'on', 'off', and
'warn'. In case of 'warn' it should be possible to pass the number of
placement groups (pg_num) to the create_pool function. Nothing is lost by
allowing this. The health check will still warn if the value needs
adjustment.
The evaluation of target_size_ratio works as before.
Add a test iterating through possible value combinations of the parameters
target_size_ratio, pg_num and pg_autoscale_mode.
Signed-off-by: Christian Brandt <cbrandt@strato.de>
John Fulton [Fri, 7 Feb 2025 18:08:03 +0000 (13:08 -0500)]
Adopt with grafana_network not grafana_server_addr
The networks list in a spec is usually a list of ranges,
not a single IP. The grafana_server_addr is a fact created
from the grafana_network range so it is a more appropriate
parameter to pass to the spec.
John Fulton [Thu, 6 Feb 2025 00:01:08 +0000 (19:01 -0500)]
Use grafana_server_addr to set prometheus networks list
When dashboard is enabled and module ceph_orch_apply is
called, if the grafana_server_addr is defined, then it
is used to populate the networks list in the spec of type
alertmanager. This is the case without this patch. With
this patch the same logic is applied to the spec of type
prometheus. Also, if the grafana_server_addr is a comma
delimited list, then a jinja2 expression handles passing
the IPs as a list.
Without this patch prometheus binds to all networks even
if grafana_server_addr is set which can create conflicts
with other services.
Fixes: https://bugzilla.redhat.com/2269009 Signed-off-by: John Fulton <fulton@redhat.com>
John Fulton [Thu, 16 Jan 2025 21:30:20 +0000 (16:30 -0500)]
Handle radosgw hosts placement with non-default cluster name
In cephadm-adopt.yml TASK "Update the placement of radosgw hosts"
does not handle when Ansible var cluster is something other than
"ceph", unless this patch is used.
Update module ceph_orch_apply to support optional cluster
parameter using the same style as in module ceph_config.
The command is only extended to inclue the new keyring
and config options if cluster name is not ceph.
This patch is necessary to migrate older clusters which were
deployed when custom names were supported.
Closes: https://issues.redhat.com/browse/RHCEPH-10442 Signed-off-by: John Fulton <fulton@redhat.com>
John Fulton [Thu, 16 Jan 2025 20:50:50 +0000 (15:50 -0500)]
Handle adoption when radosgw_address_block is comma delimited list
In cephadm-adopt.yml TASK "Update the placement of radosgw hosts"
passes module ceph_orch_apply embedded YAML via a block scalar.
This YAML creates a Ceph spec of service_type RGW. The networks
key of this spec supports either a list or a string. Without this
patch, the networks key of the spec will only contain a string.
With this patch a string is still set for the networks key, but
if Ansible var radosgw_address_block contains commas, then var
radosgw_address_block is split by those commas into a list and
the networks key of the spec will be set to a list.
Closes: https://issues.redhat.com/browse/RHCEPH-10418 Signed-off-by: John Fulton <fulton@redhat.com>
Don't try to set devices fact when osd_auto_discovery was skipped
Right now, under certain OS and Ansible versions, ie Rocky Linux and
ansible-core 2.17, `devices_check` variable is getting defined even if
task was skipped.
That results in set_fact to fail, as resulting variable has no `results`
key in it.
Structure of such variable looks like that:
```
"devices_check": {
"changed": false,
"false_condition": "osd_auto_discovery | default(False) | bool",
"skip_reason": "Conditional result was False",
"skipped": true
}
```
Checking for task not being skipped solves such issues.
devices: test devices before collecting on auto discovery
In some scenarios with NVMe, a device might be identified by
Ansible but could actually be a multipath device rather than an
actual device. We need to exclude these as Ceph cannot create
OSDs on them.
cephadm-adopt: Fixes binding network for alertmanager
Alertmanager was bind to default * network instead of grafana_server_addr
as it was before. Now on if grafana_server_addr is defined, it will be
bind to that network.
Seena Fallah [Mon, 10 Jun 2024 10:11:55 +0000 (12:11 +0200)]
ceph-handler: use haproxy maintenance for rgw restarts
RGW currently restarts without waiting for existing connections to
close. By adjusting the HAProxy weight before the restart, we can
ensure that no active connections are disrupted during the restart
process.
Add new module ceph_orch_spec which applies ceph spec files.
This feature was needed to bind extra mount points to the RGW
container (/etc/pki/ca-trust/).
Teoman ONAY [Wed, 15 May 2024 14:04:04 +0000 (16:04 +0200)]
Fix cephadm-adopt test scenario
Fixes cephadm-adopt test scenario by configuring rbd application
on test and test2 pools. Otherwise cephadm-adopt failed at task
"Check pools have an application enabled"
Seena Fallah [Tue, 7 May 2024 18:41:53 +0000 (20:41 +0200)]
ceph-rgw: introduce rgw zone to the name schema
This is needed by ceph-exporter as it is parsing the socket by the number of dots.
Although the rgw_zone variable is only using for constructing the client name
and has nothing to do with multisiting.
Although custom cluster name support was dropped, it breaks ceph-volume
functional testing as it uses "test" as cluster name.
The plan is to make ceph-volume use "ceph" but for now it's easier to
address the issue in this task.
Seena Fallah [Sat, 16 Mar 2024 15:07:02 +0000 (16:07 +0100)]
radosgw_zonegroup: parse master as boolean
The returned payload from rgw has is_master as a boolean. By having master as a string it would always report a change and try to modify the zonegroup.
Seena Fallah [Wed, 6 Mar 2024 12:37:06 +0000 (13:37 +0100)]
site: install mgrs with mons if sharing the same host
If mgr is meant to be installed on the mon host it needs to be installed in the same playbook as restart handlers might failed because of non-existance mgr
Seena Fallah [Sun, 18 Feb 2024 02:41:41 +0000 (03:41 +0100)]
container: cleanup container systemd units
* Make common params of container args in a var to avoid duplication
* The /var/lib/ceph/crash mount was missing after 637ca81c9cf801e4d1d125dc8a2492b90fd78eea
* Add CEPH_USE_RANDOM_NONCE as it's needed when running inside container (can be removed for squid later)
* Add NODE_NAME as some part of ceph code relies on this var
* add default logging opts for