Dan Mick [Tue, 18 Feb 2025 23:21:00 +0000 (15:21 -0800)]
checkcerts.py: actually fix "send email"
argparse can't do a nargs="*" optional arg *and* check for its
presence; add a separate arg -E to send the email, and keep -e as
an optional list of addressees.
Also add the full path and host where checkcerts.py is running.
Dan Mick [Wed, 18 Dec 2024 21:48:54 +0000 (13:48 -0800)]
openvpn/sepia/new-client: save a tarball of secret and secret.hash
Also, explain a little bit more about what new-client has done.
hopefully this helps users understand/keep track of their secrets,
and hopefully this streamlines diagnosing when things go wrong
Ilya Dryomov [Tue, 1 Oct 2024 10:57:09 +0000 (12:57 +0200)]
testnode: move removing openmpi-common to apt_systems.yml
Otherwise, because it's placed in ubuntu.yml, it resets
packages_to_remove list defined in apt_systems.yml to just
openmpi-common and commit 701d3594d220 ("testnode: remove tgt")
doesn't take effect.
While at it, fix a typo in the comment -- it's mpich.
As a follow up for commit 67a92953a5a2 ("testnode: don't install tgt"),
explicitly remove tgt to cover the case of it already being there. This
also documents that we actively don't want tgt to be there.
We haven't done anything with tgt for over 10 years and tgt can
interfere with setting up ceph-iscsi targets, resulting in "Could not
create NetworkPortal in configFS" errors.
FOG capture requires dbus-uuidgen (in dbus-tools) and the presence of
/var/lib/dbus (from dbus-daemon) to do the bug-avoidance dance for
Satellite. It's not clear that bug-avoidance dance is still necessary,
but this is the minimal-investigation minimal-touch change to make the
capture process work.
Vallari Agrawal [Mon, 20 May 2024 03:34:39 +0000 (09:04 +0530)]
Fix "Unsupported parameters for (stat) module: get_md5"
Remove get_md5 because it was removed in ansible 2.9:
https://github.com/ansible-community/ansible-build-data/blob/0dee49ac8a7674153606ddc6432d4029eb20172d/9/CHANGELOG-v9.rst#L5195
Dan Mick [Fri, 19 Apr 2024 02:15:46 +0000 (19:15 -0700)]
Remove unnecessary duplicate repo definitions for centos9
CentOS 8 had different .repo files for each major section (BaseOS,
AppStream, etc.). CentOS 9 has apparently moved to a single file,
centos.repo. This change 1) removes the management of separate repo files
for BaseOS and AppStream, since those repos are included in centos.repo,
and 2) stops using the perhaps-questionable single baseurl in favor of
the default metalink/mirrors setup
There are errors occurring for teuthology tests on centos9 that may
be related to this, with the errors of the form "<pkg> from <repo>
does not belong to a distupgrade repository". As near as I can tell,
a "distupgrade repository" is one used only for upgrade, and I can't
find information on how exactly it's indicated, so I don't know if this
change will resolve the error or not.
Dan Mick [Wed, 13 Mar 2024 19:33:50 +0000 (12:33 -0700)]
checkcerts.py: certificate errors were not noted
When a certificate is already expired, its expiry was not noted
(loop exited early). This stills doesn't explain the lack of early
warning, but at least it'll fix the "no email on actual errors" issue.
We can't leave /etc/machine-id blank; it breaks things, one of
which is the kernel install, which runs a postinstall script to update
/boot/loader/entries, which does nothing (silently) if there's
nothing in /etc/machine-id. Since it can come from the dbus id,
and does by default, and there's a command to generate the dbus
id, generate both, dbus first. This fixes the kernel postinstall.
I don't know if there should be any correlation between
machine-id and the subscription-manager/katello IDs.
Dan Mick [Fri, 2 Jun 2023 09:12:59 +0000 (02:12 -0700)]
testnode: Make sure PowerTools repo is enabled on CentOS
https://github.com/ceph/ceph-cm-ansible/pull/731 removed the
custom-made repo files that added mirrorlists; however, it also
removed the side-effect of enabling the Power Tools repo (which
is not enabled by default). This adds a call to dnf config-manager
to enable the repo, whatever its repo file name, on CentOS
testnodes.
Fixes: https://tracker.ceph.com/issues/59678 Signed-off-by: Dan Mick <dmick@redhat.com>
Dan Mick [Thu, 4 May 2023 07:58:02 +0000 (00:58 -0700)]
cephlab_ansible.sh: use scl rh-python38 on CentOS 7
cephlab_ansible.sh runs at the very end of the installation process
during a cobbler install for fog image capture, on first reboot of the
freshly-cobblered system.
Cobbler runs on a CentOS 7 installation today, but its python is too
old to support modern ansible. The SCL for python 3.8 is installed
on cobbler. Add code here to, if installed, enable the SCL (by setting
some paths in the trigger script that is executed on the cobbler server
after the installed host reboots; a curl fetch is placed at the end of
/etc/rc.local, and this script runs to finish up all the configuration
of the host for teuthology use.
Ken Dreyer [Fri, 21 Apr 2023 14:57:15 +0000 (10:57 -0400)]
public_facing: skip no-tabs linter rule on single task
Instead of skipping ansible-lint's no-tabs rule globally, apply it on
this single task that uses a tab (\t) character.
Longer-term, we could replace this tab with a space because /etc/hosts
can use either whitespace character. I'm taking a cautious approach
today for simplicity.
Dan Mick [Thu, 20 Apr 2023 20:50:12 +0000 (13:50 -0700)]
Remove mirrorlists for CentOS 8
They were failing similarly to EPEL mirrorlists (old broken mirror
machines, out-of-date lists), so let's try going back to out-of-the-box
repo configurations. Perhaps several years later they'll work better.
Dan Mick [Thu, 20 Apr 2023 20:26:12 +0000 (13:26 -0700)]
Remove "switch back from mirrorlist" code for CentOS
The plan is to use mirrorlist exclusively (as we've done for
EPEL) because the upstream infra is changing more rapidly than
our fixed list of mirrors, and hopefully it's more stable than
it was in the past when we were driven to this coping mechanism
of caching mirror lists.
Ken Dreyer [Mon, 17 Apr 2023 19:28:22 +0000 (15:28 -0400)]
common: use ansible_distribution_major_version in epel repos
RHEL systems use roles/common/tasks/rhel-entitlements.yml, and this sets
Yum's $releasever to a specific RHEL minor release (eg. 8.4 or 8.6). As
a result. Fedora's MirrorManager does not return any EPEL repositories
for these minor RHEL versions.
We set a static $releasever in rhel-entitlements.yml so that we pin to
old RHEL RPM content in our old RHEL nodes. We probably need to re-think
this strategy since our CentOS Stream nodes do not (cannot) do this, and
Red Hat does not really support pinning to old versions without an EUS
subscription.
Rather than untangling all that and removing our $releasever
manipulation altogether, this commit simply hard-codes
ansible_distribution_major_version ("8", "9", etc) into the EPEL .repo
files, ignoring $releasever for EPEL.
A longer-term fix would be to stop mangling $releasever on RHEL.
Ken Dreyer [Wed, 12 Apr 2023 18:55:02 +0000 (14:55 -0400)]
common: use EPEL metalink
Some mirrors are stale (https://pagure.io/fedora-infrastructure/issue/11233)
Use MirrorManager's metalink application so we always get up-to-date
mirrors.
MirrorManager will also return the list of mirror that carry each
architecture (x86_64, aarch64, etc) so we will not need to manage that
information ourselves here.
Dan Mick [Wed, 15 Feb 2023 04:24:04 +0000 (20:24 -0800)]
Add checkcerts, to use with cron to warn about expiring certs
This originally lived on gitbuilder-archive, and I've moved it,
revamped it, added some args, added some hosts, modified some emails,
ported to Py3, stopped using external programs. It's quick to run
in default mode where it just reports to the terminal; it'll also
be quiet and only send email about old certs.
The timezone processing is nonexistent on the reported expiry
date; Python timezone handling is a mess. That could be improved but
not without a deep dive.