David Galloway [Wed, 2 Dec 2020 22:46:25 +0000 (17:46 -0500)]
Move container tasks to separate role
I would've liked to keep all this in common but there's a chicken and egg situation.
docker and/or podman get installed during the testnode role. The testnode role can only be run after the common role. The testnode role is also where some repos are added.
So we need to install docker/podman and configure it after the testnodes role runs. Since we also want to be able to configure docker/podman on other systems, I couldn't put these tasks in the testnode role.
Signed-off-by: David Galloway <dgallowa@redhat.com>
David Galloway [Tue, 1 Dec 2020 18:23:10 +0000 (13:23 -0500)]
testnode: Basic filesystem and mountpoint support
Fixes: https://tracker.ceph.com/issues/6373
This will really only be useful is `drives_to_partition` or `logical_volumes` gets overridden in ansible.cephlab overrides in teuthology yaml. e.g.,
David Galloway [Wed, 25 Nov 2020 14:56:58 +0000 (09:56 -0500)]
common: Support smart.sh report failed drive
The SSDs in the bruuni, specifically, have mostly catastrophically failed. There are no details about sectors in the SMART output. Just obvious failure. `smart.sh` wasn't capable of detecting that.
Signed-off-by: David Galloway <dgallowa@redhat.com>
Dan Mick [Tue, 25 Aug 2020 20:31:43 +0000 (20:31 +0000)]
teuthology nginx: add gzip_ variables to allow compressed responses
in particular, gzip_static on allows the server to return file.gz
(if compression is allowable) for files that are requested as "file".
This handles the teuthology.log.gz files that are still referred to as
teuthology.log from pulpito pages.
David Galloway [Thu, 11 Jun 2020 19:14:53 +0000 (15:14 -0400)]
generate-fog-csv: Only print a line if the MAC is defined
There's some hosts that are "cobbler_managed" but don't have a MAC and don't need a MAC. Ansible doesn't help us out any here and print WHICH host doesn't have a MAC. Just throws an error.
Also, there's no need to gather facts on localhost so I set that to `false`
Signed-off-by: David Galloway <dgallowa@redhat.com>
David Galloway [Fri, 15 May 2020 14:37:55 +0000 (10:37 -0400)]
cobbler: Print the date during rc.local
This would have been particularly useful before https://tracker.ceph.com/issues/45341 got fixed so I could say *definitively* that the NIC bouncing was to blame for job failures but it can't hurt to have it now.
Signed-off-by: David Galloway <dgallowa@redhat.com>
David Galloway [Thu, 30 Apr 2020 18:11:48 +0000 (14:11 -0400)]
testnode: Don't attempt to install base or core
These groups are already taken care of during Cobbler installation. We're hitting https://bugzilla.redhat.com/show_bug.cgi?id=1782899 in RHEL8 so let's not bother trying to reinstall during ceph-cm-ansible.
Signed-off-by: David Galloway <dgallowa@redhat.com>
David Galloway [Thu, 30 Apr 2020 14:48:11 +0000 (10:48 -0400)]
common: Streamline RHEL repo registration
Same thing here as the last commit. Satellite has slowed down significantly so the first repo would get enabled successfully but the Satellite DB hadn't caught up yet so subsequent repo enablements would throw 500 errors. This enables all the appropriate repos at the same time reducing the number of times we have to interact with Satellite.
Signed-off-by: David Galloway <dgallowa@redhat.com>
David Galloway [Thu, 30 Apr 2020 14:37:49 +0000 (10:37 -0400)]
common: Set release when registering
Because we're running Satellite in a VM in RHV on Gluster, we get really slow I/O performance. Sadly *well* below the recommended performance: https://access.redhat.com/solutions/3397771
When attempting to set the release in a task later in this tasklist, Satellite would throw 500 errors because it was busy still trying to let the postgres DB know the system had been registered.
Setting the release during registration will save minutes per registration.
Signed-off-by: David Galloway <dgallowa@redhat.com>
David Galloway [Sun, 26 Apr 2020 15:48:00 +0000 (11:48 -0400)]
testnode: Fix CentOS repo mirrorlists
I think there used to be individual directories with the ansible_lsb.release in the path but now there only seems to be "8" and "8.1.1911" so we'll just use 8.
Signed-off-by: David Galloway <dgallowa@redhat.com>
David Galloway [Fri, 24 Apr 2020 18:20:56 +0000 (14:20 -0400)]
testnode: Fix pip vars for CentOS/RHEL8
Apparently ansible is smart enough to figure this out on CentOS/RHEL8 (python3 distros) but not Ubuntu Focal. So when I manually set the vars for a python3 distro and only included Fossa, it broke CentOS8.
Signed-off-by: David Galloway <dgallowa@redhat.com>
David Galloway [Thu, 2 Apr 2020 15:25:03 +0000 (11:25 -0400)]
dhcp-server: Only create host declaration if its IP is in subnet we're writing
In the Octo lab, we have multiple subnets with the same domain and VLAN. This is problematic because in the `dhcp_subnets` dict, we only anticipate there being one cidr.
This change will let you have multiple "subnets" in `dhcp_subnets` without fiddling with the ansible inventory ipvars or macvars.
Signed-off-by: David Galloway <dgallowa@redhat.com>
David Galloway [Sun, 22 Mar 2020 15:28:02 +0000 (11:28 -0400)]
nameserver: Support forwarders
When setting up this nameserver role in Octo, recursive lookups were failing. I suspect maybe BIND is doing an `NS` lookup when it doesn't know about a domain it is asked about. Red Hat blocks all external DNS queries so I've defined an internal DNS server for the Octo BIND server to forward to. Now external lookups work.
Signed-off-by: David Galloway <dgallowa@redhat.com>