Foo DNM

author David Galloway <david.galloway@ibm.com>

Sun, 12 Oct 2025 23:34:42 +0000 (19:34 -0400)

committer David Galloway <david.galloway@ibm.com>

Fri, 12 Dec 2025 18:46:52 +0000 (13:46 -0500)
author David Galloway <david.galloway@ibm.com>
Sun, 12 Oct 2025 23:34:42 +0000 (19:34 -0400)
committer David Galloway <david.galloway@ibm.com>
Fri, 12 Dec 2025 18:46:52 +0000 (13:46 -0500)
diff --git a/.ansible-lint b/.ansible-lint

new file mode 100644 (file)

index 0000000..578703e
--- /dev/null
+++ b/.ansible-lint
@@ -0,0 +1,25 @@
+---
+skip_list:
+  - command-instead-of-module
+  - command-instead-of-shell
+  - deprecated-command-syntax
+  - deprecated-local-action
+  - empty-string-compare
+  - experimental
+  - fqcn[action-core]
+  - fqcn[action]
+  - git-latest
+  - jinja
+  - literal-compare
+  - load-failure
+  - meta-no-info
+  - name[casing]
+  - no-changed-when
+  - no-handler
+  - no-jinja-when
+  - no-relative-paths
+  - package-latest
+  - risky-file-permissions
+  - risky-shell-pipe
+  - role-name
+  - unnamed-task
diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml

new file mode 100644 (file)

index 0000000..e334a5a
--- /dev/null
+++ b/.github/workflows/tests.yml
@@ -0,0 +1,38 @@
+name: tests
+
+on: [push, pull_request]
+
+jobs:
+  syntax-check:
+    runs-on: ubuntu-latest
+    steps:
+    - uses: actions/checkout@v3
+    - name: Install ansible
+      run: |
+        sudo apt-get update
+        sudo apt-get purge ansible
+        sudo apt-get install python3-setuptools
+        pip3 install ansible --user
+    - name: ansible-playbook syntax check
+      run: |
+        export PATH=$PATH:$HOME/.local/bin
+        sed -i /^vault_password_file/d ansible.cfg
+        ansible-playbook -i localhost, cephlab.yml --syntax-check
+  ansible-lint:
+    runs-on: ubuntu-latest
+    steps:
+    - uses: actions/checkout@v3
+    - name: Install ansible-lint
+      run: |
+        sudo apt-get update
+        sudo apt-get purge ansible
+        sudo apt-get install python3-setuptools
+        # This pinned ansible version should match teuthology's
+        # requirements.txt.
+        # And we choose an ansible-lint version to be compatible with this
+        # Ansible version.
+        pip3 install ansible==2.10.7 ansible-lint[core]==5.4.0 --user
+    - name: Run ansible-lint
+      run: |
+        export PATH=$PATH:$HOME/.local/bin
+        ansible-lint -v roles/*
diff --git a/.gitignore b/.gitignore

new file mode 100644 (file)

index 0000000..3a5b30d
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1,3 @@
+*.swp
+virtualenv
+*.pyc
diff --git a/README.rst b/README.rst

new file mode 100644 (file)

index 0000000..4130693
--- /dev/null
+++ b/README.rst
@@ -0,0 +1,148 @@
+Overview
+========
+
+This project is meant to store ansible roles for managing the nodes in the ceph
+testing labs.
+
+Inventory
+=========
+
+As this repo only contains roles, it does not define the ansible inventory or
+any associated group_vars or host_vars.  However, it does depend on these
+things existing in a separate repository or otherwise accesible by these roles
+when they are used. Any vars a role needs should be added to its
+``defaults/main.yml`` file to document what must be defined per node or group
+in your inventory.
+
+This separation is important because we have multiple labs we manage with these
+same roles and each lab has different configuration needs. We call these our
+``secrets`` or ``*-secrets`` repos throughout the rest of the documention and
+in the roles.
+
+Besides the inventory, ``secrets`` repos also may contain certain secret or
+encrypted files that we can not include in ceph-cm-ansible for various reasons.
+
+The directory structure for one of our ``secrets`` repos is::
+
+    ├── ansible
+        ├── inventory
+        │   ├── group_vars
+        │   │   ├── all.yml
+        │   │   ├── cobbler.yml
+        │   │   ├── testnodes.yml
+        │   │   ├── teuthology.yml
+        │   │   └── typica.yml
+        │   └── sepia
+        └── secrets
+            └── entitlements.yml
+
+Refer to Step 2 below for instructions on how to setup a ``secrets`` repo for
+use by ceph-cm-ansible. If set up this way, -i is not necessary for
+ansible-playbook to find the repo. However, you can choose your own setup and
+point to the ``secrets`` repo with -i if you prefer.
+
+**NOTE:** Some playbooks require specific groups to be defined in your
+inventory. Please refer to ``hosts`` in the playbook you want to use to ensure
+you've got the proper groups defined.
+
+Where should I put variables?
+-----------------------------
+
+All variables should be defined in ``defaults/main.yml`` for the role they're
+primarily used in.  If the variable you're adding can be used in multiple roles
+define it in ``defaults/main.yml`` for both roles. If the variable can contain
+a reasonable default value that should work for all possible labs then define
+that value in ``defaults/main.yml`` as well.  If not, you should still default
+the variable to something, but make the tasks that use the variable either fail
+gracefully without that var or prompt the user to define it if it's mandatory.
+
+If the variable is something that might need to be defined with a value
+specific to the lab in use, then it'll need to be added to your ``secrets``
+repo as well. Variables in ``group_vars/all.yml`` will apply to all nodes
+unless a group_var file exists that is more specific for that node.  For
+example, if you define the var ``foo: bar`` in ``all.yml`` and the node you're
+running ansible against exists in the ``testnodes`` group and there is a
+``group_vars/testnodes.yml`` file defined with ``foo: baz`` included in it then
+the role using the variable will use the value defined in ``testnodes.yml``.
+The playbook you're using knows which group_var file to use because of the
+``hosts`` value defined for it.
+
+
+Setting up a local dev environment
+==================================
+
+We assume that your SSH key is present and active for passwordless access to
+the "ubuntu" shell user on the hosts that ansible will manage.
+
+Step 1: Install ansible
+-----------------------
+
+You can use pip::
+
+  pip install ansible
+
+or use the OS package manager::
+  
+  yum install ansible
+
+Step 2: Set up secrets repository
+---------------------------------
+
+Clone the secrets repository and symlink the ``hosts`` and ``secrets``
+directories into place::
+
+  cd $HOME/src/
+  git clone git@github.com:ceph/ceph-sepia-secrets.git
+
+  # If needed, get the path for ceph-octo-secrets from a downstream dev
+
+  sudo mv /etc/ansible/hosts /etc/ansible/hosts.default
+
+  sudo ln -s ~/src/ceph-sepia-secrets/ansible/inventory /etc/ansible/hosts
+  sudo ln -s ~/src/ceph-sepia-secrets/ansible/secrets /etc/ansible/secrets
+
+Step 3: Clone the main Ceph ansible repo
+----------------------------------------
+
+Clone the main Ceph ansible repository::
+
+  git clone git@github.com:ceph/ceph-cm-ansible.git
+  cd ceph-cm-ansible
+  
+Step 4 (Optional) Modify ``hosts`` files
+----------------------------------------
+If you have any new hosts on which you'd like to run ansible, or if you're
+using separate testing VMs, edit the files in ``/etc/ansible/hosts`` to add
+your new (or testing) hosts::
+
+  vi /etc/ansible/hosts/<labname>
+
+If you don't need to test on any new hosts, you can skip this step and just use
+``/etc/ansible/hosts`` as-is.
+
+Step 5: Run ``ansible-playbook``
+--------------------------------
+
+You can now run ``ansible-playbook``::
+
+  vi myplaybook.yml
+  ansible-playbook myplaybook.yml -vv --check --diff
+
+This will print a lot of debugging output to your console.
+
+Adding a new host to ansible
+============================
+
+Ansible runs using the "cm" shell account.
+
+Let's say you've created a new VM host using downburst. At this point you
+should have a new VM with the "ubuntu" UID present. The problem is that Ansible
+uses the "cm" user. In order to get that UID set up:
+
+1. Add your host to the inventory. Look in your lab's ``secrets`` repository,
+   in the ``ansible/inventory/`` directory, and add your new node.
+
+2. Run the ``cephlab.yml`` playbook, limited to your new host "mynewhost"::
+
+    ansible-playbook -vv --limit mynewhost cephlab.yml
+
diff --git a/ansible.cfg b/ansible.cfg

new file mode 100644 (file)

index 0000000..ccd0bbc
--- /dev/null
+++ b/ansible.cfg
@@ -0,0 +1,12 @@
+[defaults]
+ansible_managed = This file is managed by ansible, don't make changes here - they will be overwritten.
+# this works when testing from my laptop, but will need to
+# be changed when it lives in a production environment
+vault_password_file = ~/.vault_pass.txt
+timeout = 120
+callback_whitelist = profile_tasks
+# default is 0.001, resulting in a storm of select(NULL, ..., 1ms) syscalls
+internal_poll_interval = 0.01
+
+[ssh_connection]
+retries = 5
diff --git a/ansible_managed.yml b/ansible_managed.yml

new file mode 100644 (file)

index 0000000..102cb73
--- /dev/null
+++ b/ansible_managed.yml
@@ -0,0 +1,12 @@
+---
+# a playbook to create the necessary users, groups and
+# sudoer settings needed for ansible to manage a node.
+- hosts: all
+  strategy: free
+# this used to be set to ubuntu but the {{ cm_user }} is the only
+# user that gets created during kickstart
+  vars:
+    ansible_ssh_user: "{{ cm_user }}"
+  roles:
+    - ansible-managed
+  become: true
diff --git a/callback_plugins/failure_log.py b/callback_plugins/failure_log.py

new file mode 100644 (file)

index 0000000..31632a2
--- /dev/null
+++ b/callback_plugins/failure_log.py
@@ -0,0 +1,76 @@
+"""
+This callback plugin writes ansible failures to a log as yaml. This way you
+can parse the file later and use the ansible failures for other reporting
+or logging.
+
+A log will not be written unless the environment variable ANSIBLE_FAILURE_LOG
+is present and contains a path to a file to write the log to.
+"""
+import yaml
+import os
+import logging
+
+import ansible
+ANSIBLE_MAJOR = int(ansible.__version__.split('.')[0])
+
+if ANSIBLE_MAJOR >= 2:
+    from ansible.plugins.callback import CallbackBase as callback_base
+else:
+    callback_base = object
+
+# Add a default representer so that we don't crash upon encountering
+# instances of AnsibleUnicode or AnsibleUnsafeText
+def default_representer(dumper, data):
+    return dumper.represent_scalar('tag:yaml.org,2002:str', str(data))
+
+yaml.SafeDumper.add_representer(None, default_representer)
+
+log = logging.getLogger(__name__)
+# We only want to log if this env var is populated with
+# a file path of where the log should live.
+fail_log = os.environ.get('ANSIBLE_FAILURE_LOG')
+if fail_log:
+    handler = logging.FileHandler(filename=fail_log)
+    log.addHandler(handler)
+
+
+def log_failure(host, result):
+    """
+    If the environment variable ANSIBLE_FAILURE_LOG is present
+    a log of all failures in the playbook will be persisted to
+    the file path given in ANSIBLE_FAILURE_LOG.
+    """
+    if fail_log:
+        failure = {"{0}".format(host): dict()}
+        failure[host] = result
+        try:
+            log.error(yaml.safe_dump(failure))
+        except Exception:
+            log.exception("Failure object was: %s", str(failure))
+
+
+class CallbackModule(callback_base):
+    """
+    This Ansible callback plugin writes task failures to a yaml file.
+    """
+    CALLBACK_VERSION = 2.0
+    CALLBACK_TYPE = 'notification'
+    CALLBACK_NAME = 'failure_log'
+
+    def runner_on_failed(self, host, result, ignore_errors=False):
+        """
+        A hook that will be called on every task failure.
+        """
+        if ignore_errors:
+            return
+        try:
+            log_failure(host, result)
+        except:
+            import traceback
+            traceback.print_exc()
+
+    def runner_on_unreachable(self, host, result):
+        """
+        A hook that will be called on every task that is unreachable.
+        """
+        log_failure(host, result)
diff --git a/cephlab.yml b/cephlab.yml

new file mode 100644 (file)

index 0000000..ecd5830
--- /dev/null
+++ b/cephlab.yml
@@ -0,0 +1,42 @@
+---
+# ensure the node is setup to be managed by ansible
+# eventually, most of the things here will be done by
+# cobbler / downburst / cloud-init.
+- import_playbook: ansible_managed.yml
+
+# if this node is in the teuthology group, configure it
+- import_playbook: teuthology.yml
+
+- hosts: testnodes
+  tasks:
+    - set_fact:
+        ran_from_cephlab_playbook: true
+
+# if this node is in the testnode group, configure it
+- import_playbook: testnodes.yml
+
+# a number of different groups get docker/podman installed and configured
+- import_playbook: container-host.yml
+
+# if this node is in the pcp group, configure it
+#- import_playbook: pcp.yml
+
+# if this node is in the cobbler group, configure it
+- import_playbook: cobbler.yml
+
+# if this node is in the paddles group, configure it
+- import_playbook: paddles.yml
+
+# if this node is in the pulpito group, configure it
+- import_playbook: pulpito.yml
+
+# Touch a file to indicate we are done. This is something chef did;
+# teuthology.task.internal.vm_setup() expects it.
+- hosts: testnodes
+  become: true
+  tasks:
+    - name: Touch /ceph-qa-ready
+      file:
+          path: /ceph-qa-ready
+          state: touch
+      when: ran_from_cephlab_playbook|bool
diff --git a/cobbler.yml b/cobbler.yml

new file mode 100644 (file)

index 0000000..f0b2709
--- /dev/null
+++ b/cobbler.yml
@@ -0,0 +1,64 @@
+---
+- hosts: cobbler 
+  roles:
+    - common
+    - cobbler 
+    - { role: cobbler_profile, distro_name: inktank-rescue, tags: ['inktank-rescue'] }
+    - { role: cobbler_profile, distro_name: dban-2.3.0-autonuke, tags: ['dban-autonuke'] }
+    - { role: cobbler_profile, distro_name: RHEL-6.6-Server-x86_64, tags: ['rhel6.6'] }
+    - { role: cobbler_profile, distro_name: RHEL-6.7-Server-x86_64, tags: ['rhel6.7'] }
+    - { role: cobbler_profile, distro_name: RHEL-6.8-Server-x86_64, tags: ['rhel6.8'] }
+    - { role: cobbler_profile, distro_name: RHEL-7.0-Server-x86_64, tags: ['rhel7.0'] }
+    - { role: cobbler_profile, distro_name: RHEL-7.1-Server-x86_64, tags: ['rhel7.1'] }
+    - { role: cobbler_profile, distro_name: RHEL-7.2-Server-x86_64, tags: ['rhel7.2'] }
+    - { role: cobbler_profile, distro_name: RHEL-7.3-Server-x86_64, tags: ['rhel7.3'] }
+    - { role: cobbler_profile, distro_name: RHEL-7.4-Server-x86_64, tags: ['rhel7.4'] }
+    - { role: cobbler_profile, distro_name: RHEL-7.5-Server-x86_64, tags: ['rhel7.5'] }
+    - { role: cobbler_profile, distro_name: RHEL-7.6-Server-x86_64, tags: ['rhel7.6'] }
+    - { role: cobbler_profile, distro_name: RHEL-7.7-Server-x86_64, tags: ['rhel7.7'] }
+    - { role: cobbler_profile, distro_name: RHEL-7.8-Server-x86_64, tags: ['rhel7.8'] }
+    - { role: cobbler_profile, distro_name: RHEL-7.9-Server-x86_64, tags: ['rhel7.9'] }
+    - { role: cobbler_profile, distro_name: RHEL-8.0-Server-x86_64, tags: ['rhel8.0'] }
+    - { role: cobbler_profile, distro_name: RHEL-8.1-Server-x86_64, tags: ['rhel8.1'] }
+    - { role: cobbler_profile, distro_name: RHEL-8.2-Server-x86_64, tags: ['rhel8.2'] }
+    - { role: cobbler_profile, distro_name: RHEL-8.3-Server-x86_64, tags: ['rhel8.3'] }
+    - { role: cobbler_profile, distro_name: RHEL-8.4-Server-x86_64, tags: ['rhel8.4'] }
+    - { role: cobbler_profile, distro_name: RHEL-8.5-Server-x86_64, tags: ['rhel8.5'] }
+    - { role: cobbler_profile, distro_name: RHEL-8.6-Server-x86_64, tags: ['rhel8.6'] }
+    - { role: cobbler_profile, distro_name: RHEL-9.0-Server-x86_64, tags: ['rhel9.0'] }
+    - { role: cobbler_profile, distro_name: RHEL-9.3-Server-x86_64, tags: ['rhel9.3'] }
+    - { role: cobbler_profile, distro_name: Fedora-22-Server-x86_64, tags: ['fedora22'] }
+    - { role: cobbler_profile, distro_name: Fedora-31-Server-x86_64, tags: ['fedora31'] }
+    - { role: cobbler_profile, distro_name: CentOS-6.7-x86_64, tags: ['centos6.7'] }
+    - { role: cobbler_profile, distro_name: CentOS-7.0-x86_64, tags: ['centos7.0'] }
+    - { role: cobbler_profile, distro_name: CentOS-7.1-x86_64, tags: ['centos7.1'] }
+    - { role: cobbler_profile, distro_name: CentOS-7.2-x86_64, tags: ['centos7.2'] }
+    - { role: cobbler_profile, distro_name: CentOS-7.3-x86_64, tags: ['centos7.3'] }
+    - { role: cobbler_profile, distro_name: CentOS-7.4-x86_64, tags: ['centos7.4'] }
+    - { role: cobbler_profile, distro_name: CentOS-7.5-x86_64, tags: ['centos7.5'] }
+    - { role: cobbler_profile, distro_name: CentOS-7.6-x86_64, tags: ['centos7.6'] }
+    - { role: cobbler_profile, distro_name: CentOS-7.7-x86_64, tags: ['centos7.7'] }
+    - { role: cobbler_profile, distro_name: CentOS-7.8-arm, tags: ['centos7.8-arm'] }
+    - { role: cobbler_profile, distro_name: CentOS-7.9-x86_64, tags: ['centos7.9'] }
+    - { role: cobbler_profile, distro_name: CentOS-8.0-x86_64, tags: ['centos8.0'] }
+    - { role: cobbler_profile, distro_name: CentOS-8.1-x86_64, tags: ['centos8.1'] }
+    - { role: cobbler_profile, distro_name: CentOS-8.1-aarch64, tags: ['centos8.1-aarch64'] }
+    - { role: cobbler_profile, distro_name: CentOS-8.2-x86_64, tags: ['centos8.2'] }
+    - { role: cobbler_profile, distro_name: CentOS-8.3-x86_64, tags: ['centos8.3'] }
+    - { role: cobbler_profile, distro_name: CentOS-8.4-x86_64, tags: ['centos8.4'] }
+    - { role: cobbler_profile, distro_name: CentOS-8.5-x86_64, tags: ['centos8.5'] }
+    - { role: cobbler_profile, distro_name: CentOS-8.stream-x86_64, tags: ['centos8.stream'] }
+    - { role: cobbler_profile, distro_name: CentOS-9.stream-x86_64, tags: ['centos9.stream'] }
+    - { role: cobbler_profile, distro_name: Rocky-9.5-x86_64, tags: ['rocky9.5'] }
+    - { role: cobbler_profile, distro_name: Ubuntu-12.04-server-x86_64, tags: ['ubuntu-precise'] }
+    - { role: cobbler_profile, distro_name: Ubuntu-14.04-server-x86_64, tags: ['ubuntu-trusty'] }
+    - { role: cobbler_profile, distro_name: Ubuntu-15.04-server-x86_64, tags: ['ubuntu-vivid'] }
+    - { role: cobbler_profile, distro_name: Ubuntu-16.04-server-x86_64, tags: ['ubuntu-xenial'] }
+    - { role: cobbler_profile, distro_name: Ubuntu-18.04-server-x86_64, tags: ['ubuntu-bionic'] }
+    - { role: cobbler_profile, distro_name: Ubuntu-20.04-server-x86_64, tags: ['ubuntu-focal'] }
+    - { role: cobbler_profile, distro_name: openSUSE-15.0-x86_64, tags: ['opensuse-15.0'] }
+    - { role: cobbler_profile, distro_name: openSUSE-15.1-x86_64, tags: ['opensuse-15.1'] }
+    - { role: cobbler_profile, distro_name: openSUSE-15.2-x86_64, tags: ['opensuse-15.2'] }
+    - { role: cobbler_profile, distro_name: VMware-ESXi-7.0-x86_64, tags: ['esxi-7.0'] }
+    - cobbler_systems
+  become: true
diff --git a/common.yml b/common.yml

new file mode 100644 (file)

index 0000000..da112be
--- /dev/null
+++ b/common.yml
@@ -0,0 +1,6 @@
+---
+- hosts: all
+  strategy: free
+  roles:
+    - common
+  become: true
diff --git a/container-host.yml b/container-host.yml

new file mode 100644 (file)

index 0000000..addabf5
--- /dev/null
+++ b/container-host.yml
@@ -0,0 +1,15 @@
+---
+- hosts:
+    - testnodes
+    - senta
+    - vossi
+    - jenkins_builders
+    - folio
+  roles:
+    - secrets
+    - container-host
+  tags:
+    - container
+    - container-mirror
+  strategy: free
+  become: true
diff --git a/dhcp-server.yml b/dhcp-server.yml

new file mode 100644 (file)

index 0000000..3240ff7
--- /dev/null
+++ b/dhcp-server.yml
@@ -0,0 +1,5 @@
+---
+- hosts: dhcp_server
+  roles:
+    - dhcp-server
+  become: true
diff --git a/downstream_setup.yml b/downstream_setup.yml

new file mode 100644 (file)

index 0000000..eadd347
--- /dev/null
+++ b/downstream_setup.yml
@@ -0,0 +1,7 @@
+---
+# A playbook used to setup a node for downstream
+# RHCeph testing.
+- hosts: testnodes
+  roles:
+    - downstream-setup
+  become: true
diff --git a/firmware.yml b/firmware.yml

new file mode 100644 (file)

index 0000000..a1e9124
--- /dev/null
+++ b/firmware.yml
@@ -0,0 +1,12 @@
+---
+# "any_errors_fatal: true" makes sure the run stops if any problems happen.
+# This gives you the ability to flash backed up firmwares or diagnose
+# problems without the playbook cleaning up after itself or causing more damage.
+
+- hosts: all
+  any_errors_fatal: true
+  strategy: free
+  roles:
+    - secrets
+    - firmware
+  become: true
diff --git a/fog-server.yml b/fog-server.yml

new file mode 100644 (file)

index 0000000..9479c23
--- /dev/null
+++ b/fog-server.yml
@@ -0,0 +1,10 @@
+---
+- hosts: fog_server
+  roles:
+    - fog-server
+  become: true
+  vars_prompt:
+    - name: "fog_force"
+      prompt: "\nWARNING: It is not safe to run this role on a running FOG server that\nhas or may have scheduled tasks.\nDo you want to forcefully install/update/restart FOG? (yes|no)"
+      default: "no"
+      private: no
diff --git a/gateway.yml b/gateway.yml

new file mode 100644 (file)

index 0000000..f9162c7
--- /dev/null
+++ b/gateway.yml
@@ -0,0 +1,6 @@
+---
+- hosts: gateway
+  roles:
+    - common
+    - gateway
+  become: true
diff --git a/grafana_agent.yml b/grafana_agent.yml

new file mode 100644 (file)

index 0000000..0ccbaee
--- /dev/null
+++ b/grafana_agent.yml
@@ -0,0 +1,6 @@
+---
+- hosts: all
+  strategy: free
+  roles:
+    - grafana_agent
+  become: true
diff --git a/long_running_cluster.yml b/long_running_cluster.yml

new file mode 100644 (file)

index 0000000..f9eabf7
--- /dev/null
+++ b/long_running_cluster.yml
@@ -0,0 +1,12 @@
+---
+- hosts: long_running_cluster
+  tasks:
+    - name: Pull in vars from common role
+      include_vars: "roles/common/vars/{{ ansible_pkg_mgr }}_systems.yml"
+
+- hosts: long_running_cluster
+  become: true
+  roles:
+    - long_running_cluster
+  handlers:
+    - import_tasks: roles/common/handlers/main.yml
diff --git a/maas.yml b/maas.yml

new file mode 100644 (file)

index 0000000..7cbb992
--- /dev/null
+++ b/maas.yml
@@ -0,0 +1,6 @@
+---
+- hosts: maas
+  roles:
+    - secrets
+    - maas
+  become: true
diff --git a/nameserver.yml b/nameserver.yml

new file mode 100644 (file)

index 0000000..677810a
--- /dev/null
+++ b/nameserver.yml
@@ -0,0 +1,6 @@
+---
+- hosts: nameserver
+  roles:
+    - common
+    - nameserver
+  become: true
diff --git a/nsupdate_web.yml b/nsupdate_web.yml

new file mode 100644 (file)

index 0000000..81852bd
--- /dev/null
+++ b/nsupdate_web.yml
@@ -0,0 +1,6 @@
+---
+- hosts: nsupdate_web
+  roles:
+    - common
+    - nsupdate_web
+  become: true
diff --git a/ntp-server.yml b/ntp-server.yml

new file mode 100644 (file)

index 0000000..fc93e5d
--- /dev/null
+++ b/ntp-server.yml
@@ -0,0 +1,5 @@
+---
+- hosts: ntp_server
+  roles:
+    - ntp-server
+  become: true
diff --git a/packages.yml b/packages.yml

new file mode 100644 (file)

index 0000000..d19956e
--- /dev/null
+++ b/packages.yml
@@ -0,0 +1,4 @@
+---
+- hosts: all
+  roles:
+    - packages 
diff --git a/paddles.yml b/paddles.yml

new file mode 100644 (file)

index 0000000..b4e8b24
--- /dev/null
+++ b/paddles.yml
@@ -0,0 +1,6 @@
+---
+- hosts: paddles
+  roles:
+    - common
+    - paddles
+  become: true
diff --git a/pcp.yml b/pcp.yml

new file mode 100644 (file)

index 0000000..91058af
--- /dev/null
+++ b/pcp.yml
@@ -0,0 +1,6 @@
+---
+- hosts: pcp
+  strategy: free
+  roles:
+    - pcp
+  become: true
diff --git a/public_facing.yml b/public_facing.yml

new file mode 100644 (file)

index 0000000..bf80e38
--- /dev/null
+++ b/public_facing.yml
@@ -0,0 +1,5 @@
+---
+- hosts: public_facing
+  roles:
+    - public_facing
+  become: true
diff --git a/pulpito.yml b/pulpito.yml

new file mode 100644 (file)

index 0000000..ecd1c3b
--- /dev/null
+++ b/pulpito.yml
@@ -0,0 +1,5 @@
+---
+- hosts: pulpito 
+  roles:
+    - common
+    - pulpito 
diff --git a/roles/ansible-managed/tasks/main.yml b/roles/ansible-managed/tasks/main.yml

new file mode 100644 (file)

index 0000000..b2507e4
--- /dev/null
+++ b/roles/ansible-managed/tasks/main.yml
@@ -0,0 +1,62 @@
+---
+- name: Create the sudo group.
+  group:
+    name: sudo
+    state: present
+  tags:
+   - user
+
+- name: Create the ansible user.
+  user:
+    name: "{{ cm_user }}"
+    groups: sudo
+    shell: /bin/bash
+    uid: "{{ cm_user_uid }}"
+    update_password: on_create
+  when: cm_user is defined and cm_user_uid is defined
+  register: user_created
+  failed_when: >
+    user_created.rc is defined and
+    user_created.rc != 0 and
+    ('user cm is currently used' not in user_created.msg | default(''))
+  tags:
+   - user
+
+- name: Delete the ansible users password.
+  command: "passwd -d {{ cm_user }}"
+  when: user_created is defined and user_created is changed
+  tags:
+   - user
+
+- name: Ensure includedir is present in sudoers.
+  lineinfile:
+    dest: /etc/sudoers
+    line: "#includedir /etc/sudoers.d"
+    state: present
+    validate: visudo -cf %s
+  tags:
+    - sudoers
+    - user
+
+- name: Create the cephlab_sudo sudoers.d file.
+  template:
+    src: cephlab_sudo
+    dest: /etc/sudoers.d/cephlab_sudo
+    owner: root
+    group: root
+    mode: 0440
+    validate: visudo -cf %s
+  tags:
+    - sudoers
+    - user
+
+- name: Add authorized keys for the ansible user.
+  authorized_key: 
+    user: "{{ cm_user }}"
+    key: "{{ cm_user_ssh_keys|join('\n') }}"
+    exclusive: True
+  when: cm_user_ssh_keys is defined and
+        cm_user is defined
+  become: true
+  tags:
+    - pubkeys
diff --git a/roles/ansible-managed/templates/cephlab_sudo b/roles/ansible-managed/templates/cephlab_sudo

new file mode 100644 (file)

index 0000000..6febac3
--- /dev/null
+++ b/roles/ansible-managed/templates/cephlab_sudo
@@ -0,0 +1,5 @@
+# {{ ansible_managed }}
+%sudo ALL=(ALL) NOPASSWD: ALL
+# For ansible pipelining
+Defaults !requiretty
+Defaults visiblepw
diff --git a/roles/cobbler/defaults/main.yml b/roles/cobbler/defaults/main.yml

new file mode 100644 (file)

index 0000000..1709277
--- /dev/null
+++ b/roles/cobbler/defaults/main.yml
@@ -0,0 +1,86 @@
+---
+# These defaults are present to allow certain tasks to no-op if a secrets repo
+# hasn't been defined. If you want to override these, do so in the secrets repo
+# itself. We override these in  $repo/ansible/inventory/group_vars/cobbler.yml
+secrets_repo:
+  name: UNDEFINED
+  url: null
+
+# Where to download ISOs
+iso_dir: /var/lib/cobbler/isos
+# Mount point to use for ISOs during import
+iso_mount: /mnt/iso
+# Where to put kernel/initrd files for image-based ISOs
+other_image_dir: /var/lib/cobbler/other_boot_images
+
+users_digest_lines:
+ # default password is 'cobbler' - change it in a secrets repo!
+ - "cobbler:Cobbler:a2d6bae81669d707b72c0bd9806e01f3"
+
+settings:
+  - name: yum_post_install_mirror
+    value: 0
+  - name: signature_url
+    value: http://cobbler.github.io/signatures/2.6.x/latest.json
+  - name: server
+    value: "{{ ip }}"
+  - name: next_server
+    value: "{{ ip }}"
+  - name: pxe_just_once
+    value: 1
+
+kickstarts:
+  - cephlab_rhel.ks
+  - cephlab_rhel_sdc.ks
+  - cephlab_ubuntu.preseed
+  - cephlab_opensuse_leap.xml
+  
+snippets:
+  - cephlab_user
+  - cephlab_hostname
+  - cephlab_packages_rhel
+  - cephlab_rc_local
+  - cephlab_rhel_disks
+  - cephlab_post_install_kernel_options
+  - cephlab_rhel_rhsm
+
+scripts:
+  - cephlab_preseed_late
+
+triggers:
+  - install/post/cephlab_ansible.sh
+
+utils:
+  - console.sh
+  - reboot.sh
+  - reimage.sh
+
+cm_user_ssh_keys: []
+
+cm_user: ''
+cm_user_uid: ''
+
+# A list of lines to add to resolv.conf and resolv.conf.d/base
+# An example:
+#   resolvconf:
+#      - "nameserver x.x.x.x"
+#      - "search an.example.com"
+resolvconf: []
+
+power_type: ipmilan
+# power_user and power_pass defaults will need to be overridden in a secrets
+# repo to be useful
+power_user: poweruser
+power_pass: powerpass
+
+pip_packages:
+  - pip
+  - ansible
+
+cobbler_settings_file: /etc/cobbler/settings
+
+kopts_flag: "--kopts"
+
+autoinstall_flag: "--kickstart"
+
+ks_dir: /var/lib/cobbler/kickstarts
diff --git a/roles/cobbler/meta/main.yml b/roles/cobbler/meta/main.yml

new file mode 100644 (file)

index 0000000..313fd69
--- /dev/null
+++ b/roles/cobbler/meta/main.yml
@@ -0,0 +1,3 @@
+---
+dependencies:
+  - role: secrets
diff --git a/roles/cobbler/tasks/apt_systems.yml b/roles/cobbler/tasks/apt_systems.yml

new file mode 100644 (file)

index 0000000..037e32d
--- /dev/null
+++ b/roles/cobbler/tasks/apt_systems.yml
@@ -0,0 +1,12 @@
+---
+- name: Install cobbler
+  apt:
+    name: "{{ cobbler_package }}"
+    state: latest
+  register: install_cobbler
+
+- name: Install extra cobbler packages
+  apt:
+    name: "{{ cobbler_extra_packages|list }}"
+    state: latest
+  when: cobbler_extra_packages|length > 0
diff --git a/roles/cobbler/tasks/distro_prep.yml b/roles/cobbler/tasks/distro_prep.yml

new file mode 100644 (file)

index 0000000..d52c8f2
--- /dev/null
+++ b/roles/cobbler/tasks/distro_prep.yml
@@ -0,0 +1,18 @@
+---
+- name: Update distro signatures
+  command: cobbler signature update
+
+- name: Create ISO directory
+  file:
+      path: "{{ iso_dir }}"
+      state: directory
+
+- name: Create ISO mountpoint
+  file:
+      path: "{{ iso_mount }}"
+      state: directory
+
+- name: Create directory for other boot images
+  file:
+      path: "{{ other_image_dir }}"
+      state: directory
diff --git a/roles/cobbler/tasks/fetch_cm_repos.yml b/roles/cobbler/tasks/fetch_cm_repos.yml

new file mode 100644 (file)

index 0000000..44e8722
--- /dev/null
+++ b/roles/cobbler/tasks/fetch_cm_repos.yml
@@ -0,0 +1,26 @@
+---
+- name: Checkout ceph-cm-ansible
+  git:
+    repo: https://github.com/ceph/ceph-cm-ansible.git
+    dest: /root/ceph-cm-ansible
+    accept_hostkey: true
+
+- name: Checkout secrets repo
+  git:
+    repo: "{{ secrets_repo.url }}"
+    dest: /root/{{ secrets_repo.name }}
+    accept_hostkey: true
+
+- name: Symlink /etc/ansible/hosts
+  file:
+    src: /root/{{ secrets_repo.name }}/ansible/inventory/
+    dest: /etc/ansible/hosts
+    state: link
+    force: yes
+
+- name: Symlink /etc/ansible/secrets
+  file:
+    src: /root/{{ secrets_repo.name }}/ansible/secrets/
+    dest: /etc/ansible/secrets
+    state: link
+    force: yes
diff --git a/roles/cobbler/tasks/ipmi_secrets.yml b/roles/cobbler/tasks/ipmi_secrets.yml

new file mode 100644 (file)

index 0000000..3c04eed
--- /dev/null
+++ b/roles/cobbler/tasks/ipmi_secrets.yml
@@ -0,0 +1,12 @@
+---
+- name: Set path to IPMI credentials
+  set_fact:
+      ipmi_creds_path: "{{ secrets_path }}/ipmi.yml"
+  when: ipmi_creds_path is undefined
+
+- name: Include IPMI credentials
+  include_vars: "{{ item }}"
+  with_first_found:
+    - "{{ ipmi_creds_path }}"
+    - empty.yml
+  no_log: true
diff --git a/roles/cobbler/tasks/main.yml b/roles/cobbler/tasks/main.yml

new file mode 100644 (file)

index 0000000..64f36fb
--- /dev/null
+++ b/roles/cobbler/tasks/main.yml
@@ -0,0 +1,93 @@
+---
+- import_tasks: ipmi_secrets.yml
+  tags:
+    - always
+
+- name: Include cobbler keys.
+  include_vars: "{{ secrets_path | mandatory }}/cobbler_keys.yml"
+  no_log: true
+  tags:
+    - vars
+
+- name: Create /root/.ssh
+  file:
+    path: /root/.ssh
+    mode: '700'
+    state: directory
+
+- name: Write cobbler keys
+  copy:
+    content: "{{ item.data }}"
+    dest: "{{ item.path }}"
+    mode: '600'
+  with_items: "{{ cobbler_keys[ansible_hostname] }}"
+  no_log: true
+
+- name: Include package type specific vars.
+  include_vars: "{{ ansible_pkg_mgr }}_systems.yml"
+  tags:
+    - always
+
+- import_tasks: yum_systems.yml
+  when: ansible_os_family == "RedHat"
+
+- import_tasks: apt_systems.yml
+  when: ansible_pkg_mgr == "apt"
+
+- import_tasks: pip.yml
+  tags:
+    - pip
+
+- name: Start cobbler
+  service:
+    name: "{{ cobbler_service }}"
+    state: started
+    enabled: yes
+
+- name: Enable tftpd
+  lineinfile:
+    dest: /etc/xinetd.d/tftp
+    regexp: disable
+    line: "        disable                 = no"
+  when: ansible_pkg_mgr == "yum"
+  register: tftp_enabled
+  tags:
+    - tftp
+
+- name: Reload xinetd
+  service:
+    name: xinetd
+    state: reloaded
+    enabled: yes
+  when: tftp_enabled is defined and tftp_enabled is changed
+  tags:
+    - tftp
+
+- name: Start httpd
+  service:
+    name: "{{ httpd_service }}"
+    state: started
+    enabled: yes
+
+- name: Update settings
+  import_tasks: settings.yml
+  tags:
+    - settings
+
+- import_tasks: fetch_cm_repos.yml
+  tags:
+   - cm_repos
+
+- import_tasks: upload_templates.yml
+  tags:
+    - templates
+
+- import_tasks: distro_prep.yml
+  tags:
+    - distros
+    - distro_prep
+
+- import_tasks: restart.yml
+
+- name: Run cobbler check
+  command: cobbler check
diff --git a/roles/cobbler/tasks/pip.yml b/roles/cobbler/tasks/pip.yml

new file mode 100644 (file)

index 0000000..19b77f0
--- /dev/null
+++ b/roles/cobbler/tasks/pip.yml
@@ -0,0 +1,5 @@
+---
+- name: Install pip packages
+  pip:
+    name: "{{ pip_packages|list }}"
+    state: latest
diff --git a/roles/cobbler/tasks/redhat/rhel_6.yml b/roles/cobbler/tasks/redhat/rhel_6.yml

new file mode 100644 (file)

index 0000000..a1820bd
--- /dev/null
+++ b/roles/cobbler/tasks/redhat/rhel_6.yml
@@ -0,0 +1,5 @@
+---
+- name: Stop iptables
+  service:
+    name: iptables
+    state: stopped
diff --git a/roles/cobbler/tasks/redhat/rhel_7.yml b/roles/cobbler/tasks/redhat/rhel_7.yml

new file mode 100644 (file)

index 0000000..283beb3
--- /dev/null
+++ b/roles/cobbler/tasks/redhat/rhel_7.yml
@@ -0,0 +1,21 @@
+---
+- name: Check if firewalld is enabled
+  command: systemctl status firewalld
+  register: firewalld
+  ignore_errors: true
+  no_log: true
+  tags:
+    - firewall
+
+- name: Enable http and https using firewalld
+  firewalld:
+    service: "{{ item }}"
+    state: enabled
+    permanent: yes
+    immediate: yes
+  with_items:
+    - http
+    - https
+  when: "'running' in firewalld.stdout"
+  tags:
+    - firewall
diff --git a/roles/cobbler/tasks/restart.yml b/roles/cobbler/tasks/restart.yml

new file mode 100644 (file)

index 0000000..a0acb6f
--- /dev/null
+++ b/roles/cobbler/tasks/restart.yml
@@ -0,0 +1,17 @@
+---
+- name: Get cobbler port
+  shell: cobbler setting report | grep xmlrpc_port | awk '{ FS=":"; print $3 }'
+  register: cobbler_port_cmd
+
+- name: Set cobbler port var
+  set_fact:
+    cobbler_port: "{{ cobbler_port_cmd.stdout.strip() }}"
+
+- name: Restart cobbler
+  service:
+    name: "{{ cobbler_service }}"
+    state: restarted
+  changed_when: false
+
+- name: Wait for cobbler to start
+  wait_for: port={{ cobbler_port|int }}
diff --git a/roles/cobbler/tasks/settings.yml b/roles/cobbler/tasks/settings.yml

new file mode 100644 (file)

index 0000000..971e55f
--- /dev/null
+++ b/roles/cobbler/tasks/settings.yml
@@ -0,0 +1,32 @@
+---
+- name: Write users.digest
+  copy:
+    content: "{% for line in users_digest_lines %}{{ line + '\n' }}{% endfor %}"
+    dest: /etc/cobbler/users.digest
+    owner: root
+    group: root
+    mode: 0600
+  register: users_digest
+
+- name: Enable dynamic settings modification
+  lineinfile:
+    dest: "{{ cobbler_settings_file }}"
+    regexp: ^allow_dynamic_settings
+    # Escape the colon below so the line will parse
+    line: "allow_dynamic_settings{{':'}} 1"
+  register: dynamic_settings
+
+- name: Set server value
+  lineinfile:
+    dest: "{{ cobbler_settings_file }}"
+    # Escape the colons below so the lines will parse
+    regexp: "^server{{':'}}"
+    line: "server{{':'}} {% for setting in settings %}{% if setting.name == 'server' %}{{ setting.value }}{% endif %}{% endfor %}"
+  register: server_value
+
+- import_tasks: restart.yml
+  when: users_digest is changed or dynamic_settings is changed or server_value is changed
+
+- name: Update settings
+  command: cobbler setting edit --name={{ item.name }} --value={{ item.value }}
+  with_items: "{{ settings }}"
diff --git a/roles/cobbler/tasks/setup-redhat.yml b/roles/cobbler/tasks/setup-redhat.yml

new file mode 100644 (file)

index 0000000..e853612
--- /dev/null
+++ b/roles/cobbler/tasks/setup-redhat.yml
@@ -0,0 +1,8 @@
+---
+- name: Include rhel 7.x specific tasks.
+  import_tasks: redhat/rhel_7.yml
+  when: ansible_distribution_major_version == "7"
+
+- name: Include rhel 6.x specific tasks.
+  import_tasks: redhat/rhel_6.yml
+  when: ansible_distribution_major_version == "6"
diff --git a/roles/cobbler/tasks/upload_templates.yml b/roles/cobbler/tasks/upload_templates.yml

new file mode 100644 (file)

index 0000000..dc503a8
--- /dev/null
+++ b/roles/cobbler/tasks/upload_templates.yml
@@ -0,0 +1,86 @@
+---
+# We need to include our RHSM entitlements from the secrets repo to subscribe
+# RHEL systems during post-install.
+- name: Include RHSM entitlement credentials
+  include_vars: "{{ item }}"
+  with_first_found:
+    - "{{ secrets_path }}/entitlements.yml"
+    - roles/common/vars/empty.yml
+  no_log: true
+  tags:
+    - always
+
+- name: Upload index.html template
+  template:
+    src: "httpd/index.html"
+    dest: "/var/www/html/"
+    owner: root
+    group: root
+    mode: 0644
+  tags:
+    - httpd
+
+- name: Upload kickstarts and preseeds.
+  template:
+    src: "kickstarts/{{ item }}"
+    dest: "{{ ks_dir }}/{{ item }}"
+    owner: root
+    group: root
+    mode: 0644 
+  with_items: "{{ kickstarts }}"
+  tags:
+    - kickstarts
+
+- name: Upload snippets
+  template:
+    src: "snippets/{{ item }}"
+    dest: "/var/lib/cobbler/snippets/{{ item }}"
+    owner: root
+    group: root
+    mode: 0644
+  with_items: "{{ snippets }}"
+  tags:
+    - snippets
+
+- name: Upload scripts.
+  template:
+    src: "scripts/{{ item }}"
+    dest: "/var/lib/cobbler/scripts/{{ item }}"
+    owner: root
+    group: root
+    mode: 0644
+  with_items: "{{ scripts }}"
+  tags:
+    - scripts
+
+- name: Upload triggers.
+  template:
+    src: "triggers/{{ item }}"
+    dest: "/var/lib/cobbler/triggers/{{ item }}"
+    owner: root
+    group: root
+    mode: 0744
+  with_items: "{{ triggers }}"
+  tags:
+    - triggers
+
+- name: Create /root/bin
+  file:
+    path: /root/bin
+    state: directory
+    owner: root
+    group: root
+    mode: 0755
+  tags:
+    - utils
+
+- name: Upload utilities for convenience.
+  template:
+    src: "utils/{{ item }}"
+    dest: "/root/bin/{{ item }}"
+    owner: root
+    group: root
+    mode: 0755
+  with_items: "{{ utils }}"
+  tags:
+    - utils
diff --git a/roles/cobbler/tasks/yum_systems.yml b/roles/cobbler/tasks/yum_systems.yml

new file mode 100644 (file)

index 0000000..01a6d0c
--- /dev/null
+++ b/roles/cobbler/tasks/yum_systems.yml
@@ -0,0 +1,20 @@
+---
+- name: Enable Cobbler 3 Stream on RHEL8
+  command: "dnf module enable cobbler:3"
+  when: ansible_distribution_major_version|int >= 8
+
+- name: Install cobbler
+  yum:
+    name: "{{ cobbler_package }}"
+    state: latest
+  register: install_cobbler
+
+- name: Install extra cobbler packages
+  yum:
+    name: "{{ cobbler_extra_packages|list }}"
+    state: latest
+  when: cobbler_extra_packages|length > 0
+
+# configure red hat specific things
+- import_tasks: setup-redhat.yml
+  when: ansible_distribution in ('RedHat', 'CentOS')
diff --git a/roles/cobbler/templates/httpd/index.html b/roles/cobbler/templates/httpd/index.html

new file mode 100644 (file)

index 0000000..2ae1a97
--- /dev/null
+++ b/roles/cobbler/templates/httpd/index.html
@@ -0,0 +1,8 @@
+<!--
+{{ ansible_managed }}
+-->
+<html>
+  <body>
+    <a href="https://{{ ansible_fqdn }}/cobbler_web/">Cobbler!</a>
+  </body>
+</html>
diff --git a/roles/cobbler/templates/kickstarts/cephlab_opensuse_leap.xml b/roles/cobbler/templates/kickstarts/cephlab_opensuse_leap.xml

new file mode 100644 (file)

index 0000000..a7e4147
--- /dev/null
+++ b/roles/cobbler/templates/kickstarts/cephlab_opensuse_leap.xml
@@ -0,0 +1,131 @@
+<?xml version="1.0"?>
+<!DOCTYPE profile>
+<profile xmlns="http://www.suse.com/1.0/yast2ns" xmlns:config="http://www.suse.com/1.0/configns">
+  <deploy_image>
+    <image_installation config:type="boolean">false</image_installation>
+  </deploy_image>
+  <general>
+    <mode>
+      <confirm config:type="boolean">false</confirm>
+      <final_reboot config:type="boolean">true</final_reboot>  
+    </mode>
+  </general>
+  <software>
+    <packages config:type="list">
+      <package>python</package>
+      <package>python-xml</package>
+      <package>sudo</package>
+      <package>gptfdisk</package>
+      <package>vim</package>
+      <package>curl</package>
+      <package>iputils</package>
+      <package>ethtool</package>
+      <package>bind-utils</package>
+      <package>wget</package>
+    </packages>
+  </software>
+  <partitioning config:type="list">
+    <drive>
+      <device>/dev/sda</device>
+      <use>all</use>
+      <partitions config:type="list">
+        <partition>
+          <create config:type="boolean">true</create>
+          <format config:type="boolean">true</format>
+          <mount>/</mount>
+          <filesystem config:type="symbol">ext4</filesystem>
+          <size>100%</size>
+        </partition>
+      </partitions>
+    </drive>
+  </partitioning>
+  $SNIPPET('addons.xml')
+  $SNIPPET('kdump.xml')
+  <keyboard>
+    <keymap>english</keymap>
+  </keyboard>
+  <language>
+    <language>en_US</language>
+    <languages></languages>
+  </language>
+  <login_settings/>
+  $SNIPPET('networking.xml')
+  <runlevel>
+    <default>3</default>
+  </runlevel>
+  <services-manager>
+    <default_target>multi-user</default_target>
+    <services>
+      <enable config:type="list">
+        <service>sshd</service>
+        <service>rc-local</service>
+      </enable>
+    </services>
+  </services-manager>
+  <users config:type="list">
+    <user>
+      <encrypted config:type="boolean">true</encrypted>
+      <fullname>root</fullname>
+      <gid>0</gid>
+      <home>/root</home>
+      <password_settings>
+        <expire></expire>
+        <flag></flag>
+        <inact></inact>
+        <max></max>
+        <min></min>
+        <warn></warn>
+      </password_settings>
+      <shell>/bin/bash</shell>
+      <uid>0</uid>
+       <user_password>$default_password_crypted</user_password>
+      <username>root</username>
+    </user>
+  </users>
+  <scripts>
+    ## we have to include the pre-scripts tag to get kickstart_start included
+    <pre-scripts config:type="list">
+      #set global $wrappedscript = 'kickstart_start'
+      $SNIPPET('suse_scriptwrapper.xml')
+       ## SuSE has an annoying habit on ppc64 of changing the system
+       ## boot order after installation. This makes it non-trivial to
+       ## automatically re-install future OS.
+       #set global $wrappedscript = 'save_boot_device'
+       $SNIPPET('suse_scriptwrapper.xml')
+    </pre-scripts>
+    <chroot-scripts config:type="list">
+       #set global $wrappedscript = 'cephlab_user'
+       $SNIPPET('suse_scriptwrapper.xml')
+    </chroot-scripts>
+    <post-scripts config:type="list">
+       ##
+       ## This plugin wrapper provides the flexibility to call pure shell
+       ## snippets which can be used directly on autoinst file and with
+       ## wrapper on SuSE.
+       ##
+       ## To use it
+       ## - exchange name_of_pure_shell_snippet with the name of this shell snippet
+       ## - and remove the '##' in front of the line with suse_scriptwrapper.xml
+       ##
+       #set global $wrappedscript = 'name_of_pure_shell_snippet'
+       ## $SNIPPET('suse_scriptwrapper.xml')
+
+       ## SuSE has an annoying habit on ppc64 of changing the system
+       ## boot order after installation. This makes it non-trivial to
+       ## automatically re-install future OS.
+       #set global $wrappedscript = 'restore_boot_device'
+       $SNIPPET('suse_scriptwrapper.xml')
+
+       #set global $wrappedscript = 'cephlab_rc_local'
+       $SNIPPET('suse_scriptwrapper.xml')
+
+       #set global $wrappedscript = 'cephlab_user'
+       $SNIPPET('suse_scriptwrapper.xml')
+    </post-scripts>
+    ## we have to include the init-scripts tag to get kickstart_done included
+    <init-scripts config:type="list">
+      #set global $wrappedscript = 'kickstart_done'
+      $SNIPPET('suse_scriptwrapper.xml')
+    </init-scripts>
+  </scripts>
+</profile>
diff --git a/roles/cobbler/templates/kickstarts/cephlab_rhel.ks b/roles/cobbler/templates/kickstarts/cephlab_rhel.ks

new file mode 100644 (file)

index 0000000..359dfcb
--- /dev/null
+++ b/roles/cobbler/templates/kickstarts/cephlab_rhel.ks
@@ -0,0 +1,98 @@
+## {{ ansible_managed }}
+# kickstart template for Fedora 8 and later.
+# (includes %end blocks)
+# do not use with earlier distros
+#set distro = $getVar('distro','').split("-")[0]
+#set distro_ver = $getVar('distro','').split("-")[1]
+#if $distro == 'RHEL' or $distro == 'CentOS'
+#set distro_ver_major = $distro_ver.split(".")[0]
+#set distro_ver_minor = $distro_ver.split(".")[1]
+#end if
+
+#platform=x86, AMD64, or Intel EM64T
+# System authorization information
+#if int($distro_ver_major) < 9
+auth  --useshadow  --enablemd5
+#else
+authselect select minimal
+#end if
+$SNIPPET('cephlab_rhel_disks')
+# Use text mode install
+text
+# Firewall configuration
+firewall --enabled
+# Run the Setup Agent on first boot
+firstboot --disable
+# System keyboard
+keyboard us
+# System language
+lang en_US
+# Use network installation
+url --url=$tree
+# If any cobbler repo definitions were referenced in the kickstart profile, include them here.
+$yum_repo_stanza
+# Network information
+network --bootproto=dhcp --device=$mac_address_eth0 --onboot=on
+# Reboot after installation
+reboot
+
+#Root password
+rootpw --iscrypted $default_password_crypted
+# SELinux configuration
+selinux --enforcing
+# Do not configure the X Window System
+skipx
+# System timezone
+timezone Etc/UTC --utc
+#if int($distro_ver_major) < 9
+# Install OS instead of upgrade
+install
+#end if
+
+%pre
+$SNIPPET('log_ks_pre')
+$SNIPPET('kickstart_start')
+# Enable installation monitoring
+$SNIPPET('pre_anamon')
+%end
+
+%packages
+@core
+$SNIPPET('cephlab_packages_rhel')
+$SNIPPET('func_install_if_enabled')
+%end
+
+%post --nochroot
+$SNIPPET('log_ks_post_nochroot')
+%end
+
+%post
+$SNIPPET('log_ks_post')
+# Start yum configuration
+$yum_config_stanza
+# End yum configuration
+$SNIPPET('post_install_kernel_options')
+$SNIPPET('func_register_if_enabled')
+$SNIPPET('download_config_files')
+$SNIPPET('koan_environment')
+$SNIPPET('cobbler_register')
+# Enable post-install boot notification
+$SNIPPET('post_anamon')
+# Start final steps
+$SNIPPET('cephlab_hostname')
+$SNIPPET('cephlab_user')
+#set distro = $getVar('distro','').split("-")[0]
+#if $distro == 'RHEL'
+$SNIPPET('cephlab_rhel_rhsm')
+#end if
+#if distro_ver_minor == 'stream'
+# We want the latest packages because it's Stream
+yum -y update
+#else
+# Update to latest kernel before rebooting
+yum -y update kernel
+#end if
+$SNIPPET('cephlab_rc_local')
+$SNIPPET('kickstart_done')
+# End final steps
+%end
diff --git a/roles/cobbler/templates/kickstarts/cephlab_rhel_sdc.ks b/roles/cobbler/templates/kickstarts/cephlab_rhel_sdc.ks

new file mode 100644 (file)

index 0000000..725df30
--- /dev/null
+++ b/roles/cobbler/templates/kickstarts/cephlab_rhel_sdc.ks
@@ -0,0 +1,112 @@
+## {{ ansible_managed }}
+## This kickstart for use with systems where /dev/sdc is the root drive (e.g., cali)
+# kickstart template for Fedora 8 and later.
+# (includes %end blocks)
+# do not use with earlier distros
+#set distro = $getVar('distro','').split("-")[0]
+#set distro_ver = $getVar('distro','').split("-")[1]
+#if $distro == 'RHEL' or $distro == 'CentOS'
+#set distro_ver_major = $distro_ver.split(".")[0]
+#set distro_ver_minor = $distro_ver.split(".")[1]
+#end if
+
+#platform=x86, AMD64, or Intel EM64T
+# System authorization information
+#if int($distro_ver_major) < 9
+auth  --useshadow  --enablemd5
+#else
+authselect select minimal
+#end if
+#set os_version = $getVar('os_version','')
+# Partition clearing information
+clearpart --all --initlabel
+# Use all of /dev/sdc for the root partition (20G minimum)
+part / --fstype="ext4" --ondisk=sdc --size=20000 --grow
+# Clear the Master Boot Record
+zerombr
+# System bootloader configuration
+#if $os_version == 'rhel7'
+    #set bootloader_args = "--location=mbr --boot-drive=sdc"
+#else
+    #set bootloader_args = "--location=mbr --driveorder=sdc"
+#end if
+bootloader $bootloader_args
+# Use text mode install
+text
+# Firewall configuration
+firewall --enabled
+# Run the Setup Agent on first boot
+firstboot --disable
+# System keyboard
+keyboard us
+# System language
+lang en_US
+# Use network installation
+url --url=$tree
+# If any cobbler repo definitions were referenced in the kickstart profile, include them here.
+$yum_repo_stanza
+# Network information
+network --bootproto=dhcp --device=$mac_address_eth0 --onboot=on
+# Reboot after installation
+reboot
+
+#Root password
+rootpw --iscrypted $default_password_crypted
+# SELinux configuration
+selinux --enforcing
+# Do not configure the X Window System
+skipx
+# System timezone
+timezone Etc/UTC --utc
+#if int($distro_ver_major) < 9
+# Install OS instead of upgrade
+install
+#end if
+
+%pre
+$SNIPPET('log_ks_pre')
+$SNIPPET('kickstart_start')
+# Enable installation monitoring
+$SNIPPET('pre_anamon')
+%end
+
+%packages
+@core
+$SNIPPET('cephlab_packages_rhel')
+$SNIPPET('func_install_if_enabled')
+%end
+
+%post --nochroot
+$SNIPPET('log_ks_post_nochroot')
+%end
+
+%post
+$SNIPPET('log_ks_post')
+# Start yum configuration
+$yum_config_stanza
+# End yum configuration
+$SNIPPET('post_install_kernel_options')
+$SNIPPET('func_register_if_enabled')
+$SNIPPET('download_config_files')
+$SNIPPET('koan_environment')
+$SNIPPET('cobbler_register')
+# Enable post-install boot notification
+$SNIPPET('post_anamon')
+# Start final steps
+$SNIPPET('cephlab_hostname')
+$SNIPPET('cephlab_user')
+#set distro = $getVar('distro','').split("-")[0]
+#if $distro == 'RHEL'
+$SNIPPET('cephlab_rhel_rhsm')
+#end if
+#if distro_ver_minor == 'stream'
+# We want the latest packages because it's Stream
+yum -y update
+#else
+# Update to latest kernel before rebooting
+yum -y update kernel
+#end if
+$SNIPPET('cephlab_rc_local')
+$SNIPPET('kickstart_done')
+# End final steps
+%end
diff --git a/roles/cobbler/templates/kickstarts/cephlab_rhel_sdi.ks b/roles/cobbler/templates/kickstarts/cephlab_rhel_sdi.ks

new file mode 100644 (file)

index 0000000..0eca255
--- /dev/null
+++ b/roles/cobbler/templates/kickstarts/cephlab_rhel_sdi.ks
@@ -0,0 +1,112 @@
+## {{ ansible_managed }}
+## This kickstart for use with systems where /dev/sdi is the root drive (e.g., callypso)
+# kickstart template for Fedora 8 and later.
+# (includes %end blocks)
+# do not use with earlier distros
+#set distro = $getVar('distro','').split("-")[0]
+#set distro_ver = $getVar('distro','').split("-")[1]
+#if $distro == 'RHEL' or $distro == 'CentOS'
+#set distro_ver_major = $distro_ver.split(".")[0]
+#set distro_ver_minor = $distro_ver.split(".")[1]
+#end if
+
+#platform=x86, AMD64, or Intel EM64T
+# System authorization information
+#if int($distro_ver_major) < 9
+auth  --useshadow  --enablemd5
+#else
+authselect select minimal
+#end if
+#set os_version = $getVar('os_version','')
+# Partition clearing information
+clearpart --all --initlabel
+# Use all of /dev/sdi for the root partition (20G minimum)
+part / --fstype="ext4" --ondisk=sdi --size=20000 --grow
+# Clear the Master Boot Record
+zerombr
+# System bootloader configuration
+#if $os_version == 'rhel7'
+    #set bootloader_args = "--location=mbr --boot-drive=sdi"
+#else
+    #set bootloader_args = "--location=mbr --driveorder=sdi"
+#end if
+bootloader $bootloader_args
+# Use text mode install
+text
+# Firewall configuration
+firewall --enabled
+# Run the Setup Agent on first boot
+firstboot --disable
+# System keyboard
+keyboard us
+# System language
+lang en_US
+# Use network installation
+url --url=$tree
+# If any cobbler repo definitions were referenced in the kickstart profile, include them here.
+$yum_repo_stanza
+# Network information
+network --bootproto=dhcp --device=$mac_address_eth0 --onboot=on
+# Reboot after installation
+reboot
+
+#Root password
+rootpw --iscrypted $default_password_crypted
+# SELinux configuration
+selinux --enforcing
+# Do not configure the X Window System
+skipx
+# System timezone
+timezone Etc/UTC --utc
+#if int($distro_ver_major) < 9
+# Install OS instead of upgrade
+install
+#end if
+
+%pre
+$SNIPPET('log_ks_pre')
+$SNIPPET('kickstart_start')
+# Enable installation monitoring
+$SNIPPET('pre_anamon')
+%end
+
+%packages
+@core
+$SNIPPET('cephlab_packages_rhel')
+$SNIPPET('func_install_if_enabled')
+%end
+
+%post --nochroot
+$SNIPPET('log_ks_post_nochroot')
+%end
+
+%post
+$SNIPPET('log_ks_post')
+# Start yum configuration
+$yum_config_stanza
+# End yum configuration
+$SNIPPET('post_install_kernel_options')
+$SNIPPET('func_register_if_enabled')
+$SNIPPET('download_config_files')
+$SNIPPET('koan_environment')
+$SNIPPET('cobbler_register')
+# Enable post-install boot notification
+$SNIPPET('post_anamon')
+# Start final steps
+$SNIPPET('cephlab_hostname')
+$SNIPPET('cephlab_user')
+#set distro = $getVar('distro','').split("-")[0]
+#if $distro == 'RHEL'
+$SNIPPET('cephlab_rhel_rhsm')
+#end if
+#if distro_ver_minor == 'stream'
+# We want the latest packages because it's Stream
+yum -y update
+#else
+# Update to latest kernel before rebooting
+yum -y update kernel
+#end if
+$SNIPPET('cephlab_rc_local')
+$SNIPPET('kickstart_done')
+# End final steps
+%end
diff --git a/roles/cobbler/templates/kickstarts/cephlab_rhel_sdm.ks b/roles/cobbler/templates/kickstarts/cephlab_rhel_sdm.ks

new file mode 100644 (file)

index 0000000..f5f8e98
--- /dev/null
+++ b/roles/cobbler/templates/kickstarts/cephlab_rhel_sdm.ks
@@ -0,0 +1,112 @@
+## {{ ansible_managed }}
+## This kickstart for use with systems where /dev/sdm is the root drive (e.g., mero)
+# kickstart template for Fedora 8 and later.
+# (includes %end blocks)
+# do not use with earlier distros
+#set distro = $getVar('distro','').split("-")[0]
+#set distro_ver = $getVar('distro','').split("-")[1]
+#if $distro == 'RHEL' or $distro == 'CentOS'
+#set distro_ver_major = $distro_ver.split(".")[0]
+#set distro_ver_minor = $distro_ver.split(".")[1]
+#end if
+
+#platform=x86, AMD64, or Intel EM64T
+# System authorization information
+#if int($distro_ver_major) < 9
+auth  --useshadow  --enablemd5
+#else
+authselect select minimal
+#end if
+#set os_version = $getVar('os_version','')
+# Partition clearing information
+clearpart --all --initlabel
+# Use all of /dev/sdm for the root partition (20G minimum)
+part / --fstype="ext4" --ondisk=sdm --size=20000 --grow
+# Clear the Master Boot Record
+zerombr
+# System bootloader configuration
+#if $os_version == 'rhel7'
+    #set bootloader_args = "--location=mbr --boot-drive=sdm"
+#else
+    #set bootloader_args = "--location=mbr --driveorder=sdm"
+#end if
+bootloader $bootloader_args
+# Use text mode install
+text
+# Firewall configuration
+firewall --enabled
+# Run the Setup Agent on first boot
+firstboot --disable
+# System keyboard
+keyboard us
+# System language
+lang en_US
+# Use network installation
+url --url=$tree
+# If any cobbler repo definitions were referenced in the kickstart profile, include them here.
+$yum_repo_stanza
+# Network information
+network --bootproto=dhcp --device=$mac_address_eth0 --onboot=on
+# Reboot after installation
+reboot
+
+#Root password
+rootpw --iscrypted $default_password_crypted
+# SELinux configuration
+selinux --enforcing
+# Do not configure the X Window System
+skipx
+# System timezone
+timezone Etc/UTC --utc
+#if int($distro_ver_major) < 9
+# Install OS instead of upgrade
+install
+#end if
+
+%pre
+$SNIPPET('log_ks_pre')
+$SNIPPET('kickstart_start')
+# Enable installation monitoring
+$SNIPPET('pre_anamon')
+%end
+
+%packages
+@core
+$SNIPPET('cephlab_packages_rhel')
+$SNIPPET('func_install_if_enabled')
+%end
+
+%post --nochroot
+$SNIPPET('log_ks_post_nochroot')
+%end
+
+%post
+$SNIPPET('log_ks_post')
+# Start yum configuration
+$yum_config_stanza
+# End yum configuration
+$SNIPPET('post_install_kernel_options')
+$SNIPPET('func_register_if_enabled')
+$SNIPPET('download_config_files')
+$SNIPPET('koan_environment')
+$SNIPPET('cobbler_register')
+# Enable post-install boot notification
+$SNIPPET('post_anamon')
+# Start final steps
+$SNIPPET('cephlab_hostname')
+$SNIPPET('cephlab_user')
+#set distro = $getVar('distro','').split("-")[0]
+#if $distro == 'RHEL'
+$SNIPPET('cephlab_rhel_rhsm')
+#end if
+#if distro_ver_minor == 'stream'
+# We want the latest packages because it's Stream
+yum -y update
+#else
+# Update to latest kernel before rebooting
+yum -y update kernel
+#end if
+$SNIPPET('cephlab_rc_local')
+$SNIPPET('kickstart_done')
+# End final steps
+%end
diff --git a/roles/cobbler/templates/kickstarts/cephlab_ubuntu.preseed b/roles/cobbler/templates/kickstarts/cephlab_ubuntu.preseed

new file mode 100644 (file)

index 0000000..7b95dc7
--- /dev/null
+++ b/roles/cobbler/templates/kickstarts/cephlab_ubuntu.preseed
@@ -0,0 +1,146 @@
+## {{ ansible_managed }}
+
+# Fetch the os_version from the distro using this profile.
+#set os_version = $getVar('os_version','')
+
+# Fetch Ubuntu version (e.g., 14.04)
+#set distro_ver = $getVar('distro','').split("-")[1]
+
+# Fetch Ubuntu major version (e.g., 14)
+#set distro_ver_major = $distro_ver.split(".")[0]
+
+### Apt setup
+# You can choose to install non-free and contrib software.
+#d-i apt-setup/non-free boolean true
+#d-i apt-setup/contrib boolean true
+
+# Preseeding only locale sets language, country and locale.
+d-i debian-installer/locale string en_US
+
+# Keyboard selection.
+# Disable automatic (interactive) keymap detection.
+d-i console-setup/ask_detect boolean false
+
+# If you select ftp, the mirror/country string does not need to be set.
+#d-i mirror/protocol string ftp
+d-i mirror/country string manual
+d-i mirror/http/hostname string archive.ubuntu.com
+d-i mirror/http/directory string /ubuntu
+d-i mirror/suite string $os_version
+
+#Removes the prompt about missing modules:
+# Continue without installing a kernel?
+#d-i base-installer/kernel/skip-install boolean true
+# Continue the install without loading kernel modules?
+#d-i anna/no_kernel_modules boolean true
+
+# Stop Ubuntu from installing random kernel choice
+#d-i base-installer/kernel/image select none
+
+# Controls whether or not the hardware clock is set to UTC.
+d-i clock-setup/utc boolean true
+#
+# # You may set this to any valid setting for $TZ; see the contents of
+# # /usr/share/zoneinfo/ for valid values.
+d-i time/zone string Etc/UTC
+
+# Controls whether to use NTP to set the clock during the install
+d-i clock-setup/ntp boolean true
+# NTP server to use. The default is almost always fine here.
+d-i clock-setup/ntp-server string pool.ntp.org
+
+# This makes partman automatically partition without confirmation.
+#d-i partman/confirm_write_new_label boolean true
+#d-i partman/choose_partition select finish
+#d-i partman/confirm boolean true
+#d-i partman/choose_partition select finish
+d-i partman-basicfilesystems/no_swap boolean false
+d-i partman-basicfilesystems/no_swap seen true
+d-i partman-auto/disk string /dev/sda
+d-i partman-auto/method string regular
+#d-i partman-auto/purge_lvm_from_device boolean true
+d-i partman-auto/confirm_nooverwrite  boolean true
+d-i partman-auto/choose_partition select finish
+
+
+d-i partman/choose_partition select finish
+d-i partman/confirm boolean true
+d-i partman/confirm_nooverwrite boolean true
+d-i partman-partitioning/confirm_write_new_label boolean true
+d-i partman/default_filesystem string ext4
+d-i partman-auto/expert_recipe string                         \
+           root ::                                            \
+              500 10000 1000000000 ext4                       \
+                      $primary{ } $bootable{ }                \
+                      method{ format } format{ }              \
+                      use_filesystem{ } filesystem{ ext4 }    \
+                      mountpoint{ / }                         \
+              .                                               
+#\
+#              64 512 1% linux-swap                            \
+#                      method{ swap } format{ }                \
+#              .
+d-i partman/confirm_write_new_label boolean true
+d-i partman/choose_partition \
+       select Finish partitioning and write changes to disk
+d-i partman/confirm boolean true
+
+d-i grub-pc/install_devices multiselect /dev/sda
+
+#User account.
+d-i passwd/root-login boolean false 
+d-i passwd/make-user boolean true
+d-i passwd/user-fullname string {{ cm_user }}
+d-i passwd/username string {{ cm_user }}
+d-i passwd/user-password-crypted password $default_password_crypted
+d-i passwd/user-uid string {{ cm_user_uid }}
+d-i user-setup/allow-password-weak boolean false
+d-i user-setup/encrypt-home boolean false
+
+# Individual additional packages to install
+#if $os_version == 'precise'
+d-i pkgsel/include string wget ntpdate bash sudo openssh-server
+#else if int($distro_ver_major) == 16
+d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware ntpdate bash devmem2 fbset sudo openssh-server udev-discover gawk gdisk ethtool curl
+#else if int($distro_ver_major) == 18
+d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware ntpdate bash devmem2 fbset sudo openssh-server gawk gdisk ethtool net-tools ifupdown python ntp curl
+#else if int($distro_ver_major) >= 20
+d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware ntpdate bash devmem2 fbset sudo openssh-server gawk gdisk ethtool net-tools ifupdown ntp curl gpg
+#else
+d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware linux-firmware-nonfree ntpdate bash devmem2 fbset sudo openssh-server udev-discover gawk gdisk ethtool curl
+#end if
+
+# Whether to upgrade packages after debootstrap.
+# Allowed values: none, safe-upgrade, full-upgrade
+d-i pkgsel/upgrade select safe-upgrade
+
+# Policy for applying updates. May be "none" (no automatic updates),
+# "unattended-upgrades" (install security updates automatically), or
+# "landscape" (manage system with Landscape).
+d-i pkgsel/update-policy select none
+
+# Set GRUB bootdev to '/dev/sda' if Xenial or later
+#if int($distro_ver_major) >= 16
+d-i grub-installer/bootdev  string /dev/sda
+#end if
+
+# During installations from serial console, the regular virtual consoles
+# (VT1-VT6) are normally disabled in /etc/inittab. Uncomment the next
+# line to prevent this.
+d-i finish-install/keep-consoles boolean true
+
+# Avoid that last message about the install being complete.
+d-i finish-install/reboot_in_progress note
+
+# This command is run just before the install finishes, but when there is
+# still a usable /target directory. You can chroot to /target and use it
+# directly, or use the apt-install and in-target commands to easily install
+# packages and run commands in the target system.
+
+# cephlab_preseed_late lives in /var/lib/cobbler/scripts
+# It is passed to the cobbler xmlrpc generate_scripts function where it's rendered.
+# This means that snippets or other templating features can be used.
+d-i preseed/late_command string \
+in-target wget http://$http_server/cblr/svc/op/script/system/$system_name/?script=cephlab_preseed_late -O /tmp/postinst.sh; \
+in-target /bin/chmod 755 /tmp/postinst.sh; \
+in-target /tmp/postinst.sh;
diff --git a/roles/cobbler/templates/kickstarts/cephlab_ubuntu_sdi.preseed b/roles/cobbler/templates/kickstarts/cephlab_ubuntu_sdi.preseed

new file mode 100644 (file)

index 0000000..96f2654
--- /dev/null
+++ b/roles/cobbler/templates/kickstarts/cephlab_ubuntu_sdi.preseed
@@ -0,0 +1,145 @@
+## {{ ansible_managed }}
+## This preseed only for systems where /dev/sdi is the root drive (e.g., callypso)
+
+# Fetch the os_version from the distro using this profile.
+#set os_version = $getVar('os_version','')
+
+# Fetch Ubuntu version (e.g., 14.04)
+#set distro_ver = $getVar('distro','').split("-")[1]
+
+# Fetch Ubuntu major version (e.g., 14)
+#set distro_ver_major = $distro_ver.split(".")[0]
+
+### Apt setup
+# You can choose to install non-free and contrib software.
+#d-i apt-setup/non-free boolean true
+#d-i apt-setup/contrib boolean true
+
+# Preseeding only locale sets language, country and locale.
+d-i debian-installer/locale string en_US
+
+# Keyboard selection.
+# Disable automatic (interactive) keymap detection.
+d-i console-setup/ask_detect boolean false
+
+# If you select ftp, the mirror/country string does not need to be set.
+#d-i mirror/protocol string ftp
+d-i mirror/country string manual
+d-i mirror/http/hostname string archive.ubuntu.com
+d-i mirror/http/directory string /ubuntu
+d-i mirror/suite string $os_version
+
+#Removes the prompt about missing modules:
+# Continue without installing a kernel?
+#d-i base-installer/kernel/skip-install boolean true
+# Continue the install without loading kernel modules?
+#d-i anna/no_kernel_modules boolean true
+
+# Stop Ubuntu from installing random kernel choice
+#d-i base-installer/kernel/image select none
+
+# Controls whether or not the hardware clock is set to UTC.
+d-i clock-setup/utc boolean true
+#
+# # You may set this to any valid setting for $TZ; see the contents of
+# # /usr/share/zoneinfo/ for valid values.
+d-i time/zone string Etc/UTC
+
+# Controls whether to use NTP to set the clock during the install
+d-i clock-setup/ntp boolean true
+# NTP server to use. The default is almost always fine here.
+d-i clock-setup/ntp-server string pool.ntp.org
+
+# This makes partman automatically partition without confirmation.
+#d-i partman/confirm_write_new_label boolean true
+#d-i partman/choose_partition select finish
+#d-i partman/confirm boolean true
+#d-i partman/choose_partition select finish
+d-i partman-basicfilesystems/no_swap boolean false
+d-i partman-basicfilesystems/no_swap seen true
+d-i partman-auto/disk string /dev/sdi
+d-i partman-auto/method string regular
+#d-i partman-auto/purge_lvm_from_device boolean true
+d-i partman-auto/confirm_nooverwrite  boolean true
+d-i partman-auto/choose_partition select finish
+
+
+d-i partman/choose_partition select finish
+d-i partman/confirm boolean true
+d-i partman/confirm_nooverwrite boolean true
+d-i partman-partitioning/confirm_write_new_label boolean true
+d-i partman/default_filesystem string ext4
+d-i partman-auto/expert_recipe string                         \
+           root ::                                            \
+              500 10000 1000000000 ext4                       \
+                      $primary{ } $bootable{ }                \
+                      method{ format } format{ }              \
+                      use_filesystem{ } filesystem{ ext4 }    \
+                      mountpoint{ / }                         \
+              .                                               
+#\
+#              64 512 1% linux-swap                            \
+#                      method{ swap } format{ }                \
+#              .
+d-i partman/confirm_write_new_label boolean true
+d-i partman/choose_partition \
+       select Finish partitioning and write changes to disk
+d-i partman/confirm boolean true
+
+d-i grub-pc/install_devices multiselect /dev/sdi
+
+#User account.
+d-i passwd/root-login boolean false 
+d-i passwd/make-user boolean true
+d-i passwd/user-fullname string {{ cm_user }}
+d-i passwd/username string {{ cm_user }}
+d-i passwd/user-password-crypted password $default_password_crypted
+d-i passwd/user-uid string {{ cm_user_uid }}
+d-i user-setup/allow-password-weak boolean false
+d-i user-setup/encrypt-home boolean false
+
+# Individual additional packages to install
+#if $os_version == 'precise'
+d-i pkgsel/include string wget ntpdate bash sudo openssh-server
+#else if int($distro_ver_major) == 16
+d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware ntpdate bash devmem2 fbset sudo openssh-server udev-discover gawk gdisk ethtool curl
+#else if int($distro_ver_major) == 18
+d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware ntpdate bash devmem2 fbset sudo openssh-server gawk gdisk ethtool net-tools ifupdown python ntp curl
+#else
+d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware linux-firmware-nonfree ntpdate bash devmem2 fbset sudo openssh-server udev-discover gawk gdisk ethtool curl
+#end if
+
+# Whether to upgrade packages after debootstrap.
+# Allowed values: none, safe-upgrade, full-upgrade
+d-i pkgsel/upgrade select safe-upgrade
+
+# Policy for applying updates. May be "none" (no automatic updates),
+# "unattended-upgrades" (install security updates automatically), or
+# "landscape" (manage system with Landscape).
+d-i pkgsel/update-policy select none
+
+# Set GRUB bootdev to '/dev/sdi' if Xenial or later
+#if int($distro_ver_major) >= 16
+d-i grub-installer/bootdev  string /dev/sdi
+#end if
+
+# During installations from serial console, the regular virtual consoles
+# (VT1-VT6) are normally disabled in /etc/inittab. Uncomment the next
+# line to prevent this.
+d-i finish-install/keep-consoles boolean true
+
+# Avoid that last message about the install being complete.
+d-i finish-install/reboot_in_progress note
+
+# This command is run just before the install finishes, but when there is
+# still a usable /target directory. You can chroot to /target and use it
+# directly, or use the apt-install and in-target commands to easily install
+# packages and run commands in the target system.
+
+# cephlab_preseed_late lives in /var/lib/cobbler/scripts
+# It is passed to the cobbler xmlrpc generate_scripts function where it's rendered.
+# This means that snippets or other templating features can be used.
+d-i preseed/late_command string \
+in-target wget http://$http_server/cblr/svc/op/script/system/$system_name/?script=cephlab_preseed_late -O /tmp/postinst.sh; \
+in-target /bin/chmod 755 /tmp/postinst.sh; \
+in-target /tmp/postinst.sh;
diff --git a/roles/cobbler/templates/kickstarts/cephlab_ubuntu_sdm.preseed b/roles/cobbler/templates/kickstarts/cephlab_ubuntu_sdm.preseed

new file mode 100644 (file)

index 0000000..7e4f311
--- /dev/null
+++ b/roles/cobbler/templates/kickstarts/cephlab_ubuntu_sdm.preseed
@@ -0,0 +1,145 @@
+## {{ ansible_managed }}
+## This preseed only for systems where /dev/sdm is the root drive (e.g., mero)
+
+# Fetch the os_version from the distro using this profile.
+#set os_version = $getVar('os_version','')
+
+# Fetch Ubuntu version (e.g., 14.04)
+#set distro_ver = $getVar('distro','').split("-")[1]
+
+# Fetch Ubuntu major version (e.g., 14)
+#set distro_ver_major = $distro_ver.split(".")[0]
+
+### Apt setup
+# You can choose to install non-free and contrib software.
+#d-i apt-setup/non-free boolean true
+#d-i apt-setup/contrib boolean true
+
+# Preseeding only locale sets language, country and locale.
+d-i debian-installer/locale string en_US
+
+# Keyboard selection.
+# Disable automatic (interactive) keymap detection.
+d-i console-setup/ask_detect boolean false
+
+# If you select ftp, the mirror/country string does not need to be set.
+#d-i mirror/protocol string ftp
+d-i mirror/country string manual
+d-i mirror/http/hostname string archive.ubuntu.com
+d-i mirror/http/directory string /ubuntu
+d-i mirror/suite string $os_version
+
+#Removes the prompt about missing modules:
+# Continue without installing a kernel?
+#d-i base-installer/kernel/skip-install boolean true
+# Continue the install without loading kernel modules?
+#d-i anna/no_kernel_modules boolean true
+
+# Stop Ubuntu from installing random kernel choice
+#d-i base-installer/kernel/image select none
+
+# Controls whether or not the hardware clock is set to UTC.
+d-i clock-setup/utc boolean true
+#
+# # You may set this to any valid setting for $TZ; see the contents of
+# # /usr/share/zoneinfo/ for valid values.
+d-i time/zone string Etc/UTC
+
+# Controls whether to use NTP to set the clock during the install
+d-i clock-setup/ntp boolean true
+# NTP server to use. The default is almost always fine here.
+d-i clock-setup/ntp-server string pool.ntp.org
+
+# This makes partman automatically partition without confirmation.
+#d-i partman/confirm_write_new_label boolean true
+#d-i partman/choose_partition select finish
+#d-i partman/confirm boolean true
+#d-i partman/choose_partition select finish
+d-i partman-basicfilesystems/no_swap boolean false
+d-i partman-basicfilesystems/no_swap seen true
+d-i partman-auto/disk string /dev/sdm
+d-i partman-auto/method string regular
+#d-i partman-auto/purge_lvm_from_device boolean true
+d-i partman-auto/confirm_nooverwrite  boolean true
+d-i partman-auto/choose_partition select finish
+
+
+d-i partman/choose_partition select finish
+d-i partman/confirm boolean true
+d-i partman/confirm_nooverwrite boolean true
+d-i partman-partitioning/confirm_write_new_label boolean true
+d-i partman/default_filesystem string ext4
+d-i partman-auto/expert_recipe string                         \
+           root ::                                            \
+              500 10000 1000000000 ext4                       \
+                      $primary{ } $bootable{ }                \
+                      method{ format } format{ }              \
+                      use_filesystem{ } filesystem{ ext4 }    \
+                      mountpoint{ / }                         \
+              .                                               
+#\
+#              64 512 1% linux-swap                            \
+#                      method{ swap } format{ }                \
+#              .
+d-i partman/confirm_write_new_label boolean true
+d-i partman/choose_partition \
+       select Finish partitioning and write changes to disk
+d-i partman/confirm boolean true
+
+d-i grub-pc/install_devices multiselect /dev/sdm
+
+#User account.
+d-i passwd/root-login boolean false 
+d-i passwd/make-user boolean true
+d-i passwd/user-fullname string {{ cm_user }}
+d-i passwd/username string {{ cm_user }}
+d-i passwd/user-password-crypted password $default_password_crypted
+d-i passwd/user-uid string {{ cm_user_uid }}
+d-i user-setup/allow-password-weak boolean false
+d-i user-setup/encrypt-home boolean false
+
+# Individual additional packages to install
+#if $os_version == 'precise'
+d-i pkgsel/include string wget ntpdate bash sudo openssh-server
+#else if int($distro_ver_major) == 16
+d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware ntpdate bash devmem2 fbset sudo openssh-server udev-discover gawk gdisk ethtool curl
+#else if int($distro_ver_major) == 18
+d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware ntpdate bash devmem2 fbset sudo openssh-server gawk gdisk ethtool net-tools ifupdown python ntp curl
+#else
+d-i pkgsel/include string u-boot-tools pastebinit initramfs-tools wget linux-firmware linux-firmware-nonfree ntpdate bash devmem2 fbset sudo openssh-server udev-discover gawk gdisk ethtool curl
+#end if
+
+# Whether to upgrade packages after debootstrap.
+# Allowed values: none, safe-upgrade, full-upgrade
+d-i pkgsel/upgrade select safe-upgrade
+
+# Policy for applying updates. May be "none" (no automatic updates),
+# "unattended-upgrades" (install security updates automatically), or
+# "landscape" (manage system with Landscape).
+d-i pkgsel/update-policy select none
+
+# Set GRUB bootdev to '/dev/sdm' if Xenial or later
+#if int($distro_ver_major) >= 16
+d-i grub-installer/bootdev  string /dev/sdm
+#end if
+
+# During installations from serial console, the regular virtual consoles
+# (VT1-VT6) are normally disabled in /etc/inittab. Uncomment the next
+# line to prevent this.
+d-i finish-install/keep-consoles boolean true
+
+# Avoid that last message about the install being complete.
+d-i finish-install/reboot_in_progress note
+
+# This command is run just before the install finishes, but when there is
+# still a usable /target directory. You can chroot to /target and use it
+# directly, or use the apt-install and in-target commands to easily install
+# packages and run commands in the target system.
+
+# cephlab_preseed_late lives in /var/lib/cobbler/scripts
+# It is passed to the cobbler xmlrpc generate_scripts function where it's rendered.
+# This means that snippets or other templating features can be used.
+d-i preseed/late_command string \
+in-target wget http://$http_server/cblr/svc/op/script/system/$system_name/?script=cephlab_preseed_late -O /tmp/postinst.sh; \
+in-target /bin/chmod 755 /tmp/postinst.sh; \
+in-target /tmp/postinst.sh;
diff --git a/roles/cobbler/templates/scripts/cephlab_preseed_late b/roles/cobbler/templates/scripts/cephlab_preseed_late

new file mode 100644 (file)

index 0000000..c385655
--- /dev/null
+++ b/roles/cobbler/templates/scripts/cephlab_preseed_late
@@ -0,0 +1,17 @@
+## {{ ansible_managed }}
+# Start preseed_late_default
+# This script runs in the chroot /target by default
+# set kernel options as defined by the system, profile or distro
+# in the Kernel Options (Post Install) field which populates the var kernel_options_post
+$SNIPPET('cephlab_post_install_kernel_options')
+$SNIPPET('post_run_deb')
+$SNIPPET('download_config_files')
+# custom
+$SNIPPET('cephlab_hostname')
+$SNIPPET('cephlab_user')
+$SNIPPET('cephlab_rc_local')
+# end custom
+$SNIPPET('kickstart_done')
+# Exit with status 0
+true
+# End preseed_late_default
diff --git a/roles/cobbler/templates/snippets/cephlab_hostname b/roles/cobbler/templates/snippets/cephlab_hostname

new file mode 100644 (file)

index 0000000..e24211d
--- /dev/null
+++ b/roles/cobbler/templates/snippets/cephlab_hostname
@@ -0,0 +1,3 @@
+## {{ ansible_managed }}
+hostname $system_name
+echo $system_name > /etc/hostname
diff --git a/roles/cobbler/templates/snippets/cephlab_packages_rhel b/roles/cobbler/templates/snippets/cephlab_packages_rhel

new file mode 100644 (file)

index 0000000..e83fabf
--- /dev/null
+++ b/roles/cobbler/templates/snippets/cephlab_packages_rhel
@@ -0,0 +1,56 @@
+## {{ ansible_managed }}
+## @base group no longer exists in >=Fedora-22
+#set distro = $getVar('distro','').split("-")[0]
+#set distro_ver = $getVar('distro','').split("-")[1]
+#if $distro == 'Fedora' and int($distro_ver) >= 22 and int($distro_ver) < 31
+@^infrastructure-server-environment
+#else if $distro == 'Fedora' and int($distro_ver) >= 31
+## We can't figure out what the new server group name is in F31 but we do need python3 so...
+python3
+#else
+@base
+#end if
+#if $distro == 'RHEL' or $distro == 'CentOS'
+#set distro_ver_major = $distro_ver.split(".")[0]
+#set distro_ver_minor = $distro_ver.split(".")[1]
+## These packages are available in all RHEL/CentOS versions but not Fedora
+perl
+#if int($distro_ver_major) >= 9
+#if $distro == 'RHEL'
+# Needed in RHEL9 but not CentOS9
+NetworkManager-initscripts-updown
+dbus-tools
+dbus-daemon
+#end if
+#if $distro == 'CentOS'
+# CentOS 9 Stream only packages
+centos-gpg-keys
+-subscription-manager
+python3-pip
+#end if
+#end if
+## These packages are not available in CentOS 9 Stream
+#if int($distro_ver_major) < 9
+redhat-lsb-core
+#end if
+#if int($distro_ver_major) < 8
+## These packages should be installed on RHEL/CentOS 7
+libselinux-python
+libsemanage-python
+policycoreutils-python
+ntp
+#if int($distro_ver_major) == 7 and int($distro_ver_minor) >= 5
+## These packages are only available in RHEL7.5 and later
+python-jwt
+#end if
+#else
+## These packages should be installed on RHEL/CentOS 8
+python3
+#end if
+#end if
+## These packages should be installed on all distros and versions
+ethtool
+wget
+smartmontools
+selinux-policy-targeted
+gdisk
diff --git a/roles/cobbler/templates/snippets/cephlab_post_install_kernel_options b/roles/cobbler/templates/snippets/cephlab_post_install_kernel_options

new file mode 100644 (file)

index 0000000..338b856
--- /dev/null
+++ b/roles/cobbler/templates/snippets/cephlab_post_install_kernel_options
@@ -0,0 +1,18 @@
+## {{ ansible_managed }}
+# Start post install kernel options update
+cat > /etc/default/grub <<-EOF
+       # {{ ansible_managed }}
+       GRUB_DEFAULT=0
+       GRUB_TIMEOUT=5
+       GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
+       GRUB_CMDLINE_LINUX_DEFAULT=""
+       GRUB_TERMINAL="console"
+       GRUB_SERIAL_COMMAND="console --unit=1 --speed=115200 --stop=1"
+#if $getVar('kernel_options_post','') != ''
+       GRUB_CMDLINE_LINUX="$kernel_options_post"
+#else
+       GRUB_CMDLINE_LINUX="console=tty0"
+#end if
+       EOF
+update-grub
+# End post install kernel options update
diff --git a/roles/cobbler/templates/snippets/cephlab_rc_local b/roles/cobbler/templates/snippets/cephlab_rc_local

new file mode 100644 (file)

index 0000000..6705c5c
--- /dev/null
+++ b/roles/cobbler/templates/snippets/cephlab_rc_local
@@ -0,0 +1,174 @@
+## {{ ansible_managed }}
+#set lockfile = '/.cephlab_rc_local'
+# Set proper location for firstboot ansible post-install trigger
+#set distro = $getVar('distro','').split("-")[0]
+#set distro_ver = $getVar('distro','').split("-")[1]
+#if ($distro == 'RHEL') or ($distro == 'CentOS')
+#set distro_ver = $distro_ver.split(".")[0]
+#end if
+#if ($distro == 'Fedora' and int($distro_ver) >= 22) or ($distro == 'RHEL' and int($distro_ver) >= 8)
+#set script = '/etc/rc.d/rc.local'
+#else if $distro == 'CentOS' and int($distro_ver) >= 9
+#set script = '/etc/rc.d/rc.local'
+systemctl enable rc-local.service
+#else if $distro == 'openSUSE'
+#set script = '/etc/init.d/boot.local'
+#else
+#set script = '/etc/rc.local'
+#end if
+
+cat > $script <<\EOF
+#!/bin/bash
+# Redirect rc.local output to our console so it's in teuthology console logs
+exec 2> /dev/ttyS1
+exec 1>&2
+set -ex
+
+# This function will print the date to console in a clean way.
+# In other words, it'll just print the date without it looking like this:
+#   + date -u +%FT%T.%N
+#   + cut -c1-23
+#   2020-05-15T14:15:33.087
+TheTimeIs ()
+{
+  { set +x; } 2>/dev/null
+  date -u +%FT%T.%N | cut -c1-23
+  { set -x; } 2>/dev/null
+}
+
+{% if rclocal_nameserver is defined %}
+if [ ! -f /.cephlab_net_configured ]; then
+#if $distro == 'openSUSE'
+  udevadm trigger
+  sleep 5
+#end if
+#raw
+  nics=$(ls -1 /sys/class/net | grep -v lo)
+
+  for nic in $nics; do
+    TheTimeIs
+    # Bring the NIC up so we can detect if a link is present
+    ifconfig $nic up || ip link set $nic up
+    # Sleep for a bit to let the NIC come up
+    sleep 5
+    if ethtool $nic | grep -q "Link detected: yes"; then
+      if command -v zypper &>/dev/null; then
+        echo -e "DEVICE=$nic\nBOOTPROTO=dhcp\nSTARTMODE=auto" > /etc/sysconfig/network/ifcfg-$nic
+      elif command -v apt-get &>/dev/null; then
+        echo -e "auto lo\niface lo inet loopback\n\nauto $nic\niface $nic inet dhcp" > /etc/network/interfaces
+      else
+        echo -e "DEVICE=$nic\nBOOTPROTO=dhcp\nONBOOT=yes" > /etc/sysconfig/network-scripts/ifcfg-$nic
+      fi
+      # Don't bail if NIC fails to go down or come up
+      { set +e; } 2>/dev/null
+      TheTimeIs
+      # Bounce the NIC so it gets a DHCP address
+      ifdown $nic
+      ifup $nic
+      attempts=0
+      # Try for 5 seconds to ping our Cobbler host
+#end raw
+      while ! ping -I $nic -nq -c1 $http_server && [ $attempts -lt 5 ]; do
+#raw
+        sleep 1
+        attempts=$[$attempts+1]
+      done
+      if [ $attempts == 5 ]; then
+        # If we can't ping our Cobbler host, remove the DHCP config for this NIC.
+        # It must either be on a non-routable network or has no reachable DHCP server.
+        ifdown $nic
+        rm -f /etc/sysconfig/network-scripts/ifcfg-$nic
+        sed -i "/$nic/d" /etc/network/interfaces
+        # Go back to bailing if anything fails bringing the next NIC up
+        set -e
+      else
+        # We found our routable NIC!
+        # Write our lockfile so this only gets run on firstboot
+        TheTimeIs
+        touch /.cephlab_net_configured
+        # Break out of the loop once we've found our routable NIC
+        break
+      fi
+    else
+      # Take the NIC back down if it's not connected
+      ifconfig $nic down || ip link set $nic down
+    fi
+  done
+fi
+
+# Don't error out if the `ip` command returns rc 1
+set +e
+
+attempts=0
+myips=""
+until [ "$myips" != "" ] || [ $attempts -ge 10 ]; do
+  myips=$(ip -4 addr | grep -oP '(?<=inet\s)\d+(\.\d+){3}' | grep -v '127.0.0.1\|127.0.1.1')
+  attempts=$[$attempts+1]
+  sleep 1
+done
+
+set -e
+
+if [ -n "$myips" ]; then
+  for ip in $myips; do
+    if timeout 1s ping -I $ip -nq -c1 {{ rclocal_nameserver }} 2>&1 >/dev/null; then
+      newhostname=$(dig +short -x $ip @{{ rclocal_nameserver }} | sed 's/\.com.*/\.com/g')
+        if [ -n "$newhostname" ]; then
+          hostname $newhostname
+          newdomain=$(hostname -d)
+          shorthostname=$(hostname -s)
+          echo $shorthostname > /etc/hostname
+          if grep -q $newdomain /etc/hosts; then
+            # Replace
+            sed -i "s/.*$newdomain.*/$ip $newhostname $shorthostname/g" /etc/hosts
+          else
+            # Or add to top of file
+            sed -i '1i'$ip' '$newhostname' '$shorthostname'\' /etc/hosts
+          fi
+        fi
+    # Quit after first IP that can ping our nameserver
+    # in the extremely unlikely event the testnode has two IPs
+    break
+    fi
+  done
+fi
+#end raw
+
+{% endif %}
+
+# Regenerate SSH host keys on boot if needed
+if command -v zypper &> /dev/null; then
+  if [ ! -f /etc/ssh/ssh_host_rsa_key ]; then
+    ssh-keygen -f /etc/ssh/ssh_host_rsa_key -N '' -t rsa
+    systemctl restart sshd
+  fi
+elif command -v apt-get &>/dev/null; then
+  if [ ! -f /etc/ssh/ssh_host_rsa_key ]; then
+     dpkg-reconfigure openssh-server
+  fi
+fi
+
+# Only run once.
+if [ -e $lockfile ]; then
+    exit 0
+fi
+
+# Wait until we get 10 ping responses from Cobbler host
+# before calling post-install trigger
+until ping -nq -c10 $http_server
+do
+    echo "Waiting for network"
+    sleep 3
+done
+# Output message to console indicating Ansible is being run
+set +x
+echo -e "==================================\nInstructing Cobbler to run Ansible\n      Waiting for completion\n==================================" > /dev/console
+TheTimeIs
+set -x
+# Run the post-install trigger a second time
+curl --max-time 1800 --silent "http://$http_server:$http_port/cblr/svc/op/trig/mode/post/system/$system_name" -o /dev/null || true
+TheTimeIs
+touch $lockfile
+EOF
+
+chmod +x $script
diff --git a/roles/cobbler/templates/snippets/cephlab_rhel_disks b/roles/cobbler/templates/snippets/cephlab_rhel_disks

new file mode 100644 (file)

index 0000000..0c9425a
--- /dev/null
+++ b/roles/cobbler/templates/snippets/cephlab_rhel_disks
@@ -0,0 +1,29 @@
+## {{ ansible_managed }}
+#set os_version = $getVar('os_version','')
+# #set hostname = $getVar('name','')
+#set distro = $getVar('distro','').split("-")[0]
+#set distro_ver = $getVar('distro','').split("-")[1]
+#if $distro == 'RHEL' or $distro == 'CentOS'
+#set distro_ver_major = $distro_ver.split(".")[0]
+#set distro_ver_minor = $distro_ver.split(".")[1]
+#end if
+# Partition clearing information
+clearpart --all --initlabel
+# Use all of /dev/sda for the root partition (20G minimum)
+part / --fstype="ext4" --ondisk=sda --size=20000 --grow
+# Clear the Master Boot Record
+zerombr
+# System bootloader configuration
+#if $os_version == 'rhel7'
+    #set bootloader_args = "--location=mbr --boot-drive=sda"
+#else if int($distro_ver_major) >= 8 and 'braggi' not in $hostname
+    #set bootloader_args = "--location=mbr --boot-drive=sda"
+ignoredisk --only-use=sda
+# On CentOS9 on braggi, the smaller "root" drive is sdb during kickstart and sda after booting into the OS.
+#else if int($distro_ver_major) == 9 and 'braggi' in $hostname
+    #set bootloader_args = "--location=mbr --driveorder=sdb,sda"
+ignoredisk --only-use=sda
+#else
+    #set bootloader_args = "--location=mbr --driveorder=sda"
+#end if
+bootloader $bootloader_args
diff --git a/roles/cobbler/templates/snippets/cephlab_rhel_rhsm b/roles/cobbler/templates/snippets/cephlab_rhel_rhsm

new file mode 100644 (file)

index 0000000..82dac18
--- /dev/null
+++ b/roles/cobbler/templates/snippets/cephlab_rhel_rhsm
@@ -0,0 +1,18 @@
+## {{ ansible_managed }}
+{% if use_satellite %}
+## Install our satellite server's CA RPM if use_satellite is true
+wget -O /tmp/satellite-ca.rpm {{ satellite_cert_rpm }}
+rpm -U /tmp/satellite-ca.rpm
+{% endif %}
+## Subscribe (These vars will be empty and this snippet won't get run if the vars aren't set like in Sepia)
+subscription-manager register --activationkey={{ subscription_manager_activationkey }} --org={{ subscription_manager_org }}
+## Disable all repos
+subscription-manager repos --disable '*'
+## Enable repos
+#if $os_version == 'rhel6'
+subscription-manager repos --enable=rhel-6-server-rpms --enable=rhel-6-server-optional-rpms --enable=rhel-6-server-extras-rpms --enable=rhel-scalefs-for-rhel-6-server-rpms
+#else if $os_version == 'rhel7'
+subscription-manager repos --enable=rhel-7-server-rpms --enable=rhel-7-server-optional-rpms --enable=rhel-7-server-extras-rpms
+#else if $os_version == 'rhel8'
+subscription-manager repos --enable=rhel-8-for-x86_64-baseos-rpms --enable=rhel-8-for-x86_64-appstream-rpms
+#end if
diff --git a/roles/cobbler/templates/snippets/cephlab_user b/roles/cobbler/templates/snippets/cephlab_user

new file mode 100644 (file)

index 0000000..4b03b89
--- /dev/null
+++ b/roles/cobbler/templates/snippets/cephlab_user
@@ -0,0 +1,35 @@
+## {{ ansible_managed }}
+#set $user = '{{ cm_user }}'
+#set $home = '/home/' + $user
+#set $auth_keys = $home + '/.ssh/authorized_keys'
+groupadd sudo
+#set distro = $getVar('distro','').split("-")[0]
+#set distro_ver = $getVar('distro','').split("-")[1]
+#if $distro == 'openSUSE'
+useradd -U -u {{ cm_user_uid }} -G sudo $user
+#else
+useradd -u {{ cm_user_uid }} -G sudo $user
+#end if
+passwd -d $user
+
+cat >> /etc/sudoers.d/cephlab_sudo << EOF
+%sudo ALL=(ALL) NOPASSWD: ALL
+# For ansible pipelining
+Defaults !requiretty
+Defaults visiblepw
+EOF
+
+chmod 0440 /etc/sudoers.d/cephlab_sudo
+
+install -d -m0755 --owner=$user --group=$user /home/$user/.ssh 
+
+cat >> $auth_keys << EOF
+{% for key in cm_user_ssh_keys %}
+{{ key }}
+{% endfor %}
+EOF
+
+chown $user.$user $auth_keys 
+chmod 644 $auth_keys
+chown -Rf $user:$user /home/$user
+curl "http://$http_server:$http_port/cblr/svc/op/nopxe/system/$system_name" -o /dev/null 
diff --git a/roles/cobbler/templates/triggers/install/post/cephlab_ansible.sh b/roles/cobbler/templates/triggers/install/post/cephlab_ansible.sh

new file mode 100644 (file)

index 0000000..f60e123
--- /dev/null
+++ b/roles/cobbler/templates/triggers/install/post/cephlab_ansible.sh
@@ -0,0 +1,61 @@
+#!/bin/bash
+## {{ ansible_managed }}
+set -ex
+
+# Cobbler on CentOS 7 in May 2023 needed a later python than the default 3.6
+# check for SCL 3.8 and enable if so.  scl enable starts a child shell; the undocumented
+# scl_source sets the environment variables (PATH, LD_LIBRARY_PATH, MANPATH, PKG_CONFIG_PATH,
+# and XDG_DATA_DIRS) in the current shell.
+
+if scl -l | grep -s rh-python38 >/dev/null 2>&1 ; then source scl_source enable rh-python38; fi
+
+name=$2
+profile=$(cobbler system dumpvars --name $2 | grep profile_name | cut -d ':' -f2)
+export USER=root
+export HOME=/root
+ANSIBLE_CM_PATH=/root/ceph-cm-ansible
+SECRETS_REPO_NAME={{ secrets_repo.name }}
+
+# Bail if the ssh port isn't open, as will be the case when this is run 
+# while the installer is still running. When this is triggered by 
+# /etc/rc.local after a reboot, the port will be open and we'll continue
+nmap -sT -oG - -p 22 $name | grep 22/open
+
+mkdir -p /var/log/ansible
+
+if [ $SECRETS_REPO_NAME != 'UNDEFINED' ]
+then
+    ANSIBLE_SECRETS_PATH=/root/$SECRETS_REPO_NAME
+    pushd $ANSIBLE_SECRETS_PATH
+    flock --close ./.lock git pull
+    popd
+fi
+pushd $ANSIBLE_CM_PATH
+flock --close ./.lock git pull
+export ANSIBLE_SSH_PIPELINING=1
+export ANSIBLE_HOST_KEY_CHECKING=False
+
+# Set up Stream repos
+# We have to do it this way because
+# 1) Stream ISOs don't work with Cobbler https://bugs.centos.org/view.php?id=18188
+# 2) Since we use a non-stream profile then convert it to stream, we can't run any package related tasks
+#    until the stream repo files are in place. e.g., The zap ansible tag has some package tasks that fail
+#    unless we get the repos in place first.
+if [[ $profile == *"8.stream"* ]]
+then
+    ansible-playbook tools/convert-to-centos-stream.yml -v --limit $name* 2>&1 >> /var/log/ansible/$name.log
+fi
+
+# Tell ansible to create users, populate authorized_keys, and zap non-root disks
+ansible-playbook testnodes.yml -v --limit $name* --tags user,pubkeys,zap 2>&1 > /var/log/ansible/$name.log
+# Now run the rest of the playbook. If it fails, at least we have access.
+# Background it so that the request doesn't block for this part and end up 
+# causing the client to retry, thus spawning this trigger multiple times
+
+# Skip the rest of the testnodes playbook if stock profile requested
+if [[ $profile == *"-stock" ]]
+then
+    exit 0
+fi
+ansible-playbook cephlab.yml -v --limit $name* --skip-tags user,pubkeys,zap 2>&1 >> /var/log/ansible/$name.log &
+popd
diff --git a/roles/cobbler/templates/utils/console.sh b/roles/cobbler/templates/utils/console.sh

new file mode 100644 (file)

index 0000000..bbd07cf
--- /dev/null
+++ b/roles/cobbler/templates/utils/console.sh
@@ -0,0 +1,5 @@
+#!/bin/bash
+## {{ ansible_managed }}
+set -ex
+name=$1
+ipmitool -H $name.{{ ipmi_domain }} -I lanplus -U {{ power_user }} -P {{ power_pass }} sol activate
diff --git a/roles/cobbler/templates/utils/reboot.sh b/roles/cobbler/templates/utils/reboot.sh

new file mode 100644 (file)

index 0000000..3251590
--- /dev/null
+++ b/roles/cobbler/templates/utils/reboot.sh
@@ -0,0 +1,4 @@
+#!/bin/bash
+## {{ ansible_managed }}
+name=$1
+cobbler system reboot --name $name
diff --git a/roles/cobbler/templates/utils/reimage.sh b/roles/cobbler/templates/utils/reimage.sh

new file mode 100644 (file)

index 0000000..4aee52d
--- /dev/null
+++ b/roles/cobbler/templates/utils/reimage.sh
@@ -0,0 +1,9 @@
+#!/bin/bash
+## {{ ansible_managed }}
+set -ex
+name=$1
+profile=$2
+echo "Reimaging $name with profile $profile"
+# First turn netboot off so that cobbler removes any stale PXE data
+cobbler system edit --name=$name netboot off
+cobbler system edit --name=$name --profile $profile --netboot on && cobbler system reboot --name $name
diff --git a/roles/cobbler/vars/apt_systems.yml b/roles/cobbler/vars/apt_systems.yml

new file mode 100644 (file)

index 0000000..9d05c83
--- /dev/null
+++ b/roles/cobbler/vars/apt_systems.yml
@@ -0,0 +1,11 @@
+---
+cobbler_package: cobbler
+cobbler_service: cobbler
+httpd_service: apache2
+cobbler_extra_packages:
+  - git
+  - syslinux
+  - python-pykickstart
+  - fence-agents
+  - nmap
+  - python-pip
diff --git a/roles/cobbler/vars/dnf_systems.yml b/roles/cobbler/vars/dnf_systems.yml

new file mode 100644 (file)

index 0000000..44be516
--- /dev/null
+++ b/roles/cobbler/vars/dnf_systems.yml
@@ -0,0 +1,36 @@
+---
+# cobbler-web pulls in cobbler
+cobbler_package: cobbler-web
+cobbler_service: cobblerd
+httpd_service: httpd
+cobbler_extra_packages:
+  - git
+  - syslinux
+  - pykickstart
+  - fence-agents-all
+  - nmap
+  - ansible
+
+pip_packages: []
+
+settings:
+  - name: yum_post_install_mirror
+    value: 0
+  - name: signature_url
+    value: https://raw.githubusercontent.com/cobbler/cobbler/master/config/cobbler/distro_signatures.json
+  - name: server
+    value: "{{ ip }}"
+  - name: next_server
+    value: "{{ ip }}"
+  - name: pxe_just_once
+    value: 1
+
+cobbler_settings_file: /etc/cobbler/settings.yaml
+
+kopts_flag: "--kernel-options"
+
+autoinstall_flag: "--autoinstall"
+
+autoinstall_meta_flag: "--autoinstall-meta"
+
+ks_dir: /var/lib/cobbler/templates
diff --git a/roles/cobbler/vars/yum_systems.yml b/roles/cobbler/vars/yum_systems.yml

new file mode 100644 (file)

index 0000000..7bbd6c2
--- /dev/null
+++ b/roles/cobbler/vars/yum_systems.yml
@@ -0,0 +1,13 @@
+---
+# cobbler-web pulls in cobbler
+cobbler_package: cobbler-web
+cobbler_service: cobblerd
+httpd_service: httpd
+cobbler_extra_packages:
+  - git
+  - syslinux
+  - pykickstart
+  - fence-agents-all
+  - nmap
+  - python-pip
+  - python2-crypto
diff --git a/roles/cobbler_profile/defaults/main.yml b/roles/cobbler_profile/defaults/main.yml

new file mode 100644 (file)

index 0000000..02c14b8
--- /dev/null
+++ b/roles/cobbler_profile/defaults/main.yml
@@ -0,0 +1,198 @@
+---
+distros:
+  # Distros with empty iso values will be skipped. These dicts will be
+  # updated with same-named items in an 'extra_distros' var, which can be
+  # set in the secrets repo.
+  "inktank-rescue":
+      iso: ""
+      kernel_options: "nokeymap"
+  "dban-2.3.0-autonuke":
+      iso: ""
+  "RHEL-6.6-Server-x86_64":
+      iso: ""
+  "RHEL-6.7-Server-x86_64":
+      iso: ""
+  "RHEL-6.8-Server-x86_64":
+      iso: ""
+  "RHEL-7.0-Server-x86_64":
+      iso: ""
+  "RHEL-7.1-Server-x86_64":
+      iso: ""
+  "RHEL-7.2-Server-x86_64":
+      iso: ""
+  "RHEL-7.3-Server-x86_64":
+      iso: ""
+  "RHEL-7.4-Server-x86_64":
+      iso: ""
+  "RHEL-7.5-Server-x86_64":
+      iso: ""
+  "RHEL-7.6-Server-x86_64":
+      iso: ""
+  "RHEL-7.7-Server-x86_64":
+      iso: ""
+  "RHEL-7.8-Server-x86_64":
+      iso: ""
+  "RHEL-7.9-Server-x86_64":
+      iso: ""
+  "RHEL-8.0-Server-x86_64":
+      iso: ""
+  "RHEL-8.1-Server-x86_64":
+      iso: ""
+  "RHEL-8.2-Server-x86_64":
+      iso: ""
+  "RHEL-8.3-Server-x86_64":
+      iso: ""
+  "RHEL-8.4-Server-x86_64":
+      iso: ""
+  "RHEL-8.5-Server-x86_64":
+      iso: ""
+  "RHEL-8.6-Server-x86_64":
+      iso: ""
+  "RHEL-9.0-Server-x86_64":
+      iso: ""
+  "RHEL-9.3-Server-x86_64":
+      iso: ""
+  "CentOS-8.stream-x86_64":
+      iso: ""
+  "CentOS-9.stream-x86_64":
+      iso: http://mirror.lanet.network/centos-stream/9-stream/BaseOS/x86_64/iso/CentOS-Stream-9-latest-x86_64-dvd1.iso
+      sha256: 774db59bf99570cfd0703c7e2751c37702bc961fdd32c59e52828ca739f86121
+      kickstart: cephlab_rhel.ks
+      kernel_options: "inst.stage2=http://@@http_server@@/cblr/links/{{ distro_name }}/ inst.ks=http://@@http_server@@/cblr/svc/op/ks/system/@@name@@"
+  "Fedora-22-Server-x86_64":
+      iso: http://ftp.linux.ncsu.edu/mirror/ftp.redhat.com/pub/fedora/linux/releases/22/Server/x86_64/iso/Fedora-Server-DVD-x86_64-22.iso
+      sha256: b2acfa7c7c6b5d2f51d3337600c2e52eeaa1a1084991181c28ca30343e52e0df
+      kickstart: cephlab_rhel.ks
+  "Fedora-31-Server-x86_64":
+      iso: https://dl.fedoraproject.org/pub/fedora/linux/releases/31/Server/x86_64/iso/Fedora-Server-dvd-x86_64-31-1.9.iso
+      sha256: 225ebc160e40bb43c5de28bad9680e3a78a9db40c9e3f4f42f3ee3f10f95dbeb
+      kickstart: cephlab_rhel.ks
+  "CentOS-6.7-x86_64":
+      iso: http://ftp.linux.ncsu.edu/pub/CentOS/6.7/isos/x86_64/CentOS-6.7-x86_64-bin-DVD1.iso
+      sha256: c0c1a05d3d74fb093c6232003da4b22b0680f59d3b2fa2cb7da736bc40b3f2c5
+      kickstart: cephlab_rhel.ks
+  "CentOS-7.0-x86_64":
+      iso: http://archive.kernel.org/centos-vault/7.0.1406/isos/x86_64/CentOS-7.0-1406-x86_64-DVD.iso
+      sha256: ee505335bcd4943ffc7e6e6e55e5aaa8da09710b6ceecda82a5619342f1d24d9
+      kickstart: cephlab_rhel.ks
+  "CentOS-7.1-x86_64":
+      iso: http://archive.kernel.org/centos-vault/7.1.1503/isos/x86_64/CentOS-7-x86_64-DVD-1503-01.iso
+      sha256: 85bcf62462fb678adc0cec159bf8b39ab5515404bc3828c432f743a1b0b30157
+      kickstart: cephlab_rhel.ks
+  "CentOS-7.2-x86_64":
+      iso: http://ftp.linux.ncsu.edu/pub/CentOS/7.2.1511/isos/x86_64/CentOS-7-x86_64-DVD-1511.iso
+      sha256: 907e5755f824c5848b9c8efbb484f3cd945e93faa024bad6ba875226f9683b16
+      kickstart: cephlab_rhel.ks
+  "CentOS-7.3-x86_64":
+      iso: http://ftp.linux.ncsu.edu/pub/CentOS/7.3.1611/isos/x86_64/CentOS-7-x86_64-DVD-1611.iso
+      sha256: c455ee948e872ad2194bdddd39045b83634e8613249182b88f549bb2319d97eb
+      kickstart: cephlab_rhel.ks
+  "CentOS-7.4-x86_64":
+      iso: http://ftp.linux.ncsu.edu/pub/CentOS/7.4.1708/isos/x86_64/CentOS-7-x86_64-DVD-1708.iso
+      sha256: ec7500d4b006702af6af023b1f8f1b890b6c7ee54400bb98cef968b883cd6546
+      kickstart: cephlab_rhel.ks
+  "CentOS-7.5-x86_64":
+      iso: http://ftp.linux.ncsu.edu/pub/CentOS/7.5.1804/isos/x86_64/CentOS-7-x86_64-DVD-1804.iso
+      sha256: 506e4e06abf778c3435b4e5745df13e79ebfc86565d7ea1e128067ef6b5a6345
+      kickstart: cephlab_rhel.ks
+  "CentOS-7.6-x86_64":
+      iso: http://ftp.linux.ncsu.edu/pub/CentOS/7.6.1810/isos/x86_64/CentOS-7-x86_64-DVD-1810.iso
+      sha256: 6d44331cc4f6c506c7bbe9feb8468fad6c51a88ca1393ca6b8b486ea04bec3c1
+      kickstart: cephlab_rhel.ks
+  "CentOS-7.7-x86_64":
+      iso: http://ftp.linux.ncsu.edu/pub/CentOS/7.7.1908/isos/x86_64/CentOS-7-x86_64-DVD-1908.iso
+      sha256: 9bba3da2876cb9fcf6c28fb636bcbd01832fe6d84cd7445fa58e44e569b3b4fe
+      kickstart: cephlab_rhel.ks
+  "CentOS-7.8-arm":
+      iso: http://centos.mirror.garr.it/centos-altarch/7.8.2003/isos/aarch64/CentOS-7-aarch64-Everything-2003.iso
+      sha256: 386e85a0d49d457252fcdbfa23d2082fc3f132f8405622831b07fd27a6071c7e
+      kickstart: cephlab_rhel.ks
+      arch: arm
+  "CentOS-7.9-x86_64":
+      iso: http://mirror.linux.duke.edu/pub/centos/7.9.2009/isos/x86_64/CentOS-7-x86_64-DVD-2009.iso
+      sha256: e33d7b1ea7a9e2f38c8f693215dd85254c3a4fe446f93f563279715b68d07987
+      kickstart: cephlab_rhel.ks
+  "CentOS-8.0-x86_64":
+      iso: http://mirror.linux.duke.edu/pub/centos/8.0.1905/isos/x86_64/CentOS-8-x86_64-1905-dvd1.iso
+      sha256: ea17ef71e0df3f6bf1d4bf1fc25bec1a76d1f211c115d39618fe688be34503e8
+      kickstart: cephlab_rhel.ks
+  "CentOS-8.1-x86_64":
+      iso: http://mirror.linux.duke.edu/pub/centos/8.1.1911/isos/x86_64/CentOS-8.1.1911-x86_64-dvd1.iso
+      sha256: 3ee3f4ea1538e026fff763e2b284a6f20b259d91d1ad5688f5783a67d279423b
+      kickstart: cephlab_rhel.ks
+  "CentOS-8.1-aarch64":
+      iso: http://mirror.linux.duke.edu/pub/centos/8/isos/aarch64/CentOS-8.1.1911-aarch64-dvd1.iso
+      sha256: 357f34e86a28c86aaf1661462ef41ec4cf5f58c120f46e66e1985a9f71c246e3
+      kickstart: cephlab_rhel.ks
+      arch: aarch64
+  "CentOS-8.2-x86_64":
+      iso: http://ftp.linux.ncsu.edu/pub/CentOS/8.2.2004/isos/x86_64/CentOS-8.2.2004-x86_64-dvd1.iso
+      sha256: c87a2d81d67bbaeaf646aea5bedd70990078ec252fc52f5a7d65ff609871e255
+      kickstart: cephlab_rhel.ks
+  "CentOS-8.3-x86_64":
+      iso: http://mirror.linux.duke.edu/pub/centos/8.3.2011/isos/x86_64/CentOS-8.3.2011-x86_64-dvd1.iso
+      sha256: aaf9d4b3071c16dbbda01dfe06085e5d0fdac76df323e3bbe87cce4318052247
+      kickstart: cephlab_rhel.ks
+  "CentOS-8.4-x86_64":
+      iso: http://packages.oit.ncsu.edu/centos/8.4.2105/isos/x86_64/CentOS-8.4.2105-x86_64-dvd1.iso
+      sha256: 0394ecfa994db75efc1413207d2e5ac67af4f6685b3b896e2837c682221fd6b2
+      kickstart: cephlab_rhel.ks
+  "CentOS-8.5-x86_64":
+      iso: https://mirror.cs.pitt.edu/centos-vault/8.5.2111/isos/x86_64/CentOS-8.5.2111-x86_64-dvd1.iso
+      sha256: 3b795863001461d4f670b0dedd02d25296b6d64683faceb8f2b60c53ac5ebb3e
+      kickstart: cephlab_rhel.ks
+  "Rocky-9.5-x86_64":
+      iso: https://download.rockylinux.org/pub/rocky/9/isos/x86_64/Rocky-9.5-x86_64-dvd.iso
+      sha256: ba60c3653640b5747610ddfb4d09520529bef2d1d83c1feb86b0c84dff31e04e
+      kickstart: cephlab_rhel.ks
+  "Ubuntu-12.04-server-x86_64":
+      iso: "http://releases.ubuntu.com/12.04/ubuntu-12.04.5-server-amd64.iso"
+      sha256: af224223de99e2a730b67d7785b657f549be0d63221188e105445f75fb8305c9
+      kickstart: cephlab_ubuntu.preseed
+      kernel_options: "netcfg/choose_interface=auto console=tty0 console=ttyS1,115200"
+      kernel_options_post: "pci=realloc=off console=tty0 console=ttyS1,115200"
+  "Ubuntu-14.04-server-x86_64":
+      iso: "http://releases.ubuntu.com/14.04/ubuntu-14.04.3-server-amd64.iso"
+      sha256: a3b345908a826e262f4ea1afeb357fd09ec0558cf34e6c9112cead4bb55ccdfb
+      kickstart: cephlab_ubuntu.preseed
+      kernel_options: "netcfg/choose_interface=auto console=tty0 console=ttyS1,115200"
+      kernel_options_post: "pci=realloc=off console=tty0 console=ttyS1,115200"
+  "Ubuntu-15.04-server-x86_64":
+      iso: "http://releases.ubuntu.com/15.04/ubuntu-15.04-server-amd64.iso"
+      sha256: 6501c8545374665823384bbb6235f865108f56d8a30bbf69dd18df73c14ccb84
+      kickstart: cephlab_ubuntu.preseed
+      kernel_options: "netcfg/choose_interface=auto console=tty0 console=ttyS1,115200"
+      kernel_options_post: "pci=realloc=off console=tty0 console=ttyS1,115200"
+  "Ubuntu-16.04-server-x86_64":
+      iso: "http://releases.ubuntu.com/16.04/ubuntu-16.04.6-server-amd64.iso"
+      sha256: 16afb1375372c57471ea5e29803a89a5a6bd1f6aabea2e5e34ac1ab7eb9786ac
+      kickstart: cephlab_ubuntu.preseed
+      kernel_options: "netcfg/choose_interface=auto console=tty0 console=ttyS1,115200"
+      kernel_options_post: "pci=realloc=off console=tty0 console=ttyS1,115200"
+  "Ubuntu-18.04-server-x86_64":
+      iso: "http://cdimage.ubuntu.com/releases/18.04/release/ubuntu-18.04-server-amd64.iso"
+      sha256: a7f5c7b0cdd0e9560d78f1e47660e066353bb8a79eb78d1fc3f4ea62a07e6cbc
+      kickstart: cephlab_ubuntu.preseed
+      kernel_options: "netcfg/choose_interface=auto console=tty0 console=ttyS1,115200 GRUB_DISABLE_OS_PROBER=true"
+      kernel_options_post: "pci=realloc=off console=tty0 console=ttyS1,115200"
+  "Ubuntu-20.04-server-x86_64":
+      iso: "http://cdimage.ubuntu.com/ubuntu-legacy-server/releases/20.04/release/ubuntu-20.04.1-legacy-server-amd64.iso"
+      sha256: f11bda2f2caed8f420802b59f382c25160b114ccc665dbac9c5046e7fceaced2
+      kickstart: cephlab_ubuntu.preseed
+      kernel_options: "netcfg/choose_interface=auto console=tty0 console=ttyS1,115200 GRUB_DISABLE_OS_PROBER=true"
+      kernel_options_post: "pci=realloc=off console=tty0 console=ttyS1,115200"
+  "openSUSE-15.0-x86_64":
+      iso: "https://download.opensuse.org/distribution/leap/15.0/iso/openSUSE-Leap-15.0-DVD-x86_64.iso"
+      sha256: c477428c7830ca76762d2f78603e13067c33952b936ff100189523e1fabe5a77
+      kickstart: cephlab_opensuse_leap.xml
+      kernel_options: "install=http://@@http_server@@/cblr/links/{{ distro_name }}/"
+  "openSUSE-15.1-x86_64":
+      iso: "https://download.opensuse.org/distribution/leap/15.1/iso/openSUSE-Leap-15.1-DVD-x86_64.iso"
+      sha256: c6d3ed19fe5cc25c4667bf0b46cc86aebcfbca3b0073aed0a288834600cb8b97
+      kickstart: cephlab_opensuse_leap.xml
+      kernel_options: "install=http://@@http_server@@/cblr/links/{{ distro_name }}/"
+  "openSUSE-15.2-x86_64":
+      iso: "https://download.opensuse.org/distribution/leap/15.2/iso/openSUSE-Leap-15.2-DVD-x86_64-Current.iso"
+      sha256: 8bc7d3e1ad515c86a285098b98a4def14e43d19e7a393cf66e980b849d2a1ddf
+      kickstart: cephlab_opensuse_leap.xml
+      kernel_options: "install=http://@@http_server@@/cblr/links/{{ distro_name }}/"
diff --git a/roles/cobbler_profile/tasks/download_image.yml b/roles/cobbler_profile/tasks/download_image.yml

new file mode 100644 (file)

index 0000000..b358208
--- /dev/null
+++ b/roles/cobbler_profile/tasks/download_image.yml
@@ -0,0 +1,28 @@
+---
+- name: Check to see if the kernel exists
+  stat: path={{ kernel_path }} get_checksum=no
+  register: kernel_stat
+
+- name: Check to see if the initrd exists
+  stat: path={{ initrd_path }} get_checksum=no
+  register: initrd_stat
+
+- name: Download kernel
+  get_url:
+      url={{ distro.kernel }}
+      dest={{ kernel_path }}
+      checksum=sha256:{{ distro.kernel_sha256 }}
+  when: profile is defined and profile.stdout == ''
+  register: download_kernel
+
+- name: Download initrd
+  get_url:
+      url={{ distro.initrd }}
+      dest={{ initrd_path }}
+      checksum=sha256:{{ distro.initrd_sha256 }}
+  when: profile is defined and profile.stdout == ''
+  register: download_initrd
+
+- name: Set files_exist if the required files are in place
+  set_fact:
+    files_exist: "{{ ( kernel_stat.stat.exists or download_kernel is changed) and ( initrd_stat.stat.exists or download_initrd is changed ) }}"
diff --git a/roles/cobbler_profile/tasks/download_iso.yml b/roles/cobbler_profile/tasks/download_iso.yml

new file mode 100644 (file)

index 0000000..0bc5d71
--- /dev/null
+++ b/roles/cobbler_profile/tasks/download_iso.yml
@@ -0,0 +1,12 @@
+---
+- name: Check to see if the ISO exists
+  stat: path={{ iso_path }} get_checksum=no
+  register: iso_stat
+
+- name: Download ISO
+  get_url:
+      url={{ distro.iso }}
+      dest={{ iso_path }}
+      checksum=sha256:{{ distro.sha256 }}
+  when: profile is defined and profile.stdout == ''
+  register: download
diff --git a/roles/cobbler_profile/tasks/import_distro.yml b/roles/cobbler_profile/tasks/import_distro.yml

new file mode 100644 (file)

index 0000000..d8eade6
--- /dev/null
+++ b/roles/cobbler_profile/tasks/import_distro.yml
@@ -0,0 +1,71 @@
+---
+# This profile will do all the work necessary to create a new distro/profile
+# pair in Cobbler.
+
+# Since this profile will be used several times in the same playbook,
+# mention the distro name each time.
+- name: Distro name
+  debug: var=distro_name
+
+- name: Load extra_distros from secrets
+  set_fact:
+    distros: "{{ distros|combine(extra_distros, recursive=True) }}"
+
+- name: Find distro settings
+  set_fact:
+      distro: "{{ distros[distro_name] }}"
+
+- name: Fail if an iso is provided in combination with either a kernel or initrd
+  fail: msg="Cannot specify both 'iso' and 'kernel' or 'initrd'. distro '{{ distro_name }}'"
+  when: distro.iso != '' and (distro.kernel is defined or distro.initrd is defined)
+
+- name: Set profile_type to iso
+  set_fact:
+    profile_type: 'iso'
+  when: distro.iso is defined and distro.iso != ''
+
+- name: Set profile_type to image
+  set_fact:
+    profile_type: 'image'
+  when: (distro.kernel is defined and distro.kernel != '') and (distro.initrd is defined and distro.initrd != '')
+
+- name: Determine if distro profile exists
+  command: cobbler profile find --name {{ distro_name }}
+  # Skip if the profile_type is empty; this allows us to mention distros with 
+  # ISOs that are internal, but leave the URL out.
+  when: profile_type|default('') != ''
+  register: profile
+  ignore_errors: true
+  changed_when: false
+
+- import_tasks: import_distro_iso.yml
+  when: profile_type|default('') == 'iso' and '"stream" not in distro_name'
+
+- import_tasks: import_distro_image.yml
+  when: profile_type|default('') == 'image'
+
+- import_tasks: import_stream_profile.yml
+  when: '"8.stream" in distro_name'
+
+# If either the profile already existed or we successfully imported the
+# distro, we might want to update other options in the profile. i.e. kickstarts
+- name: Set profile_found
+  set_fact:
+    profile_found:
+        ((profile is defined and profile.stdout == distro_name) or
+         (imported is defined and imported.rc == 0))
+
+- import_tasks: update_kickstart.yml
+  when: distro.kickstart is defined and
+        distro.kickstart != '' and
+        profile_found
+
+- import_tasks: update_kernel_options.yml
+  when: distro.kernel_options is defined and
+        distro.kernel_options != '' and
+        profile_found
+
+- import_tasks: update_kernel_options_post.yml
+  when: distro.kernel_options_post is defined and
+        distro.kernel_options_post != '' and
+        profile_found
diff --git a/roles/cobbler_profile/tasks/import_distro_image.yml b/roles/cobbler_profile/tasks/import_distro_image.yml

new file mode 100644 (file)

index 0000000..d5227ff
--- /dev/null
+++ b/roles/cobbler_profile/tasks/import_distro_image.yml
@@ -0,0 +1,39 @@
+---
+- name: Set image scratch directory
+  set_fact:
+    image_path: "{{ other_image_dir }}/{{ distro_name }}"
+
+- name: Set kernel name
+  set_fact:
+      kernel_name: "{{ distro.kernel.split('/')[-1] }}"
+
+- name: Set kernel path
+  set_fact:
+      kernel_path: "{{ other_image_dir }}/{{ kernel_name }}"
+
+- name: Set initrd name
+  set_fact:
+      initrd_name: "{{ distro.initrd.split('/')[-1] }}"
+
+- name: Set initrd path
+  set_fact:
+      initrd_path: "{{ other_image_dir }}/{{ initrd_name }}"
+
+- import_tasks: download_image.yml
+  when: distro.kernel != ''
+
+- name: Set arch
+  set_fact:
+      arch: "{{ distro.arch|default('x86_64') }}"
+  when: download_kernel is defined and download_kernel is success
+
+- name: Add the distro to cobbler
+  command: cobbler distro add --kernel {{ kernel_path }} --initrd {{ initrd_path }} --name {{ distro_name }}
+  when: download is changed or (files_exist and
+        profile is defined and profile.stdout == '')
+  register: imported
+
+- name: Add the profile to cobbler
+  command: cobbler profile add --name {{ distro_name }} --distro {{ distro_name }}
+  when: imported is defined and imported.stdout == ''
+  register: imported
diff --git a/roles/cobbler_profile/tasks/import_distro_iso.yml b/roles/cobbler_profile/tasks/import_distro_iso.yml

new file mode 100644 (file)

index 0000000..2e1d6ac
--- /dev/null
+++ b/roles/cobbler_profile/tasks/import_distro_iso.yml
@@ -0,0 +1,64 @@
+---
+- name: Set ISO name
+  set_fact:
+      iso_name: "{{ distro.iso.split('/')[-1] }}"
+
+- name: Set ISO path
+  set_fact:
+      iso_path: "{{ iso_dir }}/{{ iso_name }}"
+
+- import_tasks: download_iso.yml
+  when: distro.iso != ''
+
+# we do this so that if the playbook fails
+# after mounting and we need to run it again
+# then we'll remount and complete the rest
+# of the tasks like it's the first run
+- name: Clear the mount point.
+  mount:
+    name: "{{ iso_mount }}"
+    src: "{{ iso_path }}"
+    fstype: "iso9660"
+    state: unmounted
+
+- name: Mount ISO
+  mount:
+    name: "{{ iso_mount }}"
+    src: "{{ iso_path }}"
+    opts: "loop"
+    fstype: "iso9660"
+    state: mounted
+  when: download is changed or (iso_stat.stat is defined and iso_stat.stat.exists and
+        profile is defined and profile.stdout == '')
+  register: mount
+
+- name: Set arch
+  set_fact:
+      arch: "{{ distro.arch|default('x86_64') }}"
+  when: mount is defined and mount is changed
+
+- name: Import the distro (also creates the profile)
+  command: cobbler import --path={{ iso_mount }} --name={{ distro_name }} --arch={{ arch }}
+  register: imported
+  when: mount is defined and mount is changed
+
+# In the next two step we need to
+# rename the distro and profile only when the arch is arm
+# because cobbler is adding the arm word twice to the name instead of once
+- name: Rename the distro if the arch is arm
+  command: cobbler distro rename --name={{ distro_name }}-arm --newname={{ distro_name }}
+  when: mount is defined and mount is changed and
+        arch == "arm"
+
+- name: Rename the profile if the arch is arm
+  command: cobbler profile rename --name={{ distro_name }}-arm --newname={{ distro_name }}
+  when: mount is defined and mount is changed and
+        arch == "arm"
+
+- name: Unmount ISO
+  mount:
+    name: "{{ iso_mount }}"
+    src: "{{ iso_path }}"
+    fstype: "iso9660"
+    state: unmounted
+  when: mount is defined and mount is changed
diff --git a/roles/cobbler_profile/tasks/import_stream_profile.yml b/roles/cobbler_profile/tasks/import_stream_profile.yml

new file mode 100644 (file)

index 0000000..4220497
--- /dev/null
+++ b/roles/cobbler_profile/tasks/import_stream_profile.yml
@@ -0,0 +1,26 @@
+---
+- name: "Extract distro name and major version from {{ distro_name }}"
+  set_fact:
+    distro_and_version: "{{ distro_name.split('.')[0] }}"
+
+- name: "Extract distro name from {{ distro_name }}"
+  set_fact:
+    stream_distro_name: "{{ distro_name.split('-')[0] }}"
+
+- name: "Extract the major version number from {{ distro_and_version }}"
+  set_fact:
+    stream_distro_version: "{{ distro_and_version.split('-')[1] }}"
+
+- name: "Get the latest non-Stream profile that matches this {{ stream_distro_name }} Stream distro version (e.g., CentOS-8.3-x86_64)"
+  shell: "cobbler profile list | grep {{ distro_and_version }} | grep -v 'stream\\|arm\\|aarch\\|stock' | sort -V | tail -n 1 | xargs"
+  register: latest_non_stream_profile
+
+# See commit message for why we do it this way
+- name: "Add {{ distro_name }} to Cobbler as a sub-profile of {{ latest_non_stream_profile.stdout }}"
+  command: "cobbler profile add --name {{ distro_name }} --parent {{ latest_non_stream_profile.stdout }} --clobber"
+  register: imported
+  when: latest_non_stream_profile.stdout_lines|length != 0
+
+# Try importing as an ISO instead if we can't create a sub-profile
+- import_tasks: import_distro_iso.yml
+  when: latest_non_stream_profile.stdout_lines|length == 0
diff --git a/roles/cobbler_profile/tasks/main.yml b/roles/cobbler_profile/tasks/main.yml

new file mode 100644 (file)

index 0000000..def403e
--- /dev/null
+++ b/roles/cobbler_profile/tasks/main.yml
@@ -0,0 +1,4 @@
+---
+- import_tasks: import_distro.yml
+  tags:
+    - distros
diff --git a/roles/cobbler_profile/tasks/update_kernel_options.yml b/roles/cobbler_profile/tasks/update_kernel_options.yml

new file mode 100644 (file)

index 0000000..9cdb1c0
--- /dev/null
+++ b/roles/cobbler_profile/tasks/update_kernel_options.yml
@@ -0,0 +1,15 @@
+---
+# This returns additional kernel_options not explicitly set in the profile by us.
+# These values come from the distro import, I believe. Here's some example output from the vivid profile:
+# ksdevice=bootif lang=  biosdevname=0 text netcfg/choose_interface=auto console=tty0 console=ttyS1,115200
+# The 'ksdevice=bootif lang=' was not added by the profile and persists even when resetting the kernel_options
+# in the next task. This means that setting kernel_options will never be idempotent.
+- name: Check to see if kernel_options needs updating
+  shell: "cobbler profile dumpvars --name={{ distro_name }} | grep '^kernel_options :' | cut -d : -f 2" 
+  changed_when: false
+  register: kernel_options
+
+# This task is not idempotent because of the reason mentioned above.
+- name: "Set the profile's kernel_options"
+  command: cobbler profile edit --name={{ distro_name }} "{{ kopts_flag }}"='{{ distro.kernel_options }}'
+  when: kernel_options.stdout.strip() != distro.kernel_options
diff --git a/roles/cobbler_profile/tasks/update_kernel_options_post.yml b/roles/cobbler_profile/tasks/update_kernel_options_post.yml

new file mode 100644 (file)

index 0000000..c7d593c
--- /dev/null
+++ b/roles/cobbler_profile/tasks/update_kernel_options_post.yml
@@ -0,0 +1,9 @@
+---
+- name: Get current value for kernel_options_post
+  shell: "cobbler profile dumpvars --name={{ distro_name }} | grep '^kernel_options_post :' | cut -d : -f 2"
+  changed_when: false
+  register: kernel_options_post
+
+- name: "Set the profile's kernel_options_post if needed."
+  command: cobbler profile edit --name={{ distro_name }} "{{ kopts_flag }}"-post='{{ distro.kernel_options_post }}'
+  when: kernel_options_post.stdout.strip() != distro.kernel_options_post
diff --git a/roles/cobbler_profile/tasks/update_kickstart.yml b/roles/cobbler_profile/tasks/update_kickstart.yml

new file mode 100644 (file)

index 0000000..6f364f5
--- /dev/null
+++ b/roles/cobbler_profile/tasks/update_kickstart.yml
@@ -0,0 +1,15 @@
+---
+- name: Set kickstart path
+  set_fact:
+    kickstart_path: "{{ ks_dir }}/{{ distro.kickstart }}"
+
+- name: Check to see if the kickstart needs updating
+  shell: cobbler profile dumpvars --name={{ distro_name }} | grep '^kickstart :' | awk '{ print $3 }'
+  when: kickstart_path is defined
+  changed_when: false
+  register: kickstart
+
+- name: "Set the profile's kickstart"
+  command: cobbler profile edit --name={{ distro_name }} "{{ autoinstall_flag }}"={{ kickstart_path }}
+  when: kickstart is defined and
+        kickstart.stdout != kickstart_path
diff --git a/roles/cobbler_systems/defaults/main.yml b/roles/cobbler_systems/defaults/main.yml

new file mode 100644 (file)

index 0000000..39b8375
--- /dev/null
+++ b/roles/cobbler_systems/defaults/main.yml
@@ -0,0 +1,5 @@
+---
+interface: eth0
+kernel_options: ''
+kernel_options_post: ''
+default_profile: "RHEL-8.6-Server-x86_64"
diff --git a/roles/cobbler_systems/tasks/main.yml b/roles/cobbler_systems/tasks/main.yml

new file mode 100644 (file)

index 0000000..59f2c60
--- /dev/null
+++ b/roles/cobbler_systems/tasks/main.yml
@@ -0,0 +1,10 @@
+---
+- import_tasks: populate_systems.yml
+  tags:
+    - systems
+
+- name: Run cobbler sync
+  command: cobbler sync
+  no_log: true
+  tags:
+    - systems
diff --git a/roles/cobbler_systems/tasks/populate_systems.yml b/roles/cobbler_systems/tasks/populate_systems.yml

new file mode 100644 (file)

index 0000000..c9ab934
--- /dev/null
+++ b/roles/cobbler_systems/tasks/populate_systems.yml
@@ -0,0 +1,33 @@
+---
+- name: Get list of cobbler systems
+  command: cobbler system list
+  register: cmd_cobbler_systems
+  no_log: true
+
+- name: Set cobbler_systems_current
+  set_fact:
+    cobbler_systems_current: "[{% for host in cmd_cobbler_systems.stdout.strip().split() %}'{{ host }}.{{ lab_domain }}, {% endfor %}]"
+
+- name: set cobbler_systems_add
+  set_fact:
+    cobbler_systems_add:
+      "{{ groups.cobbler_managed | difference(cobbler_systems_current) }}"
+
+- name: Add missing systems to cobbler
+  command: cobbler system add --name={{ item.split('.')[0] }} --profile={{ default_profile }} --mac={{ hostvars[item].mac }} --ip-address={{ hostvars[item].ip }} --interface={{ hostvars[item].interface|default(interface) }} --hostname={{ item.split('.')[0] }}.{{ lab_domain }} "{{ kopts_flag }}"="{{ hostvars[item].kernel_options|default(kernel_options) }}" "{{ autoinstall_meta_flag|default('--ksmeta') }}"="{{ hostvars[item].kickstart_metadata|default(kickstart_metadata) }}" --power-type={{ hostvars[item].power_type|default(power_type) }} --power-address={{ item.split('.')[0] }}.{{ ipmi_domain }} --power-user={{ hostvars[item].power_user|default(power_user) }} --power-pass={{ hostvars[item].power_pass|default(power_pass) }} --netboot-enabled false
+  with_items: "{{ cobbler_systems_add }}"
+  when:
+    - hostvars[item].mac is defined
+    - hostvars[item].ip is defined
+
+- name: set cobbler_systems_update
+  set_fact:
+    cobbler_systems_update:
+      "{{ groups.cobbler_managed | intersect(cobbler_systems_current) }}"
+
+- name: Update existing systems in cobbler
+  command: cobbler system edit --name={{ item.split('.')[0] }} --mac={{ hostvars[item].mac }} --ip-address={{ hostvars[item].ip }} --interface={{ hostvars[item].interface|default(interface) }} --hostname={{ item.split('.')[0] }}.{{ lab_domain }} "{{ kopts_flag }}"="{{ hostvars[item].kernel_options|default(kernel_options) }}" "{{ kopts_flag }}"-post="{{ hostvars[item].kernel_options_post|default(kernel_options_post) }}" "{{ autoinstall_meta_flag|default('--ksmeta') }}"="{{ hostvars[item].kickstart_metadata|default(kickstart_metadata) }}" --power-type={{ hostvars[item].power_type|default(power_type) }} --power-address={{ item.split('.')[0] }}.{{ ipmi_domain }} --power-user={{ hostvars[item].power_user|default(power_user) }} --power-pass={{ hostvars[item].power_pass|default(power_pass) }}
+  with_items: "{{ cobbler_systems_update }}"
+  when:
+    - hostvars[item].mac is defined
+    - hostvars[item].ip is defined
diff --git a/roles/common/README.rst b/roles/common/README.rst

new file mode 100644 (file)

index 0000000..4646df1
--- /dev/null
+++ b/roles/common/README.rst
@@ -0,0 +1,120 @@
+Common
+======
+
+The common role consists of tasks we want run on all hosts in the Ansible
+inventory (i.e., not just testnodes).  This includes things like setting the
+timezone and enabling repos.
+
+Usage
++++++
+
+The common role is run on every host in the Ansible inventory and is typically
+called by another role's playbook.  Calling it manually to run a
+specific task (such as setting the timezone) can be done like so::
+
+    ansible-playbook common.yml --limit="host.example.com" --tags="timezone"
+
+**WARNING:** If the common role is run without a valid tag, the full role will run.  See ``roles/common/tasks`` for what this includes.
+
+Variables
++++++++++
+
+``timezone`` is the desired timezone for all hosts in the Ansible inventory.
+Defined in ``roles/common/defaults/main.yml``.  Values in the TZ column here_ can be used
+in place of the default value.
+
+``subscription_manager_activationkey`` and ``subscription_manager_org`` are used
+to register systems with Red Hat's Subscription Manager tool.  Blank defaults
+are set in ``roles/common/defaults/main.yml`` and should be overridden in the
+secrets repo.
+
+``rhsm_repos`` is a list of Red Hat repos that a system should subscribe to.  We
+have them defined in ``roles/common/vars/redhat_{6,7}.yml``.
+
+``use_satellite`` is a boolean that sets whether a local Red Hat Satellite server is available and should be used instead of Red Hat's CDN.  If ``use_satellite`` is set to true, you must also define ``subscription_manager_activationkey``, ``subscription_manager_org``, and ``satellite_cert_rpm`` in your secrets repo.  ``set_rhsm_release: true`` will add ``--release=X.Y`` to the ``subscription-manager register`` command; This prevents a RHEL7.6 install from being upgraded to RHEL7.7, for example.::
+
+    # Red Hat Satellite vars
+    use_satellite: true
+    satellite_cert_rpm: "http://satellite.example.com/pub/katello-ca-consumer-latest.noarch.rpm"
+    subscription_manager_org: "Your Org"
+    subscription_manager_activationkey: "abc123"
+    set_rhsm_release: false
+
+``epel_mirror_baseurl`` is self explanatory and defined in
+``roles/common/defaults/main.yml``.  Can be overwritten in secrets if you run
+your own local epel mirror.
+
+``epel_repos`` is a dictionary used to create epel repo files.  Defined in ``roles/common/defaults/main.yml``.
+
+``enable_epel`` is a boolean that sets whether epel repos should be enabled.
+Defined in ``roles/common/defaults/main.yml``.
+
+``yum_timeout`` is an integer used to set the yum timeout.  Defined in
+``roles/common/defaults/main.yml``.
+
+``nagios_allowed_hosts`` should be a comma-separated list of hosts allowed to query NRPE.  Override in the secrets repo.
+
+The following variables are used to configure NRPE_ (Nagios Remote Plugin
+Executor) on hosts in ``/etc/nagios/nrpe.cfg``.  The system defaults differ between distros (``nrpe`` in
+RHEL vs ``nagios-nrpe-server`` in Ubuntu).  Setting these allows us to make
+tasks OS-agnostic.  They variables are mostly self-explanatory and defined in
+``roles/common/vars/{yum,apt}_systems.yml``::
+
+    ## Ubuntu variables are used in this example
+
+    # Used to install the package and start/stop the service
+    nrpe_service_name: nagios-nrpe-server
+
+    # NRPE service runs as this user/group
+    nrpe_user: nagios
+    nrpe_group: nagios
+
+    # Where nagios plugins can be found
+    nagios_plugins_directory: /usr/lib/nagios/plugins
+
+    # List of packages needed for NRPE use
+    nrpe_packages:
+      - nagios-nrpe-server
+      - nagios-plugins-basic
+
+Definining ``secondary_nic_mac`` as a hostvar will configure the corresponding NIC to use DHCP.  This 
+assumes you've configured a static IP definition on your DHCP server and the NIC is cabled.
+The tasks will automatically set the MTU to 9000 if the NIC is 10Gb or 25Gb. Override in ``groups_vars/group.yml`` as ``secondary_nic_mtu=1500``
+This taskset only supports one secondary NIC.::
+
+    secondary_nic_mac: 'DE:AD:BE:EF:00:11'
+
+Tags
+++++
+
+timezone
+    Sets the timezone
+
+monitoring-scripts
+    Installs smartmontools (if necessary) and uploads custom monitoring scripts.
+    See ``roles/common/tasks/disk_monitoring.yml``.
+
+entitlements
+    Registers a Red Hat host then subscribes and enables repos.  See
+    ``roles/common/tasks/rhel-entitlements.yml``.
+
+kerberos
+    Configures kerberos.  See ``roles/common/tasks/kerberos.yml``.
+
+nagios
+    Installs and configures nrpe service (including firewalld and SELinux if
+    applicable).  ``monitoring-scripts`` is also always run with this tag since
+    NRPE isn't very useful without them.
+
+secondary-nic
+    Configure secondary NIC if ``secondary_nic_mac`` is defined.
+
+To Do
++++++
+
+- Rewrite ``roles/common/tasks/rhel-entitlements.yml`` to use Ansible's
+  redhat_subscription_module_.
+
+.. _here: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
+.. _NRPE: https://github.com/NagiosEnterprises/nrpe
+.. _redhat_subscription_module: https://docs.ansible.com/ansible/redhat_subscription_module.html
diff --git a/roles/common/defaults/main.yml b/roles/common/defaults/main.yml

new file mode 100644 (file)

index 0000000..c29513e
--- /dev/null
+++ b/roles/common/defaults/main.yml
@@ -0,0 +1,42 @@
+---
+timezone: "Etc/UTC"
+
+# Red Hat Subscription Manager credentials
+subscription_manager_activationkey: ""
+subscription_manager_org: ""
+
+# Repos to enable in Red Hat Subscription Manager
+rhsm_repos: []
+
+# Defines whether to use a Red Hat Satellite server
+use_satellite: false
+
+kerberos_realm: EXAMPLE.COM
+
+epel_mirror_baseurl: "http://dl.fedoraproject.org/pub/epel"
+epel_repos:
+  epel:
+    name: "Extra Packages for Enterprise Linux"
+    metalink: "https://mirrors.fedoraproject.org/metalink?repo=epel-{{ ansible_distribution_major_version }}&arch=$basearch&infra=$infra&content=$contentdir"
+    # ternary requires ansible >= 1.9
+    enabled: "{{ enable_epel | ternary(1, 0) }}"
+    gpgcheck: 0
+  epel-testing:
+    name: "Extra Packages for Enterprise Linux - Testing"
+    metalink: "https://mirrors.fedoraproject.org/metalink?repo=testing-epel{{ ansible_distribution_major_version }}&arch=$basearch&infra=$infra&content=$contentdir"
+    enabled: 0
+    gpgcheck: 0
+
+enable_epel: true
+yum_timeout: 300
+
+# Override in secrets repo
+nagios_allowed_hosts: "127.0.0.1"
+
+# Override in roles/common/vars/os_version.yml
+nrpe_selinux_packages:
+  - libsemanage-python
+  - policycoreutils-python
+
+# Is this a containerized node?
+containerized_node: false
diff --git a/roles/common/files/libexec/diskusage.pl b/roles/common/files/libexec/diskusage.pl

new file mode 100644 (file)

index 0000000..d4d21ab
--- /dev/null
+++ b/roles/common/files/libexec/diskusage.pl
@@ -0,0 +1,123 @@
+#!/usr/bin/perl
+
+# {{ ansible_managed }}
+
+#******************************************************************************************
+#
+# NRPE DISK USAGE PLUGIN
+#
+# Program: Disk Usage plugin written to be used with Netsaint and NRPE
+# License: GPL
+# Copyright (c) 2000 Jeremy Hanmer (jeremy@newdream.net)
+#
+# Last Modified: 10/23/00
+# 
+# Information:  Basically, I wrote this because I had to deal with large numbers of 
+# machines with a wide range of disk configurations, and with dynamically mounted 
+# partitions.  The basic check_disk plugin relied on a static configuration file which
+# doesn't lend itself to being used in a heterogeneous environnment (especially when
+# you can't guarantee that the devices listed in the configuration file will be mounted).
+#
+# Bugs:  Currently, this plugin only works on EXT2 partitions (although it's easy to change).
+#
+# Command Line: diskusage.pl <warning percentage> <critical percentage>
+#
+# Tested Systems:  Mandrake 7.1/Intel, Debian 2.2/Intel, Debian 2.1/Intel
+#
+# License Information:
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+#
+#*******************************************************************************************
+
+
+use strict;
+
+my $wrn = shift @ARGV;
+my $crt = shift @ARGV;
+my $output;
+my $count;
+my %type;
+my $result = 0;
+my $warn = 0;
+my $crit = 0;
+my @parts;
+my $hostname = `hostname`;
+chomp $hostname;
+@parts = `mount | grep -vi fuse\|/snap`;
+
+#if ( $hostname eq 'zartan' ) {
+#      @parts = `mount`;
+#}
+#else {
+#      @parts = `mount -t ext2,reiserfs`;
+#}
+for (@parts) {
+       my ($dev,$on,$mount,$tp,$type,$options) = split(/\s+/,$_);
+               next if ($type eq 'nfs' && !($hostname eq 'zartan'));
+               next if ($type eq 'proc' || $type eq 'devpts');
+               my @df= `df -k $mount`;
+               my @df_inode = `df -i $mount`;
+#              print "$dev $mount $type\n";
+               shift @df;
+               shift @df_inode;
+               for(@df) {
+                       my ($dev1,$blocks,$used,$free,$pc,$mount) = split(/\s+/,$_);
+                       my ($percent,$blah) = split(/\%/,$pc);
+                       if ( ($percent >= $wrn ) && (!($percent >= $crt) || ($mount =~ m/\/mnt\//)) ) {
+                               $output .= "$mount is at $pc    ";
+                               $warn = 1;
+                       }
+                       if ( ($percent >= $crt ) && !($mount =~ m/\/mnt\//) ){
+                               $output = "" unless $crit eq '1';
+                               $output .= "$mount is at $pc    ";
+                               $crit = 1;
+                       }
+               }
+               for(@df_inode) {
+                       my ($dev1,$inodes,$used,$free,$pc,$mount) = split(/\s+/,$_);
+                       my ($percent,$blah) = split(/\%/,$pc);
+                       if ( ($percent >= $wrn ) && (!($percent >= $crt) ) ) {
+                               $output .= "$mount is at $pc inode usage    ";
+                               $warn = 1;
+                       }
+                       if ( ($percent >= $crt ) && !($mount =~ m/\/mnt\//) ){
+                               $output = "" unless $crit eq '1';
+                               $output .= "$mount is at $pc inode usage    ";
+                               $crit = 1;
+                       }
+               }
+       }
+
+
+#if ( ($warn eq '1') && !($crit eq '1') )  {
+#      print "$output\n";
+#      $result = 1;
+#      }
+if ( $crit eq '1' ) {
+       print "$output\n";
+       $result = 2;
+}
+
+else {
+       print "Disks are OK now\n";
+}
+
+
+#if ( !( $crit eq '1' ) && !( $warn eq '1' ) ) {
+#      print "Disks are ok now\n";
+#}
+#print "$result\n";
+exit $result; 
diff --git a/roles/common/files/libexec/raid.pl b/roles/common/files/libexec/raid.pl

new file mode 100755 (executable)

index 0000000..f65eedd
--- /dev/null
+++ b/roles/common/files/libexec/raid.pl
@@ -0,0 +1,313 @@
+#!/usr/bin/perl
+
+# {{ ansible_managed }}
+
+use strict;
+
+my $warn;
+my $crit;
+my $out;
+
+my @out;
+my $devices;
+my $pci;
+my $scsi;
+my $derp;
+
+$pci = `/usr/bin/lspci | /bin/grep -i raid | /bin/grep -v PATA | /usr/bin/head -2`;
+$scsi = `/usr/bin/lspci | /bin/grep -i scsi | /bin/grep -v PATA | /usr/bin/head -1`;
+
+# software raid!
+if (-e "/proc/mdstat") {
+    # check software raid!
+#    open(R,"/tmp/mdstat");
+    open(R,"/proc/mdstat");
+    while (<R>) {
+               if (/^(md\d+) : (\w+)/) {
+                       my $dev = $1;
+                       my $status = $2;
+                       my $rest = <R>;
+                       $devices++;
+                       
+                       my ($disks,$states) = $rest =~ /(\[.*\]) (\[.*\])/;
+                       my $mout .= "$dev is $status $disks $states" if $states =~ /_/;
+                       
+                       # recovery?
+                       my $next = <R>;  # possibly recovery?
+                       if ($next =~ / recovery = /) {
+                               my ($progress,$per) = $next =~ /(\[.*\])\s+recovery =\s+(\S+%)/;
+                               $mout .= " recovery $per";
+                               my $next = <R>;
+                               if (my ($finish,$speed) = $next =~ /finish=(.*)min speed=(.*)\/sec/) {
+                                       $mout .= " finish $finish min";
+                               }
+                               $warn = 1;
+            } elsif ($next =~ / resync = /) {
+                my ($progress,$per) = $next =~ /(\[.*\])\s+resync =\s+(\S+%)/;
+                $mout .= " resync $per";
+                if (my ($finish,$speed) = $next =~ /finish=(.*)min speed=(.*)\/sec/) {
+                    $mout .= " finish $finish min";
+                }
+                $warn = 1;
+                       } elsif ($states =~ /_/) {  # not all U
+                               $crit = 1;
+                       }
+                       
+                       push( @out, $mout ) if $mout;
+               }
+    }
+}
+
+
+# mylex raid!
+if ($pci =~ /Mylex/i) {
+#if (1) {
+    my $s = `cat /proc/rd/status`;
+    chomp($s);
+    unless ($s =~ /OK/) {
+       my @myinfo;
+       for my $ctl (`ls -d /proc/rd/c*`) {
+#      for my $ctl ('/proc/rd/c0') {
+           chomp $ctl;
+           my %bad;
+           my ($c) = $ctl =~ /\/(c\d)$/;
+           open(S,"$ctl/current_status") || print "can't open $ctl/current_status\n";;
+#          open(S,"/tmp/mylex.bad");
+           my $lastdevice;
+           while (<S>) {
+               # disk status
+               if (/^    (\d:\d)  Vendor/) {
+                   $lastdevice = $1;
+               }
+               if (/ Disk Status: (\S+),/) {
+                   if ($1 ne 'Online') {
+                       push( @myinfo, "$c disk $lastdevice $1");
+                   }
+               }
+
+               # logical drives
+               if (/    (\/dev\/rd\/\S+): (\S+), (\w+),/) {
+                   my $dev = $1;
+                   my $type = $2;
+                   my $status = $3;
+                   $devices++;
+                   $bad{$dev} = 1;
+                   if ($status ne 'Online') {
+                       push( @myinfo, "$dev ($type) $status");
+                   }
+               }
+
+               # rebuild?
+               if (/  Rebuild in Progress: .* \((\S+)\) (\d+%) completed/) {
+                   push( @myinfo, "$1 rebuild $2 complete" );
+                   delete $bad{$1};
+               }
+           }
+           if (keys %bad) {
+               $crit = 1;  # at least 1 is failed and !recovering
+           } else {
+               $warn = 1;   # all are recovering
+           }
+       }
+
+       push( @out, "Mylex $s: " . join(', ',@myinfo)) if @myinfo;
+    }
+}
+
+
+# icp vortex raid!
+if ( $pci =~ /intel/i) {
+    opendir(D,"/proc/scsi/gdth");
+    my @dev = readdir(D);
+    closedir D;
+    my @vortex;
+    for my $dev (@dev) {
+       next if $dev =~ /^\./;
+       my $read = `cat /proc/scsi/gdth/$dev`;
+       # my $read = `cat /tmp/asdf9.warn`;
+       my $cur;   # Logical | Physical | Host | Array
+       my @myinfo;
+#      print "dev $dev\n";
+       for $_ (split(/\n/,$read)) {
+           chomp;
+           if (/^\w/) {
+               # new section
+               ($cur) = /^(\w+)/;
+#              print "cur = $cur\n";
+               next;
+           }
+           if ($cur eq 'Logical') {
+               my ($num,$status) = /Number:\s+(\d+)\s+Status:\s+(\w+)/;
+               next unless $status;
+               if ($status ne 'ok') {
+                   $warn = 1;
+                   #push( @myinfo, "Logical #$num $status" );
+                   unshift( @myinfo, "Logical #$num $status" );
+               }
+           }
+           if ($cur eq 'Array') {
+               my ($num,$status) = /Number:\s+(\d+)\s+Status:\s+(\w+)/;
+               next unless $status;
+               if ($status ne 'ready') {
+                   $warn = 1;
+                   #push( @myinfo, "Array #$num $status" );
+                   unshift( @myinfo, "Array #$num $status" );
+               }
+           }
+           if ($cur eq 'Host') {
+               if (/Number/) {
+                   $devices++;
+               }
+           }
+           if ($cur eq 'Controller') {
+               # push( @myinfo, $_ );
+               unshift( @myinfo, $_ );
+           }
+       }
+       
+       if (@myinfo) {
+           # push( @vortex, "dev $dev: " . join(', ', @myinfo) );
+           # unshift( @vortex, "dev $dev: " . join(', ', @myinfo) );
+           push( @vortex, "dev $dev: " . join(', ', $myinfo[0], $myinfo[1], $myinfo[2], $myinfo[3], $myinfo[4] ) );
+           # $warn = 1;
+       }
+    }
+
+    if (@vortex) {
+       # push( @out, 'Vortex: ' . join('.   ', @vortex) );
+       push( @out, 'Vortex: ' . join('.   ', @vortex) );
+    }
+}
+# SAS megaraid
+if ( $pci =~ /LSI\ Logic/i) {
+    my $read = `/usr/bin/sudo /usr/sbin/megacli -LDInfo -lall -a0`;
+    for $_ (split(/\n/,$read)) {
+       chomp;
+       # The line we care about is State: Optimal, if we don't have that, we've problems
+       if ($_ =~/^State\s*\:\s*(.*)/m) {
+            $devices++;
+           #/^State\?:\s?(\w+)/;
+           my $state = $1;
+           next unless $state;
+           if ($state ne 'Optimal') {
+               my $rebuild = `/usr/bin/sudo /usr/sbin/megacli -PDList -a0 | /bin/grep -i firmware`;
+                       if ( $rebuild =~ /Rebuild/i) {
+                               my $enclosure = `/usr/bin/sudo /usr/sbin/megacli -PDList -a0 | /bin/grep -B15 Rebuild | /bin/grep -e Enclosure -e Slot | /usr/bin/cut -d':' -f2 | /usr/bin/awk '{printf \$1\":\"}' | /usr/bin/awk -F ":" '{printf \$1":"\$2}'`;
+                               #my $rebuildstatus = `/usr/bin/sudo /usr/sbin/megacli -PDRbld -ShowProg -PhysDrv\[$enclosure\] -a0 | /bin/grep -i rebuild`;
+                               my $rebuildstatus = `/usr/bin/sudo /usr/sbin/megacli -PDRbld -ShowProg -PhysDrv\[$enclosure\] -a0 | /bin/egrep -i \'\(rebuild\|not found\)\'`;
+                               if ($rebuildstatus =~ /not found/m) {
+                                  # check by device id instead of enclosure id if we get a not found error above
+                                  $enclosure = `/usr/bin/sudo /usr/sbin/megacli -PDList -a0 | /bin/grep -B15 Rebuild | /bin/grep -e Enclosure -e Slot | /bin/grep -v position | /usr/bin/cut -d':' -f2 | /usr/bin/awk '{printf \$1\":\"}' | /usr/bin/awk -F ":" '{printf \$1":"\$2}'`;
+                                  $rebuildstatus = `/usr/bin/sudo /usr/sbin/megacli -PDRbld -ShowProg -PhysDrv\[$enclosure\] -a0 | /bin/grep -i rebuild`;
+                               }
+                                       for $_ ($rebuildstatus) {
+                                       $crit = 1;
+                                       push(@out,$_);
+                                       }
+                       } else {
+               $crit = 1;
+                my $virtual=`/usr/bin/sudo /usr/sbin/megacli -LDInfo -lall -a0 | grep -i failed -B6 | grep -i virtual | cut -d'(' -f1`;
+               push(@out, $virtual, $_);
+               }
+           }
+       }       
+        # Should to catch the syntax or permissions errors this thing spits out
+       if (/ERROR/i) {
+           $crit = 1;
+           push(@out, $_);
+       foreach my $k (@out)
+       {
+               print $_;
+       }
+       }
+    }
+}
+
+# e3ware
+if ( $pci =~ /3ware/i) {
+       open(CLI,"/usr/bin/sudo /usr/sbin/tw_cli show|");
+       #my $read = `/usr/sbin/megacli -LDInfo -l0 -a0`;
+
+       $devices++;
+       my @controllers;
+       while (<CLI>) {
+               if ( $_ =~ /^c[0-9]/ ) {
+                       my ($c) = split(/\s+/,$_);
+                       push(@controllers,$c);
+               }
+       }
+       close(CLI);
+
+       foreach my $cont (@controllers) {
+               open(CLI,"/usr/bin/sudo /usr/sbin/tw_cli /$cont show|");
+               while (<CLI>) {
+                       if ( $_ =~ /^u[0-9]+/ ) {
+                               my @info = split(/\s+/,$_);
+                               if ( $info[2] ne 'OK' ) {
+                                       if ( $info[2] =~ /REBUILDING/i) {
+                                               my $rebuildstatus = `/usr/bin/sudo /usr/sbin/tw_cli /$cont/$info[0] show | /bin/grep REBUILD | /bin/grep -v RAID-10`;
+                                                       for $_ ($rebuildstatus) {
+                                                       $crit = 1;
+                                                       push(@out,$_);
+                                                       }
+                                       } else {
+                                       $crit = 1;
+                                       push(@out,$_);
+                                       }
+                               }
+                       }
+                       if ( $_ =~ /^p[0-9]+/ ) {
+                               my @info = split(/\s+/,$_);
+                               if ( $info[1] ne 'OK' ) {
+                                       $crit = 1;
+                                       push(@out,$_);
+                               }
+                       }
+               }
+       }       
+}
+
+#Areca
+
+if ( $pci =~ /areca/i) {
+                open(CLI,"sudo /usr/sbin/cli64 vsf info|");
+                while (<CLI>) {
+                        if ( $_ =~ /^\ \ [0-9]+/ ) {
+                               $devices++;
+                                my @info = split(/\s+/,$_);
+                               if ( $_ !~ /Normal/i) {
+                                        $crit = 1;
+                                        push(@out,$_);
+                                }
+                        }
+                }
+        }
+
+if ( $scsi =~ /LSI Logic/i) {
+                open(CLI,"sudo /usr/sbin/mpt-status | /usr/bin/head -1 |");
+                $devices++;
+                while (<CLI>) {
+                        if ( $_ =~ /^ioc/ ) {
+                                my @info = split(/\s+/,$_);
+                                if ( $info[10] ne 'OPTIMAL,' ) {
+                                        $crit = 1;
+                                        push(@out,$_);
+                                }
+                        }
+                }
+        }
+
+# show results
+my $result = 0;
+$result = 1 if $warn;
+$result = 2 if $crit;
+# print "warn = $warn crit = $crit\n";
+print $derp;
+my $out = "No raid devices found $pci";
+$out = "All $devices raid devices happy as clams" if $devices;
+if (@out) {
+    $out = join(';     ', @out);  
+}
+
+print "$out\n";
+exit $result;
diff --git a/roles/common/files/libexec/smart.sh b/roles/common/files/libexec/smart.sh

new file mode 100755 (executable)

index 0000000..09a3275
--- /dev/null
+++ b/roles/common/files/libexec/smart.sh
@@ -0,0 +1,433 @@
+#!/bin/bash
+# Description: Bash script to check drive health using pending, uncorrectable,
+# and reallocated sector count
+#
+# Nagios return codes: 0 = OK; 1 = WARNING; 2 = CRITICAL; 3 = UNKNOWN
+#
+# See https://en.wikipedia.org/wiki/S.M.A.R.T.#ATA_S.M.A.R.T._attributes
+
+### Define global variables ###
+# total number of drives (or RAID slots) discovered
+numdrives=0
+# Number of failed, failing, and/or missing drives
+failingdrives=0
+# Fallback message for UNKNOWN return code output
+unknownmsg="Unknown error"
+# Return code for nagios (Default to SUCCESS)
+rc=0
+# Location of nvme-cli executable
+nvmecli="/usr/sbin/nvme"
+# Array of messages indicating drive health.  Output after nagios status.
+declare -a messages
+
+### Functions ###
+main ()
+{
+  preflight
+
+  if [ "$raid" = true ]
+  then
+    areca_smart
+    areca_failed
+  elif [ "$raid" = false ]
+  then
+    normal_smart
+  else
+    echo "ERROR - Could not determine if RAID present"
+    exit 3
+  fi
+
+  if [ "$nvme" = true ]
+  then
+    nvme_smart
+  fi
+
+  ## Return UNKNOWN if no drives found
+  if [ "$numdrives" -eq "0" ]
+  then
+    unknownmsg="No drives found!"
+    rc=3
+  fi
+  
+  ## Return code and service status for nagios
+  if [ "$rc" = 0 ]
+  then
+    echo "OK - All $numdrives drives healthy"
+  elif [ "$rc" = 1 ]
+  then
+    echo "WARNING - $failingdrives of $numdrives drives sick"
+  elif [ "$rc" = 2 ]
+  then
+    echo "CRITICAL - $failingdrives of $numdrives drives need replacing"
+  elif [ "$rc" = 3 ]
+  then
+    echo "UNKNOWN - $unknownmsg"
+  else
+    echo "ERROR - Got no return code"
+  fi
+  
+  ## Iterate through array of messages
+  # Nagios reads and displays the first line of output on the Services page.
+  # All individual messages about failed/failing disk statistics can be viewed
+  # on the individual system's SMART detail page in nagios.
+  readarray -t sorted < <(for msg in "${messages[@]}"; do echo "$msg"; done | sort)
+  for msg in "${sorted[@]}"; do
+    echo "$msg"
+  done
+  
+  exit $rc
+}
+
+# Pre-flight checks
+preflight ()
+{
+  # Set raid var then check for cli64 command and bail if missing
+  if lspci | grep -qi areca
+  then
+    raid=true
+  else
+    raid=false
+  fi
+  
+  if [ "$raid" = true ] && ! [ -x "$(command -v cli64)" ]
+  then
+    echo "ERROR - cli64 command not found or is not executable"
+    exit 3
+  fi
+  
+  # Check for smartmontools and bail if missing
+  if ! [ -x "$(command -v smartctl)" ]
+  then
+    echo "ERROR - smartctl is not installed or is not executable"
+    echo "yum/apt-get install smartmontools"
+    exit 3
+  fi
+
+  # Check for nvme devices and nvme-cli executable
+  if cat /proc/partitions | grep -q nvme
+  then
+    nvme=true
+    if ! [ -x "$nvmecli" ]
+    then
+      echo "ERROR - NVMe Device detected but no nvme-cli executable"
+      exit 3
+    fi
+  fi
+}
+
+# Gather smart data for drives behind Areca RAID controller
+areca_smart ()
+{
+  # Store output of cli64 to reduce repeated executions
+  cli64out=$(sudo cli64 disk info | grep -E "Slot#[[:digit:]]")
+  # Loop through all disks not marked as 'N.A.' or 'Failed'
+  for slot in $(echo "$cli64out" | grep -v 'N.A.\|Failed' \
+  | grep -o "Slot#[[:digit:]]" | cut -c6-)
+  do
+    let "numdrives+=1"
+    failed=false
+    # Determine if disk is JBOD or part of hardware RAID
+    if echo "$cli64out" | grep -E "Slot#$slot" | grep -q 'JBOD'
+    then
+      jbod=true
+    else
+      jbod=false
+    fi
+    output=$(sudo cli64 disk smart drv=$slot \
+    | grep -E "^  "5"|^"197"|^"198"" | awk '{ print $(NF-1) }' | tr '\n' ' ')
+    outputcount=$(echo $output | wc -w)
+    # Only continue if we received 3 SMART data points
+    if [ "$outputcount" = "3" ]
+    then
+      # Only do slot to drive letter matching once per bad JBOD
+      if [[ $output != "0 0 0 " ]] && [ "$jbod" = true ]
+      then
+        dl=$(areca_bay_to_letter $slot)
+      elif [ "$jbod" = false ]
+      then
+        dl="(RAID)"
+      fi
+      read reallocated pending uncorrect <<< $output
+      if [ "$reallocated" != "0" ]
+      then
+        messages+=("Drive $slot $dl has $reallocated reallocated sectors")
+        failed=true
+        # A small number of reallocated sectors is OK
+       # Don't set rc to WARN if we were already CRIT from previous drive
+        if [ "$reallocated" -le 5 ] && [ "$rc" != 2 ]
+        then
+          rc=1 # Warn if <= 5
+        else
+          rc=2 # Crit if >5
+        fi
+      fi
+      if [ "$pending" != "0" ]
+      then
+        messages+=("Drive $slot $dl has $pending pending sectors")
+        failed=true
+        rc=2
+      fi
+      if [ "$uncorrect" != "0" ]
+      then
+        messages+=("Drive $slot $dl has $uncorrect uncorrect sectors")
+        failed=true
+        rc=2
+      fi
+    else
+      messages+=("Drive $slot returned $outputcount of 3 expected attributes")
+      unknownmsg="SMART data could not be read for one or more drives"
+      rc=3
+    fi
+    # Make sure drives with multiple types of bad sectors only get counted once
+    if [ "$failed" = true ]
+    then
+      let "failingdrives+=1"
+    fi
+  done
+}
+
+# Correlate Areca drive bay to drive letter
+areca_bay_to_letter ()
+{
+  # Get S/N according to RAID controller given argument $1 (slot #)
+  areca_serial=$(sudo cli64 disk info drv=$1 | grep 'Serial Number' \
+  | awk '{ print $NF }')
+  # Loop through and get S/N according to smartctl given drive name
+  for dl in $(cat /proc/partitions | grep -w 'sd[a-z]\|sd[a-z]\{2\}' \
+  | awk '{ print $NF }')
+  do
+    smart_serial=$(sudo smartctl -a /dev/$dl | grep "Serial number" \
+    | awk '{ print $NF }')
+    # If cli64 and smartctl find a S/N match, return drive letter
+    if [ "$areca_serial" = "$smart_serial" ]
+    then
+      echo "($dl)"
+    fi
+  done
+}
+
+# Tally missing and failed drives connected to Areca RAID
+areca_failed ()
+{
+  # Store output of cli64 to reduce repeated executions
+  cli64out=$(sudo cli64 disk info | grep -E "Slot#[[:digit:]]")
+  # Missing (N.A.) drives
+  for drive in $(echo "$cli64out" | grep -E "Slot#[[:digit:]]" \
+  | grep "N.A." | awk '{ print $1 }')
+  do
+    messages+=("Drive $drive is missing")
+    let "failingdrives+=1"
+    rc=2
+  done
+  # Hard failed drives
+  for drive in $(echo "$cli64out" | grep -E "Slot#[[:digit:]]" \
+  | grep 'Failed' | awk '{ print $1 }')
+  do
+    messages+=("Drive $drive failed")
+    let "failingdrives+=1"
+    rc=2
+  done
+}
+
+# Standard SATA/SAS drive smartctl check
+normal_smart ()
+{
+  # The grep regex will include drives named sdaa, for example
+  for l in $(cat /proc/partitions | grep -w 'sd[a-z]\|sd[a-z]\{2\}' \
+  | awk '{ print $NF }')
+  do
+    let "numdrives+=1"
+    failed=false
+    # The general consensus online is that some SMART attributes are less
+    # worrisome when it comes to SSDs (e.g., Reallocated_Sector_Ct)
+    if sudo smartctl -i /dev/$l | grep -q 'Solid State Device'; then
+      is_ssd=true
+    else
+      is_ssd=false
+    fi
+    output=$(sudo smartctl -f hex -A /dev/$l | grep '^0')
+    # This block is mainly for the SAS drives in the reesi since they
+    # don't report regular SMART attributes
+    if [ $? != 0 ]; then
+      if output=$(sudo smartctl -l error /dev/$l | grep '^read:\|^write:'); then
+        uncorrect_read=$(echo "$output" | grep '^read:' | awk '{print $NF}')
+        uncorrect_write=$(echo "$output" | grep '^write:' | awk '{print $NF}')
+        if [ "$uncorrect_read" != "0" ]; then
+          messages+=("Drive $l reports $uncorrect_read uncorrected read errors")
+        failed=true
+        rc=2
+        fi
+        if [ "$uncorrect_write" != "0" ]; then
+          messages+=("Drive $l reports $uncorrect_write uncorrected write errors")
+        failed=true
+        rc=2
+        fi
+      # The SSDs in the bruuni just straight up say failed with no additional detail
+      elif sudo smartctl -a /dev/$l | grep -q "FAILED!"; then
+        messages+=("Drive $l ($(get_serial $l)) has completely failed")
+        failed=true
+        rc=2
+      else
+        messages+=("No SMART data found for drive $l")
+        failed=true
+        rc=3
+      fi
+    fi
+    # 0x05 (5) Reallocated_Sector_Ct
+    if echo "$output" | grep -q '^0x05'; then
+      reallocated=$(echo "$output" | grep '^0x05' | awk '{print $NF}')
+      if [ "$reallocated" != "0" ] && [ $is_ssd = false ]; then
+        messages+=("Drive $l ($(get_serial $l)) has $reallocated reallocated sectors")
+        failed=true
+        # A small number of reallocated sectors is OK
+       # Don't set rc to WARN if we were already CRIT from previous drive
+        if [ $reallocated -le 5 ] && [ "$rc" -lt 2 ]
+        then
+          rc=1 # Warn if <= 5
+        else
+          rc=2 # Crit if >5
+        fi
+      fi
+    fi
+    # 0xbb (187) Reported_Uncorrect
+    if echo "$output" | grep -q '^0xbb'; then
+      uncorrect=$(echo "$output" | grep '^0xbb' | awk '{print $NF}')
+      if [ "$uncorrect" != "0" ]; then
+        messages+=("Drive $l ($(get_serial $l)) had $uncorrect reported uncorrect sectors")
+        failed=true
+        rc=2
+      fi
+    fi
+    # 0xc4 (196) Reallocated_Event_Count
+    if echo "$output" | grep -q '^0xc4'; then
+      reallocatedevents=$(echo "$output" | grep '^0xc4' | awk '{print $NF}')
+      if [ "$reallocatedevents" != "0" ]; then
+        messages+=("Drive $l ($(get_serial $l)) has $reallocatedevents reallocated events")
+        failed=true
+        rc=2
+      fi
+    fi
+    # 0xc5 (197) Current_Pending_Sector
+    if echo "$output" | grep -q '^0xc5'; then
+      pending=$(echo "$output" | grep '^0xc5' | awk '{print $NF}')
+      if [ "$pending" != "0" ]; then
+        messages+=("Drive $l ($(get_serial $l)) has $pending pending sectors")
+        failed=true
+        rc=2
+      fi
+    fi
+    # 0xc6 (198) Offline_Uncorrectable
+    if echo "$output" | grep -q '^0xc6'; then
+      uncorrect=$(echo "$output" | grep '^0xc6' | awk '{print $NF}')
+      if [ "$uncorrect" != "0" ]; then
+        messages+=("Drive $l ($(get_serial $l)) has $uncorrect uncorrect sectors")
+        failed=true
+        rc=2
+      fi
+    fi
+    # 0xe9 (233) Media_Wearout_Indicator
+    if echo -e "$output" | grep -q '^0xe9'; then
+      wearout=$(echo "$output" | grep '^0xe9' | awk '{print $NF}')
+      if [ "$wearout" == "1" ]; then
+        messages+=("Drive $l ($(get_serial $l)) has exhausted its Media_Wearout_Indicator")
+        failed=true
+       # Don't set rc to WARN if we were already CRIT from previous drive
+        if [ "$rc" != 2 ]
+        then
+          rc=1
+        else
+          rc=2
+        fi
+      fi
+    fi
+    # Make sure drives with multiple types of bad sectors only get counted once
+    if [ "$failed" = true ]
+    then
+      let "failingdrives+=1"
+    fi
+  done
+}
+
+nvme_smart ()
+{
+  # Loop through NVMe devices
+  for nvmedisk in $(sudo $nvmecli list | grep nvme | awk '{ print $1 }')
+  do
+    # Include NVMe devices in overall drive count
+    let "numdrives+=1"
+    failed=false
+    # Clear output variable from any previous disk checks
+    output=""
+    output=$(sudo $nvmecli smart-log $nvmedisk | \
+             grep -E "^"critical_warning"|^"percentage_used"|^"media_errors"|^"num_err_log_entries"" \
+             | awk '{ print $NF }' | sed 's/%//' | tr '\n' ' ')
+    outputcount=$(echo $output | wc -w)
+    # Only continue if we received 4 SMART data points
+    if [ "$outputcount" = "4" ]
+    then
+      read critical_warning percentage_used media_errors num_err_log_entries <<< $output
+      # Check for critical warnings
+      if [ "$critical_warning" != "0" ]
+      then
+        messages+=("$nvmedrive indicates there is a critical warning")
+        failed=true
+        rc=1
+      fi
+      # Alert if >= 90% of manufacturer predicted life consumed
+      if [ "$percentage_used" -ge 90 ] && [ "$percentage_used" -lt 100 ]
+      then
+        messages+=("$nvmedisk has estimated $(expr 100 - $percentage_used)% life remaining")
+        failed=true
+        rc=1 # Warn if >= 90 and < 100
+      elif [ "$percentage_used" -ge 100 ]
+      then
+        messages+=("$nvmedisk has consumed $percentage_used% of its estimated life")
+        failed=true
+        rc=2 # Crit if > 100
+      fi
+      # Check for media errors
+      if [ "$media_errors" != "0" ]
+      then
+        messages+=("$nvmedisk indicates there are $media_errors media errors")
+        failed=true
+        rc=2
+      fi
+      # Check for error log entries
+#     This doesn't appear to be a useful or reliable method of measuring NVMe health.
+#     I've done a bunch of research and haven't been able to find much of anything
+#     about this metric.  On top of that, all our new reesi NVMe indicate errors but
+#     there's nothing in the error-logs so I'm commenting this for now.
+#      if [ "$num_err_log_entries" != "0" ]
+#      then
+#        messages+=("$nvmedisk indicates there are $num_err_log_entries error log entries")
+#        failed=true
+#        rc=2
+#      fi
+    elif [ "$outputcount" != "4" ]
+    then
+      messages+=("$nvmedisk returned $outputcount of 4 expected attributes")
+      unknownmsg="SMART data could not be read for one or more drives"
+      rc=3
+    else
+      messages+=("Error processing data for $nvmedisk")
+      rc=3
+    fi
+    # Make sure NVMe devices with more than one type of error only get counted once
+    if [ "$failed" = true ]
+    then
+      let "failingdrives+=1"
+    fi
+  done
+}
+
+get_serial() {
+  serial=$(sudo smartctl -i /dev/$1 | grep "Serial Number:" | awk '{ print $3 }')
+  if [ "$serial" == "" ]; then
+    echo "S/N unknown"
+  else
+    echo $serial
+  fi
+}
+
+## Call main() function
+main
diff --git a/roles/common/files/nagios/check_mem.sh b/roles/common/files/nagios/check_mem.sh

new file mode 100644 (file)

index 0000000..5a0c103
--- /dev/null
+++ b/roles/common/files/nagios/check_mem.sh
@@ -0,0 +1,52 @@
+#!/bin/bash
+# Source: https://github.com/whereisaaron/linux-check-mem-nagios-plugin
+
+if [ "$1" = "-w" ] && [ "$2" -gt "0" ] && [ "$3" = "-c" ] && [ "$4" -gt "0" ]; then
+
+       freem=`free -m | grep Mem`
+       freem_bits=(${freem// / })
+
+        memTotal_m=${freem_bits[1]}
+        memFree_m=${freem_bits[3]}
+        memBuffer_m=${freem_bits[5]}
+        memCache_m=${freem_bits[6]}
+
+        memUsed_m=$(($memTotal_m-$memFree_m-$memBuffer_m-$memCache_m))
+        memUsedPrc=$((($memUsed_m*100)/$memTotal_m))
+
+       warn=$(((($memTotal_m*100)-($memTotal_m*(100-$2)))/100))
+       crit=$(((($memTotal_m*100)-($memTotal_m*(100-$4)))/100))
+
+       memTotal_b=$(($memTotal_m*1024*1024))
+       memFree_b=$(($memFree_m*1024*1024))
+       memUsed_b=$(($memUsed_m*1024*1024))
+       memBuffer_b=$(($memBuffer_m*1024*1024))
+       memCache_b=$(($memCache_m*1024*1024))
+
+       minmax="0;$memTotal_b";
+       data="TOTAL=$memTotal_b;;;$minmax USED=$memUsed_b;$warn;$crit;$minmax CACHE=$memCache_b;;;$minmax BUFFER=$memBuffer_b;;;$minmax"
+
+        if [ "$memUsedPrc" -ge "$4" ]; then
+                echo "MEMORY CRITICAL - Total: $memTotal_m MB - Used: $memUsed_m MB - $memUsedPrc% used!|$data"
+                $(exit 2)
+        elif [ "$memUsedPrc" -ge "$2" ]; then
+                echo "MEMORY WARNING - Total: $memTotal_m MB - Used: $memUsed_m MB - $memUsedPrc% used!|$data"
+                $(exit 1)
+        else
+                echo "MEMORY OK - Total: $memTotal_m MB - Used: $memUsed_m MB - $memUsedPrc% used|$data"
+                $(exit 0)
+        fi
+
+else
+        echo "check_mem v1.3"
+        echo ""
+        echo "Usage:"
+        echo "check_mem.sh -w <warnlevel> -c <critlevel>"
+        echo ""
+        echo "warnlevel and critlevel is percentage value without %"
+        echo ""
+        echo "v1.1 Copyright (C) 2012 Lukasz Gogolin (lukasz.gogolin@gmail.com)"
+        echo "v1.2 Modified 2014 by Aaron Roydhouse (aaron@roydhouse.com)"
+        echo "v1.3 Modified 2015 by Aaron Roydhouse (aaron@roydhouse.com)"
+        exit
+fi
diff --git a/roles/common/files/nagios/nrpe.te b/roles/common/files/nagios/nrpe.te

new file mode 100644 (file)

index 0000000..5c2bef4
--- /dev/null
+++ b/roles/common/files/nagios/nrpe.te
@@ -0,0 +1,27 @@
+module nrpe 1.0;
+
+require {
+       type fsadm_exec_t;
+       type tmp_t;
+       type fixed_disk_device_t;
+       type nrpe_t;
+       type hwdata_t;
+       class capability { dac_read_search sys_admin sys_rawio dac_override };
+       class blk_file { read getattr open ioctl };
+       class unix_dgram_socket sendto;
+       class dir { write remove_name search add_name };
+       class file { execute read create execute_no_trans write getattr unlink
+open };
+}
+
+#============= nrpe_t ==============
+
+allow nrpe_t fixed_disk_device_t:blk_file { read getattr open ioctl };
+allow nrpe_t fsadm_exec_t:file { read execute open getattr execute_no_trans };
+allow nrpe_t hwdata_t:dir search;
+allow nrpe_t hwdata_t:file { read getattr open };
+allow nrpe_t self:capability { dac_read_search sys_admin dac_override sys_rawio };
+allow nrpe_t self:unix_dgram_socket sendto;
+allow nrpe_t tmp_t:dir { write remove_name add_name };
+allow nrpe_t tmp_t:file unlink;
+allow nrpe_t tmp_t:file { write create open };
diff --git a/roles/common/files/sbin/cli64 b/roles/common/files/sbin/cli64

new file mode 100644 (file)

index 0000000..7ef82de

Binary files /dev/null and b/roles/common/files/sbin/cli64 differ
diff --git a/roles/common/files/sbin/megacli b/roles/common/files/sbin/megacli

new file mode 100755 (executable)

index 0000000..50bf00b

Binary files /dev/null and b/roles/common/files/sbin/megacli differ
diff --git a/roles/common/files/sbin/nvme b/roles/common/files/sbin/nvme

new file mode 100755 (executable)

index 0000000..a33d198

Binary files /dev/null and b/roles/common/files/sbin/nvme differ
diff --git a/roles/common/handlers/main.yml b/roles/common/handlers/main.yml

new file mode 100644 (file)

index 0000000..30e2a52
--- /dev/null
+++ b/roles/common/handlers/main.yml
@@ -0,0 +1,7 @@
+---
+- name: restart nagios-nrpe-server
+  service:
+    name: "{{ nrpe_service_name }}"
+    state: restarted
+  # There's an issue with ansible<=2.9 and our custom built kernels (5.8 as of this commit) where the service and systemd modules don't have backwards compatibility with init scripts
+  ignore_errors: "{{ 'ceph' in ansible_kernel }}"
diff --git a/roles/common/meta/main.yml b/roles/common/meta/main.yml

new file mode 100644 (file)

index 0000000..ebe0d98
--- /dev/null
+++ b/roles/common/meta/main.yml
@@ -0,0 +1,5 @@
+---
+dependencies:
+  - role: secrets
+  - role: users
+  
diff --git a/roles/common/tasks/apt_systems.yml b/roles/common/tasks/apt_systems.yml

new file mode 100644 (file)

index 0000000..eb21bb2
--- /dev/null
+++ b/roles/common/tasks/apt_systems.yml
@@ -0,0 +1,40 @@
+---
+- name: Get the current timezone
+  shell: cat /etc/timezone
+  register: current_tz
+  changed_when: false
+  tags:
+    - timezone
+
+- name: Set the timezone in /etc/timezone
+  copy:
+    content: "{{ timezone }}"
+    dest: /etc/timezone
+    owner: root
+    group: root
+    mode: 0644
+  # Default is used below to avoid breaking check mode
+  when: current_tz.stdout|default("") != timezone
+  register: etc_timezone
+  tags:
+    - timezone
+
+- name: Inform the OS of the new timezone
+  command: dpkg-reconfigure --frontend noninteractive tzdata
+  when: etc_timezone is changed
+  tags:
+    - timezone
+
+- name: Mask sleep units
+  systemd:
+    name: "{{ item }}"
+    state: stopped
+    masked: yes
+  with_items:
+    - sleep.target
+    - suspend.target
+    - hibernate.target
+    - hybrid-sleep.target
+  when:
+    - ansible_distribution_major_version|int >= 20
+    - not containerized_node
diff --git a/roles/common/tasks/disk_monitoring.yml b/roles/common/tasks/disk_monitoring.yml

new file mode 100644 (file)

index 0000000..9e6b430
--- /dev/null
+++ b/roles/common/tasks/disk_monitoring.yml
@@ -0,0 +1,41 @@
+---
+# We use these scripts to check to see if any of our test nodes have bad disks
+
+# Ignore errors in case there are no repos enabled and package install fails
+- name: Make sure smartmontools is installed
+  package:
+    name: smartmontools
+    state: latest
+  ignore_errors: true
+
+- name: Upload megacli and cli64 for raid monitoring and smart.pl to /usr/sbin/.
+  copy:
+    src: "../files/sbin/{{ item }}"
+    dest: "/usr/sbin/{{ item }}"
+    owner: root
+    group: root
+    mode: 0755
+  with_items:
+    - megacli
+    - cli64
+    - nvme
+
+- name: Create /usr/libexec.
+  file:
+    path: /usr/libexec
+    owner: root
+    group: root
+    mode: 0755
+    state: directory
+
+- name: Upload custom netsaint scripts for raid/disk/smart/monitoring to /usr/libexec/.
+  copy:
+    src: "../files/libexec/{{ item }}"
+    dest: "/usr/libexec/{{ item }}"
+    owner: root
+    group: root
+    mode: 0755
+  with_items:
+    - smart.sh
+    - raid.pl
+    - diskusage.pl
diff --git a/roles/common/tasks/epel.yml b/roles/common/tasks/epel.yml

new file mode 100644 (file)

index 0000000..9b5f5dd
--- /dev/null
+++ b/roles/common/tasks/epel.yml
@@ -0,0 +1,21 @@
+---
+- name: Increase the yum timeout.
+  lineinfile:
+    dest: /etc/yum.conf
+    line: "timeout={{ yum_timeout }}"
+    regexp: "^timeout="
+    state: present
+
+- name: Configure epel repos in /etc/yum.repos.d/
+  template:
+    src: yum_repo.j2
+    dest: /etc/yum.repos.d/{{ item.key }}.repo
+    owner: root
+    group: root
+    mode: 0644
+  register: epel_repo
+  with_dict: "{{ epel_repos }}"
+
+- name: Clean yum cache
+  shell: yum clean all
+  when: epel_repo is defined and epel_repo is changed
diff --git a/roles/common/tasks/kerberos.yml b/roles/common/tasks/kerberos.yml

new file mode 100644 (file)

index 0000000..d50fa63
--- /dev/null
+++ b/roles/common/tasks/kerberos.yml
@@ -0,0 +1,44 @@
+---
+# Install and Configure a Kerberos client
+
+- name: Install Kerberos Packages (RedHat)
+  package:
+    name: krb5-workstation
+    state: present
+  when: ansible_os_family == 'RedHat'
+
+# See http://tracker.ceph.com/issues/15439
+- name: Clean apt cache
+  command: apt-get clean
+  when: ansible_os_family == 'Debian'
+
+- name: Update apt cache
+  apt:
+    update_cache: yes
+  # Register and retry to work around transient http issues
+  register: apt_cache_update
+  until: apt_cache_update is success
+  # try for 2 minutes before failing
+  retries: 24
+  delay: 5
+  when: ansible_os_family == 'Debian'
+
+- name: Install Kerberos Packages (Debian)
+  apt:
+    name: krb5-user
+    state: present
+  when: ansible_os_family == 'Debian'
+
+- name: Install Kerberos Packages (OpenSUSE Leap)
+  zypper:
+    name: krb5-client
+    state: present
+  when: ansible_os_family == 'Suse'
+
+- name: Add krb5 config file
+  template:
+    src: 'krb5.conf'
+    dest: '/etc/krb5.conf'
+    owner: root
+    group: root
+    mode: 0644
diff --git a/roles/common/tasks/main.yml b/roles/common/tasks/main.yml

new file mode 100644 (file)

index 0000000..c9435a1
--- /dev/null
+++ b/roles/common/tasks/main.yml
@@ -0,0 +1,67 @@
+---
+
+- name: Log the OS name, version and release
+  debug: msg="Host {{ inventory_hostname }} is running {{ ansible_distribution }} {{ ansible_distribution_version }} ({{ ansible_distribution_release }})"
+
+# loading major version specific vars
+- name: Including major version specific variables.
+  include_vars: "{{ item }}"
+  with_first_found:
+    - "{{ ansible_distribution | lower }}_{{ ansible_distribution_major_version }}.yml"
+    - empty.yml
+  tags:
+    - vars
+    # We need these vars for the entitlements tag to work
+    - entitlements
+
+# configure things specific to yum systems
+- import_tasks: yum_systems.yml
+  when: ansible_os_family == "RedHat"
+
+# configure things specific to apt systems
+- import_tasks: apt_systems.yml
+  when: ansible_pkg_mgr == "apt"
+
+- import_tasks: zypper_systems.yml
+  when: ansible_pkg_mgr == "zypper"
+
+- name: Set the hardware clock
+  command: hwclock --systohc
+  tags:
+    - timezone
+
+# configure Kerberos
+- import_tasks: kerberos.yml
+  tags:
+    - kerberos
+
+# upload custom disk monitoring scripts
+- import_tasks: disk_monitoring.yml
+  tags:
+    - monitoring-scripts
+    - nagios
+
+# configure nagios (Except CentOS 9 Stream)
+- import_tasks: nagios.yml
+  tags:
+    - nagios
+
+- name: Get SELinux status
+  command: getenforce
+  register: selinux_status
+  when: ansible_os_family == "RedHat"
+  tags:
+    - nagios
+
+# configure selinux for nagios
+- import_tasks: nrpe-selinux.yml
+  when: ansible_os_family == "RedHat" and
+        (selinux_status is defined and selinux_status.stdout != "Disabled")
+  tags:
+    - nagios
+
+- name: include secondary NIC config tasks
+  import_tasks: secondary_nic.yml
+  when: secondary_nic_mac is defined
+  tags:
+    - secondary-nic
diff --git a/roles/common/tasks/nagios.yml b/roles/common/tasks/nagios.yml

new file mode 100644 (file)

index 0000000..e162e9a
--- /dev/null
+++ b/roles/common/tasks/nagios.yml
@@ -0,0 +1,111 @@
+---
+- name: "Include {{ ansible_pkg_mgr }}_system vars"
+  include_vars: "{{ ansible_pkg_mgr }}_systems.yml"
+
+# Returns 0 if found and 1 if not found
+# Task fails if not found.  Hence ignore_errors: true
+- name: Check for epel
+  shell: "grep -q 'epel' /etc/yum.repos.d/*"
+  register: have_epel
+  no_log: true
+  ignore_errors: true
+  when: ansible_os_family == "RedHat"
+
+# This task is only run when epel isn't present
+- name: Install nrpe without epel
+  package:
+    name: "{{ item }}"
+    state: present
+  with_items:
+    - http://{{ mirror_host }}/lab-extras/rhel7/x86_64/nagios-common-4.0.8-2.el7.x86_64.rpm
+    - http://{{ mirror_host }}/lab-extras/rhel7/x86_64/nrpe-2.15-7.el7.x86_64.rpm
+    - http://{{ mirror_host }}/lab-extras/rhel7/x86_64/nagios-plugins-2.0.3-3.el7.x86_64.rpm
+    - http://{{ mirror_host }}/lab-extras/rhel7/x86_64/nagios-plugins-load-2.0.3-3.el7.x86_64.rpm
+  when:
+    - ansible_os_family == "RedHat"
+    - ansible_distribution_major_version|int <= 7
+    - have_epel.rc == 1
+
+- name: Install nrpe package and dependencies (RHEL/CentOS)
+  package:
+    name: "{{ nrpe_packages|list }}"
+    state: latest
+    enablerepo: epel
+  when:
+    - ansible_os_family == "RedHat"
+    - have_epel.rc == 0
+
+- name: Install nrpe package and dependencies (non-RHEL/CentOS)
+  package:
+    name: "{{ nrpe_packages|list }}"
+    state: latest
+  when: ansible_os_family != "RedHat"
+
+- name: Upload nagios sudoers.d for raid utilities.
+  template:
+    src: nagios/90-nagios
+    dest: /etc/sudoers.d/90-nagios
+    owner: root
+    group: root
+    mode: 0440
+    validate: visudo -cf %s
+
+- name: Upload nagios check_mem script
+  copy:
+    src: nagios/check_mem.sh
+    dest: "{{ nagios_plugins_directory }}/check_mem.sh"
+    owner: root
+    group: root
+    mode: 0755
+
+- name: Configure nagios nrpe settings (Ubuntu)
+  lineinfile:
+    dest: /etc/default/{{ nrpe_service_name }}
+    regexp: "^{{ item }}"
+    line: "{{ item }}=\"--no-ssl\""
+  when: ansible_pkg_mgr == "apt"
+  with_items:
+    - DAEMON_OPTS
+    - NRPE_OPTS
+  notify:
+    - restart nagios-nrpe-server
+
+- name: Configure nagios nrpe settings (RHEL/CentOS)
+  lineinfile:
+    dest: /etc/sysconfig/{{ nrpe_service_name }}
+    regexp: "^NRPE_SSL_OPT"
+    line: "NRPE_SSL_OPT=\"-n\""
+  when: ansible_os_family == "RedHat"
+
+- name: Check firewalld status
+  command: systemctl status firewalld
+  register: firewalld
+  ignore_errors: true
+  no_log: true
+  when: ansible_os_family == "RedHat"
+
+- name: Open nrpe port if firewalld enabled
+  firewalld:
+    port: 5666/tcp
+    state: enabled
+    permanent: yes
+    immediate: yes
+  when: ansible_os_family == "RedHat" and (firewalld is defined and firewalld.stdout.find('running') != -1)
+
+- name: Upload nagios nrpe config.
+  template:
+    src: nagios/nrpe.cfg 
+    dest: /etc/nagios/nrpe.cfg
+    owner: root
+    group: root
+    mode: 0644
+  notify:
+    - restart nagios-nrpe-server
+
+- name: Make sure nagios nrpe service is running.
+  service:
+    name: "{{ nrpe_service_name }}"
+    enabled: yes
+    state: started
+  # There's an issue with ansible<=2.9 and our custom built kernels (5.8 as of this commit) where the service and systemd modules don't have backwards compatibility with init scripts
+  ignore_errors: "{{ 'ceph' in ansible_kernel }}"
diff --git a/roles/common/tasks/nrpe-selinux.yml b/roles/common/tasks/nrpe-selinux.yml

new file mode 100644 (file)

index 0000000..5550a72
--- /dev/null
+++ b/roles/common/tasks/nrpe-selinux.yml
@@ -0,0 +1,41 @@
+---
+- name: nrpe - Install SELinux dependencies
+  package:
+    name: "{{ nrpe_selinux_packages|list }}"
+    state: installed
+
+# ignore_errors in case we don't have any repos
+- name: nrpe - Ensure SELinux policy is up to date
+  package:
+    name: selinux-policy-targeted
+    state: latest
+  ignore_errors: true
+
+- name: nrpe - Set SELinux boolean nagios_run_sudo true
+  seboolean:
+    name: nagios_run_sudo
+    state: yes
+    persistent: yes
+
+- name: nrpe - Remove SELinux policy package
+  command: semodule -r nrpe
+  failed_when: false
+
+- name: nrpe - Copy SELinux type enforcement file
+  copy:
+    src: nagios/nrpe.te
+    dest: /tmp/nrpe.te
+
+- name: nrpe - Compile SELinux module file
+  command: checkmodule -M -m -o /tmp/nrpe.mod /tmp/nrpe.te
+
+- name: nrpe - Build SELinux policy package
+  command: semodule_package -o /tmp/nrpe.pp -m /tmp/nrpe.mod
+
+- name: nrpe - Load SELinux policy package
+  command: semodule -i /tmp/nrpe.pp
+
+- name: nrpe - Remove temporary files
+  file:
+    path: /tmp/nrpe.*
+    state: absent
diff --git a/roles/common/tasks/rhel-entitlements.yml b/roles/common/tasks/rhel-entitlements.yml

new file mode 100644 (file)

index 0000000..f1d155b
--- /dev/null
+++ b/roles/common/tasks/rhel-entitlements.yml
@@ -0,0 +1,200 @@
+---
+# Register a RHEL-based system with subscription-manager.
+
+- name: Set entitlements_path
+  set_fact:
+    entitlements_path: "{{ secrets_path }}/entitlements.yml"
+
+- name: Include Red Hat encrypted variables.
+  include_vars: "{{ item }}"
+  with_first_found:
+    - "{{ entitlements_path }}"
+    - empty.yml
+  no_log: true
+  tags:
+    - vars
+
+- name: Set have_entitlements
+  set_fact:
+    have_entitlements: "{{ subscription_manager_org != '' and subscription_manager_activationkey != ''}}"
+
+- name: Find existing CA Cert RPMs
+  command: rpm -qa katello-ca-consumer*
+  register: existing_satellite_cert
+  when: use_satellite == true
+
+- name: Uninstall previous CA Certs from Satellite Servers
+  command: rpm -e "{{ existing_satellite_cert.stdout }}"
+  when:
+    - use_satellite == true
+    - existing_satellite_cert.stdout|length>0
+
+- name: Subscription-manager clean
+  command: subscription-manager clean
+  when: use_satellite == true
+
+- name: remove host UUID files
+  file:
+    state: absent
+    path: "{{ item }}"
+  with_items:
+    - /var/lib/dbus/machine-id
+    - /etc/machine-id
+    - /etc/rhsm/facts/dmi_system_uuid.facts
+    - /etc/rhsm/facts/katello.facts
+    - /etc/insights-client/machine-id
+  when: use_satellite == true
+
+- name: Generate new UUID
+  shell: uuidgen
+  register: new_uuid
+  when: use_satellite == true
+
+- name: Run dbus-uuidgen to create /var/lib/dbus/machine-id
+  shell: dbus-uuidgen --ensure
+
+- name: Run systemd-machine-id-setup to set /etc/machine-id
+  shell: systemd-machine-id-setup
+
+- name: Add new UUID to dmi_system_uuid.facts
+  ansible.builtin.lineinfile:
+    path: /etc/rhsm/facts/dmi_system_uuid.facts
+    create: yes
+    line: |
+      WA{"dmi.system.uuid": "{{ new_uuid.stdout }}"}WA
+  when: use_satellite == true
+
+- name: remove 'WA' PREFIX from dmi_system_uuid.facts
+  replace: dest="/etc/rhsm/facts/dmi_system_uuid.facts" regexp="WA" replace=""
+  when: use_satellite == true
+
+- name: Add fqdn to katello.facts
+  ansible.builtin.lineinfile:
+    path: /etc/rhsm/facts/katello.facts
+    create: yes
+    line: |
+      WA{"network.hostname-override": "{{ ansible_fqdn }}"}WA
+  when: use_satellite == true
+
+- name: remove 'WA' PREFIX from katello.facts
+  replace: dest="/etc/rhsm/facts/katello.facts" regexp="WA" replace=""
+  when: use_satellite == true
+
+- name: Install CA Cert from Satellite Server
+  yum:
+    name: "{{ satellite_cert_rpm }}"
+    state: present
+    validate_certs: no
+    disable_gpg_check: yes
+  when: use_satellite == true
+
+# set the releasever cause without it rhel-7-server-rpms repo fails on rhel7.9 machines
+# https://tracker.ceph.com/issues/49771
+# We have to do this here (instead of in testnodes role) because some package transactions fail during the common role.
+# However, we do not want to lock the release ver on all our systems; just testnodes.
+- name: Set the releasever
+  copy:
+    content: "{{ ansible_distribution_version }}"
+    dest: /etc/yum/vars/releasever
+  when: inventory_hostname in groups['testnodes'] and ansible_distribution_version.startswith("7")
+
+- name: Determine if node is registered with subscription-manager.
+  command: subscription-manager identity
+  register: subscription
+  ignore_errors: true
+  changed_when: false
+  no_log: true
+
+- name: Set rhsm_registered if we're already registered
+  set_fact:
+    rhsm_registered: "{{ subscription.rc == 0 }}"
+
+# A `dnf group upgrade base` which happens later in the testnodes role will
+# update a 8.X system to 8.Y.  We don't want that to happen because we
+# expect to test on a specific version.  set_rhsm_release=true locks a 8.X install to 8.X packages.
+- name: Register with subscription-manager.
+  command: subscription-manager register
+           --activationkey={{ subscription_manager_activationkey }}
+           --org={{ subscription_manager_org }}
+           --name={{ ansible_fqdn }}
+           {% if set_rhsm_release|default(false)|bool == true %}--release={{ ansible_distribution_version }}{% endif %}
+           --force
+  when: rhsm_registered == false and have_entitlements == true
+  register: entitled
+  until: entitled is success
+  retries: 12
+  delay: 10
+  failed_when:
+    - entitled.rc != 0
+
+- name: Set rhsm_registered if we just registered
+  set_fact:
+    rhsm_registered: true
+  when: entitled is success
+
+# Output of this command is, for example:
+# 7.1
+# 7.2
+# 7Server
+- name: List CDN releases available to system
+  shell: "subscription-manager release --list | grep -E '[0-9]'"
+  register: rhsm_release_list
+  changed_when: false
+  failed_when:
+    - rhsm_release_list.rc != 0
+
+- name: Get list of enabled RHSM repos
+  shell: subscription-manager repos --list | grep -B4 'Enabled:.*1' | grep 'Repo ID:' | sed -e 's/Repo ID:\s*\(.*\)/\1/' | sort
+  register: repo_list_cmd
+  when: rhsm_registered == true
+  changed_when: false
+
+- name: Store list of enabled repos
+  set_fact:
+    repo_list: "{{ repo_list_cmd.stdout.split('\n') }}"
+  when: repo_list_cmd is defined and repo_list_cmd is not skipped
+
+- name: Set replace_repos false if entitlements are missing
+  set_fact:
+    replace_repos: false
+  when: have_entitlements == false
+
+- name: Set replace_repos true if rhsm_repos differs from repo_list
+  set_fact:
+    replace_repos: "{{ repo_list|sort != rhsm_repos|sort }}"
+  when: repo_list is defined
+
+- name: Set replace_repos true if newly-subscribed
+  set_fact:
+    replace_repos: true
+  when: rhsm_registered == true and
+        (entitled is changed and entitled.rc == 0)
+
+- name: Disable all rhsm repos
+  command: subscription-manager repos --disable '*'
+  when: rhsm_registered == true and
+        replace_repos|bool == true
+  # This produces an absurd amount of useless output
+  no_log: true
+
+- name: Enable necessary rhsm repos
+  command: subscription-manager repos {% for repo in rhsm_repos|list %}--enable={{ repo }} {% endfor %}
+  when: rhsm_registered == true and
+        replace_repos|bool == true
+  retries: 5
+  delay: 10
+
+# recreate the removed machine-id files to avoid breaking
+# other parts of the system, /bin/install-kernel for instance
+
+- name: Run dbus-uuidgen to create /var/lib/dbus/machine-id
+  shell: dbus-uuidgen --ensure
+
+- name: Run systemd-machine-id-setup to set /etc/machine-id
+  shell: systemd-machine-id-setup
+
+- name: Remove old apt-mirror repository definition.
+  file:
+    path: /etc/yum.repos.d/cd.repo
+    state: absent
+  when: entitled is success
diff --git a/roles/common/tasks/secondary_nic.yml b/roles/common/tasks/secondary_nic.yml

new file mode 100644 (file)

index 0000000..afd1cbb
--- /dev/null
+++ b/roles/common/tasks/secondary_nic.yml
@@ -0,0 +1,88 @@
+---
+- name: Make sure ethtool is installed (Ubuntu)
+  apt:
+    name: ethtool
+    state: present
+  when: ansible_os_family == 'Debian'
+
+- name: Make sure ethtool is installed (CentOS/RHEL)
+  yum:
+    name: ethtool
+    state: present
+    enablerepo: epel
+  when:
+    - ansible_os_family == 'RedHat'
+    - enable_epel|bool == true
+
+- name: grep ethtool for secondary NIC MAC address
+  shell: "ethtool -P {{ item }} | awk '{ print $3 }' | grep -q -i '{{ secondary_nic_mac }}'"
+  register: ethtool_grep_output
+  with_items: "{{ ansible_interfaces }}"
+  failed_when: false
+  changed_when: false
+
+- name: Define net_to_configure var
+  set_fact:
+    nic_to_configure: "{{ item.item }}"
+  with_items: "{{ ethtool_grep_output.results }}"
+  when: item.rc == 0
+
+- name: "Check if {{ nic_to_configure }} is 10Gb"
+  shell: "ethtool {{ nic_to_configure }} | grep Speed | awk '{ print $2 }'"
+  register: nic_to_configure_speed
+  changed_when: false
+
+# Assume jumbo frames if 10Gb
+- name: Set MTU to 9000 if 10Gb
+  set_fact: mtu=9000
+  when:
+    - mtu is not defined
+    - nic_to_configure_speed is defined
+    - (nic_to_configure_speed.stdout == '10000Mb/s' or nic_to_configure_speed.stdout == '25000Mb/s')
+
+- name: "Write Ubuntu network config for {{ nic_to_configure }}"
+  blockinfile:
+    path: /etc/network/interfaces
+    block: |
+      auto {{ nic_to_configure }}
+      iface {{ nic_to_configure }} inet dhcp
+  register: wrote_network_config
+  when:
+    - nic_to_configure is defined
+    - ansible_os_family == 'Debian'
+
+# Can't set MTU for DHCP interfaces on Ubuntu in /etc/network/interfaces
+- name: Set MTU on Ubuntu
+  shell: "ifconfig {{ nic_to_configure }} mtu {{ mtu|default('1500') }}"
+  when: ansible_os_family == 'Debian'
+
+- name: "Bounce {{ nic_to_configure }}"
+  shell: "ifdown {{ nic_to_configure }} && ifup {{ nic_to_configure }}"
+  when:
+    - wrote_network_config is changed
+    - ansible_os_family == 'Debian'
+
+- name: "Write RHEL/CentOS network config for {{ nic_to_configure }}"
+  lineinfile:
+    path: "/etc/sysconfig/network-scripts/ifcfg-{{ nic_to_configure }}"
+    create: yes
+    owner: root
+    group: root
+    mode: 0644
+    regexp: "{{ item.regexp }}"
+    line: "{{ item.line }}"
+  register: wrote_network_config
+  with_items:
+    - { regexp: '^DEVICE=', line: 'DEVICE={{ nic_to_configure }}' }
+    - { regexp: '^BOOTPROTO=', line: 'BOOTPROTO=dhcp' }
+    - { regexp: '^ONBOOT=', line: 'ONBOOT=yes' }
+    - { regexp: '^MTU=', line: 'MTU={{ mtu|default("1500") }}' }
+  when:
+    - nic_to_configure is defined
+    - ansible_os_family == 'RedHat'
+
+- name: "Bounce {{ nic_to_configure }}"
+  shell: "ifdown {{ nic_to_configure }}; ifup {{ nic_to_configure }}"
+  when:
+    - wrote_network_config is changed
+    - ansible_os_family == 'RedHat'
diff --git a/roles/common/tasks/yum_systems.yml b/roles/common/tasks/yum_systems.yml

new file mode 100644 (file)

index 0000000..26addd1
--- /dev/null
+++ b/roles/common/tasks/yum_systems.yml
@@ -0,0 +1,77 @@
+---
+- name: Get the current timezone (RHEL/CentOS 6)
+  shell: cut -d'"' -f2 /etc/sysconfig/clock
+  when: ansible_distribution_major_version == "6"
+  register: current_tz
+  changed_when: false
+  tags:
+    - timezone
+
+- name: Get the current timezone (RHEL/CentOS 7)
+  shell: 'timedatectl | grep -E "Time ?zone" | sed -e "s/.*: \(.*\) (.*/\1/"'
+  when: ansible_distribution_major_version|int >= 7
+  register: current_tz
+  changed_when: false
+  tags:
+    - timezone
+
+# See http://tracker.ceph.com/issues/24197
+# If/when we use ansible 2.7, the next two tasks can be replaced with the 'reboot' ansible module
+- name: Reboot RHEL7 to workaround systemd bug
+  shell: "sleep 5 && reboot"
+  async: 1
+  poll: 0
+  when: '"Connection timed out" in current_tz.stderr'
+  tags:
+    - timezone
+
+- name: Wait for reboot in case of systemd workaround
+  wait_for_connection:
+    delay: 40
+    timeout: 300
+  when: '"Connection timed out" in current_tz.stderr'
+  tags:
+    - timezone
+
+- name: Set /etc/localtime (RHEL/CentOS 6)
+  file:
+    src: /usr/share/zoneinfo/{{ timezone }}
+    dest: /etc/localtime
+    state: link
+    force: yes
+  # Default is used below to avoid breaking check mode
+  when: ansible_distribution_major_version == "6" and current_tz.stdout|default("") != timezone
+  tags:
+    - timezone
+
+- name: Set the timezone (RHEL/CentOS >= 7)
+  command: timedatectl set-timezone {{ timezone }}
+  # Default is used below to avoid breaking check mode
+  when: ansible_distribution_major_version|int >= 7 and current_tz.stdout|default("") != timezone
+  tags:
+    - timezone
+
+# This is temporary to provide reverse compatibility with certain
+# tasks that call yum specifically.
+# Should be deprecated once we move to ansible v2
+- name: Install yum on Fedora 22 and later
+  dnf:
+    name: yum
+    state: present
+  when: ansible_distribution == 'Fedora' and ansible_distribution_major_version|int >= 22
+
+# configure Red Hat entitlements with subscription-manager
+# skip_entitlements=true on OVH testnodes
+- import_tasks: rhel-entitlements.yml
+  when:
+    ansible_distribution == 'RedHat' and
+    skip_entitlements|default(false)|bool != true
+  tags:
+    - entitlements
+
+# create and manage epel.repo
+- import_tasks: epel.yml
+  when: ansible_distribution == "CentOS" or ansible_distribution == 'RedHat'
+  tags:
+    - epel
+    - repos
diff --git a/roles/common/tasks/zypper_systems.yml b/roles/common/tasks/zypper_systems.yml

new file mode 100644 (file)

index 0000000..a7752e6
--- /dev/null
+++ b/roles/common/tasks/zypper_systems.yml
@@ -0,0 +1,34 @@
+---
+
+- name: Get the current timezone
+  shell: 'timedatectl | grep -E "Time ?zone" | sed -e "s/.*: \(.*\) (.*/\1/"'
+  register: current_tz
+  changed_when: false
+  tags:
+    - timezone
+
+- name: Set the timezone
+  command: timedatectl set-timezone {{ timezone }}
+  when: current_tz.stdout|default("") != timezone
+  tags:
+    - timezone
+
+- name: Add base OpenSUSE Leap repo
+  zypper_repository:
+    name: repo-oss
+    repo: "http://download.opensuse.org/distribution/leap/{{ ansible_distribution_version }}/repo/oss/"
+    state: present
+    auto_import_keys: yes
+
+- name: Add updates OpenSUSE Leap repo
+  zypper_repository:
+    name: repo-update-oss
+    repo: "http://download.opensuse.org/update/leap/{{ ansible_distribution_version }}/oss/"
+    state: present
+    auto_import_keys: yes
+
+- name: Refresh repos
+  zypper_repository:
+    repo: '*'
+    runrefresh: yes
+    auto_import_keys: yes
diff --git a/roles/common/templates/krb5.conf b/roles/common/templates/krb5.conf

new file mode 100644 (file)

index 0000000..eb41a59
--- /dev/null
+++ b/roles/common/templates/krb5.conf
@@ -0,0 +1,4 @@
+# {{ ansible_managed }}
+
+[libdefaults]
+ default_realm = {{ kerberos_realm }}
diff --git a/roles/common/templates/nagios/90-nagios b/roles/common/templates/nagios/90-nagios

new file mode 100644 (file)

index 0000000..d4bfcad
--- /dev/null
+++ b/roles/common/templates/nagios/90-nagios
@@ -0,0 +1,2 @@
+## {{ ansible_managed }}
+{{ nrpe_user }} ALL=NOPASSWD: /usr/sbin/megacli, /usr/sbin/cli64, /usr/sbin/smartctl, /usr/sbin/nvme
diff --git a/roles/common/templates/nagios/nrpe.cfg b/roles/common/templates/nagios/nrpe.cfg

new file mode 100644 (file)

index 0000000..d866f6d
--- /dev/null
+++ b/roles/common/templates/nagios/nrpe.cfg
@@ -0,0 +1,31 @@
+# {{ ansible_managed }}
+log_facility=daemon
+{% if ansible_os_family == "Debian" %}
+pid_file=/var/run/nagios/nrpe.pid
+{% else %}
+pid_file=/var/run/nrpe/nrpe.pid
+{% endif %}
+server_port=5666
+nrpe_user={{ nrpe_user }}
+nrpe_group={{ nrpe_group }}
+
+allowed_hosts={{ nagios_allowed_hosts }}
+dont_blame_nrpe=0
+debug=0
+command_timeout=60
+connection_timeout=300
+
+command[check_users]={{ nagios_plugins_directory }}/check_users --warning=5 --critical=10
+command[check_load]={{ nagios_plugins_directory }}/check_load --percpu --warning=1.5,1.4,1.3 --critical=2.0,1.9,1.8
+command[check_mem]={{ nagios_plugins_directory }}/check_mem.sh -w 85 -c 95
+command[check_hda1]={{ nagios_plugins_directory }}/check_disk --warning=20% --critical=10% --partition=/dev/hda1
+command[check_root]={{ nagios_plugins_directory }}/check_disk --warning=10% --critical=5% --units=GB --path=/
+command[check_zombie_procs]={{ nagios_plugins_directory }}/check_procs --warning=5 --critical=10 --state=Z
+command[check_total_procs]={{ nagios_plugins_directory }}/check_procs --warning=300 --critical=500
+command[check_raid]=/usr/libexec/raid.pl
+command[check_disks]=/usr/libexec/diskusage.pl 90 95
+command[check_smart]=/usr/libexec/smart.sh
+
+include=/etc/nagios/nrpe_local.cfg
+
+include_dir=/etc/nagios/nrpe.d/
diff --git a/roles/common/templates/yum_repo.j2 b/roles/common/templates/yum_repo.j2

new file mode 100644 (file)

index 0000000..7467eb6
--- /dev/null
+++ b/roles/common/templates/yum_repo.j2
@@ -0,0 +1,8 @@
+#
+# {{ ansible_managed }}
+#
+
+[{{ item.key }}]
+{% for k, v in item.value.items() | sort -%}
+  {{ k }}={{ v }}
+{% endfor %}
diff --git a/roles/common/vars/apt_systems.yml b/roles/common/vars/apt_systems.yml

new file mode 100644 (file)

index 0000000..8c2ba5c
--- /dev/null
+++ b/roles/common/vars/apt_systems.yml
@@ -0,0 +1,9 @@
+---
+nrpe_service_name: nagios-nrpe-server
+nrpe_user: nagios
+nrpe_group: nagios
+nagios_plugins_directory: /usr/lib/nagios/plugins
+
+nrpe_packages:
+  - nagios-nrpe-server
+  - nagios-plugins-basic
diff --git a/roles/common/vars/centos_8.yml b/roles/common/vars/centos_8.yml

new file mode 100644 (file)

index 0000000..9af7db3
--- /dev/null
+++ b/roles/common/vars/centos_8.yml
@@ -0,0 +1,4 @@
+---
+nrpe_selinux_packages:
+  - python3-libsemanage
+  - python3-policycoreutils
diff --git a/roles/common/vars/centos_9.yml b/roles/common/vars/centos_9.yml

new file mode 100644 (file)

index 0000000..9af7db3
--- /dev/null
+++ b/roles/common/vars/centos_9.yml
@@ -0,0 +1,4 @@
+---
+nrpe_selinux_packages:
+  - python3-libsemanage
+  - python3-policycoreutils
diff --git a/roles/common/vars/dnf_systems.yml b/roles/common/vars/dnf_systems.yml

new file mode 120000 (symlink)

index 0000000..3eacc96
--- /dev/null
+++ b/roles/common/vars/dnf_systems.yml
@@ -0,0 +1 @@
+yum_systems.yml
+\ No newline at end of file
diff --git a/roles/common/vars/empty.yml b/roles/common/vars/empty.yml

new file mode 100644 (file)

index 0000000..ed97d53
--- /dev/null
+++ b/roles/common/vars/empty.yml
@@ -0,0 +1 @@
+---
diff --git a/roles/common/vars/fedora_31.yml b/roles/common/vars/fedora_31.yml

new file mode 100644 (file)

index 0000000..087e67a
--- /dev/null
+++ b/roles/common/vars/fedora_31.yml
@@ -0,0 +1,4 @@
+---
+nrpe_selinux_packages:
+  - python-libsemanage
+  - policycoreutils-python-utils
diff --git a/roles/common/vars/redhat_6.yml b/roles/common/vars/redhat_6.yml

new file mode 100644 (file)

index 0000000..4ccaa91
--- /dev/null
+++ b/roles/common/vars/redhat_6.yml
@@ -0,0 +1,7 @@
+---
+rhsm_repos:
+  - rhel-6-server-rpms
+  - rhel-6-server-optional-rpms
+  - rhel-6-server-extras-rpms
+  # for xfsprogs
+  - rhel-scalefs-for-rhel-6-server-rpms
diff --git a/roles/common/vars/redhat_7.yml b/roles/common/vars/redhat_7.yml

new file mode 100644 (file)

index 0000000..a3855f1
--- /dev/null
+++ b/roles/common/vars/redhat_7.yml
@@ -0,0 +1,6 @@
+---
+rhsm_repos:
+  - rhel-7-server-rpms
+  - rhel-7-server-optional-rpms
+  - rhel-7-server-extras-rpms
+  - rhel-ha-for-rhel-7-server-rpms
diff --git a/roles/common/vars/redhat_8.yml b/roles/common/vars/redhat_8.yml

new file mode 100644 (file)

index 0000000..5afcdbb
--- /dev/null
+++ b/roles/common/vars/redhat_8.yml
@@ -0,0 +1,9 @@
+---
+rhsm_repos:
+  - rhel-8-for-x86_64-baseos-rpms
+  - rhel-8-for-x86_64-appstream-rpms
+  - codeready-builder-for-rhel-8-x86_64-rpms
+
+nrpe_selinux_packages:
+  - python3-libsemanage
+  - python3-policycoreutils
diff --git a/roles/common/vars/redhat_9.yml b/roles/common/vars/redhat_9.yml

new file mode 100644 (file)

index 0000000..372da7b
--- /dev/null
+++ b/roles/common/vars/redhat_9.yml
@@ -0,0 +1,9 @@
+---
+rhsm_repos:
+  - rhel-9-for-x86_64-baseos-rpms
+  - rhel-9-for-x86_64-appstream-rpms
+  - codeready-builder-for-rhel-9-x86_64-rpms
+
+nrpe_selinux_packages:
+  - python3-libsemanage
+  - python3-policycoreutils
diff --git a/roles/common/vars/yum_systems.yml b/roles/common/vars/yum_systems.yml

new file mode 100644 (file)

index 0000000..b85f008
--- /dev/null
+++ b/roles/common/vars/yum_systems.yml
@@ -0,0 +1,11 @@
+---
+nrpe_service_name: nrpe
+nrpe_user: nrpe
+nrpe_group: nrpe
+nagios_plugins_directory: /usr/lib64/nagios/plugins
+
+nrpe_packages:
+  - nagios-common
+  - nrpe
+  - nagios-plugins
+  - nagios-plugins-load
diff --git a/roles/common/vars/zypper_systems.yml b/roles/common/vars/zypper_systems.yml

new file mode 100644 (file)

index 0000000..c655e01
--- /dev/null
+++ b/roles/common/vars/zypper_systems.yml
@@ -0,0 +1,9 @@
+---
+nrpe_service_name: nrpe
+nrpe_user: nrpe
+nrpe_group: nrpe
+nagios_plugins_directory: /usr/lib/nagios/plugins
+
+nrpe_packages:
+  - nrpe
+  - monitoring-plugins-nrpe
diff --git a/roles/container-host/README.rst b/roles/container-host/README.rst

new file mode 100644 (file)

index 0000000..b7611cb
--- /dev/null
+++ b/roles/container-host/README.rst
@@ -0,0 +1,31 @@
+container-host
+==============
+
+The container-host role will:
+
+- Install ``docker`` or ``podman``
+- Configure a local ``docker.io`` mirror if configured
+
+Variables
++++++++++
+
+``container_packages: []`` is the list of container packages to install.  We default to podman on RedHat based distros and docker.io on Debian-based distros.
+
+The following variables are used to optionally configure a docker.io mirror CA certificate. The role will install the certificate in both ``/etc/containers/certs.d`` (for podman) and ``/etc/docker/certs.d`` (for docker).::
+
+    # Defined in all.yml in secrets repo
+    container_mirror: docker-mirror.front.sepia.ceph.com:5000
+
+    # Defined in all.yml in secrets repo
+    container_mirror_cert: |
+      -----BEGIN CERTIFICATE-----
+      ...
+      -----END CERTIFICATE-----
+
+Tags
+++++
+
+registries-conf-ctl
+    Add ``--skip-tags registries-conf-ctl`` to your ``ansible-playbook`` command if you don't want to use registries-conf-ctl_ to configure the container service's conf file.
+
+.. _registries-conf-ctl: https://github.com/sebastian-philipp/registries-conf-ctl
diff --git a/roles/container-host/meta/main.yml b/roles/container-host/meta/main.yml

new file mode 100644 (file)

index 0000000..313fd69
--- /dev/null
+++ b/roles/container-host/meta/main.yml
@@ -0,0 +1,3 @@
+---
+dependencies:
+  - role: secrets
diff --git a/roles/container-host/tasks/container_mirror.yml b/roles/container-host/tasks/container_mirror.yml

new file mode 100644 (file)

index 0000000..74f5bd1
--- /dev/null
+++ b/roles/container-host/tasks/container_mirror.yml
@@ -0,0 +1,60 @@
+---
+- name: "Create container_mirror_cert_paths"
+  file:
+    path: "{{ item }}"
+    state: directory
+  with_items: "{{ container_mirror_cert_paths }}"
+
+- name: "Copy {{ container_mirror }} self-signed cert"
+  copy:
+    dest: "{{ item }}/docker-mirror.crt"
+    content: "{{ container_mirror_cert }}"
+  with_items: "{{ container_mirror_cert_paths }}"
+
+- name: Ensure git is installed
+  package:
+    name: git
+    state: present
+  tags:
+    - registries-conf-ctl
+
+- name: Install registries-conf-ctl 
+  pip:
+    name: git+https://github.com/sebastian-philipp/registries-conf-ctl
+    state: latest
+    executable: "{{ pip_executable|default('pip3') }}"
+  tags:
+    - registries-conf-ctl
+
+- name: "Check for docker's daemon.json"
+  stat:
+    path: "{{ container_service_conf }}"
+  when:
+    - "'docker.io' in container_packages"
+    - "'podman' not in container_packages"
+  register: container_conf
+
+- name: "Create {{ container_service_conf }} if necessary"
+  copy:
+    dest: "{{ container_service_conf }}"
+    content: "{}"
+  when:
+    - "'docker.io' in container_packages"
+    - "'podman' not in container_packages"
+    - container_conf.stat.exists == False
+
+- name: Add local docker.io registry mirror
+  command: registries-conf-ctl add-mirror docker.io "{{ container_mirror }}"
+  environment:
+    PATH: /usr/local/bin:/usr/bin
+  tags:
+    - registries-conf-ctl
+
+# not very elegant but it's a workaround for now
+- name: Restart docker service
+  service:
+    name: docker
+    state: restarted
+  when: "'docker.io' in container_packages"
+  # There's an issue with ansible<=2.9 and our custom built kernels (5.8 as of this commit) where the service and systemd modules don't have backwards compatibility with init scripts
+  ignore_errors: "{{ 'ceph' in ansible_kernel }}"
diff --git a/roles/container-host/tasks/main.yml b/roles/container-host/tasks/main.yml

new file mode 100644 (file)

index 0000000..83f5013
--- /dev/null
+++ b/roles/container-host/tasks/main.yml
@@ -0,0 +1,44 @@
+---
+- set_fact:
+    package_manager: apt
+  when: ansible_os_family == "Debian"
+
+- set_fact:
+    package_manager: yum
+  when: ansible_os_family == "RedHat"
+
+- name: Including distro specific variables
+  include_vars: "{{ item }}"
+  with_first_found:
+    - "{{ secrets_path }}/container-host/{{ ansible_distribution | lower }}_{{ ansible_distribution_major_version }}.yml"
+    - "{{ ansible_distribution | lower }}_{{ ansible_distribution_major_version }}.yml"
+    - "{{ package_manager }}_systems.yml"
+    - empty.yml
+
+- name: Install container packages
+  package:
+    name: "{{ container_packages }}"
+    state: latest
+  when: container_packages|length > 0
+
+- set_fact:
+    container_service_conf: "/etc/containers/registries.conf"
+  when:
+    - "'podman' in container_packages"
+  tags:
+    - container-mirror
+
+- set_fact:
+    container_service_conf: "/etc/docker/daemon.json"
+  when:
+    - "'docker.io' in container_packages"
+    - "'podman' not in container_packages"
+  tags:
+    - container-mirror
+
+- import_tasks: container_mirror.yml
+  when:
+    - container_mirror is defined
+    - container_mirror_cert is defined
+  tags:
+    - container-mirror
diff --git a/roles/container-host/tasks/pipx_install_reg_conf_ctl.yml b/roles/container-host/tasks/pipx_install_reg_conf_ctl.yml

new file mode 100644 (file)

index 0000000..2e13779
--- /dev/null
+++ b/roles/container-host/tasks/pipx_install_reg_conf_ctl.yml
@@ -0,0 +1,74 @@
+---
+- name: "Create container_mirror_cert_paths"
+  file:
+    path: "{{ item }}"
+    state: directory
+  with_items: "{{ container_mirror_cert_paths }}"
+
+- name: "Copy {{ container_mirror }} self-signed cert"
+  copy:
+    dest: "{{ item }}/docker-mirror.crt"
+    content: "{{ container_mirror_cert }}"
+  with_items: "{{ container_mirror_cert_paths }}"
+
+- name: Ensure git is installed
+  package:
+    name: git
+    state: present
+  tags:
+    - registries-conf-ctl
+
+- name: Check for pipx
+  ansible.builtin.shell: "command -v pipx"
+  register: pipx_check
+  changed_when: false
+  failed_when: false
+  tags:
+    - registries-conf-ctl
+
+- import_tasks: pipx_install_reg_conf_ctl.yml
+  when: pipx_check.rc == 0
+  tags:
+    - registries-conf-ctl
+
+- name: Install registries-conf-ctl via pip
+  pip:
+    name: git+https://github.com/sebastian-philipp/registries-conf-ctl
+    state: latest
+    executable: "{{ pip_executable|default('pip3') }}"
+  when: pipx_check.rc != 0
+  tags:
+    - registries-conf-ctl
+
+- name: "Check for docker's daemon.json"
+  stat:
+    path: "{{ container_service_conf }}"
+  when:
+    - "'docker.io' in container_packages"
+    - "'podman' not in container_packages"
+  register: container_conf
+
+- name: "Create {{ container_service_conf }} if necessary"
+  copy:
+    dest: "{{ container_service_conf }}"
+    content: "{}"
+  when:
+    - "'docker.io' in container_packages"
+    - "'podman' not in container_packages"
+    - container_conf.stat.exists == False
+
+- name: Add local docker.io registry mirror
+  command: registries-conf-ctl add-mirror docker.io "{{ container_mirror }}"
+  environment:
+    PATH: /usr/local/bin:/usr/bin
+  tags:
+    - registries-conf-ctl
+
+# not very elegant but it's a workaround for now
+- name: Restart docker service
+  service:
+    name: docker
+    state: restarted
+  when: "'docker.io' in container_packages"
+  # There's an issue with ansible<=2.9 and our custom built kernels (5.8 as of this commit) where the service and systemd modules don't have backwards compatibility with init scripts
+  ignore_errors: "{{ 'ceph' in ansible_kernel }}"
diff --git a/roles/container-host/vars/apt_systems.yml b/roles/container-host/vars/apt_systems.yml

new file mode 100644 (file)

index 0000000..e513e8d
--- /dev/null
+++ b/roles/container-host/vars/apt_systems.yml
@@ -0,0 +1,5 @@
+---
+container_packages:
+  - docker.io
+  - python3-setuptools
+  - python3-pip
diff --git a/roles/container-host/vars/centos_7.yml b/roles/container-host/vars/centos_7.yml

new file mode 100644 (file)

index 0000000..e0d5907
--- /dev/null
+++ b/roles/container-host/vars/centos_7.yml
@@ -0,0 +1,6 @@
+---
+container_packages:
+  - podman
+  - podman-docker
+
+pip_executable: pip
diff --git a/roles/container-host/vars/centos_9.yml b/roles/container-host/vars/centos_9.yml

new file mode 100644 (file)

index 0000000..9ca52e4
--- /dev/null
+++ b/roles/container-host/vars/centos_9.yml
@@ -0,0 +1,5 @@
+---
+container_packages:
+  - podman
+# Doesn't exist yet
+#  - podman-docker
diff --git a/roles/container-host/vars/empty.yml b/roles/container-host/vars/empty.yml

new file mode 100644 (file)

index 0000000..ed97d53
--- /dev/null
+++ b/roles/container-host/vars/empty.yml
@@ -0,0 +1 @@
+---
diff --git a/roles/container-host/vars/main.yml b/roles/container-host/vars/main.yml

new file mode 100644 (file)

index 0000000..e84dc53
--- /dev/null
+++ b/roles/container-host/vars/main.yml
@@ -0,0 +1,4 @@
+---
+container_mirror_cert_paths:
+  - "/etc/docker/certs.d/{{ container_mirror }}"
+  - "/etc/containers/certs.d/{{ container_mirror }}"
diff --git a/roles/container-host/vars/ubuntu_18.yml b/roles/container-host/vars/ubuntu_18.yml

new file mode 100644 (file)

index 0000000..4dc4ea9
--- /dev/null
+++ b/roles/container-host/vars/ubuntu_18.yml
@@ -0,0 +1,7 @@
+---
+container_packages:
+  - docker.io
+  - python-setuptools
+  - python-pip
+
+pip_executable: pip
diff --git a/roles/container-host/vars/ubuntu_24.yml b/roles/container-host/vars/ubuntu_24.yml

new file mode 100644 (file)

index 0000000..3e79485
--- /dev/null
+++ b/roles/container-host/vars/ubuntu_24.yml
@@ -0,0 +1,6 @@
+---
+container_packages:
+  - docker.io
+  - python3-setuptools
+  - python3-pip
+  - pipx
diff --git a/roles/container-host/vars/yum_systems.yml b/roles/container-host/vars/yum_systems.yml

new file mode 100644 (file)

index 0000000..f6a6b7e
--- /dev/null
+++ b/roles/container-host/vars/yum_systems.yml
@@ -0,0 +1,4 @@
+---
+container_packages:
+  - podman
+  - podman-docker
diff --git a/roles/dhcp-server/README.rst b/roles/dhcp-server/README.rst

new file mode 100644 (file)

index 0000000..5e9688f
--- /dev/null
+++ b/roles/dhcp-server/README.rst
@@ -0,0 +1,109 @@
+dhcp-server
+===========
+
+This role can be used to install, update, and manage a DHCP server running on CentOS 7.
+
+Notes
++++++
+
+This role is heavily modified to be primarily useful for our test labs that only have two or three subnets.  See https://wiki.sepia.ceph.com/doku.php?id=services:networking.
+
+This role checks for firewalld and iptables.  It will configure firewalld unless iptables is running.  It **does not** configure iptables and will not install or configure firewalld if it's not installed.   At the time the role was created, our DHCP server was running other services and its iptables was already heavily modified and configured.  This reason, along with firewalld being the default in CentOS 7, is why iptables configuration is skipped.
+
+Variables
++++++++++
+This role basically has two required and two optional variables:
+
++----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+| **Required Variables**                                                                                                                                                                                                                                                                                                                 |
++---------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+|::                                                                   | This list will be used to populate the global ``/etc/dhcpd.conf``.  You can add additional keys and values.  Just make sure they follow the syntax required for dhcpd.conf.                                                                                      |
+|                                                                     |                                                                                                                                                                                                                                                                  |
+|  dhcp_global_options:                                               |                                                                                                                                                                                                                                                                  |
+|    - ddns-update-style: none                                        | Here's the dhcpd_ man page.                                                                                                                                                                                                                                      |
+|    - default-lease-time: 43200                                      |                                                                                                                                                                                                                                                                  |
+|    - max-lease-time: 172800                                         |                                                                                                                                                                                                                                                                  |
+|    - one-lease-per-client: "true"                                   |                                                                                                                                                                                                                                                                  |
+|                                                                     |                                                                                                                                                                                                                                                                  |
++---------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+|::                                                                   | This is large dictionary that gets parsed out into individual dhcpd config files.  Each top-level key (``front`` and ``ipmi`` in the example) will get its own dhcp conf file created.  The example shown to the left is our actual ``dhcp_subnets`` dictionary. |
+|                                                                     |                                                                                                                                                                                                                                                                  |
+|  dhcp_subnets:                                                      |                                                                                                                                                                                                                                                                  |
+|    front:                                                           | Under each subnet, ``cidr``, ``ipvar``, and ``macvar`` are required.  ``ipvar`` and ``macvar`` tell the Jinja2 template which IP address and MAC address should be used for each host in each subnet config file.                                                |
+|      cidr: 172.21.0.0/20                                            |                                                                                                                                                                                                                                                                  |
+|      ipvar: ip                                                      | Here's a line from our Ansible inventory host file                                                                                                                                                                                                               |
+|      macvar: mac                                                    |                                                                                                                                                                                                                                                                  |
+|      domain_name: front.sepia.ceph.com                              | ``smithi001.front.sepia.ceph.com mac=0C:C4:7A:BD:15:E8 ip=172.21.15.1 ipmi=172.21.47.1 bmc=0C:C4:7A:6E:21:A7``                                                                                                                                                   |
+|      domain_search:                                                 |                                                                                                                                                                                                                                                                  |
+|        - front.sepia.ceph.com                                       | This will result in a static IP entry for smithi001-front with IP 172.21.15.1 and MAC 0C:C4:7A:BD:15:E8 in ``dhcpd.front.conf`` and a smithi001-ipmi entry with IP 172.21.47.1 with MAC 0C:C4:7A:6E:21:A7 in ``dhcpd.ipmi.conf``.                                |
+|        - sepia.ceph.com                                             |                                                                                                                                                                                                                                                                  |
+|      domain_name_server:                                            | The ``next_server`` and ``filename`` values can be overridden by ansible group or host.  See below.                                                                                                                                                              |
+|        - 172.21.0.1                                                 |                                                                                                                                                                                                                                                                  |
+|        - 172.21.0.2                                                 | All the other keys are optional.                                                                                                                                                                                                                                 |
+|      routers: 172.21.15.254                                         |                                                                                                                                                                                                                                                                  |
+|      next_server: 172.21.0.11                                       |                                                                                                                                                                                                                                                                  |
+|      filename: "/pxelinux.0"                                        |                                                                                                                                                                                                                                                                  |
+|      classes:                                                       |                                                                                                                                                                                                                                                                  |
+|        virtual: "match if substring(hardware, 0, 4) = 01:52:54:00"  |                                                                                                                                                                                                                                                                  |
+|        lxc: "match if substring(hardware, 0, 4) = 01:52:54:ff"      |                                                                                                                                                                                                                                                                  |
+|      pools:                                                         |                                                                                                                                                                                                                                                                  |
+|        virtual:                                                     |                                                                                                                                                                                                                                                                  |
+|          range: 172.21.10.20 172.21.10.250                          |                                                                                                                                                                                                                                                                  |
+|        unknown_clients:                                             |                                                                                                                                                                                                                                                                  |
+|          range:                                                     |                                                                                                                                                                                                                                                                  |
+|            - 172.21.11.0 172.21.11.19                               |                                                                                                                                                                                                                                                                  |
+|            - 172.21.13.170 172.21.13.250                            |                                                                                                                                                                                                                                                                  |
+|          next_server: 172.21.0.11                                   |                                                                                                                                                                                                                                                                  |
+|          filename: "/pxelinux.0"                                    |                                                                                                                                                                                                                                                                  |
+|        lxc:                                                         |                                                                                                                                                                                                                                                                  |
+|          range: 172.21.14.1 172.21.14.200                           |                                                                                                                                                                                                                                                                  |
+|    ipmi:                                                            |                                                                                                                                                                                                                                                                  |
+|      cidr: 172.21.32.0/20                                           |                                                                                                                                                                                                                                                                  |
+|      ipvar: ipmi                                                    |                                                                                                                                                                                                                                                                  |
+|      macvar: bmc                                                    |                                                                                                                                                                                                                                                                  |
+|      domain_name: ipmi.sepia.ceph.com                               |                                                                                                                                                                                                                                                                  |
+|      domain_search:                                                 |                                                                                                                                                                                                                                                                  |
+|        - ipmi.sepia.ceph.com                                        |                                                                                                                                                                                                                                                                  |
+|        - sepia.ceph.com                                             |                                                                                                                                                                                                                                                                  |
+|      domain_name_servers:                                           |                                                                                                                                                                                                                                                                  |
+|        - 172.21.0.1                                                 |                                                                                                                                                                                                                                                                  |
+|        - 172.21.0.2                                                 |                                                                                                                                                                                                                                                                  |
+|      routers: 172.21.47.254                                         |                                                                                                                                                                                                                                                                  |
+|      next_server: 172.21.0.11                                       |                                                                                                                                                                                                                                                                  |
+|      filename: "/pxelinux.0"                                        |                                                                                                                                                                                                                                                                  |
+|      pools:                                                         |                                                                                                                                                                                                                                                                  |
+|        unknown_clients:                                             |                                                                                                                                                                                                                                                                  |
+|          range: 172.21.43.1 172.21.43.100                           |                                                                                                                                                                                                                                                                  |
+|          next_server: 172.21.0.11                                   |                                                                                                                                                                                                                                                                  |
+|          filename: "/pxelinux.0"                                    |                                                                                                                                                                                                                                                                  |
+|                                                                     |                                                                                                                                                                                                                                                                  |
++---------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+| **Optional Variables**                                                                                                                                                                                                                                                                                                                 |
++---------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+| ``dhcp_next_server: 1.2.3.4``                                       | This is your PXE/TFTP server's IP address.  This will **override** the subnet's ``next_server`` defined in the ``dhcp_subnets`` dictionary.  It can be defined in your Ansible inventory in a couple ways:                                                       |
+|                                                                     |                                                                                                                                                                                                                                                                  |
+|                                                                     | #. In ``ansible/inventory/group_vars/group.yml`` if some hosts should use a different PXE server                                                                                                                                                                 |
+|                                                                     | #. In your inventory ``hosts`` file on a per-host basis.  See Ansible's docs_ on variable precedence.                                                                                                                                                            |
++---------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+| ``dhcp_filename: "/pxelinux.0"``                                    | Same rules as above.  This is the TFTP filename the DHCP server should instruct DHCP clients to download.                                                                                                                                                        |
++---------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+|::                                                                   | This can be set at a ``host_var`` or ``group_var`` level in the Ansible inventory.  It will **override** the subnet's ``domain_name_servers`` defined in the ``dhcp_subnets`` dictionary.                                                                        |
+|                                                                     |                                                                                                                                                                                                                                                                  |
+|  domain_name_servers:                                               |                                                                                                                                                                                                                                                                  |
+|    - 1.2.3.4                                                        |                                                                                                                                                                                                                                                                  |
+|    - 5.6.7.8                                                        |                                                                                                                                                                                                                                                                  |
+|                                                                     |                                                                                                                                                                                                                                                                  |
++---------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+| ``dhcp_option_hostname: False``                                     | Should this host get ``option host-name "{{ ansible_host }}";`` defined in its host declaration?  Defaults to False.  Override in secrets repo per host/group.                                                                                                   |
++---------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+
+Tags
+++++
+
+Available tags are listed below:
+
+packages
+    Run (or skip) package install/update tasks
+
+.. _docs: https://docs.ansible.com/ansible/latest/user_guide/playbooks_variables.html#variable-precedence-where-should-i-put-a-variable
+.. _dhcpd: https://linux.die.net/man/8/dhcpd
diff --git a/roles/dhcp-server/tasks/main.yml b/roles/dhcp-server/tasks/main.yml

new file mode 100644 (file)

index 0000000..ef384f1
--- /dev/null
+++ b/roles/dhcp-server/tasks/main.yml
@@ -0,0 +1,65 @@
+---
+- name: Install/update packages
+  yum:
+    name: dhcp
+    state: latest
+  register: dhcp_yum_transaction
+  tags: packages
+
+- name: Check for firewalld
+  command: firewall-cmd --state
+  register: firewalld_state
+  ignore_errors: true
+
+- name: Check for iptables
+  command: systemctl status iptables
+  register: iptables_state
+  ignore_errors: true
+
+- name: Make sure firewalld is running
+  service:
+    name: firewalld
+    state: started
+    enabled: yes
+  when:
+    - iptables_state.rc != 0
+    - not (firewalld_state.msg is defined and "'No such file or directory' in firewalld_state.msg")
+
+- name: Configure firewalld
+  firewalld:
+    service: dhcp
+    state: enabled
+    permanent: true
+    immediate: yes
+  when:
+    - iptables_state.rc != 0
+    - not (firewalld_state.msg is defined and "'No such file or directory' in firewalld_state.msg")
+
+- name: Write global dhcpd.conf
+  template:
+    src: dhcpd.conf.j2
+    dest: /etc/dhcp/dhcpd.conf
+    backup: yes
+  register: dhcp_global_config
+
+- name: Write each subnet config
+  template:
+    src: dhcpd.subnet.conf.j2
+    dest: "/etc/dhcp/dhcpd.{{ item }}.conf"
+    backup: yes
+  with_items: "{{ dhcp_subnets }}"
+  register: dhcp_subnet_config
+
+- name: Test new config
+  command: dhcpd -t -cf /etc/dhcp/dhcpd.conf
+  register: dhcpd_config_test_result
+  when: dhcp_global_config is changed or dhcp_subnet_config is changed
+
+- name: Restart dhcpd
+  service:
+    name: dhcpd
+    state: restarted
+  when:
+    - (dhcp_global_config is changed or dhcp_subnet_config is changed)
+    - dhcpd_config_test_result is defined
+    - dhcpd_config_test_result.rc == 0
diff --git a/roles/dhcp-server/templates/dhcpd.conf.j2 b/roles/dhcp-server/templates/dhcpd.conf.j2

new file mode 100644 (file)

index 0000000..5b0c237
--- /dev/null
+++ b/roles/dhcp-server/templates/dhcpd.conf.j2
@@ -0,0 +1,9 @@
+{% for item in dhcp_global_options %}
+{% for key, value in item.items() %}
+{{ key }} {{ value }};
+{% endfor %}
+{% endfor %}
+
+{% for key, value in dhcp_subnets.items() %}
+include "/etc/dhcp/dhcpd.{{ key }}.conf";
+{% endfor %}
diff --git a/roles/dhcp-server/templates/dhcpd.subnet.conf.j2 b/roles/dhcp-server/templates/dhcpd.subnet.conf.j2

new file mode 100644 (file)

index 0000000..2d93039
--- /dev/null
+++ b/roles/dhcp-server/templates/dhcpd.subnet.conf.j2
@@ -0,0 +1,78 @@
+{% for subnet, subnet_item in dhcp_subnets.items() %}
+{% if subnet == item %}
+subnet {{ subnet_item.cidr | ipaddr('network') }} netmask {{ subnet_item.cidr | ipaddr('netmask') }} {
+  {% if subnet_item.domain_name is defined -%}
+  option domain-name           "{{ subnet_item.domain_name }}";
+  {% endif -%}
+  {% if subnet_item.domain_search is defined -%}
+  option domain-search         "{{ subnet_item.domain_search|join('", "') }}";
+  {% endif -%}
+  {% if subnet_item.domain_name_servers is defined -%}
+  option domain-name-servers   {{ subnet_item.domain_name_servers|join(', ') }};
+  {% endif -%}
+  {% if subnet_item.routers is defined -%}
+  option routers               {{ subnet_item.routers }};
+  {% endif -%}
+  {% if subnet_item.next_server is defined -%}
+  next-server                  {{ subnet_item.next_server }};
+  {% endif -%}
+  {% if subnet_item.filename is defined -%}
+  filename                     "{{ subnet_item.filename }}";
+  {% endif %}
+
+  {% if subnet_item.classes is defined -%}
+  {% for class_name, class_string in subnet_item.classes.items() -%}
+  class "{{ class_name }}" {
+    {{ class_string }};
+  }
+
+  {% endfor -%}
+  {%- endif -%}
+
+  {% if subnet_item.pools is defined -%}
+  {% for pool, pool_value in subnet_item.pools.items() -%}
+  pool {
+    {% if pool == "unknown_clients" -%}
+    allow unknown-clients;
+    {% else -%}
+    allow members of "{{ pool }}";
+    {% endif -%}
+    {% if pool_value.range is string -%}
+    range {{ pool_value.range }};
+    {% else -%}
+    range {{ pool_value.range|join(';\n    range ') }};
+    {% endif -%}
+    {% if pool_value.next_server is defined -%}
+    next-server {{ pool_value.next_server }};
+    {% endif -%}
+    {% if pool_value.filename is defined -%}
+    filename "{{ pool_value.filename }}";
+    {% endif -%}
+  }
+
+  {% endfor -%}
+  {%- endif -%}
+
+  {% for host in groups['all'] | sort | unique -%}
+  {% if hostvars[host][subnet_item.macvar] is defined -%}
+  {% if hostvars[host][subnet_item.ipvar] | ipaddr(subnet_item.cidr) | ipaddr('bool') -%}
+  host {{ host.split('.')[0] }}-{{ subnet }} {
+    {% if hostvars[host]['dhcp_next_server'] is defined -%}
+    next-server {{ hostvars[host]['dhcp_next_server'] }};
+    filename "{{ hostvars[host]['dhcp_filename'] }}";
+    {% endif -%}
+    {% if hostvars[host]['domain_name_servers'] is defined -%}
+    option domain-name-servers {{ hostvars[host]['domain_name_servers']|join(', ') }};
+    {% endif -%}
+    hardware ethernet {{ hostvars[host][subnet_item.macvar] }};
+    fixed-address {{ hostvars[host][subnet_item.ipvar] }};
+  {% if hostvars[host]['dhcp_option_hostname'] is defined and hostvars[host]['dhcp_option_hostname'] == true %}
+  option host-name "{{ host.split('.')[0] }}";
+  {% endif -%}
+  }
+  {% endif -%}
+  {% endif -%}
+  {% endfor -%}
+} # end subnet
+{% endif %}
+{% endfor %}
diff --git a/roles/downstream-setup/defaults/main.yml b/roles/downstream-setup/defaults/main.yml

new file mode 100644 (file)

index 0000000..044146e
--- /dev/null
+++ b/roles/downstream-setup/defaults/main.yml
@@ -0,0 +1,39 @@
+---
+# When cleanup is true the tasks being used might
+# perform cleanup steps if applicable.
+cleanup: false
+
+
+# yum_repos is a list of hashes that
+# define the url to download the yum repo
+# from and the name to save it as in etc/yum.repos.d
+#
+# For example:
+#   yum_repos:
+#     - url: "http://path/to/epel.repo"
+#       name: "epel"
+#
+# When using the yum_repos var and if cleanup is true it will
+# delete the repos instead of creating them.
+yum_repos: []
+
+# a list of repo names as strings to delete from /etc/yum.repos.d
+# the name should not include the .repo extension
+remove_yum_repos: []
+
+# a list of repo names as strings to disable in /etc/yum.repos.d
+# the name should not include the .repo extension
+# When using the disable_yum_repos var and if cleanup is true it will
+# delete the repos instead of creating them.
+# NOTE: this does not work on repo files with multiple entries in them,
+# it will only disable the first entry in the repo file.
+disable_yum_repos: []
+
+# a list of repo names as strings to enable in /etc/yum.repos.d
+# the name should not include the .repo extension
+# NOTE: this does not work on repo files with multiple entries in them,
+# it will only enable the first entry in the repo file.
+enable_yum_repos: []
+
+# defining empty var for ansible v2.2 compatibility.
+repos_to_remove: []
diff --git a/roles/downstream-setup/tasks/cleanup.yml b/roles/downstream-setup/tasks/cleanup.yml

new file mode 100644 (file)

index 0000000..94c1d6f
--- /dev/null
+++ b/roles/downstream-setup/tasks/cleanup.yml
@@ -0,0 +1,33 @@
+---
+- debug: msg="Performing cleanup related tasks..."
+
+- import_tasks: yum_repos.yml
+  when: remove_yum_repos|length > 0
+  vars:
+    repos: "{{ remove_yum_repos }}"
+  tags:
+    - yum-repos
+
+- set_fact:
+    repos_to_remove: "{% for repo in yum_repos%}{{ repo.name }}{% if not loop.last %},{% endif %}{% endfor %}"
+
+- import_tasks: remove_yum_repos.yml
+  when: yum_repos|length > 0
+  vars:
+    repos: "{{ repos_to_remove.split(',') }}"
+  tags:
+    - delete-yum-repos
+
+- import_tasks: disable_yum_repos.yml
+  when: enable_yum_repos|length > 0
+  vars:
+    repos: "{{ enable_yum_repos }}"
+  tags:
+    - disable-yum-repos
+
+- import_tasks: enable_yum_repos.yml
+  when: disable_yum_repos|length > 0
+  vars:
+    repos: "{{ disable_yum_repos }}"
+  tags:
+    - enable-yum-repos
diff --git a/roles/downstream-setup/tasks/disable_yum_repos.yml b/roles/downstream-setup/tasks/disable_yum_repos.yml

new file mode 100644 (file)

index 0000000..b98382a
--- /dev/null
+++ b/roles/downstream-setup/tasks/disable_yum_repos.yml
@@ -0,0 +1,10 @@
+---
+- name: Disable yum repos.
+  lineinfile:
+    dest: "/etc/yum.repos.d/{{ item }}.repo"
+    line: "enabled=0"
+    regexp: "enabled=1"
+    backrefs: yes
+    state: present
+  with_items: "{{ repos }}"
+  ignore_errors: true
diff --git a/roles/downstream-setup/tasks/enable_yum_repos.yml b/roles/downstream-setup/tasks/enable_yum_repos.yml

new file mode 100644 (file)

index 0000000..1fa87f4
--- /dev/null
+++ b/roles/downstream-setup/tasks/enable_yum_repos.yml
@@ -0,0 +1,10 @@
+---
+- name: Enable yum repos.
+  lineinfile:
+    dest: "/etc/yum.repos.d/{{ item }}.repo"
+    line: "enabled=1"
+    regexp: "enabled=0"
+    backrefs: yes
+    state: present
+  with_items: "{{ repos }}"
+  ignore_errors: true
diff --git a/roles/downstream-setup/tasks/main.yml b/roles/downstream-setup/tasks/main.yml

new file mode 100644 (file)

index 0000000..35133f5
--- /dev/null
+++ b/roles/downstream-setup/tasks/main.yml
@@ -0,0 +1,19 @@
+---
+# re: 'static: no' -- See https://github.com/ansible/ansible/issues/18483
+# Can be removed once that fix makes it into Ansible
+
+# These are tasks which perform actions corresponding to the names of
+# the variables they use.  For example, `disable_yum_repos` would actually
+# disable all repos defined in that list.
+- import_tasks: setup.yml
+  when: not cleanup and (ansible_distribution == "CentOS" or ansible_distribution == "RedHat")
+  static: no
+
+# These are tasks which reverse the actions corresponding to the names of
+# the variables they use. For example, `disable_yum_repos` would actually
+# enable all repos defined in that list. The primary use for this is through
+# teuthology, so that you can tell a teuthology run to disable a set of repos
+# for the test run but then re-enable them during the teuthology cleanup process.
+- import_tasks: cleanup.yml
+  when: cleanup and (ansible_distribution == "CentOS" or ansible_distribution == "RedHat")
+  static: no
diff --git a/roles/downstream-setup/tasks/remove_yum_repos.yml b/roles/downstream-setup/tasks/remove_yum_repos.yml

new file mode 100644 (file)

index 0000000..fc3741d
--- /dev/null
+++ b/roles/downstream-setup/tasks/remove_yum_repos.yml
@@ -0,0 +1,6 @@
+---
+- name: Delete yum repos from /etc/yum.repos.d
+  file:
+    path: "/etc/yum.repos.d/{{ item }}.repo"
+    state: absent
+  with_items: "{{ repos }}"
diff --git a/roles/downstream-setup/tasks/setup.yml b/roles/downstream-setup/tasks/setup.yml

new file mode 100644 (file)

index 0000000..7c554b0
--- /dev/null
+++ b/roles/downstream-setup/tasks/setup.yml
@@ -0,0 +1,28 @@
+---
+- import_tasks: yum_repos.yml
+  when: yum_repos|length > 0
+  vars:
+    repos: "{{ yum_repos }}"
+  tags:
+    - yum-repos
+
+- import_tasks: remove_yum_repos.yml
+  when: remove_yum_repos|length > 0
+  vars:
+    repos: "{{ remove_yum_repos }}"
+  tags:
+    - delete-yum-repos
+
+- import_tasks: disable_yum_repos.yml
+  when: disable_yum_repos|length > 0
+  vars:
+    repos: "{{ disable_yum_repos }}"
+  tags:
+    - disable-yum-repos
+
+- import_tasks: enable_yum_repos.yml
+  when: enable_yum_repos|length > 0
+  vars:
+    repos: "{{ enable_yum_repos }}"
+  tags:
+    - enable-yum-repos
diff --git a/roles/downstream-setup/tasks/yum_repos.yml b/roles/downstream-setup/tasks/yum_repos.yml

new file mode 100644 (file)

index 0000000..8cf7243
--- /dev/null
+++ b/roles/downstream-setup/tasks/yum_repos.yml
@@ -0,0 +1,7 @@
+---
+- name: Download yum repos to /etc/yum.repos.d
+  get_url:
+    url: "{{ item.url }}"
+    dest: "/etc/yum.repos.d/{{ item.name }}.repo"
+    force: yes
+  with_items: "{{ repos }}"
diff --git a/roles/firmware/README.rst b/roles/firmware/README.rst

new file mode 100644 (file)

index 0000000..7e83c8d
--- /dev/null
+++ b/roles/firmware/README.rst
@@ -0,0 +1,129 @@
+firmware
+========
+
+This role will largely only be useful for the Ceph upstream Sepia_ test lab.
+Some of the firmware flashing methods can be applied to other machine types however.
+
+Prerequisites
++++++++++++++
+
+Prerequisites are ordered by machine type (smithi, mira, etc.) then device type (BIOS, BMC, etc.)
+
+Universal device types (RAID controllers) are listed separately last.
+
+Mira
+----
+**BIOS**
+
+#. Download the latest BIOS firmware from Supermicro_'s website.
+#. Extract the binary blob from the archive and upload it somewhere that is http-accessible within the lab.
+#. Define ``bios_location`` as the http path to that file.
+#. Define ``latest_bios_version``.  This is listed under ``Rev`` on Supermicro_'s website.  See example under the *Variables* section.
+
+**BMC**
+
+#. Download the latest BMC firmware from Supermicro_'s website.
+#. Copy the full zip archive somewhere http-accessible within the lab.
+#. Define ``bmc_location`` as the http path to that archive.
+#. Define ``latest_bmc_version``.  This is listed under ``Rev`` on Supermicro_'s website.  See example under the *Variables* section.
+
+----
+
+Smithi
+------
+The Smithi machines have X10 generation system boards which require a DOS prompt or Windows in order to flash the BIOS.  The flashrom tool doesn't yet support those boards.
+
+**BMC**
+
+#. Download the latest BMC firmware from Supermicro_'s website.
+#. Copy the full zip archive somewhere http-accessible within the lab.
+#. Define ``bmc_location`` in the secrets repo as the http path to that archive.
+#. Define ``latest_bmc_version`` in the secrets repo.  This is listed under ``Rev`` on Supermicro_'s website.  See example under the *Variables* section.
+
+**NVMe**
+
+RHEL and CentOS are the only supported distros for NVMe firmware flashing.  Intel bakes the latest firmware into RPMs.
+
+#. Download the latest Intel SSD Data Center Tool archive from Intel_'s website.
+#. Extract the appropriate architecture RPM (probably x86_64) from the zip archive and upload it somewhere http-accessible within the lab.
+#. Define ``nvme_firmware_package`` in the secrets repo as the HTTP path to the RPM.
+
+----
+
+Areca RAID Controllers
+----------------------
+We have multiple different model controllers but the firmware update process is the same for the models we have.  Following these steps carefully allow the process to be used for any model controller.
+
+#. Download firmware archives for each model RAID controller you have from Areca_'s website.
+#. Create an empty directory on your http server and upload each archive there.
+#. Rename each zip archive to match the model output you get from ``cli64 sys info | grep Controller Name`` (e.g., ARC-1222.zip).
+#. Define a ``latest_{{ model_lower_pretty }}_version`` variable for each model controller you have.  This *must* match the ``Firmware Version`` output of ``cli64 sys info``.  See examples under the *Variables* section.
+
+Variables
++++++++++
+
+``flashrom_location: "http://download.flashrom.org/releases/flashrom-0.9.9.tar.bz2"``.  Tool used to flash BIOSes for certain machine types.  Defined in ``roles/firmware/defaults/main.yml``.
+
+``firmware_update_path: "/home/{{ ansible_user }}/firmware-update"`` is just a temporary dir used on the target ansible host to work out of and download firmware and tools to.  It gets deleted at the end of a succsessful playbook run.  Defined in ``roles/firmware/defaults/main.yml``.
+
+``latest_bios_version: null`` should be overridden in your ansible inventory based on machine type.  The format should match what you get when running ``dmidecode --type bios | grep Version``.  Not all machine types have BIOSes that can be updated using ``flashrom`` so this variable is defined as ``null`` in ``roles/firmware/defaults/main.yml``.  See example for a supported machine type::
+
+  # From ansible/inventory/group_vars/mira.yml
+  latest_bios_version: "1.2a"
+
+``latest_bmc_version: null`` should be overridden in your ansible inventory based on machine type.  The format should match what you get when running ``ipmitool mc info | grep "Firmware Revision"``.  See example::
+
+  # From ansible/inventory/group_vars/mira.yml
+  latest_bmc_version: "3.16"
+
+``bios_location: null`` should be the direct HTTP path to the BIOS binary.  Override in your ansible inventory based on machine type.  See example::
+
+  # From ansible/inventory/group_vars/mira.yml
+  bios_location: "http://drop.front.sepia.ceph.com/firmware/mira/X8SIL2.627"
+
+``bmc_location: null`` should be the direct HTTP path to the BMC firmware zip archive.  Override in your ansible inventory based on machine type.  See example::
+
+  # From ansible/inventory/group_vars/mira.yml
+  bmc_location: "http://drop.front.sepia.ceph.com/firmware/mira/ipmi_316.zip"
+
+``areca_download_location: null`` should be the HTTP path to a directory serving all your Areca firmware zip archives.  Override in your ansible inventory.  See example::
+
+  # From ansible/inventory/group_vars/all.yml
+  areca_download_location: "http://drop.front.sepia.ceph.com/firmware/areca"
+
+You should have a ``latest_{{ areca_lower_pretty }}_version`` variable for each model Areca controller you have.  ``areca_lower_pretty`` should be lowercase with no special characters.  Obtain the firmware version format and model from ``cli64 sys info`` output.  Override in your ansible inventory.  See examples::
+
+  # From ansible/inventory/group_vars/all.yml
+  latest_arc1222_version: "V1.51"
+  latest_arc1880_version: "V1.53"
+
+``nvme_firmware_package: null`` should be overridden in your ansible inventory.  It is the direct HTTP path to Intel's SSD Datacenter Tool RPM.  We only have NVMe drives in our ``smithi`` machine type so we define it in ``group_vars``.  See example::
+
+  # From ansible/inventory/group_vars/smithi.yml
+  nvme_firmware_package: "http://drop.front.sepia.ceph.com/firmware/smithi/isdct-3.0.2.400-17.x86_64.rpm"
+
+Tags
+++++
+Running the role without a tag will update all firmwares a system has available to it.
+
+bios
+    If the system(s) you're running this role against supports flashing the BIOS from the OS (current method uses ``flashrom`` and a BIOS binary), this tag will update the BIOS if an update is required.
+
+bmc
+    If the system(s) you're running this role against supports flashing the BMC from the OS (Supermicro provides an executable and firmare binary), this tag will update the BMC if an update is required.
+
+areca
+    Updates only Areca RAID controller firmwares/BIOS
+
+nvme
+    Updates Intel NVMe device firmware.  Supports RHEL/CentOS only.
+
+To Do
++++++
+
+- Monitor ``flashrom`` releases to check if Supermicro X10 boards are supported yet
+
+.. _Sepia: https://ceph.github.io/sepia/
+.. _Supermicro: https://www.supermicro.com/ResourceApps/BIOS_IPMI.aspx
+.. _Intel: https://downloadcenter.intel.com/download/26221/Intel-SSD-Data-Center-Tool
+.. _Areca: http://www.areca.us/support/main.htm
diff --git a/roles/firmware/defaults/main.yml b/roles/firmware/defaults/main.yml

new file mode 100644 (file)

index 0000000..8f6f13b
--- /dev/null
+++ b/roles/firmware/defaults/main.yml
@@ -0,0 +1,11 @@
+---
+# Defaults should be overridden in the secrets repo in each machine type's
+# group_vars file
+latest_bios_version: null
+latest_bmc_version: null
+
+flashrom_location: "http://download.flashrom.org/releases/flashrom-0.9.9.tar.bz2"
+
+areca_download_location: null
+
+firmware_update_path: "/home/{{ ansible_user }}/firmware-update"
diff --git a/roles/firmware/tasks/areca/areca-update.yml b/roles/firmware/tasks/areca/areca-update.yml

new file mode 100644 (file)

index 0000000..2078d87
--- /dev/null
+++ b/roles/firmware/tasks/areca/areca-update.yml
@@ -0,0 +1,27 @@
+---
+# This file is only called when current_areca_version
+# and latest_{{ areca_model_pretty }}_version do not match
+
+- name: Install Unzip
+  package:
+    name: unzip
+    state: latest
+
+- name: Create Areca update working directory structure
+  file:
+    path: "{{ firmware_update_path }}/areca-update"
+    state: directory
+
+# Download Areca zip archive and name it something we can consume reliably
+- name: Download Areca firmware
+  get_url:
+    url: "{{ areca_download_location }}/{{ areca_model.stdout }}.zip"
+    dest: "{{ firmware_update_path }}/areca-update/areca.zip"
+    force: yes
+
+# Only extract the binary blobs and don't recreate dir structure
+- name: Unzip Areca firmware archive
+  shell: "cd {{ firmware_update_path }}/areca-update && unzip -j areca.zip *.BIN"
+
+- name: Flash Areca firmware
+  shell: "for file in $(ls {{ firmware_update_path }}/areca-update/*.BIN); do cli64 sys updatefw path=$file; done"
diff --git a/roles/firmware/tasks/areca/main.yml b/roles/firmware/tasks/areca/main.yml

new file mode 100644 (file)

index 0000000..033e459
--- /dev/null
+++ b/roles/firmware/tasks/areca/main.yml
@@ -0,0 +1,34 @@
+---
+- name: Check for Areca devices
+  shell: "lspci | grep -q -i areca"
+  register: lspci_output
+#  ignore_errors: true
+  failed_when: False
+
+- name: Determine Areca RAID Controller Model
+  shell: "cli64 sys info | grep 'Controller Name' | awk '{ print $4 }'"
+  register: areca_model
+  when: "lspci_output.rc == 0"
+
+- name: Set areca_model_pretty var
+  set_fact:
+    areca_model_pretty: "{{ areca_model.stdout|lower|replace('-', '') }}"
+  when: "lspci_output.rc == 0"
+
+- name: Determine current Areca firmware version
+  shell: "cli64 sys info | grep 'Firmware Version' | awk '{ print $4 }'"
+  register: current_areca_version
+  when: "lspci_output.rc == 0"
+
+# We have Areca 1222 and 1880 covered.  If any other models exist, the 'when'
+# statement will gracefully allow the rest of this playbook to be skipped.
+- name: Determine if Areca firmware update needed
+  set_fact:
+    need_areca_update: true
+  when: current_areca_version is defined and
+        latest_{{ areca_model_pretty }}_version is defined and
+        (current_areca_version.stdout != latest_{{ areca_model_pretty }}_version)
+
+- name: Run Areca firmware update playbook
+  import_tasks: areca/areca-update.yml
+  when: need_areca_update is defined and need_areca_update == true
diff --git a/roles/firmware/tasks/main.yml b/roles/firmware/tasks/main.yml

new file mode 100644 (file)

index 0000000..3da7d63
--- /dev/null
+++ b/roles/firmware/tasks/main.yml
@@ -0,0 +1,34 @@
+---
+- import_tasks: mira/bios.yml
+  tags:
+    - bios
+  when: '"mira" in ansible_hostname'
+
+- import_tasks: mira/bmc.yml
+  tags:
+    - bmc
+  when: '"mira" in ansible_hostname'
+
+- import_tasks: areca/main.yml
+  tags:
+    - areca
+
+- import_tasks: smithi/bmc.yml
+  tags:
+    - bmc
+  when: '"smithi" in ansible_hostname'
+
+# NVMe firmware flashing is only supported on RHEL/CentOS
+- import_tasks: smithi/nvme.yml
+  tags:
+    - nvme
+  when: '"smithi" in ansible_hostname and ansible_pkg_mgr == "yum"'
+
+# This won't get run if a previous playbook fails.  So if a backup of a BIOS is
+# needed to restore, it'll still be there
+- name: Clean up firmware update directory
+  file:
+    path: "{{ firmware_update_path }}"
+    state: absent
+  tags:
+    - always
diff --git a/roles/firmware/tasks/mira/bios-update.yml b/roles/firmware/tasks/mira/bios-update.yml

new file mode 100644 (file)

index 0000000..18a38f0
--- /dev/null
+++ b/roles/firmware/tasks/mira/bios-update.yml
@@ -0,0 +1,75 @@
+---
+# This file is only called when current_bios_version
+# and latest_bios_version do not match
+
+- name: Install packages for CentOS/RHEL
+  yum:
+    name: "{{ item }}"
+    state: latest
+  with_items:
+    - pciutils-devel
+    - zlib-devel
+    - libftdi-devel
+    - libusb-devel
+    - make
+    - gcc
+  when: ansible_pkg_mgr == "yum"
+
+- name: Install packages for Ubuntu
+  apt:
+    name: "{{ item }}"
+    state: latest
+  with_items:
+    - flashrom
+  when: ansible_pkg_mgr == "apt"
+
+# Flashrom has to be built on CentOS so we add an extra dir for it
+# This is equivalent to 'mkdir -p'
+- name: Create BIOS update working directory structure
+  file:
+    path: "{{ firmware_update_path }}/bios-update/flashrom"
+    state: directory
+
+# This file must be the already-extracted binary blob from the Supermicro
+# firmware archive.  Naming scheme is PPPPPY.MDD
+# PPPPP = Project name; Y = Year; M = Month; DD = Day
+# We rename it to 'new-bios' here so the playbook can consume a universal name
+- name: Download BIOS binary
+  get_url:
+    url: "{{ bios_location }}"
+    dest: "{{ firmware_update_path }}/bios-update/new-bios"
+
+# There is flashrom RPM in any trusted repositories so we have to compile it
+- name: Download flashrom archive (CentOS)
+  get_url:
+    url: "{{ flashrom_location }}"
+    dest: "{{ firmware_update_path }}/bios-update/flashrom.tar.bz2"
+    validate_certs: no
+  when: ansible_pkg_mgr == "yum"
+
+# The flashrom tarballs extract to a directory with its version number by default
+# '--strip-components 1' gets rid of that dir so the playbook can run with any
+# flashrom version
+- name: Extract flashrom (CentOS)
+  shell: "tar -xjf {{ firmware_update_path }}/bios-update/flashrom.tar.bz2 --directory {{ firmware_update_path }}/bios-update/flashrom --strip-components 1"
+  when: ansible_pkg_mgr == "yum"
+
+- name: Compile flashrom (CentOS)
+  shell: "cd {{ firmware_update_path }}/bios-update/flashrom && make"
+  when: ansible_pkg_mgr == "yum"
+
+- name: Back up existing BIOS (CentOS)
+  shell: "cd {{ firmware_update_path }}/bios-update && flashrom/flashrom --programmer internal --read BIOS.bak"
+  when: ansible_pkg_mgr == "yum"
+
+- name: Flash new BIOS (CentOS)
+  shell: "cd {{ firmware_update_path }}/bios-update && flashrom/flashrom --programmer internal --write new-bios"
+  when: ansible_pkg_mgr == "yum"
+
+- name: Back up existing BIOS (Ubuntu)
+  shell: "cd {{ firmware_update_path }}/bios-update && flashrom --programmer internal --read BIOS.bak"
+  when: ansible_pkg_mgr == "apt"
+
+- name: Flash new BIOS (Ubuntu)
+  shell: "flashrom --programmer internal --write {{ firmware_update_path }}/bios-update/new-bios"
+  when: ansible_pkg_mgr == "apt"
diff --git a/roles/firmware/tasks/mira/bios.yml b/roles/firmware/tasks/mira/bios.yml

new file mode 100644 (file)

index 0000000..c555600
--- /dev/null
+++ b/roles/firmware/tasks/mira/bios.yml
@@ -0,0 +1,14 @@
+---
+- name: Determine current BIOS firmware version
+  shell: dmidecode --type bios | grep Version | awk '{ print $2 }'
+  register: current_bios_version
+  changed_when: False
+
+- name: Determine if BIOS update is needed
+  set_fact:
+    need_bios_update: true
+  when: current_bios_version.stdout != latest_bios_version
+
+- name: Include BIOS update logic
+  import_tasks: mira/bios-update.yml
+  when: need_bios_update is defined and need_bios_update == true
diff --git a/roles/firmware/tasks/mira/bmc-update.yml b/roles/firmware/tasks/mira/bmc-update.yml

new file mode 100644 (file)

index 0000000..cac2590
--- /dev/null
+++ b/roles/firmware/tasks/mira/bmc-update.yml
@@ -0,0 +1,30 @@
+---
+# This file is only called when current_bmc_version
+# and latest_bmc_version do not match
+
+- name: Install unzip
+  package:
+    name: unzip
+    state: latest
+
+- name: Create BMC update working directory structure
+  file:
+    path: "{{ firmware_update_path }}/bmc-update"
+    state: directory
+
+# Download the archive and rename to something the playbook can consume
+- name: Download BMC archive
+  get_url:
+    url: "{{ bmc_location }}"
+    dest: "{{ firmware_update_path }}/bmc-update/bmc.zip"
+    force: yes
+
+- name: Extract IPMI archive
+  shell: "cd {{ firmware_update_path }}/bmc-update && unzip bmc.zip"
+
+- name: Flash new BMC (Takes around 5 minutes)
+  shell: "cd {{ firmware_update_path }}/bmc-update/Linux* && chmod +x lUpdate && ./lUpdate -f ../*.bin -i kcs -r y"
+  register: bmc_flash_output
+
+# Print output of flash script
+- debug: var=bmc_flash_output.stdout_lines|last
diff --git a/roles/firmware/tasks/mira/bmc.yml b/roles/firmware/tasks/mira/bmc.yml

new file mode 100644 (file)

index 0000000..7c261c9
--- /dev/null
+++ b/roles/firmware/tasks/mira/bmc.yml
@@ -0,0 +1,27 @@
+---
+- name: Install ipmitool
+  package:
+    name: ipmitool
+    state: latest
+
+- name: Enable IPMI kernel modules
+  modprobe:
+    name: "{{ item }}"
+    state: present
+  with_items:
+    - ipmi_devintf
+    - ipmi_si
+
+- name: Determine current BMC firmware version
+  shell: ipmitool mc info | grep "Firmware Revision" | awk '{ print $4 }'
+  register: current_bmc_version
+  changed_when: False
+
+- name: Determine if BMC update is needed
+  set_fact:
+    need_bmc_update: true
+  when: current_bmc_version.stdout != latest_bmc_version
+
+- name: Include BMC update logic
+  import_tasks: mira/bmc-update.yml
+  when: need_bmc_update is defined and need_bmc_update == true
diff --git a/roles/firmware/tasks/smithi/bmc-update.yml b/roles/firmware/tasks/smithi/bmc-update.yml

new file mode 100644 (file)

index 0000000..77e0613
--- /dev/null
+++ b/roles/firmware/tasks/smithi/bmc-update.yml
@@ -0,0 +1,31 @@
+---
+# This file is only called when current_bmc_version
+# and latest_bmc_version do not match
+
+- name: Install unzip
+  package:
+    name: unzip
+    state: latest
+
+- name: Create BMC update working directory structure
+  file:
+    path: "{{ firmware_update_path }}/bmc-update"
+    state: directory
+
+# Download the archive and rename to something the playbook can consume
+- name: Download BMC archive
+  get_url:
+    url: "{{ bmc_location }}"
+    dest: "{{ firmware_update_path }}/bmc-update/bmc.zip"
+    force: yes
+
+# Extract only the binary blob and the Linux flashing executable
+- name: Extract IPMI archive
+  shell: "cd {{ firmware_update_path }}/bmc-update && unzip -j bmc.zip *.bin */linux/x64/AlUpdate"
+
+- name: Flash new BMC (Takes around 11 minutes)
+  shell: "cd {{ firmware_update_path }}/bmc-update && chmod +x AlUpdate && ./AlUpdate -f *.bin -i kcs -r y"
+  register: bmc_flash_output
+
+# Print output of flash script
+- debug: var=bmc_flash_output.stdout_lines|last
diff --git a/roles/firmware/tasks/smithi/bmc.yml b/roles/firmware/tasks/smithi/bmc.yml

new file mode 100644 (file)

index 0000000..3c3400b
--- /dev/null
+++ b/roles/firmware/tasks/smithi/bmc.yml
@@ -0,0 +1,27 @@
+---
+- name: Install ipmitool
+  package:
+    name: ipmitool
+    state: latest
+
+- name: Enable IPMI kernel modules
+  modprobe:
+    name: "{{ item }}"
+    state: present
+  with_items:
+    - ipmi_devintf
+    - ipmi_si
+
+- name: Determine current BMC firmware version
+  shell: ipmitool mc info | grep "Firmware Revision" | awk '{ print $4 }'
+  register: current_bmc_version
+  changed_when: False
+
+- name: Determine if BMC update is needed
+  set_fact:
+    need_bmc_update: true
+  when: current_bmc_version.stdout != latest_bmc_version
+
+- name: Include BMC update logic
+  import_tasks: smithi/bmc-update.yml
+  when: need_bmc_update is defined and need_bmc_update == true
diff --git a/roles/firmware/tasks/smithi/nvme.yml b/roles/firmware/tasks/smithi/nvme.yml

new file mode 100644 (file)

index 0000000..60bd3ba
--- /dev/null
+++ b/roles/firmware/tasks/smithi/nvme.yml
@@ -0,0 +1,29 @@
+---
+- name: Install Intel SSD Data Center Tool
+  yum:
+    name: "{{ nvme_firmware_package }}"
+    state: present
+
+# This will gather a list of serial numbers in case there are multiple NVMe drives.
+- name: Gather list of NVMe device serial numbers
+  shell: isdct show -d SerialNumber -intelssd | grep SerialNumber | awk '{ print $3 }'
+  register: nvme_serial_list_raw
+
+- name: Store ansible-friendly list of NVMe device Serial Numbers
+  set_fact:
+    nvme_device_list: "{{ nvme_serial_list_raw.stdout.split('\n') }}"
+
+# Despite the -force flag, this command won't flash firmware on a device that
+# already has the latest firmware.  It'll just return 3 as the exit code.
+# Ansible fails a task with an rc of 3 hence the added failed_when logic.
+# A successful firmware update return code is 0.
+- name: Update each NVMe device's firmware
+  shell: "isdct load -force -intelssd {{ item }}"
+  with_items: "{{ nvme_device_list|default([]) }}"
+  register: nvme_update_output
+  failed_when: "'Error' in nvme_update_output.stdout"
+  changed_when: nvme_update_output.rc == 0
+
+# Print firmware flash output
+# Syntax discovered here: https://github.com/ansible/ansible/issues/5564
+- debug: var=nvme_update_output.results|map(attribute='stdout_lines')|list
diff --git a/roles/fog-server/README.rst b/roles/fog-server/README.rst

new file mode 100644 (file)

index 0000000..2d73fec
--- /dev/null
+++ b/roles/fog-server/README.rst
@@ -0,0 +1,48 @@
+fog-server
+==========
+
+This role can be used to install and update a FOG_ server.  It has been minimally tested on Ubuntu 16.04 and CentOS 7.4.
+
+Notes
++++++
+
+* You must manually configure firewall, SELinux, and repos on RHEL/CentOS/Fedora.
+* This role assumes the ``sudo`` group already exists and has passwordless sudo access.
+* We'd recommend running in verbose mode to see shell output.  It can take around 10 minutes for the Install and Update tasks to complete.
+
+Variables
++++++++++
+
++-----------------------------------------------------------------------------------------------------------------------------------------------+
+| **Required Variables**                                                                                                                        |
++----------------------------+------------------------------------------------------------------------------------------------------------------+
+| ``fog_user: fog``          | Name for user account to be created on the system.  The application will be run from this user's home directory. |
++----------------------------+------------------------------------------------------------------------------------------------------------------+
+| ``fog_branch: master``     | Branch of FOG to checkout and install.  Defaults to master but could be set to ``working`` for bleeding edge.    |
++----------------------------+------------------------------------------------------------------------------------------------------------------+
+| ``fog_dhcp_server: false`` | Set to ``true`` if you want FOG to install and configure the host as a DHCP server.                              |
++----------------------------+------------------------------------------------------------------------------------------------------------------+
+
+**Optional Variables**
+
+If none of these are set, the FOG defaults will be used.  For simplicity's sake, the variables have been named after the variables in fogsettings_.  Read the official documentation for a description of what each does.
+
+* fog_ipaddress
+* fog_interface
+* fog_submask
+* fog_routeraddress
+* fog_plainrouter
+* fog_dnsaddress
+* fog_password
+* fog_startrange (Required if ``fog_dhcp_server: true``)
+* fog_endrange (Required if ``fog_dhcp_server: true``)
+* fog_snmysqluser
+* fog_snmysqlpass
+* fog_snmysqlhost
+* fog_images_path
+* fog_docroot
+* fog_webroot
+* fog_httpproto
+
+.. _FOG: https://fogproject.org/
+.. _fogsettings: https://wiki.fogproject.org/wiki/index.php?title=.fogsettings
diff --git a/roles/fog-server/defaults/main.yml b/roles/fog-server/defaults/main.yml

new file mode 100644 (file)

index 0000000..aca4e87
--- /dev/null
+++ b/roles/fog-server/defaults/main.yml
@@ -0,0 +1,4 @@
+---
+fog_user: fog
+fog_branch: master
+fog_dhcp_server: false
diff --git a/roles/fog-server/tasks/install.yml b/roles/fog-server/tasks/install.yml

new file mode 100644 (file)

index 0000000..eed8a3a
--- /dev/null
+++ b/roles/fog-server/tasks/install.yml
@@ -0,0 +1,12 @@
+---
+- name: Clone FOG
+  git:
+    repo: https://github.com/FOGProject/fogproject.git
+    dest: "/home/{{ fog_user }}/fog"
+    version: "{{ fog_branch }}"
+
+- name: Install FOG
+  shell: "sudo ./installfog.sh -Y -f /home/{{ fog_user }}/temp_settings"
+  args:
+    chdir: "/home/{{ fog_user }}/fog/bin"
+  become_user: "{{ fog_user }}"
diff --git a/roles/fog-server/tasks/main.yml b/roles/fog-server/tasks/main.yml

new file mode 100644 (file)

index 0000000..17c4de1
--- /dev/null
+++ b/roles/fog-server/tasks/main.yml
@@ -0,0 +1,48 @@
+---
+- name: Ensure a user for FOG
+  user:
+    name: "{{ fog_user }}"
+    shell: /bin/bash
+    group: sudo
+    append: yes
+    createhome: yes
+
+- name: Ensure a path for FOG
+  file:
+    path: "/home/{{ fog_user }}/fog"
+    owner: "{{ fog_user }}"
+    state: directory
+
+- name: Write temp settings/answer file for FOG
+  template:
+    src: temp_settings.j2
+    dest: "/home/{{ fog_user }}/temp_settings"
+    owner: "{{ fog_user }}"
+
+# Unattended upgrades (of mysql specifically) will break FOG
+# https://forums.fogproject.org/topic/10006/ubuntu-is-fog-s-enemy
+- name: Make sure unattended-upgrades is not installed
+  apt:
+    name: unattended-upgrades
+    state: absent
+  when: ansible_os_family == "Debian"
+
+- name: Check if FOG is already installed
+  stat:
+    path: /opt/fog
+  register: fog_path_found
+
+- import_tasks: install.yml
+  when:
+    - fog_path_found.stat.exists == false
+    - fog_force == "yes"
+
+- import_tasks: update.yml
+  when:
+    - fog_path_found.stat.exists == true
+    - fog_force == "yes"
+
+- name: Clean up temp settings/answer file for FOG
+  file:
+    path: "/home/{{ fog_user }}/temp_settings"
+    state: absent
diff --git a/roles/fog-server/tasks/update.yml b/roles/fog-server/tasks/update.yml

new file mode 100644 (file)

index 0000000..9478859
--- /dev/null
+++ b/roles/fog-server/tasks/update.yml
@@ -0,0 +1,13 @@
+---
+- name: Update FOG checkout
+  git:
+    repo: https://github.com/FOGProject/fogproject.git
+    dest: "/home/{{ fog_user }}/fog"
+    version: "{{ fog_branch }}"
+    update: yes
+
+- name: Update FOG
+  shell: "sudo ./installfog.sh -Y -f /home/{{ fog_user }}/temp_settings"
+  args:
+    chdir: "/home/{{ fog_user }}/fog/bin"
+  become_user: "{{ fog_user }}"
diff --git a/roles/fog-server/templates/temp_settings.j2 b/roles/fog-server/templates/temp_settings.j2

new file mode 100644 (file)

index 0000000..3b99d05
--- /dev/null
+++ b/roles/fog-server/templates/temp_settings.j2
@@ -0,0 +1,99 @@
+{% if fog_ipaddress is defined %}
+ipaddress='{{ fog_ipaddress }}'
+{% else %}
+ipaddress='{{ ansible_default_ipv4.address }}'
+{% endif %}
+{% if fog_interface is defined %}
+interface='{{ fog_interface }}'
+{% else %}
+interface='{{ ansible_default_ipv4.alias }}'
+{% endif %}
+{% if fog_submask is defined %}
+submask='{{ fog_submask }}'
+{% else %}
+submask='{{ ansible_default_ipv4.netmask }}'
+{% endif %}
+{% if fog_routeraddress is defined %}
+routeraddress='{{ fog_routeraddress }}'
+{% else %}
+routeraddress='{{ ansible_default_ipv4.gateway }}'
+{% endif %}
+{% if fog_plainrouter is defined %}
+plainrouter='{{ fog_plainrouter }}'
+{% else %}
+plainrouter=''
+{% endif %}
+{% if fog_dnsaddress is defined %}
+dnsaddress='{{ fog_dnsaddress }}'
+{% else %}
+dnsaddress=''
+{% endif %}
+username='{{ fog_user }}'
+{% if fog_password is defined %}
+password='{{ fog_password }}'
+{% endif %}
+{% if ansible_os_family == "RedHat" %}
+osid='1'
+{% elif ansible_os_family == "Debian" %}
+osid='2'
+{% elif ansible_os_family == "Archlinux" %}
+osid='3'
+{% endif %}
+{% if fog_dhcp_server == true %}
+dodhcp='Y'
+bldhcp='1'
+startrange='{{ fog_startrange }}'
+endrange='{{ fog_endrange }}'
+{% else %}
+dodhcp='N'
+bldhcp='0'
+startrange=''
+endrange=''
+{% endif %}
+dhcpd='isc-dhcp-server'
+blexports='1'
+installtype='N'
+{% if fog_snmysqluser is defined %}
+snmysqluser='{{ fog_snmysqluser }}'
+{% else %}
+snmysqluser='root'
+{% endif %}
+{% if fog_snmysqlpass is defined %}
+snmysqlpass='{{ fog_snmysqlpass }}'
+{% else %}
+snmysqlpass=''
+{% endif %}
+{% if fog_snmysqlhost is defined %}
+snmysqlhost='{{ fog_snmysqlhost }}'
+{% else %}
+snmysqlhost='localhost'
+{% endif %}
+installlang='0'
+{% if fog_images_path is defined %}
+storageLocation='{{ fog_images_path }}'
+{% else %}
+storageLocation='/images'
+{% endif %}
+fogupdateloaded=1
+{% if fog_docroot is defined %}
+docroot='{{ fog_docroot }}'
+{% else %}
+docroot='/var/www/html/'
+{% endif %}
+{% if fog_webroot is defined %}
+webroot='{{ fog_webroot }}'
+{% else %}
+webroot='/fog/'
+{% endif %}
+caCreated='yes'
+bootfilename='undionly.kpxe'
+noTftpBuild=''
+notpxedefaultfile=''
+sslpath='/opt/fog/snapins/ssl/'
+backupPath=''
+sslprivkey='/opt/fog/snapins/ssl//.srvprivate.key'
+{% if fog_httpproto is defined %}
+httpproto='{{ fog_httpproto }}'
+{% else %}
+httpproto='http'
+{% endif %}
diff --git a/roles/gateway/README.rst b/roles/gateway/README.rst

new file mode 100644 (file)

index 0000000..5cea2e1
--- /dev/null
+++ b/roles/gateway/README.rst
@@ -0,0 +1,164 @@
+gateway
+=======
+
+This role can be used to set up a new OpenVPN gateway for a Ceph test lab 
+as well as maintain user access provided a secrets repo is configured.
+
+This role supports CentOS 7.2 only at this time.  Its current intended use
+is to maintain the existing OpenVPN gateway in our Sepia_ lab.
+
+It does the following:
+- Configures network devices
+- Configures firewalld
+- Configures fail2ban
+- Installs and updates necessary packages
+- Maintains user list
+
+Prerequisites
++++++++++++++
+
+- CentOS 7.2
+
+Variables
++++++++++
+
+A list of packages to install that is specific to the role.  The list is defined in ``roles/gateway/vars/packages.yml``::
+
+    packages: []
+
+A unique name to give to your OpenVPN service.  This name is used to organize configuration files and start/stop the service.  Defined in the secrets repo::
+
+    openvpn_server_name: []
+
+The directory in which the OpenVPN server CA, keys, certs, and user file should be saved.  Defined in the secrets repo::
+
+    openvpn_data_dir: []
+
+Contains paths, file permission (modes), and data to store and maintain OpenVPN CA, cert, key, and main server config.  Consult your server.conf on what you should define here.  For reference, we have dh1024.pem, server.crt, server.key, tlsauth, and server.conf defined.  Defined in the secrets repo::
+
+    gateway_secrets: []
+
+    # Example:
+    gateway_secrets:
+      - path: "{{ openvpn_data_dir }}/server.crt"
+        mode: 0644
+        data: |
+          -----BEGIN CERTIFICATE-----
+          ...
+          -----END CERTIFICATE-----
+      - path: /etc/openvpn/server.conf
+        mode: 0644
+        data: |
+          script-security 2
+          ...
+          cert {{ openvpn_data_dir }}/server.crt
+
+A list of users that don't have their ssh pubkey added to the ``teuthology_user`` authorized_keys but still need VPN access::
+
+    openvpn_users: []
+
+    # Example:
+    openvpn_users:
+      - ovpn: user@host etc...
+
+The following vars are used to populate ``/etc/resolv.conf``.  Defined in the
+secrets repo::
+
+    gw_resolv_search: []
+    # Example: gw_resolv_search: "front.example.com"
+
+    gw_resolv_ns: []
+    # Example:
+    gw_resolv_ns:
+      - 1.2.3.4
+      - 8.8.8.8
+
+The ``gw_networks`` dictionary assumes you have individual NICs for each
+VLAN in your lab.  The subelements ``peerdns`` and ``dns{1,2}`` are optional for
+all but one NIC.  These are what set your nameservers in
+``/etc/resolv.conf``.
+``dns1`` and ``dns2`` should be defined under a single NIC and ``peerdns``
+should be set to ``"yes"``.  ``routes`` is optional but must be formatted as documented in RHEL_ documentation.
+Defined in the secrets repo::
+
+    # Example:
+    gw_networks:
+      private:
+        ifname: "eth0"
+        mac: "de:ad:be:ef:12:34"
+        ip4: "192.168.1.100"
+        netmask: "255.255.240.0"
+        gw4: "192.168.1.1"
+        defroute: "yes"
+        peerdns: "yes"
+        search "private.example.com"
+        dns1: "192.168.1.1"
+        dns2: "8.8.8.8"
+        routes: |
+          ADDRESS0=192.168.1.0
+          NETMASK0=255.255.240.0
+          GATEWAY0=192.168.1.1
+          ADDRESS1=172.21.64.0
+          NETMASK1=255.255.252.0
+          GATEWAY1=192.168.1.1
+      public:
+        ifname: "eth1"
+        etc...
+
+The *fail2ban* vars are explained in /etc/fail2ban/jail.conf.  We've set
+defaults in ``roles/gateway/defaults/main.yml`` but they can be overridden in
+the secrets repo::
+
+    gw_f2b_ignoreip: "127.0.0.1/8"
+    gw_f2b_bantime: "43200"
+    gw_f2b_findtime: "600"
+    gw_f2b_maxretry: "5"
+
+``gw_f2b_services`` is a dictionary listing services fail2ban should monitor.  Defined in
+``roles/gateway/defaults/main.yml``.  See example below::
+
+    gw_f2b_services:
+      sshd:
+        enabled: "true"
+        port: "ssh"
+        logpath: "%(sshd_log)s"
+      apache:
+        enabled: "true"
+        port: "http"
+
+Tags
+++++
+
+packages
+    Install *and update* packages
+
+users
+    Update OpenVPN users list
+
+networking
+    Configure basic networking (NICs, IP forwarding, resolv.conf)
+
+firewall
+    Configure firewalld
+
+**NOTE:** Ansible v2.1 or later is required for the initial firewall setup as the ``masquerade`` parameter is new to that version.
+
+fail2ban
+    Configure fail2ban
+
+Dependencies
+++++++++++++
+
+This role depends on the following roles:
+
+secrets
+    Provides a var, ``secrets_path``, containing the path of the secrets repository, a tree of ansible variable files.
+
+To Do
++++++
+
+- Support installation of new OpenVPN gateway from scratch
+- Generate and pull (to secrets?) CA, keys, and certificates
+
+.. _Sepia: https://ceph.github.io/sepia/
+.. _RHEL: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/networking_guide/sec-Configuring_Static_Routes_in_ifcfg_files#bh-Static_Routes_Using_the_Network-Netmask_Directives_Format
diff --git a/roles/gateway/defaults/main.yml b/roles/gateway/defaults/main.yml

new file mode 100644 (file)

index 0000000..2ef0f5f
--- /dev/null
+++ b/roles/gateway/defaults/main.yml
@@ -0,0 +1,26 @@
+---
+# These defaults are present to allow certain tasks to no-op if a secrets repo
+# hasn't been defined. If you want to override these, do so in the secrets repo
+# itself. We override these in  $repo/ansible/inventory/group_vars/gateway.yml
+secrets_repo:
+  name: UNDEFINED
+  url: null
+
+openvpn_server_name: server
+
+openvpn_data_dir: /etc/openvpn/data
+
+gw_allow_http: "true"
+gw_allow_https: "true"
+
+# fail2ban-specific vars
+gw_f2b_ignoreip: "127.0.0.1/8"
+gw_f2b_bantime: "43200" # 12hrs
+gw_f2b_findtime: "600" # 10min
+gw_f2b_maxretry: "5"
+
+gw_f2b_services:
+  sshd:
+    enabled: "true"
+    port: "ssh"
+    logpath: "%(sshd_log)s"
diff --git a/roles/gateway/files/openvpn.logrotate b/roles/gateway/files/openvpn.logrotate

new file mode 100644 (file)

index 0000000..cee4906
--- /dev/null
+++ b/roles/gateway/files/openvpn.logrotate
@@ -0,0 +1,9 @@
+/var/log/openvpn/*.log {
+       daily
+       rotate 90
+       compress
+       missingok
+       copytruncate
+       notifempty
+       create 644 nobody nobody
+}
diff --git a/roles/gateway/files/openvpn.rsyslog b/roles/gateway/files/openvpn.rsyslog

new file mode 100644 (file)

index 0000000..9798300
--- /dev/null
+++ b/roles/gateway/files/openvpn.rsyslog
@@ -0,0 +1,5 @@
+# Log syslog messages matching 'ovpn-' or 'openvpn' to /var/log/openvpn/openvpn.log
+if $programname startswith 'ovpn-' or $programname startswith 'openvpn' then /var/log/openvpn/openvpn.log
+
+# Stop processing matched logs (don't log them anywhere else)
+if $programname startswith 'ovpn-' or $programname startswith 'openvpn' then stop
diff --git a/roles/gateway/handlers/main.yml b/roles/gateway/handlers/main.yml

new file mode 100644 (file)

index 0000000..c5d1642
--- /dev/null
+++ b/roles/gateway/handlers/main.yml
@@ -0,0 +1,30 @@
+---
+# Restart networking
+- name: restart networking
+  service:
+    name: network
+    state: restarted
+
+# Restart fail2ban
+- name: restart fail2ban
+  service:
+    name: fail2ban
+    state: restarted
+
+# Reload fail2ban
+- name: reload fail2ban
+  service:
+    name: fail2ban
+    state: reloaded
+
+# Restart OpenVPN
+- name: restart openvpn
+  service:
+    name: "openvpn@{{ openvpn_server_name }}"
+    state: restarted
+
+# Restart rsyslog
+- name: restart rsyslog
+  service:
+    name: rsyslog
+    state: restarted
diff --git a/roles/gateway/meta/main.yml b/roles/gateway/meta/main.yml

new file mode 100644 (file)

index 0000000..313fd69
--- /dev/null
+++ b/roles/gateway/meta/main.yml
@@ -0,0 +1,3 @@
+---
+dependencies:
+  - role: secrets
diff --git a/roles/gateway/tasks/fail2ban.yml b/roles/gateway/tasks/fail2ban.yml

new file mode 100644 (file)

index 0000000..82ae754
--- /dev/null
+++ b/roles/gateway/tasks/fail2ban.yml
@@ -0,0 +1,41 @@
+---
+- name: Write fail2ban defaults conf file
+  template:
+    src: templates/f2b.jail.local.j2
+    dest: /etc/fail2ban/jail.local
+  notify: restart fail2ban
+
+# Set a var equal to our ansible_managed var since ansible_managed
+# can't be called directly in the next task.
+# See https://github.com/ansible/ansible/issues/11317
+- name: Set f2b_grep_var to ansible_managed string
+  set_fact:
+    f2b_grep_var: "This file is managed by ansible, don't make changes here - they will be overwritten."
+
+# Remove all service files in case a malformed config was previously shipped.
+# Malformed service files cause fail2ban to not start.
+- name: Clean up ansible-written service conf files
+  shell: for file in $(grep -l {{ f2b_grep_var|quote }} /etc/fail2ban/jail.d/*); do rm -vf $file; done
+  register: f2b_rm_out
+
+# Show what files were deleted
+- debug: var=f2b_rm_out.stdout
+
+- name: Write fail2ban service conf files
+  template:
+    src: templates/f2b.service.j2
+    dest: "/etc/fail2ban/jail.d/{{ item.key }}.local"
+  with_dict: "{{ gw_f2b_services }}"
+  notify: reload fail2ban
+
+- name: Make sure fail2ban service is running
+  service:
+    name: fail2ban
+    state: started
+
+- name: Check fail2ban status
+  shell: fail2ban-client status
+  register: fail2ban_status
+
+# Show fail2ban status
+- debug: var=fail2ban_status.stdout_lines
diff --git a/roles/gateway/tasks/firewall.yml b/roles/gateway/tasks/firewall.yml

new file mode 100644 (file)

index 0000000..a72d849
--- /dev/null
+++ b/roles/gateway/tasks/firewall.yml
@@ -0,0 +1,60 @@
+---
+- name: Make sure iptables isn't running
+  service:
+    name: iptables
+    state: stopped
+    enabled: false
+  ignore_errors: true
+
+- name: Make sure firewalld is enabled
+  service:
+    name: firewalld
+    state: started
+    enabled: yes
+
+- name: firewalld | Allow openvpn traffic
+  firewalld:
+    service: openvpn
+    zone: public
+    state: enabled
+    permanent: true
+    immediate: yes
+
+- name: firewalld | Allow http traffic
+  firewalld:
+    service: http
+    zone: public
+    state: enabled
+    permanent: true
+    immediate: yes
+  when: gw_allow_http == "true"
+
+- name: firewalld | Allow https traffic
+  firewalld:
+    service: https
+    zone: public
+    state: enabled
+    permanent: true
+    immediate: yes
+  when: gw_allow_https =="true"
+
+# The following two tasks require Ansible v2.1 due to the 'masquerade'
+# and 'interface' parameters being new to that version.  They only need to be
+# run the first time the role is run so it's okay for them to be skipped.
+- name: firewalld | Add connection masquerading
+  firewalld:
+    masquerade: yes
+    zone: public
+    state: enabled
+    permanent: true
+    immediate: yes
+  when: "{{ ansible_version.major }} >= 2 and {{ ansible_version.minor }} >= 1"
+
+- name: firewalld | Add tun0 to internal zone
+  firewalld:
+    zone: internal
+    interface: tun0
+    state: enabled
+    permanent: true
+    immediate: yes
+  when: "{{ ansible_version.major }} >= 2 and {{ ansible_version.minor }} >= 1"
diff --git a/roles/gateway/tasks/logging.yml b/roles/gateway/tasks/logging.yml

new file mode 100644 (file)

index 0000000..8c7126b
--- /dev/null
+++ b/roles/gateway/tasks/logging.yml
@@ -0,0 +1,20 @@
+---
+- name: Create log directory
+  file:
+    path: /var/log/openvpn
+    state: directory
+
+- name: Set log dir SELinux context
+  command: restorecon -R /var/log/openvpn
+
+- name: Write logrotate conf file
+  copy:
+    src: files/openvpn.logrotate
+    dest: /etc/logrotate.d/openvpn
+  notify: restart rsyslog
+
+- name: Write rsyslog conf file
+  copy:
+    src: files/openvpn.rsyslog
+    dest: /etc/rsyslog.d/20-openvpn.conf
+  notify: restart rsyslog
diff --git a/roles/gateway/tasks/main.yml b/roles/gateway/tasks/main.yml

new file mode 100644 (file)

index 0000000..d993f94
--- /dev/null
+++ b/roles/gateway/tasks/main.yml
@@ -0,0 +1,57 @@
+---
+- name: Include secrets
+  include_vars: "{{ secrets_path | mandatory }}/gateway.yml"
+  no_log: true
+  tags:
+    - always
+
+# Install and update system packages
+- import_tasks: packages.yml
+  tags:
+    - packages
+
+# Configure networking
+- import_tasks: network.yml
+  tags:
+    - networking
+
+# Configure firewalld
+- import_tasks: firewall.yml
+  tags:
+    - firewall
+
+# Configure fail2ban
+- import_tasks: fail2ban.yml
+  tags:
+    - fail2ban
+
+- name: Ensure data directory exists
+  file:
+    path: "{{ openvpn_data_dir }}"
+    state: directory
+    mode: 0755
+
+# Manage OpenVPN users list using secrets repo
+- import_tasks: users.yml
+  tags:
+    - users
+
+- name: Write OpenVPN secrets
+  copy:
+    content: "{{ item.data }}"
+    dest: "{{ item.path }}"
+    mode: "{{ item.mode }}"
+  with_items: "{{ gateway_secrets }}"
+  no_log: true
+  notify: restart openvpn
+
+# Configure logging
+- import_tasks: logging.yml
+  tags:
+    - logging
+
+- name: Make sure OpenVPN service is running and enabled
+  service:
+    name: "openvpn@{{ openvpn_server_name }}"
+    state: started
+    enabled: yes
diff --git a/roles/gateway/tasks/network.yml b/roles/gateway/tasks/network.yml

new file mode 100644 (file)

index 0000000..b61c8d1
--- /dev/null
+++ b/roles/gateway/tasks/network.yml
@@ -0,0 +1,43 @@
+---
+- name: Write ifcfg scripts
+  template:
+    src: ifcfg.j2
+    dest: "/etc/sysconfig/network-scripts/ifcfg-{{ item.value.ifname }}"
+  with_dict: "{{ gw_networks }}"
+  register: interfaces
+
+- name: Write additional routes
+  copy:
+    content: "{{ item.value.routes }}"
+    dest: "/etc/sysconfig/network-scripts/route-{{ item.value.ifname }}"
+  with_dict: "{{ gw_networks }}"
+  when: item.value.routes is defined
+
+# Restart networking right away if changes made.  This makes sure
+# the public interface is up and ready for OpenVPN to bind to.
+- name: Restart networking
+  service:
+    name: network
+    state: restarted
+  when: interfaces.changed
+
+- name: Write resolv.conf
+  template:
+    src: resolvconf.j2
+    dest: "/etc/resolv.conf"
+
+- name: Disable IPv6
+  sysctl:
+    name: net.ipv6.conf.all.disable_ipv6
+    value: 1
+    sysctl_set: yes
+    state: present
+    reload: yes
+
+- name: Enable IPv4 forwarding
+  sysctl:
+    name: net.ipv4.ip_forward
+    value: 1
+    sysctl_set: yes
+    state: present
+    reload: yes
diff --git a/roles/gateway/tasks/packages.yml b/roles/gateway/tasks/packages.yml

new file mode 100644 (file)

index 0000000..9c5751b
--- /dev/null
+++ b/roles/gateway/tasks/packages.yml
@@ -0,0 +1,9 @@
+---
+- name: Include gateway package list
+  include_vars: packages.yml
+
+- name: Install and update packages
+  yum:
+    name: "{{ packages|list }}"
+    state: latest
+    enablerepo: epel
diff --git a/roles/gateway/tasks/users.yml b/roles/gateway/tasks/users.yml

new file mode 100644 (file)

index 0000000..00d071b
--- /dev/null
+++ b/roles/gateway/tasks/users.yml
@@ -0,0 +1,21 @@
+---
+- name: Populate list of OpenVPN users
+  set_fact:
+    openvpn_users:
+      "{{ admin_users|list + lab_users|list + openvpn_users|list }}"
+
+- name: Update users file
+  template:
+    src: users.j2
+    dest: "{{ openvpn_data_dir }}/users"
+    owner: root
+    group: root
+    mode: 0644
+
+- name: Upload auth-openvpn script
+  template:
+    src: auth-openvpn
+    dest: "{{ openvpn_data_dir }}/auth-openvpn"
+    owner: root
+    group: root
+    mode: 0755
diff --git a/roles/gateway/templates/auth-openvpn b/roles/gateway/templates/auth-openvpn

new file mode 100644 (file)

index 0000000..dec071e
--- /dev/null
+++ b/roles/gateway/templates/auth-openvpn
@@ -0,0 +1,93 @@
+#!/usr/bin/python3
+
+import hashlib
+import logging
+import logging.handlers
+import os
+import re
+import sys
+import time
+
+log = logging.getLogger('auth-openvpn')
+
+def authenticate():
+    # annoy attackers
+    time.sleep(1)
+
+    path = sys.argv[1]
+    with open(path, 'rb') as f:
+        user = f.readline(8192)
+        assert user.endswith(b'\n')
+        user = user[:-1]
+        assert user
+        secret = f.readline(8192)
+        assert secret.endswith(b'\n')
+        secret = secret[:-1]
+        assert secret
+
+    # From openvpn(8):
+    #
+    # To protect against a client passing a maliciously formed username or
+    # password string, the username string must consist only of these
+    # characters: alphanumeric, underbar ('_'), dash ('-'), dot ('.'), or
+    # at ('@'). The password string can consist of any printable
+    # characters except for CR or LF. Any illegal characters in either the
+    # username or password string will be converted to underbar ('_').
+    #
+    # We'll just redo that quickly for usernames, to ensure they are safe.
+
+    user = re.sub(rb'[^a-zA-Z0-9_.@-]', '_', user)
+
+    def find_user(wanted):
+        with open('{{ openvpn_data_dir }}/users', 'rb') as f:
+            for line in f:
+                assert line.endswith(b'\n')
+                line = line[:-1]
+                if line.startswith(b'#') or len(line) == 0:
+                    continue
+                (username, salt, correct) = line.split(b' ', 2)
+                if username == wanted:
+                    return (salt, correct)
+
+        # these will never match
+        log.error('User not found: %r', wanted)
+        salt = b'not-found'
+        correct = 64*b'x'
+        return (salt, correct)
+
+    (salt, correct) = find_user(user)
+
+    inner = hashlib.new('sha256')
+    inner.update(salt)
+    inner.update(secret)
+    outer = hashlib.new('sha256')
+    outer.update(inner.digest())
+    outer.update(salt)
+    attempt = outer.hexdigest().encode()
+
+    if attempt != correct:
+        log.error('{prog}: invalid auth for user {user!r}.'.format(prog=os.path.basename(sys.argv[0]), user=user))
+        sys.exit(1)
+
+def main():
+    handler = logging.handlers.SysLogHandler(
+        address='/dev/log',
+        facility=logging.handlers.SysLogHandler.LOG_DAEMON,
+        )
+    fmt = logging.Formatter('%(name)s: %(message)s')
+    handler.setFormatter(fmt)
+    logging.basicConfig()
+    root = logging.getLogger('')
+    root.addHandler(handler)
+    log.setLevel(logging.INFO)
+
+    try:
+        authenticate()
+    except SystemExit:
+        raise
+    except:
+        log.exception('Unhandled error: ')
+        raise
+
+if __name__ == '__main__':
+    sys.exit(main())
diff --git a/roles/gateway/templates/f2b.jail.local.j2 b/roles/gateway/templates/f2b.jail.local.j2

new file mode 100644 (file)

index 0000000..335483b
--- /dev/null
+++ b/roles/gateway/templates/f2b.jail.local.j2
@@ -0,0 +1,8 @@
+#
+# {{ ansible_managed }}
+#
+[DEFAULT]
+ignoreip = {{ gw_f2b_ignoreip }}
+bantime = {{ gw_f2b_bantime }}
+findtime = {{ gw_f2b_findtime }}
+maxretry = {{ gw_f2b_maxretry }}
diff --git a/roles/gateway/templates/f2b.service.j2 b/roles/gateway/templates/f2b.service.j2

new file mode 100644 (file)

index 0000000..863305b
--- /dev/null
+++ b/roles/gateway/templates/f2b.service.j2
@@ -0,0 +1,9 @@
+#
+# {{ ansible_managed }}
+#
+[{{ item.key }}]
+enabled = {{ item.value.enabled }}
+port = {{ item.value.port }}
+{% if item.value.logpath is defined %}
+logpath = {{ item.value.logpath }}
+{% endif %}
diff --git a/roles/gateway/templates/ifcfg.j2 b/roles/gateway/templates/ifcfg.j2

new file mode 100644 (file)

index 0000000..36a564d
--- /dev/null
+++ b/roles/gateway/templates/ifcfg.j2
@@ -0,0 +1,27 @@
+#
+# {{ ansible_managed }}
+#
+NAME="{{ item.key }}"
+DEVICE="{{ item.value.ifname }}"
+HWADDR="{{ item.value.mac }}"
+NM_CONTROLLED="no"
+ONBOOT="yes"
+BOOTPROTO="static"
+IPADDR="{{ item.value.ip4 }}"
+NETMASK="{{ item.value.netmask }}"
+GATEWAY="{{ item.value.gw4 }}"
+DEFROUTE="{{ item.value.defroute }}"
+
+# Optional values
+{% if item.value.search is defined %}
+SEARCH="{{ item.value.search }}"
+{% endif %}
+{% if item.value.peerdns is defined %}
+PEERDNS="{{ item.value.peerdns }}"
+{% endif %}
+{% if item.value.dns1 is defined %}
+DNS1="{{ item.value.dns1 }}"
+{% endif %}
+{% if item.value.dns2 is defined %}
+DNS2="{{ item.value.dns2 }}"
+{% endif %}
diff --git a/roles/gateway/templates/resolvconf.j2 b/roles/gateway/templates/resolvconf.j2

new file mode 100644 (file)

index 0000000..71ded30
--- /dev/null
+++ b/roles/gateway/templates/resolvconf.j2
@@ -0,0 +1,7 @@
+#
+# {{ ansible_managed }}
+#
+search {{ gw_resolv_search }}
+{% for nameserver in gw_resolv_ns %}
+nameserver {{ nameserver }}
+{% endfor %}
diff --git a/roles/gateway/templates/users.j2 b/roles/gateway/templates/users.j2

new file mode 100644 (file)

index 0000000..e1dda58
--- /dev/null
+++ b/roles/gateway/templates/users.j2
@@ -0,0 +1,6 @@
+#
+# {{ ansible_managed }}
+#
+{% for user in openvpn_users %}
+{{ user.ovpn }}
+{% endfor %}
diff --git a/roles/gateway/vars/packages.yml b/roles/gateway/vars/packages.yml

new file mode 100644 (file)

index 0000000..3e9466f
--- /dev/null
+++ b/roles/gateway/vars/packages.yml
@@ -0,0 +1,17 @@
+---
+packages:
+  ## misc tools
+  - vim
+  - wget
+  - mlocate
+  - ipmitool
+  - git
+  - fail2ban
+  - fail2ban-firewalld
+  - network-scripts
+  ## VPN-specific stuff
+  - openvpn
+  - easy-rsa
+  ## monitoring
+  - nrpe
+  - nagios-plugins-all
diff --git a/roles/grafana_agent/defaults/main.yml b/roles/grafana_agent/defaults/main.yml

new file mode 100644 (file)

index 0000000..2df6f91
--- /dev/null
+++ b/roles/grafana_agent/defaults/main.yml
@@ -0,0 +1,16 @@
+---
+# Mimir URL and creds
+agent_mimir_url: "http://sepia-grafana.front.sepia.ceph.com:9009/api/v1/push"
+agent_mimir_username: "admin"
+grafana_apt_repo_url: "https://apt.grafana.com"
+grafana_apt_repo_key_url: "https://apt.grafana.com/gpg.key"
+grafana_rpm_repo_url: "https://rpm.grafana.com"
+grafana_rpm_repo_key_url: "https://rpm.grafana.com/gpg.key"
+
+scrape_interval_global: "60s"
+scrape_interval_node: "30s"
+
+# Selinux packages
+useradd_selinux_packages:
+  - policycoreutils
+  - checkpolicy
diff --git a/roles/grafana_agent/files/grafana/customuseradd.te b/roles/grafana_agent/files/grafana/customuseradd.te

new file mode 100644 (file)

index 0000000..bbded82
--- /dev/null
+++ b/roles/grafana_agent/files/grafana/customuseradd.te
@@ -0,0 +1,12 @@
+module customuseradd 1.0;
+
+require {
+       type useradd_t;
+  type var_lib_t;
+       class file { execute read create write getattr setattr
+open };
+}
+
+#============= useradd_t ==============
+
+allow useradd_t var_lib_t:file { write create open setattr getattr };
diff --git a/roles/grafana_agent/handlers/main.yml b/roles/grafana_agent/handlers/main.yml

new file mode 100644 (file)

index 0000000..169e45f
--- /dev/null
+++ b/roles/grafana_agent/handlers/main.yml
@@ -0,0 +1,6 @@
+---
+- name: "Restart grafana agent instance"
+  become: true
+  ansible.builtin.service:
+    name: "grafana-agent"
+    state: "restarted"
diff --git a/roles/grafana_agent/meta/main.yml b/roles/grafana_agent/meta/main.yml

new file mode 100644 (file)

index 0000000..313fd69
--- /dev/null
+++ b/roles/grafana_agent/meta/main.yml
@@ -0,0 +1,3 @@
+---
+dependencies:
+  - role: secrets
diff --git a/roles/grafana_agent/tasks/main.yml b/roles/grafana_agent/tasks/main.yml

new file mode 100644 (file)

index 0000000..ba3d209
--- /dev/null
+++ b/roles/grafana_agent/tasks/main.yml
@@ -0,0 +1,88 @@
+---
+- name: Include secrets
+  include_vars: "{{ secrets_path | mandatory }}/mimir_password.yml"
+  no_log: true
+  tags:
+    - always
+
+- name: Gather facts on listening ports
+  community.general.listen_ports_facts:
+
+# Resolving selinux conflicts
+- import_tasks: useradd-selinux.yml
+  when: ansible_os_family == "RedHat"
+
+- name: Check if prometheus is listening on port 9090
+  ansible.builtin.debug:
+    msg: The {{ item.name }} service - pid {{ item.pid }} is running on same port as grafana-agent please set {{ item.name }} to listen on a diffrent port than {{ item.port }}
+  vars:
+    tcp_listen_violations: "{{ ansible_facts.tcp_listen | selectattr('name', 'in', tcp_whitelist) | list }}"
+    tcp_whitelist:
+      - prometheus
+  loop: "{{ tcp_listen_violations }}"
+  failed_when: true
+
+- name: "Ensure that path /etc/apt/keyrings exists"
+  become: true
+  ansible.builtin.file:
+    path: /etc/apt/keyrings
+    state: directory
+    mode: '0755'
+    force: true
+  when: ansible_pkg_mgr == "apt"
+  register: keyrings_exists
+
+- name: "Import Grafana GPG key"
+  become: true
+  ansible.builtin.get_url:
+    url: "{{ grafana_apt_repo_key_url }}"
+    dest: /etc/apt/keyrings/grafana.gpg
+    mode: '0644'
+    force: true
+  when: ansible_pkg_mgr == "apt" and keyrings_exists is defined
+
+- name: Ensure downloaded file for key is a binary keyring
+  shell: "cat /etc/apt/keyrings/grafana.gpg | gpg --dearmor | sudo tee /etc/apt/keyrings/grafana.gpg > /dev/null"
+  when: ansible_pkg_mgr == "apt"
+
+- name: "Add Grafana's repository to APT sources list"
+  become: true
+  ansible.builtin.apt_repository:
+    repo: "deb [signed-by=/etc/apt/keyrings/grafana.gpg] {{ grafana_apt_repo_url }} stable main"
+    state: present
+  when: ansible_pkg_mgr == "apt"
+
+- name: "Add Grafana's repository to yum/dnf systems"
+  become: true
+  ansible.builtin.yum_repository:
+    baseurl: "{{ grafana_rpm_repo_url }}"
+    name: "grafana"
+    description: "grafana"
+    gpgcheck: true
+    gpgkey: "{{ grafana_rpm_repo_key_url }}"
+    state: present
+  when: ansible_os_family == "RedHat"
+
+- name: "Install grafana-agent"
+  become: true
+  ansible.builtin.package:
+    name: "grafana-agent"
+    state: "present"
+
+- name: "Enable grafana-agent"
+  become: true
+  ansible.builtin.service:
+    name: "grafana-agent"
+    state: "started"
+    enabled: true
+
+# Deploy config file from template and restart the agent
+- name: "Configure agent"
+  become: true
+  ansible.builtin.template:
+    src: "templates/grafana-agent.yaml.j2"
+    dest: "/etc/grafana-agent.yaml"
+    mode: "0440"
+    owner: "root"
+    group: "grafana-agent"
+  notify: "Restart grafana agent instance"
diff --git a/roles/grafana_agent/tasks/useradd-selinux.yml b/roles/grafana_agent/tasks/useradd-selinux.yml

new file mode 100644 (file)

index 0000000..85e57a2
--- /dev/null
+++ b/roles/grafana_agent/tasks/useradd-selinux.yml
@@ -0,0 +1,38 @@
+---
+- name: useradd - Install SELinux dependencies
+  package:
+    name: "{{ useradd_selinux_packages|list }}"
+    state: present
+
+# ignore_errors in case we don't have any repos
+- name: useradd - Ensure SELinux policy is up to date
+  package:
+    name: selinux-policy-targeted
+    state: latest
+  ignore_errors: true
+
+- name: useradd - Copy SELinux type enforcement file
+  copy:
+    src: grafana/customuseradd.te
+    dest: /tmp/customuseradd.te
+
+- name: useradd - Compile SELinux module file
+  command: checkmodule -M -m -o /tmp/customuseradd.mod /tmp/customuseradd.te
+
+- name: useradd - Build SELinux policy package
+  command: semodule_package -o /tmp/customuseradd.pp -m /tmp/customuseradd.mod
+
+- name: useradd - Load SELinux policy package
+  command: semodule -i /tmp/customuseradd.pp
+
+- name: useradd - Remove temporary files
+  file:
+    path: /tmp/customuseradd.*
+    state: absent
+
+- name: Verify SELinux module is installed
+  command: semodule -l
+  register: semodule_list
+  changed_when: false
+  failed_when: "'customuseradd' not in semodule_list.stdout"
+
diff --git a/roles/grafana_agent/templates/grafana-agent.yaml.j2 b/roles/grafana_agent/templates/grafana-agent.yaml.j2

new file mode 100644 (file)

index 0000000..f58250c
--- /dev/null
+++ b/roles/grafana_agent/templates/grafana-agent.yaml.j2
@@ -0,0 +1,33 @@
+server:
+  log_level: info
+
+metrics:
+  global:
+    remote_write:
+      - url: {{ agent_mimir_url }}
+        basic_auth:
+          username: {{ agent_mimir_username }}
+          password: {{ agent_mimir_password }}
+        queue_config:
+          max_backoff: 5m
+    external_labels:
+      nodetype: unknown_nodetype
+      ingest_instance: {{ inventory_hostname }}
+    scrape_interval: {{ scrape_interval_global }}
+  configs:
+    - name: {{ inventory_hostname }}
+      scrape_configs:
+        - job_name: 'grafana-agent-exporter'
+          relabel_configs:
+            - source_labels: [__address__]
+              target_label: instance
+              replacement: {{ inventory_hostname }}
+
+integrations:
+  node_exporter:
+    enabled: true
+    scrape_interval: {{ scrape_interval_node }}
+    instance: {{ inventory_hostname }}
+    rootfs_path: /
+    sysfs_path: /sys
+    procfs_path: /proc
diff --git a/roles/long_running_cluster/tasks/logrotate.yml b/roles/long_running_cluster/tasks/logrotate.yml

new file mode 100644 (file)

index 0000000..19ef817
--- /dev/null
+++ b/roles/long_running_cluster/tasks/logrotate.yml
@@ -0,0 +1,19 @@
+---
+# Because of the high debug level enabled for LRC daemons, the root drives
+# fill up rather quickly.  The drives fill up before the daily logrotate can
+# run so we rotate every 6 hours and keep 3 days worth.  This can be adjusted
+# as needed.
+
+- name: "Write custom ceph logrotate config"
+  template:
+    src: ceph-common.logrotate
+    dest: /etc/logrotate.d/cm-ansible-ceph-common
+  when: lrc_fsid is not defined
+
+- name: "Create cronjob to logrotate every 6 hours"
+  cron:
+    name: "Logrotate ceph logs every 6 hours"
+    minute: "25"
+    hour: "0,6,12,18"
+    job: "/usr/sbin/logrotate -f /etc/logrotate.d/{{ lrc_fsid|default('cm-ansible-ceph-common') }}"
+    user: root
diff --git a/roles/long_running_cluster/tasks/main.yml b/roles/long_running_cluster/tasks/main.yml

new file mode 100644 (file)

index 0000000..d298c05
--- /dev/null
+++ b/roles/long_running_cluster/tasks/main.yml
@@ -0,0 +1,14 @@
+---
+# We only need to install nagios checks on MON nodes
+- name: Check if MON node
+  command: "systemctl status ceph-mon@{{ ansible_hostname }}"
+  ignore_errors: true
+  changed_when: false
+  register: mon_service_status
+
+- import_tasks: nagios.yml
+  when: mon_service_status.rc == 0
+
+- import_tasks: logrotate.yml
+  tags:
+    - logrotate
diff --git a/roles/long_running_cluster/tasks/nagios.yml b/roles/long_running_cluster/tasks/nagios.yml

new file mode 100644 (file)

index 0000000..88e6bcd
--- /dev/null
+++ b/roles/long_running_cluster/tasks/nagios.yml
@@ -0,0 +1,36 @@
+---
+- name: Clone ceph-nagios-plugins on MON nodes
+  git:
+    repo: https://github.com/ceph/ceph-nagios-plugins.git
+    dest: "{{ nagios_plugins_directory }}/ceph-nagios-plugins"
+    update: yes
+
+- name: Make install ceph-nagios-plugins
+  shell: "cd /tmp/ceph-nagios-plugins && make libdir={{ nagios_plugins_directory|replace('/nagios/plugins', '') }} install"
+
+- name: Check for nagios ceph keyring
+  stat:
+    path: /etc/ceph/client.nagios.keyring
+  register: nagios_keyring
+
+- name: Create nagios ceph keyring
+  shell: "ceph auth get-or-create client.nagios mon 'allow r' > /etc/ceph/client.nagios.keyring && chown ceph:ceph /etc/ceph/client.nagios.keyring"
+  when: nagios_keyring.stat.exists == false
+
+- name: Write nrpe config for ceph health checks
+  lineinfile:
+    dest: /etc/nagios/nrpe_local.cfg
+    regexp: '.*check_ceph_health.*'
+    line: "command[check_ceph_health]={{ nagios_plugins_directory }}/check_ceph_health --name client.nagios -k /etc/ceph/client.nagios.keyring --whitelist 'failing to respond to cache pressure|requests are blocked'"
+    state: present
+    create: yes
+  notify: restart nagios-nrpe-server
+
+- name: Write nrpe config for ceph cluster capacity
+  lineinfile:
+    dest: /etc/nagios/nrpe_local.cfg
+    regexp: '.*check_ceph_df.*'
+    line: "command[check_ceph_df]={{ nagios_plugins_directory }}/check_ceph_df --name client.nagios -k /etc/ceph/client.nagios.keyring --pool data --warn 90 --critical 95"
+    state: present
+    create: yes
+  notify: restart nagios-nrpe-server
diff --git a/roles/long_running_cluster/templates/ceph-common.logrotate b/roles/long_running_cluster/templates/ceph-common.logrotate

new file mode 100644 (file)

index 0000000..53d8239
--- /dev/null
+++ b/roles/long_running_cluster/templates/ceph-common.logrotate
@@ -0,0 +1,12 @@
+# {{ ansible_managed }}
+/var/log/ceph/*.log {
+    rotate 6
+    compress
+    sharedscripts
+    postrotate
+        killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw || pkill -1 -x "ceph-mon|ceph-mgr|ceph-mds|ceph-osd|ceph-fuse|radosgw" || true
+    endscript
+    missingok
+    notifempty
+    su root ceph
+}
diff --git a/roles/maas/README.md b/roles/maas/README.md

new file mode 100644 (file)

index 0000000..e504a0a
--- /dev/null
+++ b/roles/maas/README.md
@@ -0,0 +1,156 @@
+# Ansible Playbook: MAAS Installation and Configuration
+
+This Ansible playbook automates the installation and initial configuration of [MAAS (Metal as a Service)](https://maas.io/) on Ubuntu-based systems.
+
+## Features
+
+- Installs MAAS packages
+- Initializes MAAS with a default user with High Availability
+- Configures networking (DHCP, DNS, etc.)
+- Adds Machines from inventory into MAAS
+
+## Requirements
+
+- Ansible 2.10+
+- Ubuntu 20.04 or later on the target system(s)
+- Sudo access on target host
+- Internet access (for downloading MAAS packages and images)
+- At least 2 Nodes to deploy MAAS with High Availability
+
+## Inventory
+
+Define your inventory in `hosts.ini` with the following structure:
+
+```ini
+[maas_region_rack_server]
+test1 ip=172.x.x.x ipmi=10.0.8.x mac=08:00:27:ed:43:x
+
+[maas_rack_server]
+test2 ip=172.x.x.x ipmi=10.0.8.x mac=08:00:27:ed:43:x
+
+[maas_db_server]
+test1 ip=172.x.x.x ipmi=10.0.8.x mac=08:00:27:ed:43:x
+
+You can do this installation with 3 or 2 nodes depending on your needs.
+If you want to use a dedicated DB server you can just put it in the maas_db_server group, use a different server in maas_region_rack_server and another in maas_rack_server.
+Or if you want to simplify and you dont mind to use your maas server as DB server too, you can use the same node in maas_db_server and in maas_region_rack_server, as they are different services and use different ports they can be installed on the same node. This way you use only 2 nodes for the installation the db+region+rack server and the secondary rack for high availability.
+
+The systems you want to add into MAAS should be on a group called [testnodes] with the same structure.
+
+## Variables
+
+You can configure the playbook via group_vars/maas.yml in the secret repo or defaults/main.yml. Common variables include:
+maas_admin_username: "admin"
+maas_admin_password: "adminpass"
+maas_admin_email: "admin@example.com"
+maas_db_name: "maasdb"
+maas_db_user: "maas"
+maas_db_password: "maaspassword"
+maas_version: "3.5"
+
+NTP variables include:
+maas_ntp_servers: "ntp.ubuntu.com"  # NTP servers, specified as IP addresses or hostnames delimited by commas and/or spaces, to be used as time references for MAAS itself, the machines MAAS deploys, and devices that make use of MAAS's DHCP services. MAAS uses ntp.ubuntu.com by default. You can put a single server or multiple servers.
+maas_ntp_external_only: "false" # Configure all region controller hosts, rack controller hosts, and subsequently deployed machines to refer directly to the configured external NTP servers. Otherwise only region controller hosts will be configured to use those external NTP servers, rack contoller hosts will in turn refer to the regions' NTP servers, and deployed machines will refer to the racks' NTP servers. The value of this variable can be true or false.
+
+DNS variables include:
+dns_domains: # This is the list of domains you want to create, in this case we have 2 domains, but you can list here all the domains you need.
+  - ceph: Static primary domain (e.g., `front.sepia.ceph.com`).
+  - ipmi: Static IPMI domain (`ipmi.sepia.ceph.com`).
+default_domains: List of domains to preserve/ignore (default: `["maas"]`). The default domain is a DNS domain that is used by maas when you deploy a machine it is used by maas for internal dns records so we choose to exclude it from our ansible role.
+
+DHCP variables include:
+dhcp_maas_global:
+  - ddns-update-style: none
+  - default-lease-time: 43200
+  - max-lease-time: 172800
+  - one-lease-per-client: "true"
+
+This list will be used to populate the global DHCP snippet. You can add additional keys and values. Just make sure they follow the syntax required for dhcpd.conf.
+The global configuration is optional, so you can just remove the elements of the list if you do not need them.
+
+dhcp_maas_subnets: #This is a list of dictionaries, you can list here all the subnets you want to configure and use any name you want in this case we use front and back but you can include here any other or change the names.
+  front:
+    cidr: 10.0.8.0/24
+    ipvar: ip
+    macvar: mac
+    start_ip: 10.0.8.10
+    end_ip: 10.0.8.20
+    ip_range_type: dynamic
+    classes:
+      virtual: "match if substring(hardware, 0, 4) = 01:52:54:00"
+      lxc: "match if substring(hardware, 0, 4) = 01:52:54:ff"
+    pools:
+      virtual:
+        range: 172.21.10.20 172.21.10.250
+      unknown_clients:
+        range:
+          - 172.21.11.0 172.21.11.19
+          - 172.21.13.170 172.21.13.250
+      lxc:
+        range: 172.21.14.1 172.21.14.200
+  back:
+    cidr: 172.21.16.0/20
+    ipvar: back
+    macvar: backmac
+    start_ip: 172.21.16.10
+    end_ip: 172.21.16.20
+    ip_range_type: dynamic
+
+This is large dictionary that gets parsed out into individual snippet files. Each top-level key (front and back in the example) will get its own snippet file created.
+
+Under each subnet, cidr, ipvar, and macvar are required. ipvar and macvar tell the Jinja2 template which IP address and MAC address should be used for each host in each subnet snippet, the value of these variables should be the name of the variable that holds the ip address and mac address, respectively (for hosts that have more than one interface). That is, you might have "ipfront=1.2.3.4 ipback=5.6.7.8", and for the front subnet, 'ipvar' would be set to 'ipfront', and for the back network, 'ipvar' would be set to 'ipback', if those variables are not defined in the inventory then that host will not be included into the subnet configuration.
+
+Here's a line from our Ansible inventory host file
+
+smithi001.front.sepia.ceph.com mac=0C:C4:7A:BD:15:E8 ip=172.21.15.1 ipmi=172.21.47.1 bmc=0C:C4:7A:6E:21:A7
+
+This will result in a static lease for smithi001-front with IP 172.21.15.1 and MAC 0C:C4:7A:BD:15:E8 in front_hosts snippet and a smithi001-ipmi entry with IP 172.21.47.1 with MAC 0C:C4:7A:6E:21:A7 in ipmi_hosts snippet.
+
+start_ip, end_ip and ip_range_type are required too in order to create an IP range. MAAS needs a range in order to enable DHCP on the subnet. In this case the ip_range_type is configured as dynamic, it could be dynamic or static.
+
+The classes are optional, they are groups of DHCP clients defined by specific criteria, allowing the possibility to apply custom DHCP options or behaviors to those groups. This enables more granular control over how DHCP services are delivered to different client types, like assigning specific IP addresses or configuring other network parameters based on device type or other characteristics. In this case we have virtual and lxc but you can include here any group you want with any name. In our specific case we are including into these groups hosts that match with an specific mac address criteria.
+
+The pools are optional too, they are ranges of IP addresses that a DHCP server uses to automatically assign to DHCP clients on a network. These addresses are dynamically allocated, meaning they are leased to clients for a specific duration and can be reclaimed when no longer in use. DHCP pools allow for efficient IP address management and are essential for networks where devices are frequently added or moved. In the example above we are using pools to assign IPs to the classes we just defined and to the unknown_clients which are servers that are not defined into the DHCP config file.
+
+## Usage
+
+1. Clone the repository:
+
+git clone https://github.com/ceph/ceph-cm-ansible.git
+cd ceph-cm-ansible
+
+2. Update inventory and variables.
+
+3. Run the playbook:
+
+ansible-playbook maas.yml
+
+## Role Structure
+
+maas
+  ├── defaults
+  │   └── main.yml
+  ├── meta
+  │   └── main.yml
+  ├── README.md
+  ├── tasks
+  │   ├── add_machines.yml
+  │   ├── config_dhcpd_subnet.yml
+  │   ├── config_dns.yml
+  │   ├── config_ntp.yml
+  │   ├── initialize_region_rack.yml
+  │   ├── initialize_secondary_rack.yml
+  │   ├── install_maasdb.yml
+  │   └── main.yml
+  └── templates
+      ├── dhcpd.classes.snippet.j2
+      ├── dhcpd.global.snippet.j2
+      ├── dhcpd.hosts.snippet.j2
+      └── dhcpd.pools.snippet.j2
+
+## Tags
+
+- install_maas #Install MAAS and postgreSQL only and initializes the region+rack server and the secondary rack.
+- add-machines #Add Machines to MAAS only if they are not already present.
+- config_dhcp #Configures DHCP options only if there are any change in the DHCP variables.
+- config_dns #Configure DNS domains and add the DNS Records that are not currently into a domain.
diff --git a/roles/maas/defaults/main.yml b/roles/maas/defaults/main.yml

new file mode 100644 (file)

index 0000000..5c900e7
--- /dev/null
+++ b/roles/maas/defaults/main.yml
@@ -0,0 +1,36 @@
+---
+# MAAS user and database variables
+maas_admin_username: "admin"
+maas_db_name: "maasdb"
+maas_db_user: "maas"
+postgres_version: "16"
+
+#General variables
+maas_version: "3.6"
+maas_install_method: "apt"
+maas_home_dir: "/home/ubuntu/maas"
+global_kernel_opt: "console=tty0 console=ttyS1,115200"
+
+# DNS variables
+default_domains:
+  - "maas"
+
+maas_dns_domains:
+  ceph: "front.sepia.ceph.com"
+  ipmi: "ipmi.sepia.ceph.com"
+
+# NTP variables
+maas_ntp_servers: "ntp.ubuntu.com"
+maas_ntp_external_only: "false"
+
+# Users variables
+keys_repo: "https://github.com/ceph/keys"
+keys_branch: main
+keys_repo_path: "~/.cache/src/keys"
+
+# Should MAAS mark machines broken in order to update their network interface configurations in MAAS?
+maas_force_machine_update: false
+
+# Override in secrets
+maas_ipmi_username: ADMIN
+maas_ipmi_password: ADMIN
diff --git a/roles/maas/handlers/main.yml b/roles/maas/handlers/main.yml

new file mode 100644 (file)

index 0000000..1e40778
--- /dev/null
+++ b/roles/maas/handlers/main.yml
@@ -0,0 +1,11 @@
+---
+- include_tasks: _auth_header.yml
+  listen: "Rebuild MAAS machine indexes"
+
+- name: Read machines from MAAS (handler)
+  listen: "Rebuild MAAS machine indexes"
+  include_tasks: machines/_read_machines.yml
+
+- name: Build machine indexes (handler)
+  listen: "Rebuild MAAS machine indexes"
+  include_tasks: machines/_build_indexes.yml
diff --git a/roles/maas/meta/main.yml b/roles/maas/meta/main.yml

new file mode 100644 (file)

index 0000000..313fd69
--- /dev/null
+++ b/roles/maas/meta/main.yml
@@ -0,0 +1,3 @@
+---
+dependencies:
+  - role: secrets
diff --git a/roles/maas/tasks/_auth_header.yml b/roles/maas/tasks/_auth_header.yml

new file mode 100644 (file)

index 0000000..71c65a6
--- /dev/null
+++ b/roles/maas/tasks/_auth_header.yml
@@ -0,0 +1,18 @@
+---
+# Build a FRESH OAuth header using the pre-encoded pieces from the pretasks.
+# Requires: maas_ck_enc, maas_tk_enc, maas_sig_enc (set in api_auth_pretasks.yml)
+
+- name: Build OAuth header (fresh nonce/timestamp)
+  vars:
+    _nonce: "{{ lookup('community.general.random_string', length=24, upper=false, special=false) }}"
+    _ts: "{{ lookup('pipe', 'date +%s') }}"
+  set_fact:
+    maas_auth_header: >-
+      OAuth oauth_version="1.0",
+      oauth_signature_method="PLAINTEXT",
+      oauth_consumer_key="{{ maas_ck_enc }}",
+      oauth_token="{{ maas_tk_enc }}",
+      oauth_signature="{{ maas_sig_enc }}",
+      oauth_nonce="{{ _nonce | urlencode }}",
+      oauth_timestamp="{{ _ts }}"
+#  no_log: true
diff --git a/roles/maas/tasks/add_users.yml b/roles/maas/tasks/add_users.yml

new file mode 100644 (file)

index 0000000..c0cf8a6
--- /dev/null
+++ b/roles/maas/tasks/add_users.yml
@@ -0,0 +1,53 @@
+---
+- name: Add all users from inventory variables to MAAS
+  when: inventory_hostname in groups['maas_region_rack_server']
+  tags: add_users
+  block:
+    - name: Get existing users in MAAS
+      command: "maas {{ maas_admin_username }} users read"
+      register: existing_users
+
+    - name: Extract existing usernames
+      set_fact:
+        existing_usernames: "{{ existing_users.stdout | from_json | map(attribute='username') | list }}"
+    
+    - name: Create all admin users.
+      command: "maas {{ maas_admin_username }} users create username={{ item.name }} email={{ item.email }} password={{ item.name}}temp is_superuser=1"
+      with_items: "{{ admin_users }}"
+      when: item.name not in existing_usernames
+
+    - name: Merge admin_users and lab_users
+      set_fact:
+        pubkey_users: "{{ admin_users|list }}" #+ lab_users|list }}"
+
+    - name: Clone the keys repo
+      local_action:
+        module: git
+        repo: "{{ keys_repo }}"
+        version: "{{ keys_branch }}"
+        force: yes
+        dest: "{{ keys_repo_path }}"
+      become: false
+      when: keys_repo is defined
+      connection: local
+      run_once: true
+      register: clone_keys
+      until: clone_keys is success
+      retries: 5
+      delay: 10
+    
+    - name: Update authorized_keys using the keys repo
+      vars:
+        user: "{{ item.name }}"
+        key: "{{ lookup('file', keys_repo_path + '/ssh/' + item.name + '.pub') }}"
+      command: "maas {{ maas_admin_username }} sshkeys create user={{ user }} key='{{ key }}'"
+      with_items: "{{ pubkey_users }}"
+      when: item.key is undefined and keys_repo is defined
+    
+    - name: Update authorized_keys for each user with literal keys
+      vars:
+        user: "{{ item.name }}"
+        key: "{{ item.key }}"
+      command: "maas {{ maas_admin_username }} sshkeys create user={{ user }} key='{{ key }}'"
+      with_items: "{{ pubkey_users }}"
+      when: item.key is defined
diff --git a/roles/maas/tasks/api_auth_pretasks.yml b/roles/maas/tasks/api_auth_pretasks.yml

new file mode 100644 (file)

index 0000000..7ae8f95
--- /dev/null
+++ b/roles/maas/tasks/api_auth_pretasks.yml
@@ -0,0 +1,23 @@
+---
+# Parse the MAAS API key ONCE and pre-encode the static OAuth pieces.
+
+- name: Bail if no MAAS key
+  assert:
+    that:
+      - maas_api_key is defined
+      - (maas_api_key | length) > 0
+    fail_msg: "maas_api_key not available."
+
+# Split key: <consumer_key>:<token_key>:<token_secret>
+- name: Parse MAAS API key once
+  set_fact:
+    maas_ck_raw: "{{ (maas_api_key.split(':'))[0] }}"
+    maas_tk_raw: "{{ (maas_api_key.split(':'))[1] }}"
+    maas_ts_raw: "{{ (maas_api_key.split(':'))[2] }}"
+
+# Pre-encode static values used in every header
+- name: Pre-encode OAuth static pieces
+  set_fact:
+    maas_ck_enc: "{{ maas_ck_raw | urlencode }}"
+    maas_tk_enc: "{{ maas_tk_raw | urlencode }}"
+    maas_sig_enc: "{{ ('&' ~ maas_ts_raw) | urlencode }}"
diff --git a/roles/maas/tasks/config_dhcpd_subnet.yml b/roles/maas/tasks/config_dhcpd_subnet.yml

new file mode 100644 (file)

index 0000000..dd44bd7
--- /dev/null
+++ b/roles/maas/tasks/config_dhcpd_subnet.yml
@@ -0,0 +1,173 @@
+---
+- name: Configure MAAS DHCP
+  when: inventory_hostname in groups['maas_region_rack_server']
+  tags: config_dhcp
+  block:
+  # This section enables DHCP on the subnets included into the secrets repo group_vars and creates an IP range for them
+   - name: Read maas ipranges
+     command: "maas {{ maas_admin_username }} ipranges read"
+     register: ip_ranges_raw
+
+   - name: Parse IP range JSON
+     set_fact:
+       existing_start_ips: "{{ ip_ranges_raw.stdout | from_json | map(attribute='start_ip') | list }}"
+       existing_end_ips: "{{ ip_ranges_raw.stdout | from_json | map(attribute='end_ip') | list }}"
+   
+#   - name: Create IP Range for {{ subnet_name }} subnet
+#     command: "maas {{ maas_admin_username }} ipranges create type={{ subnet_data.ip_range_type }} start_ip={{ subnet_data.start_ip }} end_ip={{ subnet_data.end_ip }}"
+#     when: subnet_data.start_ip not in existing_start_ips and subnet_data.end_ip not in existing_end_ips
+
+   - name: Read maas subnet information
+     command: "maas {{ maas_admin_username }} subnet read {{ subnet_data.cidr }}"
+     register: subnet_info
+
+   - name: Define subnet variables
+     set_fact:
+       fabric_name: "{{ (subnet_info.stdout | from_json).vlan.fabric }}"
+       vlan_vid: "{{ (subnet_info.stdout | from_json).vlan.vid }}"
+       vlan_id: "{{ (subnet_info.stdout | from_json).id }}"
+   
+   - name: Enable DHCP on {{ subnet_name }} subnet
+     #command: "maas {{ maas_admin_username }} vlan update {{ fabric_name }} {{ vlan_vid }} dhcp_on=True primary_rack={{ groups['maas_region_rack_server'][0].split('.')[0] }} secondary_rack={{ groups['maas_rack_server'][0].split('.')[0] }}"
+     command: "maas {{ maas_admin_username }} vlan update {{ fabric_name }} {{ vlan_vid }} dhcp_on=True"
+   
+   # This section creates the directory where the snippets are going to be copied
+   
+   - name: Define snippets path
+     set_fact:
+       snippets_path: "{{ '/var/snap/maas/common/maas/dhcp/snippets' if maas_install_method == 'snap' else '/var/lib/maas/dhcp/snippets' }}"
+  
+   - name: Create snippets directory
+     file:
+       path: "{{ snippets_path }}"
+       state: directory
+       mode: '0755'
+     register: snippets_directory
+     failed_when: snippets_directory.failed == true
+   
+   # This section verifies if the snippets already exist and creates the name variables
+   - name: Get current snippet names
+     command: bash -c "maas {{ maas_admin_username }} dhcpsnippets read"
+     register: current_snippets
+ 
+   - name: Parse snippet names JSON
+     set_fact:
+       existing_snippets: "{{ current_snippets.stdout | from_json | map(attribute='name') | list }}"
+
+   - name: Define snippet name variables
+     set_fact:
+       global_snippet: "global_dhcp"
+       classes_snippet: "{{ subnet_name }}_classes"
+       pools_snippet: "{{ subnet_name }}_pools"
+       hosts_snippet: "{{ subnet_name }}_hosts"    
+
+   # This section copies the snippets
+   
+   - name: Copy global DHCP snippet
+     template:
+       src: dhcpd.global.snippet.j2
+       dest: "{{ snippets_path }}/global_dhcp_snippet"
+     register: dhcp_global_config
+   
+   - name: Copy {{ subnet_name }} subnet classes snippet
+     template:
+       src: dhcpd.classes.snippet.j2
+       dest: "{{ snippets_path }}/{{ subnet_name }}_classes_snippet"
+     when: subnet_data.classes is defined
+     register: dhcp_classes_config
+   
+   - name: Copy {{ subnet_name }} subnet pools snippet
+     template:
+       src: dhcpd.pools.snippet.j2
+       dest: "{{ snippets_path }}/{{ subnet_name }}_pools_snippet"
+     when: subnet_data.pools is defined
+     register: dhcp_pools_config
+   
+   - name: Copy {{ subnet_name }} subnet hosts snippet
+     template:
+       src: dhcpd.hosts.snippet.j2
+       dest: "{{ snippets_path }}/{{ subnet_name }}_hosts_snippet"
+     register: dhcp_hosts_config
+
+   - pause:
+       minutes: 500
+   
+   # This section decodes the snippet files and creates the variables to add them into MAAS
+   
+   - name: Slurp global DHCP file content
+     slurp:
+       src: "{{ snippets_path }}/global_dhcp_snippet"
+     when: dhcp_global_config.failed == false
+     register: global_file
+   
+   - name: Decode global DHCP file content
+     set_fact:
+       global_content: "{{ global_file.content | b64decode }}"
+     when: dhcp_global_config.failed == false
+   
+   - name: Slurp {{ subnet_name }} classes file content
+     slurp:
+       src: "{{ snippets_path }}/{{ subnet_name }}_classes_snippet"
+     when: subnet_data.classes is defined and dhcp_classes_config.failed == false
+     register: classes_file
+   
+   - name: Decode {{ subnet_name }} classes file content
+     set_fact:
+       classes_content: "{{ classes_file.content | b64decode }}"
+     when: subnet_data.classes is defined and dhcp_classes_config.failed == false
+   
+   - name: Slurp {{ subnet_name }} pools file content
+     slurp:
+       src: "{{ snippets_path }}/{{ subnet_name }}_pools_snippet"
+     when: subnet_data.pools is defined and dhcp_pools_config.failed == false
+     register: pools_file
+   
+   - name: Decode {{ subnet_name }} pools file content
+     set_fact:
+       pools_content: "{{ pools_file.content | b64decode }}"
+     when: subnet_data.pools is defined and dhcp_pools_config.failed == false
+   
+   - name: Slurp {{ subnet_name }} hosts file content
+     slurp:
+       src: "{{ snippets_path }}/{{ subnet_name }}_hosts_snippet"
+     register: hosts_file
+   
+   - name: Decode {{ subnet_name }} hosts file content
+     set_fact:
+       hosts_content: "{{ hosts_file.content | b64decode }}"
+   
+   # This section deletes the snippets if already exist
+   
+   - name: Delete global DHCP snippet if already exists
+     command: "maas {{ maas_admin_username }} dhcpsnippet delete {{ global_snippet }}"
+     when: dhcp_global_config.changed == true and global_snippet in existing_snippets
+   
+   - name: Delete {{ subnet_name }} subnet classes snippet if already exists
+     command: "maas {{ maas_admin_username }} dhcpsnippet delete {{ classes_snippet }}"
+     when: subnet_data.classes is defined and dhcp_classes_config.changed == true and classes_snippet in existing_snippets
+   
+   - name: Delete {{ subnet_name }} subnet pools snippet if already exists
+     command: "maas {{ maas_admin_username }} dhcpsnippet delete {{ pools_snippet }}"
+     when: subnet_data.pools is defined and dhcp_pools_config.changed == true and pools_snippet in existing_snippets
+   
+   - name: Delete {{ subnet_name }} subnet hosts snippet if already exists
+     command: "maas {{ maas_admin_username }} dhcpsnippet delete {{ hosts_snippet }}"
+     when: dhcp_hosts_config.changed == true and hosts_snippet in existing_snippets
+   
+   # This section adds snippets into MAAS
+   
+   - name: Add global DHCP snippet into MAAS
+     command: "maas {{ maas_admin_username }} dhcpsnippets create name='{{ global_snippet }}' value='{{ global_content }}' description='This snippet configures the global DHCP options' global_snippet=true"
+     when: dhcp_global_config.failed == false and dhcp_global_config.changed == true
+   
+   - name: Add {{ subnet_name }} classes snippet into MAAS
+     command: "maas {{ maas_admin_username }} dhcpsnippets create name='{{ classes_snippet }}' value='{{ classes_content }}' description='This snippet configures the classes in {{ subnet_name }} subnet' subnet='{{ vlan_id }}'"
+     when: subnet_data.classes is defined and dhcp_classes_config.failed == false and dhcp_classes_config.changed == true
+   
+   - name: Add {{ subnet_name }} pools snippet into MAAS
+     command: "maas {{ maas_admin_username }} dhcpsnippets create name='{{ pools_snippet }}' value='{{ pools_content }}' description='This snippet configures the pools in {{ subnet_name }} subnet' subnet='{{ vlan_id }}'"
+     when: subnet_data.pools is defined and dhcp_pools_config.failed == false and dhcp_pools_config.changed == true
+   
+   - name: Add {{ subnet_name }} hosts snippet into MAAS
+     command: "maas {{ maas_admin_username }} dhcpsnippets create name='{{ hosts_snippet }}' value='{{ hosts_content }}' description='This snippet configures the hosts in {{ subnet_name }} subnet' subnet='{{ vlan_id }}'"
+     when: dhcp_hosts_config.failed == false and dhcp_hosts_config.changed == true
diff --git a/roles/maas/tasks/config_dns.yml b/roles/maas/tasks/config_dns.yml

new file mode 100644 (file)

index 0000000..3058dea
--- /dev/null
+++ b/roles/maas/tasks/config_dns.yml
@@ -0,0 +1,85 @@
+---
+- name: Configures MAAS DNS
+  when: inventory_hostname in groups['maas_region_rack_server']
+  tags: config_dns
+  block:
+    - name: Get existing DNS resources
+      ansible.builtin.command: "maas {{ maas_admin_username }} dnsresources read"
+      register: existing_resources
+      changed_when: false
+    
+    - name: Initialize DNS records list
+      ansible.builtin.set_fact:
+        dns_records: []
+    
+    - name: Define target hosts for DNS records
+      ansible.builtin.set_fact:
+        target_hosts: "{{ groups | dict2items | rejectattr('key', 'equalto', 'maas') | map(attribute='value') | flatten | unique | default([]) }}"
+        when: groups.keys() | length > 1
+    
+    - name: Build DNS records for all interfaces
+      ansible.builtin.set_fact:
+        dns_records: "{{ dns_records + [{'name': item[0].split('.')[0], 'ip': interface_ip, 'type': 'A', 'domain': item[1].value}] }}"
+      loop: "{{ (target_hosts | default([])) | product(maas_dns_domains | dict2items) | list }}"
+      vars:
+        interface_ip: "{{ hostvars[item[0]][item[1].key] if item[1].key != 'ceph' else hostvars[item[0]]['ip'] }}"
+      when:
+        - target_hosts is defined and target_hosts | length > 0
+        - "item[1].key in hostvars[item[0]] or (item[1].key == 'ceph' and 'ip' in hostvars[item[0]])"
+    
+    - name: Parse desired FQDNs
+      ansible.builtin.set_fact:
+        desired_fqdns: "{{ dns_records | map(attribute='name') | zip(dns_records | map(attribute='domain')) | map('join', '.') | list }}"
+      when: dns_records | length > 0
+    
+    - name: Remove unwanted DNS records
+      ansible.builtin.command: "maas {{ maas_admin_username }} dnsresource delete {{ item.id }}"
+      loop: "{{ existing_resources.stdout | from_json }}"
+      when: >
+        dns_records | length > 0 and
+        item.fqdn not in desired_fqdns
+      register: dns_deletion
+      failed_when: dns_deletion.rc != 0 and "does not exist" not in dns_deletion.stderr
+    
+    - name: Get updated DNS resources after deletions
+      ansible.builtin.command: "maas {{ maas_admin_username }} dnsresources read"
+      register: updated_resources
+      changed_when: false
+    
+    - name: Get existing DNS domains
+      ansible.builtin.command: "maas {{ maas_admin_username }} domains read"
+      register: existing_domains
+      changed_when: false
+    
+    - name: Parse existing domains
+      ansible.builtin.set_fact:
+        current_domains: "{{ existing_domains.stdout | from_json | map(attribute='name') | list }}"
+    
+    - name: Remove unwanted domains
+      ansible.builtin.command: "maas {{ maas_admin_username }} domain delete {{ item.id }}"
+      loop: "{{ existing_domains.stdout | from_json }}"
+      when: >
+        item.name not in default_domains and
+        item.name not in maas_dns_domains.values()
+      register: domain_deletion
+      failed_when: domain_deletion.rc != 0 and "does not exist" not in domain_deletion.stderr and "protected foreign keys" not in domain_deletion.stderr
+    
+    - name: Ensure new DNS domains exist
+      ansible.builtin.command: "maas {{ maas_admin_username }} domains create name={{ item.value }}"
+      loop: "{{ maas_dns_domains | dict2items }}"
+      when: item.value not in current_domains
+      register: domain_creation
+      failed_when: domain_creation.rc != 0 and "already exists" not in domain_creation.stderr
+    
+    - name: Ensure DNS records exist
+      ansible.builtin.command: >
+        maas {{ maas_admin_username }} dnsresources create
+        fqdn={{ item.name }}.{{ item.domain }}
+        ip_addresses={{ item.ip }}
+      loop: "{{ dns_records }}"
+      when: >
+        dns_records | length > 0 and
+        (item.name + '.' + item.domain) not in
+        (updated_resources.stdout | from_json | map(attribute='fqdn') | list)
+      register: dns_creation
+      failed_when: dns_creation.rc != 0 and "already exists" not in dns_creation.stderr
diff --git a/roles/maas/tasks/config_maas.yml b/roles/maas/tasks/config_maas.yml

new file mode 100644 (file)

index 0000000..166a8f8
--- /dev/null
+++ b/roles/maas/tasks/config_maas.yml
@@ -0,0 +1,78 @@
+---
+- name: Config MAAS
+  when: inventory_hostname in groups['maas_region_rack_server']
+  tags: config_maas
+  block:
+    - name: Check if MAAS was already unsquashed
+      stat:
+        path: "/var/lib/snapd/snaps/maas_x1.snap"
+      register: maas_x1
+
+    - name: Verify that MAAS directory exist
+      ansible.builtin.file:
+        path: "{{ maas_home_dir }}"
+        state: directory
+        owner: root
+        group: root
+        mode: '0755'
+      when: "maas_install_method == 'snap' and not maas_x1.stat.exists"
+      register: maas_home
+
+    - name: Check installed MAAS snap
+      shell: "sudo ls -t /var/lib/snapd/snaps/maas_*"
+      when: "maas_install_method == 'snap' and not maas_x1.stat.exists"
+      register: maas_snap
+
+    - name: Unsquahs MAAS FS
+      command: "sudo unsquashfs -d {{ maas_home_dir }} {{ maas_snap.stdout }}"
+      when: "maas_install_method == 'snap' and maas_home is defined and not maas_x1.stat.exists"
+      register: maas_fs
+
+    - name: Change MAAS current to home directory
+      command: "sudo snap try {{ maas_home_dir }}"
+      when: "maas_install_method == 'snap' and maas_fs is defined and not maas_x1.stat.exists"
+
+    - name: Check UEFI template directory
+      shell: "ls {{ maas_home_dir }}/lib/python*/site-packages/provisioningserver/templates/uefi/config.local.arm64.template"
+      when: "maas_install_method == 'snap'"
+      register: uefi_template_path
+
+    - name: Copy UEFI template to support ARM OS's
+      ansible.builtin.template:
+        src: arm_uefi.j2
+        dest: "{{ uefi_template_path.stdout if maas_install_method == 'snap' else '/usr/lib/python3/dist-packages/provisioningserver/templates/uefi/config.local.arm64.template' }}"
+        owner: root
+        group: root
+        mode: '0644'
+    
+    - name: Check curtin scripts directory
+      shell: "ls {{ maas_home_dir }}/usr/lib/python3/dist-packages/curtin/commands/install_grub.py"
+      when: "maas_install_method == 'snap'"
+      register: curtin_scripts_path    
+
+    - name: Add force flag into install_grub curtin script to allow ARM deployment
+      ansible.builtin.replace:
+        path: "{{ curtin_scripts_path.stdout if maas_install_method == 'snap' else '/usr/lib/python3/dist-packages/curtin/commands/install_grub.py' }}"
+        regexp: "'--recheck']"
+        replace: "'--recheck', '--force']"
+
+    - name: Check curtin_userdata directory
+      shell: "ls {{ maas_home_dir }}/etc/maas/preseeds/curtin_userdata"
+      when: "maas_install_method == 'snap'"
+      register: curtin_userdata_path
+
+    - name: Copy curtin_userdata template to generate CM user
+      ansible.builtin.blockinfile:
+        path: "{{ curtin_userdata_path.stdout if maas_install_method == 'snap' else '/etc/maas/preseeds/curtin_userdata' }}"
+        insertafter: EOF
+        block: |2
+            90_create_cm_user: ["curtin", "in-target", "--", "sh", "-c", "useradd {{ cm_user }} -u 1001 -m -s /bin/bash -g sudo"]
+            92_delete_cm_pass: ["curtin", "in-target", "--", "sh", "-c", "passwd -d cm"]
+            94_configure_sudo: ["curtin", "in-target", "--", "sh", "-c", "printf '%%sudo ALL=(ALL) NOPASSWD: ALL\nDefaults !requiretty\nDefaults visiblepw' >> /etc/sudoers.d/cephlab_sudo"]
+            96_create_ssh_directory: ["curtin", "in-target", "--", "sh", "-c", "mkdir -p /home/cm/.ssh"]
+            98_copy_ssh_keys_cm: ["curtin", "in-target", "--", "sh", "-c", "echo '{{ cm_user_ssh_keys|join('\n') }}' >> /home/cm/.ssh/authorized_keys"]
+      when: "cm_user_ssh_keys is defined and cm_user is defined"
+
+    - name: Configure global kernel options
+      command: "maas {{ maas_admin_username }} maas set-config name=kernel_opts value='{{ global_kernel_opt }}'"
+      when: "global_kernel_opt is defined"
diff --git a/roles/maas/tasks/config_ntp.yml b/roles/maas/tasks/config_ntp.yml

new file mode 100644 (file)

index 0000000..eea4180
--- /dev/null
+++ b/roles/maas/tasks/config_ntp.yml
@@ -0,0 +1,10 @@
+---
+- name: Configure NTP service
+  when: inventory_hostname in groups['maas_region_rack_server']
+  tags: config_ntp
+  block:
+    - name: Configure NTP servers to sync MAAS
+      command: "maas {{ maas_admin_username }} maas set-config name=ntp_servers value={{ maas_ntp_servers }}"
+
+    - name: Configure the option to use NTP external only
+      command: "maas {{ maas_admin_username }} maas set-config name=ntp_external_only value={{ maas_ntp_external_only }}"
diff --git a/roles/maas/tasks/initialize_region_rack.yml b/roles/maas/tasks/initialize_region_rack.yml

new file mode 100644 (file)

index 0000000..28f0dd3
--- /dev/null
+++ b/roles/maas/tasks/initialize_region_rack.yml
@@ -0,0 +1,46 @@
+---
+- name: Initialize MAAS Region + Rack Controller
+  when: inventory_hostname in groups['maas_region_rack_server'] and maas_install.failed == false and maas_install.changed == true
+  tags: install_maas
+  block:
+    - name: List all enabled services
+      ansible.builtin.service_facts:
+      when: "maas_install_method == 'snap'"
+
+    - name: Disable timesyncd service
+      systemd_service:
+        name: "{{ item }}"
+        state: stopped
+        enabled: false
+      when: "maas_install_method == 'snap' and '{{ item }}.service' in ansible_facts.services and ansible_facts['services']['{{ item }}.service']['status'] != 'not-found'" 
+      loop:
+        - systemd-timesyncd
+        - chrony
+
+    - name: Initialize MAAS Region Controller Snap
+      expect:  
+        command: "maas init region+rack --database-uri postgres://{{ maas_db_user }}:{{ maas_db_password }}@localhost/{{ maas_db_name }}"
+        responses:
+          "MAAS URL*": ""
+          "Controller has already been initialized*": ""
+        timeout: 300
+      when: "maas_install_method == 'snap'"
+
+    - name: Starting MAAS region service Apt
+      ansible.builtin.systemd:
+        name: maas-regiond.service
+        state: started
+        no_block: false
+      when: "maas_install_method == 'apt'"
+  
+    - name: Perform database migrations
+      command: "{{ 'maas' if maas_install_method == 'snap' else 'maas-region' }} migrate"
+
+    - name: Create MAAS admin user
+      command: "sudo maas createadmin --username={{ maas_admin_username }} --password={{ maas_admin_password }} --email={{ maas_admin_email }}"
+      register: admin_user_created
+      ignore_errors: true
+
+    - name: Restart MAAS services
+      command: "snap restart maas"
+      when: "maas_install_method == 'snap'"
diff --git a/roles/maas/tasks/initialize_secondary_rack.yml b/roles/maas/tasks/initialize_secondary_rack.yml

new file mode 100644 (file)

index 0000000..f1167af
--- /dev/null
+++ b/roles/maas/tasks/initialize_secondary_rack.yml
@@ -0,0 +1,36 @@
+---
+- name: Get secret for init-rack
+  command: "cat {{ '/var/snap/maas/common/maas/secret' if maas_install_method == 'snap' else '/var/lib/maas/secret' }}"
+  when: inventory_hostname in groups['maas_region_rack_server'] and maas_install.failed == false and maas_install.changed == true
+  tags: install_maas
+  register: secret_var
+
+- name: Initialize MAAS Rack Controller
+  when: inventory_hostname in groups['maas_rack_server'] and maas_install.failed == false and secret_var is defined and maas_install.changed == true
+  tags: install_maas
+  block:
+    - name: List all enabled services
+      ansible.builtin.service_facts:
+      when: "maas_install_method == 'snap'"
+
+    - name: Disable timesyncd service
+      systemd_service:
+        name: "{{ item }}"
+        state: stopped
+        enabled: false
+      when: "maas_install_method == 'snap' and '{{ item }}.service' in ansible_facts.services and ansible_facts['services']['{{ item }}.service']['status'] != 'not-found'"
+      loop:
+        - systemd-timesyncd
+        - chrony
+
+    - name: Register Rack Controller with Region Controller Snap
+      command: "maas init rack --maas-url http://{{ hostvars[groups['maas_region_rack_server'].0]['ip'] }}:5240/MAAS/ --secret {{ hostvars[groups['maas_region_rack_server'].0]['secret_var']['stdout'] }}"
+      when: "maas_install_method == 'snap'"
+
+    - name: Register Rack Controller with Region Controller Apt
+      command: "maas-rack register --url=http://{{ hostvars[groups['maas_region_rack_server'].0]['ip'] }}:5240/MAAS/ --secret={{ hostvars[groups['maas_region_rack_server'].0]['secret_var']['stdout'] }}"
+      when: "maas_install_method == 'apt'"
+
+    - name: Restart MAAS Rack Controller
+      command: "snap restart maas"
+      when: "maas_install_method == 'snap'"
diff --git a/roles/maas/tasks/install_maasdb.yml b/roles/maas/tasks/install_maasdb.yml

new file mode 100644 (file)

index 0000000..2b434cd
--- /dev/null
+++ b/roles/maas/tasks/install_maasdb.yml
@@ -0,0 +1,33 @@
+---
+- name: Install PostgreSQL
+  apt:
+    name: postgresql-{{ postgres_version}}
+    state: present
+  when: inventory_hostname in groups['maas_db_server']
+  tags: 
+    - install_maas
+    - install_db
+  register: postgres_install
+
+- name: Configure PostgreSQL for MAAS
+  when: inventory_hostname in groups['maas_db_server'] and postgres_install is changed
+  tags: 
+    - install_maas
+    - install_db  
+  block:
+    - name: Create PostgreSQL user for MAAS
+      command: sudo -i -u postgres psql -c "CREATE USER \"{{ maas_db_user }}\" WITH ENCRYPTED PASSWORD '{{ maas_db_password }}'"
+          
+    - name: Create PostgreSQL database for MAAS
+      command: sudo -i -u postgres createdb -O "{{ maas_db_user }}" "{{ maas_db_name }}"
+          
+    - name: Allow MAAS region controller to connect
+      lineinfile:
+        path: /etc/postgresql/{{ postgres_version }}/main/pg_hba.conf
+        line: "host    {{ maas_db_name }}    {{ maas_db_user }}    0/0    md5"
+        insertafter: EOF
+
+    - name: Restart PostgreSQL
+      systemd:
+        name: postgresql
+        state: restarted
diff --git a/roles/maas/tasks/machines.yml b/roles/maas/tasks/machines.yml

new file mode 100644 (file)

index 0000000..c4de75c
--- /dev/null
+++ b/roles/maas/tasks/machines.yml
@@ -0,0 +1,159 @@
+---
+################################################################################
+# API base
+################################################################################
+- name: Set MAAS API base URL
+  set_fact:
+    _maas_api: "{{ maas_api_url | trim('/') }}/MAAS/api/2.0"
+
+- include_tasks: _auth_header.yml
+  tags:
+    - ipmi
+
+- include_tasks: machines/_read_machines.yml
+  tags:
+    - ipmi
+
+- include_tasks: machines/_build_indexes.yml
+  tags:
+    - ipmi
+
+- name: Ensure short hostnames are unique in MAAS
+  fail:
+    msg: "Duplicate short hostnames found in MAAS: {{ (_short_names | difference(_short_names | unique)) | unique | join(', ') }}"
+  when: (_short_names | difference(_short_names | unique)) | length > 0
+
+# Initialize the list of nodes we will mark Fixed later
+- name: Init empty _marked_broken list
+  set_fact:
+    _marked_broken: "{{ hostvars['localhost']._marked_broken | default([]) }}"
+  delegate_to: localhost
+  run_once: true
+
+- include_tasks: machines/_plan_sets.yml
+
+# CREATE: loop over SHORT names only
+- name: Include create.yml for missing hosts
+  include_tasks: machines/create.yml
+  loop: "{{ _create_short }}"
+  loop_control:
+    label: "{{ item }}"
+  vars:
+    # short name we planned against
+    host: "{{ item }}"
+
+    # creating: there should be no system_id; keep safe default
+    system_id: "{{ maas_short_to_id[item] | default(omit) }}"
+
+    # resolve inventory host (FQDN if inventory uses it)
+    inv_host: "{{ (inventory_by_short | default({})).get(item, item) }}"
+
+    desired_arch: "{{ hostvars[(inventory_by_short | default({})).get(item, item)].maas_arch
+                     | default(maas_arch | default('amd64/generic')) }}"
+    desired_domain: "{{ hostvars[(inventory_by_short | default({})).get(item, item)].maas_domain
+                       | default(maas_domain | default(omit)) }}"
+
+    # collect MACs from inventory: for each iface prefix, read <prefix>_mac var
+    mac_addresses: >-
+      {{
+        (hostvars[(inventory_by_short | default({})).get(item, item)].maas_interfaces | default([]))
+        | map(attribute='prefix')
+        | map('regex_replace', '$', '_mac')
+        | map('extract', hostvars[(inventory_by_short | default({})).get(item, item)])
+        | select('defined')
+        | list
+      }}
+  tags: create_machines
+
+# Create machines just creates a skeleton machine entry.
+# We called a handler to re-read all the machines from MaaS and update
+# the _update_names list.
+- meta: flush_handlers
+
+- name: Set timestamp for when machines get marked broken
+  set_fact:
+    broken_at: "{{ lookup('pipe', 'date +%Y-%m-%d\\ %H:%M:%S') }}"
+
+- include_tasks: machines/_plan_sets.yml
+
+# UPDATE: loop over SHORT names only
+- name: Include update.yml for existing hosts
+  include_tasks: machines/update.yml
+  loop: "{{ _update_short }}"
+  loop_control:
+    label: "{{ item }}"
+  vars:
+    # MAAS object for this short name (safe default to {})
+    existing: "{{ maas_by_short[item] | default({}) }}"
+
+    # updating requires a real system_id; keep strict so we notice problems
+    system_id: "{{ maas_short_to_id[item] }}"
+
+    # status map may be absent during initial runs; keep safe default
+    system_status: "{{ maas_host_to_status[item] | default('Unknown') }}"
+
+    host: "{{ item }}"
+    inv_host: "{{ (inventory_by_short | default({})).get(item, item) }}"
+
+    desired_arch: "{{ hostvars[(inventory_by_short | default({})).get(item, item)].maas_arch
+                     | default(maas_arch | default('amd64/generic')) }}"
+    desired_domain: "{{ hostvars[(inventory_by_short | default({})).get(item, item)].maas_domain
+                       | default(maas_domain | default(omit)) }}"
+  when:
+    # Don’t touch Deployed machines
+    - system_status is not match('(?i)^deployed$')
+  tags: update_machines
+
+#- pause:
+
+- include_vars: "{{ secrets_path }}/ipmi.yml"
+  tags:
+    - ipmi
+  failed_when: false
+
+- debug: var=power_user
+
+- name: Build list of hosts that have a MAAS system_id
+  set_fact:
+    _ipmi_with_id: >-
+      {{ _plan_ipmi
+         | select('in', (maas_short_to_id | default({})).keys() | list)
+         | list }}
+  tags:
+    - ipmi
+
+# Apply IPMI creds for all hosts we can resolve to a system_id
+- name: Include set_ipmi_creds.yml
+  include_tasks: machines/set_ipmi_creds.yml
+  loop: "{{ _ipmi_with_id | default([]) }}"
+  loop_control:
+    loop_var: ipmi_short
+    label: "{{ ipmi_short }}"
+  vars:
+    host: "{{ ipmi_short }}"
+    system_id: "{{ maas_short_to_id[ipmi_short] }}"
+    # If inventory uses FQDNs, resolve to inventory hostname; else short
+    inv_host: "{{ (inventory_by_short | default({})).get(ipmi_short, ipmi_short) }}"
+  when:
+    - power_user is defined
+    - power_pass is defined
+  tags:
+    - ipmi
+
+
+- name: Include delete.yml for extra hosts
+  include_tasks: machines/delete.yml
+  loop: "{{ _delete_names }}"
+  loop_control:
+    label: "{{ item }}"
+  vars:
+    host: "{{ item }}"
+    system_id: "{{ maas_short_to_id[item] }}"
+    # If inventory uses FQDNs, this resolves to the inventory hostname; else returns the short
+    inv_host: "{{ (inventory_by_short | default({})).get(item, item) }}"
+  when: (maas_delete_hosts | default(false)) | bool
+
+- name: Include cleanup.yml when we marked nodes broken
+  include_tasks: machines/cleanup.yml
+  when: _marked_broken | default([]) | length > 0
+  run_once: true
diff --git a/roles/maas/tasks/machines.yml.cli b/roles/maas/tasks/machines.yml.cli

new file mode 100644 (file)

index 0000000..b5bbcaf
--- /dev/null
+++ b/roles/maas/tasks/machines.yml.cli
@@ -0,0 +1,1064 @@
+---
+- name: Add all machines from inventory to MAAS
+  when: inventory_hostname in groups['maas_region_rack_server']
+  tags: machines
+  block:
+
+    - name: Read machines from MAAS
+      ansible.builtin.command:
+        argv: [ maas, "{{ maas_admin_username }}", machines, read ]
+      register: maas_read
+
+    - name: Parse MAAS machines JSON
+      ansible.builtin.set_fact:
+        maas_nodes_list: "{{ (maas_read.stdout | from_json) | list }}"
+
+    - name: Init MAAS map
+      ansible.builtin.set_fact:
+        maas_by_hostname: {}
+
+    - name: Populate MAAS map
+      vars:
+        boot_mac: >-
+          {{
+            (
+              (item.boot_interface.mac_address
+               if (item.boot_interface is defined and item.boot_interface and item.boot_interface.mac_address is defined)
+               else (item.interface_set | selectattr('mac_address','defined') | list | first).mac_address
+              ) | default('')
+            ) | lower
+          }}
+        boot_ip: >-
+          {{
+            (
+              (
+                (item.boot_interface.links | selectattr('ip_address','defined') | list | first).ip_address
+                if (item.boot_interface is defined and item.boot_interface and item.boot_interface.links | default([]))
+                else (item.ip_addresses | first)
+              ) | default('')
+            )
+          }}
+      loop: "{{ maas_nodes_list }}"
+      loop_control: { label: "{{ item.hostname | default('UNKNOWN') }}" }
+      ansible.builtin.set_fact:
+        maas_by_hostname: >-
+          {{
+            maas_by_hostname | combine({
+              (item.hostname | lower): {
+                'system_id': item.system_id | default(''),
+                'arch': item.architecture | default(''),
+                'mac': boot_mac,
+                'power_type': item.power_type | default(''),
+                'ip': boot_ip,
+                'status_name': item.status_name | default('')
+              }
+            })
+          }}
+
+    - name: Init desired inventory map
+      ansible.builtin.set_fact:
+        desired_by_hostname: {}
+
+    - name: Populate desired map from inventory
+      vars:
+        node: "{{ item }}"
+        hostname: "{{ node.split('.')[0] | lower }}"
+        boot_mac_key: "{{ hostvars[node]['maas_boot_mac_var'] | default(maas_boot_mac_var | default('ext_pere_mac')) }}"
+        want_mac_raw: "{{ hostvars[node][boot_mac_key] | default('') }}"
+        want_mac: "{{ want_mac_raw | lower }}"
+        boot_ip_key: "{{ hostvars[node]['maas_boot_ip_var'] | default(maas_boot_ip_var | default('ext_pere_ip')) }}"
+        want_ip: "{{ hostvars[node][boot_ip_key] | default('') }}"
+        want_arch: "{{ hostvars[node].get('arch', hostvars[node].get('maas_arch', maas_arch | default('amd64/generic'))) }}"
+        want_power: "{{ 'ipmi' if (hostvars[node].ipmi is defined and hostvars[node].ipmi|length>0) else hostvars[node].get('power_type','manual') }}"
+      loop: "{{ groups['testnodes'] | default([]) }}"
+      loop_control: { label: "{{ item }}" }
+      ansible.builtin.set_fact:
+        desired_by_hostname: >-
+          {{
+            desired_by_hostname | combine({
+              hostname: {
+                'hostname': hostname,
+                'mac': want_mac,
+                'arch': want_arch,
+                'power_type': want_power,
+                'ip': want_ip,
+                'ipmi_address': hostvars[node].ipmi | default(''),
+                'current_state': (maas_by_hostname.get(hostname, {}).status_name | default(''))
+              }
+            })
+          }}
+
+    - name: Assert each node has boot MAC and arch
+      vars:
+        node: "{{ item }}"
+        hostname: "{{ node.split('.')[0] | lower }}"
+        boot_mac_key: "{{ hostvars[node]['maas_boot_mac_var'] | default(maas_boot_mac_var | default('ext_pere_mac')) }}"
+      loop: "{{ groups['testnodes'] | default([]) }}"
+      loop_control: { label: "{{ item }}" }
+      ansible.builtin.assert:
+        that:
+          - hostvars[node][boot_mac_key] is defined
+          - (hostvars[node].get('arch', hostvars[node].get('maas_arch', maas_arch | default('amd64/generic')))) | string | length > 0
+
+    - name: Compute hosts to create
+      ansible.builtin.set_fact:
+        to_create: >-
+          {{
+            (desired_by_hostname.keys() | difference(maas_by_hostname.keys()))
+            | map('extract', desired_by_hostname)
+            | list
+          }}
+
+    # A) Try IPMI on each create-candidate
+    - name: Probe IPMI for create candidates
+      when: to_create | length > 0
+      loop: "{{ to_create }}"
+      loop_control: { label: "{{ item.hostname }} -> {{ item.ipmi_address | default('') }}" }
+      ansible.builtin.command:
+        argv:
+          - ipmitool
+          - -I
+          - lanplus
+          - -H
+          - "{{ item.ipmi_address }}"
+          - -U
+          - "{{ maas_ipmi_username }}"
+          - -P
+          - "{{ maas_ipmi_password }}"
+          - -N
+          - "1"
+          - -R
+          - "1"
+          - chassis
+          - power
+          - status
+      register: ipmi_probe_create
+      changed_when: false
+      failed_when: false
+    
+    - name: Build IPMI OK map for creates
+      when: ipmi_probe_create is defined
+      ansible.builtin.set_fact:
+        ipmi_ok_create_map: {}
+
+    - name: Accumulate IPMI OK map for creates
+      when: ipmi_probe_create is defined
+      loop: "{{ ipmi_probe_create.results }}"
+      loop_control: { label: "{{ item.item.hostname }} rc={{ item.rc }}" }
+      ansible.builtin.set_fact:
+        ipmi_ok_create_map: >-
+          {{
+            (ipmi_ok_create_map | default({}))
+            | combine({ (item.item.hostname): ((item.rc | int) == 0) })
+          }}
+    
+    # C) Rewrite to_create so power_type is 'ipmi' only if ipmi_ok else 'manual'
+    # init an empty list we’ll fill
+    - name: Init effective create list
+      ansible.builtin.set_fact:
+        to_create_effective: []
+    
+    # append each host with power_type decided by the probe result
+    - name: Build effective create list (ipmi if reachable else manual)
+      when: to_create | length > 0
+      loop: "{{ to_create }}"
+      loop_control: { label: "{{ item.hostname }}" }
+      ansible.builtin.set_fact:
+        to_create_effective: >-
+          {{
+            (to_create_effective | default([]))
+            + [ item | combine({
+                  'power_type': (ipmi_ok_create_map | default({})).get(item.hostname, false)
+                                | ternary('ipmi','manual')
+                }) ]
+          }}
+    
+    # replace the original list
+    - name: Apply effective create list
+      when: to_create_effective | length > 0
+      ansible.builtin.set_fact:
+        to_create: "{{ to_create_effective }}"
+
+    - name: Compute hosts to update
+      vars:
+        both_keys: "{{ desired_by_hostname.keys() | intersect(maas_by_hostname.keys()) }}"
+        diffs: >-
+          {%- set out = [] -%}
+          {%- for k in both_keys -%}
+            {%- set d = desired_by_hostname[k] -%}
+            {%- set m = maas_by_hostname[k] -%}
+            {%- set drift = [] -%}
+            {%- if (d.mac | default('')) != (m.mac | default('')) -%}{%- set _ = drift.append('mac') -%}{%- endif -%}
+            {%- if (d.arch | default('')) != (m.arch | default('')) -%}{%- set _ = drift.append('arch') -%}{%- endif -%}
+            {%- if (d.power_type | default('')) != (m.power_type | default('')) -%}{%- set _ = drift.append('power_type') -%}{%- endif -%}
+            {%- set ip_drift = ((d.ip | default('')) and ((d.ip | default('')) != (m.ip | default('')))) -%}
+            {%- if drift | length > 0 or ip_drift -%}
+              {%- set _ = out.append({
+                'hostname': k,
+                'mac': d.mac,
+                'arch': d.arch,
+                'power_type': d.power_type,
+                'want_ip': d.ip,
+                'have_ip': m.ip | default(''),
+                'ip_drift': ip_drift,
+                'drift': drift,
+                'system_id': m.system_id,
+                'ipmi_address': d.ipmi_address | default('')
+              }) -%}
+            {%- endif -%}
+          {%- endfor -%}
+          {{ out }}
+      ansible.builtin.set_fact:
+        to_update: "{{ diffs }}"
+
+    - name: Create missing machines in MAAS
+      when: to_create | length > 0
+      loop: "{{ to_create }}"
+      loop_control: { label: "{{ item.hostname }}" }
+      ansible.builtin.command:
+        argv:
+          - maas
+          - "{{ maas_admin_username }}"
+          - machines
+          - create
+          - "architecture={{ item.arch }}"
+          - "mac_addresses={{ item.mac }}"
+          - "hostname={{ item.hostname }}"
+          - "power_type={{ item.power_type | default('manual') }}"
+#          - "deployed=true"
+
+    - name: Re-read machines from MAAS after creates
+      when: to_create | default([]) | length > 0
+      ansible.builtin.command:
+        argv: [ maas, "{{ maas_admin_username }}", machines, read ]
+      register: maas_read_after_create
+      changed_when: false
+    
+    - name: Parse machines JSON (post-create)
+      when: maas_read_after_create is defined and (maas_read_after_create.stdout | default('')) | length > 0
+      ansible.builtin.set_fact:
+        maas_nodes_list: "{{ (maas_read_after_create.stdout | from_json) | list }}"
+    
+    - name: Rebuild maas_by_hostname (post-create)
+      when: maas_read_after_create is defined and (maas_read_after_create.stdout | default('')) | length > 0
+      vars:
+        boot_mac: >-
+          {{
+            (
+              (item.boot_interface.mac_address
+                if (item.boot_interface is defined and item.boot_interface and item.boot_interface.mac_address is defined)
+                else (item.interface_set | selectattr('mac_address','defined') | list | first).mac_address
+            ) | default('') | lower }}
+        boot_ip: >-
+          {{
+            (
+              (
+                (item.boot_interface.links | selectattr('ip_address','defined') | list | first).ip_address
+                if (item.boot_interface is defined and item.boot_interface and item.boot_interface.links | default([]))
+                else (item.ip_addresses | first)
+              ) | default('')
+            )
+          }}
+      loop: "{{ maas_nodes_list }}"
+      loop_control: { label: "{{ item.hostname | default('UNKNOWN') }}" }
+      ansible.builtin.set_fact:
+        maas_by_hostname: >-
+          {{
+            (maas_by_hostname | default({})) | combine({
+              (item.hostname | lower): {
+                'system_id': item.system_id | default(''),
+                'arch': item.architecture | default(''),
+                'mac': boot_mac,
+                'power_type': item.power_type | default(''),
+                'ip': boot_ip,
+                'status_name': item.status_name | default('')
+              }
+            })
+          }}
+
+    - name: Build desired physical MAC set per host
+      vars:
+        node: "{{ item }}"
+        hostname: "{{ node.split('.')[0] | lower }}"
+        # keys must come from the NODE (and coerce to strings to support names like "25Gb_2")
+        keys: "{{ (hostvars[node].maas_mac_keys | default([])) | map('string') | list }}"
+        # extract values safely, then default missing ones to ''
+        macs_raw: >-
+          {{
+            (keys | map('extract', hostvars[node]) | list)
+            | map('default','')
+            | list
+          }}
+        desired_macs: "{{ macs_raw | reject('equalto','') | map('lower') | list | unique }}"
+      loop: "{{ groups['testnodes'] | default([]) }}"
+      loop_control: { label: "{{ item }}" }
+      ansible.builtin.set_fact:
+        desired_phys_macs: "{{ (desired_phys_macs | default({})) | combine({ hostname: desired_macs }) }}"
+
+    - name: Read MAAS interfaces for each host
+      vars:
+        hostname: "{{ item.split('.')[0] | lower }}"
+        sid: "{{ maas_by_hostname.get(hostname, {}).get('system_id') | default('') }}"
+      loop: "{{ groups['testnodes'] | default([]) }}"
+      loop_control: { label: "{{ item }} (sid={{ sid | default('') }})" }
+      when: sid | length > 0
+      ansible.builtin.command:
+        argv: [ maas, "{{ maas_admin_username }}", interfaces, read, "{{ sid }}" ]
+      register: iface_reads
+      changed_when: false
+
+    - name: Index existing physical interfaces by host (normalized)
+      ansible.builtin.set_fact:
+        existing_phys_by_host: >-
+          {{
+            dict(
+              iface_reads.results
+              | selectattr('stdout','defined')
+              | map(attribute='item') | map('split','.') | map('first') | list
+              | zip(
+                  iface_reads.results
+                  | map(attribute='stdout') | map('from_json') | list
+                )
+            )
+          }}
+
+    - name: Show desired vs existing MACs (debug)
+      vars:
+        h: "{{ item.split('.')[0] | lower }}"
+      loop: "{{ groups['testnodes'] | default([]) }}"
+      loop_control: { label: "{{ item }}" }
+      ansible.builtin.debug:
+        msg:
+          desired: "{{ desired_phys_macs[h] | default([]) }}"
+          have: "{{ (existing_phys_by_host[h] | default([])) | selectattr('type','equalto','physical') | map(attribute='mac_address') | list }}"
+
+    - name: Compute phys interface drift + mac->id per host
+      vars:
+        hostname: "{{ item }}"
+        interfaces: "{{ existing_phys_by_host[hostname] | default([]) }}"
+        phys_ifaces: "{{ interfaces | selectattr('type','equalto','physical') | list }}"
+        have_macs: "{{ phys_ifaces | map(attribute='mac_address') | map('lower') | list }}"
+        want_macs: "{{ desired_phys_macs[hostname] | default([]) }}"
+        mac_to_id: >-
+          {{ dict(
+               (phys_ifaces | map(attribute='mac_address') | map('lower') | list)
+               | zip(phys_ifaces | map(attribute='id') | list)
+             )
+          }}
+        missing_macs: "{{ want_macs | difference(have_macs) }}"
+        extra_macs:   "{{ have_macs | difference(want_macs) }}"
+      loop: "{{ (desired_phys_macs | default({})).keys() | list }}"
+      loop_control: { label: "{{ item }}" }
+      ansible.builtin.set_fact:
+        iface_drift: "{{ (iface_drift | default({})) | combine({ hostname: {
+          'missing': missing_macs,
+          'extra': extra_macs,
+          'mac_to_id': mac_to_id
+        }}) }}"
+
+    - name: Build phys_create_list
+      ansible.builtin.set_fact:
+        phys_create_list: >-
+          {%- set out = [] -%}
+          {%- for h, want_macs in (desired_phys_macs | default({})).items() -%}
+            {%- set sid = (maas_by_hostname[h].system_id | default('')) -%}
+            {%- set missing = (iface_drift[h].missing | default([])) -%}
+            {%- for m in missing -%}
+              {%- set _ = out.append({'hostname': h, 'sid': sid, 'mac': m}) -%}
+            {%- endfor -%}
+          {%- endfor -%}
+          {{ out }}
+    
+    - name: Define allowed states for NIC changes
+      ansible.builtin.set_fact:
+        maas_allowed_states_for_phys: "{{ maas_allowed_states_for_phys | default(['New','Ready','Allocated','Broken']) }}"
+    
+    - name: Ensure status_name_map exists (hostname -> status_name)
+      when: status_name_map is not defined
+      ansible.builtin.set_fact:
+        status_name_map: >-
+          {{
+            dict(
+              (maas_nodes_list | map(attribute='hostname') | map('lower') | list)
+              | zip(maas_nodes_list | map(attribute='status_name') | list)
+            )
+          }}
+
+    - name: Split phys_create_list by eligibility (simple & clear)
+      ansible.builtin.set_fact:
+        phys_create_eligible: []
+        phys_create_ineligible: []
+    
+    - name: Accumulate phys_create elig / inelig
+      vars:
+        eligible_states: "{{ maas_allowed_states_for_phys }}"
+        st: "{{ status_name_map.get(item.hostname) | default('') }}"
+      loop: "{{ phys_create_list | default([]) }}"
+      loop_control: { label: "{{ item.hostname }} -> {{ st }}" }
+      ansible.builtin.set_fact:
+        phys_create_eligible: "{{ phys_create_eligible + [item] if st in eligible_states else phys_create_eligible }}"
+        phys_create_ineligible: "{{ phys_create_ineligible + [item] if st not in eligible_states else phys_create_ineligible }}"
+   
+    - name: Create missing physical interfaces in MAAS (eligible hosts)
+      when: phys_create_eligible | length > 0
+      loop: "{{ phys_create_eligible }}"
+      loop_control: { label: "{{ item.hostname }} -> {{ item.mac }}" }
+      ansible.builtin.command:
+        argv:
+          - maas
+          - "{{ maas_admin_username }}"
+          - interfaces
+          - create-physical
+          - "{{ item.sid }}"
+          - "mac_address={{ item.mac }}"
+      register: phys_create_results
+      changed_when: true
+    
+    - name: Re-read interfaces after physical creates
+      when: phys_create_eligible | length > 0
+      loop: "{{ phys_create_eligible | map(attribute='sid') | unique | list }}"
+      loop_control: { label: "{{ item }}" }
+      ansible.builtin.command:
+        argv: [ maas, "{{ maas_admin_username }}", interfaces, read, "{{ item }}" ]
+      register: iface_reads_after_phys_create
+      changed_when: false
+    
+    - name: Record phys-create skipped due to state (force=false)
+      when:
+        - not (maas_force_machine_update | default(false) | bool)
+        - (phys_create_ineligible | length) > 0
+      ansible.builtin.set_fact:
+        machines_skipped_due_to_state: >-
+          {{
+            (machines_skipped_due_to_state | default([]))
+            + (phys_create_ineligible | map(attribute='hostname') | list)
+          }}
+    
+    - name: "Mark {{ item }} broken to update physical interfaces"
+      when:
+        - (maas_force_machine_update | default(false) | bool)
+        - (phys_create_ineligible | length) > 0
+      loop: "{{ phys_create_ineligible | map(attribute='sid') | unique | list }}"
+      loop_control: { label: "{{ item }}" }
+      ansible.builtin.command:
+        argv: [ maas, "{{ maas_admin_username }}", machine, mark-broken, "{{ item }}" ]
+      register: phys_force_mark_broken
+      failed_when: >
+        (phys_force_mark_broken.rc != 0)
+        and ('No rack controllers can access the BMC' not in (phys_force_mark_broken.stdout | default('')))
+      changed_when: true
+    
+    - name: Create physical interfaces (while broken)
+      when:
+        - (maas_force_machine_update | default(false) | bool)
+        - (phys_create_ineligible | length) > 0
+      loop: "{{ phys_create_ineligible }}"
+      loop_control: { label: "{{ item.hostname }} -> {{ item.mac }}" }
+      ansible.builtin.command:
+        argv:
+          - maas
+          - "{{ maas_admin_username }}"
+          - interfaces
+          - create-physical
+          - "{{ item.sid }}"
+          - "mac_address={{ item.mac }}"
+      register: phys_force_create_results
+      changed_when: true
+    
+    - name: Mark fixed after physical interface create
+      when:
+        - (maas_force_machine_update | default(false) | bool)
+        - (phys_create_ineligible | length) > 0
+      loop: "{{ phys_create_ineligible | map(attribute='sid') | unique | list }}"
+      loop_control: { label: "{{ item }}" }
+      ansible.builtin.command:
+        argv: [ maas, "{{ maas_admin_username }}", machine, mark-fixed, "{{ item }}" ]
+      register: phys_force_mark_fixed
+      failed_when: >
+        (phys_force_mark_fixed.rc != 0)
+        and ('No rack controllers can access the BMC' not in (phys_force_mark_fixed.stdout | default('')))
+      changed_when: true
+
+    - name: Read interfaces for bond scan
+      loop: "{{ groups['testnodes'] | default([]) }}"
+      loop_control: { label: "{{ item }}" }
+      vars:
+        h: "{{ item.split('.')[0] | lower }}"
+        sid: "{{ maas_by_hostname[h].system_id | default(omit) }}"
+      when: sid is defined
+      ansible.builtin.command:
+        argv: [ maas, "{{ maas_admin_username }}", interfaces, read, "{{ sid }}" ]
+      register: bond_scan
+      changed_when: false
+
+    - name: Init bond maps
+      ansible.builtin.set_fact:
+        current_bonds_map: {}
+        current_bond_members: {}
+    
+    - name: Build current bond maps (per host)
+      loop: "{{ bond_scan.results | selectattr('stdout','defined') | list }}"
+      loop_control:
+        label: "{{ item.item.split('.')[0] | lower }}"
+      vars:
+        h: "{{ item.item.split('.')[0] | lower }}"
+        bonds: "{{ (item.stdout | from_json) | selectattr('type','equalto','bond') | list }}"
+        bond_names: "{{ bonds | map(attribute='name') | list }}"
+        bond_ids: "{{ bonds | map(attribute='id') | list }}"
+        bond_parents: "{{ bonds | map(attribute='parents') | list }}"
+        name_to_id: "{{ dict(bond_names | zip(bond_ids)) }}"
+        id_to_parents: "{{ dict(bond_ids | zip(bond_parents)) }}"
+      ansible.builtin.set_fact:
+        current_bonds_map: "{{ current_bonds_map | combine({ h: name_to_id }) }}"
+        current_bond_members: "{{ current_bond_members | combine({ h: id_to_parents }) }}"
+
+    - name: Ensure bond action lists exist
+      ansible.builtin.set_fact:
+        bond_create_list: "{{ bond_create_list | default([]) }}"
+        bond_update_list: "{{ bond_update_list | default([]) }}"
+    
+    - name: Compute bond actions per host
+      loop: "{{ groups['testnodes'] | default([]) }}"
+      loop_control: { label: "{{ item }}" }
+      vars:
+        node: "{{ item }}"
+        h: "{{ node.split('.')[0] | lower }}"
+        sid: "{{ maas_by_hostname[h].system_id | default('') }}"
+        want_bonds: "{{ hostvars[node].maas_bonds | default([]) }}"
+        mac_to_id: "{{ iface_drift[h].mac_to_id | default({}) }}"
+        have_bonds: "{{ current_bonds_map.get(h, {}) }}"
+      ansible.builtin.set_fact:
+        bond_create_list: >-
+          {%- set out = bond_create_list | default([]) -%}
+          {%- for b in want_bonds -%}
+            {%- set parent_macs = (b.interfaces | default([])) | map('extract', hostvars[node]) | map('lower') | list -%}
+            {%- set parent_ids = parent_macs | map('extract', mac_to_id) | select('defined') | list -%}
+            {%- if b.name not in have_bonds.keys() -%}
+              {%- set _ = out.append({
+                'hostname': h,
+                'sid': sid,
+                'name': b.name,
+                'mode': b.mode | default('802.3ad'),
+                'mtu': b.mtu | default(9000),
+                'parent_ids': parent_ids
+              }) -%}
+            {%- endif -%}
+          {%- endfor -%}
+          {{ out }}
+        bond_update_list: >-
+          {%- set out = bond_update_list | default([]) -%}
+          {%- for b in want_bonds -%}
+            {%- if b.name in have_bonds.keys() -%}
+              {%- set parent_macs = (b.interfaces | default([])) | map('extract', hostvars[node]) | map('lower') | list -%}
+              {%- set parent_ids = parent_macs | map('extract', mac_to_id) | select('defined') | list -%}
+              {%- set _ = out.append({
+                'hostname': h,
+                'sid': sid,
+                'name': b.name,
+                'mode': b.mode | default('802.3ad'),
+                'mtu': b.mtu | default(9000),
+                'parent_ids': parent_ids,
+                'have_bond_id': have_bonds[b.name]
+              }) -%}
+            {%- endif -%}
+          {%- endfor -%}
+          {{ out }}
+
+    - name: Define allowed MAAS states for bond changes
+      ansible.builtin.set_fact:
+        maas_allowed_states_for_bonds: ['New','Ready','Allocated','Broken']
+
+    
+    - name: Build eligibility lists for bond changes
+      vars:
+        eligible_hosts: >-
+          {{
+            status_name_map | dict2items
+            | selectattr('value','in', maas_allowed_states_for_bonds)
+            | map(attribute='key') | list
+          }}
+        all_bond_hosts: >-
+          {{
+            (
+              (bond_create_list | default([])) + (bond_update_list | default([]))
+            )
+            | map(attribute='hostname') | list
+            | unique | list
+          }}
+      ansible.builtin.set_fact:
+        bond_create_eligible: "{{ (bond_create_list | default([])) | selectattr('hostname','in', eligible_hosts) | list }}"
+        bond_update_eligible: "{{ (bond_update_list | default([])) | selectattr('hostname','in', eligible_hosts) | list }}"
+        bond_ineligible_hosts: "{{ all_bond_hosts | difference(eligible_hosts) | list }}"
+
+    - name: Recompute desired parent IDs for each bond update
+      when: bond_update_eligible | length > 0
+      loop: "{{ bond_update_eligible }}"
+      loop_control:
+        label: "{{ item.hostname }} -> {{ item.name }}"
+      vars:
+        hostname: "{{ item.hostname }}"
+        # get this host's bond definition from inventory/group_vars
+        bond_cfg: >-
+          {{
+            (hostvars[hostname].maas_bonds | default([]))
+            | selectattr('name','equalto', item.name) | first | default({})
+          }}
+        # the inventory keys for this bond (e.g. ['ext_pere_mac','25Gb_2'])
+        mac_keys: "{{ bond_cfg.interfaces | default([]) }}"
+        # resolve keys -> MACs from that host, normalize/lower, drop empties
+        macs_for_bond: >-
+          {{
+            mac_keys
+            | map('extract', hostvars[hostname]) | map('default','')
+            | map('lower') | reject('equalto','') | list
+          }}
+        # existing interface id map for this host: mac(lower) -> id
+        id_by_mac: "{{ iface_drift[hostname].mac_to_id | default({}) }}"
+        desired_parent_ids: >-
+          {{
+            macs_for_bond
+            | map('extract', id_by_mac, None)
+            | reject('equalto', None)
+            | map('string') | unique | sort | list
+          }}
+      ansible.builtin.set_fact:
+        bond_update_argvs: >-
+          {{
+            (bond_update_argvs | default([]))
+            + [ {
+                'sid': item.sid,
+                'bond_id': item.have_bond_id,
+                'argv': [
+                  'maas', maas_admin_username, 'interface', 'update',
+                  item.sid, (item.have_bond_id | string),
+                  'parents=' ~ (desired_parent_ids | join(',')),
+                  'bond_mode=' ~ (item.mode | default('802.3ad')),
+                  'mtu=' ~ (item.mtu | default(9000) | string)
+                ]
+              } ]
+          }}
+    
+    - name: Apply bond parents/mode/mtu (idempotent)
+      when: (bond_update_argvs | default([])) | length > 0
+      loop: "{{ bond_update_argvs }}"
+      loop_control: { label: "{{ item.sid }} -> bond {{ item.bond_id }}" }
+      vars:
+        # item.argv currently has base pieces; rebuild with repeated parents=
+        parents_ids: >-
+          {{
+            (item.argv | last) is string and (item.argv | last) is search('^parents=')
+              | ternary( (item.argv | last | regex_replace('^parents=', '')).split(','),
+                         [] )
+          }}
+        parents_args: "{{ parents_ids | map('string') | map('regex_replace','^(.*)$','parents=\\1') | list }}"
+        base_args: "{{ ['maas', maas_admin_username, 'interface', 'update', item.sid, (item.bond_id | string)] }}"
+        final_argv: "{{ base_args + parents_args + ['bond_mode=802.3ad', 'mtu=9000'] }}"
+      ansible.builtin.command:
+        argv: "{{ final_argv }}"
+      register: bond_parent_updates
+      changed_when: true
+
+    - name: Record machines skipped due to state (force=false)
+      when:
+        - not (maas_force_machine_update | default(false) | bool)
+        - bond_ineligible_hosts | length > 0
+      ansible.builtin.set_fact:
+        bond_skipped_due_to_state: >-
+          {{ (bond_skipped_due_to_state | default([])) + bond_ineligible_hosts }}
+
+    - name: Create bonds (machines in modifiable state)
+      when: bond_create_eligible | length > 0
+      loop: "{{ bond_create_eligible }}"
+      loop_control: { label: "{{ item.hostname }} -> {{ item.name }}" }
+      vars:
+        parents_args: >-
+          {{
+            item.parent_ids
+            | map('string')
+            | map('regex_replace','^(.*)$','parents=\\1')
+            | list
+          }}
+        argv_final: >-
+          {{
+            ['maas', maas_admin_username, 'interfaces', 'create-bond',
+             item.sid, 'name=' ~ item.name,
+             'bond_mode=' ~ (item.mode | default('802.3ad'))]
+            + parents_args
+            + ['mtu=' ~ (item.mtu | default(9000) | string)]
+            + ((item.vlan is defined) | ternary(['vlan=' ~ (item.vlan | string)], []))
+          }}
+      ansible.builtin.command:
+        argv: "{{ argv_final }}"
+      register: bond_create_results
+      changed_when: true
+
+    - name: Update bonds (machine in modifiable state)
+      when: bond_update_eligible | length > 0
+      loop: "{{ bond_update_eligible }}"
+      loop_control:
+        label: "{{ item.hostname }} -> {{ item.name }} (id={{ item.have_bond_id }})"
+      vars:
+        parents_args: >-
+          {{
+            item.parent_ids
+            | map('string')
+            | map('regex_replace','^(.*)$','parents=\1')
+            | list
+          }}
+        argv_final: >-
+          {{
+            ['maas', maas_admin_username, 'interface', 'update',
+             item.sid, (item.have_bond_id | string)]
+            + parents_args
+            + ['bond_mode=' ~ (item.mode | default('802.3ad')),
+               'mtu=' ~ (item.mtu | default(9000) | string)]
+          }}
+      ansible.builtin.command:
+        argv: "{{ argv_final }}"
+      register: bond_update_calls
+      changed_when: true
+
+    - name: Build force lists (ineligible hosts only, when forcing)
+      when: (maas_force_machine_update | default(false) | bool)
+      vars:
+        bond_create_force_hosts: "{{ (bond_create_list | default([])) | map(attribute='hostname') | list | unique | list | difference(bond_create_eligible | map(attribute='hostname') | list | unique | list) }}"
+        bond_update_force_hosts: "{{ (bond_update_list | default([])) | map(attribute='hostname') | list | unique | list | difference(bond_update_eligible | map(attribute='hostname') | list | unique | list) }}"
+      ansible.builtin.set_fact:
+        bond_create_force: "{{ (bond_create_list | default([])) | selectattr('hostname','in', bond_create_force_hosts) | list }}"
+        bond_update_force: "{{ (bond_update_list | default([])) | selectattr('hostname','in', bond_update_force_hosts) | list }}"
+        force_hosts_unique: "{{ (bond_create_force_hosts + bond_update_force_hosts) | unique | list }}"
+    
+    - name: Mark machines broken for forced bond updates
+      when:
+        - (maas_force_machine_update | default(false) | bool)
+        - force_hosts_unique | length > 0
+      loop: "{{ force_hosts_unique }}"
+      loop_control: { label: "{{ item }}" }
+      ansible.builtin.command:
+        argv:
+          - maas
+          - "{{ maas_admin_username }}"
+          - machine
+          - mark-broken
+          - "{{ maas_by_hostname[item].system_id }}"
+      register: mark_broken_result
+      # Treat only *other* non-zero failures as fatal
+      failed_when: >
+        (mark_broken_result.rc != 0) and
+        ('No rack controllers can access the BMC of node' not in (mark_broken_result.stdout | default(''))) and
+        ('No rack controllers can access the BMC of machine' not in (mark_broken_result.stdout | default('')))
+      # Still count as "changed" so downstream tasks run
+      changed_when: >
+        (mark_broken_result.rc == 0) or
+        ('No rack controllers can access the BMC' in (mark_broken_result.stdout | default('')))
+    
+    - name: Create bonds (forced, machine temporarily broken)
+      when:
+        - (maas_force_machine_update | default(false) | bool)
+        - bond_create_force | length > 0
+      loop: "{{ bond_create_force }}"
+      loop_control:
+        label: "{{ item.hostname }} -> {{ item.name }}"
+      vars:
+        parents_csv: "{{ item.parent_ids | map('string') | join(',') }}"
+        bond_create_argv: >-
+          {{
+            [
+              'maas',
+              maas_admin_username,
+              'interfaces',
+              'create-bond',
+              item.sid,
+              'name=' ~ item.name,
+              'bond_mode=' ~ (item.mode | default('802.3ad')),
+              'parents=' ~ parents_csv,
+              'mtu=' ~ (item.mtu | default(9000) | string)
+            ]
+            + ((item.vlan is defined) | ternary(['vlan=' ~ (item.vlan | string)], []))
+          }}
+      ansible.builtin.command:
+        argv: "{{ bond_create_argv }}"
+      register: bond_create_force_results
+      changed_when: true
+
+    # Read all fabrics and vlans to build a vid -> vlan_id map
+    - name: Read fabrics
+      ansible.builtin.command:
+        argv: [ maas, "{{ maas_admin_username }}", fabrics, read ]
+      register: maas_fabrics
+      changed_when: false
+    
+    - name: Build list of fabric IDs
+      ansible.builtin.set_fact:
+        fabric_ids: "{{ (maas_fabrics.stdout | from_json) | map(attribute='id') | list | unique }}"
+    
+    - name: Read VLANs for each fabric
+      loop: "{{ fabric_ids }}"
+      loop_control: { label: "fabric={{ item }}" }
+      ansible.builtin.command:
+        argv: [ maas, "{{ maas_admin_username }}", vlans, read, "{{ item }}" ]
+      register: maas_vlans_reads
+      changed_when: false
+    
+    - name: Build vid -> vlan_id map
+      ansible.builtin.set_fact:
+        vid_to_vlan_id: >-
+          {{
+            dict(
+              (maas_vlans_reads.results | map(attribute='stdout') | map('from_json') | list)
+              | sum(start=[])
+              | map(attribute='vid') | list
+              | zip(
+                  (maas_vlans_reads.results | map(attribute='stdout') | map('from_json') | list)
+                  | sum(start=[])
+                  | map(attribute='id') | list
+                )
+            )
+          }}
+
+    # For each bond needing a VLAN, compute the MAAS VLAN id
+    - name: Build bond->vlan_id updates
+      when: bond_update_eligible | length > 0
+      loop: "{{ bond_update_eligible }}"
+      loop_control: { label: "{{ item.hostname }} -> {{ item.name }}" }
+      vars:
+        bond_cfg: "{{ (maas_bonds | selectattr('name','equalto', item.name) | first) | default({}) }}"
+        desired_vid: "{{ bond_cfg.vlan | default(None) }}"
+        vlan_id: "{{ (desired_vid is not none) | ternary(vid_to_vlan_id.get(desired_vid), None) }}"
+      ansible.builtin.set_fact:
+        bond_vlan_updates: >-
+          {{
+            (bond_vlan_updates | default([]))
+            + ([{
+                'sid': item.sid,
+                'bond_id': item.have_bond_id,
+                'vlan_id': vlan_id
+              }] if vlan_id is not none else [])
+          }}
+    
+    - name: Attach bond to VLAN (set vlan=VLAN_ID)
+      when: (bond_vlan_updates | default([])) | length > 0
+      loop: "{{ bond_vlan_updates }}"
+      loop_control:
+        label: "{{ item.sid }} -> bond {{ item.bond_id }} vlan={{ item.vlan_id }}"
+      ansible.builtin.command:
+        argv:
+          - maas
+          - "{{ maas_admin_username }}"
+          - interface
+          - update
+          - "{{ item.sid }}"
+          - "{{ item.bond_id }}"
+          - "vlan={{ item.vlan_id }}"
+      register: bond_vlan_set
+      changed_when: true
+    
+    - name: Update bonds (forced, machine temporarily broken)
+      when:
+        - (maas_force_machine_update | default(false) | bool)
+        - bond_update_force | length > 0
+        - member_drift or (item.mode is defined) or (item.mtu is defined)
+      loop: "{{ bond_update_force }}"
+      loop_control: { label: "{{ item.hostname }} -> {{ item.name }} (id={{ item.have_bond_id }})" }
+      vars:
+        parents_csv: "{{ item.parent_ids | map('string') | join(',') }}"
+        bond_update_force_argv: >-
+          {{
+            [
+              'maas',
+              maas_admin_username,
+              'interface',
+              'update',
+              item.sid,
+              (item.have_bond_id | string),
+              'bond_mode=' ~ (item.mode | default('802.3ad')),
+              'parents=' ~ parents_csv,
+              'mtu=' ~ (item.mtu | default(9000) | string)
+            ]
+          }}
+      ansible.builtin.command:
+        argv: "{{ bond_update_force_argv }}"
+      register: bond_update_force_results
+      changed_when: true
+    
+    - name: Mark machines fixed after forced bond updates
+      when:
+        - (maas_force_machine_update | default(false) | bool)
+        - force_hosts_unique | length > 0
+      loop: "{{ force_hosts_unique }}"
+      loop_control: { label: "{{ item }}" }
+      ansible.builtin.command:
+        argv:
+          - maas
+          - "{{ maas_admin_username }}"
+          - machine
+          - mark-fixed
+          - "{{ maas_by_hostname[item].system_id }}"
+      register: mark_fixed_result
+      failed_when: >
+        (mark_fixed_result.rc != 0) and
+        ('No rack controllers can access the BMC of node' not in (mark_fixed_result.stdout | default(''))) and
+        ('No rack controllers can access the BMC of machine' not in (mark_fixed_result.stdout | default('')))
+      changed_when: >
+        (mark_fixed_result.rc == 0) or
+        ('No rack controllers can access the BMC' in (mark_fixed_result.stdout | default('')))
+
+    - name: Read machine details to inspect power parameters
+      vars:
+        hostname: "{{ item.split('.')[0] | lower }}"
+        sid: "{{ maas_by_hostname.get(hostname, {}).get('system_id') | default(omit) }}"
+      when: sid is defined
+      loop: "{{ groups['testnodes'] | default([]) }}"
+      loop_control: { label: "{{ item }}" }
+      ansible.builtin.command:
+        argv: [ maas, "{{ maas_admin_username }}", machine, read, "{{ sid }}" ]
+      register: machine_reads
+      changed_when: false
+
+    - name: Build map of current power settings
+      when: machine_reads is defined and machine_reads.results is defined
+      ansible.builtin.set_fact:
+        power_map: >-
+          {{
+            dict(
+              machine_reads.results
+              | selectattr('stdout','defined')
+              | map(attribute='item')
+              | map('split','.') | map('first') | list
+              | zip(machine_reads.results | map(attribute='stdout') | map('from_json'))
+            )
+          }}
+
+    - name: Select update candidates
+      ansible.builtin.set_fact:
+        update_candidates: >-
+          {{
+            to_update
+            | selectattr('drift', 'defined')
+            | selectattr('drift', 'ne', [])
+            | list
+          }}
+
+    # A) Try IPMI for each update-candidate that *wants* ipmi and has an address
+    - name: Probe IPMI for update candidates
+      loop: "{{ update_candidates }}"
+      loop_control: { label: "{{ item.hostname }} -> {{ ipmi_addr }}" }
+      vars:
+        ipmi_addr: "{{ desired_by_hostname[item.hostname].ipmi_address | default('') }}"
+      ansible.builtin.command:
+        argv:
+          - ipmitool
+          - -I
+          - lanplus
+          - -H
+          - "{{ ipmi_addr }}"
+          - -U
+          - "{{ maas_ipmi_username }}"
+          - -P
+          - "{{ maas_ipmi_password }}"
+          - -N
+          - "1"
+          - -R
+          - "1"
+          - chassis
+          - power
+          - status
+      register: ipmi_probe_update
+      changed_when: false
+      failed_when: false
+      when:
+        - update_candidates | default([]) | length > 0
+        - (item.power_type | default('manual')) == 'ipmi'
+        - ipmi_addr | length > 0
+        - maas_ipmi_username is defined
+        - maas_ipmi_password is defined
+    
+    # B) Build "hostname -> ipmi_ok" (rc == 0) lookup for updates
+    - name: Init IPMI OK map for updates
+      when: ipmi_probe_update is defined
+      ansible.builtin.set_fact:
+        ipmi_ok_update_map: {}
+    
+    - name: Accumulate IPMI OK map for updates
+      when: ipmi_probe_update is defined
+      loop: "{{ ipmi_probe_update.results }}"
+      loop_control: { label: "{{ item.item.hostname }} rc={{ item.rc }}" }
+      ansible.builtin.set_fact:
+        ipmi_ok_update_map: >-
+          {{
+            (ipmi_ok_update_map | default({}))
+            | combine({ (item.item.hostname): ((item.rc | int) == 0) })
+          }}
+    
+    # C) Produce update list with an *effective* power_type (ipmi if ok, else manual)
+    - name: Init effective update list
+      ansible.builtin.set_fact:
+        update_candidates_effective: []
+    
+    - name: Compute effective power_type for updates
+      when: update_candidates | default([]) | length > 0
+      loop: "{{ update_candidates }}"
+      loop_control: { label: "{{ item.hostname }}" }
+      ansible.builtin.set_fact:
+        update_candidates_effective: >-
+          {{
+            (update_candidates_effective | default([]))
+            + [ item | combine({
+                  'ipmi_address': (desired_by_hostname[item.hostname].ipmi_address | default('')),
+                  'effective_power_type':
+                    (
+                      ((item.power_type | default('manual')) == 'ipmi')
+                      and (ipmi_ok_update_map | default({})).get(item.hostname, false)
+                    )
+                    | ternary('ipmi','manual')
+                }) ]
+          }}
+
+    - name: Update machines (ipmi reachable)
+      when: update_candidates_effective | selectattr('effective_power_type','equalto','ipmi') | list | length > 0
+      loop: "{{ update_candidates_effective | selectattr('effective_power_type','equalto','ipmi') | list }}"
+      loop_control: { label: "{{ item.hostname }}" }
+      ansible.builtin.command:
+        argv:
+          - maas
+          - "{{ maas_admin_username }}"
+          - machine
+          - update
+          - "{{ item.system_id }}"
+          - "hostname={{ item.hostname }}"
+          - "architecture={{ item.arch }}"
+          - "power_type=ipmi"
+          - "mac_addresses={{ item.mac }}"
+          - "power_parameters_power_address={{ item.ipmi_address | default('') }}"
+          - "power_parameters_power_user={{ maas_ipmi_username }}"
+          - "power_parameters_power_pass={{ maas_ipmi_password }}"
+    
+    - name: Update machines (fallback to manual)
+      when: update_candidates_effective | selectattr('effective_power_type','equalto','manual') | list | length > 0
+      loop: "{{ update_candidates_effective | selectattr('effective_power_type','equalto','manual') | list }}"
+      loop_control: { label: "{{ item.hostname }}" }
+      ansible.builtin.command:
+        argv:
+          - maas
+          - "{{ maas_admin_username }}"
+          - machine
+          - update
+          - "{{ item.system_id }}"
+          - "hostname={{ item.hostname }}"
+          - "architecture={{ item.arch }}"
+          - "power_type=manual"
+          - "mac_addresses={{ item.mac }}"
+
+    - name: These machines need to be updated but were skipped for being in the wrong state
+      run_once: true
+      when:
+        - not (maas_force_machine_update | default(false) | bool)
+        - ((bond_skipped_due_to_state | default([])) | length > 0) or
+          ((machines_skipped_due_to_state | default([])) | length > 0)
+      ansible.builtin.debug:
+        msg: >-
+          These machines need to be updated but were skipped for being in the wrong state:
+          {{
+            ((bond_skipped_due_to_state | default([])) + (machines_skipped_due_to_state | default([])))
+            | unique | sort | list
+          }}
diff --git a/roles/maas/tasks/machines/_apply_one_iface.yml b/roles/maas/tasks/machines/_apply_one_iface.yml

new file mode 100644 (file)

index 0000000..89eead3
--- /dev/null
+++ b/roles/maas/tasks/machines/_apply_one_iface.yml
@@ -0,0 +1,490 @@
+---
+# TODOs:
+#  - REMOVE VLAN interfaces that should not exist
+
+# Fresh auth (nonce) for any API calls in this include
+- include_tasks: ../_auth_header.yml
+
+# Normalize incoming iface object; never use a loop var named "iface" anywhere.
+- name: Normalize iface object
+  set_fact:
+    iface: "{{ iface_obj }}"
+
+# Ensure we have a vlan map; if empty, fetch it from MAAS
+- name: Ensure vlan map exists
+  set_fact:
+    _vlan_by_vid: "{{ _vlan_by_vid | default({}) }}"
+
+- name: Read all fabrics (for VLAN lookup)
+  when: (_vlan_by_vid | length) == 0
+  uri:
+    url: "{{ _maas_api }}/fabrics/"
+    method: GET
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    return_content: yes
+    status_code: 200
+  register: _fabrics_resp
+  no_log: true
+
+# Flatten all VLANs from every fabric into one list
+- name: Collect all VLANs from fabrics payload
+  when:
+    - (_vlan_by_vid | length) == 0
+    - _fabrics_resp.json is defined
+  set_fact:
+    _all_vlans: "{{ (_fabrics_resp.json | map(attribute='vlans') | list) | flatten }}"
+
+# Build { "<vid>": <vlan_obj>, ... } for fast lookup
+- name: Build _vlan_by_vid map (keyed by VID as string)
+  when:
+    - (_vlan_by_vid | length) == 0
+    - _all_vlans is defined
+  set_fact:
+    _vlan_by_vid: >-
+      {{
+        dict(
+          (_all_vlans | map(attribute='vid') | map('string') | list)
+          | zip(_all_vlans)
+        )
+      }}
+
+# Build quick lookups
+- name: Build interface lookups
+  set_fact:
+    _iface_id_by_mac: >-
+      {{
+        dict(
+          (_ifaces | selectattr('mac_address','defined')
+                   | map(attribute='mac_address')
+                   | map('lower') | list)
+          | zip(_ifaces | map(attribute='id'))
+        )
+      }}
+    _iface_name_by_id: >-
+      {{
+        dict(
+          (_ifaces | selectattr('id','defined')   | map(attribute='id')   | list)
+          | zip(_ifaces | selectattr('name','defined') | map(attribute='name') | list)
+        )
+      }}
+
+# Normalize VLAN lookup for int/string keys
+- name: Build VLAN lookup (int & string keys)
+  set_fact:
+    _vlan_lookup: >-
+      {{
+        (_vlan_by_vid | default({}))
+        | combine(
+            dict(((_vlan_by_vid | default({})).keys() | list | map('string') | list)
+                 | zip((_vlan_by_vid | default({})).values())),
+            recursive=True
+          )
+      }}
+
+# Resolve node system_id from interface facts (avoids mismatch)
+- name: Resolve node system_id for interface ops
+  set_fact:
+    _node_system_id: >-
+      {{
+        (_ifaces | length) > 0 and ((_ifaces | first).system_id) or system_id
+      }}
+
+# Validate prefix_mac exists
+- name: "Ensure {{ prefix }}_mac exists for {{ iface.prefix }}"
+  assert:
+    that:
+      - iface.prefix is defined
+      - hostvars[inv_host][iface.prefix ~ '_mac'] is defined
+    fail_msg: "Missing {{ iface.prefix }}_mac for {{ inv_host }}"
+
+# Resolve parent MAC from inventory (normalize to lower)
+- name: Set _parent_mac
+  set_fact:
+    _parent_mac: "{{ hostvars[inv_host][iface.prefix ~ '_mac'] | string | lower }}"
+
+# Try to resolve an interface id for this MAC
+- name: Resolve parent interface id
+  set_fact:
+    _parent_id: "{{ _iface_id_by_mac.get(_parent_mac) | default(None) }}"
+
+- include_tasks: ../_auth_header.yml
+
+# Optionally create missing PHYSICAL interface (when allowed)
+- name: "Create missing physical interface for {{ host }}"
+  when:
+    - (_parent_id is none) or (_parent_id | string) == ''
+    - (maas_allow_create_physical | default(true)) | bool
+  uri:
+    url: "{{ _maas_api }}/nodes/{{ _node_system_id }}/interfaces/?op=create_physical"
+    method: POST
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+      Content-Type: application/x-www-form-urlencoded
+    body_format: form-urlencoded
+    body:
+      type: "physical"
+      mac_address: "{{ _parent_mac }}"
+      mtu: "{{ iface.mtu | default(omit) }}"
+      # name: "{{ iface.prefix }}"         # optional; MAAS may auto-name (ethX)
+    status_code: [200, 201]
+    return_content: true
+  register: _create_phys
+  no_log: true
+
+- include_tasks: ../_auth_header.yml
+
+# Refresh interfaces + lookups after possible create
+- name: Refresh MAAS interface facts after create (if needed)
+  when:
+    - _create_phys is defined
+  uri:
+    url: "{{ _maas_api }}/nodes/{{ _node_system_id }}/interfaces/"
+    method: GET
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    return_content: true
+    status_code: 200
+  register: _ifaces_after_create
+  no_log: true
+- name: Re-set _ifaces after create
+  when:
+    - _ifaces_after_create is defined
+  set_fact:
+    _ifaces: "{{ _ifaces_after_create.json | list }}"
+
+- name: Rebuild interface facts + maps after create
+  when:
+    - _ifaces_after_create is defined
+  set_fact:
+    _iface_id_by_mac: >-
+      {{
+        dict(
+          (_ifaces | selectattr('mac_address','defined') | map(attribute='mac_address')
+                   | map('lower') | list)
+          | zip(_ifaces | map(attribute='id'))
+        )
+      }}
+    _iface_id_by_name: >-
+      {{
+        dict(
+          (_ifaces | selectattr('name','defined') | map(attribute='name') | list)
+          | zip(_ifaces | map(attribute='id'))
+        )
+      }}
+    _iface_name_by_id: >-
+      {{
+        dict(
+          (_ifaces | selectattr('id','defined')   | map(attribute='id')   | list)
+          | zip(_ifaces | selectattr('name','defined') | map(attribute='name') | list)
+        )
+      }}
+
+# Resolve again now that we may have created it
+- name: Resolve parent interface id (post-create)
+  when: (_parent_id is none) or (_parent_id | string) == ''
+  set_fact:
+    _parent_id: "{{ _iface_id_by_mac.get(_parent_mac) | default(None) }}"
+
+# If still missing, fail cleanly (or switch to 'warn + skip' if you prefer)
+- name: Abort when parent interface is missing and auto-create disabled/failed
+  when: _parent_id is none
+  fail:
+    msg: >-
+      Could not find or create physical interface with MAC {{ _parent_mac }}
+      on {{ inv_host }} (system_id={{ _node_system_id }}).
+      Either re-commission the node or allow auto-create via
+      maas_allow_create_physical=true.
+
+# Load parent object (safe now)
+- name: Load parent interface object
+  set_fact:
+    _parent_obj: "{{ (_ifaces | selectattr('id','equalto', (_parent_id|int)) | list | first) | default({}) }}"
+
+- name: Ensure prerequisites for bond MAC match exist
+  set_fact:
+    _desired_bonds: "{{ _desired_bonds | default([]) }}"
+    _parent_mac: "{{ _parent_mac | string | lower }}"
+    _bond_match: {}
+
+- name: Collect matching bond (by MAC) for this parent
+  set_fact:
+    _bond_match: "{{ bond }}"
+  loop: "{{ _desired_bonds }}"
+  loop_control:
+    loop_var: bond
+    label: "{{ bond.name | default('∅') }}"
+  when:
+    - bond.interfaces is defined
+    - bond.native_vid is defined
+    - _parent_mac in (bond.interfaces | map('extract', hostvars[inv_host]) | map('string') | map('lower') | list)
+
+- name: Inherit native VLAN from matched bond
+  set_fact:
+    _effective_native_vid: "{{ _bond_match.native_vid }}"
+    _effective_native_vlan_id: "{{ _vlan_lookup[_bond_match.native_vid | string].id }}"
+  when:
+    - _bond_match is mapping
+    - _bond_match | length > 0
+    - _bond_match.native_vid is defined
+    - (_bond_match.native_vid | string) in _vlan_lookup
+
+# If the loaded parent is a VLAN (e.g. eth0.1300), use its physical parent (e.g. eth0)
+- name: Detect if loaded parent is a VLAN
+  set_fact:
+    _parent_is_vlan: "{{ _parent_obj is mapping and (_parent_obj.type | default('')) == 'vlan' }}"
+
+- name: Extract physical parent name from VLAN
+  when: _parent_is_vlan
+  set_fact:
+    _phys_parent_name: "{{ (_parent_obj.parents | default([])) | first | default('') }}"
+
+- name: Resolve physical parent object by name from _ifaces
+  when: _parent_is_vlan and (_phys_parent_name | length) > 0
+  set_fact:
+    _phys_parent_obj: >-
+      {{
+        (_ifaces | default([])
+         | selectattr('name','equalto', _phys_parent_name)
+         | list | first) | default({}, true)
+      }}
+
+- name: Set parent_id to the physical iface id (obj → name map → keep old)
+  when: _parent_is_vlan
+  set_fact:
+    _parent_id: >-
+      {{
+        _phys_parent_obj.id
+          | default(_iface_id_by_name.get(_phys_parent_name), true)
+          | default(_parent_id, true)
+      }}
+    _parent_obj: >-
+      {{
+        (_phys_parent_obj if (_phys_parent_obj | length > 0) else _parent_obj)
+      }}
+
+# Safety net so we never send parent=0 again
+- name: Assert parent interface id resolved before creating VLAN subinterface
+  assert:
+    that:
+      - _parent_id is defined
+      - (_parent_id | int) > 0
+    fail_msg: >-
+      Could not resolve physical parent for '{{ iface.prefix }}'.
+      parent_obj={{ _parent_obj | default({}) }} maps: by_name={{ _iface_id_by_name | default({}) }}.
+
+
+# Only check type if we actually have an object
+- name: Ensure parent is physical/bond before native VLAN update
+  when: _parent_obj is mapping and _parent_obj.type is defined
+  assert:
+    that:
+      - _parent_obj.type in ['physical','bond']
+    fail_msg: "Native VLAN can only be set on a physical/bond parent (id={{ _parent_id }})."
+
+# Only check type if we actually have an object
+- name: Ensure parent is physical/bond before native VLAN update
+  when: _parent_obj is mapping and _parent_obj.type is defined
+  assert:
+    that:
+      - _parent_obj.type in ['physical','bond']
+    fail_msg: "Native VLAN can only be set on a physical/bond parent (id={{ _parent_id }})."
+
+# --- MTU handling for parent interface ------------------------------------
+- name: Set desired MTU fact (if specified)
+  when: iface.mtu is defined
+  set_fact:
+    _desired_mtu: "{{ iface.mtu | int }}"
+
+- name: Set current MTU from parent object
+  when:
+    - iface.mtu is defined
+    - _parent_obj is mapping
+  set_fact:
+    _current_mtu: "{{ (_parent_obj.mtu | default(0)) | int }}"
+
+- include_tasks: ../_auth_header.yml
+
+- name: "Update MTU on {{ host }}'s {{ _parent_obj.name }} (if different)"
+  when:
+    - iface.mtu is defined
+    - (_desired_mtu | int) > 0
+    - (_current_mtu | int) != (_desired_mtu | int)
+  uri:
+    url: "{{ _maas_api }}/nodes/{{ _node_system_id }}/interfaces/{{ _parent_id }}/"
+    method: PUT
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+      Content-Type: application/x-www-form-urlencoded
+    body_format: form-urlencoded
+    body:
+      mtu: "{{ _desired_mtu }}"
+    status_code: 200
+
+- name: Check current native on parent
+  set_fact:
+    _current_native: "{{ (_ifaces | selectattr('id','equalto', (_parent_id|int)) | map(attribute='vlan') | list | first) | default(None) }}"
+
+- include_tasks: ../_auth_header.yml
+
+- name: Check current native on parent
+  set_fact:
+    _current_native: "{{ (_ifaces | selectattr('id','equalto', (_parent_id|int)) | map(attribute='vlan') | list | first) | default(None) }}"
+
+- name: Set _current_native_id from _current_native dict. Default to 0.
+  set_fact:
+    _current_native_id: >-
+      {{ (_current_native.id | int)
+         if (_current_native is mapping)
+         else 0 }}
+
+# If iface.native_vid is missing, and bond logic didn’t set anything, fallback to 'untagged'
+- name: Derive native VID from _vlan_lookup when native_vid is missing
+  when:
+    - iface.native_vid is not defined
+    - _effective_native_vlan_id is not defined
+  set_fact:
+    _effective_native_vid: >-
+      {{
+        (_vlan_lookup
+          | dict2items
+          | selectattr('value.name','equalto','untagged')
+          | map(attribute='value.vid')
+          | list
+          | first) | default(omit)
+      }}
+
+- name: Resolve native VLAN ID from VID
+  when: _effective_native_vid is defined
+  set_fact:
+    _effective_native_vlan_id: "{{ _vlan_lookup[_effective_native_vid|string].id }}"
+
+# Figure out what ID to send to MAAS
+- name: Choose final native VLAN id to apply
+  set_fact:
+    _native_vlan_id_to_apply: >-
+      {{
+        (iface.get('native_vid') is not none)
+        | ternary(
+            _vlan_lookup[iface.get('native_vid')|string].id,
+            _effective_native_vlan_id | default(omit)
+          )
+      }}
+
+# Only if we actually have an ID, and parent is physical/bond
+- name: "Set native VLAN on {{ host }}'s {{ _parent_obj.name }} (if different)"
+  when:
+    - _native_vlan_id_to_apply is defined
+    - _current_native is defined
+    - (_current_native_id | int) != (_native_vlan_id_to_apply | int)
+    - (_ifaces | selectattr('id','equalto', (_parent_id|int)) | map(attribute='type') | list | first) in ['physical','bond']
+  uri:
+    url: "{{ _maas_api }}/nodes/{{ _node_system_id }}/interfaces/{{ _parent_id }}/"
+    method: PUT
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+      Content-Type: application/x-www-form-urlencoded
+    body_format: form-urlencoded
+    body:
+      vlan: "{{ _native_vlan_id_to_apply }}"
+      link_connected: true
+    status_code: 200
+#  no_log: true
+
+- include_tasks: ../_auth_header.yml
+
+# --- Index existing VLAN subinterfaces (by parent_id + vlan_id) ----------------
+- name: Init list of existing VLAN subinterfaces
+  set_fact:
+    _existing_vlan_pairs: []
+
+# Optional (fast lookup): build name -> id map once
+- name: Build iface name→id map
+  set_fact:
+    _iface_name_to_id: "{{ dict(_ifaces | map(attribute='name') | zip(_ifaces | map(attribute='id'))) }}"
+
+# Collect existing VLAN subinterfaces (translate parent name -> id)
+- name: Collect existing VLAN subinterfaces (translate parent name -> id)
+  vars:
+    _parent_name: "{{ vlan_iface.parents | default([]) | first }}"
+    _parent_id: "{{ (_iface_name_to_id | default({})).get(_parent_name) | default(omit) }}"
+    _pair:
+      id: "{{ vlan_iface.id }}"
+      name: "{{ vlan_iface.name }}"
+      parent_name: "{{ _parent_name }}"
+      parent_id: "{{ _parent_id }}"
+      vlan_id: "{{ vlan_iface.vlan.id }}"
+  set_fact:
+    _existing_vlan_pairs: "{{ (_existing_vlan_pairs | default([])) + [_pair] }}"
+  loop: "{{ _ifaces | selectattr('type','equalto','vlan') | list }}"
+  loop_control:
+    loop_var: vlan_iface
+    label: "{{ vlan_iface.name }} ← {{ _parent_name }} (vlan_id={{ vlan_iface.vlan.id }})"
+
+- name: Ensure tagged VLAN subinterfaces exist (?op=create_vlan)  # guarded
+  when:
+    - iface.tagged_vids is defined
+    - vid in iface.tagged_vids
+    - _vlan_lookup[vid|string] is defined
+    - (
+        _existing_vlan_pairs
+        | selectattr('parent_id','equalto', (_parent_id|int))
+        | selectattr('vlan_id','equalto', _vlan_lookup[vid|string].id)
+        | list | length
+      ) == 0
+  uri:
+    url: "{{ _maas_api }}/nodes/{{ _node_system_id }}/interfaces/?op=create_vlan"
+    method: POST
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+      Content-Type: application/x-www-form-urlencoded
+    body_format: form-urlencoded
+    body:
+      parent: "{{ _parent_id }}"
+      vlan: "{{ _vlan_lookup[vid|string].id }}"
+    status_code: [200]
+    return_content: true
+  loop: "{{ iface.tagged_vids | default([]) }}"
+  loop_control:
+    loop_var: vid
+    label: "{{ iface.prefix }} → VID {{ vid }}"
+  register: _create_vlan_results
+  failed_when: >
+    (_create_vlan_results.status | default(0)) != 200 and
+    ('already has an interface named' not in
+      (
+        (_create_vlan_results.content | default('') | lower) ~ ' ' ~
+        (_create_vlan_results.msg | default('') | lower) ~ ' ' ~
+        ((_create_vlan_results.json.name | default([])) | join(' ') | lower)
+      )
+    )
+  #no_log: true
+
+- name: Skip note (VLAN subinterface already present)
+  debug:
+    msg: >-
+      Skipping create: parent_id={{ _parent_id }} already has vlan_id={{ _vlan_lookup[vid|string].id }}
+      ({{ iface.prefix }}.{{ vid }})
+  loop: "{{ iface.tagged_vids | default([]) }}"
+  loop_control:
+    loop_var: vid
+    label: "{{ iface.prefix }} → VID {{ vid }}"
+  when:
+    - iface.tagged_vids is defined
+    - >
+      (_existing_vlan_pairs
+       | selectattr('parent_id','equalto', (_parent_id|int))
+       | selectattr('vlan_id','equalto', _vlan_lookup[vid|string].id)
+       | list | length) > 0
+
+- name: Rebuild interface facts + maps after create (if any changed)
+  when:
+    - _create_vlan_results is defined
+    - (_create_vlan_results.results | selectattr('status','defined') | list | length) > 0
+  include_tasks: machines/_refresh_iface_facts.yml
diff --git a/roles/maas/tasks/machines/_apply_subnet.yml b/roles/maas/tasks/machines/_apply_subnet.yml

new file mode 100644 (file)

index 0000000..8b2b169
--- /dev/null
+++ b/roles/maas/tasks/machines/_apply_subnet.yml
@@ -0,0 +1,221 @@
+# roles/maas/tasks/machines/_apply_subnet.yml
+# Expects: iface (id, name, vlan_id[, type]), candidate_subnets (list), system_id, _maas_api, maas_auth_header
+# Optional: iface.desired_mode or maas_iface_mode_default (defaults to "DHCP")
+
+# Safety: only operate on bond/vlan interfaces if iface.type is provided
+#- block:
+# Make sure we have iface as the current item
+    - name: Build candidate subnets for {{ iface.name }}
+      set_fact:
+        _candidate_subnets: >-
+          {{
+            _subnets_by_vlan[iface.vlan_id | string]
+            | default([])
+          }}
+    
+    - name: Fail clearly if no subnet found for {{ iface.name }}
+      when: _candidate_subnets | length == 0
+      fail:
+        msg: >-
+          No subnet found for iface {{ iface.name }} (type={{ iface.type }},
+          vlan_id={{ iface.vlan_id }}). Known VLAN IDs: {{ _subnets_by_vlan.keys() | list }}
+    
+    - name: Choose subnet for {{ iface.name }}
+      when: _candidate_subnets | length > 0
+      set_fact:
+        iface_subnet: "{{ _candidate_subnets[0] }}"
+
+#    - name: Choose subnet for {{ iface.name }}
+#      set_fact:
+#        _chosen_subnet: >-
+#          {{
+#            (
+#              candidate_subnets
+#              | selectattr('managed','defined')
+#              | selectattr('managed','eq', true)
+#              | list
+#              | first
+#            )
+#            | default((candidate_subnets | first), true)
+#          }}
+
+    - name: Skip if no candidate subnets for VLAN {{ iface.vlan_id }}
+      when: (candidate_subnets | length) == 0
+      debug:
+        msg: "No subnets on VLAN {{ iface.vlan_id }}; leaving {{ iface.name }} unchanged."
+
+    - block:
+        - include_tasks: _auth_header.yml
+
+        - name: Read current interface links
+          uri:
+            url: "{{ _maas_api }}/nodes/{{ system_id }}/interfaces/{{ iface.id }}/"
+            method: GET
+            headers:
+              Authorization: "{{ maas_auth_header }}"
+              Accept: application/json
+            return_content: yes
+            status_code: 200
+          register: _if_detail
+          no_log: true
+
+        # -------------------------
+        # Normalize / derive facts
+        # -------------------------
+        - name: Compute candidate subnet IDs and desired mode
+          set_fact:
+            _candidate_ids: "{{ candidate_subnets | map(attribute='id') | list }}"
+            _desired_mode: "{{ iface.desired_mode | default(maas_iface_mode_default | default('DHCP')) }}"
+
+        # Normalize links to a predictable shape
+        - name: Normalize current links on iface
+          set_fact:
+            _links_norm: []
+        - name: Append normalized link
+          set_fact:
+            _links_norm: >-
+              {{
+                _links_norm + [
+                  {
+                    'id': l.id | default(omit),
+                    'subnet_id': (
+                      l.subnet.id
+                      if (l.subnet is mapping and (l.subnet.id is defined))
+                      else (l.subnet if (l.subnet is defined) else omit)
+                    ),
+                    'mode': (l.mode | default('AUTO')),
+                    'ip_address': l.ip_address | default(omit),
+                    'default_gateway': l.default_gateway | default(false)
+                  }
+                ]
+              }}
+          loop: "{{ _if_detail.json.links | default([]) }}"
+          loop_control:
+            loop_var: l
+
+        - name: Collect existing links on this VLAN
+          set_fact:
+            _existing_on_vlan: >-
+              {{
+                _links_norm
+                | selectattr('subnet_id', 'defined')
+                | selectattr('subnet_id', 'in', _candidate_ids)
+                | list
+              }}
+
+        # Select first existing link (if any)
+        - name: Select first existing link (if any)
+          set_fact:
+            _existing_link: >-
+              {{
+                (_existing_on_vlan | list) | first | default(omit, true)
+              }}
+            _has_link_on_vlan: "{{ (_existing_on_vlan | length | int) > 0 }}"
+            _current_mode: "{{ (_existing_on_vlan | first).mode | default(None) if (_existing_on_vlan | length | int) > 0 else None }}"
+            _mode_mismatch: >-
+              {{
+                (_existing_on_vlan | length | int) > 0 and
+                (((_existing_on_vlan | first).mode | default('') | upper)
+                  != (_desired_mode | upper))
+              }}
+
+        - name: Show link decision inputs
+          debug:
+            msg:
+              has_link_on_vlan: "{{ _has_link_on_vlan }}"
+              candidate_ids: "{{ _candidate_ids }}"
+              desired_mode: "{{ _desired_mode }}"
+
+        # -------------------------
+        # Actions
+        # -------------------------
+        - include_tasks: _auth_header.yml
+
+#        # Case 1: No link on this VLAN -> link with desired mode
+#        - name: Link subnet with desired mode (no existing link)
+#          when:
+#            - not _has_link_on_vlan
+#            - _chosen_subnet is defined
+#          uri:
+#            url: "{{ _maas_api }}/nodes/{{ system_id }}/interfaces/{{ iface.id }}/?op=link_subnet"
+#            method: POST
+#            headers:
+#              Authorization: "{{ maas_auth_header }}"
+#              Accept: application/json
+#              Content-Type: application/x-www-form-urlencoded
+#            body_format: form-urlencoded
+#            body:
+#              mode: "{{ _desired_mode }}"          # DHCP / STATIC / AUTO / LINK_UP
+#              subnet: "{{ _chosen_subnet.id }}"    # integer id
+#            status_code: [200, 201, 409]
+#          no_log: true
+
+        - name: Link subnet with desired mode (no existing link)
+          when:
+            - not _has_link_on_vlan
+            - _candidate_ids | length > 0
+          #no_log: true
+          uri:
+            url: "{{ _maas_api }}/nodes/{{ system_id }}/interfaces/{{ iface.id }}/?op=link_subnet"
+            method: POST
+            headers:
+              Authorization: "{{ maas_auth_header }}"
+              Accept: application/json
+            body_format: json
+            body:
+              mode: "{{ _desired_mode | lower }}"     # "dhcp"
+              subnet: "{{ _candidate_ids[0] }}"
+            status_code: 200
+            return_content: yes
+          register: _link_resp
+
+        # Case 2: Link exists but wrong mode -> unlink then relink with desired mode
+        - name: Unlink existing subnet (mode mismatch)
+          when:
+            - _mode_mismatch
+            - _existing_link is defined
+            - _existing_link.id is defined
+          uri:
+            url: "{{ _maas_api }}/nodes/{{ system_id }}/interfaces/{{ iface.id }}/?op=unlink_subnet"
+            method: POST
+            headers:
+              Authorization: "{{ maas_auth_header }}"
+              Accept: application/json
+              Content-Type: application/x-www-form-urlencoded
+            body_format: form-urlencoded
+            body:
+              id: "{{ _existing_link.id }}"
+            status_code: [200, 204, 409]
+          #no_log: true
+
+        - include_tasks: _auth_header.yml
+          when: _mode_mismatch
+
+        - name: Relink subnet with desired mode (after unlink)
+          when:
+            - _mode_mismatch
+            - _chosen_subnet is defined
+          uri:
+            url: "{{ _maas_api }}/nodes/{{ system_id }}/interfaces/{{ iface.id }}/?op=link_subnet"
+            method: POST
+            headers:
+              Authorization: "{{ maas_auth_header }}"
+              Accept: application/json
+              Content-Type: application/x-www-form-urlencoded
+            body_format: form-urlencoded
+            body:
+              mode: "{{ _desired_mode }}"
+              subnet: "{{ _chosen_subnet.id }}"
+            status_code: [200, 201, 409]
+          #no_log: true
+
+        # Case 3: Already correct -> noop
+        - name: Note existing correct link
+          when:
+            - _has_link_on_vlan
+            - not _mode_mismatch
+          debug:
+            msg: >-
+              "{{ iface.name }} already linked to VLAN {{ iface.vlan_id }} subnet with mode {{ _current_mode }}; skipping."
+      when: (candidate_subnets | length) > 0
+#  when: (iface.type is not defined) or (iface.type in ['bond','vlan'])
diff --git a/roles/maas/tasks/machines/_build_indexes.yml b/roles/maas/tasks/machines/_build_indexes.yml

new file mode 100644 (file)

index 0000000..6aa2dc1
--- /dev/null
+++ b/roles/maas/tasks/machines/_build_indexes.yml
@@ -0,0 +1,106 @@
+---
+- name: Init _nodes dict
+  set_fact:
+    _nodes: "{{ maas_nodes_list | selectattr('hostname','defined') | list }}"
+  no_log: true
+
+#- debug: var=_nodes
+#- pause:
+
+- name: Build maps keyed by FQDN (single pass, no loop)
+  set_fact:
+    maas_by_hostname: >-
+      {{ dict(
+           _nodes | map(attribute='hostname') 
+           | zip(_nodes)
+         ) }}
+    maas_host_to_macs: >-
+      {{ dict(
+           _nodes | map(attribute='hostname')
+           | zip(
+               _nodes
+               | map(attribute='interface_set')
+               | map('default', [])
+               | map('map', attribute='mac_address')
+               | map('list')
+             )
+         ) }}
+    maas_host_to_ifaces: >-
+      {{ dict(
+           _nodes | map(attribute='hostname')
+           | zip(
+               _nodes | map(attribute='interface_set') | map('default', [])
+             )
+         ) }}
+    maas_host_to_status: >-
+      {{ dict(
+           _nodes | map(attribute='hostname')
+           | zip(_nodes | map(attribute='status_name'))
+         ) }}
+  no_log: true
+
+# Short names list (dedup check can use this)
+# Build short name list (from MAAS payload, no regex needed)
+- name: Build short name list
+  set_fact:
+    _short_names: >-
+      {{
+        (maas_by_hostname | default({}))
+        | dict2items
+        | map(attribute='value.hostname')
+        | reject('equalto', None)
+        | list
+      }}
+
+# short -> id
+- name: Build maas_short_to_id
+  set_fact:
+    maas_short_to_id: >-
+      {{
+        dict(
+          (
+            (maas_by_hostname | default({}))
+            | dict2items
+            | map(attribute='value.hostname')
+            | reject('equalto', None)
+          )
+          | zip(
+              (maas_by_hostname | default({}))
+              | dict2items
+              | map(attribute='value.system_id')
+            )
+        )
+      }}
+
+# short -> object
+- name: Build maas_by_short
+  set_fact:
+    maas_by_short: >-
+      {{
+        dict(
+          (
+            (maas_by_hostname | default({}))
+            | dict2items
+            | map(attribute='value.hostname')
+            | reject('equalto', None)
+          )
+          | zip(
+              (maas_by_hostname | default({}))
+              | dict2items
+              | map(attribute='value')
+            )
+        )
+      }}
+  no_log: true
+
+# short -> ansible inventory_host
+- name: Build inventory_by_short
+  set_fact:
+    inventory_by_short: >-
+      {{
+        (inventory_by_short | default({}))
+        | combine({ (inv_fqdn.split('.')[0]): inv_fqdn })
+      }}
+  loop: "{{ groups['testnodes'] }}"
+  loop_control:
+    loop_var: inv_fqdn
diff --git a/roles/maas/tasks/machines/_create_vlan_on_parent.yml b/roles/maas/tasks/machines/_create_vlan_on_parent.yml

new file mode 100644 (file)

index 0000000..14e0826
--- /dev/null
+++ b/roles/maas/tasks/machines/_create_vlan_on_parent.yml
@@ -0,0 +1,43 @@
+---
+# Expected vars (passed by caller):
+# - parent_id        (int/string MAAS iface ID of the parent, e.g. bond id)
+# - vlan_id          (int/string MAAS VLAN object id, not VID)
+# - system_id   (MAAS node system_id, e.g. gseprg)
+# - vid_label        (optional, for nicer labels/logging)
+
+- name: Validate required vars
+  fail:
+    msg: >-
+      Missing var(s). parent_id={{ parent_id|default('UNSET') }},
+      vlan_id={{ vlan_id|default('UNSET') }},
+      system_id={{ system_id|default('UNSET') }}
+  when: parent_id is not defined or vlan_id is not defined or system_id is not defined
+
+# Optional: quick sanity that the parent exists in _ifaces (if _ifaces available)
+- name: Sanity-check parent exists on node (optional)
+  vars:
+    _parent_found: >-
+      {{
+        (_ifaces | selectattr('id','equalto', parent_id|int) | list | length) > 0
+      }}
+  when:
+    - _ifaces is defined
+    - not _parent_found | bool
+  fail:
+    msg: "Parent interface id {{ parent_id }} not found on node {{ system_id }}"
+
+- include_tasks: ../_auth_header.yml
+
+- name: "POST op=create_vlan (parent={{ parent_id }}, vlan={{ vlan_id }})"
+  uri:
+    url: "{{ _maas_api }}/nodes/{{ system_id }}/interfaces/?op=create_vlan"
+    method: POST
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+      Content-Type: application/x-www-form-urlencoded
+    body: "parent={{ parent_id }}&vlan={{ vlan_id }}"
+    body_format: form-urlencoded
+    status_code: 200
+  register: _create_vlan_resp
+  changed_when: true
diff --git a/roles/maas/tasks/machines/_ensure_bond.yml b/roles/maas/tasks/machines/_ensure_bond.yml

new file mode 100644 (file)

index 0000000..d37fb11
--- /dev/null
+++ b/roles/maas/tasks/machines/_ensure_bond.yml
@@ -0,0 +1,570 @@
+---
+# Assumes incoming vars:
+# - system_id
+# - bond: { name, mode, mtu, link_speed?, interfaces[] or parents[], tagged_vids? }
+# - _ifaces: current MAAS interface list for node (from your refresh task)
+# - _vlan_lookup: { vid(str) -> vlan_obj with .id }
+# Uses ../_auth_header.yml to set maas_auth_header for MAAS API calls.
+
+- name: Reset per-bond scratch facts
+  set_fact:
+    _existing_bond_obj: {}
+    _existing_bond_id: 0
+    _desired_parent_tokens: []
+    _desired_parent_macs: []
+    _bond_parent_ids: []
+    _bond_parent_names: []
+    _bond_existing_tagged_vids: []
+    _bond_desired_tagged_vids: []
+
+- name: Build iface lookup maps (name→mac, mac→id, id→name)
+  set_fact:
+    _name_to_mac: "{{ dict(_ifaces | map(attribute='name') | zip(_ifaces | map(attribute='mac_address') | map('lower'))) }}"
+    _name_to_mac_ci: "{{ dict((_ifaces | map(attribute='name') | map('lower')) | zip(_ifaces | map(attribute='mac_address') | map('lower'))) }}"
+    _mac_to_id:   "{{ dict(_ifaces | map(attribute='mac_address') | map('lower') | zip(_ifaces | map(attribute='id'))) }}"
+    _id_to_name:  "{{ dict(_ifaces | map(attribute='id') | zip(_ifaces | map(attribute='name'))) }}"
+    _iface_by_name: "{{ dict(_ifaces | map(attribute='name') | zip(_ifaces)) }}"
+
+- name: Build id→name map with INT keys
+  set_fact:
+    _id_to_name_int: >-
+      {{
+        dict(
+          (_iface_by_name | dict2items | map(attribute='value.id') | list)
+          | zip(_iface_by_name | dict2items | map(attribute='key') | list)
+        )
+      }}
+
+- name: Build mac→id map for physical/virtual (exclude bonds)
+  set_fact:
+    _mac_to_id_phys: >-
+      {{
+        dict(
+          (_ifaces
+           | rejectattr('type', 'equalto', 'bond')
+           | map(attribute='mac_address')
+           | map('lower')
+           | list)
+          |
+          zip(
+            _ifaces
+            | rejectattr('type', 'equalto', 'bond')
+            | map(attribute='id')
+            | list
+          )
+        )
+      }}
+
+- name: Collect desired parent tokens (could be var names or MACs)
+  set_fact:
+    _desired_parent_tokens: "{{ (bond.interfaces | default(_bond_parent_names) | default([])) | map('string') | list }}"
+
+# init
+- name: Resolve inventory host for token lookup
+  set_fact:
+    _inv_host_resolved: "{{ inv_host | default(inventory_hostname) }}"
+  changed_when: false
+
+# init
+- set_fact:
+    _desired_parent_macs: []
+    _unresolved_tokens: []
+    _unresolved_parent_tokens: []
+  changed_when: false
+
+# Resolve each token to a MAC:
+# precedence: direct MAC token → iface name (CI) → inventory var value
+- name: Resolve desired parent tokens to MACs
+  vars:
+    mac_from_tok:  "{{ (parent_tok | lower) if (parent_tok | lower is match('^([0-9a-f]{2}:){5}[0-9a-f]{2}$')) else '' }}"
+    mac_from_name: "{{ _name_to_mac_ci.get(parent_tok | lower, '') }}"
+    mac_from_var:  "{{ (hostvars[_inv_host_resolved][parent_tok] | default('')) | string | lower }}"
+    mac_candidate: "{{ [mac_from_tok, mac_from_name, mac_from_var] | select('match','^([0-9a-f]{2}:){5}[0-9a-f]{2}$') | list | first | default('') }}"
+  set_fact:
+    _desired_parent_macs: "{{ _desired_parent_macs + [mac_candidate] if mac_candidate else _desired_parent_macs }}"
+    _unresolved_tokens:   "{{ _unresolved_tokens + [parent_tok] if not mac_candidate else _unresolved_tokens }}"
+  loop: "{{ _desired_parent_tokens }}"
+  loop_control:
+    loop_var: parent_tok
+
+- name: Fail if any desired parents didn’t normalize to a MAC
+  assert:
+    that:
+      - _unresolved_tokens | length == 0
+    fail_msg: "For bond {{ bond.name }}, could not resolve parent(s) to MACs: {{ _unresolved_tokens }}"
+
+- name: Find existing bond by name
+  set_fact:
+    _bond_by_name: >-
+      {{
+        (_ifaces | selectattr('type','equalto','bond')
+                 | selectattr('name','equalto', bond.name)
+                 | list | first) | default({})
+      }}
+
+# set from _bond_by_name (computed in the previous task)
+- name: Cache bond object/id from _bond_by_name
+  set_fact:
+    _existing_bond_obj: "{{ _bond_by_name | default({}) }}"
+    _existing_bond_id: "{{ (_bond_by_name.id | default(0)) | int }}"
+
+# (delete the “If not found by name…” task — it does nothing)
+
+# now, only scan by parent MACs if the id is still 0
+- name: Scan bonds to match desired parent MACs (order-insensitive)
+  when: (_existing_bond_id | int) == 0
+  set_fact:
+    _existing_bond_obj: >-
+      {{
+        bond_iface if (
+          ((bond_iface.parents | default([]))
+           | map('extract', _name_to_mac) | map('lower') | list | sort)
+          == _desired_parent_macs
+        )
+        else _existing_bond_obj | default({})
+      }}
+    _existing_bond_id: >-
+      {{
+        (
+          bond_iface.id if (
+            ((bond_iface.parents | default([]))
+             | map('extract', _name_to_mac) | map('lower') | list | sort)
+            == _desired_parent_macs
+          )
+          else _existing_bond_id | default(0)
+        ) | int
+      }}
+  loop: "{{ _ifaces | selectattr('type','equalto','bond') | list }}"
+  loop_control:
+    loop_var: bond_iface
+
+
+
+# 1) Compute observed parent MACs from the bond object
+- name: Compute observed parent MACs
+  set_fact:
+    _observed_parent_macs: >-
+      {{
+        (_existing_bond_obj.parents | default([]))
+        | map('extract', _name_to_mac)
+        | select('defined')
+        | map('lower') | list | sort
+      }}
+
+# 2) (Idempotent) normalize desired list just in case
+- name: Normalize desired parent MACs
+  set_fact:
+    _desired_parent_macs: "{{ (_desired_parent_macs | default([])) | map('lower') | list | sort }}"
+
+# 3) Compare using normalized types/lists
+- name: Compute MAC-based parent match flag
+  set_fact:
+    _bond_parents_match: "{{ (_existing_bond_id | int) > 0 and (_observed_parent_macs == _desired_parent_macs) }}"
+
+- name: Derive bond parent names from desired MACs (phys-only, robust to renames)
+  set_fact:
+    _bond_parent_names: >-
+      {{
+        _desired_parent_macs
+        | map('extract', _mac_to_id_phys)
+        | select('defined')
+        | list
+        | map('int')
+        | map('extract', _id_to_name_int)
+        | list
+      }}
+
+- name: Require bond parent names
+  assert:
+    that:
+      - _bond_parent_names | length > 0
+      - _bond_parent_names | length == (_desired_parent_macs | length)
+    fail_msg: >-
+      Could not derive parent names from desired MACs (got {{ _bond_parent_names | default([]) }}).
+      Check _mac_to_id_phys={{ _mac_to_id_phys }} and _id_to_name_int={{ _id_to_name_int }}.
+
+#- name: Compute desired parent IDs from MACs
+#  set_fact:
+#    _bond_parent_ids: >-
+#      {{
+#        _desired_parent_macs
+#        | map('lower')
+#        | map('extract', _mac_to_id)
+#        | list
+#      }}
+
+- name: Compute desired parent IDs from MACs (prefer non-bond ifaces)
+  set_fact:
+    _bond_parent_ids: >-
+      {{
+        _desired_parent_macs
+        | map('lower')
+        | map('extract', _mac_to_id_phys)
+        | list
+      }}
+
+- name: Assert we derived parent names
+  assert:
+    that:
+      - _bond_parent_names | length > 0
+    fail_msg: >-
+      Could not derive parent names from MACs={{ _desired_parent_macs }}.
+      Known MAC->ID map={{ _iface_id_by_mac | to_nice_json }} id->name={{ _id_to_name | to_nice_json }}
+
+- name: Fail if any desired MACs are unknown to MAAS (non-bond)
+  vars:
+    _missing: >-
+      {{
+        _desired_parent_macs
+        | map('lower')
+        | reject('in', _mac_to_id_phys.keys())
+        | list
+      }}
+  assert:
+    that:
+      - _missing | length == 0
+      - (_bond_parent_ids | select('gt', 0) | list | length) == (_bond_parent_ids | length)
+    fail_msg: >-
+      Unresolved parent MACs for {{ bond.name }}: {{ _missing }}
+
+- name: Temporarily set _bond_create_native_vid
+  set_fact:
+    _bond_create_native_vid: "{{ _vlan_lookup[bond.native_vid|string].id }}"
+  when:
+    - bond.native_vid is defined
+    - (bond.native_vid|string) in _vlan_lookup
+
+- name: Build create_bond payload (no link_speed on create)
+  when: not _bond_parents_match
+  set_fact:
+    _create_bond_qs: >-
+      {{
+        (
+          ['name=' ~ (bond.name | urlencode)]
+          + (_bond_parent_ids
+               | map('string')
+               | map('regex_replace','^(.*)$','parents=\1')
+               | list)
+          + (bond.mtu  is defined | ternary(['mtu=' ~ (bond.mtu|string)], []))
+          + (bond.mode is defined | ternary(['bond_mode=' ~ (bond.mode | urlencode)], []))
+          + (_bond_create_native_vid is defined
+               | ternary(['vlan=' ~ (_bond_create_native_vid|string)], []))
+        )
+        | join('&')
+      }}
+
+- include_tasks: ../_auth_header.yml
+  when: not _bond_parents_match
+
+- name: "POST ?op=create_bond (only if needed) for {{ _inv_host_resolved }}"
+  when: not _bond_parents_match
+  uri:
+    url: "{{ _maas_api }}/nodes/{{ system_id }}/interfaces/?op=create_bond"
+    method: POST
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+      Content-Type: application/x-www-form-urlencoded
+    body: "{{ _create_bond_qs }}"
+    body_format: form-urlencoded
+    status_code: 200
+  register: _bond_create_resp
+  changed_when: true
+#  no_log: true
+
+- name: Refresh interface facts (post create)
+  when: not _bond_parents_match
+  include_tasks: ../_refresh_iface_facts.yml
+
+- name: Re-resolve bond by name (after create)
+  set_fact:
+    _existing_bond_obj: "{{ _iface_by_name.get(bond.name, {}) }}"
+
+- name: Resolve bond by parents when name lookup fails
+  when:
+    - (_existing_bond_id | int) == 0
+    - _bond_parent_names | length > 0
+  set_fact:
+    _existing_bond_obj: >-
+      {{
+        (_ifaces | selectattr('type','equalto','bond') | list)
+        | selectattr('parents','equalto', _bond_parent_names)
+        | first | default({})
+      }}
+
+# 1) id + params first
+- name: Cache normalized bond id + params
+  set_fact:
+    _existing_bond_id: "{{ _existing_bond_obj.id | default(0) | int }}"
+    _existing_params: "{{ _existing_bond_obj.params | default({}) }}"
+
+# 2) then derive fields from those
+- name: Cache normalized bond fields
+  set_fact:
+    _existing_mtu: "{{ _existing_params.mtu | default(0) | int }}"
+    _existing_mode: "{{ _existing_params.bond_mode | default('') | string }}"
+    _existing_link_speed: "{{ _existing_bond_obj.link_speed | default(0) | int }}"
+    _existing_link_connected: "{{ _existing_bond_obj.link_connected | default(false) | bool }}"
+
+- name: Decide create/mtu/mode/speed update flags
+  set_fact:
+    _needs_bond_create: "{{ (_existing_bond_id | int) == 0 }}"
+    _needs_bond_mtu_update: "{{ (_existing_bond_id | int) > 0 and (_existing_mtu | int) != (bond.mtu | int) }}"
+    _needs_bond_mode_update: "{{ (_existing_bond_id | int) > 0 and (_existing_mode | string) != (bond.mode | default('') | string) }}"
+    _needs_bond_speed_update: >-
+      {{
+        (_existing_bond_id | int) > 0
+        and (_existing_link_connected | bool)
+        and ((_existing_link_speed | int) != (bond.link_speed | int | default(0)))
+      }}
+
+- name: Assemble bond update payload
+  set_fact:
+    _bond_update_payload: >-
+      {{
+        dict()
+        | combine( {'mtu': (bond.mtu | int)} if _needs_bond_mtu_update else {} )
+        | combine( {'bond_mode': bond.mode} if (_needs_bond_mode_update and (bond.mode | default('') | length > 0)) else {} )
+      }}
+
+- include_tasks: ../_auth_header.yml
+
+- name: "PUT /interfaces/{id} (bond_mode/mtu) for {{ _inv_host_resolved }}"
+  when:
+    - (_existing_bond_id | int) > 0
+    - (_bond_update_payload | length) > 0
+  uri:
+    url: "{{ _maas_api }}/nodes/{{ system_id }}/interfaces/{{ _existing_bond_id }}/"
+    method: PUT
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+      Content-Type: application/x-www-form-urlencoded
+    body: "{{ _bond_update_payload }}"
+    body_format: form-urlencoded
+    status_code: 200
+  register: _bond_base_update
+  changed_when: true
+
+# Optional: visibility when there's nothing to change
+- name: No bond base update needed
+  when:
+    - (_existing_bond_id | int) > 0
+    - (_bond_update_payload | length) == 0
+  debug:
+    msg: "Bond {{ _existing_bond_id }} already has requested mtu/mode. No update."
+
+- name: Refresh interface facts (post base update)
+  when:
+    - (_existing_bond_id | int) > 0
+    - (_bond_update_payload | length) > 0
+  include_tasks: ../_refresh_iface_facts.yml
+
+# Only re-resolve if we actually updated anything
+- name: Re-resolve bond after base update
+  when: (_bond_update_payload | length) > 0
+  set_fact:
+    _existing_bond_obj: >-
+      {{
+        (
+          _ifaces
+          | selectattr('type', 'equalto', 'bond')
+          | selectattr('name', 'equalto', bond.name)
+          | list
+        ) | first | default(_existing_bond_obj)
+      }}
+
+- name: Re-cache normalized bond facts (fresh after base update)
+  when:
+    - (_existing_bond_id | int) > 0
+  set_fact:
+    _existing_params: "{{ _existing_bond_obj.params | default({}) }}"
+    _existing_mtu: "{{ _existing_params.mtu | default(0) | int }}"
+    _existing_mode: "{{ _existing_params.bond_mode | default('') }}"
+    _existing_link_speed: "{{ _existing_bond_obj.link_speed | default(0) | int }}"
+    _existing_link_connected: "{{ _existing_bond_obj.link_connected | default(false) | bool }}"
+
+- name: Read link_connected for bond iface (normalized)
+  set_fact:
+    _bond_link_connected: "{{ _existing_link_connected | bool }}"
+
+- name: Decide if we need to update the link speed
+  set_fact:
+    _needs_bond_speed_update: >-
+      {{
+        (_existing_bond_id | int) > 0
+        and (_bond_link_connected | bool)
+        and ((_existing_link_speed | int) != (bond.link_speed | int | default(0)))
+      }}
+
+- include_tasks: ../_auth_header.yml
+
+- name: "PUT /interfaces/{id} (link_speed) for {{ _inv_host_resolved }}"
+  when:
+    - bond.link_speed is defined
+    - _bond_link_connected | bool
+    - (_existing_bond_id | int) > 0
+    - _needs_bond_speed_update
+  uri:
+    url: "{{ _maas_api }}/nodes/{{ system_id }}/interfaces/{{ _existing_bond_id }}/"
+    method: PUT
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+      Content-Type: application/x-www-form-urlencoded
+    body: "link_speed={{ bond.link_speed }}"
+    body_format: form-urlencoded
+    status_code: 200
+  register: _bond_speed_update
+  changed_when: true
+
+- name: Derive bond parent names (prefer explicit, else MAAS, else cache)
+  set_fact:
+    _bond_parent_names: >-
+      {{
+        (
+          bond.parents
+          | default(_existing_bond_obj.parents
+                    | default(_iface_by_name[bond.name].parents | default([])))
+        ) | sort
+      }}
+
+- name: Require bond parent names
+  assert:
+    that:
+      - _bond_parent_names | length > 0
+    fail_msg: >-
+      Need parent names for {{ bond.name }}. Checked:
+      bond.parents, _existing_bond_obj.parents, and _iface_by_name[{{ bond.name }}].parents.
+
+- name: Collect parent VLAN ids for bond parents
+  set_fact:
+    _parent_vlan_ids: >-
+      {{
+        _bond_parent_names
+        | map('extract', _iface_by_name)
+        | map(attribute='vlan')
+        | select('defined')
+        | map(attribute='id')
+        | list
+      }}
+
+- name: Unique-ify parent VLAN ids
+  set_fact:
+    _parent_vlan_ids_unique: "{{ _parent_vlan_ids | unique }}"
+
+- name: Decide target native VLAN id
+  set_fact:
+    _target_native_vlan_id: >-
+      {%- if bond is defined and bond.native_vlan_id is defined -%}
+        {{ bond.native_vlan_id | int }}
+      {%- elif bond is defined and bond.native_vid is defined and (bond.native_vid|string) in _vlan_lookup -%}
+        {{ _vlan_lookup[bond.native_vid|string].id | int }}
+      {%- elif (_parent_vlan_ids_unique | length) == 1 -%}
+        {{ (_parent_vlan_ids_unique | first) | int }}
+      {%- else -%}
+      {%- endif -%}
+
+- name: Compute need to change parent native VLANs
+  set_fact:
+    _need_parent_vlan_change: >-
+      {{
+        (_target_native_vlan_id is defined)
+        and (_parent_vlan_ids | select('ne', _target_native_vlan_id | int) | list | length > 0)
+      }}
+
+- name: Fail if parents disagree and no target VLAN provided
+  fail:
+    msg: >-
+      Parents {{ _bond_parent_names }} have different native VLANs {{ _parent_vlan_ids_unique }},
+      and no bond.native_vid/native_vlan_id was provided to reconcile them.
+  when:
+    - (_target_native_vlan_id is not defined)
+    - (_parent_vlan_ids_unique | length) > 1
+
+- name: Parent native VLAN already correct; skipping updates
+  debug:
+    msg: >-
+      Parents {{ _bond_parent_names }} already on VLAN {{ _target_native_vlan_id }}; no change.
+  when: not _need_parent_vlan_change
+
+- name: Ensure bond parents have native VLAN set (only when needed)
+  include_tasks: ../machines/_set_parent_native.yml
+  loop: "{{ _bond_parent_names | map('extract', _iface_by_name) | map(attribute='id') | list }}"
+  loop_control:
+    loop_var: parent_id
+    label: "{{ parent_id }} → vlan {{ _target_native_vlan_id }}"
+  when:
+    - _need_parent_vlan_change
+
+# Refresh facts to see the parents’ new native VLAN
+- name: Refresh interface facts (after setting parents’ native VLAN)
+  include_tasks: "_refresh_iface_facts.yml"
+  when:
+    - _need_parent_vlan_change
+
+# Re-resolve bond by name, else by parents (no item/loop)
+- name: Re-resolve bond by name
+  set_fact:
+    _existing_bond_obj: "{{ _iface_by_name.get(bond.name, {}) }}"
+
+- name: Resolve bond by parents when name lookup fails (order-insensitive)
+  set_fact:
+    _existing_bond_obj: >-
+      {{
+        (
+          _ifaces
+          | selectattr('type','equalto','bond')
+          | selectattr('parents','defined')
+          | list
+        )
+        | selectattr('parents', 'equalto', _bond_parent_names | sort)
+        | list
+        | first
+        | default({})
+      }}
+  when: _existing_bond_obj | length == 0
+
+- name: Re-cache normalized bond id (after final resolve)
+  set_fact:
+    _existing_bond_id: "{{ _existing_bond_obj.id | default(0) | int }}"
+
+- name: Gather VLAN vids already on {{ bond.name }}
+  set_fact:
+    _bond_existing_tagged_vids: >-
+      {{
+        (_ifaces
+         | selectattr('type','equalto','vlan')
+         | selectattr('parents','defined')
+         | selectattr('parents','contains', bond.name)
+         | map(attribute='vlan') | select('defined')
+         | map(attribute='vid') | map('string') | list)
+      }}
+
+# (2) Desired tagged vids (EXCLUDING native)
+- name: Compute desired vids
+  set_fact:
+    _bond_desired_tagged_vids: >-
+      {{
+        (bond.tagged_vids | default([]) | map('string') | unique | list)
+        | difference([ (bond.native_vid | default('') | string) ])
+      }}
+
+# (3) Missing tagged vids
+- name: Compute missing vids
+  set_fact:
+    _bond_missing_tagged_vids: "{{ _bond_desired_tagged_vids | difference(_bond_existing_tagged_vids | default([])) }}"
+
+# (4) Create missing VLAN subinterfaces on the bond
+- name: Create missing VLAN subinterfaces on {{ bond.name }}
+  include_tasks: ../_create_vlan_on_parent.yml
+  loop: "{{ _bond_missing_tagged_vids }}"
+  loop_control:
+    loop_var: vid
+    label: "{{ bond.name }} → VID {{ vid }}"
+  vars:
+    parent_id: "{{ _existing_bond_id | int }}"
+    vlan_id:   "{{ _vlan_lookup[vid|string].id }}"
+    vid_label: "{{ vid|string }}"
+    system_id: "{{ _existing_bond_obj.system_id }}"
+  when: (_existing_bond_id | int) > 0
diff --git a/roles/maas/tasks/machines/_ensure_boot_iface.yml b/roles/maas/tasks/machines/_ensure_boot_iface.yml

new file mode 100644 (file)

index 0000000..b2972ea
--- /dev/null
+++ b/roles/maas/tasks/machines/_ensure_boot_iface.yml
@@ -0,0 +1,102 @@
+---
+# Expects:
+# - desired_iface (from the loop item), with .prefix and boot:true
+# - inv_host, _nodes, _node_system_id available in scope
+# - hostvars[inv_host]["<prefix>_mac"] defined (e.g., 25Gb_1_mac)
+
+# 1) Resolve prefix and MAC
+- name: Resolve boot prefix from loop item
+  set_fact:
+    _boot_prefix: "{{ desired_iface.prefix }}"
+
+- name: Ensure <prefix>_mac exists for boot NIC
+  assert:
+    that:
+      - _boot_prefix is defined
+      - hostvars[inv_host][_boot_prefix ~ '_mac'] is defined
+    fail_msg: "Missing {{ _boot_prefix }}_mac for {{ inv_host }}"
+
+- name: Normalize boot MAC
+  set_fact:
+    _boot_mac: "{{ hostvars[inv_host][_boot_prefix ~ '_mac'] | string | lower }}"
+
+# 2) Resolve node object from local _nodes (mapping or list)
+- name: Resolve _node_obj from _nodes (no API call)
+  set_fact:
+    _node_obj: >-
+      {{
+        (_nodes if (_nodes is mapping)
+         else ((_nodes | selectattr('system_id','equalto', _node_system_id) | list | first) | default({}, true)))
+      }}
+
+# 3) Build MAC→id map from that node’s interface_set (physical/bond only)
+- name: Build MAC→id map from _nodes.interface_set
+  set_fact:
+    _node_mac_to_id: >-
+      {{
+        dict(
+          (
+            (_node_obj.interface_set | default([]))
+            | selectattr('type','in',['physical','bond'])
+            | map(attribute='mac_address')
+            | map('lower') | list
+          )
+          | zip(
+            (_node_obj.interface_set | default([]))
+            | selectattr('type','in',['physical','bond'])
+            | map(attribute='id') | list
+          )
+        )
+      }}
+
+- name: Resolve desired boot interface id from _nodes by MAC
+  set_fact:
+    _desired_boot_iface_id: "{{ _node_mac_to_id.get(_boot_mac, 0) | int }}"
+
+- name: Fail if desired boot MAC not found in _nodes.interface_set
+  when: _desired_boot_iface_id | int == 0
+  fail:
+    msg: >-
+      Could not map {{ _boot_prefix }}_mac={{ _boot_mac }} to an interface id in _nodes.interface_set
+      for {{ inv_host }} (system_id={{ _node_system_id }}). Refresh _nodes / re-commission the node.
+
+# 4) Read current boot interface id from the same _nodes payload
+- name: Extract current boot interface id from _nodes
+  set_fact:
+    _current_boot_iface_id: >-
+      {{
+        (_node_obj.boot_interface.id | int)
+          if (_node_obj is mapping
+              and _node_obj.boot_interface is defined
+              and _node_obj.boot_interface is mapping
+              and _node_obj.boot_interface.id is defined)
+          else 0
+      }}
+
+# 5) Only POST if different
+- include_tasks: ../_auth_header.yml
+
+- name: "Set boot interface to id={{ _desired_boot_iface_id }} (if different)"
+  when:
+    - _desired_boot_iface_id | int > 0
+    - _current_boot_iface_id | int != _desired_boot_iface_id | int
+  uri:
+    url: "{{ _maas_api }}/machines/{{ _node_system_id }}/?op=set-boot-interface"
+    method: POST
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+      Content-Type: application/x-www-form-urlencoded
+    body_format: form-urlencoded
+    body:
+      id: "{{ _desired_boot_iface_id }}"
+    status_code: [200]
+  register: _set_boot_iface
+#  no_log: true
+
+- name: Note boot-interface outcome
+  debug:
+    msg: >-
+      Boot interface {{ (_current_boot_iface_id|int) == (_desired_boot_iface_id|int)
+        | ternary('unchanged', 'updated') }}
+      previous={{ _current_boot_iface_id }}, desired={{ _desired_boot_iface_id }}
diff --git a/roles/maas/tasks/machines/_fetch_subnets_for_vlan.yml b/roles/maas/tasks/machines/_fetch_subnets_for_vlan.yml

new file mode 100644 (file)

index 0000000..a74dbd3
--- /dev/null
+++ b/roles/maas/tasks/machines/_fetch_subnets_for_vlan.yml
@@ -0,0 +1,21 @@
+# Expects: vlan_id
+# Produces/updates: _subnets_by_vlan (dict: { vlan_id: <subnet_list> })
+
+- include_tasks: _auth_header.yml
+
+- name: Query subnets for VLAN {{ vlan_id }}
+  uri:
+    url: "{{ maas_api_url }}/api/2.0/vlans/{{ vlan_id }}/?op=subnets"
+    method: GET
+    headers:
+      Authorization: "Bearer {{ _maas_auth.json.token }}"
+    return_content: true
+  register: _subnets_resp
+
+- name: Accumulate subnets into map
+  set_fact:
+    _subnets_by_vlan: >-
+      {{
+        (_subnets_by_vlan | default({})) |
+        combine({ (vlan_id|string): (_subnets_resp.json | default([])) })
+      }}
diff --git a/roles/maas/tasks/machines/_fetch_vlans_for_fabric.yml b/roles/maas/tasks/machines/_fetch_vlans_for_fabric.yml

new file mode 100644 (file)

index 0000000..84ed9ca
--- /dev/null
+++ b/roles/maas/tasks/machines/_fetch_vlans_for_fabric.yml
@@ -0,0 +1,30 @@
+---
+# 1) Refresh MAAS auth header (new nonce)
+- include_tasks: ../_auth_header.yml
+
+# 2) GET vlans for this fabric
+- name: Read VLANs for fabric {{ fab.id }}
+  uri:
+    url: "{{ _maas_api }}/fabrics/{{ fab.id }}/vlans/"
+    method: GET
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    return_content: yes
+    status_code: 200
+  register: _vlans_this_fabric
+
+# 3) Merge into vid -> vlan-object map
+- name: Merge VLANs from fabric {{ fab.id }} into _vlan_by_vid
+  set_fact:
+    _vlan_by_vid: >-
+      {{
+        _vlan_by_vid
+        | combine(
+            dict(
+              (_vlans_this_fabric.json | map(attribute='vid') | list)
+              | zip(_vlans_this_fabric.json)
+            ),
+            recursive=True
+          )
+      }}
diff --git a/roles/maas/tasks/machines/_mark_broken.yml b/roles/maas/tasks/machines/_mark_broken.yml

new file mode 100644 (file)

index 0000000..2a18417
--- /dev/null
+++ b/roles/maas/tasks/machines/_mark_broken.yml
@@ -0,0 +1,48 @@
+---
+# _mark_broken.yml  (uses system_status; no GET/cache lookup)
+
+# Normalize status from the passed var (already computed upstream)
+- name: Resolve current status from passed var
+  set_fact:
+    _maas_status_name: "{{ system_status | default('') | string }}"
+
+- block:
+    - name: Build mark_broken comment body
+      set_fact:
+        _mark_broken_body: "comment={{ ('Temp: editing NIC at ' ~ broken_at) | urlencode }}"
+
+    # Refresh header again right before POST (avoids timestamp drift)
+    - include_tasks: ../_auth_header.yml
+
+    - name: POST {{ inv_host }} ?op=mark_broken (with note)
+      when: _maas_status_name != 'Broken'
+      uri:
+        url: "{{ _maas_api }}/machines/{{ system_id }}/op-mark_broken"
+        method: POST
+        headers:
+          Authorization: "{{ maas_auth_header }}"
+          Accept: application/json
+          Content-Type: application/x-www-form-urlencoded
+        body: "{{ _mark_broken_body }}"
+        body_format: form-urlencoded
+        status_code: [200, 403]   # handle both; branch below
+      register: _mark_broken_resp
+      changed_when: "_maas_status_name != 'Broken' and _mark_broken_resp.status == 200"
+      failed_when: "_mark_broken_resp.status not in [200, 403]"
+
+    - name: Remember that we marked {{ inv_host }} Broken
+      when: _maas_status_name != 'Broken' and _mark_broken_resp.status == 200
+      set_fact:
+        _marked_broken: "{{ (hostvars['localhost']._marked_broken | default([])) + [ system_id ] }}"
+      delegate_to: localhost
+      changed_when: false
+
+    - name: Remember that we failed to mark {{ inv_host }} broken
+      when: _maas_status_name != 'Broken' and _mark_broken_resp.status == 403
+      set_fact:
+        _failed_to_mark_broken: "{{ (hostvars['localhost']._failed_to_mark_broken | default([])) + [ system_id ] }}"
+      delegate_to: localhost
+      changed_when: false
+
+  # Skip if upstream says it's already Broken (and, if desired, skip Ready)
+  when: _maas_status_name not in ['Broken', 'Ready', 'New', 'Allocated']
diff --git a/roles/maas/tasks/machines/_plan_sets.yml b/roles/maas/tasks/machines/_plan_sets.yml

new file mode 100644 (file)

index 0000000..c1cc75d
--- /dev/null
+++ b/roles/maas/tasks/machines/_plan_sets.yml
@@ -0,0 +1,53 @@
+---
+# 1) Normalize everything to SHORT names (no regex needed for MAAS)
+- name: Normalize hostnames (ignore domains)
+  set_fact:
+    # Short names that exist in MAAS right now
+    _existing_names: >-
+      {{
+        (maas_by_hostname | default({}))
+        | dict2items
+        | map(attribute='value.hostname')
+        | reject('equalto', None)
+        | list
+      }}
+
+    # Short names from your inventory group
+    testnode_names: >-
+      {{
+        groups.get('testnodes', [])
+        | map('extract', hostvars, 'inventory_hostname_short')
+        | reject('equalto', None)
+        | list
+      }}
+
+    # Short names that must be excluded
+    maas_excluded_hosts: >-
+      {{
+        (
+          groups.get('maas_region_rack_server', []) +
+          groups.get('maas_db_server', []) +
+          groups.get('maas_dont_delete', [])
+        )
+        | map('extract', hostvars, 'inventory_hostname_short')
+        | reject('equalto', None)
+        | unique
+        | list
+      }}
+
+# 2) Plan using SHORT names only
+- name: Determine which hosts to create, update, and delete
+  set_fact:
+    _create_short: "{{ testnode_names | difference(_existing_names + maas_excluded_hosts) | list }}"
+    _delete_short: "{{ _existing_names | difference(testnode_names + maas_excluded_hosts) | list }}"
+    _update_short: "{{ (_existing_names | intersect(testnode_names)) | difference(maas_excluded_hosts) | list }}"
+
+# Plan: set IPMI creds for everything in create + update (short names)
+- name: Build combined IPMI plan list (create + update)
+  set_fact:
+    _plan_ipmi: >-
+      {{
+        ((_create_short | default([])) + (_update_short | default([])))
+        | unique
+        | list
+      }}
diff --git a/roles/maas/tasks/machines/_read_machines.yml b/roles/maas/tasks/machines/_read_machines.yml

new file mode 100644 (file)

index 0000000..070f70c
--- /dev/null
+++ b/roles/maas/tasks/machines/_read_machines.yml
@@ -0,0 +1,27 @@
+---
+- include_tasks: _auth_header.yml
+
+# Queries MAAS and builds maas_nodes_list + _with_names
+- name: Read all machines from MAAS
+  uri:
+    url: "{{ _maas_api }}/machines/"
+    method: GET
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    return_content: yes
+    status_code: 200
+  register: _all_machines
+  no_log: true
+
+#- pause:
+
+- name: Parse MAAS machines JSON
+  set_fact:
+    maas_nodes_list: "{{ _all_machines.json | list }}"
+
+#- pause:
+
+- name: Keep only entries with hostname
+  set_fact:
+    _with_names: "{{ maas_nodes_list | selectattr('hostname', 'defined') | list }}"
diff --git a/roles/maas/tasks/machines/_refresh_iface_facts.yml b/roles/maas/tasks/machines/_refresh_iface_facts.yml

new file mode 100644 (file)

index 0000000..e5eade2
--- /dev/null
+++ b/roles/maas/tasks/machines/_refresh_iface_facts.yml
@@ -0,0 +1,108 @@
+---
+# Fresh auth (new nonce/timestamp) for every API call
+#- name: Build OAuth header (fresh nonce/timestamp)
+#  include_tasks: ../_auth_header.yml
+
+## 1) Fetch all interfaces for this node
+#- name: Read MAAS interfaces for this node
+#  uri:
+#    url: "{{ _maas_api }}/nodes/{{ _node_system_id }}/interfaces/"
+#    method: GET
+#    headers:
+#      Authorization: "{{ maas_auth_header }}"
+#      Accept: application/json
+#    return_content: true
+#    status_code: 200
+#  register: _ifaces_resp
+
+## TODO: I think this is needed
+- include_tasks: machines/_read_machines.yml
+
+#- pause:
+
+- include_tasks: machines/_build_indexes.yml
+
+#- pause:
+
+- name: Set raw interface list
+  set_fact:
+#    _ifaces: "{{ _ifaces_resp.json | default([]) }}"
+    _ifaces: "{{ maas_host_to_ifaces[host] }}"
+
+#- debug: var=_ifaces
+
+#- pause:
+
+# 2) Rebuild quick lookups
+- name: Build interface lookup maps (by name, by id, by mac)
+  set_fact:
+    _iface_by_name: >-
+      {{
+        dict(
+          (_ifaces | map(attribute='name') | list)
+          | zip(_ifaces | list)
+        )
+      }}
+    _iface_id_by_name: >-
+      {{
+        dict(
+          (_ifaces | map(attribute='name') | list)
+          | zip(_ifaces | map(attribute='id') | list)
+        )
+      }}
+    _iface_id_by_mac: >-
+      {{
+        dict(
+          (
+            _ifaces
+            | selectattr('mac_address','defined')
+            | map(attribute='mac_address')
+            | map('lower')
+            | list
+          )
+          | zip(
+              _ifaces
+              | selectattr('mac_address','defined')
+              | map(attribute='id')
+              | list
+            )
+        )
+      }}
+
+# 3) Index existing VLAN subinterfaces as (parent_id, vlan_id) pairs
+- name: Init existing VLAN pair index
+  set_fact:
+    _existing_vlan_pairs: []
+
+- name: Build existing VLAN pair index
+  set_fact:
+    _existing_vlan_pairs: >-
+      {{
+        _existing_vlan_pairs + [ {
+          'parent_id': (_iface_id_by_name.get(item.parents[0]) | int),
+          'vlan_id': item.vlan.id,
+          'iface_id': item.id,
+          'name': item.name
+        } ]
+      }}
+  loop: "{{ _ifaces | selectattr('type','equalto','vlan') | list }}"
+  when:
+    - item.parents is defined
+    - (item.parents | length) > 0
+    - item.vlan is defined
+    - item.vlan.id is defined
+  loop_control:
+    label: "{{ item.name | default(item.id) }}"
+
+# 4) Track current native VLAN per *parent* interface (physical/bond)
+- name: Init native VLAN map
+  set_fact:
+    _native_by_parent: {}
+
+- name: Build native VLAN map (parent_id -> vlan_id or None)
+  set_fact:
+    _native_by_parent: "{{ _native_by_parent | combine({ (iface_for_vlan_map.id | int): (iface_for_vlan_map.vlan.id if iface_for_vlan_map.vlan is mapping else None) }) }}"
+  loop: "{{ _ifaces | rejectattr('type','equalto','vlan') | list }}"
+  loop_control:
+    loop_var: iface_for_vlan_map
+    label: "{{ iface_for_vlan_map.name | default(iface_for_vlan_map.id) }}"
diff --git a/roles/maas/tasks/machines/_set_parent_native.yml b/roles/maas/tasks/machines/_set_parent_native.yml

new file mode 100644 (file)

index 0000000..ccbfd25
--- /dev/null
+++ b/roles/maas/tasks/machines/_set_parent_native.yml
@@ -0,0 +1,17 @@
+---
+- include_tasks: ../_auth_header.yml
+
+- name: PUT vlan on parent {{ parent_id }} on {{ _inv_host_resolved }}
+  uri:
+    url: "{{ _maas_api }}/nodes/{{ system_id }}/interfaces/{{ parent_id }}/"
+    method: PUT
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+      Content-Type: application/json
+    body_format: json
+    body:
+      vlan: "{{ _target_native_vlan_id | int }}"
+    status_code: [200, 201]
+  register: _put_parent_vlan
+  changed_when: _put_parent_vlan.status in [200, 201]
diff --git a/roles/maas/tasks/machines/cleanup.yml b/roles/maas/tasks/machines/cleanup.yml

new file mode 100644 (file)

index 0000000..e08b8c8
--- /dev/null
+++ b/roles/maas/tasks/machines/cleanup.yml
@@ -0,0 +1,56 @@
+# Ensure auth header for cleanup
+- include_tasks: ../_auth_header.yml
+
+# Normalize unique list (in case the same node was handled twice)
+- name: Normalize _marked_broken unique list
+  set_fact:
+    _marked_broken: "{{ _marked_broken | default([]) | unique }}"
+  run_once: true
+  delegate_to: localhost
+
+# Fetch current status for each before flipping (idempotent safeguard)
+- name: GET node details before un-breaking
+  uri:
+    url: "{{ _maas_api }}/nodes/{{ sid }}/"
+    method: GET
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    status_code: 200
+    return_content: true
+  loop: "{{ _marked_broken | default([]) }}"
+  loop_control:
+    loop_var: sid
+  register: _cleanup_status
+
+- include_tasks: ../_auth_header.yml
+
+# Un-break only those still Broken
+- name: POST op=mark_fixed
+  uri:
+    url: "{{ _maas_api }}/machines/{{ sid }}/op-mark_fixed"
+    method: POST
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+      Content-Type: application/x-www-form-urlencoded
+    body: ""
+    body_format: form-urlencoded
+    status_code: 200
+  loop: >-
+    {{
+      (_cleanup_status.results | default([]))
+      | selectattr('json.status_name','defined')
+      | selectattr('json.status_name','equalto','Broken')
+      | map(attribute='sid') | list
+    }}
+  loop_control:
+    loop_var: sid
+  register: _mark_fixed_resp
+  changed_when: true
+
+# Optional: clear the list so a later run doesn’t try to un-break again
+- name: Clear shared _marked_broken list
+  set_fact:
+    _marked_broken: []
+  run_once: true
diff --git a/roles/maas/tasks/machines/create.yml b/roles/maas/tasks/machines/create.yml

new file mode 100644 (file)

index 0000000..c6d73a6
--- /dev/null
+++ b/roles/maas/tasks/machines/create.yml
@@ -0,0 +1,36 @@
+---
+#- include_tasks: ../_resolve_host.yml
+
+- include_tasks: _auth_header.yml
+
+- name: Build machine create body
+  set_fact:
+    maas_create_body: >-
+      {{
+        dict({
+          'hostname': host,
+          'deployed': true,
+          'architecture': desired_arch,
+          'mac_addresses': mac_addresses
+        }
+        | combine( desired_domain is defined and {'domain': desired_domain} or {} ))
+      }}
+
+- name: machines create body for {{ host }} (system_id={{ system_id }})
+  debug:
+    var: maas_create_body
+
+- name: Create machine in MAAS
+  uri:
+    url: "{{ _maas_api }}/machines/"
+    method: POST
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Content-Type: application/x-www-form-urlencoded
+      Accept: application/json
+    body_format: form-urlencoded
+    body: "{{ maas_create_body }}"
+    status_code: 200
+  register: create_result
+  changed_when: create_result.status in [200, 201]
+  notify: "Rebuild MAAS machine indexes"
diff --git a/roles/maas/tasks/machines/delete.yml b/roles/maas/tasks/machines/delete.yml

new file mode 100644 (file)

index 0000000..b92b27b
--- /dev/null
+++ b/roles/maas/tasks/machines/delete.yml
@@ -0,0 +1,6 @@
+---
+#- include_tasks: ../_resolve_host.yml
+
+- name: Would have deleted host {{ host }}
+  debug:
+    msg: "Would have deleted host {{ host }}"
diff --git a/roles/maas/tasks/machines/set_ipmi_creds.yml b/roles/maas/tasks/machines/set_ipmi_creds.yml

new file mode 100644 (file)

index 0000000..2dc537d
--- /dev/null
+++ b/roles/maas/tasks/machines/set_ipmi_creds.yml
@@ -0,0 +1,79 @@
+---
+# Derive short hostname and base group (strip trailing digits)
+- name: Prep IPMI secrets lookup context
+  set_fact:
+    _inv_short: "{{ hostvars[inv_host].inventory_hostname_short | default(inventory_hostname_short) }}"
+    _base_group: "{{ (hostvars[inv_host].inventory_hostname_short | default(inventory_hostname_short)) | regex_replace('\\d+$', '') }}"
+
+# Build candidates in priority order
+- name: Build IPMI secrets candidate list
+  set_fact:
+    _ipmi_files:
+      - "{{ secrets_path }}/host_vars/{{ _inv_short }}.yml"
+      - "{{ secrets_path }}/group_vars/{{ _base_group }}.yml"
+      - "{{ secrets_path }}/ipmi.yml"
+
+# Load first found file (host_vars short -> group_vars/<base>.yml -> ipmi.yml)
+- name: Load IPMI secrets (first found)
+  include_vars:
+    file: "{{ lookup('first_found', {'files': _ipmi_files, 'skip': True}) }}"
+    name: ipmi_secrets
+  # add this if secrets live on the controller:
+  # delegate_to: localhost
+
+## Ensure required keys exist
+#- name: Ensure IPMI user/pass are present from secrets
+#  assert:
+#    that:
+#      - ipmi_secrets is defined
+#      - ipmi_secrets.power_user is defined
+#      - ipmi_secrets.power_pass is defined
+#    fail_msg: >-
+#      Missing IPMI secrets for {{ inv_host }}. Looked in: {{ _ipmi_files }}
+#
+## Build payload using inventory IPMI address + secrets user/pass
+#- name: Build power configuration payload
+#  set_fact:
+#    maas_power_payload:
+#      power_type: "ipmi"
+#      power_parameters_power_address: "{{ hostvars[inv_host].ipmi }}"
+#      power_parameters_power_user: "{{ ipmi_secrets.power_user }}"
+#      power_parameters_power_pass: "{{ ipmi_secrets.power_pass }}"
+#      power_parameters_power_boot_type: "{{ maas_power_boot_type|default('auto') }}"
+
+# Ensure creds exist
+- name: Ensure IPMI user/pass are present from secrets
+  assert:
+    that:
+      - ipmi_secrets is defined
+      - ipmi_secrets.power_user is defined
+      - ipmi_secrets.power_pass is defined
+    fail_msg: >-
+      Missing IPMI secrets for {{ inv_host }}. Searched: {{ _ipmi_files }}
+
+# Build payload using inventory IPMI address + secrets user/pass
+- name: Build power configuration payload
+  set_fact:
+    maas_power_payload:
+      power_type: "ipmi"
+      power_parameters_power_address: "{{ hostvars[inv_host].ipmi }}"
+      power_parameters_power_user: "{{ ipmi_secrets.power_user }}"
+      power_parameters_power_pass: "{{ ipmi_secrets.power_pass }}"
+      power_parameters_power_boot_type: "{{ maas_power_boot_type|default('efi') }}"
+
+- include_tasks: ../_auth_header.yml
+
+- name: "Set IPMI Credentials on {{ _inv_short }}"
+  uri:
+    url: "{{ _maas_api }}/machines/{{ system_id }}/"
+    method: PUT
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+      Content-Type: application/x-www-form-urlencoded
+    body: "{{ maas_power_payload }}"
+    body_format: form-urlencoded
+    status_code: 200
+  register: set_ipmi_creds_result
+  changed_when: set_ipmi_creds_result.status in [200, 201]
+#  no_log: true
diff --git a/roles/maas/tasks/machines/update.yml b/roles/maas/tasks/machines/update.yml

new file mode 100644 (file)

index 0000000..7c860bc
--- /dev/null
+++ b/roles/maas/tasks/machines/update.yml
@@ -0,0 +1,223 @@
+---
+# roles/maas/tasks/machines/update.yml
+
+# 1) Fresh OAuth header (nonce/timestamp)
+- name: Build OAuth header
+  include_tasks: _auth_header.yml
+
+# 2) Record node system_id for downstream includes
+- name: Remember {{ inv_host }} = systemd id {{ system_id }}
+  set_fact:
+    _node_system_id: "{{ system_id }}"
+
+# 5) Initialize desired structures so later tasks never explode on undefined
+# Load desired bonds & interfaces from group_vars
+- name: Load desired bonds & interfaces from group_vars
+  set_fact:
+    _desired_bonds: "{{ hostvars[inv_host].maas_bonds | default([]) }}"
+    _desired_ifaces: "{{ hostvars[inv_host].maas_interfaces | default([]) }}"
+
+- include_tasks: machines/_refresh_iface_facts.yml
+
+- include_tasks: machines/_mark_broken.yml
+  when: system_status not in ['Broken', 'Ready', 'New', 'Allocated']
+
+- name: Apply interfaces (native_vid + tagged_vids)
+  include_tasks: machines/_apply_one_iface.yml
+  loop: "{{ _desired_ifaces }}"
+  loop_control:
+    loop_var: desired_iface
+    label: "{{ desired_iface.prefix | default('(no prefix)') }}"
+  vars:
+    iface_obj: "{{ desired_iface }}"
+
+# 9) Ensure bonds (each include runs per bond; no block-looping)
+- name: Ensure each bond
+  when: (_desired_bonds | default([])) | length > 0
+  include_tasks: machines/_ensure_bond.yml
+  loop: "{{ _desired_bonds | default([]) }}"
+  loop_control:
+    loop_var: bond
+    label: "{{ bond.name | default('unnamed-bond') }}"
+
+# Ensure we have fresh auth + base url
+- include_tasks: _auth_header.yml
+
+# Read all interfaces for this node
+- name: Read machine interfaces (for subnet assignment)
+  uri:
+    url: "{{ _maas_api }}/nodes/{{ system_id }}/interfaces/"
+    method: GET
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    return_content: yes
+    status_code: 200
+  register: _ifaces_read
+#  no_log: true
+
+# INIT
+- name: Init iface list (id, name, vlan_id)
+  set_fact:
+    _iface_rows: []
+
+# Build iface list (id, name, vlan_id, type) — bonds, vlans, physical NICs
+- name: Build iface list (id, name, vlan_id, type, mac)
+  set_fact:
+    _iface_rows: "{{ _iface_rows + [ {
+      'id': i.id,
+      'name': i.name,
+      'vlan_id': i.vlan.id,
+      'type': i.type,
+      'mac': (i.mac_address | default('') | lower)
+    } ] }}"
+  loop: >-
+    {{
+      _ifaces_read.json
+      | selectattr('vlan','defined')
+      | selectattr('vlan.id','defined')
+      | selectattr('type','defined')
+      | list
+    }}
+  loop_control:
+    loop_var: i
+
+- include_tasks: _auth_header.yml
+
+# Fetch ALL subnets once (we'll group them locally)
+- name: Read all subnets
+  uri:
+    url: "{{ _maas_api }}/subnets/"
+    method: GET
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    return_content: yes
+    status_code: 200
+  register: _all_subnets
+  no_log: true
+
+- name: Init subnets_by_vlan map
+  set_fact:
+    _subnets_by_vlan: {}
+
+# Build map: vlan_id -> [subnets...]
+- name: Group subnets by VLAN id
+  set_fact:
+    _subnets_by_vlan: >-
+      {{
+        _subnets_by_vlan | default({}) | combine({
+          (s.vlan.id|string):
+            (_subnets_by_vlan.get(s.vlan.id|string, []) + [s])
+        })
+      }}
+  loop: "{{ _all_subnets.json | default([]) }}"
+  loop_control:
+    loop_var: s
+  when: s.vlan is defined and s.vlan.id is defined
+
+- name: Collect known MACs from inventory (any var ending in _mac)
+  set_fact:
+    _inv_macs: >-
+      {{
+        hostvars[inventory_hostname]
+        | dict2items
+        | selectattr('key', 'search', '_mac$')
+        | map(attribute='value')
+        | map('lower')
+        | reject('equalto', '')
+        | list
+      }}
+
+#- name: Debug iface rows and inventory MACs
+#  debug:
+#    msg:
+#      iface_rows: "{{ _iface_rows }}"
+#      inv_macs: "{{ _inv_macs }}"
+
+#- name: Apply subnet assignment for each iface (bond/vlan/physical)
+#  when: (_iface_rows | length) > 0
+#  include_tasks: machines/_apply_subnet.yml
+#  loop: "{{ _iface_rows }}"
+#  loop_control:
+#    loop_var: row
+#  vars:
+#    iface: "{{ row }}"
+#    candidate_subnets: "{{ _subnets_by_vlan.get(row.vlan_id|string, []) }}"
+#- name: Apply subnet assignment for each iface (bond/vlan/physical) we manage
+#  include_tasks: machines/_apply_subnet.yml
+#  loop: "{{ _iface_rows }}"
+#  loop_control:
+#    loop_var: row
+#  when:
+#    - (_iface_rows | length) > 0
+#    - row.name in (maas_iface_vlan_map | default({})).keys()
+#  vars:
+#    iface: "{{ row }}"
+#    candidate_subnets: "{{ _subnets_by_vlan.get(row.vlan_id | string, []) }}"
+- name: Apply subnet assignment for each iface (bond/vlan/physical) we know from inventory
+  include_tasks: machines/_apply_subnet.yml
+  loop: "{{ _iface_rows | selectattr('mac', 'in', _inv_macs) | list }}"
+  loop_control:
+    loop_var: row
+  when:
+    - _iface_rows | length > 0
+  vars:
+    iface: "{{ row }}"
+    candidate_subnets: "{{ _subnets_by_vlan.get(row.vlan_id | string, []) }}"
+
+#- name: Build iface list (id, name, vlan_id, type) — only bond/vlan
+#  set_fact:
+#    _iface_rows: "{{ _iface_rows + [ {'id': i.id, 'name': i.name, 'vlan_id': i.vlan.id, 'type': i.type} ] }}"
+#  loop: >-
+#    {{
+#      _ifaces_read.json
+#      | selectattr('vlan','defined')
+#      | selectattr('vlan.id','defined')
+#      | selectattr('type','defined')
+#      | selectattr('type','in',['bond','vlan'])
+#      | list
+#    }}
+#  loop_control:
+#    loop_var: i
+#
+#- include_tasks: _auth_header.yml
+#
+## Fetch ALL subnets once (we'll group them locally)
+#- name: Read all subnets
+#  uri:
+#    url: "{{ _maas_api }}/subnets/"
+#    method: GET
+#    headers:
+#      Authorization: "{{ maas_auth_header }}"
+#      Accept: application/json
+#    return_content: yes
+#    status_code: 200
+#  register: _all_subnets
+#  no_log: true
+#
+#- name: Init subnets_by_vlan map
+#  set_fact:
+#    _subnets_by_vlan: {}
+#
+## Build map: vlan_id -> [subnets...]
+#- name: Group subnets by VLAN id
+#  set_fact:
+#    _subnets_by_vlan: "{{ _subnets_by_vlan | default({}) | combine({ (s.vlan.id|string): ( _subnets_by_vlan.get(s.vlan.id|string, []) + [s] ) }) }}"
+#  loop: "{{ _all_subnets.json | default([]) }}"
+#  loop_control:
+#    loop_var: s
+#  when: s.vlan is defined and s.vlan.id is defined
+#
+#- debug: var=_iface_rows
+#- pause:
+#
+#- name: Apply subnet assignment for each iface (bond/vlan only)
+#  when: (_iface_rows | length) > 0
+#  include_tasks: machines/_apply_subnet.yml
+#  loop: "{{ _iface_rows }}"
+#  loop_control:
+#    loop_var: row
+#  vars:
+#    iface: "{{ row }}"
+#    candidate_subnets: "{{ _subnets_by_vlan.get(row.vlan_id|string, []) }}"
diff --git a/roles/maas/tasks/main.yml b/roles/maas/tasks/main.yml

new file mode 100644 (file)

index 0000000..db20838
--- /dev/null
+++ b/roles/maas/tasks/main.yml
@@ -0,0 +1,123 @@
+---
+# Playbook to install and configure MAAS
+- name: Fail if not an Ubuntu system
+  fail:
+    msg: "This playbook only supports Ubuntu systems"
+  when: ansible_distribution != "Ubuntu"
+
+- name: Ensure system is up-to-date
+  apt:
+    update_cache: yes
+    upgrade: full
+
+# Install and configure the MAAS DB
+- import_tasks: install_maasdb.yml
+
+# Install MAAS
+- name: Install MAAS with Snap
+  snap:
+    name: maas
+    classic: yes
+    channel: "{{ maas_version }}/stable"
+    state: present
+  tags: install_maas
+  when: "maas_install_method == 'snap'"
+  register: maas_install_snap
+
+- name: Add MAAS apt repository
+  ansible.builtin.apt_repository:
+    repo: "ppa:maas/{{ maas_version }}"
+  tags: install_maas
+  when: "maas_install_method == 'apt'"
+
+- name: Install MAAS with Apt
+  ansible.builtin.apt:
+    name: maas
+    state: present
+  tags: install_maas
+  when: "maas_install_method == 'apt'"
+  register: maas_install_apt
+
+- name: Normalize install result
+  set_fact:
+    maas_install: "{{ maas_install_snap if maas_install_method == 'snap' else maas_install_apt }}"
+  changed_when: "(maas_install_method == 'apt' and maas_install_apt is defined and maas_install_apt.changed) or (maas_install_method == 'snap' and maas_install_snap is defined and maas_install_snap.changed)"
+  tags: install_maas
+
+# Initialize MAAS  
+- import_tasks: initialize_region_rack.yml
+
+- import_tasks: initialize_secondary_rack.yml
+
+# Logging into the MAAS API to use CLI
+- name: Get API key
+  command: maas apikey --username={{ maas_admin_username }}
+  when: inventory_hostname in groups['maas_region_rack_server']
+  tags:
+  - config_dhcp
+  - config_maas
+#  - machines
+  - config_dns
+  - config_ntp
+  - add_users  
+  register: maas_api_key
+
+- name: Log into MAAS API
+  command: "maas login {{ maas_admin_username }} http://{{ hostvars[groups['maas_region_rack_server'].0]['ip'] }}:5240/MAAS/api/2.0/ {{ maas_api_key.stdout }}"
+  when: inventory_hostname in groups['maas_region_rack_server']
+  tags:
+  - config_dhcp
+  - config_maas
+#  - machines
+  - config_dns
+  - config_ntp
+  - add_users  
+
+# Configure MAAS
+- import_tasks: config_maas.yml
+
+- import_tasks: api_auth_pretasks.yml
+  tags:
+    - always
+    - api
+
+# Configure Networks
+- import_tasks: networking.yml
+  tags:
+    - networking
+
+# Configure NTP Service
+- import_tasks: config_ntp.yml
+
+# Configure DNS Service
+- import_tasks: config_dns.yml
+
+# Configure DHCP Service
+- name: dhcp_configuration
+  include_tasks: config_dhcpd_subnet.yml
+  loop: "{{ dhcp_maas_subnets|dict2items }}"
+  loop_control:
+    loop_var: subnet
+  vars:
+    subnet_name: "{{ subnet.key }}"
+    subnet_data: "{{ subnet.value }}"
+  tags: config_dhcp
+
+# Add Machines into MAAS
+- import_tasks: machines.yml
+  tags: machines
+
+# Add Users into MAAS
+- import_tasks: add_users.yml
+
+# Logout from MAAS API  
+- name: Logout from MAAS
+  command: "maas logout {{ maas_admin_username }}"
+  tags:
+  - config_dhcp
+  - config_maas
+#  - machines
+  - config_dns
+  - config_ntp
+  - add_users
+  when: inventory_hostname in groups['maas_region_rack_server']
diff --git a/roles/maas/tasks/networking.yml b/roles/maas/tasks/networking.yml

new file mode 100644 (file)

index 0000000..505a2f7
--- /dev/null
+++ b/roles/maas/tasks/networking.yml
@@ -0,0 +1,432 @@
+---
+# Prereqs (set by your own auth tasks):
+# - maas_api_url: e.g. "http://10.64.1.25:5240"
+# - maas_auth_header: OAuth 1.0 PLAINTEXT header string
+# Inputs:
+# - maas_networking: your fabric/vlan/subnet structure
+# - maas_global_dns_servers: optional list of DNS servers
+# - maas_global_primary_rack_controller: optional Controller *hostname*
+#   Rack Controller must be defined at the VLAN level if not defined globally.
+
+################################################################################
+# API base
+################################################################################
+- name: Set MAAS API base URL
+  set_fact:
+    _maas_api: "{{ maas_api_url | trim('/') }}/MAAS/api/2.0"
+
+################################################################################
+# Inventory Validation
+################################################################################
+
+# --- Check for DHCP-enabled VLANs that are missing dynamic ip_ranges ----------
+
+# Always init so the assert never sees an undefined var
+- name: Init list of DHCP violations
+  set_fact:
+    _dhcp_missing_dynamic: []
+
+- name: Build list of fabric/vlan pairs
+  set_fact:
+    _fabric_vlans: "{{ maas_networking | subelements('vlans', skip_missing=True) }}"
+
+# Flag any VLAN with dhcp_on=true but no dynamic ranges on any of its subnets
+- name: Find DHCP-enabled VLANs missing dynamic ranges
+  vars:
+    _vlan: "{{ item.1 }}"
+    _dyn_count: >-
+      {{
+        (_vlan.subnets | default([]))
+        | selectattr('ip_ranges','defined')
+        | map(attribute='ip_ranges')
+        | flatten
+        | selectattr('type','equalto','dynamic')
+        | list
+        | length
+      }}
+  when:
+    - _vlan.dhcp_on | default(false) | bool
+    - (_dyn_count | int) == 0
+  set_fact:
+    _dhcp_missing_dynamic: >-
+      {{
+        (_dhcp_missing_dynamic | default([]))
+        + [ { 'fabric': item.0.fabric, 'vid': _vlan.vid, 'name': _vlan.name | default('') } ]
+      }}
+  loop: "{{ _fabric_vlans }}"
+  loop_control:
+    label: "{{ item.0.fabric }}:{{ item.1.vid }}"
+
+- name: Fail if any DHCP-enabled VLAN lacks a dynamic range
+  assert:
+    that:
+      - (_dhcp_missing_dynamic | default([])) | length == 0
+    fail_msg: >-
+      DHCP is enabled but no dynamic range is defined on these VLANs:
+      {{ (_dhcp_missing_dynamic | default([])) | to_nice_json }}
+
+# --- Check for undefined primary rack controller per VLAN ---------------------
+
+# 1) Capture global if provided (and non-empty)
+- name: Capture global primary rack controller id (if set)
+  set_fact:
+    _global_primary_rack_controller: "{{ maas_global_primary_rack_controller | string }}"
+  when:
+    - maas_global_primary_rack_controller is defined
+    - (maas_global_primary_rack_controller | string) | length > 0
+
+# 2) If no global, ensure every VLAN declares primary_rack_controller
+- name: Build list of VLANs missing primary_rack_controller (when no global set)
+  set_fact:
+    _vlans_missing_prc: |
+      {% set missing = [] %}
+      {% for pair in (maas_networking | subelements('vlans', skip_missing=True)) %}
+      {%   set fab = pair[0] %}
+      {%   set v   = pair[1] %}
+      {%   if v.primary_rack_controller is not defined or (v.primary_rack_controller | string) | length == 0 %}
+      {%     set _ = missing.append(fab.fabric ~ ":VID " ~ (v.vid | string)) %}
+      {%   endif %}
+      {% endfor %}
+      {{ missing }}
+  when: _global_primary_rack_controller is not defined
+
+- name: Require maas_global_primary_rack_controller or per-VLAN primary_rack_controller
+  assert:
+    that:
+      - (_global_primary_rack is defined) or (_vlans_missing_prc | length == 0)
+    fail_msg: >-
+      Missing primary rack controller configuration.
+      Either set 'maas_global_primary_rack_controller' or add 'primary_rack_controller'
+      on each VLAN. Missing for:
+      {{ (_vlans_missing_prc | default([])) | join('\n') }}
+  when: _global_primary_rack_controller is not defined
+
+################################################################################
+# Domains
+################################################################################
+- name: Collect unique domains from maas_networking
+  set_fact:
+    _wanted_domains: >-
+      {{
+        maas_networking
+        | map(attribute='vlans') | flatten
+        | map(attribute='subnets') | flatten
+        | selectattr('domain','defined')
+        | map(attribute='domain')
+        | list | unique
+      }}
+
+- include_tasks: _auth_header.yml
+#- name: Read existing RCs
+#  uri:
+#    url: "{{ _maas_api }}/rackcontrollers/"
+#    method: GET
+#    headers: { Authorization: "{{ maas_auth_header }}" }
+#    return_content: true
+#  register: _domains_resp
+#
+#- pause:
+
+- name: Read existing domains
+  uri:
+    url: "{{ _maas_api }}/domains/"
+    method: GET
+    headers: { Authorization: "{{ maas_auth_header }}" }
+    return_content: true
+  register: _domains_resp
+
+- name: Index domains by name
+  set_fact:
+    _domains_by_name: "{{ (_domains_resp.json | default([])) | items2dict(key_name='name', value_name='id') }}"
+
+- name: Compute domains to create
+  set_fact:
+    _new_domains: "{{ _wanted_domains | difference((_domains_by_name.keys() | list)) }}"
+
+# _wanted_domains must be a real list (use the unique/flatten filter recipe)
+
+- name: Ensure desired domains exist
+  include_tasks: networking/domain_create.yml
+  loop: "{{ _new_domains }}"
+  loop_control:
+    loop_var: domain_name
+#  #no_log: true
+
+################################################################################
+# Spaces
+################################################################################
+- name: Collect unique spaces from maas_networking
+  set_fact:
+    _wanted_spaces: >-
+      {{
+        maas_networking
+        | map(attribute='vlans') | flatten
+        | map(attribute='subnets') | flatten
+        | selectattr('space','defined')
+        | map(attribute='space')
+        | list | unique
+      }}
+
+- include_tasks: _auth_header.yml
+  #no_log: true
+
+- name: Read existing spaces
+  uri:
+    url: "{{ _maas_api }}/spaces/"
+    method: GET
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    return_content: true
+    use_netrc: false
+  register: _spaces_resp
+  #no_log: true
+
+- name: Index spaces by name
+  set_fact:
+    _spaces_by_name: "{{ (_spaces_resp.json | default([])) | items2dict(key_name='name', value_name='id') }}"
+
+- name: Compute spaces to create
+  set_fact:
+    _new_spaces: "{{ _wanted_spaces | difference((_spaces_by_name.keys() | list)) }}"
+
+- name: Ensure desired spaces exist
+  include_tasks: networking/space_create.yml
+  loop: "{{ _new_spaces }}"
+  loop_control:
+    loop_var: space_name
+  #no_log: true
+
+################################################################################
+# Fabrics
+################################################################################
+- include_tasks: _auth_header.yml
+  #no_log: true
+
+- name: Read fabrics
+  uri:
+    url: "{{ _maas_api }}/fabrics/"
+    method: GET
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    return_content: true
+    use_netrc: false
+  register: _fabrics_resp
+  #no_log: true
+
+- name: Index fabrics by name
+  set_fact:
+    _fabric_by_name: "{{ (_fabrics_resp.json | default([])) | items2dict(key_name='name', value_name='id') }}"
+
+- name: Collect desired fabric names from maas_networking
+  set_fact:
+    _wanted_fabrics: "{{ maas_networking | map(attribute='fabric') | list | unique }}"
+
+- name: Compute fabrics to create
+  set_fact:
+    _new_fabrics: "{{ _wanted_fabrics | difference((_fabric_by_name.keys() | list)) }}"
+
+- name: Ensure fabrics exist
+  include_tasks: networking/fabric_create.yml
+  loop: "{{ _new_fabrics }}"
+  loop_control:
+    loop_var: fabric_name
+  #no_log: true
+
+# Refresh fabrics after creates
+- include_tasks: _auth_header.yml
+  #no_log: true
+
+- name: Refresh fabrics
+  uri:
+    url: "{{ _maas_api }}/fabrics/"
+    method: GET
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    return_content: true
+    use_netrc: false
+  register: _fabrics_resp2
+  #no_log: true
+
+- name: Re-index fabrics
+  set_fact:
+    _fabric_by_name: "{{ (_fabrics_resp2.json | default([])) | items2dict(key_name='name', value_name='id') }}"
+
+################################################################################
+# VLANs
+################################################################################
+- name: Validate VLAN names
+  loop: "{{ maas_networking | subelements('vlans', skip_missing=True) }}"
+  loop_control:
+    loop_var: item
+  assert:
+    that:
+      - item.1.name is match('^[a-z0-9-]+$')
+    fail_msg: "Invalid VLAN name '{{ item.1.name }}' — only lowercase letters and dashes are allowed."
+
+# Read VLANs per fabric (looped helper so each GET has fresh auth)
+- name: init raw vlans holder
+  set_fact:
+    _vlans_raw_by_fabric: {}
+
+- name: Read VLANs for each fabric
+  include_tasks: networking/fabric_vlans_read.yml
+  loop: "{{ maas_networking }}"
+  loop_control:
+    loop_var: fab_obj
+  #no_log: true
+
+- name: Build VLAN index (first pass)
+  include_tasks: networking/vlan_build_index.yml
+
+- name: Create VLANs that are missing
+  vars:
+    _fname: "{{ pair.0.fabric }}"
+    vlan:   "{{ pair.1 }}"
+    _vrec:  "{{ _vlan_index.get(_fname, {}) }}"
+    # handle both string and int vid keys so creation works regardless of index build
+    _exists: "{{ (_vrec.get(vlan.vid | string) is not none) or (_vrec.get(vlan.vid) is not none) }}"
+  include_tasks: networking/vlan_create.yml
+  loop: "{{ maas_networking | subelements('vlans', skip_missing=True) }}"
+  loop_control:
+    loop_var: pair
+    label: "{{ pair.0.fabric }}:{{ pair.1.vid }}"
+  when: not _exists
+
+# Refresh VLANs after creates (read again via helper) and rebuild index
+- name: Reset raw vlans holder
+  set_fact:
+    _vlans_raw_by_fabric: {}
+
+- name: Re-read VLANs for each fabric
+  include_tasks: networking/fabric_vlans_read.yml
+  loop: "{{ maas_networking }}"
+  loop_control:
+    loop_var: fab_obj
+
+- name: Build VLAN index (second pass)
+  include_tasks: networking/vlan_build_index.yml
+
+################################################################################
+# Subnets (create/update DNS + ranges) BEFORE enabling VLAN DHCP
+################################################################################
+# Build (fabric, vlan) pairs
+- name: Build list of fabric/vlan pairs
+  set_fact:
+    _fabric_vlans: "{{ maas_networking | subelements('vlans', skip_missing=True) }}"
+
+- name: Build list of (fabric, vlan, subnet) triples
+  set_fact:
+    _subnet_triples: |
+      {% set out = [] %}
+      {% for pair in _fabric_vlans %}
+      {%   set fab = pair[0] %}
+      {%   set vlan = pair[1] %}
+      {%   for sn in vlan.subnets | default([]) %}
+      {%     set _ = out.append([fab, vlan, sn]) %}
+      {%   endfor %}
+      {% endfor %}
+      {{ out }}
+
+- name: Ensure subnets, DNS servers, and IP ranges
+  include_tasks: networking/subnet_apply.yml
+  vars:
+    trio: "{{ item }}"
+  loop: "{{ _subnet_triples }}"
+  loop_control:
+    label: "{{ item[0].fabric }} : VID {{ item[1].vid }} : {{ item[2].cidr }}"
+
+################################################################################
+# VLAN property updates (name/mtu/dhcp_on) AFTER ranges exist
+################################################################################
+################################################################################
+# VLAN property updates (name/mtu/dhcp_on/space) AFTER ranges exist
+################################################################################
+
+## Resolve the VLAN id safely (handles string/int VID keys)
+#- name: Resolve VLAN id for update
+#  vars:
+#    _fname: "{{ pair.0.fabric }}"
+#    vlan:   "{{ pair.1 }}"
+#  set_fact:
+#    _vobj: >-
+#      {{
+#        _vlan_index[_fname].get(vlan.vid|string)
+#        or _vlan_index[_fname].get(vlan.vid)
+#      }}
+#    _vlan_id: "{{ _vobj.id if (_vobj is defined and _vobj) else None }}"
+#  loop: "{{ maas_networking | subelements('vlans', skip_missing=True) }}"
+#  loop_control:
+#    loop_var: pair
+#    label: "{{ pair.0.fabric }}:{{ pair.1.vid }}"
+#
+#- name: Ensure VLAN exists in index before updating
+#  assert:
+#    that:
+#      - _vlan_id is not none
+#    fail_msg: >-
+#      VLAN {{ pair.1.vid }} on fabric {{ pair.0.fabric }} not found in _vlan_index.
+#      Known VIDs: {{ _vlan_index[pair.0.fabric] | dict2items | map(attribute='key') | list }}
+#  loop: "{{ maas_networking | subelements('vlans', skip_missing=True) }}"
+#  loop_control:
+#    loop_var: pair
+#    label: "{{ pair.0.fabric }}:{{ pair.1.vid }}"
+#
+## Build update body (name/mtu/space + dhcp_on only if we saw a dynamic range in inventory)
+#- name: Build VLAN update body
+#  vars:
+#    _fname: "{{ pair.0.fabric }}"
+#    vlan:   "{{ pair.1 }}"
+#
+#    # unique space from subnets (if exactly one specified)
+#    _spaces_list: >-
+#      {{
+#        (vlan.subnets | default([]))
+#        | selectattr('space','defined')
+#        | map(attribute='space') | list | unique
+#      }}
+#    _desired_space: "{{ _spaces_list[0] if (_spaces_list | length) == 1 else omit }}"
+#
+#    # does inventory declare at least one dynamic range on any subnet of this VLAN?
+#    _has_dynamic_for_vlan: >-
+#      {{
+#        (vlan.subnets | default([]))
+#        | selectattr('ip_ranges','defined')
+#        | map(attribute='ip_ranges') | flatten
+#        | selectattr('type','equalto','dynamic')
+#        | list | length > 0
+#      }}
+#  set_fact:
+#    _body: >-
+#      {{
+#        {'name': vlan.name}
+#        | combine( (vlan.mtu is defined)     | ternary({'mtu': vlan.mtu}, {}), recursive=True )
+#        | combine( (_desired_space is not none) | ternary({'space': _desired_space}, {}), recursive=True )
+#        | combine(
+#            (vlan.dhcp_on | default(false) | bool and _has_dynamic_for_vlan)
+#            | ternary({'dhcp_on': true}, {}), recursive=True
+#          )
+#      }}
+#  loop: "{{ maas_networking | subelements('vlans', skip_missing=True) }}"
+#  loop_control:
+#    loop_var: pair
+#    label: "{{ pair.0.fabric }}:{{ pair.1.vid }}"
+
+## Do the actual VLAN update (expects _vlan_id and _body set by the two tasks above)
+#- name: Call VLAN Update tasks
+#  include_tasks: networking/vlan_update.yml
+#  loop: "{{ maas_networking | subelements('vlans', skip_missing=True) | map('join', ':') | list }}"
+#  loop_control:
+#    label: "{{ item }}"
+
+- name: Call VLAN Update tasks
+  include_tasks: networking/vlan_update.yml
+  loop: "{{ maas_networking | subelements('vlans', skip_missing=True) }}"
+  loop_control:
+    loop_var: pair
+    label: "{{ pair.0.fabric }}:{{ pair.1.vid }}"
+  vars:
+    _fname: "{{ pair.0.fabric }}"
+    vlan:   "{{ pair.1 }}"
diff --git a/roles/maas/tasks/networking/domain_create.yml b/roles/maas/tasks/networking/domain_create.yml

new file mode 100644 (file)

index 0000000..a4f3761
--- /dev/null
+++ b/roles/maas/tasks/networking/domain_create.yml
@@ -0,0 +1,22 @@
+---
+# Expects: _maas_api, maas_api_key, domain_name
+# Builds a fresh OAuth header and creates the domain
+
+- include_tasks: ../_auth_header.yml
+#  no_log: true
+
+- uri:
+    url: "{{ _maas_api }}/domains/"
+    method: POST
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Content-Type: application/x-www-form-urlencoded
+      Accept: application/json
+    body_format: form-urlencoded
+    body:
+      name: "{{ domain_name }}"
+    status_code: [200, 201, 409]
+    use_netrc: false
+    return_content: false
+    validate_certs: true
+  #no_log: true
diff --git a/roles/maas/tasks/networking/fabric_create.yml b/roles/maas/tasks/networking/fabric_create.yml

new file mode 100644 (file)

index 0000000..317747e
--- /dev/null
+++ b/roles/maas/tasks/networking/fabric_create.yml
@@ -0,0 +1,19 @@
+---
+# Expects: _maas_api, maas_api_key, fabric_name
+
+- include_tasks: ../_auth_header.yml
+  no_log: true
+
+- uri:
+    url: "{{ _maas_api }}/fabrics/"
+    method: POST
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Content-Type: application/x-www-form-urlencoded
+      Accept: application/json
+    body_format: form-urlencoded
+    body:
+      name: "{{ fabric_name }}"
+    status_code: [200, 201, 409]
+    use_netrc: false
+  no_log: true
diff --git a/roles/maas/tasks/networking/fabric_vlans_read.yml b/roles/maas/tasks/networking/fabric_vlans_read.yml

new file mode 100644 (file)

index 0000000..1e6f212
--- /dev/null
+++ b/roles/maas/tasks/networking/fabric_vlans_read.yml
@@ -0,0 +1,20 @@
+---
+# Expects: _maas_api, maas_api_key, _fabric_by_name, fab_obj (with .fabric)
+
+- include_tasks: ../_auth_header.yml
+  no_log: true
+
+- uri:
+    url: "{{ _maas_api }}/fabrics/{{ _fabric_by_name[fab_obj.fabric] }}/vlans/"
+    method: GET
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    return_content: true
+    use_netrc: false
+  register: _vlans_resp
+  no_log: true
+
+- set_fact:
+    _vlans_raw_by_fabric: "{{ _vlans_raw_by_fabric | combine({ fab_obj.fabric: (_vlans_resp.json | default([])) }, recursive=True) }}"
+  no_log: true
diff --git a/roles/maas/tasks/networking/space_create.yml b/roles/maas/tasks/networking/space_create.yml

new file mode 100644 (file)

index 0000000..144c206
--- /dev/null
+++ b/roles/maas/tasks/networking/space_create.yml
@@ -0,0 +1,19 @@
+---
+# Expects: _maas_api, maas_api_key, space_name
+
+- include_tasks: ../_auth_header.yml
+  no_log: true
+
+- uri:
+    url: "{{ _maas_api }}/spaces/"
+    method: POST
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Content-Type: application/x-www-form-urlencoded
+      Accept: application/json
+    body_format: form-urlencoded
+    body:
+      name: "{{ space_name }}"
+    status_code: [200, 201, 409]
+    use_netrc: false
+  no_log: true
diff --git a/roles/maas/tasks/networking/subnet_apply.yml b/roles/maas/tasks/networking/subnet_apply.yml

new file mode 100644 (file)

index 0000000..b429062
--- /dev/null
+++ b/roles/maas/tasks/networking/subnet_apply.yml
@@ -0,0 +1,355 @@
+---
+# Expects: trio=[fabric_obj, vlan_obj, subnet_obj], _vlan_index, _maas_api, maas_auth_header
+
+# 0) Validate input triple
+- name: Verify triple input
+  assert:
+    that:
+      - trio is defined
+      - trio | length == 3
+    fail_msg: "subnet_apply.yml expects trio=[fabric, vlan, subnet], got: {{ trio | default('undefined') }}"
+
+# 1) Unpack triple
+- name: Extract fabric, vlan, and subnet
+  set_fact:
+    _fname: "{{ trio[0].fabric }}"
+    vlan:   "{{ trio[1] }}"
+    subnet: "{{ trio[2] }}"
+
+# 2) Ensure VLAN exists in index & resolve its numeric id
+- name: Ensure VLAN is present in index
+  assert:
+    that:
+      - _vlan_index[_fname] is defined
+      - _vlan_index[_fname][vlan.vid | string] is defined
+    fail_msg: >-
+      VLAN {{ vlan.vid }} not found in index for fabric {{ _fname }}.
+      Known vids here: {{ (_vlan_index.get(_fname, {}) | dict2items | map(attribute='key') | list) }}
+
+- name: Resolve VLAN object from index
+  set_fact:
+    _vobj: "{{ _vlan_index[_fname][vlan.vid | string] }}"
+
+- name: Extract VLAN numeric id
+  set_fact:
+    _vid: "{{ _vobj.id }}"
+
+# 3) Read subnets (global) and normalize to a list
+- include_tasks: ../_auth_header.yml
+  no_log: true
+
+- name: Read subnets (global list)
+  uri:
+    url: "{{ _maas_api }}/subnets/"
+    method: GET
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    return_content: true
+    use_netrc: false
+  register: _subnets_resp
+  no_log: true
+
+- name: Normalize subnets list
+  set_fact:
+    _subnets_list: >-
+      {{
+        (_subnets_resp.json.subnets
+          if (_subnets_resp.json is mapping and 'subnets' in _subnets_resp.json)
+          else (_subnets_resp.json | default([])))
+      }}
+
+# Find the existing subnet id by CIDR (none if missing)
+- name: Extract existing subnet id by CIDR
+  set_fact:
+    _existing_subnet_id: >-
+      {{
+        (_subnets_list
+          | selectattr('cidr','equalto', subnet.cidr)
+          | map(attribute='id') | list | first)
+        | default(none)
+      }}
+
+- name: Decide if subnet already exists
+  set_fact:
+    _subnet_exists: "{{ _existing_subnet_id is not none and (_existing_subnet_id|string)|length > 0 }}"
+
+# Working subnet id variable (may be set later by create)
+- set_fact:
+    _subnet_id: "{{ _existing_subnet_id }}"
+
+# figure out the parent VLAN (we’re looping subelements('subnets'), so pair.0 is the VLAN)
+- name: Resolve VLAN id for this subnet
+  set_fact:
+    _vlan_id: >-
+      {{
+        (
+          _vlan_index[pair.0.fabric][(pair.0.vid | string)].id
+          if (pair is defined and pair.0 is defined and pair.0.vid is defined)
+          else _vlan_index[_fname][(vlan.vid | string)].id
+        ) | string
+      }}
+
+#- name: Locate existing subnet by CIDR
+#  set_fact:
+#    _existing_subnet: "{{ (_subnets_resp.json | default([])) | selectattr('cidr','equalto', subnet.cidr) | list | first | default(none) }}"
+
+# 4) CREATE if missing
+- name: Build subnet create body
+  set_fact:
+    _subnet_create_body: >-
+      {{
+        {'cidr': subnet.cidr, 'vlan': _vid}
+        | combine( (subnet.gateway is defined) | ternary({'gateway_ip': subnet.gateway}, {}), recursive=True )
+        | combine( (subnet.managed is defined) | ternary({'managed': subnet.managed|bool}, {}), recursive=True )
+      }}
+
+- include_tasks: ../_auth_header.yml
+  when: not _subnet_exists
+  no_log: true
+
+- name: Create subnet (if missing)
+  uri:
+    url: "{{ _maas_api }}/subnets/"
+    method: POST
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Content-Type: application/x-www-form-urlencoded
+      Accept: application/json
+    body_format: form-urlencoded
+    body: "{{ _subnet_create_body }}"
+    status_code: [200, 201, 409]
+    return_content: true
+    use_netrc: false
+  register: _subnet_create_resp
+  when: not _subnet_exists
+  no_log: true
+
+- name: Set final _subnet_id
+  set_fact:
+    _subnet_id: >-
+      {{
+        (
+          _existing_subnet_id
+          if _subnet_exists
+          else (
+            _subnet_create_resp.json.id
+            if (_subnet_create_resp is defined and _subnet_create_resp.json is defined and _subnet_create_resp.json.id is defined)
+            else none
+          )
+        )
+      }}
+
+- name: Ensure _subnet_id is set (fallback lookup)
+  set_fact:
+    _subnet_id: >-
+      {{
+        _subnet_id
+        if (_subnet_id is not none and (_subnet_id|string)|length > 0)
+        else (
+          (_subnets_list
+            | selectattr('cidr','equalto', subnet.cidr)
+            | map(attribute='id') | list | first) | default(none)
+        )
+      }}
+
+- include_tasks: ../_auth_header.yml
+  when: _subnet_id is none or (_subnet_id|string)|length == 0
+  no_log: true
+
+- name: Re-read subnets (only if _subnet_id still missing)
+  uri:
+    url: "{{ _maas_api }}/subnets/"
+    method: GET
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    return_content: true
+    use_netrc: false
+  register: _subnets_resp_refetch
+  when: _subnet_id is none or (_subnet_id|string)|length == 0
+  no_log: true
+
+- name: Normalize subnets list (refetch)
+  set_fact:
+    _subnets_list: >-
+      {{
+        (_subnets_resp_refetch.json.subnets
+         if (_subnets_resp_refetch is defined and _subnets_resp_refetch.json is mapping and 'subnets' in _subnets_resp_refetch.json)
+         else (_subnets_resp_refetch.json | default([])))
+      }}
+  when: _subnets_resp_refetch is defined
+
+- name: Final fallback - derive _subnet_id from refetch
+  set_fact:
+    _subnet_id: >-
+      {{
+        _subnet_id
+        if (_subnet_id is not none and (_subnet_id|string)|length > 0)
+        else (
+          (_subnets_list
+            | selectattr('cidr','equalto', subnet.cidr)
+            | map(attribute='id') | list | first) | default(none)
+        )
+      }}
+
+# 5) UPDATE if present
+- name: Build subnet update body
+  set_fact:
+    _subnet_update_body: >-
+      {{
+        {'cidr': subnet.cidr, 'vlan': _vid}
+        | combine( (subnet.gateway is defined) | ternary({'gateway_ip': subnet.gateway}, {}), recursive=True )
+        | combine( (subnet.managed is defined) | ternary({'managed': subnet.managed|bool}, {}), recursive=True )
+      }}
+#      {{
+#        {}
+#        | combine( (subnet.gateway is defined) | ternary({'gateway_ip': subnet.gateway}, {}), recursive=True )
+#        | combine( (subnet.managed is defined) | ternary({'managed': subnet.managed|bool}, {}), recursive=True )
+#      }}
+
+- include_tasks: ../_auth_header.yml
+  when: _subnet_id is not none
+  no_log: true
+
+- name: Update subnet (if exists)
+  uri:
+    url: "{{ _maas_api }}/subnets/{{ _subnet_id }}/"
+    method: PUT
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Content-Type: application/x-www-form-urlencoded
+      Accept: application/json
+    body_format: form-urlencoded
+    body: "{{ _subnet_update_body }}"
+    status_code: [200]
+    return_content: true
+    use_netrc: false
+  when: _subnet_id is not none and (_subnet_id|string)|length > 0
+
+# 7) DNS servers
+# DNS servers: prefer subnet.dns_servers[], else maas_global_dns_servers
+- name: Choose DNS servers for this subnet
+  set_fact:
+    _dns_list: "{{ subnet.dns_servers | default(maas_global_dns_servers | default([])) | list }}"
+
+- include_tasks: ../_auth_header.yml
+  when: _dns_list | length > 0 and _subnet_id is not none and (_subnet_id|string)|length > 0
+  no_log: true
+
+- name: Set DNS servers on subnet
+  uri:
+    url: "{{ _maas_api }}/subnets/{{ _subnet_id }}/"
+    method: PUT
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Content-Type: application/x-www-form-urlencoded
+      Accept: application/json
+    body_format: form-urlencoded
+    body: "{{ {'dns_servers': _dns_list | join(' ')} }}"
+    status_code: [200]
+    use_netrc: false
+  when: _dns_list | length > 0 and _subnet_id is not none and (_subnet_id|string)|length > 0
+
+# 8) IP ranges
+# IP ranges (read from top-level /ipranges/, not /subnets/{id}/ipranges/)
+- include_tasks: ../_auth_header.yml
+  when:
+    - _subnet_id is not none
+    - subnet.ip_ranges is defined
+  no_log: true
+
+- name: Read all ipranges (we'll filter by subnet)
+  uri:
+    url: "{{ _maas_api }}/ipranges/"
+    method: GET
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    return_content: true
+    use_netrc: false
+    status_code: [200]
+  register: _all_ranges_resp
+  when:
+    - _subnet_id is not none
+    - subnet.ip_ranges is defined
+
+# Normalize payload so later tasks don’t depend on .json vs .content
+- name: Normalize ipranges payload to a list
+  set_fact:
+    _ipranges_list: >-
+      {{
+        _all_ranges_resp.json
+          if (_all_ranges_resp is defined and _all_ranges_resp.json is defined and _all_ranges_resp.json != '')
+          else (_all_ranges_resp.content | from_json)
+      }}
+  when:
+    - _subnet_id is not none
+    - subnet.ip_ranges is defined
+    - _all_ranges_resp is defined
+
+- name: Show _subnet_id and ipranges count
+  debug:
+    msg:
+      - "_subnet_id (int) = {{ _subnet_id | int }}"
+      - "ipranges total = {{ (_ipranges_list | default([])) | length }}"
+  when:
+    - _subnet_id is not none
+    - subnet.ip_ranges is defined
+    - _ipranges_list is defined
+
+- name: Build normalized ipranges list
+  set_fact:
+    _ipranges_normalized: |
+      {% set out = [] %}
+      {% for r in (_ipranges_list | default([])) %}
+      {%   set sid = ((r.subnet.id if (r.subnet is mapping and 'id' in r.subnet) else r.subnet) | int) %}
+      {%   set _ = out.append({
+            'id': r.id,
+            'type': r.type,
+            'start_ip': r.start_ip,
+            'end_ip': r.end_ip,
+            'computed_subnet_id': sid
+          }) %}
+      {% endfor %}
+      {{ out }}
+  when:
+    - _subnet_id is not none
+    - subnet.ip_ranges is defined
+    - _ipranges_list is defined
+
+- name: Filter normalized ipranges to this subnet (robust int compare)
+  set_fact:
+    _subnet_ranges_existing: |
+      {% set sid = _subnet_id | int %}
+      {% set out = [] %}
+      {% for r in (_ipranges_normalized | default([])) %}
+      {%   if (r.computed_subnet_id | int) == sid %}
+      {%     set _ = out.append(r) %}
+      {%   endif %}
+      {% endfor %}
+      {{ out }}
+  when:
+    - _subnet_id is not none
+    - subnet.ip_ranges is defined
+    - _ipranges_normalized is defined
+
+- name: Create missing ranges
+  vars:
+    _exists: >-
+      {{
+        (_subnet_ranges_existing | default([]))
+        | selectattr('type','equalto', ipr.type | default('reserved'))
+        | selectattr('start_ip','equalto', ipr.start_ip)
+        | selectattr('end_ip','equalto', ipr.end_ip)
+        | list | length > 0
+      }}
+  include_tasks: subnet_range_create.yml
+  loop: "{{ subnet.ip_ranges | default([]) }}"
+  loop_control:
+    loop_var: ipr
+    label: "{{ ipr.type }} {{ ipr.start_ip }}-{{ ipr.end_ip }}"
+  when:
+    - _subnet_id is not none
+    - subnet.ip_ranges is defined
+    - not _exists
diff --git a/roles/maas/tasks/networking/subnet_range_create.yml b/roles/maas/tasks/networking/subnet_range_create.yml

new file mode 100644 (file)

index 0000000..577fcb8
--- /dev/null
+++ b/roles/maas/tasks/networking/subnet_range_create.yml
@@ -0,0 +1,225 @@
+---
+# Expects: _subnet_id, ipr (range spec with type/start_ip/end_ip), maas_auth_header, _subnet_ranges_existing
+# Optional: maas_overwrite_ipranges (default: false)
+
+- name: Default overwrite flag
+  set_fact:
+    maas_overwrite_ipranges: "{{ maas_overwrite_ipranges | default(false) | bool }}"
+
+# Helper facts
+- set_fact:
+    _ipr_type: "{{ ipr.type | default('reserved') }}"
+    _ipr_start: "{{ ipr.start_ip }}"
+    _ipr_end: "{{ ipr.end_ip }}"
+    _overlaps: []
+
+# --- exact match detection (boolean, no None pitfalls) ---
+- name: Compute exact-match flag for this subnet/type/span
+  vars:
+    _want_type:  "{{ _ipr_type  | string }}"
+    _want_start: "{{ _ipr_start | string }}"
+    _want_end:   "{{ _ipr_end   | string }}"
+  set_fact:
+    _exact_exists: >-
+      {{
+        (
+          (_subnet_ranges_existing | default([]))
+          | selectattr('type',     'equalto', _want_type)
+          | selectattr('start_ip', 'equalto', _want_start)
+          | selectattr('end_ip',   'equalto', _want_end)
+          | list | length
+        ) > 0
+      }}
+
+# (optional) tiny debug so you can see it flip true/false
+- name: Tiny debug so you can see it flip true/false
+  debug:
+    msg:
+      - "subnet_id: {{ _subnet_id }}"
+      - "existing ranges on this subnet: {{ _subnet_ranges_existing | length }}"
+      - "looking for: {{ _ipr_type }} {{ _ipr_start }}-{{ _ipr_end }}"
+      - "exact_exists={{ _exact_exists }}"
+    verbosity: 0
+
+# --- overlap detection stays as you had it ---
+
+# Skip only when an exact already exists
+- name: Skip create when exact range already exists
+  debug:
+    msg: "IP range already present ({{ _ipr_type }} {{ _ipr_start }}-{{ _ipr_end }}); skipping."
+  when: _exact_exists
+
+# Always define _overlaps, even if earlier overlap-compute tasks were skipped
+- name: Ensure _overlaps is defined
+  set_fact:
+    _overlaps: "{{ _overlaps | default([]) }}"
+
+- include_tasks: ../_auth_header.yml
+  no_log: true
+
+- name: Read all ipranges (server truth)
+  uri:
+    url: "{{ _maas_api }}/ipranges/"
+    method: GET
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    return_content: true
+    status_code: [200]
+  register: _ipr_read
+  no_log: true
+
+- name: Filter ipranges down to this subnet id
+  set_fact:
+    _subnet_ranges_existing: >-
+      {{ (_ipr_read.json | default([]))
+         | selectattr('subnet.id','equalto', _subnet_id)
+         | list }}
+
+# Build tuple/list forms of the new range once
+- name: Compute tuple forms of new start/end
+  set_fact:
+    _new_start_t: "{{ _ipr_start | split('.') | map('int') | list }}"
+    _new_end_t:   "{{ _ipr_end   | split('.') | map('int') | list }}"
+
+# existing.start <= new.end  AND  existing.end >= new.start  (inclusive)
+- name: Accumulate overlaps for this subnet/type/span (inclusive, no ipaddr)
+  set_fact:
+    _overlaps: "{{ _overlaps + [r] }}"
+  loop: "{{ _subnet_ranges_existing | default([]) }}"
+  loop_control:
+    loop_var: r
+  when:
+    - (r.start_ip | split('.') | map('int') | list) <= _new_end_t
+    - (r.end_ip   | split('.') | map('int') | list) >= _new_start_t
+
+- name: Debug overlaps (if any)
+  debug:
+    msg:
+      - "Overlaps (ids): {{ _overlaps | map(attribute='id') | list }}"
+      - "Overlaps (types): {{ _overlaps | map(attribute='type') | list }}"
+      - "Overlaps (spans): {{ _overlaps | map(attribute='start_ip') | list }} — {{ _overlaps | map(attribute='end_ip') | list }}"
+  when: _overlaps | length > 0
+
+# Fail on overlapping ranges (unless overwrite enabled)
+- name: Fail on overlapping ranges (unless overwrite enabled)
+  fail:
+    msg: >-
+      Requested {{ _ipr_type }} range {{ _ipr_start }}-{{ _ipr_end }}
+      overlaps existing ranges:
+      {{ (_overlaps | default([])) | map(attribute='start_ip') | list }} - {{ (_overlaps | default([])) | map(attribute='end_ip') | list }}.
+      Re-run with maas_overwrite_ipranges=true to replace them.
+  when:
+    - not _exact_exists
+    - (_overlaps | default([])) | length > 0
+    - not maas_overwrite_ipranges
+
+- include_tasks: ../_auth_header.yml
+  no_log: true
+
+- name: Read this subnet to check for managed=true and dynamic range mismatch
+  uri:
+    url: "{{ _maas_api }}/subnets/{{ _subnet_id }}/"
+    method: GET
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    return_content: true
+    status_code: [200]
+  register: _subnet_read
+  no_log: true
+
+- set_fact:
+    _server_subnet_managed: "{{ (_subnet_read.json.managed | default(false)) | bool }}"
+
+- name: Fail if subnet is unmanaged but a dynamic range is requested
+  fail:
+    msg: >-
+      Refusing to create a dynamic range on unmanaged subnet id={{ _subnet_id }}
+      ({{ _subnet_read.json.cidr }}). Set 'managed: true' on the subnet or use a
+      reserved range instead. Requested: {{ _ipr_type }} {{ _ipr_start }}-{{ _ipr_end }}.
+  when:
+    - _ipr_type == 'dynamic'
+    - not _server_subnet_managed
+
+# Delete overlapping ipranges before create
+- include_tasks: ../_auth_header.yml
+  when:
+    - not _exact_exists
+    - (_overlaps | default([])) | length > 0
+    - maas_overwrite_ipranges
+  no_log: true
+
+# before delete loop
+- set_fact:
+    _overlap_ids: "{{ _overlaps | map(attribute='id') | list | unique | list }}"
+
+- name: Delete overlapping ipranges before create
+  uri:
+    url: "{{ _maas_api }}/ipranges/{{ ov_id }}/"
+    method: DELETE
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    status_code: [200, 204, 404]
+    return_content: false
+  loop: "{{ _overlap_ids }}"
+  loop_control:
+    loop_var: ov_id
+    label: "delete id={{ ov_id }}"
+  failed_when: false
+  when:
+    - (_overlaps | length) > 0
+    - maas_overwrite_ipranges
+    - not _exact_exists
+  no_log: true
+
+- include_tasks: ../_auth_header.yml
+  no_log: true
+
+- name: Read all ipranges again (post-delete)
+  uri:
+    url: "{{ _maas_api }}/ipranges/"
+    method: GET
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Accept: application/json
+    return_content: true
+    status_code: [200]
+  register: _ipr_read_after
+  no_log: true
+
+- name: Filter ipranges down to this subnet id (post-delete)
+  set_fact:
+    _subnet_ranges_existing: >-
+      {{ (_ipr_read_after.json | default([]))
+         | selectattr('subnet.id','equalto', _subnet_id)
+         | list }}
+
+- include_tasks: ../_auth_header.yml
+  when:
+    - not _exact_exists
+    - ((_overlaps | default([])) | length == 0) or maas_overwrite_ipranges
+  no_log: true
+
+- name: Create iprange
+  uri:
+    url: "{{ _maas_api }}/ipranges/"
+    method: POST
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Content-Type: application/x-www-form-urlencoded
+      Accept: application/json
+    body_format: form-urlencoded
+    body:
+      subnet: "{{ _subnet_id | string }}"
+      type: "{{ _ipr_type | default('reserved') }}"
+      start_ip: "{{ _ipr_start }}"
+      end_ip: "{{ _ipr_end }}"
+    status_code: [200, 201, 409]
+    return_content: true
+    use_netrc: false
+  register: _range_create_resp
+  when:
+    - not _exact_exists
+    - ((_overlaps | default([])) | length == 0) or maas_overwrite_ipranges
diff --git a/roles/maas/tasks/networking/vlan_build_index.yml b/roles/maas/tasks/networking/vlan_build_index.yml

new file mode 100644 (file)

index 0000000..1b5ef98
--- /dev/null
+++ b/roles/maas/tasks/networking/vlan_build_index.yml
@@ -0,0 +1,22 @@
+---
+# Build `_vlan_index` as: { "<fabric_name>": { "<vid as string>": <vlan_obj> } }
+
+# Start clean
+- set_fact:
+    _vlan_index: {}
+
+# Expect `_vlans_raw_by_fabric` to be a dict like:
+# { "tucson-qe": [ {vid: 1300, id: 5011, ...}, ... ], ... }
+- name: Normalize VLAN index with string vid keys
+  set_fact:
+    _vlan_index: |
+      {% set out = {} %}
+      {% for it in (_vlans_raw_by_fabric | default({}) | dict2items) %}
+      {%   set fname = it.key %}
+      {%   set vlist = it.value | default([]) %}
+      {%   set _ = out.update({ fname: {} }) %}
+      {%   for v in vlist %}
+      {%     set _ = out[fname].update({ (v.vid | string): v }) %}
+      {%   endfor %}
+      {% endfor %}
+      {{ out }}
diff --git a/roles/maas/tasks/networking/vlan_create.yml b/roles/maas/tasks/networking/vlan_create.yml

new file mode 100644 (file)

index 0000000..9bd49cd
--- /dev/null
+++ b/roles/maas/tasks/networking/vlan_create.yml
@@ -0,0 +1,32 @@
+---
+# Expects: _maas_api, maas_api_key, pair, _fabric_by_name
+# pair.0 = fabric obj; pair.1 = vlan obj
+
+- include_tasks: ../_auth_header.yml
+  no_log: true
+
+- set_fact:
+    _fid: "{{ _fabric_by_name[pair.0.fabric] }}"
+    _vlan_create_body: >-
+      {{
+        {'vid': pair.1.vid}
+        | combine( (pair.1.name is defined)        | ternary({'name': pair.1.name}, {}), recursive=True )
+        | combine( (pair.1.description is defined) | ternary({'description': pair.1.description}, {}), recursive=True )
+        | combine( (pair.1.mtu is defined)         | ternary({'mtu': pair.1.mtu}, {}), recursive=True )
+        | combine( (pair.1.space is defined)       | ternary({'space': pair.1.space}, {}), recursive=True )
+      }}
+
+# NOTE: dhcp_on is not created here; we set it in the separate "vlan_update" task because
+# ipranges must be created first.
+- uri:
+    url: "{{ _maas_api }}/fabrics/{{ _fid }}/vlans/"
+    method: POST
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Content-Type: application/x-www-form-urlencoded
+      Accept: application/json
+    body_format: form-urlencoded
+    body: "{{ _vlan_create_body }}"
+    status_code: [200, 201, 409]
+    return_content: true
+    use_netrc: false 
diff --git a/roles/maas/tasks/networking/vlan_update.yml b/roles/maas/tasks/networking/vlan_update.yml

new file mode 100644 (file)

index 0000000..b42cda2
--- /dev/null
+++ b/roles/maas/tasks/networking/vlan_update.yml
@@ -0,0 +1,95 @@
+---
+# Expects: _fname, _fabric_by-name, vlan, _vlan_index, _body, _maas_api, maas_auth_header
+# (_fname and vlan are often passed from the caller; we normalize if pair=* is used)
+
+- name: Normalize inputs
+  set_fact:
+    _fname: "{{ _fname | default(pair.0.fabric) }}"
+    _fid: "{{ _fabric_by_name[pair.0.fabric] }}"
+    vlan:  "{{ vlan  | default(pair.1) }}"
+
+- name: Ensure VLAN present in index
+  assert:
+    that:
+      - _vlan_index[_fname] is defined
+      - _vlan_index[_fname][vlan.vid | string] is defined
+    fail_msg: >-
+      VLAN {{ vlan.vid }} not found in index for fabric {{ _fname }}.
+      Known vids: {{ _vlan_index.get(_fname, {}) | dict2items | map(attribute='key') | list }}
+
+- name: Clear any stale per-VLAN variables
+  set_fact:
+    _vlan_id: "{{ none }}"
+    _vobj: "{{ none }}"
+    _prc_candidate: ""
+    _primary_rack_controller: "{{ none }}"
+
+- name: Resolve VLAN object
+  set_fact:
+    _vobj: "{{ _vlan_index[_fname][vlan.vid | string] }}"
+
+- name: And ID
+  set_fact:
+    _vlan_id: "{{ _vobj.id | string }}"
+
+# Set the Primary Rack Controller to the VLAN-level defined one.  Otherwise empty string.
+- name: Compute per-VLAN primary rack controller candidate
+  set_fact:
+    _prc_candidate: "{{ (vlan | default({})).get('primary_rack_controller') | default('', true) | string | trim }}"
+
+# Use the VLAN-level defined PRC discovered above or use the global one.
+- name: Decide primary rack controller for this VLAN
+  set_fact:
+    _primary_rack_controller: "{{ _prc_candidate if (_prc_candidate | length) > 0 else (_global_primary_rack_controller | default(omit)) }}"
+
+- name: Build VLAN update body
+  vars:
+    _spaces_list: >-
+      {{
+        (vlan.subnets | default([]))
+        | selectattr('space','defined')
+        | map(attribute='space') | list | unique
+      }}
+    _desired_space: "{{ _spaces_list[0] if (_spaces_list | length) == 1 else omit }}"
+
+    _has_dynamic_for_vlan: >-
+      {{
+        (vlan.subnets | default([]))
+        | selectattr('ip_ranges','defined')
+        | map(attribute='ip_ranges') | flatten
+        | selectattr('type','equalto','dynamic')
+        | list | length > 0
+      }}
+  set_fact:
+    _vlan_update_body: >-
+      {{
+        {'name': vlan.name}
+        | combine( (vlan.mtu is defined) | ternary({'mtu': vlan.mtu}, {}), recursive=True )
+        | combine( (_desired_space is not none) | ternary({'space': _desired_space}, {}), recursive=True )
+        | combine(
+            (vlan.dhcp_on | default(false) | bool and (_primary_rack_controller is defined))
+            | ternary({'primary_rack': _primary_rack_controller}, {}), recursive=True
+          )
+        | combine(
+            (vlan.dhcp_on | default(false) | bool and _has_dynamic_for_vlan)
+            | ternary({'dhcp_on': true}, {}), recursive=True
+          )
+      }}
+
+- include_tasks: ../_auth_header.yml
+  no_log: true
+
+- name: Update VLAN properties
+  uri:
+    url: "{{ _maas_api }}/fabrics/{{ _fid }}/vlans/{{ vlan.vid }}/"
+    method: PUT
+    headers:
+      Authorization: "{{ maas_auth_header }}"
+      Content-Type: application/x-www-form-urlencoded
+      Accept: application/json
+    body_format: form-urlencoded
+    body: "{{ _vlan_update_body }}"
+    status_code: [200]
+    return_content: true
+    use_netrc: false
+  no_log: true
diff --git a/roles/maas/tasks/networking_subnet.yml b/roles/maas/tasks/networking_subnet.yml

new file mode 100644 (file)

index 0000000..8bcdf34
--- /dev/null
+++ b/roles/maas/tasks/networking_subnet.yml
@@ -0,0 +1,133 @@
+---
+# Expects:
+# - _maas_api
+# - maas_auth_header
+# - _fabric_by_name
+# - _vlan_index
+# - trio (tuple: [fabric_obj, vlan_obj, subnet_obj])
+
+- name: Unpack current triple
+  set_fact:
+    _fname: "{{ trio.0.fabric }}"
+    vlan:   "{{ trio.1 }}"
+    subnet: "{{ trio.2 }}"
+    _vobj:  "{{ _vlan_index[_fname][vlan.vid] }}"
+    _vid:   "{{ _vobj.id }}"
+
+- name: Read existing subnets on VLAN
+  uri:
+    url: "{{ _maas_api }}/vlans/{{ _vid }}/subnets/"
+    method: GET
+    headers: { Authorization: "{{ maas_auth_header }}" }
+    return_content: true
+  register: _subnets_resp
+
+- name: Get existing subnet (by CIDR) if present
+  set_fact:
+    _existing_subnet: "{{ (_subnets_resp.json | default([])) | selectattr('cidr','equalto', subnet.cidr) | list | first | default(None) }}"
+
+- name: Build create body for subnet
+  set_fact:
+    _subnet_create_body: >-
+      {{
+        {'cidr': subnet.cidr, 'vlan': _vid}
+        | combine( (subnet.gateway is defined) | ternary({'gateway_ip': subnet.gateway}, {}), recursive=True )
+        | combine( (subnet.space   is defined) | ternary({'space': subnet.space}, {}),       recursive=True )
+        | combine( (subnet.managed is defined) | ternary({'managed': subnet.managed|bool}, {}), recursive=True )
+      }}
+
+- name: Create subnet if missing
+  when: _existing_subnet is none
+  uri:
+    url: "{{ _maas_api }}/subnets/"
+    method: POST
+    headers: { Authorization: "{{ maas_auth_header }}" }
+    body_format: form-urlencoded
+    body: "{{ _subnet_create_body }}"
+    status_code: [200, 201, 409]
+    return_content: true
+
+- name: Build update body for subnet
+  set_fact:
+    _subnet_update_body: >-
+      {{
+        {}
+        | combine( (subnet.gateway is defined) | ternary({'gateway_ip': subnet.gateway}, {}), recursive=True )
+        | combine( (subnet.space   is defined) | ternary({'space': subnet.space}, {}),       recursive=True )
+        | combine( (subnet.managed is defined) | ternary({'managed': subnet.managed|bool}, {}), recursive=True )
+      }}
+
+- name: Update subnet if exists
+  when: _existing_subnet is not none
+  uri:
+    url: "{{ _maas_api }}/subnets/{{ _existing_subnet.id }}/"
+    method: POST
+    headers: { Authorization: "{{ maas_auth_header }}" }
+    body_format: form-urlencoded
+    body: "{{ _subnet_update_body }}"
+    status_code: [200, 201]
+    return_content: true
+
+- name: Re-read subnets to get current subnet_id
+  uri:
+    url: "{{ _maas_api }}/vlans/{{ _vid }}/subnets/"
+    method: GET
+    headers: { Authorization: "{{ maas_auth_header }}" }
+    return_content: true
+  register: _subnets_after
+
+- name: Compute subnet id
+  set_fact:
+    _subnet_id: "{{ (_subnets_after.json | default([])) | selectattr('cidr','equalto', subnet.cidr) | map(attribute='id') | first }}"
+
+- name: Determine DNS servers for subnet (per-subnet or global)
+  set_fact:
+    _dns_list: "{{ subnet.dns_servers | default(maas_global_dns_servers | default([])) | list }}"
+
+- name: Set DNS servers on subnet when provided
+  when: _dns_list | length > 0
+  uri:
+    url: "{{ _maas_api }}/subnets/{{ _subnet_id }}/"
+    method: POST
+    headers: { Authorization: "{{ maas_auth_header }}" }
+    body_format: form-urlencoded
+    body: "{{ {'dns_servers': _dns_list | join(' ')} }}"
+    status_code: [200, 201]
+
+- name: Ensure IP ranges on subnet (if any)
+  when: subnet.ip_ranges is defined
+  block:
+    - name: Read existing ranges
+      uri:
+        url: "{{ _maas_api }}/subnets/{{ _subnet_id }}/ipranges/"
+        method: GET
+        headers: { Authorization: "{{ maas_auth_header }}" }
+        return_content: true
+      register: _ranges_resp
+
+    - name: Create/ensure each range (by type/start/end)
+      vars:
+        ipr_body: >-
+          {{
+            {'type': ipr.type | default('reserved'),
+             'start_ip': ipr.start_ip,
+             'end_ip': ipr.end_ip}
+          }}
+        exists: >-
+          {{
+            (_ranges_resp.json | default([]))
+            | selectattr('type','equalto', ipr.type | default('reserved'))
+            | selectattr('start_ip','equalto', ipr.start_ip)
+            | selectattr('end_ip','equalto', ipr.end_ip)
+            | list | first | default(None)
+          }}
+      when: exists is none
+      uri:
+        url: "{{ _maas_api }}/subnets/{{ _subnet_id }}/ipranges/"
+        method: POST
+        headers: { Authorization: "{{ maas_auth_header }}" }
+        body_format: form-urlencoded
+        body: "{{ ipr_body }}"
+        status_code: [200, 201, 409]
+      loop: "{{ subnet.ip_ranges }}"
+      loop_control: { loop_var: ipr }
diff --git a/roles/maas/templates/arm_uefi.j2 b/roles/maas/templates/arm_uefi.j2

new file mode 100644 (file)

index 0000000..0b0baa7
--- /dev/null
+++ b/roles/maas/templates/arm_uefi.j2
@@ -0,0 +1,27 @@
+{{ '{{' }}if debug{{ '}}' }}set debug="all"{{ '{{' }}endif{{ '}}' }}
+set default="0"
+set timeout=0
+
+menuentry 'Local' {
+    echo 'Booting local disk...'
+    # This is the default bootloader location according to the UEFI spec.
+    search --set=root --file /efi/boot/bootaa64.efi
+    if [ $? -eq 0 ]; then
+        chainloader /efi/boot/bootaa64.efi
+        boot
+    fi
+
+{% set distros = ["rocky", "centos", "ubuntu"] %}
+
+{% for item in distros %}
+
+    search --set=root --file /efi/{{ item }}/grubaa64.efi
+    if [ $? -eq 0 ]; then
+        chainloader /efi/{{ item }}/grubaa64.efi
+        boot
+    fi
+
+{% endfor %}
+    # If no bootloader is found exit and allow the next device to boot.
+    exit
+}
diff --git a/roles/maas/templates/dhcpd.classes.snippet.j2 b/roles/maas/templates/dhcpd.classes.snippet.j2

new file mode 100644 (file)

index 0000000..b9cbad6
--- /dev/null
+++ b/roles/maas/templates/dhcpd.classes.snippet.j2
@@ -0,0 +1,8 @@
+  {% if subnet_data.classes is defined -%}
+  {% for class_name, class_string in subnet_data.classes.items() -%}
+  class "{{ class_name }}" {
+    {{ class_string }};
+  }
+
+  {% endfor -%}
+  {%- endif -%}
diff --git a/roles/maas/templates/dhcpd.global.snippet.j2 b/roles/maas/templates/dhcpd.global.snippet.j2

new file mode 100644 (file)

index 0000000..027b09b
--- /dev/null
+++ b/roles/maas/templates/dhcpd.global.snippet.j2
@@ -0,0 +1,5 @@
+{% for item in dhcp_maas_global %}
+{% for key, value in item.items() %}
+{{ key }} {{ value }};
+{% endfor %}
+{% endfor %}
diff --git a/roles/maas/templates/dhcpd.hosts.snippet.j2 b/roles/maas/templates/dhcpd.hosts.snippet.j2

new file mode 100644 (file)

index 0000000..d1d8013
--- /dev/null
+++ b/roles/maas/templates/dhcpd.hosts.snippet.j2
@@ -0,0 +1,16 @@
+  {% for host in groups['all'] | sort | unique -%}
+  {% if hostvars[host][subnet_data.macvar] is defined -%}
+  {% if hostvars[host][subnet_data.ipvar] | ansible.utils.ipaddr(subnet_data.cidr) -%}
+  host {{ host.split('.')[0] }}-{{ subnet_name }} {
+   {% if hostvars[host]['domain_name_servers'] is defined -%}
+    option domain-name-servers {{ hostvars[host]['domain_name_servers']|join(', ') }};
+    {% endif -%}
+    hardware ethernet {{ hostvars[host][subnet_data.macvar] }};
+    fixed-address {{ hostvars[host][subnet_data.ipvar] }};
+  {% if hostvars[host]['dhcp_option_hostname'] is defined and hostvars[host]['dhcp_option_hostname'] == true %}
+  option host-name "{{ host.split('.')[0] }}";
+  {% endif -%}
+  }
+  {% endif -%}
+  {% endif -%}
+  {% endfor -%}
diff --git a/roles/maas/templates/dhcpd.pools.snippet.j2 b/roles/maas/templates/dhcpd.pools.snippet.j2

new file mode 100644 (file)

index 0000000..2d7af05
--- /dev/null
+++ b/roles/maas/templates/dhcpd.pools.snippet.j2
@@ -0,0 +1,23 @@
+  {% if subnet_data.pools is defined -%}
+  {% for pool, pool_value in subnet_data.pools.items() -%}
+  pool {
+    {% if pool == "unknown_clients" -%}
+    allow unknown-clients;
+    {% else -%}
+    allow members of "{{ pool }}";
+    {% endif -%}
+    {% if pool_value.range is string -%}
+    range {{ pool_value.range }};
+    {% else -%}
+    range {{ pool_value.range|join(';\n    range ') }};
+    {% endif -%}
+    {% if pool_value.next_server is defined -%}
+    next-server {{ pool_value.next_server }};
+    {% endif -%}
+    {% if pool_value.filename is defined -%}
+    filename "{{ pool_value.filename }}";
+    {% endif -%}
+  }
+
+  {% endfor -%}
+  {%- endif -%}
diff --git a/roles/nameserver/README.rst b/roles/nameserver/README.rst

new file mode 100644 (file)

index 0000000..85ac3d3
--- /dev/null
+++ b/roles/nameserver/README.rst
@@ -0,0 +1,260 @@
+nameserver
+==========
+
+This role is used to set up and configure a very basic **internal** BIND DNS server.
+
+This role has only been tested on CentOS 7.2 using BIND9.
+
+It does the following:
+
+- Installs and updates necessary packages
+- Enables and configures firewalld
+- Manages named.conf and BIND daemon config
+- Manages forward and reverse DNS records
+
+Prerequisites
++++++++++++++
+
+- CentOS 7.2
+
+Variables
++++++++++
+Most variables are defined in ``roles/nameserver/defaults/main.yml`` and values are chosen to support our Sepia_ lab.  They can be overridden in the ``secrets`` repo.
+
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+|Variable                                                |Description                                                                                                                |
++========================================================+===========================================================================================================================+
+|``packages: []``                                        |A list of packages to install that is specific to the role.  The list is defined in ``roles/nameserver/vars/packages.yml`` |
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+|``named_conf_dir: "/var/named"``                        |BIND main configuration directory.  Defined in ``roles/nameserver/defaults/main.yml``                                      |
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+|``named_conf_file: "/etc/named.conf"``                  |BIND main configuration file.  This is the default CentOS path.  Defined in ``roles/nameserver/defaults/main.yml``         |
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+|``named_conf_data_dir: "/var/named/data"``              |BIND data directory.  named debug output and statistics are stored here.  Defined in ``roles/nameserver/defaults/main.yml``|
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+|``named_conf_listen_port: 53``                          |Port BIND should listen on.  Defined in ``roles/nameserver/defaults/main.yml``                                             |
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+|::                                                      |                                                                                                                           |
+|                                                        |                                                                                                                           |
+|  named_conf_listen_iface:                              |Interface(s) BIND should listen on.  This defaults to listen on all IPv4 interfaces Ansible detects for the nameserver.    |
+|    - 127.0.0.1                                         |Defined in ``roles/nameserver/defaults/main.yml``                                                                          |
+|    - "{{ ansible_all_ipv4_addresses[0] }}"             |                                                                                                                           |
+|                                                        |                                                                                                                           |
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+|``named_conf_zones_path: "/var/named/zones"``           |Path to BIND zone files.  Defined in ``roles/nameserver/defaults/main.yml``                                                |
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+|::                                                      |named daemon options.  Writes to ``/etc/sysconfig/named``.  Defined in ``roles/nameserver/defaults/main.yml``              |
+|                                                        |                                                                                                                           |
+|  named_conf_daemon_opts: []                            |                                                                                                                           |
+|                                                        |                                                                                                                           |
+|  # Example for IPv4 support only:                      |                                                                                                                           |
+|   named_conf_daemon_opts: "-4"                         |                                                                                                                           |
+|                                                        |                                                                                                                           |
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+|::                                                      |Values used to populate corresponding settings in each zone file's SOA record                                              |
+|                                                        |Defined in ``roles/nameserver/defaults/main.yml``                                                                          |
+|  named_conf_soa_ttl: 3600                              |                                                                                                                           |
+|  named_conf_soa_refresh: 3600                          |                                                                                                                           |
+|  named_conf_soa_retry: 3600                            |                                                                                                                           |
+|  named_conf_soa_expire: 604800                         |                                                                                                                           |
+|                                                        |                                                                                                                           |
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+|::                                                      |Desired primary nameserver and admin e-mail for each zone file.  Defined in the secrets repo                               |
+|                                                        |                                                                                                                           |
+|  named_conf_soa: []                                    |                                                                                                                           |
+|                                                        |                                                                                                                           |
+|  # Example:                                            |                                                                                                                           |
+|  named_conf_soa: "ns1.example.com. admin.example.com." |                                                                                                                           |
+|                                                        |                                                                                                                           |
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+|``named_conf_recursion: "no"``                          |Define whether recursion should be allowed or not.  Defaults to "no".  Override in Ansible inventory as a hostvar.         |
+|                                                        |                                                                                                                           |
+|                                                        |**NOTE:** Setting to "yes" will add ``allow-recursion { any; }``. See To-Do.                                               |
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+|::                                                      |A list of nameservers BIND should forward external DNS queries to.  This is not required but should be defined in          |
+|                                                        |``ansible/inventory/group_vars/nameserver.yml`` if desired.                                                                |
+|  named_forwarders:                                     |                                                                                                                           |
+|    - 8.8.8.8                                           |                                                                                                                           |
+|    - 1.1.1.1                                           |                                                                                                                           |
+|                                                        |                                                                                                                           |
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+|``named_conf_slave: true``                              |Will configure the server as a DNS slave if true.  This variable is not required but should be set to true in the hostvars |
+|                                                        |if desired.                                                                                                                |
+|                                                        |                                                                                                                           |
+|                                                        |**NOTE:** You must also set ``named_conf_master`` if ``named_conf_slave`` is true.  See below.                             |
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+|``named_conf_master: "1.2.3.4"``                        |Specifies the master server's IP which zones should be transferred from.  Define in hostvars.                              |
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+|::                                                      |A list of hosts or subnets you want to allow zone transfers to.  This variable is not required but should be defined in    |
+|                                                        |hostvars if you wish.  BIND allows AXFR transfers to anywhere by default.                                                  |
+|  named_conf_allow_axfr:                                |                                                                                                                           |
+|    - localhost                                         |See http://www.zytrax.com/books/dns/ch7/xfer.html#allow-transfer.                                                          |
+|    - 1.2.3.4                                           |                                                                                                                           |
+|                                                        |                                                                                                                           |
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+|``ddns_keys: {}``                                       |A dictionary defining each Dynamic DNS zone's authorized key.  See **Dynamic DNS** below.  Defined in an encrypted file in |
+|                                                        |the secrets repo                                                                                                           |
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+
+**named_domains: []**
+
+The ``named_domains`` dictionary is the bread and butter of creating zone files.  It is in standard YAML syntax.  Each domain (key) must have ``forward``, ``ipvar``, and ``dynamic`` defined.  ``ipvar`` can be set to ``NULL``.  Optional values include ``miscrecords``, ``reverse``, and ``ddns_hostname_prefixes``.
+
+``forward``
+  The domain of the forward lookup zone for each domain (key)
+
+``ipvar``
+  The variable assigned to a system in the Ansible inventory.  This allows systems to have multiple IPs assigned for a front and ipmi network, for example.  See **Inventory Example** below.
+
+``dynamic``
+  Specifies whether the parent zone/domain should allow Dynamic DNS records.  See **Dynamic DNS** below for more information.
+
+``ddns_hostname_prefixes``
+  This should be a list of dynamic hostname prefixes you don't want overwritten if a zone/domain has static and dynamic records.  See **Dynamic DNS** below.
+
+``miscrecords``
+  Records to add to corresponding ``forward`` zone file.  This is a good place for CNAMEs and MX records and records for hosts you don't have in your Ansible inventory.  If your main nameserver is in a subdomain, you should create its glue record here.  See example.
+
+``reverse``
+  This should be a list of each reverse lookup IP C-Block address corresponding to the domain (key).  See example.
+
+**Example**::
+
+    named_domains:
+      example.com:
+        ipvar: NULL
+        dynamic: false
+        forward: example.com
+        miscrecords:
+          - www                 IN      A       8.8.8.8
+          - www                 IN      TXT     "my www host"
+          - ns1.private         IN      A       192.168.0.1
+      private.example.com:
+        ipvar: ip
+        dynamic: true
+        ddns_hostname_prefixes:
+          - dyn
+        forward: private.example.com
+        miscrecords:
+          - mail                IN      MX      192.168.0.2
+          - email               IN      CNAME   mail
+        reverse:
+          - 192.168.0.0
+          - 192.168.1.0
+          - 192.168.2.0
+      mgmt.example.com:
+        ipvar: mgmt
+        dynamic: false
+        forward: mgmt.example.com
+        reverse:
+          - 192.168.10.0
+          - 192.168.11.0
+          - 192.168.12.0
+      ddns.example.com:
+        ipvar: NULL
+        dynamic: true
+        forward: ddns.example.com
+        
+Inventory
++++++++++
+This role will create forward and reverse DNS records for any host defined in your Ansible inventory when given an IP address assigned to a variable matching ``ipvar`` in ``named_domains``.
+
+Using the ``named_domains`` example above and inventory below, forward *and reverse* records for ``ns1.private.example.com``, ``tester050.private.example.com``, and ``tester050.mgmt.example.com`` would be created.
+
+**Example**::
+
+    [nameserver]
+    ns1.private.example.com ip=192.168.0.1
+
+    [testnodes]
+    tester050.private.example.com ip=192.168.1.50 mgmt=192.168.11.50
+
+**Note:** Hosts in inventory with no IP address defined will not have records created and should be added to ``miscrecords`` in ``named_domains``.
+
+Dynamic DNS
++++++++++++
+If you wish to use the Dynamic DNS feature of this role, you should generate an HMAC-MD5 keypair using dnssec-keygen_ for each zone you want to be able to dynamically update.  The key generated should be pasted in the ``secret`` value of the ``ddns_keys`` dictionary for the corresponding domain.
+
+**Example**::
+
+    $ dnssec-keygen -a HMAC-MD5 -b 512 -n USER ddns.example.com
+    Kddns.example.com.+157+57501
+    $ cat Kddns.example.com.+157+57501.key
+    ddns.example.com. IN KEY 0 3 157 LxFSAiBgKYtsTTV/hjaK7LNdsbk19xQv0ZY9xLtrpdIWhf2S4gurD5GJ JjP9N8bnlCPKc7zVy+JcBYbSMSsm2A==
+
+    # In {{ secrets_path }}/nameserver.yml
+    ---
+    ddns_keys:
+      ddns.example.com:
+        secret: "LxFSAiBgKYtsTTV/hjaK7LNdsbk19xQv0ZY9xLtrpdIWhf2S4gurD5GJ JjP9N8bnlCPKc7zVy+JcBYbSMSsm2A=="
+
+``roles/nameserver/templates/named.conf.j2`` loops through each domain in ``named_domains``, checks whether ``dynamic: true`` and if so, then loops through ``ddns_keys`` and matches the secret key to the domain.
+
+These instructions assume you'll either have one host updating DNS records or you'll be sharing the resulting key.  Clients can use nsupdate_ to update the nameserver.  Configuring that is outside the scope of this role.
+
+You can have two types of Dynamic DNS zones:
+
+  1. A pure dynamic DNS zone with no static A records
+  2. A mixed zone consisting of both dynamic and static records
+
+For a mixed zone, you must specify ``ddns_hostname_prefixes`` under the domain in ``named_domains`` else your dynamic records will be overwritten each time the records task is run.  **Example**::
+
+    named_domains:
+      private.example.com:
+        forward: private.example.com
+        ipvar: ip
+        dynamic: true
+        ddns_hostname_prefixes:
+          - foo
+      ddns.example.com:
+        forward: ddns.example.com
+        ipvar: NULL
+        dynamic: true
+
+In the example above, a dynamic hostname of ``foo001.private.example.com`` will be saved and restored at the end of the records task.  If you *dynamically* added a hostname of ``bar001.private.example.com`` however, the records task will remove it.  Do not create static hostnames in your ansible inventory using any of the prefixes in ``ddns_hostname_prefixes`` or you'll end up with duplicates in the zone file.
+
+The records task will not modify the ddns.example.com zone file.
+
+For our upstream test lab's purposes, this allows us to combine static and dynamic records in our ``front.sepia.ceph.com`` domain so teuthology_'s ``lab_domain`` variable can remain unchanged.
+
+This role also configures DNS slaves to accept DDNS updates and will forward them to the master using the ``allow-update-forwarding`` parameter in ``/etc/named.conf``.  This is particularly useful in our Sepia lab since our master server can't send ``NOTIFY`` messages directly to the slave.
+
+**NOTE:** Reverse zone Dynamic DNS is not supported at this time.
+
+Tags
+++++
+
+packages
+    Install *and update* packages
+
+config
+    Configure and restart named service (if config changes)
+
+firewall
+    Enable firewalld and allow dns traffic
+
+records
+    Compiles and writes forward and reverse zone files using ``named_domains`` and Ansible inventory
+
+Dependencies
+++++++++++++
+
+This role depends on the following roles:
+
+secrets
+    Provides a var, ``secrets_path``, containing the path of the secrets repository, a tree of Ansible variable files.
+
+sudo
+    Sets ``ansible_sudo: true`` for this role which causes all the plays in this role to execute with sudo.
+
+To-Do
++++++
+
+- Allow additional user-defined firewall rules
+- DNSSEC
+- Add support for specifying networks to allow recursion from
+
+.. _Sepia: https://ceph.github.io/sepia/
+.. _dnssec-keygen: https://ftp.isc.org/isc/bind9/cur/9.9/doc/arm/man.dnssec-keygen.html
+.. _nsupdate: https://linux.die.net/man/8/nsupdate
+.. _teuthology: http://docs.ceph.com/teuthology/docs/siteconfig.html?highlight=lab_domain
diff --git a/roles/nameserver/defaults/main.yml b/roles/nameserver/defaults/main.yml

new file mode 100644 (file)

index 0000000..c4dc0a5
--- /dev/null
+++ b/roles/nameserver/defaults/main.yml
@@ -0,0 +1,27 @@
+---
+# These defaults are present to allow certain tasks to no-op if a secrets repo
+# hasn't been defined. If you want to override these, do so in the secrets repo
+# itself. We override these in  $repo/ansible/inventory/group_vars/nameserver.yml
+secrets_repo:
+  name: null
+  url: null
+
+# Main BIND conf vars
+named_conf_dir: "/var/named"
+named_conf_file: "/etc/named.conf"
+named_conf_data_dir: "/var/named/data"
+named_conf_listen_port: 53
+named_conf_listen_iface:
+ - 127.0.0.1
+ - "{{ ansible_all_ipv4_addresses[0] }}"
+named_conf_zones_path: "/var/named/zones"
+named_conf_daemon_opts: ""
+named_conf_recursion: "no" # Allow recursion?  [yes|no]
+
+# Zone file conf vars
+named_conf_soa_ttl: 3600
+named_conf_soa_refresh: 3600
+named_conf_soa_retry: 3600
+named_conf_soa_expire: 604800
+
+ddns_keys: {}
diff --git a/roles/nameserver/handlers/main.yml b/roles/nameserver/handlers/main.yml

new file mode 100644 (file)

index 0000000..248f51c
--- /dev/null
+++ b/roles/nameserver/handlers/main.yml
@@ -0,0 +1,12 @@
+---
+# Restart for config file updates
+- name: restart named
+  service:
+    name: named
+    state: restarted
+
+# Reload for zone file updates
+- name: reload named
+  service:
+    name: named
+    state: reloaded
diff --git a/roles/nameserver/meta/main.yml b/roles/nameserver/meta/main.yml

new file mode 100644 (file)

index 0000000..313fd69
--- /dev/null
+++ b/roles/nameserver/meta/main.yml
@@ -0,0 +1,3 @@
+---
+dependencies:
+  - role: secrets
diff --git a/roles/nameserver/tasks/config.yml b/roles/nameserver/tasks/config.yml

new file mode 100644 (file)

index 0000000..9e0fe3a
--- /dev/null
+++ b/roles/nameserver/tasks/config.yml
@@ -0,0 +1,39 @@
+---
+- name: Create named data directory
+  file:
+    path: "{{ named_conf_data_dir }}"
+    state: directory
+    owner: named
+    group: named
+
+- name: Create named.conf
+  template:
+    src: named.conf.j2
+    dest: "{{ named_conf_file }}"
+    validate: named-checkconf %s
+  notify: restart named
+
+- name: Apply named daemon options
+  lineinfile:
+    dest: /etc/sysconfig/named
+    regexp: '^OPTIONS='
+    line: "OPTIONS=\"{{ named_conf_daemon_opts }}\""
+    state: present
+    create: True
+  notify: restart named
+
+- name: Configure SELinux to allow named to write to master zone files
+  seboolean:
+    name: named_write_master_zones
+    state: yes
+    persistent: yes
+  when:
+    - ansible_selinux.status is defined
+    - ansible_selinux.status == "enabled"
+
+# Helps prevent accidental DoS
+- name: Double maximum configured connections
+  sysctl:
+    name: net.nf_conntrack_max
+    value: 131072
+    state: present
diff --git a/roles/nameserver/tasks/firewall.yml b/roles/nameserver/tasks/firewall.yml

new file mode 100644 (file)

index 0000000..6ed628a
--- /dev/null
+++ b/roles/nameserver/tasks/firewall.yml
@@ -0,0 +1,13 @@
+---
+- name: Enable firewalld
+  service:
+    name: firewalld
+    enabled: yes
+    state: started
+
+- name: Allow incoming DNS traffic
+  firewalld:
+    service: dns
+    permanent: true
+    immediate: yes
+    state: enabled
diff --git a/roles/nameserver/tasks/main.yml b/roles/nameserver/tasks/main.yml

new file mode 100644 (file)

index 0000000..a5bdf25
--- /dev/null
+++ b/roles/nameserver/tasks/main.yml
@@ -0,0 +1,80 @@
+---
+- name: Include secrets
+  include_vars: "{{ item }}"
+  no_log: true
+  with_first_found:
+    - "{{ secrets_path | mandatory }}/nameserver.yml"
+    - empty.yml
+  tags:
+    - always
+
+# Install and update system packages
+- import_tasks: packages.yml
+  tags:
+    - packages
+
+- name: Enable and start ntpd
+  service:
+    name: ntpd
+    state: started
+    enabled: yes
+  tags:
+    - always
+
+# DDNS updates fail to create or edit jnl files without this
+- name: Ensure permissions set for "{{ named_conf_zones_path }}"
+  file:
+    path: "{{ named_conf_zones_path }}"
+    mode: '0700'
+    state: directory
+    owner: named
+    group: named
+  tags:
+    - always
+
+# Configure firewalld
+- import_tasks: firewall.yml
+  tags:
+    - firewall
+
+# Configure BIND
+- import_tasks: config.yml
+  tags:
+    - config
+
+# Compile and write zone files
+- import_tasks: records.yml
+  tags:
+    - records
+  when: (named_conf_slave is undefined) or
+        (named_conf_slave is defined and named_conf_slave == false)
+
+# The tasks below are last so the grep output is near the end of the play
+- set_fact:
+    have_collisions: true
+  when:
+    - (named_conf_slave is undefined) or (named_conf_slave is defined and named_conf_slave == false)
+    - nameserver_collisions_grep is defined and nameserver_collisions_grep.stdout | length > 0
+  tags:
+    - records
+
+- name: Print IP collisions
+  debug:
+    msg:
+      - "WARNING: The following IP addresses have multiple records in DNS.  Check for IP collisions!"
+      - "Either re-run this playbook with '-vvv' or `grep -r -w {{ inventory_dir }}/{{ lab_name }} {{ inventory_dir }}/group_vars/nameserver.yml` for the IPs below."
+      - "{{ nameserver_collisions_grep.stdout_lines }}"
+  when: have_collisions is defined and have_collisions|bool
+  tags:
+    - records
+
+- name: grep duplicated IPs in ansible inventory
+  local_action:
+    module: command
+    cmd: "grep -r -w {{ item }} {{ inventory_dir }}/{{ lab_name }} {{ inventory_dir }}/group_vars/nameserver.yml"
+  become: false
+  connection: local
+  with_items: "{{ nameserver_collisions_grep.stdout_lines }}"
+  when: have_collisions is defined and have_collisions|bool
+  tags:
+    - records
diff --git a/roles/nameserver/tasks/packages.yml b/roles/nameserver/tasks/packages.yml

new file mode 100644 (file)

index 0000000..7c9fd50
--- /dev/null
+++ b/roles/nameserver/tasks/packages.yml
@@ -0,0 +1,24 @@
+---
+- name: Include nameserver package list
+  include_vars: packages_redhat.yml
+  when: ansible_os_family == "RedHat"
+
+- name: Include nameserver package list
+  include_vars: packages_suse.yml
+  when: ansible_os_family == "Suse"
+
+- name: Install and update packages via yum
+  yum:
+    name: "{{ packages }}"
+    state: latest
+    enablerepo: epel
+  when: ansible_pkg_mgr == "yum"
+
+- name: Install and update packages via zypper
+  zypper:
+    name: "{{ packages }}"
+    state: latest
+    update_cache: yes
+  when: ansible_pkg_mgr == "zypper"
+  tags:
+    - packages
diff --git a/roles/nameserver/tasks/records.yml b/roles/nameserver/tasks/records.yml

new file mode 100644 (file)

index 0000000..556aba3
--- /dev/null
+++ b/roles/nameserver/tasks/records.yml
@@ -0,0 +1,126 @@
+---
+# Creating reverse records requires ansible_version.major >=2
+# to use the skip_missing flag of with_subelements
+# https://github.com/ansible/ansible/issues/9827
+- name: Bail if local ansible version is older than v2.0
+  assert:
+    that: "{{ ansible_version.major }} >= 2"
+
+- name: Create zone file path
+  file:
+    path: "{{ named_conf_zones_path }}"
+    state: directory
+
+- name: Set named_serial variable
+  set_fact:
+    named_serial: "{{ ansible_date_time.epoch }}"
+
+- name: Create non-existent forward zone files for dynamic domains
+  template:
+    src: forward.j2
+    dest: "{{ named_conf_zones_path }}/{{ item.key }}"
+    validate: named-checkzone {{ item.key }} %s
+    # only write if zone file doesn't already exist
+    # this makes sure we don't clobber ddns records
+    force: no
+  with_dict: "{{ named_domains }}"
+  notify: reload named
+  when: item.value.dynamic == true
+
+# We store new zone files in a temp directory because it takes ansible minutes
+# to write all the files.  If we prevented DDNS updates while they were
+# getting written, there's a good chance some updates would get refused.
+# We copy these to named_conf_zones_path at the end.
+- name: Create temporary directory for zone files
+  command: "mktemp -d"
+  register: named_tempdir
+
+- name: Write forward zone files to tempdir
+  template:
+    src: forward.j2
+    dest: "{{ named_tempdir.stdout }}/{{ item.key }}"
+    validate: named-checkzone {{ item.key }} %s
+  with_dict: "{{ named_domains }}"
+  notify: reload named
+  # Don't write zone files for pure dynamic zones
+  when: (item.value.dynamic != true) or
+        (item.value.dynamic == true and item.value.ddns_hostname_prefixes is defined)
+
+- name: grep temp zone files for IP collisions
+  shell: 'grep -E -o -h "(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)" {{ named_tempdir.stdout }}/* | sort | uniq -d'
+  register: nameserver_collisions_grep
+  when: (named_conf_slave is undefined) or
+        (named_conf_slave is defined and named_conf_slave == false)
+
+- name: Write reverse zone files to tempdir
+  template:
+    src: reverse.j2
+    dest: "{{ named_tempdir.stdout }}/{{ item.1 }}"
+    validate: named-checkzone {{ item.1 }} %s
+  with_subelements:
+    - "{{ named_domains }}"
+    - reverse
+    - flags:
+      skip_missing: True
+  notify: reload named
+
+# This makes sure dynamic DNS records in the journal files are in sync with the
+# actual zone files so we can store them in the next 2 steps.
+- name: Sync Dynamic DNS journals with zone files
+  command: "rndc sync -clean {{ item.key }}"
+  with_dict: "{{ named_domains }}"
+  when: item.value.dynamic == true and
+        item.value.ddns_hostname_prefixes is defined
+  # Don't fail if there is no journal file
+  failed_when: false
+
+# Prevents dynamic DNS record updates so we can capture current DDNS records
+# and move our new zone files into place without them getting overwritten.
+- name: Freeze Dynamic DNS zones to prevent updates
+  command: "rndc freeze {{ item.key }}"
+  register: freeze_output
+  with_dict: "{{ named_domains }}"
+  when: item.value.dynamic == true and
+        item.value.ddns_hostname_prefixes is defined
+  failed_when: (freeze_output.rc != 0) and ("no matching zone" not in freeze_output.stderr)
+
+- name: Spit existing dynamic A records into new/temp forward zone file
+  shell: "grep -E '^({% for prefix in item.value.ddns_hostname_prefixes %}{{ prefix }}{% if not loop.last %}|{% endif %}{% endfor %})[0-9]+\\s+A' {{ named_conf_zones_path }}/{{ item.key }} >> {{ named_tempdir.stdout }}/{{ item.key }}"
+  with_dict: "{{ named_domains }}"
+  when: item.value.dynamic == true and
+        item.value.ddns_hostname_prefixes is defined
+  # Don't fail if there are no records to store
+  failed_when: false
+
+- name: Move all new/temp zone files to actual zone file dir
+  shell: "mv -vf {{ named_tempdir.stdout }}/* {{ named_conf_zones_path }}/"
+
+# Re-run setup module to update ansible_date_time.epoch
+- name:
+  setup:
+
+- name: Set new_named_serial variable
+  set_fact:
+    new_named_serial: "{{ ansible_date_time.epoch }}"
+
+# Since ansible takes a while to write the new/temp zone files, it is likely
+# a DDNS record update incremented the serial so the original named_serial is
+# too old.  We replace it here to be safe.
+- name: Overwrite zone file serial number
+  shell: "sed -i 's/{{ named_serial }}/{{ new_named_serial }}/g' {{ named_conf_zones_path }}/*"
+
+# Context is incorrect due to the files being written to a temp directory first
+- name: Restore SELinux context on zone files
+  command: "restorecon -r {{ named_conf_zones_path }}"
+
+# This re-enables dynamic DNS record updates
+- name: Thaw frozen zone files
+  shell: "rndc thaw {{ item.key }}"
+  with_dict: "{{ named_domains }}"
+  when: item.value.dynamic == true and
+        item.value.ddns_hostname_prefixes is defined
+
+- name: Clean up temp dir
+  file:
+    path: "{{ named_tempdir.stdout }}"
+    state: absent
diff --git a/roles/nameserver/templates/forward.j2 b/roles/nameserver/templates/forward.j2

new file mode 100644 (file)

index 0000000..5ce8c2e
--- /dev/null
+++ b/roles/nameserver/templates/forward.j2
@@ -0,0 +1,36 @@
+{% set domain = item.key %}
+{% if item.value.ipvar is defined and item.value.ipvar.0 is defined %}
+{% set ipvar = item.value.ipvar %}
+{% endif %}
+;
+; {{ ansible_managed }}
+;
+$TTL {{ named_conf_soa_ttl }}
+@              IN      SOA     {{ named_conf_soa }} (
+                               {{ named_serial }}      ; Serial
+                               {{ named_conf_soa_refresh }}            ; Refresh
+                               {{ named_conf_soa_retry }}              ; Retry
+                               {{ named_conf_soa_expire }}             ; Expire
+                               {{ named_conf_soa_ttl }}                ; TTL
+                               )
+
+{% for nameserver in groups['nameserver'] %}
+               IN      NS      {{ nameserver }}.
+{% endfor %}
+
+$ORIGIN {{ domain }}.
+
+{% if item.value.miscrecords is defined %}
+{% for record in item.value.miscrecords %}
+{{ record }}
+{% endfor %}
+{% endif %}
+
+{% if item.value.ipvar is defined and item.value.ipvar.0 is defined %}
+{% for host in groups['all'] %}
+{% if hostvars[host][ipvar] is defined %}
+{% set ipaddr = hostvars[host][ipvar] %}
+{{ hostvars[host]['inventory_hostname_short'] }}                       IN      A       {{ hostvars[host][ipvar] }}
+{% endif %}
+{% endfor %}
+{% endif %}
diff --git a/roles/nameserver/templates/named.conf.j2 b/roles/nameserver/templates/named.conf.j2

new file mode 100644 (file)

index 0000000..ffccc22
--- /dev/null
+++ b/roles/nameserver/templates/named.conf.j2
@@ -0,0 +1,101 @@
+#
+# {{ ansible_managed }}
+#
+
+options {
+       listen-on port {{ named_conf_listen_port }} { {% for interface in named_conf_listen_iface -%}{{ interface }}; {% endfor -%} };
+
+       directory               "{{ named_conf_dir }}";
+       dump-file               "{{ named_conf_data_dir }}/cache_dump.db";
+       statistics-file         "{{ named_conf_data_dir }}/named_stats.txt";
+       memstatistics-file      "{{ named_conf_data_dir }}/named_mem_stats.txt";
+
+       allow-query             { any; };
+       recursion               {{ named_conf_recursion }};
+{% if named_conf_recursion == "yes" %}
+       allow-recursion         { any; };
+{% endif %}
+{% if named_forwarders is defined %}
+       forwarders { {% for forwarder in named_forwarders -%}{{ forwarder }}; {% endfor -%} };
+{% endif %}
+{% if named_conf_allow_axfr is defined %}
+       allow-transfer          { {% for ip in named_conf_allow_axfr -%}{{ ip }}; {% endfor -%} };
+{% endif %}
+
+{% if named_conf_slave is defined and named_conf_slave == true %}
+       ## Slave-specific config
+       # Set these in case named_conf_soa vars are lower than the BIND default.
+       # Forces refresh and retries at the specified intervals.
+       min-refresh-time        {{ named_conf_soa_refresh }};
+       max-refresh-time        {{ named_conf_soa_refresh }};
+       min-retry-time          {{ named_conf_soa_retry }};
+       max-retry-time          {{ named_conf_soa_retry }};
+       notify                  master-only;
+{% endif %}
+};
+
+logging {
+       channel                 default_debug {
+       file                    "{{ named_conf_data_dir }}/named.run";
+       severity                dynamic;
+        };
+};
+
+# Dynamic DNS
+{% for key, zone in named_domains.items() %}
+{% if zone.dynamic == true %}
+{% for domain, values in ddns_keys.items() %}
+{% if key == domain %}
+key "{{ key }}" {
+       algorithm {{ values.algorithm|default('hmac-md5') }};
+       secret "{{ values.secret }}";
+};
+{% endif %}
+{% endfor %}
+{% endif %}
+{% endfor %}
+
+# Forward zones
+{% for key, zone in named_domains.items() %}
+zone "{{ key }}" {
+{% if named_conf_slave is defined and named_conf_slave == true %}
+       type    slave;
+       file    "{{ named_conf_dir }}/slaves/{{ key }}";
+       masters { {{ named_conf_master }}; };
+{% if zone.dynamic == true %}
+       allow-update-forwarding { key "{{ key }}"; };
+{% endif %}
+{% else %}
+       type    master;
+       file    "{{ named_conf_zones_path }}/{{ key }}";
+{% if zone.dynamic == true %}
+       allow-update { key "{{ key }}"; };
+{% endif %}
+{% endif %}
+};
+
+{% endfor %}
+
+# Reverse zones
+{% for key, zone in named_domains.items() %}
+{% if zone.reverse is defined and zone.reverse.0 is defined %}
+{% for reverse in zone.reverse %}
+{% if ansible_env._ == "/usr/bin/python3" %}
+{% set octet1,octet2,octet3,_ = reverse.split('.') %}
+{% else %}
+{% set octet1,octet2,octet3 = reverse.split('.') %}
+{% endif %}
+zone "{{ octet3 }}.{{ octet2 }}.{{ octet1 }}.in-addr.arpa" {
+{% if named_conf_slave is defined and named_conf_slave == true %}
+       type    slave;
+       file    "{{ named_conf_dir }}/slaves/{{ reverse }}";
+       masters { {{ named_conf_master }}; };
+{% else %}
+       type    master;
+       file    "{{ named_conf_zones_path }}/{{ reverse }}";
+{% endif %}
+};
+
+{% endfor %}
+{% endif %}
+{% endfor %}
diff --git a/roles/nameserver/templates/reverse.j2 b/roles/nameserver/templates/reverse.j2

new file mode 100644 (file)

index 0000000..6d6e82a
--- /dev/null
+++ b/roles/nameserver/templates/reverse.j2
@@ -0,0 +1,30 @@
+{% set zone = item.1 %}
+{% set domain = item.0.forward %}
+{% set ipvar = item.0.ipvar %}
+;
+; {{ ansible_managed }}
+;
+$TTL {{ named_conf_soa_ttl }}
+@              IN      SOA     {{ named_conf_soa }} (
+                               {{ named_serial }}              ; Serial
+                               {{ named_conf_soa_refresh }}            ; Refresh
+                               {{ named_conf_soa_retry }}              ; Retry
+                               {{ named_conf_soa_expire }}             ; Expire
+                               {{ named_conf_soa_ttl }}                ; TTL
+                               )
+
+{% for nameserver in groups['nameserver'] %}
+               IN      NS      {{ nameserver }}.
+{% endfor %}
+
+; Reverse zone {{ zone }} belongs to forward zone {{ domain }}
+
+{% for host in groups['all'] %}
+{% if hostvars[host][ipvar] is defined %}
+{% set octet1,octet2,octet3,octet4 = hostvars[host][ipvar].split('.') %}
+{% set cutip = octet1 + '.' + octet2 + '.' + octet3 %}
+{% if cutip == zone %}
+{{ octet4 }}           IN      PTR     {{ hostvars[host]['inventory_hostname_short'] }}.{{ domain }}.
+{% endif %}
+{% endif %}
+{% endfor %}
diff --git a/roles/nameserver/vars/empty.yml b/roles/nameserver/vars/empty.yml

new file mode 100644 (file)

index 0000000..c6f9b19
--- /dev/null
+++ b/roles/nameserver/vars/empty.yml
@@ -0,0 +1,7 @@
+---
+# This is empty on purpose.  Used as the last line
+# when using include_vars with with_first_found when
+# the var file might not exist.
+#
+# Maybe related issue:
+# https://github.com/ansible/ansible/issues/10000
diff --git a/roles/nameserver/vars/packages_redhat.yml b/roles/nameserver/vars/packages_redhat.yml

new file mode 100644 (file)

index 0000000..ee3222f
--- /dev/null
+++ b/roles/nameserver/vars/packages_redhat.yml
@@ -0,0 +1,19 @@
+---
+packages:
+  ## misc tools
+  - vim
+  - wget
+  - mlocate
+  - git
+  - redhat-lsb-core
+  ## bind-specific packages
+  - bind
+  - bind-utils
+  ## firewall
+  - firewalld
+  ## monitoring
+  - nrpe
+  - nagios-plugins-all
+  ## for NTP
+  - ntp
+  - ntpdate
diff --git a/roles/nameserver/vars/packages_suse.yml b/roles/nameserver/vars/packages_suse.yml

new file mode 100644 (file)

index 0000000..341e68f
--- /dev/null
+++ b/roles/nameserver/vars/packages_suse.yml
@@ -0,0 +1,21 @@
+---
+packages:
+  ## misc tools
+  - vim
+  - wget
+  - mlocate
+  - git
+  - lsb
+  ## bind-specific packages
+  - bind
+  - bind-utils
+  ## firewall
+  - firewalld
+  ## monitoring
+  - nrpe
+  - nagios-plugins-all
+  ## for NTP
+  - ntp
+  #- ntpdate
+  # do we really need selinux on opensuse?
+  - python-selinux
diff --git a/roles/nsupdate_web/README.rst b/roles/nsupdate_web/README.rst

new file mode 100644 (file)

index 0000000..59ffb9f
--- /dev/null
+++ b/roles/nsupdate_web/README.rst
@@ -0,0 +1,15 @@
+nsupdate-web
+============
+
+This role sets up `nsupdate-web <https://github.com/zmc/nsupdate-web>`_ for updating dynamic DNS records.
+
+To use the role, you must first have:
+
+- A DNS server supporting `RFC 2136 <https://tools.ietf.org/html/rfc2136>`_. We use `bind <https://www.isc.org/downloads/bind/>`_ and the `nameserver` role to help configure ours.
+- Key files stored in the location pointed to by `keys_dir`
+
+You must set the following vars. Here are examples::
+
+    nsupdate_web_server: "ns1.front.sepia.ceph.com"
+    pubkey_name: "Kfront.sepia.ceph.com.+157+12548.key"
+
diff --git a/roles/nsupdate_web/defaults/main.yml b/roles/nsupdate_web/defaults/main.yml

new file mode 100644 (file)

index 0000000..894d9fb
--- /dev/null
+++ b/roles/nsupdate_web/defaults/main.yml
@@ -0,0 +1,15 @@
+---
+packages: []
+nsupdate_web_user: "nsupdate"
+nsupdate_web_port: "8080"
+nsupdate_web_ttl: "60"
+virtualenv_path: "~/venv"
+python_version: "python3"
+nsupdate_web_repo: "https://github.com/ceph/nsupdate-web.git"
+nsupdate_web_path: "/home/{{ nsupdate_web_user }}/nsupdate_web"
+nsupdate_web_branch: "main"
+# The public and private keys must be manually placed on the host; 
+# The pubkey name must be provided - most likely via group_vars
+pubkey_name: "your_pubkey.key"
+keys_dir: "/home/{{ nsupdate_web_user }}/keys"
+allow_hosts: ""
diff --git a/roles/nsupdate_web/tasks/main.yml b/roles/nsupdate_web/tasks/main.yml

new file mode 100644 (file)

index 0000000..ae78e0c
--- /dev/null
+++ b/roles/nsupdate_web/tasks/main.yml
@@ -0,0 +1,113 @@
+---
+- name: Build args to pass to nsupdate_web
+  set_fact:
+    nsupdate_web_args: "--ttl {{ nsupdate_web_ttl }} -d {{ lab_domain }} -K {{ keys_dir }}/{{ pubkey_name }} -s {{ nsupdate_web_server }}{% if allow_hosts %} -a {{ allow_hosts }}{% endif %}"
+
+- name: Including major version specific variables.
+  include_vars: "{{ item }}"
+  with_first_found:
+    - "{{ ansible_distribution | lower | replace(' ', '_') }}_{{ ansible_distribution_major_version }}.yml"
+    - empty.yml
+
+- name: Install packages
+  package:
+    name: "{{ item }}"
+    state: latest
+  with_items: "{{ packages }}"
+
+- name: Create nsupdate group
+  group:
+    name: "{{ nsupdate_web_user }}"
+    state: present
+    system: true
+
+- name: Create nsupdate user
+  user:
+    name: "{{ nsupdate_web_user }}"
+    group: "{{ nsupdate_web_user }}"
+    state: present
+    system: true
+    shell: "/bin/false"
+
+- name: Make sure keys_dir exists
+  file:
+    path: "{{ keys_dir }}"
+    state: directory
+    owner: "{{ nsupdate_web_user }}"
+    group: "{{ nsupdate_web_user }}"
+  when: "'{{ secrets_path }}/{{ pubkey_name }}' is file"
+
+- name: Copy .key and .private keys to keys_dir
+  copy:
+    src: "{{ item }}"
+    dest: "{{ keys_dir }}/"
+    owner: "{{ nsupdate_web_user }}"
+    group: "{{ nsupdate_web_user }}"
+  with_fileglob:
+    - "{{ secrets_path }}/{{ pubkey_name | regex_replace('\\.[^\\.]+$', '') }}.*"
+  when: "'{{ secrets_path }}/{{ pubkey_name }}' is file"
+
+- name: Clone nsupdate_web repo
+  git:
+    repo: "{{ nsupdate_web_repo }}"
+    dest: "{{ nsupdate_web_path }}"
+    version: "{{ nsupdate_web_branch }}"
+  become_user: "{{ nsupdate_web_user }}"
+
+- name: Create/update virtualenv
+  pip:
+    name: pip
+    virtualenv_python: "{{ python_version }}"
+    virtualenv: "{{ virtualenv_path }}"
+  become_user: "{{ nsupdate_web_user }}"
+
+- name: Set up nsupdate_web
+  shell: "source {{ virtualenv_path }}/bin/activate && python setup.py develop"
+  args:
+    chdir: "{{ nsupdate_web_path }}"
+    executable: "/bin/bash"
+  become_user: "{{ nsupdate_web_user }}"
+
+- name: Ship systemd service
+  template:
+    src: nsupdate-web.service
+    dest: "/etc/systemd/system/"
+    owner: root
+    group: root
+    mode: 0644
+  register: ship_service
+
+- name: Reload systemd and enable/restart service
+  # We use the systemd module here so we can use the daemon_reload feature,
+  # since we're shipping the .service file ourselves
+  systemd:
+    name: nsupdate-web
+    daemon_reload: true
+    enabled: true
+    state: restarted
+  when: ship_service is changed
+
+- name: Ship nginx configuration
+  template:
+    src: "nsupdate_web_nginx_{{ ansible_distribution | lower | replace(' ', '_') }}_{{ ansible_distribution_major_version }}"
+    dest: "{{ nginx_available }}/nsupdate_web"
+    owner: root
+    group: root
+    mode: 0644
+
+- name: Disable default nginx configuration
+  file:
+    path: "{{ nginx_enabled }}/default"
+    state: absent
+
+- name: Enable our nginx configuration
+  file:
+    src: "{{ nginx_available }}/nsupdate_web"
+    dest: "{{ nginx_enabled }}/{{ nsupdate_web_conf }}"
+    state: link
+
+- name: Enable and restart nginx
+  service:
+    name: nginx
+    enabled: true
+    state: restarted
diff --git a/roles/nsupdate_web/templates/nsupdate-web.service b/roles/nsupdate_web/templates/nsupdate-web.service

new file mode 100644 (file)

index 0000000..081c702
--- /dev/null
+++ b/roles/nsupdate_web/templates/nsupdate-web.service
@@ -0,0 +1,12 @@
+# {{ ansible_managed }}
+[Unit]
+Description=DDNS HTTP update service.
+
+[Service]
+Type=simple
+User={{ nsupdate_web_user }}
+Group={{ nsupdate_web_user }}
+ExecStart=/usr/bin/python3 {{ nsupdate_web_path }}/ddns-server.py -p {{ nsupdate_web_port }} {{ nsupdate_web_args }}
+
+[Install]
+WantedBy=multi-user.target
diff --git a/roles/nsupdate_web/templates/nsupdate_web_nginx_opensuse_leap_15 b/roles/nsupdate_web/templates/nsupdate_web_nginx_opensuse_leap_15

new file mode 100644 (file)

index 0000000..d9054d3
--- /dev/null
+++ b/roles/nsupdate_web/templates/nsupdate_web_nginx_opensuse_leap_15
@@ -0,0 +1,7 @@
+server {
+    listen 80;
+
+    location = /update {
+        proxy_pass http://localhost:{{ nsupdate_web_port }};
+    }
+}
diff --git a/roles/nsupdate_web/templates/nsupdate_web_nginx_ubuntu_16 b/roles/nsupdate_web/templates/nsupdate_web_nginx_ubuntu_16

new file mode 100644 (file)

index 0000000..0a73d8e
--- /dev/null
+++ b/roles/nsupdate_web/templates/nsupdate_web_nginx_ubuntu_16
@@ -0,0 +1,8 @@
+server {
+    listen 80;
+
+    location = /update {
+        include proxy_params;
+        proxy_pass http://localhost:{{ nsupdate_web_port }};
+    }
+}
diff --git a/roles/nsupdate_web/vars/opensuse_leap_15.yml b/roles/nsupdate_web/vars/opensuse_leap_15.yml

new file mode 100644 (file)

index 0000000..2de26aa
--- /dev/null
+++ b/roles/nsupdate_web/vars/opensuse_leap_15.yml
@@ -0,0 +1,9 @@
+packages:
+  - git
+  - python3
+  - python3-virtualenv
+  - bind-utils
+  - nginx
+nginx_available: "/etc/nginx"
+nginx_enabled: "/etc/nginx/vhosts.d"
+nsupdate_web_conf: "nsupdate_web.conf"
diff --git a/roles/nsupdate_web/vars/opensuse_leap_42.yml b/roles/nsupdate_web/vars/opensuse_leap_42.yml

new file mode 100644 (file)

index 0000000..9268a57
--- /dev/null
+++ b/roles/nsupdate_web/vars/opensuse_leap_42.yml
@@ -0,0 +1,8 @@
+packages:
+  - git
+  - python3
+  - python3-virtualenv
+  - bind-utils
+  - nginx
+nginx_available: "/etc/nginx"
+nginx_enabled: "/etc/nginx/vhosts.d"
diff --git a/roles/nsupdate_web/vars/ubuntu_16.yml b/roles/nsupdate_web/vars/ubuntu_16.yml

new file mode 100644 (file)

index 0000000..d4761cd
--- /dev/null
+++ b/roles/nsupdate_web/vars/ubuntu_16.yml
@@ -0,0 +1,10 @@
+packages:
+  - git
+  - python3-minimal
+  - virtualenv
+  - dnsutils
+  - nginx
+
+nginx_available: "/etc/nginx/sites-available"
+nginx_enabled: "/etc/nginx/sites-enabled"
+nsupdate_web_conf: "nsupdate_web"
diff --git a/roles/ntp-server/README.rst b/roles/ntp-server/README.rst

new file mode 100644 (file)

index 0000000..f085180
--- /dev/null
+++ b/roles/ntp-server/README.rst
@@ -0,0 +1,25 @@
+ntp-server
+==========
+
+This role is used to set up and configure an NTP server on RHEL or CentOS 7 using NTPd or Chronyd.
+
+Notes
++++++
+
+Virtual machines should not be used as NTP servers.
+
+Red Hat best practices were followed: https://access.redhat.com/solutions/778603
+
+Variables
++++++++++
+
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
+|Variable                                                |Description                                                                                                                |
++========================================================+===========================================================================================================================+
+|::                                                      |A list of LANs that are permitted to query the NTP server running on the host.                                             |
+|                                                        |                                                                                                                           |
+|  ntp_permitted_lans:                                   |                                                                                                                           |
+|    - 192.168.0.0/24                                    |Must be in CIDR format as shown.                                                                                           |
+|    - 172.20.20.0/20                                    |                                                                                                                           |
+|                                                        |                                                                                                                           |
++--------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------+
diff --git a/roles/ntp-server/tasks/main.yml b/roles/ntp-server/tasks/main.yml

new file mode 100644 (file)

index 0000000..dd6fba4
--- /dev/null
+++ b/roles/ntp-server/tasks/main.yml
@@ -0,0 +1,119 @@
+---
+- name: Check if ntp package installed
+  command: rpm -q ntp
+  ignore_errors: true
+  register: ntp_installed
+
+- name: Check if chrony package installed
+  command: rpm -q chrony
+  ignore_errors: true
+  register: chrony_installed
+
+# Use NTP if neither time service is installed
+- set_fact:
+    use_ntp: true
+    use_chrony: false
+  when:
+    - ntp_installed.rc != 0
+    - chrony_installed.rc != 0
+
+# Use NTP if it's installed and Chrony isn't
+- set_fact:
+    use_ntp: true
+    use_chrony: false
+  when:
+    - ntp_installed.rc == 0
+    - chrony_installed.rc != 0
+
+# Use Chrony if it's installed and NTP isn't
+- set_fact:
+    use_ntp: false
+    use_chrony: true
+  when:
+    - ntp_installed.rc != 0
+    - chrony_installed.rc == 0
+
+# It's unlikely we have four baremetal hosts doing nothing but serving as NTP servers.
+# Thus, we shouldn't go uninstalling anything since either package could be a dependency
+# of an already running service.
+- fail:
+    msg: "Both NTP and Chrony are installed.  Check dependencies before removing either package and proceeding."
+  when:
+    - ntp_installed.rc == 0
+    - chrony_installed.rc == 0
+
+- name: Install and update ntp package
+  yum:
+    name: ntp
+    state: latest
+  when: use_ntp == true
+
+- name: Install and update chrony package
+  yum:
+    name: chrony
+    state: latest
+  when: use_chrony == true
+
+- name: Write NTP config file
+  template:
+    src: ntp.conf.j2
+    dest: /etc/ntp.conf
+  register: conf_written
+  when: use_ntp == true
+
+- name: Write chronyd config file
+  template:
+    src: chrony.conf.j2
+    dest: /etc/chrony.conf
+  register: conf_written
+  when: use_chrony == true
+
+- name: Start and enable NTP service
+  service:
+    name: ntpd
+    state: started
+    enabled: yes
+  when: use_ntp == true
+
+- name: Start and enable chronyd service
+  service:
+    name: chronyd
+    state: started
+    enabled: yes
+  when: use_chrony == true
+
+- name: Restart NTP service when conf changed
+  service:
+    name: ntpd
+    state: restarted
+  when:
+    - conf_written is changed
+    - use_ntp == true
+
+- name: Restart chronyd service when conf changed
+  service:
+    name: chronyd
+    state: restarted
+  when:
+    - conf_written is changed
+    - use_chrony == true
+
+- name: Check for firewalld
+  command: firewall-cmd --state
+  failed_when: false
+  register: firewalld_state
+
+- name: Allow NTP traffic through firewalld
+  firewalld:
+    service: ntp
+    permanent: true
+    immediate: true
+    state: enabled
+  when: firewalld_state.rc == 0
+
+- name: Allow NTP traffic through iptables
+  command: "{{ item }}"
+  with_items:
+    - "iptables -I INPUT -p udp -m udp --dport 123 -j ACCEPT"
+    - "service iptables save"
+  when: firewalld_state.rc != 0
diff --git a/roles/ntp-server/templates/chrony.conf.j2 b/roles/ntp-server/templates/chrony.conf.j2

new file mode 100644 (file)

index 0000000..0621733
--- /dev/null
+++ b/roles/ntp-server/templates/chrony.conf.j2
@@ -0,0 +1,16 @@
+# {{ ansible_managed }}
+
+# Allow these networks to query this NTP server
+{% for lan in ntp_permitted_lans %}
+allow {{ lan }}
+{% endfor %}
+
+# Get time from these public hosts
+server 0.rhel.pool.ntp.org
+server 1.rhel.pool.ntp.org
+server 2.rhel.pool.ntp.org
+server 3.rhel.pool.ntp.org
+
+log measurements statistics tracking
+
+logdir /var/log/chrony
diff --git a/roles/ntp-server/templates/ntp.conf.j2 b/roles/ntp-server/templates/ntp.conf.j2

new file mode 100644 (file)

index 0000000..6df1d7c
--- /dev/null
+++ b/roles/ntp-server/templates/ntp.conf.j2
@@ -0,0 +1,37 @@
+# {{ ansible_managed }}
+
+# For more information about this file, see the man pages
+# ntp.conf(5), ntp_acc(5), ntp_auth(5), ntp_clock(5), ntp_misc(5), ntp_mon(5).
+
+driftfile /var/lib/ntp/drift
+
+# Permit time synchronization with our time source, but do not
+# permit the source to query or modify the service on this system.
+restrict default kod nomodify notrap nopeer noquery 
+restrict -6 default kod nomodify notrap nopeer noquery 
+
+# Permit all access over the loopback interface.  This could
+# be tightened as well, but to do so would effect some of
+# the administrative functions.
+restrict 127.0.0.1
+restrict -6 ::1
+
+# Allow these networks to query this NTP server
+{% for lan in ntp_permitted_lans %}
+restrict {{ lan | ipaddr('network') }} mask {{ lan | ipaddr('netmask') }} nomodify notrap
+{% endfor %}
+
+# Get time from these public hosts
+server 0.rhel.pool.ntp.org
+server 1.rhel.pool.ntp.org
+server 2.rhel.pool.ntp.org
+server 3.rhel.pool.ntp.org
+
+includefile /etc/ntp/crypto/pw
+
+# Key file containing the keys and key identifiers used when operating
+# with symmetric key cryptography. 
+keys /etc/ntp/keys
+
+# Enable writing of statistics records.
+statistics clockstats cryptostats loopstats peerstats sysstats rawstats
diff --git a/roles/packages/README.rst b/roles/packages/README.rst

new file mode 100644 (file)

index 0000000..dd7ea60
--- /dev/null
+++ b/roles/packages/README.rst
@@ -0,0 +1,34 @@
+Packages
+========
+
+This role is used to install and remove packages.
+
+Usage
++++++
+
+To install packages, use --extra-vars and pass in lists of packages you
+wish to install for both yum and apt based systems.
+
+For example::
+
+    ansible-playbook packages.yml --extra-vars='{"yum_packages": "foo", "apt_packages": ["foo", "bar"]}'
+
+To remove packages, use --extra-vars and pass in the list of packages you wish
+to remove while also including the ``cleanup`` variable.
+
+For example::
+
+    ansible-playbook packages.yml --extra-vars='{"yum_packages": "foo", "cleanup": true}'
+
+The following is an example of how you might accomplish this in a teuthology job::
+
+    tasks:
+    - ansible:
+        repo: https://github.com/ceph/ceph-cm-ansible.git
+        playbook: packages.yml
+        cleanup: true
+        vars:
+            yum_packages: "foo" 
+            apt_packages:
+                - "foo"
+                - "bar" 
diff --git a/roles/packages/defaults/main.yml b/roles/packages/defaults/main.yml

new file mode 100644 (file)

index 0000000..0d928a4
--- /dev/null
+++ b/roles/packages/defaults/main.yml
@@ -0,0 +1,8 @@
+---
+# When cleanup is true the tasks being used might
+# perform cleanup steps if applicable.
+cleanup: false
+
+apt_packages: []
+
+yum_packages: []
diff --git a/roles/packages/tasks/cleanup.yml b/roles/packages/tasks/cleanup.yml

new file mode 100644 (file)

index 0000000..43c934f
--- /dev/null
+++ b/roles/packages/tasks/cleanup.yml
@@ -0,0 +1,6 @@
+---
+- debug: msg="Performing cleanup related tasks..."
+
+- import_tasks: packages.yml
+  vars:
+    state: "absent"
diff --git a/roles/packages/tasks/main.yml b/roles/packages/tasks/main.yml

new file mode 100644 (file)

index 0000000..29b24b3
--- /dev/null
+++ b/roles/packages/tasks/main.yml
@@ -0,0 +1,14 @@
+---
+# These are tasks which perform actions corresponding to the names of
+# the variables they use.  For example, `disable_yum_repos` would actually
+# disable all repos defined in that list.
+- import_tasks: setup.yml
+  when: not cleanup
+
+# These are tasks which reverse the actions corresponding to the names of
+# the variables they use. For example, `disable_yum_repos` would actually
+# enable all repos defined in that list. The primary use for this is through
+# teuthology, so that you can tell a teuthology run to disable a set of repos
+# for the test run but then re-enable them during the teuthology cleanup process.
+- import_tasks: cleanup.yml
+  when: cleanup
diff --git a/roles/packages/tasks/packages.yml b/roles/packages/tasks/packages.yml

new file mode 100644 (file)

index 0000000..1eecd09
--- /dev/null
+++ b/roles/packages/tasks/packages.yml
@@ -0,0 +1,17 @@
+---
+- name: Install or remove apt packages
+  apt:
+    update_cache: true
+    name: "{{ item }}"
+    state: "{{ state }}"
+  with_items: "{{ apt_packages }}"
+  when: apt_packages|length > 0 and
+        ansible_pkg_mgr == "apt"
+
+- name: Install or remove yum packages
+  yum:
+    name: "{{ item }}"
+    state: "{{ state }}"
+  with_items: "{{ yum_packages }}"
+  when: yum_packages|length > 0 and
+        ansible_pkg_mgr == "yum"
diff --git a/roles/packages/tasks/setup.yml b/roles/packages/tasks/setup.yml

new file mode 100644 (file)

index 0000000..42db932
--- /dev/null
+++ b/roles/packages/tasks/setup.yml
@@ -0,0 +1,4 @@
+---
+- import_tasks: packages.yml
+  vars:
+    state: "present"
diff --git a/roles/paddles/README.rst b/roles/paddles/README.rst

new file mode 100644 (file)

index 0000000..60f495f
--- /dev/null
+++ b/roles/paddles/README.rst
@@ -0,0 +1,50 @@
+Paddles
+==========
+This role is used to configure a node to run paddles_. It is able to deploy via two methods:
+
+1. Using a Docker service to manage replicated containers
+2. Cloning paddles_ directly and using supervisord to run it
+
+Both use postgresql for the database and nginx as a reverse proxy.
+
+It has been tested on:
+
+- Ubuntu 18.04
+
+Usage
++++++
+
+Typically::
+
+  ansible-playbook -l 'paddles.*' ./paddles.yml
+
+Variables
++++++++++
+
+``paddles_user``: The system account to create and use (Default: paddles)
+
+``paddles_db_user``: The postgresql account to create and use (Default: paddles)
+
+``paddles_port``: The port to use for paddles internally (Default: 8080; external port is always 80)
+
+``paddles_statsd_host``: Optionally send metrics to a statsd host
+
+``paddles_statsd_prefix``: The prefix to use for statsd metrics
+
+``paddles_sentry_dsn``: Optionally send errors to a Sentry DSN
+
+``paddles_containerized``: Whether or not to deploy containers
+
+``paddles_container_image``: The container image to use for paddles
+
+``paddles_container_replicas``: The number of replica containers to run (Default 10)
+
+``paddles_repo``: Optionally override the paddles git repo - not relevant for containers
+
+``paddles_branch``: Optionally override the paddles repo branch.
+For GitHub pull requests it is also possible to use refs/pull/X/merge or refs/pull/X/head
+instead of branch.
+
+``log_host``: The host where teuthology logs are stored
+
+.. _paddles: https://github.com/ceph/paddles
diff --git a/roles/paddles/defaults/main.yml b/roles/paddles/defaults/main.yml

new file mode 100644 (file)

index 0000000..ac854c4
--- /dev/null
+++ b/roles/paddles/defaults/main.yml
@@ -0,0 +1,17 @@
+---
+paddles_user: paddles
+paddles_db_user: paddles
+paddles_port: 8080
+paddles_statsd_host: ""
+paddles_statsd_prefix: ""
+paddles_sentry_dsn: ""
+
+paddles_containerized: false
+paddles_container_image: quay.io/ceph-infra/paddles:latest
+paddles_container_replicas: 10
+
+paddles_repo: https://github.com/ceph/paddles.git
+paddles_branch: main
+
+
+log_host: localhost
diff --git a/roles/paddles/meta/main.yml b/roles/paddles/meta/main.yml

new file mode 100644 (file)

index 0000000..3413ff8
--- /dev/null
+++ b/roles/paddles/meta/main.yml
@@ -0,0 +1,4 @@
+---
+dependencies:
+  - role: secrets
+  - role: users
diff --git a/roles/paddles/tasks/apt_systems.yml b/roles/paddles/tasks/apt_systems.yml

new file mode 100644 (file)

index 0000000..a7974c1
--- /dev/null
+++ b/roles/paddles/tasks/apt_systems.yml
@@ -0,0 +1,24 @@
+---
+- name: Include package type specific vars.
+  include_vars: "apt_systems.yml"
+  tags:
+    - always
+
+- name: Install packages via apt
+  apt:
+    name: "{{ paddles_extra_packages|list }}"
+    state: latest
+    update_cache: yes
+    cache_valid_time: 600
+  tags:
+    - packages
+
+- name: Install docker packages
+  apt:
+    name: "{{ paddles_docker_packages|list }}"
+    state: latest
+    update_cache: yes
+    cache_valid_time: 600
+  when: paddles_containerized
+  tags:
+    - packages
diff --git a/roles/paddles/tasks/main.yml b/roles/paddles/tasks/main.yml

new file mode 100644 (file)

index 0000000..d0621b2
--- /dev/null
+++ b/roles/paddles/tasks/main.yml
@@ -0,0 +1,77 @@
+---
+- name: Include secrets
+  include_vars: "{{ secrets_path | mandatory }}/paddles.yml"
+  no_log: true
+  tags:
+    - always
+
+- name: Set repo location
+  set_fact:
+    paddles_repo_path: "/home/{{ paddles_user }}/paddles"
+  tags:
+    - always
+
+- name: Set paddles_address
+  set_fact:
+    paddles_address: http://{{ ansible_hostname }}/
+  when: paddles_address is not defined or not paddles_address.startswith('http')
+  tags:
+    - always
+
+- name: Set db_host
+  set_fact:
+    db_host: "{% if paddles_containerized %}{{ inventory_hostname }}{% else %}localhost{% endif %}"
+  tags:
+    - always
+
+- name: Set db_url
+  set_fact:
+    db_url: "postgresql+psycopg2://{{ paddles_db_user }}:{{ db_pass }}@{{ db_host }}/paddles"
+  no_log: true
+  tags:
+    - always
+
+- import_tasks: zypper_systems.yml
+  when: ansible_pkg_mgr == "zypper"
+
+- import_tasks: apt_systems.yml
+  when: ansible_pkg_mgr == "apt"
+
+# Yum systems support is not implemented yet.
+- import_tasks: yum_systems.yml
+  when: ansible_pkg_mgr == "yum"
+
+# Set up the paddles user
+- import_tasks: setup_user.yml
+
+# Set up the actual paddles project
+- import_tasks: setup_paddles.yml
+  when: not paddles_containerized
+
+# Set up the DB which paddles uses
+- import_tasks: setup_db.yml
+  tags:
+    - db
+
+# Set up docker if necessary
+- import_tasks: setup_docker.yml
+  when: paddles_containerized
+  tags:
+    - service
+
+- import_tasks: setup_postgres_exporter.yml
+  when: paddles_containerized
+  tags:
+    - service
+    - prometheus
+
+# Configure the system to run paddles as a daemon
+- import_tasks: setup_service.yml
+  when: not paddles_containerized
+  tags:
+    - service
+
+# Configure nginx as a reverse proxy
+- import_tasks: nginx.yml
+  when:
+    - not ansible_distribution is search("openSUSE")
diff --git a/roles/paddles/tasks/nginx.yml b/roles/paddles/tasks/nginx.yml

new file mode 100644 (file)

index 0000000..7f2c332
--- /dev/null
+++ b/roles/paddles/tasks/nginx.yml
@@ -0,0 +1,30 @@
+---
+- name: Disable default nginx config
+  file:
+    name: /etc/nginx/sites-enabled/default
+    state: absent
+
+- name: Ship nginx config
+  template:
+    src: nginx.conf
+    dest: /etc/nginx/sites-available/paddles
+
+- name: Enable nginx config
+  file:
+    src: /etc/nginx/sites-available/paddles
+    dest: /etc/nginx/sites-enabled/paddles
+    state: link
+
+- name: Disable apache httpd
+  service:
+    name: "{{ apache_service }}"
+    enabled: no
+    state: stopped
+  failed_when: false
+
+- name: Enable nginx
+  service:
+    name: nginx
+    enabled: yes
+    state: reloaded
+  changed_when: false
diff --git a/roles/paddles/tasks/setup_db.yml b/roles/paddles/tasks/setup_db.yml

new file mode 100644 (file)

index 0000000..d377035
--- /dev/null
+++ b/roles/paddles/tasks/setup_db.yml
@@ -0,0 +1,63 @@
+---
+- name: Listen on all interfaces
+  postgresql_set:
+    name: listen_addresses
+    value: "*"
+  become_user: postgres
+  register: pg_listen
+
+- name: Restart postgres to listen on all interfaces
+  service:
+    name: postgresql
+    state: restarted
+  when: pg_listen is changed
+
+- name: Create the postgresql database
+  postgresql_db:
+    name: paddles
+  become_user: postgres
+  register: create_db
+
+- name: Set up access to the database
+  postgresql_user:
+    db: paddles
+    name: "{{ paddles_db_user }}"
+    password: "{{ db_pass }}"
+  become_user: postgres
+  when: create_db is changed
+
+- name: Run pecan populate
+  command: ./virtualenv/bin/pecan populate prod.py
+  args:
+    chdir: "{{ paddles_repo_path }}"
+  become_user: "{{ paddles_user }}"
+  when:
+    - create_db is changed
+    - not paddles_containerized
+
+- name: Copy alembic config template to alembic.ini
+  command: cp ./alembic.ini.in alembic.ini
+  args:
+    creates: alembic.ini
+    chdir: "{{ paddles_repo_path }}"
+  register: alembic_ini
+  become_user: "{{ paddles_user }}"
+  when: not paddles_containerized
+
+- name: Update alembic.ini
+  lineinfile:
+    dest: "{{ paddles_repo_path }}/alembic.ini"
+    line: "sqlalchemy.url = {{ db_url }}"
+    regexp: "^sqlalchemy.url = "
+  when: not paddles_containerized
+
+- name: Set the alembic revision
+  shell: |
+    source virtualenv/bin/activate
+    alembic stamp head
+  args:
+    chdir: "{{ paddles_repo_path }}"
+  when:
+    - alembic_ini is changed
+    - not paddles_containerized
+  become_user: "{{ paddles_user }}"
diff --git a/roles/paddles/tasks/setup_docker.yml b/roles/paddles/tasks/setup_docker.yml

new file mode 100644 (file)

index 0000000..0edfa87
--- /dev/null
+++ b/roles/paddles/tasks/setup_docker.yml
@@ -0,0 +1,88 @@
+---
+- name: Add paddles_user to the docker group
+  user:
+    name: "{{ paddles_user }}"
+    append: yes
+    groups:
+      - docker
+
+- name: Install docker's python module
+  become_user: "{{ paddles_user }}"
+  pip:
+    name: docker
+    state: latest
+    executable: pip3
+    extra_args: --user
+
+- name: Init docker swarm
+  become_user: "{{ paddles_user }}"
+  docker_swarm:
+    state: present
+
+- name: Create secret for the database URL
+  become_user: "{{ paddles_user }}"
+  docker_secret:
+    name: paddles_sqlalchemy_url
+    data: "{{ db_url }}"
+
+- name: Pull the paddles container image
+  become_user: "{{ paddles_user }}"
+  docker_image:
+    name: "{{ paddles_container_image }}"
+    source: pull
+  register: image_pull
+
+- name: Get postgres hba conf file location
+  postgresql_info:
+    db: paddles
+    filter: settings
+  become_user: postgres
+  register: pg_info
+
+- name: Tell postgres to trust the Docker network
+  postgresql_pg_hba:
+    dest: "{{ pg_info.settings.hba_file.setting }}"
+    contype: host
+    users: all
+    databases: all
+    method: md5
+    source: "{{ ansible_docker_gwbridge.ipv4.address }}/{{ ansible_docker_gwbridge.ipv4.prefix }}"
+
+- name: Create docker swarm service
+  become_user: "{{ paddles_user }}"
+  docker_swarm_service:
+    name: paddles
+    state: present
+    replicas: "{{ paddles_container_replicas }}"
+    update_config:
+      parallelism: 1
+      delay: 10s
+      monitor: 10s
+      failure_action: rollback
+    rollback_config:
+      order: start-first
+    image: "{{ paddles_container_image }}"
+    resolve_image: true
+    force_update: "{{ image_pull.changed }}"
+    publish:
+      - published_port: "{{ paddles_port }}"
+        target_port: 8080
+    logging:
+      driver: journald
+      options:
+        tag: paddles
+    env:
+      - "PADDLES_ADDRESS={{ paddles_address }}"
+      - "PADDLES_SERVER_HOST=0.0.0.0"
+      - "SENTRY_DSN={{ paddles_sentry_dsn }}"
+      - "PADDLES_STATSD_HOST={{ paddles_statsd_host }}"
+      - "PADDLES_STATSD_PREFIX={{ paddles_statsd_prefix }}"
+      - "GUNICORN_CMD_ARGS=--workers=2 --max-requests=10000"
+    secrets:
+      - secret_name: paddles_sqlalchemy_url
+        filename: "/run/secrets/paddles_sqlalchemy_url"
+    healthcheck:
+      test: ["CMD", "curl", "--fail", "http://localhost:8080"]
+      interval: 1m
+      timeout: 5s
+      start_period: 10s
diff --git a/roles/paddles/tasks/setup_paddles.yml b/roles/paddles/tasks/setup_paddles.yml

new file mode 100644 (file)

index 0000000..b20561b
--- /dev/null
+++ b/roles/paddles/tasks/setup_paddles.yml
@@ -0,0 +1,58 @@
+---
+- name: Determine GitHub Pull Request
+  set_fact:
+    paddles_pull: "{{ paddles_branch | regex_replace( '^refs/pull/([^/]+)/.*$', '\\1') }}"
+
+- name: Clone the repo and checkout pull request branch
+  git:
+    repo: "{{ paddles_repo }}"
+    dest: "{{ paddles_repo_path }}"
+    version: "pull-{{ paddles_pull }}"
+    refspec: '+{{ paddles_branch }}:refs/remotes/origin/pull-{{ paddles_pull }}'
+  become_user: "{{ paddles_user }}"
+  tags:
+    - repos
+  when: paddles_pull is defined and paddles_pull != paddles_branch
+
+- name: Checkout the repo
+  git:
+    repo: "{{ paddles_repo }}"
+    dest: "{{ paddles_repo_path }}"
+    version: "{{ paddles_branch }}"
+  become_user: "{{ paddles_user }}"
+  tags:
+    - repos
+  when: paddles_pull is not defined or paddles_pull == paddles_branch
+
+- name: Install latest pip via pip
+  pip:
+    name: "pip"
+    state: "latest"
+    chdir: "{{ paddles_repo_path }}"
+    virtualenv_python: "python3"
+    virtualenv: "{{ paddles_repo_path }}/virtualenv"
+  become_user: "{{ paddles_user }}"
+- name: Install requirements via pip
+  pip:
+    chdir: "{{ paddles_repo_path }}"
+    requirements: "./requirements.txt"
+    virtualenv: "{{ paddles_repo_path }}/virtualenv"
+  become_user: "{{ paddles_user }}"
+
+- name: Run setup inside virtualenv
+  command: "./virtualenv/bin/python setup.py develop"
+  args:
+    chdir: "{{ paddles_repo_path }}"
+  changed_when: false
+  become_user: "{{ paddles_user }}"
+
+- name: Ship prod.py
+  template:
+    src: prod.py
+    dest: "{{ paddles_repo_path }}/prod.py"
+    owner: "{{ paddles_user }}"
+    group: "{{ paddles_user }}"
+    mode: 0755
+  register: prod_conf
+  tags:
+    - config
diff --git a/roles/paddles/tasks/setup_postgres_exporter.yml b/roles/paddles/tasks/setup_postgres_exporter.yml

new file mode 100644 (file)

index 0000000..dffd76e
--- /dev/null
+++ b/roles/paddles/tasks/setup_postgres_exporter.yml
@@ -0,0 +1,42 @@
+---
+- name: Add postgres user to the docker group
+  user:
+    name: "postgres"
+    append: yes
+    groups:
+      - docker
+
+- name: Create secret for the database password
+  become_user: "{{ paddles_user }}"
+  docker_secret:
+    name: postgres_exporter_password
+    data: "{{ db_pass }}"
+
+- name: Create docker swarm service for postgres exporter
+  become_user: postgres
+  docker_swarm_service:
+    name: postgres-exporter
+    state: present
+    replicas: 1
+    update_config:
+      parallelism: 1
+      delay: 10s
+      monitor: 10s
+      failure_action: rollback
+    rollback_config:
+      order: start-first
+    image: "quay.io/prometheuscommunity/postgres-exporter:latest"
+    resolve_image: true
+    publish:
+      - published_port: 9187
+        target_port: 9187
+    logging:
+      driver: journald
+      options:
+        tag: prometheus-exporter
+    env:
+      - "DATA_SOURCE_URI={{ db_host }}"
+      - "DATA_SOURCE_USER={{ paddles_db_user }}"
+      - "DATA_SOURCE_PASS_FILE=/run/secrets/postgres_exporter_password"
+    secrets:
+      - secret_name: postgres_exporter_password
diff --git a/roles/paddles/tasks/setup_service.yml b/roles/paddles/tasks/setup_service.yml

new file mode 100644 (file)

index 0000000..057e89b
--- /dev/null
+++ b/roles/paddles/tasks/setup_service.yml
@@ -0,0 +1,35 @@
+---
+- name: Make sure supervisor config directory exists
+  file:
+    path: "{{ supervisor_conf_d }}"
+    state: directory
+    recurse: yes
+    mode: 0755
+
+- name: Ship supervisor config
+  template:
+    src: supervisor.conf
+    dest: "{{ supervisor_conf_d }}/paddles.{{ supervisor_conf_suffix }}"
+    mode: 0755
+  register: supervisor_conf
+
+- name: Read supervisord config
+  command: supervisorctl update
+  when: supervisor_conf is changed
+
+- name: Ensure paddles is running
+  supervisorctl:
+    name: paddles
+    state: started
+
+- name: Restart paddles if prod.py changed
+  supervisorctl:
+    name: paddles
+    state: restarted
+  when: prod_conf is defined and prod_conf is changed
+  tags:
+    - config
+
+- name: Wait for paddles to start
+  wait_for:
+    port: "{{ paddles_port }}"
diff --git a/roles/paddles/tasks/setup_user.yml b/roles/paddles/tasks/setup_user.yml

new file mode 100644 (file)

index 0000000..1fe7fd0
--- /dev/null
+++ b/roles/paddles/tasks/setup_user.yml
@@ -0,0 +1,14 @@
+---
+- name: Create group
+  group:
+    name: "{{ paddles_user }}"
+    state: present
+  tags:
+    - user
+- name: Create user
+  user:
+    name: "{{ paddles_user }}"
+    state: present
+    shell: /bin/bash
+  tags:
+    - user
diff --git a/roles/paddles/tasks/yum_systems.yml b/roles/paddles/tasks/yum_systems.yml

new file mode 100644 (file)

index 0000000..3d78686
--- /dev/null
+++ b/roles/paddles/tasks/yum_systems.yml
@@ -0,0 +1,4 @@
+---
+- name: Fail on yum systems as support is not implemented
+  fail:
+    msg: "yum systems are not supported at this time"
diff --git a/roles/paddles/tasks/zypper_systems.yml b/roles/paddles/tasks/zypper_systems.yml

new file mode 100644 (file)

index 0000000..3e87493
--- /dev/null
+++ b/roles/paddles/tasks/zypper_systems.yml
@@ -0,0 +1,55 @@
+---
+- name: Fail on zypper systems if paddles_containerized is set
+  fail:
+    msg: "'paddles_containerized' is not yet supported on zypper systems"
+  when: paddles_containerized
+
+- name: Include package type specific vars.
+  include_vars: "zypper_systems.yml"
+  tags:
+    - always
+
+- name: Install packages via zypper
+  zypper:
+    name: "{{ paddles_extra_packages|list }}"
+    state: latest
+    update_cache: yes
+  tags:
+    - packages
+
+- name: Enable and start database
+  service:
+    name: postgresql
+    state: started
+    enabled: yes
+
+- name: Enable and start supervisor
+  service:
+    name: supervisord
+    state: started
+    enabled: yes
+
+- name: Disable ProtectHome=true for supervisor
+  lineinfile:
+    path: "/etc/systemd/system/multi-user.target.wants/supervisord.service"
+    state: present
+    regexp: "^(ProtectHome=true.*)"
+    line: '#\1'
+    backrefs: yes
+
+- name: Reload supervisor
+  service:
+    name: supervisord
+    state: restarted
+    daemon_reload: yes
+
+- name: Setup hba_conf
+  lineinfile:
+    path: "/var/lib/pgsql/data/pg_hba.conf"
+    insertafter: "^#\\s+TYPE\\s+DATABASE\\s+USER\\s+ADDRESS\\s+METHOD.*"
+    line: "host    paddles         {{ paddles_user }}        ::1/128             trust"
+
+- name: Reload database
+  service:
+    name: postgresql
+    state: reloaded
diff --git a/roles/paddles/templates/nginx.conf b/roles/paddles/templates/nginx.conf

new file mode 100644 (file)

index 0000000..ada5988
--- /dev/null
+++ b/roles/paddles/templates/nginx.conf
@@ -0,0 +1,15 @@
+server {
+        server_name {{ inventory_hostname }};
+        listen 80;
+        gzip on;
+        gzip_types text/plain application/json;
+        gzip_proxied any;
+        proxy_send_timeout 600;
+        proxy_connect_timeout 240;
+        location / {
+           proxy_pass                   http://127.0.0.1:{{ paddles_port }}/;
+           proxy_set_header Host        $host;
+           proxy_set_header X-Real-IP   $remote_addr;
+        }
+
+}
diff --git a/roles/paddles/templates/prod.py b/roles/paddles/templates/prod.py

new file mode 100644 (file)

index 0000000..ac01b2b
--- /dev/null
+++ b/roles/paddles/templates/prod.py
@@ -0,0 +1,64 @@
+# {{ ansible_managed }}
+from paddles.hooks import IsolatedTransactionHook
+from paddles import models
+from paddles.hooks.cors import CorsHook
+
+server = {
+    'port': '8080',
+    'host': '127.0.0.1'
+}
+
+address = '{{ paddles_address }}'
+job_log_href_templ = 'http://{{ log_host }}/teuthology/{run_name}/{job_id}/teuthology.log'  # noqa
+default_latest_runs_count = 25
+
+sqlalchemy = {
+    'url': '{{ db_url }}',
+    'echo':          True,
+    'echo_pool':     True,
+    'pool_recycle':  3600,
+    'encoding':      'utf-8'
+}
+
+app = {
+    'root': 'paddles.controllers.root.RootController',
+    'modules': ['paddles'],
+    'template_path': '%(confdir)s/paddles/templates',
+    'default_renderer': 'json',
+    'guess_content_type_from_ext': False,
+    'debug': False,
+    'hooks': [
+        IsolatedTransactionHook(
+            models.start,
+            models.start_read_only,
+            models.commit,
+            models.rollback,
+            models.clear
+        ),
+        CorsHook(),
+    ],
+}
+
+logging = {
+    'disable_existing_loggers': False,
+    'loggers': {
+        'root': {'level': 'INFO', 'handlers': ['console']},
+        'paddles': {'level': 'DEBUG', 'handlers': ['console']},
+        'sqlalchemy': {'level': 'WARN'},
+        'py.warnings': {'handlers': ['console']},
+        '__force_dict__': True
+    },
+    'handlers': {
+        'console': {
+            'level': 'DEBUG',
+            'class': 'logging.StreamHandler',
+            'formatter': 'simple'
+        }
+    },
+    'formatters': {
+        'simple': {
+            'format': ('%(asctime)s %(levelname)-5.5s [%(name)s]'
+                       ' %(message)s')
+        }
+    }
+}
diff --git a/roles/paddles/templates/supervisor.conf b/roles/paddles/templates/supervisor.conf

new file mode 100644 (file)

index 0000000..a1fb596
--- /dev/null
+++ b/roles/paddles/templates/supervisor.conf
@@ -0,0 +1,11 @@
+# {{ ansible_managed }}
+[program:paddles]
+user={{ paddles_user }}
+environment=HOME="/home/{{ paddles_user }}",USER="{{ paddles_user }}"
+directory=/home/{{ paddles_user }}/paddles
+command=/home/{{ paddles_user }}/paddles/virtualenv/bin/gunicorn_pecan -c gunicorn_config.py prod.py
+autostart=true
+autorestart=true
+redirect_stderr=true
+stdout_logfile = /home/{{ paddles_user }}/paddles.out.log
+stderr_logfile = /home/{{ paddles_user }}/paddles.err.log
diff --git a/roles/paddles/vars/apt_systems.yml b/roles/paddles/vars/apt_systems.yml

new file mode 100644 (file)

index 0000000..3c1b1ee
--- /dev/null
+++ b/roles/paddles/vars/apt_systems.yml
@@ -0,0 +1,27 @@
+---
+paddles_extra_packages:
+  # The following is a requirement of ansible's postgresql module
+  - python3-psycopg2
+  # The following packages are requirements for running paddles
+  - git-all
+  - python3-dev
+  - python3-pip
+  - python3-virtualenv
+  - virtualenv
+  - postgresql
+  - postgresql-contrib
+  - postgresql-server-dev-all
+  - supervisor
+  # We use nginx to reverse-proxy
+  - nginx
+  - liblz4-tool
+
+paddles_docker_packages:
+  - docker.io
+  - python3-docker
+
+# We need this so we can disable apache2 to get out of the way of nginx
+apache_service: 'apache2'
+
+supervisor_conf_d: /etc/supervisor/conf.d
+supervisor_conf_suffix: conf
diff --git a/roles/paddles/vars/yum_systems.yml b/roles/paddles/vars/yum_systems.yml

new file mode 100644 (file)

index 0000000..ed97d53
--- /dev/null
+++ b/roles/paddles/vars/yum_systems.yml
@@ -0,0 +1 @@
+---
diff --git a/roles/paddles/vars/zypper_systems.yml b/roles/paddles/vars/zypper_systems.yml

new file mode 100644 (file)

index 0000000..ee913f2
--- /dev/null
+++ b/roles/paddles/vars/zypper_systems.yml
@@ -0,0 +1,24 @@
+---
+paddles_extra_packages:
+  # The following is a requirement of ansible's postgresql module
+  - python-psycopg2
+  # The following packages are requirements for running paddles
+  - gcc
+  - git
+  - python3-devel
+  - python3-pip
+  - python3-virtualenv
+  - postgresql
+  - postgresql-contrib
+  - postgresql-devel
+  - postgresql-server
+  - supervisor
+  # We use nginx to reverse-proxy
+  - nginx
+
+# We need this so we can disable apache2 to get out of the way of nginx
+apache_service: 'apache2'
+
+#supervisor_conf_d: /etc/supervisor/conf.d
+supervisor_conf_d: /etc/supervisord.d
+supervisor_conf_suffix: conf
diff --git a/roles/pcp/README.rst b/roles/pcp/README.rst

new file mode 100644 (file)

index 0000000..6bcac56
--- /dev/null
+++ b/roles/pcp/README.rst
@@ -0,0 +1,71 @@
+PCP
+===
+This role is used to configure a node to run PCP_.
+
+PCP's main function is to collect performance-related metrics. By default, this
+role will set up each node as a ``pcp_collector``. It is also capable of
+installing and configuring the necessary packages to act as a ``pcp_manager``,
+collecting data from all the ``pcp_collector`` nodes; and also as a ``pcp_web``
+host, providing various web UIs to display the data graphically.
+
+These distros should be fully supported:
+
+- CentOS 7
+- Ubuntu 14.04 (Trusty)
+
+These distros are supported as ``pcp_collector`` nodes:
+
+- CentOS 6
+- Debian 8
+- Fedora 22 (Only via ansible 2)
+
+.. _PCP: https://github.com/performancecopilot/pcp
+
+Variables
++++++++++
+
+Defaults for these variables are defined in ``roles/pcp/defaults/main.yml``.
+
+To use upstream-provided packages instead of the distro's packages, set::
+
+    upstream_repo: true
+
+To tell a given host to collect performance data using ``pmcd``, and to run
+``pmlogger`` to create archive logs::
+
+    pcp_collector: true
+
+To tell the host to aggregate data from other systems using ``pmmgr`` and
+corresponding ``pmlogger`` processes for each ``pcp_collector`` node::
+
+    pcp_manager: true
+
+To tell a ``pcp_manager`` host to use Avahi to auto-discover other hosts running PCP::
+
+    pcp_use_avahi: true
+
+To tell a ``pcp_manager`` host to probe hosts on its local network for the PCP service::
+
+    pcp_probe: true
+
+To tell a ``pcp_manager`` host to use a larger timeout when attempting to
+connect to hosts that it monitors (in seconds)::
+
+    pmcd_connect_timeout: 1
+
+To tell a ``pcp_manager`` host to retain full-resolution archives for a year
+(format is a `PCP time window`_)::
+
+    pmlogmerge_retain: "365days"
+
+To tell a ``pcp_manager`` host to delete reduced archives after two years
+(format is a `PCP time window`_)::
+
+    pmlogmerge_reduce: "730days"
+
+To tell a ``pcp_manager`` host to run PCP's various web UIs::
+
+    pcp_web: true
+
+
+.. _PCP time window: http://www.pcp.io/books/PCP_UAG/html/LE14729-PARENT.html
diff --git a/roles/pcp/defaults/main.yml b/roles/pcp/defaults/main.yml

new file mode 100644 (file)

index 0000000..0cd4091
--- /dev/null
+++ b/roles/pcp/defaults/main.yml
@@ -0,0 +1,25 @@
+---
+# Whether or not to use upstream repos
+upstream_repo: false
+
+## PCP Collector options
+# Set the host up to collect data
+pcp_collector: true
+
+## PCP Manager options
+# Set the host up to be able to monitor other systems
+pcp_manager: false
+# Whether or not to use avahi to auto-discover hosts
+pcp_use_avahi: false
+# Whether or not to probe the local network to auto-discover hosts
+pcp_probe: false
+# PMCD_CONNECT_TIMEOUT in /etc/pcp/pmmgr/pmmgr.options
+pmcd_connect_timeout: "0.1"
+# How long to keep full-resolution archives before reducing to save space
+pmlogmerge_retain: "90days"
+# How long before deleting reduced archives
+pmlogmerge_reduce: "99999weeks"
+
+## PCP Web UI options
+# Set up the web UI
+pcp_web: false
diff --git a/roles/pcp/files/1h1m.json b/roles/pcp/files/1h1m.json

new file mode 100644 (file)

index 0000000..f3f5385
--- /dev/null
+++ b/roles/pcp/files/1h1m.json
@@ -0,0 +1,367 @@
+{
+  "id": null,
+  "title": "past 1h every 1m",
+  "originalTitle": "past 1h every 1m",
+  "tags": [],
+  "style": "light",
+  "timezone": "utc",
+  "editable": true,
+  "hideControls": false,
+  "sharedCrosshair": false,
+  "rows": [
+    {
+      "title": "load",
+      "height": "200px",
+      "editable": true,
+      "collapse": false,
+      "collapsable": true,
+      "panels": [
+        {
+          "span": 12,
+          "editable": true,
+          "type": "graph",
+          "x-axis": true,
+          "y-axis": true,
+          "scale": 1,
+          "y_formats": [
+            "short",
+            "short"
+          ],
+          "grid": {
+            "max": null,
+            "min": null,
+            "leftMax": null,
+            "rightMax": null,
+            "leftMin": null,
+            "rightMin": null,
+            "threshold1": null,
+            "threshold2": null,
+            "threshold1Color": "rgba(216, 200, 27, 0.27)",
+            "threshold2Color": "rgba(234, 112, 112, 0.22)"
+          },
+          "resolution": 100,
+          "lines": true,
+          "fill": 1,
+          "linewidth": 2,
+          "points": false,
+          "pointradius": 5,
+          "bars": false,
+          "stack": false,
+          "spyable": true,
+          "options": false,
+          "legend": {
+            "show": false,
+            "values": false,
+            "min": false,
+            "max": false,
+            "current": false,
+            "total": false,
+            "avg": false
+          },
+          "interactive": true,
+          "legend_counts": true,
+          "timezone": "utc",
+          "percentage": false,
+          "zerofill": true,
+          "nullPointMode": "connected",
+          "steppedLine": false,
+          "tooltip": {
+            "value_type": "cumulative",
+            "shared": false
+          },
+          "targets": [
+            {
+              "target": "*.kernel.all.load.1 minute"
+            }
+          ],
+          "aliasColors": {},
+          "title": "1-minute load average",
+          "id": 2,
+          "datasource": null,
+          "renderer": "flot",
+          "seriesOverrides": []
+        }
+      ],
+      "notice": false
+    },
+    {
+      "net": "demo2",
+      "height": "200px",
+      "editable": true,
+      "collapse": false,
+      "collapsable": true,
+      "panels": [
+        {
+          "span": 12,
+          "editable": true,
+          "type": "graph",
+          "x-axis": true,
+          "y-axis": true,
+          "scale": 1,
+          "y_formats": [
+            "short",
+            "short"
+          ],
+          "grid": {
+            "max": null,
+            "min": null,
+            "leftMax": null,
+            "rightMax": null,
+            "leftMin": null,
+            "rightMin": null,
+            "threshold1": null,
+            "threshold2": null,
+            "threshold1Color": "rgba(216, 200, 27, 0.27)",
+            "threshold2Color": "rgba(234, 112, 112, 0.22)"
+          },
+          "resolution": 100,
+          "lines": true,
+          "fill": 1,
+          "linewidth": 2,
+          "points": false,
+          "pointradius": 5,
+          "bars": false,
+          "stack": false,
+          "spyable": true,
+          "options": false,
+          "legend": {
+            "show": false,
+            "values": false,
+            "min": false,
+            "max": false,
+            "current": false,
+            "total": false,
+            "avg": false
+          },
+          "interactive": true,
+          "legend_counts": true,
+          "timezone": "utc",
+          "percentage": false,
+          "zerofill": true,
+          "nullPointMode": "connected",
+          "steppedLine": false,
+          "tooltip": {
+            "value_type": "cumulative",
+            "shared": false
+          },
+          "targets": [
+            {
+              "target": "*.network.interface.*.bytes.*"
+            }
+          ],
+          "aliasColors": {},
+          "title": "network i/o bytes/s",
+          "id": 3,
+          "datasource": null,
+          "renderer": "flot",
+          "seriesOverrides": [],
+          "links": []
+        }
+      ],
+      "notice": false
+    },
+    {
+      "disk": "demo3",
+      "height": "200px",
+      "editable": true,
+      "collapse": false,
+      "collapsable": true,
+      "panels": [
+        {
+          "span": 12,
+          "editable": true,
+          "type": "graph",
+          "x-axis": true,
+          "y-axis": true,
+          "scale": 1,
+          "y_formats": [
+            "short",
+            "short"
+          ],
+          "grid": {
+            "max": null,
+            "min": null,
+            "leftMax": null,
+            "rightMax": null,
+            "leftMin": null,
+            "rightMin": null,
+            "threshold1": null,
+            "threshold2": null,
+            "threshold1Color": "rgba(216, 200, 27, 0.27)",
+            "threshold2Color": "rgba(234, 112, 112, 0.22)"
+          },
+          "resolution": 100,
+          "lines": true,
+          "fill": 1,
+          "linewidth": 2,
+          "points": false,
+          "pointradius": 5,
+          "bars": false,
+          "stack": false,
+          "spyable": true,
+          "options": false,
+          "legend": {
+            "show": false,
+            "values": false,
+            "min": false,
+            "max": false,
+            "current": false,
+            "total": false,
+            "avg": false
+          },
+          "interactive": true,
+          "legend_counts": true,
+          "timezone": "utc",
+          "percentage": false,
+          "zerofill": true,
+          "nullPointMode": "connected",
+          "steppedLine": false,
+          "tooltip": {
+            "value_type": "cumulative",
+            "shared": false
+          },
+          "targets": [
+            {
+              "target": "*.disk.all.read_bytes"
+            },
+            {
+              "target": "*.disk.all.write_bytes"
+            }
+          ],
+          "aliasColors": {},
+          "title": "disk read/write kbytes/s",
+          "id": 4,
+          "datasource": null,
+          "renderer": "flot",
+          "seriesOverrides": []
+        }
+      ],
+      "notice": false
+    },
+    {
+      "mem": "demo3",
+      "height": "200px",
+      "editable": true,
+      "collapse": false,
+      "collapsable": true,
+      "panels": [
+        {
+          "span": 12,
+          "editable": true,
+          "type": "graph",
+          "x-axis": true,
+          "y-axis": true,
+          "scale": 1,
+          "y_formats": [
+            "short",
+            "short"
+          ],
+          "grid": {
+            "max": null,
+            "min": null,
+            "leftMax": null,
+            "rightMax": null,
+            "leftMin": null,
+            "rightMin": null,
+            "threshold1": null,
+            "threshold2": null,
+            "threshold1Color": "rgba(216, 200, 27, 0.27)",
+            "threshold2Color": "rgba(234, 112, 112, 0.22)"
+          },
+          "resolution": 100,
+          "lines": true,
+          "fill": 1,
+          "linewidth": 2,
+          "points": false,
+          "pointradius": 5,
+          "bars": false,
+          "stack": false,
+          "spyable": true,
+          "options": false,
+          "legend": {
+            "show": false,
+            "values": false,
+            "min": false,
+            "max": false,
+            "current": false,
+            "total": false,
+            "avg": false
+          },
+          "interactive": true,
+          "legend_counts": true,
+          "timezone": "utc",
+          "percentage": false,
+          "zerofill": true,
+          "nullPointMode": "connected",
+          "steppedLine": false,
+          "tooltip": {
+            "value_type": "cumulative",
+            "shared": false
+          },
+          "targets": [
+            {
+              "target": "*.mem.util.available"
+            },
+            {
+              "target": "*.mem.util.used"
+            }
+          ],
+          "aliasColors": {},
+          "title": "available/used memory kbytes",
+          "id": 5,
+          "datasource": null,
+          "renderer": "flot",
+          "seriesOverrides": []
+        }
+      ],
+      "notice": false
+    }
+  ],
+  "nav": [
+    {
+      "type": "timepicker",
+      "collapse": false,
+      "notice": false,
+      "enable": true,
+      "status": "Stable",
+      "time_options": [
+        "5m",
+        "15m",
+        "1h",
+        "6h",
+        "12h",
+        "24h",
+        "2d",
+        "7d",
+        "30d"
+      ],
+      "refresh_intervals": [
+        "5s",
+        "10s",
+        "30s",
+        "1m",
+        "5m",
+        "15m",
+        "30m",
+        "1h",
+        "2h",
+        "1d"
+      ],
+      "now": true
+    }
+  ],
+  "time": {
+    "from": "now-1h",
+    "to": "now"
+  },
+  "templating": {
+    "list": [],
+    "enable": false
+  },
+  "annotations": {
+    "list": []
+  },
+  "refresh": "1m",
+  "version": 6,
+  "hideAllLegends": false
+}
diff --git a/roles/pcp/files/index.js b/roles/pcp/files/index.js

new file mode 100644 (file)

index 0000000..97a3502
--- /dev/null
+++ b/roles/pcp/files/index.js
@@ -0,0 +1,229 @@
+/*jslint indent: 2 nomen: true */
+"use strict";
+
+// This is what's in scope at this point
+var window, document, ARGS, $, jQuery, moment, kbn, _;
+
+var USAGE = {
+  title: "Invalid or missing argument",
+  content: "Arguments taken by this dashboard are:\n\n" +
+    "``hosts``: A comma-separated list of hosts to monitor (required)\n\n" +
+    "``title``: The title of the dashboard (default: the hosts list)\n\n" +
+    "``time_from``: The start of the time window (default: 'now-1h')\n\n" +
+    "``time_to``: The end of the time window (ignored if time_from is not set)\n\n" +
+    "``refresh``: How often to refresh the dashboard (default: never)\n\n" +
+    "All arguments are to be passed as a [query string](https://en.wikipedia.org/wiki/Query_string)",
+  error: true,
+};
+
+// This is the base configuration for the dashboard
+var dashboard_stub = {
+  rows: [],
+  services: {},
+  time: {
+    from: "now-1h",
+    to: "now",
+  },
+  timezone: "utc",
+  editable: "true",
+  nav: [
+    {
+      type: "timepicker",
+      collapse: false,
+      notice: false,
+      enable: true,
+      status: "Stable",
+      time_options:
+        ["5m", "15m", "1h", "6h", "12h", "24h", "2d", "7d", "30d"],
+      refresh_intervals:
+        ["1m", "5m", "15m", "30m", "1h", "6h", "1d"],
+      now: false,
+    },
+  ],
+};
+
+// This is the base configuration for each row
+var row_stub = {
+  showTitle: true,
+  height: '300px',
+  panels: [],
+};
+
+// This is the base configuration for each panel
+var graph_panel_stub = {
+  type: 'graph',
+  editable: true,
+  collapse: false,
+  collapsable: true,
+  legend_counts: true,
+  legend: {
+    show: false,
+    values: false,
+    min: false,
+    max: false,
+    current: false,
+    total: false,
+    avg: false
+  },
+  spyable: true,
+  options: false,
+};
+
+// This represents each of the panels that we want.
+// Each row may contain multiple panels.
+// Each panel has a title and one or more metrics.
+var dashboard_rows = [
+  {
+    title: "load (1 minute)",
+    panels: [
+      {
+        metrics: ["kernel.all.load.1 minute"],
+      },
+    ],
+  },
+  {
+    title: "network (bytes/s)",
+    panels: [
+      // We use e* here to select only Ethernet interfaces and ignore
+      // loopbacks
+      {
+        title: "in",
+        metrics: ["network.interface.in.bytes.e*"],
+        span: 6,
+      },
+      {
+        title: "out",
+        metrics: ["network.interface.out.bytes.e*"],
+        span: 6,
+      },
+    ],
+  },
+  {
+    title: "disk (kbytes)",
+    panels: [
+      {
+        title: "read",
+        metrics: ["disk.all.read_bytes"],
+        span: 6,
+      },
+      {
+        title: "write",
+        metrics: ["disk.all.write_bytes"],
+        span: 6,
+      },
+    ],
+  },
+  {
+    title: "memory (kbytes)",
+    panels: [
+      {
+        title: "free",
+        metrics: ["mem.util.free"],
+        span: 6,
+      },
+      {
+        title: "used",
+        metrics: ["mem.util.used"],
+        span: 6,
+      },
+    ],
+  },
+];
+
+var text_panel_stub = {
+  title: "",
+  type: "text",
+  mode: "markdown",
+  content: "",
+  error: false,
+};
+
+function get_text_panel(values) {
+  // values is a hash that optionally overrides text_panel_stub's values.
+  var panel;
+  panel = $.extend(true, text_panel_stub, values);
+  return panel;
+}
+
+function set_targets(rows_base, hosts) {
+  // Now let's flesh out our row values. For each row we want, we need to
+  // create a set of 'targets' which consist of wildcarded host values
+  // concatenated with each metric we want.
+  var i_row, i_panel, i_metric, i_host, row_templ, panel, metrics, metric, host;
+  for (i_row = 0; i_row < rows_base.length; i_row += 1) {
+    row_templ = rows_base[i_row];
+    for (i_panel = 0; i_panel < row_templ.panels.length; i_panel += 1) {
+      panel = row_templ.panels[i_panel];
+      panel.targets = [];
+      metrics = panel.metrics;
+      for (i_metric = 0; i_metric < metrics.length; i_metric += 1) {
+        metric = metrics[i_metric];
+        for (i_host = 0; i_host < hosts.length; i_host += 1) {
+          host = hosts[i_host];
+          panel.targets.push(
+            {target: '*' + host + '*.' + metric}
+          );
+        }
+      }
+    }
+  }
+  return rows_base;
+}
+
+function build_dashboard(rows_base) {
+  var dashboard, i_row, row_templ, row, i_panel, panel;
+  dashboard = $.extend(true, {}, dashboard_stub);
+  for (i_row = 0; i_row < rows_base.length; i_row += 1) {
+    row_templ = rows_base[i_row];
+    row = $.extend(true, {}, row_stub);
+    row.title = row_templ.title;
+    for (i_panel = 0; i_panel < row_templ.panels.length; i_panel += 1) {
+      panel = $.extend(true, {}, graph_panel_stub, row_templ.panels[i_panel]);
+      row.panels.push(panel);
+    }
+    dashboard.rows.push(row);
+  }
+  return dashboard;
+}
+
+function main(callback) {
+  var dashboard, hosts, title, rows, panel;
+  if (!_.isUndefined(ARGS.hosts)) {
+    hosts = ARGS.hosts.split(',');
+    // We provide a default title based on the hosts arg, but it may be
+    // overridden via the title arg
+    title = hosts.join(', ');
+    rows = set_targets(dashboard_rows, hosts);
+  } else {
+    title = 'usage';
+    panel = get_text_panel(USAGE);
+    rows = [{
+      title: "error",
+      panels: [panel],
+    }];
+  }
+  dashboard = build_dashboard(rows);
+  if (!_.isUndefined(ARGS.refresh)) {
+    dashboard.refresh = ARGS.refresh;
+  }
+  if (!_.isUndefined(ARGS.time_from)) {
+    dashboard.time.from = ARGS.time_from;
+    if (!_.isUndefined(ARGS.time_to)) {
+      dashboard.time.to = ARGS.time_to;
+    }
+  }
+  if (!_.isUndefined(ARGS.title)) {
+    title = ARGS.title;
+  }
+  dashboard.title = title;
+
+  $.ajax({
+    method: 'GET',
+    url: '/'
+  })
+    .done(function () {
+      callback(dashboard);
+    });
+}
+
+return main;
diff --git a/roles/pcp/tasks/apt_update.yml b/roles/pcp/tasks/apt_update.yml

new file mode 100644 (file)

index 0000000..1cf903f
--- /dev/null
+++ b/roles/pcp/tasks/apt_update.yml
@@ -0,0 +1,6 @@
+---
+- name: Update apt cache
+  apt:
+    update_cache: yes
+  when:
+    ansible_pkg_mgr == "apt"
diff --git a/roles/pcp/tasks/collector.yml b/roles/pcp/tasks/collector.yml

new file mode 100644 (file)

index 0000000..8c01e5d
--- /dev/null
+++ b/roles/pcp/tasks/collector.yml
@@ -0,0 +1,46 @@
+---
+- name: Install pcp
+  apt:
+    name: "{{ pcp_package }}"
+    state: latest
+  register: install_pcp_apt
+  when:
+    ansible_pkg_mgr == "apt"
+
+- name: Install pcp
+  yum:
+    name: "{{ pcp_package }}"
+    state: latest
+  register: install_pcp_yum
+  when:
+    ansible_pkg_mgr == "yum"
+
+- name: Install pcp
+  dnf:
+    name: "{{ pcp_package }}"
+    state: latest
+  register: install_pcp_dnf
+  when:
+    ansible_pkg_mgr == "dnf"
+
+- import_tasks: permissons.yml
+
+- name: Restart pcp
+  service:
+    name: "{{ pmcd_service }}"
+    state: restarted
+    enabled: yes
+  when:
+    install_pcp_apt is changed or
+    install_pcp_yum is changed or
+    install_pcp_dnf is changed
+
+- name: Restart pmlogger
+  service:
+    name: "{{ pmlogger_service }}"
+    state: restarted
+    enabled: yes
+  when:
+    install_pcp_apt is changed or
+    install_pcp_yum is changed or
+    install_pcp_dnf is changed
diff --git a/roles/pcp/tasks/main.yml b/roles/pcp/tasks/main.yml

new file mode 100644 (file)

index 0000000..2fe17fd
--- /dev/null
+++ b/roles/pcp/tasks/main.yml
@@ -0,0 +1,39 @@
+---
+- name: Include package type specific vars.
+  include_vars: "{{ ansible_pkg_mgr }}_systems.yml"
+  tags:
+    - always
+
+- name: Set up upstream repo
+  import_tasks: repo.yml
+  when:
+    upstream_repo|bool == true
+  tags:
+    - repo
+
+- import_tasks: apt_update.yml
+  when:
+    upstream_repo|bool == false
+  tags:
+    - always
+
+- name: Set up as collector
+  import_tasks: collector.yml
+  when:
+    pcp_collector|bool == true
+  tags:
+    - collector
+
+- name: Set up as manager
+  import_tasks: manager.yml
+  when:
+    pcp_manager|bool == true
+  tags:
+    - manager
+
+- name: Set up web UI
+  import_tasks: web.yml
+  when:
+    pcp_web|bool == true
+  tags:
+    - web
diff --git a/roles/pcp/tasks/manager.yml b/roles/pcp/tasks/manager.yml

new file mode 100644 (file)

index 0000000..81f3429
--- /dev/null
+++ b/roles/pcp/tasks/manager.yml
@@ -0,0 +1,115 @@
+---
+- name: Install avahi
+  apt:
+    name: "{{ avahi_package }}"
+    state: latest
+  when:
+    ansible_pkg_mgr == "apt" and
+    pcp_use_avahi|bool == true
+
+- name: Install avahi
+  yum:
+    name: "{{ avahi_package }}"
+    state: latest
+  when:
+    ansible_pkg_mgr == "yum" and
+    pcp_use_avahi|bool == true
+
+- name: Install pcp-manager
+  apt:
+    name: "{{ pcp_manager_package }}"
+    state: latest
+  when:
+    ansible_pkg_mgr == "apt"
+  register: install_pmmgr_apt
+
+- name: Install pcp-manager
+  yum:
+    name: "{{ pcp_manager_package }}"
+    state: latest
+  when:
+    ansible_pkg_mgr == "yum"
+  register: install_pmmgr_yum
+
+# Make the pmcd connect timeout very small so we don't spend ages looking for
+# hosts that are down
+- name: Set pmcd connect timeout
+  lineinfile:
+    dest: /etc/pcp/pmmgr/pmmgr.options
+    regexp: "^PMCD_CONNECT_TIMEOUT="
+    line: "PMCD_CONNECT_TIMEOUT=0.1"
+
+- name: Enable pmmgr
+  service:
+    name: "{{ pmmgr_service }}"
+    enabled: yes
+
+- set_fact:
+    pcp_target_hosts: "[{% for host in groups.pcp %}'{{ host }}',{% endfor %}]"
+
+- name: Write target-host
+  template:
+    src: target-host
+    dest: /etc/pcp/pmmgr/target-host
+    owner: root
+    group: root
+    mode: 0644
+  register: target_host
+
+- set_fact:
+    network_and_netmask: "{{ ansible_default_ipv4.network }}/{{ ansible_default_ipv4.netmask }}"
+
+- set_fact:
+    # ipaddr('net') converts a 'network/netmask' string to 'network/CIDR' format
+    network_and_cidr: "{{ network_and_netmask|ipaddr('net') }}"
+
+- name: Write target-discovery
+  template:
+    src: target-discovery
+    dest: /etc/pcp/pmmgr/target-discovery
+    owner: root
+    group: root
+    mode: 0644
+  register: target_discovery
+
+- import_tasks: permissons.yml
+
+# This greatly speeds up polling for hosts
+- name: Set PMCD_CONNECT_TIMEOUT in pmmgr.options
+  lineinfile:
+    dest: /etc/pcp/pmmgr/pmmgr.options
+    regexp: "^PMCD_CONNECT_TIMEOUT="
+    line: "PMCD_CONNECT_TIMEOUT={{ pmcd_connect_timeout }}"
+  register: pmmgr_options
+
+- name: Set /etc/pcp/pmmgr/pmlogmerge-retain
+  copy:
+    dest: /etc/pcp/pmmgr/pmlogmerge-retain
+    content: "{{ pmlogmerge_retain }}"
+    owner: root
+    group: root
+    mode: 0644
+  register: update_pmlogmerge_retain
+
+- name: Set /etc/pcp/pmmgr/pmlogmerge-reduce
+  copy:
+    dest: /etc/pcp/pmmgr/pmlogmerge-reduce
+    content: "{{ pmlogmerge_reduce }}"
+    owner: root
+    group: root
+    mode: 0644
+  register: update_pmlogmerge_reduce
+
+- name: Restart pmmgr
+  service:
+    name: "{{ pmmgr_service }}"
+    state: restarted
+    enabled: yes
+  when:
+    install_pmmgr_apt is changed or
+    install_pmmgr_yum is changed or
+    target_host is changed or
+    target_discovery is changed or
+    pmmgr_options is changed or
+    update_pmlogmerge_retain is changed or
+    update_pmlogmerge_reduce is changed
diff --git a/roles/pcp/tasks/permissons.yml b/roles/pcp/tasks/permissons.yml

new file mode 100644 (file)

index 0000000..685a9c8
--- /dev/null
+++ b/roles/pcp/tasks/permissons.yml
@@ -0,0 +1,9 @@
+---
+- name: Ensure /var/log/pcp is owned by pcp
+  file:
+    path: /var/log/pcp
+    owner: "{{ pcp_user }}"
+    group: "{{ pcp_user }}"
+    recurse: yes
+  # http://tracker.ceph.com/issues/16119
+  failed_when: false
diff --git a/roles/pcp/tasks/repo.yml b/roles/pcp/tasks/repo.yml

new file mode 100644 (file)

index 0000000..1129751
--- /dev/null
+++ b/roles/pcp/tasks/repo.yml
@@ -0,0 +1,43 @@
+---
+- name: Add upstream apt repo
+  copy:
+    content: "{{ upstream_repo_source }}"
+    dest: /etc/apt/sources.list.d/pcp.list
+  when:
+    ansible_pkg_mgr == "apt"
+
+- name: Add upstream GPG key to apt
+  apt_key:
+    url: https://bintray.com/user/downloadSubjectPublicKey?username=pcp
+    keyring: /etc/apt/trusted.gpg.d/pcp.gpg
+    state: present
+    validate_certs: true
+  when:
+    ansible_pkg_mgr == "apt"
+
+- name: Add upstream yum repo
+  get_url:
+    url: "{{ upstream_repo_url }}"
+    dest: /etc/yum.repos.d/pcp.repo
+  when:
+    ansible_pkg_mgr == "yum"
+
+- name: Add upstream GPG key to rpm
+  rpm_key:
+    key: https://bintray.com/user/downloadSubjectPublicKey?username=pcp
+    state: present
+    validate_certs: true
+  when:
+    ansible_pkg_mgr == "yum"
+
+- import_tasks: apt_update.yml
+
+- name: Ensure packages are updated (apt)
+  shell: "DEBIAN_FRONTEND=noninteractive apt -y install --only-upgrade .*pcp.*"
+  when:
+    ansible_pkg_mgr == "apt"
+
+- name: Ensure packages are updated (yum)
+  shell: "yum update *pcp*"
+  when:
+    ansible_pkg_mgr == "yum"
diff --git a/roles/pcp/tasks/web.yml b/roles/pcp/tasks/web.yml

new file mode 100644 (file)

index 0000000..15082b6
--- /dev/null
+++ b/roles/pcp/tasks/web.yml
@@ -0,0 +1,54 @@
+---
+- name: Fail when on Ubuntu
+  fail:
+    msg: "pcp-webapi is only available when using upstream packages. Set upstream_repo to true."
+  when: ansible_distribution == "Ubuntu" and upstream_repo|bool != true
+
+- name: Install pcp-webapi
+  yum:
+    name: "{{ pcp_webapi_package }}"
+    state: latest
+  register: install_pcp_webapi
+  when: ansible_pkg_mgr == "yum"
+
+- name: Install pcp-webjs
+  yum:
+    name: "{{ pcp_webjs_package }}"
+    state: latest
+  register: install_pcp_webjs
+  when: ansible_pkg_mgr == "yum"
+
+- name: Install pcp-webapi
+  apt:
+    name: "{{ pcp_webapi_package }}"
+    state: latest
+  register: install_pcp_webapi
+  when: ansible_pkg_mgr == "apt"
+
+- name: Enable pmwebd
+  service:
+    name: "{{ pmwebd_service }}"
+    enabled: yes
+  register: enable_pmwebd
+
+- name: Ship dashboard
+  copy:
+    src: "../files/{{ item }}"
+    dest: "/usr/share/pcp/webapps/grafana/app/dashboards/"
+    owner: root
+    group: root
+    mode: 0644
+  with_items:
+    - 1h1m.json
+    - index.js
+  tags:
+    - dashboards
+
+- name: Start pmwebd
+  service:
+    name: "{{ pmmgr_service }}"
+    state: restarted
+  when:
+    install_pcp_webapi is changed or
+    install_pcp_webjs is changed or
+    enable_pmwebd is changed
diff --git a/roles/pcp/templates/target-discovery b/roles/pcp/templates/target-discovery

new file mode 100644 (file)

index 0000000..caac2a5
--- /dev/null
+++ b/roles/pcp/templates/target-discovery
@@ -0,0 +1,7 @@
+# {{ ansible_managed }}
+{% if pcp_use_avahi %}
+avahi,timeout=5
+{% endif %}
+{% if pcp_probe %}
+probe={{ network_and_cidr }},maxThreads=256
+{% endif %}
diff --git a/roles/pcp/templates/target-host b/roles/pcp/templates/target-host

new file mode 100644 (file)

index 0000000..7ee0058
--- /dev/null
+++ b/roles/pcp/templates/target-host
@@ -0,0 +1,4 @@
+# {{ ansible_managed }}
+{% for host in pcp_target_hosts %}
+{{ host }}
+{% endfor %}
diff --git a/roles/pcp/vars/apt_systems.yml b/roles/pcp/vars/apt_systems.yml

new file mode 100644 (file)

index 0000000..09bf305
--- /dev/null
+++ b/roles/pcp/vars/apt_systems.yml
@@ -0,0 +1,11 @@
+---
+upstream_repo_source: "deb https://dl.bintray.com/pcp/trusty {{ ansible_distribution_release }} main"
+pcp_user: pcp
+pcp_package: pcp
+pmcd_service: pmcd
+pmlogger_service: pmlogger
+pcp_manager_package: "{% if upstream_repo %}pcp-manager{% else %}pcp{% endif %}"
+pmmgr_service: pmmgr
+avahi_package: avahi-daemon
+pcp_webapi_package: pcp-webapi
+pmwebd_service: pmwebd
diff --git a/roles/pcp/vars/dnf_systems.yml b/roles/pcp/vars/dnf_systems.yml

new file mode 100644 (file)

index 0000000..b47ba6f
--- /dev/null
+++ b/roles/pcp/vars/dnf_systems.yml
@@ -0,0 +1,11 @@
+---
+pcp_user: pcp
+pcp_package: pcp
+pmcd_service: pmcd
+pmlogger_service: pmlogger
+pcp_manager_package: pcp-manager
+pmmgr_service: pmmgr
+avahi_package: avahi
+pcp_webapi_package: pcp-webapi
+pcp_webjs_package: pcp-webjs
+pmwebd_service: pmwebd
diff --git a/roles/pcp/vars/yum_systems.yml b/roles/pcp/vars/yum_systems.yml

new file mode 100644 (file)

index 0000000..14ba8cf
--- /dev/null
+++ b/roles/pcp/vars/yum_systems.yml
@@ -0,0 +1,12 @@
+---
+upstream_repo_url: "https://bintray.com/pcp/{{ {'Fedora': 'f', 'CentOS': 'el', 'RedHat': 'el'}[ansible_distribution] }}{{ ansible_distribution_major_version }}/rpm"
+pcp_user: pcp
+pcp_package: pcp
+pmcd_service: pmcd
+pmlogger_service: pmlogger
+pcp_manager_package: pcp-manager
+pmmgr_service: pmmgr
+avahi_package: avahi
+pcp_webapi_package: pcp-webapi
+pcp_webjs_package: pcp-webjs
+pmwebd_service: pmwebd
diff --git a/roles/public_facing/README.rst b/roles/public_facing/README.rst

new file mode 100644 (file)

index 0000000..77cf49f
--- /dev/null
+++ b/roles/public_facing/README.rst
@@ -0,0 +1,99 @@
+public_facing
+=============
+
+This role is used to manage the various public-facing hosts we have.  Each host has various configs not managed by the ``common`` role.  This playbook aims to:
+
+- Provide automation in the event of disaster recovery
+- Automate repeatable tasks
+- Automate 'one-off' host or service nuances
+
+Usage
++++++
+
+Example::
+
+  ansible-playbook public_facing.yml --limit="download.ceph.com"
+
+Variables
++++++++++
+
+Defaults
+--------
+Defined in ``roles/public_facing/defaults/main.yml``  Override these in the ansible inventory ``host_vars`` file.
+
+``use_ufw: false`` specifies whether an Ubuntu host should use UFW_
+
+``f2b_ignoreip: "127.0.0.1"``
+``f2b_bantime: "43200"``
+``f2b_findtime: "900"``
+``f2b_maxretry: 5``
+
+``use_fail2ban: true`` specifies whether a host should use fail2ban_
+
+``f2b_services: {}`` is a dictionary listing services fail2ban should monitor.  See example below::
+
+    f2b_services:
+      sshd:
+        enabled: "true"
+        port: "22"
+        maxretry: 3
+        findtime: "3600" # 1hr
+        filter: "sshd"
+        logpath: "{{ sshd_logpath }}"
+      sshd-ddos:
+        enabled: "true"
+        port: "22"
+        maxretry: 3
+        filter: "sshd-ddos"
+        logpath: "{{ sshd_logpath }}"
+        bantime: -1 # optionally set in host_vars
+
+    # Note: sshd_logpath gets defined automatically in roles/public_facing/tasks/fail2ban.yml
+
+host_vars
+---------
+If required, define these in your ansible inventory ``host_vars`` file.
+
+``ufw_allowed_ports: []`` should be a list of ports you want UFW to allow traffic through.  You may optionally defined a ``source_ip`` by adding ``:1.2.3.4`` after the port.  List items must be double-quoted due to the way the task processes stdout of ``ufw status``.  Example::
+
+    ufw_allowed_ports:
+      - "22"
+      - "80"
+      - "443"
+      - "3306:1.2.3.4"
+
+``f2b_filters: {}`` is a dictionary of additional filters fail2ban should use.  For example, our status portal running Cachet has an additional fail2ban service monitoring repeated login attempts to the admin portal.  ``maxlines`` is an optional variable.  See filter example::
+
+    f2b_filters:
+      apache-cachet:
+        failregex: "<HOST> .*GET /auth/login.*$"
+      example-filter:
+        failregex: "<HOST> .*foo$"
+        maxlines: 3
+
+Common Tasks
+++++++++++++
+
+These are tasks that are applicable to all our public-facing hosts.
+
+UFW
+---
+At the time of this writing, we only have one public-facing host that doesn't run Ubuntu -- the nameserver.  Its firewall is managed in the ``nameserver`` role.
+
+Despite having network port ACLs defined for each host in our cloud provider's interface, enabling a firewall local to the system will allow us to block abusive IPs using fail2ban.
+
+fail2ban
+--------
+If ``use_fail2ban`` is set to ``true`` this role will install, configure, and enable fail2ban.
+
+To-Do
++++++
+
+status.sepia.ceph.com
+---------------------
+
+ - Install and update Cachet_?
+
+.. _UFW: https://wiki.ubuntu.com/UncomplicatedFirewall
+.. _fail2ban: http://www.fail2ban.org/wiki/index.php/Main_Page
+.. _Cachet: https://cachethq.io
diff --git a/roles/public_facing/defaults/main.yml b/roles/public_facing/defaults/main.yml

new file mode 100644 (file)

index 0000000..0c7567e
--- /dev/null
+++ b/roles/public_facing/defaults/main.yml
@@ -0,0 +1,37 @@
+---
+## Any of these vars can be overridden in inventory host_vars.
+
+# Don't use ufw by default.
+use_ufw: false
+
+# Default to allow SSH traffic.
+ufw_allowed_ports:
+  - "22"
+
+# Use fail2ban by default
+use_fail2ban: true
+
+# Defaults for global fail2ban overrides in /etc/fail2ban/jail.local
+# Override in ansible inventory host_vars, group_vars, or some can be
+# overridden by service files in the f2b_services dict.  See README.
+f2b_ignoreip: "127.0.0.1"
+f2b_bantime: "43200" # 12 hours
+f2b_findtime: "900" # 15 minutes
+f2b_maxretry: 5
+
+# Default fail2ban services to block.  This can be overridden in ansible
+# inventory group_vars or host_vars.
+f2b_services:
+  sshd:
+    enabled: "true"
+    port: "22"
+    maxretry: 3
+    findtime: "3600" # 1hr
+    filter: "sshd"
+    logpath: "{{ sshd_logpath }}"
+  sshd-ddos:
+    enabled: "true"
+    port: "22"
+    maxretry: 3
+    filter: "sshd-ddos"
+    logpath: "{{ sshd_logpath }}"
diff --git a/roles/public_facing/handlers/main.yml b/roles/public_facing/handlers/main.yml

new file mode 100644 (file)

index 0000000..d548b28
--- /dev/null
+++ b/roles/public_facing/handlers/main.yml
@@ -0,0 +1,18 @@
+---
+# Restart fail2ban
+- name: restart fail2ban
+  service:
+    name: fail2ban
+    state: restarted
+
+# Reload fail2ban
+- name: reload fail2ban
+  service:
+    name: fail2ban
+    state: reloaded
+
+# Restart sshd
+- name: restart sshd
+  service:
+    name: sshd
+    state: restarted
diff --git a/roles/public_facing/tasks/download.ceph.com.yml b/roles/public_facing/tasks/download.ceph.com.yml

new file mode 100644 (file)

index 0000000..03e4b57
--- /dev/null
+++ b/roles/public_facing/tasks/download.ceph.com.yml
@@ -0,0 +1,71 @@
+---
+- name: Put logrotate config in place
+  template:
+    src: templates/download.ceph.com/logrotate.j2
+    dest: /etc/logrotate.d/download.ceph.com
+
+# Used for pushing upstream builds
+# https://github.com/ceph/ceph-build/blob/main/scripts/sync-push
+- name: Add signer user
+  user:
+    name: signer
+
+# signer_pubkey defined in inventory host_vars
+- name: Update signer user's authorized_keys
+  authorized_key:
+    user: signer
+    state: present
+    key: "{{ signer_pubkey }}"
+
+# Used to rsync download.ceph.com http logs and compile metrics
+# for metrics.ceph.com
+- name: Create Bitergia user
+  user:
+    name: bitergia
+    groups: adm
+
+# bitergia_pubkey defined in inventory host_vars
+- name: Update bitergia user's authorized_keys
+  authorized_key:
+    user: bitergia
+    state: present
+    key: "{{ bitergia_pubkey }}"
+
+- name: Create ~/bin dir for bitergia user
+  file:
+    path: /home/bitergia/bin
+    state: directory
+    owner: bitergia
+    group: bitergia
+
+# Rsync is almost certainly already installed but it's required for the next task
+- name: Make sure rsync is installed
+  apt:
+    name: rsync
+    state: latest
+
+- name: Put rrsync script in place for bitergia user
+  shell: "gunzip /usr/share/doc/rsync/scripts/rrsync.gz --to-stdout > /home/bitergia/bin/rrsync"
+  changed_when: false
+
+- name: Set permissions for bitergia rrsync script
+  file:
+    dest: /home/bitergia/bin/rrsync
+    owner: bitergia
+    group: bitergia
+    mode: 0774
+
+# Updates download.ceph.com/timestamp
+- name: Put make_timestamp script in place
+  template:
+    src: templates/download.ceph.com/make_timestamp.j2
+    dest: /usr/libexec/make_timestamp
+    mode: 0775
+
+- name: Create cron entry for make_timestamp
+  cron:
+    name: "Update download.ceph.com/timestamp"
+    minute: "0"
+    job: "/usr/libexec/make_timestamp"
+
+- import_tasks: letsencrypt_nginx.yml
diff --git a/roles/public_facing/tasks/fail2ban.yml b/roles/public_facing/tasks/fail2ban.yml

new file mode 100644 (file)

index 0000000..362fe58
--- /dev/null
+++ b/roles/public_facing/tasks/fail2ban.yml
@@ -0,0 +1,77 @@
+---
+- name: Install or update fail2ban
+  package:
+    name: fail2ban
+    state: latest
+
+- name: Check if firewalld is running
+  shell: firewall-cmd --state
+  register: firewalld_status
+  # Don't fail if command not found
+  failed_when: false
+
+- name: Set f2b_banaction if using firewalld
+  set_fact:
+    f2b_banaction: "firewallcmd-ipset"
+  when: firewalld_status.stdout == "running"
+
+- name: Check if UFW is running
+  shell: ufw status | grep Status | cut -d ' ' -f2
+  register: ufw_status
+  # Don't fail if command not found
+  failed_when: false
+
+- name: Set f2b_banaction if using UFW
+  set_fact:
+    f2b_banaction: "ufw"
+  when: ufw_status.stdout == "active"
+
+- name: Write /etc/fail2ban/action.d/ufw.conf if it's missing
+  template:
+    src: f2b_ufw.conf.j2
+    dest: /etc/fail2ban/action.d/ufw.conf
+  when: use_ufw == true
+
+# Any parameters defined in this file overwrite the package-provided jail.conf
+- name: Write global fail2ban defaults
+  template:
+    src: templates/f2b.jail.local.j2
+    dest: /etc/fail2ban/jail.local
+  notify: restart fail2ban
+
+# sshd_logpath is used in the f2b_services dictionary.  fail2ban doesn't know
+# where ssh logs are for services other than sshd so sshd-ddos, for example
+# needs to be told where to look.  For other services (e.g., nginx), the logpath
+# can be set directly in the dict.
+- name: Set sshd_logpath for CentOS/RHEL
+  set_fact:
+    sshd_logpath: "/var/log/messages"
+  when: ansible_os_family == "RedHat"
+
+- name: Set sshd_logpath for Ubuntu
+  set_fact:
+    sshd_logpath: "/var/log/auth.log"
+  when: ansible_os_family == "Debian"
+
+# This makes sure there are no old or malformed service conf files.
+# We'll rewrite them in the next task.
+- name: Clean up local service conf files
+  shell: rm -f /etc/fail2ban/jail.d/*.local
+
+- name: Write fail2ban service conf files
+  template:
+    src: templates/f2b.service.j2
+    dest: "/etc/fail2ban/jail.d/{{ item.key }}.local"
+  with_dict: "{{ f2b_services }}"
+  notify: reload fail2ban
+
+- name: Clean up local filter conf files
+  shell: rm -f /etc/fail2ban/filter.d/*.local
+
+- name: Write fail2ban filter conf files
+  template:
+    src: templates/f2b.filter.j2
+    dest: "/etc/fail2ban/filter.d/{{ item.key }}.local"
+  with_dict: "{{ f2b_filters }}"
+  when: f2b_filters is defined
+  notify: reload fail2ban
diff --git a/roles/public_facing/tasks/letsencrypt_nginx.yml b/roles/public_facing/tasks/letsencrypt_nginx.yml

new file mode 100644 (file)

index 0000000..3dcd31e
--- /dev/null
+++ b/roles/public_facing/tasks/letsencrypt_nginx.yml
@@ -0,0 +1,68 @@
+---
+# NOTE: Initial cert creation is a manual process primarily because we'll hopefully never
+# have to start from scratch again.  This playbook just keeps the existing certs up to date.
+
+# Get letsencrypt authority server IPv4 address
+- local_action: shell dig -4 +short acme-v01.api.letsencrypt.org | tail -n 1
+  register: letsencrypt_ipv4_address
+
+# This task really only needs to be run the first time download.ceph.com is set up.
+# An entry matching *letsencrypt* in /etc/hosts is required for the cronjob in the next task however.
+- name: Create entry for letsencrypt authority server in /etc/hosts
+  lineinfile:
+    path: /etc/hosts
+    regexp: '(.*)letsencrypt(.*)'
+    line: '{{ letsencrypt_ipv4_address.stdout }}    acme-v01.api.letsencrypt.org'
+    state: present
+
+# 'letsencrypt renew' fails because it can't reach the letsencrypt authority server using IPv6
+- name: Create cron entry to force IPv4 connectivity to letsencrypt authority server  # noqa no-tabs
+  cron:
+    name: "Forces letsencrypt to use IPv4 when accessing acme-v01.api.letsencrypt.org"
+    hour: "0"
+    job: "IP=$(dig -4 +short acme-v01.api.letsencrypt.org | tail -n 1) && sed -i \"s/.*letsencrypt.*/$IP\tacme-v01.api.letsencrypt.org/g\" /etc/hosts"
+
+# letsencrypt doesn't recommend using the Ubuntu-provided letsencrypt package
+# https://github.com/certbot/certbot/issues/3538
+# They do recommend using certbot from their PPA for Xenial
+# https://certbot.eff.org/#ubuntuxenial-nginx
+
+- name: install software-properties-common
+  apt:
+    name: software-properties-common
+    state: latest
+    update_cache: yes
+
+- name: add certbot PPA
+  apt_repository:
+    repo: "ppa:certbot/certbot"
+
+- name: install certbot
+  apt:
+    name: python-certbot-nginx
+    state: latest
+    update_cache: yes
+
+- name: setup a cron to attempt to renew the SSL cert every 15ish days
+  cron:
+    name: "renew letsencrypt cert"
+    minute: "0"
+    hour: "0"
+    day: "1,15"
+    job: "certbot renew --renew-hook='systemctl reload nginx'"
+
+# This cronjob would attempt to renew the cert twice a day but doesn't have our required --renew-hook
+- name: make sure certbot's cronbjob is not present
+  file:
+    path: /etc/cron.d/certbot
+    state: absent
+
+# Same thing here.  Let me automate how I wanna automate plz.
+- name: make sure certbot's systemd services are disabled
+  service:
+    name: "{{ item }}"
+    state: stopped
+    enabled: no
+  with_items:
+    - "certbot.service"
+    - "certbot.timer"
diff --git a/roles/public_facing/tasks/main.yml b/roles/public_facing/tasks/main.yml

new file mode 100644 (file)

index 0000000..67639a9
--- /dev/null
+++ b/roles/public_facing/tasks/main.yml
@@ -0,0 +1,37 @@
+---
+## Common tasks
+
+# Most of our public-facing hosts are running Ubuntu.
+# use_ufw defaults to false but is overridden in inventory host_vars
+- import_tasks: ufw.yml
+  when: use_ufw == true
+  tags:
+    - always
+
+- import_tasks: fail2ban.yml
+  tags:
+    - always
+  when: use_fail2ban == true
+
+- name: Disable password authentication
+  lineinfile:
+    dest: /etc/ssh/sshd_config
+    regexp: "^PasswordAuthentication"
+    line: "PasswordAuthentication no"
+    state: present
+  notify: restart sshd
+
+## Individual host tasks
+
+# local_action in the task after this causes 'ansible_host' to change to 'localhost'
+# we set a temporary variable here to search for in the local_action task
+- set_fact:
+    target_host: "{{ ansible_host }}"
+
+- name: Check for host-specific playbooks
+  local_action: "stat path=roles/public_facing/tasks/{{ target_host }}.yml"
+  register: host_playbook
+
+- name: Include any host-specific playbooks if present
+  include_tasks: "{{ ansible_host }}.yml"
+  when: host_playbook.stat.exists
diff --git a/roles/public_facing/tasks/status.sepia.ceph.com.yml b/roles/public_facing/tasks/status.sepia.ceph.com.yml

new file mode 100644 (file)

index 0000000..5011e43
--- /dev/null
+++ b/roles/public_facing/tasks/status.sepia.ceph.com.yml
@@ -0,0 +1,21 @@
+---
+- name: Create /root/checks directory for Cachet checks
+  file:
+    path: "{{ cachet_checks_path }}"
+    state: directory
+
+- name: Clone nagios-eventhandler-cachet to /root/checks dir
+  git:
+    repo: https://github.com/djgalloway/nagios-eventhandler-cachet.git
+    dest: "{{ cachet_checks_path }}/nagios-eventhandler-cachet"
+    update: yes
+
+- name: Put templated Cachet checks in place
+  template:
+    dest: "{{ cachet_checks_path }}/{{ item.dest }}"
+    src: "{{ item.src }}"
+    mode: "{{ item.mode }}"
+  with_items:
+    - { src: 'templates/status.sepia.ceph.com/lab-pings.j2', dest: 'lab-pings.sh', mode: '0755' }
+    - { src: 'templates/status.sepia.ceph.com/openvpn.j2', dest: 'openvpn.sh', mode: '0755' }
+    - { src: 'templates/status.sepia.ceph.com/nagios-eventhandler-cachet.config.j2', dest: 'nagios-eventhandler-cachet/config.inc.php', mode: '0644' }
diff --git a/roles/public_facing/tasks/ufw.yml b/roles/public_facing/tasks/ufw.yml

new file mode 100644 (file)

index 0000000..d902072
--- /dev/null
+++ b/roles/public_facing/tasks/ufw.yml
@@ -0,0 +1,57 @@
+---
+- name: Make sure iptables-persistent is not installed
+  apt:
+    name: iptables-persistent
+    state: absent
+
+- name: Install or update ufw
+  apt:
+    name: ufw
+    state: latest
+
+- name: Get current ufw status
+  shell: ufw status | grep 'Status' | cut -d ' ' -f2
+  register: ufw_status
+
+# policy: allow makes sure we can still ssh if ufw is inactive.
+# We revert this at the end of the playbook
+- name: Enable ufw if inactive
+  ufw:
+    state: enabled
+    policy: allow
+  when: ufw_status.stdout == "inactive"
+
+# Instead of deleting all rules and re-opening ports with each playbook run,
+# we'll compare a list of ports we specify should be open with a list of currently open ports.
+- name: Get list of currently allowed ports
+  shell: ufw status | grep 'ALLOW' | grep -v v6 | awk '{ print $1 }'
+  register: ufw_current_allowed_raw
+  # Don't fail if we don't get any output
+  failed_when: false
+
+- name: Determine ports to disable
+  set_fact:
+    ufw_ports_to_disable: "{{ ufw_current_allowed_raw.stdout_lines | difference(ufw_allowed_ports) }}"
+
+- name: Determine ports to enable
+  set_fact:
+    ufw_ports_to_enable: "{{ ufw_allowed_ports | difference(ufw_current_allowed_raw.stdout_lines) }}"
+
+- name: Disable any open ports that aren't specified in ufw_allowed_ports
+  ufw:
+    rule: allow
+    port: "{{ item }}"
+    delete: yes
+  with_items: "{{ ufw_ports_to_disable }}"
+
+- name: Enable any ports we're missing
+  ufw:
+    rule: allow
+    port: "{% if ':' in item %}{% set port_and_src = item.split(':') %}{{ port_and_src[0] }}{% else %}{{ item }}{% endif %}"
+    from_ip: "{% if ':' in item %}{% set port_and_src = item.split(':') %}{{ port_and_src[1] }}{% else %}any{% endif %}"
+  with_items: "{{ ufw_ports_to_enable }}"
+
+# ufw_allowed_ports are excluded from the default policy
+- name: Set default policy to deny
+  ufw:
+    policy: deny
diff --git a/roles/public_facing/tasks/www.ceph.com.yml b/roles/public_facing/tasks/www.ceph.com.yml

new file mode 100644 (file)

index 0000000..6199004
--- /dev/null
+++ b/roles/public_facing/tasks/www.ceph.com.yml
@@ -0,0 +1,12 @@
+---
+# Wordpress has its own cron system that only runs queued jobs when the site
+# is visited.  We want certain jobs to run regardless of page visits.
+# 5 minutes was used because that's the most frequent any job is queued.
+# See http://docs.wprssaggregator.com/cron-intervals/#getting-around-the-limitations
+- name: Cron entry for Wordpress cron
+  cron:
+    name: "Call wp-cron.php to run Wordpress cronjobs"
+    minute: "*/5"
+    job: "/usr/bin/wget -q -O - http://ceph.com/wp-cron.php?doing_wp_cron"
+
+- import_tasks: letsencrypt_nginx.yml
diff --git a/roles/public_facing/templates/download.ceph.com/logrotate.j2 b/roles/public_facing/templates/download.ceph.com/logrotate.j2

new file mode 100644 (file)

index 0000000..1ca6a1a
--- /dev/null
+++ b/roles/public_facing/templates/download.ceph.com/logrotate.j2
@@ -0,0 +1,22 @@
+#
+# {{ ansible_managed }}
+#
+/data/download.ceph.com/logs/*.log {
+       daily
+       missingok
+       rotate 30
+       compress
+       delaycompress
+       notifempty
+       dateext
+       create 640 www-data adm
+       sharedscripts
+       prerotate
+               if [ -d /etc/logrotate.d/httpd-prerotate ]; then \
+                       run-parts /etc/logrotate.d/httpd-prerotate; \
+               fi \
+       endscript
+       postrotate
+               [ -s /run/nginx.pid ] && kill -USR1 `cat /run/nginx.pid`
+       endscript
+}
diff --git a/roles/public_facing/templates/download.ceph.com/make_timestamp.j2 b/roles/public_facing/templates/download.ceph.com/make_timestamp.j2

new file mode 100644 (file)

index 0000000..3df1188
--- /dev/null
+++ b/roles/public_facing/templates/download.ceph.com/make_timestamp.j2
@@ -0,0 +1,3 @@
+#!/bin/bash
+# {{ ansible_managed }}
+/bin/date "+%s" > /data/download.ceph.com/www/timestamp
diff --git a/roles/public_facing/templates/f2b.filter.j2 b/roles/public_facing/templates/f2b.filter.j2

new file mode 100644 (file)

index 0000000..86db2b7
--- /dev/null
+++ b/roles/public_facing/templates/f2b.filter.j2
@@ -0,0 +1,10 @@
+#
+# {{ ansible_managed }}
+#
+[Definition]
+failregex = {{ item.value.failregex }}
+
+{% if item.value.maxlines is defined %}
+[Init]
+maxlines = {{ item.value.maxlines }}
+{% endif %}
diff --git a/roles/public_facing/templates/f2b.jail.local.j2 b/roles/public_facing/templates/f2b.jail.local.j2

new file mode 100644 (file)

index 0000000..65d840c
--- /dev/null
+++ b/roles/public_facing/templates/f2b.jail.local.j2
@@ -0,0 +1,13 @@
+#
+# {{ ansible_managed }}
+#
+
+# These are global overrides of jail.conf
+[DEFAULT]
+ignoreip = {{ f2b_ignoreip }}
+bantime = {{ f2b_bantime }}
+findtime = {{ f2b_findtime }}
+maxretry = {{ f2b_maxretry }}
+{% if f2b_banaction is defined %}
+banaction = {{ f2b_banaction }}
+{% endif %}
diff --git a/roles/public_facing/templates/f2b.service.j2 b/roles/public_facing/templates/f2b.service.j2

new file mode 100644 (file)

index 0000000..b0c09c8
--- /dev/null
+++ b/roles/public_facing/templates/f2b.service.j2
@@ -0,0 +1,23 @@
+#
+# {{ ansible_managed }}
+#
+[{{ item.key }}]
+enabled = {{ item.value.enabled }}
+{% if item.value.maxretry is defined %}
+maxretry = {{ item.value.maxretry }}
+{% endif %}
+{% if item.value.port is defined %}
+port = {{ item.value.port }}
+{% endif %}
+{% if item.value.findtime is defined %}
+findtime = {{ item.value.findtime }}
+{% endif %}
+{% if item.value.logpath is defined %}
+logpath = {{ item.value.logpath }}
+{% endif %}
+{% if item.value.filter is defined %}
+filter = {{ item.value.filter }}
+{% endif %}
+{% if item.value.bantime is defined %}
+bantime = {{ item.value.bantime }}
+{% endif %}
diff --git a/roles/public_facing/templates/f2b_ufw.conf.j2 b/roles/public_facing/templates/f2b_ufw.conf.j2

new file mode 100644 (file)

index 0000000..cc03586
--- /dev/null
+++ b/roles/public_facing/templates/f2b_ufw.conf.j2
@@ -0,0 +1,13 @@
+#
+# {{ ansible_managed }}
+#
+# Fail2Ban action configuration file for ufw
+
+[Definition]
+actionstart = 
+actionstop = 
+actioncheck = 
+actionban = ufw insert 1 deny from <ip> to any port <port>
+            ufw insert 1 deny proto tcp from <ip> to any port <port>
+actionunban = ufw delete deny from <ip> to any port <port>
+              ufw delete deny proto tcp from <ip> to any port <port>
diff --git a/roles/public_facing/templates/status.sepia.ceph.com/lab-pings.j2 b/roles/public_facing/templates/status.sepia.ceph.com/lab-pings.j2

new file mode 100644 (file)

index 0000000..fc1b775
--- /dev/null
+++ b/roles/public_facing/templates/status.sepia.ceph.com/lab-pings.j2
@@ -0,0 +1,23 @@
+#!/bin/bash
+#
+# {{ ansible_managed }}
+#
+# Pings the Community Cage edge router, measures packet loss, and reports status to cachet using nagios event handler
+#
+# cachet_notify usage: ./cachet_notify $cachet_component $service_name $service_state $service_state_type $service_output
+
+PERCENT=$(ping -c 10 -q {{ community_cage_ip }} | grep -oP '\d+(?=% packet loss)')
+EXEC=/root/checks/nagios-eventhandler-cachet/cachet_notify
+
+if [ "$PERCENT" -eq 0 ] 2> /dev/null
+then
+  $EXEC 'Community Cage Network' 'Packet Loss' OK HARD '0% packet loss' ''
+elif [ "$PERCENT" -ge 1 ] 2> /dev/null && [ "$PERCENT" -le 99 ] 2> /dev/null
+then
+  $EXEC 'Community Cage Network' 'Packet Loss' CRITICAL SOFT "$PERCENT% packet loss" ''
+elif [ "$PERCENT" -eq 100 ] 2> /dev/null
+then
+  $EXEC 'Community Cage Network' 'Packet Loss' CRITICAL HARD "$PERCENT% packet loss" ''
+else
+  $EXEC 'Community Cage Network' 'Packet Loss' CRITICAL HARD "Couldn't measure packet loss.  Unknown error" ''
+fi
diff --git a/roles/public_facing/templates/status.sepia.ceph.com/nagios-eventhandler-cachet.config.j2 b/roles/public_facing/templates/status.sepia.ceph.com/nagios-eventhandler-cachet.config.j2

new file mode 100644 (file)

index 0000000..0c1d2ad
--- /dev/null
+++ b/roles/public_facing/templates/status.sepia.ceph.com/nagios-eventhandler-cachet.config.j2
@@ -0,0 +1,6 @@
+<?php
+
+$cachet_url = '{{ cachet_api_url }}';
+$api_key = '{{ cachet_api_key }}'
+
+?>
diff --git a/roles/public_facing/templates/status.sepia.ceph.com/openvpn.j2 b/roles/public_facing/templates/status.sepia.ceph.com/openvpn.j2

new file mode 100644 (file)

index 0000000..94727c0
--- /dev/null
+++ b/roles/public_facing/templates/status.sepia.ceph.com/openvpn.j2
@@ -0,0 +1,23 @@
+#!/bin/bash
+#
+# {{ ansible_managed }}
+#
+# Checks whether Sepia openvpn server is up and listening on 1194
+
+EXEC=/root/checks/nagios-eventhandler-cachet/cachet_notify
+
+# Returns 0 if string found
+sudo nmap --max-retries 3 --host-timeout 5s -sU -n -p 1194 gw.sepia.ceph.com | grep -q '1194/udp open|filtered openvpn'
+
+if [ $? -ne 0 ]
+then
+  # If nmap didn't return 0, check if we're having overall network issues
+  ping -c 1 -q 8.8.8.8
+  # If we can ping Google DNS but didn't get expected nmap output, alert
+  if [ $? -eq 0 ]
+  then
+    $EXEC 'OpenVPN Server' 'OpenVPN' CRITICAL HARD "gw.sepia.ceph.com is unreachable or port 1194 closed" ''
+  fi
+else
+  $EXEC 'OpenVPN Server' 'OpenVPN' OK HARD 'OK' ''
+fi
diff --git a/roles/pulpito/README.rst b/roles/pulpito/README.rst

new file mode 100644 (file)

index 0000000..3c0b256
--- /dev/null
+++ b/roles/pulpito/README.rst
@@ -0,0 +1,30 @@
+Pulpito
+=======
+
+This role is used to configure a node to run pulpito_.
+
+It has been tested on:
+
+- CentOS 7.x
+- Debian 8.x (Jessie)
+- Ubuntu 14.04 (Trusty)
+
+Dependencies
+++++++++++++
+
+Since pulpito_ is only useful as a frontend to paddles_, it requires a paddles_ instance to function. Additonally, you must set ``paddles_address`` in e.g. your secrets repository to the URL of your instance.
+
+
+.. _pulpito: https://github.com/ceph/pulpito
+.. _paddles: https://github.com/ceph/paddles
+
+Variables
++++++++++
+
+``pulpito_repo``: Optionally override the pulpito git repo.
+
+``pulpito_branch``: Optionally override the pulpito repo branch.
+For GitHub pull requests the values refs/pull/X/merge or refs/pull/X/head
+can be used.
+
+``pulpito_user``: The system account to create and use (Default: pulpito)
diff --git a/roles/pulpito/defaults/main.yml b/roles/pulpito/defaults/main.yml

new file mode 100644 (file)

index 0000000..c07e0df
--- /dev/null
+++ b/roles/pulpito/defaults/main.yml
@@ -0,0 +1,3 @@
+pulpito_repo: https://github.com/ceph/pulpito.git
+pulpito_user: pulpito
+pulpito_branch: main
diff --git a/roles/pulpito/tasks/apt_systems.yml b/roles/pulpito/tasks/apt_systems.yml

new file mode 100644 (file)

index 0000000..614d76c
--- /dev/null
+++ b/roles/pulpito/tasks/apt_systems.yml
@@ -0,0 +1,10 @@
+---
+- name: Install packages via apt
+  apt:
+    name: "{{ pulpito_extra_packages|list }}"
+    state: latest
+    update_cache: yes
+    cache_valid_time: 600
+  no_log: true
+  tags:
+    - packages
diff --git a/roles/pulpito/tasks/main.yml b/roles/pulpito/tasks/main.yml

new file mode 100644 (file)

index 0000000..e90f7f8
--- /dev/null
+++ b/roles/pulpito/tasks/main.yml
@@ -0,0 +1,70 @@
+---
+- name: Include package type specific vars.
+  include_vars: "{{ ansible_pkg_mgr }}_systems.yml"
+  tags:
+    - always
+
+- import_tasks: yum_systems.yml
+  when: ansible_pkg_mgr == "yum"
+
+- import_tasks: apt_systems.yml
+  when: ansible_pkg_mgr == "apt"
+
+- import_tasks: zypper_systems.yml
+  when: ansible_pkg_mgr == "zypper"
+
+- name: Create the user
+  user:
+    name: "{{ pulpito_user }}"
+    state: present
+    shell: /bin/bash
+  tags:
+    - user
+
+- name: Set repo location
+  set_fact:
+    pulpito_repo_path: "/home/{{ pulpito_user }}/pulpito"
+
+# Set up the actual pulpito project
+- import_tasks: setup_pulpito.yml
+
+
+- name: Enable supervisord
+  service:
+    name: "{{ supervisor_service }}"
+    enabled: yes
+    state: started
+
+- name: Set supervisord config path
+  set_fact:
+    supervisor_conf_path: "{{ supervisor_conf_d }}/pulpito.{{ supervisor_conf_suffix }}"
+
+- name: Look for supervisord config
+  stat:
+    path: "{{ supervisor_conf_path }}"
+    get_checksum: no
+  register: supervisor_conf
+
+- name: Copy supervisord config
+  shell: cp ./supervisord_pulpito.conf {{ supervisor_conf_path }} chdir={{ pulpito_repo_path }}
+  when: supervisor_conf.stat.exists == false
+  register: supervisor_conf
+
+- name: Read supervisord config
+  command: supervisorctl update
+  when: supervisor_conf is changed
+
+- name: Check if pulpito is running
+  command: supervisorctl status pulpito
+  register: pulpito_status
+  changed_when: false
+
+- name: Restart pulpito if necessary
+  supervisorctl:
+    name: pulpito
+    state: restarted
+  when: pulpito_status.stdout is match('.*RUNNING.*') and pulpito_config is changed
+
+- name: Wait for pulpito to start
+  wait_for:
+    port: 8081
diff --git a/roles/pulpito/tasks/setup_pulpito.yml b/roles/pulpito/tasks/setup_pulpito.yml

new file mode 100644 (file)

index 0000000..71db346
--- /dev/null
+++ b/roles/pulpito/tasks/setup_pulpito.yml
@@ -0,0 +1,71 @@
+---
+- name: Determine GitHub Pull Request
+  set_fact:
+    pulpito_pull: "{{ pulpito_branch | regex_replace( '^refs/pull/([^/]+)/.*$', '\\1') }}"
+
+- name: Clone the repo and checkout pull request branch
+  git:
+    repo: "{{ pulpito_repo }}"
+    dest: "{{ pulpito_repo_path }}"
+    version: "pull-{{ pulpito_pull }}"
+    refspec: '+{{ pulpito_branch }}:refs/remotes/origin/pull-{{ pulpito_pull }}'
+  become_user: "{{ pulpito_user }}"
+  tags:
+    - repos
+  when: pulpito_pull is defined and pulpito_pull != pulpito_branch
+
+- name: Checkout the repo
+  git:
+    repo: "{{ pulpito_repo }}"
+    dest: "{{ pulpito_repo_path }}"
+    version: "{{ pulpito_branch }}"
+  become_user: "{{ pulpito_user }}"
+  tags:
+    - repos
+  when: pulpito_pull is not defined or pulpito_pull == pulpito_branch
+
+- name: Look for the virtualenv
+  stat: 
+    path: "{{ pulpito_repo_path }}/virtualenv"
+    get_checksum: no
+  register: virtualenv
+
+- name: Create the virtualenv
+  shell: virtualenv -p python3 ./virtualenv chdir={{ pulpito_repo_path }}
+  become_user: "{{ pulpito_user }}"
+  when: virtualenv.stat.exists == false
+
+- name: Self-upgrade pip
+  pip:
+    name: "pip"
+    state: "latest"
+    chdir: "{{ pulpito_repo_path }}"
+    virtualenv: "{{ pulpito_repo_path }}/virtualenv"
+  become_user: "{{ pulpito_user }}"
+
+- name: Install requirements via pip
+  pip:
+    chdir: "{{ pulpito_repo_path }}"
+    requirements: "./requirements.txt"
+    virtualenv: "{{ pulpito_repo_path }}/virtualenv"
+  #no_log: true
+  become_user: "{{ pulpito_user }}"
+
+- name: Check for pulpito config
+  stat:
+    path: "{{ pulpito_repo_path }}/prod.py"
+    get_checksum: no
+  register: pulpito_config
+
+- name: Copy pulpito config
+  shell: cp ./config.py.in prod.py chdir={{ pulpito_repo_path }}
+  when: pulpito_config.stat.exists == false
+  become_user: "{{ pulpito_user }}"
+
+- name: Set paddles_address
+  lineinfile:
+    dest: "{{ pulpito_repo_path }}/prod.py"
+    regexp: "^paddles_address = "
+    line: "paddles_address = '{{ paddles_address|mandatory }}'"
+  register: pulpito_config
+
diff --git a/roles/pulpito/tasks/yum_systems.yml b/roles/pulpito/tasks/yum_systems.yml

new file mode 100644 (file)

index 0000000..cf2a3f4
--- /dev/null
+++ b/roles/pulpito/tasks/yum_systems.yml
@@ -0,0 +1,8 @@
+---
+- name: Install packages via yum
+  yum:
+    name: "{{ pulpito_extra_packages|list }}"
+    state: latest
+  no_log: true
+  tags:
+    - packages
diff --git a/roles/pulpito/tasks/zypper_systems.yml b/roles/pulpito/tasks/zypper_systems.yml

new file mode 100644 (file)

index 0000000..91f3b91
--- /dev/null
+++ b/roles/pulpito/tasks/zypper_systems.yml
@@ -0,0 +1,9 @@
+---
+- name: Install packages via zypper
+  zypper:
+    name: "{{ pulpito_extra_packages|list }}"
+    state: latest
+    update_cache: yes
+  #no_log: true
+  tags:
+    - packages
diff --git a/roles/pulpito/vars/apt_systems.yml b/roles/pulpito/vars/apt_systems.yml

new file mode 100644 (file)

index 0000000..07dc0bf
--- /dev/null
+++ b/roles/pulpito/vars/apt_systems.yml
@@ -0,0 +1,11 @@
+---
+pulpito_extra_packages:
+  - git-core
+  - supervisor
+  - python3-pip
+  - python3-virtualenv
+  - virtualenv
+
+supervisor_service: supervisor
+supervisor_conf_d: /etc/supervisor/conf.d/
+supervisor_conf_suffix: conf
diff --git a/roles/pulpito/vars/yum_systems.yml b/roles/pulpito/vars/yum_systems.yml

new file mode 100644 (file)

index 0000000..a420d29
--- /dev/null
+++ b/roles/pulpito/vars/yum_systems.yml
@@ -0,0 +1,10 @@
+---
+pulpito_extra_packages:
+  - git-all
+  - supervisor
+  - python3-pip
+  - python3-virtualenv
+
+supervisor_service: supervisord
+supervisor_conf_d: /etc/supervisord.d
+supervisor_conf_suffix: ini
diff --git a/roles/pulpito/vars/zypper_systems.yml b/roles/pulpito/vars/zypper_systems.yml

new file mode 100644 (file)

index 0000000..dae6ff2
--- /dev/null
+++ b/roles/pulpito/vars/zypper_systems.yml
@@ -0,0 +1,10 @@
+---
+pulpito_extra_packages:
+  - git
+  - python3-pip
+  - python3-virtualenv
+  - supervisor
+
+supervisor_service: supervisord
+supervisor_conf_d: /etc/supervisord.d/
+supervisor_conf_suffix: conf
diff --git a/roles/rook/README.rst b/roles/rook/README.rst

new file mode 100644 (file)

index 0000000..bc39839
--- /dev/null
+++ b/roles/rook/README.rst
@@ -0,0 +1,116 @@
+Rook
+====
+
+This role is used for updating and recovering the rook jenkins in the rook ci Virtual Private Cloud (VPC).
+
+The functions in this role are:
+
+**rook-jenkins-update:** For updating rook jenkins version to the version defined in the "jenkins_controller_image" variable
+
+**rook-os-update:** For updating rook jenkins OS packages
+
+**rook-recovery:** For recovering the Prod-jenkins instance from the image defined in the "image" variable in a case that the instance was deleted or crashed
+
+Usage
++++++
+
+The rook role is used by the ``rook.yml`` playbook.  Run this playbook with one of the optional **Tags** listed in the tags section to upgrade rook jenkins OS packages/recover it from an image or update the rook jenkins app.
+
+**Pre-requisites:** Before running ``rook.yml`` make sure your IP address has ssh access to the VPC. This is configured in the `AWS dashboard`_ under the "rook-jenkins-group" security group inbound rules.
+
+- The Rook-Recovery Playbook is used for deploying rook jenkins from an image in case of a crash/corruption:
+    - Run the playbook with the ``rook-recovery`` tag, then you will need to make the newly created instance available to the public network as explained in the next step.
+
+    - Once the instance is deployed, now add the deployed instance to the load balancing target group named "jenkins-rook-new" so that it will be available to the public network.
+
+- AWS dashboard access
+  Access details to the AWS dashboard can be found in here_ (Red Hat VPN Access required)
+
+**NOTE:** ``rook.yml`` Is currently using only localhost and not any host from the inventory. This is because the ``rook-recovery`` play deploys and configures the rook jenkins during his run.
+
+Examples
+++++++++
+
+Updating the rook jenkins app to version 2.289.1::
+
+    ansible-playbook rook.yml --tags="rook-jenkins-update" --extra-vars="jenkins_controller_image=jenkins/jenkins:2.289.1"
+
+Updating the rook jenkins OS packages::
+
+    ansible-playbook rook.yml --tags="rook-os-update"
+
+Variables
++++++++++
+
+Available variables are listed below These overrides are included by ``tasks/vars.yml``.
+
+The rook jenkins version::
+
+    jenkins_controller_image: jenkins/jenkins:2.289.1
+
+The rook jenkins ssh keyi-pair defined in the aws dashboard::
+
+    keypair: root-jenkins-new-key
+
+The rook jenkins instance type::
+
+    controller_instance_type: m4.large
+
+The rook jenkins instance aws security group::
+
+    security_group: rook-jenkins-group
+
+The rook jenkins instance aws region::
+
+    region: us-east-1
+
+The rook jenkins instance aws vpc subnet id::
+
+    vpc_subnet_id: subnet-c72b609b
+
+The rook jenkins image is the backup image used for creating the recovery instance of rook jenkins::
+
+    image: ami-0aaf5dbaa4cbe5771
+
+The rook jenkins instance name, used by the rook-recovery play when creating the instance from image::
+
+    instance_name: Recovery-Rook-Jenkins
+
+A list of the rook jenkins aws instance tags, used by the rook-recovery play when creating the instance from image::
+
+    aws_tags:
+      Name: "{{ instance_name }}"
+      Application: "Jenkins"
+
+The rook jenkins running aws instance name::
+
+    controller_name: Prod-Jenkins
+
+The rook jenkins instance ssh key::
+
+    rook_key: "{{ secrets_path | mandatory }}/rook_key.yml"
+
+Tags
+++++
+
+Available tags are listed below:
+
+- rook-jenkins-update
+    Update the rook jenkins app to the version defined in the "jenkins_controller_image" variable.
+
+- rook-os-update
+    Update the rook jenkins OS packages.
+
+- rook-recovery
+    Recover the rook jenkins instance from the image defined in "image" variable.
+
+Dependencies
+++++++++++++
+
+This role depends on the following roles:
+
+- secrets
+    Provides a var, ``secrets_path``, containing the path of the secrets repository.
+
+ .. _AWS dashboard: https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Home:
+ .. _here: http://wiki.ceph.redhat.com/dokuwiki/doku.php?id=rook_aws_account
diff --git a/roles/rook/meta/main.yml b/roles/rook/meta/main.yml

new file mode 100644 (file)

index 0000000..313fd69
--- /dev/null
+++ b/roles/rook/meta/main.yml
@@ -0,0 +1,3 @@
+---
+dependencies:
+  - role: secrets
diff --git a/roles/rook/tasks/main.yml b/roles/rook/tasks/main.yml

new file mode 100644 (file)

index 0000000..6ac383f
--- /dev/null
+++ b/roles/rook/tasks/main.yml
@@ -0,0 +1,18 @@
+---
+- name: Include secrets
+  include_vars: "{{ secrets_path | mandatory }}/aws.yaml"
+  no_log: true
+  tags:
+    - always
+
+- import_tasks: rook-jenkins-update.yml
+  tags:
+    - rook-jenkins-update
+
+- import_tasks: rook-os-update.yml
+  tags:
+    - rook-os-update
+
+- import_tasks: rook-recovery.yml
+  tags:
+    - rook-recovery
diff --git a/roles/rook/tasks/rook-jenkins-update.yml b/roles/rook/tasks/rook-jenkins-update.yml

new file mode 100644 (file)

index 0000000..fb15a2c
--- /dev/null
+++ b/roles/rook/tasks/rook-jenkins-update.yml
@@ -0,0 +1,34 @@
+---
+- name: Gather facts
+  ec2_instance_facts:
+    filters:
+      "tag:Name": "{{ controller_name }}"
+      instance-state-name: running
+  register: controller_metadata
+
+- name: Take a backup image of the Prod-jenkins instance
+  ec2_ami:
+    aws_access_key: "{{ aws_access_key }}"
+    aws_secret_key: "{{ aws_secret_key }}"
+    instance_id: "{{ controller_metadata.instances[0].instance_id }}"
+    no_reboot: yes
+    wait: yes
+    wait_timeout: 3000
+    name: "{{ controller_name }}-{{ ansible_date_time.date }}"
+    tags:
+      Name: "{{ controller_name }}-{{ ansible_date_time.date }}"
+
+- name: Check if container is running
+  command: ssh -i "{{ rook_key }}" ubuntu@"{{ controller_metadata.instances[0].public_dns_name }}" docker ps -a | grep -i jenkins | wc -l
+  register: container
+
+- name: Kill the jenkins container
+  command: ssh -i "{{ rook_key }}" ubuntu@"{{ controller_metadata.instances[0].public_dns_name }}" sudo docker kill jenkins
+  when: container.stdout == '1'
+
+- name: Remove the jenkins container
+  command: ssh -i "{{ rook_key }}" ubuntu@"{{ controller_metadata.instances[0].public_dns_name }}" sudo docker rm jenkins
+  when: container.stdout == '1'
+
+- name: Start the new jenkins container with the new LTS version
+  command: ssh -i "{{ rook_key }}" ubuntu@"{{ controller_metadata.instances[0].public_dns_name }}" sudo docker run -d --name jenkins -p 8080:8080 -p 50000:50000 -v /mnt/jenkins/jenkins:/var/jenkins_home "{{ jenkins_controller_image }}"
diff --git a/roles/rook/tasks/rook-os-update.yml b/roles/rook/tasks/rook-os-update.yml

new file mode 100644 (file)

index 0000000..4ac0ceb
--- /dev/null
+++ b/roles/rook/tasks/rook-os-update.yml
@@ -0,0 +1,54 @@
+---
+- name: Gather facts
+  ec2_instance_facts:
+    filters:
+      "tag:Name": "{{ controller_name }}"
+      instance-state-name: running
+  register: controller_metadata
+
+- name: Take a image of the controller
+  ec2_ami:
+    aws_access_key: "{{ aws_access_key }}"
+    aws_secret_key: "{{ aws_secret_key }}"
+    instance_id: "{{ controller_metadata.instances[0].instance_id }}"
+    no_reboot: yes
+    wait: yes
+    wait_timeout: 3000
+    name: "{{ controller_name }}-{{ ansible_date_time.date }}"
+    tags:
+      Name: "{{ controller_name }}-{{ ansible_date_time.date }}"
+
+- name: Update apt cache
+  command: ssh -i "{{ rook_key }}" ubuntu@"{{ controller_metadata.instances[0].public_dns_name }}" sudo apt-get update
+
+- name: Update packages
+  command: ssh -i "{{ rook_key }}" ubuntu@"{{ controller_metadata.instances[0].public_dns_name }}" sudo apt-get upgrade -y
+
+- name: Check if system requires reboot
+  command: ssh -i "{{ rook_key }}" ubuntu@"{{ controller_metadata.instances[0].public_dns_name }}" [ -f /var/run/reboot-required ]; echo $?
+  register: reboot
+
+- name: Reboot if required
+  command: ssh -i "{{ rook_key }}" ubuntu@"{{ controller_metadata.instances[0].public_dns_name }}" sudo reboot
+  ignore_errors: yes
+  when: reboot.stdout == '0'
+
+- name: Wait for SSH to come up
+  wait_for: host={{ controller_metadata.instances[0].public_dns_name }} port=22 delay=60 timeout=320 state=started
+  when: reboot.stdout == '0'
+
+- name: Check if old container exist
+  command: ssh -i "{{ rook_key }}" ubuntu@"{{ controller_metadata.instances[0].public_dns_name }}" docker ps -a | grep -i jenkins | wc -l
+  register: container
+
+- name: Remove jenkins old container if exist
+  command: ssh -i "{{ rook_key }}" ubuntu@"{{ controller_metadata.instances[0].public_dns_name }}" sudo docker rm jenkins
+  when:
+    - container.stdout == '1'
+    - reboot.stdout == '0'
+
+- name: Start jenkins container
+  command: ssh -i "{{ rook_key }}" ubuntu@"{{ controller_metadata.instances[0].public_dns_name }}" sudo docker run -d --name jenkins -p 8080:8080 -p 50000:50000 -v /mnt/jenkins/jenkins:/var/jenkins_home "{{ jenkins_controller_image }}"
+  when:
+    - container.stdout == '1'
+    - reboot.stdout == '0'
diff --git a/roles/rook/tasks/rook-recovery.yml b/roles/rook/tasks/rook-recovery.yml

new file mode 100644 (file)

index 0000000..82ae054
--- /dev/null
+++ b/roles/rook/tasks/rook-recovery.yml
@@ -0,0 +1,58 @@
+---
+- name: Launch instance
+  ec2:
+    aws_access_key: "{{ aws_access_key }}"
+    aws_secret_key: "{{ aws_secret_key }}"
+    key_name: "{{ keypair }}"
+    group: "{{ security_group }}"
+    instance_type: "{{ controller_instance_type }}"
+    image: "{{ image }}"
+    region: "{{ region }}"
+    vpc_subnet_id: "{{ vpc_subnet_id }}"
+    assign_public_ip: yes
+    instance_tags: "{{ aws_tags }}"
+    wait: yes
+  register: ec2_instances
+
+- name: print ec2 facts
+  debug:
+    var: ec2_instances
+
+- name: Set name tag for AWS instance
+  ec2_tag:
+    aws_access_key: "{{ aws_access_key }}"
+    aws_secret_key: "{{ aws_secret_key }}"
+    region: "{{ region }}"
+    resource: "{{ item.1.id }}"
+    tags:
+      Name: "{{ aws_tags.Name }}-{{ '%02d' | format(item.0 + 1) }}"
+  with_indexed_items: "{{ ec2_instances.instances }}"
+  loop_control:
+    label: "{{ item.1.id }} - {{ aws_tags.Name }}-{{ '%02d' | format(item.0 + 1) }}"
+
+- name: Wait for SSH to come up
+  wait_for: host={{ item.public_ip }} port=22 delay=60 timeout=320 state=started
+  with_items: '{{ ec2_instances.instances }}'
+  loop_control:
+    label: "{{ item.id }} - {{ item.public_ip }}"
+
+- name: Remove jenkins docker old container
+  command: ssh -i "{{ rook_key }}" ubuntu@"{{ item.public_ip }}" sudo docker rm jenkins
+  with_items: '{{ ec2_instances.instances }}'
+  loop_control:
+    label: "{{ item.id }} - {{ item.public_ip }}"
+
+- name: Start jenkins container
+  command: ssh -i "{{ rook_key }}" ubuntu@"{{ item.public_ip }}" sudo docker run -d --name jenkins -p 8080:8080 -p 50000:50000 -v /mnt/jenkins/jenkins:/var/jenkins_home "{{ jenkins_controller_image }}"
+  with_items: '{{ ec2_instances.instances }}'
+  loop_control:
+    label: "{{ item.id }} - {{ item.public_ip }}"
+
+- name: The instance was succssfuly started
+  debug:
+    msg:
+    - "The Rook Jenkins is up and running the instance is named: {{ aws_tags.Name }}-{{ '%02d' | format(item.0 + 1) }}"
+    - "In order to make him avalible to public network you will need to add him to the load balancing target group"
+  with_indexed_items: "{{ ec2_instances.instances }}"
+  loop_control:
+    label: "{{ item.1.id }} - {{ aws_tags.Name }}-{{ '%02d' | format(item.0 + 1) }}"
diff --git a/roles/rook/vars/main.yml b/roles/rook/vars/main.yml

new file mode 100644 (file)

index 0000000..0eee7f4
--- /dev/null
+++ b/roles/rook/vars/main.yml
@@ -0,0 +1,14 @@
+---
+jenkins_controller_image: jenkins/jenkins:2.289.1
+keypair: root-jenkins-new-key
+controller_instance_type: m4.large
+security_group: rook-jenkins-group
+image: ami-0aaf5dbaa4cbe5771
+region: us-east-1
+vpc_subnet_id: subnet-c72b609b
+instance_name: Recovery-Rook-Jenkins
+aws_tags:
+  Name: "{{ instance_name }}"
+  Application: "Jenkins"
+controller_name: Prod-Jenkins
+rook_key: "{{ secrets_path | mandatory }}/rook_key.yml"
diff --git a/roles/secrets/defaults/main.yml b/roles/secrets/defaults/main.yml

new file mode 100644 (file)

index 0000000..80df7bd
--- /dev/null
+++ b/roles/secrets/defaults/main.yml
@@ -0,0 +1,2 @@
+---
+secrets_path: "{{ lookup('env', 'ANSIBLE_SECRETS_PATH') | default('/etc/ansible/secrets', true) }}"
diff --git a/roles/signalfx_splunk_agent_configuration/README.rst b/roles/signalfx_splunk_agent_configuration/README.rst

new file mode 100644 (file)

index 0000000..832bb5d
--- /dev/null
+++ b/roles/signalfx_splunk_agent_configuration/README.rst
@@ -0,0 +1,73 @@
+signalfx_splunk_agent_configuration
+===================================
+
+This role will help you configure any server node to monitor the services like HTTP and SYSTEMD. 
+This will create the necessary configuration files and add the server for monitoring on the dashboard.
+
+Prerequisites
+-------------
+
+Requires an access_token which needs to be generated in your profile.
+
+HTTP - Monitoring
++++++++++++++++++
+
+Create a variable file as follows. Example: http_vars.yml::
+
+    ---
+      access_token: "<Your access token>"
+      basic_attributes:
+        appcode: "<Your preferred appcode>"
+      http_enabled: true
+      http_monitors:
+        - host: example1.domain.com
+          http_timeout: 1s
+        - host: example2.domain.com 
+          port: 80
+          use_https: false
+        - host: example3.domain.com 
+          port: 8443
+          path: /my/path/index.html
+          skip_verify: true
+
++++++++++++++++++
+
+
+SYSTEMD - Monitoring
+++++++++++++++++++++
+
+Create a variable file as follows. Example: systemd_vars.yml::
+
+    ---
+      access_token: "<Your access token>"
+      basic_attributes:
+        appcode: "<Your preferred appcode>"
+      systemd_enabled: true
+      systemd_services:
+        - ssh
+        - nginx
+        - firewall
+      systemd_sendactivestate: true
+      systemd_extrametrics:
+          - gauge.active_state.active
+
+++++++++++++++++++++
+
+How to run
+----------
+
+You can pass the variables file name as a extra variable `var_file_name`.
+
+If nothing is provided then it will make use of the vars/main.yml parameters and configure the node to default settings.
+
+NOTE: If you wish to configure the node with default setting, please remember to change the values below.
+
+- access_token
+- appcode
+
+The way of passing the variable to the ansible playbook can be achieved by running the following command::
+
+    Example: If your variables file name is http_vars.yml
+    ansible-playbook -i hosts -e "var_file_name=http_vars.yml" signalfx.yml
+
+----------
diff --git a/roles/signalfx_splunk_agent_configuration/defaults/main.yml b/roles/signalfx_splunk_agent_configuration/defaults/main.yml

new file mode 100644 (file)

index 0000000..7cd2f08
--- /dev/null
+++ b/roles/signalfx_splunk_agent_configuration/defaults/main.yml
@@ -0,0 +1,23 @@
+---
+agent_interval_seconds: 20
+agent_realm: us1
+agent_restorecon_map:
+  RedHat6: /sbin/restorecon
+  RedHat7: /usr/sbin/restorecon
+  RedHat8: /sbin/restorecon
+agent_restorecon_path: "{{ agent_restorecon_map[ ansible_distribution + ansible_distribution_major_version ] }}"
+signalfx_skip_repo: true
+
+http_enabled: false
+http_monitors: []
+
+systemd_enabled: false
+systemd_services: []
+
+signalfx_repo_base_url: https://splunk.jfrog.io/splunk
+signalfx_package_stage: release
+signalfx_version: latest
+signalfx_conf_file_path: /etc/signalfx/agent.yaml
+signalfx_service_user: signalfx-agent
+signalfx_service_group: signalfx-agent
+signalfx_service_state: started
diff --git a/roles/signalfx_splunk_agent_configuration/handlers/main.yml b/roles/signalfx_splunk_agent_configuration/handlers/main.yml

new file mode 100644 (file)

index 0000000..087164b
--- /dev/null
+++ b/roles/signalfx_splunk_agent_configuration/handlers/main.yml
@@ -0,0 +1,9 @@
+---
+- name: agent_systemd_reload
+  systemd:
+    daemon_reload: yes
+
+- name: agent_restart
+  service:
+    name: signalfx-agent
+    state: restarted
diff --git a/roles/signalfx_splunk_agent_configuration/tasks/http.yml b/roles/signalfx_splunk_agent_configuration/tasks/http.yml

new file mode 100644 (file)

index 0000000..44efc87
--- /dev/null
+++ b/roles/signalfx_splunk_agent_configuration/tasks/http.yml
@@ -0,0 +1,17 @@
+---
+- name: Configure HTTP monitoring
+  template:
+    src: http.yaml.j2
+    dest: "{{ agent_extra_monitor_path }}/http.yaml"
+    owner: "{{ signalfx_service_user }}"
+    group: "{{ signalfx_service_group }}"
+    mode: 0600
+  notify: agent_restart
+
+- name: Ensure OCSP cache can be created
+  file:
+    state: directory
+    path: '/usr/lib/signalfx-agent/.cache/'
+    owner: 'signalfx-agent'
+    group: 'signalfx-agent'
+    mode: '0700'
diff --git a/roles/signalfx_splunk_agent_configuration/tasks/linux_installation.yml b/roles/signalfx_splunk_agent_configuration/tasks/linux_installation.yml

new file mode 100644 (file)

index 0000000..10dadaf
--- /dev/null
+++ b/roles/signalfx_splunk_agent_configuration/tasks/linux_installation.yml
@@ -0,0 +1,25 @@
+---
+- name: Import signalfx-agent deploy for CentOS or RHEL
+  import_tasks: yum_installation.yml
+  when: ansible_os_family in rhel_distro
+
+- name: Import signalfx-agent deploy for Debian or Ubuntu
+  import_tasks: ubuntu_installation.yml
+  when: ansible_os_family in ubuntu_distro
+
+- name: Set signalfx-agent service owner
+  import_tasks: service_owner.yml
+
+- name: Write signalfx config
+  copy:
+    content: "{{ signalfx_agent_config | to_nice_yaml }}"
+    dest: "{{ signalfx_conf_file_path }}"
+    owner: "{{ signalfx_service_user }}"
+    group: "{{ signalfx_service_group }}"
+    mode: 0600
+
+- name: Start signalfx-agent
+  service:
+    name: signalfx-agent
+    state: "{{ signalfx_service_state }}"
+    enabled: yes
diff --git a/roles/signalfx_splunk_agent_configuration/tasks/main.yml b/roles/signalfx_splunk_agent_configuration/tasks/main.yml

new file mode 100644 (file)

index 0000000..f72d246
--- /dev/null
+++ b/roles/signalfx_splunk_agent_configuration/tasks/main.yml
@@ -0,0 +1,89 @@
+---
+- name: Validate the variable definitions 
+  assert:
+    that:
+      - basic_attributes is defined
+      - basic_attributes['appcode'] is defined
+      - access_token is defined
+    quiet: true
+
+- name: Default monitors
+  set_fact:
+    _agent_monitors: "{{ default_monitors }}"
+  when: agent_monitors is not defined
+
+- name: Configure SELinux for SignalFX Smart Agent
+  seboolean:
+    name: nis_enabled
+    state: yes
+    persistent: yes
+  when: ansible_distribution_major_version | int > 6
+
+- name: Create the SignalFX Smart Agent configuration directory
+  file:
+    path: "{{ access_token_path | dirname }}"
+    state: directory
+    mode: 0700
+
+- name: Store SignalFX access token in a separate file
+  copy:
+    dest: "{{ access_token_path }}"
+    content: "{{ access_token }}"
+    mode: 0600
+  no_log: true
+
+- name: Import the SignalFX Smart Agent role
+  import_tasks: signalfx_main.yml 
+  vars:
+    signalfx_agent_config:
+      signalFxAccessToken: "{'#from': '{{ access_token_path }}'}"
+      signalFxRealm: "{{ agent_realm }}"
+      intervalSeconds: "{{ agent_interval_seconds }}"
+      globalDimensions: "{{ basic_attributes }}"
+      monitors: "{{ _agent_monitors }}"
+
+- name: Include extra monitors in agent configuration
+  blockinfile:
+    path: "{{ signalfx_conf_file_path }}"
+    insertafter: 'monitors:'
+    block: |
+      -   '#from': /etc/signalfx/monitors/*
+          flatten: true
+          optional: true
+
+- name: Create directory for SignalFX extra monitors
+  file:
+    path: "{{ agent_extra_monitor_path }}"
+    state: directory
+    owner: "{{ signalfx_service_user }}"
+    group: "{{ signalfx_service_group }}"
+    mode: 0700
+
+- name: Correct bundled binaries SELinux context types to work around an upstream bug
+  sefcontext:
+    target: "{{ agent_bin_path }}"
+    setype: "{{ agent_bin_setype }}"
+    state: present
+
+- name: Apply the SELinux context type to collectd
+  command: "{{ agent_restorecon_path }} -RvF {{ agent_bin_restore }}"
+
+- name: Fix the SignalFX Smart Agent service startup
+  blockinfile:
+    path: "{{ agent_systemd_config }}"
+    backup: yes
+    insertbefore: BOF
+    block: |
+      [Unit]
+      Description=SignalFX Smart Agent
+      After=network.target nss-lookup.target multi-user.target
+  notify: agent_systemd_reload
+  when: ansible_distribution_major_version | int > 6
+
+- name: Configure HTTP monitoring
+  import_tasks: http.yml
+  when: http_enabled
+
+- name: Configure Systemd services monitoring
+  import_tasks: systemd.yml
+  when: systemd_enabled
diff --git a/roles/signalfx_splunk_agent_configuration/tasks/service_owner.yml b/roles/signalfx_splunk_agent_configuration/tasks/service_owner.yml

new file mode 100644 (file)

index 0000000..780a7d5
--- /dev/null
+++ b/roles/signalfx_splunk_agent_configuration/tasks/service_owner.yml
@@ -0,0 +1,90 @@
+---
+- name: Create user/group
+  block:
+    - name: Get groups
+      getent:
+        database: group
+        key: "{{ signalfx_service_group }}"
+        fail_key: no
+    - name: Create group
+      group:
+        name: "{{ signalfx_service_group }}"
+        system: yes
+      when: not getent_group[signalfx_service_group]
+    - name: Get users
+      getent:
+        database: passwd
+        key: "{{ signalfx_service_user }}"
+        fail_key: no
+    - name: Create user
+      user:
+        name: "{{ signalfx_service_user }}"
+        group: "{{ signalfx_service_group }}"
+        createhome: no
+        shell: /sbin/nologin
+        system: yes
+      when: not getent_passwd[signalfx_service_user]
+
+- name: Set user/group for signalfx-agent systemd service
+  block:
+    - name: Stop systemd service
+      service:
+        name: signalfx-agent
+        state: stopped
+    - name: Create tmpfile override
+      lineinfile:
+        path: /etc/tmpfiles.d/signalfx-agent.conf
+        create: yes
+        line: "D /run/signalfx-agent 0755 {{ signalfx_service_user }} {{ signalfx_service_group }} - -"
+        regexp: '^D /run/signalfx-agent .*'
+        insertafter: EOF
+    - name: Initialize tmpfile override
+      command: systemd-tmpfiles --create --remove /etc/tmpfiles.d/signalfx-agent.conf
+    - name: Create systemd override directory
+      file:
+        path: /etc/systemd/system/signalfx-agent.service.d/
+        state: directory
+    - name: Create systemd service owner override file
+      lineinfile:
+        path: /etc/systemd/system/signalfx-agent.service.d/service-owner.conf
+        create: yes
+        line: '[Service]'
+        regexp: '^\[Service\].*'
+        insertafter: EOF
+    - name: Set systemd service owner user
+      lineinfile:
+        path: /etc/systemd/system/signalfx-agent.service.d/service-owner.conf
+        line: "User={{ signalfx_service_user }}"
+        regexp: '^User=.*'
+        insertafter: '^\[Service\].*'
+    - name: Set systemd service owner group
+      lineinfile:
+        path: /etc/systemd/system/signalfx-agent.service.d/service-owner.conf
+        line: "Group={{ signalfx_service_group }}"
+        regexp: '^Group=.*'
+        insertafter: '^User=.*'
+    - name: Reload systemd service
+      systemd:
+        daemon_reload: yes
+  when: ansible_service_mgr == 'systemd'
+
+- name: Set user/group for signalfx-agent initd service
+  block:
+    - name: Stop initd service
+      service:
+        name: signalfx-agent
+        state: stopped
+    - name: Set initd service owner user
+      lineinfile:
+        path: /etc/default/signalfx-agent
+        create: yes
+        line: "user={{ signalfx_service_user }}"
+        regexp: '^user=.*'
+        insertafter: EOF
+    - name: Set initd service owner group
+      lineinfile:
+        path: /etc/default/signalfx-agent
+        line: "group={{ signalfx_service_group }}"
+        regexp: '^group=.*'
+        insertafter: '^user=.*'
+  when: ansible_service_mgr != 'systemd'
diff --git a/roles/signalfx_splunk_agent_configuration/tasks/signalfx_main.yml b/roles/signalfx_splunk_agent_configuration/tasks/signalfx_main.yml

new file mode 100644 (file)

index 0000000..089071e
--- /dev/null
+++ b/roles/signalfx_splunk_agent_configuration/tasks/signalfx_main.yml
@@ -0,0 +1,26 @@
+---
+- name: Accepted distros
+  set_fact:
+    ubuntu_distro: ['Ubuntu']
+    rhel_distro: ['RedHat', 'Red Hat Enterprise Linux', 'CentOS', 'Amazon']
+    cacheable: true
+
+- name: Confirm if agent configuration is provided!
+  fail: msg='Please provide a populated signalfx_agent_config'
+  when: not (signalfx_agent_config| default(false))
+
+- name: Confirm if SignalFx Access Token is defined!
+  fail: msg='Please specify a signalFxAccessToken in your signalfx_agent_config'
+  when: not (signalfx_agent_config.signalFxAccessToken | default('') | trim) or not signalfx_agent_config.signalFxAccessToken
+
+- name: Acceptable distribution check
+  fail:
+    msg: >
+         Failed! The target is {{ ansible_os_family }} and this role only supports {{ ubuntu_distro }} and {{ rhel_distro }}.
+  when: (ansible_os_family not in ubuntu_distro)
+          and
+        (ansible_os_family not in rhel_distro)
+
+- name: Linux installation
+  include_tasks: linux_installation.yml
+  when: ( ansible_os_family in ubuntu_distro ) or ( ansible_os_family in rhel_distro )
diff --git a/roles/signalfx_splunk_agent_configuration/tasks/systemd.yml b/roles/signalfx_splunk_agent_configuration/tasks/systemd.yml

new file mode 100644 (file)

index 0000000..ea354f9
--- /dev/null
+++ b/roles/signalfx_splunk_agent_configuration/tasks/systemd.yml
@@ -0,0 +1,9 @@
+---
+- name: Configure systemd monitoring
+  template:
+    src: systemd.yaml.j2
+    dest: "{{ agent_extra_monitor_path }}/systemd.yaml"
+    owner: "{{ signalfx_service_user }}"
+    group: "{{ signalfx_service_group }}"
+    mode: 0600
+  notify: agent_restart
diff --git a/roles/signalfx_splunk_agent_configuration/tasks/ubuntu_installation.yml b/roles/signalfx_splunk_agent_configuration/tasks/ubuntu_installation.yml

new file mode 100644 (file)

index 0000000..e26ffdf
--- /dev/null
+++ b/roles/signalfx_splunk_agent_configuration/tasks/ubuntu_installation.yml
@@ -0,0 +1,32 @@
+---
+- name: Delete old signing key for SignalFx Agent
+  apt_key:
+    id: 91668001288D1C6D2885D651185894C15AE495F6
+    state: absent
+
+- name: Delete old signing key file for SignalFx Agent
+  file:
+    path: /etc/apt/trusted.gpg.d/signalfx.gpg
+    state: absent
+
+- name: Add an Apt signing key for Signalfx Agent
+  get_url:
+    url: "{{ sfx_repo_base_url }}/signalfx-agent-deb/splunk-B3CD4420.gpg"
+    dest: /etc/apt/trusted.gpg.d/splunk.gpg
+    mode: 0644
+
+- name: Add Signalfx Agent repository into sources list
+  apt_repository:
+    repo: "deb {{ sfx_repo_base_url }}/signalfx-agent-deb {{ sfx_package_stage }} main"
+    filename: 'signalfx-agent'
+    mode: 644
+    state: present
+  when: not (sfx_skip_repo | bool)
+
+- name: Install signalfx-agent via apt package manager
+  apt:
+    name: signalfx-agent{% if sfx_version is defined and sfx_version != "latest" %}={{ sfx_version }}{% endif %}
+    state: "{% if sfx_version is defined and sfx_version != 'latest' %}present{% else %}{{ sfx_version }}{% endif %}"
+    force: yes
+    update_cache: yes
+    policy_rc_d: 101
diff --git a/roles/signalfx_splunk_agent_configuration/tasks/yum_installation.yml b/roles/signalfx_splunk_agent_configuration/tasks/yum_installation.yml

new file mode 100644 (file)

index 0000000..fec807e
--- /dev/null
+++ b/roles/signalfx_splunk_agent_configuration/tasks/yum_installation.yml
@@ -0,0 +1,22 @@
+---
+- name: Delete old signing key for SignalFx Agent
+  rpm_key:
+    key: 098acf3b
+    state: absent
+
+- name: Add Signalfx Agent repo into source list
+  yum_repository:
+    name: signalfx-agent
+    description: SignalFx Agent Repository
+    baseurl: "{{ signalfx_repo_base_url }}/signalfx-agent-rpm/{{ signalfx_package_stage }}"
+    gpgkey: "{{ signalfx_repo_base_url }}/signalfx-agent-rpm/splunk-B3CD4420.pub"
+    gpgcheck: yes
+    enabled: yes
+  when: not (signalfx_skip_repo | bool)
+
+- name: Install signalfx-agent via yum package manager
+  yum:
+    name: signalfx-agent{% if signalfx_version is defined and signalfx_version != "latest" %}-{{ signalfx_version }}{% endif %}
+    state: "{% if signalfx_version is defined and signalfx_version != 'latest' %}present{% else %}{{ signalfx_version }}{% endif %}"
+    allow_downgrade: yes
+    update_cache: yes
diff --git a/roles/signalfx_splunk_agent_configuration/templates/http.yaml.j2 b/roles/signalfx_splunk_agent_configuration/templates/http.yaml.j2

new file mode 100644 (file)

index 0000000..fa30963
--- /dev/null
+++ b/roles/signalfx_splunk_agent_configuration/templates/http.yaml.j2
@@ -0,0 +1,42 @@
+{% for http_monitor in http_monitors %}
+- type: http
+  host: {{ http_monitor.host | default(ansible_fqdn) }}
+  port: {{ http_monitor.port | default(443) }}
+  path: {{ http_monitor.path | default('/')}}
+  httpTimeout: {{ http_monitor.http_timeout | default('5s') }}
+  useHTTPS: {{ http_monitor.use_https | default(true) }}
+  skipVerify: {{ http_monitor.skip_verify | default(false) }}
+  noRedirects: {{ http_monitor.no_redirects | default(false)  }}
+  method: {{ http_monitor.method | default('GET') }}
+  desiredCode: {{ http_monitor.desired_code | default(200)}}
+  addRedirectURL: {{ http_monitor.add_redirect_url | default(false) }}
+{% if http_monitor.username is defined %}
+  username: {{ http_monitor.username }}
+{% endif %}
+{% if http_monitor.password is defined %}
+  password: {{ http_monitor.password }}
+{% endif %}
+{% if http_monitor.http_headers is defined %}
+  httpHeaders: {{ http_monitor.http_headers }}
+{% endif %}
+{% if http_monitor.ca_cert_path is defined %}
+  caCertPath: {{ http_monitor.ca_cert_path }}
+{% endif %}
+{% if http_monitor.client_cert_path is defined %}
+  clientCertPath: {{ http_monitor.client_cert_path }}
+{% endif %}
+{% if http_monitor.client_key_path is defined %}
+  clientKeyPath: {{ http_monitor.client_key_path }}
+{% endif %}
+{% if http_monitor.request_body is defined %}
+  requestBody: {{ http_monitor.request_body }}
+{% endif %}
+{% if http_monitor.regex is defined %}
+  regex: {{ http_monitor.regex }}
+{% endif %}
+{% if http_monitor.extra_dimensions is defined %}
+  extraDimensions:
+{{ http_monitor.extra_dimensions | to_nice_yaml | indent(4, True) }}
+{% endif %}
+
+{% endfor %}
diff --git a/roles/signalfx_splunk_agent_configuration/templates/systemd.yaml.j2 b/roles/signalfx_splunk_agent_configuration/templates/systemd.yaml.j2

new file mode 100644 (file)

index 0000000..6b11c13
--- /dev/null
+++ b/roles/signalfx_splunk_agent_configuration/templates/systemd.yaml.j2
@@ -0,0 +1,20 @@
+- type: collectd/systemd
+  services:
+{% for service in systemd_services %}
+  - {{ service }}
+{% endfor %}
+{% if systemd_sendactivestate is defined %}
+  sendActiveState: {{ systemd_sendactivestate }}
+{% endif %}
+{% if systemd_sendsubstate is defined %}
+  sendSubState: {{ systemd_sendsubstate }}
+{% endif %}
+{% if systemd_sendloadstate is defined %}
+  sendLoadState: {{ systemd_sendloadstate }}
+{% endif %}
+{% if systemd_extrametrics is defined and systemd_extrametrics | length > 0 %}
+  extraMetrics:
+{% for metric in systemd_extrametrics %}
+    - {{ metric }}
+{% endfor %}
+{% endif %}
diff --git a/roles/signalfx_splunk_agent_configuration/vars/main.yml b/roles/signalfx_splunk_agent_configuration/vars/main.yml

new file mode 100644 (file)

index 0000000..6d4fb7d
--- /dev/null
+++ b/roles/signalfx_splunk_agent_configuration/vars/main.yml
@@ -0,0 +1,22 @@
+---
+agent_systemd_config: /etc/systemd/system/signalfx-agent.service.d/service-owner.conf
+access_token_path: /etc/signalfx/token
+agent_extra_monitor_path: /etc/signalfx/monitors
+default_monitors:
+  - type: cpu
+    extraMetrics:
+      - cpu.user
+      - cpu.wait
+      - cpu.system
+      - cpu.steal
+  - type: filesystems
+  - type: disk-io
+  - type: net-io
+  - type: load
+  - type: memory
+  - type: vmem
+  - type: host-metadata
+  - type: processlist
+agent_bin_path: '/usr/lib/signalfx-agent/bin(/.*)'
+agent_bin_restore: '/usr/lib/signalfx-agent/bin'
+agent_bin_setype: bin_t
diff --git a/roles/testnode/README.rst b/roles/testnode/README.rst

new file mode 100644 (file)

index 0000000..81faaff
--- /dev/null
+++ b/roles/testnode/README.rst
@@ -0,0 +1,389 @@
+Testnode
+========
+
+This role is used to configure a node for ceph testing using teuthology_ and ceph-qa-suite_.
+It will manage the necessary groups, users and configuration needed for teuthology to connect to and use the node.
+It also installs a number of packages needed for tasks in ceph-qa-suite and teuthology.
+
+The following distros are supported:
+
+- RHEL 6.X
+- RHEL 7.X
+- Centos 6.X
+- Centos 7.x
+- Fedora 20
+- Debian Wheezy
+- Ubuntu Precise
+- Ubuntu Trusty
+- Ubuntu Vivid
+
+**NOTE:** This role was first created as a port of ceph-qa-chef_.
+
+Usage
++++++
+
+The testnode role is primarily used by the ``testnodes.yml`` playbook.  This playbook is run by cobbler during
+bare-metal imaging to prepare a node for testing and is also used by teuthology during test runs to ensure the config
+is correct before testing.
+
+**NOTE:** ``testnodes.yml`` is limited to run against hosts in the ``testnodes`` group by the ``hosts`` key in the playbook.
+
+Variables
++++++++++
+
+Available variables are listed below, along with default values (see ``roles/testnode/defaults/main.yml``). The ``testnode`` role
+also allows for variables to be defined per package type (apt, yum), distro, distro major version and distro version.
+These overrides are included by ``tasks/vars.yml`` and the specific var files live in ``vars/``.
+
+The host to use as a package mirror::
+
+    mirror_host: apt-mirror.sepia.ceph.com
+
+The host to use as a github mirror::
+
+    git_mirror_host: git.ceph.com
+
+The host to find package-signing keys on (at https://{{key_host}}/keys/{release,autobuild}.asc)::
+
+    key_host: download.ceph.com
+
+This host is used by teuthology to download ceph packages and will be given higher priority on apt systems::
+
+    gitbuilder_host: gitbuilder.ceph.com
+
+The mirror to download and install ``pip`` from::
+
+    pip_mirror_url: "http://{{ mirror_host }}/pypi/simple"
+
+A hash defining yum repos that would be common across a major version. Each key in the hash represents
+the filename of a yum repo created in /etc/yum.repos.d. The key/value pairs as the value for that repo
+will be used as the properties for the repo file::
+
+    common_yum_repos: {}
+
+    # An example: 
+    common_yum_repos:
+      rhel-7-fcgi-ceph:
+        name: "RHEL 7 Local fastcgi Repo"
+        baseurl: http://gitbuilder.ceph.com/mod_fastcgi-rpm-rhel7-x86_64-basic/ref/master/
+        enabled: 1
+        gpgcheck: 0
+        priority: 2
+
+A hash defining version-specific yum repos. Each key in the hash represents
+the filename of a yum repo created in /etc/yum.repos.d. The key/value pairs as the value for that repo
+will be used as the properties for the repo file::
+
+    yum_repos: {}
+    
+    # An example:
+    yum_repos:
+      fedora-fcgi-ceph:
+        name: Fedora Local fastcgi Repo
+        baseurl: http://gitbuilder.ceph.com/mod_fastcgi-rpm-fedora20-x86_64-basic/ref/master/
+        enabled: 1
+        gpgcheck: 0
+        priority: 0
+
+Another dictionary of yum repos to put in place.  We have this dictionary defined in the Octo lab secrets repo.  We have devel
+repos with baseurls we don't want to expose the URLs of.  This dict gets combined with ``yum_repos`` in ``roles/testnode/tasks/yum/repos.yml``::
+
+    additional_yum_repos: {}
+    
+    # An example:
+    additional_yum_repos:
+      devel-ceph-repo:
+        name: This is a repo with devel packages
+        baseurl: http://some/private/repo/
+        enabled: 0
+        gpgcheck: 0
+
+A list of copr repos to enable using ``dnf copr enable``::
+
+    copr_repos: []
+
+    # An example:
+    copr_repos:
+      - ktdreyer/ceph-el8
+
+A list of mirrorlist template **filenames** to upload to ``/etc/yum.repos.d/``.
+Mirrorlist templates should live in ``roles/testnode/vars/mirrorlists/{{ ansible_distribution_major_version }}/``
+We were already doing this with epel mirrorlists in the ``common`` role but started seeing metalink issues with CentOS repos::
+
+    yum_mirrorlists: []
+
+    # Example:
+    yum_mirrorlists:
+      - CentOS-AppStream-mirrorlist
+
+    $ cat roles/testnode/templates/mirrorlists/8/CentOS-AppStream-mirrorlist
+    # {{ ansible_managed }}
+    https://download-cc-rdu01.fedoraproject.org/pub/centos/{{ ansible_lsb.release }}/AppStream/x86_64/os/
+    https://path/to/another/mirror
+
+
+A list defining apt repos that would be common across a major version or distro. Each item in the list represents
+an apt repo to be added to sources.list::
+
+    common_apt_repos: []
+
+    # An Example:
+    common_apt_repos:
+      # mod_fastcgi for radosgw
+      - "deb http://gitbuilder.ceph.com/libapache-mod-fastcgi-deb-{{ansible_distribution_release}}-x86_64-basic/ref/master/ {{ansible_distribution_release}} main"
+
+A list defining version-specific apt repos. Each item in the list represents an apt repo to be added to sources.list::
+
+    apt_repos: []
+
+A list of packages to install that is specific to a distro version.  These lists are defined in the var files in ``vars/``::
+
+    packages: []
+
+A list of packages to install that are common to a distro or distro version. These lists are defined in the var files in ``vars/``::
+
+    common_packages: []
+
+A list of packages that must be installed from epel. These packages are installed with the epel repo explicitly enabled for any
+yum-based distro that provides the list in their var file in ``/vars``::
+
+    epel_packages: []
+
+**NOTE:** A good example of how ``packages`` and ``common_packages`` work together is with Ubuntu. The var file ``roles/testnode/vars/ubuntu.yml`` defines
+a number of packages in ``common_packages`` that need to be installed across all versions of ubuntu, while the version-specific files
+(for example, ``roles/testnode/vars/ubuntu_14.yml``) define packages in ``packages`` that either have varying names across versions or are only needed
+for that specific version. This is the same idea behind the vars that control apt and yum repos as well.
+
+A list of ceph packages to remove. It's safe to add packages to this list that aren't currently installed or don't exist. Both ``apt-get`` and ``yum``
+handle this case correctly. This list is defined in ``vars/apt_systems.yml`` and ``vars/yum_systems.yml``::
+
+    ceph_packages_to_remove: []
+
+A list of packages to remove. These lists are defined in the var files in ``vars/``::
+
+    packages_to_remove: []
+
+A list of packages to upgrade. These lists are defined in the vars files in ``vars/``::
+
+    packages_to_upgrade: []
+
+A list of packages to install via ``apt install --no-install-recommends``::
+
+    no_recommended_packages: []
+
+A list of packages to install via pip. These lists are defined in the vars files in ``vars/``::
+
+    pip_packages_to_install: []
+
+The user that teuthology will use to connect to testnodes. This user will be created by this role and assigned to the appropriate groups.
+Even though this variable exists, teuthology is not quite ready to support a configurable user::
+
+    teuthology_user: "ubuntu"
+
+This user is created for use in running xfstests from ceph-qa-suite::
+
+    xfstests_user: "fsgqa"
+
+This will control whether or not rpcbind is started before nfs.  Some distros require this, others don't::
+
+    start_rpcbind: true
+
+Set to true if /etc/fstab must be modified to persist things like mount options, which is useful for long-lived
+bare-metal machines, less useful for virtual machines that are re-imaged before each job::
+
+    modify_fstab: true
+
+A list of ntp servers to use::
+
+    ntp_servers:
+      - 0.us.pool.ntp.org
+      - 1.us.pool.ntp.org
+      - 2.us.pool.ntp.org
+      - 3.us.pool.ntp.org
+
+The lab domain to use when populating systems in cobbler.  (See ``roles/cobbler_systems/tasks/populate_systems.yml``)
+This variable is also used to strip the domain from RHEL and CentOS testnode hostnames
+The latter is only done if ``lab_domain`` is defined::
+
+    lab_domain: ''
+
+A dictionary of drives/devices you want to partition.  ``scratch_devs`` is not required.  All other values are self-explanatory given this example::
+
+    # Example:
+    drives_to_partition:
+      nvme0n1:
+        device: "/dev/nvme0n1"
+        unit: "GB"
+        sizes:
+          - "0 95"
+          - "95 190"
+          - "190 285"
+          - "285 380"
+          - "380 400"
+        scratch_devs:
+          - p1
+          - p2
+          - p3
+          - p4
+      sdb:
+        device: "/dev/sdb"
+        unit: "%"
+        sizes:
+          - "0 50"
+          - "50 100"
+        scratch_devs:
+          - 2
+
+An optional dictionary of filesystems you want created and where to mount them.  (You must use a ``drives_to_partition`` or ``logical_volumes`` dictionary to carve up drives first.)  Example::
+
+    filesystems:
+      varfoo:
+        device: "/dev/nvme0n1p5"
+        fstype: ext4
+        mountpoint: "/var/lib/foo"
+      fscache:
+        device: "/dev/nvme0n1p6"
+        fstype: xfs
+        mountpoint: "/var/cache/fscache"
+
+A dictionary of volume groups you want created.  ``pvs`` should be a comma-delimited list.  Example::
+
+    volume_groups:
+      vg_nvme:
+        pvs: "/dev/nvme0n1"
+      vg_hdd:
+        pvs: "/dev/sdb,/dev/sdc"
+
+A dictionary of logical volumes you want created.  See Ansible's docs_ on available sizing options.  The ``vg`` value is the volume group you want the logical volume created on.  Define ``scratch_dev`` if you want it added to ``/scratch_devices`` on the testnode::
+
+    logical_volumes:
+      lv_1:
+        vg: vg_nvme
+        size: "25%VG"
+        scratch_dev: true
+      lv_2:
+        vg: vg_nvme
+        size: "75%VG"
+        scratch_dev: true
+      lv_foo:
+        vg: vg_hdd
+        size: "100%VG"
+
+Setting ``quick_lvs_to_create`` will:
+
+    #. Create one large volume group using all non-root devices listed in ``ansible_devices``
+    #. Create X number of logical volumes equal in size
+
+    Defining this variable will override ``volume_groups`` and ``logical_volumes`` dicts if defined in secrets::
+
+        # Example would create 4 logical volumes each using 25% of a volume group created using all non-root physical volumes
+        quick_lvs_to_create: 4
+
+Define ``check_for_nvme: true`` in Ansible inventory group_vars (by machine type) if the testnode should have an NVMe device.  This will include a few tasks to verify an NVMe device is present.  If the drive is missing, the tasks will mark the testnode down in the paddles_ lock database so the node doesn't repeatedly fail jobs.  Defaults to false::
+
+    check_for_nvme: false
+
+Downstream QE requested ABRT be configured in a certain way.  Overridden in Octo secrets::
+
+    configure_abrt: false
+
+Configure ``cachefilesd``.  See https://tracker.ceph.com/issues/6373.  Defaults to ``false``::
+
+    configure_cachefilesd: true
+
+    # Optionally override any of the following variables to change their
+    # corresponding values in /etc/cachefilesd.conf
+    cachefilesd_dir
+    cachefilesd_tag
+    cachefilesd_brun
+    cachefilesd_bcull
+    cachefilesd_bstop
+    cachefilesd_frun
+    cachefilesd_fcull
+    cachefilesd_fstop
+    cachefilesd_secctx
+
+Tags
+++++
+
+Available tags are listed below:
+
+cachefilesd
+    Install and configure cachefilesd.
+
+cpan
+    Install and configure cpan and Amazon::S3.
+
+filesystems
+    Create and mount filesystems.
+
+gpg-keys
+    Install gpg keys on Fedora.    
+
+hostname
+    Check and set proper fqdn. See, ``roles/testnode/tasks/set_hostname.yml``.
+
+lvm
+    Configures logical volumes if dicts are defined in the secrets repo.
+
+nfs
+    Install and start nfs.
+
+ntp-client
+    Setup ntp.
+
+packages
+    Install, update and remove packages.
+
+partition
+    Partition any drives/devices if ``drives_to_partition`` is defined in secrets.
+
+pip
+    Install and configure pip.
+
+pubkeys
+    Adds the ssh public keys for the ``teuthology_user``.    
+
+remove-ceph
+    Ensure all ceph related packages are removed. See ``packages_to_remove`` in the distros var file for the list.    
+
+repos
+    Perform all repo related tasks. Creates and manages our custom repo files.     
+
+selinux
+    Configure selinux on yum systems.    
+
+ssh
+    Manage things ssh related.  Will upload the distro specific sshd_config, ssh_config and addition of pubkeys for the ``teuthology_user``. 
+
+sudoers
+    Manage the /etc/sudoers and the nagios suders.d files.
+
+user
+    Manages the ``teuthology_user`` and ``xfstests_user``.
+
+zap
+    Zap (``sgdizk -Z``) all non-root drives and **all** logical volumes and volume groups
+
+Dependencies
+++++++++++++
+
+This role depends on the following roles:
+
+secrets
+    Provides a var, ``secrets_path``, containing the path of the secrets repository, a tree of ansible variable files.
+    
+sudo
+    Sets ``ansible_sudo: true`` for this role which causes all the plays in this role to execute with sudo.
+
+To Do
++++++
+
+- Noop creating custom repos if ``mirror_host`` is not defined.  Change the default to ``mirror_host: ''`` and skip
+  creating custom repo files if a mirror is not needed for that specific distro. This is currently hacked in for Vivid.
+
+.. _ceph-qa-chef: https://github.com/ceph/ceph-qa-chef
+.. _teuthology: https://github.com/ceph/teuthology
+.. _ceph-qa-suite: https://github.com/ceph/ceph-qa-suite
+.. _docs: https://docs.ansible.com/ansible/latest/lvol_module.html
+.. _paddles: https://github.com/ceph/paddles
diff --git a/roles/testnode/defaults/main.yml b/roles/testnode/defaults/main.yml

new file mode 100644 (file)

index 0000000..2c441e6
--- /dev/null
+++ b/roles/testnode/defaults/main.yml
@@ -0,0 +1,81 @@
+---
+mirror_host: apt-mirror.sepia.ceph.com
+git_mirror_host: git.ceph.com
+key_host: download.ceph.com
+gitbuilder_host: gitbuilder.ceph.com
+pip_mirror_url: "http://{{ mirror_host }}/pypi/simple"
+
+# yum repos common to a major version or distro
+common_yum_repos: {}
+
+# version-specific yum repos, defined in the version specific var file
+yum_repos: {}
+
+# list of copr repo *names* to enable (e.g., user/repo)
+copr_repos: []
+
+# apt repos common to a major version or distro
+common_apt_repos: []
+
+# version-specific apt repos, defined in the the version-specific var files
+apt_repos: []
+
+# packages to install, see common_packages below as well. The set of packages to install
+# is packages + common_packages
+packages: []
+
+# a list of packages that have to be installed from epel
+epel_packages: []
+
+# packages common to a major version, distro or package type. This means that they
+# have the same name and are intended to be installed for all other versions in that major
+# version, distro or package type
+common_packages: []
+
+# common packages that aren't available in aarch64 architecture
+non_aarch64_packages: []
+non_aarch64_packages_to_upgrade: []
+non_aarch64_common_packages: []
+
+# packages used by ceph we want to ensure are removed
+ceph_packages_to_remove: []
+ceph_dependency_packages_to_remove: []
+packages_to_remove: []
+packages_to_upgrade: []
+
+# the user teuthology will use
+teuthology_user: "ubuntu"
+xfstests_user: "fsgqa"
+
+# some distros need to start rpcbind before
+# trying to use nfs while others don't.
+start_rpcbind: true
+
+# set to true if /etc/fstab must be modified to persist things like
+# mount options, which is useful for long lived bare metal machines,
+# less useful for virtual machines that are re-imaged before each job
+modify_fstab: true
+
+# used to ensure proper full and short fqdn on testnodes
+lab_domain: ""
+
+ntp_servers:
+  - 0.us.pool.ntp.org
+  - 1.us.pool.ntp.org
+  - 2.us.pool.ntp.org
+  - 3.us.pool.ntp.org
+
+# Set to true in group_vars if the testnode/machine type should have an NVMe device
+check_for_nvme: false
+
+# packages to install via pip
+pip_packages_to_install: []
+
+# Configure ABRT (probably only for downstream use)
+configure_abrt: false
+
+# Configure cachefilesd (https://tracker.ceph.com/issues/6373)
+configure_cachefilesd: false
+
+# Is this a containerized testnode?
+containerized_node: false
diff --git a/roles/testnode/handlers/main.yml b/roles/testnode/handlers/main.yml

new file mode 100644 (file)

index 0000000..e820eb9
--- /dev/null
+++ b/roles/testnode/handlers/main.yml
@@ -0,0 +1,37 @@
+---
+- name: restart ntp
+  service:
+    name: "{{ ntp_service_name }}"
+    state: restarted
+  # There's an issue with ansible<=2.9 and our custom built kernels (5.8 as of this commit) where the service and systemd modules don't have backwards compatibility with init scripts
+  ignore_errors: "{{ 'ceph' in ansible_kernel }}"
+
+- name: restart ssh
+  service:
+    name: "{{ ssh_service_name }}"
+    state: restarted 
+  # There's an issue with ansible<=2.9 and our custom built kernels (5.8 as of this commit) where the service and systemd modules don't have backwards compatibility with init scripts
+  ignore_errors: "{{ 'ceph' in ansible_kernel }}"
+
+- name: start rpcbind
+  service:
+    name: rpcbind
+    state: started
+    enabled: yes
+  when: start_rpcbind
+  # There's an issue with ansible<=2.9 and our custom built kernels (5.8 as of this commit) where the service and systemd modules don't have backwards compatibility with init scripts
+  ignore_errors: "{{ 'ceph' in ansible_kernel }}"
+
+- name: restart nfs-server
+  service:
+    name: "{{ nfs_service }}"
+    state: restarted
+  # There's an issue with ansible<=2.9 and our custom built kernels (5.8 as of this commit) where the service and systemd modules don't have backwards compatibility with init scripts
+  ignore_errors: "{{ 'ceph' in ansible_kernel }}"
+
+- name: restart cron
+  service:
+    name: cron
+    state: restarted
+  # There's an issue with ansible<=2.9 and our custom built kernels (5.8 as of this commit) where the service and systemd modules don't have backwards compatibility with init scripts
+  ignore_errors: "{{ 'ceph' in ansible_kernel }}"
diff --git a/roles/testnode/meta/main.yml b/roles/testnode/meta/main.yml

new file mode 100644 (file)

index 0000000..313fd69
--- /dev/null
+++ b/roles/testnode/meta/main.yml
@@ -0,0 +1,3 @@
+---
+dependencies:
+  - role: secrets
diff --git a/roles/testnode/tasks/apt/packages.yml b/roles/testnode/tasks/apt/packages.yml

new file mode 100644 (file)

index 0000000..97e51e3
--- /dev/null
+++ b/roles/testnode/tasks/apt/packages.yml
@@ -0,0 +1,46 @@
+---
+- name: Ensure packages are not present.
+  apt:
+    name: "{{ ceph_packages_to_remove|list + packages_to_remove|list }}"
+    state: absent
+    force: yes
+  when: ceph_packages_to_remove|length > 0 or
+        packages_to_remove|length > 0
+
+- name: Upgrade packages
+  apt:
+    name: "{{ packages_to_upgrade|list }}"
+    state: latest
+    force: yes
+  when: packages_to_upgrade|length > 0
+
+- name: Upgrade non aarch64 packages
+  apt:
+    name: "{{ non_aarch64_packages_to_upgrade|list }}"
+    state: latest
+    force: yes
+  when:
+    non_aarch64_packages_to_upgrade|length > 0 and
+    ansible_architecture != "aarch64"
+
+- name: Install packages
+  apt:
+    name: "{{ packages|list + common_packages|list }}"
+    state: present 
+    force: yes
+  when: packages|length > 0 or
+        common_packages|length > 0
+
+- name: Install non aarch64 packages
+  apt:
+    name: "{{ non_aarch64_packages|list + non_aarch64_common_packages|list }}"
+    state: present
+    force: yes
+  when: ansible_architecture != "aarch64"
+
+- name: Install packages with --no-install-recommends
+  apt:
+    name: "{{ no_recommended_packages|list }}"
+    state: present
+    install_recommends: no
+  when: no_recommended_packages|length > 0
diff --git a/roles/testnode/tasks/apt/repos.yml b/roles/testnode/tasks/apt/repos.yml

new file mode 100644 (file)

index 0000000..046a9e0
--- /dev/null
+++ b/roles/testnode/tasks/apt/repos.yml
@@ -0,0 +1,62 @@
+---
+# Check for and remove custom repos.
+# http://tracker.ceph.com/issues/12794
+- name: Check for custom repos
+  shell: "ls -1 /etc/apt/sources.list.d/"
+  register: custom_repos
+  changed_when: false
+
+- name: Remove custom repos
+  file: path=/etc/apt/sources.list.d/{{ item }} state=absent
+  with_items: "{{ custom_repos.stdout_lines|default([]) }}"
+  # Ignore changes here because we will be removing repos that we end up re-adding later
+  changed_when: false
+
+- name: Set apt preferences
+  template:
+    dest: "/etc/apt/preferences.d/ceph.pref"
+    src: "apt/ceph.pref"
+    owner: root
+    group: root
+    mode: 0644
+  register: apt_prefs
+
+# Starting with ubuntu 15.04 we no longer maintain our own package mirrors.
+# For anything ubuntu < 15.04 or debian <=7 we still do.
+- name: Add sources list
+  template:
+    dest: /etc/apt/sources.list
+    src: "apt/sources.list.{{ ansible_distribution_release | lower }}"
+    owner: root
+    group: root
+    mode: 0644
+  register: sources
+  when: ansible_architecture != "aarch64" and
+        ansible_distribution_major_version|int < 15
+
+- name: Install apt keys
+  apt_key:
+    url: "{{ item }}"
+    state: present
+  with_items:
+    - "http://{{ key_host }}/keys/autobuild.asc"
+    - "http://{{ key_host }}/keys/release.asc"
+  # try for 2 minutes before failing
+  retries: 24
+  delay: 5
+
+# required for apt_repository
+- name: Install python-apt
+  apt:
+    name: "{{ python_apt_package_name|default('python-apt') }}"
+    state: present
+
+- name: Add local apt repos.
+  apt_repository:
+    repo: "{{ item }}"
+    state: present
+    update_cache: no 
+    mode: 0644
+  with_items: "{{ apt_repos|list + common_apt_repos|list }}"
+  register: local_apt_repos
+  when: ansible_architecture != "aarch64"
diff --git a/roles/testnode/tasks/apt_systems.yml b/roles/testnode/tasks/apt_systems.yml

new file mode 100644 (file)

index 0000000..93d8cf8
--- /dev/null
+++ b/roles/testnode/tasks/apt_systems.yml
@@ -0,0 +1,88 @@
+---
+- name: Setup local repo files.
+  import_tasks: apt/repos.yml
+  tags:
+    - repos
+
+# http://tracker.ceph.com/issues/15090
+# We don't know why it's happening, but something is corrupting the
+# apt-cache.  Let's try just blasting it each time.
+- name: Blast the apt cache.
+  command:
+    sudo apt-get clean
+
+- name: Update apt cache.
+  apt:
+    update_cache: yes
+  # try for 2 minutes before failing
+  retries: 24
+  delay: 5
+  tags:
+    - repos
+    - packages
+
+- name: Perform package related tasks.
+  import_tasks: apt/packages.yml
+  tags:
+    - packages
+
+# This was ported directly from chef.  I was unable to figure out a better way
+# to do this, but it seems to just be adding the user_xattr option to the root filesystem mount.
+# I believe perl was used here initially because the mount resources provided by chef and ansible
+# require both the name (i.e. /) and the source (UUID="<some_uuid>") to ensure it's editing the correct line
+# in /etc/fstab.  This won't work for us because the root file system source (UUID or label) is different depending
+# on the image used to create this node (downburst and cobbler use different images).
+- name: Use perl to add user_xattr to the root mount options in fstab.
+  command:
+    perl -pe 'if (m{^([^#]\S*\s+/\s+\S+\s+)(\S+)(\s+.*)$}) { $_="$1$2,user_xattr$3\n" unless $2=~m{(^|,)user_xattr(,|$)}; }' -i.bak /etc/fstab
+  args:
+    creates: /etc/fstab.bak
+  register: add_user_xattr
+  when:
+    - modify_fstab == true
+    - not containerized_node
+
+- name: Enable xattr for this boot.
+  command:
+    mount -o remount,user_xattr /
+  when: add_user_xattr is defined and
+        add_user_xattr is changed
+
+- name: Ensure fuse, kvm and disk groups exist.
+  group:
+    name: "{{ item }}"
+    state: present
+  with_items:
+  - fuse
+  - kvm
+  - disk
+
+- name: Upload /etc/fuse.conf.
+  template:
+    src: fuse.conf
+    dest: /etc/fuse.conf
+    owner: root
+    group: fuse
+    mode: 0644
+
+- name: Add teuthology user to groups fuse, kvm and disk.
+  user:
+    name: "{{ teuthology_user }}"
+    # group sets the primary group, while groups just adds
+    # the user to the specified group or groups.
+    groups: fuse,kvm,disk
+    append: yes
+
+- import_tasks: static_ip.yml
+  when:
+    - "'vps' not in group_names"
+    - not containerized_node
+
+- name: Stop apache2
+  service:
+    name: apache2
+    state: stopped
+  # There's an issue with ansible<=2.9 and our custom built kernels (5.8 as of this commit) where the service and systemd modules don't have backwards compatibility with init scripts
+  ignore_errors: "{{ 'ceph' in ansible_kernel }}"
+  when:
+    - not containerized_node
diff --git a/roles/testnode/tasks/cachefilesd.yaml b/roles/testnode/tasks/cachefilesd.yaml

new file mode 100644 (file)

index 0000000..7101569
--- /dev/null
+++ b/roles/testnode/tasks/cachefilesd.yaml
@@ -0,0 +1,21 @@
+---
+- name: Install cachefilesd
+  package:
+    name: cachefilesd
+    state: latest
+
+- name: Install cachefilesd conf file
+  template:
+    src: cachefilesd.j2
+    dest: /etc/cachefilesd.conf
+
+- name: Restart cachefilesd
+  service:
+    name: cachefilesd
+    state: restarted
+  # There's an issue with ansible<=2.9 and our custom built kernels (5.8 as of this commit) where the service and systemd modules don't have backwards compatibility with init scripts
+  ignore_errors: "{{ 'ceph' in ansible_kernel }}"
+
+- name: Restart cachefilesd
+  shell: systemctl restart cachefilesd
+  when: "'ceph' in ansible_kernel"
diff --git a/roles/testnode/tasks/check-for-nvme.yml b/roles/testnode/tasks/check-for-nvme.yml

new file mode 100644 (file)

index 0000000..4023802
--- /dev/null
+++ b/roles/testnode/tasks/check-for-nvme.yml
@@ -0,0 +1,41 @@
+---
+# NVMe cards have started failing frequently.  These tasks will mark a
+# system down in the paddles DB so it doesn't repeatedly fail jobs if the device is missing.
+# https://wiki.sepia.ceph.com/doku.php?id=hardware:smithi&#nvme_failure_tracking
+# These tasks can also be used by a few machine types in Octo
+
+# Default to false
+- set_fact:
+    nvme_card_present: false
+
+- name: Check for NVMe drive
+  set_fact:
+    nvme_card_present: true
+  with_items: "{{ ansible_devices }}"
+  when: "'nvme' in item"
+
+- name: Check for teuthology-lock command
+  local_action: shell which teuthology-lock
+  register: teuthology_lock
+  ignore_errors: true
+  become: false
+
+- name: Mark system down if NVMe card missing
+  local_action: "shell {{ teuthology_lock.stdout }} --update --status down {{ inventory_hostname }}"
+  become: false
+  when:
+    - teuthology_lock.rc == 0
+    - nvme_card_present == false
+
+- name: Update description in paddles lock DB if NVMe card missing
+  local_action: "shell {{ teuthology_lock.stdout }} --update --desc 'Marked down by ceph-cm-ansible due to missing NVMe card {{ ansible_date_time.iso8601 }}' {{ inventory_hostname }}"
+  become: false
+  when:
+    - teuthology_lock.rc == 0
+    - nvme_card_present == false
+
+- name: Fail rest of playbook due to missing NVMe card
+  fail:
+    msg: "Failing rest of playbook due to missing NVMe card"
+  when:
+    - nvme_card_present == false
diff --git a/roles/testnode/tasks/cloud-init.yml b/roles/testnode/tasks/cloud-init.yml

new file mode 100644 (file)

index 0000000..b266e09
--- /dev/null
+++ b/roles/testnode/tasks/cloud-init.yml
@@ -0,0 +1,8 @@
+---
+# Older versions of cloud-init are not writing to a file needed
+# to keep hostname across reboots on a non-centos/rhel kernel.
+- name: Include hostname in /etc/sysconfig/network
+  lineinfile:
+    dest: /etc/sysconfig/network
+    line: "HOSTNAME={{ ansible_hostname }}"
+    regexp: "^HOSTNAME=*"
diff --git a/roles/testnode/tasks/configure_lvm.yml b/roles/testnode/tasks/configure_lvm.yml

new file mode 100644 (file)

index 0000000..773cce1
--- /dev/null
+++ b/roles/testnode/tasks/configure_lvm.yml
@@ -0,0 +1,88 @@
+---
+- name: Set root disk
+  set_fact:
+    root_disk: "{{ item.device|regex_replace('[0-9]+', '')|regex_replace('/dev/', '') }}"
+  with_items: "{{ ansible_mounts }}"
+  when:
+    - item.mount == '/'
+    - quick_lvs_to_create is defined
+
+- name: Combine list of non-root disks
+  set_fact:
+      disks_for_vg: "{{ ansible_devices.keys() | sort | reject('match',root_disk) | reject('match','loop') | reject('match','ram') | reject('match','dm-') | map('regex_replace','^','/dev/') | join(',') }}"
+  when: quick_lvs_to_create is defined
+
+- set_fact: vg_name=vg_hdd
+  when:
+    - disks_for_vg is defined
+    - "'nvme' not in disks_for_vg"
+
+- set_fact: vg_name=vg_nvme
+  when:
+    - disks_for_vg is defined
+    - "'nvme' in disks_for_vg"
+
+- name: Create volume_groups dict
+  set_fact:
+    volume_groups:
+      "{'{{ vg_name }}': {'pvs': '{{ disks_for_vg }}' }}"
+  when: vg_name is defined
+
+# This isn't perfect but with the |int at the end, this'll just round down
+# if quick_lvs_to_create won't divide evenly to make sure the VG doesn't run out of space
+- name: Determine desired logical volume percentage size
+  set_fact:
+    quick_lv_size: "{{ (100 / quick_lvs_to_create|int)|int }}"
+  when: quick_lvs_to_create is defined
+
+- name: Create logical_volumes dict
+  set_fact:
+    logical_volumes:
+      "{
+         {%- for lv in range(quick_lvs_to_create|int) -%}
+         'lv_{{ lv + 1 }}':
+           {
+             'vg': '{{ vg_name }}',
+             'size': '{{ quick_lv_size }}%VG',
+             'scratch_dev': true
+           }
+           {%- if not loop.last -%}
+             ,
+           {%- endif -%}
+         {%- endfor -%}
+       }"
+  when: quick_lvs_to_create is defined
+
+- name: "Create volume group(s)"
+  lvg:
+    vg: "{{ item.key }}"
+    pvs: "{{ item.value.pvs }}"
+    state: present
+  with_dict: "{{ volume_groups }}"
+  when: volume_groups is defined
+
+- name: "Create logical volume(s)"
+  lvol:
+    vg: "{{ item.value.vg }}"
+    lv: "{{ item.key }}"
+    size: "{{ item.value.size }}"
+  with_dict: "{{ logical_volumes }}"
+  when: logical_volumes is defined
+
+- name: "Erase /scratch_devs so we know it's accurate"
+  file:
+    path: /scratch_devs
+    state: absent
+
+- name: "Write /scratch_devs"
+  lineinfile:
+    dest: /scratch_devs
+    create: yes
+    owner: root
+    group: root
+    mode: 0644
+    line: "/dev/{{ item.value.vg }}/{{ item.key }}"
+  with_dict: "{{ logical_volumes }}"
+  when:
+    - logical_volumes is defined
+    - item.value.scratch_dev is defined
diff --git a/roles/testnode/tasks/cpan.yml b/roles/testnode/tasks/cpan.yml

new file mode 100644 (file)

index 0000000..2925c8d
--- /dev/null
+++ b/roles/testnode/tasks/cpan.yml
@@ -0,0 +1,53 @@
+---
+- name: Add CPAN config directory for the teuthology user.
+  file:
+    path: "/home/{{ teuthology_user }}/.cpan/CPAN/"
+    owner: "{{ teuthology_user }}"
+    group: "{{ teuthology_user }}"
+    mode: 0755
+    recurse: yes
+    state: directory
+
+- name: Add CPAN config directory for the root user.
+  file:
+    path: /root/.cpan/CPAN/
+    owner: root
+    group: root
+    mode: 0755
+    recurse: yes
+    state: directory
+
+- name: Upload CPAN config for the teuthology user.
+  template:
+    src: cpan_config.pm
+    dest: "/home/{{ teuthology_user }}/.cpan/CPAN/MyConfig.pm"
+    owner: "{{ teuthology_user }}"
+    group: "{{ teuthology_user }}"
+    mode: 0755
+
+- name: Upload CPAN config for root.
+  template:
+    src: cpan_config.pm
+    dest: /root/.cpan/CPAN/MyConfig.pm
+    owner: root
+    group: root
+    mode: 0755
+
+- name: Ensure perl-doc and cpanminus is installed on apt systems.
+  apt: name={{ item }} state=present
+  with_items:
+    - cpanminus
+    - perl-doc
+  when: ansible_pkg_mgr == "apt"
+
+- name: "Check to see if Amazon::S3 is installed."
+  command: "perldoc -l Amazon::S3"
+  register: cpan_check
+  ignore_errors: true
+  changed_when: false
+
+- name: "Install Amazon::S3."
+  cpanm:
+    name: "Amazon::S3"
+  when: cpan_check is defined and
+        cpan_check.rc != 0
diff --git a/roles/testnode/tasks/drive_partitioning.yml b/roles/testnode/tasks/drive_partitioning.yml

new file mode 100644 (file)

index 0000000..c8595cf
--- /dev/null
+++ b/roles/testnode/tasks/drive_partitioning.yml
@@ -0,0 +1,33 @@
+---
+# Partition a data drive, like the nvme devices in smithi.  Only included
+# if drives_to_partition is defined.
+
+- name: "Write a new partition table to {{ item.value.device }}"
+  command: "parted -s {{ item.value.device }} mktable gpt"
+  with_dict: "{{ drives_to_partition }}"
+
+- name: "Write partition entries to {{ item.0.device }}"
+  command: "parted {{ item.0.device }} unit '{{ item.0.unit }}' mkpart foo {{ item.1 }}"
+  with_subelements:
+    - "{{ drives_to_partition }}"
+    - sizes
+
+- name: "Erase /scratch_devs so we know it's accurate"
+  file:
+    path: /scratch_devs
+    state: absent
+
+- name: "Write /scratch_devs for {{ item.0.device }}"
+  lineinfile:
+    dest: /scratch_devs
+    create: yes
+    owner: root
+    group: root
+    mode: 0644
+    line: "{{ item.0.device }}{{ item.1 }}"
+  with_subelements:
+    - "{{ drives_to_partition }}"
+    - scratch_devs
+    - flags:
+      # In case you want to partition a drive but not use it as a scratch device
+      skip_missing: True
diff --git a/roles/testnode/tasks/filesystems.yml b/roles/testnode/tasks/filesystems.yml

new file mode 100644 (file)

index 0000000..54dc29e
--- /dev/null
+++ b/roles/testnode/tasks/filesystems.yml
@@ -0,0 +1,14 @@
+---
+- name: Create filesystems
+  filesystem:
+    dev: "{{ item.value.device }}"
+    fstype: "{{ item.value.fstype }}"
+  with_dict: "{{ filesystems }}"
+
+- name: Mount filesystems
+  mount:
+    path: "{{ item.value.mountpoint }}"
+    src: "{{ item.value.device }}"
+    fstype: "{{ item.value.fstype }}"
+    state: mounted
+  with_dict: "{{ filesystems }}"
diff --git a/roles/testnode/tasks/imitate_ubuntu.yml b/roles/testnode/tasks/imitate_ubuntu.yml

new file mode 100644 (file)

index 0000000..fcaca9d
--- /dev/null
+++ b/roles/testnode/tasks/imitate_ubuntu.yml
@@ -0,0 +1,22 @@
+---
+# plays that make centos and rhel act or look
+# like an ubuntu system for ease of testing
+
+- name: Make raid/smart scripts work.
+  file:
+    state: link
+    src: /sbin/lspci
+    dest: /usr/bin/lspci
+    force: yes
+
+- name: Create FStest ubuntu directory.
+  file:
+    state: directory
+    dest: /usr/lib/ltp/testcases/bin
+
+- name: Make fsstress same path as ubuntu.
+  file:
+    state: link
+    src: /usr/bin/fsstress
+    dest: /usr/lib/ltp/testcases/bin/fsstress
+    force: yes
diff --git a/roles/testnode/tasks/lvm.yml b/roles/testnode/tasks/lvm.yml

new file mode 100644 (file)

index 0000000..9767bbc
--- /dev/null
+++ b/roles/testnode/tasks/lvm.yml
@@ -0,0 +1,13 @@
+---
+- name: Edit lvm.conf to support LVM on kRBD.
+  lineinfile:
+    dest: /etc/lvm/lvm.conf
+    regexp: "# types ="
+    line: 'types = [ "rbd", 16 ]'
+    backrefs: yes
+    state: present
+
+- import_tasks: configure_lvm.yml
+  when: (logical_volumes is defined) or
+        (volume_groups is defined) or
+        (quick_lvs_to_create is defined)
diff --git a/roles/testnode/tasks/main.yml b/roles/testnode/tasks/main.yml

new file mode 100644 (file)

index 0000000..b68c0f8
--- /dev/null
+++ b/roles/testnode/tasks/main.yml
@@ -0,0 +1,159 @@
+---
+# loading vars
+- import_tasks: vars.yml
+  tags:
+    - vars
+    - always
+
+- import_tasks: user.yml
+  tags:
+    - user
+
+- name: Set a high max open files limit for the teuthology user.
+  template:
+    src: security_limits.conf
+    dest: "/etc/security/limits.d/{{ teuthology_user }}.conf"
+    owner: root
+    group: root
+    mode: 0755
+  when: ansible_pkg_mgr != "zypper"
+
+- name: Set the hostname
+  import_tasks: set_hostname.yml
+  when: lab_domain != ""
+  tags:
+    - hostname
+
+- name: configure ssh
+  import_tasks: ssh.yml
+  tags:
+    - ssh
+
+- name: configure things specific to yum systems
+  import_tasks: yum_systems.yml
+  when: ansible_os_family == "RedHat"
+
+- name: configure things specific to apt systems
+  import_tasks: apt_systems.yml
+  when: ansible_pkg_mgr == "apt"
+
+- name: configure things specific to zypper systems
+  import_tasks: zypper_systems.yml
+  when: ansible_pkg_mgr == "zypper"
+
+- name: configure centos specific things
+  import_tasks: setup-centos.yml
+  when: ansible_distribution == "CentOS"
+
+- name: configure red hat specific things
+  import_tasks: setup-redhat.yml
+  when: ansible_distribution == 'RedHat'
+
+- name: configure fedora specific things
+  import_tasks: setup-fedora.yml
+  when: ansible_distribution == "Fedora"
+
+- name: configure ubuntu specific things
+  import_tasks: setup-ubuntu.yml
+  when: ansible_distribution == "Ubuntu"
+
+- name: configure ubuntu non-aarch64 specific things
+  import_tasks: setup-ubuntu-non-aarch64.yml
+  when:
+    ansible_distribution == "Ubuntu" and
+    ansible_architecture != "aarch64" and
+    not containerized_node
+
+- name: configure debian specific things
+  import_tasks: setup-debian.yml
+  when: ansible_distribution == "Debian"
+
+- name: configure opensuse specific things
+  import_tasks: setup-opensuse.yml
+  when: ansible_distribution == "openSUSE"
+
+- import_tasks: check-for-nvme.yml
+  when: check_for_nvme == true
+
+- import_tasks: zap_disks.yml
+  tags:
+    - zap
+
+- name: partition drives, if any are requested
+  import_tasks: drive_partitioning.yml
+  when: drives_to_partition is defined
+  tags:
+    - partition
+
+- name: set up LVM
+  import_tasks: lvm.yml
+  tags:
+    - lvm
+
+- name: set up filesystems
+  import_tasks: filesystems.yml
+  tags:
+    - filesystems
+  when: filesystems is defined
+
+- name: mount /var/lib/ceph to specified partition
+  import_tasks: var_lib.yml
+  when: var_lib_partition is defined
+  tags:
+    - varlib
+
+- import_tasks: cachefilesd.yaml
+  when: configure_cachefilesd|bool == true
+  tags:
+    - cachefilesd
+
+# Install and configure cpan and Amazon::S3
+- import_tasks: cpan.yml
+  tags:
+    - cpan
+  when:
+    - ansible_os_family != "RedHat"
+    - ansible_distribution_major_version != 8
+
+# configure ntp
+- import_tasks: ntp.yml
+  tags:
+    - ntp-client
+
+- name: configure pip to use our mirror
+  import_tasks: pip.yml
+  tags:
+    - pip
+
+- name: include resolv.conf setup
+  import_tasks: resolvconf.yml
+  tags:
+    - resolvconf
+
+# http://tracker.ceph.com/issues/20623
+- name: List any leftover Ceph artifacts from previous jobs
+  shell: 'find {{ item }} -name "*ceph*"'
+  with_items:
+    - /var/run/
+    - /etc/systemd/system/
+    - /etc/ceph
+    - /var/log/
+  register: ceph_test_artifacts
+  changed_when: ceph_test_artifacts.stdout != ""
+  failed_when: ceph_test_artifacts.rc != 0 and
+               "No such file or directory" not in ceph_test_artifacts.stderr and
+               "File system loop detected" not in ceph_test_artifacts.stderr
+
+- name: Delete any leftover Ceph artifacts from previous jobs
+  file:
+    path: "{{ item }}"
+    state: absent
+  with_items: "{{ ceph_test_artifacts.results|map(attribute='stdout_lines')|list }}"
+
+# Touch a file to indicate we are done. This is something chef did;
+# teuthology.task.internal.vm_setup() expects it.
+- name: Touch /ceph-qa-ready
+  file:
+      path: /ceph-qa-ready
+      state: touch
+  when: (ran_from_cephlab_playbook is undefined or not ran_from_cephlab_playbook|bool)
diff --git a/roles/testnode/tasks/nfs.yml b/roles/testnode/tasks/nfs.yml

new file mode 100644 (file)

index 0000000..7104655
--- /dev/null
+++ b/roles/testnode/tasks/nfs.yml
@@ -0,0 +1,18 @@
+---
+- name: Upload a dummy nfs export so that the nfs kernel server starts.
+  template:
+    src: exports
+    dest: /etc/exports
+    owner: root
+    group: root
+    mode: 0644
+  notify:
+    - start rpcbind
+    - restart nfs-server
+
+- name: Enable nfs-server on rhel 7.x.
+  service:
+    name: "{{ nfs_service }}"
+    enabled: true
+  when: ansible_distribution == "RedHat" and
+        ansible_distribution_major_version == "7"
diff --git a/roles/testnode/tasks/ntp.yml b/roles/testnode/tasks/ntp.yml

new file mode 100644 (file)

index 0000000..f93bef2
--- /dev/null
+++ b/roles/testnode/tasks/ntp.yml
@@ -0,0 +1,56 @@
+---
+- name: Install ntp package on rpm based systems.
+  yum:
+    name: ntp
+    state: present
+  when: ansible_pkg_mgr  == "yum"
+  tags:
+    - packages
+
+- name: Install ntp package on apt based systems.
+  apt:
+    name: ntp
+    state: present
+  when: ansible_pkg_mgr  == "apt"
+  tags:
+    - packages
+
+# See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=806556.
+# ifdown/ifup would often leave ntpd not running on xenial.
+# We do our own ntpdate dance in teuthology's clock task.
+- name: Remove racy /etc/network/if-up.d/ntpdate on xenial
+  file:
+    name: /etc/network/if-up.d/ntpdate
+    state: absent
+  when: ansible_distribution == "Ubuntu" and
+        ansible_distribution_major_version == '16'
+
+- name: Create the ntp.conf file.
+  template:
+    src: ntp.conf
+    dest: /etc/ntp.conf
+    owner: root
+    group: root
+    mode: 0644
+  notify:
+    - restart ntp
+  when: ntp_service_name == "ntp" or ntp_service_name == "ntpd"
+
+- name: Create the chrony.conf file
+  template:
+    src: chrony.conf
+    dest: /etc/chrony.conf
+    owner: root
+    group: root
+    mode: 0644
+  notify:
+    - restart ntp
+  when: ntp_service_name == "chronyd"
+
+- name: Make sure ntpd is running.
+  service:
+    name: "{{ ntp_service_name }}"
+    enabled: yes
+    state: started
+  # There's an issue with ansible<=2.9 and our custom built kernels (5.8 as of this commit) where the service and systemd modules don't have backwards compatibility with init scripts
+  ignore_errors: "{{ 'ceph' in ansible_kernel }}"
diff --git a/roles/testnode/tasks/pip.yml b/roles/testnode/tasks/pip.yml

new file mode 100644 (file)

index 0000000..8ef9c6b
--- /dev/null
+++ b/roles/testnode/tasks/pip.yml
@@ -0,0 +1,61 @@
+---
+# Default to python2 version
+- set_fact:
+    pip_version: python-pip
+    pip_executable: pip
+
+# Start using python3-pip on Ubuntu 20.04 and later
+# Add appropriate `or` statements for other python3-only distros
+- set_fact:
+    pip_version: python3-pip
+    pip_executable: pip3
+    # You would think this ansible_python_interpreter=/usr/bin/python3 is already the default
+    # (hint: it is) but the pip module at the bottom insisted on using the python2 version of
+    # setuptools despite this default *and* giving you the option to set the executable to pip3.
+    # For some reason, reminding ansible this is a python3 host here makes the pip module work.
+    ansible_python_interpreter: /usr/bin/python3
+  when: (ansible_distribution == 'Ubuntu' and ansible_distribution_major_version|int >= 20) or
+        (ansible_os_family == 'RedHat' and ansible_distribution_major_version|int >= 8)
+
+# python-pip installed during packages task on Fedora since epel doesn't exist
+- name: Install python-pip on rpm based systems.
+  yum:
+    name: "{{ pip_version }}"
+    state: present
+    enablerepo: epel
+  when: ansible_pkg_mgr == "yum" and ansible_distribution != 'Fedora'
+
+- name: Install python-pip on apt based systems.
+  apt:
+    name: "{{ pip_version }}"
+    state: present
+  when: ansible_pkg_mgr == "apt"
+
+- name: Install python-pip on zypper based systems.
+  zypper:
+    name:
+      - python2-pip
+      - python3-pip
+    state: present
+  when: ansible_pkg_mgr == "zypper"
+
+- name: Create the .pip directory for the teuthology user.
+  file:
+    path: "/home/{{ teuthology_user }}/.pip"
+    owner: "{{ teuthology_user }}"
+    group: "{{ teuthology_user }}"
+    mode: 0644
+    state: directory
+
+- name: Create pip.conf and configure it to use our mirror
+  template:
+    src: pip.conf
+    dest: "/home/{{ teuthology_user }}/.pip/pip.conf"
+    owner: "{{ teuthology_user }}"
+    group: "{{ teuthology_user }}"
+    mode: 0644
+
+- name: Install packages via pip
+  pip:
+    name: "{{ pip_packages_to_install|list }}"
+    executable: "{{ pip_executable }}"
diff --git a/roles/testnode/tasks/redhat/rhel_6.yml b/roles/testnode/tasks/redhat/rhel_6.yml

new file mode 100644 (file)

index 0000000..92353c5
--- /dev/null
+++ b/roles/testnode/tasks/redhat/rhel_6.yml
@@ -0,0 +1,5 @@
+---
+- name: Fix broken cloud-init
+  import_tasks: ../cloud-init.yml
+
+- import_tasks: ../imitate_ubuntu.yml
diff --git a/roles/testnode/tasks/redhat/rhel_7.yml b/roles/testnode/tasks/redhat/rhel_7.yml

new file mode 100644 (file)

index 0000000..262ec0c
--- /dev/null
+++ b/roles/testnode/tasks/redhat/rhel_7.yml
@@ -0,0 +1,4 @@
+---
+- import_tasks: ../nfs.yml
+  tags:
+    - nfs
diff --git a/roles/testnode/tasks/resolvconf.yml b/roles/testnode/tasks/resolvconf.yml

new file mode 100644 (file)

index 0000000..eb12133
--- /dev/null
+++ b/roles/testnode/tasks/resolvconf.yml
@@ -0,0 +1,61 @@
+---
+- name: Purge resolvconf
+  apt:
+    name: resolvconf
+    state: absent
+    purge: yes
+  when: ansible_pkg_mgr == "apt"
+
+- name: Set interface
+  set_fact:
+    interface: "{{ ansible_default_ipv4.interface }}"
+
+- name: Check for presence of /etc/network/interfaces
+  stat:
+    path: /etc/network/interfaces
+    get_checksum: no
+  register: etc_network_interfaces
+
+- name: Rewrite /etc/network/interfaces to use dhcp
+  replace:
+    dest: /etc/network/interfaces
+    # This regexp matches a stanza like:
+    #
+    # iface eth0 inet static
+    #     address 10.8.128.17
+    #     netmask 255.255.248.0
+    #     gateway 10.8.135.254
+    #     broadcast 10.8.135.255
+    #
+    # It also handles cases where the entire stanza has whitespace in front of it.
+    regexp: '^ *iface {{ interface }} inet static(\n\ +[^\s]+.*)+'
+    replace: "iface {{ interface }} inet dhcp\n"
+  when: etc_network_interfaces.stat.exists
+  register: dhcp_enabled
+
+- name: Set bounce_interface if we just enabled dhcp
+  set_fact:
+    bounce_interface: "{{ dhcp_enabled is changed }}"
+
+- name: ifdown and ifup
+  shell: "ifdown {{ interface }} && ifup {{ interface }}"
+  # Even if bounce_interface is False, we need to work around a Xenial issue
+  # where purging resolvconf breaks DNS by removing /etc/resolv.conf. Bouncing
+  # the interface rebuilds it.
+  # The Ubuntu bug is:
+  # https://bugs.launchpad.net/ubuntu/+source/resolvconf/+bug/1593489
+  when: bounce_interface == "True" or 
+        (ansible_distribution|lower == 'ubuntu' and
+        ansible_distribution_major_version|int == 16)
+
+- name: Ensure lab_domain is in search domains in /etc/resolv.conf
+  lineinfile:
+    dest: /etc/resolv.conf
+    regexp: "^search .*"
+    line: "search {{ lab_domain }}"
+
+- name: Ensure domain is set in /etc/resolv.conf
+  lineinfile:
+    dest: /etc/resolv.conf
+    regexp: "^domain .*"
+    line: "domain {{ lab_domain }}"
diff --git a/roles/testnode/tasks/set_hostname.yml b/roles/testnode/tasks/set_hostname.yml

new file mode 100644 (file)

index 0000000..8387d22
--- /dev/null
+++ b/roles/testnode/tasks/set_hostname.yml
@@ -0,0 +1,10 @@
+---
+- name: Set hostname var
+  set_fact:
+    hostname: "{{ inventory_hostname.split('.')[0] }}"
+
+- name: "Set the system's hostname"
+  hostname:
+     name: "{{ hostname }}"
+  # https://github.com/ansible/ansible/issues/42726
+  when: ansible_os_family != "Suse"
diff --git a/roles/testnode/tasks/setup-centos.yml b/roles/testnode/tasks/setup-centos.yml

new file mode 100644 (file)

index 0000000..f0140a2
--- /dev/null
+++ b/roles/testnode/tasks/setup-centos.yml
@@ -0,0 +1,6 @@
+---
+- name: Fix broken cloud-init
+  import_tasks: cloud-init.yml
+  when: ansible_distribution_major_version == "6"
+
+- import_tasks: imitate_ubuntu.yml
diff --git a/roles/testnode/tasks/setup-debian.yml b/roles/testnode/tasks/setup-debian.yml

new file mode 100644 (file)

index 0000000..dd65808
--- /dev/null
+++ b/roles/testnode/tasks/setup-debian.yml
@@ -0,0 +1,33 @@
+---
+- name: Work around broken wget on wheezy.
+  template:
+    src: wgetrc
+    dest: /etc/wgetrc
+    owner: root
+    group: root
+    mode: 0644
+
+- name: Stop collectl
+  service:
+    name: collectl
+    state: stopped
+
+- name: Add PATH to the teuthology_user .bashrc.
+  lineinfile:
+    dest: "/home/{{ teuthology_user }}/.bashrc"
+    line: "export PATH=$PATH:/usr/sbin"
+    insertbefore: BOF
+    state: present
+
+- name: Check to see if we need to edit /etc/profile.
+  command:
+    grep '/usr/games:/usr/sbin' /etc/profile
+  register: update_profile
+  changed_when: false
+  ignore_errors: true
+
+- name: Update /etc/profile if needed.
+  command:
+    sed -i 's/\/usr\/games"/\/usr\/games:\/usr\/sbin"/g' /etc/profile
+  when: update_profile is defined and
+        update_profile.rc != 0
diff --git a/roles/testnode/tasks/setup-fedora.yml b/roles/testnode/tasks/setup-fedora.yml

new file mode 100644 (file)

index 0000000..b35caf3
--- /dev/null
+++ b/roles/testnode/tasks/setup-fedora.yml
@@ -0,0 +1,10 @@
+---
+- import_tasks: imitate_ubuntu.yml
+
+- name: Set grub config.
+  template:
+    src: grub
+    dest: /etc/default/grub
+    owner: root
+    group: root
+    mode: 0644
diff --git a/roles/testnode/tasks/setup-opensuse.yml b/roles/testnode/tasks/setup-opensuse.yml

new file mode 100644 (file)

index 0000000..e69de29
diff --git a/roles/testnode/tasks/setup-redhat.yml b/roles/testnode/tasks/setup-redhat.yml

new file mode 100644 (file)

index 0000000..e853612
--- /dev/null
+++ b/roles/testnode/tasks/setup-redhat.yml
@@ -0,0 +1,8 @@
+---
+- name: Include rhel 7.x specific tasks.
+  import_tasks: redhat/rhel_7.yml
+  when: ansible_distribution_major_version == "7"
+
+- name: Include rhel 6.x specific tasks.
+  import_tasks: redhat/rhel_6.yml
+  when: ansible_distribution_major_version == "6"
diff --git a/roles/testnode/tasks/setup-ubuntu-non-aarch64.yml b/roles/testnode/tasks/setup-ubuntu-non-aarch64.yml

new file mode 100644 (file)

index 0000000..de752d3
--- /dev/null
+++ b/roles/testnode/tasks/setup-ubuntu-non-aarch64.yml
@@ -0,0 +1,37 @@
+---
+- name: Upload weekly kernel-clean crontab.
+  template:
+    src: cron/kernel-clean
+    dest: /etc/cron.weekly/kernel-clean
+    owner: root
+    group: root
+    mode: 0755
+  notify:
+    - restart cron
+
+- name: Upload /etc/grub.d/02_force_timeout.
+  template:
+    src: grub.d/02_force_timeout
+    dest: /etc/grub.d/02_force_timeout
+    owner: root
+    group: root
+    mode: 0755
+
+- name: Enable kernel modules to load at boot time.
+  template:
+    src: modules
+    dest: /etc/modules
+    owner: root
+    group: root
+    mode: 0644
+
+- name: Enabling auto-fsck fix to prevent boot hangup.
+  lineinfile:
+    dest: /etc/default/rcS
+    line: "FSCKFIX=yes"
+    regexp: "FSCKFIX=no"
+    create: yes
+    # backrefs makes it so that if the regexp
+    # isn't found the file is left unchanged
+    backrefs: yes
+    state: present
diff --git a/roles/testnode/tasks/setup-ubuntu.yml b/roles/testnode/tasks/setup-ubuntu.yml

new file mode 100644 (file)

index 0000000..5c95336
--- /dev/null
+++ b/roles/testnode/tasks/setup-ubuntu.yml
@@ -0,0 +1,9 @@
+---
+- name: Remove /etc/ceph
+  file:
+    path: /etc/ceph
+    state: absent
+
+- import_tasks: nfs.yml
+  tags:
+    - nfs
diff --git a/roles/testnode/tasks/ssh.yml b/roles/testnode/tasks/ssh.yml

new file mode 100644 (file)

index 0000000..a0f1455
--- /dev/null
+++ b/roles/testnode/tasks/ssh.yml
@@ -0,0 +1,31 @@
+---
+- name: Upload distro major version specific sshd_config
+  template:
+    src: "ssh/sshd_config_{{ ansible_distribution | lower |  regex_replace(' ', '_') }}_{{ ansible_distribution_major_version }}"
+    dest: /etc/ssh/sshd_config
+    owner: root 
+    group: root
+    mode: 0755
+  notify:
+    - restart ssh
+
+- name: Upload ssh_config
+  template:
+    src: ssh/ssh_config
+    dest: /etc/ssh/ssh_config
+    owner: root
+    group: root
+    mode: 0755
+
+- name: Add ssh pubkeys
+  authorized_key:
+    user="{{ teuthology_user }}"
+    key=https://raw.githubusercontent.com/ceph/keys/autogenerated/ssh/@all.pub
+  # Register and retry to work around transient githubusercontent.com issues
+  register: ssh_key_update
+  until: ssh_key_update is success
+  # try for 2 minutes to retrieve the key before failing
+  retries: 24
+  delay: 5
+  tags:
+    - pubkeys
diff --git a/roles/testnode/tasks/static_ip.yml b/roles/testnode/tasks/static_ip.yml

new file mode 100644 (file)

index 0000000..005ab0c
--- /dev/null
+++ b/roles/testnode/tasks/static_ip.yml
@@ -0,0 +1,8 @@
+---
+- name: Set up static IP in /etc/hosts.
+  lineinfile:
+    dest: /etc/hosts
+    line: "{{ ansible_default_ipv4['address'] }} {{ ansible_fqdn }} {{ ansible_hostname }}"
+    regexp: "^127.0.1.1"
+    backrefs: yes
+    state: present
diff --git a/roles/testnode/tasks/user.yml b/roles/testnode/tasks/user.yml

new file mode 100644 (file)

index 0000000..ec83b9a
--- /dev/null
+++ b/roles/testnode/tasks/user.yml
@@ -0,0 +1,37 @@
+---
+- name: Ensure the sudo group exists.
+  group:
+    name: sudo
+    state: present
+
+- name: Ensure the teuthology_user group exists.
+  group:
+    name: "{{ teuthology_user }}"
+    state: present
+
+- name: Create the teuthology user.
+  user:
+    name: "{{ teuthology_user }}"
+    # apparently some ceph tests fail without this uid
+    # https://github.com/ceph/ceph-qa-chef/commit/5678cc3893fd1cc291254e4d1abe6705e6a9bbb0
+    uid: 1000
+    group: "{{ teuthology_user }}"
+    groups: sudo
+    shell: /bin/bash
+    state: present
+  # If we're currently running as teuthology_user, we won't be able to modify
+  # the account
+  when: teuthology_user != ansible_ssh_user
+  register: teuthology_user_existence
+
+# If the teuthology_user was just created, delete its password
+- name: Delete the teuthology users password.
+  command: "passwd -d {{ teuthology_user }}"
+  when: teuthology_user_existence is defined and
+        teuthology_user_existence is changed
+
+- name: Add a user for xfstests to test user quotas.
+  user:
+    name: "{{ xfstests_user }}"
+    uid: 10101
+    state: present
diff --git a/roles/testnode/tasks/var_lib.yml b/roles/testnode/tasks/var_lib.yml

new file mode 100644 (file)

index 0000000..324dc68
--- /dev/null
+++ b/roles/testnode/tasks/var_lib.yml
@@ -0,0 +1,31 @@
+---
+# This set of tasks is intended to mount a small NVMe partition to /var/lib/ceph
+# to fix http://tracker.ceph.com/issues/20910
+
+- name: "Create /var/lib/ceph"
+  file:
+    path: "/var/lib/ceph"
+    state: directory
+
+- name: Set xfs_opts on newer OSes
+  set_fact:
+    xfs_opts: "-m crc=0,finobt=0"
+  when: (ansible_distribution | lower == 'ubuntu' and ansible_distribution_major_version|int >= 16) or
+        (ansible_distribution | lower in ['centos', 'rhel'] and ansible_distribution_major_version|int >= 7)
+
+- name: "Create xfs filesystem on {{ var_lib_partition }}"
+  filesystem:
+    dev: "{{ var_lib_partition }}"
+    fstype: xfs
+    force: yes
+    # Don't use a version 5 superblock as it's too new for some kernels
+    opts: "{{ xfs_opts|default('') }}"
+
+- name: "Mount {{ var_lib_partition }} to /var/lib/ceph"
+  mount:
+    path: "/var/lib/ceph"
+    src: "{{ var_lib_partition }}"
+    fstype: xfs
+    # Don't fail to boot if the mount fails
+    opts: defaults,nofail
+    state: mounted
diff --git a/roles/testnode/tasks/vars.yml b/roles/testnode/tasks/vars.yml

new file mode 100644 (file)

index 0000000..857b232
--- /dev/null
+++ b/roles/testnode/tasks/vars.yml
@@ -0,0 +1,21 @@
+---
+- name: Include package type specific vars.
+  include_vars: "{{ ansible_pkg_mgr }}_systems.yml"
+
+- name: Including distro specific variables.
+  include_vars: "{{ item }}"
+  with_first_found:
+    - "{{ ansible_distribution | lower | regex_replace(' ', '_') }}.yml"
+    - empty.yml
+
+- name: Including major version specific variables.
+  include_vars: "{{ item }}"
+  with_first_found:
+    - "{{ ansible_distribution | lower | regex_replace(' ', '_') }}_{{ ansible_distribution_major_version }}.yml"
+    - empty.yml
+
+- name: Including version specific variables.
+  include_vars: "{{ item }}"
+  with_first_found:
+    - "{{ ansible_distribution | lower | regex_replace(' ', '_') }}_{{ ansible_distribution_version }}.yml"
+    - empty.yml
diff --git a/roles/testnode/tasks/yum/abrt.yml b/roles/testnode/tasks/yum/abrt.yml

new file mode 100644 (file)

index 0000000..2e6ad2f
--- /dev/null
+++ b/roles/testnode/tasks/yum/abrt.yml
@@ -0,0 +1,25 @@
+---
+- name: Install abrt
+  yum:
+    name: abrt
+    state: installed
+
+- name: Enable abrt-auto-reporting
+  command: abrt-auto-reporting enabled
+
+- name: Set OpenGPGCheck in abrt-action-save-package-data.conf
+  lineinfile:
+    path: /etc/abrt/abrt-action-save-package-data.conf
+    regexp: '^OpenGPGCheck'
+    line: 'OpenGPGCheck no'
+
+- name: Set ProcessUnpackaged in abrt-action-save-package-data.conf
+  lineinfile:
+    path: /etc/abrt/abrt-action-save-package-data.conf
+    regexp: '^ProcessUnpackaged'
+    line: 'ProcessUnpackaged no'
+
+- name: Restart abrtd
+  service:
+    name: abrtd
+    state: restarted
diff --git a/roles/testnode/tasks/yum/firewall.yml b/roles/testnode/tasks/yum/firewall.yml

new file mode 100644 (file)

index 0000000..7835cae
--- /dev/null
+++ b/roles/testnode/tasks/yum/firewall.yml
@@ -0,0 +1,18 @@
+---
+# There have been instances where iptables is installed on EL7 testnodes.
+# This task will make sure both services are stopped and disabled regardless
+# of OS version.
+
+- name: Stop and disable firewalld
+  service:
+    name: firewalld
+    state: stopped
+    enabled: no
+  ignore_errors: true
+
+- name: Stop and disable iptables
+  service:
+    name: iptables
+    state: stopped
+    enabled: no
+  ignore_errors: true
diff --git a/roles/testnode/tasks/yum/gpg_keys.yml b/roles/testnode/tasks/yum/gpg_keys.yml

new file mode 100644 (file)

index 0000000..5c216c9
--- /dev/null
+++ b/roles/testnode/tasks/yum/gpg_keys.yml
@@ -0,0 +1,18 @@
+---
+# this is needed for the rpm_key module so it can
+# figure out if the key you're adding is already
+# installed or not.
+- name: Install GPG
+  yum:
+    name: gpg
+    state: present
+
+- name: Install GPG keys
+  rpm_key:
+    state: present
+    key: "{{ item }}"
+    validate_certs: no
+  with_items:
+    - 'https://{{ key_host }}/keys/release.asc'
+    - 'https://{{ key_host }}/keys/autobuild.asc'
+  register: gpg_keys
diff --git a/roles/testnode/tasks/yum/packages.yml b/roles/testnode/tasks/yum/packages.yml

new file mode 100644 (file)

index 0000000..010695a
--- /dev/null
+++ b/roles/testnode/tasks/yum/packages.yml
@@ -0,0 +1,70 @@
+---
+# this is needed for the yum-complete-transation command next
+- name: Ensure yum_utils is present.
+  package:
+    name: yum-utils
+    state: present
+  when:
+    - ansible_os_family == "RedHat"
+    - ansible_distribution_major_version|int <= 7
+
+- name: Removing saved yum transactions
+  command: yum-complete-transaction --cleanup-only
+  register: transaction_cleanup
+  changed_when: "'Cleaning up' in transaction_cleanup.stdout"
+  when:
+    - ansible_os_family == "RedHat"
+    - ansible_distribution_major_version|int <= 7
+
+- name: Check if ceph-debuginfo is installed
+  command: rpm -q ceph-debuginfo
+  ignore_errors: yes
+  changed_when: false
+  register: bz1234967
+  tags:
+    - remove-ceph
+
+- name: Work around https://bugzilla.redhat.com/show_bug.cgi?id=1234967
+  command: rpm -e ceph-debuginfo
+  when: bz1234967 is defined and bz1234967.rc == 0
+  tags:
+    - remove-ceph
+
+- name: Ensure ceph packages are not present.
+  package:
+    name: "{{ ceph_packages_to_remove|list }}"
+    state: absent
+  tags:
+    - remove-ceph
+
+- name: Ensure ceph dependency packages are not present.
+  package:
+    name: "{{ ceph_dependency_packages_to_remove|list }}"
+    state: absent
+  tags:
+    - remove-ceph-dependency
+
+- name: Install packages
+  package:
+    name: "{{ packages|list }}"
+    state: present 
+  when: packages|length > 0
+
+- name: Install epel packages
+  package:
+    name: "{{ epel_packages|list }}"
+    state: present
+    enablerepo: epel
+  when: epel_packages|length > 0
+
+- name: Remove packages
+  package:
+    name: "{{ packages_to_remove|list }}"
+    state: absent
+  when: packages_to_remove|length > 0
+
+- name: Upgrade packages
+  package:
+    name: "{{ packages_to_upgrade|list }}"
+    state: latest
+  when: packages_to_upgrade|length > 0
diff --git a/roles/testnode/tasks/yum/repos.yml b/roles/testnode/tasks/yum/repos.yml

new file mode 100644 (file)

index 0000000..8caecc4
--- /dev/null
+++ b/roles/testnode/tasks/yum/repos.yml
@@ -0,0 +1,62 @@
+---
+- name: Configure local mirrorlists
+  template:
+    src: 'mirrorlists/{{ ansible_distribution_major_version }}/{{ item }}'
+    dest: '/etc/yum.repos.d/{{ item }}'
+    owner: root
+    group: root
+    mode: 0644
+  with_items: "{{ yum_mirrorlists }}"
+  when: yum_mirrorlists is defined
+
+- name: Configure common additional repos in /etc/yum.repos.d/
+  template:
+    src: yum_repo.j2
+    dest: /etc/yum.repos.d/{{ item.key }}.repo
+    owner: root
+    group: root
+    mode: 0644
+  register: repo_file
+  with_dict: "{{ common_yum_repos }}"
+  when: common_yum_repos.keys() | length > 0
+
+- name: Configure version specific repos in /etc/yum.repos.d/
+  template:
+    src: yum_repo.j2
+    dest: /etc/yum.repos.d/{{ item.key }}.repo
+    owner: root
+    group: root
+    mode: 0644
+  register: version_repo_file
+  with_dict: "{{ yum_repos|default({}) | combine(additional_yum_repos|default({}), recursive=True) }}"
+  when: (yum_repos.keys() | length > 0) or (additional_yum_repos.keys() | length > 0)
+
+- name: Enable copr repos
+  command: "dnf -y copr enable {{ item }}"
+  with_items: "{{ copr_repos }}"
+  when:
+    - (ansible_os_family == "RedHat" and ansible_distribution_major_version|int >= 8)
+    - copr_repos|length > 0
+
+- name: Enable PowerTools on CentOS
+  command: "dnf -y config-manager --set-enabled powertools"
+  when:
+    - ansible_distribution == 'CentOS'
+    - ansible_distribution_major_version | int < 9
+
+- name: Enable CodeReady Linux Builder on CentOS 9
+  command: "dnf -y config-manager --set-enabled crb"
+  when:
+    - ansible_distribution == 'CentOS'
+    - ansible_distribution_major_version | int >= 9
+
+- import_tasks: gpg_keys.yml
+  when: ansible_distribution == "Fedora"
+  tags:
+    - gpg-keys
+
+- name: Clean yum cache
+  shell: yum clean all
+  when: (repo_file is defined and repo_file is changed) or
+        (gpg_keys is defined and gpg_keys is changed) or
+        (version_repo_file is defined and version_repo_file is changed)
diff --git a/roles/testnode/tasks/yum_systems.yml b/roles/testnode/tasks/yum_systems.yml

new file mode 100644 (file)

index 0000000..09d13c8
--- /dev/null
+++ b/roles/testnode/tasks/yum_systems.yml
@@ -0,0 +1,107 @@
+---
+# Tasks common to all systems that use the yum
+# package manager
+
+- name: Create remote.conf
+  template:
+    src: remote.conf
+    dest: /etc/security/limits.d/remote.conf
+    group: root
+    owner: root
+    mode: 0644
+  when:
+    - not containerized_node
+
+- name: Set mode on /etc/fuse.conf
+  file:
+    path: /etc/fuse.conf
+    mode: 0644
+    state: touch
+  changed_when: false
+
+- name: Ensure the group kvm exists.
+  group:
+    name: kvm
+    state: present
+
+- name: Add the teuthology user to groups kvm,disk
+  user:
+    name: "{{ teuthology_user }}"
+    groups: kvm,disk
+    append: yes
+
+- name: Configure /etc/sudoers.
+  template:
+    src: sudoers
+    dest: /etc/sudoers
+    owner: root
+    group: root
+    mode: 0440
+    validate: visudo -cf %s
+  tags:
+    - sudoers
+
+- name: Configure /etc/security/limits.conf
+  template:
+    src: limits.conf
+    dest: /etc/security/limits.conf
+    group: root
+    owner: root
+    mode: 0644
+
+# http://tracker.ceph.com/issues/15272
+# We don't know why it's happening, but something is corrupting the
+# rpmdb.  Let's try just rebuilding it every time.
+- name: Rebuild rpmdb
+  command:
+    rpm --rebuilddb
+  # https://bugzilla.redhat.com/show_bug.cgi?id=1680124
+  when:
+    not containerized_node
+
+- name: Check /etc/os-release to see if this is CentOS Stream
+  shell: "grep 'CentOS Stream' /etc/os-release || true"
+  register: stream_in_osrelease
+  tags:
+    - repos
+
+# Setting this var will add "-stream" to the mirrorlist/baseurl URLs in .repo files
+- set_fact:
+    dash_stream: "-stream"
+    is_stream: true
+  when: (ansible_lsb.description is defined and "Stream" in ansible_lsb.description) or
+        stream_in_osrelease.stdout is search("CentOS Stream")
+  tags:
+    - repos
+
+- name: Setup local repo files.
+  import_tasks: yum/repos.yml
+  tags:
+    - repos
+
+# skip_packaging=true set in group_vars for OVH testnodes.  We still want these
+# tasks to run on CentOS though so we set it back to false here.
+- set_fact:
+    skip_packaging: false
+  when: ansible_distribution != "RedHat"
+  tags:
+    - packages
+
+- name: Perform package related tasks.
+  import_tasks: yum/packages.yml
+  when: skip_packaging|default(false)|bool != true
+  tags:
+    - packages
+
+- name: Disable firewall
+  import_tasks: yum/firewall.yml
+
+- name: Enable SELinux
+  selinux: state=permissive policy=targeted
+  tags:
+    - selinux
+
+- name: Configure ABRT
+  import_tasks: yum/abrt.yml
+  when: configure_abrt|bool
+  tags: abrt
diff --git a/roles/testnode/tasks/zap_disks.yml b/roles/testnode/tasks/zap_disks.yml

new file mode 100644 (file)

index 0000000..560c3f7
--- /dev/null
+++ b/roles/testnode/tasks/zap_disks.yml
@@ -0,0 +1,84 @@
+---
+# These zap tasks are run on freshly reimaged cobbler_managed machines
+# even when using the -stock profiles.  Therefore, testnode package
+# installation hasn't happened yet so we install zap dependencies here.
+
+- name: Make sure apt dependencies are installed
+  apt:
+    name: ['gdisk', 'dmsetup']
+    state: present
+  when: ansible_os_family == "Debian"
+
+- name: Make sure rpm dependencies are installed
+  package:
+    name: ['gdisk', 'device-mapper']
+    state: present
+  when: (ansible_distribution == "RedHat" and rhsm_registered is defined and rhsm_registered == true) or
+        (ansible_os_family == "RedHat" and ansible_distribution != "RedHat")
+
+- name: Set root disk
+  set_fact:
+    root_disk: "{{ item.device|regex_replace('[0-9]+', '') }}"
+  with_items: "{{ ansible_mounts }}"
+  when: item.mount == '/'
+
+- name: Compile list of non-root partitions
+  shell: "lsblk --list --noheadings | grep part | grep -v {{ root_disk|regex_replace('/dev/', '') }} | awk '{ print $1 }'"
+  register: non_root_partitions
+
+- name: Unmount any non-root mountpoints
+  mount:
+    path: "{{ item.mount }}"
+    state: unmounted
+  with_items: "{{ ansible_mounts }}"
+  when:
+    - item.mount != '/' and
+      not item.mount is match("/(boot|home|opt|root|srv|tmp|usr/local|var|.snapshots)")
+
+## http://tracker.ceph.com/issues/20533
+## Trusty version of wipefs lacks --force option
+- name: Wipe filesystems on non-root partitions
+  shell: "wipefs --force --all /dev/{{ item }} || wipefs --all /dev/{{ item }}"
+  with_items: "{{ non_root_partitions.stdout_lines }}"
+  when: non_root_partitions|length > 0
+
+## See https://github.com/ceph/ceph-ansible/issues/759#issue-153248281
+- name: Zap all non-root disks
+  shell: "sgdisk --zap-all /dev/{{ item.key }} || sgdisk --zap-all /dev/{{ item.key }}"
+  with_dict: "{{ ansible_devices }}"
+  when:
+    - item.key not in root_disk
+    - '"loop" not in item.key'
+    - '"ram" not in item.key'
+    - '"sr" not in item.key'
+
+## See https://tracker.ceph.com/issues/22354 and
+## https://github.com/ceph/ceph/pull/20400
+- name: Blow away lingering OSD data and FSIDs
+  shell: "dd if=/dev/zero of=/dev/{{ item.key }} bs=1M count=110"
+  with_dict: "{{ ansible_devices }}"
+  when:
+    - item.key not in root_disk
+    - '"loop" not in item.key'
+    - '"ram" not in item.key'
+    - '"sr" not in item.key'
+
+- name: Remove all LVM data
+  shell: "dmsetup remove_all --force"
+  register: removed_lvm_data
+  until: "'Unable to remove' not in removed_lvm_data.stderr"
+  retries: 5
+  delay: 1
+  ignore_errors: true
+
+## See http://tracker.ceph.com/issues/21989
+- name: Check for physical volumes
+  shell: "pvdisplay | grep 'PV Name' | awk '{ print $3 }'"
+  register: pvs_to_remove
+
+- name: Remove physical volumes
+  shell: "pvremove --force --force --yes {{ item }}"
+  with_items: "{{ pvs_to_remove.stdout_lines }}"
+  when:
+    - pvs_to_remove is defined
+    - pvs_to_remove.stdout_lines|length > 0
diff --git a/roles/testnode/tasks/zypper/packages.yml b/roles/testnode/tasks/zypper/packages.yml

new file mode 100644 (file)

index 0000000..ab79d81
--- /dev/null
+++ b/roles/testnode/tasks/zypper/packages.yml
@@ -0,0 +1,36 @@
+---
+- name: Ensure ceph packages are not present.
+  zypper:
+    name: "{{ ceph_packages_to_remove|list }}"
+    state: absent
+  tags:
+    - remove-ceph
+
+- name: Ensure ceph dependency packages are not present.
+  zypper:
+    name: "{{ ceph_dependency_packages_to_remove|list }}"
+    state: absent
+  tags:
+    - remove-ceph-dependency
+
+# https://tracker.ceph.com/issues/44501
+- set_fact:
+    ansible_python_interpreter: /usr/bin/python3
+
+- name: Remove packages
+  zypper:
+    name: "{{ packages_to_remove|list }}"
+    state: absent
+  when: packages_to_remove|length > 0
+
+- name: Install packages
+  zypper:
+    name: "{{ packages|list }}"
+    state: present
+  when: packages|length > 0
+
+- name: Upgrade packages
+  zypper:
+    name: "{{ packages_to_upgrade|list }}"
+    state: latest
+  when: packages_to_upgrade|length > 0
diff --git a/roles/testnode/tasks/zypper_systems.yml b/roles/testnode/tasks/zypper_systems.yml

new file mode 100644 (file)

index 0000000..9e10933
--- /dev/null
+++ b/roles/testnode/tasks/zypper_systems.yml
@@ -0,0 +1,53 @@
+---
+# Tasks common to all systems that use the zypper package manager
+# This is mostly a copy of the yum_systems.yml
+
+- name: Set mode on /etc/fuse.conf
+  file:
+    path: /etc/fuse.conf
+    mode: 0644
+    state: touch
+  changed_when: false
+
+- name: Ensure the group kvm exists.
+  group:
+    name: kvm
+    state: present
+
+- name: Add the teuthology user to groups kvm,disk
+  user:
+    name: "{{ teuthology_user }}"
+    groups: kvm,disk
+    append: yes
+
+- name: Configure /etc/sudoers.
+  template:
+    src: sudoers
+    dest: /etc/sudoers
+    owner: root
+    group: root
+    mode: 0440
+    validate: visudo -cf %s
+  tags:
+    - sudoers
+
+- name: Configure /etc/security/limits.conf
+  template:
+    src: limits.conf
+    dest: /etc/security/limits.conf
+    group: root
+    owner: root
+    mode: 0644
+
+# http://tracker.ceph.com/issues/15272
+# We don't know why it's happening, but something is corrupting the
+# rpmdb.  Let's try just rebuilding it every time.
+- name: Rebuild rpmdb
+  command:
+    rpm --rebuilddb
+
+- name: Perform package related tasks.
+  import_tasks: zypper/packages.yml
+  tags:
+    - packages
+
diff --git a/roles/testnode/templates/apt/ceph.pref b/roles/testnode/templates/apt/ceph.pref

new file mode 100644 (file)

index 0000000..c1e70b0
--- /dev/null
+++ b/roles/testnode/templates/apt/ceph.pref
@@ -0,0 +1,4 @@
+{# {{ ansible_managed }} #}
+Package: *
+Pin: origin *.ceph.com
+Pin-Priority: 999
diff --git a/roles/testnode/templates/apt/sources.list.jessie b/roles/testnode/templates/apt/sources.list.jessie

new file mode 100644 (file)

index 0000000..80ba5e3
--- /dev/null
+++ b/roles/testnode/templates/apt/sources.list.jessie
@@ -0,0 +1,4 @@
+# {{ ansible_managed }}
+deb http://http.debian.net/debian jessie main contrib non-free
+deb http://security.debian.org/ jessie/updates main contrib non-free
+deb http://http.debian.net/debian jessie-updates main contrib non-free
diff --git a/roles/testnode/templates/apt/sources.list.precise b/roles/testnode/templates/apt/sources.list.precise

new file mode 100644 (file)

index 0000000..cda0800
--- /dev/null
+++ b/roles/testnode/templates/apt/sources.list.precise
@@ -0,0 +1,62 @@
+# {{ ansible_managed }}
+# deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu/ precise main restricted
+
+# deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu/ precise-updates main restricted
+# deb http://security.ubuntu.com/ubuntu precise-security main restricted
+
+# See http://help.ubuntu.com/community/UpgradeNotes for how to upgrade to
+# newer versions of the distribution.
+deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu/ precise main restricted
+deb-src http://{{ mirror_host }}/archive.ubuntu.com/ubuntu/ precise main restricted
+
+## Major bug fix updates produced after the final release of the
+## distribution.
+deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu/ precise-updates main restricted
+deb-src http://{{ mirror_host }}/archive.ubuntu.com/ubuntu/ precise-updates main restricted
+
+## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu
+## team. Also, please note that software in universe WILL NOT receive any
+## review or updates from the Ubuntu security team.
+deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu/ precise universe
+deb-src http://{{ mirror_host }}/archive.ubuntu.com/ubuntu/ precise universe
+deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu/ precise-updates universe
+deb-src http://{{ mirror_host }}/archive.ubuntu.com/ubuntu/ precise-updates universe
+
+## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu
+## team, and may not be under a free licence. Please satisfy yourself as to
+## your rights to use the software. Also, please note that software in
+## multiverse WILL NOT receive any review or updates from the Ubuntu
+## security team.
+deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu/ precise multiverse
+deb-src http://{{ mirror_host }}/archive.ubuntu.com/ubuntu/ precise multiverse
+deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu/ precise-updates multiverse
+deb-src http://{{ mirror_host }}/archive.ubuntu.com/ubuntu/ precise-updates multiverse
+
+## N.B. software from this repository may not have been tested as
+## extensively as that contained in the main release, although it includes
+## newer versions of some applications which may provide useful features.
+## Also, please note that software in backports WILL NOT receive any review
+## or updates from the Ubuntu security team.
+deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu/ precise-backports main restricted universe multiverse
+deb-src http://{{ mirror_host }}/archive.ubuntu.com/ubuntu/ precise-backports main restricted universe multiverse
+
+deb http://security.ubuntu.com/ubuntu precise-security main restricted
+deb-src http://security.ubuntu.com/ubuntu precise-security main restricted
+deb http://security.ubuntu.com/ubuntu precise-security universe
+deb-src http://security.ubuntu.com/ubuntu precise-security universe
+deb http://security.ubuntu.com/ubuntu precise-security multiverse
+deb-src http://security.ubuntu.com/ubuntu precise-security multiverse
+
+## Uncomment the following two lines to add software from Canonical's
+## 'partner' repository.
+## This software is not part of Ubuntu, but is offered by Canonical and the
+## respective vendors as a service to Ubuntu users.
+# deb http://{{ mirror_host }}/archive.canonical.com/ubuntu precise partner
+# deb-src http://{{ mirror_host }}/archive.canonical.com/ubuntu precise partner
+
+## Uncomment the following two lines to add software from Ubuntu's
+## 'extras' repository.
+## This software is not part of Ubuntu, but is offered by third-party
+## developers who want to ship their latest software.
+# deb http://extras.ubuntu.com/ubuntu precise main
+# deb-src http://extras.ubuntu.com/ubuntu precise main
diff --git a/roles/testnode/templates/apt/sources.list.trusty b/roles/testnode/templates/apt/sources.list.trusty

new file mode 100644 (file)

index 0000000..19cf488
--- /dev/null
+++ b/roles/testnode/templates/apt/sources.list.trusty
@@ -0,0 +1,63 @@
+# {{ ansible_managed }}
+# deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu trusty main restricted
+
+# deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu trusty-updates main restricted
+# deb http://security.ubuntu.com/ubuntu trusty-security main restricted
+
+# See http://help.ubuntu.com/community/UpgradeNotes for how to upgrade to
+# newer versions of the distribution.
+deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu trusty main restricted
+deb-src http://{{ mirror_host }}/archive.ubuntu.com/ubuntu trusty main restricted
+
+## Major bug fix updates produced after the final release of the
+## distribution.
+deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu trusty-updates main restricted
+deb-src http://{{ mirror_host }}/archive.ubuntu.com/ubuntu trusty-updates main restricted
+
+## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu
+## team. Also, please note that software in universe WILL NOT receive any
+## review or updates from the Ubuntu security team.
+deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu trusty universe
+deb-src http://{{ mirror_host }}/archive.ubuntu.com/ubuntu trusty universe
+deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu trusty-updates universe
+deb-src http://{{ mirror_host }}/archive.ubuntu.com/ubuntu trusty-updates universe
+
+## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu
+## team, and may not be under a free licence. Please satisfy yourself as to
+## your rights to use the software. Also, please note that software in
+## multiverse WILL NOT receive any review or updates from the Ubuntu
+## security team.
+deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu trusty multiverse
+deb-src http://{{ mirror_host }}/archive.ubuntu.com/ubuntu trusty multiverse
+deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu trusty-updates multiverse
+deb-src http://{{ mirror_host }}/archive.ubuntu.com/ubuntu trusty-updates multiverse
+
+## N.B. software from this repository may not have been tested as
+## extensively as that contained in the main release, although it includes
+## newer versions of some applications which may provide useful features.
+## Also, please note that software in backports WILL NOT receive any review
+## or updates from the Ubuntu security team.
+deb http://{{ mirror_host }}/archive.ubuntu.com/ubuntu trusty-backports main restricted universe multiverse
+deb-src http://{{ mirror_host }}/archive.ubuntu.com/ubuntu trusty-backports main restricted universe multiverse
+
+deb http://security.ubuntu.com/ubuntu trusty-security main restricted
+deb-src http://security.ubuntu.com/ubuntu trusty-security main restricted
+deb http://security.ubuntu.com/ubuntu trusty-security universe
+deb-src http://security.ubuntu.com/ubuntu trusty-security universe
+deb http://security.ubuntu.com/ubuntu trusty-security multiverse
+deb-src http://security.ubuntu.com/ubuntu trusty-security multiverse
+
+## Uncomment the following two lines to add software from Canonical's
+## 'partner' repository.
+## This software is not part of Ubuntu, but is offered by Canonical and the
+## respective vendors as a service to Ubuntu users.
+# deb http://archive.canonical.com/ubuntu trusty partner
+# deb-src http://archive.canonical.com/ubuntu trusty partner
+
+## Uncomment the following two lines to add software from Ubuntu's
+## 'extras' repository.
+## This software is not part of Ubuntu, but is offered by third-party
+## developers who want to ship their latest software.
+# deb http://extras.ubuntu.com/ubuntu trusty main
+# deb-src http://extras.ubuntu.com/ubuntu trusty main
+
diff --git a/roles/testnode/templates/apt/sources.list.wheezy b/roles/testnode/templates/apt/sources.list.wheezy

new file mode 100644 (file)

index 0000000..bbe3e3e
--- /dev/null
+++ b/roles/testnode/templates/apt/sources.list.wheezy
@@ -0,0 +1,4 @@
+# {{ ansible_managed }}
+deb http://{{ mirror_host }}/ftp.us.debian.org/debian wheezy main contrib non-free
+deb http://{{ mirror_host }}/security.debian.org/debian-security/ wheezy/updates main contrib non-free
+deb http://{{ mirror_host }}/ftp.us.debian.org/debian wheezy-backports main contrib non-free
diff --git a/roles/testnode/templates/cachefilesd.j2 b/roles/testnode/templates/cachefilesd.j2

new file mode 100644 (file)

index 0000000..d94765b
--- /dev/null
+++ b/roles/testnode/templates/cachefilesd.j2
@@ -0,0 +1,9 @@
+dir {{ cachefilesd_dir|default('/var/cache/fscache') }}
+tag {{ cachefilesd_tag|default('mycache') }}
+brun {{ cachefilesd_brun|default('10%') }}
+bcull {{ cachefilesd_bcull|default('7%') }}
+bstop {{ cachefilesd_bstop|default('3%') }}
+frun {{ cachefilesd_frun|default('10%') }}
+fcull {{ cachefilesd_fcull|default('7%') }}
+fstop {{ cachefilesd_fstop|default('3%') }}
+secctx {{ cachefilesd_secctx|default('system_u:system_r:cachefiles_kernel_t:s0') }}
diff --git a/roles/testnode/templates/chrony.conf b/roles/testnode/templates/chrony.conf

new file mode 100644 (file)

index 0000000..749c8c7
--- /dev/null
+++ b/roles/testnode/templates/chrony.conf
@@ -0,0 +1,6 @@
+{% for server in ntp_servers %}
+server {{ server }} iburst
+{% endfor %}
+driftfile /var/lib/chrony/drift
+makestep 1.0 3
+rtcsync
diff --git a/roles/testnode/templates/cpan_config.pm b/roles/testnode/templates/cpan_config.pm

new file mode 100644 (file)

index 0000000..66db16d
--- /dev/null
+++ b/roles/testnode/templates/cpan_config.pm
@@ -0,0 +1,67 @@
+# {{ ansible_managed }}
+$CPAN::Config = {
+  'applypatch' => q[],
+  'auto_commit' => q[0],
+  'build_cache' => q[100],
+  'build_dir' => q[/home/{{ teuthology_user }}/.cpan/build],
+  'build_dir_reuse' => q[0],
+  'build_requires_install_policy' => q[yes],
+  'bzip2' => q[/bin/bzip2],
+  'cache_metadata' => q[1],
+  'check_sigs' => q[0],
+  'colorize_output' => q[0],
+  'commandnumber_in_prompt' => q[1],
+  'connect_to_internet_ok' => q[1],
+  'cpan_home' => q[/home/{{ teuthology_user }}/.cpan],
+  'ftp_passive' => q[1],
+  'ftp_proxy' => q[],
+  'getcwd' => q[cwd],
+  'gpg' => q[/usr/bin/gpg],
+  'gzip' => q[/bin/gzip],
+  'halt_on_failure' => q[0],
+  'histfile' => q[/home/{{ teuthology_user }}/.cpan/histfile],
+  'histsize' => q[100],
+  'http_proxy' => q[],
+  'inactivity_timeout' => q[0],
+  'index_expire' => q[1],
+  'inhibit_startup_message' => q[0],
+  'keep_source_where' => q[/home/{{ teuthology_user }}/.cpan/sources],
+  'load_module_verbosity' => q[none],
+  'make' => q[/usr/bin/make],
+  'make_arg' => q[],
+  'make_install_arg' => q[],
+  'make_install_make_command' => q[/usr/bin/make],
+  'makepl_arg' => q[INSTALLDIRS=site],
+  'mbuild_arg' => q[],
+  'mbuild_install_arg' => q[],
+  'mbuild_install_build_command' => q[./Build],
+  'mbuildpl_arg' => q[--installdirs site],
+  'no_proxy' => q[],
+  'pager' => q[/usr/bin/less],
+  'patch' => q[/usr/bin/patch],
+  'perl5lib_verbosity' => q[none],
+  'prefer_external_tar' => q[1],
+  'prefer_installer' => q[MB],
+  'prefs_dir' => q[/home/{{ teuthology_user }}/.cpan/prefs],
+  'prerequisites_policy' => q[follow],
+  'scan_cache' => q[atstart],
+  'shell' => q[/bin/bash],
+  'show_unparsable_versions' => q[0],
+  'show_upload_date' => q[0],
+  'show_zero_versions' => q[0],
+  'tar' => q[/bin/tar],
+  'tar_verbosity' => q[none],
+  'term_is_latin' => q[1],
+  'term_ornaments' => q[1],
+  'test_report' => q[0],
+  'trust_test_report_history' => q[0],
+  'unzip' => q[/usr/bin/unzip],
+  'urllist' => [q[http://apt-mirror.sepia.ceph.com/CPAN/]],
+  'use_sqlite' => q[0],
+  'version_timeout' => q[15],
+  'wget' => q[/usr/bin/wget],
+  'yaml_load_code' => q[0],
+  'yaml_module' => q[YAML],
+};
+1;
+__END__
diff --git a/roles/testnode/templates/cron/kernel-clean b/roles/testnode/templates/cron/kernel-clean

new file mode 100644 (file)

index 0000000..80b97b2
--- /dev/null
+++ b/roles/testnode/templates/cron/kernel-clean
@@ -0,0 +1,26 @@
+#!/bin/bash
+# {{ ansible_managed }}
+
+#Environment variable for
+PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
+
+#Don't run any post remove scripts. Doing it for each kernel
+#Takes too long. We dont normally remove kernels other than
+#via this script which will manually do update-grub at the end.
+
+rm -f /etc/kernel/postrm.d/* 2> /dev/null
+
+current=`uname -r`
+kernellist=""
+for kernel in `dpkg -l | grep linux-image | grep -i -e '^ii '  | grep ceph | grep -v "$current" | awk '{print $2}'`
+do
+    kernellist="$kernel $kernellist"
+done
+
+apt-get -y remove $kernellist
+
+#Manually update grub since we disabled dpkg from doing it.
+update-grub
+
+#Clean apt-cache
+apt-get clean
diff --git a/roles/testnode/templates/exports b/roles/testnode/templates/exports

new file mode 100644 (file)

index 0000000..17bbdc6
--- /dev/null
+++ b/roles/testnode/templates/exports
@@ -0,0 +1,14 @@
+# {{ ansible_managed }}
+#
+# /etc/exports: the access control list for filesystems which may be exported
+#               to NFS clients.  See exports(5).
+#
+# Example for NFSv2 and NFSv3:
+# /srv/homes       hostname1(rw,sync,no_subtree_check) hostname2(ro,sync,no_subtree_check)
+#
+# Example for NFSv4:
+# /srv/nfs4        gss/krb5i(rw,sync,fsid=0,crossmnt,no_subtree_check)
+# /srv/nfs4/homes  gss/krb5i(rw,sync,no_subtree_check)
+#
+# dummy export just to make nfs_kernel_start
+/tmp    1.1.1.1(ro,sync,no_subtree_check)
diff --git a/roles/testnode/templates/fuse.conf b/roles/testnode/templates/fuse.conf

new file mode 100644 (file)

index 0000000..9723248
--- /dev/null
+++ b/roles/testnode/templates/fuse.conf
@@ -0,0 +1,9 @@
+# {{ ansible_managed }}
+# /etc/fuse.conf - Configuration file for Filesystem in Userspace (FUSE)
+
+# Set the maximum number of FUSE mounts allowed to non-root users.
+# The default is 1000.
+#mount_max = 1000
+
+# Allow non-root users to specify the allow_other or allow_root mount options.
+user_allow_other
diff --git a/roles/testnode/templates/grub b/roles/testnode/templates/grub

new file mode 100644 (file)

index 0000000..6686cb0
--- /dev/null
+++ b/roles/testnode/templates/grub
@@ -0,0 +1,4 @@
+# {{ ansible_managed }}
+GRUB_DEFAULT=saved
+GRUB_TIMEOUT=5
+GRUB_DISABLE_LINUX_UUID="true"
diff --git a/roles/testnode/templates/grub.d/02_force_timeout b/roles/testnode/templates/grub.d/02_force_timeout

new file mode 100644 (file)

index 0000000..3f828ff
--- /dev/null
+++ b/roles/testnode/templates/grub.d/02_force_timeout
@@ -0,0 +1,4 @@
+# {{ ansible_managed }}
+cat <<EOF
+set timeout=5
+EOF
diff --git a/roles/testnode/templates/limits.conf b/roles/testnode/templates/limits.conf

new file mode 100644 (file)

index 0000000..2e4997e
--- /dev/null
+++ b/roles/testnode/templates/limits.conf
@@ -0,0 +1,20 @@
+# {{ ansible_managed }}
+#
+# /etc/security/limits.conf
+#
+#This file sets the resource limits for the users logged in via PAM.
+#It does not affect resource limits of the system services.
+#
+#Also note that configuration files in /etc/security/limits.d directory,
+#which are read in alphabetical order, override the settings in this
+#file in case the domain is the same or more specific.
+#That means for example that setting a limit for wildcard domain here
+#can be overriden with a wildcard setting in a config file in the
+#subdirectory, but a user specific setting here can be overriden only
+#with a user specific setting in the subdirectory.
+#
+#Each line describes a limit for a user in the form:
+#
+#<domain>        <type>  <item>  <value>
+
+*                soft    core            unlimited
diff --git a/roles/testnode/templates/modules b/roles/testnode/templates/modules

new file mode 100644 (file)

index 0000000..3fdab7e
--- /dev/null
+++ b/roles/testnode/templates/modules
@@ -0,0 +1,10 @@
+# {{ ansible_managed }}
+# /etc/modules: kernel modules to load at boot time.
+#
+# This file contains the names of kernel modules that should be loaded
+# at boot time, one per line. Lines beginning with "#" are ignored.
+
+loop
+lp
+rtc
+scsi_transport_iscsi
diff --git a/roles/testnode/templates/ntp.conf b/roles/testnode/templates/ntp.conf

new file mode 100644 (file)

index 0000000..df665bc
--- /dev/null
+++ b/roles/testnode/templates/ntp.conf
@@ -0,0 +1,77 @@
+#
+# {{ ansible_managed }}
+#
+# /etc/ntp.conf, configuration for ntpd; see ntp.conf(5) for help
+
+driftfile /var/lib/ntp/ntp.drift
+
+
+# Enable this if you want statistics to be logged.
+statsdir /var/log/ntpstats/
+
+statistics loopstats peerstats rawstats clockstats sysstats
+filegen loopstats file loopstats type day enable
+filegen peerstats file peerstats type day enable
+filegen rawstats file rawstats type day enable
+filegen clockstats file clockstats type day enable
+filegen sysstats file sysstats type day enable
+
+
+# You do need to talk to an NTP server or two (or three).
+#server ntp.your-provider.example
+
+# pool.ntp.org maps to about 1000 low-stratum NTP servers.  Your server will
+# pick a different set every time it starts up.  Please consider joining the
+# pool: <http://www.pool.ntp.org/join.html>
+
+#clock1 is currently an alias to public ntp servers, which are 20-50ms off from
+#our internal ones!
+
+# found this guy from http://www.pool.ntp.org/user/ask, ~2.5ms ping time
+#server tock.phyber.com iburst minpoll 4 maxpoll 7
+
+#server clock1.dreamhost.com iburst dynamic
+#server clock2.dreamhost.com iburst dynamic
+#server clock3.dreamhost.com iburst minpoll 4 maxpoll 7
+#server 0.debian.pool.ntp.org iburst dynamic
+#server 1.debian.pool.ntp.org iburst dynamic
+#server 2.debian.pool.ntp.org iburst dynamic
+#server 3.debian.pool.ntp.org iburst dynamic
+
+{% for server in ntp_servers %}
+server {{ server }}
+{% endfor %}
+
+
+# Access control configuration; see /usr/share/doc/ntp-doc/html/accopt.html for
+# details.  The web page <http://support.ntp.org/bin/view/Support/AccessRestrictions>
+# might also be helpful.
+#
+# Note that "restrict" applies to both servers and clients, so a configuration
+# that might be intended to block requests from certain clients could also end
+# up blocking replies from your own upstream servers.
+
+# By default, exchange time with everybody, but don't allow configuration.
+restrict -4 default kod notrap nomodify nopeer noquery
+restrict -6 default kod notrap nomodify nopeer noquery
+
+# Local users may interrogate the ntp server more closely.
+restrict 127.0.0.1
+restrict ::1
+
+# Clients from this (example!) subnet have unlimited access, but only if
+# cryptographically authenticated.
+#restrict 192.168.123.0 mask 255.255.255.0 notrust
+
+
+# If you want to provide time to your local subnet, change the next line.
+# (Again, the address is an example only.)
+#broadcast 192.168.123.255
+
+# If you want to listen to time broadcasts on your local subnet, de-comment the
+# next lines.  Please do this only if you trust everybody on the network!
+#disable auth
+#broadcastclient
+
+#Greater accuracy
+tinker step 0.025
diff --git a/roles/testnode/templates/pip.conf b/roles/testnode/templates/pip.conf

new file mode 100644 (file)

index 0000000..c82ad50
--- /dev/null
+++ b/roles/testnode/templates/pip.conf
@@ -0,0 +1,2 @@
+[global]
+index-url = {{ pip_mirror_url }} 
diff --git a/roles/testnode/templates/remote.conf b/roles/testnode/templates/remote.conf

new file mode 100644 (file)

index 0000000..a9e67d9
--- /dev/null
+++ b/roles/testnode/templates/remote.conf
@@ -0,0 +1,2 @@
+# {{ ansible_managed }}
+*   hard    core   unlimited
diff --git a/roles/testnode/templates/security_limits.conf b/roles/testnode/templates/security_limits.conf

new file mode 100644 (file)

index 0000000..1c515b2
--- /dev/null
+++ b/roles/testnode/templates/security_limits.conf
@@ -0,0 +1,2 @@
+# {{ ansible_managed }}
+{{ teuthology_user }} hard nofile 16384
diff --git a/roles/testnode/templates/ssh/ssh_config b/roles/testnode/templates/ssh/ssh_config

new file mode 100644 (file)

index 0000000..29fbd86
--- /dev/null
+++ b/roles/testnode/templates/ssh/ssh_config
@@ -0,0 +1,17 @@
+# {{ ansible_managed }}
+#
+# This is the ssh client system-wide configuration file.  See
+# ssh_config(5) for more information.  This file provides defaults for
+# users, and the values can be changed in per-user configuration files
+# or on the command line.
+
+Host *
+    SendEnv LANG LC_*
+    HashKnownHosts yes
+    GSSAPIAuthentication yes
+    GSSAPIDelegateCredentials no
+    StrictHostKeyChecking no
+    SendEnv LANG LC_*
+    HashKnownHosts yes
+    GSSAPIAuthentication yes
+    GSSAPIDelegateCredentials no
diff --git a/roles/testnode/templates/ssh/sshd_config_centos_6 b/roles/testnode/templates/ssh/sshd_config_centos_6

new file mode 100644 (file)

index 0000000..80fb519
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_centos_6
@@ -0,0 +1,34 @@
+#      {{ ansible_managed }}
+#      $OpenBSD: sshd_config,v 1.80 2008/07/02 02:24:18 djm Exp $
+
+# This is the sshd server system-wide configuration file.  See
+# sshd_config(5) for more information.
+
+# This sshd was compiled with PATH=/usr/local/bin:/bin:/usr/bin
+
+Protocol 2
+
+SyslogFacility AUTHPRIV
+
+PasswordAuthentication yes
+
+ChallengeResponseAuthentication no
+
+# GSSAPI options
+GSSAPIAuthentication yes
+GSSAPICleanupCredentials yes
+
+UsePAM yes
+
+# Accept locale-related environment variables
+AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
+AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
+AcceptEnv LC_IDENTIFICATION LC_ALL LANGUAGE
+AcceptEnv XMODIFIERS
+
+X11Forwarding yes
+
+# override default of no subsystems
+Subsystem      sftp    /usr/libexec/openssh/sftp-server
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_centos_7 b/roles/testnode/templates/ssh/sshd_config_centos_7

new file mode 100644 (file)

index 0000000..7f5faae
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_centos_7
@@ -0,0 +1,38 @@
+#      {{ ansible_managed }}
+#      $OpenBSD: sshd_config,v 1.90 2013/05/16 04:09:14 dtucker Exp $
+
+# This is the sshd server system-wide configuration file.  See
+# sshd_config(5) for more information.
+
+# This sshd was compiled with PATH=/usr/local/bin:/usr/bin
+
+HostKey /etc/ssh/ssh_host_rsa_key
+HostKey /etc/ssh/ssh_host_ecdsa_key
+
+SyslogFacility AUTHPRIV
+
+AuthorizedKeysFile .ssh/authorized_keys
+
+PasswordAuthentication no
+
+ChallengeResponseAuthentication no
+
+# GSSAPI options
+GSSAPIAuthentication yes
+GSSAPICleanupCredentials yes
+
+UsePAM yes
+
+X11Forwarding yes
+UsePrivilegeSeparation sandbox         # Default for new installations.
+
+# Accept locale-related environment variables
+AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
+AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
+AcceptEnv LC_IDENTIFICATION LC_ALL LANGUAGE
+AcceptEnv XMODIFIERS
+
+# override default of no subsystems
+Subsystem sftp /usr/libexec/openssh/sftp-server
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_centos_8 b/roles/testnode/templates/ssh/sshd_config_centos_8

new file mode 100644 (file)

index 0000000..087d4c7
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_centos_8
@@ -0,0 +1,38 @@
+# {{ ansible_managed }}
+#      $OpenBSD: sshd_config,v 1.90 2013/05/16 04:09:14 dtucker Exp $
+
+# This is the sshd server system-wide configuration file.  See
+# sshd_config(5) for more information.
+
+# This sshd was compiled with PATH=/usr/local/bin:/usr/bin
+
+HostKey /etc/ssh/ssh_host_rsa_key
+HostKey /etc/ssh/ssh_host_ecdsa_key
+
+SyslogFacility AUTHPRIV
+
+AuthorizedKeysFile     .ssh/authorized_keys
+
+PasswordAuthentication yes
+
+ChallengeResponseAuthentication no
+
+# GSSAPI options
+GSSAPIAuthentication yes
+GSSAPICleanupCredentials yes
+
+UsePAM yes
+
+X11Forwarding yes
+UsePrivilegeSeparation sandbox         # Default for new installations.
+
+# Accept locale-related environment variables
+AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
+AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
+AcceptEnv LC_IDENTIFICATION LC_ALL LANGUAGE
+AcceptEnv XMODIFIERS
+
+# override default of no subsystems
+Subsystem      sftp    /usr/libexec/openssh/sftp-server
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_centos_9 b/roles/testnode/templates/ssh/sshd_config_centos_9

new file mode 100644 (file)

index 0000000..087d4c7
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_centos_9
@@ -0,0 +1,38 @@
+# {{ ansible_managed }}
+#      $OpenBSD: sshd_config,v 1.90 2013/05/16 04:09:14 dtucker Exp $
+
+# This is the sshd server system-wide configuration file.  See
+# sshd_config(5) for more information.
+
+# This sshd was compiled with PATH=/usr/local/bin:/usr/bin
+
+HostKey /etc/ssh/ssh_host_rsa_key
+HostKey /etc/ssh/ssh_host_ecdsa_key
+
+SyslogFacility AUTHPRIV
+
+AuthorizedKeysFile     .ssh/authorized_keys
+
+PasswordAuthentication yes
+
+ChallengeResponseAuthentication no
+
+# GSSAPI options
+GSSAPIAuthentication yes
+GSSAPICleanupCredentials yes
+
+UsePAM yes
+
+X11Forwarding yes
+UsePrivilegeSeparation sandbox         # Default for new installations.
+
+# Accept locale-related environment variables
+AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
+AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
+AcceptEnv LC_IDENTIFICATION LC_ALL LANGUAGE
+AcceptEnv XMODIFIERS
+
+# override default of no subsystems
+Subsystem      sftp    /usr/libexec/openssh/sftp-server
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_debian_7 b/roles/testnode/templates/ssh/sshd_config_debian_7

new file mode 100644 (file)

index 0000000..06f41e4
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_debian_7
@@ -0,0 +1,90 @@
+# {{ ansible_managed }}
+# Package generated configuration file
+# See the sshd_config(5) manpage for details
+
+# What ports, IPs and protocols we listen for
+Port 22
+# Use these options to restrict which interfaces/protocols sshd will bind to
+#ListenAddress ::
+#ListenAddress 0.0.0.0
+Protocol 2
+# HostKeys for protocol version 2
+HostKey /etc/ssh/ssh_host_rsa_key
+HostKey /etc/ssh/ssh_host_dsa_key
+HostKey /etc/ssh/ssh_host_ecdsa_key
+#Privilege Separation is turned on for security
+UsePrivilegeSeparation yes
+
+# Lifetime and size of ephemeral version 1 server key
+KeyRegenerationInterval 3600
+ServerKeyBits 768
+
+# Logging
+SyslogFacility AUTH
+LogLevel INFO
+
+# Authentication:
+LoginGraceTime 120
+PermitRootLogin yes
+StrictModes yes
+
+RSAAuthentication yes
+PubkeyAuthentication yes
+#AuthorizedKeysFile    %h/.ssh/authorized_keys
+
+# Don't read the user's ~/.rhosts and ~/.shosts files
+IgnoreRhosts yes
+# For this to work you will also need host keys in /etc/ssh_known_hosts
+RhostsRSAAuthentication no
+# similar for protocol version 2
+HostbasedAuthentication no
+# Uncomment if you don't trust ~/.ssh/known_hosts for RhostsRSAAuthentication
+#IgnoreUserKnownHosts yes
+
+# To enable empty passwords, change to yes (NOT RECOMMENDED)
+PermitEmptyPasswords no
+
+# Change to yes to enable challenge-response passwords (beware issues with
+# some PAM modules and threads)
+ChallengeResponseAuthentication no
+
+# Change to no to disable tunnelled clear text passwords
+#PasswordAuthentication yes
+
+# Kerberos options
+#KerberosAuthentication no
+#KerberosGetAFSToken no
+#KerberosOrLocalPasswd yes
+#KerberosTicketCleanup yes
+
+# GSSAPI options
+#GSSAPIAuthentication no
+#GSSAPICleanupCredentials yes
+
+X11Forwarding yes
+X11DisplayOffset 10
+PrintMotd no
+PrintLastLog yes
+TCPKeepAlive yes
+#UseLogin no
+
+#MaxStartups 10:30:60
+#Banner /etc/issue.net
+
+# Allow client to pass locale environment variables
+AcceptEnv LANG LC_*
+
+Subsystem sftp /usr/lib/openssh/sftp-server
+
+# Set this to 'yes' to enable PAM authentication, account processing,
+# and session processing. If this is enabled, PAM authentication will
+# be allowed through the ChallengeResponseAuthentication and
+# PasswordAuthentication.  Depending on your PAM configuration,
+# PAM authentication via ChallengeResponseAuthentication may bypass
+# the setting of "PermitRootLogin without-password".
+# If you just want the PAM account and session checks to run without
+# PAM authentication, then enable this but set PasswordAuthentication
+# and ChallengeResponseAuthentication to 'no'.
+UsePAM yes
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_debian_8 b/roles/testnode/templates/ssh/sshd_config_debian_8

new file mode 100644 (file)

index 0000000..a53a032
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_debian_8
@@ -0,0 +1,91 @@
+# {{ ansible_managed }}
+# Package generated configuration file
+# See the sshd_config(5) manpage for details
+
+# What ports, IPs and protocols we listen for
+Port 22
+# Use these options to restrict which interfaces/protocols sshd will bind to
+#ListenAddress ::
+#ListenAddress 0.0.0.0
+Protocol 2
+# HostKeys for protocol version 2
+HostKey /etc/ssh/ssh_host_rsa_key
+HostKey /etc/ssh/ssh_host_dsa_key
+HostKey /etc/ssh/ssh_host_ecdsa_key
+HostKey /etc/ssh/ssh_host_ed25519_key
+#Privilege Separation is turned on for security
+UsePrivilegeSeparation yes
+
+# Lifetime and size of ephemeral version 1 server key
+KeyRegenerationInterval 3600
+ServerKeyBits 1024
+
+# Logging
+SyslogFacility AUTH
+LogLevel INFO
+
+# Authentication:
+LoginGraceTime 120
+PermitRootLogin yes
+StrictModes yes
+
+RSAAuthentication yes
+PubkeyAuthentication yes
+#AuthorizedKeysFile    %h/.ssh/authorized_keys
+
+# Don't read the user's ~/.rhosts and ~/.shosts files
+IgnoreRhosts yes
+# For this to work you will also need host keys in /etc/ssh_known_hosts
+RhostsRSAAuthentication no
+# similar for protocol version 2
+HostbasedAuthentication no
+# Uncomment if you don't trust ~/.ssh/known_hosts for RhostsRSAAuthentication
+#IgnoreUserKnownHosts yes
+
+# To enable empty passwords, change to yes (NOT RECOMMENDED)
+PermitEmptyPasswords no
+
+# Change to yes to enable challenge-response passwords (beware issues with
+# some PAM modules and threads)
+ChallengeResponseAuthentication no
+
+# Change to no to disable tunnelled clear text passwords
+#PasswordAuthentication yes
+
+# Kerberos options
+#KerberosAuthentication no
+#KerberosGetAFSToken no
+#KerberosOrLocalPasswd yes
+#KerberosTicketCleanup yes
+
+# GSSAPI options
+#GSSAPIAuthentication no
+#GSSAPICleanupCredentials yes
+
+X11Forwarding yes
+X11DisplayOffset 10
+PrintMotd no
+PrintLastLog yes
+TCPKeepAlive yes
+#UseLogin no
+
+#MaxStartups 10:30:60
+#Banner /etc/issue.net
+
+# Allow client to pass locale environment variables
+AcceptEnv LANG LC_*
+
+Subsystem sftp /usr/lib/openssh/sftp-server
+
+# Set this to 'yes' to enable PAM authentication, account processing,
+# and session processing. If this is enabled, PAM authentication will
+# be allowed through the ChallengeResponseAuthentication and
+# PasswordAuthentication.  Depending on your PAM configuration,
+# PAM authentication via ChallengeResponseAuthentication may bypass
+# the setting of "PermitRootLogin without-password".
+# If you just want the PAM account and session checks to run without
+# PAM authentication, then enable this but set PasswordAuthentication
+# and ChallengeResponseAuthentication to 'no'.
+UsePAM yes
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_fedora_22 b/roles/testnode/templates/ssh/sshd_config_fedora_22

new file mode 100644 (file)

index 0000000..c310deb
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_fedora_22
@@ -0,0 +1,31 @@
+#      {{ ansible_managed }}
+#      $OpenBSD: sshd_config,v 1.90 2013/05/16 04:09:14 dtucker Exp $
+
+# This is the sshd server system-wide configuration file.  See
+# sshd_config(5) for more information.
+
+SyslogFacility AUTHPRIV
+
+PasswordAuthentication no
+
+ChallengeResponseAuthentication no
+
+# GSSAPI options
+GSSAPIAuthentication yes
+GSSAPICleanupCredentials yes
+
+UsePAM yes
+
+X11Forwarding yes
+UsePrivilegeSeparation sandbox         # Default for new installations.
+
+# Accept locale-related environment variables
+AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
+AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
+AcceptEnv LC_IDENTIFICATION LC_ALL LANGUAGE
+AcceptEnv XMODIFIERS
+
+# override default of no subsystems
+Subsystem sftp /usr/libexec/openssh/sftp-server
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_opensuse_leap_15 b/roles/testnode/templates/ssh/sshd_config_opensuse_leap_15

new file mode 100644 (file)

index 0000000..173923d
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_opensuse_leap_15
@@ -0,0 +1,123 @@
+#      $OpenBSD: sshd_config,v 1.103 2018/04/09 20:41:22 tj Exp $
+
+# This is the sshd server system-wide configuration file.  See
+# sshd_config(5) for more information.
+
+# This sshd was compiled with PATH=/usr/bin:/bin:/usr/sbin:/sbin
+
+# The strategy used for options in the default sshd_config shipped with
+# OpenSSH is to specify options with their default value where
+# possible, but leave them commented.  Uncommented options override the
+# default value.
+
+#Port 22
+#AddressFamily any
+#ListenAddress 0.0.0.0
+#ListenAddress ::
+
+#HostKey /etc/ssh/ssh_host_rsa_key
+#HostKey /etc/ssh/ssh_host_ecdsa_key
+#HostKey /etc/ssh/ssh_host_ed25519_key
+
+# Ciphers and keying
+#RekeyLimit default none
+
+# Logging
+#SyslogFacility AUTH
+#LogLevel INFO
+
+# Authentication:
+
+#LoginGraceTime 2m
+PermitRootLogin yes
+#StrictModes yes
+#MaxAuthTries 6
+#MaxSessions 10
+
+#PubkeyAuthentication yes
+
+# The default is to check both .ssh/authorized_keys and .ssh/authorized_keys2
+# but this is overridden so installations will only check .ssh/authorized_keys
+AuthorizedKeysFile     .ssh/authorized_keys
+
+#AuthorizedPrincipalsFile none
+
+#AuthorizedKeysCommand none
+#AuthorizedKeysCommandUser nobody
+
+# For this to work you will also need host keys in /etc/ssh/ssh_known_hosts
+#HostbasedAuthentication no
+# Change to yes if you don't trust ~/.ssh/known_hosts for
+# HostbasedAuthentication
+#IgnoreUserKnownHosts no
+# Don't read the user's ~/.rhosts and ~/.shosts files
+#IgnoreRhosts yes
+
+# To disable tunneled clear text passwords, change to no here!
+#PasswordAuthentication yes
+#PermitEmptyPasswords no
+
+# Change to no to disable s/key passwords
+#ChallengeResponseAuthentication yes
+
+# Kerberos options
+#KerberosAuthentication no
+#KerberosOrLocalPasswd yes
+#KerberosTicketCleanup yes
+#KerberosGetAFSToken no
+
+# GSSAPI options
+#GSSAPIAuthentication no
+#GSSAPICleanupCredentials yes
+#GSSAPIStrictAcceptorCheck yes
+#GSSAPIKeyExchange no
+
+# Set this to 'yes' to enable PAM authentication, account processing,
+# and session processing. If this is enabled, PAM authentication will
+# be allowed through the ChallengeResponseAuthentication and
+# PasswordAuthentication.  Depending on your PAM configuration,
+# PAM authentication via ChallengeResponseAuthentication may bypass
+# the setting of "PermitRootLogin without-password".
+# If you just want the PAM account and session checks to run without
+# PAM authentication, then enable this but set PasswordAuthentication
+# and ChallengeResponseAuthentication to 'no'.
+UsePAM yes
+
+#AllowAgentForwarding yes
+#AllowTcpForwarding yes
+#GatewayPorts no
+X11Forwarding yes
+#X11DisplayOffset 10
+#X11UseLocalhost yes
+#PermitTTY yes
+#PrintMotd yes
+#PrintLastLog yes
+#TCPKeepAlive yes
+#PermitUserEnvironment no
+#Compression delayed
+#ClientAliveInterval 0
+#ClientAliveCountMax 3
+#UseDNS no
+#PidFile /run/sshd.pid
+#MaxStartups 10:30:100
+#PermitTunnel no
+#ChrootDirectory none
+#VersionAddendum none
+
+# no default banner path
+#Banner none
+
+# override default of no subsystems
+Subsystem      sftp    /usr/lib/ssh/sftp-server
+
+# This enables accepting locale enviroment variables LC_* LANG, see sshd_config(5).
+AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
+AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
+AcceptEnv LC_IDENTIFICATION LC_ALL
+
+# Example of overriding settings on a per-user basis
+#Match User anoncvs
+#      X11Forwarding no
+#      AllowTcpForwarding no
+#      PermitTTY no
+#      ForceCommand cvs server
diff --git a/roles/testnode/templates/ssh/sshd_config_opensuse_leap_42 b/roles/testnode/templates/ssh/sshd_config_opensuse_leap_42

new file mode 100644 (file)

index 0000000..f0c3466
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_opensuse_leap_42
@@ -0,0 +1,9 @@
+AuthorizedKeysFile     .ssh/authorized_keys
+UsePAM yes
+UsePrivilegeSeparation sandbox
+Subsystem      sftp    /usr/lib/ssh/sftp-server
+AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
+AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
+AcceptEnv LC_IDENTIFICATION LC_ALL
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_opensuse_leap_NA b/roles/testnode/templates/ssh/sshd_config_opensuse_leap_NA

new file mode 100644 (file)

index 0000000..6e48757
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_opensuse_leap_NA
@@ -0,0 +1,91 @@
+# {{ ansible_managed }}
+# Package generated configuration file
+# See the sshd_config(5) manpage for details
+
+# What ports, IPs and protocols we listen for
+Port 22
+# Use these options to restrict which interfaces/protocols sshd will bind to
+#ListenAddress ::
+#ListenAddress 0.0.0.0
+Protocol 2
+# HostKeys for protocol version 2
+HostKey /etc/ssh/ssh_host_rsa_key
+HostKey /etc/ssh/ssh_host_dsa_key
+HostKey /etc/ssh/ssh_host_ecdsa_key
+HostKey /etc/ssh/ssh_host_ed25519_key
+#Privilege Separation is turned on for security
+UsePrivilegeSeparation yes
+
+# Lifetime and size of ephemeral version 1 server key
+KeyRegenerationInterval 3600
+ServerKeyBits 1024
+
+# Logging
+SyslogFacility AUTH
+LogLevel INFO
+
+# Authentication:
+LoginGraceTime 120
+PermitRootLogin without-password
+StrictModes yes
+
+RSAAuthentication yes
+PubkeyAuthentication yes
+#AuthorizedKeysFile    %h/.ssh/authorized_keys
+
+# Don't read the user's ~/.rhosts and ~/.shosts files
+IgnoreRhosts yes
+# For this to work you will also need host keys in /etc/ssh_known_hosts
+RhostsRSAAuthentication no
+# similar for protocol version 2
+HostbasedAuthentication no
+# Uncomment if you don't trust ~/.ssh/known_hosts for RhostsRSAAuthentication
+#IgnoreUserKnownHosts yes
+
+# To enable empty passwords, change to yes (NOT RECOMMENDED)
+PermitEmptyPasswords no
+
+# Change to yes to enable challenge-response passwords (beware issues with
+# some PAM modules and threads)
+ChallengeResponseAuthentication no
+
+# Change to no to disable tunnelled clear text passwords
+#PasswordAuthentication yes
+
+# Kerberos options
+#KerberosAuthentication no
+#KerberosGetAFSToken no
+#KerberosOrLocalPasswd yes
+#KerberosTicketCleanup yes
+
+# GSSAPI options
+#GSSAPIAuthentication no
+#GSSAPICleanupCredentials yes
+
+X11Forwarding yes
+X11DisplayOffset 10
+PrintMotd no
+PrintLastLog yes
+TCPKeepAlive yes
+#UseLogin no
+
+#MaxStartups 10:30:60
+#Banner /etc/issue.net
+
+# Allow client to pass locale environment variables
+AcceptEnv LANG LC_*
+
+Subsystem sftp /usr/lib/openssh/sftp-server
+
+# Set this to 'yes' to enable PAM authentication, account processing,
+# and session processing. If this is enabled, PAM authentication will
+# be allowed through the ChallengeResponseAuthentication and
+# PasswordAuthentication.  Depending on your PAM configuration,
+# PAM authentication via ChallengeResponseAuthentication may bypass
+# the setting of "PermitRootLogin without-password".
+# If you just want the PAM account and session checks to run without
+# PAM authentication, then enable this but set PasswordAuthentication
+# and ChallengeResponseAuthentication to 'no'.
+UsePAM yes
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_redhat_6 b/roles/testnode/templates/ssh/sshd_config_redhat_6

new file mode 100644 (file)

index 0000000..80c907e
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_redhat_6
@@ -0,0 +1,33 @@
+#      {{ ansible_managed }}
+#      $OpenBSD: sshd_config,v 1.80 2008/07/02 02:24:18 djm Exp $
+
+# This is the sshd server system-wide configuration file.  See
+# sshd_config(5) for more information.
+
+# This sshd was compiled with PATH=/usr/local/bin:/bin:/usr/bin
+
+Protocol 2
+
+SyslogFacility AUTHPRIV
+PasswordAuthentication yes
+
+ChallengeResponseAuthentication no
+
+# GSSAPI options
+GSSAPIAuthentication yes
+GSSAPICleanupCredentials yes
+
+UsePAM yes
+
+# Accept locale-related environment variables
+AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
+AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
+AcceptEnv LC_IDENTIFICATION LC_ALL LANGUAGE
+AcceptEnv XMODIFIERS
+
+X11Forwarding yes
+
+# override default of no subsystems
+Subsystem      sftp    /usr/libexec/openssh/sftp-server
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_redhat_7 b/roles/testnode/templates/ssh/sshd_config_redhat_7

new file mode 100644 (file)

index 0000000..087d4c7
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_redhat_7
@@ -0,0 +1,38 @@
+# {{ ansible_managed }}
+#      $OpenBSD: sshd_config,v 1.90 2013/05/16 04:09:14 dtucker Exp $
+
+# This is the sshd server system-wide configuration file.  See
+# sshd_config(5) for more information.
+
+# This sshd was compiled with PATH=/usr/local/bin:/usr/bin
+
+HostKey /etc/ssh/ssh_host_rsa_key
+HostKey /etc/ssh/ssh_host_ecdsa_key
+
+SyslogFacility AUTHPRIV
+
+AuthorizedKeysFile     .ssh/authorized_keys
+
+PasswordAuthentication yes
+
+ChallengeResponseAuthentication no
+
+# GSSAPI options
+GSSAPIAuthentication yes
+GSSAPICleanupCredentials yes
+
+UsePAM yes
+
+X11Forwarding yes
+UsePrivilegeSeparation sandbox         # Default for new installations.
+
+# Accept locale-related environment variables
+AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
+AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
+AcceptEnv LC_IDENTIFICATION LC_ALL LANGUAGE
+AcceptEnv XMODIFIERS
+
+# override default of no subsystems
+Subsystem      sftp    /usr/libexec/openssh/sftp-server
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_redhat_8 b/roles/testnode/templates/ssh/sshd_config_redhat_8

new file mode 100644 (file)

index 0000000..087d4c7
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_redhat_8
@@ -0,0 +1,38 @@
+# {{ ansible_managed }}
+#      $OpenBSD: sshd_config,v 1.90 2013/05/16 04:09:14 dtucker Exp $
+
+# This is the sshd server system-wide configuration file.  See
+# sshd_config(5) for more information.
+
+# This sshd was compiled with PATH=/usr/local/bin:/usr/bin
+
+HostKey /etc/ssh/ssh_host_rsa_key
+HostKey /etc/ssh/ssh_host_ecdsa_key
+
+SyslogFacility AUTHPRIV
+
+AuthorizedKeysFile     .ssh/authorized_keys
+
+PasswordAuthentication yes
+
+ChallengeResponseAuthentication no
+
+# GSSAPI options
+GSSAPIAuthentication yes
+GSSAPICleanupCredentials yes
+
+UsePAM yes
+
+X11Forwarding yes
+UsePrivilegeSeparation sandbox         # Default for new installations.
+
+# Accept locale-related environment variables
+AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
+AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
+AcceptEnv LC_IDENTIFICATION LC_ALL LANGUAGE
+AcceptEnv XMODIFIERS
+
+# override default of no subsystems
+Subsystem      sftp    /usr/libexec/openssh/sftp-server
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_redhat_9 b/roles/testnode/templates/ssh/sshd_config_redhat_9

new file mode 100644 (file)

index 0000000..087d4c7
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_redhat_9
@@ -0,0 +1,38 @@
+# {{ ansible_managed }}
+#      $OpenBSD: sshd_config,v 1.90 2013/05/16 04:09:14 dtucker Exp $
+
+# This is the sshd server system-wide configuration file.  See
+# sshd_config(5) for more information.
+
+# This sshd was compiled with PATH=/usr/local/bin:/usr/bin
+
+HostKey /etc/ssh/ssh_host_rsa_key
+HostKey /etc/ssh/ssh_host_ecdsa_key
+
+SyslogFacility AUTHPRIV
+
+AuthorizedKeysFile     .ssh/authorized_keys
+
+PasswordAuthentication yes
+
+ChallengeResponseAuthentication no
+
+# GSSAPI options
+GSSAPIAuthentication yes
+GSSAPICleanupCredentials yes
+
+UsePAM yes
+
+X11Forwarding yes
+UsePrivilegeSeparation sandbox         # Default for new installations.
+
+# Accept locale-related environment variables
+AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
+AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
+AcceptEnv LC_IDENTIFICATION LC_ALL LANGUAGE
+AcceptEnv XMODIFIERS
+
+# override default of no subsystems
+Subsystem      sftp    /usr/libexec/openssh/sftp-server
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_ubuntu_12 b/roles/testnode/templates/ssh/sshd_config_ubuntu_12

new file mode 100644 (file)

index 0000000..73c9dca
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_ubuntu_12
@@ -0,0 +1,90 @@
+# {{ ansible_managed }}
+# Package generated configuration file
+# See the sshd_config(5) manpage for details
+
+# What ports, IPs and protocols we listen for
+Port 22
+# Use these options to restrict which interfaces/protocols sshd will bind to
+#ListenAddress ::
+#ListenAddress 0.0.0.0
+Protocol 2
+# HostKeys for protocol version 2
+HostKey /etc/ssh/ssh_host_rsa_key
+HostKey /etc/ssh/ssh_host_dsa_key
+HostKey /etc/ssh/ssh_host_ecdsa_key
+#Privilege Separation is turned on for security
+UsePrivilegeSeparation yes
+
+# Lifetime and size of ephemeral version 1 server key
+KeyRegenerationInterval 3600
+ServerKeyBits 768
+
+# Logging
+SyslogFacility AUTH
+LogLevel INFO
+
+# Authentication:
+LoginGraceTime 120
+PermitRootLogin yes
+StrictModes yes
+
+RSAAuthentication yes
+PubkeyAuthentication yes
+#AuthorizedKeysFile    %h/.ssh/authorized_keys
+
+# Don't read the user's ~/.rhosts and ~/.shosts files
+IgnoreRhosts yes
+# For this to work you will also need host keys in /etc/ssh_known_hosts
+RhostsRSAAuthentication no
+# similar for protocol version 2
+HostbasedAuthentication no
+# Uncomment if you don't trust ~/.ssh/known_hosts for RhostsRSAAuthentication
+#IgnoreUserKnownHosts yes
+
+# To enable empty passwords, change to yes (NOT RECOMMENDED)
+PermitEmptyPasswords no
+
+# Change to yes to enable challenge-response passwords (beware issues with
+# some PAM modules and threads)
+ChallengeResponseAuthentication no
+
+# Change to no to disable tunnelled clear text passwords
+PasswordAuthentication no
+
+# Kerberos options
+#KerberosAuthentication no
+#KerberosGetAFSToken no
+#KerberosOrLocalPasswd yes
+#KerberosTicketCleanup yes
+
+# GSSAPI options
+#GSSAPIAuthentication no
+#GSSAPICleanupCredentials yes
+
+X11Forwarding yes
+X11DisplayOffset 10
+PrintMotd no
+PrintLastLog yes
+TCPKeepAlive yes
+#UseLogin no
+
+#MaxStartups 10:30:60
+#Banner /etc/issue.net
+
+# Allow client to pass locale environment variables
+AcceptEnv LANG LC_*
+
+Subsystem sftp /usr/lib/openssh/sftp-server
+
+# Set this to 'yes' to enable PAM authentication, account processing,
+# and session processing. If this is enabled, PAM authentication will
+# be allowed through the ChallengeResponseAuthentication and
+# PasswordAuthentication.  Depending on your PAM configuration,
+# PAM authentication via ChallengeResponseAuthentication may bypass
+# the setting of "PermitRootLogin without-password".
+# If you just want the PAM account and session checks to run without
+# PAM authentication, then enable this but set PasswordAuthentication
+# and ChallengeResponseAuthentication to 'no'.
+UsePAM yes
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_ubuntu_14 b/roles/testnode/templates/ssh/sshd_config_ubuntu_14

new file mode 100644 (file)

index 0000000..6e48757
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_ubuntu_14
@@ -0,0 +1,91 @@
+# {{ ansible_managed }}
+# Package generated configuration file
+# See the sshd_config(5) manpage for details
+
+# What ports, IPs and protocols we listen for
+Port 22
+# Use these options to restrict which interfaces/protocols sshd will bind to
+#ListenAddress ::
+#ListenAddress 0.0.0.0
+Protocol 2
+# HostKeys for protocol version 2
+HostKey /etc/ssh/ssh_host_rsa_key
+HostKey /etc/ssh/ssh_host_dsa_key
+HostKey /etc/ssh/ssh_host_ecdsa_key
+HostKey /etc/ssh/ssh_host_ed25519_key
+#Privilege Separation is turned on for security
+UsePrivilegeSeparation yes
+
+# Lifetime and size of ephemeral version 1 server key
+KeyRegenerationInterval 3600
+ServerKeyBits 1024
+
+# Logging
+SyslogFacility AUTH
+LogLevel INFO
+
+# Authentication:
+LoginGraceTime 120
+PermitRootLogin without-password
+StrictModes yes
+
+RSAAuthentication yes
+PubkeyAuthentication yes
+#AuthorizedKeysFile    %h/.ssh/authorized_keys
+
+# Don't read the user's ~/.rhosts and ~/.shosts files
+IgnoreRhosts yes
+# For this to work you will also need host keys in /etc/ssh_known_hosts
+RhostsRSAAuthentication no
+# similar for protocol version 2
+HostbasedAuthentication no
+# Uncomment if you don't trust ~/.ssh/known_hosts for RhostsRSAAuthentication
+#IgnoreUserKnownHosts yes
+
+# To enable empty passwords, change to yes (NOT RECOMMENDED)
+PermitEmptyPasswords no
+
+# Change to yes to enable challenge-response passwords (beware issues with
+# some PAM modules and threads)
+ChallengeResponseAuthentication no
+
+# Change to no to disable tunnelled clear text passwords
+#PasswordAuthentication yes
+
+# Kerberos options
+#KerberosAuthentication no
+#KerberosGetAFSToken no
+#KerberosOrLocalPasswd yes
+#KerberosTicketCleanup yes
+
+# GSSAPI options
+#GSSAPIAuthentication no
+#GSSAPICleanupCredentials yes
+
+X11Forwarding yes
+X11DisplayOffset 10
+PrintMotd no
+PrintLastLog yes
+TCPKeepAlive yes
+#UseLogin no
+
+#MaxStartups 10:30:60
+#Banner /etc/issue.net
+
+# Allow client to pass locale environment variables
+AcceptEnv LANG LC_*
+
+Subsystem sftp /usr/lib/openssh/sftp-server
+
+# Set this to 'yes' to enable PAM authentication, account processing,
+# and session processing. If this is enabled, PAM authentication will
+# be allowed through the ChallengeResponseAuthentication and
+# PasswordAuthentication.  Depending on your PAM configuration,
+# PAM authentication via ChallengeResponseAuthentication may bypass
+# the setting of "PermitRootLogin without-password".
+# If you just want the PAM account and session checks to run without
+# PAM authentication, then enable this but set PasswordAuthentication
+# and ChallengeResponseAuthentication to 'no'.
+UsePAM yes
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_ubuntu_15 b/roles/testnode/templates/ssh/sshd_config_ubuntu_15

new file mode 100644 (file)

index 0000000..6e48757
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_ubuntu_15
@@ -0,0 +1,91 @@
+# {{ ansible_managed }}
+# Package generated configuration file
+# See the sshd_config(5) manpage for details
+
+# What ports, IPs and protocols we listen for
+Port 22
+# Use these options to restrict which interfaces/protocols sshd will bind to
+#ListenAddress ::
+#ListenAddress 0.0.0.0
+Protocol 2
+# HostKeys for protocol version 2
+HostKey /etc/ssh/ssh_host_rsa_key
+HostKey /etc/ssh/ssh_host_dsa_key
+HostKey /etc/ssh/ssh_host_ecdsa_key
+HostKey /etc/ssh/ssh_host_ed25519_key
+#Privilege Separation is turned on for security
+UsePrivilegeSeparation yes
+
+# Lifetime and size of ephemeral version 1 server key
+KeyRegenerationInterval 3600
+ServerKeyBits 1024
+
+# Logging
+SyslogFacility AUTH
+LogLevel INFO
+
+# Authentication:
+LoginGraceTime 120
+PermitRootLogin without-password
+StrictModes yes
+
+RSAAuthentication yes
+PubkeyAuthentication yes
+#AuthorizedKeysFile    %h/.ssh/authorized_keys
+
+# Don't read the user's ~/.rhosts and ~/.shosts files
+IgnoreRhosts yes
+# For this to work you will also need host keys in /etc/ssh_known_hosts
+RhostsRSAAuthentication no
+# similar for protocol version 2
+HostbasedAuthentication no
+# Uncomment if you don't trust ~/.ssh/known_hosts for RhostsRSAAuthentication
+#IgnoreUserKnownHosts yes
+
+# To enable empty passwords, change to yes (NOT RECOMMENDED)
+PermitEmptyPasswords no
+
+# Change to yes to enable challenge-response passwords (beware issues with
+# some PAM modules and threads)
+ChallengeResponseAuthentication no
+
+# Change to no to disable tunnelled clear text passwords
+#PasswordAuthentication yes
+
+# Kerberos options
+#KerberosAuthentication no
+#KerberosGetAFSToken no
+#KerberosOrLocalPasswd yes
+#KerberosTicketCleanup yes
+
+# GSSAPI options
+#GSSAPIAuthentication no
+#GSSAPICleanupCredentials yes
+
+X11Forwarding yes
+X11DisplayOffset 10
+PrintMotd no
+PrintLastLog yes
+TCPKeepAlive yes
+#UseLogin no
+
+#MaxStartups 10:30:60
+#Banner /etc/issue.net
+
+# Allow client to pass locale environment variables
+AcceptEnv LANG LC_*
+
+Subsystem sftp /usr/lib/openssh/sftp-server
+
+# Set this to 'yes' to enable PAM authentication, account processing,
+# and session processing. If this is enabled, PAM authentication will
+# be allowed through the ChallengeResponseAuthentication and
+# PasswordAuthentication.  Depending on your PAM configuration,
+# PAM authentication via ChallengeResponseAuthentication may bypass
+# the setting of "PermitRootLogin without-password".
+# If you just want the PAM account and session checks to run without
+# PAM authentication, then enable this but set PasswordAuthentication
+# and ChallengeResponseAuthentication to 'no'.
+UsePAM yes
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_ubuntu_16 b/roles/testnode/templates/ssh/sshd_config_ubuntu_16

new file mode 100644 (file)

index 0000000..6e48757
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_ubuntu_16
@@ -0,0 +1,91 @@
+# {{ ansible_managed }}
+# Package generated configuration file
+# See the sshd_config(5) manpage for details
+
+# What ports, IPs and protocols we listen for
+Port 22
+# Use these options to restrict which interfaces/protocols sshd will bind to
+#ListenAddress ::
+#ListenAddress 0.0.0.0
+Protocol 2
+# HostKeys for protocol version 2
+HostKey /etc/ssh/ssh_host_rsa_key
+HostKey /etc/ssh/ssh_host_dsa_key
+HostKey /etc/ssh/ssh_host_ecdsa_key
+HostKey /etc/ssh/ssh_host_ed25519_key
+#Privilege Separation is turned on for security
+UsePrivilegeSeparation yes
+
+# Lifetime and size of ephemeral version 1 server key
+KeyRegenerationInterval 3600
+ServerKeyBits 1024
+
+# Logging
+SyslogFacility AUTH
+LogLevel INFO
+
+# Authentication:
+LoginGraceTime 120
+PermitRootLogin without-password
+StrictModes yes
+
+RSAAuthentication yes
+PubkeyAuthentication yes
+#AuthorizedKeysFile    %h/.ssh/authorized_keys
+
+# Don't read the user's ~/.rhosts and ~/.shosts files
+IgnoreRhosts yes
+# For this to work you will also need host keys in /etc/ssh_known_hosts
+RhostsRSAAuthentication no
+# similar for protocol version 2
+HostbasedAuthentication no
+# Uncomment if you don't trust ~/.ssh/known_hosts for RhostsRSAAuthentication
+#IgnoreUserKnownHosts yes
+
+# To enable empty passwords, change to yes (NOT RECOMMENDED)
+PermitEmptyPasswords no
+
+# Change to yes to enable challenge-response passwords (beware issues with
+# some PAM modules and threads)
+ChallengeResponseAuthentication no
+
+# Change to no to disable tunnelled clear text passwords
+#PasswordAuthentication yes
+
+# Kerberos options
+#KerberosAuthentication no
+#KerberosGetAFSToken no
+#KerberosOrLocalPasswd yes
+#KerberosTicketCleanup yes
+
+# GSSAPI options
+#GSSAPIAuthentication no
+#GSSAPICleanupCredentials yes
+
+X11Forwarding yes
+X11DisplayOffset 10
+PrintMotd no
+PrintLastLog yes
+TCPKeepAlive yes
+#UseLogin no
+
+#MaxStartups 10:30:60
+#Banner /etc/issue.net
+
+# Allow client to pass locale environment variables
+AcceptEnv LANG LC_*
+
+Subsystem sftp /usr/lib/openssh/sftp-server
+
+# Set this to 'yes' to enable PAM authentication, account processing,
+# and session processing. If this is enabled, PAM authentication will
+# be allowed through the ChallengeResponseAuthentication and
+# PasswordAuthentication.  Depending on your PAM configuration,
+# PAM authentication via ChallengeResponseAuthentication may bypass
+# the setting of "PermitRootLogin without-password".
+# If you just want the PAM account and session checks to run without
+# PAM authentication, then enable this but set PasswordAuthentication
+# and ChallengeResponseAuthentication to 'no'.
+UsePAM yes
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_ubuntu_18 b/roles/testnode/templates/ssh/sshd_config_ubuntu_18

new file mode 100644 (file)

index 0000000..6e48757
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_ubuntu_18
@@ -0,0 +1,91 @@
+# {{ ansible_managed }}
+# Package generated configuration file
+# See the sshd_config(5) manpage for details
+
+# What ports, IPs and protocols we listen for
+Port 22
+# Use these options to restrict which interfaces/protocols sshd will bind to
+#ListenAddress ::
+#ListenAddress 0.0.0.0
+Protocol 2
+# HostKeys for protocol version 2
+HostKey /etc/ssh/ssh_host_rsa_key
+HostKey /etc/ssh/ssh_host_dsa_key
+HostKey /etc/ssh/ssh_host_ecdsa_key
+HostKey /etc/ssh/ssh_host_ed25519_key
+#Privilege Separation is turned on for security
+UsePrivilegeSeparation yes
+
+# Lifetime and size of ephemeral version 1 server key
+KeyRegenerationInterval 3600
+ServerKeyBits 1024
+
+# Logging
+SyslogFacility AUTH
+LogLevel INFO
+
+# Authentication:
+LoginGraceTime 120
+PermitRootLogin without-password
+StrictModes yes
+
+RSAAuthentication yes
+PubkeyAuthentication yes
+#AuthorizedKeysFile    %h/.ssh/authorized_keys
+
+# Don't read the user's ~/.rhosts and ~/.shosts files
+IgnoreRhosts yes
+# For this to work you will also need host keys in /etc/ssh_known_hosts
+RhostsRSAAuthentication no
+# similar for protocol version 2
+HostbasedAuthentication no
+# Uncomment if you don't trust ~/.ssh/known_hosts for RhostsRSAAuthentication
+#IgnoreUserKnownHosts yes
+
+# To enable empty passwords, change to yes (NOT RECOMMENDED)
+PermitEmptyPasswords no
+
+# Change to yes to enable challenge-response passwords (beware issues with
+# some PAM modules and threads)
+ChallengeResponseAuthentication no
+
+# Change to no to disable tunnelled clear text passwords
+#PasswordAuthentication yes
+
+# Kerberos options
+#KerberosAuthentication no
+#KerberosGetAFSToken no
+#KerberosOrLocalPasswd yes
+#KerberosTicketCleanup yes
+
+# GSSAPI options
+#GSSAPIAuthentication no
+#GSSAPICleanupCredentials yes
+
+X11Forwarding yes
+X11DisplayOffset 10
+PrintMotd no
+PrintLastLog yes
+TCPKeepAlive yes
+#UseLogin no
+
+#MaxStartups 10:30:60
+#Banner /etc/issue.net
+
+# Allow client to pass locale environment variables
+AcceptEnv LANG LC_*
+
+Subsystem sftp /usr/lib/openssh/sftp-server
+
+# Set this to 'yes' to enable PAM authentication, account processing,
+# and session processing. If this is enabled, PAM authentication will
+# be allowed through the ChallengeResponseAuthentication and
+# PasswordAuthentication.  Depending on your PAM configuration,
+# PAM authentication via ChallengeResponseAuthentication may bypass
+# the setting of "PermitRootLogin without-password".
+# If you just want the PAM account and session checks to run without
+# PAM authentication, then enable this but set PasswordAuthentication
+# and ChallengeResponseAuthentication to 'no'.
+UsePAM yes
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_ubuntu_20 b/roles/testnode/templates/ssh/sshd_config_ubuntu_20

new file mode 100644 (file)

index 0000000..6e48757
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_ubuntu_20
@@ -0,0 +1,91 @@
+# {{ ansible_managed }}
+# Package generated configuration file
+# See the sshd_config(5) manpage for details
+
+# What ports, IPs and protocols we listen for
+Port 22
+# Use these options to restrict which interfaces/protocols sshd will bind to
+#ListenAddress ::
+#ListenAddress 0.0.0.0
+Protocol 2
+# HostKeys for protocol version 2
+HostKey /etc/ssh/ssh_host_rsa_key
+HostKey /etc/ssh/ssh_host_dsa_key
+HostKey /etc/ssh/ssh_host_ecdsa_key
+HostKey /etc/ssh/ssh_host_ed25519_key
+#Privilege Separation is turned on for security
+UsePrivilegeSeparation yes
+
+# Lifetime and size of ephemeral version 1 server key
+KeyRegenerationInterval 3600
+ServerKeyBits 1024
+
+# Logging
+SyslogFacility AUTH
+LogLevel INFO
+
+# Authentication:
+LoginGraceTime 120
+PermitRootLogin without-password
+StrictModes yes
+
+RSAAuthentication yes
+PubkeyAuthentication yes
+#AuthorizedKeysFile    %h/.ssh/authorized_keys
+
+# Don't read the user's ~/.rhosts and ~/.shosts files
+IgnoreRhosts yes
+# For this to work you will also need host keys in /etc/ssh_known_hosts
+RhostsRSAAuthentication no
+# similar for protocol version 2
+HostbasedAuthentication no
+# Uncomment if you don't trust ~/.ssh/known_hosts for RhostsRSAAuthentication
+#IgnoreUserKnownHosts yes
+
+# To enable empty passwords, change to yes (NOT RECOMMENDED)
+PermitEmptyPasswords no
+
+# Change to yes to enable challenge-response passwords (beware issues with
+# some PAM modules and threads)
+ChallengeResponseAuthentication no
+
+# Change to no to disable tunnelled clear text passwords
+#PasswordAuthentication yes
+
+# Kerberos options
+#KerberosAuthentication no
+#KerberosGetAFSToken no
+#KerberosOrLocalPasswd yes
+#KerberosTicketCleanup yes
+
+# GSSAPI options
+#GSSAPIAuthentication no
+#GSSAPICleanupCredentials yes
+
+X11Forwarding yes
+X11DisplayOffset 10
+PrintMotd no
+PrintLastLog yes
+TCPKeepAlive yes
+#UseLogin no
+
+#MaxStartups 10:30:60
+#Banner /etc/issue.net
+
+# Allow client to pass locale environment variables
+AcceptEnv LANG LC_*
+
+Subsystem sftp /usr/lib/openssh/sftp-server
+
+# Set this to 'yes' to enable PAM authentication, account processing,
+# and session processing. If this is enabled, PAM authentication will
+# be allowed through the ChallengeResponseAuthentication and
+# PasswordAuthentication.  Depending on your PAM configuration,
+# PAM authentication via ChallengeResponseAuthentication may bypass
+# the setting of "PermitRootLogin without-password".
+# If you just want the PAM account and session checks to run without
+# PAM authentication, then enable this but set PasswordAuthentication
+# and ChallengeResponseAuthentication to 'no'.
+UsePAM yes
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/ssh/sshd_config_ubuntu_22 b/roles/testnode/templates/ssh/sshd_config_ubuntu_22

new file mode 100644 (file)

index 0000000..6e48757
--- /dev/null
+++ b/roles/testnode/templates/ssh/sshd_config_ubuntu_22
@@ -0,0 +1,91 @@
+# {{ ansible_managed }}
+# Package generated configuration file
+# See the sshd_config(5) manpage for details
+
+# What ports, IPs and protocols we listen for
+Port 22
+# Use these options to restrict which interfaces/protocols sshd will bind to
+#ListenAddress ::
+#ListenAddress 0.0.0.0
+Protocol 2
+# HostKeys for protocol version 2
+HostKey /etc/ssh/ssh_host_rsa_key
+HostKey /etc/ssh/ssh_host_dsa_key
+HostKey /etc/ssh/ssh_host_ecdsa_key
+HostKey /etc/ssh/ssh_host_ed25519_key
+#Privilege Separation is turned on for security
+UsePrivilegeSeparation yes
+
+# Lifetime and size of ephemeral version 1 server key
+KeyRegenerationInterval 3600
+ServerKeyBits 1024
+
+# Logging
+SyslogFacility AUTH
+LogLevel INFO
+
+# Authentication:
+LoginGraceTime 120
+PermitRootLogin without-password
+StrictModes yes
+
+RSAAuthentication yes
+PubkeyAuthentication yes
+#AuthorizedKeysFile    %h/.ssh/authorized_keys
+
+# Don't read the user's ~/.rhosts and ~/.shosts files
+IgnoreRhosts yes
+# For this to work you will also need host keys in /etc/ssh_known_hosts
+RhostsRSAAuthentication no
+# similar for protocol version 2
+HostbasedAuthentication no
+# Uncomment if you don't trust ~/.ssh/known_hosts for RhostsRSAAuthentication
+#IgnoreUserKnownHosts yes
+
+# To enable empty passwords, change to yes (NOT RECOMMENDED)
+PermitEmptyPasswords no
+
+# Change to yes to enable challenge-response passwords (beware issues with
+# some PAM modules and threads)
+ChallengeResponseAuthentication no
+
+# Change to no to disable tunnelled clear text passwords
+#PasswordAuthentication yes
+
+# Kerberos options
+#KerberosAuthentication no
+#KerberosGetAFSToken no
+#KerberosOrLocalPasswd yes
+#KerberosTicketCleanup yes
+
+# GSSAPI options
+#GSSAPIAuthentication no
+#GSSAPICleanupCredentials yes
+
+X11Forwarding yes
+X11DisplayOffset 10
+PrintMotd no
+PrintLastLog yes
+TCPKeepAlive yes
+#UseLogin no
+
+#MaxStartups 10:30:60
+#Banner /etc/issue.net
+
+# Allow client to pass locale environment variables
+AcceptEnv LANG LC_*
+
+Subsystem sftp /usr/lib/openssh/sftp-server
+
+# Set this to 'yes' to enable PAM authentication, account processing,
+# and session processing. If this is enabled, PAM authentication will
+# be allowed through the ChallengeResponseAuthentication and
+# PasswordAuthentication.  Depending on your PAM configuration,
+# PAM authentication via ChallengeResponseAuthentication may bypass
+# the setting of "PermitRootLogin without-password".
+# If you just want the PAM account and session checks to run without
+# PAM authentication, then enable this but set PasswordAuthentication
+# and ChallengeResponseAuthentication to 'no'.
+UsePAM yes
+
+MaxSessions 1000
diff --git a/roles/testnode/templates/sudoers b/roles/testnode/templates/sudoers

new file mode 100755 (executable)

index 0000000..d4dc6d5
--- /dev/null
+++ b/roles/testnode/templates/sudoers
@@ -0,0 +1,47 @@
+## {{ ansible_managed }}
+
+## Sudoers allows particular users to run various commands as
+## the root user, without needing the root password.
+##
+## Examples are provided at the bottom of the file for collections
+## of related commands, which can then be delegated out to particular
+## users or groups.
+## 
+## This file must be edited with the 'visudo' command.
+
+# Disable "ssh hostname sudo <cmd>", because it will show the password in clear. 
+#         You have to run "ssh -t hostname sudo <cmd>".
+#
+Defaults    !requiretty
+
+# Refuse to run if unable to disable echo on the tty. This setting should also be
+# changed in order to be able to use sudo without a tty. See !requiretty above.
+#
+Defaults   visiblepw
+
+# Preserving HOME has security implications since many programs
+# use it when searching for configuration files. Note that HOME
+# is already set when the the env_reset option is enabled, so
+# this option is only effective for configurations where either
+# env_reset is disabled or HOME is present in the env_keep list.
+#
+Defaults    always_set_home
+
+Defaults    env_reset
+Defaults    env_keep =  "COLORS DISPLAY HOSTNAME HISTSIZE INPUTRC KDEDIR LS_COLORS"
+Defaults    env_keep += "MAIL PS1 PS2 QTDIR USERNAME LANG LC_ADDRESS LC_CTYPE"
+Defaults    env_keep += "LC_COLLATE LC_IDENTIFICATION LC_MEASUREMENT LC_MESSAGES"
+Defaults    env_keep += "LC_MONETARY LC_NAME LC_NUMERIC LC_PAPER LC_TELEPHONE"
+Defaults    env_keep += "LC_TIME LC_ALL LANGUAGE LINGUAS _XKB_CHARSET XAUTHORITY"
+
+Defaults    secure_path = /sbin:/bin:/usr/sbin:/usr/bin
+
+## Allow root to run any commands anywhere 
+root   ALL=(ALL)       ALL
+
+## Allows people in group wheel to run all commands
+%wheel ALL=(ALL)       ALL
+
+{{ teuthology_user }} ALL=(ALL) NOPASSWD:ALL
+
+#includedir /etc/sudoers.d
diff --git a/roles/testnode/templates/wgetrc b/roles/testnode/templates/wgetrc

new file mode 100644 (file)

index 0000000..e1af6f9
--- /dev/null
+++ b/roles/testnode/templates/wgetrc
@@ -0,0 +1,3 @@
+# {{ ansible_managed }}
+check_certificate = off
+passive_ftp = on
diff --git a/roles/testnode/templates/yum_repo.j2 b/roles/testnode/templates/yum_repo.j2

new file mode 100644 (file)

index 0000000..7467eb6
--- /dev/null
+++ b/roles/testnode/templates/yum_repo.j2
@@ -0,0 +1,8 @@
+#
+# {{ ansible_managed }}
+#
+
+[{{ item.key }}]
+{% for k, v in item.value.items() | sort -%}
+  {{ k }}={{ v }}
+{% endfor %}
diff --git a/roles/testnode/vars/apt_systems.yml b/roles/testnode/vars/apt_systems.yml

new file mode 100644 (file)

index 0000000..bdcda51
--- /dev/null
+++ b/roles/testnode/vars/apt_systems.yml
@@ -0,0 +1,35 @@
+---
+ntp_service_name: ntp
+ssh_service_name: ssh
+nfs_service: nfs-kernel-server
+
+packages_to_remove:
+  # multipath interferes with krbd tests
+  - multipath-tools
+  # openmpi-common conflicts with mpich stuff
+  - openmpi-common
+  # tgt interferes with ceph-iscsi tests
+  - tgt
+
+ceph_packages_to_remove:
+  - ceph
+  - ceph-common
+  - libcephfs1
+  - radosgw
+  - python-ceph
+  - python-rados
+  - python-cephfs
+  - python-rbd
+  - librbd1
+  - librados2
+  - ceph-fs-common-dbg
+  - ceph-fs-common
+
+packages: []
+common_packages: []
+
+apt_repos: []
+common_apt_repos: []
+
+pip_packages_to_install:
+  - remoto>=0.0.35
diff --git a/roles/testnode/vars/centos_6.yml b/roles/testnode/vars/centos_6.yml

new file mode 100644 (file)

index 0000000..5213c9c
--- /dev/null
+++ b/roles/testnode/vars/centos_6.yml
@@ -0,0 +1,120 @@
+---
+# vars specific to centos 6.x
+
+yum_repos:
+  centos6-fcgi-ceph:
+    name: Cent OS 6 Local fastcgi Repo
+    baseurl: "http://{{ gitbuilder_host }}/mod_fastcgi-rpm-centos6-x86_64-basic/ref/master/"
+    enabled: 1
+    gpgcheck: 0
+    priority: 2
+  centos6-misc-ceph:
+    name: Cent OS 6 Local misc Repo
+    baseurl: "http://{{ mirror_host }}/misc-rpms/"
+    enabled: 1
+    gpgcheck: 0
+    priority: 2
+  rpmforge:
+    name: Red Hat Enterprise $releasever - RPMforge.net - dag
+    baseurl: "http://{{ mirror_host }}/rpmforge/"
+    enabled: 1
+    gpgcheck: 0
+    protect: 0
+  lab-extras:
+    name: lab-extras
+    baseurl: "http://{{ mirror_host }}/lab-extras/centos6/"
+    enabled: 1
+    gpgcheck: 0
+    priority: 2
+
+packages:
+  - '@core'
+  - '@base'
+  - yum-plugin-priorities
+  - yum-plugin-fastestmirror
+  - redhat-lsb
+  - sysstat
+  - gdb
+  - git-all
+  - python-configobj
+  # for running ceph
+  - libedit
+  - openssl098e
+  - boost-thread
+  - xfsprogs
+  - gdisk
+  - parted
+  - libgcrypt
+  - fuse
+  - fuse-libs
+  ###
+  ###
+  ###
+  - openssl
+  - libuuid
+  - btrfs-progs
+  # for compiling helpers
+  - libatomic_ops-devel
+  ###
+  # used by workunits
+  - attr
+  - valgrind
+  - python-nose
+  - mpich2
+  - mpich2-devel
+  - ant
+  - fsstress
+  - iozone
+  ###
+  # used by the xfstests tasks
+  - libtool
+  - automake
+  - gettext
+  - uuid-devel
+  - libacl-devel
+  - bc
+  - xfsdump
+  ###
+  # for blktrace and seekwatcher
+  - blktrace
+  - numpy
+  - python-matplotlib
+  ###
+  # for qemu
+  - qemu-kvm
+  - usbredir
+  - genisoimage
+  ###
+  # for apache and rgw
+  - httpd
+  - httpd-devel
+  - httpd-tools
+  - mod_ssl
+  - mod_fastcgi-2.4.7-1.ceph.el6
+  ### for swift and s3-tests
+  - libev-devel
+  - python-devel
+  # for pretty-printing xml
+  - perl-XML-Twig
+  # for java bindings, hadoop, etc.
+  - java-1.7.0-openjdk-devel
+  - junit4
+  # for nfs
+  - nfs-utils
+
+epel_packages:
+  # for running ceph
+  - cryptopp-devel
+  - cryptopp
+  - fcgi
+  # used by workunits
+  - dbench
+  # used by workunits
+  - fuse-sshfs
+  - bonnie++
+  # for json_xs to investigate JSON by hand
+  - perl-JSON
+  # for ceph-deploy
+  - python-virtualenv
+  # for setting BIOS settings
+  - smbios-utils
diff --git a/roles/testnode/vars/centos_7.yml b/roles/testnode/vars/centos_7.yml

new file mode 100644 (file)

index 0000000..e88310c
--- /dev/null
+++ b/roles/testnode/vars/centos_7.yml
@@ -0,0 +1,109 @@
+---
+# vars specific to centos 7.x
+
+yum_repos:
+  centos7-fcgi-ceph:
+    name: CentOS 7 Local fastcgi Repo
+    baseurl: "http://{{ gitbuilder_host }}/mod_fastcgi-rpm-centos7-x86_64-basic/ref/master/"
+    enabled: 1
+    gpgcheck: 0
+  lab-extras:
+    name: lab-extras
+    baseurl: "http://{{ mirror_host }}/lab-extras/centos7/"
+    enabled: 1
+    gpgcheck: 0
+
+packages:
+  - '@core'
+  - '@base'
+  - yum-plugin-priorities
+  - yum-plugin-fastestmirror
+  - redhat-lsb
+  - sysstat
+  - gdb
+  - git-all
+  - python-configobj
+  - gcc-c++
+  # for running ceph
+  - libedit
+  - openssl098e
+  - boost-thread
+  - xfsprogs
+  - gdisk
+  - parted
+  - libgcrypt
+  - fuse
+  - fuse-libs
+  ###
+  # for ceph-deploy
+  - python-virtualenv
+  ###
+  - openssl
+  - libuuid
+  - btrfs-progs
+  # used by workunits
+  - attr
+  - valgrind
+  - python-nose
+  - mpich
+  - podman
+  # for cephadmunit.py's kill,
+  - podman-docker
+  - ant
+  - iozone
+  ###
+  # used by the xfstests tasks
+  - libtool
+  - automake
+  - gettext
+  - libuuid-devel
+  - libacl-devel
+  - bc
+  - xfsdump
+  - xfsprogs-devel
+  ###
+  # for blktrace and seekwatcher
+  - blktrace
+  - numpy
+  - python-matplotlib
+  ###
+  # for qemu
+  - qemu-kvm
+  - usbredir
+  - genisoimage
+  ###
+  # for apache and rgw
+  - httpd
+  - httpd-devel
+  - httpd-tools
+  - mod_ssl
+  - mod_fastcgi-2.4.7-1.ceph.el7.centos
+  ### for swift and s3-tests
+  - libev-devel
+  # for pretty-printing xml
+  - perl-XML-Twig
+  # for java bindings, hadoop, etc.
+  - java-1.6.0-openjdk-devel
+  - junit4
+  # for nfs
+  - nfs-utils
+  # for xfstests
+  - ncurses-devel
+  # for s3 tests
+  - python-devel
+  - python-virtualenv
+  - perl-CPAN
+  - python3
+
+epel_packages:
+  # for running ceph
+  - cryptopp-devel
+  - cryptopp
+  - fcgi
+  # used by workunits
+  - dbench
+  # used by workunits
+  - fuse-sshfs
+  - bonnie++
+  # for json_xs to investigate JSON by hand
+  - perl-JSON-XS
diff --git a/roles/testnode/vars/centos_8.yml b/roles/testnode/vars/centos_8.yml

new file mode 100644 (file)

index 0000000..a1b03b1
--- /dev/null
+++ b/roles/testnode/vars/centos_8.yml
@@ -0,0 +1,74 @@
+---
+# vars specific to any centos 8.x version
+# some of these will be overridden by vars in centos_8_stream.yml
+
+common_yum_repos:
+  lab-extras:
+    name: "lab-extras"
+    baseurl: "http://{{ mirror_host }}/lab-extras/8/"
+    enabled: 1
+    gpgcheck: 0
+
+copr_repos:
+  - ceph/python3-asyncssh
+
+packages_to_upgrade:
+  - libgcrypt # explicitly tied to qemu build
+
+packages:
+  - redhat-lsb-core
+  # for package-cleanup
+  - dnf-utils
+  - sysstat
+  - libedit
+  - boost-thread
+  - xfsprogs
+  - gdisk
+  - parted
+  - libgcrypt
+  - fuse-libs
+  - openssl
+  - libuuid
+  - podman
+  # for cephadmunit.py to uniformly run 'docker kill -p ...'
+  - podman-docker
+  - attr
+  - ant
+  - lsof
+  - gettext
+  - bc
+  - xfsdump
+  - blktrace
+  - usbredir
+  - libev-devel
+  - valgrind
+  - nfs-utils
+  # for xfstests
+  - ncurses-devel
+  # for s3 tests
+  # for workunits,
+  - gcc
+  - git
+  # qa/workunits/rados/test_python.sh
+  - python3-nose
+  # for cram tests
+  - python3-virtualenv
+  # for rbd qemu tests
+  - genisoimage
+  - qemu-img
+  - qemu-kvm-core
+  - qemu-kvm-block-rbd
+  # for pjd tests
+  - libacl-devel
+  # for fs tests,
+  - autoconf
+  # for test-crash.sh
+  - gdb
+  - iozone
+
+epel_packages:
+  - dbench
+
+nfs_service: nfs-server
+
+ntp_service_name: chronyd
diff --git a/roles/testnode/vars/centos_8_stream.yml b/roles/testnode/vars/centos_8_stream.yml

new file mode 100644 (file)

index 0000000..2cb04ef
--- /dev/null
+++ b/roles/testnode/vars/centos_8_stream.yml
@@ -0,0 +1,66 @@
+---
+# vars specific to centos stream version 8.x
+# these will override vars in centos_8.yml
+
+packages_to_upgrade:
+  - libgcrypt # explicitly tied to qemu build
+
+  # centos stream additions start here
+  - systemd
+
+packages:
+  - redhat-lsb-core
+  # for package-cleanup
+  - dnf-utils
+  - sysstat
+  - libedit
+  - boost-thread
+  - xfsprogs
+  - gdisk
+  - parted
+  - libgcrypt
+  - fuse-libs
+  - openssl
+  - libuuid
+  - podman
+  # for cephadmunit.py to uniformly run 'docker kill -p ...'
+  - podman-docker
+  - attr
+  - ant
+  - lsof
+  - gettext
+  - bc
+  - xfsdump
+  - blktrace
+  - usbredir
+  - libev-devel
+  - valgrind
+  - nfs-utils
+  # for xfstests
+  - ncurses-devel
+  # for s3 tests
+  # for workunits,
+  - gcc
+  - git
+  # qa/workunits/rados/test_python.sh
+  - python3-nose
+  # for cram tests
+  - python3-virtualenv
+  # for rbd qemu tests
+  - genisoimage
+  - qemu-img
+  - qemu-kvm-core
+  - qemu-kvm-block-rbd
+  # for pjd tests
+  - libacl-devel
+  # for fs tests,
+  - autoconf
+  # for test-crash.sh
+  - gdb
+  - iozone
+
+  # centos stream additions start here
+  - lvm2
+
+epel_packages:
+  - dbench
diff --git a/roles/testnode/vars/centos_9.yml b/roles/testnode/vars/centos_9.yml

new file mode 100644 (file)

index 0000000..093a709
--- /dev/null
+++ b/roles/testnode/vars/centos_9.yml
@@ -0,0 +1,74 @@
+---
+# vars specific to any centos 9.x version
+
+common_yum_repos:
+  lab-extras:
+    name: "lab-extras"
+    baseurl: "http://{{ mirror_host }}/lab-extras/9/"
+    enabled: 1
+    gpgcheck: 0
+
+
+# When mirrors become available, these will be filenames in roles/testnodes/templates/mirrorlists/9/
+yum_mirrorlists: []
+
+packages_to_upgrade:
+  - libgcrypt # explicitly tied to qemu build
+
+packages:
+  # for package-cleanup
+  - dnf-utils
+  - sysstat
+  - libedit
+  - boost-thread
+  - xfsprogs
+  - gdisk
+  - parted
+  - libgcrypt
+  - fuse-libs
+  - openssl
+  - libuuid
+  - podman
+  # for cephadmunit.py to uniformly run 'docker kill -p ...'
+# Doesn't exist yet
+#  - podman-docker
+  - attr
+#  - ant
+  - lsof
+  - gettext
+  - bc
+  - xfsdump
+  - blktrace
+  - usbredir
+#  - libev-devel
+  - valgrind
+  - nfs-utils
+  # for xfstests
+  - ncurses-devel
+  # for s3 tests
+  # for workunits,
+  - gcc
+  - git
+  # qa/workunits/rados/test_python.sh
+#  - python3-nose
+  # for cram tests
+#  - python3-virtualenv
+  # for rbd qemu tests
+  - genisoimage
+  - qemu-img
+  - qemu-kvm-core
+  - qemu-kvm-block-rbd
+  # for pjd tests
+  - libacl-devel
+  # for fs tests,
+  - autoconf
+  # for test-crash.sh
+  - gdb
+  - iozone
+
+epel_packages:
+  - dbench
+
+nfs_service: nfs-server
+
+ntp_service_name: chronyd
diff --git a/roles/testnode/vars/debian_7.yml b/roles/testnode/vars/debian_7.yml

new file mode 100644 (file)

index 0000000..e149b26
--- /dev/null
+++ b/roles/testnode/vars/debian_7.yml
@@ -0,0 +1,98 @@
+---
+apt_repos:
+  - "deb http://ceph.com/debian-dumpling/ wheezy main"
+  - "deb http://gitbuilder.ceph.com/libapache-mod-fastcgi-deb-wheezy-x86_64-basic/ref/master/ wheezy main"
+
+packages:
+  - lsb-release
+  - build-essential
+  - sysstat
+  - gdb
+  - python-configobj
+  - python-gevent
+  - python-dev
+  - python-virtualenv
+  - libev-dev
+  - fuse
+  - libssl1.0.0
+  - libgoogle-perftools4
+  - libboost-thread1.49.0
+  - cryptsetup-bin
+  - libcrypto++9
+  - iozone3
+  - libmpich2-3
+  - collectl
+  - nfs-kernel-server
+  # for running ceph
+  - libedit2
+  - xfsprogs
+  - gdisk
+  - parted
+  ###
+  # for setting BIOS settings 
+  - libsmbios-bin
+  ###
+  - libuuid1
+  - libfcgi
+  - btrfs-tools
+  # for compiling helpers and such
+  - libatomic-ops-dev
+  ###
+  # used by workunits
+  - git-core
+  - attr
+  - dbench
+  - bonnie++
+  - valgrind
+  - python-nose
+  - mpich2
+  - libmpich2-dev
+  - ant
+  ###
+  # used by the xfstests tasks
+  - libtool
+  - automake
+  - gettext
+  - uuid-dev
+  - libacl1-dev
+  - bc
+  - xfsdump
+  - dmapi
+  - xfslibs-dev
+  ###
+  # For Mark Nelson
+  - sysprof
+  - pdsh
+  ###
+  # for blktrace and seekwatcher
+  - blktrace
+  - python-numpy
+  - python-matplotlib
+  - mencoder
+  ###
+  # for qemu
+  - kvm
+  - genisoimage
+  ###
+  # for json_xs to investigate JSON by hand
+  - libjson-xs-perl
+  ###
+  # for pretty-printing xml
+  - xml-twig-tools
+  ###
+  # for java bindings, hadoop, etc.
+  - default-jdk
+  - junit4
+  ###
+  # for samba testing
+  - cifs-utils
+  ###
+  # DistCC for arm
+  - distcc
+  
+packages_to_upgrade:
+  - apt
+  - libcurl3-gnutls
+  - apache2
+  - libapache2-mod-fastcgi
+  - libfcgi0ldbl
diff --git a/roles/testnode/vars/debian_8.yml b/roles/testnode/vars/debian_8.yml

new file mode 100644 (file)

index 0000000..bdf6378
--- /dev/null
+++ b/roles/testnode/vars/debian_8.yml
@@ -0,0 +1,97 @@
+---
+packages:
+  - lsb-release
+  - build-essential
+  - sysstat
+  - gdb
+  - python-configobj
+  - python-gevent
+  - python-dev
+  - python-virtualenv
+  - libev-dev
+  - fuse
+  - libssl1.0.0
+  - libgoogle-perftools4
+  - cryptsetup-bin
+  - libcrypto++9
+  - iozone3
+  - docker.io
+  - collectl
+  - nfs-kernel-server
+  # for running ceph
+  - libedit2
+  - xfsprogs
+  - gdisk
+  - parted
+  ###
+  # for setting BIOS settings 
+  - libsmbios-bin
+  ###
+  - libuuid1
+  - libfcgi
+  - btrfs-tools
+  # for compiling helpers and such
+  - libatomic-ops-dev
+  ###
+  # used by workunits
+  - git-core
+  - attr
+  - dbench
+  - bonnie++
+  - valgrind
+  - python-nose
+  - mpich2
+  - libmpich2-dev
+  - ant
+  ###
+  # used by the xfstests tasks
+  - libtool
+  - automake
+  - gettext
+  - uuid-dev
+  - libacl1-dev
+  - bc
+  - xfsdump
+  - dmapi
+  - xfslibs-dev
+  ###
+  # For Mark Nelson
+  - sysprof
+  - pdsh
+  ###
+  # for blktrace and seekwatcher
+  - blktrace
+  - python-numpy
+  - python-matplotlib
+  ###
+  # for qemu
+  - kvm
+  - genisoimage
+  ###
+  # for json_xs to investigate JSON by hand
+  - libjson-xs-perl
+  ###
+  # for pretty-printing xml
+  - xml-twig-tools
+  ###
+  # for java bindings, hadoop, etc.
+  - default-jdk
+  - junit4
+  ###
+  # for samba testing
+  - cifs-utils
+  ###
+  # DistCC for arm
+  - distcc
+
+#NOTE: these packages were not found for debian 8, but are present for debian 7
+#- mencoder
+#- libmpich2-3
+#- libboost-thread1.49.0
+  
+packages_to_upgrade:
+  - apt
+  - libcurl3-gnutls
+  - apache2
+  - libapache2-mod-fastcgi
+  - libfcgi0ldbl
diff --git a/roles/testnode/vars/dnf_systems.yml b/roles/testnode/vars/dnf_systems.yml

new file mode 120000 (symlink)

index 0000000..3eacc96
--- /dev/null
+++ b/roles/testnode/vars/dnf_systems.yml
@@ -0,0 +1 @@
+yum_systems.yml
+\ No newline at end of file
diff --git a/roles/testnode/vars/empty.yml b/roles/testnode/vars/empty.yml

new file mode 100644 (file)

index 0000000..85479d1
--- /dev/null
+++ b/roles/testnode/vars/empty.yml
@@ -0,0 +1,12 @@
+---
+# This is empty on purpose.  Used as the last line
+# when using include_vars with with_first_found when
+# the var file might not exist.
+#
+# For example, there is not rhel 6.5 var file because it's not needed
+# but there is a rhel 7.0 var file that we need to include. Using this empty.yml
+# as the last line in with_first_found allows include_vars to work across different distros
+# where the var file might not be needed.
+#
+# Maybe related issue:
+# https://github.com/ansible/ansible/issues/10000
diff --git a/roles/testnode/vars/fedora_22.yml b/roles/testnode/vars/fedora_22.yml

new file mode 100644 (file)

index 0000000..31170f4
--- /dev/null
+++ b/roles/testnode/vars/fedora_22.yml
@@ -0,0 +1,74 @@
+---
+packages_to_upgrade:
+  - leveldb
+
+packages_to_remove:
+  - ceph-libs
+
+packages:
+  - '@core'
+  - yum-plugin-priorities
+  - redhat-lsb
+  - sysstat
+  - gdb
+  - git-all
+  - python-configobj
+  # for running ceph
+  - libedit
+  - openssl-devel
+  - google-perftools-devel
+  - boost-thread
+  - xfsprogs
+  - gdisk
+  - parted
+  - libgcrypt
+  - cryptopp-devel
+  - cryptopp
+  ###
+  # for ceph-deploy
+  - python-virtualenv
+  ###
+  # for setting BIOS settings
+  - smbios-utils
+  ###
+  - openssl
+  - libuuid
+  - fcgi-devel
+  - btrfs-progs
+  # for compiling helpers
+  - libatomic_ops-devel
+  ###
+  # used by workunits
+  - attr
+  - valgrind
+  - python-nose
+  - mpich2
+  - mpich2-devel
+  - ant
+  - dbench
+  - bonnie++
+  ###
+  # used by the xfstests tasks
+  - libtool
+  - automake
+  - gettext
+  - uuid-devel
+  - libacl-devel
+  - bc
+  - xfsdump
+  ###
+  # for blktrace and seekwatcher
+  - blktrace
+  - numpy
+  - python-matplotlib
+  # for json_xs to investigate JSON by hand
+  - perl-JSON
+  # for pretty-printing xml
+  - perl-XML-Twig
+  # for java bindings, hadoop, etc.
+  - java-1.8.0-openjdk-devel
+  - junit
+  # for nfs
+  - nfs-utils
+  # python-pip is installed via roles/testnode/tasks/pip.yml on other rpm-based distros
+  - python-pip
diff --git a/roles/testnode/vars/opensuse_leap_15.0.yml b/roles/testnode/vars/opensuse_leap_15.0.yml

new file mode 100644 (file)

index 0000000..d56b4ff
--- /dev/null
+++ b/roles/testnode/vars/opensuse_leap_15.0.yml
@@ -0,0 +1,76 @@
+---
+# vars specific to OpenSuse Leap 15.0
+packages_to_remove:
+  - gettext-runtime-mini
+
+packages:
+  - lsb-release
+  - sysstat
+  - gdb
+  - make
+  - git
+  - python-configobj
+  # for running ceph
+  - libedit0
+#  - libboost_thread1_54_0
+  - libboost_thread1_66_0
+  - xfsprogs
+  - podman
+  - gptfdisk
+  - parted
+  - libgcrypt20
+  - fuse
+  - fuse-devel
+  - libfuse2
+  ###
+  # for ceph-deploy
+  - python-virtualenv
+  ###
+  - openssl
+  - libuuid1
+  - btrfsprogs
+  # used by workunits
+  - attr
+  - valgrind
+  - python-nose
+  - ant
+#  - iozone
+  ###
+  # used by the xfstests tasks
+  - libtool
+  - automake
+  - gettext-runtime
+  - libuuid-devel
+  - libacl-devel
+  - bc
+  - xfsdump
+  - xfsprogs-devel
+  ###
+  # for blktrace and seekwatcher
+  - blktrace
+  - python-numpy
+  - python-matplotlib
+  ###
+  # for qemu
+  - qemu-kvm
+  - usbredir
+#  - genisoimage
+  ###
+  # for apache and rgw
+  - apache2
+  - apache2-devel
+  - apache2-utils
+#  - apache2-mod_fastcgi
+  ###
+  - libevent-devel
+  # for pretty-printing xml
+  - perl-XML-Twig
+  # for java bindings, hadoop, etc.
+  - java-1_8_0-openjdk-devel
+  - junit
+  # for disk/etc monitoring
+  - smartmontools
+  # for nfs
+  - nfs-kernel-server
+  # for xfstests
+  - ncurses-devel
diff --git a/roles/testnode/vars/opensuse_leap_15.1.yml b/roles/testnode/vars/opensuse_leap_15.1.yml

new file mode 100644 (file)

index 0000000..2c0db92
--- /dev/null
+++ b/roles/testnode/vars/opensuse_leap_15.1.yml
@@ -0,0 +1,86 @@
+---
+# vars specific to OpenSuse Leap 15.1
+packages_to_remove:
+  - gettext-runtime-mini
+
+packages:
+  - lsb-release
+  - sysstat
+  - gdb
+  - make
+  - git
+  - python-configobj
+  # for running ceph
+  - libedit0
+#  - libboost_thread1_54_0
+  - libboost_thread1_66_0
+  - xfsprogs
+  - podman
+  - gptfdisk
+  - parted
+  - libgcrypt20
+  - fuse
+  - fuse-devel
+  - libfuse2
+  ###
+  # for ceph-deploy
+  - python-virtualenv
+  ###
+  - openssl
+  - libuuid1
+  - btrfsprogs
+  # used by workunits
+  - attr
+  - valgrind
+  - python-nose
+  - ant
+#  - iozone
+  ###
+  # used by the xfstests tasks
+  - libtool
+  - automake
+  - gettext-runtime
+  - libuuid-devel
+  - libacl-devel
+  - bc
+  - xfsdump
+  - xfsprogs-devel
+  ###
+  # for blktrace and seekwatcher
+  - blktrace
+  - python-numpy
+  - python-matplotlib
+  ###
+  # for qemu
+  - qemu-kvm
+  - usbredir
+#  - genisoimage
+  ###
+  # for apache and rgw
+  - apache2
+  - apache2-devel
+  - apache2-utils
+#  - apache2-mod_fastcgi
+  ###
+  - libevent-devel
+  # for pretty-printing xml
+  - perl-XML-Twig
+  # for java bindings, hadoop, etc.
+  - java-1_8_0-openjdk-devel
+  - junit
+  # for disk/etc monitoring
+  - smartmontools
+  # for nfs
+  - nfs-kernel-server
+  # for xfstests
+  - ncurses-devel
+  - lvm2
+  # missing packages in openSUSE minimal image
+  #- chrony
+  #- make
+  #- gcc
+  #- gcc-c++
+  - lsb-release
+  - rsyslog
+  - git
+  - wget
diff --git a/roles/testnode/vars/opensuse_leap_15.2.yml b/roles/testnode/vars/opensuse_leap_15.2.yml

new file mode 100644 (file)

index 0000000..4b9c4d8
--- /dev/null
+++ b/roles/testnode/vars/opensuse_leap_15.2.yml
@@ -0,0 +1,76 @@
+---
+# vars specific to OpenSuse Leap 15.2
+packages_to_remove:
+  - gettext-runtime-mini
+  - python
+  - python-base
+
+packages:
+  - python3-base
+  - lsb-release
+  - sysstat
+  - gdb
+  - make
+  - zypper
+  - git
+  - python3-configobj
+  # for running ceph
+  - libedit0
+  - xfsprogs
+  - podman
+  - gptfdisk
+  - parted
+  - libgcrypt20
+  - fuse
+  - fuse-devel
+  - libfuse2
+  ###
+  - openssl
+  - libuuid1
+  - btrfsprogs
+  # used by workunits
+  - attr
+  - valgrind
+  - python3-nose
+  - ant
+#  - iozone
+  ###
+  # used by the xfstests tasks
+  - libtool
+  - automake
+  - gettext-runtime
+  - libuuid-devel
+  - libacl-devel
+  - bc
+  - xfsdump
+  - xfsprogs-devel
+  ###
+  # for blktrace and seekwatcher
+  - blktrace
+  - python3-numpy
+  - python3-matplotlib
+  ###
+  # for qemu
+  - qemu-kvm
+  - usbredir
+#  - genisoimage
+  ###
+  - libevent-devel
+  # for pretty-printing xml
+#  - perl-XML-Twig
+  # for java bindings, hadoop, etc.
+#  - java-1_8_0-openjdk-devel
+#  - junit
+  # for disk/etc monitoring
+  - smartmontools
+  # for nfs
+  - nfs-kernel-server
+  # for xfstests
+  - ncurses-devel
+  - lvm2
+  # missing packages in openSUSE minimal image
+#  - chrony
+#  - gcc
+#  - gcc-c++
+  - rsyslog
+  - wget
diff --git a/roles/testnode/vars/redhat_6.yml b/roles/testnode/vars/redhat_6.yml

new file mode 100644 (file)

index 0000000..bafc391
--- /dev/null
+++ b/roles/testnode/vars/redhat_6.yml
@@ -0,0 +1,109 @@
+---
+# vars specific to rhel 6.x
+
+common_yum_repos:
+  lab-extras:
+    name: "lab-extras"
+    baseurl: "http://{{ mirror_host }}/lab-extras/rhel6/"
+    enabled: 1
+    gpgcheck: 0
+    priority: 2
+  centos6-fcgi-ceph:
+    name: "Cent OS 6 Local fastcgi Repo"
+    baseurl: "http://{{ gitbuilder_host }}/mod_fastcgi-rpm-rhel6-x86_64-basic/ref/master/"
+    enabled: 1
+    gpgcheck: 0
+    priority: 2
+  centos6-misc-ceph:
+    name: "Cent OS 6 Local misc Repo"
+    baseurl: "http://{{ mirror_host }}/misc-rpms/"
+    enabled: 1
+    gpgcheck: 0
+    priority: 2
+
+packages:
+  - '@core'
+  - '@base'
+  - yum-plugin-priorities
+  - yum-plugin-fastestmirror
+  - redhat-lsb
+  - sysstat
+  - gdb
+  - git-all
+  - python-configobj
+  # for running ceph
+  - libedit
+  - openssl098e
+  - boost-thread
+  - xfsprogs
+  - gdisk
+  - parted
+  - libgcrypt
+  - fuse
+  - fuse-libs
+  ###
+  - openssl
+  - libuuid
+  - btrfs-progs
+  # used by workunits
+  - attr
+  - valgrind
+  - python-nose
+  - mpich2
+  - ant
+  - fsstress
+  - iozone
+  ###
+  # used by the xfstests tasks
+  - libtool
+  - automake
+  - gettext
+  - libuuid-devel
+  - libacl-devel
+  - bc
+  - xfsdump
+  ###
+  # for blktrace and seekwatcher
+  - blktrace
+  - numpy
+  - python-matplotlib
+  ###
+  # for qemu
+  - qemu-kvm
+  - usbredir
+  - genisoimage
+  ###
+  # for apache and rgw
+  - httpd
+  - httpd-devel
+  - httpd-tools
+  - mod_ssl
+  - mod_fastcgi-2.4.7-1.ceph.el6
+  ### for swift and s3-tests
+  - libev-devel
+  # for pretty-printing xml
+  - perl-XML-Twig
+  # for java bindings, hadoop, etc.
+  - java-1.6.0-openjdk-devel
+  - junit4
+  # for nfs
+  - nfs-utils
+
+
+epel_packages:
+  # for running ceph
+  - cryptopp-devel
+  - cryptopp
+  - fcgi
+  # used by workunits
+  - dbench
+  - fuse-sshfs
+  - bonnie++
+  # for json_xs to investigate JSON by hand
+  - perl-JSON-XS
+  # for ceph-deploy
+  - python-virtualenv
+  # for setting BIOS settings
+  - smbios-utils
+
+nfs_service: nfs
diff --git a/roles/testnode/vars/redhat_7.6.yml b/roles/testnode/vars/redhat_7.6.yml

new file mode 100644 (file)

index 0000000..84366bb
--- /dev/null
+++ b/roles/testnode/vars/redhat_7.6.yml
@@ -0,0 +1,86 @@
+---
+# vars specific to any rhel 7.x version
+
+common_yum_repos:
+  rhel-7-fcgi-ceph:
+    name: "RHEL 7 Local fastcgi Repo"
+    baseurl: "http://{{ gitbuilder_host }}/mod_fastcgi-rpm-rhel7-x86_64-basic/ref/master/"
+    enabled: 1
+    gpgcheck: 0
+  lab-extras:
+    name: "lab-extras"
+    baseurl: "http://{{ mirror_host }}/lab-extras/rhel7/"
+    enabled: 1
+    gpgcheck: 0
+
+packages:
+  - '@core'
+  - '@base'
+  - yum-plugin-priorities
+  - yum-plugin-fastestmirror
+  - redhat-lsb
+  - sysstat
+  - gdb
+  - git-all
+  - python-configobj
+  - libedit
+  - openssl098e
+  - boost-thread
+  - xfsprogs
+  - xfsprogs-devel
+  - gdisk
+  - parted
+  - libgcrypt
+  - fuse
+  - fuse-libs
+  - openssl
+  - libuuid
+  - btrfs-progs
+  - attr
+  - valgrind
+  - python-nose
+  - mpich
+  - ant
+  - lsof
+  - iozone
+  - libtool
+  - automake
+  - gettext
+  - libuuid-devel
+  - libacl-devel
+  - bc
+  - xfsdump
+  - blktrace
+  - numpy
+  - python-matplotlib
+  - qemu-kvm
+  - usbredir
+  - genisoimage
+  - httpd
+  - httpd-devel
+  - httpd-tools
+  - mod_ssl
+  - mod_fastcgi-2.4.7-1.ceph.el7
+  - perl-XML-Twig
+  - java-1.6.0-openjdk-devel
+  - junit4
+  - nfs-utils
+  # for xfstests
+  - ncurses-devel
+  # for s3 tests
+  - python-devel
+  - python-virtualenv
+  - perl-CPAN
+  - python3
+
+epel_packages:
+  - cryptopp-devel
+  - cryptopp
+  - dbench
+  - fcgi
+  - fuse-sshfs
+  - perl-JSON-XS
+  - leveldb
+  - xmlstarlet
+
+nfs_service: nfs-server
diff --git a/roles/testnode/vars/redhat_7.8.yml b/roles/testnode/vars/redhat_7.8.yml

new file mode 100644 (file)

index 0000000..f2e8265
--- /dev/null
+++ b/roles/testnode/vars/redhat_7.8.yml
@@ -0,0 +1,88 @@
+---
+# vars specific to any rhel 7.x version
+
+common_yum_repos:
+  rhel-7-fcgi-ceph:
+    name: "RHEL 7 Local fastcgi Repo"
+    baseurl: "http://{{ gitbuilder_host }}/mod_fastcgi-rpm-rhel7-x86_64-basic/ref/master/"
+    enabled: 1
+    gpgcheck: 0
+  lab-extras:
+    name: "lab-extras"
+    baseurl: "http://{{ mirror_host }}/lab-extras/rhel7/"
+    enabled: 1
+    gpgcheck: 0
+
+packages:
+  - '@core'
+  - '@base'
+  - yum-plugin-priorities
+  - yum-plugin-fastestmirror
+  - redhat-lsb
+  - sysstat
+  - gdb
+  - git-all
+  - python-configobj
+  - gcc-c++
+  - libedit
+  - openssl098e
+  - boost-thread
+  - xfsprogs
+  - xfsprogs-devel
+  - gdisk
+  - parted
+  - libgcrypt
+  - fuse
+  - fuse-libs
+  - lvm2
+  - openssl
+  - libuuid
+  - btrfs-progs
+  - attr
+  - valgrind
+  - python-nose
+  - mpich
+  - ant
+  - lsof
+  - iozone
+  - libtool
+  - automake
+  - gettext
+  - libuuid-devel
+  - libacl-devel
+  - bc
+  - xfsdump
+  - blktrace
+  - numpy
+  - python-matplotlib
+  - qemu-kvm
+  - usbredir
+  - genisoimage
+  - httpd
+  - httpd-devel
+  - httpd-tools
+  - mod_ssl
+  - mod_fastcgi-2.4.7-1.ceph.el7
+  - perl-XML-Twig
+  - java-1.6.0-openjdk-devel
+  - junit4
+  - nfs-utils
+  # for xfstests
+  - ncurses-devel
+  # for s3 tests
+  - python-devel
+  - python-virtualenv
+  - perl-CPAN
+  - python3
+
+epel_packages:
+  - cryptopp-devel
+  - cryptopp
+  - dbench
+  - fcgi
+  - fuse-sshfs
+  - perl-JSON-XS
+  - leveldb
+  - xmlstarlet
+
+nfs_service: nfs-server
diff --git a/roles/testnode/vars/redhat_7.yml b/roles/testnode/vars/redhat_7.yml

new file mode 100644 (file)

index 0000000..77741bd
--- /dev/null
+++ b/roles/testnode/vars/redhat_7.yml
@@ -0,0 +1,90 @@
+---
+# vars specific to any rhel 7.x version
+
+common_yum_repos:
+  rhel-7-fcgi-ceph:
+    name: "RHEL 7 Local fastcgi Repo"
+    baseurl: "http://{{ gitbuilder_host }}/mod_fastcgi-rpm-rhel7-x86_64-basic/ref/master/"
+    enabled: 1
+    gpgcheck: 0
+  lab-extras:
+    name: "lab-extras"
+    baseurl: "http://{{ mirror_host }}/lab-extras/rhel7/"
+    enabled: 1
+    gpgcheck: 0
+
+packages:
+  - '@core'
+  - '@base'
+  - yum-plugin-priorities
+  - yum-plugin-fastestmirror
+  - redhat-lsb
+  - sysstat
+  - gdb
+  - git-all
+  - python-configobj
+  - gcc-c++
+  - libedit
+  - openssl098e
+  - boost-thread
+  - xfsprogs
+  - xfsprogs-devel
+  - gdisk
+  - parted
+  - libgcrypt
+  - fuse
+  - fuse-libs
+  - lvm2
+  - openssl
+  - libuuid
+  - btrfs-progs
+  - attr
+  - valgrind
+  - python-nose
+  - mpich
+  - ant
+  - lsof
+  - iozone
+  - libtool
+  - automake
+  - gettext
+  - libuuid-devel
+  - libacl-devel
+  - bc
+  - xfsdump
+  - blktrace
+  - numpy
+  - python-matplotlib
+  - qemu-kvm
+  - usbredir
+  - genisoimage
+  - httpd
+  - httpd-devel
+  - httpd-tools
+  - mod_ssl
+  - mod_fastcgi-2.4.7-1.ceph.el7
+  - libev-devel
+  - perl-XML-Twig
+  - java-1.6.0-openjdk-devel
+  - junit4
+  - nfs-utils
+  # for xfstests
+  - ncurses-devel
+  # for s3 tests
+  - python-devel
+  - python-virtualenv
+  - perl-CPAN
+  - podman
+  - python3
+
+epel_packages:
+  - cryptopp-devel
+  - cryptopp
+  - dbench
+  - fcgi
+  - fuse-sshfs
+  - perl-JSON-XS
+  - leveldb
+  - xmlstarlet
+
+nfs_service: nfs-server
diff --git a/roles/testnode/vars/redhat_8.yml b/roles/testnode/vars/redhat_8.yml

new file mode 100644 (file)

index 0000000..b56784f
--- /dev/null
+++ b/roles/testnode/vars/redhat_8.yml
@@ -0,0 +1,74 @@
+---
+# vars specific to any rhel 8.x version
+
+common_yum_repos:
+  lab-extras:
+    name: "lab-extras"
+    baseurl: "http://{{ mirror_host }}/lab-extras/8/"
+    enabled: 1
+    gpgcheck: 0
+
+copr_repos:
+  - ceph/python3-asyncssh
+
+packages:
+  # for package-cleanup
+  - dnf-utils
+  - git-all
+  - sysstat
+  - libedit
+  - boost-thread
+  - xfsprogs
+  - gdisk
+  - parted
+  - libgcrypt
+  - fuse-libs
+  - openssl
+  - libuuid
+  - attr
+  - ant
+  - lsof
+  - gettext
+  - bc
+  - xfsdump
+  - blktrace
+  - usbredir
+  - podman
+  - redhat-lsb
+  - firewalld
+  - wget
+  - libev-devel
+  - valgrind
+  - nfs-utils
+  # for xfstests
+  - ncurses-devel
+  # for s3 tests
+  # for workunits,
+  - gcc
+  - git
+  - make
+  # qa/workunits/rados/test_python.sh
+  - python3-nose
+  # for cram tests
+  - python3-virtualenv
+  # for rbd qemu tests
+  - genisoimage
+  - qemu-img
+  - qemu-kvm-core
+  - qemu-kvm-block-rbd
+  # for pjd tests
+  - libacl-devel
+  # for fs tests,
+  - autoconf
+  # for test-crash.sh
+  - gdb
+  - iozone
+  # cephadm
+  - lvm2
+
+epel_packages:
+  - dbench
+
+nfs_service: nfs-server
+
+ntp_service_name: chronyd
diff --git a/roles/testnode/vars/redhat_9.yml b/roles/testnode/vars/redhat_9.yml

new file mode 100644 (file)

index 0000000..8c25388
--- /dev/null
+++ b/roles/testnode/vars/redhat_9.yml
@@ -0,0 +1,63 @@
+---
+# Packages that are in RHEL8 but not in RHEL9:
+# redhat-lsb libev-devel python3-nose python3-virtualenv iozone
+# Packages that needed to be added to this vars file:
+# python3-pip
+
+packages:
+  - dnf-utils
+  - git-all
+  - sysstat
+  - libedit
+  - boost-thread
+  - xfsprogs
+  - gdisk
+  - parted
+  - libgcrypt
+  - fuse-libs
+  - openssl
+  - libuuid
+  - attr
+  - ant
+  - lsof
+  - gettext
+  - bc
+  - xfsdump
+  - blktrace
+  - usbredir
+  - podman
+  - firewalld
+  - wget
+  - valgrind
+  - nfs-utils
+  - ncurses-devel
+  - gcc
+  - git
+  - make
+  - genisoimage
+  - qemu-img
+  - qemu-kvm-core
+  - qemu-kvm-block-rbd
+  - libacl-devel
+  - autoconf
+  - gdb
+  - lvm2
+  - python3-pip
+  - NetworkManager-initscripts-updown
+
+ceph_dependency_packages_to_remove:
+  - boost-random
+  - boost-program-options
+  - leveldb
+  - xmlstarlet
+  - boost-random
+  - hdparm
+
+epel_packages:
+  - dbench
+
+nfs_service: nfs-server
+
+ntp_service_name: chronyd
+
+configure_abrt: false
diff --git a/roles/testnode/vars/ubuntu.yml b/roles/testnode/vars/ubuntu.yml

new file mode 100644 (file)

index 0000000..36ee10a
--- /dev/null
+++ b/roles/testnode/vars/ubuntu.yml
@@ -0,0 +1,94 @@
+---
+common_packages:
+  # for apache
+  - libfcgi0ldbl
+  ###
+  # for s3 tests
+  - libev-dev
+  ###
+  # for cpan
+  - perl
+  - libwww-perl
+  ###
+  - lsb-release
+  - build-essential
+  - sysstat
+  - gdb
+  # for running ceph
+  - libedit2
+  - cryptsetup-bin
+  - xfsprogs
+  - gdisk
+  - parted
+  ###
+  # for setting BIOS settings
+  ###
+  - libuuid1
+  # for compiling helpers and such
+  - libatomic-ops-dev
+  ###
+  # used by workunits
+  - git-core
+  - attr
+  - dbench
+  - bonnie++
+  - valgrind
+  - ant
+  ###
+  # used by the xfstests tasks
+  - libtool
+  - automake
+  - gettext
+  - uuid-dev
+  - libacl1-dev
+  - bc
+  - xfsdump
+  - xfslibs-dev
+  - libattr1-dev
+  - quota
+  - libcap2-bin
+  - libncurses5-dev
+  - lvm2
+  ###
+  - vim
+  - pdsh
+  # for blktrace and seekwatcher
+  - blktrace
+  ###
+  # qemu
+  - genisoimage
+  ###
+  # for json_xs to investigate JSON by hand
+  - libjson-xs-perl
+  # for pretty-printing xml
+  - xml-twig-tools
+  # for java bindings, hadoop, etc.
+  - default-jdk
+  - junit4
+  ###
+  # for samba testing
+  - cifs-utils
+  # for Static IP
+  - ipcalc
+  # nfs
+  - nfs-common
+  - nfs-kernel-server
+  # for add-apt-repository
+  - software-properties-common
+  # for https://twitter.com/letsencrypt/status/1443621997288767491
+  - libgnutls30
+
+non_aarch64_common_packages:
+  - smbios-utils
+  - libfcgi
+  - sysprof
+
+packages_to_upgrade:
+  - apt
+  - apache2
+
+non_aarch64_packages_to_upgrade:
+  - libapache2-mod-fastcgi
+
+no_recommended_packages:
+  - collectl
diff --git a/roles/testnode/vars/ubuntu_12.04.yml b/roles/testnode/vars/ubuntu_12.04.yml

new file mode 100644 (file)

index 0000000..5e8953d
--- /dev/null
+++ b/roles/testnode/vars/ubuntu_12.04.yml
@@ -0,0 +1,25 @@
+---
+packages:
+  - libgoogle-perftools0
+  - libboost-thread1.46.1
+  - ltp-kernel-test
+  - libmpich2-3
+  - kvm
+  ###
+  # for setting BIOS settings
+  ###
+  - libcrypto++9
+  ###
+  # used by workunits
+  - mpich2
+  - libmpich2-dev
+  - python-dev
+
+non_aarch64_packages:
+  - iozone3
+  - dmapi
+  - libssl0.9.8
+
+# on precise rpcbind does not provide a way to
+# be managed with upstart
+start_rpcbind: false
diff --git a/roles/testnode/vars/ubuntu_14.yml b/roles/testnode/vars/ubuntu_14.yml

new file mode 100644 (file)

index 0000000..5a3fd70
--- /dev/null
+++ b/roles/testnode/vars/ubuntu_14.yml
@@ -0,0 +1,26 @@
+---
+apt_repos:
+  # mod_fastcgi for radosgw
+  - "deb http://gitbuilder.ceph.com/libapache-mod-fastcgi-deb-trusty-x86_64-basic/ref/master/ trusty main"
+
+packages:
+  - libboost-thread1.54.0
+  - mpich
+  - qemu-system-x86
+#  - blkin
+  - lttng-tools
+  ###
+  # for setting BIOS settings
+  ###
+  - libcrypto++9
+  ###
+  # used by workunits
+  - mpich2
+  - libmpich2-dev
+  - python-dev
+
+non_aarch64_packages:
+  - libgoogle-perftools4
+  - iozone3
+  - dmapi
+  - libssl0.9.8
diff --git a/roles/testnode/vars/ubuntu_15.yml b/roles/testnode/vars/ubuntu_15.yml

new file mode 100644 (file)

index 0000000..0b439cb
--- /dev/null
+++ b/roles/testnode/vars/ubuntu_15.yml
@@ -0,0 +1,26 @@
+---
+apt_repos: []
+
+packages:
+  - libgoogle-perftools4
+# FIXME: not available on vivid, figure out what's available and use it
+# - libboost-thread1.54.0
+  - mpich
+  - qemu-system-x86
+# FIXME: not available on vivid, figure out what's available and use it
+#  - blkin
+  - lttng-tools
+  ###
+  # for setting BIOS settings
+  ###
+  - libcrypto++9
+  ###
+  # used by workunits
+  - mpich2
+  - libmpich2-dev
+  - python-dev
+
+non_aarch64_packages:
+  - iozone3
+  - dmapi
+  - libssl0.9.8
diff --git a/roles/testnode/vars/ubuntu_16.yml b/roles/testnode/vars/ubuntu_16.yml

new file mode 100644 (file)

index 0000000..10283eb
--- /dev/null
+++ b/roles/testnode/vars/ubuntu_16.yml
@@ -0,0 +1,37 @@
+---
+apt_repos:
+  # http://tracker.ceph.com/issues/18126
+  - "deb [trusted=yes] https://chacra.ceph.com/r/valgrind/latest/HEAD/ubuntu/xenial/flavors/default/ xenial main"
+
+packages:
+  - libboost-thread1.58.0
+  - mpich
+  - qemu-system-x86
+  - python-virtualenv
+  - python-configobj
+  - python-gevent
+  - python-numpy
+  - python-matplotlib
+  - python-nose
+  - btrfs-tools
+#  - blkin
+  - lttng-tools
+  ###
+  # for setting BIOS settings
+  ###
+  - libcrypto++9v5
+  ###
+  # for building xfstests #18067
+  - libtool-bin
+  - python-dev
+
+packages_to_upgrade:
+  # http://tracker.ceph.com/issues/13522#note-51
+  - libgoogle-perftools4
+  # http://tracker.ceph.com/issues/18126#note-11
+  - valgrind
+
+non_aarch64_packages:
+  - libgoogle-perftools4
+  - iozone3
+  - libssl1.0.0
diff --git a/roles/testnode/vars/ubuntu_18.yml b/roles/testnode/vars/ubuntu_18.yml

new file mode 100644 (file)

index 0000000..ce4d09f
--- /dev/null
+++ b/roles/testnode/vars/ubuntu_18.yml
@@ -0,0 +1,27 @@
+---
+packages:
+  - mpich
+  - qemu-system-x86
+  - python-virtualenv
+  - python-configobj
+  - python-gevent
+  - python-numpy
+  - python-matplotlib
+  - python-nose
+  - btrfs-tools
+#  - blkin
+  - lttng-tools
+  # for building xfstests #18067
+  - libtool-bin
+  # for ceph-daemon (no podman on ubuntu/debian, yet)
+  - docker.io
+  # qa/workunits/rbd/test_librbd_python.sh
+  - python3-nose
+  - python-dev
+
+non_aarch64_packages:
+  - libgoogle-perftools4
+  - iozone3
+  - libssl1.0.0
+
+non_aarch64_packages_to_upgrade: []
diff --git a/roles/testnode/vars/ubuntu_20.yml b/roles/testnode/vars/ubuntu_20.yml

new file mode 100644 (file)

index 0000000..75e7a38
--- /dev/null
+++ b/roles/testnode/vars/ubuntu_20.yml
@@ -0,0 +1,27 @@
+---
+packages:
+  - mpich
+  - qemu-system-x86
+#  - blkin
+  - lttng-tools
+  # for building xfstests #18067
+  - libtool-bin
+  # for ceph-daemon (no podman on ubuntu/debian, yet)
+  - docker.io
+  # qa/workunits/rbd/test_librbd_python.sh
+  - python3-nose
+  # python3 version of deps
+  - python3-venv
+  - python3-virtualenv
+  - python3-configobj
+  - python3-gevent
+  - python3-numpy
+  - python3-matplotlib
+  - python3-setuptools
+  - python-dev
+
+non_aarch64_packages:
+  - libgoogle-perftools4
+  - iozone3
+
+non_aarch64_packages_to_upgrade: []
diff --git a/roles/testnode/vars/ubuntu_22.yml b/roles/testnode/vars/ubuntu_22.yml

new file mode 100644 (file)

index 0000000..9bdf7f4
--- /dev/null
+++ b/roles/testnode/vars/ubuntu_22.yml
@@ -0,0 +1,29 @@
+---
+packages:
+  - mpich
+  - qemu-system-x86
+#  - blkin
+  - lttng-tools
+  # for building xfstests #18067
+  - libtool-bin
+  # for ceph-daemon (no podman on ubuntu/debian, yet)
+  - docker.io
+  # qa/workunits/rbd/test_librbd_python.sh
+  - python3-nose
+  # python3 version of deps
+  - python3-venv
+  - python3-virtualenv
+  - python3-configobj
+  - python3-gevent
+  - python3-numpy
+  - python3-matplotlib
+  - python3-setuptools
+  - python3-dev
+
+non_aarch64_packages:
+  - libgoogle-perftools4
+  - iozone3
+
+non_aarch64_packages_to_upgrade: []
+
+python_apt_package_name: python3-apt
diff --git a/roles/testnode/vars/yum_systems.yml b/roles/testnode/vars/yum_systems.yml

new file mode 100644 (file)

index 0000000..8d2843a
--- /dev/null
+++ b/roles/testnode/vars/yum_systems.yml
@@ -0,0 +1,51 @@
+---
+ntp_service_name: ntpd
+ssh_service_name: sshd
+
+packages_to_remove:
+  # multipath interferes with krbd tests
+  - device-mapper-multipath
+  # tgt interferes with ceph-iscsi tests
+  - scsi-target-utils
+
+# ceph packages that we ensure do not exist
+ceph_packages_to_remove:
+  - ceph
+  - ceph-base
+  - ceph-selinux
+  - ceph-common
+  - ceph-debuginfo
+  - ceph-release
+  - libcephfs1
+  - ceph-radosgw
+  - python-ceph
+  - python-rados
+  - python-rbd
+  - python-cephfs
+  - librbd1
+  - librados2
+  - mod_fastcgi
+
+ceph_dependency_packages_to_remove:
+  - boost-random
+  - boost-program-options
+  - leveldb
+  - xmlstarlet
+  - python-jinja2
+  - python-ceph
+  - python-flask
+  - python-requests
+  - boost-random
+  - python-urllib3
+  - python-babel
+  - hdparm
+  - python-markupsafe
+  - python-werkzeug
+  - python-itsdangerous
+
+pip_packages_to_install:
+  - remoto>=0.0.35
+
+# This gets defined to "-stream" in roles/testnode/tasks/yum_systems.yml when CentOS Stream is the OS.
+# It adds "-stream" to yum repo mirrorlist URLs.
+dash_stream: ""
diff --git a/roles/testnode/vars/zypper_systems.yml b/roles/testnode/vars/zypper_systems.yml

new file mode 100644 (file)

index 0000000..3ef1545
--- /dev/null
+++ b/roles/testnode/vars/zypper_systems.yml
@@ -0,0 +1,49 @@
+---
+ntp_service_name: chronyd
+ssh_service_name: sshd
+nrpe_service_name: nrpe
+nrpe_user: nrpe
+nrpe_group: nrpe
+nagios_plugins_directory: /usr/lib64/nagios/plugins
+
+packages_to_remove:
+  # multipath interferes with krbd tests
+  - multipath-tools
+  # tgt interferes with ceph-iscsi tests
+  - tgt
+
+# ceph packages that we ensure do not exist
+ceph_packages_to_remove:
+  - ceph
+  - ceph-base
+  - ceph-selinux
+  - ceph-common
+  - ceph-debuginfo
+  - ceph-release
+  - libcephfs1
+  - ceph-radosgw
+  - python-ceph
+  - python-rados
+  - python-rbd
+  - python-cephfs
+  - librbd1
+  - librados2
+  - mod_fastcgi
+  - iozone
+
+ceph_dependency_packages_to_remove:
+  - boost-random
+  - boost-program-options
+  - leveldb
+  - xmlstarlet
+  - python-jinja2
+  - python-ceph
+  - python-flask
+  - python-requests
+  - boost-random
+  - python-urllib3
+  - python-babel
+  - hdparm
+  - python-markupsafe
+  - python-werkzeug
+  - python-itsdangerous
diff --git a/roles/teuthology/README.rst b/roles/teuthology/README.rst

new file mode 100644 (file)

index 0000000..2a94e83
--- /dev/null
+++ b/roles/teuthology/README.rst
@@ -0,0 +1,22 @@
+Teuthology
+==========
+
+This role is used to manage the main teuthology node in a lab, e.g.
+``teuthology.front.sepia.ceph.com``.
+
+It only depends on the ``common`` role.
+
+It also does the following:
+
+- Install dependencies required for ``teuthology``
+- Create the ``teuthology`` and ``teuthworker`` users which are used for
+  scheduling and executing tests, respectively
+- Clone ``teuthology`` repos into ``~/src/teuthology_main`` under those user accounts
+- Run ``teuthology``'s ``bootstrap`` script
+- Manages user accounts and sudo privileges using the ``test_admins`` group_var in the secrets repo
+- Includes a script to keep the ``teuthology`` user's crontab up to date with remote version-controlled versions (``--tags="crontab")
+
+It currently does NOT do these things:
+
+- Manage ``teuthology-worker`` processes
+- Run ``teuthology-nuke --stale``
diff --git a/roles/teuthology/defaults/main.yml b/roles/teuthology/defaults/main.yml

new file mode 100644 (file)

index 0000000..20605dc
--- /dev/null
+++ b/roles/teuthology/defaults/main.yml
@@ -0,0 +1,17 @@
+---
+teuthology_scheduler_user: teuthology
+teuthology_execution_user: teuthworker
+
+teuthology_users:
+  # for scheduling tests
+  - "{{ teuthology_scheduler_user }}"
+  # for executing tests
+  - "{{ teuthology_execution_user }}"
+
+teuthology_repo: https://github.com/ceph/teuthology.git
+teuthology_branch: "main"
+teuthology_yaml_extra: ""
+teuthology_ceph_git_base_url: "git://git.ceph.com/"
+archive_base: "/home/{{ teuthology_execution_user }}/archive"
+
+remote_crontab_url: "https://raw.githubusercontent.com/ceph/ceph/main/qa/crontab/teuthology-cronjobs"
diff --git a/roles/teuthology/meta/main.yml b/roles/teuthology/meta/main.yml

new file mode 100644 (file)

index 0000000..72869df
--- /dev/null
+++ b/roles/teuthology/meta/main.yml
@@ -0,0 +1,3 @@
+---
+dependencies:
+  - role: users
diff --git a/roles/teuthology/tasks/apt_systems.yml b/roles/teuthology/tasks/apt_systems.yml

new file mode 100644 (file)

index 0000000..2ce0121
--- /dev/null
+++ b/roles/teuthology/tasks/apt_systems.yml
@@ -0,0 +1,14 @@
+---
+- name: Include package type specific vars.
+  include_vars: "apt_systems.yml"
+  tags:
+    - always
+
+- name: Install packages via apt
+  apt:
+    name: "{{ teuthology_extra_packages|list }}"
+    state: latest
+    update_cache: yes
+    cache_valid_time: 600
+  tags:
+    - packages
diff --git a/roles/teuthology/tasks/main.yml b/roles/teuthology/tasks/main.yml

new file mode 100644 (file)

index 0000000..d55f1e8
--- /dev/null
+++ b/roles/teuthology/tasks/main.yml
@@ -0,0 +1,60 @@
+---
+- import_tasks: zypper_systems.yml
+  when: ansible_pkg_mgr == "zypper"
+
+- import_tasks: apt_systems.yml
+  when: ansible_pkg_mgr == "apt"
+
+# Yum systems support is not implemented yet.
+- import_tasks: yum_systems.yml
+  when: ansible_pkg_mgr == "yum"
+
+# Set up the different users that teuthology uses
+- import_tasks: setup_users.yml
+
+- name: Ship /etc/teuthology.yaml
+  template:
+    src: teuthology.yaml
+    dest: /etc/teuthology.yaml
+    mode: 0755
+  tags:
+    - config
+
+- name: Ship /etc/init.d/teuthology-worker
+  template:
+    src: teuthology-worker.init
+    dest: /etc/init.d/teuthology-worker
+    mode: 0755
+  tags:
+    - config
+
+- name: Ensure scheduler user binary directory exists
+  file:
+    state: directory
+    owner: "{{ teuthology_scheduler_user }}"
+    group: "{{ teuthology_scheduler_user }}"
+    path: "/home/{{ teuthology_scheduler_user }}/bin"
+    mode: 0755
+  tags:
+    - crontab
+
+- name: Ship teuthology user's crontab update script
+  template:
+    src: update-crontab.sh
+    dest: "/home/{{ teuthology_scheduler_user }}/bin/update-crontab.sh"
+    mode: 0775
+    owner: "{{ teuthology_scheduler_user }}"
+    group: "{{ teuthology_scheduler_user }}"
+  tags:
+    - crontab
+
+# Serve logs over HTTP
+- import_tasks: setup_log_access.yml
+  tags:
+    - logs
+
+- name: Enable and start beanstalkd
+  service:
+    name: beanstalkd
+    state: started
+    enabled: yes
diff --git a/roles/teuthology/tasks/setup_log_access.yml b/roles/teuthology/tasks/setup_log_access.yml

new file mode 100644 (file)

index 0000000..95db9d6
--- /dev/null
+++ b/roles/teuthology/tasks/setup_log_access.yml
@@ -0,0 +1,31 @@
+---
+- name: Disable default nginx config
+  file:
+    name: /etc/nginx/sites-enabled/default
+    state: absent
+
+- name: Ship nginx config
+  template:
+    src: nginx.conf
+    dest: "{{ nginx_available }}/test_logs.conf"
+
+- name: Enable nginx config
+  file:
+    src: "{{ nginx_available }}/test_logs.conf"
+    dest: "{{ nginx_enabled }}/test_logs.conf"
+    state: link
+
+# Ignore errors in case service doesn't exist
+- name: Disable apache httpd
+  service:
+    name: "{{ apache_service }}"
+    enabled: no
+    state: stopped
+  ignore_errors: true
+
+- name: Enable nginx
+  service:
+    name: nginx
+    enabled: yes
+    state: reloaded
+  changed_when: false
diff --git a/roles/teuthology/tasks/setup_users.yml b/roles/teuthology/tasks/setup_users.yml

new file mode 100644 (file)

index 0000000..741ba0a
--- /dev/null
+++ b/roles/teuthology/tasks/setup_users.yml
@@ -0,0 +1,105 @@
+---
+- name: Create group
+  group:
+    name: "{{ item }}"
+    state: present
+  with_items: "{{ teuthology_users }}"
+  tags:
+    - user
+
+- name: Create users
+  user:
+    name: "{{ item }}"
+    state: present
+    shell: /bin/bash
+  with_items: "{{ teuthology_users }}"
+  tags:
+    - user
+
+# test-admins group gets sudo rights to /bin/kill pids (used by teuthology-kill)
+- name: Create test-admins group
+  group:
+    name: test-admins
+    state: present
+  tags:
+    - user
+
+- name: Add test_admins to test-admins group
+  user:
+    name: "{{ item }}"
+    groups: test-admins
+    append: yes
+  with_items: "{{ test_admins }}"
+  tags:
+    - user
+  when: test_admins is defined and test_admins|length > 0
+
+- name: Grant test-admins sudo access to /bin/kill
+  lineinfile:
+    dest: /etc/sudoers.d/cephlab_sudo
+    regexp: "^%test-admins"
+    line: "%test-admins ALL=NOPASSWD: /bin/kill, /usr/bin/kill"
+    state: present
+    validate: visudo -cf %s
+  tags:
+    - user
+
+- name: Determine teuthology GitHub PR
+  set_fact:
+    teuthology_ghpr: "{{ teuthology_branch | regex_replace( '^origin/pr/([^/]+)/.*$', '\\1') }}"
+
+- name: Clone the teuthology repo for GitHub PR
+  git:
+    repo: "https://github.com/ceph/teuthology"
+    dest: /home/{{ item }}/src/teuthology_main
+    version: "{{ teuthology_branch }}"
+    refspec: '+refs/pull/{{ teuthology_ghpr }}/*:refs/origin/pr/{{ teuthology_ghpr }}/*'
+  become_user: "{{ item }}"
+  with_items: "{{ teuthology_users }}"
+  tags:
+    - repos
+  when: teuthology_ghpr is defined and teuthology_ghpr != teuthology_branch
+
+- name: Clone the teuthology repo
+  git:
+    repo: "{{ teuthology_repo }}"
+    dest: /home/{{ item }}/src/teuthology_main
+    version: "{{ teuthology_branch }}"
+  become_user: "{{ item }}"
+  with_items: "{{ teuthology_users }}"
+  tags:
+    - repos
+  when: teuthology_ghpr is not defined or teuthology_ghpr == teuthology_branch
+
+- name: Run bootstrap
+  shell: NO_CLOBBER=true ./bootstrap
+  args:
+    chdir: /home/{{ item }}/src/teuthology_main/
+  become_user: "{{ item }}"
+  with_items: "{{ teuthology_users }}"
+  register: bootstrap
+  changed_when: bootstrap.stdout_lines[-1]|length > 60
+  tags:
+    - repos
+  
+- name: Add teuthology scripts to PATH
+  lineinfile:
+    dest: /home/{{ item }}/.profile
+    regexp: teuthology_main
+    line: 'PATH="$HOME/src/teuthology_main/virtualenv/bin:$PATH"'
+  become_user: "{{ item }}"
+  with_items: "{{ teuthology_users }}"
+
+- name: Ensure teuthology is usable
+  shell: "./teuthology --version"
+  args:
+    chdir: /home/{{ item }}/src/teuthology_main/virtualenv/bin/
+  become_user: "{{ item }}"
+  with_items: "{{ teuthology_users }}"
+  changed_when: false
+
+- name: Ensure archive directory exists
+  shell: "mkdir -p {{ archive_base }}/worker_logs"
+  become_user: "{{ teuthology_execution_user }}"
+  tags:
+    - logs
diff --git a/roles/teuthology/tasks/yum_systems.yml b/roles/teuthology/tasks/yum_systems.yml

new file mode 100644 (file)

index 0000000..78a6710
--- /dev/null
+++ b/roles/teuthology/tasks/yum_systems.yml
@@ -0,0 +1,3 @@
+---
+- fail:
+    msg: "yum systems are not supported at this time"
diff --git a/roles/teuthology/tasks/zypper_systems.yml b/roles/teuthology/tasks/zypper_systems.yml

new file mode 100644 (file)

index 0000000..d01969c
--- /dev/null
+++ b/roles/teuthology/tasks/zypper_systems.yml
@@ -0,0 +1,13 @@
+---
+- name: Include package type specific vars.
+  include_vars: "zypper_{{ ansible_distribution | lower | replace(' ', '_') }}_{{ ansible_distribution_version }}.yml"
+  tags:
+    - always
+
+- name: Install packages via zypper
+  zypper:
+    name: "{{ teuthology_extra_packages|list }}"
+    state: latest
+    update_cache: yes
+  tags:
+    - packages
diff --git a/roles/teuthology/templates/nginx.conf b/roles/teuthology/templates/nginx.conf

new file mode 100644 (file)

index 0000000..d459062
--- /dev/null
+++ b/roles/teuthology/templates/nginx.conf
@@ -0,0 +1,22 @@
+# {{ ansible_managed }}
+server {
+        gzip on;
+        gzip_types *;
+        gzip_comp_level 9;
+        gzip_proxied any;
+        gzip_vary on;
+        gzip_static on;
+        allow all;
+        autoindex on;
+        server_name {{ inventory_hostname }};
+        location /teuthology {
+          alias {{ archive_base }};
+          # Prevents Chromium from thinking certain text files are binary,
+          # e.g. console logs while reimaging is underway
+          add_header X-Content-Type-Options nosniff;
+        }
+        types {
+            text/plain log;
+            text/plain yaml yml;
+        }
+}
diff --git a/roles/teuthology/templates/teuthology-worker.init b/roles/teuthology/templates/teuthology-worker.init

new file mode 100644 (file)

index 0000000..a1c534d
--- /dev/null
+++ b/roles/teuthology/templates/teuthology-worker.init
@@ -0,0 +1,166 @@
+#!/bin/bash
+#
+# Copyright (c) 2015 Red Hat, Inc.
+#
+# Author: Loic Dachary <loic@dachary.org>
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in
+# all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+# THE SOFTWARE.
+#
+### BEGIN INIT INFO
+# Provides:        teuthology
+# Required-Start:  $network $remote_fs $syslog beanstalkd nginx
+# Required-Stop:   $network $remote_fs $syslog
+# Default-Start:   2 3 4 5
+# Default-Stop:
+# Short-Description: Start teuthology
+### END INIT INFO
+
+export NWORKERS=20
+
+[ -f /etc/default/teuthology ] && source /etc/default/teuthology
+
+user=${TEUTHOLOGY_USERNAME:-"{{ teuthology_execution_user }}"}
+export HOME=/home/$user
+export WORKER_HOME=$HOME/src/teuthology_main
+#/usr/share/nginx/html
+export WORKER_ARCH=$HOME/archive
+
+[ -d $WORKER_ARCH ] || sudo -u $user mkdir -p $WORKER_ARCH
+
+function worker_pidfile() {
+    echo /var/run/teuthology-worker.$1.pid
+}
+function worker_logfile() {
+    echo /var/log/teuthology.${1}.log
+}
+
+function stop_worker() {
+    wnum=$1
+    wpidfile=$(worker_pidfile $wnum)
+    if [[ -f $wpidfile ]] ; then
+        wpid=$(cat $wpidfile)
+        echo Killing worker $wnum with pid=$wpid...
+        pkill -P $wpid
+        pkill $wpid
+        rm -f $wpidfile
+    fi
+}
+
+function stop_workers() {
+    for i in $(seq 1 $NWORKERS) ; do
+        stop_worker $i
+    done
+}
+
+function start_worker() {
+    local wlogfile=$1
+    local wpidfile=$2
+    local worklogs=/tmp/$user-logs
+    mkdir -p $worklogs && chown $user: $worklogs
+    su - -c "
+cd /home/$user
+source openrc.sh
+cd $WORKER_HOME
+export LC_ALL=C
+virtualenv/bin/teuthology-worker --tube openstack -l $worklogs --archive-dir $WORKER_ARCH
+" $user > $wlogfile 2>&1 & {
+        echo $! > $wpidfile
+        echo "Started worker with pid=$! see log $wlogfile"
+    }
+}
+
+function rkill() {
+    local pid=$1
+    for i in $(pgrep -P $pid) ; do
+        rkill $i
+    done
+    echo Killing process $pid
+    kill -9 $pid
+}
+
+function stop_process() {
+    local pidfile=$1
+    [[ -f $pidfile ]] && {
+        local pid=$(cat $pidfile)
+        rkill $pid
+        ps --no-headers $pid 2>&1 > /dev/null || rm $pidfile
+    }
+}
+
+function start_workers() {
+    for i in $(seq 1 $NWORKERS) ; do
+        local wpidfile=$(worker_pidfile $i)
+        local wlogfile=$(worker_logfile $i)
+        [[ -f $wpidfile ]] && {
+            local wpid=$(cat $wpidfile)
+            ps --no-headers -p $wpid 2>&1 > /dev/null && {
+                echo Worker $i is already running with process $wpid
+                continue
+            }
+        }
+        start_worker $wlogfile $wpidfile
+    done
+}
+echo $1
+case $1 in
+        start-workers)
+            start_workers
+            ;;
+        list-workers)
+            for i in $(ls /var/run | grep teuthology-worker | sort) ; do
+                WPID=$(cat /var/run/$i)
+                WORKER=${i##teuthology-worker.}
+                WORKER=${WORKER%%.pid}
+                STATUS=$(ps --no-headers -p $WPID 2>&1 > /dev/null && echo running || echo dead)
+                echo $WORKER PID:$WPID STATUS:$STATUS
+            done
+            ;;
+        stop-workers)
+            echo Stopping workers
+            stop_workers
+            ;;
+        stop-worker)
+            stop_worker $2
+            ;;
+        restart-workers)
+            $0 stop-workers
+            $1 start-workers
+            ;;
+        start)
+                (
+                   cd /home/$user
+                   source openrc.sh
+                   cd teuthology
+                   . virtualenv/bin/activate
+                   teuthology-lock --list-targets --owner scheduled_$user@teuthology > /tmp/t
+                   if test -s /tmp/t && ! grep -qq 'targets: {}' /tmp/t ; then
+                      teuthology-lock --unlock -t /tmp/t --owner scheduled_$user@teuthology
+                  fi
+                   start_workers
+                )
+                ;;
+        stop)
+                $0 stop-workers
+               ;;
+        restart)
+                $0 stop
+                $0 start
+                ;;
+        *)
+esac
diff --git a/roles/teuthology/templates/teuthology.yaml b/roles/teuthology/templates/teuthology.yaml

new file mode 100644 (file)

index 0000000..8fedf20
--- /dev/null
+++ b/roles/teuthology/templates/teuthology.yaml
@@ -0,0 +1,16 @@
+# {{ ansible_managed }}
+lock_server: {{ paddles_address }}
+results_server: {{ paddles_address }}
+results_ui_server: {{ pulpito_address }}
+results_email: {{ teuthology_results_email|default('null') }}
+results_sending_email: {{ teuthology_results_sending_email|default('null') }}
+lab_domain: {{ lab_domain|default('teuthology') }}
+default_machine_type: {{ teuthology_default_machine_type|default('null') }}
+max_job_time: {{ teuthology_max_job_time|default(129600) }}
+{{ teuthology_yaml_extra }}
+# Not yet configurable via ansible
+archive_server: http://{{ inventory_hostname }}/
+archive_base: {{ archive_base }}
+ceph_git_base_url: {{ teuthology_ceph_git_base_url }}
+queue_host: localhost
+queue_port: 11300
diff --git a/roles/teuthology/templates/update-crontab.sh b/roles/teuthology/templates/update-crontab.sh

new file mode 100755 (executable)

index 0000000..7635649
--- /dev/null
+++ b/roles/teuthology/templates/update-crontab.sh
@@ -0,0 +1,49 @@
+#/bin/bash
+#
+# {{ ansible_managed }}
+#
+# Script to update teuthology user's crontab for scheduling suite runs
+
+REMOTE_CRONTAB_URL="{{ remote_crontab_url }}"
+TEMP_DIR="$(mktemp -d /tmp/XXXXXXXX)"
+CHKCRONTAB_PATH=~/bin/chkcrontab-venv
+
+# Output remote crontab to temp file
+curl -s -o $TEMP_DIR/new $REMOTE_CRONTAB_URL > /dev/null
+
+# Output existing crontab
+crontab -l > $TEMP_DIR/old
+
+# Check for differences
+diff $TEMP_DIR/old $TEMP_DIR/new
+
+if [ $? -eq 0 ]; then
+  echo "No changes.  Exiting."
+  exit 0
+fi
+
+# Install chkcrontab if needed
+# https://pypi.python.org/pypi/chkcrontab
+if ! [ -x ${CHKCRONTAB_PATH}/bin/chkcrontab ]; then
+  rm -rf $CHKCRONTAB_PATH
+  mkdir $CHKCRONTAB_PATH
+  virtualenv $CHKCRONTAB_PATH
+  source $CHKCRONTAB_PATH/bin/activate
+  pip install chkcrontab
+else
+  source $CHKCRONTAB_PATH/bin/activate
+fi
+
+# Perform the actual crontab syntax check
+chkcrontab $TEMP_DIR/new
+
+if [ $? -eq 0 ]; then
+  # Install crontab
+  deactivate
+  crontab $TEMP_DIR/new
+  rm -rf $TEMP_DIR
+  echo "Installed new crontab successfully at $(date)"
+else
+  echo "Checking crontab in $TEMP_DIR/new failed"
+  exit 1
+fi
diff --git a/roles/teuthology/vars/apt_systems.yml b/roles/teuthology/vars/apt_systems.yml

new file mode 100644 (file)

index 0000000..50f5d3e
--- /dev/null
+++ b/roles/teuthology/vars/apt_systems.yml
@@ -0,0 +1,25 @@
+---
+teuthology_extra_packages:
+  # The following packages are requirements for bootstrapping teuthology
+  - git-all
+  - virtualenv
+  - python3-dev
+  - python3-pip
+  - python3-virtualenv
+  - libev-dev
+  - python3-libvirt
+  - beanstalkd
+  - qemu-utils
+  - libev-dev
+  - libvirt-dev
+  # The following packages are requirements for running teuthology
+  - libmysqlclient-dev
+  - libffi-dev
+  - libssl-dev
+  - libyaml-dev
+  # The following are requirements for serving teuthology logs
+  - nginx
+
+apache_service: apache2
+nginx_available: "/etc/nginx/sites-available"
+nginx_enabled: "/etc/nginx/sites-enabled"
diff --git a/roles/teuthology/vars/yum_systems.yml b/roles/teuthology/vars/yum_systems.yml

new file mode 100644 (file)

index 0000000..ed97d53
--- /dev/null
+++ b/roles/teuthology/vars/yum_systems.yml
@@ -0,0 +1 @@
+---
diff --git a/roles/teuthology/vars/zypper_opensuse_leap_15.0.yml b/roles/teuthology/vars/zypper_opensuse_leap_15.0.yml

new file mode 100644 (file)

index 0000000..793d54e
--- /dev/null
+++ b/roles/teuthology/vars/zypper_opensuse_leap_15.0.yml
@@ -0,0 +1,22 @@
+---
+teuthology_extra_packages:
+  - beanstalkd
+  - git
+  - gcc
+  - libev-devel
+  - libffi-devel
+  - libmysqlclient-devel
+  - libopenssl-devel
+  - libvirt-devel
+  - libvirt-python
+  - libyaml-devel
+  - lsb-release
+  - nginx
+  - python-devel
+  - python-pip
+  - python-virtualenv
+  - qemu-tools
+
+#apache_service: apache2
+nginx_available: "/etc/nginx"
+nginx_enabled: "/etc/nginx/vhosts.d"
diff --git a/roles/teuthology/vars/zypper_opensuse_leap_15.1.yml b/roles/teuthology/vars/zypper_opensuse_leap_15.1.yml

new file mode 100644 (file)

index 0000000..b376be5
--- /dev/null
+++ b/roles/teuthology/vars/zypper_opensuse_leap_15.1.yml
@@ -0,0 +1,26 @@
+---
+teuthology_extra_packages:
+  - beanstalkd
+  - git
+  - gcc
+  - libev-devel
+  - libffi-devel
+  - libmysqlclient-devel
+  - libopenssl-devel
+  - libvirt-devel
+  - libyaml-devel
+  - lsb-release
+  - nginx
+  - python2-devel
+  - python3-devel
+  - python2-pip
+  - python3-pip
+  - python2-virtualenv
+  - python3-virtualenv
+  - python2-libvirt-python
+  - python3-libvirt-python
+  - qemu-tools
+
+#apache_service: apache2
+nginx_available: "/etc/nginx"
+nginx_enabled: "/etc/nginx/vhosts.d"
diff --git a/roles/teuthology/vars/zypper_opensuse_leap_15.2.yml b/roles/teuthology/vars/zypper_opensuse_leap_15.2.yml

new file mode 100644 (file)

index 0000000..d14999c
--- /dev/null
+++ b/roles/teuthology/vars/zypper_opensuse_leap_15.2.yml
@@ -0,0 +1,21 @@
+---
+teuthology_extra_packages:
+  - beanstalkd
+  - git
+  - gcc
+  - libev-devel
+  - libffi-devel
+  - libmysqlclient-devel
+  - libopenssl-devel
+  - libvirt-devel
+  - libyaml-devel
+  - lsb-release
+  - nginx
+  - python3-devel
+  - python3-pip
+  - python3-virtualenv
+  - python3-libvirt-python
+  - qemu-tools
+
+nginx_available: "/etc/nginx"
+nginx_enabled: "/etc/nginx/vhosts.d"
diff --git a/roles/teuthology/vars/zypper_opensuse_leap_15.3.yml b/roles/teuthology/vars/zypper_opensuse_leap_15.3.yml

new file mode 100644 (file)

index 0000000..d14999c
--- /dev/null
+++ b/roles/teuthology/vars/zypper_opensuse_leap_15.3.yml
@@ -0,0 +1,21 @@
+---
+teuthology_extra_packages:
+  - beanstalkd
+  - git
+  - gcc
+  - libev-devel
+  - libffi-devel
+  - libmysqlclient-devel
+  - libopenssl-devel
+  - libvirt-devel
+  - libyaml-devel
+  - lsb-release
+  - nginx
+  - python3-devel
+  - python3-pip
+  - python3-virtualenv
+  - python3-libvirt-python
+  - qemu-tools
+
+nginx_available: "/etc/nginx"
+nginx_enabled: "/etc/nginx/vhosts.d"
diff --git a/roles/teuthology/vars/zypper_opensuse_leap_42.3.yml b/roles/teuthology/vars/zypper_opensuse_leap_42.3.yml

new file mode 100644 (file)

index 0000000..280a260
--- /dev/null
+++ b/roles/teuthology/vars/zypper_opensuse_leap_42.3.yml
@@ -0,0 +1,22 @@
+---
+teuthology_extra_packages:
+  - beanstalkd
+  - git
+  - gcc
+  - libev-devel
+  - libffi48-devel
+  - libmysqlclient-devel
+  - libopenssl-devel
+  - libvirt-devel
+  - libvirt-python
+  - libyaml-devel
+  - lsb-release
+  - nginx
+  - python-devel
+  - python-pip
+  - python-virtualenv
+  - qemu-tools
+
+#apache_service: apache2
+nginx_available: "/etc/nginx"
+nginx_enabled: "/etc/nginx/vhosts.d"
diff --git a/roles/users/README.rst b/roles/users/README.rst

new file mode 100644 (file)

index 0000000..e4f5e3c
--- /dev/null
+++ b/roles/users/README.rst
@@ -0,0 +1,123 @@
+Users
+=====
+
+This role is used to manage user accounts on a node. In either your group_vars
+or host_vars files you must define two variables for this role to use:
+``managed_users`` and ``managed_admin_users``. The ``managed_users`` variable
+will create users without sudo access while users in the
+``managed_admin_users`` list will be granted sudo access. Sudo access is
+granted by adding the ``managed_admin_users`` to the group ``sudo`` which
+should be created beforehand. It is not required to add both of these vars to
+your inventory, only use what makes sense for the node being managed.
+
+Additionally, if you have defined ``managed_users`` and ``managed_admin_users``
+for a set of hosts and want to grant sudo access to users on a subset of those
+hosts, you may define ``extra_admin_users`` for that group. The format of that
+variable is similar to the other two, except the ``key`` field is optional for
+each user which is already present in ``managed_users``. This is to allow
+flexibility without as much repetition.
+
+When adding a user, these steps are performed for each user:
+
+- Ensures that the user exists (tags: users)
+
+- Sets the user's shell to bin/bash (tags: users)
+
+- Ensures that the user's homedir exists (tags: users)
+
+- Adds the user to the ``sudo`` group if in ``managed_admin_users`` (tags: users)
+
+- Adds the user's public key to ~/.ssh/authorized_keys (tags: pubkeys)
+
+
+This role also supports revoking user access by removing all users in the
+``revoked_users`` variable.
+
+
+Usage
++++++
+
+This role is required as a dependency for the ``common`` role so it's already in use for most
+all groups and playbooks, but if you need to manage users for a specific node or for a
+one-off situation you can use the users.yml playbook.
+
+For example, this would create and update keys for all users defined for $NODE. First, be
+sure to define either ``managed_users`` or ``managed_admin_users`` in your inventory; then::
+
+    $ ansible-playbook users.yml --limit="$NODE"
+
+You can also filter the list of users being managed by passing the 'users' variable::
+
+    $ ansible-playbook users.yml --limit="$NODE" --extra-vars='{"users": ["user1"]}'
+
+Variables
++++++++++
+
+Available variables are listed below, along with default values (see ``defaults/main.yml``):
+
+A list of hashes that define users that will be created **without** sudo access::
+
+    managed_users: []
+
+A list of hashes that define users that will be created **with** sudo access::
+    
+    managed_admin_users: []
+
+Both of these lists require that the user data be a yaml hash that defines both a ``name``
+and ``key`` property.  The ``name`` will become the user's username and ``key`` is either
+and SSH public key as a string or a url.
+
+For example, in inventory/group_vars/webservers.yml you might have a list of users like this::
+
+    ---
+    managed_users:
+      - name: user1
+        key: <ssh_key_string>
+      - name: user2
+        key: <ssh_key_url>
+
+    managed_admin_users:
+      - name: admin
+        key: <ssh_key_string>
+
+A list of usernames to filter ``managed_users`` and ``managed_admin_users`` by::
+
+    users: []
+
+A list of usernames whose access is to be revoked::
+
+    revoked_users: []
+
+The users role writes a sentinel file, ``/keys-repo-sha1``, to indicate the sha1 of the keys repo when ceph-cm-ansible last ran.  If the sha1 in that file matches the current keys repo HEAD sha1, users tasks will be skipped unless you set ``force_users_update: True``::
+
+    force_users_update: False
+
+By default, the users and pubkeys should be updated.  A task in ``main.yml`` changes this to ``False`` if the machine's users and keys are already up to date (unless ``force_users_update: True``)::
+
+    perform_users_role: True
+
+Tags
+++++
+
+Available tags are listed below:
+
+users
+    Perform only user creation/removal tasks; ssh keys will not be updated.
+
+revoke
+    Perform only user removal tasks.
+
+pubkeys
+    Perform only authorized keys tasks, users will not be created but all
+    SSH keys will be updated for both ``managed_users`` and ``managed_admin_users``.
+
+TODO
+++++
+
+- Allow management of the UID for each user
+
+- Allow management of the shell for each user
+
+- Ensure that the sudo group exists with the correct permissions. We currently depend on it
+  being created already by other playbooks (ansible_managed.yml) or created by cobbler
+  during imaging.
diff --git a/roles/users/defaults/main.yml b/roles/users/defaults/main.yml

new file mode 100644 (file)

index 0000000..9273609
--- /dev/null
+++ b/roles/users/defaults/main.yml
@@ -0,0 +1,37 @@
+---
+# this should be a list of users in the
+# following format:
+# 
+#   managed_users:
+#     - name: username
+#       key: <ssh key as a string>
+#     - name: user2
+#       key: <url to an ssh key>
+
+# not given sudo access
+managed_users: []
+# are given sudo access
+managed_admin_users: []
+
+# A list of usernames to filter managed_users and
+# managed_admin_users by.  For example, if given ['user1']
+# both managed_users and managed_admin_users would be filtered
+# to only contain the information for 'user1'.
+users: []
+
+# A list of users whose access is to be revoked. These accounts will be deleted.
+revoked_users: []
+
+# A repo containing SSH pubkeys. Will be used for each user that has no key
+# specified.
+keys_repo: "https://github.com/ceph/keys"
+# Branch of above repo to use
+keys_branch: main
+# Where to clone keys_repo on the *local* disk
+keys_repo_path: "~/.cache/src/keys"
+
+# If the keys git repo HEAD sha1 matches the sha1 of the host's /keys-repo-sha1 file, the users role will get skipped to save time.
+# Update users and pubkeys by default (this is changed to False during the play if keys_repo_head.stdout == sentinel_sha1.stdout)
+perform_users_role: True
+# Set this to True if you want to run the users tasks anyway
+force_users_update: False
diff --git a/roles/users/tasks/create_users.yml b/roles/users/tasks/create_users.yml

new file mode 100644 (file)

index 0000000..6a407cd
--- /dev/null
+++ b/roles/users/tasks/create_users.yml
@@ -0,0 +1,39 @@
+---
+# This is to prevent normal (read: human) users from ending up with UID 1000,
+# which testnodes needs for the teuthology user.
+- name: Set UID_MIN to 1001
+  lineinfile:
+    dest: /etc/login.defs
+    regexp: "^UID_MIN"
+    line: "UID_MIN                  1001"
+
+- debug: var=managed_admin_users
+- debug: var=managed_users
+
+- name: Normalize managed_admin_users (only if it’s already a list)
+  set_fact:
+    managed_admin_users: "{{ managed_admin_users if (managed_admin_users is iterable and (managed_admin_users|type_debug) == 'list') else [] }}"
+
+- name: Sanity check
+  debug:
+    msg:
+      - "managed_admin_users type: {{ managed_admin_users | type_debug }}"
+      - "first admin: {{ (managed_admin_users | first) | default({}) }}"
+# Expect: list
+
+- name: Create admin users (Ubuntu/Debian)
+  become: yes
+  ansible.builtin.user:
+    name:  "{{ item.name }}"
+    groups: wheel
+    shell:  /bin/bash
+    state:  present
+    append: yes
+  loop: "{{ managed_admin_users }}"
+
+- name: Create all users without sudo access.
+  user:
+    name: "{{ item.name }}"
+    shell: /bin/bash
+    state: present
+  with_items: "{{ managed_users }}"
diff --git a/roles/users/tasks/filter_users.yml b/roles/users/tasks/filter_users.yml

new file mode 100644 (file)

index 0000000..a667cec
--- /dev/null
+++ b/roles/users/tasks/filter_users.yml
@@ -0,0 +1,48 @@
+# 0) Safety defaults
+- set_fact:
+    managed_admin_users: "{{ managed_admin_users | default([]) }}"
+    users: "{{ users | default([]) }}"
+
+# 1) De-duplicate lab_users by name (keeps the first occurrence)
+- name: De-dup lab_users by name
+  set_fact:
+    _lab_users_unique: >-
+      {{
+        (lab_users | groupby('name'))
+        | map('last') | map('first') | list
+      }}
+
+# 2) Build admin names list (supports list-of-dicts OR list-of-strings)
+- name: Build _admin_names safely
+  set_fact:
+    _admin_names: >-
+      {{
+        (
+          (managed_admin_users | length > 0) and ((managed_admin_users | first) is mapping)
+        )
+        | ternary(
+            managed_admin_users | map(attribute='name') | list,
+            managed_admin_users | list
+          )
+      }}
+
+# 3) managed_users = lab_users_unique MINUS admins
+- name: Recompute managed_users
+  set_fact:
+    managed_users: "{{ _lab_users_unique | rejectattr('name','in', _admin_names) | list }}"
+
+# 4) Optional allowlist (only if users provided)
+- name: Apply allowlist "users"
+  when: users | length > 0
+  set_fact:
+    managed_users: "{{ managed_users | selectattr('name','in', users) | list }}"
+    managed_admin_users: >-
+      {{
+        (
+          (managed_admin_users | length > 0) and ((managed_admin_users | first) is mapping)
+        )
+        | ternary(
+            managed_admin_users | selectattr('name','in', users) | list,
+            (managed_admin_users | select('in', users) | list)
+          )
+      }}
diff --git a/roles/users/tasks/main.yml b/roles/users/tasks/main.yml

new file mode 100644 (file)

index 0000000..f702787
--- /dev/null
+++ b/roles/users/tasks/main.yml
@@ -0,0 +1,60 @@
+---
+- name: Check keys_repo HEAD sha1
+  shell: "git ls-remote {{ keys_repo }} HEAD | awk '{ print $1 }'"
+  register: keys_repo_head
+  become: false
+  when: keys_repo is defined
+  delegate_to: localhost
+  connection: local
+  run_once: true
+  retries: 5
+  delay: 10
+  # perform_users_role is True by default so no need to fail the play if there's an error.
+  ignore_errors: true
+  tags:
+    - pubkeys
+
+- name: Check host's /keys-repo-sha1 sentinel file
+  command: cat /keys-repo-sha1
+  register: sentinel_sha1
+  # perform_users_role is True by default so no need to fail the play if there's an error.
+  failed_when: false
+  tags:
+    - pubkeys
+
+- name: Determine if we can skip users and pubkeys updates
+  set_fact:
+    perform_users_role: False
+  # perform_users_role is True by default so no need to fail the play if there's an error.
+  ignore_errors: true
+  when: (keys_repo_head is undefined) or (keys_repo_head.stdout == sentinel_sha1.stdout) and
+        not force_users_update|bool
+
+- import_tasks: filter_users.yml
+  when: perform_users_role|bool
+  tags:
+    - always
+
+- import_tasks: create_users.yml
+  when: perform_users_role|bool
+  tags:
+    - user
+
+- import_tasks: update_keys.yml
+  when: perform_users_role|bool
+  tags:
+    - pubkeys
+
+- import_tasks: revoke_users.yml
+  when: perform_users_role|bool
+  tags:
+    - user
+    - revoke
+
+- name: Write /keys-repo-sha1 sentinel file
+  copy:
+    content: "{{ keys_repo_head.stdout }}"
+    dest: /keys-repo-sha1
+  when: keys_repo_head is defined
+  tags:
+    - pubkeys
diff --git a/roles/users/tasks/revoke_users.yml b/roles/users/tasks/revoke_users.yml

new file mode 100644 (file)

index 0000000..c730099
--- /dev/null
+++ b/roles/users/tasks/revoke_users.yml
@@ -0,0 +1,16 @@
+---
+- name: Filter the revoked_users list
+  set_fact:
+    revoked_users:
+        "[{% for user in revoked_users %}
+            {% if user in users %}'{{ user }}',{%endif%}
+        {%endfor%}]"
+  when: users|length > 0
+  tags:
+    - always
+
+- name: Remove revoked users
+  user:
+    name: "{{ item }}"
+    state: absent
+  with_items: "{{ revoked_users }}"
diff --git a/roles/users/tasks/update_keys.yml b/roles/users/tasks/update_keys.yml

new file mode 100644 (file)

index 0000000..decb646
--- /dev/null
+++ b/roles/users/tasks/update_keys.yml
@@ -0,0 +1,36 @@
+---
+- name: Merge managed_users and managed_admin users
+  set_fact:
+    pubkey_users: "{{ managed_users|list + managed_admin_users|list }}"
+
+- name: Clone the keys repo
+  local_action:
+    module: git
+    repo: "{{ keys_repo }}"
+    version: "{{ keys_branch }}"
+    # http://tracker.ceph.com/issues/16615
+    # depth: 1
+    force: yes
+    dest: "{{ keys_repo_path }}"
+  become: false
+  when: keys_repo is defined
+  connection: local
+  run_once: true
+  register: clone_keys
+  until: clone_keys is success
+  retries: 5
+  delay: 10
+
+- name: Update authorized_keys using the keys repo
+  authorized_key:
+    user: "{{ item.name }}"
+    key: "{{ lookup('file', keys_repo_path + '/ssh/' + item.name + '.pub') }}"
+  with_items: "{{ pubkey_users }}"
+  when: item.key is undefined and keys_repo is defined
+
+- name: Update authorized_keys for each user with literal keys
+  authorized_key:
+    user: "{{ item.name }}"
+    key: "{{ item.key }}"
+  with_items: "{{ pubkey_users }}"
+  when: item.key is defined
diff --git a/roles/vmhost/README.rst b/roles/vmhost/README.rst

new file mode 100644 (file)

index 0000000..b25d0ef
--- /dev/null
+++ b/roles/vmhost/README.rst
@@ -0,0 +1,56 @@
+vmhost
+======
+
+This role does a lot of the setup for a mira node running Ubuntu
+(probably sticking with an LTS of trusty or later is a good idea;
+trusty is where it's got the most testing) to turn it into a
+'standard' VPS host.  Our standard is: 8 qemu-kvm virtual machines,
+provisioned by libvirt through downburst, as noted in the lock
+database on paddles for the sepia lab.  The first of those uses
+data storage sharing the root drive, and the last seven use
+the seven free mira drives as their storage pool.
+
+This role does not set up the storage pool directories/mount
+points, and does not add any mapping of which vpm VMs belong
+on any particular node (from the vps_hosts group).  It assumes
+that you have already:
+
+- created /srv/libvirtpool on the vmhost
+
+- made subdirs there named after the vpms
+
+On mira, we then use disks b..h as separate filesystems to
+mount on vpmNNN+1..vpmNNN+7, so for miras, we will have:
+
+- made filesystems (xfs is the usual choice)
+
+- mounted those filesystems on /srv/libvirtpool/<vpm#2--N>
+
+- added UUID= lines to /etc/fstab so the mounts happen at reboot
+
+Note that the role does not assume any particular structure
+of what provides /srv/libvirtpool/vpmNNN, but simply uses that
+to drive creating libvirt pools.
+
+It is certainly possible to do the above with ansible as well,
+and a later version may.
+
+
+Variables
++++++++++
+
+Only one variable is defined, ``vmhost_apt_packages``.  The default
+is empty, but the current definition in vars/ is not expected to change
+soon.
+
+Tags
+++++
+
+packages
+    Just install packages
+
+networking
+    Set up the bridge for qemu to use as the 'front' network
+
+libvirt
+    All the libvirt-related setup (pools, networks, etc.)
diff --git a/roles/vmhost/files/interfaces b/roles/vmhost/files/interfaces

new file mode 100644 (file)

index 0000000..7c770ca
--- /dev/null
+++ b/roles/vmhost/files/interfaces
@@ -0,0 +1,12 @@
+auto lo
+iface lo inet loopback
+
+iface eth0 inet manual
+
+auto br-front
+iface br-front inet dhcp
+  bridge_ports eth0
+  bridge_fd 9
+  bridge_hello 2
+  bridge_maxage 12
+  bridge_stp off
diff --git a/roles/vmhost/files/libvirt-net-front.xml b/roles/vmhost/files/libvirt-net-front.xml

new file mode 100644 (file)

index 0000000..72dbca8
--- /dev/null
+++ b/roles/vmhost/files/libvirt-net-front.xml
@@ -0,0 +1,5 @@
+<network>
+  <name>front</name>
+  <bridge name='br-front'/>
+  <forward mode='bridge'/>
+</network>
diff --git a/roles/vmhost/tasks/libvirt.yml b/roles/vmhost/tasks/libvirt.yml

new file mode 100644 (file)

index 0000000..04b28de
--- /dev/null
+++ b/roles/vmhost/tasks/libvirt.yml
@@ -0,0 +1,131 @@
+---
+# default pool
+- name: Query libvirt pool 'default'
+  command: virsh pool-uuid default
+  register: pool_uuid
+  failed_when: false
+
+- name: Define libvirt pool 'default'
+  command: virsh pool-define-as --name default dir --target /var/lib/libvirt/images
+  when: pool_uuid is defined and pool_uuid | failed
+
+- name: Query 'default' pool state
+  command: virsh -q pool-info default
+  ignore_errors: yes
+  register: default_pool_info
+
+- name: Start pool 'default'
+  command: virsh pool-start default
+  when: 'default_pool_info is defined and default_pool_info.stdout|search("State:  *inactive")'
+
+- name: Autostart pool 'default'
+  command: virsh pool-autostart default
+  when: 'default_pool_info is defined and default_pool_info.stdout | search("Autostart:  *no")'
+
+# Per-vpm storage pools
+
+- name: Test for /srv/libvirtpool
+  stat:
+    path: /srv/libvirtpool
+  register: srv_libvirtpool
+  failed_when: srv_libvirtpool.stat.exists == False
+
+- name: Ensure proper ownership in /srv/libvirtpool
+  file: 
+    path: /srv/libvirtpool
+    state: directory
+    owner: libvirt-qemu
+    group: kvm
+    recurse: yes
+  when: srv_libvirtpool.stat.exists
+
+# the dance here is to figure out which pools are already defined,
+# and avoid trying to defining them again.
+
+- name: Find defined vpm names
+  command: ls /srv/libvirtpool
+  register: ls_libvirtpool
+  when: srv_libvirtpool.stat.exists
+
+- name: See which pools are defined and which are not
+  shell: virsh pool-info {{ item }}
+  with_items: "{{ ls_libvirtpool.stdout_lines }}"
+  register: pool_info
+  when: srv_libvirtpool.stat.exists
+  # don't bother reporting anything about this command; it's not useful
+  failed_when: false
+
+# pool_info.results is a now list of dicts, one per item, with 'rc',
+# 'changed', 'stdout', 'stderr' etc.  Make a new list for
+# all of the above that failed (i.e. rc == 1), as those
+# are the pools that still need definition.  "" stop
+# jinja templating from being confused with yaml, as usual;
+# {%- and -%} suppress blank lines so that the only thing
+# that expands is the list declaration.
+
+- name: Form list of undefined pools
+  set_fact:
+    pools_to_define:
+      "{%- set l = [] %}
+      {%- for result in pool_info.results %}
+        {%- if result.rc == 1 %}
+          {%- set dummy = l.append(result.item) %}
+        {%- endif %}
+      {%- endfor -%}
+      {{ l | list }}"
+
+- name: Define pools which are left to be defined
+  shell: |
+    virsh pool-define-as --name {{ item | quote }} --type dir --target /srv/libvirtpool/{{ item }};
+    virsh pool-autostart {{ item | quote }};
+    virsh pool-build {{ item | quote }};
+    virsh pool-start {{ item | quote }}
+  with_items: "{{ pools_to_define }}"
+  when: pools_to_define|length > 0
+
+# Front network
+
+- name: Query for front network definition
+  command: virsh net-info front
+  ignore_errors: true
+  register: front_net
+
+- name: Send front network definition file
+  copy:
+    src: ../files/libvirt-net-front.xml
+    dest: /tmp/libvirt-net-front.xml
+  when: front_net is defined and front_net | failed
+
+- name: Create front network
+  command: virsh net-define /tmp/libvirt-net-front.xml
+  when: front_net is defined and front_net | failed
+
+- name: Remove tmp network definition file
+  file:
+    dest: /tmp/libvirt-net-front.xml
+    state: absent
+  when: front_net is defined and front_net | failed
+
+- name: Re-query for front network definition
+  command: virsh net-info front
+  ignore_errors: yes
+  register: front_net
+
+- name: Start front network
+  command: virsh net-start front
+  when: 'front_net is defined and front_net.stdout | search("Active:  *no")'
+
+- name: Set front network to autostart
+  command: virsh net-autostart front
+  when: 'front_net is defined and front_net.stdout | search("Autostart:  *no")'
+
+# Final steps
+
+- name: Allow libvirt for teuthology user
+  user:
+    name: "{{ teuthology_user }}"
+    groups: libvirtd
+    append: yes
+
+- name: Restart libvirt-bin
+  command: service libvirt-bin restart
diff --git a/roles/vmhost/tasks/main.yml b/roles/vmhost/tasks/main.yml

new file mode 100644 (file)

index 0000000..07109c2
--- /dev/null
+++ b/roles/vmhost/tasks/main.yml
@@ -0,0 +1,8 @@
+- import_tasks: packages.yml
+  tags: packages
+
+- import_tasks: networking.yml
+  tags: networking
+
+- import_tasks: libvirt.yml
+  tags: libvirt
diff --git a/roles/vmhost/tasks/networking.yml b/roles/vmhost/tasks/networking.yml

new file mode 100644 (file)

index 0000000..a04b8b6
--- /dev/null
+++ b/roles/vmhost/tasks/networking.yml
@@ -0,0 +1,17 @@
+# front_mac = ansible_eth0.macaddress
+# front_ip = ansible_eth0.ipv4.address
+
+- name: Install /etc/network/interfaces
+  copy:
+    src: interfaces
+    dest: /etc/network/interfaces
+    force: yes
+    owner: root
+    group: root
+    mode: 0644
+    backup: yes
+  register: interface_install
+
+- name: Activate new network config
+  shell: /sbin/ifdown -a; /sbin/ifup -a
+  when: interface_install is changed
diff --git a/roles/vmhost/tasks/packages.yml b/roles/vmhost/tasks/packages.yml

new file mode 100644 (file)

index 0000000..8d35482
--- /dev/null
+++ b/roles/vmhost/tasks/packages.yml
@@ -0,0 +1,9 @@
+---
+- name: Install packages via apt
+  apt:
+    name: "{{ vmhost_apt_packages|list }}"
+    state: latest
+    update_cache: yes
+    cache_valid_time: 600
+  tags:
+    - packages
diff --git a/roles/vmhost/vars/main.yml b/roles/vmhost/vars/main.yml

new file mode 100644 (file)

index 0000000..c78f9b6
--- /dev/null
+++ b/roles/vmhost/vars/main.yml
@@ -0,0 +1,5 @@
+---
+vmhost_apt_packages:
+  - qemu-kvm
+  - libvirt-bin
+  - bridge-utils
diff --git a/rook.yml b/rook.yml

new file mode 100644 (file)

index 0000000..ca9073f
--- /dev/null
+++ b/rook.yml
@@ -0,0 +1,5 @@
+---
+- hosts: localhost
+  gather_facts: True
+  roles:
+    - rook
diff --git a/set_python_path.yml b/set_python_path.yml

new file mode 100644 (file)

index 0000000..3c8f7cc
--- /dev/null
+++ b/set_python_path.yml
@@ -0,0 +1,18 @@
+---
+# This will set ansible_python_interpreter to use python3
+# if the shell module fails (like it will on RHEL8 since
+# /usr/bin/python is no more).
+- hosts: all
+  gather_facts: false
+  vars:
+    ansible_ssh_user: "{{ cm_user }}"
+  become: true
+  tasks:
+    - name: Check for /usr/bin/python
+      shell: echo marco
+      register: polo
+      ignore_errors: true
+    - name: Set ansible_python_interpreter=/usr/bin/python3
+      set_fact:
+        ansible_python_interpreter: /usr/bin/python3
+      when: polo.rc != 0
diff --git a/signalfx.yml b/signalfx.yml

new file mode 100644 (file)

index 0000000..91b773b
--- /dev/null
+++ b/signalfx.yml
@@ -0,0 +1,10 @@
+---
+- name: The signalfx-configurarion for systemd monitoring
+  hosts: all
+  gather_facts: yes
+
+  vars_files:
+    - "{{ var_file_name }}"
+
+  roles:
+    - signalfx_splunk_agent_configuration
diff --git a/testnodes.yml b/testnodes.yml

new file mode 100644 (file)

index 0000000..ce49a56
--- /dev/null
+++ b/testnodes.yml
@@ -0,0 +1,8 @@
+---
+- hosts: testnodes
+  strategy: free
+  roles:
+    - common
+    - testnode
+    - grafana_agent
+  become: true
diff --git a/teuthology.yml b/teuthology.yml

new file mode 100644 (file)

index 0000000..92ee555
--- /dev/null
+++ b/teuthology.yml
@@ -0,0 +1,7 @@
+---
+- hosts: teuthology
+  strategy: free
+  roles:
+    - common
+    - teuthology
+  become: true
diff --git a/tools/checkcerts.py b/tools/checkcerts.py

new file mode 100755 (executable)

index 0000000..b2b360a
--- /dev/null
+++ b/tools/checkcerts.py
@@ -0,0 +1,123 @@
+#!/usr/bin/python3
+
+import argparse
+import socket
+import ssl
+import subprocess
+import sys
+import os
+import tempfile
+import datetime
+import smtplib
+
+DAYS_BEFORE_WARN=7
+
+DEFAULT_DOMAINS = [
+    '1.chacra.ceph.com',
+    '2.chacra.ceph.com',
+    '3.chacra.ceph.com',
+    '4.chacra.ceph.com',
+    'ceph.com',
+    'ceph.io',
+    'chacra.ceph.com',
+    'console-openshift-console.apps.os.sepia.ceph.com',
+    'docs.ceph.com',
+    'download.ceph.com',
+    'git.ceph.com',
+    'grafana.ceph.com',
+    'jenkins.ceph.com',
+    'jenkins.rook.io',
+    'lists.ceph.io',
+    'pad.ceph.com',
+    'paddles.front.sepia.ceph.com',
+    'pulpito.ceph.com',
+    'quay.ceph.io',
+    'sentry.ceph.com',
+    'shaman.ceph.com',
+    'status.sepia.ceph.com',
+    'telemetry-public.ceph.com',
+    'tracker.ceph.com',
+    'wiki.sepia.ceph.com',
+    'www.ceph.io',
+    ]
+DEFAULT_EMAIL = [
+    'dmick@redhat.com',
+    'ceph-infra@redhat.com',
+    'akraitman@redhat.com',
+    'aschoen@redhat.com',
+    'zcerza@redhat.com',
+    ]
+
+
+def parse_args():
+    ap = argparse.ArgumentParser()
+    ap.add_argument('-q', '--quiet', action='store_true')
+    ap.add_argument('-E', '--send-email', action='store_true', help="send email with warnings")
+    ap.add_argument('-e', '--email', nargs='*', default=DEFAULT_EMAIL, help=f'list of addresses to send to (default: {DEFAULT_EMAIL})')
+    ap.add_argument('-d', '--domains', nargs='*', default=DEFAULT_DOMAINS)
+    return ap.parse_args()
+
+def sendmail(emailto, subject, body):
+    FROM = 'ceph-infra-admins@redhat.com'
+    TO = emailto # must be a list
+    SUBJECT = subject
+    TEXT = body
+
+    # Prepare actual message
+
+    message = """\
+From: %s
+To: %s
+Subject: %s
+
+%s
+
+Report from %s running on %s
+""" % (FROM, ", ".join(TO), SUBJECT, TEXT, os.path.realpath(sys.argv[0]), socket.gethostname())
+
+    # send it 
+    server = smtplib.SMTP('localhost')
+    server.sendmail(FROM, TO, message)
+    server.quit()
+
+def main():
+    context = ssl.create_default_context()
+
+    args = parse_args()
+    domains = args.domains
+
+    warned = False
+    for domain in domains:
+        errstr = None
+        certerr = False
+        warn = datetime.timedelta(days=DAYS_BEFORE_WARN)
+        try:
+            with socket.create_connection((domain, 443)) as sock:
+                with context.wrap_socket(sock, server_hostname=domain) as ssock:
+                    cert = ssock.getpeercert()
+        except (ssl.CertificateError, ssl.SSLError) as e:
+            certerr = True
+            errstr = f'{domain} cert error: {e}'
+
+        if not certerr:
+            expire = datetime.datetime.strptime(cert['notAfter'],
+                '%b %d %H:%M:%S %Y %Z')
+            now = datetime.datetime.utcnow()
+            left = expire - now
+
+            errstr = f'{domain:30s} cert: {str(left).rsplit(".",1)[0]} left until it expires'
+        if not args.quiet:
+            print(errstr, file=sys.stderr)
+        if (certerr or (left < warn)) and (args.send_email):
+            subject = f'Certificate problem with {domain}'
+            body = errstr
+            email = args.email
+            if email == []:
+                email = DEFAULT_EMAIL
+            sendmail(email, subject, body)
+            warned = True
+    return int(warned)
+
+if __name__ == '__main__':
+    sys.exit(main())
+
diff --git a/tools/cobbler-access.sh b/tools/cobbler-access.sh

new file mode 100755 (executable)

index 0000000..da48ebe
--- /dev/null
+++ b/tools/cobbler-access.sh
@@ -0,0 +1,37 @@
+#!/bin/bash
+# Script to generate Cobbler credentials
+
+tmpfile=$(mktemp)
+
+# Basically `mkpasswd` but uses a small subset of special characters
+password=$(head /dev/urandom | tr -dc 'A-Za-z0-9!@#$%&' | head -c 12 && echo)
+
+if [ $# -eq 0 ]; then
+  printf "Enter username: "
+  read -r username
+else
+  username=$1
+fi
+
+cat << EOF
+
+======== String for cobbler.yml ========
+--- Cobbler v2 ---
+$(echo -n "$username:Cobbler:" && echo -n "$username:Cobbler:$password" | md5sum | awk '{ print $1 }')
+
+--- Cobbler v3 ---
+$username:Cobbler:$(printf "$password" | openssl dgst -sha3-512 | awk '{ print $2 }')
+
+======== E-mail to $username ========
+Hi FIRSTNAME,
+
+Here are your Cobbler user credentials.
+
+Username: $username
+Password: $password
+
+Please do not share these credentials.
+
+Thank you.
+
+EOF
diff --git a/tools/convert-to-centos-stream.yml b/tools/convert-to-centos-stream.yml

new file mode 100644 (file)

index 0000000..ba0d550
--- /dev/null
+++ b/tools/convert-to-centos-stream.yml
@@ -0,0 +1,50 @@
+---
+### This playbook simply converts a CentOS host to CentOS Stream.
+### It is primarily intended to be run during Cobbler's cephlab_ansible.sh post-install trigger.
+
+- hosts:
+    - all
+  become: true
+  gather_facts: true
+  tasks:
+
+  - name: List repo files
+    find:
+      paths: /etc/yum.repos.d/
+      file_type: file
+      patterns: 'CentOS-Linux-*.repo'
+    register: pre_stream_repo_files
+    when: ansible_distribution == 'CentOS'
+
+  # From ansible docs: 'replace: If not set, matches are removed entirely.'
+  - name: Remove all mirrorlists
+    replace:
+      path: "{{ item.path }}"
+      regexp: '^mirrorlist=.*'
+    with_items: "{{ pre_stream_repo_files.files }}"
+    when: ansible_distribution == 'CentOS'
+
+  - name: Uncomment baseurls
+    replace:
+      path: "{{ item.path }}"
+      regexp: '^mirrorlist=.*'
+      regexp: '^\s*#*\s*(baseurl=.*)'
+      replace: '\1'
+    with_items: "{{ pre_stream_repo_files.files }}"
+    when: ansible_distribution == 'CentOS'
+
+  - name: Point baseurls to archive server
+    replace:
+      path: "{{ item.path }}"
+      regexp: 'mirror.centos.org/\$contentdir/\$releasever'
+      replace: 'vault.centos.org/8.5.2111'
+    with_items: "{{ pre_stream_repo_files.files }}"
+    when: ansible_distribution == 'CentOS'
+
+  - name: Swap to Stream Repos
+    command: dnf -y swap centos-linux-repos centos-stream-repos
+    when: ansible_distribution == 'CentOS'
+
+  - name: Sync Stream Repos
+    command: dnf -y distro-sync
+    when: ansible_distribution == 'CentOS'
diff --git a/tools/dot.vmlist.conf b/tools/dot.vmlist.conf

new file mode 100644 (file)

index 0000000..e4f0851
--- /dev/null
+++ b/tools/dot.vmlist.conf
@@ -0,0 +1,38 @@
+# put this in ~/.vmlist.conf
+
+[global]
+# which hosts to examine for lxc and virsh output
+# vm_hosts:
+# where to put the cache file for dump without -r
+# cachefile: ~/.vmlist.cache
+# what version of the novaclient API to use
+# novaclient_version: 2
+
+# sections named 'cloud-XXXX' will be interpreted as
+# nova providers, and connected to and all servers listed.
+# if cloud_region_names is set, a list for each region is
+# acquired.
+
+#[cloud-ovh-cattle]
+#cloud_user: <userid>
+#cloud_password: <password>
+#cloud_project_id: 5633955729735406
+#cloud_tenant_id: 131b886b156a4f84b5f41baf2fbe646c
+#cloud_region_names: GRA1, BHS1
+#cloud_auth_url: https://auth.cloud.ovh.net/v2.0
+
+#[cloud-ovh-pets]
+#cloud_user: <userid>
+#cloud_password: <password>
+#cloud_project_id: 4867549786842007
+#cloud_auth_url: https://auth.cloud.ovh.net/v2.0
+#cloud_tenant_id: 8f16c274eb514336a8844ed418dfc1a0
+#cloud_region_names: GRA1, BHS1
+
+#[cloud-dreamcompute]
+#cloud_user: <userid>
+#cloud_password: <password>
+#cloud_project_id: dhc1268222
+#cloud_auth_url: https://keystone.dream.io/v2.0
+
+
diff --git a/tools/downstream-jenkins-sync-jobs.yml b/tools/downstream-jenkins-sync-jobs.yml

new file mode 100644 (file)

index 0000000..123e2b6
--- /dev/null
+++ b/tools/downstream-jenkins-sync-jobs.yml
@@ -0,0 +1,51 @@
+---
+# This playbook is used to sync the jenkins jobs from one ocp pod folder to another pod folder
+# Usage:
+# ansible-playbook downstream-jenkins-sync-jobs.yml --extra-vars "src_pod=gluster-downstream-jenkins src_folder=/var/lib/jenkins/restore/jobs/ dest_pod=gluster-new-jenkins dest_folder=/var/lib/jenkins/jobs/"
+# Varibles:
+# src_pod - The pod name that holds the jobs that will get copied to the destination pod
+# src_folder - The folder on the src_pod that includs the jobs the will be copied to the dest_pod
+# dest_pod - The pod name that the jobs will get copied to
+# dest_folder - The folder on the dest_pod where the jobs the will be copied to
+# 
+- hosts: localhost
+  gather_facts: false
+  tasks:
+
+  - name: Check oc tool installation status
+    command: which oc
+    changed_when: false
+    failed_when: false
+    register: oc_installed
+
+  - name: Fail if oc tool is not installed
+    fail:
+      msg: "oc tool appears to be missing, install first and connect with your user to the ocp cluster by running oc login"
+    when: oc_installed is failed
+
+  - name: Check connected oc client user
+    command: oc whoami
+    ignore_errors: True
+    register: oc_whoami
+
+  - name: Fail if oc user is not connected
+    fail:
+      msg: "Please login to the ocp cluster by running oc login"
+    when: oc_whoami.rc != 0
+
+  - name: Create temporary directory
+    tempfile:
+      state: directory
+    register: tmpdir
+
+  - name: rsync jobs from source pod to local folder
+    shell: oc rsync $(oc get pods | grep -i Running | grep -i "{{ src_pod }}" | awk '{ print $1 }'):"{{ src_folder }}" "{{ tmpdir.path }}"
+
+  - name: rsync jobs from local folder to destination pod
+    shell: oc rsync "{{ tmpdir.path }}" $(oc get pods | grep -i Running | grep -i "{{ dest_pod }}" | awk '{ print $1 }'):"{{ dest_folder }}"
+
+  - name: Remove the temporary directory
+    file:
+      path: "{{ tmpdir.path }}"
+      state: absent
+    when: tmpdir.path is defined
diff --git a/tools/generate-fog-csv.yml b/tools/generate-fog-csv.yml

new file mode 100644 (file)

index 0000000..d866261
--- /dev/null
+++ b/tools/generate-fog-csv.yml
@@ -0,0 +1,10 @@
+---
+# This playbook can be used to generate a CSV file of testnodes
+# that can be imported to the FOG web UI.
+# It outputs a CSV file to /tmp/fog_hostfile.csv
+
+- hosts: localhost
+  roles:
+    - generate-fog-csv
+  become: false
+  gather_facts: false
diff --git a/tools/incerta-nic.yml b/tools/incerta-nic.yml

new file mode 100644 (file)

index 0000000..eea67f5
--- /dev/null
+++ b/tools/incerta-nic.yml
@@ -0,0 +1,116 @@
+# The incerta nodes in the Sepia lab are connected to a private (not uplinked) Mellanox 40Gb switch.
+# This playbook is used in conjunction with individual host_vars files for each host to configure
+# the second/back interface on each server.
+#
+# https://wiki.sepia.ceph.com/doku.php?id=hardware:incerta
+# https://wiki.sepia.ceph.com/doku.php?id=services:networking#hardware
+# mlx-sw01.ipmi.sepia.ceph.com
+
+- hosts: incerta
+  become: true
+  gather_facts: true
+  tasks:
+    - name: Make sure ethtool is installed (Ubuntu)
+      apt:
+        name: ethtool
+        state: present
+      when: ansible_os_family == 'Debian'
+    
+    - name: Make sure ethtool is installed (CentOS/RHEL)
+      yum:
+        name: ethtool
+        state: present
+        enablerepo: epel
+      when: ansible_os_family == 'RedHat'
+    
+    - name: grep ethtool for secondary NIC MAC address
+      shell: "ethtool -P {{ item }} | awk '{ print $3 }' | grep -q -i '{{ incerta_back_mac }}'"
+      register: ethtool_grep_output
+      with_items: "{{ ansible_interfaces }}"
+      failed_when: false
+      changed_when: false
+    
+    - name: Define net_to_configure var
+      set_fact:
+        nic_to_configure: "{{ item.item }}"
+      with_items: "{{ ethtool_grep_output.results }}"
+      when: item.rc == 0
+
+    - name: Check for /etc/network/interfaces
+      stat:
+        path: /etc/network/interfaces
+      register: etc_network_interfaces
+      when: ansible_os_family == 'Debian'
+    
+    - name: "Write Ubuntu network config for {{ nic_to_configure }}"
+      blockinfile:
+        path: /etc/network/interfaces
+        block: |
+          auto {{ nic_to_configure }}
+          iface {{ nic_to_configure }} inet static
+            address {{ incerta_back_ip }}
+            network 10.0.10.0
+            netmask 255.255.255.0
+            broadcast 10.0.10.255
+          post-up /sbin/ifconfig {{ nic_to_configure }} mtu 9216 up
+      register: wrote_network_config
+      when:
+        - nic_to_configure is defined
+        - ansible_os_family == 'Debian'
+        - etc_network_interfaces.stat.exists
+    
+    - name: "Bounce {{ nic_to_configure }}"
+      shell: "ifdown {{ nic_to_configure }} && ifup {{ nic_to_configure }}"
+      when:
+        - wrote_network_config is changed
+        - ansible_os_family == 'Debian'
+        - etc_network_interfaces.stat.exists
+
+    - name: Check for /etc/netplan/01-netcfg.yaml
+      stat:
+        path: /etc/netplan/01-netcfg.yaml
+      register: netplan_conf
+      when: ansible_os_family == 'Debian'
+
+    - name: "Configure {{ nic_to_configure }} using ifconfig"
+      command: "ifconfig {{ nic_to_configure }} {{ incerta_back_ip }} netmask 255.255.255.0 mtu 9216"
+      when:
+        - ansible_os_family == 'Debian'
+        - not etc_network_interfaces.stat.exists
+        - netplan_conf.stat.exists
+    
+    - name: "Write RHEL/CentOS network config for {{ nic_to_configure }}"
+      lineinfile:
+        path: "/etc/sysconfig/network-scripts/ifcfg-{{ nic_to_configure }}"
+        create: yes
+        owner: root
+        group: root
+        mode: 0644
+        regexp: "{{ item.regexp }}"
+        line: "{{ item.line }}"
+      register: wrote_network_config
+      with_items:
+        - { regexp: '^DEVICE=', line: 'DEVICE={{ nic_to_configure }}' }
+        - { regexp: '^NAME=', line: 'NAME={{ nic_to_configure }}' }
+        - { regexp: '^BOOTPROTO=', line: 'BOOTPROTO=static' }
+        - { regexp: '^ONBOOT=', line: 'ONBOOT=yes' }
+        - { regexp: '^MTU=', line: 'MTU=9216' }
+        - { regexp: '^IPADDR=', line: 'IPADDR={{ incerta_back_ip }}' }
+        - { regexp: '^PREFIX=', line: 'PREFIX=24' }
+        - { regexp: '^DEFROUTE=', line: 'DEFROUTE=no' }
+      when:
+        - nic_to_configure is defined
+        - ansible_os_family == 'RedHat'
+    
+    - name: "Bounce {{ nic_to_configure }}"
+      shell: "ifdown {{ nic_to_configure }}; ifup {{ nic_to_configure }}"
+      when:
+        - wrote_network_config is changed
+        - ansible_os_family == 'RedHat'
+    
+    - fail:
+        msg: "WARNING: {{ ansible_hostname }} IS USING NETPLAN TO CONFIGURE ITS NICS. EDITING NETPLAN YAML FILES USING ANSIBLE IS NOT TRIVIAL. THEREFORE, THIS NETWORK SETTING WILL NOT SURVIVE A REBOOT! RECOMMEND MANUALLY EDITING /etc/netplan/01-netcfg.yaml"
+      when:
+        - ansible_os_family == 'Debian'
+        - not etc_network_interfaces.stat.exists
+        - netplan_conf.stat.exists
diff --git a/tools/jenkins-builder-disk.yml b/tools/jenkins-builder-disk.yml

new file mode 100644 (file)

index 0000000..61c78d6
--- /dev/null
+++ b/tools/jenkins-builder-disk.yml
@@ -0,0 +1,91 @@
+### This playbook configures a braggi host to be a Jenkins slave.
+
+- hosts:
+    - braggi
+    - incerta
+    - irvingi
+    - adami
+  become: true
+  tasks:
+
+# CentOS 9 on the braggi nodes likes to flip around which disk is sda and which is sdb.  Sometimes it comes up as sdb and sometimes sda.
+  - name: Check if /dev/sda is the 400GB disk on a braggi
+    parted:
+      device: "/dev/sda"
+      unit: GiB
+    register: "sda_parted"
+    when: '"braggi" in ansible_hostname'
+
+  - name: Check if /dev/sdb is the 400GB disk on a braggi
+    parted:
+      device: "/dev/sdb"
+      unit: GiB
+    register: "sdb_parted"
+    when: '"braggi" in ansible_hostname'
+
+  - set_fact:
+      mount_point: /home/jenkins-build
+    when: '"braggi" in ansible_hostname'
+
+  - set_fact:
+      disk: /dev/sda
+    when:
+      - '"braggi" in ansible_hostname'
+      - "sda_parted.disk.size < 500"
+
+  - set_fact:
+      disk: /dev/sdb
+    when:
+      - '"braggi" in ansible_hostname'
+      - "sdb_parted.disk.size < 500"
+
+  - set_fact:
+      disk: /dev/sdb
+      mount_point: /home/jenkins-build
+    when: '"adami" in ansible_hostname'
+
+  - set_fact:
+      disk: /dev/nvme0n1
+      mount_point: /home/jenkins-build
+    when: '"incerta" in ansible_hostname'
+
+# Setting the mountpoint to libvirt/images on irvinigi because I'm adding two
+# right now as CentOS7 Vagrant builders.
+  - set_fact:
+      disk: /dev/sdc
+      mount_point: /var/lib/libvirt/images
+    when: '"irvingi" in ansible_hostname'
+
+  - name: "Create {{ mount_point }} home dir"
+    file:
+      path: "{{ mount_point }}"
+      state: directory
+
+  - name: Install xfsprogs (Ubuntu)
+    package:
+      name: xfsprogs
+      state: latest
+    when: ansible_os_family == "Debian"
+
+  - name: Unmount
+    mount:
+      path: "{{ mount_point }}"
+      src: "{{ disk }}"
+      state: unmounted
+      fstype: xfs
+    ignore_errors: true
+
+  - name: Zap disk
+    command: "sgdisk -Z {{ disk }}"
+
+  - name: Configure disk
+    filesystem:
+      fstype: xfs
+      dev: "{{ disk }}"
+
+  - name: Mount disk
+    mount:
+      path: "{{ mount_point }}"
+      src: "{{ disk }}"
+      state: mounted
+      fstype: xfs
diff --git a/tools/openvpn/maketar.sh b/tools/openvpn/maketar.sh

new file mode 100755 (executable)

index 0000000..113e37e
--- /dev/null
+++ b/tools/openvpn/maketar.sh
@@ -0,0 +1,6 @@
+#!/bin/bash
+#
+# make a tarball for distribution of this configuration and
+# secret generator
+#
+tar cfz sepia-vpn-client.tar.gz sepia/ca.crt sepia/client.conf sepia/new-client sepia/tlsauth
diff --git a/tools/openvpn/sepia/ca.crt b/tools/openvpn/sepia/ca.crt

new file mode 100644 (file)

index 0000000..54cb98d
--- /dev/null
+++ b/tools/openvpn/sepia/ca.crt
@@ -0,0 +1,20 @@
+-----BEGIN CERTIFICATE-----
+MIIDVzCCAj+gAwIBAgIUOAVvdnT5AeNHmQVerBNGyBipF+0wDQYJKoZIhvcNAQEL
+BQAwGjEYMBYGA1UEAwwPb3BlbnZwbmNhLXNlcGlhMB4XDTI0MTIwMjE3MTc1MloX
+DTM0MTEzMDE3MTc1MlowGjEYMBYGA1UEAwwPb3BlbnZwbmNhLXNlcGlhMIIBIjAN
+BgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEApPbQdUr74nVphtcdV9VhJs1cgKGq
+iZNBRdVxW92JurMJuIJXSiBwGochYTs4NQprlD5jYStnpzoe7c1HsFKwVEY3xSmT
+h7wdj0JIRgAdspG2XxxSU63k2t4Ezm6z7W7jnRvXjNhD55AMpxHAQpS0YhpxTm95
+SJDlk7gCmdIN087ioTYW8Fh+NI/ASjz5m3XWjsF/mTOHLYmlRL4bSWLwpKXuxpPW
+YVeScyDC6olc0MOfNKihxY3Q4IJiLcBPXQhGp3pnKCSut+f+nHu+sSLssliuvGBh
+6rn5c/5TceGbVvK1BX53F5Znx/AGC7XEEXKddUQbZDVN8pg1VygKt8tDIQIDAQAB
+o4GUMIGRMB0GA1UdDgQWBBSCoc5pUrxKfAoguqWqY25PhYuYrjBVBgNVHSMETjBM
+gBSCoc5pUrxKfAoguqWqY25PhYuYrqEepBwwGjEYMBYGA1UEAwwPb3BlbnZwbmNh
+LXNlcGlhghQ4BW92dPkB40eZBV6sE0bIGKkX7TAMBgNVHRMEBTADAQH/MAsGA1Ud
+DwQEAwIBBjANBgkqhkiG9w0BAQsFAAOCAQEAIPJAeutTT6llsHQcC8CUPxSGe98l
+IPGHFX3AE9tRU1C2jfsidovNnxfpYksctjVcv3Zo6UbY6w83+UXciu4uusfjgJ/X
+dc5na7J+PCNcgNY34fsFmX4yQNF7ffTEUAS91FJ2bXs+Ob/dIQvZ0ZJopLia4C0m
+IT0DJfQV6Xx+R+mQ+MB1c2bmW17C88PCOygTUyn8ssrUkttkrf9xebp2TqyggdSH
+myw4nD/iQz+l7lwmDitEJY6cyLBDihhpKEyeCcIMp2+ytEsqaCKOASvjKnG24O19
+N0+ctqX/JPZzCEEpYhlFtZEFKjnYV7DiGvC6GiGZAMWNB3oY2bm+Gf2mNQ==
+-----END CERTIFICATE-----
diff --git a/tools/openvpn/sepia/client.conf b/tools/openvpn/sepia/client.conf

new file mode 100644 (file)

index 0000000..c51ace0
--- /dev/null
+++ b/tools/openvpn/sepia/client.conf
@@ -0,0 +1,18 @@
+script-security 1
+client
+remote vpn.sepia.ceph.com 1194
+dev tun
+remote-random
+resolv-retry infinite
+nobind
+user nobody
+group nogroup
+persist-tun
+persist-key
+comp-lzo
+verb 2
+mute 10
+remote-cert-tls server
+tls-auth sepia/tlsauth 1
+ca sepia/ca.crt
+auth-user-pass sepia/secret
diff --git a/tools/openvpn/sepia/new-client b/tools/openvpn/sepia/new-client

new file mode 100755 (executable)

index 0000000..c3181a5
--- /dev/null
+++ b/tools/openvpn/sepia/new-client
@@ -0,0 +1,82 @@
+#!/usr/bin/python3
+
+# How to set up a client (on Ubuntu/Debian):
+#
+# sudo apt-get install openvpn
+# cd /etc/openvpn
+# sudo tar xvzf ~/sepia-vpn-client.tar.gz
+# sudo ./sepia/new-client MYUSERNAME@MYHOST
+#
+# ... submit the secret to admin and wait for acknowledgment ...
+#
+# sudo service openvpn start sepia
+
+import base64
+import datetime
+import hashlib
+import os
+import re
+import sys
+import tarfile
+
+path = os.path.dirname(sys.argv[0])
+os.chdir(path)
+
+try:
+    (user,) = sys.argv[1:]
+except ValueError:
+    raise SystemExit('Usage: new-client USERNAME@HOST')
+
+# From openvpn(8):
+#
+# To protect against a client passing a maliciously formed username or
+# password string, the username string must consist only of these
+# characters: alphanumeric, underbar ('_'), dash ('-'), dot ('.'), or
+# at ('@'). The password string can consist of any printable
+# characters except for CR or LF. Any illegal characters in either the
+# username or password string will be converted to underbar ('_').
+#
+# Verifying this here to avoid confusion down the road.
+if not re.match(r'^[a-zA-Z0-9_.@-]+$', user):
+    raise SystemExit('new-client: Invalid characters in username')
+
+salt = base64.b64encode(os.urandom(16)).rstrip(b'=')
+secret = base64.b64encode(os.urandom(64)).rstrip(b'=')
+
+inner = hashlib.new('sha256')
+inner.update(salt)
+inner.update(secret)
+outer = hashlib.new('sha256')
+outer.update(inner.digest())
+outer.update(salt)
+hashed = outer.hexdigest()
+
+with open('secret', 'wb') as f:
+    os.fchmod(f.fileno(), 0o600)
+    f.write('{user}\n{secret}\n'.format(user=user, secret=secret.decode()).encode('utf-8'))
+
+base = os.path.basename(path)
+os.symlink(os.path.join(base, 'client.conf'), '../sepia.conf')
+
+sys.stdout.write(
+    "\n!!!!! DO NOT RUN THIS SCRIPT MORE THAN ONCE !!!!!\n\nPlease paste the following line in your Sepia Lab Access Request tracker ticket:\n\n")
+sys.stdout.write("{user} {salt} {hashed}\n\n".format(
+    user=user,
+    salt=salt.decode('utf-8'),
+    hashed=hashed,
+))
+
+with open('secret.hash', 'w') as f:
+    f.write(f"{user} {salt.decode('utf-8')} {hashed}")
+
+datestr = datetime.datetime.now().strftime('%Y%m%d_%H%M%S')
+tarfilename = f'secrets.{datestr}.tar.gz'
+tarfile = tarfile.open(tarfilename, 'w:gz')
+for f in ['secret', 'secret.hash']:
+    tarfile.add(f)
+tarfile.close()
+
+sys.stdout.write(f"""
+The secret file (private to you) and secret.hash (the above hashed secret
+information, to be placed on the OpenVPN server) are a matched pair.
+They've been placed into {tarfilename} for safekeeping.""")
diff --git a/tools/openvpn/sepia/tlsauth b/tools/openvpn/sepia/tlsauth

new file mode 100644 (file)

index 0000000..bc0af9c
--- /dev/null
+++ b/tools/openvpn/sepia/tlsauth
@@ -0,0 +1,21 @@
+#
+# 2048 bit OpenVPN static key
+#
+-----BEGIN OpenVPN Static key V1-----
+45839625d348b4d5c0af603d94110313
+9d6960d0b3c3b22365f0e5ded5281664
+3473d1ece7bfc8fcb990232886aec346
+db726c28f8f6423648a7274d975abd1a
+587953b38323cf13b763724d5c8e2b77
+b6a9d12ca751d8e3de0e56be37300855
+e6864c047148a30cb0b7d87fbd7f5f80
+d19c05a808ba1b48e9a8139051b63e47
+02ab07478c34d75f77d16ecafcaae81c
+303c64f334e73d9b6ba71d2397941402
+51bbd5ab903e89a85cf05ae1158e6258
+d39b9f9e9a3b00cd96d6b6c8a3b93bf1
+9fd3fab9ce8513a525a55feb731ca46c
+185555b2771351422b703b2c3ecbc809
+05cf68e6fd95226c5a45adc01e7645e6
+aaadeb236c0f44fb42c01decd819e849
+-----END OpenVPN Static key V1-----
diff --git a/tools/prep-fog-capture.yml b/tools/prep-fog-capture.yml

new file mode 100644 (file)

index 0000000..0877ebf
--- /dev/null
+++ b/tools/prep-fog-capture.yml
@@ -0,0 +1,146 @@
+---
+### This standalone playbook can be used to prep a COBBLER-IMAGED testnode
+### so that it can be used to capture an OS image for FOG.
+### This playbook is needed for a couple reasons
+###   - NIC configs get hard coded into the captured FOG images so nodes reimaged by FOG don't come up with network
+
+- hosts:
+    - testnodes
+  become: true
+  gather_facts: false
+  tasks:
+
+  # (Missing in RHEL8)
+  - name: Check for /usr/bin/python
+    shell: echo marco
+    register: polo
+    ignore_errors: true
+
+  - name: Set ansible_python_interpreter=/usr/bin/python3
+    set_fact:
+      ansible_python_interpreter: /usr/bin/python3
+    when: polo is failed
+
+  # Now that we know where python is, we can gather_facts
+  - setup:
+
+  # We need to leave /.cephlab_rc_local or else each FOG reimage would tell Cobbler to run ceph-cm-ansible
+  - name: Remove lock files and udev rules
+    file:
+      path: "{{ item }}"
+      state: absent
+    with_items:
+      - /etc/udev/rules.d/70-persistent-net.rules
+      - /.cephlab_net_configured
+      - /ceph-qa-ready
+
+  - name: Get list of ifcfg scripts from host used to capture image
+    shell: "ls -1 /etc/sysconfig/network-scripts/ifcfg-* | grep -v ifcfg-lo"
+    register: ifcfg_scripts
+    when: ansible_os_family == "RedHat"
+    ignore_errors: true
+
+  - name: Get list of ifcfg scripts from host used to capture image
+    shell: "ls -1 /etc/sysconfig/network/ifcfg-* | grep -v ifcfg-lo"
+    register: ifcfg_scripts
+    when: ansible_os_family == "Suse"
+    ignore_errors: true
+
+  - name: Delete ifcfg scripts
+    file:
+      path: "{{ item }}"
+      state: absent
+    with_items: "{{ ifcfg_scripts.stdout_lines|default([]) }}"
+    when: ifcfg_scripts is defined
+
+  - name: Remove /var/lib/ceph mountpoint from fstab
+    shell: sed -i '/\/var\/lib\/ceph/d' /etc/fstab
+
+  - name: Unmount /var/lib/ceph
+    ansible.posix.mount:
+      path: /var/lib/ceph
+      state: unmounted
+
+  - name: Get list of SSH host keys
+    shell: "ls -1 /etc/ssh/ssh_host_*"
+    register: ssh_host_keys
+    ignore_errors: true
+
+  # Key regeneration is done automatically on CentOS firstboot.
+  # For Ubuntu, we'll add `dpkg-reconfigure openssh-server` to rc.local
+  - name: Delete SSH host keys so they're generated during firstboot on cloned machines
+    file:
+      path: "{{ item }}"
+      state: absent
+    with_items: "{{ ssh_host_keys.stdout_lines|default([]) }}"
+    when: ssh_host_keys is defined
+
+  - name: Unsubscribe RHEL
+    command: subscription-manager unregister
+    when: ansible_distribution == "RedHat"
+    failed_when: false
+
+  # A file gets leftover when a testnode is registered with Satellite that caused
+  # each registered subsequent testnode to report the wrong hostname
+  - name: Clean up katello facts
+    file:
+      path: /etc/rhsm/facts/katello.facts
+      state: absent
+    when: ansible_distribution == "RedHat"
+
+  # https://bugzilla.redhat.com/show_bug.cgi?id=1814337
+  - name: Disable dnf-makecache service
+    service:
+      name: dnf-makecache.timer
+      state: stopped
+      enabled: no
+    when:
+      - ansible_os_family == "RedHat"
+      - ansible_distribution_major_version|int >= 8
+
+  # Hopefully fixes https://github.com/ceph/ceph-cm-ansible/pull/544#issuecomment-599076564
+  - name: Clean DNF cache
+    shell: "dnf clean all && rm -rf /var/cache/dnf/*"
+    when:
+      - ansible_os_family == "RedHat"
+      - ansible_distribution_major_version|int >= 8
+
+  - set_fact:
+      ntp_service: ntp
+    when: ansible_os_family == "Debian"
+
+  - set_fact:
+      ntp_service: ntpd
+    when: ansible_os_family == "RedHat" and ansible_distribution_major_version|int <= 7
+
+  - set_fact:
+      ntp_service: chronyd
+    when: (ansible_os_family == "RedHat" and ansible_distribution_major_version|int >= 8) or
+          ansible_os_family == "Suse"
+
+  - name: "Stop {{ ntp_service }} service"
+    service:
+      name: "{{ ntp_service }}"
+      state: stopped
+    when: '"ntp" in ntp_service'
+
+  # The theory here is although we do have the ntp service running on boot,
+  # if the time is off, it slowly drifts back in sync.  Since our testnodes
+  # are ephemeral, they don't ever have enough time to correctly drift
+  # back to the correct time.  So we'll force it in the captured OS images.
+  - name: Force time synchronization using stepping | ntp
+    command: "ntpdate -b {{ ntp_servers|join(' ') }}"
+    when: '"ntp" in ntp_service'
+
+  - name: "Start {{ ntp_service }}"
+    service:
+      name: "{{ ntp_service }}"
+      state: started
+
+  # chronyd needs to be started in order to force time sync. This differs from ntpd.
+  - name: Force time synchronization using stepping | chrony
+    command: chronyc -a makestep
+    when: '"chrony" in ntp_service'
+
+  - name: Sync the hardware clock
+    command: "hwclock --systohc"
diff --git a/tools/roles/generate-fog-csv/tasks/main.yml b/tools/roles/generate-fog-csv/tasks/main.yml

new file mode 100644 (file)

index 0000000..bfbc56d
--- /dev/null
+++ b/tools/roles/generate-fog-csv/tasks/main.yml
@@ -0,0 +1,5 @@
+---
+- template:
+    src: csv.j2
+    dest: /tmp/fog_hostfile.csv
+  delegate_to: localhost
diff --git a/tools/roles/generate-fog-csv/templates/csv.j2 b/tools/roles/generate-fog-csv/templates/csv.j2

new file mode 100644 (file)

index 0000000..61ee4f4
--- /dev/null
+++ b/tools/roles/generate-fog-csv/templates/csv.j2
@@ -0,0 +1,5 @@
+{% for host in groups['cobbler_managed'] %}
+{% if hostvars[host]['mac'] is defined %}
+"{{ hostvars[host]['mac'] }}","{{ hostvars[host]['inventory_hostname_short'] }}","","","1","0","","","fog","","","","","","","","","{{ hostvars[host]['kernel_options'] }}","","{{ hostvars[host]['fog_install_drive']|default('/dev/sda') }}","","","","","0000-00-00 00:00:00","110","","",""
+{% endif %}
+{% endfor %}
diff --git a/tools/set-bmc-static.yml b/tools/set-bmc-static.yml

new file mode 100644 (file)

index 0000000..ce6898f
--- /dev/null
+++ b/tools/set-bmc-static.yml
@@ -0,0 +1,213 @@
+---
+### This standalone playbook can be used to (re)configure BMC network settings.
+### Override vars at the top of file if needed.  This has only been tested on
+### Supermicro BMCs but could easily be adapted for other manufacturers.
+###
+### This playbook should allow you to configure a BMC whether you have
+### SSH access to the host or not
+
+- hosts:
+    - ipmi
+  become: true
+  gather_facts: false
+  vars:
+    # Set to true if setting up a bunch of BMCs for the first time
+    setup_user: false
+    initial_user: ADMIN
+    initial_pass: ADMIN
+    # On Supermicro BMCs, Anonymous is UID 1 and reserved.  UID 2 is the default ADMIN:ADMIN
+    power_uid: 2
+    # Change this if the ipmi interface isn't found at channel 1
+    # (i.e., if `ipmitool lan print 1` returns 'Invalid channel: 1')
+    ipmi_channel_id: 1
+    use_dhcp: false
+    # "off" will disable setting a VLAN ID.  Octo needs VLAN 101 set.
+    vlan_id: "off"
+    # Define these for static settings.  These defaults are for Sepia.
+    static_netmask: 255.255.240.0
+    static_gateway: 172.21.47.254
+    # Change to true if you want to force an 'mc reset cold' no matter what
+    force_mc_reset: false
+  tasks:
+
+  # Pull in IPMI creds from secrets repo.
+  # Override power_user and power_pass with --extra-vars if needed
+  - include_vars: ../roles/secrets/defaults/main.yml
+  - include_vars: "{{ secrets_path }}/ipmi.yml"
+
+  - name: Check if we have SSH access
+    shell: "timeout 3s ssh {{ inventory_hostname }} true"
+    register: have_ssh_access
+    delegate_to: localhost
+    failed_when: false
+    changed_when: false
+
+  # These first 4 tasks assume you don't have SSH access to the host yet.  We'll try again via SSH later if these fail.
+  - name: Initial setup of username from localhost
+    shell: "ipmitool -I lanplus -U {{ initial_user }} -P {{ initial_pass }} -H {{ inventory_hostname_short }}.{{ ipmi_domain }} user set name {{ power_uid }} {{ power_user }}"
+    register: set_username_locally
+    delegate_to: localhost
+    when:
+      - setup_user
+      - have_ssh_access.rc != 0
+    ignore_errors: true
+
+  - name: Initial setup of permissions from localhost
+    shell: "ipmitool -I lanplus -U {{ power_user }} -P {{ initial_pass }} -H {{ inventory_hostname_short }}.{{ ipmi_domain }} channel setaccess {{ ipmi_channel_id }} {{ power_uid }} privilege=4"
+    register: set_permissions_locally
+    delegate_to: localhost
+    when:
+      - setup_user
+      - have_ssh_access.rc != 0
+    ignore_errors: true
+
+  - name: Initial setup of password from localhost
+    shell: "ipmitool -I lanplus -U {{ power_user }} -P {{ initial_pass }} -H {{ inventory_hostname_short }}.{{ ipmi_domain }} user set password {{ power_uid }} {{ power_pass }}"
+    register: set_password_locally
+    delegate_to: localhost
+    when:
+      - setup_user
+      - have_ssh_access.rc != 0
+    ignore_errors: true
+
+  - name: Check if DHCP already enabled
+    shell: "ipmitool -I lanplus -U {{ power_user }} -P {{ power_pass }} -H {{ inventory_hostname_short }}.{{ ipmi_domain }} lan print 1 | grep -q DHCP"
+    register: dhcp_already_enabled
+    delegate_to: localhost
+    when: use_dhcp
+    failed_when: dhcp_already_enabled.stderr != ''
+    changed_when: false
+
+  - name: Set BMC to use DHCP from localhost
+    shell: "ipmitool -I lanplus -U {{ power_user }} -P {{ power_pass }} -H {{ inventory_hostname_short }}.{{ ipmi_domain }} lan set {{ ipmi_channel_id }} ipsrc dhcp"
+    register: set_to_dhcp_locally
+    delegate_to: localhost
+    when:
+      - use_dhcp
+      - (dhcp_already_enabled is defined and dhcp_already_enabled.rc != 0)
+    ignore_errors: true
+
+  - name: Install ipmitool
+    package:
+      name: ipmitool
+      state: latest
+    when: have_ssh_access.rc == 0
+
+  - name: Activate kernel modules
+    modprobe:
+      name: "{{ item }}"
+      state: present
+    with_items:
+      - ipmi_devintf
+      - ipmi_si
+    when: have_ssh_access.rc == 0
+    ignore_errors: true
+
+  - name: Initial setup of username
+    shell: "ipmitool user set name {{ power_uid }} {{ power_user }}"
+    when:
+      - setup_user
+      - (set_username_locally is defined and set_username_locally is failed)
+
+  - name: Initial setup of permissions
+    shell: "ipmitool channel setaccess {{ ipmi_channel_id }} {{ power_uid }} privilege=4"
+    when:
+      - setup_user
+      - (set_permissions_locally is defined and set_permissions_locally is failed)
+
+  - name: Initial setup of password
+    shell: "ipmitool user set password {{ power_uid }} {{ power_pass }}"
+    register: set_password_locally
+    when:
+      - setup_user
+      - (set_password_locally is defined and set_password_locally is failed)
+    ignore_errors: true
+
+  - name: Set BMC to use DHCP via SSH
+    shell: "ipmitool lan set {{ ipmi_channel_id }} ipsrc dhcp"
+    register: set_to_dhcp_remotely
+    when:
+      - use_dhcp
+      - set_to_dhcp_locally is failed
+
+  - name: Check existing network settings via SSH
+    shell: "ipmitool lan print {{ ipmi_channel_id }} | grep 'IP Address Source\\|IP Address\\|Subnet Mask\\|Default Gateway IP\\|VLAN ID' | cut -d ':' -f2 | sed 's/^ //g'"
+    register: existing_network_settings
+    changed_when: false
+    when:
+      - not use_dhcp
+      - have_ssh_access.rc == 0
+
+  - name: Check existing network settings via localhost
+    shell: "ipmitool -I lanplus -U {{ power_user }} -P {{ power_pass }} -H {{ inventory_hostname_short }}.{{ ipmi_domain }} lan print {{ ipmi_channel_id }} | grep 'IP Address Source\\|IP Address\\|Subnet Mask\\|Default Gateway IP\\|VLAN ID' | cut -d ':' -f2 | sed 's/^ //g'"
+    register: existing_network_settings
+    delegate_to: localhost
+    changed_when: false
+    when:
+      - not use_dhcp
+      - have_ssh_access.rc != 0
+
+  - set_fact:
+      existing_network_settings_list: "{{ existing_network_settings.stdout.split('\n') }}"
+    when:
+      - not use_dhcp
+
+  - set_fact:
+      desired_network_settings_list: "[ 'Static Address', '{{ hostvars[inventory_hostname].ipmi }}', '{{ static_netmask }}', '{{ static_gateway }}', 'Disabled' ]"
+    when:
+      - not use_dhcp
+      - vlan_id == "off"
+
+  - set_fact:
+      desired_network_settings_list: "[ 'Static Address', '{{ hostvars[inventory_hostname].ipmi }}', '{{ static_netmask }}', '{{ static_gateway }}', '{{ vlan_id }}' ]"
+    when:
+      - not use_dhcp
+      - vlan_id != "off"
+
+  - set_fact:
+      network_settings_change_required: "{{ existing_network_settings_list|sort != desired_network_settings_list|sort }}"
+    when:
+      - not use_dhcp
+      - desired_network_settings_list is defined
+
+  - name: Set BMC to use static IP via SSH
+    shell: "{{ item }}"
+    with_items:
+      - "ipmitool lan set {{ ipmi_channel_id }} ipsrc static"
+      - "ipmitool lan set {{ ipmi_channel_id }} ipaddr {{ hostvars[inventory_hostname].ipmi }}"
+      - "ipmitool lan set {{ ipmi_channel_id }} netmask {{ static_netmask }}"
+      - "ipmitool lan set {{ ipmi_channel_id }} defgw ipaddr {{ static_gateway }}"
+      - "ipmitool lan set {{ ipmi_channel_id }} vlan id {{ vlan_id }}"
+    register: set_to_static
+    when:
+      - not use_dhcp
+      - network_settings_change_required
+      - have_ssh_access.rc == 0
+    failed_when: "set_to_static.stderr != ''"
+    ignore_errors: true
+
+  - name: Set BMC to use static IP via localhost
+    shell: "{{ item }}"
+    with_items:
+      - "ipmitool -I lanplus -U {{ power_user }} -P {{ power_pass }} -H {{ inventory_hostname_short }}.{{ ipmi_domain }} lan set {{ ipmi_channel_id }} ipsrc static"
+      - "ipmitool -I lanplus -U {{ power_user }} -P {{ power_pass }} -H {{ inventory_hostname_short }}.{{ ipmi_domain }} lan set {{ ipmi_channel_id }} ipaddr {{ hostvars[inventory_hostname].ipmi }}"
+      - "ipmitool -I lanplus -U {{ power_user }} -P {{ power_pass }} -H {{ inventory_hostname_short }}.{{ ipmi_domain }} lan set {{ ipmi_channel_id }} netmask {{ static_netmask }}"
+      - "ipmitool -I lanplus -U {{ power_user }} -P {{ power_pass }} -H {{ inventory_hostname_short }}.{{ ipmi_domain }} lan set {{ ipmi_channel_id }} defgw ipaddr {{ static_gateway }}"
+      # https://sourceforge.net/p/ipmitool/bugs/456/
+      #- "ipmitool -I lanplus -U {{ power_user }} -P {{ power_pass }} -H {{ inventory_hostname_short }}.{{ ipmi_domain }} lan set {{ ipmi_channel_id }} vlan id {{ vlan_id }}"
+    register: set_to_static
+    delegate_to: localhost
+    when:
+      - not use_dhcp
+      - network_settings_change_required
+      - have_ssh_access.rc != 0
+    failed_when: "set_to_static.stderr != ''"
+    ignore_errors: true
+
+  - name: Reset BMC
+    shell: "ipmitool -I lanplus -U {{ power_user }} -P {{ power_pass }} -H {{ inventory_hostname_short }}.{{ ipmi_domain }} mc reset cold"
+    delegate_to: localhost
+    when: force_mc_reset or
+          (set_to_dhcp_locally is defined and set_to_dhcp_locally is changed) or
+          (set_to_dhcp_remotely is defined and set_to_dhcp_remotely is changed) or
+          (network_settings_change_required is defined and network_settings_change_required and not set_to_static is failed)
diff --git a/tools/set-next-server-local.sh b/tools/set-next-server-local.sh

new file mode 100644 (file)

index 0000000..e9834e2
--- /dev/null
+++ b/tools/set-next-server-local.sh
@@ -0,0 +1,34 @@
+#!/bin/bash
+# Modifies dhcp config file to add or remove a next-server and filename
+# The fog next-server and filename are the default for all DHCP hosts so
+# entering 'cobbler' for $2 adds its next-server and filename.
+# Setting 'fog' for $2 just removes it so the host entry uses the global default.
+#
+# This script should live on your workstation somewhere executable.
+#
+# It also assumes you are using tools/switch-secrets to switch between
+# octo an sepia ansible inventories
+
+if [ $# -ne 2 ]; then
+  echo "Usage: $(basename $0) hostname [cobbler|fog]"
+  echo
+  echo "Example: \`$(basename $0) mira042 cobbler\` would add Cobbler's next-server and filename to mira042's DHCP entry"
+  echo
+  exit 1
+elif [ "$2" != "cobbler" ] && [ "$2" != "fog" ]; then
+  echo "Unrecognized option $2.  Must use 'cobbler' or 'fog'"
+  exit 1
+else
+  host=$(echo $1 | cut -d '.' -f1)
+fi
+ls -lah /etc/ansible/hosts | grep -q octo
+if [ $? -eq 0 ]
+then
+  dhcp_server="magna001.ceph.redhat.com"
+else
+  dhcp_server="store01.front.sepia.ceph.com"
+fi
+
+set -x
+
+ssh $dhcp_server "sudo /usr/local/sbin/set-next-server.sh $host $2 && sudo service dhcpd restart"
diff --git a/tools/set-next-server.sh b/tools/set-next-server.sh

new file mode 100644 (file)

index 0000000..866a7d6
--- /dev/null
+++ b/tools/set-next-server.sh
@@ -0,0 +1,75 @@
+#!/bin/bash
+# Modifies dhcp config file to add or remove a next-server and filename
+# The fog next-server and filename are the default for all DHCP hosts so
+# entering 'cobbler' for $2 adds its next-server and filename.
+# Setting 'fog' for $2 just removes it so the host entry uses the global default.
+#
+# This script should live on the DHCP server somewhere executable
+#
+# NOTE: DHCP entries *must* be in the following format
+# (dhcp-server role write entries like this)
+#
+# host foo-front {
+#   hardware ethernet aa:bb:cc:11:22:33;
+#   fixed-address 1.2.3.4;
+# }
+
+if [ $# -ne 2 ]; then
+  echo "Usage: $(basename $0) hostname [cobbler|fog]"
+  echo
+  echo "Example: \`$(basename $0) mira042 cobbler\` would add Cobbler's next-server and filename to mira042's DHCP entry"
+  echo
+  exit 1
+elif [ "$2" != "cobbler" ] && [ "$2" != "fog" ]; then
+  echo "Unrecognized option $2.  Must use 'cobbler' or 'fog'"
+  exit 1
+else
+  host=$(echo $1 | cut -d '.' -f1)
+fi
+
+set -x
+
+dhcpconfig="/etc/dhcp/dhcpd.front.conf"
+timestamp=$(date +%s)
+cobblerip="172.21.0.11"
+cobblerfilename="/pxelinux.0"
+fogip="172.21.0.72"
+fogfilename="/undionly.kpxe"
+macaddr=$(sed -n "/host ${host}-front/,/}/p" $dhcpconfig | grep 'hardware ethernet' | awk '{ print $3 }' | tr -d ';')
+ipaddr=$(sed -n "/host ${host}-front/,/}/p" $dhcpconfig | grep 'fixed-address' | awk '{ print $2 }' | tr -d ';')
+linenum=$(grep -n $host $dhcpconfig | cut -d ':' -f1)
+
+if [ -z "$macaddr" ]; then
+  echo "No MAC address found for $host"
+  exit 1
+elif [ -z "$ipaddr" ]; then
+  echo "No IP address found for $host"
+  exit 1
+elif [ -z "$linenum" ]; then
+  echo "Unable to determine line number for $host entry"
+  exit 1
+fi
+
+# Back up dhcp config
+cp $dhcpconfig ${dhcpconfig}_$timestamp.bak
+
+# Delete
+sed -i "/host ${host}-front {/,/}/d" $dhcpconfig
+
+if [ "$2" == "cobbler" ]; then
+  sed -i "${linenum} i \  host ${host}-front {\n\    hardware ethernet $macaddr;\n\    fixed-address $ipaddr;\n\    next-server $cobblerip;\n\    filename \"$cobblerfilename\";\n\  }" $dhcpconfig
+elif [ "$2" == "fog" ]; then
+  sed -i "${linenum} i \  host ${host}-front {\n\    hardware ethernet $macaddr;\n\    fixed-address $ipaddr;\n\    next-server $fogip;\n\    filename \"$fogfilename\";\n\  }" $dhcpconfig
+fi
+
+dhcpd -q -t -cf $dhcpconfig
+
+if [ $? != 0 ]; then
+  mv $dhcpconfig ${dhcpconfig}_$timestamp.broken
+  mv ${dhcpconfig}_$timestamp.bak $dhcpconfig
+  echo "New config failed config test.  Restored backup."
+  exit 1
+else
+  rm ${dhcpconfig}_$timestamp.bak
+#  service dhcpd restart
+fi
diff --git a/tools/switch-secrets.sh b/tools/switch-secrets.sh

new file mode 100644 (file)

index 0000000..c0c0e54
--- /dev/null
+++ b/tools/switch-secrets.sh
@@ -0,0 +1,17 @@
+#!/bin/bash
+# Switches your ansible inventory between ceph-sepia-secrets or ceph-octo-secrets
+
+val=$(ls -lah /etc/ansible/secrets | grep -c "octo")
+if [ $val -eq 1 ]; then
+        sudo rm /etc/ansible/secrets
+        sudo ln -s ~/git/ceph/ceph-sepia-secrets/ansible/secrets /etc/ansible/secrets
+        sudo rm /etc/ansible/hosts
+        sudo ln -s ~/git/ceph/ceph-sepia-secrets/ansible/inventory /etc/ansible/hosts
+        cat ~/.teuthology.yaml.sepia > ~/.teuthology.yaml
+elif [ $val -eq 0 ]; then
+        sudo rm /etc/ansible/secrets
+        sudo ln -s ~/git/ceph/ceph-octo-secrets/ansible/secrets /etc/ansible/secrets
+        sudo rm /etc/ansible/hosts
+        sudo ln -s ~/git/ceph/ceph-octo-secrets/ansible/inventory /etc/ansible/hosts
+        cat ~/.teuthology.yaml.octo > ~/.teuthology.yaml
+fi
diff --git a/tools/update-nvme-firmware.yml b/tools/update-nvme-firmware.yml

new file mode 100644 (file)

index 0000000..28480b0
--- /dev/null
+++ b/tools/update-nvme-firmware.yml
@@ -0,0 +1,20 @@
+---
+# This playbook can be used to mass update NVMe card firmware.
+# The isdct RPM (no DEB unfortunately) can be obtained from Intel's website.
+# Download the zip, unpack, and push to drop.front.
+# https://downloadcenter.intel.com/product/87278/Intel-SSD-Data-Center-Tool
+
+- hosts:
+    - smithi
+  become: true
+  tasks:
+
+  - name: Install tool
+    yum:
+      name: http://drop.front.sepia.ceph.com/firmware/smithi/isdct-3.0.9.400-17.x86_64.rpm
+      state: installed
+    register: installed
+
+  - name: Update firmware
+    command: "isdct load -f -intelssd 0"
+    when: installed is changed
diff --git a/tools/vmlist.py b/tools/vmlist.py

new file mode 100755 (executable)

index 0000000..d7c1ea7
--- /dev/null
+++ b/tools/vmlist.py
@@ -0,0 +1,230 @@
+#!/usr/bin/env python
+
+import ConfigParser
+import docopt
+import multiprocessing
+import novaclient.client
+import os
+import subprocess
+import sys
+import tempfile
+import textwrap
+
+CACHEFILE = "~/.vmlist.cache"
+CONFFILE = "~/.vmlist.conf"
+
+
+# mira074.front.sepia.ceph.com
+# mira015.front.sepia.ceph.com
+
+VM_HOSTS = textwrap.dedent('''\
+    senta01.front.sepia.ceph.com
+    senta02.front.sepia.ceph.com
+    senta03.front.sepia.ceph.com
+    senta04.front.sepia.ceph.com
+    mira001.front.sepia.ceph.com
+    mira003.front.sepia.ceph.com
+    mira004.front.sepia.ceph.com
+    mira005.front.sepia.ceph.com
+    mira006.front.sepia.ceph.com
+    mira007.front.sepia.ceph.com
+    mira008.front.sepia.ceph.com
+    mira009.front.sepia.ceph.com
+    mira010.front.sepia.ceph.com
+    mira011.front.sepia.ceph.com
+    mira013.front.sepia.ceph.com
+    mira014.front.sepia.ceph.com
+    mira017.front.sepia.ceph.com
+    mira018.front.sepia.ceph.com
+    mira020.front.sepia.ceph.com
+    mira024.front.sepia.ceph.com
+    mira029.front.sepia.ceph.com
+    mira036.front.sepia.ceph.com
+    mira043.front.sepia.ceph.com
+    mira044.front.sepia.ceph.com
+    mira079.front.sepia.ceph.com
+    mira081.front.sepia.ceph.com
+    mira098.front.sepia.ceph.com
+    irvingi01.front.sepia.ceph.com
+    irvingi02.front.sepia.ceph.com
+    irvingi03.front.sepia.ceph.com
+    irvingi04.front.sepia.ceph.com
+    irvingi05.front.sepia.ceph.com
+    irvingi06.front.sepia.ceph.com
+    irvingi07.front.sepia.ceph.com
+    irvingi08.front.sepia.ceph.com
+    hv01.front.sepia.ceph.com
+    hv02.front.sepia.ceph.com
+    hv03.front.sepia.ceph.com''')
+
+NOVACLIENT_VERSION = '2'
+
+
+global_defaults = {
+    'vm_hosts': VM_HOSTS,
+    'cachefile': CACHEFILE,
+    'novaclient_version': NOVACLIENT_VERSION,
+}
+
+class Cfg(object):
+
+    '''
+    Read INI-style config file; allow uppercase versions of
+    keys present in environment to override keys in the file
+    '''
+
+    def __init__(self, cfgfile):
+        self.cfgparser = ConfigParser.SafeConfigParser()
+        self.cfgparser.read(cfgfile)
+        self.cloud_providers = list()
+        self.cloud_providers = [s for s in self.cfgparser.sections()
+                                if s.startswith('cloud')]
+
+        # set up global defaults
+        if not self.cfgparser.has_section('global'):
+            self.cfgparser.add_section('global')
+        for k, v in global_defaults.iteritems():
+            if not self.cfgparser.has_option('global', k):
+                self.cfgparser.set('global', k, v)
+
+    def get(self, section, key):
+        env_val = os.environ.get(key.upper())
+        if env_val:
+            return env_val
+        if self.cfgparser.has_option(section, key):
+            return self.cfgparser.get(section, key)
+        else:
+            return None
+
+
+cfg = Cfg(os.path.expanduser(CONFFILE))
+
+
+def list_vms(host, outputfile=None):
+    """
+    Connect to host and collect lxc-ls and virsh list --all output
+    """
+    if not host:
+        return
+    lxc_output = []
+    if subprocess.call(['ssh', host, 'test', '-x', '/usr/bin/lxc-ls']) == 0:
+        lxc_output = subprocess.check_output(
+            ['ssh', host, 'sudo', 'lxc-ls']
+        ).strip().split('\n')
+        # avoid ['']; there must be a better way
+        lxc_output = [line for line in lxc_output if line]
+
+    virsh_output = subprocess.check_output(
+        ['ssh', host, 'sudo', 'virsh', '-r', 'list', '--all']
+    ).strip().split('\n')
+    virsh_output = [line.split()[1] for line in virsh_output[2:] if line]
+    virsh_output = [line for line in virsh_output if line]
+
+    if not outputfile:
+        outputfile = sys.stdout
+
+    shorthost = host.split('.')[0]
+    if lxc_output:
+        outputfile.writelines(['{} {} (lxc)\n'.format(shorthost, line)
+                              for line in (lxc_output)])
+    if virsh_output:
+        outputfile.writelines(['{} {} (kvm)\n'.format(shorthost, line)
+                              for line in (virsh_output)])
+    outputfile.flush()
+    if outputfile != sys.stdout:
+        outputfile.seek(0)
+
+
+def list_nova(provider, outputfile=None):
+    if outputfile is None:
+        outputfile = sys.stdout
+
+    cloud_regions = [None]
+    regions = cfg.get(provider, 'cloud_region_names')
+    if regions:
+        cloud_regions = [r.strip() for r in regions.split(',')]
+
+    for region in cloud_regions:
+        nova = novaclient.client.Client(
+            int(cfg.get('global', 'novaclient_version')),
+            cfg.get(provider, 'cloud_user'),
+            cfg.get(provider, 'cloud_password'),
+            project_id=cfg.get(provider, 'cloud_project_id'),
+            auth_url=cfg.get(provider, 'cloud_auth_url'),
+            region_name=region,
+            tenant_id=cfg.get(provider, 'cloud_tenant_id'),
+        )
+        output = [
+            '{} {} {}\n'.format(
+                provider,
+                getattr(s, s.NAME_ATTR).strip(),
+                '(%s)' % region if region else '',
+            ) for s in nova.servers.list()
+        ]
+        outputfile.writelines(output)
+        outputfile.flush()
+    if outputfile != sys.stdout:
+        outputfile.seek(0)
+
+
+usage = """
+Usage: vmlist [-r] [-h VM_HOST]
+
+List all KVM, LXC, and OpenStack vms known
+
+Options:
+    -r, --refresh           refresh cached list (cache in {cachefile})
+    -h, --host MACHINE   get list from only this host, and do not cache
+""".format(cachefile=cfg.get('global', 'cachefile'))
+
+
+def main():
+
+    args = docopt.docopt(usage)
+    cachefile = os.path.expanduser(cfg.get('global', 'cachefile'))
+
+    if args['--host']:
+        list_vms(args['--host'])
+        return 0
+
+    if args['--refresh']:
+
+        procs = []
+        outfiles = []
+        for host in cfg.get('global', 'vm_hosts').split('\n'):
+            outfile = tempfile.NamedTemporaryFile()
+            proc = multiprocessing.Process(
+                target=list_vms, args=(host, outfile)
+            )
+            procs.append(proc)
+            outfiles.append(outfile)
+            proc.start()
+
+        # all the nova providers
+        for provider in cfg.cloud_providers:
+            outfile = tempfile.NamedTemporaryFile()
+            proc = multiprocessing.Process(
+                target=list_nova,
+                args=(provider, outfile,),
+            )
+            procs.append(proc)
+            outfiles.append(outfile)
+            proc.start()
+
+        for proc in procs:
+            proc.join()
+
+        lines = []
+        for fil in outfiles:
+            lines.extend(fil.readlines())
+        lines = sorted(lines)
+
+        with open(os.path.expanduser(cachefile), 'w') as cache:
+            cache.write(''.join(lines))
+
+    # dump the cache
+    sys.stdout.write(open(os.path.expanduser(cachefile), 'r').read())
+
+
+if __name__ == '__main__':
+    sys.exit(main())
diff --git a/users.yml b/users.yml

new file mode 100644 (file)

index 0000000..7c80112
--- /dev/null
+++ b/users.yml
@@ -0,0 +1,5 @@
+---
+- hosts: all
+  roles:
+    - users
+  become: true
diff --git a/vmhost.yml b/vmhost.yml

new file mode 100644 (file)

index 0000000..9739da9
--- /dev/null
+++ b/vmhost.yml
@@ -0,0 +1,5 @@
+---
+- hosts: vps_hosts
+  roles:
+    - vmhost
+  become: true
author	David Galloway <david.galloway@ibm.com>
	Sun, 12 Oct 2025 23:34:42 +0000 (19:34 -0400)
committer	David Galloway <david.galloway@ibm.com>
	Fri, 12 Dec 2025 18:46:52 +0000 (13:46 -0500)