When you create or update your PR, the Ceph project's `Continuous Integration
(CI) <https://en.wikipedia.org/wiki/Continuous_integration>`_ infrastructure
automatically tests it. At the time of this writing (September 2020), the
-automated CI testing included five tests:
+automated CI testing included five tests:
-#. a test to check that the commits are properly signed (see :ref:`submitting-patches`):
+#. a test to check that the commits are properly signed (see :ref:`submitting-patches`):
#. a test to check that the documentation builds
#. a test to check that the submodules are unmodified
#. a test to check that the API is in order
-#. a :ref:`make check<make-check>` test
-
+#. a :ref:`make check<make-check>` test
+
Additional tests may be performed depending on which files your PR modifies.
The :ref:`make check<make-check>` test builds the PR and runs it through a battery of
.. _`teuthology repository`: https://github.com/ceph/teuthology
.. _`teuthology framework`: https://github.com/ceph/teuthology
-The Ceph community has access to the `Sepia lab
-<https://wiki.sepia.ceph.com/doku.php>`_ where `Integration Testing` _ can be
-run on real hardware. Other developers may add tags like "needs-qa" to your
+The Ceph community has access to the `Sepia lab`_ where `Integration Testing`_
+can be run on physical hardware. Other developers may add tags like "needs-qa"
+to your
PR. This allows PRs that need testing to be merged into a single branch and
tested all at the same time. Since teuthology suites can take hours (even
days in some cases) to run, this can save a lot of time.
To request access to the Sepia lab, start `here <https://wiki.sepia.ceph.com/doku.php?id=vpnaccess>`_.
-Integration testing is discussed in more detail in the `Integration Testing` _
+Integration testing is discussed in more detail in the `Integration Testing`_
chapter.
Code review
client: add timer_lock support
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
-.. _Integration Testing: ./testing-integration-tests/tests-integration-testing-teuthology-intro.rst
+.. _Integration Testing: ../testing_integration_tests/tests-integration-testing-teuthology-intro
+.. _Sepia lab: https://wiki.sepia.ceph.com/doku.php
``target149202171058.teuthology``) are resolvable within the teuthology
cluster.
-.. _Integration Testing: ../testing-integration-tests/tests-integration-testing-teuthology-intro.rst
+.. _Integration Testing: ../testing_integration_tests/tests-integration-testing-teuthology-intro
.. _IRC: ../essentials/#irc
.. _Mailing List: ../essentials/#mailing-list
.. _teuthology framework: https://github.com/ceph/teuthology
.. rubric:: Contents
.. toctree::
- :maxdepth: 1
:glob:
+ :titlesonly:
Introduction <tests-integration-testing-teuthology-intro>
Workflow <tests-integration-testing-teuthology-workflow>
.. _tests-integration-testing-teuthology-debugging-tips:
-Analysing and Debugging A Teuthology Job
------------------------------------------
+Analyzing and Debugging A Teuthology Job
+========================================
-For scheduling an integration test please refer to, `Scheduling Test Run`_
-Here, we will be discussing how to analyse failed/dead jobs to root cause the problem and amend it.
-
-Triaging the cause of failure
-------------------------------
+For scheduling an integration test please refer to, `Scheduling Test Run`_.
Once a teuthology run is successfully completed, we can access the results using
-pulpito dashboard for example:
+pulpito dashboard, which looks like:
+
+http://pulpito.front.sepia.ceph.com/<job-name>/<job-id>/
+
+or via sshing into teuthology server::
+
+ ssh <username>@teuthology.front.sepia.ceph.com
+
+and accessing `teuthology archives`_, for example::
+
+ nano /a/teuthology-2021-01-06_07:01:02-rados-master-distro-basic-smithi/
+
+.. note:: This would require Sepia lab access. To know how to request it, see:
+ https://ceph.github.io/sepia/adding_users/
+
+On pulpito, the jobs in red specify either a failed or dead job.
+Here, a job is combination of daemons and configurations that are formed using
+`qa/suites`_ yaml fragments.
+Taking these configurations, teuthology runs the tasks that are present in
+`qa/tasks`_, which are commands used for setting up the test environment and
+testing Ceph's components.
+These tasks help us in covering large subset of usecase scenarios and hence
+exposing the bugs which were uncaught by `make check`_ testing.
+
+.. _make check: ../tests-integration-testing-teuthology-intro/#make-check
-http://pulpito.front.sepia.ceph.com/ideepika-2020-11-03_04:03:28-rados-wip-yuri-testing-2020-10-28-0947-octopus-distro-basic-smithi/ which might look something
+A job failure hence might be because of:
-This run has 2 job run failures. To triage, open the teuthology log for it using either:
+* environment setup(`testing on varied systems<https://github.com/ceph/ceph/tree/master/qa/distros/supported>_`):
+ testing compatibility with stable realeases for supported versions.
-http://pulpito.front.sepia.ceph.com/<job-name>/<job-id>/teuthology.log
+* permutation of config values: for instance, qa/suites/rados/thrash ensures to
+ test Ceph under stressful workload, so that we be able to catch corner case
+ bugs.
+ The final setup config yaml that would be used for testing can be accessed
+ at::
-or via sshing into teuthology server using::
+ /a/<job-name>/<job-id>/orig.config.yaml
- ssh teuthology.front.sepia.ceph.com
+More details about config.yaml can be found on `detailed test config`_
+
+Triaging the cause of failure
+------------------------------
+
+To triage a job failure, open the teuthology log for it using either(from the
+pulpito page):
+
+http://qa-proxy.ceph.com/<job-name>/<job-id>/teuthology.log
and then opening log file with signature as:
for example in our case::
- nano /a/ideepika-2020-11-03_04:03:28-rados-wip-yuri-testing-2020-10-28-0947-octopus-distro-basic-smithi/5585704/teuthology.log
+ nano /a/teuthology-2021-01-06_07:01:02-rados-master-distro-basic-smithi/5759282/teuthology.log
Generally, a job failure is recorded in teuthology log as a Traceback which gets
-added to job summary. While analysing a job failure, we generally start looking
-for ``Traceback`` keyword and further see the call stack and logs that might had
-lead to failure Most of the time, traceback will also be including the failing
-command.
+added to job summary.
+While analyzing a job failure, we generally start looking for ``Traceback``
+keyword and further see the call stack and logs that might had lead to failure.
+Most of the time, traceback will also be including the failing command.
.. note:: the teuthology logs are deleted every once in a while, if you are
unable to access example link, please feel free to refer any other case from
Once the cause of failure is triaged, and is something which might not be
related to the developer's code change, this indicates that it might be a
-generic failure for the upstream branch(in our case octopus), in which case, we
-look for related failure keywords on https://tracker.ceph.com/ If a similar
-issue has been reported via a tracker.ceph.com ticket, please add any relevant
-feedback to it. Otherwise, please create a new tracker ticket for it. If you are
-not familiar with the cause of failure, someone else will look at it.
+generic failure for the upstream branch (in our case octopus), in which case, we
+look for related failure keywords on https://tracker.ceph.com/.
+If a similar issue has been reported via a tracker.ceph.com ticket, please add
+any relevant feedback to it. Otherwise, please create a new tracker ticket for
+it. If you are not familiar with the cause of failure, someone else will look at
+it.
-Debugging An Issue
-------------------
+Debugging an issue using interactive-on-error
+---------------------------------------------
-If you want to work on a tracker issue, assign it to yourself, and try to
-reproduce that issue. For this purpose you can run a job similar to the failed
-job, using interactive-on-error mode in teuthology::
+To investigate an issue, the first step would be to try to reproduce it, for
+that purpose. For this purpose you can run a job similar to the failed job,
+using `interactive-on-error`_ mode in teuthology::
ideepika@teuthology:~/teuthology$ ./virtualenv/bin/teuthology -v --lock --block $<your-config-yaml> --interactive-on-error
-More details on using teuthology command please read `detailed test config`_
+we can either have a `custom config.yaml`_ or use the one from failed job; for
+which copy the ``orig.config.yaml`` to your local dir and change the `testing
+priority`_ accordingly, which would look like::
+
+ ideepika@teuthology:~/teuthology$ cp /a/teuthology-2021-01-06_07:01:02-rados-master-distro-basic-smithi/5759282/orig.config.yaml test.yaml
+ ideepika@teuthology:~/teuthology$ ./virtualenv/bin/teuthology -v --lock --block test.yaml --interactive-on-error
+
+
+Teuthology will then lock the machines required by the ``config.yaml``, when
+their is job failure, which halts at an interactive python session which let's
+us investigate the ctx values and the targets via sshing into them, once we have
+investigated the system, we can manually terminate the session and let
+teuthology cleanup.
+
+Suggested Resources
+--------------------
+
+ * `Testing Ceph: Pains & Pleasures <https://www.youtube.com/watch?v=gj1OXrKdSrs>`_
-.. _Scheduling Test Run: ../tests-integration-testing-teuthology-workflow.rst/#scheduling-test-run
-.. _detailed test config: https://github.com/ceph/teuthology/blob/master/docs/detailed_test_config.rst
+.. _Scheduling Test Run: ../tests-integration-testing-teuthology-workflow/#scheduling-test-run
+.. _detailed test config: https://docs.ceph.com/projects/teuthology/en/latest/detailed_test_config.html
+.. _teuthology archives: ../tests-integration-testing-teuthology-workflow/#teuthology-archives
+.. _qa/suites: https://github.com/ceph/ceph/tree/master/qa/suites
+.. _qa/tasks: https://github.com/ceph/ceph/tree/master/qa/tasks
+.. _interactive-on-error: https://docs.ceph.com/projects/teuthology/en/latest/detailed_test_config.html#troubleshooting
+.. _custom config.yaml: https://docs.ceph.com/projects/teuthology/en/latest/detailed_test_config.html#test-configuration
+.. _testing priority: ../tests-integration-testing-teuthology-intro/#testing-priority
Testing - Integration Tests - Introduction
==========================================
-Ceph has two types of tests: :ref:`make check <make-check>` tests and integration tests.
-When a test requires multiple machines, root access or lasts for a
-longer time (for example, to simulate a realistic Ceph deployment), it
-is deemed to be an integration test. Integration tests are organized into
-"suites", which are defined in the `ceph/qa sub-directory`_ and run with
-the ``teuthology-suite`` command.
+Ceph has two types of tests: :ref:`make check <make-check>` tests and
+integration tests. When a test requires multiple machines, root access or lasts
+for a longer time (for example, to simulate a realistic Ceph deployment), it is
+deemed to be an integration test. Integration tests are organized into "suites",
+which are defined in the `ceph/qa sub-directory`_ and run with the
+``teuthology-suite`` command.
The ``teuthology-suite`` command is part of the `teuthology framework`_.
In the sections that follow we attempt to provide a detailed introduction
installed on any machine running those platforms.
Teuthology has a `list of platforms that it supports
-<https://github.com/ceph/ceph/tree/master/qa/distros/supported>`_ (as
-of September 2020 the list consisted of "RHEL/CentOS 8" and "Ubuntu 18.04"). It
-expects to be provided pre-built Ceph packages for these platforms.
-Teuthology deploys these platforms on machines (bare-metal or
-cloud-provisioned), installs the packages on them, and deploys Ceph
-clusters on them - all as called for by the test.
+<https://github.com/ceph/ceph/tree/master/qa/distros/supported>`_ (as of
+September 2020 the list consisted of "RHEL/CentOS 8" and "Ubuntu 18.04"). It
+expects to be provided pre-built Ceph packages for these platforms. Teuthology
+deploys these platforms on machines (bare-metal or cloud-provisioned), installs
+the packages on them, and deploys Ceph clusters on them - all as called for by
+the test.
The Nightlies
-------------
the same time zone and from their perspective the tests were run overnight.
The results of the nightlies are published at http://pulpito.ceph.com/. The
-developer nick shows in the
-test results URL and in the first column of the Pulpito dashboard. The
-results are also reported on the `ceph-qa mailing list
+developer nick shows in the test results URL and in the first column of the
+Pulpito dashboard. The results are also reported on the `ceph-qa mailing list
<https://ceph.com/irc/>`_ for analysis.
Testing Priority
* **200 <= Priority < 1000:** Use this priority for large test runs that can
be done over the course of a week.
-In case you don't know how many jobs would be triggered by
-``teuthology-suite`` command, use ``--dry-run`` to get a count first and then
-issue ``teuthology-suite`` command again, this time without ``--dry-run`` and
-with ``-p`` and an appropriate number as an argument to it.
+In case you don't know how many jobs would be triggered by ``teuthology-suite``
+command, use ``--dry-run`` to get a count first and then issue
+``teuthology-suite`` command again, this time without ``--dry-run`` and with
+``-p`` and an appropriate number as an argument to it.
To skip the priority check, use ``--force-priority``. In order to be sensitive
to the runs of other developers who also need to do testing, please use it in
Suites Inventory
----------------
-The ``suites`` directory of the `ceph/qa sub-directory`_ contains
-all the integration tests, for all the Ceph components.
+The ``suites`` directory of the `ceph/qa sub-directory`_ contains all the
+integration tests, for all the Ceph components.
`ceph-deploy <https://github.com/ceph/ceph/tree/master/qa/suites/ceph-deploy>`_
- install a Ceph cluster with ``ceph-deploy`` (:ref:`ceph-deploy man page <ceph-deploy>`)
+ install a Ceph cluster with ``ceph-deploy`` (`ceph-deploy man page`_)
`dummy <https://github.com/ceph/ceph/tree/master/qa/suites/dummy>`_
get a machine, do nothing and return success (commonly used to
- verify the `Integration Testing` _ infrastructure works as expected)
+ verify the integration testing infrastructure works as expected)
`fs <https://github.com/ceph/ceph/tree/master/qa/suites/fs>`_
test CephFS mounted using FUSE
for various versions of Ceph, verify that upgrades can happen
without disrupting an ongoing workload
-.. _`ceph-deploy man page`: ../../man/8/ceph-deploy
+`ceph-deploy man page`_
teuthology-describe-tests
-------------------------
The upshot is that tests can be documented by embedding ``meta:``
annotations in the yaml files used to define the tests. The results can be
-seen in the `ceph-qa-suite wiki
-<http://tracker.ceph.com/projects/ceph-qa-suite/wiki/>`_.
+seen in the `teuthology-desribe usecases`_
Since this is a new feature, many yaml files have yet to be annotated.
Developers are encouraged to improve the documentation, in terms of both
coverage and quality.
-Please also see, `teuthology-desribe usecases`_
-
How integration tests are run
-----------------------------
to the `Sepia lab`_, you may rightly ask how you can run the integration
tests in your own environment.
-One option is to set up a teuthology cluster on bare metal. Though this is
-a non-trivial task, it `is` possible. Here are `some notes
-<http://docs.ceph.com/teuthology/docs/LAB_SETUP.html>`_ to get you started
-if you decide to go this route.
+One option is to set up a teuthology cluster on bare metal. Though this is a
+non-trivial task, it `is` possible. Here are `some notes
+<https://docs.ceph.com/projects/teuthology/en/latest/LAB_SETUP.html>`_ to get
+you started if you decide to go this route.
If you have access to an OpenStack tenant, you have another option: the
`teuthology framework`_ has an OpenStack backend, which is documented `here
-<https://github.com/dachary/teuthology/tree/openstack#openstack-backend>`__.
+<https://docs.ceph.com/projects/teuthology/en/latest/openstack_backend.html>`__.
This OpenStack backend can build packages from a given git commit or
branch, provision VMs, install the packages and run integration tests
on those VMs. This process is controlled using a tool called
teuthology-suite --help
-.. _teuthology-suite: http://docs.ceph.com/teuthology/docs/teuthology.suite.html
+.. _teuthology-suite: https://docs.ceph.com/projects/teuthology/en/latest/commands/teuthology-suite.html
How integration tests are defined
---------------------------------
.. _ceph/qa sub-directory: https://github.com/ceph/ceph/tree/master/qa
.. _Sepia Lab: https://wiki.sepia.ceph.com/doku.php
-.. _Integration Testing: ../testing_integration_tests/tests-integration-testing-teuthology-intro.rst
.. _teuthology repository: https://github.com/ceph/teuthology
.. _teuthology framework: https://github.com/ceph/teuthology
.. _teuthology-desribe usecases: https://gist.github.com/jdurgin/09711d5923b583f60afc
-
+.. _ceph-deploy man page: ../../../../man/8/ceph-deploy
built for your branch. Follow these steps to initiate the build process -
#. Push the branch to `ceph-ci`_ repository. This triggers the process of
- building the binaries.
+ building the binaries on jenkins CI.
#. To confirm that the build process has been initiated, spot the branch name
at `Shaman`_. Little after the build process has been initiated, the single
entry with your branch name would multiply, each new entry for a different
combination of distro and flavour.
-#. Wait until the packages are built and uploaded, and the repository offering
- them are created. This is marked by colouring the entries for the branch
- name green. Preferably, wait until each entry is coloured green. Usually,
- it takes around 2-3 hours depending on the availability of the machines.
+#. Wait until the packages are built and uploaded to `Chacra`_, and the
+ repository offering them are created. This is marked by colouring the entries
+ for the branch name green. Preferably, wait until each entry is coloured
+ green. Usually, it takes around 2-3 hours depending on the availability of
+ the machines.
.. note:: Branch to be pushed on ceph-ci can be any branch, it shouldn't
necessarily be a PR branch.
ssh <username>@teuthology.front.sepia.ceph.com
- This would require Sepia lab access. To know how to request it, see: https://ceph.github.io/sepia/adding_users/
+ This would require Sepia lab access. To know how to request it, see:
+ https://ceph.github.io/sepia/adding_users/
#. Next, get teuthology installed. Run the first set of commands in
`Running Your First Test`_ for that. After that, activate the virtual
-R fail
Following are the options used in above command with their meanings -
- -v verbose
- -m machine name
- -c branch name, the branch that was pushed on ceph-ci
- -s test-suite name
- -p higher the number, lower the priority of the job
- --filter filter tests in given suite that needs to run, the arg to
- filter should be the test you want to run
- -e <email> When tests finish or time out, send an email
- here. May also be specified in ~/.teuthology.yaml
- as 'results_email'
- -R A comma-separated list of statuses to be used
- with --rerun. Supported statuses are: 'dead',
- 'fail', 'pass', 'queued', 'running', 'waiting'
- [default: fail,dead]
+ -v verbose
+ -m machine name
+ -c branch name, the branch that was pushed on ceph-ci
+ -s test-suite name
+ -p higher the number, lower the priority of the job
+ --filter filter tests in given suite that needs to run, the arg to
+ filter should be the test you want to run
+ -e <email> When tests finish or time out, send an email
+ here. May also be specified in ~/.teuthology.yaml
+ as 'results_email'
+ -R A comma-separated list of statuses to be used
+ with --rerun. Supported statuses are: 'dead',
+ 'fail', 'pass', 'queued', 'running', 'waiting'
+ [default: fail,dead]
#. Wait for the tests to run. ``teuthology-suite`` prints a link to the
`Pulpito`_ page created for the tests triggered.
.. note:: Don't skip passing a priority number, the default value is 1000
which is way too high; the job probably might never run.
-#. Wait for the tests to run. ``teuthology-suite`` prints a link to the
- `Pulpito`_ page created for the tests triggered.
-
Other frequently used/useful options are ``-d`` (or ``--distro``),
``--distroversion``, ``--filter-out``, ``--timeout``, ``flavor``, ``-rerun``,
``-l`` (for limiting number of jobs) , ``-n`` (for how many times job would
While writing a PR you might need to test your PR repeatedly using teuthology.
If you are making non-QA changes, you need to follow the standard process of
triggering builds, waiting for it to finish and then triggering tests and
-wait for the result. But if changes you made are purely changes in qa/,
-you don't need rebuild the binaries. Instead you can test binaries built for
-the ceph-ci branch and instruct ``teuthology-suite`` command to use a separate
-branch for running tests.
+wait for the result.
+But if changes you made are purely changes in qa/, you don't need rebuild the
+binaries. Instead you can test binaries built for the ceph-ci branch and
+instruct ``teuthology-suite`` command to use a separate branch for running
+tests.
The separate branch can be passed to the command by using ``--suite-repo`` and
``--suite-branch``. Pass the link to the GitHub fork where your PR branch exists
to the first option and pass the PR branch name to the second option.
Once the teuthology job is scheduled, the status/results for test run could
be checked from https://pulpito.ceph.com/.
-It could be used for quickly checking out job logs... their status etc.
+It could be used for quickly checking out job logs, their status, etc.
Teuthology Archives
*******************
Once the tests have finished running, the log for the job can be obtained by
clicking on job ID at the Pulpito page for your tests. It's more convenient to
download the log and then view it rather than viewing it in an internet browser
-since these logs can easily be up to size of 1 GB. It is easier to
+since these logs can easily be up to size of 1 GB. It is easier to
ssh into the teuthology machine again (``teuthology.front.sepia.ceph.com``), and
access the following path::
much.
.. note:: To access archives more conveniently, ``/a/`` has been symbolically
- linked to ``/ceph/teuthology-archive/``. For instance, to access the previous
- example, we can use something like::
+ linked to ``/ceph/teuthology-archive/``. For instance, to access the previous
+ example, we can use something like::
/a/teuthology-2019-12-10_05:00:03-smoke-master-testing-basic-smithi/4588482/teuthology.log
Re-running Tests
----------------
-You can pass --rerun option, with test ID as an argument to it, to
-teuthology-suite command. Generally, this is useful in cases where teuthology test
+You can pass ``--rerun`` option, with test ID as an argument to it, to
+``teuthology-suite`` command. Generally, this is useful in cases where teuthology test
batch has some failed/dead jobs that we might want to retrigger. We can trigger
jobs based on their status using::
-R fail,dead,queued,running \
-e $CEPH_QA_MAIL
-The meaning of rest the of the options is already covered in `Triggering Tests`_
+The meaning of the rest the options is already covered in `Triggering Tests`_
section.
Naming the ceph-ci branch
https://github.com/ceph/ceph-ci/branches.
.. _ceph-ci: https://github.com/ceph/ceph-ci
+.. _Chacra: https://github.com/ceph/chacra/blob/master/README.rst
.. _Pulpito: http://pulpito.front.sepia.ceph.com/
-.. _Running Your First Test: ../running-tests-locally/#running-your-first-test
+.. _Running Your First Test: ../../running-tests-locally/#running-your-first-test
.. _Shaman: https://shaman.ceph.com/builds/ceph/
-.. _Suites Inventory: ../tests-integration-testing-teuthology-intro.rst/#suites-inventory
-.. _Testing Priority: ../tests-integration-testing-teuthology-intro.rst/#testing-priority
-.. _Triggering Tests: ../tests-integration-testing-teuthology-workflow.rst/#triggering-tests
+.. _Suites Inventory: ../tests-integration-testing-teuthology-intro/#suites-inventory
+.. _Testing Priority: ../tests-integration-testing-teuthology-intro/#testing-priority
+.. _Triggering Tests: ../tests-integration-testing-teuthology-workflow/#triggering-tests
+.. _tests-sentry-developers-guide: ../tests-sentry-developers-guide/