From c77d0817d55e3baace11eab353c4429d21d0bf50 Mon Sep 17 00:00:00 2001 From: Deepika Upadhyay Date: Wed, 4 Nov 2020 19:51:27 +0530 Subject: [PATCH] doc/dev/developer_guide: rearrange and improve docs * move running-tests-using-teuth.rst to doc/dev/developer_guide/tests-integration-testing-teuthology-workflow.rst * introduce developer's guide for Sentry and improve teuthology docs * add teuthology debugging guide * create testing_integration_tests subfolder for teuthology Signed-off-by: Deepika Upadhyay --- doc/dev/developer_guide/basic-workflow.rst | 7 +- doc/dev/developer_guide/index.rst | 7 +- .../running-tests-in-cloud.rst | 8 +- .../testing_integration_tests/index.rst | 15 ++ ...tion-testing-teuthology-debugging-tips.rst | 66 +++++++ ...-integration-testing-teuthology-intro.rst} | 19 +- ...tegration-testing-teuthology-workflow.rst} | 164 ++++++++++++------ .../tests-sentry-developers-guide.rst | 6 + 8 files changed, 222 insertions(+), 70 deletions(-) create mode 100644 doc/dev/developer_guide/testing_integration_tests/index.rst create mode 100644 doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-debugging-tips.rst rename doc/dev/developer_guide/{tests-integration-tests.rst => testing_integration_tests/tests-integration-testing-teuthology-intro.rst} (97%) rename doc/dev/developer_guide/{running-tests-using-teuth.rst => testing_integration_tests/tests-integration-testing-teuthology-workflow.rst} (61%) create mode 100644 doc/dev/developer_guide/testing_integration_tests/tests-sentry-developers-guide.rst diff --git a/doc/dev/developer_guide/basic-workflow.rst b/doc/dev/developer_guide/basic-workflow.rst index 1a387a94e18..c2a3201d6b9 100644 --- a/doc/dev/developer_guide/basic-workflow.rst +++ b/doc/dev/developer_guide/basic-workflow.rst @@ -282,15 +282,15 @@ sub-directory`_ and are run via the `teuthology framework`_. .. _`teuthology framework`: https://github.com/ceph/teuthology The Ceph community has access to the `Sepia lab -`_ where :ref:`testing-integration-tests` can be -run on physical hardware. Other developers may add tags like "needs-qa" to your +`_ where `Integration Testing` _ can be +run on real hardware. Other developers may add tags like "needs-qa" to your PR. This allows PRs that need testing to be merged into a single branch and tested all at the same time. Since teuthology suites can take hours (even days in some cases) to run, this can save a lot of time. To request access to the Sepia lab, start `here `_. -Integration testing is discussed in more detail in the :ref:`testing-integration-tests` +Integration testing is discussed in more detail in the `Integration Testing` _ chapter. Code review @@ -387,3 +387,4 @@ the **ptl-tool** have the following form:: client: add timer_lock support Reviewed-by: Patrick Donnelly +.. _Integration Testing: ./testing-integration-tests/tests-integration-testing-teuthology-intro.rst diff --git a/doc/dev/developer_guide/index.rst b/doc/dev/developer_guide/index.rst index c33f0a575a1..c8a8600227c 100644 --- a/doc/dev/developer_guide/index.rst +++ b/doc/dev/developer_guide/index.rst @@ -17,8 +17,7 @@ Contributing to Ceph: A Guide for Developers Issue tracker Basic workflow Tests: Unit Tests - Tests: Integration Tests - Running Tests Locally - Running Integration Tests using Teuthology - Running Tests in the Cloud + Tests: Integration Tests + Tests: Running Tests (Locally) + Tests: Running Tests in the Cloud Ceph Dashboard Developer Documentation (formerly HACKING.rst) diff --git a/doc/dev/developer_guide/running-tests-in-cloud.rst b/doc/dev/developer_guide/running-tests-in-cloud.rst index 60118aefdb8..17885c4d664 100644 --- a/doc/dev/developer_guide/running-tests-in-cloud.rst +++ b/doc/dev/developer_guide/running-tests-in-cloud.rst @@ -2,7 +2,7 @@ Running Tests in the Cloud ========================== In this chapter, we will explain in detail how use an OpenStack -tenant as an environment for Ceph `integration testing`_. +tenant as an environment for Ceph `Integration Testing`_. Assumptions and caveat ---------------------- @@ -124,8 +124,7 @@ uploaded to http://teuthology-logs.public.ceph.com. Run a standalone test --------------------- -The standalone test explained in `Reading a standalone test`_ can be run -with the following command +The standalone test can be run with the following command .. prompt:: bash $ @@ -282,8 +281,7 @@ server list`` on the teuthology machine, but the target VM hostnames (e.g. ``target149202171058.teuthology``) are resolvable within the teuthology cluster. -.. _Integration testing: ../tests-integration-tests +.. _Integration Testing: ../testing-integration-tests/tests-integration-testing-teuthology-intro.rst .. _IRC: ../essentials/#irc .. _Mailing List: ../essentials/#mailing-list -.. _Reading A Standalone Test: ../testing-integration-tests/#reading-a-standalone-test .. _teuthology framework: https://github.com/ceph/teuthology diff --git a/doc/dev/developer_guide/testing_integration_tests/index.rst b/doc/dev/developer_guide/testing_integration_tests/index.rst new file mode 100644 index 00000000000..8cbe3855470 --- /dev/null +++ b/doc/dev/developer_guide/testing_integration_tests/index.rst @@ -0,0 +1,15 @@ +======================= +Teuthology User Guide +======================= + +.. rubric:: Contents + +.. toctree:: + :maxdepth: 1 + :glob: + + Introduction + Workflow + Debugging Tips + Sentry Notes + diff --git a/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-debugging-tips.rst b/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-debugging-tips.rst new file mode 100644 index 00000000000..84d7e06a1fe --- /dev/null +++ b/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-debugging-tips.rst @@ -0,0 +1,66 @@ +.. _tests-integration-testing-teuthology-debugging-tips: + +Analysing and Debugging A Teuthology Job +----------------------------------------- + +For scheduling an integration test please refer to, `Scheduling Test Run`_ +Here, we will be discussing how to analyse failed/dead jobs to root cause the problem and amend it. + +Triaging the cause of failure +------------------------------ + +Once a teuthology run is successfully completed, we can access the results using +pulpito dashboard for example: + +http://pulpito.front.sepia.ceph.com/ideepika-2020-11-03_04:03:28-rados-wip-yuri-testing-2020-10-28-0947-octopus-distro-basic-smithi/ which might look something + +This run has 2 job run failures. To triage, open the teuthology log for it using either: + +http://pulpito.front.sepia.ceph.com///teuthology.log + +or via sshing into teuthology server using:: + + ssh teuthology.front.sepia.ceph.com + +and then opening log file with signature as: + + /a///teuthology.log + +for example in our case:: + + nano /a/ideepika-2020-11-03_04:03:28-rados-wip-yuri-testing-2020-10-28-0947-octopus-distro-basic-smithi/5585704/teuthology.log + +Generally, a job failure is recorded in teuthology log as a Traceback which gets +added to job summary. While analysing a job failure, we generally start looking +for ``Traceback`` keyword and further see the call stack and logs that might had +lead to failure Most of the time, traceback will also be including the failing +command. + +.. note:: the teuthology logs are deleted every once in a while, if you are + unable to access example link, please feel free to refer any other case from + http://pulpito.front.sepia.ceph.com/ + +Reporting the Issue +------------------- + +Once the cause of failure is triaged, and is something which might not be +related to the developer's code change, this indicates that it might be a +generic failure for the upstream branch(in our case octopus), in which case, we +look for related failure keywords on https://tracker.ceph.com/ If a similar +issue has been reported via a tracker.ceph.com ticket, please add any relevant +feedback to it. Otherwise, please create a new tracker ticket for it. If you are +not familiar with the cause of failure, someone else will look at it. + +Debugging An Issue +------------------ + +If you want to work on a tracker issue, assign it to yourself, and try to +reproduce that issue. For this purpose you can run a job similar to the failed +job, using interactive-on-error mode in teuthology:: + + ideepika@teuthology:~/teuthology$ ./virtualenv/bin/teuthology -v --lock --block $ --interactive-on-error + +More details on using teuthology command please read `detailed test config`_ + +.. _Scheduling Test Run: ../tests-integration-testing-teuthology-workflow.rst/#scheduling-test-run +.. _detailed test config: https://github.com/ceph/teuthology/blob/master/docs/detailed_test_config.rst diff --git a/doc/dev/developer_guide/tests-integration-tests.rst b/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-intro.rst similarity index 97% rename from doc/dev/developer_guide/tests-integration-tests.rst rename to doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-intro.rst index 06590320402..dcbd8c79c52 100644 --- a/doc/dev/developer_guide/tests-integration-tests.rst +++ b/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-intro.rst @@ -1,7 +1,7 @@ -.. _testing-integration-tests: +.. _tests-integration-testing-teuthology-intro: -Testing - Integration Tests -=========================== +Testing - Integration Tests - Introduction +========================================== Ceph has two types of tests: :ref:`make check ` tests and integration tests. When a test requires multiple machines, root access or lasts for a @@ -100,7 +100,7 @@ all the integration tests, for all the Ceph components. `dummy `_ get a machine, do nothing and return success (commonly used to - verify the :ref:`testing-integration-tests` infrastructure works as expected) + verify the `Integration Testing` _ infrastructure works as expected) `fs `_ test CephFS mounted using FUSE @@ -143,10 +143,8 @@ all the integration tests, for all the Ceph components. teuthology-describe-tests ------------------------- -In February 2016, a new feature called ``teuthology-describe-tests`` was -added to the `teuthology framework`_ to facilitate documentation and better -understanding of integration tests (`feature announcement -`_). +``teuthology-describe`` was added to the `teuthology framework`_ to facilitate +documentation and better understanding of integration tests. The upshot is that tests can be documented by embedding ``meta:`` annotations in the yaml files used to define the tests. The results can be @@ -157,6 +155,8 @@ Since this is a new feature, many yaml files have yet to be annotated. Developers are encouraged to improve the documentation, in terms of both coverage and quality. +Please also see, `teuthology-desribe usecases`_ + How integration tests are run ----------------------------- @@ -524,5 +524,8 @@ test will be first. .. _ceph/qa sub-directory: https://github.com/ceph/ceph/tree/master/qa .. _Sepia Lab: https://wiki.sepia.ceph.com/doku.php +.. _Integration Testing: ../testing_integration_tests/tests-integration-testing-teuthology-intro.rst .. _teuthology repository: https://github.com/ceph/teuthology .. _teuthology framework: https://github.com/ceph/teuthology +.. _teuthology-desribe usecases: https://gist.github.com/jdurgin/09711d5923b583f60afc + diff --git a/doc/dev/developer_guide/running-tests-using-teuth.rst b/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-workflow.rst similarity index 61% rename from doc/dev/developer_guide/running-tests-using-teuth.rst rename to doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-workflow.rst index 492b7790e9e..9321210c395 100644 --- a/doc/dev/developer_guide/running-tests-using-teuth.rst +++ b/doc/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-workflow.rst @@ -1,8 +1,14 @@ -Running Integration Tests using Teuthology -========================================== +.. _tests-integration-testing-teuthology-workflow: + +Integration Tests using Teuthology Workflow +=========================================== + +Scheduling Test Run +------------------- Getting binaries ----------------- +**************** + To run integration tests using teuthology, you need to have Ceph binaries built for your branch. Follow these steps to initiate the build process - @@ -26,7 +32,8 @@ built for your branch. Follow these steps to initiate the build process - `Shaman`_ beforehand since it already might have builds ready for it. Triggering Tests ----------------- +**************** + After building is complete, proceed to trigger tests - #. Log in to the teuthology machine:: @@ -41,7 +48,14 @@ After building is complete, proceed to trigger tests - #. Run the ``teuthology-suite`` command:: - teuthology-suite -v -m smithi -c wip-devname-feature-x -s fs -p 110 --filter "cephfs-shell" + teuthology-suite -v \ + -m smithi \ + -c wip-devname-feature-x \ + -s fs \ + -p 110 \ + --filter "cephfs-shell" \ + -e foo@gmail.com \ + -R fail Following are the options used in above command with their meanings - -v verbose @@ -51,13 +65,23 @@ After building is complete, proceed to trigger tests - -p higher the number, lower the priority of the job --filter filter tests in given suite that needs to run, the arg to filter should be the test you want to run + -e When tests finish or time out, send an email + here. May also be specified in ~/.teuthology.yaml + as 'results_email' + -R A comma-separated list of statuses to be used + with --rerun. Supported statuses are: 'dead', + 'fail', 'pass', 'queued', 'running', 'waiting' + [default: fail,dead] + +#. Wait for the tests to run. ``teuthology-suite`` prints a link to the + `Pulpito`_ page created for the tests triggered. .. note:: The priority number present in the command above is just a placeholder. It might be highly inappropriate for the jobs you may want to trigger. See `Testing Priority`_ section to pick a priority number. .. note:: Don't skip passing a priority number, the default value is 1000 - which way too high; the job probably might never run. + which is way too high; the job probably might never run. #. Wait for the tests to run. ``teuthology-suite`` prints a link to the `Pulpito`_ page created for the tests triggered. @@ -65,39 +89,53 @@ After building is complete, proceed to trigger tests - Other frequently used/useful options are ``-d`` (or ``--distro``), ``--distroversion``, ``--filter-out``, ``--timeout``, ``flavor``, ``-rerun``, ``-l`` (for limiting number of jobs) , ``-n`` (for how many times job would -run) and ``-e`` (for email notifications). Run ``teuthology-suite --help`` -to read description of these and every other options available. +run). Run ``teuthology-suite --help`` to read description of these and every +other options available. Testing QA changes (without re-building binaires) -------------------------------------------------- +************************************************* + While writing a PR you might need to test your PR repeatedly using teuthology. If you are making non-QA changes, you need to follow the standard process of triggering builds, waiting for it to finish and then triggering tests and wait for the result. But if changes you made are purely changes in qa/, you don't need rebuild the binaries. Instead you can test binaries built for the ceph-ci branch and instruct ``teuthology-suite`` command to use a separate -branch for running tests. The separate branch can be passed to the command -by using ``--suite-repo`` and ``--suite-branch``. Pass the link to the GitHub -fork where your PR branch exists to the first option and pass the PR branch -name to the second option. +branch for running tests. +The separate branch can be passed to the command by using ``--suite-repo`` and +``--suite-branch``. Pass the link to the GitHub fork where your PR branch exists +to the first option and pass the PR branch name to the second option. For example, if you want to make changes in ``qa/`` after testing ``branch-x`` (of which has ceph-ci branch is ``wip-username-branch-x``) by running following command:: - teuthology-suite -v -m smithi -c wip-username-branch-x -s fs -p 50 --filter cephfs-shell + teuthology-suite -v \ + -m smithi \ + -c wip-username-branch-x \ + -s fs \ + -p 50 + --filter cephfs-shell + You can make the modifications locally, update the PR branch and then trigger tests from your PR branch as follows:: - teuthology-suite -v -m smithi -c wip-username-branch-x -s fs -p 50 --filter cephfs-shell --suite-repo https://github.com/username/ceph --suite-branch branch-x + teuthology-suite -v \ + -m smithi \ + -c wip-username-branch-x \ + -s fs -p 50 \ + --filter cephfs-shell \ + --suite-repo https://github.com/$username/ceph \ + --suite-branch branch-x You can verify if the tests were run using this branch by looking at values for the keys ``suite_branch``, ``suite_repo`` and ``suite_sha1`` in the job config printed at the very beginning of the teuthology job. About Suites and Filters ------------------------- +************************ + See `Suites Inventory`_ for a list of suites of integration tests present right now. Alternatively, each directory under ``qa/suites`` in Ceph repository is an integration test suite, so looking within that directory @@ -107,13 +145,48 @@ For picking an argument for ``--filter``, look within ``qa/suites///tasks`` to get keywords for filtering tests. Each YAML file in there can trigger a bunch of tests; using the name of the file, without the extension part of the file name, as an argument to the -``--filter`` will trigger those tests. For example, the sample command above -uses ``cephfs-shell`` since there's a file named ``cephfs-shell.yaml`` in -``qa/suites/fs/basic_functional/tasks/``. In case, the file name doesn't hint -what bunch of tests it would trigger, look at the contents of the file for -``modules`` attribute. For ``cephfs-shell.yaml`` the ``modules`` attribute -is ``tasks.cephfs.test_cephfs_shell`` which means it'll trigger all tests in -``qa/tasks/cephfs/test_cephfs_shell.py``. +``--filter`` will trigger those tests. +For example, the sample command above uses ``cephfs-shell`` since there's a file +named ``cephfs-shell.yaml`` in ``qa/suites/fs/basic_functional/tasks/``. In +case, the file name doesn't hint what bunch of tests it would trigger, look at +the contents of the file for ``modules`` attribute. For ``cephfs-shell.yaml`` +the ``modules`` attribute is ``tasks.cephfs.test_cephfs_shell`` which means +it'll trigger all tests in ``qa/tasks/cephfs/test_cephfs_shell.py``. + +Viewing Tests Results +--------------------- + +Pulpito Dashboard +***************** + +Once the teuthology job is scheduled, the status/results for test run could +be checked from https://pulpito.ceph.com/. +It could be used for quickly checking out job logs... their status etc. + +Teuthology Archives +******************* + +Once the tests have finished running, the log for the job can be obtained by +clicking on job ID at the Pulpito page for your tests. It's more convenient to +download the log and then view it rather than viewing it in an internet browser +since these logs can easily be up to size of 1 GB. It is easier to +ssh into the teuthology machine again (``teuthology.front.sepia.ceph.com``), and +access the following path:: + + /ceph/teuthology-archive///teuthology.log + +For example, for above test ID path is:: + + /ceph/teuthology-archive/teuthology-2019-12-10_05:00:03-smoke-master-testing-basic-smithi/4588482/teuthology.log + +This way the log can be viewed remotely without having to wait too +much. + +.. note:: To access archives more conveniently, ``/a/`` has been symbolically + linked to ``/ceph/teuthology-archive/``. For instance, to access the previous + example, we can use something like:: + + /a/teuthology-2019-12-10_05:00:03-smoke-master-testing-basic-smithi/4588482/teuthology.log Killing Tests ------------- @@ -123,7 +196,7 @@ times wrong set of tests can be triggered is filter wasn't chosen carefully. To save resource it's better to termniate such a job. Following is the command to terminate a job:: - teuthology-kill -r teuthology-2019-12-10_05:00:03-smoke-master-testing-basic-smithi + teuthology-kill -r teuthology-2019-12-10_05:00:03-smoke-master-testing-basic-smithi Let's call the argument passed to ``-r`` as test ID. It can be found easily in the link to the Pulpito page for the tests you triggered. For @@ -131,32 +204,22 @@ example, for the above test ID, the link is - http://pulpito.front.sepia.ceph.co Re-running Tests ---------------- -Pass ``--rerun`` option, with test ID as an argument to it, to -``teuthology-suite`` command:: - - teuthology-suite -v -m smithi -c wip-rishabh-fs-test_cephfs_shell-fix -p 50 --rerun teuthology-2019-12-10_05:00:03-smoke-master-testing-basic-smithi - -The meaning of rest of the options is already covered in `Triggering Tests` +You can pass --rerun option, with test ID as an argument to it, to +teuthology-suite command. Generally, this is useful in cases where teuthology test +batch has some failed/dead jobs that we might want to retrigger. We can trigger +jobs based on their status using:: + + teuthology-suite -v \ + -m smithi \ + -c wip-rishabh-fs-test_cephfs_shell-fix \ + -p 50 \ + --rerun teuthology-2019-12-10_05:00:03-smoke-master-testing-basic-smithi \ + -R fail,dead,queued,running \ + -e $CEPH_QA_MAIL + +The meaning of rest the of the options is already covered in `Triggering Tests`_ section. -Teuthology Archives -------------------- -Once the tests have finished running, the log for the job can be obtained by -clicking on job ID at the Pulpito page for your tests. It's more convenient to -download the log and then view it rather than viewing it in an internet -browser since these logs can easily be upto size of 1 GB. What's much more -easier is to log in to the teuthology machine again -(``teuthology.front.sepia.ceph.com``), and access the following path:: - - /ceph/teuthology-archive///teuthology.log - -For example, for above test ID path is:: - - /ceph/teuthology-archive/teuthology-2019-12-10_05:00:03-smoke-master-testing-basic-smithi/4588482/teuthology.log - -This way the log remotely can be viewed remotely without having to wait too -much. - Naming the ceph-ci branch ------------------------- There are no hard conventions (except for the case of stable branch; see @@ -179,5 +242,6 @@ https://github.com/ceph/ceph-ci/branches. .. _Pulpito: http://pulpito.front.sepia.ceph.com/ .. _Running Your First Test: ../running-tests-locally/#running-your-first-test .. _Shaman: https://shaman.ceph.com/builds/ceph/ -.. _Suites Inventory: ../tests-integration-tests/#suites-inventory -.. _Testing Priority: ../tests-integration-tests/#testing-priority +.. _Suites Inventory: ../tests-integration-testing-teuthology-intro.rst/#suites-inventory +.. _Testing Priority: ../tests-integration-testing-teuthology-intro.rst/#testing-priority +.. _Triggering Tests: ../tests-integration-testing-teuthology-workflow.rst/#triggering-tests diff --git a/doc/dev/developer_guide/testing_integration_tests/tests-sentry-developers-guide.rst b/doc/dev/developer_guide/testing_integration_tests/tests-sentry-developers-guide.rst new file mode 100644 index 00000000000..94dfae39aa6 --- /dev/null +++ b/doc/dev/developer_guide/testing_integration_tests/tests-sentry-developers-guide.rst @@ -0,0 +1,6 @@ +.. _tests-sentry-developers-guide: + +Sentry Notes +============ + +To be updated. Feel free to contribute. -- 2.39.5