Zack Cerza [Fri, 23 Oct 2015 22:00:06 +0000 (16:00 -0600)]
misc.sh(): Don't log.exception() before raise
log.exception() logs the traceback, and raise will also cause it to be
logged. There's no need to have it logged twice; additionally, when sh()
was being called within a try/except clause we were confusingly logging
an expected failure. Callers can choose to log if they want.
Zack Cerza [Thu, 22 Oct 2015 16:39:36 +0000 (10:39 -0600)]
OpenStack.exists(): Don't list every instance
Instead of "openstack server list", dumping the entire tentant's list of
instances, use "openstack server show" to show a single instance. While
"list" can accept a "--name" argument to filter, it does not have an
"--id" argument.
Zack Cerza [Mon, 19 Oct 2015 21:35:37 +0000 (15:35 -0600)]
Use safe_while to work around an OVH problem
We're seeing intermittent network failures when running inside OVH; they
present as:
https://github.com/kennethreitz/requests/issues/2364
This should help work around the issue.
openstack: clear the buildpackages directory on stop
When running /etc/init.d/teuthology stop, all OpenStack resources are
destroyed, including the instance hosting the repository where the
buildpackages task artefacts are archived. Remove /tmp/stampsdir so that
everything gets rebuilt.
config: ~/teuthology.yaml check_package_signatures hint
If check_package_signatures is false, the tasks installing
packages (install, ceph-deploy, ...) are authorized to skip the package
signatures verifications.
Set this as the default for a cluster dynamically generated by the
OpenStack backend.
Dan Mick [Fri, 9 Oct 2015 01:22:25 +0000 (18:22 -0700)]
run_tasks.py: fix Sentry URL
I don't know when or why it changed, but the existing URL format, which
uses '/search?q=<id>', fails; what works, by observing the web UI's URL
submission and by testing, is to omit the 'search' part of the path:
'/?q=<id>'
Loic Dachary [Thu, 8 Oct 2015 21:16:45 +0000 (23:16 +0200)]
install: split the upgrade_common function
The upgrade_common function implements a non trivial logic that defines
how overrides are applied to the install.upgrade task, as well as the
way upgrades are applied to the desired targets.
The function is split in two:
upgrade_common which remains the entry point
upgrade_remote_to_config which encapsulates the logic
This allows other parts of teuthology to obey the same logic by calling
the function instead of replicating it.
config: add ceph_git_url and ceph_qa_suite_git_url
The ~/.teuthology.yaml ceph_git_base_url configuration does not allow to
modify the URL of the Ceph repository without also modifying the URL of
the teuthology repository. Although it is frequently needed to point to
an alternate ceph or ceph-qa-suite repository, it is rarely necessary to
point to an alternate teuthology repository.
This is not a blocker: it is enough to mirror the teuthology,
ceph-cm-ansible, ceph-deploy and maybe a few other repositories to
satisfy this requirement. This is however inconvenient because the
exact list of repositories that need to be mirrored is not easily
accessible. In addition, unless the user is careful about updating the
mirrors prior to running teuthology, there is a good chance that an
obsolete version of the repository will be used and this may lead to
problems difficult to diagnose.
The git_ceph_url and git_ceph_qa_suite_url configuration variables are
added to specify the URL of the ceph and ceph-qa-suite repositories
without modifying the git_ceph_base_url value so that all other
repositories retain their default location.
For easier consumption within teuthology and ceph-qa-suite, the
get_git_ceph_url() and get_git_ceph_qa_suite_url() accessors are added
to the config class. They use the user provided value, if available, and
otherwise fallback to constructing the URL with git_ceph_base_url which
is the legacy behavior.
Loic Dachary [Tue, 6 Oct 2015 16:18:44 +0000 (18:18 +0200)]
misc: wait_until_osds_up must verify 'up' in state
It is not enough to count the number of entries in the osds
array, wait_until_osds_up must count which one are actually up by
checking if the string "up" is in the "state" array.
openstack: throttling helps the instance running the cluster
The instance throttling (not launching more than X instances per minute)
helps the instance running the teuthology cluster when running multiple
workers. The workload does not spike when launching a suite and that
allows to run more workers on a machine with the same hardware configuration.
openstack: do not rely on gitbuilder.ceph.com by default
Make it so the default when using the OpenStack backend is to build the
packages transparently using the OpenStack cluster instead of relying on
http://ceph.com/gitbuilder.cgi.
If a buildpackages task is found, ensure it is always before the install
task because it is intended to produce the packages that will be used by
the install task.
The call to teuthology_schedule was buried inside an 'if dry_run:'
clause. That clause is unnecessary since teuthology_schedule handles
dry_run cases - we pass it the same value as an arg
When running a suite with 50 jobs, it will schedule about 100 instance
creation within less than a minute. It is likely to exceed the API
quotas of the OpenStack provider (number of instance creation per
minute) and lead to instance creation failures.
Set the teuthology-suite to be called with --throttle 15 by default,
that is about 60 / 15 = 4 * 2 = ~8 server creation per minute.
When scheduling, wait SLEEP seconds between jobs. Useful to avoid
bursts that may be too hard on the underlying infrastructure or exceed
OpenStack API limits (server creation per minute for instance).
Add ansible task to install repo if config provides it
If playbook is specificed as shown in doc string usage, It will run ansible task and install
the repos so that rpm's can be installed from the provided repo