Zack Cerza [Thu, 12 Dec 2013 23:33:53 +0000 (17:33 -0600)]
Make sure to report all results.
If a just-finished job was using a teuthology branch not known to
contain the reporting feature, then report the job via the
teuthology-report script. Note that in some cases this will result in
double reporting but the extra load should be negligible.
John Spray [Thu, 12 Dec 2013 21:33:19 +0000 (13:33 -0800)]
Fix FSID not being set in ceph.conf
Symptom was that 'ceph --admin-daemon... config get fsid'
returned zeros, while correct fsid was present in cluster maps.
Fix it by populating FSID in ceph.conf, after extracting it from
monmap.
Sandon Van Ness [Thu, 12 Dec 2013 02:07:43 +0000 (18:07 -0800)]
Longer timeout after sync/reboot.
With only a 5 second sleep via ssh and python it looks like a
race-condition was sometimes hitting where it would think
the machine is back up before the reboot command had completed.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sage Weil [Mon, 9 Dec 2013 19:42:12 +0000 (11:42 -0800)]
nuke: ignore exceptions while issuing reboot command
I'm seeing failed tasks (and nuke) leak machines. It looks like we are
getting an exception on the '... reboot -f -n' command when we should be
ignoring it and waiting for the machine to restart.
For example:
http://qa-proxy.ceph.com/teuthology/sage-2013-12-08_19:25:06-rados:thrash-wip-tier-foo-basic-plana/136321/teuthology.log
Warren Usui [Thu, 5 Dec 2013 01:49:21 +0000 (17:49 -0800)]
A create_if_vm call was made more than once when a lock-many style lock
was performed. This caused downburst to run twice, and the second
downburst fails as a result of the first downburst running.
Warren Usui [Mon, 2 Dec 2013 22:37:12 +0000 (14:37 -0800)]
Implement --downburst-conf parameter for teuthology-lock.
Load the appropriate yaml information when found (this formerly
did not work). Make sure teuthology --lock works with a downburst
entry in the yaml files. Document how this works in README.rst.
Warren Usui [Fri, 15 Nov 2013 04:24:38 +0000 (20:24 -0800)]
tgt and iscsi code need some minor fixes. Moved the settle call during
simple read testing. In iscsi.py, generic_mkfs and generic_mount need
to be called from the main body of the task. An extraneous iscsiadm
command was removed. The tgt size is now not hard-coded. It is extracted
from the property and defaults to 10240.
Sandon Van Ness [Thu, 21 Nov 2013 00:37:31 +0000 (16:37 -0800)]
Use shortened version in order to avoid revision/arch mishaps.
Sometimes -X is added to package names which does not exist in the
/version file. Simply using the version string does not work on
RHEL (it does on centos). Until version and the packages match
identically we instead will just split the version at the - and
no longer specify the dist for better reliability but slightly
lower accuracy.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sandon Van Ness [Sat, 9 Nov 2013 01:00:27 +0000 (17:00 -0800)]
For saya (arm) use arm gitbuilder for ceph sha1.
Since the arm gitbuilder (even using a ton of nodes for distcc) is
much slower than x86 lets grab the sha1 from its own gitbuilder
when machine_type is saya rather than the x86 one.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sandon Van Ness [Fri, 8 Nov 2013 22:35:51 +0000 (14:35 -0800)]
Distro kernel bug-fixes.
Fixed some things that were being done incorrectly.
Some distro kernels have no debug so added | true when disabling
kdb. Also changed what was skipping kernels if non-ubuntu to also
schedule kernel install if a distro kernel.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Warren Usui [Tue, 15 Oct 2013 21:43:39 +0000 (14:43 -0700)]
Added two new tasks. tgt starts up the tgt service. iscsi starts
up the iscsi service and logins to an rbd image using the tgt service
(either locally or remotely). The iscsi service runs some
simple tests, and then sets up the isci-image to be useable by
rbd test scripts. Later workunits can perform further testing
on the isci-image interface.
In order to add the new tasks, common_fs_utils.py was formed
from code extracted out of rbd.py. Rbd.py and iscsi.py both
call the functions in this module.
Sage Weil [Fri, 1 Nov 2013 17:56:42 +0000 (10:56 -0700)]
install.upgrade: fix overrides of sha1|tag|branch
If the upgrade task config has a branch: (or tag or sha1), do
not apply the sha1|branch|tag overrides keys. This fixes the
breakage from 280f783c2e8dda0df6afb4de0b115aad1614fbdc which
made
...use the sha1 from the overrides instead of the explicitly
specified branch. The intention was to only use the overrides
when the version was not specified (whether it was sha1 or
branch or tag).
At some point we should probably make the same change for
install function in install.py, but let's fix this first to
get the upgrade tests working.
Sandon Van Ness [Thu, 31 Oct 2013 00:06:14 +0000 (17:06 -0700)]
Initial ugly commit.
Definitely some enhancements can be done. I think I have everything
needed but I have not been able to test this yet. If this needs to
get done before I am back feel free to work on it.
Completely untested and probably a few mistakes somewhere...
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sandon Van Ness [Mon, 28 Oct 2013 18:04:28 +0000 (11:04 -0700)]
Fixed errors. Tests pass.
Since the default OS version is different for each distro the
argument default is None instead of explicity set to a value
like with get_distro. Fixed some logic around that and the tests
making the arugment always take precidence.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sandon Van Ness [Sat, 26 Oct 2013 00:48:50 +0000 (17:48 -0700)]
Support --os-version as argument.
You can use --os-type as an argument when not running teuthology
tests but instead just using teuthology-lock. This adds the ability
to also use --os-version so you can specify the version of the
distro without having to run an actual test with a yaml like you
normally would have had to do setting os_version in the yaml.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sandon Van Ness [Fri, 25 Oct 2013 20:14:21 +0000 (13:14 -0700)]
Use worker httpd instead of prefork (like ubuntu) on rpm distros.
Ubuntu's default apache uses worker instead of prefork like rpm
based distro's. If rpm use httpd.worker instead of httpd so that
the -X behavior will not be blocked by a single request.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sandon Van Ness [Fri, 25 Oct 2013 20:14:21 +0000 (13:14 -0700)]
Use worker httpd instead of prefork (like ubuntu) on rpm distros.
Ubuntu's default apache uses worker instead of prefork like rpm
based distro's. If rpm use httpd.worker instead of httpd so that
the -X behavior will not be blocked by a single request.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sage Weil [Thu, 24 Oct 2013 02:24:06 +0000 (19:24 -0700)]
rbd_fsx: do not exceed 250GB for fsx image
This breaks on 32-bit architectures because fsx can't allocate the
in-memory buffer or hit some other limit. 500MG also seems to run out of
memory (tho it at least does not segfault).
Fixes: #6576 Signed-off-by: Sage Weil <sage@inktank.com>
Samuel Just [Wed, 23 Oct 2013 17:52:55 +0000 (10:52 -0700)]
ceph_manager: workaround for 6116
This is an annoying race, we really should delay going
clean until the backfill peer has acknoledged the clean
info, but we currently don't. In order to prevent this
bug from messing up the nightlies, we'll delay killing
the peer for 20s to make it likely that the backfill
peer has gotten the clean info.
Workaround: #6116 Signed-off-by: Samuel Just <sam.just@inktank.com>