]>
git.apps.os.sepia.ceph.com Git - teuthology.git/log
Zack Cerza [Mon, 6 Jan 2014 16:56:26 +0000 (10:56 -0600)]
Enable reporting of entire runs as dead
Zack Cerza [Fri, 3 Jan 2014 21:45:18 +0000 (15:45 -0600)]
Re-raise exceptions caught in the watchdog
Zack Cerza [Fri, 3 Jan 2014 21:08:45 +0000 (15:08 -0600)]
Use response.text if response.json is None
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Fri, 3 Jan 2014 21:01:31 +0000 (15:01 -0600)]
Strip stdout lines
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Fri, 3 Jan 2014 20:56:46 +0000 (14:56 -0600)]
Catch and log unhandled exceptions in the watchdog
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Fri, 3 Jan 2014 20:45:25 +0000 (14:45 -0600)]
Add 'emperor' to list of branches with reporting
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Fri, 3 Jan 2014 18:41:11 +0000 (12:41 -0600)]
Work around a change in pip 1.5 regarding wheels
The error message was "pip's wheel support requires setuptools >= 0.8
for dist-info support."
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Fri, 3 Jan 2014 17:55:13 +0000 (11:55 -0600)]
Be safer when calling ./bootstrap
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Sandon Van Ness [Fri, 3 Jan 2014 02:30:08 +0000 (18:30 -0800)]
Use CentOS Gitbuilder sha1 instead of Fedora for non-ubuntu.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Alfredo Deza [Fri, 13 Dec 2013 19:46:29 +0000 (14:46 -0500)]
break out of the while loop after 15 minutes
Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
(cherry picked from commit
bef6eb74dcaa37b70b1eab4d28bfa10abb0049d0 )
Signed-off-by: Zack Cerza <zack@cerza.org>
Zack Cerza [Tue, 31 Dec 2013 20:25:05 +0000 (14:25 -0600)]
Sleep once outside of the watchdog loop
Hopefully this will prevent the double-posting of jobs.
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Alfredo Deza [Tue, 31 Dec 2013 13:53:51 +0000 (05:53 -0800)]
Merge pull request #168 from ktdreyer/readme-formatting
format bullets in README
Ken Dreyer [Tue, 31 Dec 2013 02:42:39 +0000 (19:42 -0700)]
format bullets in README
Zack Cerza [Mon, 30 Dec 2013 22:20:52 +0000 (16:20 -0600)]
Set the content-type in report_job()
Zack Cerza [Mon, 30 Dec 2013 16:05:16 +0000 (10:05 -0600)]
Split out ResultsSerializer.job_info()
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Mon, 16 Dec 2013 17:39:49 +0000 (11:39 -0600)]
Port from httplib2 to requests module
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Sage Weil [Sun, 22 Dec 2013 06:21:49 +0000 (22:21 -0800)]
valgrind.supp: ignore libnss3 leaks
These just started popping up when I updated the notcmalloc gitbuilder, probably
because of an updated libnss version. Whitelist it!
Signed-off-by: Sage Weil <sage@inktank.com>
Ilya Dryomov [Mon, 23 Dec 2013 17:54:11 +0000 (19:54 +0200)]
rbd: bump the default scratch size for xfstests to 10G
autobuild-ceph.git commit
53db7a34aba5 had silently changed the default
elevator from cfq to deadline, which made xfstests 167 very unhappy.
It looks like with deadline and noop elevators it requires a ~6G
scratch partition. Bump the default scratch image size to 10G.
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Sage Weil [Sun, 22 Dec 2013 17:50:12 +0000 (09:50 -0800)]
Revert "valgrind.supp: ignore libnss3 leaks"
This reverts commit
572dc88a7cc295cb06354e6f004f7ad665b101f4 .
This didn't occur on next; I think there may be a real leak on the ceph
side.
Sage Weil [Sun, 22 Dec 2013 06:21:49 +0000 (22:21 -0800)]
valgrind.supp: ignore libnss3 leaks
These just started popping up. Probably because I gave the
gitbuilders a kick?
Signed-off-by: Sage Weil <sage@inktank.com>
SandonV [Fri, 20 Dec 2013 20:48:42 +0000 (12:48 -0800)]
Merge pull request #166 from ceph/wip-lockspell-wusui
Fix spelling error in comment.
Warren Usui [Fri, 20 Dec 2013 20:31:24 +0000 (12:31 -0800)]
Fix spelling error in teuthology/task/locktest.py comment
Zack Cerza [Fri, 20 Dec 2013 15:52:12 +0000 (09:52 -0600)]
Add ability to mark jobs as 'dead'
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Thu, 19 Dec 2013 22:43:11 +0000 (16:43 -0600)]
Allow passing multiple job_ids
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Thu, 19 Dec 2013 22:12:56 +0000 (16:12 -0600)]
Implement single-job killing
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Thu, 19 Dec 2013 21:39:15 +0000 (15:39 -0600)]
For teuthology-kill, s/suite/run/
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
SandonV [Thu, 19 Dec 2013 22:27:16 +0000 (14:27 -0800)]
Merge pull request #165 from ceph/wip-7042-fix-wusui
Do not run local handling fix if local parameter is not found.
Warren Usui [Thu, 19 Dec 2013 22:20:12 +0000 (14:20 -0800)]
Do not run local handling fix if local parameter is not found.
Fixes: 7042
Signed-off-by: Warren Usui <warren.usui@inktank.com>
Zack Cerza [Thu, 19 Dec 2013 17:27:14 +0000 (09:27 -0800)]
Merge pull request #156 from ceph/teuthology-doc-hadoop-wusui
Added docstrings. Cleaned up code (broke up long lines, removed unused
Zack Cerza [Thu, 19 Dec 2013 17:24:21 +0000 (09:24 -0800)]
Merge pull request #164 from ceph/wip-rados
rados: add in more (optional) op types
Zack Cerza [Thu, 19 Dec 2013 17:23:36 +0000 (09:23 -0800)]
Merge pull request #160 from ceph/wip-fix-5149-wusui
Added handling of a 'local' option inside install.py which specifies
Zack Cerza [Thu, 19 Dec 2013 16:25:51 +0000 (10:25 -0600)]
Log calls to teuthology-report more verbosely
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Tue, 17 Dec 2013 17:02:30 +0000 (11:02 -0600)]
Catch every exception here, for now.
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Sandon Van Ness [Wed, 18 Dec 2013 20:38:50 +0000 (12:38 -0800)]
Use saucy gitbuilder for arm package checking.
Some-how missed it checks both sha1 and package version file
and package version was still the quantal gitbuilder which wont
work as the hardware is down.
This was causing scheduling failures.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sage Weil [Wed, 18 Dec 2013 19:41:58 +0000 (11:41 -0800)]
rados: add in more (optional) op types
Signed-off-by: Sage Weil <sage@inktank.com>
Zack Cerza [Mon, 16 Dec 2013 20:22:22 +0000 (14:22 -0600)]
Use shell=True to call teuthology-report
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Mon, 16 Dec 2013 19:34:37 +0000 (13:34 -0600)]
Catch OSError if script isn't in $PATH
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Mon, 16 Dec 2013 17:43:06 +0000 (11:43 -0600)]
Revert "Use path when calling teuthology-report. …"
This reverts commit
e4b5ab811e954a5b134d413aeb338805b5e3441d .
Sandon Van Ness [Sat, 14 Dec 2013 15:14:51 +0000 (07:14 -0800)]
Use path when calling teuthology-report. …
The 'teuthology-report' command is probably not going to exist
in $PATH so get the location of the running command and assume its
in the same path.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Zack Cerza [Fri, 13 Dec 2013 17:25:30 +0000 (09:25 -0800)]
Merge pull request #162 from jcsp/fsid-conf
Fix FSID not being set in ceph.conf
Zack Cerza [Fri, 13 Dec 2013 17:24:23 +0000 (09:24 -0800)]
Merge pull request #161 from jcsp/ssh-config
Respect .ssh/config when opening SSH connections
Zack Cerza [Fri, 13 Dec 2013 15:56:23 +0000 (09:56 -0600)]
Skip the 'dead' report on old branches
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Sandon Van Ness [Fri, 13 Dec 2013 00:04:38 +0000 (16:04 -0800)]
Use saucy gitbuilder when grabbing sha1 for arm.
Old quantal gitbuilders are gone until hardware comes back. Use
the new saucy gitbuilders instead.
Zack Cerza [Thu, 12 Dec 2013 23:33:53 +0000 (17:33 -0600)]
Make sure to report all results.
If a just-finished job was using a teuthology branch not known to
contain the reporting feature, then report the job via the
teuthology-report script. Note that in some cases this will result in
double reporting but the extra load should be negligible.
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Thu, 12 Dec 2013 22:54:56 +0000 (16:54 -0600)]
Enable reporting of single jobs
(also switch to docopt)
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Thu, 12 Dec 2013 21:45:58 +0000 (15:45 -0600)]
Remove the child's stderr completely
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
John Spray [Thu, 12 Dec 2013 21:33:19 +0000 (13:33 -0800)]
Fix FSID not being set in ceph.conf
Symptom was that 'ceph --admin-daemon... config get fsid'
returned zeros, while correct fsid was present in cluster maps.
Fix it by populating FSID in ceph.conf, after extracting it from
monmap.
Zack Cerza [Thu, 12 Dec 2013 17:47:45 +0000 (11:47 -0600)]
When starting a job, tell paddles it's running
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Sandon Van Ness [Thu, 12 Dec 2013 02:07:43 +0000 (18:07 -0800)]
Longer timeout after sync/reboot.
With only a 5 second sleep via ssh and python it looks like a
race-condition was sometimes hitting where it would think
the machine is back up before the reboot command had completed.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
John Spray [Wed, 11 Dec 2013 21:08:51 +0000 (13:08 -0800)]
Respect .ssh/config when opening SSH connections
This handles that case where your private key is
in a non-default location that you're pointing
to in ~/.ssh/config.
Warren Usui [Wed, 11 Dec 2013 07:45:38 +0000 (23:45 -0800)]
Added handling of a 'local' option inside install.py which specifies
a local directory containing deb or rpm files to be installed.
Fixes: 5149
Signed-off-by: Warren Usui <warren.usui@inktank.com>
Zack Cerza [Tue, 10 Dec 2013 22:47:35 +0000 (16:47 -0600)]
Use continue, not break
Fixes a bug where not all pids were being collected
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Tue, 10 Dec 2013 22:35:05 +0000 (16:35 -0600)]
Tweak logic for pid lookup
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Tue, 10 Dec 2013 22:25:28 +0000 (16:25 -0600)]
Fix indentation
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Tue, 10 Dec 2013 19:19:56 +0000 (13:19 -0600)]
Don't show child's stderr, but show archive path
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Tue, 10 Dec 2013 16:06:16 +0000 (10:06 -0600)]
Add debug statements
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Tue, 10 Dec 2013 16:02:51 +0000 (08:02 -0800)]
Merge pull request #159 from ceph/wip-cache
rados: allow existing pool(s) to be used
Sage Weil [Tue, 10 Dec 2013 00:02:13 +0000 (16:02 -0800)]
rados: allow existing pool(s) to be used
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Mon, 9 Dec 2013 23:37:58 +0000 (15:37 -0800)]
ceph.conf: put 2x command in [global]
so that osdmaptool sees it.
Signed-off-by: Sage Weil <sage@inktank.com>
Zack Cerza [Mon, 9 Dec 2013 22:57:11 +0000 (16:57 -0600)]
Create a DateTime object from the timestamp
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Mon, 9 Dec 2013 22:40:27 +0000 (16:40 -0600)]
Make -a optional
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Mon, 9 Dec 2013 22:32:45 +0000 (16:32 -0600)]
Add missing req: psutil
Zack Cerza [Mon, 9 Dec 2013 21:16:33 +0000 (13:16 -0800)]
Merge pull request #151 from ceph/wip-distro-kernel
Wip distro kernel
Zack Cerza [Mon, 9 Dec 2013 20:56:49 +0000 (14:56 -0600)]
Auto-restart
If /tmp/teuthology-restart-workers is newer than the running process,
restart.
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Zack Cerza [Mon, 9 Dec 2013 21:01:03 +0000 (13:01 -0800)]
Merge pull request #158 from ceph/wip-nuke
make nuke behave
Sage Weil [Mon, 9 Dec 2013 19:42:12 +0000 (11:42 -0800)]
nuke: ignore exceptions while issuing reboot command
I'm seeing failed tasks (and nuke) leak machines. It looks like we are
getting an exception on the '... reboot -f -n' command when we should be
ignoring it and waiting for the machine to restart.
For example:
http://qa-proxy.ceph.com/teuthology/sage-2013-12-08_19:25:06-rados:thrash-wip-tier-foo-basic-plana/136321/teuthology.log
Signed-off-by: Sage Weil <sage@inktank.com>
Sandon Van Ness [Mon, 9 Dec 2013 19:42:06 +0000 (11:42 -0800)]
Remove unused variable.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sandon Van Ness [Mon, 9 Dec 2013 19:35:23 +0000 (11:35 -0800)]
Added additional comments.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sage Weil [Sat, 7 Dec 2013 21:20:58 +0000 (13:20 -0800)]
ceph.conf: default to 2x
A bunch of our tests rely on this; they need to be fixed
before we can run at 3x.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Sat, 7 Dec 2013 01:42:23 +0000 (17:42 -0800)]
nuke: fix sync before reboot timeout
If you do 'timeout 5 sync' and sync hangs, timeout will block trying to
kill it.
Instead, just background sync, wait a few seconds, and reboot. This means
we wait a few seconds even if sync returns immediately, but who cares!
Signed-off-by: Sage Weil <sage@inktank.com>
Alfredo Deza [Fri, 6 Dec 2013 14:18:14 +0000 (06:18 -0800)]
Merge pull request #157 from ceph/wip-watchdog
Implement a watchdog for queued jobs
Zack Cerza [Thu, 5 Dec 2013 23:37:25 +0000 (17:37 -0600)]
Implement a watchdog for queued jobs
This continually posts the run's status to the results server, if
configured, at an interval defaulting to 600 seconds.
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Warren Usui [Thu, 5 Dec 2013 01:49:21 +0000 (17:49 -0800)]
A create_if_vm call was made more than once when a lock-many style lock
was performed. This caused downburst to run twice, and the second
downburst fails as a result of the first downburst running.
Fixes: 6933
Warren Usui [Thu, 5 Dec 2013 01:36:14 +0000 (17:36 -0800)]
Merge branch 'teuthology-fix-downburst-yaml-wusui'
Warren Usui [Mon, 2 Dec 2013 22:37:12 +0000 (14:37 -0800)]
Implement --downburst-conf parameter for teuthology-lock.
Load the appropriate yaml information when found (this formerly
did not work). Make sure teuthology --lock works with a downburst
entry in the yaml files. Document how this works in README.rst.
Fixes: #6921
Reviewed-by: Dan Mick
Warren Usui [Wed, 4 Dec 2013 02:16:04 +0000 (18:16 -0800)]
Added docstrings. Cleaned up code (broke up long lines, removed unused
variable references, pep8 formatted most of the code (one set of long lines
remains), and changed some variable and method names to conform to pylint
standards).
Fixes: 6530
Josh Durgin [Wed, 4 Dec 2013 01:31:45 +0000 (17:31 -0800)]
rbd: make default size larger for xfstests
Test 167 runs out of space on newer kernels
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Warren Usui [Tue, 26 Nov 2013 04:56:24 +0000 (20:56 -0800)]
Merge branch 'wip-fix-teuth-tgt-wusui'
Warren Usui [Fri, 15 Nov 2013 04:24:38 +0000 (20:24 -0800)]
tgt and iscsi code need some minor fixes. Moved the settle call during
simple read testing. In iscsi.py, generic_mkfs and generic_mount need
to be called from the main body of the task. An extraneous iscsiadm
command was removed. The tgt size is now not hard-coded. It is extracted
from the property and defaults to 10240.
Fixes: #6782
Zack Cerza [Mon, 25 Nov 2013 23:31:34 +0000 (15:31 -0800)]
Merge pull request #154 from ceph/wip-multi-mtype
Wip multi mtype
Sandon Van Ness [Mon, 25 Nov 2013 09:19:13 +0000 (01:19 -0800)]
Changes suggested per review.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Zack Cerza [Fri, 22 Nov 2013 23:03:29 +0000 (17:03 -0600)]
Also catch httplib2.ServerNotFoundError
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Dan Mick [Fri, 22 Nov 2013 06:04:19 +0000 (22:04 -0800)]
internal.py: nitty little spelling error
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Sandon Van Ness [Thu, 21 Nov 2013 23:21:19 +0000 (15:21 -0800)]
Schedule-suite Use 'multi' tube for multiple types. Scheduling.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sandon Van Ness [Thu, 21 Nov 2013 22:19:44 +0000 (14:19 -0800)]
Allow ability to use multi machine type deliminated by ,- \t.
I was originally attempting a more complicated locking mechanism
but I think its almost as good to just have it attempt the other
machine type if one.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Zack Cerza [Thu, 21 Nov 2013 19:56:41 +0000 (13:56 -0600)]
Skip cluster() if use_existing_cluster is True
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
SandonV [Thu, 21 Nov 2013 02:03:04 +0000 (18:03 -0800)]
Merge pull request #153 from ceph/wip-6790
Reviewed by Warren.
Sandon Van Ness [Thu, 21 Nov 2013 00:37:31 +0000 (16:37 -0800)]
Use shortened version in order to avoid revision/arch mishaps.
Sometimes -X is added to package names which does not exist in the
/version file. Simply using the version string does not work on
RHEL (it does on centos). Until version and the packages match
identically we instead will just split the version at the - and
no longer specify the dist for better reliability but slightly
lower accuracy.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Zack Cerza [Wed, 20 Nov 2013 22:23:07 +0000 (16:23 -0600)]
Add optional 'use_existing_cluster' flag
If this flag is present, skip a few unnecessary steps
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Sandon Van Ness [Fri, 15 Nov 2013 05:47:41 +0000 (21:47 -0800)]
Fix ceph.repo so it uses URI value.
Basically some weird cases where ceph-releases would be pointing
to the wrong branch/build when two branches had the same sha1.
This fixes that.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Samuel Just [Thu, 14 Nov 2013 22:01:51 +0000 (14:01 -0800)]
ceph_manager: provide unique pool names to avoid collision
Fixes: #6769
Signed-off-by: Samuel Just <sam.just@inktank.com>
Alfredo Deza [Thu, 14 Nov 2013 13:46:28 +0000 (05:46 -0800)]
Merge pull request #152 from dachary/master
add git clone to installation instrutions
Loic Dachary [Thu, 14 Nov 2013 13:12:35 +0000 (14:12 +0100)]
add git clone to installation instrutions
Signed-off-by: Loic Dachary <loic@dachary.org>
Josh Durgin [Wed, 13 Nov 2013 23:26:37 +0000 (15:26 -0800)]
syslog: ignore perf nmi handler timeout
This seems to have started appearing in recent 3.12+ kernels
with perf enabled.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Zack Cerza [Tue, 12 Nov 2013 23:07:15 +0000 (17:07 -0600)]
Make report_job() always return an int
Sandon Van Ness [Tue, 12 Nov 2013 21:04:00 +0000 (13:04 -0800)]
Add some debug logging.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sandon Van Ness [Sat, 9 Nov 2013 01:00:27 +0000 (17:00 -0800)]
For saya (arm) use arm gitbuilder for ceph sha1.
Since the arm gitbuilder (even using a ton of nodes for distcc) is
much slower than x86 lets grab the sha1 from its own gitbuilder
when machine_type is saya rather than the x86 one.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sandon Van Ness [Fri, 8 Nov 2013 22:35:51 +0000 (14:35 -0800)]
Distro kernel bug-fixes.
Fixed some things that were being done incorrectly.
Some distro kernels have no debug so added | true when disabling
kdb. Also changed what was skipping kernels if non-ubuntu to also
schedule kernel install if a distro kernel.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Zack Cerza [Fri, 8 Nov 2013 20:24:42 +0000 (12:24 -0800)]
Merge pull request #146 from ceph/wip-os-type
Wip os type
Sandon Van Ness [Fri, 8 Nov 2013 19:02:48 +0000 (11:02 -0800)]
Consolidate two excepts into one.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>