]>
git.apps.os.sepia.ceph.com Git - teuthology.git/log
Zack Cerza [Thu, 30 Mar 2017 21:30:23 +0000 (15:30 -0600)]
Merge pull request #1055 from tchaikov/wip-workunit-to-use-suite-branch
suite: use "suite_branch" for workunit if avaiable
Kefu Chai [Thu, 30 Mar 2017 12:40:28 +0000 (20:40 +0800)]
suite: use "suite_branch" for workunit if avaiable
"git clone" allows us to clone a single branch, but not a sha1, so
prefer "branch" if it's available.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Zack Cerza [Thu, 23 Mar 2017 18:14:17 +0000 (12:14 -0600)]
Merge pull request #1053 from ceph/wip-ovh-destroy
cloud.openstack: Destroy multiple nodes
Zack Cerza [Wed, 22 Mar 2017 16:33:33 +0000 (10:33 -0600)]
cloud.openstack: Destroy multiple nodes
In the very rare case where more than one node exists with the same name
(this can happen in case of an outage), destroy all nodes that match
instead of failing altogether.
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Wed, 22 Mar 2017 16:28:25 +0000 (10:28 -0600)]
cloud.openstack: Split out node-finding
... into its own method.
Signed-off-by: Zack Cerza <zack@redhat.com>
Sage Weil [Tue, 21 Mar 2017 19:45:39 +0000 (14:45 -0500)]
Merge pull request #1052 from ceph/wip-mgr
ceph.conf: debug mgr by default
Sage Weil [Tue, 21 Mar 2017 19:44:27 +0000 (14:44 -0500)]
ceph.conf: debug mgr by default
Signed-off-by: Sage Weil <sage@redhat.com>
Zack Cerza [Tue, 21 Mar 2017 17:12:15 +0000 (11:12 -0600)]
Merge pull request #1007 from ceph/wip-rh-internal-task
Add redhat internal tasks that setup repo
Vasu Kulkarni [Mon, 20 Mar 2017 16:55:19 +0000 (09:55 -0700)]
run redhat internal tasks based on config item
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
Vasu Kulkarni [Mon, 20 Mar 2017 16:52:33 +0000 (09:52 -0700)]
Always run tests with latest kernel on RHEL
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
Vasu Kulkarni [Mon, 20 Mar 2017 16:48:10 +0000 (09:48 -0700)]
Add redhat internal task to setup repos
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
Zack Cerza [Fri, 17 Mar 2017 20:00:48 +0000 (14:00 -0600)]
Merge pull request #1047 from ceph/wip-daemon-register
orchestra/daemon: separate register_daemon from add_daemon
Sage Weil [Fri, 17 Mar 2017 19:46:22 +0000 (15:46 -0400)]
orchestra/daemon: separate register_daemon from add_daemon
The former registers without starting; the latter does both (as before).
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Fri, 17 Mar 2017 18:59:39 +0000 (13:59 -0500)]
Merge pull request #1050 from ceph/wip-pg-remap
ceph.conf: mon osd allow pg remap = true
Sage Weil [Fri, 17 Mar 2017 18:58:50 +0000 (14:58 -0400)]
ceph.conf: mon osd allow pg remap = true
Signed-off-by: Sage Weil <sage@redhat.com>
Dan Mick [Thu, 16 Mar 2017 17:39:53 +0000 (10:39 -0700)]
Merge pull request #1045 from ceph/wip-daemon
Fix a couple bugs re: remote daemon execution
Reviewed-by: Dan Mick <dmick@redhat.com>
David Galloway [Thu, 16 Mar 2017 00:19:44 +0000 (20:19 -0400)]
Merge pull request #1048 from dmick/master
setup.py: disallow requests 2.13.0
Dan Mick [Thu, 16 Mar 2017 00:01:03 +0000 (17:01 -0700)]
setup.py: disallow requests 2.13.0
python-cinderclient and/or keystoneauth1 apparently don't want
requests 2.13.0, so make it illegal for now
Signed-off-by: Dan Mick <dan.mick@redhat.com>
Zack Cerza [Fri, 10 Mar 2017 23:14:30 +0000 (16:14 -0700)]
orchestra.run: More consistently notice failures
check_status=True in combination with wait=False resulted in
CommandFailedError never being raised if poll() was used instead of
wait(). Fix that.
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Fri, 10 Mar 2017 16:41:46 +0000 (09:41 -0700)]
orchestra.run: Notice when short-lived procs exit
If a command exits immediately, there is a race between greenlet
completion (which flushes the ChannelFile buffers) and the call to
exit_status_ready(). Waiting for 0.1s on the greenlets removes the race
condition.
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Thu, 9 Mar 2017 22:27:31 +0000 (15:27 -0700)]
orchestra.run: Fix bug in RemoteProcess.finished
Previously, if wait() had not been called, we didn't have the returncode
attribute set. Set it from within the finished property if it isn't
already.
Signed-off-by: Zack Cerza <zack@redhat.com>
Dan Mick [Wed, 8 Mar 2017 23:02:59 +0000 (15:02 -0800)]
Merge pull request #1044 from ceph/wip-vps-docs
docs: Update libvirt conf to match current vps_hosts list
Reviewed-by: Dan Mick <dmick@redhat.com>
David Galloway [Wed, 8 Mar 2017 20:07:41 +0000 (15:07 -0500)]
docs: Update libvirt conf to match current vps_hosts list
Signed-off-by: David Galloway <dgallowa@redhat.com>
Zack Cerza [Tue, 7 Mar 2017 16:50:26 +0000 (09:50 -0700)]
Merge pull request #1043 from ceph/wip-fix-1041
provision.downburst: Fix a bug in PR #1041
Zack Cerza [Tue, 7 Mar 2017 16:18:56 +0000 (09:18 -0700)]
provision.downburst: Fix a bug in PR #1041
Looks like an argument was dropped to a string format().
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Tue, 7 Mar 2017 00:57:30 +0000 (17:57 -0700)]
Merge pull request #1042 from dmick/wip-downburst-logging
provision: ctx does not always have 'config' attribute
Dan Mick [Tue, 7 Mar 2017 00:14:30 +0000 (16:14 -0800)]
provision: ctx does not always have 'config' attribute
Signed-off-by: Dan Mick <dan.mick@redhat.com>
Zack Cerza [Tue, 7 Mar 2017 00:05:22 +0000 (17:05 -0700)]
Merge pull request #1041 from dmick/wip-downburst-logging
Invoke downburst with logging
Dan Mick [Mon, 6 Mar 2017 23:41:36 +0000 (15:41 -0800)]
downburst: always log output and error, check returncode for failure
The more info the better; always log everything about the downburst
execution to the teuthology log. Check for command failure by
checking for returncode != 0 rather than "presence of stderr", since
logging always happens to stderr.
Signed-off-by: Dan Mick <dan.mick@redhat.com>
Dan Mick [Mon, 6 Mar 2017 23:40:05 +0000 (15:40 -0800)]
provision: invoke downburst with -v and --logfile
Verbose output isn't verbose enough to matter, and can be helpful
tracking down weirdness. Also, log to private file in case
downburst hangs mid-operation, to avoid having to do any
select() madness in teuthology.
Signed-off-by: Dan Mick <dan.mick@redhat.com>
Dan Mick [Mon, 6 Mar 2017 21:43:12 +0000 (13:43 -0800)]
Merge pull request #1040 from ceph/wip-downburst-unlock
provision.downburst: Tweak destroy return code
Reviewed-by: Dan Mick <dmick@redhat.com>
Zack Cerza [Mon, 6 Mar 2017 19:35:27 +0000 (19:35 +0000)]
provision.downburst: Tweak destroy return code
Specifically, if the instance doesn't even exist, consider the destroy
op to have succeeded.
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Mon, 6 Mar 2017 16:19:28 +0000 (09:19 -0700)]
Merge pull request #1035 from ceph/wip-f25
bootstrap: Update required package names for Fedora 25
Zack Cerza [Fri, 3 Mar 2017 21:38:21 +0000 (14:38 -0700)]
Merge pull request #1039 from dmick/wip-clock
Include missing pieces from ntpd clock task change
Dan Mick [Mon, 30 Jan 2017 21:25:57 +0000 (13:25 -0800)]
ntpdc is deprecated; use ntpq (in the ntp package everywhere)
See https://www.eecis.udel.edu/~mills/ntp/html/ntpdc.html
Signed-off-by: Dan Mick <dan.mick@redhat.com>
Dan Mick [Fri, 13 Jan 2017 22:49:05 +0000 (14:49 -0800)]
clock: handle different service names
Restarting ntpd involves two different service names (thanks
again, distro maintainers, much appreciated)
Signed-off-by: Dan Mick <dan.mick@redhat.com>
Zack Cerza [Fri, 3 Mar 2017 18:10:46 +0000 (11:10 -0700)]
Merge pull request #1038 from tchaikov/wip-yaml-with-repo
docs/detailed_test_config.rst: fix the sample yaml
Kefu Chai [Fri, 3 Mar 2017 10:54:22 +0000 (18:54 +0800)]
docs/detailed_test_config.rst: fix the sample yaml
so it is able to work with latest teuthology
Signed-off-by: Kefu Chai <kchai@redhat.com>
Dan Mick [Wed, 1 Mar 2017 23:43:57 +0000 (15:43 -0800)]
Merge pull request #1037 from ceph/wip-cloud-volume-retry
cloud.openstack: Also retry on BaseHTTPError
Zack Cerza [Wed, 1 Mar 2017 23:20:35 +0000 (16:20 -0700)]
cloud.openstack: Also retry on BaseHTTPError
We attach volumes immediately after creating them; sometimes they are
still momentarily in the 'creating' state, causing the attach call to
throw a BaseHTTPError. When that happens, simply retry the request
instead of failing node creation, starting the entire cycle all over
again.
Signed-off-by: Zack Cerza <zack@redhat.com>
David Galloway [Thu, 16 Feb 2017 22:04:26 +0000 (17:04 -0500)]
bootstrap: Update required package names for Fedora 25
Signed-off-by: David Galloway <dgallowa@redhat.com>
Dan Mick [Tue, 28 Feb 2017 20:55:54 +0000 (12:55 -0800)]
Merge pull request #1012 from ceph/wip-libcloud
libcloud provisioning backend with OpenStack support
Reviewed-by: Dan Mick <dmick@redhat.com>
Zack Cerza [Fri, 24 Feb 2017 20:56:46 +0000 (13:56 -0700)]
Merge pull request #1034 from ceph/wip-nuke-nfsganesha
nuke: Remove nfs-ganesha repos
David Galloway [Fri, 24 Feb 2017 19:01:12 +0000 (14:01 -0500)]
nuke: Remove nfs-ganesha repos
Fixes: http://tracker.ceph.com/issues/18974
Signed-off-by: David Galloway <dgallowa@redhat.com>
Zack Cerza [Tue, 21 Feb 2017 20:17:43 +0000 (13:17 -0700)]
cloud.openstack: Exclude windows-specific sizes
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Fri, 10 Feb 2017 21:32:01 +0000 (14:32 -0700)]
nuke: Do not log the namespace object
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Fri, 10 Feb 2017 17:25:53 +0000 (10:25 -0700)]
cloud.openstack: Cache authentication tokens
Constantly causing Keystone to regenerate auth tokens was the cause of
our hitting rate limits during testing. This will let us reuse auth
tokens - including across processes - to avoid hitting those limits.
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Thu, 9 Feb 2017 20:50:20 +0000 (13:50 -0700)]
Add cloud.util.AuthToken
This provides a mechanism for caching OpenStack authentication tokens
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Thu, 9 Feb 2017 20:53:57 +0000 (13:53 -0700)]
Move FileLock out of repo_utils
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Thu, 9 Feb 2017 00:13:51 +0000 (17:13 -0700)]
cloud: Volume creation fails node creation
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Mon, 6 Feb 2017 21:16:13 +0000 (14:16 -0700)]
cloud: Retry failed requests in libcloud
It's common to see "429 Rate limit exceeded", at least with OVH. When we
encounter the exception associated with that exception, backoff and
retry for an interval before eventually giving up.
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Fri, 20 Jan 2017 20:00:38 +0000 (13:00 -0700)]
updatekeys: Fix argument parsing
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Wed, 18 Jan 2017 21:03:03 +0000 (14:03 -0700)]
.gitignore: .idea
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Tue, 10 Jan 2017 19:47:21 +0000 (12:47 -0700)]
Don't unlock VMs when destroy fails
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Fri, 6 Jan 2017 20:47:59 +0000 (13:47 -0700)]
Update SSH keys when creating VMs
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Fri, 6 Jan 2017 20:01:42 +0000 (13:01 -0700)]
Handle VMs missing consoles safely
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Fri, 6 Jan 2017 00:04:11 +0000 (17:04 -0700)]
Make teuthology.lock a subpackage
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Thu, 5 Jan 2017 20:24:11 +0000 (13:24 -0700)]
More intelligently implement is_vm()
Move to using one decent implementation instead of multiple (some naive)
implementations
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Wed, 4 Jan 2017 17:22:44 +0000 (10:22 -0700)]
Locking changes needed for libcloud
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Thu, 8 Dec 2016 03:30:57 +0000 (20:30 -0700)]
Add libcloud backend
Initially this supports OpenStack but will grow to support other methods
of cloud-like deployment. Some assuptions are made regarding supporting
infrastructure (FIXME document these)
Signed-off-by: Zack Cerza <zack@redhat.com>
Dan Mick [Fri, 24 Feb 2017 01:25:20 +0000 (17:25 -0800)]
Merge pull request #1033 from ceph/wip-wait-osd-timeout
misc.wait_until_osds_up(): timeout after 5min
Zack Cerza [Fri, 24 Feb 2017 00:36:05 +0000 (17:36 -0700)]
misc.wait_until_osds_up(): timeout after 5min
It doesn't make any sense to wait more than a few minutes for OSDs to
come up. If they take more than five minutes, fail the job.
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Thu, 23 Feb 2017 23:34:36 +0000 (16:34 -0700)]
Merge pull request #1030 from tchaikov/wip-suite-sha1-for-workunit
suite: use "suite_hash" as the default sha1 for workunit
Zack Cerza [Thu, 23 Feb 2017 17:17:35 +0000 (10:17 -0700)]
Merge pull request #1032 from ceph/wip-nuke-samba
nuke: Remove samba repos
David Galloway [Thu, 23 Feb 2017 16:21:47 +0000 (11:21 -0500)]
nuke: Remove samba repos
Fixes: http://tracker.ceph.com/issues/19061
Signed-off-by: David Galloway <dgallowa@redhat.com>
Ilya Dryomov [Fri, 17 Feb 2017 11:56:20 +0000 (12:56 +0100)]
run: allow using alternate suite repo
Do the same thing we do for ceph repo to make ceph.git commit
1f82b9b9446d ("qa/tasks/workunit: use the suite repo for cloning
workunit") work for scheduled jobs.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Kefu Chai [Thu, 16 Feb 2017 06:59:21 +0000 (14:59 +0800)]
suite: use "suite_hash" as the default sha1 for workunit
as "workunits" reside in ceph/qa/workunits, it's more intuitive to
respect suite-branch option when cloning workunits.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Zack Cerza [Fri, 10 Feb 2017 20:43:36 +0000 (13:43 -0700)]
Merge pull request #1027 from ceph/wip-ceph-ansible-boot
ceph_ansible: update pip and ansible versions
Zack Cerza [Thu, 9 Feb 2017 23:08:21 +0000 (16:08 -0700)]
Merge pull request #1013 from ceph/wip-clock
Change to use task 'clock' instead of 'clock.check'
Vasu Kulkarni [Thu, 9 Feb 2017 00:16:55 +0000 (16:16 -0800)]
update default ansible version to 2.2.1
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
Vasu Kulkarni [Thu, 9 Feb 2017 00:15:38 +0000 (16:15 -0800)]
upgrade pip first before setuptools
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
Zack Cerza [Wed, 8 Feb 2017 22:11:04 +0000 (15:11 -0700)]
Merge pull request #1018 from jan--f/wip-bootstrap-tumbleweed
bootstrap: add support for opensuse tumbleweed.
Zack Cerza [Wed, 8 Feb 2017 22:10:22 +0000 (15:10 -0700)]
Merge pull request #1021 from ceph/wip-suite-default
teuthology-suite: Drop default for --ceph
Zack Cerza [Wed, 8 Feb 2017 22:00:19 +0000 (15:00 -0700)]
Merge pull request #1004 from SUSE/wip-17981
nuke: Use pkill -KILL to unconditionally wipe out hadoop processes
Zack Cerza [Tue, 7 Feb 2017 20:22:40 +0000 (13:22 -0700)]
Merge pull request #1026 from dmick/master
prune: use the shortest time of -p or -r to decide on processing
Dan Mick [Tue, 7 Feb 2017 20:07:47 +0000 (12:07 -0800)]
prune: use the shortest time of -p or -r to decide on processing
invoking -p 7 -r 30 was only removing passed jobs 30 days or older.
Signed-off-by: Dan Mick <dan.mick@redhat.com>
Zack Cerza [Tue, 7 Feb 2017 17:06:09 +0000 (10:06 -0700)]
Merge pull request #1025 from ceph/wip-console-force-into-spy
console: force existing connections into spy mode if !readonly
Ilya Dryomov [Tue, 7 Feb 2017 09:55:45 +0000 (10:55 +0100)]
console: force existing connections into spy mode if !readonly
If someone watching the console didn't think of using "console -s", we
end up power cycling the node in an attempt to get the login prompt.
This is futile -- if the watcher is still there after the node comes
back up, our connection will get dropped to spy mode again.
Use -f to temporarily force existing connections into spy mode when we
attach to save a power cycle.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Zack Cerza [Fri, 3 Feb 2017 20:55:48 +0000 (13:55 -0700)]
Merge pull request #1024 from ceph/wip-kclient-nuke-nosync
nuke: work around a reboot -n trouble
Ilya Dryomov [Fri, 3 Feb 2017 10:21:02 +0000 (11:21 +0100)]
nuke: work around a reboot -n trouble
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Ilya Dryomov [Fri, 3 Feb 2017 09:59:43 +0000 (10:59 +0100)]
nuke: improve stale_kernel_mount() check
Commit
7db9e8b76fd5 ("nuke: bring stale kernel client handling back")
resurrected the check that was removed in commit
1d47a121b385 ("Fix
nuke, redo some cleanup functions"). It isn't sufficient though -- for
example, if a workunit already issued a umount, /etc/mtab won't have
a '^/dev/rbd' entry.
debugfs is enabled and mounted on all distros we care about.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Zack Cerza [Wed, 1 Feb 2017 23:58:59 +0000 (16:58 -0700)]
Merge pull request #1022 from ceph/wip-unbreak-kclient-nuke
nuke: bring stale kernel client handling back
Ilya Dryomov [Wed, 1 Feb 2017 19:37:49 +0000 (20:37 +0100)]
nuke: drop remove_kernel_mounts()
Calling remove_kernel_mounts() after reboot() is pretty useless. Also,
as explained in the previous commit, there isn't much we can do in the
krbd case, so just drop it.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Ilya Dryomov [Wed, 1 Feb 2017 19:37:49 +0000 (20:37 +0100)]
nuke: bring stale kernel client handling back
Commit
1d47a121b385 ("Fix nuke, redo some cleanup functions") broke
stale kernel client map/mount handling by dropping reboot arguments.
While for kcephfs we can use 'umount -f' to avoid sync (it used to not
work, but is mostly fixed now, I believe), currently there is nothing
we can do for a local filesystem mounted on top of krbd.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Zack Cerza [Wed, 25 Jan 2017 22:20:43 +0000 (15:20 -0700)]
teuthology-suite: Drop default for --ceph
Signed-off-by: Zack Cerza <zack@redhat.com>
Dan Mick [Wed, 25 Jan 2017 21:23:10 +0000 (13:23 -0800)]
Merge pull request #1020 from ceph/wip-waitpid
Tell gevent not to patch os.waitpid()
Reviewed-by: Dan Mick <dmick@redhat.com>
Zack Cerza [Wed, 25 Jan 2017 21:08:35 +0000 (14:08 -0700)]
Tell gevent not to patch os.waitpid()
Signed-off-by: Zack Cerza <zack@redhat.com>
Jan Fajerski [Wed, 25 Jan 2017 13:38:39 +0000 (14:38 +0100)]
bootstrap: add support for opensuse tumbleweed.
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
Dan Mick [Tue, 24 Jan 2017 19:27:17 +0000 (11:27 -0800)]
Merge pull request #1017 from ceph/wip-manhole
Use manhole to provide a way to debug hung jobs
Reviewed-by: Dan Mick <dmick@redhat.com>
Zack Cerza [Tue, 24 Jan 2017 18:56:49 +0000 (11:56 -0700)]
Use manhole to provide a way to debug hung jobs
https://pypi.python.org/pypi/manhole
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Tue, 24 Jan 2017 16:47:29 +0000 (09:47 -0700)]
Merge pull request #1014 from jcsp/wip-18594
pcp: use a timeout when downloading graphite graphs
John Spray [Thu, 19 Jan 2017 13:47:59 +0000 (13:47 +0000)]
pcp: use a timeout when downloading graphite graphs
Fixes: http://tracker.ceph.com/issues/18597
Signed-off-by: John Spray <john.spray@redhat.com>
Zack Cerza [Tue, 17 Jan 2017 17:41:41 +0000 (10:41 -0700)]
Merge pull request #1011 from ceph/wip-kernel-kill-vsplitter
kernel: be more flexible about sha1 matching
Ilya Dryomov [Tue, 17 Jan 2017 14:03:46 +0000 (15:03 +0100)]
kernel: be more flexible about sha1 matching
Some rpm scripts don't allow dashes in the Release field, so let's
accept both -g and _g. Kill _vsplitter() as Calxeda is no more.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Dan Mick [Tue, 17 Jan 2017 00:11:30 +0000 (16:11 -0800)]
clock: Fix clock's use of Remote.run()
Signed-off-by: Dan Mick <dan.mick@redhat.com>
Dan Mick [Fri, 13 Jan 2017 22:49:05 +0000 (14:49 -0800)]
Use clock by default (instead of clock.check)
We're seeing clocks desynchronized. My theory is that this might
be because ntp can take five minutes or more to actually sync the
clocks, and clock.check doesn't do any setting of the clocks, just
reporting. clock, OTOH, stops ntpd and does an ntpdate, and then
restarts ntpd, which should kickstart it with a much-closer-to-correct
time.
Signed-off-by: Dan Mick <dan.mick@redhat.com>
Dan Mick [Tue, 17 Jan 2017 00:19:41 +0000 (16:19 -0800)]
Merge pull request #1009 from ceph/wip-archive
Avoid a race condition with job archive creation
Reviewed-by: Dan Mick <dmick@redhat.com>
Zack Cerza [Mon, 16 Jan 2017 23:39:49 +0000 (16:39 -0700)]
Fix a conflict with openstack requirements
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Mon, 16 Jan 2017 23:16:41 +0000 (16:16 -0700)]
worker: Create job archive directories
... not just run archive directories. This is to resolve a race
condition between the job creating its archive directory and the worker
symlinking its log into that directory.
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Mon, 16 Jan 2017 23:14:49 +0000 (16:14 -0700)]
run: Don't fail if the archive dir exists
Signed-off-by: Zack Cerza <zack@redhat.com>