]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Sage Weil [Wed, 17 Jul 2013 18:20:01 +0000 (11:20 -0700)]
ceph: do not ignore osd leaks
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Thu, 18 Jul 2013 19:31:11 +0000 (12:31 -0700)]
nuke: killall ceph-disk, too
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Thu, 18 Jul 2013 18:38:00 +0000 (11:38 -0700)]
schedule_suite.sh: escape ceph-deploy overrides
Sage Weil [Thu, 18 Jul 2013 18:21:07 +0000 (11:21 -0700)]
ceph-deploy: support overrides
Something like
overrides:
ceph-deploy:
foo: bar
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Thu, 18 Jul 2013 04:33:50 +0000 (21:33 -0700)]
Merge remote-tracking branch 'gh/next'
Sage Weil [Thu, 18 Jul 2013 03:59:54 +0000 (20:59 -0700)]
Merge branch 'wip-machine-type'
Reviewed-by: Sandon Van Ness <sandon@inktank.com>
Sage Weil [Sat, 13 Jul 2013 20:11:40 +0000 (13:11 -0700)]
lock: filter machine type for --list, --list-targets
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Sat, 13 Jul 2013 20:09:15 +0000 (13:09 -0700)]
lock: make --summary list all machines by default
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Sat, 13 Jul 2013 20:09:07 +0000 (13:09 -0700)]
lock: drop machine-type default, but require for lock-many
Signed-off-by: Sage Weil <sage@inktank.com>
Samuel Just [Thu, 18 Jul 2013 01:14:58 +0000 (18:14 -0700)]
ceph.conf.template: enable osd debug verify stray on activate
Signed-off-by: Samuel Just <sam.just@inktank.com>
Yehuda Sadeh [Wed, 17 Jul 2013 21:05:26 +0000 (14:05 -0700)]
radosgw-admin: adapt task to recent changes
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
tamil [Wed, 17 Jul 2013 00:41:57 +0000 (17:41 -0700)]
Merge branch 'master' of github.com:ceph/teuthology
tamil [Wed, 17 Jul 2013 00:41:32 +0000 (17:41 -0700)]
added overrides for ceph-deploy
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
Sage Weil [Wed, 17 Jul 2013 00:15:55 +0000 (17:15 -0700)]
workunit: set CEPH_CLI_TEST_DUP_COMMAND
This will make the CLI do every mon command twice and make sure they both
succeed. This catches problems with mon command idempotency faster than
waiting for random failures trigger.
tamil [Wed, 17 Jul 2013 00:14:33 +0000 (17:14 -0700)]
added conf section to ceph-deploy task
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
Warren Usui [Fri, 12 Jul 2013 03:24:09 +0000 (20:24 -0700)]
Created tasktest to test sequential and parallel tasks.
Added sequential task and parallel task.
Changed _run_one_task to run_one_task (now called by new tasks too).
Fix #4969
Signed-off-by: Warren Usui <warren.usui@inktank.com>
tamil [Tue, 16 Jul 2013 00:04:21 +0000 (17:04 -0700)]
calling mon destroy command after mds create
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
Sage Weil [Sat, 13 Jul 2013 21:07:28 +0000 (14:07 -0700)]
ceph_manager: drop -t arg prefix for pg dump_stuck
This is no longer needed, and ugly to support.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Fri, 12 Jul 2013 22:18:50 +0000 (15:18 -0700)]
ceph.conf: enable old message assert
If this triggers, the RECONNECT_SEQ feature is broken (and
maybe we've caught #5517).
tamil [Tue, 9 Jul 2013 18:12:29 +0000 (11:12 -0700)]
Add mon create and destroy with an optional argument mon_initial_members
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
Sage Weil [Tue, 9 Jul 2013 05:22:22 +0000 (22:22 -0700)]
lock: fix typo
Sandon Van Ness [Mon, 8 Jul 2013 23:54:22 +0000 (16:54 -0700)]
VM: Use mac addresses from DB instead of randomizing.
In order to make IP addresses less likely to change and to allow
a smaller DHCP pool to be used I generated static MAC addresses
for all the vpm entries in the DB. I also put the correct entries
for all the other types of machines as well for their primary
(eth0) mac address as well in order to keep things standardized
and so there is another location where we have this information.
Without this fix going through a few tests would exhaust the DHCP
pool which at the time was around 460 IP addresses for virtual
machines and has since been upped to ~690 IP addresses.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Reviewed-by: Warren Usui <warren.usui@inktank.com>
Sage Weil [Mon, 8 Jul 2013 17:40:27 +0000 (10:40 -0700)]
Merge pull request #17 from ceph/wip-mon-thrash
mon thrash improvements
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Sage Weil [Sat, 6 Jul 2013 01:04:40 +0000 (18:04 -0700)]
mon_thrasher: add pause/unpause of mons to thrashing
This adds an additional element of laggyness to the cluster which should
cause mons to call new elections.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Sat, 6 Jul 2013 01:01:57 +0000 (18:01 -0700)]
daemon-helper: send arbitrary signals via stdin
Each byte written to stdin will be interpreted as a signal.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Fri, 5 Jul 2013 21:23:56 +0000 (14:23 -0700)]
mon_thrash: optionally scrub after each iteration (default true)
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Fri, 5 Jul 2013 21:23:37 +0000 (14:23 -0700)]
mon_thrash: fix more naming
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Fri, 5 Jul 2013 17:30:25 +0000 (10:30 -0700)]
mon_thrash: use _ instead of - consistently
Signed-off-by: Sage Weil <sage@inktank.com>
Sandon Van Ness [Thu, 4 Jul 2013 02:07:35 +0000 (19:07 -0700)]
Fix VM issues.
Fix of #5494 although bad description. Instead of adding a wait
the code used to detect if the guest was back up is fixed. The
previous code appeared to assume only one machine and broke
when it was waiting for multiple machines if the guests did not
come up within 10 seconds of each other
Make nuke not do the normal stuff if the machine is a VPS as we
just destroy them when they get unlocked.
Instead of getting downburst options from ~/.teuthology.yaml get
it from the yaml given to teuthology for the test/task instead.
Fixed an error that would make all the default downburst values
not take effect if any of them were set via a yaml.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Reviewed-by: Warren Usui <warren.usui@inktank.com>
Sage Weil [Wed, 3 Jul 2013 16:59:21 +0000 (09:59 -0700)]
ceph: don't check leaks on client.* (i.e., radosgw)
...until we fix them. This way we can see other valgrind issues.
Sage Weil [Mon, 1 Jul 2013 21:21:55 +0000 (14:21 -0700)]
radosgw-admin: add missing quote
Sage Weil [Mon, 1 Jul 2013 21:21:48 +0000 (14:21 -0700)]
radosgw-admin: test 'bucket list' command (all buckets)
Verifies fix for #5455
Signed-off-by: Sage Weil <sage@inktank.com>
Sandon Van Ness [Thu, 27 Jun 2013 21:08:09 +0000 (14:08 -0700)]
Update keys if they have changed before locking
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sage Weil [Thu, 27 Jun 2013 00:48:03 +0000 (17:48 -0700)]
ceph: disable logrotate
This screwed up the log archival step at the end, and generally makes a
mess of automated runs.
Fixe: #5451
Sage Weil [Tue, 25 Jun 2013 19:45:22 +0000 (12:45 -0700)]
dump_stuck: fix test
The mon-osd-report-timeout setting shouldn't be there! We will set the
other item explicitly, and remove both from the suite yaml.
Fixes: #5440
Sage Weil [Mon, 24 Jun 2013 23:18:36 +0000 (16:18 -0700)]
Merge pull request #15 from ceph/wip-ulimits
Reviewed-by: Warren Usui <warren.usui@inktank.com>
Sage Weil [Mon, 24 Jun 2013 18:01:48 +0000 (11:01 -0700)]
Merge pull request #16 from ceph/wip-5431
Reviewed-by: Warren Usui <warren.usui@inktank.com>
Sage Weil [Mon, 24 Jun 2013 03:44:38 +0000 (20:44 -0700)]
rados: fix multiclient tests
Each client (not run) gets its own pool!
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Sun, 23 Jun 2013 23:21:45 +0000 (16:21 -0700)]
dump_stuck: fix race with osd start
Occasionally we don't wait long enough for the osd to start and
mark itself up. Keep trying until flush succeeds.
Fixes: #5431
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Sun, 23 Jun 2013 16:15:28 +0000 (09:15 -0700)]
enable-coredump -> adjust-ulimits
and set max_files to be big, too!
Sandon Van Ness [Fri, 21 Jun 2013 22:53:53 +0000 (15:53 -0700)]
Merge remote-tracking branch 'remotes/origin/wip-sandon-cephdeploy'
Sage Weil [Fri, 21 Jun 2013 22:27:49 +0000 (15:27 -0700)]
Merge pull request #14 from clee/cleanup
Clean up nested-if logic
Reviewed-by: Sage Weil <sage@inktank.com>
Sandon Van Ness [Fri, 21 Jun 2013 01:36:58 +0000 (18:36 -0700)]
Wipe out existing id_rsa.pub and id_rsa before pushing ssh keys
A very simple change. Just touch a file first (to create it if it
doesn't yet exist so the delete doesn't error out) and then delete
it before pushing the keys to the file. This should avoid the
id_rsa.pub and id_rsa files from getting messed up due to previous
runs which were interrupted or failed (or if those files exist for
some reason). This appears to be what was causing breaking in the
ceph-deploy nightlies.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Chris Lee [Thu, 20 Jun 2013 20:42:33 +0000 (13:42 -0700)]
Clean up nested-if logic
Samuel Just [Tue, 4 Jun 2013 21:12:07 +0000 (14:12 -0700)]
task/peering_speed_test.py: add test which summarizes pg peering speed
Running this regularly may warn us about slow peering.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Samuel Just [Tue, 4 Jun 2013 21:11:29 +0000 (14:11 -0700)]
task/: add args.py
The usage doc string for a task is tedious to write and
hard to keep reconciled with the code as defaults are changed.
args.py includes a helper to put it all in one place.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Sage Weil [Wed, 19 Jun 2013 20:35:50 +0000 (13:35 -0700)]
schedule_suite.sh: specify admin_socket branch in overrides yaml
Signed-off-by: Sage Weil <sage@inktank.com>
Warren Usui [Wed, 19 Jun 2013 18:29:38 +0000 (11:29 -0700)]
Include MySQLdb
Fixes: #5120
Warren Usui [Tue, 18 Jun 2013 21:23:22 +0000 (14:23 -0700)]
Fix to ignore ssh-key checking if running on virtual machines or
if a line that reads 'sshkey: ignore' is in the yaml file.
Fix #5364
Signed-off-by: Warren Usui <warren.usui@inktank.com>
Warren Usui [Wed, 12 Jun 2013 21:21:14 +0000 (14:21 -0700)]
Make reset of ssh key code conditional on being a virtual machine.
Add and use is_vm to determine if we are running on a virtual machine.
Fix #5364
Signed-off-by: Warren Usui <warren.usui@inktank.com>
Sage Weil [Wed, 19 Jun 2013 17:36:49 +0000 (10:36 -0700)]
admin_socket: fetch test from correct branch
Sage Weil [Wed, 19 Jun 2013 16:08:17 +0000 (09:08 -0700)]
valgrind: give up and ignore all leveldb leaks
Hopefully if it is our fault we will have our own struct wrapping the
leveldb resource that we leak.
Sandon Van Ness [Mon, 17 Jun 2013 23:24:37 +0000 (16:24 -0700)]
Use authorized_keys2 instead of authorized_keys
Instead of going through the trouble of adding/removing lines
from authorized_keys which has all our normal keys in it, instead
push keys to the unused authorized_keys2 file which makes the key
management significantly simpler as that file can just be wiped
out each time instead of worrying about preserving contents.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sandon Van Ness [Mon, 17 Jun 2013 23:24:37 +0000 (16:24 -0700)]
Use authorized_keys2 instead of authorized_keys
Instead of going through the trouble of adding/removing lines
from authorized_keys which has all our normal keys in it, instead
push keys to the unused authorized_keys2 file which makes the key
management significantly simpler as that file can just be wiped
out each time instead of worrying about preserving contents.
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
Sage Weil [Mon, 17 Jun 2013 20:53:10 +0000 (13:53 -0700)]
valgrind: another leveldb leak
Sage Weil [Sun, 16 Jun 2013 21:53:49 +0000 (14:53 -0700)]
misc: let clients use any pool
rados.py, for example, creates new pools for each instance.
Sage Weil [Sun, 16 Jun 2013 20:11:50 +0000 (13:11 -0700)]
ceph_manager: fix ceph tell mon.*
Need -- to make cli stop parsing (or quote the options).
Otherwise, the options will be parsed/applied to the cli's
librados instance.
Sage Weil [Sun, 16 Jun 2013 16:10:25 +0000 (09:10 -0700)]
no need for ceph --concise argument
Samuel Just [Fri, 14 Jun 2013 17:30:58 +0000 (10:30 -0700)]
ceph_manager: use new ceph tell mon.* syntax
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Sage Weil [Fri, 14 Jun 2013 05:27:50 +0000 (22:27 -0700)]
rados: fix up for parallel work
- use a separate pool for each client
- create pool at start, destroy pool at end
- use all clients, if not explicitly specified
Signed-off-by: Sage Weil <sage@inktank.com>
tamil [Fri, 14 Jun 2013 00:13:09 +0000 (17:13 -0700)]
adding a newline to auth key data
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
tamil [Thu, 13 Jun 2013 23:27:05 +0000 (16:27 -0700)]
Merge branch 'master' of github.com:ceph/teuthology
tamil [Thu, 13 Jun 2013 23:26:42 +0000 (16:26 -0700)]
modified ceph-deploy to throw appropriate exceptions
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
Sage Weil [Thu, 13 Jun 2013 21:51:21 +0000 (14:51 -0700)]
stop stripping leading \n from osd commands
leaving them in for mon command, but not for any good reason.
Warren Usui [Thu, 13 Jun 2013 00:05:51 +0000 (17:05 -0700)]
Merge branch 'wip-RhelFix-wusui'
Sage Weil [Wed, 12 Jun 2013 02:33:59 +0000 (19:33 -0700)]
valgrind: make leveldb thread suppression more general
The thread can get created from a range of callers; ignore them all.
Warren Usui [Tue, 11 Jun 2013 23:50:09 +0000 (16:50 -0700)]
Use install -d for /var/log/ceph.
Additional fix needed for #4946
Signed-off-by: Warren Usui <warren.usui@inktank.com>
Warren Usui [Tue, 11 Jun 2013 21:14:07 +0000 (14:14 -0700)]
Fix capitalization of CentOS
Fixes: #5313
Signed-off-by: Warren Usui <warren.usui@inktank.com>
tamil [Mon, 10 Jun 2013 22:41:48 +0000 (15:41 -0700)]
added support for rhel
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
Dan Mick [Wed, 5 Jun 2013 00:46:05 +0000 (17:46 -0700)]
teuthology-lock --summary: allow --machine-type=all
Somehow this got lost; putting it back
Signed-off-by: Dan Mick <dan.mick@inktank.com>
(cherry picked from commit
e4eb4aa23b66a5b33cf6ff14305ae8c3d328fb50 )
Sage Weil [Mon, 10 Jun 2013 17:45:05 +0000 (10:45 -0700)]
ceph: ignore ceph-osd leaks for now :(
Warren Usui [Mon, 10 Jun 2013 16:46:42 +0000 (09:46 -0700)]
Merge branch 'wip-teuthVm-wusui'
Sage Weil [Sun, 9 Jun 2013 05:26:31 +0000 (22:26 -0700)]
valgrind: glibc/boost_thread leak suppressions
Sage Weil [Sat, 8 Jun 2013 04:58:41 +0000 (21:58 -0700)]
ceph_manager: drop -- before --format=json arg
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Sat, 1 Jun 2013 04:15:41 +0000 (21:15 -0700)]
valgrind: more leveldb whitelisting
Warren Usui [Fri, 7 Jun 2013 01:43:43 +0000 (18:43 -0700)]
Support added for running scheduled tasks on virtual machines.
This included:
A). changes made so that full path names on some files were used
(scheduled tasks started in different home directories).
B.) Changes to insure tasks come up on the beanstalkc queue properly,
C.) Finding and inserting the libvirt eqivalent code for vm machines
in order to simulate ipmi actions,
D.) Fix host key code, report valgrind issue more clearly.
E.) Some message and downburst call changes.
Fix #4988
Fix #5122
Signed-off-by: Warren Usui <warren.usui@inktank.com>
tamil [Sat, 8 Jun 2013 00:40:39 +0000 (17:40 -0700)]
merged system_value for rpms
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
tamil [Fri, 7 Jun 2013 23:00:29 +0000 (16:00 -0700)]
support install task for fedora
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
Warren Usui [Fri, 7 Jun 2013 22:00:39 +0000 (15:00 -0700)]
Merge branch 'wip-RhelInstall-wusui'
Warren Usui [Fri, 7 Jun 2013 21:35:08 +0000 (14:35 -0700)]
Add RHEL support to teuthology
Fix #4946
Signed-off-by: Warren Usui <warren.usui@inktank.com>
Dan Mick [Thu, 6 Jun 2013 22:41:40 +0000 (15:41 -0700)]
task/install.py: extraneous subscript in upgrade() for only some remotes
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Dan Mick [Wed, 5 Jun 2013 00:46:05 +0000 (17:46 -0700)]
teuthology-lock --summary: allow --machine-type=all
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Dan Mick [Tue, 4 Jun 2013 23:11:19 +0000 (16:11 -0700)]
ceph_manager: don't say you have no arguments and then list them
Calling ceph pg dump --format=json works better without -- before pg
(how did this work before?...)
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Sage Weil [Tue, 4 Jun 2013 16:07:53 +0000 (09:07 -0700)]
ceph: fix valgrind grep output parsing
When you pass a single file to zgrep you don't get the filename prefix,
which confuses the split line a few lines down.
Sage Weil [Mon, 3 Jun 2013 16:57:17 +0000 (09:57 -0700)]
ceph: debug valgrind error
File "/var/lib/teuthworker/teuthology-master/teuthology/task/ceph.py", line 215, in valgrind_post
(file, kind) = line.split(':')
ValueError: need more than 1 value to unpack
Sage Weil [Fri, 31 May 2013 05:07:30 +0000 (22:07 -0700)]
valgrind: add another leveldb suppression
Sage Weil [Thu, 30 May 2013 18:25:32 +0000 (11:25 -0700)]
valgrind: update suppressions for leveldb, libc leaks from mon
These result in clean valgrind leak checks on the mon (at least with my
limited vstart testing).
Warren Usui [Fri, 17 May 2013 17:39:15 +0000 (10:39 -0700)]
Rhel support added
Fixes: #4946
Signed-off-by: Warren Usui <warren.usui@inktank.com>
Sage Weil [Wed, 22 May 2013 20:22:21 +0000 (13:22 -0700)]
ceph: fix valgrind log check
- logs are gzipped; use zgrep
- wait for the proc to exit before looking at stdout
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Wed, 22 May 2013 16:25:40 +0000 (09:25 -0700)]
schedule_suite.sh: resolve ceph sha1 using deb gitbuilder, not tarball
The tarball one is old and largely obsolete.
Sage Weil [Mon, 20 May 2013 19:26:49 +0000 (12:26 -0700)]
thrashosds: sync before doing powercycle testing
Hopefully fixes #5112
Sage Weil [Mon, 20 May 2013 18:23:50 +0000 (11:23 -0700)]
schedule_suite.sh: 8hr -> 10hr suite timeout
Still missing some slow rbd tests.
Sage Weil [Sat, 18 May 2013 01:53:02 +0000 (18:53 -0700)]
install: make overrides grouped by project
This lets us set different overrides for e.g. ceph vs samba, and makes it
so the schedule_teuthology.sh overrides don't specify a ceph sha1 for
samba installs.
Signed-off-by: Sage Weil <sage@inktank.com>
tamil [Fri, 17 May 2013 19:08:45 +0000 (12:08 -0700)]
client config will be done only after the cluster is operational.
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
tamil [Thu, 16 May 2013 20:14:06 +0000 (13:14 -0700)]
set permission for config file
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
Sage Weil [Thu, 16 May 2013 18:29:42 +0000 (11:29 -0700)]
schedule_suite.sh: put sha1 in install: overrides, not ceph:
Signed-off-by: Sage Weil <sage@inktank.com>
tamil [Thu, 16 May 2013 16:49:40 +0000 (09:49 -0700)]
added UserKnownHostsfile to ssh config
Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
David Zafman [Tue, 14 May 2013 23:17:10 +0000 (16:17 -0700)]
Fix scrub_test.py permission error
Add description of yaml file including log-whitelist
Add sudo to dd that corrupts data
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Warren Usui <warren.usui@inktank.com>:wq
Josh Durgin [Mon, 13 May 2013 21:19:59 +0000 (14:19 -0700)]
qemu: load the kvm module before trying to use it
It should be loaded before this, but in some cases it is not for some reason.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Sage Weil [Sun, 12 May 2013 00:07:14 +0000 (17:07 -0700)]
schedule_suite.sh: bump suite timeout from 6->8 hours
This captures the current slow rbd tasks.
Signed-off-by: Sage Weil <sage@inktank.com>