John Spray [Wed, 29 Apr 2015 18:53:59 +0000 (19:53 +0100)]
tasks/ceph_deploy: fix for multiple mons
Now that service IDs are modified during run, we have
to avoid repeatedly evaluating first_mon for where
to run ceph_deploy, as the answer will change.
Fixes: #11495 Signed-off-by: John Spray <john.spray@redhat.com>
The early non-defaults caused failures due to xfstests_url: None not
being overridden by run_xfstests(). Move the defaults to xfstests() and
don pass xfstests_branch past that point.
John Spray [Mon, 6 Apr 2015 18:54:23 +0000 (19:54 +0100)]
tasks/ceph_deploy: configure CephFS
This test apparently had not been touched since
"fs new" was added. In addition to calling
Filesystem.create:
* modify the get_nodes_using_role
function to modify ctx.cluster.remotes so that the
service IDs match what ceph-deploy will set
* log exceptions during ceph_deploy setup, as otherwise
they can get lost if another exception occurs during
teardown (so that it's all easier to debug).
* default to passing --dev=master during install, so
that we don't error out horribly when run without
an explicit branch set (e.g. when run outside
scheduled suite)
Fixes: #11316 Signed-off-by: John Spray <john.spray@redhat.com>
John Spray [Tue, 7 Apr 2015 13:14:15 +0000 (14:14 +0100)]
tasks/cephfs: tweak use of mon for admin commands
... s/mon_remote/admin_remote/ and allow caller to pass
in which remote they want to use for that. Enables use
with ceph_deploy task which does not give admin keys
to mons.
John Spray [Wed, 1 Apr 2015 12:56:13 +0000 (13:56 +0100)]
tasks: update test_journal_repair
This broke with recent Client changes that
do better caching of readdir results, such
that doing an ls twice is no longer sufficient
to see a fresh result after repair - we need
to remount instead.
John Spray [Thu, 26 Mar 2015 17:15:28 +0000 (17:15 +0000)]
tasks: update journal_repair test for 'damaged' state
To track recent change in master where instead of
crashing on missing MDSTable object we'll go
into damaged state.
Instead of catching a crash, handle the rank's
transition to the damanged state. Leave the crash
handling code (unused for the moment) in the
Filesystem class in case it's needed elsewhere
soon.
John Spray [Wed, 4 Feb 2015 12:52:42 +0000 (12:52 +0000)]
tasks/cephfs: be tolerant of multiple MDSs
...as long as only one is active, all the ops
that default to talking to a single MDS should
be happy to talk to the active MDS, even if there
happens to be a standby lying around too.
To shut down a user's smbd process it is recommended that SIGKILL (-9)
NOT be used, except as a last resort, as this may leave the shared
memory area in an inconsistent state. The safe way to terminate an smbd
is to send it a SIGTERM (-15) signal and wait for it to die on its own.
Douglas Fuller [Tue, 31 Mar 2015 15:52:52 +0000 (08:52 -0700)]
RBD: added optional YAML parameters to test xfstests from different repos
These variables are needed because ceph-qa-suite bootstraps ceph-qa-chef via
http download of solo-from/scratch/run. This adds a variable to override the
default script. It also adds variables to the rbd task to override the versions
of run_xfstests_krbd.sh and run_xfstests.sh downloaded by the default task.
variables added
======
tasks:
-chef
script_url: # override default location for solo-from-scratch for Chef
chef_repo: # override default Chef repo used by solo-from-scratch
chef_branch: # to choose a different git upstream branch for ceph-qa-chef
-rbd.xfstests:
client.0:
xfstests_branch: # to choose a different git upstream branch for xfstests
xfstests_url: # override git base URL for run_xfstests{_krbd}.sh
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
Loic Dachary [Tue, 24 Mar 2015 19:14:45 +0000 (20:14 +0100)]
erasure-code: ec-cache-agent in firefly-x/stress-split-erasure-code
Immediately after the Firefly installation, create an erasure code pool
behind a replicated cache pool. Run deep-scrub on all OSD while a rados
task runs. After upgrading half of the cluster (MON and OSD), run a
rados task again also deep-scrub in parallel.
Sage Weil [Tue, 31 Mar 2015 04:11:35 +0000 (21:11 -0700)]
ceph: fix mkfs -f bug
Pass -f by default to btrfs instead of first trying without and *then*
trying with.
Among other things, this avoids a confusing failure where we try mkfs.ext4
device (no -f), fail for some reason, and then try again with -f and get
a usage error (-f does not mean force for mke2fs).
Dan Mick [Thu, 26 Mar 2015 23:46:35 +0000 (16:46 -0700)]
calamari_setup: centralize config defaults
Make a DEFAULTS dict that is updated by any user parms, so that
defaults are documented centrally and so config.get(key, defval) is
no longer necessary everywhere.
Dan Mick [Thu, 26 Mar 2015 22:29:37 +0000 (15:29 -0700)]
calamari_setup: remove "build test image" code; add 'test_image' cfgvar
Stop trying to build test images inside this test; presume the test
image is available built externally (in a file path or an http URL).
Config vars ice_tool_dir, ice_version, iceball_location, and
ice_git_location go away in favor of 'test_image', the path to the
testable image (which can still be a tar.gz or an .iso).
Dan Mick [Thu, 26 Mar 2015 22:31:05 +0000 (15:31 -0700)]
calamari_setup: mounting iso on older distros requires -o loop
Ubuntu's mount/kernel support "mount <file> <mntpnt>" directly;
apparently Centos 6 (and presumably RHEL6) require specifying at
least '-o loop' (a /dev/loopN will be dynamically allocated and removed
on unmount).
The ec-rados-default.yaml is modified to be a task instead of a
sequential task in a parallel tasks. The ec-rados-sequential.yaml is
added and is linked in
suites/upgrade/giant-x/parallel/2-workload/sequential_run instead of ec-rados-default.yaml.
Loic Dachary [Sun, 22 Mar 2015 16:43:02 +0000 (17:43 +0100)]
ensure summary is looked for the user we need (part 2)
Move the get_user_summary(out, user) logic to util.rgw so that it can be
shared between radosgw_admin_rest.py and radosgw_admin.py and modify
them accordingly.
Loic Dachary [Sun, 22 Mar 2015 16:43:02 +0000 (17:43 +0100)]
ensure summary is looked for the user we need (part 2)
Move the get_user_summary(out, user) logic to util.rgw so that it can be
shared between radosgw_admin_rest.py and radosgw_admin.py and modify
them accordingly.