Sébastien Han [Thu, 13 Nov 2014 18:11:36 +0000 (19:11 +0100)]
Improve readability of the exception
The error messages were not really clear from a non-programmer
perspective. In the context of OpenStack all the drivers are falling
back to the exceptions provided by the rados library. Having clearer
error messages will help debugging misconfigured environment.
Signed-off-by: Sébastien Han <sebastien.han@enovance.com>
David Zafman [Wed, 12 Nov 2014 23:22:04 +0000 (15:22 -0800)]
ceph_objectstore_tool: Fixes to make import work again
The is_pg() call is now true even for pgs pending removal, fix broken
finish_remove_pgs() by removing is_pg() check.
Need to add create_collection() to the initial transaction on import
Fixes: #10090 Signed-off-by: David Zafman <dzafman@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
Loic Dachary [Wed, 12 Nov 2014 17:49:54 +0000 (18:49 +0100)]
qa: handle CEPH_CLI_TEST_DUP_COMMAND on ceph osd create
If CEPH_CLI_TEST_DUP_COMMAND is set when ceph osd create is called, it
will create two osd. They must be cleaned up afterwards instead of
assuming only one is going to be created.
Sébastien Han [Mon, 10 Nov 2014 14:06:20 +0000 (15:06 +0100)]
doc: enable RBD cache and socket on OpenStack deployments
Enabling the RBD cache improves sequential IOs and the socket helps a
lot while troubleshooting. These 2 items are considered as best
practice for OpenStack deployments with Ceph.
Signed-off-by: Sébastien Han <sebastien.han@enovance.com>
Josh Durgin [Wed, 12 Nov 2014 02:16:02 +0000 (18:16 -0800)]
qa: allow small allocation diffs for exported rbds
The local filesytem may behave slightly differently. This isn't
foolproof, but seems to be reliable enough on rhel7 rootfs, where
exact comparison was failing.
Rongze Zhu [Mon, 10 Nov 2014 16:13:42 +0000 (00:13 +0800)]
crush: fix tree bucket functions
There are incorrect nodes' weight in tree bucket when construct tree
bucket. The tree bucket don't store item id in items array, so the tree
bucket will not work correctly. The patch fix above bugs and add a
simple test for tree bucket.
The check for 'nextkey < last_disk_key' makes not much sense since
last_disk_key is an empty string and not set before. Comparing a
decoded string to be less than an empty string will be never true.
Since this if() isn't part of a loop last_disk_key is only set
once and there is no other consumer: revert this dead code.
Danny Al-Gaaf [Thu, 30 Oct 2014 02:14:41 +0000 (03:14 +0100)]
rados_sync.cc: fix xattr_diff() for the only_in_b checks
In the checks to build only_in_b up the wrong const_iterator x is
build up. it should compare rhs->xattrs with xattrs entries and
not twice rhs->xattrs.
Fix for:
CID 716957 (#1 of 1): Invalid iterator comparison (MISMATCHED_ITERATOR)
mismatched_comparison: Comparing x from rhs->xattrs to this->xattrs.end()
from this->xattrs.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
CID 717177 (#2-1 of 3): Uncaught exception (UNCAUGHT_EXCEPT)
root_function: In function main(int, char const **) an exception of
type ceph::FailedAssertion is thrown and never caught.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Jason Dillaman [Tue, 11 Nov 2014 07:17:28 +0000 (02:17 -0500)]
librbd: Python unit tests now use unique pools and images
RBD python unit tests no longer utilize the 'rbd' pool for
test cases. Instead, a new temporary pool is created and
deleted. Additionally, each unit test now uses a unique
image name to reduce the possibility of test case failures
affecting subsequent tests.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Mon, 27 Oct 2014 18:47:19 +0000 (14:47 -0400)]
osdc: Constrain max number of in-flight read requests
Constrain the number of in-flight RADOS read requests to the
cache size. This reduces the chance of the cache memory
ballooning during certain scenarios like copy-up which can
invoke many concurrent read requests.
Fixes: #9854
Backport: giant, firefly, dumpling Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Rongze Zhu [Fri, 10 Oct 2014 11:18:00 +0000 (19:18 +0800)]
crush: fix incorrect use of adjust_item_weight method
adjust_item_weight method will adjust all buckets which the item
inside. If the osd.0 in host=fake01 and host=fake02, we execute
"ceph osd crush osd.0 10 host=fake01", it not only will adjust fake01's
weight, but also will adjust fake02's weight.
the patch add adjust_item_weightf_in_loc method and fix remove_item,
_remove_item_under, update_item, insert_item, detach_bucket methods.
Introduce ceph_erasure_code_non_regression to check and compare how an
erasure code plugin encodes and decodes content with a given set of
parameters. For instance:
Will create an encoded object (--create) and store it into a directory
along with the chunks, one chunk per file. The directory name is derived
from the parameters. The content of the object is a random pattern of 31
bytes repeated to fill the object size specified with --stripe-width.
The check function (--check) reads the object back from the file,
encodes it and compares the result with the content of the chunks read
from the files. It also attempts recover from one or two erasures.
Chunks encoded by a given version of Ceph are expected to be encoded
exactly in the same way by all Ceph versions going forward.
Loic Dachary [Sun, 9 Nov 2014 02:23:06 +0000 (03:23 +0100)]
erasure-code: document pool operations
A short introduction to the first time user of an erasure coded pool.
It includes a reminder of how it relates to cache tiering and links to
define new profiles with an example.
There was examples in the developer documentation but the operator
expects to find such a guide in the rados operations chapter.
Loic Dachary [Wed, 22 Oct 2014 03:05:45 +0000 (20:05 -0700)]
tests: use kill -0 to check process existence
When killing a daemon, instead of using kill -9 to check the process was
terminated, use kill -0. Should the pid of the process be reused
immediately after, it would be wrong to kill the new process. Worst case
scenario the kill_daemon function returns before the process is
confirmed to be killed but this is not treated as an error and is
unlikely to cause any problem.
Loic Dachary [Sat, 18 Oct 2014 22:41:40 +0000 (15:41 -0700)]
tests: remove vstart_wrapped_tests.sh
Listing tests to be run in a single script does not take advantage of
parallel runs in make.
The vstart_wrapper.sh script is reworked and made less specialized and
let the caller decide which daemons to run via CEPH_START and does not
enforce the number of deamons of each time. It no longer uses stop.sh to
avoid killing the osd/mon/mds that are unrelated to the tests.