Was not tracking high markers correctly. Could only work if there was a single
hole in the completion range. Just keep a map of all the complete entries.
Mark Nelson [Wed, 20 Jul 2016 13:28:40 +0000 (08:28 -0500)]
Merge pull request #10327 from xiexingguo/xxg-wip-bluestore-2016-07-18
os/bluestore: misc fixes/cleanups
Mark's Comments:
This passed "ceph_test_objectstore --gtest_filter=*/2".
This PR may have contributed to write performance improvements along with #10257.
Mark Nelson [Wed, 20 Jul 2016 13:22:31 +0000 (08:22 -0500)]
Merge pull request #10257 from chhabaramesh/extent_alloc
os/bluestore: extent alloc functionality for stupid and bitmap allocator
Mark's Comments:
This passed "ceph_test_objectstore --gtest_filter=*/2".
This PR appears to result in a large sequential write performance increase (Up to ~3.5X), and may contribute to a small random write performance increase as well.
rgw: back off bucket sync on failures, don't store marker
Fixes: http://tracker.ceph.com/issues/16742
If we fail on any single entry in bucket, skip updating the marker tracker
so that next time we'll go over that entry, and back off. This will trigger
a report to the data sync error repo and eventually a retry on the failing
object.
this silences the warning of:
```
In file included from
/home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/include/gtest/gtest.h:58:0,
from
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osdcap.cc:20:
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osdcap.cc:
In member function ‘virtual void
OSDCap_AllowClassMulti_Test::TestBody()’:
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osdcap.cc:766:6:
note: variable tracking size limit exceeded with
-fvar-tracking-assignments, retrying without
TEST(OSDCap, AllowClassMulti) {
^
/home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/include/gtest/internal/gtest-internal.h:1211:3:
note: in definition of macro ‘GTEST_TEST_CLASS_NAME_’
test_case_name##_##test_name##_Test
^
/home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/include/gtest/gtest.h:2181:3:
note: in expansion of macro ‘GTEST_TEST_’
GTEST_TEST_(test_case_name, test_name, \
^
/home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/include/gtest/gtest.h:2187:42:
note: in expansion of macro ‘GTEST_TEST’
# define TEST(test_case_name, test_name) GTEST_TEST(test_case_name,
# test_name)
^
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osdcap.cc:766:1:
note: in expansion of macro ‘TEST’
TEST(OSDCap, AllowClassMulti) {
^
```
cmake: remove util.cc from lib{rados,cephfs},ceph-objectstore-tool
util.cc is included by both librados and libcephfs, the `lvm` static
variable in `lsb_release_parse()` will be free twice by them. this
could lead to double free issue. and util.cc is not used by client at all, so
remove it from them.
Jason Dillaman [Fri, 8 Jul 2016 02:16:51 +0000 (22:16 -0400)]
qa/workunits/rbd: exercise snapshot renames within rbd-mirror test
Snapshot rename operations utilize the (cluster) unique snapshot
sequence to prevent attempts at replays. When mirroring to a
different cluster, these sequences will not align.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
os/bluestore: fix potential uninitialized nid of onode
The _zero() process may implicitly create a new onode,
thus we shall call _assign_nid() to initialize the nid
properly. And if the onode already has one, _assign_nid()
does nothing.
So it is proper to call _assign_nid() here under any case.
Mark's comments:
This passed "ceph_test_objectstore --gtest_filter=*/2".
This PR did not appear to have a significant impact on performance tests.
Closes #10236
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
os/bluestore: check against we don't overflow
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
os/bluestore: try to reap as many collections as we can
So if there is one collection getting contiguously stucking,
we don't abort at the same point each time.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
os/bluestore: make device size of BitFreelistManager is block-size aligned
Otherwise if we try to set past-eof blocks as allocated durint create(),
the call to _xor() will trigger the firing of the following assert:
Ramana Raja [Thu, 7 Jul 2016 11:45:13 +0000 (17:15 +0530)]
ceph_volume_client: version on-disk metadata
Version on-disk metadata with two attributes,
'compat version', the minimum CephFSVolume Client
version that can decode the metadata, and
'version', the version that encoded the metadata.
Ramana Raja [Thu, 23 Jun 2016 10:36:53 +0000 (16:06 +0530)]
ceph_volume_client: modify locking of meta files
File locks are applied on meta files before updating the meta
file contents. These meta files would need to be cleaned up
sometime, which could lead to locks being held on unlinked meta
files. Prevent this by checking whether the file had been deleted
after lock was acquired on it.
Ramana Raja [Tue, 7 Jun 2016 19:12:18 +0000 (00:42 +0530)]
ceph_volume_client: recover from dirty auth and auth meta updates
Check dirty flag after locking something and call recover() if we are
opening something dirty (racing with another instance of the driver
restarting after failure) -- only required if someone running multiple
manila-share instances with Ceph loaded.
Ramana Raja [Tue, 21 Jun 2016 06:44:56 +0000 (12:14 +0530)]
ceph_volume_client: modify data layout in meta files
Notable changes to data layout in auth meta and volume meta files:
In the auth meta files, add a 'dirty' flag to track the status of auth
updates to a single volume.
In the volume meta file, make the 'dirty' flag track the status of
auth updates for a single ID.
Optimize the recovery of partial auth update changes to auth meta,
volume meta, and the Ceph backend, facilitated by changes in the
data layout in the meta files.
Store a two-way mapping between auth IDs and volumes.
Enables us to record some metadata on auth ids (which
openstack tenant created it) so that we can avoid exposing
keys to other tenants who try to use the same ceph
auth id.
Enables us to expose the list of which auth ids have access
to a volume, so that Manila's update_access() can be
implemented efficiently.
DNM: see TODOs inline.
Fixes: http://tracker.ceph.com/issues/15615 Signed-off-by: John Spray <john.spray@redhat.com>
John Spray [Mon, 7 Mar 2016 13:06:41 +0000 (13:06 +0000)]
pybind: enable integer flags to libcephfs open
The 'rw+' style flags are handy and convenient, but
they don't capture all possibilities. Change to
optionally accept an integer here for advance users
who want to specify arbitrary combinations of
flags.
Mark Nelson [Sun, 17 Jul 2016 11:39:41 +0000 (06:39 -0500)]
os/bluestore: revert preferred csum behavior
This passes "ceph_test_objectstore --gtest_filter=*/2".
This restores 4K random read performance to previous levels when objects
are were previously written out using large IOs (4MB in this case):
Somnath Roy [Sat, 9 Jul 2016 02:41:46 +0000 (22:41 -0400)]
Bluestore: Fixed a Bluestore crash
A bluestore race condition is been fixed by protecting txc structures
within _txc_state_poc with collection lock.
Mark's comments:
This fixes segfaults during random write tests with bluestore.
This passes "ceph_test_objectstore --gtest_filter=*/2".
This may introduce a small performance regresion, though there is enough
noise in the results to make it inconclusive.
Closes #10220
Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
Patrick Donnelly [Sat, 16 Jul 2016 03:29:22 +0000 (23:29 -0400)]
mds: add assertions for standby_daemons invariant
These assertions catch state changes of an mds in standby_daemons to a state
other than MDSMap::STATE_STANDBY. Currently this invariant is (sometimes!)
checked in other locations on access of standby_daemons. This commit allows us
catch the violated invariant at the time it occurred.
Related to: http://tracker.ceph.com/issues/16592
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
cmake: do not pass --disable-pip-version-check if not supported
on older versions of pip, this option is not supported, and
--disable-pip-version-check is implied with --no-index. so no need to
use them when --no-index is passed to pip.
this is a workaround of the timeout found in jenkins. currently three
tests are found timeout, and they are labeld with "Racing" and
"LongRunning". so, to workaround this issue, we run the tests in two
phases:
1. run the racing tests with -j1
2. run the non-racing tests with -jN
if we all all tests with -j1, the total test time is 2683.57 sec
please note "make test" is used by cmake to run tests, so we cannot just
repurpose it to *build* them.
* AddCephTest.cmake: depends on "tests"
* CMakeLists.txt: let "check" depend on "tests"
* src/CMakeLists.txt: update the run-tox tests
* run-make-check.sh: use "make tests" and "ctest" instead of "make check"
* ceph-detect-init/CMakeLists.txt: let "tests" depend on
"ceph-detect-init"
* ceph-disk/CMakeLists.txt: let "tests" depend on "ceph-disk"