Kefu Chai [Sun, 23 Nov 2014 19:12:24 +0000 (03:12 +0800)]
lockdep: do not use $CEPH_LOCKDEP for g_lockdep
* a non-zero CEPH_LOCKDEP brings ceph down because g_lockdep_ceph_ctx
is still being constructed when dout_impl() dereferences it.
* fix a typo in comment.
* remove dead code.
Haomai Wang [Wed, 19 Nov 2014 06:34:52 +0000 (14:34 +0800)]
KeyValueStore: Add KEY_ENDING sign to the end of key
Keys stored in alphabetical order and need to follow ghobject_t comparison
rule. "generation" and "shard_id" are optional fields for object key, but
a default ghobject with UINT64_MAX generation(by default) will larger than
the same ghobject with other generation. GenericObjectMap rejects to store
generation if generation is UINT64_MAX in order to reduce too much words
in key. So we need to add a MAX sign to the end of key to make ordering
is same with ghobject's comparison rule.
David Zafman [Tue, 18 Nov 2014 21:00:15 +0000 (13:00 -0800)]
ceph_objectstore_tool: Add feature called set-allow-sharded-objects
Uses --op set-allow-sharded-objects option
This operation will be rejected if on the target OSD's osdmap there is
at least one OSD which does not support ERASURE CODES.
Prompt the user that they could import if sharded state allowed
Prompt the user to use new feature if sharded state found inconsistent
Fixes: #10077 Signed-off-by: David Zafman <dzafman@redhat.com>
Jianpeng Ma [Thu, 13 Nov 2014 03:32:57 +0000 (11:32 +0800)]
FileJournal: Add ssd discard for journal which using ssd disk as journal.
Journal is like a ring buffer. After data wrote to media disk, journal
can overwrite. But for those data, ssd dont't know it's nouse and can
remove. So add discard to tell ssd to remove those data.
This maybe not increase the performance. But it can increase the
lifetime of ssd.
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
06a245a added a section def to assembly files; I added it twice to
this file. There's no damage, but a compiler warning (on machines with
yasm installed)
Loic Dachary [Tue, 7 Oct 2014 19:18:00 +0000 (21:18 +0200)]
autotools: add --enable-docker
Docker based tests should be explicit instead of auto-detected. It is
good that they do not run if docker is not available. It would be bad if
they run when the developer does not expect them to create docker
containers.
Loic Dachary [Tue, 7 Oct 2014 17:02:45 +0000 (19:02 +0200)]
ceph-disk: test prepare / activate on a device
This indirectly tests that partprobe is called after zap because it
would fail to map the partitions to /dev/disk/by-partuuid otherwise.
It also indirectly test the implementation of init=none when using a
block device because the test would fail to put an object into the rbd
pool using the device otherwise.
runs test/ceph-disk.sh in a ubuntu 14.04 docker container. Once the
container is populated and ceph compiled, running a test script roughly
requires entering the container and running make TESTS=tests/foo.sh check
* docker build ceph-ubuntu-14.04 using ubuntu.dockerfile as a Dockerfile
* it will run apt-get install ceph compilation / run dependencies
* git clone the-local-clone ceph-ubuntu-14.04
* docker run ceph-ubuntu-14.04 make -j4 in the ceph-ubuntu-14.04 clone
* docker run test/ceph-disk.sh
test/docker-test.sh is the command line interface for
test/docker-test-helper.sh which can be invoked from shell scripts.
test/ubuntu.dockerfile and test/ubuntu.dockerfile are regular
Dockerfiles which allow substitution of environment variables.
Loic Dachary [Thu, 9 Oct 2014 16:52:17 +0000 (18:52 +0200)]
ceph-disk: run partprobe after zap
Not running partprobe after zapping a device can lead to the following:
* ceph-disk prepare /dev/loop2
* links are created in /dev/disk/by-partuuid
* ceph-disk zap /dev/loop2
* links are not removed from /dev/disk/by-partuuid
* ceph-disk prepare /dev/loop2
* some links are not created in /dev/disk/by-partuuid
This is assuming there is a bug in the way udev events are handled by
the operating system.
Loic Dachary [Fri, 10 Oct 2014 08:23:34 +0000 (10:23 +0200)]
ceph-disk: encapsulate partprobe / partx calls
Add the update_partition function to reduce code duplication.
The action is made an argument although it always is -a because it will
be -d when deleting a partition.
Use the update_partition function in prepare_journal_dev
Jason Dillaman [Tue, 18 Nov 2014 02:49:26 +0000 (21:49 -0500)]
librbd: protect list_children from invalid child pool IoCtxs
While listing child images, don't ignore error codes returned
from librados when creating an IoCtx. This will prevent seg
faults from occurring when an invalid IoCtx is used.
Fixes: #10123
Backport: giant, firefly, dumpling Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Blaine Gardner [Mon, 17 Nov 2014 23:17:15 +0000 (17:17 -0600)]
Fix bug #10096 (ceph-disk umount race condition)
Bug: http://tracker.ceph.com/issues/10096
Brief: Unmounting temporary mount point failed due to file being 'busy'.
Root cause could not be easily determined due to timing variances caused
by debug attempts. Race condition exists.
Solution: Implement a retry with incremental backoff as a viable
workaround. This workaround is okay because (1) Finding the root cause
would take a not insignificant amount of time/effort. (2) The workaround
is a more general fix for any process that might cause the exhibited
behavior.
Adam Spiers [Sun, 16 Nov 2014 20:52:36 +0000 (15:52 -0500)]
doc: fix typos in diagram for incomplete write
In this example of a write of v2 of the object being interrupted, OSD2
would never have any version of the D1 chunk. It only has the old v1
version of the D2 chunk.
Loic Dachary [Fri, 14 Nov 2014 00:16:10 +0000 (01:16 +0100)]
common: do not omit shard when ghobject NO_GEN is set
Do not silence the display of shard_id when generation is NO_GEN.
Erasure coded objects JSON representation used by ceph_objectstore_tool
need the shard_id to find the file containing the chunk.
Minimal testing is added to ceph_objectstore_tool.py
Loic Dachary [Thu, 13 Nov 2014 16:32:14 +0000 (17:32 +0100)]
tests: ceph_objectstore_tool.py replace stop.sh with init-ceph
The stop.sh will stop all ceph-* processes. Use the init-ceph script
instead to selectively kill the daemons run by the vstart.sh cluster
used for ceph_objectstore_tool.
Loic Dachary [Thu, 13 Nov 2014 16:27:01 +0000 (17:27 +0100)]
tests: ceph_objectstore_tool.py run faster by default
By default use only a small number of objects to speed up the tests. If
the argument "big" is given, use a large number of objects as it may
help find some problems.
Loic Dachary [Thu, 13 Nov 2014 16:21:48 +0000 (17:21 +0100)]
tests: ceph_objectstore_tool.py run mon and osd on specific port
By default vstart.sh runs MDS but they are not needed for the tests,
only run mon and osd instead. Instead of using the default vstart.sh
port which may conflict with a already running vstart.sh, set the
CEPH_PORT=7400 which is not used by any other test run with make check.