]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
10 years agotest: encoding: add LevelDBStoreStats and ceph_data_stats to types.h 3835/head
Joao Eduardo Luis [Fri, 23 Jan 2015 11:30:02 +0000 (11:30 +0000)]
test: encoding: add LevelDBStoreStats and ceph_data_stats to types.h

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
10 years agomon/mon_types.h: allow testing encode/decode of LevelDBStoreStats
Joao Eduardo Luis [Mon, 26 Jan 2015 12:59:21 +0000 (12:59 +0000)]
mon/mon_types.h: allow testing encode/decode of LevelDBStoreStats

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
10 years agoinclude/util.h: allow testing encoding/decoding of ceph_data_stats
Joao Eduardo Luis [Mon, 26 Jan 2015 12:58:24 +0000 (12:58 +0000)]
include/util.h: allow testing encoding/decoding of ceph_data_stats

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
10 years agoinclude/util.h: initialize ceph_data_stats to zero
Joao Eduardo Luis [Wed, 21 Jan 2015 17:47:20 +0000 (17:47 +0000)]
include/util.h: initialize ceph_data_stats to zero

We decode this struct on the monitor.  Although at the moment there's no
reports of any weird behavior by not initializing it, let's avoid it
completely by setting member values to zero -- just in case and because
it's a good policy.

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
10 years agomon: mon_types.h: initialize LevelDBStoreStats and avoid craziness
Joao Eduardo Luis [Wed, 21 Jan 2015 17:45:02 +0000 (17:45 +0000)]
mon: mon_types.h: initialize LevelDBStoreStats and avoid craziness

On a mixed-version cluster, say firefly and dumpling, the first round of
data health checks could end up with crazy values being reported for
data usage/availability for dumpling monitors.

This would be caused by dumpling not supporting reporting of store
stats, and by not assuming values as zero on decoding we would end up
decoding trash.

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
10 years agoMerge branch 'hammer' of jenkins.front.sepia.ceph.com:ceph/ceph into hammer
Jenkins [Fri, 27 Feb 2015 21:21:03 +0000 (13:21 -0800)]
Merge branch 'hammer' of jenkins.front.sepia.ceph.com:ceph/ceph into hammer

10 years agoMerge pull request #3825 from ceph/wip-hammer-gplv2-text
Loic Dachary [Fri, 27 Feb 2015 18:09:06 +0000 (19:09 +0100)]
Merge pull request #3825 from ceph/wip-hammer-gplv2-text

Add GPLv2 text file

Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years ago0.93 v0.93
Jenkins [Fri, 27 Feb 2015 17:52:54 +0000 (09:52 -0800)]
0.93

10 years agoAdd GPLv2 text file 3825/head
Ken Dreyer [Fri, 27 Feb 2015 17:32:37 +0000 (10:32 -0700)]
Add GPLv2 text file

Most of the ceph tree is LGPLv2.1, but there are some files that are
under the full GPLv2.

Add a copy of the GNU General Public License (version 2) to the
distribution. This file was copied verbatim from
https://www.gnu.org/licenses/gpl-2.0.txt

Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
10 years agoMerge pull request #3681 from ceph/wip-fusesystem-10710
Gregory Farnum [Thu, 26 Feb 2015 23:54:21 +0000 (15:54 -0800)]
Merge pull request #3681 from ceph/wip-fusesystem-10710

ceph-fuse: check for failures on system() invocation

Reviewed-by: John Spray <john.spray@redhat.com>
10 years agoceph-fuse: test dentry invalidation options and fail out if we fail 3681/head
Greg Farnum [Thu, 26 Feb 2015 23:20:11 +0000 (15:20 -0800)]
ceph-fuse: test dentry invalidation options and fail out if we fail

We identify the Linux kernel version and based on that either expect to
be able to invalidate dentries effectively, or expect to be able to remount
the ceph-fuse mountpoint. Test it using the Client functions and callbacks by
spinning off a thread to invoke the test that is separate from the main
FUSE loop.

Most unfortunately, there doesn't seem to be a good interface to tell
FUSE to shut down if we need to do that. See
http://fuse.996288.n3.nabble.com/libfuse-exiting-fuse-session-loop-td10686.html
I tried changing our signal invocation or attempting a simple action on
the mount point but those were ineffectual at terminating the remaining
processes; fusermount actually gets rid of them all.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
10 years agoClient: support using dentry invalidation callbacks on older kernels
Greg Farnum [Thu, 26 Feb 2015 23:12:47 +0000 (15:12 -0800)]
Client: support using dentry invalidation callbacks on older kernels

This brings back a few small code chunks that were removed in
0827bb79ea5127e6763f6e904dfa1a3266046ffb. We check the kernel version,
and if it is less than 3.18 we use these dentry invalidation callbacks
instead of the remount callback. This should resolve a number of
issues with racing against remount, including #10916, and lets older
unprivileged users on older kernels run even if they can't apply
options on mount (#10542).

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
10 years agoClient: add functions to test remount functionality
Greg Farnum [Thu, 26 Feb 2015 23:18:31 +0000 (15:18 -0800)]
Client: add functions to test remount functionality

Unprivileged users can't use options when remounting; see
http://tracker.ceph.com/issues/10542. We're about to use this
in ceph-fuse when starting up.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
10 years agoClient: check for failures on system() invocation
Greg Farnum [Tue, 10 Feb 2015 19:11:06 +0000 (11:11 -0800)]
Client: check for failures on system() invocation

Fixes: #10710
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
10 years agoMerge pull request #3806 from ceph/wip-10961
Josh Durgin [Thu, 26 Feb 2015 20:01:01 +0000 (12:01 -0800)]
Merge pull request #3806 from ceph/wip-10961

qa/workunits/rbd/copy.sh: explicitly choose the image format

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge pull request #3802 from ceph/hammer-10912
Gregory Farnum [Thu, 26 Feb 2015 18:14:09 +0000 (10:14 -0800)]
Merge pull request #3802 from ceph/hammer-10912

client: re-send requsets before composing the cap reconnect message

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoqa/workunits/rbd/copy.sh: explicitly choose the image format 3806/head
Jason Dillaman [Thu, 26 Feb 2015 17:00:41 +0000 (12:00 -0500)]
qa/workunits/rbd/copy.sh: explicitly choose the image format

The rbd CLI now utilizes the rbd_default_format configuration
setting, therefore the copy test now needs to tell rbd which image
format it is expecting to create.

Fixes: #10961
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agoMerge pull request #3800 from ceph/wip-10864-hammer-packaging-rbd-udev
Sage Weil [Thu, 26 Feb 2015 05:05:09 +0000 (21:05 -0800)]
Merge pull request #3800 from ceph/wip-10864-hammer-packaging-rbd-udev

packaging: move rbd udev rules to ceph-common

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoclient: re-send requsets before composing the cap reconnect message 3802/head
Yan, Zheng [Wed, 25 Feb 2015 07:27:59 +0000 (15:27 +0800)]
client: re-send requsets before composing the cap reconnect message

After commit 419800fe (client: re-send request when MDS enters reconnecting
stage), cephfs client can send both unsafe requests and normal requests when
MDS is in reconnecting stage. Normal requests can have embedded cap releases,
the client code encodes these embedded cap releases after composing the cap
reconnect message. This causes the client sliently drop some caps. The fix
is re-send requsets (which add embedded cap releases) before composing the
cap reconnect message

Fixes: #10912
Signed-off-by: Yan, Zheng <zyan@redhat.com>
(cherry picked from commit 8ea5a811b3b32b99b65e6170976af3d42e6c9ba0)

10 years agoMerge pull request #3796 from ceph/wip-librbd-async-operations
Josh Durgin [Thu, 26 Feb 2015 02:52:19 +0000 (18:52 -0800)]
Merge pull request #3796 from ceph/wip-librbd-async-operations

librbd: better handling for async maintenance requests

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge pull request #3791 from ceph/wip-librbd-mdlock
Jason Dillaman [Thu, 26 Feb 2015 02:37:48 +0000 (21:37 -0500)]
Merge pull request #3791 from ceph/wip-librbd-mdlock

librbd: fix object map locking

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
10 years agopackaging: move rbd udev rules to ceph-common 3800/head
Ken Dreyer [Wed, 25 Feb 2015 22:27:32 +0000 (15:27 -0700)]
packaging: move rbd udev rules to ceph-common

We should ship the RBD udev rules in the same package that ships
/usr/bin/rbd.  This package happens to be ceph-common, so move the udev
rules there.

The udev rules rely on the ceph-rbdnamer utility, so move that utility
and its man page as well.

http://tracker.ceph.com/issues/10864 Refs: #10864

Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
10 years agolibrbd: remove unnecessary md_lock usage 3791/head
Josh Durgin [Thu, 26 Feb 2015 01:41:52 +0000 (17:41 -0800)]
librbd: remove unnecessary md_lock usage

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agolibrbd: move object_map_lock acquisition into refresh()
Josh Durgin [Thu, 26 Feb 2015 01:02:42 +0000 (17:02 -0800)]
librbd: move object_map_lock acquisition into refresh()

Every caller was acquiring this just for these calls.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agolibrbd: don't check if object map is enabled before refreshing
Josh Durgin [Wed, 25 Feb 2015 23:54:00 +0000 (15:54 -0800)]
librbd: don't check if object map is enabled before refreshing

This check is now done internally by the object map.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agolibrbd: remove object map on rollback if needed
Josh Durgin [Wed, 25 Feb 2015 22:34:15 +0000 (14:34 -0800)]
librbd: remove object map on rollback if needed

When rolling back to a snapshot that did not have object map enabled,
delete the head object map.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agolibrbd: clarify md_lock usage
Josh Durgin [Wed, 25 Feb 2015 02:49:26 +0000 (18:49 -0800)]
librbd: clarify md_lock usage

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agotest_librbd: add simple test for object map snapshot consistency
Josh Durgin [Wed, 25 Feb 2015 02:31:15 +0000 (18:31 -0800)]
test_librbd: add simple test for object map snapshot consistency

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agolibrbd: use snap_lock to protect ImageCtx->size
Josh Durgin [Wed, 25 Feb 2015 01:19:59 +0000 (17:19 -0800)]
librbd: use snap_lock to protect ImageCtx->size

Since this is often looked up by snap_id anyway, snap_lock
is easy to use for this.

This lets us avoid taking md_lock in many places.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agolibrbd: hold snap_lock while reading parent info in diff_iterate
Josh Durgin [Tue, 24 Feb 2015 22:43:10 +0000 (14:43 -0800)]
librbd: hold snap_lock while reading parent info in diff_iterate

Caught be the re-added assertions in ImageCtx::get_parent_info()

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agotest_librbd: close ioctx after imagectx
Josh Durgin [Tue, 24 Feb 2015 22:31:27 +0000 (14:31 -0800)]
test_librbd: close ioctx after imagectx

There's no need to explicitly close the ioctx. Doing so may cause
problems when the Images using it are destroyed afterwards.  Just let
normal cleanup at the end of the block take care of it in the correct
order.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agorbd: fix --image-feature parsing
Josh Durgin [Tue, 24 Feb 2015 05:36:13 +0000 (21:36 -0800)]
rbd: fix --image-feature parsing

Need to use _witharg(), not _flag()

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agolibrbd: apply flag read failure to all snaps
Josh Durgin [Tue, 24 Feb 2015 04:51:23 +0000 (20:51 -0800)]
librbd: apply flag read failure to all snaps

Don't check just the features of head, since it may be possible to
disable object map in the future.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agolibrbd: make ImageCtx->object_map always present
Josh Durgin [Tue, 24 Feb 2015 04:28:38 +0000 (20:28 -0800)]
librbd: make ImageCtx->object_map always present

This simplifies locking by obviating the NULL checks.  We no longer
need md_lock to protect these acceses. We can use object_map_lock
instead, to make sure no one reads an object map while its being
updated.

Keep track of whether the object map is enabled for a given snapshot
internally. In each public method, check this state, and automatically
set it correctly when refreshing the object map. During snapshot
removal, unconditionally try to remove the object map object, to
protect against bugs leaking objects, and to be consistent with image
removal.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agotests: add unit test to verify async requests time out 3796/head
Jason Dillaman [Wed, 25 Feb 2015 17:02:00 +0000 (12:02 -0500)]
tests: add unit test to verify async requests time out

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agolibrbd: restart async requests if lock owner doesn't report progress
Jason Dillaman [Wed, 25 Feb 2015 17:00:26 +0000 (12:00 -0500)]
librbd: restart async requests if lock owner doesn't report progress

Detect the case of a crashed lock owner by waiting for up to 30 seconds
for a async request progress message from the leader.  If a progress
message isn't received, restart the request (and possibly take ownership
of the lock).

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agolibrbd: replace Finisher/SafeTimer use with facade
Jason Dillaman [Wed, 25 Feb 2015 04:35:31 +0000 (23:35 -0500)]
librbd: replace Finisher/SafeTimer use with facade

Replace the two Context threading classes used within
ImageWatcher with a facade to orchestrate the scheduling
and canceling of Context task callbacks.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agolibrbd: cancel in-progress maint operations before releasing lock
Jason Dillaman [Tue, 24 Feb 2015 19:33:44 +0000 (14:33 -0500)]
librbd: cancel in-progress maint operations before releasing lock

Ensure that all in-flight maintenance operations (resize, flatten) are
not running when the exclusive lock is released.  The lock will be
released when transitioning to a snapshot, closing the image, or
cooperatively when another client requests the lock.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agolibrbd: flush context potentially completing too early
Jason Dillaman [Tue, 24 Feb 2015 17:53:45 +0000 (12:53 -0500)]
librbd: flush context potentially completing too early

If the async operation associated with a flush request completes,
only complete the flush contexts if no previous operations are
still in flight. Otherwise, move the flush contexts to an older
in-flight async operation.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agoMerge pull request #3799 from ceph/wip-librbd-image-watcher-tests
Josh Durgin [Thu, 26 Feb 2015 00:42:46 +0000 (16:42 -0800)]
Merge pull request #3799 from ceph/wip-librbd-image-watcher-tests

tests: add additional test coverage for ImageWatcher RPC

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agolibrbd: take ImageCtx->snap_lock for write in add_snap()
Josh Durgin [Tue, 24 Feb 2015 03:50:55 +0000 (19:50 -0800)]
librbd: take ImageCtx->snap_lock for write in add_snap()

add_snap() updates the ImageCtx snapshot metadata in memory, as well
as reading the flags as part of the object map snapshot. Both of these
require holding snap_lock.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agolibrbd: use snap_lock to protect ImageCtx->flags
Josh Durgin [Tue, 24 Feb 2015 03:49:12 +0000 (19:49 -0800)]
librbd: use snap_lock to protect ImageCtx->flags

This is another step towards eliminating md_lock from the writeback
path. Almost all the places that use ImageCtx->flags already use
snap_lock, so there's no need to create a new lock. For the others,
add a helper, test_flags() that acquires the lock, similar to
test_features().

This also makes sure we look up flags of the snapshot we're operating
on, instead of those for head.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agolibrbd: add locking asserts to ImageCtx
Josh Durgin [Tue, 24 Feb 2015 03:03:32 +0000 (19:03 -0800)]
librbd: add locking asserts to ImageCtx

A bunch of these used to be here, but were removed when converting to
RWLocks, before RWLocks had is_[w]locked() methods.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agolibrbd: fix ImageWatcher::is_lock_supported() locking
Josh Durgin [Tue, 24 Feb 2015 02:49:34 +0000 (18:49 -0800)]
librbd: fix ImageWatcher::is_lock_supported() locking

Take snap_lock while reading ImageCtx->snap_id, and
look up the features by snap_id as well.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agolibrbd: add and use a test_features() helper
Josh Durgin [Tue, 24 Feb 2015 02:46:26 +0000 (18:46 -0800)]
librbd: add and use a test_features() helper

This gets the appropriate locks, and checks the currently open
snapshot instead of head.  Looking up features by snap_id prepares us
for future addition or removal of e.g. an object map throughout the
life of an image.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agolibrbd: use ImageCtx->snap_lock for ImageCtx->features
Josh Durgin [Tue, 24 Feb 2015 02:44:05 +0000 (18:44 -0800)]
librbd: use ImageCtx->snap_lock for ImageCtx->features

This was being protected by md_lock, but that has become too coarse
since it is used to prevent writes from proceeding while flushing
caches for a snapshot. With the addition of ObjectMap and
ImageWatcher, writeback could try to acquire md_lock again, leading to
a deadlock.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agotests: add additional test coverage for ImageWatcher RPC 3799/head
Jason Dillaman [Wed, 25 Feb 2015 19:59:38 +0000 (14:59 -0500)]
tests: add additional test coverage for ImageWatcher RPC

Test flatten, resize, and snap create RPC messages along with
basic error code return paths.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agolibrbd: add ostream formatter for NotifyOp
Jason Dillaman [Wed, 25 Feb 2015 17:31:55 +0000 (12:31 -0500)]
librbd: add ostream formatter for NotifyOp

Allow for reuse of the NotifyOp to string conversions within
dencoder and tests.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agofuse: do not invoke ll_register_callbacks() on finalize
Greg Farnum [Fri, 13 Feb 2015 03:23:43 +0000 (19:23 -0800)]
fuse: do not invoke ll_register_callbacks() on finalize

We were passing in a NULL data structure, probably in an attempt to
let things clean up -- but our implementation just returns with a NULL
pass-in value, so drop it for clarity.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
10 years agoMerge pull request #3794 from ceph/wip-10862-hammer
Gregory Farnum [Wed, 25 Feb 2015 19:07:59 +0000 (11:07 -0800)]
Merge pull request #3794 from ceph/wip-10862-hammer

Backport: mon: do not try and "deactivate" the last MDS

10 years agoMerge pull request #3788 from ceph/wip-devel-python-split
Ken Dreyer [Wed, 25 Feb 2015 19:00:43 +0000 (12:00 -0700)]
Merge pull request #3788 from ceph/wip-devel-python-split

split python-ceph into python-{rados,rbd,cephfs}

Reviewed-by: Ken Dreyer <kdreyer@redhat.com>
10 years agomon: do not try and "deactivate" the last MDS 3794/head
John Spray [Mon, 23 Feb 2015 14:23:56 +0000 (14:23 +0000)]
mon: do not try and "deactivate" the last MDS

Fixes: #10862
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit a2867987bc561479733839c3891fa14bfcebb849)

10 years agoqa: fix python-ceph reference 3788/head
Boris Ranto [Tue, 24 Feb 2015 20:13:15 +0000 (12:13 -0800)]
qa: fix python-ceph reference

Signed-off-by: Boris Ranto <branto@redhat.com>
10 years agodoc: fix python-ceph refs in docs
Boris Ranto [Tue, 24 Feb 2015 20:12:59 +0000 (12:12 -0800)]
doc: fix python-ceph refs in docs

Signed-off-by: Boris Ranto <branto@redhat.com>
10 years agoceph.spec: specify version
Boris Ranto [Tue, 24 Feb 2015 20:11:15 +0000 (12:11 -0800)]
ceph.spec: specify version

Signed-off-by: Boris Ranto <branto@redhat.com>
10 years agodebian: split python-ceph
Sage Weil [Tue, 24 Feb 2015 18:09:47 +0000 (10:09 -0800)]
debian: split python-ceph

- move argparse to ceph-common
- split out rados, rbd, and cephfs bindings into their own packages
- keep python-ceph as a metapackage

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoSplit python-ceph to appropriate python-* packages
Boris Ranto [Wed, 7 Jan 2015 09:26:49 +0000 (10:26 +0100)]
Split python-ceph to appropriate python-* packages

python-ceph contains various header files/bindings for serveral
libraries, this patch creates *-devel packages for all the
libraries separately and provides the compatibility layer for
the split.

Signed-off-by: Boris Ranto <branto@redhat.com>
10 years agoMerge pull request #3742 from ceph/wip-10788
Samuel Just [Tue, 24 Feb 2015 19:15:06 +0000 (11:15 -0800)]
Merge pull request #3742 from ceph/wip-10788

osd: proxy features with proxied reads; only proxy reads to new peers

Reviewed-by: Samuel Just <sjust@redhat.com>
10 years agoMerge pull request #3786 from ceph/wip-librbd-python-tests
Josh Durgin [Tue, 24 Feb 2015 16:45:48 +0000 (08:45 -0800)]
Merge pull request #3786 from ceph/wip-librbd-python-tests

tests: speed up Python RBD random data generation

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agotests: speed up Python RBD random data generation 3786/head
Jason Dillaman [Tue, 24 Feb 2015 14:25:14 +0000 (09:25 -0500)]
tests: speed up Python RBD random data generation

The RBD large_write test cases was taking multiple minutes to
run under a Fedora 21 VM.  Replaced the million+ random number
generator calls with a single call to os.urandom. The test
now completes within seconds.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agoMerge pull request #3780 from ceph/wip-osdc-watch-error
Josh Durgin [Tue, 24 Feb 2015 01:21:04 +0000 (17:21 -0800)]
Merge pull request #3780 from ceph/wip-osdc-watch-error

osdc: watch error callback invoked on cancelled context

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge pull request #3781 from ceph/wip-librbd-image-watcher-tests
Josh Durgin [Tue, 24 Feb 2015 01:20:52 +0000 (17:20 -0800)]
Merge pull request #3781 from ceph/wip-librbd-image-watcher-tests

tests: fix potential race conditions in test_ImageWatcher

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agotests: fix potential race conditions in test_ImageWatcher 3781/head
Jason Dillaman [Tue, 24 Feb 2015 01:09:56 +0000 (20:09 -0500)]
tests: fix potential race conditions in test_ImageWatcher

The tests were sending invalid responses back to ImageWatchers
(missing the result code), which had the potential to allow the
lock to be acquired sooner than the test was expecting since
ImageWatcher would assume the last of response code meant no
clients owned the exclusive lock and would retry as fast as
possible.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agoosdc: watch error callback invoked on cancelled context 3780/head
Jason Dillaman [Tue, 24 Feb 2015 00:45:03 +0000 (19:45 -0500)]
osdc: watch error callback invoked on cancelled context

The C_DoWatchError context did not verify whether or not the
watch was cancelled prior to invoking the callback.  This
resulted in sporadic crashes when reconnect errors bubbled
up to destroyed objects.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agoMerge pull request #3779 from liewegas/wip-watch-timeout-test
Josh Durgin [Mon, 23 Feb 2015 22:51:49 +0000 (14:51 -0800)]
Merge pull request #3779 from liewegas/wip-watch-timeout-test

make watch timeout test less likely to fail under thrashing

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoceph_test_rados_api_watch_notify: wait longer for watch timeout 3779/head
Sage Weil [Mon, 23 Feb 2015 22:46:16 +0000 (14:46 -0800)]
ceph_test_rados_api_watch_notify: wait longer for watch timeout

OSD thrashing can delay this indefinitely; longer delay means lower
probability of that happening.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd: better debug for maybe_handle_cache 3742/head
Sage Weil [Thu, 19 Feb 2015 17:05:49 +0000 (09:05 -0800)]
osd: better debug for maybe_handle_cache

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd,mon: explicitly specify OSD features in MOSDBoot
Sage Weil [Wed, 18 Feb 2015 22:53:04 +0000 (14:53 -0800)]
osd,mon: explicitly specify OSD features in MOSDBoot

We are using the connection features to populate the features field in the
OSDMap, but this is the *intersection* of mon and osd features, not the
osd features.  Fix this by explicitly specifying the features in
MOSDBoot.

Fixes: #10911
Backport: giant, firefly
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd: do not proxy reads unless all OSDs proxy features too
Sage Weil [Mon, 16 Feb 2015 22:18:40 +0000 (14:18 -0800)]
osd: do not proxy reads unless all OSDs proxy features too

Specifically, the object_copy_data_t encoding changed such that the reply
encoding is dependent on features; if we proxy such a read to an old
OSD it will use *our* features to encode instead of the original OSD's.

This effectively conditionally reverts 8e145e08ede625adfb5d41216d7777d6c9707bd0
when the cluster features aren't all present.

Fixes: #10788
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/OSDMap: cache get_up_osd_features
Sage Weil [Mon, 16 Feb 2015 17:30:39 +0000 (09:30 -0800)]
osd/OSDMap: cache get_up_osd_features

This method is O(n) and called from in a few places for each IO operation.
Cache the value since it does not change over the lifetime of a single
epoch.  Invalidate on apply_incremental() and decode.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3777 from ceph/wip-librbd-snap-create-race
Josh Durgin [Mon, 23 Feb 2015 18:00:01 +0000 (10:00 -0800)]
Merge pull request #3777 from ceph/wip-librbd-snap-create-race

librbd: fixed snap create race conditions

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agolibrbd: fixed snap create race conditions 3777/head
Jason Dillaman [Mon, 23 Feb 2015 17:16:39 +0000 (12:16 -0500)]
librbd: fixed snap create race conditions

Since the post-snap create header update runs asynchrously
in a finalizer callback, it's possible that the snapshot
is not immediately visible.  Also, if a proxied snap create
message is replayed, it's possible for the client to receive
a EEXISTS error.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agoMerge pull request #3758 from ceph/wip-10898
Josh Durgin [Mon, 23 Feb 2015 17:08:11 +0000 (09:08 -0800)]
Merge pull request #3758 from ceph/wip-10898

librbd: improved ImageWatcher duplicate message detection

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge pull request #3770 from ceph/wip-librbd-exclusive-lock-config
Josh Durgin [Mon, 23 Feb 2015 16:37:04 +0000 (08:37 -0800)]
Merge pull request #3770 from ceph/wip-librbd-exclusive-lock-config

rbd: disable RBD exclusive locking by default

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agolibrbd: improved ImageWatcher duplicate message detection 3758/head
Jason Dillaman [Thu, 19 Feb 2015 19:43:15 +0000 (14:43 -0500)]
librbd: improved ImageWatcher duplicate message detection

Added a unique client id to announcement messages so that duplicate
lock release / acquired / requested messages can be detected and
ignored by the client.  Also fixed an issue processing the result
code for async operations.

Fixes: #10898
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agolibrbd: add test instances for watch/notify messages
Jason Dillaman [Thu, 19 Feb 2015 17:33:39 +0000 (12:33 -0500)]
librbd: add test instances for watch/notify messages

Ensure that the librbd watch/notify messages are tested
for backwards compatibility.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agolibrbd: minor cleanup of ImageWatcher messages
Jason Dillaman [Thu, 19 Feb 2015 02:56:34 +0000 (21:56 -0500)]
librbd: minor cleanup of ImageWatcher messages

Moved all RPC messages to their own classes to facilitate cleaner
version control and backward compatibility.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agorbd: disable RBD exclusive locking by default 3770/head
Jason Dillaman [Fri, 20 Feb 2015 17:50:26 +0000 (12:50 -0500)]
rbd: disable RBD exclusive locking by default

Utilize the existing rbd_default_features config option to
control whether or not to enable RBD exclusive locking and
object map features by default.  Also added a new option to
the rbd cli to specify the image features when creating images.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agoMerge pull request #3741 from ceph/wip-cmake-vstart
Kefu Chai [Sun, 22 Feb 2015 02:10:40 +0000 (10:10 +0800)]
Merge pull request #3741 from ceph/wip-cmake-vstart

cmake fixes and enable vstart with cmake build

Reviewed-by: Kefu Chai <kchai@redhat.com>
10 years agoMerge pull request #3729 from guangyy/wip-4254-hammer
Sage Weil [Sat, 21 Feb 2015 18:10:12 +0000 (10:10 -0800)]
Merge pull request #3729 from guangyy/wip-4254-hammer

osd: number of degraded objects in EC pool is wrong when there is OSD down(in).

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3722 from ceph/wip-10787
Sage Weil [Sat, 21 Feb 2015 18:08:59 +0000 (10:08 -0800)]
Merge pull request #3722 from ceph/wip-10787

mon: fix osd_epoch cache bug 10787

Reviewed-by: Joao Eduardo Luis <joao@redhat.com>
10 years agoMerge branch 'osd-fix' of git://github.com/wonzhq/ceph into hammer
Sage Weil [Sat, 21 Feb 2015 18:07:20 +0000 (10:07 -0800)]
Merge branch 'osd-fix' of git://github.com/wonzhq/ceph into hammer

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
10 years agoMerge branch 'wip-5639' of git://github.com/rzarzynski/ceph into hammer
Sage Weil [Sat, 21 Feb 2015 18:05:55 +0000 (10:05 -0800)]
Merge branch 'wip-5639' of git://github.com/rzarzynski/ceph into hammer

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3760 from ceph/wip-10883
Sage Weil [Fri, 20 Feb 2015 23:18:41 +0000 (15:18 -0800)]
Merge pull request #3760 from ceph/wip-10883

osd: Fix FileJournal wrap to get header out first

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3767 from athanatos/wip-10881
Sage Weil [Fri, 20 Feb 2015 23:14:03 +0000 (15:14 -0800)]
Merge pull request #3767 from athanatos/wip-10881

Wip 10881

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3768 from athanatos/wip-10780
Sage Weil [Fri, 20 Feb 2015 23:10:35 +0000 (15:10 -0800)]
Merge pull request #3768 from athanatos/wip-10780

Wip 10780

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3769 from athanatos/wip-10908
Sage Weil [Fri, 20 Feb 2015 23:08:20 +0000 (15:08 -0800)]
Merge pull request #3769 from athanatos/wip-10908

Wip 10908

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agocmake: radosgw, radosgw-admin related fixes 3741/head
Yehuda Sadeh [Fri, 13 Feb 2015 23:41:29 +0000 (15:41 -0800)]
cmake: radosgw, radosgw-admin related fixes

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agovstart.sh: can use binaries outside of ceph/src
Yehuda Sadeh [Fri, 13 Feb 2015 22:34:06 +0000 (14:34 -0800)]
vstart.sh: can use binaries outside of ceph/src

If setting CEPH_BUILD_ROOT, will use that path, otherwise runs
everything from current directory as before.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agoMerge pull request #3744 from ceph/wip-10884-hammer-rpm-devel-split
Ken Dreyer [Fri, 20 Feb 2015 17:44:45 +0000 (10:44 -0700)]
Merge pull request #3744 from ceph/wip-10884-hammer-rpm-devel-split

ceph.spec: split ceph-devel to appropriate *-devel packages

Reviewed-by: Sandon Van Ness <sandon@redhat.com>
Reviewed-by: Boris Ranto <branto@redhat.com>
10 years agoMerge pull request #3764 from ceph/wip-10919
Josh Durgin [Fri, 20 Feb 2015 17:00:26 +0000 (09:00 -0800)]
Merge pull request #3764 from ceph/wip-10919

cls_rbd: invalidate bufferlist CRC when updating object map

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agocls_rbd: invalidate bufferlist CRC when updating object map 3764/head
Jason Dillaman [Fri, 20 Feb 2015 15:37:59 +0000 (10:37 -0500)]
cls_rbd: invalidate bufferlist CRC when updating object map

The bit vector was not invalidating the bufferlist's CRC, resulting
in peer OSDs rejecting write op due to a mismatched CRC on the
message.

Fixes: #10919
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agoMerge pull request #3759 from ceph/wip-10914
Josh Durgin [Fri, 20 Feb 2015 02:47:16 +0000 (18:47 -0800)]
Merge pull request #3759 from ceph/wip-10914

osdc: pass fadvise op flags to WritebackHandler read requests

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoosd: Fix FileJournal wrap to get header out first 3760/head
David Zafman [Thu, 19 Feb 2015 00:21:12 +0000 (16:21 -0800)]
osd: Fix FileJournal wrap to get header out first

Correct and restore assert that was removed

Cause by f46b1b473fce0322a672b16c7739e569a45054b6
Fixes: #10883
Backport: dumpling, firefly, giant

Signed-off-by: David Zafman <dzafman@redhat.com>
10 years agoosdc: pass fadvise op flags to WritebackHandler read requests 3759/head
Jason Dillaman [Thu, 19 Feb 2015 20:38:32 +0000 (15:38 -0500)]
osdc: pass fadvise op flags to WritebackHandler read requests

librbd was previously attempting to cast the provided Context to
retrieve the fadvise flags.  To eliminate the unsafe cast, now
the fadvise flags are directly passed to the WritebackHandler::read
callback.

Fixes: #10914
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agoosd/OSDMap: include pg_temp count in summary
Sage Weil [Thu, 12 Feb 2015 22:16:53 +0000 (14:16 -0800)]
osd/OSDMap: include pg_temp count in summary

It is useful to know how big the pg_temp map is.  Strictly speaking
this is part of the OSDMap so I'm including it here.  It looks like
this:

     osdmap e25: 3 osds: 3 up, 3 in; 1 remapped pgs

It might be more user-friendly to put it in a line with the pgmap
somewhere (where other pg counts are included), but it doesn't quite
fit there either.  So sticking with where it lives in the data
structure!

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit db06582a067439a57e0d7f0da2193fc479736200)

10 years agoMerge pull request #3663 from ceph/wip-10765
Sage Weil [Thu, 19 Feb 2015 19:02:26 +0000 (11:02 -0800)]
Merge pull request #3663 from ceph/wip-10765

librados: close watch/notify race

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge pull request #3693 from ceph/wip-objecter-linger-locking
Sage Weil [Thu, 19 Feb 2015 19:02:08 +0000 (11:02 -0800)]
Merge pull request #3693 from ceph/wip-objecter-linger-locking

objecter: clean up linger op locking

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3710 from ceph/wip-10844
Sage Weil [Thu, 19 Feb 2015 19:01:07 +0000 (11:01 -0800)]
Merge pull request #3710 from ceph/wip-10844

mon: MonCap: take EntityName instead when expanding profiles

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoPG: compensate for bug 10780 on older peers 3768/head
Samuel Just [Tue, 17 Feb 2015 18:08:01 +0000 (10:08 -0800)]
PG: compensate for bug 10780 on older peers

Previously, there was a harmless bug where we didn't fill in the
last_epoch_started field for a peer which we are resetting the
last_backfill line for.  It's no longer harmless since we use that
as the activation epoch, so if the peer is missing the MIN_SIZE
feature bit, we fill in the last_epoch_started it meant to fill in.

Signed-off-by: Samuel Just <sjust@redhat.com>