Jason Dillaman [Mon, 19 Jan 2015 22:33:41 +0000 (17:33 -0500)]
librbd: throttle async progress callbacks
Ensure that no more than one outstanding progress callback
is queued for notification. This will allow remote progress
updates to be sent at a rate in which all watch/notify
clients can support.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Wed, 14 Jan 2015 20:56:15 +0000 (15:56 -0500)]
librbd: add more robust retry handling to maintenance ops
When image locking is enabled, snapshot create, resize, and
flatten are coordinated with the lock owner. Previously, if the
the lock owner changed during one of this operations, the
operation would fail. Now librbd will attempt to restart the
operation with the new lock owner (or become the owner itself).
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Wed, 14 Jan 2015 16:49:13 +0000 (11:49 -0500)]
librbd: assert header lock ownership for maint operations
The resize, flatten, and snapshot maintenance operations now
use the new assert_lock feature to ensure that the current
client still owns the header lock when making changes.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Tue, 18 Nov 2014 08:56:41 +0000 (03:56 -0500)]
cls_lock: New assert_locked operation
The assert_locked operation can be combined with other
RADOS ops to prevent an update to a locked object when
the client doesn't own the lock. It will not attempt to
acquire the lock if the object is not currently locked.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Thu, 9 Oct 2014 04:00:17 +0000 (00:00 -0400)]
librbd: Coordinate maintenance through exclusive lock leader
When the exclusive lock feature is enabled, only a single client can
modify the image. As a result, certain maintenance activities
need to be proxied from the maintenance client to the active
leader.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Mon, 10 Nov 2014 17:25:50 +0000 (12:25 -0500)]
librbd: Create async versions of long-running maintenance operations
Resize and flatten now have async versions. The existing resize
and flatten operations now use the async versions internally. The
async operations will be used by the client holding the exclusive
lock when it receives maintenance requests from other clients.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Sat, 24 Jan 2015 07:28:07 +0000 (02:28 -0500)]
librbd: trim would not complete if exclusive lock is lost
The trim completion context was not properly invoked if the
image's exclusive lock was lost between issuing a librados call
and receiving its completion.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
delco225 [Fri, 23 Jan 2015 09:32:30 +0000 (10:32 +0100)]
bug: error when installing ceph dependencies with install-deps.sh
The parsing is sensitive to i18n and will fail if, for instance, it is set to French.
Workaround the problem by always setting the language to C so the script
can safely assume all output will be in english.
Ken Dreyer [Fri, 23 Jan 2015 22:08:34 +0000 (15:08 -0700)]
ceph.spec.in: use wildcards to capture man pages
Use wildcard to capture gzipped man pages for ceph-clsinfo(8) and
librados-config(8). In addition to future-proofing us against
possible compression type changes down the road, this also aligns us
with the existing convention that's used to capture the rest of the man
page files.
Jason Dillaman [Fri, 23 Jan 2015 17:56:56 +0000 (12:56 -0500)]
librbd: potential deadlock on close_image
The owner_lock was incorrectly held when unregistering the image
watcher. It was possible for the ImageWatcher finisher to be
running code that was then deadlocked waiting to acquire the
owner_lock while the close_image thread was attempting to shutdown
the deadlocked finisher.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Wed, 21 Jan 2015 13:58:57 +0000 (08:58 -0500)]
librbd: fix copy-on-read / resize down race condition
There was a rare race condition between a pending CoR operation
and a resize down operation resulting in a CoR copyup past the
new, reduced parent overlap. This commit also adds additional
log message details for CoR.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Tue, 20 Jan 2015 17:12:43 +0000 (12:12 -0500)]
test: add rados_nobjects_list_xyz functions to librados test stub
The new RBD copy-on-read unit test case uses these RADOS functions
to verify that the CoR operation was successful. This implements
these functions in the librados_test_stub library.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Sat, 17 Jan 2015 05:18:24 +0000 (00:18 -0500)]
librbd: use finisher for copy-on-read copyup fulfillment
When the RBD cache is enabled, the ObjectCacher does not allow
reentrancy to read the full object. As a temporary workaround,
use the Finisher to handle CoR read requests.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Wed, 21 Jan 2015 22:23:00 +0000 (17:23 -0500)]
librbd: schedule header refresh after watch error
If a librados watch error occurs, it is possible that one
or more events were missed. Therefore, flag the header as
dirty so that it will be reloaded after the next operation.
Fixes: #4092 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Sage Weil [Wed, 21 Jan 2015 21:03:57 +0000 (13:03 -0800)]
vstart.sh: pull default CEPH_PORT from .ceph_port
This lets you put a unique port in .ceph_port in your working dir and
vstart.sh instances will avoid each other without having to pass
CEPH_PORT=... to each one each time.
Samuel Just [Thu, 22 Jan 2015 00:25:47 +0000 (16:25 -0800)]
PGBackend: fix and clarify be_select_auth_object
Previously, auth would end up containing every object without a
self-evident defect -- even if they did not match each other. Instead
of filtering out the non-matching items there, be_compare_scrubmaps now
returns one valid object and be_compare_scrubmaps gathers the other
which match it.
We can be smarter by doing this in be_select_auth_object and selecting
the largest matching set, but for now this is simpler.
Fixes: 10524 Signed-off-by: Samuel Just <sjust@redhat.com>
Loic Dachary [Tue, 20 Jan 2015 17:53:52 +0000 (18:53 +0100)]
ceph-disk: do not reuse partition if encryption required
If encryption is required, an existing journal partition must not be
reused. If an existing partition that was not prepared with ceph-disk is
found and reused, the caller will assume it is encrypted although it is
not.
mon: PGMonitor: available size 0 if no osds on pool's ruleset
get_rule_avail() may return < 0, which we were using blindly assuming it
would always return an unsigned value. We would end up with weird
values if the ruleset had no osds.
Signed-off-by: Joao Eduardo Luis <joao@redhat.com>