Jason Dillaman [Tue, 24 Feb 2015 01:09:56 +0000 (20:09 -0500)]
tests: fix potential race conditions in test_ImageWatcher
The tests were sending invalid responses back to ImageWatchers
(missing the result code), which had the potential to allow the
lock to be acquired sooner than the test was expecting since
ImageWatcher would assume the last of response code meant no
clients owned the exclusive lock and would retry as fast as
possible.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Mon, 23 Feb 2015 17:16:39 +0000 (12:16 -0500)]
librbd: fixed snap create race conditions
Since the post-snap create header update runs asynchrously
in a finalizer callback, it's possible that the snapshot
is not immediately visible. Also, if a proxied snap create
message is replayed, it's possible for the client to receive
a EEXISTS error.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Added a unique client id to announcement messages so that duplicate
lock release / acquired / requested messages can be detected and
ignored by the client. Also fixed an issue processing the result
code for async operations.
Fixes: #10898 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Fri, 20 Feb 2015 17:50:26 +0000 (12:50 -0500)]
rbd: disable RBD exclusive locking by default
Utilize the existing rbd_default_features config option to
control whether or not to enable RBD exclusive locking and
object map features by default. Also added a new option to
the rbd cli to specify the image features when creating images.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Thu, 19 Feb 2015 20:38:32 +0000 (15:38 -0500)]
osdc: pass fadvise op flags to WritebackHandler read requests
librbd was previously attempting to cast the provided Context to
retrieve the fadvise flags. To eliminate the unsafe cast, now
the fadvise flags are directly passed to the WritebackHandler::read
callback.
Fixes: #10914 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Sage Weil [Thu, 12 Feb 2015 22:16:53 +0000 (14:16 -0800)]
osd/OSDMap: include pg_temp count in summary
It is useful to know how big the pg_temp map is. Strictly speaking
this is part of the OSDMap so I'm including it here. It looks like
this:
osdmap e25: 3 osds: 3 up, 3 in; 1 remapped pgs
It might be more user-friendly to put it in a line with the pgmap
somewhere (where other pg counts are included), but it doesn't quite
fit there either. So sticking with where it lives in the data
structure!
Samuel Just [Tue, 17 Feb 2015 18:08:01 +0000 (10:08 -0800)]
PG: compensate for bug 10780 on older peers
Previously, there was a harmless bug where we didn't fill in the
last_epoch_started field for a peer which we are resetting the
last_backfill line for. It's no longer harmless since we use that
as the activation epoch, so if the peer is missing the MIN_SIZE
feature bit, we fill in the last_epoch_started it meant to fill in.
It was possible for ImageWatcher to attempt to re-acquire held locks
via context callbacks. This issue affected resizing/flattening when
no work was required and rescheduling a watch upon two successive
failures.
Fixes: #10899 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Boris Ranto [Wed, 7 Jan 2015 09:00:21 +0000 (10:00 +0100)]
ceph.spec: split ceph-devel to appropriate *-devel packages
ceph-devel contains various header files/bindings for several
libraries, this patch creates *-devel packages for all the libraries
separately and provides the compatibility layer for the split.
http://tracker.ceph.com/issues/10884 Refs: #10884
Signed-off-by: Boris Ranto <branto@redhat.com>
Amended by Ken Dreyer <kdreyer@redhat.com> to add version numbers to the
Obsoletes, add Obsoletes to the libradosstriper1-devel and
libcephfs_jni1-devel subpackages, adjust the librados documentation, and
add the Redmine issue number to this commit log.
Jason Dillaman [Sat, 14 Feb 2015 06:24:44 +0000 (01:24 -0500)]
librbd: enforce write ordering with snapshot
The md_lock is now held for reading when scheduling write/discards.
Since snap_create now holds the lock for writing and flushes all
pending IO, write/discard operations will now be consistent for a
given request across objects.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Sat, 7 Feb 2015 14:13:10 +0000 (09:13 -0500)]
librbd: use separate files for snapshot object maps
Instead of relying on the built-in object snapshot support,
create a separate object map object for each image snapshot.
This will allow a future repair utility to rebuild the object
map for an image's snapshots.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Mapped IoCtx::write_full to existing test method used by the
ObjectWriteOperation::write_full API method. Also added missing
cls_log implementation for debugging.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Wed, 4 Feb 2015 07:44:50 +0000 (02:44 -0500)]
cls_rbd: added CRC validation to object map
Added a footer to the object map which stores a header CRC and
and data CRCs for each 4KB chunk. Updates to the object map only
require recomputing the CRC to the affected 4KB chunk.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
t-miyamae [Thu, 12 Feb 2015 06:45:02 +0000 (15:45 +0900)]
tests: remove tests for when init() is not called in shec (#10839)
init2_1, init2_2, init2_3 are equivalent to init_1 and also removed.
encode_6, decode_6, create_ruleset_3 are null argument tests,
but the arguments are references of C++, so also removed.
Guang Yang [Fri, 13 Feb 2015 09:19:30 +0000 (09:19 +0000)]
osd: number of degraded objects in EC pool is wrong when there is OSD down(in)
With EC pool (crush rule choose indep), when there is an OSD down, the size of the 'acting' list does not change (CRUSH_ITEM_NONE is used to replace the down OSD), in this case, 'actingset' should be used to calculate the degraded objects.
Implement check_experimental_feature_enabled so that it returns the
message instead of unconditionally displaying it via derr. It allows the
caller to display it in another context.
mon: MonCap: take EntityName instead when expanding profiles
entity_name_t is tightly coupled to the messenger, while EntityName is
tied to auth. When expanding profiles we want to tie the profile
expansion to the entity that was authenticated. Otherwise we may incur
in weird behavior such as having caps validation failing because a given
client messenger inst does not match the auth entity it used.
has entity_name_t 'client.12345' and EntityName 'osd.0'. Using
entity_name_t during profile expansion would not allow the client access
to daemon-private/osd.X/foo (client.12345 != osd.X).
Fixes: #10844
Backport: firefly,giant
Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
Yan, Zheng [Thu, 12 Feb 2015 12:24:45 +0000 (20:24 +0800)]
mds: fix decoding of InodeStore::oldest_snap
There is no ENCODE_START/FINISH block when encoding inode that
embedded in dentry. So we can't use encoding version to check
if the buffer contains InodeStore::oldest_snap. Instead, we check
if the buffer iterator reaches end of buffer.
Loic Dachary [Wed, 11 Feb 2015 11:20:18 +0000 (12:20 +0100)]
tests: update docker helper documentation
The tags for the centos repository changed from centos6, centos7 to 6
and 7 which is consistent with the other distribution
repositories. Update the documentation accordingly.
Loic Dachary [Wed, 11 Feb 2015 11:15:02 +0000 (12:15 +0100)]
tests: one Dockerfile per repository:tag
There cannot be a common Dockerfile for all repository:tag combination
of a given operating system. The only way to customize a Dockerfile is
via variable substitution and it cannot conveniently address all
differences between versions.
Create one Dockerfile per operating system version instead. I.e. one
dockerfile for centos:7, one for centos:6 etc.
Kefu Chai [Thu, 12 Feb 2015 05:02:45 +0000 (13:02 +0800)]
osd: fix OSDCap parser on old boost/spirit
* on boost 1.41, the ascii::space skipper fails to skip the spaces at the
beginning of the parsed string, so as a workaround we replace the `lit(' ')`
in grammar spec with `ascii::blank`. this also simplifies the grammar
a little bit.
Samuel Just [Thu, 12 Feb 2015 01:07:05 +0000 (17:07 -0800)]
osd/: include version_t in extra_reqids with promote
Otherwise, we can't return the correct user version on a dup request.
Note: This patch does not handle compatilibity with the variant which
does not include the version_t field. Since it's been less than 2 weeks
and we haven't had a release, I think that's ok since handling
compatilibity would require some overhead in the encode/decode
methods.