Sage Weil [Mon, 12 Jan 2015 22:00:21 +0000 (14:00 -0800)]
osd: enable filestore_extsize by default
Note that this will only get used if the kernel is new enough; if it is
older than 3.5 the option will get disabled and extsize will not be used
even if the option is set to true.
Sage Weil [Mon, 12 Jan 2015 21:59:39 +0000 (13:59 -0800)]
os/FileStore: verify kernel is new enough before using extsize ioctl
Old kernels have an XFS bug that exposes uninitialized data when the
extsize hint is set and only partially written. This is fixed by Linux
commit aff3a9edb7080f69f07fe76a8bd089b3dfa4cb5d, documented in XFS bug
http://oss.sgi.com/bugzilla/show_bug.cgi?id=874, and tested by XFS
test xfs/229 to prevent regressions.
Notably the original bug affects kernel 3.2, which is widely deployed with
ubuntu precise 12.04.
Backport: giant, firefly Signed-off-by: Sage Weil <sage@redhat.com>
Jianpeng Ma [Mon, 5 Jan 2015 12:51:21 +0000 (20:51 +0800)]
test/bufferlist: For root, don't do permission operation for read_file
case.
For root user, it meet those error:
test/bufferlist.cc:1880: Failure
Value of: bl.read_file("testfile", &error)
Actual: 0
Expected: -13
test/bufferlist.cc:1884: Failure
Value of: bl.length()
Actual: 8
Expected: (unsigned)4
Which is: 4
test/bufferlist.cc:1886: Failure
Value of: actual
Actual: "ABC
ABC
"
Expected: "ABC\n"
Which is: "ABC
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Ken Dreyer [Fri, 2 Jan 2015 17:32:26 +0000 (10:32 -0700)]
doc: rm reference to old Ubuntu release
Remove the reference to "Ubuntu 12.10" since this is EOL.
Clarify that we only recommend Ubuntu LTS releases.
(Since this information has a tendancy to become stale, perhaps this
whole paragraph should be removed here and we should simply point at the
main OS Recommendations page.)
Sage Weil [Mon, 29 Dec 2014 23:47:28 +0000 (15:47 -0800)]
client: fix quota signed/unsigned warning
client/Client.cc: In member function 'bool Client::is_quota_bytes_exceeded(Inode*, uint64_t)':
client/Client.cc:10393:66: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (quota->max_bytes && (rstat->rbytes + new_bytes) > quota->max_bytes)
Ning Yao [Fri, 26 Dec 2014 04:20:35 +0000 (04:20 +0000)]
PG:: filter_snapc: Return immediately if no snapc need to trim
we can return immediately if no snapc need to trim. Do not iterater snapc vector and do extra judgement and ops.
Sage Weil [Tue, 23 Dec 2014 20:39:08 +0000 (12:39 -0800)]
mon: provide encoded canonical full OSDMap from primary
Currently we make each monitor apply the incremental and encode the full
map locally. The original motivation was to save bandwidth, but the
savings are minimal to modest and the complexity associated with doing this
is huge.
This strategy also causes problems now that we have OSDMap crc's and old
mons/clusters may have diverging full OSDMaps due to mixed version
clusters. See #10422
Instead, include the encoded full map in the paxos transaction. We will
still apply the incremental and check the crc, but if it fails and we have
the correct version, reload it from disk and move on. If we don't, we
will continue as we have before--the primary mon doesn't have support for
crc's yet. When it does we will start verifying and/or get our
full map back into sync.
Fixes: #10422 Signed-off-by: Sage Weil <sage@redhat.com>
Mykola Golub [Tue, 23 Dec 2014 11:39:33 +0000 (13:39 +0200)]
10132: osd: tries to set ioprio when the config option is blank
According to documentation, ioprio params will only be used if both
osd disk thread ioprio class and osd disk thread ioprio priority are
set to a non default value.
So, add a proper check and do not generate "set_disk_tp_priority(22)
Invalid argument" warning for the default settings.
Haomai Wang [Fri, 19 Dec 2014 14:28:54 +0000 (22:28 +0800)]
test_msgr: Avoid deadlock between send_message and dispatch
If connection holds Connection's lock and try to acquire
FakeDispatcher's lock while gtest thread try to send_message with
FakeDispatcher's lock and try to acquire Connection's lock,
it will be deadlock.
Now AsyncConnection::_stop may consume a little time on deleting time events,
it may occur that accepting a connection get this stopping connection
because unregister call isn't met.
Sage Weil [Fri, 19 Dec 2014 19:48:27 +0000 (11:48 -0800)]
librados: add rados_watch_flush() call
Add a call so that callers can make sure all queued callbacks have
completed before shutting down the ioctx. This avoids a segv triggered
by the LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNotify2Timeout/1
test due to the ioctx being destroyed when the in-progress callback
does a notify_ack.
Sage Weil [Fri, 19 Dec 2014 16:37:00 +0000 (08:37 -0800)]
osdc/Objecter: do notify completion callback in fast-dispatch context
The notify completion has exactly one user, the librados caller which
does nothing but take a local (inner) lock and signal a Cond. Do this
in the fast-dispatch context for simplicity.
Notably, this makes the notify completion (and timeout) trigger a
notify2() return (with ETIMEDOUT) even when the finisher queue that
normally delivers notify is busy.. for example with a notify that is
being very slow. In our case, the unit test is doing a sleep(3) to
test timeouts but also prevented the ETIMEDOUT notification from
being delivered to the caller. This patch resolves that.
The code moved from be_select_auth_object to be_compare_scrubmaps 74bd8708dfbfd3c8e7ba3f41d8534609dcbc1237 but the j iterator is use
differently although it has the same type. Use map.begin() as a
fallback instead.
Sage Weil [Mon, 22 Dec 2014 15:32:36 +0000 (07:32 -0800)]
osd: scrub: only assume shard digest == oi digest for replicated pools
For an EC object, the digest we get from scrub is for the *shard*, and that
is not the same as the *object* digest in the object_info_t. Skip these
checks; we already have the per-shard digest that is verified in the EC
backend.
Fixes: #10409 Signed-off-by: Sage Weil <sage@redhat.com>