Greg Farnum [Thu, 23 Oct 2014 00:16:31 +0000 (17:16 -0700)]
client: cast m->get_client_tid() to compare to 16-bit Inode::flushing_cap_tid
m->get_client_tid() is 64 bits (as it should be), but Inode::flushing_cap_tid
is only 16 bits. 16 bits should be plenty to let the cap flush updates
pipeline appropriately, but we need to cast in the proper direction when
comparing these differently-sized versions. So downcast the 64-bit one
to 16 bits.
Jason Dillaman [Tue, 21 Oct 2014 07:42:13 +0000 (03:42 -0400)]
rbd: Correct readahead divide by zero exception
When readahead is used on old-format RBD images, a divide
by zero signal will be thrown. This was caused by initializing
the readahead alignments prior to initializing the stripe layout
of old-format RBD images.
Fixes: 9857 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Adam Crume [Wed, 8 Oct 2014 00:45:53 +0000 (17:45 -0700)]
Fix read performance regression in ObjectCacher
The regression was introduced in commit 4fc9fffc494abedac0a9b1ce44706343f18466f1. The problem is that the cache
thinks it's full (when it's not), so it defers the read. This change
frees up cache space if necessary and only defers the read if enough
space cannot be freed.
qa/workunits: cephtool: don't remove self's key on auth tests
Suites run with CEPH_TEST_CLI_DUP_COMMAND=1, which will send a duplicate
command for every command issued with the 'ceph' tool. Behavior is to
get a reply from the command and then send a duplicate, looking for the
same outcome (guaranteeing idempotency of the operations). However, it
so happens that if you remove the entity's own key from the keyring and
you happen to be unlucky enough so that the client's connection gets
failed (we also run tests with connection failure injections), the
'ceph' tool won't be able to reconnect to the cluster to send the
duplicate command (as it's entity no longer exists in the cluster's
keyring).
We rewrite the test instead of resorting to ugly hacks to work around
this behavior, simply having a new 'role-definer' added by the existing
'role-definer' (which we weren't testing anyway, so bonus points for
that) and then have one removing the other (to test the procedure) and
finally using 'client.admin' to remove the last 'role-definer'.
Fixes: #9820 Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
mon: MDSMonitor: wait for osdmon to be writable when requesting proposal
Otherwise we may end up requesting the osdmon to propose while it is
mid-proposal. We can't simply return EAGAIN to the user either because
then we would have to expect the user to be able to successfully race
with the whole cluster in finding a window in which 'mds fs new' command
would succeed -- which is not a realistic expectation. Having the
command to osdmon()->wait_for_writable() guarantees that the command
will be added to a queue and that we will, eventually, tend to it.
Fixes: #9794 Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
mon: MDSMonitor: don't return -EINVAL if function is bool
Returning -EINVAL on a function that expects bool and the error code to
be in a variable 'r' can only achieve one thing: if this path is ever
touched, instead of returning an error as it was supposed to, we're
returning 'true' with 'r = 0' and, for no apparent reason, the user will
think everything went smoothly but with no new fs created.
Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
mon: MDSMonitor: check all conditions are met *before* osdmon proposal
We should not allow ourselves to request the osdmon to propose before we
know for sure that we meet the required conditions to go through with
our own state change. Even if we still can't guarantee that our
proposal is going to be committed, we shouldn't just change the osdmon's
state just because we can. This way, at least, we make sure that our
checks hold up before doing anything with side-effects.
Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
We were just setting return code to -EINVAL, while allowing the logic to
continue regardless. If we are to return error, then we should abort
the operation as well and let the user know it went wrong instead of
continuing as if nothing had happened.
Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
Jianpeng Ma [Fri, 17 Oct 2014 06:04:40 +0000 (14:04 +0800)]
test: fix compile warning in bufferlist.cc
test/bufferlist.cc: In member function ‘virtual void
Buffer_constructors_Test::TestBody()’:
test/bufferlist.cc:154:36: warning: ignoring return value of ‘int
system(const char*)’, declared with attribute warn_unused_result
[-Wunused-result]
::system("echo ABC > testfile");
^
test/bufferlist.cc: In member function ‘virtual void
TestRawPipe::SetUp()’:
test/bufferlist.cc:182:36: warning: ignoring return value of ‘int
system(const char*)’, declared with attribute warn_unused_result
[-Wunused-result]
::system("echo ABC > testfile");
^
test/bufferlist.cc: In member function ‘virtual void
BufferList_read_file_Test::TestBody()’:
test/bufferlist.cc:1768:53: warning: ignoring return value of ‘int
system(const char*)’, declared with attribute warn_unused_result
[-Wunused-result]
::system("echo ABC > testfile ; chmod 0 testfile");
^
test/bufferlist.cc:1770:32: warning: ignoring return value of ‘int
system(const char*)’, declared with attribute warn_unused_result
[-Wunused-result]
::system("chmod +r testfile");
^
test/bufferlist.cc: In member function ‘virtual void
BufferList_read_fd_Test::TestBody()’:
test/bufferlist.cc:1781:34: warning: ignoring return value of ‘int
system(const char*)’, declared with attribute warn_unused_result
[-Wunused-result]
::system("echo ABC > testfile");
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Jianpeng Ma [Fri, 17 Oct 2014 05:19:59 +0000 (13:19 +0800)]
librbd: fix compile warning in librbd/internal.cc.
librbd/internal.cc: In function 'void
librbd::readahead(librbd::ImageCtx*, const std::vector<std::pair<long
unsigned int, long unsigned int> >&, const md_config_t*)':
librbd/internal.cc:3150:38: warning: comparison between signed and
unsigned integer expressions [-Wsign-compare]
ictx->total_bytes_read > conf->rbd_readahead_disable_after_bytes;
^
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Gregory Farnum [Thu, 16 Oct 2014 13:57:34 +0000 (06:57 -0700)]
Merge pull request #2628 from ceph/wip-client-flock
Wip client flock
Add support for file locking to the userspace client, and improve blocked-lock cancellation so that it doesn't remove locks that succeeded when racing.
Loic Dachary [Thu, 16 Oct 2014 00:14:53 +0000 (17:14 -0700)]
mon: add the osd crush rename-bucket command
The synopsis is:
osd crush rename-bucket name1 name2
It is made idempotent by interpreting -EALREADY as returned by
CrushWrapper::rename_bucket return as success.
The crush_rename_bucket method first checks for errors with
CrushWrapper::can_rename_bucket if there is no pending crush so that it
can return early and avoid the creation of a pending crush map.
If renaming is possible, CrushWrapper::rename_bucket is called on the
pending crush map (and creates it indirectly if it does not already
exists).
Loic Dachary [Thu, 16 Oct 2014 00:06:12 +0000 (17:06 -0700)]
crush: add CrushWrapper::rename_item and can_rename_item
The can_rename_item is a const method checking if renaming an item could
succeed. If not it returns a unique -errno code and a human readable
message message.
Trying to rename a non existent item into an existent item returns
-EALREADY which can be treated as success if renaming is to be
idempotent.
Loic Dachary [Thu, 16 Oct 2014 00:02:58 +0000 (17:02 -0700)]
crush: improve constness of CrushWrapper methods
A number of CrushWrapper get methods or predicates were not const
because they need to maintain transparently the rmaps. Make the rmaps
mutable and update the constness of the methods to match what the caller
would expect.
Jason Dillaman [Tue, 14 Oct 2014 15:09:09 +0000 (11:09 -0400)]
librbdpy: Added missing method docstrings
Several methods were missing docstrings, preventing the methods
from appearing in the generated documentation. Ensured all methods
now have appropriate docstrings.
Fixes: 5977 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Yan, Zheng [Mon, 13 Oct 2014 03:34:18 +0000 (11:34 +0800)]
client: use finisher to abort MDS request
When a request is interrupted, libfuse first locks an internal mutex,
then calls the interrupt callback. libfuse need to lock the same mutex
when unregistering interrupt callback. We unregister interrupt callback
while client_lock is locked, so we can't acquiring the client_lock in
the interrupt callback.
This commit introduce two new types of setfilelock request. Unlike
setfilelock (UNLOCK) request, these two new types of setfilelock request
do not drop locks that have alread been acquired, they only interrupt
blocked setfilelock request.
Yan, Zheng [Thu, 9 Oct 2014 01:42:08 +0000 (09:42 +0800)]
client: register callback for fuse interrupt
libfuse allows program to reigster a callback for interrupt. When a file
system operation is interrupted, the fuse kernel driver sends interupt
request to libfuse. libfuse calls the interrupt callback when receiving
interrupt request.
Jianpeng Ma [Mon, 13 Oct 2014 05:33:38 +0000 (13:33 +0800)]
FileStore:Round offset of fiemap down aligned with CEPH_PAGE_SIZE.
There is a bug on xfs about fiemap. If offset unsigned, the result of
fiemap will leak some data.
Kernel commit eedf32bfcace7d8e20cc66757d74fc68f3439ff7 fix this bug.
To avoid this bug on kernel which don't apply this commit, in ceph we
make the offset down aligned with CEPH_PAGE_SIZE.
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Guang Yang [Mon, 13 Oct 2014 04:18:45 +0000 (04:18 +0000)]
The fix for issue 9614 was not completed, as a result, for those erasure coded PGs with one OSD down, the state was wrongly marked as active+clean+degraded. This patch makes sure the clean flag is not set for such PG. Signed-off-by: Guang Yang <yguang@yahoo-inc.com>
BJ Lougee [Sat, 11 Oct 2014 07:44:17 +0000 (02:44 -0500)]
libcephfs.h libcephfs.cc : Defined error codes for the mount function
Used new error codes from libcephfs.h to replace the magic numbers in the mount functon found in libcephfs.cc.