John Spray [Thu, 7 May 2015 17:42:01 +0000 (18:42 +0100)]
client: fix error handling in check_pool_perm
Previously, on an error such as a pool not existing,
the caller doing the check would error out, but
anyone waiting on waiting_for_pool_perm would
block indefinitely (symptom was that reads on a
file with a bogus layout would block forever).
Fix by triggering the wait list on errors and
clear the CHECKING state so that the other callers
also perform the check and find the error.
Additionally, don't return the RADOS error code
up to filesystem users, because it can be
misleading. For example, nonexistent pool is
ENOENT, but we shouldn't give ENOENT on IO
to a file which does exist, we should give EIO.
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit e08cf25cafef5752877439c18cc584b0a75eca08) Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Loic Dachary [Wed, 6 May 2015 18:14:37 +0000 (20:14 +0200)]
tests: ceph-helpers kill_daemons fails when kill fails
Instead of silently leaving the daemons running, it returns failure so
the caller can decide what to do with this situation. The timeout is
also extended to minutes instead of seconds to gracefully handle the
rare situations when a machine is extra slow for some reason.
Ken Dreyer [Thu, 30 Apr 2015 21:53:22 +0000 (15:53 -0600)]
packaging: mv ceph-objectstore-tool to main ceph pkg
This change ensures that the ceph-objectstore-tool utility is present on
all OSDs. This makes it easier for users to run this tool to do manual
debugging/recovery in some scenarios.
http://tracker.ceph.com/issues/11376 Refs: #11376
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
(cherry picked from commit 61cf5da0b51e2d9578c7b4bca85184317e30f4ca)
Conflicts:
debian/control
because file layout changes from ceph-test and ceph << 0.94.1-46
Loic Dachary [Thu, 7 May 2015 17:03:16 +0000 (19:03 +0200)]
Merge pull request #4559 from dachary/wip-11429-hammer
OSD::load_pgs: we need to handle the case where an upgrade from earlier versions which ignored non-existent pgs resurrects a pg with a prehistoric osdmap
Jason Dillaman [Mon, 27 Apr 2015 05:00:38 +0000 (01:00 -0400)]
librbd: allow snapshots to be created when snapshot is active
The librbd API previously permitted the creation of snapshots while
the image context was associated to another snapshot. A recent code
cleanup broke that ability, so this re-introduces it. The code change
also allows minor cleanup with rebuild_object_map.
Jason Dillaman [Tue, 21 Apr 2015 16:59:33 +0000 (12:59 -0400)]
librbd: better handling for duplicate flatten requests
A proxied flatten request could be replayed, resulting in a
-EINVAL error code being generated on the second attempt. Filter
out that error if it is known the parent did exist before the
op started.
Jason Dillaman [Wed, 18 Mar 2015 15:51:47 +0000 (11:51 -0400)]
librbd: use generic helper for issuing async requests
resize, flatten, and rebuild object map now use the same
bootstrap code for sending the request to the remote lock owner
or executing the request locally.
Samuel Just [Tue, 21 Apr 2015 06:45:57 +0000 (23:45 -0700)]
OSD: handle the case where we resurrected an old, deleted pg
Prior to giant, we would skip pgs in load_pgs which were not present in
the current osdmap. Those pgs would eventually refer to very old
osdmaps, which we no longer have causing the assertion failure in 11429
once the osd is finally upgraded to a version which does not skip the
pgs. Instead, if we do not have the map for the pg epoch, complain to
the osd log and skip the pg.
Conflicts:
src/rgw/rgw_op.cc
discard the whitespace modification hunk that were creating
conflict and ignore the conflict due to an unrelated cast
modification in the context
Dmytro Iurchenko [Mon, 16 Feb 2015 16:47:59 +0000 (18:47 +0200)]
rgw: Swift API. Complement the response to "show container details"
OpenStack Object Storage API v1 states that X-Container-Object-Count, X-Container-Bytes-Used and user-defined metadata headers should be included in a response.
Jason Dillaman [Mon, 27 Apr 2015 07:42:24 +0000 (03:42 -0400)]
librbd: update ref count when queueing AioCompletion
If the client releases the AioCompletion while librbd is waiting
to acquire the exclusive lock, the memory associated with the
completion will be freed too early.
Jason Dillaman [Fri, 10 Apr 2015 16:37:05 +0000 (12:37 -0400)]
librbd: failure to update the object map should always return success
If an object map update fails, the object map will be flagged as
invalid. However, if a subsequent update failure occurs, the error
code will propagate back to the caller.
Jason Dillaman [Fri, 6 Mar 2015 20:40:48 +0000 (15:40 -0500)]
tests: librados_test_stub reads should deep-copy
If a client of librados_test_stub modified a bufferlist
retrieved via a read call, the client will actually be
changing the contents of the file. Therefore, read calls
should deep-copy the contents of the buffer::ptrs.
Owen Synge [Tue, 17 Mar 2015 14:41:33 +0000 (15:41 +0100)]
Fix "disk zap" sgdisk invocation
Fixes #11143
If the metadata on the disk is truly invalid, sgdisk would fail to zero
it in one go, because --mbrtogpt apparently tried to operate on the
metadata it read before executing --zap-all.
Splitting this up into two separate invocations to first zap everything
and then clear it properly fixes this issue.
Based on patch by Lars Marowsky-Bree <lmb@suse.com> in ceph-deploy.
Created by Vincent Untz <vuntz@suse.com>
Yehuda Sadeh [Fri, 27 Mar 2015 23:32:48 +0000 (16:32 -0700)]
rgw: generate new tag for object when setting object attrs
Fixes: #11256
Backport: firefly, hammer
Beforehand we were reusing the object's tag, which is problematic as
this tag is used for bucket index updates, and we might be clobbering a
racing update (like object removal).
Objects that start with underscore need to have an object locator,
this is due to an old behavior that we need to retain. Some objects
might have been created without the locator. This tool creates a new
rados object with the appropriate locator.
max_req_id was moved to RGWRados and changed to atomic64_t.
The same request id resulted in gc giving the same idtag to all objects
resulting in a leakage of rados objects. It only kept the last deleted object in
it's queue, the previous objects were never freed.
Boris Ranto [Mon, 13 Apr 2015 13:07:03 +0000 (15:07 +0200)]
Rework mds/Makefile.am to support a dencoder client build
The patch adds all the mds sources to DENCODER_SOURCES to allow a
dencoder client build. The patch also splits the Makefile.am file to
better accomodate the change.
Haomai Wang [Fri, 17 Apr 2015 14:07:00 +0000 (22:07 +0800)]
Fix clear_pipe after reaping progress
In pipe.cc:1353 we stop this connection and we will let reader and write threads stop. If now reader and writer quit ASAP and we call queue_reap to trigger the reap progress. Now we haven't call "connection_state->clear_pipe(this)" in pipe.cc:1379, so we may assert failure here.
Guang Yang [Fri, 3 Apr 2015 12:27:04 +0000 (12:27 +0000)]
rgw : Issue AIO for next chunk first before flush the (cached) data.
When handling GET request for large object (with multiple chunks), currently it will first flush the
cached data, and then issue AIO request for next chunk, this has the potential issue to make the retriving
from OSD and sending to client serialized. This patch switch the two operations.
Dencoder is built if ENABLE_CLIENT is set. However, the rgw/Makefile.am
populated DENCODER_SOURCES only if WITH_RADOSGW was set. The patch fixes
this and populates DENCODER_SOURES if ENABLE_CLIENT is set.
Loic Dachary [Sun, 8 Mar 2015 14:15:35 +0000 (15:15 +0100)]
ceph-disk: more robust parted output parser
In some cases, depending on the implementation or the operating system,
parted --machine -- /dev/sdh print
may contain empty lines. The current parsing code is fragile and highly
depends on output details. Replace it with code that basically does the
same sanity checks (output not empty, existence of units, existence of
the dev entry) but handles the entire output instead of checking line by
line.
Jianpeng Ma [Fri, 6 Mar 2015 03:26:31 +0000 (11:26 +0800)]
osdc: add epoch_t last_force_resend in Op/LingerOp.
Using this field record the pg_poo_t::last_force_op_resend to avoid op
endless when osd reply with redirect.
Fixes: #11026 Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com> Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit def4fc4ae51174ae92ac1fb606427f4f6f00743e)
Jason Dillaman [Tue, 7 Apr 2015 19:39:13 +0000 (15:39 -0400)]
librbd: moved snap_create header update notification to initiator
When handling a proxied snap_create operation, the client which
invoked the snap_create should send the header update notification
to avoid a possible race condition where snap_create completes but
the client doesn't see the new snapshot (since it didn't yet receive
the notification).
Jason Dillaman [Wed, 22 Apr 2015 15:27:35 +0000 (11:27 -0400)]
librbd: updated cache max objects calculation
The previous calculation was based upon the image's object size.
Since the cache stores smaller bufferheads, the object size is not
a good indicator of cache usage and was resulting in objects being
evicted from the cache too often. Instead, base the max number of
objects on the memory load required to store the extra metadata
for the objects.