Sage Weil [Sat, 26 Oct 2013 00:45:06 +0000 (17:45 -0700)]
mon/OSDMonitor: make racing dup pool rename behave
If we get dup pool rename requests that are racing, make sure the second
one comes back with 'success' if the rename entry already exists in the
pending_inc map.
mon: OSDMonitor: Make 'osd pool rename' idempotent
'ceph osd pool rename' takes two arguments: source pool and dest pool.
If by chance 'source pool' does not exist and 'destination pool' does,
then, in order to assure it's idempotent, we want to assume that if
'source pool' no longer exists is because it was already renamed.
However, while we will return success in such case, we want to make sure
to let the user know that we made such assumption. Mostly to warn the
user of such a thing in case of a mistake on the user's part (say, the
user didn't notice that the source pool didn't exist, while the dest did),
but also to make sure that the user is not surprised by the command
returning success if the user expected an ENOENT or EEXIST.
Fixes: #6635 Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Adam Twardowski [Thu, 24 Oct 2013 16:24:11 +0000 (12:24 -0400)]
Update init-rbdmap
Add a chkconfig line for RHEL based distros to make chkconfig start rbdmap earlier on boot and stop later on shutdown. This will help prevent shutdown/reboot from hanging your system forever in the event that some daemon has a file held open on an rbd mounted filesystem.
Signed-off-by: Adam Twardowski <adam.twardowski@gmail.com>(cherry picked from commit 80384a1a24e681fff11c8715804b7f8cc4a2189a)
Josh Durgin [Thu, 24 Oct 2013 15:42:48 +0000 (08:42 -0700)]
rgw: escape bucket and object names in StreamReadRequests
This fixes copy operations for objects that contain unsafe characters,
like a newline, which would return a 403 otherwise, since the GET to
the source rgw would be unable to verify the signature on a partially
valid bucket name.
Josh Durgin [Thu, 24 Oct 2013 15:37:25 +0000 (08:37 -0700)]
rgw: move url escaping to a common place
This is useful outside of the s3 interface. Rename url_escape()
url_encode() for consistency with the exsting common url_decode()
function. This is in preparation for the next commit, which needs
to escape url-unsafe characters in another place.
Josh Durgin [Thu, 24 Oct 2013 15:34:24 +0000 (08:34 -0700)]
rgw: update metadata log list to match data log list
Send the last marker whether the log is truncated in the same format
as data log list, so clients don't have more needless complexity
handling the difference. Keep bucket index logs the same, since they
contain the marker already, and are not used in exactly the same way
metadata and data logs are.
Josh Durgin [Thu, 24 Oct 2013 15:26:19 +0000 (08:26 -0700)]
rgw: include marker and truncated flag in data log list api
Consumers of this api need to know their position in the log. It's
readily available when fetching the log, so return it. Without the
marker in this call, a client could not easily or efficiently figure
out its position in the log, since it would require getting the global
last marker in the log, and then reading all the log entries.
This would be slow for large logs, and would be subject to races that
would cause potentially very expensive duplicate work.
Returning this atomically while fetching the log entries simplifies
all of this.
Josh Durgin [Thu, 24 Oct 2013 15:18:19 +0000 (08:18 -0700)]
cls_log: always return final marker from log_list
There's no reason to restrict returning the marker to the case where
less than the whole log is returned, since there's already a truncated
flag to tell the client what happened.
Giving the client the last marker makes it easy to consume when the
log entries do not contain their own marker. If the last marker is not
returned, the client cannot get the last marker without racing with
updates to the log.
mon: MonClient: ping monitors without authenticating
* add support on the monitor to reply to MPing messages with the contents of
'mon_status' and 'health', regardless of a client having authenticated beforehand.
* add support on the MonClient to send a MPing message to a randomly picked
monitor (it was easier this way, '-m ip:port' allows for targeted ping) and block
waiting for a reply.
* add support on librados, pybind/rados.py and the 'ceph' tool to send pings to
monitors.
Resolves: #5984
Reviewed-by: Greg Farnum <greg@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Sage Weil [Tue, 22 Oct 2013 19:54:09 +0000 (12:54 -0700)]
ceph: catch exceptions thrown during the rados handle init
In my case, making ceph.conf unreadable triggers an exception here:
Traceback (most recent call last):
File "./ceph", line 802, in <module>
sys.exit(main())
File "./ceph", line 575, in main
conf_defaults=conf_defaults, conffile=conffile)
File "/home/sage/src/ceph/src/pybind/rados.py", line 221, in __init__
self.conf_read_file(conffile)
File "/home/sage/src/ceph/src/pybind/rados.py", line 272, in conf_read_file
raise make_ex(ret, "error calling conf_read_file")
rados.Error: error calling conf_read_file: errno EACCES
Yehuda Sadeh [Mon, 21 Oct 2013 21:17:12 +0000 (14:17 -0700)]
rgw: turn swift COPY into PUT
Fixes: #6606
The swift COPY operation is unique in a sense that it's a write
operation that has its destination not set by the URI target, but by a
different HTTP header. This is problematic as there are some hidden
assumptions in the code that the specified bucket/object in the URI is
the operation target. E.g., certain initialization functions, quota,
etc. Instead of creating a specialized code everywhere for this case
just turn it into a regular copy operation, that is, a PUT with
a specified copy source.
Greg Farnum [Fri, 18 Oct 2013 23:34:11 +0000 (16:34 -0700)]
ReplicatedPG: copy: conditionally requeue copy ops when cancelled
We may need to requeue copy ops which are cancelled as part of an acting
set change but don't change the primary. To support this, add a
"requeue" flag to cancel_copy_ops() and copy_ops(), as well as to
CopyResults. The CopyCallback is then responsible for requeuing (the
higher layers can't do so as they can't know which request actually
triggered the copy).
Sage Weil [Thu, 17 Oct 2013 19:06:26 +0000 (12:06 -0700)]
Makefile: fix /sbin vs /usr/sbin behavior
Instead of telling configure to put things in /sbin, explicitly put the
two important items (mkcephfs and mount.fuse.ceph) in /sbin via an
automake rule. This unbreaks FreeBSD 9.1 and probably others.
Based on patches originally from Alan Somers <asomers@gmail.com>, modified
for the current Makefile structure and applied to the specfile too.
Fixes: #6456 Signed-off-by: Sage Weil <sage@inktank.com> Tested-by: Alan Somers <asomers@gmail.com>
Gregory Farnum [Wed, 16 Oct 2013 18:13:35 +0000 (11:13 -0700)]
Merge pull request #709 from ceph/wip-filerecover
This patch prevents us from inadvertently reducing sparse file sizes during recovery.
We also reduce some code duplication by using eval() directly in do_file_recover()
instead of reproducing the parts we care about.
Sage Weil [Fri, 27 Sep 2013 22:29:13 +0000 (15:29 -0700)]
common/buffer: add crc caching performance test
On my old box:
- matching cached values is a big win (free), obviously
- the adjustment is the same speed as redoing the calculation. this
is probably because the data is already in L1/L2 cache; we still
save memory bandwidth.
Sage Weil [Wed, 16 Oct 2013 00:55:32 +0000 (17:55 -0700)]
cls_rbd: do not make noise in osd log on rbd removal
ubuntu@burnupi06:~$ tail -f /var/log/ceph/ceph-osd.1.log
2013-02-07 17:00:30.565749 7fdb09e6b700 0 <cls> cls/rbd/cls_rbd.cc:1615: error reading id for name 'sds': -2
2013-02-07 17:00:30.566301 7fdb0a66c700 0 <cls> cls/rbd/cls_rbd.cc:1521: error reading name to id mapping: -2
2013-02-07 17:03:54.085700 7fdb0a66c700 0 <cls> cls/rbd/cls_rbd.cc:1615: error reading id for name 'sdfsd': -2
2013-02-07 17:03:54.086143 7fdb09e6b700 0 <cls> cls/rbd/cls_rbd.cc:1521: error reading name to id mapping: -2
Fixes: #4047 Signed-off-by: Sage Weil <sage@inktank.com>