Ken Dreyer [Mon, 12 Jan 2015 19:23:56 +0000 (12:23 -0700)]
packaging: rm unnecessary RPM macro definitions
Remove the extraneous macro definitions, %{name}, %{version},
%{unmangled_version}, %{release}. RPM automatically sets these when we
use the Name, Version, and Release tags.
Discussed %{unmangled_version} in #ceph-devel today; not sure where that
one came from.
Josh Durgin [Tue, 23 Dec 2014 06:16:36 +0000 (01:16 -0500)]
client: workaround socket leak
Recreate the boto connection object after every 512 requests in an
attempt to workaround a socket leak when the remote end closes the
connection. Boto does not seem to be causing httplib to close() the
sockets appropriately in some cases. I'm not sure exactly where
the leak is occurring, but forcing boto to reinitialize its
connection pool like this avoids it.
Josh Durgin [Thu, 18 Dec 2014 05:49:35 +0000 (21:49 -0800)]
worker: check op state for progress on any HTTP error
500 can be caused by an fcgi timeout when the operation still ends up
succeeding. It doesn't hurt to check for op state being in progress in
general in case other error codes happen but don't indicate actual
failure of the copy, since wait_for_object fails immediately if the
op state is not in progress.
Josh Durgin [Thu, 18 Dec 2014 05:28:38 +0000 (21:28 -0800)]
sync: use ' ' as the default shard so retries can be stored
The replica log api doesn't allow empty markers, but we may need to
store retries in the replica log even if there are no log entries. Use
' ' as a marker instead, since it is before all possible markers.
Josh Durgin [Thu, 18 Dec 2014 05:20:21 +0000 (21:20 -0800)]
full data sync: treat 404 from bucket list as success
Since results are paged, the exception may not occur immediately, so
add a new type to distinguish it from other client exceptions, and
catch it at any point during full sync of a bucket.
Josh Durgin [Tue, 25 Nov 2014 00:26:27 +0000 (16:26 -0800)]
Immediately fail when op state is not found
There's a very narrow window in which our connection sending the copy
request could fail before radosgw records the op state, and we query
the op state in that time. Retrying later at that point is fine, and
avoids a long timeout if there is a bug causing us to reach this point
without the op state ever being recorded. This race hasn't been
observed in practice.
Josh Durgin [Mon, 10 Nov 2014 23:39:22 +0000 (15:39 -0800)]
worker: process full sync for buckets with no log
This way we catch objects that were uploaded before zones or data logs
were set up. If there are actually no objects in the bucket, the
bucket listing will tell us that.
Josh Durgin [Mon, 10 Nov 2014 22:14:11 +0000 (14:14 -0800)]
client: don't hide boto return values
The decorator for translating boto exceptions needs to return
explicitly, or it will lose the return value of the original function,
and cause full sync to fail for objects that aren't in the bucket
index log.
Josh Durgin [Sat, 22 Mar 2014 12:10:28 +0000 (05:10 -0700)]
packaging: fix up spec file for non-python files
Correct the installation of the binary, logrotate conf, and init script.
Stop using INSTALLED_FILES. Instead explicitly list files and glob the
python package, as recommended by Fedora's python packaging guide.
Josh Durgin [Wed, 19 Mar 2014 11:31:29 +0000 (04:31 -0700)]
cli: add a default log file
Now that the packages include /var/log/ceph/radosgw-agent, default to
storing logs there, named after the configuration file or
radosgw-agent.log if no config file was used.