]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
11 years agoqa/run_xfstests.sh: use old xfstests until we adapt to new org 646/head
Sage Weil [Thu, 26 Sep 2013 22:02:18 +0000 (15:02 -0700)]
qa/run_xfstests.sh: use old xfstests until we adapt to new org

Tests were rearranged upstream; use an old version for the time being
until we can refactor run_xfstests.sh to cope.  See #6385

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #645 from liewegas/wip-6346
Gregory Farnum [Thu, 26 Sep 2013 20:12:37 +0000 (13:12 -0700)]
Merge pull request #645 from liewegas/wip-6346

Reviewed-by: Greg Farnum <greg@inktank.com>
11 years agoosd/ReplicatedPG: fix bl resize on write vs truncate race 645/head
Sage Weil [Thu, 26 Sep 2013 17:38:23 +0000 (10:38 -0700)]
osd/ReplicatedPG: fix bl resize on write vs truncate race

If we resize the write due to the funky truncate behavior, we need to
resize the bufferlist to match.

Fixes: #6346
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agomon: OSDMonitor: do not write full_latest during trim
Joao Eduardo Luis [Wed, 25 Sep 2013 21:08:24 +0000 (22:08 +0100)]
mon: OSDMonitor: do not write full_latest during trim

On commit 81983bab we patched OSDMonitor::update_from_paxos() such that we
write the latest full map version to 'full_latest' each time the latest
full map was built from the incremental versions.

This change however clashed with OSDMonitor::encode_trim_extra(), which
also wrote to 'full_latest' on each trim, writing instead the version of
the *oldest* full map.  This duality of behaviors could lead the store
to an inconsistent state across the monitors (although there's no sign of
it actually imposing any issues besides rebuilding already existing full
maps on some monitors).

We now stop OSDMonitor::encode_trim_extra() from writing to 'full_latest'.
This function will still write out the oldest full map it has in the store,
but it will no longer write to full_latest, instead leaving it up to
OSDMonitor::update_from_paxos() to figure it out -- and it already does.

Fixes: #6378
Backport: dumpling

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agocommon/crc32c_intel_fast: fix compile-time #ifdef
Sage Weil [Wed, 4 Sep 2013 20:14:14 +0000 (13:14 -0700)]
common/crc32c_intel_fast: fix compile-time #ifdef

This wasn't getting built in!

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
(cherry picked from commit 3233336cc3b6c2c1e89fe6c6d21d42e0f2cce142)

11 years agoqa/workunits/mon/crush_ops.sh: fix test
Sage Weil [Wed, 25 Sep 2013 17:10:21 +0000 (10:10 -0700)]
qa/workunits/mon/crush_ops.sh: fix test

Fix root.

Fixes: #6392
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agomon/OSDMonitor: fix 'ceph osd crush reweight ...'
Sage Weil [Tue, 24 Sep 2013 22:26:03 +0000 (15:26 -0700)]
mon/OSDMonitor: fix 'ceph osd crush reweight ...'

The adjust method returns a count of adjusted items.

Add a test.

Fixes: #6382
Backport: dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
11 years agoosd: change warn_interval_multiplier to uint32_t
Loic Dachary [Tue, 24 Sep 2013 17:04:23 +0000 (19:04 +0200)]
osd: change warn_interval_multiplier to uint32_t

to prevent overflow in OpTracker::check_ops_in_flight when
multiplying warn_interval_multiplier *= 2

Backport: cuttlefish, dumpling

http://tracker.ceph.com/issues/6370 fixes #6370

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 1bce1f009bffd3e28025a08775fec189907a81db)

11 years agoarch/intel: fix old comment
Sage Weil [Tue, 24 Sep 2013 17:18:44 +0000 (10:18 -0700)]
arch/intel: fix old comment

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoarch/intel: use intel probe instructions for x86_64 only
Sage Weil [Tue, 24 Sep 2013 17:17:37 +0000 (10:17 -0700)]
arch/intel: use intel probe instructions for x86_64 only

Not LP64, which includes ppc64 and clearly does not build.

Fixes: #6283
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
11 years agoosd: revert 'osd max xattr size' limit
Sage Weil [Mon, 23 Sep 2013 23:23:33 +0000 (16:23 -0700)]
osd: revert 'osd max xattr size' limit

Set it to 0 (unlimited) for now.

Backport: dumpling

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoMerge pull request #599 from ceph/wip-6323
Sage Weil [Mon, 23 Sep 2013 21:12:33 +0000 (14:12 -0700)]
Merge pull request #599 from ceph/wip-6323

mon: OSDMonitor: fix #6322 and #6323

Reviewed-by: Greg Farnum <greg@inktank.com>
11 years agoMerge pull request #612 from ceph/wip-6361
João Eduardo Luís [Sat, 21 Sep 2013 11:40:55 +0000 (04:40 -0700)]
Merge pull request #612 from ceph/wip-6361

perfglue/heapprofiler: expect cmd name when handling command instead of 'heap <cmd>'

This was broken by the cli rework.

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agomon: fix wrong arg to "instructed to" status message
Dan Mick [Thu, 19 Sep 2013 23:04:16 +0000 (16:04 -0700)]
mon: fix wrong arg to "instructed to" status message

Fixes: #6293
Signed-off-by: Dan Mick <dan.mick@inktank.com>
(cherry picked from commit 16ebb25f7cdb8e92c618a333c505c16edb16c95c)

11 years agorgw: destroy get_obj handle in copy_obj()
Yehuda Sadeh [Mon, 16 Sep 2013 21:35:25 +0000 (14:35 -0700)]
rgw: destroy get_obj handle in copy_obj()

Fixes: #6176
Backport: dumpling
We take different code paths in copy_obj, make sure we close the handle
when we exit the function. Move the call to finish_get_obj() out of
copy_obj_data() as we don't create the handle there, so that should
makes code less confusing and less prone to errors.
Also, note that RGWRados::get_obj() also calls finish_get_obj(). For
everything to work in concert we need to pass a pointer to the handle
and not the handle itself. Therefore we needed to also change the call
to copy_obj_data().

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoqa: workunits: cephtool: check if 'heap' commands are parseable 612/head
Joao Eduardo Luis [Fri, 20 Sep 2013 16:06:30 +0000 (17:06 +0100)]
qa: workunits: cephtool: check if 'heap' commands are parseable

Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
11 years agoosd: OSD: add 'heap' command to known osd commands array
Joao Eduardo Luis [Fri, 20 Sep 2013 16:50:27 +0000 (17:50 +0100)]
osd: OSD: add 'heap' command to known osd commands array

Must have been forgotten during the cli rework.

Backport: dumpling

Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
11 years agomds: MDS: pass only heap profiler commands instead of the whole cmd vector
Joao Eduardo Luis [Fri, 20 Sep 2013 15:43:27 +0000 (16:43 +0100)]
mds: MDS: pass only heap profiler commands instead of the whole cmd vector

The heap profiler doesn't care, nor should it, what our command name is.
It only cares about the commands it handles.

Backport: dumpling

Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
11 years agoperfglue/heap_profiler.cc: expect args as first element on cmd vector
Joao Eduardo Luis [Fri, 20 Sep 2013 15:41:14 +0000 (16:41 +0100)]
perfglue/heap_profiler.cc: expect args as first element on cmd vector

We used to pass 'heap' as the first element of the cmd vector when
handling commands.  We haven't been doing so for a while now, so we
needed to fix this.

Not expecting 'heap' also makes sense, considering that what we need to
know when we reach this function is what command we should handle, and
we should not care what the caller calls us when handling his business.

Fixes: #6361
Backport: dumpling

Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
11 years agolru_map: don't use list::size()
Yehuda Sadeh [Thu, 12 Sep 2013 21:32:17 +0000 (14:32 -0700)]
lru_map: don't use list::size()

replace list::size() with map::size(), which should have
a constant time complexity.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agocommon/lru_map: rename tokens to entries
Yehuda Sadeh [Thu, 12 Sep 2013 21:30:19 +0000 (14:30 -0700)]
common/lru_map: rename tokens to entries

This code was originally used in a token cache, now
as a generic infrastructure rename token fields.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agobufferlist: don't use list::size()
Yehuda Sadeh [Thu, 12 Sep 2013 19:26:41 +0000 (12:26 -0700)]
bufferlist: don't use list::size()

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agorgw: use bufferlist::append() instead of bufferlist::push_back()
Yehuda Sadeh [Wed, 18 Sep 2013 17:37:21 +0000 (10:37 -0700)]
rgw: use bufferlist::append() instead of bufferlist::push_back()

push_back() expects char *, whereas append can append a single char.
Appending a NULL char to push_back is cast as a NULL pointer which is
bad.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agov0.69 v0.69
Gary Lowell [Wed, 18 Sep 2013 01:40:51 +0000 (01:40 +0000)]
v0.69

11 years agomon: OSDMonitor: multiple rebuilt full maps per transaction 599/head
Joao Eduardo Luis [Tue, 17 Sep 2013 16:58:20 +0000 (17:58 +0100)]
mon: OSDMonitor: multiple rebuilt full maps per transaction

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: OSDMonitor: update latest_full while rebuilding full maps
Joao Eduardo Luis [Sun, 15 Sep 2013 20:03:50 +0000 (21:03 +0100)]
mon: OSDMonitor: update latest_full while rebuilding full maps

Not doing so will make the monitor rebuild the osdmap full versions, even
though they may have been rebuilt before, every time the monitor starts.

This mostly happens when the cluster is left in an unhealthy state for
a long period of time and incremental versions build up.  Even though we
build the full maps on update_from_paxos(), not updating 'full_latest'
leads to the situation initially described.

Fixes: #6322
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: OSDMonitor: smaller transactions when rebuilding full versions
Joao Eduardo Luis [Sun, 15 Sep 2013 20:00:55 +0000 (21:00 +0100)]
mon: OSDMonitor: smaller transactions when rebuilding full versions

Otherwise, for considerably sized rebuilds, the monitor will not only
consume vast amounts of memory, but it will also have troubles committing
the transaction.  Anyway, it's also a good idea to adjust transactions to
the granularity we want, and to be fair we care that each rebuilt full map
gets to disk, even if subsequent full maps don't (those can be rebuilt
later).

Fixes: #6323
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agorgw: try to create log pool if doesn't exist
Yehuda Sadeh [Fri, 13 Sep 2013 21:43:54 +0000 (14:43 -0700)]
rgw: try to create log pool if doesn't exist

When using replica log, if the log pool doesn't exist all operations are
going to fail. Try to create it if doesn't exist.

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agorgw: NULL terminate buffer before parsing it
Yehuda Sadeh [Wed, 11 Sep 2013 20:46:31 +0000 (13:46 -0700)]
rgw: NULL terminate buffer before parsing it

Fixes: #6175
Backport: dumpling
We get a buffer off the remote gateway which might
not be NULL terminated. The JSON parser needs the
buffer to be NULL terminated even though we provide
a buffer length as it calls strlen().

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agorgw: don't call list::size() in ObjectCache
Yehuda Sadeh [Thu, 12 Sep 2013 05:30:12 +0000 (22:30 -0700)]
rgw: don't call list::size() in ObjectCache

Fixes: #6286
Use an external counter instead of calling list::size()

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agorgw: drain pending requests before completing write
Yehuda Sadeh [Tue, 10 Sep 2013 19:18:55 +0000 (12:18 -0700)]
rgw: drain pending requests before completing write

Fixes: #6268
When doing aio write of objects (either regular or multipart parts) we
need to drain pending aio requests. Otherwise if gateway goes down then
object might end up corrupted.

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agorgw: when failing read from client, return correct error
Yehuda Sadeh [Tue, 3 Sep 2013 20:27:21 +0000 (13:27 -0700)]
rgw: when failing read from client, return correct error

Fixes: #6214
When getting a failed read from client when putting an object
we returned the wrong value (always 0), which in the chunked-
upload case ended up in assuming that the write was done
successfully.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #583 from ceph/wip-6230-workunit
Sage Weil [Mon, 9 Sep 2013 23:31:28 +0000 (16:31 -0700)]
Merge pull request #583 from ceph/wip-6230-workunit

qa: workunits: mon: crush_ops: test 'ceph osd crush move'

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoqa: workunits: mon: crush_ops: test 'ceph osd crush move' 583/head
Joao Eduardo Luis [Mon, 9 Sep 2013 23:20:41 +0000 (00:20 +0100)]
qa: workunits: mon: crush_ops: test 'ceph osd crush move'

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agoMerge pull request #582 from ceph/wip-6230
Sage Weil [Mon, 9 Sep 2013 23:01:06 +0000 (16:01 -0700)]
Merge pull request #582 from ceph/wip-6230

mon: MonCommands: expect a CephString as 1st arg for 'osd crush move'

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agomon: MonCommands: expect a CephString as 1st arg for 'osd crush move' 582/head
Joao Eduardo Luis [Mon, 9 Sep 2013 22:14:11 +0000 (23:14 +0100)]
mon: MonCommands: expect a CephString as 1st arg for 'osd crush move'

Fixes: #6230
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agoMerge pull request #576 from ceph/wip-6078-2
Sage Weil [Sat, 7 Sep 2013 21:09:23 +0000 (14:09 -0700)]
Merge pull request #576 from ceph/wip-6078-2

rgw: fix get cors, delete cors

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #571 from dalgaaf/fix-da-init-radosgw
Sage Weil [Sat, 7 Sep 2013 15:42:59 +0000 (08:42 -0700)]
Merge pull request #571 from dalgaaf/fix-da-init-radosgw

init-radosgw*: fix status return value if radosgw isn't running

Backport: dumpling
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoinit-radosgw*: fix status return value if radosgw isn't running 571/head
Danny Al-Gaaf [Sat, 7 Sep 2013 09:30:15 +0000 (11:30 +0200)]
init-radosgw*: fix status return value if radosgw isn't running

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
11 years agorgw: fix get cors, delete cors 576/head
Yehuda Sadeh [Sat, 7 Sep 2013 05:33:38 +0000 (22:33 -0700)]
rgw: fix get cors, delete cors

Remove a couple of variables that overrode class member. Not
really clear how it was working before, might have been a bad
merge / rebase.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoosd/ReplicatedPG: set reply versions for pg ops (PGLS)
Sage Weil [Wed, 4 Sep 2013 05:41:17 +0000 (22:41 -0700)]
osd/ReplicatedPG: set reply versions for pg ops (PGLS)

Returning the current version for the pgid and last_user_version makes
some sense here.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit b05f7ea5199fc190a3be887fac4d74417461e1ce)

11 years agoosd/ReplicatedPG: set reply versions on dup op ACK
Sage Weil [Wed, 4 Sep 2013 05:40:42 +0000 (22:40 -0700)]
osd/ReplicatedPG: set reply versions on dup op ACK

All other MOSDOpReply creators do this, with the exception of the pg
op.

Fixes: #6222
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit 5148aac73d50593217455619bef95b8e1b296e10)

11 years agorgw: flush pending data when completing multipart part upload
Yehuda Sadeh [Fri, 23 Aug 2013 22:39:20 +0000 (15:39 -0700)]
rgw: flush pending data when completing multipart part upload

Fixes: #6111
Backport: dumpling
When completing the part upload we need to flush any data that we
aggregated and didn't flush yet. With earlier code didn't have to deal
with it as for multipart upload we didn't have any pending data.
What we do now is we call the regular atomic data completion
function that takes care of it.

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agorgw: check object name after rebuilding it in S3 POST
Yehuda Sadeh [Tue, 27 Aug 2013 02:46:43 +0000 (19:46 -0700)]
rgw: check object name after rebuilding it in S3 POST

Fixes: #6088
Backport: bobtail, cuttlefish, dumpling

When posting an object it is possible to provide a key
name that refers to the original filename, however we
need to verify that in the end we don't end up with an
empty object name.

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoMerge branch 'wip-6078' into next
Yehuda Sadeh [Wed, 4 Sep 2013 23:18:38 +0000 (16:18 -0700)]
Merge branch 'wip-6078' into next

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
11 years agorgw: fix certain return status cases in CORS
Yehuda Sadeh [Thu, 29 Aug 2013 04:25:20 +0000 (21:25 -0700)]
rgw: fix certain return status cases in CORS

Change return values in certain cases, reorder
checks, etc.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agorgw: add COPY method to be handled by CORS
Yehuda Sadeh [Thu, 29 Aug 2013 04:24:36 +0000 (21:24 -0700)]
rgw: add COPY method to be handled by CORS

Was missing this http method.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agorgw: fix CORS rule check
Yehuda Sadeh [Wed, 28 Aug 2013 02:38:45 +0000 (19:38 -0700)]
rgw: fix CORS rule check

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agorgw: don't handle CORS if rule not found (is NULL)
Yehuda Sadeh [Wed, 28 Aug 2013 02:38:18 +0000 (19:38 -0700)]
rgw: don't handle CORS if rule not found (is NULL)

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agorgw: tie CORS header response to all relevant operations
Yehuda Sadeh [Thu, 22 Aug 2013 20:38:55 +0000 (13:38 -0700)]
rgw: tie CORS header response to all relevant operations

Have the CORS responses on all relevant operations. Also add headers
on failure cases.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agorgw: add a generic CORS response handling
Yehuda Sadeh [Thu, 22 Aug 2013 17:00:53 +0000 (10:00 -0700)]
rgw: add a generic CORS response handling

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agorgw: OPTIONS request doesn't need to read object info
Yehuda Sadeh [Thu, 22 Aug 2013 00:22:46 +0000 (17:22 -0700)]
rgw: OPTIONS request doesn't need to read object info

This is a bucket-only operation, so we shouldn't look at the
object. Object may not exist and we might respond with Not
Exists response which is not what we want.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agorgw: remove use of s->bucket_cors
Yehuda Sadeh [Wed, 21 Aug 2013 21:43:28 +0000 (14:43 -0700)]
rgw: remove use of s->bucket_cors

Some old code still tried to use s->bucket_cors, which was
abandoned in a cleanup work.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoMerge branch 'next'
Gary Lowell [Wed, 4 Sep 2013 08:37:41 +0000 (01:37 -0700)]
Merge branch 'next'

11 years agov0.68 v0.68
Gary Lowell [Tue, 3 Sep 2013 23:10:31 +0000 (16:10 -0700)]
v0.68

11 years agoMerge branch 'wip-copyfrom'
Sage Weil [Tue, 3 Sep 2013 23:00:28 +0000 (16:00 -0700)]
Merge branch 'wip-copyfrom'

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agodoc: Fix repo URL for Ceph cloning (dev/generatedocs)
Dan Mick [Tue, 3 Sep 2013 22:56:53 +0000 (15:56 -0700)]
doc: Fix repo URL for Ceph cloning (dev/generatedocs)

Signed-off-by: Dan Mick <dan.mick@inktank.com>
11 years agoceph_test_rados: test COPY_FROM 563/head
Sage Weil [Tue, 3 Sep 2013 20:51:31 +0000 (13:51 -0700)]
ceph_test_rados: test COPY_FROM

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd: initial COPY_FROM (not viable for large objects)
Sage Weil [Wed, 28 Aug 2013 22:04:16 +0000 (15:04 -0700)]
osd: initial COPY_FROM (not viable for large objects)

Initial pass at COPY_FROM implementation.  This uses COPY_GET to read an
object from another OSD and write it locally.  It chunks the read but
accumulates it all in-memory and commits it at once, so it is only suitable
for smaller objects.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoobjecter, librados: add COPY_FROM operation
Sage Weil [Mon, 26 Aug 2013 23:24:16 +0000 (16:24 -0700)]
objecter, librados: add COPY_FROM operation

This operation will copy an entire object (data, attrs, omap)
atomically.  If the src_version does not match the source object, or
the source object is updated while the copy is in progress, we will
fail with a suitable error code.  By atomic we mean that it will either
successfully copy the entire object in its entirety or it will fail (and
require no cleanup).

Add to C++ librados API only for now.

Signed-off-by: Sage Weil <sage@inktank.com>
Conflicts:

src/include/ceph_strings.cc
src/include/rados.h
src/osd/osd_types.cc

11 years agodoc: Updated manual install to include sync agent, ARM packages, and DNS configuration.
John Wilkins [Tue, 3 Sep 2013 21:20:59 +0000 (14:20 -0700)]
doc: Updated manual install to include sync agent, ARM packages, and DNS configuration.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agoceph_test_rados: add missing kick for rollback
Sage Weil [Tue, 3 Sep 2013 20:58:03 +0000 (13:58 -0700)]
ceph_test_rados: add missing kick for rollback

This was broken by the refactor in 96aaa5e3a371ade8b91ad9ab991d996eaef2cea5
and can make us hang.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agorgw: change watch init ordering, don't distribute if can't
Yehuda Sadeh [Thu, 29 Aug 2013 20:06:33 +0000 (13:06 -0700)]
rgw: change watch init ordering, don't distribute if can't

Backport: dumpling

Moving back the watch initialization after the zone init,
as the zone info holds the control pool name. Since zone
init might need to create a new system object (that needs
to distribute cache), don't try to distribute cache if
watch is not yet initialized.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoosd: provide better version bounds for cls_current_version and ENOENT replies
Greg Farnum [Tue, 3 Sep 2013 17:54:23 +0000 (10:54 -0700)]
osd: provide better version bounds for cls_current_version and ENOENT replies

Following the changes to when we set or increase the user_version, we
want to continue to return the best lower bound we can on the version
of any newly-created object. For ENOENT replies that means returning
info.last_user_version instead of the (potentially-zero) ctx->user_at_version.

Similarly, for cls_current_version we want to return the last version on
the PG rather than the last update to the object in order to provide
sensible version ordering across object deletes and creates.

Update the versions doc so it continues to be precise.

Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge remote-tracking branch 'gh/wip-6179'
Sage Weil [Tue, 3 Sep 2013 17:07:26 +0000 (10:07 -0700)]
Merge remote-tracking branch 'gh/wip-6179'

Reviewed-by: Greg Farnum <greg@inktank.com>
11 years agoMerge pull request #562 from kri5/wip-4365
Yehuda Sadeh [Tue, 3 Sep 2013 15:49:32 +0000 (08:49 -0700)]
Merge pull request #562 from kri5/wip-4365

rgw: Allow wildcard in supported keystone roles.

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agorgw: Allow wildcard in supported keystone roles. 455/head 562/head
Christophe Courtaut [Mon, 22 Jul 2013 13:15:38 +0000 (15:15 +0200)]
rgw: Allow wildcard in supported keystone roles.

http://tracker.ceph.com/issues/4365 fixes #4365

Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
11 years agoosd/ReplicatedPG: set user_version in waiting_for_commit replies
Sage Weil [Tue, 3 Sep 2013 00:08:04 +0000 (17:08 -0700)]
osd/ReplicatedPG: set user_version in waiting_for_commit replies

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: do not set ctx->user_at_version unless ctx->user_modify
Sage Weil [Sat, 31 Aug 2013 00:20:54 +0000 (17:20 -0700)]
osd/ReplicatedPG: do not set ctx->user_at_version unless ctx->user_modify

Leave ctx->user_at_version set to the previous oi.user_version unless/until
we find that ctx->user_modify is true.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: do not log a user_version on the snapdir object
Sage Weil [Sat, 31 Aug 2013 00:19:07 +0000 (17:19 -0700)]
osd/ReplicatedPG: do not log a user_version on the snapdir object

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: log previous user_version on clone
Sage Weil [Sat, 31 Aug 2013 00:18:21 +0000 (17:18 -0700)]
osd/ReplicatedPG: log previous user_version on clone

Nothing relies on this, but it makes sense to me.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: do not log user_version on deletion events
Sage Weil [Sat, 31 Aug 2013 00:22:26 +0000 (17:22 -0700)]
osd/ReplicatedPG: do not log user_version on deletion events

Or snap trim events where we are adjusting the head's snapdir attr.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/PG: only raise PG's last_user_version if entry is >
Sage Weil [Sat, 31 Aug 2013 00:15:56 +0000 (17:15 -0700)]
osd/PG: only raise PG's last_user_version if entry is >

We may have pg entries that do not increase the user_version at all (i.e.,
they may be 0).  Do not update the last_user_version in that case as we
need it to remain an upper bound.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd: debug user_versions a bit
Sage Weil [Sat, 31 Aug 2013 00:06:30 +0000 (17:06 -0700)]
osd: debug user_versions a bit

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosdc/Objecter: fix dereference of NULL pg_pool_t
Sage Weil [Sun, 1 Sep 2013 15:42:23 +0000 (08:42 -0700)]
osdc/Objecter: fix dereference of NULL pg_pool_t

Make sure we don't dereference a NULL pointer.  Note that we check a
bit further down if the target pool does not exist and return the proper
error.

Bug was reliably reproduced by

 ./ceph_test_rados_api_watch_notify --gtest_filter=LibRadosWatchNotify.WatchNotifyTimeoutTestPP

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoValidate S3 tokens against Keystone
Roald J. van Loon [Fri, 9 Aug 2013 11:31:10 +0000 (13:31 +0200)]
Validate S3 tokens against Keystone

- Added config option to allow S3 to use Keystone auth
- Implemented JSONDecoder for KeystoneToken
- RGW_Auth_S3::authorize now uses rgw_store_user_info on keystone auth
- Minor fix in get_canon_resource; dout is now after the assignment

Reviewed-by: Yehuda Sadeh<yehuda@inktank.com>
Signed-off-by: Roald J. van Loon <roaldvanloon@gmail.com>
11 years agoMerge pull request #561 from ceph/wip-6178
Sage Weil [Sat, 31 Aug 2013 23:46:52 +0000 (16:46 -0700)]
Merge pull request #561 from ceph/wip-6178

os: LevelDBStore: ignore ENOENT files when estimating store size

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge branch 'next'
Sage Weil [Sat, 31 Aug 2013 17:31:31 +0000 (10:31 -0700)]
Merge branch 'next'

11 years agomon: fix uninitialized Op field
Roald J. van Loon [Sat, 31 Aug 2013 17:30:14 +0000 (10:30 -0700)]
mon: fix uninitialized Op field

- Uninitialized field in MonitorLevelDB::Op causes random build errors.

Signed-off-by: Roald J. van Loon <roaldvanloon@gmail.com>
11 years agoautomake cleanup: uninitialized version_t
Roald J. van Loon [Fri, 30 Aug 2013 21:05:52 +0000 (23:05 +0200)]
automake cleanup: uninitialized version_t

This sometimes gives a completely random uint64_t value, because it is
potentially used uninitialized.

Signed-off-by: Roald J. van Loon <roaldvanloon@gmail.com>
11 years agoMerge pull request #541 from ceph/wip-6036
Sage Weil [Sat, 31 Aug 2013 00:02:49 +0000 (17:02 -0700)]
Merge pull request #541 from ceph/wip-6036

osd objecter; copy-get

Reviewed-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agoosd/ReplicatedPG: do not requeue if not primary 541/head
Sage Weil [Tue, 27 Aug 2013 22:01:02 +0000 (15:01 -0700)]
osd/ReplicatedPG: do not requeue if not primary

This saves us a bit of work, since we will discard the op anyway if
we aren't primary (or even if we become primary again before we get to
it).

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd: COPY_GET operation
Sage Weil [Tue, 27 Aug 2013 22:25:50 +0000 (15:25 -0700)]
osd: COPY_GET operation

Add new rados operation to copy all user-visible content for an object
in a simple, safe way.  Use a new object_copy_cursor_t to keep track of
our position.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: factor {execute,reply}_ctx() out of do_op()
Sage Weil [Sun, 25 Aug 2013 04:58:11 +0000 (21:58 -0700)]
osd/ReplicatedPG: factor {execute,reply}_ctx() out of do_op()

Separate the processing of an OpContext from the preamble and
allocation, so that we can delay the execution for some ops (like the
COPYFROM operation we're about to add).

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd: feed OSDMaps to the Objecter
Sage Weil [Sat, 17 Aug 2013 06:33:06 +0000 (23:33 -0700)]
osd: feed OSDMaps to the Objecter

Feed every map message we see (that isn't discarded for some other
reason) to the Objecter.  It has the same continuity requirements that
the OSD has, so it should be satisfied with what we get.  It can also
request maps via our MonClient.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd: add an Objecter instance
Sage Weil [Sat, 17 Aug 2013 06:17:03 +0000 (23:17 -0700)]
osd: add an Objecter instance

It gets its own lock, timer, and osdmap.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd: discriminate based on connection messenger, not peer type
Sage Weil [Mon, 26 Aug 2013 20:58:47 +0000 (13:58 -0700)]
osd: discriminate based on connection messenger, not peer type

Replace ->get_source().is_osd() checks and instead see if it is the
cluster_messenger so that we do not confuse ourselves when we get
legit requests from other OSDs on our public interface.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoceph-osd: rename msgr vars
Sage Weil [Sat, 17 Aug 2013 23:23:24 +0000 (16:23 -0700)]
ceph-osd: rename msgr vars

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd: add a separate messenger for the Objecter
Sage Weil [Sat, 17 Aug 2013 06:03:26 +0000 (23:03 -0700)]
osd: add a separate messenger for the Objecter

We will give the OSD's Objecter its own messenger so that it does not
interfere with the OSD when it marks things up or down.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: add whitespace
Sage Weil [Mon, 19 Aug 2013 04:26:44 +0000 (21:26 -0700)]
osd/ReplicatedPG: add whitespace

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd: less whitespace
Sage Weil [Sat, 17 Aug 2013 06:45:14 +0000 (23:45 -0700)]
osd: less whitespace

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosdc/Objecter: allow ops to be canceled
Sage Weil [Sun, 18 Aug 2013 00:01:53 +0000 (17:01 -0700)]
osdc/Objecter: allow ops to be canceled

This is useful in general, and specifically will be useful for the
rados COPY operation.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosdc/Objecter: only request map on startup if epoch == 0
Sage Weil [Sat, 17 Aug 2013 06:27:39 +0000 (23:27 -0700)]
osdc/Objecter: only request map on startup if epoch == 0

Normal clients have no map and need one to get started.  If we are the
OSD, we will already have one and will get fed maps as they come in.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd, objecter: clean up assert_ver()
Sage Weil [Sun, 25 Aug 2013 18:12:44 +0000 (11:12 -0700)]
osd, objecter: clean up assert_ver()

Create a separate union in the args and clean up the code a bit so that
this doesn't reuse the (unrelated) watch helpers.  No change in
protocol.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: drop src_obc.clear() calls
Sage Weil [Sat, 24 Aug 2013 04:34:28 +0000 (21:34 -0700)]
osd/ReplicatedPG: drop src_obc.clear() calls

These are all about to go out of scope; no need to clear them
explicitly.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoos/ObjectStore: add bufferlist variant of setattrs
Sage Weil [Wed, 21 Aug 2013 05:23:54 +0000 (22:23 -0700)]
os/ObjectStore: add bufferlist variant of setattrs

And hopefully we can kill the bufferptr ones someday!

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agounittest_lfnindex testing older HASH_INDEX_TAG
David Zafman [Fri, 30 Aug 2013 23:17:16 +0000 (16:17 -0700)]
unittest_lfnindex testing older HASH_INDEX_TAG

Switch to work with new HOBJECT_WITH_POOL

fixes: #6196

Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agodoc/rados/operations/pools: remove experimental note about pg splitting
Sage Weil [Fri, 30 Aug 2013 22:41:02 +0000 (15:41 -0700)]
doc/rados/operations/pools: remove experimental note about pg splitting

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #560 from ceph/wip-6032-cache-objecter
Sage Weil [Fri, 30 Aug 2013 22:24:41 +0000 (15:24 -0700)]
Merge pull request #560 from ceph/wip-6032-cache-objecter

Wip 6032 cache objecter

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #554 from ceph/wip-tier-interface
Gregory Farnum [Fri, 30 Aug 2013 21:13:25 +0000 (14:13 -0700)]
Merge pull request #554 from ceph/wip-tier-interface

Specify a user and pg_pool_t interface for tiering/caching specifications

Reviewed-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>