]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
12 years agoPG::build_scrub_map: detect race with peering via last_peering_reset
Samuel Just [Mon, 25 Feb 2013 20:40:06 +0000 (12:40 -0800)]
PG::build_scrub_map: detect race with peering via last_peering_reset

Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 67225339dc3d62d7fe5a32eec65d51e53e8d35bb)

12 years agoReplicatedPG::C_OSD_CommittedPushedObject: use intrusive_ptr for pg
Samuel Just [Mon, 25 Feb 2013 20:36:29 +0000 (12:36 -0800)]
ReplicatedPG::C_OSD_CommittedPushedObject: use intrusive_ptr for pg

Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 04ee8f478bbd587a711d0668c471cfc5c1cab06c)

12 years agoReplicatedPG::C_OSD_CommittedPushedObject take epoch submitted
Samuel Just [Mon, 25 Feb 2013 20:35:26 +0000 (12:35 -0800)]
ReplicatedPG::C_OSD_CommittedPushedObject take epoch submitted

What we really care about is that the epoch in which the Context
was submitted is at complete() time >= last_peering_reset.

Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit a01dea6af9aacf0614570ebb5fa161d9dde9b6b6)

12 years agojounal: disable aio
Sage Weil [Mon, 4 Mar 2013 18:08:49 +0000 (10:08 -0800)]
jounal: disable aio

There is a deadlock issue in the aio code, see #4079.  Disable for the time
being.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomsgr: drop messages on cons with CLOSED Pipes
Sage Weil [Thu, 28 Feb 2013 20:46:00 +0000 (12:46 -0800)]
msgr: drop messages on cons with CLOSED Pipes

Back in commit 6339c5d43974f4b495f15d199e01a141e74235f5, we tried to make
this deal with a race between a faulting pipe and new messages being
queued.  The sequence is

- fault starts on pipe
- fault drops pipe_lock to unregister the pipe
- user (objecter) queues new message on the con
- submit_message reopens a Pipe (due to this bug)
- the message managed to make it out over the wire
- fault finishes faulting, calls ms_reset
- user (objecter) closes the con
- user (objecter) resends everything

It appears as though the previous patch *meant* to drop *m on the floor in
this case, which is what this patch does.  And that fixes the crash I am
hitting; see #4271.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoFileJournal::wrap_read_bl: adjust pos before returning
Samuel Just [Thu, 28 Feb 2013 00:58:45 +0000 (16:58 -0800)]
FileJournal::wrap_read_bl: adjust pos before returning

Otherwise, we may feed an offset past the end of the journal to
check_header in read_entry and incorrectly determine that the entry is
corrupt.

Fixes: 4296
Backport: bobtail
Backport: argonaut
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agolibrbd: fix rollback size
Josh Durgin [Tue, 26 Feb 2013 21:20:08 +0000 (13:20 -0800)]
librbd: fix rollback size

The duplicate calls to get_image_size() and get_snap_size() replaced
by 5806226cf0743bb44eaf7bc815897c6846d43233 uncovered this. The first
call was using the currently set snap_id instead of the snapshot being
rolled back to.

Fixes: #4272
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agomsg: fix entity_addr_t::is_same_host() for IPv6
Sage Weil [Tue, 26 Feb 2013 22:07:12 +0000 (14:07 -0800)]
msg: fix entity_addr_t::is_same_host() for IPv6

We weren't checking the memcmp return value properly!  Aie...

Backport: bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph_common.sh: tolerate missing mds, mon, osds in conf
Sage Weil [Tue, 26 Feb 2013 19:10:44 +0000 (11:10 -0800)]
ceph_common.sh: tolerate missing mds, mon, osds in conf

With set -e this seems to fail (at least on some machines) if, say, there
is no MDS in the conf file.  This fixes it.

Tested-by: Mark Nelson <mark.nelson@inktank.com>
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'gh/wip-4249' into next
Sage Weil [Tue, 26 Feb 2013 01:48:07 +0000 (17:48 -0800)]
Merge remote-tracking branch 'gh/wip-4249' into next

12 years agosystest: restrict list error acceptance
Josh Durgin [Mon, 25 Feb 2013 23:02:50 +0000 (15:02 -0800)]
systest: restrict list error acceptance

Only ignore errors after the midway point if the midway_sem_post is
defined.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
(cherry picked from commit 5b24a68b6e7d57bac688021b822fb2f73494c3e9)

12 years agosystest: fix race with pool deletion
Josh Durgin [Mon, 25 Feb 2013 22:55:34 +0000 (14:55 -0800)]
systest: fix race with pool deletion

The second test have pool deletion and object listing wait on the same
semaphore to connect and start. This led to errors sometimes when the
pool was deleted before it could be opened by the listing process. Add
another semaphore so the pool deletion happens only after the listing
has begun.

Fixes: #4147
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
(cherry picked from commit b0271e390564119e998e18189282252d54f75eb6)

12 years agolibrbd: drop snap_lock before invalidating cache
Josh Durgin [Mon, 25 Feb 2013 19:33:48 +0000 (11:33 -0800)]
librbd: drop snap_lock before invalidating cache

Writeback will take the snap_lock, so read everything we need under it
before invalidating the cache. This avoids a recursive lock when writeback
uses snap_lock while snap_rollback() was holding it.

Remove a not-very-useful debugging message that depended on snap_lock being held.

Fixes: #4249
Backport: bobtail
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
12 years agomds: reencode MDSMap in MMDSMap if MDSENC feature is not present
Sage Weil [Sun, 24 Feb 2013 00:36:36 +0000 (16:36 -0800)]
mds: reencode MDSMap in MMDSMap if MDSENC feature is not present

In some cases the MMDSMap message from mon -> client passes from leader ->
peon -> client, and the leader doesn't encode with the correct feature
bits.  As with MMOSDMap, we reencode the nested MDSMap based on the
features if relevant bits are not present.

We forgot to include this with the mds encoding changes.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoqa/run_xfstests.sh: use $TESTDIR instead of /tmp/cephtest
Sage Weil [Sat, 23 Feb 2013 16:38:10 +0000 (08:38 -0800)]
qa/run_xfstests.sh: use $TESTDIR instead of /tmp/cephtest

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: an interval can't go readwrite if its acting is empty
Sage Weil [Thu, 21 Feb 2013 19:15:58 +0000 (11:15 -0800)]
osd: an interval can't go readwrite if its acting is empty

Let's not forget that min_size can be zero.

Fixes: #4159
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 4277265d99647c9fe950ba627e5d86234cfd70a9)

12 years agoconfiguration parsing: give better error for missing =
Dan Mick [Fri, 22 Feb 2013 05:41:25 +0000 (21:41 -0800)]
configuration parsing: give better error for missing =

A ceph.conf line with "key" and no "= value" currently shows
"unexpected character while parsing putative key value,
at char N line M".  There's no reason it can't be clearer.

Fixes: #4229
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoceph_common.sh: fix iteration of items in ceph.conf
Sage Weil [Fri, 22 Feb 2013 01:29:58 +0000 (17:29 -0800)]
ceph_common.sh: fix iteration of items in ceph.conf

This broke in c8f528a4070dd3aa0b25c435c6234032aee39b21.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomds: use inode_t::layout for dir layout policy
Greg Farnum [Thu, 21 Feb 2013 17:22:00 +0000 (09:22 -0800)]
mds: use inode_t::layout for dir layout policy

This cherry-pick is going in the reverse direction of normal. That's
because this direction makes for the minimal change -- this patchset
is required to fix the loss of directory layouts we were previously
seeing, but fixing it requires changing the encoding versions. So we
wrote it on top of Bobtail and let it update the struct_v's as they existed
then. Note that we here change a few encoding versions in ways which are
NOT COMPATIBLE with previous development code (but not any releases). In
particular, development code introduced and this removes the
file_layout_policy_t, and some of the CInode and EMetaBlob encoding
struct_v values were used in development code to mean one thing, but
mean something different due to the Bobtail patch.

Remove the default_file_layout struct, which was just a ceph_file_layout,
and store it in the inode_t.  Rip out all the annoying code that put this
on the heap.

To aid in this usage, add a clear_layout() function to inode_t.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 36ed407e0f939a9bca57c3ffc0ee5608d50ab7ed)
Conflicts:

src/mds/CInode.cc
src/mds/CInode.h
src/mds/MDCache.cc
src/mds/Server.cc
src/mds/events/EMetaBlob.h
Cherry-pick-
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agomds: parse ceph.*.layout vxattr key/value content
Sage Weil [Mon, 21 Jan 2013 05:53:37 +0000 (21:53 -0800)]
mds: parse ceph.*.layout vxattr key/value content

Use qi to parse a strictly formatted set of key/value pairs.  Be picky
about whitespace.  Any subset of recognized keys is allowed.  Parse the
same set of keys as the ceph.*.layout.* vxattrs.

Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 5551aa5b3b5c2e9e7006476b9cd8cc181d2c9a04)

12 years agoFix failing > 4MB range requests through radosgw S3 API.
Jan Harkes [Thu, 21 Feb 2013 20:17:38 +0000 (15:17 -0500)]
Fix failing > 4MB range requests through radosgw S3 API.

When a range request is made for more than rgw_get_obj_max_req_size
bytes the first returned chunk sets 'ret' to STATUS_PARTIAL_CONTENT and
all remaining chunks behave as if there is an error state and only
return a minimal header.

Fix this by passing STATUS_PARTIAL_CONTENT to set_req_state_err, but
leave the 'ret' member variable untouched.

Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
(cherry picked from commit c83a01d4e8dcd26eec24c020c5b79fcfa4ae44a3)

12 years agoosd: clear recovery state on pg removal
Sage Weil [Thu, 21 Feb 2013 18:30:08 +0000 (10:30 -0800)]
osd: clear recovery state on pg removal

This ensures we release our in-progress recovery counters, which prevents
recovery from getting blocked indefinitely when a pool removal races with
recovery ops.

Fixes: #4217
Backport: bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agorgw: refactor header grants
Yehuda Sadeh [Wed, 20 Feb 2013 20:39:37 +0000 (12:39 -0800)]
rgw: refactor header grants

Move definition to a static array.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agorgw_acl: Support ACL grants in headers.
caleb miles [Tue, 19 Feb 2013 17:15:30 +0000 (12:15 -0500)]
rgw_acl: Support ACL grants in headers.

Issue 3669: Support S3 ACL grants specified in request headers. Allow
requests, excluding POST object, to specify ACL grants in HTTP headers.

Signed-off-by: caleb miles <caleb.miles@inktank.com>
Conflicts:
src/rgw/rgw_acl_s3.cc
src/rgw/rgw_acl_s3.h
src/rgw/rgw_rest_s3.cc
src/rgw/rgw_rest_s3.h

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoosd: lock pg in build_past_intervals_parallel()
Sage Weil [Wed, 20 Feb 2013 06:20:47 +0000 (22:20 -0800)]
osd: lock pg in build_past_intervals_parallel()

Methods called by write_if_dirty() (get_osdmap()) assert that the pg
is locked.

Backport: bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agorgw: fix multipart uploads listing
Yehuda Sadeh [Mon, 18 Feb 2013 17:10:43 +0000 (09:10 -0800)]
rgw: fix multipart uploads listing

Fixes: #4177
Backport: bobtail
Listing multipart uploads had a typo, and was requiring the
wrong resource (uploadId instead of uploads).

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agorgw: don't copy object when it's copied into itself
Yehuda Sadeh [Fri, 15 Feb 2013 18:22:54 +0000 (10:22 -0800)]
rgw: don't copy object when it's copied into itself

Fixes: #4150
Backport: bobtail

When object copied into itself, object will not be fully copied: tail
reference count stays the same, head part is rewritten.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agotest/bufferlist: fix warning
Sage Weil [Tue, 19 Feb 2013 23:33:20 +0000 (15:33 -0800)]
test/bufferlist: fix warning

In file included from test/bufferlist.cc:31:0:
../src/gtest/include/gtest/gtest.h: In function ‘testing::AssertionResult testing::internal::CmpHelperEQ(const char*, const char*, const T1&, const T2&) [with T1 = unsigned int, T2 = int]’:
../src/gtest/include/gtest/gtest.h:1300:30: instantiated from ‘static testing::AssertionResult testing::internal::EqHelper::Compare(const char*, const char*, const T1&, const T2&) [with T1 = unsigned int, T2 = int, bool lhs_is_null_literal = false]’
test/bufferlist.cc:1604:227: instantiated from here
warning: ../src/gtest/include/gtest/gtest.h:1263:3: comparison between signed and unsigned integer expressions [-Wsign-compare]

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'master' of https://github.com/ceph/ceph
Gary Lowell [Tue, 19 Feb 2013 22:55:14 +0000 (14:55 -0800)]
Merge branch 'master' of https://github.com/ceph/ceph

12 years agoMerge branch 'next'
Gary Lowell [Tue, 19 Feb 2013 22:53:54 +0000 (14:53 -0800)]
Merge branch 'next'

12 years agotesting: updating hadoop-internal test
Joe Buck [Tue, 19 Feb 2013 18:37:49 +0000 (10:37 -0800)]
testing: updating hadoop-internal test

Small tweaks to the hadoop-internal test
to better use existing environment varaibles.

Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Noah Watkins <noahwatkins@gmail.com>
12 years agoqa: sample test for new replication tests
Noah Watkins [Tue, 12 Feb 2013 23:21:14 +0000 (15:21 -0800)]
qa: sample test for new replication tests

Signed-off-by: Joe Buck <jbbuck@gmail.com>
12 years agodoc/release-notes: v0.57
Sage Weil [Tue, 19 Feb 2013 21:50:18 +0000 (13:50 -0800)]
doc/release-notes: v0.57

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoPG: remove weirdness log for last_complete < log.tail
Samuel Just [Tue, 19 Feb 2013 18:49:33 +0000 (10:49 -0800)]
PG: remove weirdness log for last_complete < log.tail

In the case of a divergent object prior to log.tail,
last_complete may end up before log.tail.

Backport: bobtail
Fixes #4174
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoos/FileStore: check replay guard on src for collection rename
Sage Weil [Tue, 19 Feb 2013 01:39:46 +0000 (17:39 -0800)]
os/FileStore: check replay guard on src for collection rename

This avoids a problematic sequence like:

     - rename A/ -> B/
     - remove B/1...100
     - destroy B/
     - create A/
     - write A/101...
     <crash>
     - replay A/ -> B/
     - remove B/1...100  (fails but tolerated)
     - destroy B/        (fails with ENOTEMPTY)

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoosd: requeue pg waiters at the front of the finished queue
Sage Weil [Mon, 18 Feb 2013 06:35:50 +0000 (22:35 -0800)]
osd: requeue pg waiters at the front of the finished queue

We could have a sequence like:

- op1
- notify
- op2

in the finished queue.  Op1 gets put on waiting_for_pg, the notify
creates the pg and requeues op1 (and the end), op2 is handled, and
finally op1 is handled.  That breaks ordering; see #2947.

Instead, when we wake up a pg, queue the waiting messages at the front
of the dispatch queue.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoosd: pull requeued requests off one at a time
Sage Weil [Mon, 18 Feb 2013 04:49:52 +0000 (20:49 -0800)]
osd: pull requeued requests off one at a time

Pull items off the finished queue on at a time.  In certain cases, an
event may result in new items betting added to the finished queue that
will be put at the *front* instead of the back.  See latest incarnation
of #2947.

Note that this is a significant changed in behavior in that we can
theoretically starve if an event keeps resulting in new events getting
generated.  Beware!

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agov0.57 v0.57
Gary Lowell [Tue, 19 Feb 2013 18:07:42 +0000 (10:07 -0800)]
v0.57

12 years agoosd: fix printf warning on pg_log_entry_t::get_key_name
Sage Weil [Tue, 19 Feb 2013 17:12:52 +0000 (09:12 -0800)]
osd: fix printf warning on pg_log_entry_t::get_key_name

warning: osd/osd_types.cc:1716:76: format '%lu' expects argument of type 'long unsigned int', but argument 5 has type 'version_t {aka long long unsigned int}' [-Wformat]
warning: osd/osd_types.cc:1716:76: format '%lu' expects argument of type 'long unsigned int', but argument 5 has type 'version_t {aka long long unsigned int}' [-Wformat]

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoqa: test_mon_workloadgen: use default config file path
Sage Weil [Tue, 19 Feb 2013 17:08:57 +0000 (09:08 -0800)]
qa: test_mon_workloadgen: use default config file path

I'm not sure why we wouldn't.  Also, this makes this test work without
annoying plumbing to pass the explicit path through.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoqa: mon/workloadgen.sh: drop TEST_CEPH_CONF code
Sage Weil [Tue, 19 Feb 2013 17:02:14 +0000 (09:02 -0800)]
qa: mon/workloadgen.sh: drop TEST_CEPH_CONF code

The binaries already pick up on CEPH_CONF, which will be set as needed.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agorbd: udevadm settle before unmap
Sage Weil [Tue, 19 Feb 2013 04:36:56 +0000 (20:36 -0800)]
rbd: udevadm settle before unmap

udev runs blkid on device close, and other such nonsense that can
make unmap fail with EBUSY.  Settle before we unmap to avoid this if
possible.  See #4183.

Closes: #4186
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
12 years agotest: correcting hadoop-internal tests
Joe Buck [Thu, 14 Feb 2013 23:13:39 +0000 (15:13 -0800)]
test: correcting hadoop-internal tests

Changing the hadoop-internal tests to use the
newly added $TESTDIR environment variable.
Also, removed unneeded variables.

Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Sam Lang <sam.lang@inktank.com>
12 years agotesting: adding a Hadoop wordcount test
Joe Buck [Mon, 18 Feb 2013 23:46:20 +0000 (15:46 -0800)]
testing: adding a Hadoop wordcount test

Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Sam Lang <sam.lang@inktank.com>
12 years agoqa: rbd map-snapshot-io: udevadm settle
Sage Weil [Tue, 19 Feb 2013 04:32:41 +0000 (20:32 -0800)]
qa: rbd map-snapshot-io: udevadm settle

Udev runs blkid on device close, thwarting any rbd unmap that
immediately follows use of the device.  Explicitly settle for now.

See #4183.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodebian: allow extra args to get passed to ./configure via the environment
Sage Weil [Tue, 19 Feb 2013 01:07:55 +0000 (17:07 -0800)]
debian: allow extra args to get passed to ./configure via the environment

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoqa: rbd/map-snapshot-io: remove image when done
Sage Weil [Mon, 18 Feb 2013 19:24:46 +0000 (11:24 -0800)]
qa: rbd/map-snapshot-io: remove image when done

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoqa: fix quoting of wget URLs
Sage Weil [Mon, 18 Feb 2013 18:58:10 +0000 (10:58 -0800)]
qa: fix quoting of wget URLs

Broke this in ae0c2bbb50ab04467b5223a4f61bfca4b0830142.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: log weirdness if caller_ops hash gets bigger than the log
Sage Weil [Mon, 18 Feb 2013 18:25:03 +0000 (10:25 -0800)]
osd: log weirdness if caller_ops hash gets bigger than the log

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #65 from javacruft/wip-ocf-rbd
Sage Weil [Mon, 18 Feb 2013 17:18:54 +0000 (09:18 -0800)]
Merge pull request #65 from javacruft/wip-ocf-rbd

Strip any trailing whitespace from rbd showmapped

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoStrip any trailing whitespace from rbd showmapped 65/head
James Page [Mon, 18 Feb 2013 16:24:54 +0000 (16:24 +0000)]
Strip any trailing whitespace from rbd showmapped

More recent versions of ceph append a bit of whitespace to the line
after the name of the /dev/rbdX device; this causes the monitor check
to fail as it can't find the device name due to the whitespace.

This fix excludes any characters after the /dev/rbdN match.

12 years agobuffer: drop large malloc tests
Sage Weil [Mon, 18 Feb 2013 05:47:30 +0000 (21:47 -0800)]
buffer: drop large malloc tests

These succeed on my machine and eat unseemly amounts of RAM.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agobuffer: put big buffer on heap, not stack
Sage Weil [Mon, 18 Feb 2013 05:47:07 +0000 (21:47 -0800)]
buffer: put big buffer on heap, not stack

This fixes a segfault on my x86_64 wheezy box.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agounit tests for src/common/buffer.{cc,h}
Loic Dachary [Sun, 17 Feb 2013 19:38:52 +0000 (20:38 +0100)]
unit tests for src/common/buffer.{cc,h}

Implement unit tests covering most lines of code ( > 92% ) and all
methods as show by the output of make check-coverage :
http://dachary.org/wp-uploads/2013/03/ceph-lcov/ .

The following static constructors are implemented by opaque classes
defined in buffer.cc ( buffer::raw_char, buffer::raw_posix_aligned
etc. ). Testing the implementation of these classes is done by
variations of the calls to the static constructors.

    copy(const char *c, unsigned len);
    create(unsigned len);
    claim_char(unsigned len, char *buf);
    create_malloc(unsigned len);
    claim_malloc(unsigned len, char *buf);
    create_static(unsigned len, char *buf);
    create_page_aligned(unsigned len);

The raw_mmap_pages class cannot be tested because it is commented out in
raw_posix_aligned. The raw_hack_aligned class is only tested under Cygwin.
The raw_posix_aligned class is not tested under Cygwin.

The unittest_bufferlist.sh script calls unittest_bufferlist with the
CEPH_BUFFER_TRACK=true environment variable to enable the code
tracking the memory usage. It cannot be done within the bufferlist.cc
file itself because it relies on the initialization of a global
variable  ( buffer_track_alloc ).

When raw_posix_aligned is called on DARWIN, the data is not aligned
on CEPH_PAGE_SIZE because it calls valloc(size) which is the equivalent of
memalign(sysconf(_SC_PAGESIZE),size) and not memalign(CEPH_PAGE_SIZE,size).
For this reason the alignment test is de-activated on DARWIN.

The tests are grouped in

TEST(BufferPtr, ... ) for buffer::ptr
TEST(BufferListIterator, ...) for buffer::list::iterator
TEST(BufferList, ...) for buffer::list
TEST(BufferHash, ...) for buffer::hash

and each method ( and all variations of the prototype ) are
included into a single TEST() function.

Although most aspects of the methods are tested, including exceptions
and border cases, inconsistencies are not highlighted . For
instance

    buffer::list::iterator i;
    i.advance(1);

would dereference a buffer::raw NULL pointer although

    buffer::ptr p;
    p.wasted()

asserts instead of dereferencing the buffer::raw NULL pointer. It
would be better to always assert in case a NULL pointer is about to be
used. But this is a minor inconsistency that is probably not worth a
test.

The following buffer::list methods

    ssize_t read_fd(int fd, size_t len);
    int write_fd(int fd) const;

are not fully tested because the border cases cannot be reliably
reproduced. Going thru a pointer indirection when calling the ::writev
or safe_read functions would allow the test to create mockups to synthetize
the conditions for border cases.

tracker.ceph.com/issues/4066 refs #4066

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agomon: fix pgmap stat smoothing
Sage Weil [Mon, 18 Feb 2013 05:21:35 +0000 (21:21 -0800)]
mon: fix pgmap stat smoothing

<facepalm>

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodoc/release-notes: add note about upgrade to v0.56.3
Sage Weil [Mon, 18 Feb 2013 05:10:52 +0000 (21:10 -0800)]
doc/release-notes: add note about upgrade to v0.56.3

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph_common: fix check for defined/undefined entities in conf
Sage Weil [Sun, 17 Feb 2013 18:39:44 +0000 (10:39 -0800)]
ceph_common: fix check for defined/undefined entities in conf

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agobuffer::ptr::cmp only compares up to the smallest length
Loic Dachary [Sun, 17 Feb 2013 08:41:09 +0000 (09:41 +0100)]
buffer::ptr::cmp only compares up to the smallest length

When running

  bufferptr a("A", 1);
  bufferptr ab("AB", 2);
  a.cmp(ab);

it returned zero because. cmp only compared up to the length of the
smallest buffer and returned if they are identical. The function is
modified to compare the length of the buffers instead of returning.

http://tracker.ceph.com/issues/4170 refs #4170

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agoceph-disk-prepare: -f for mkfs.xfs only
Sage Weil [Sun, 17 Feb 2013 04:55:03 +0000 (20:55 -0800)]
ceph-disk-prepare: -f for mkfs.xfs only

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodebian: fix start of ceph-all
Sage Weil [Sun, 17 Feb 2013 00:49:50 +0000 (16:49 -0800)]
debian: fix start of ceph-all

Tolerate failure, and do ceph-all, not ceph-osd-all.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agofix operator>=(bufferlist& l, bufferlist& r)
Loic Dachary [Sat, 16 Feb 2013 09:11:15 +0000 (10:11 +0100)]
fix operator>=(bufferlist& l, bufferlist& r)

  bufferlist a;
  a.append("A");
  bufferlist ab;
  ab.append("AB");

a >= ab failed, throwing an instance of 'ceph::buffer::end_of_buffer'
because it tried to access a[1]. All comparison operators should be
tested using a lexicographic sort like strcmp or memcmp (-1, 0, 1).
In the meantime, the missing test is added:

  if (l.length() == p && r.length() > p) return false;

A set of unit tests demonstrating the problem and covering all comparison
operators are added to show that the proposed fix works as expected.

http://tracker.ceph.com/issues/4157 refs #4157

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agoMerge remote-tracking branch 'gh/wip-deploy'
Sage Weil [Sat, 16 Feb 2013 17:39:34 +0000 (09:39 -0800)]
Merge remote-tracking branch 'gh/wip-deploy'

Reviewed-by: Gary Lowell <gary.lowell@inktank.com>
12 years agoMerge branch 'next'
Sage Weil [Sat, 16 Feb 2013 01:21:42 +0000 (17:21 -0800)]
Merge branch 'next'

12 years agoMerge remote-tracking branch 'upstream/wip-4075'
Samuel Just [Sat, 16 Feb 2013 00:53:40 +0000 (16:53 -0800)]
Merge remote-tracking branch 'upstream/wip-4075'

Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge pull request #62 from dalgaaf/wip-da-sca-cppcheck-performance-3
Sage Weil [Sat, 16 Feb 2013 00:26:58 +0000 (16:26 -0800)]
Merge pull request #62 from dalgaaf/wip-da-sca-cppcheck-performance-3

More performance patches

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoqa: rbd/map-snapshat-io: unmap image when done
Sage Weil [Sat, 16 Feb 2013 00:26:16 +0000 (16:26 -0800)]
qa: rbd/map-snapshat-io: unmap image when done

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph-disk-prepare: always force mkfs.xfs
Alexandre Marangone [Fri, 15 Feb 2013 20:24:01 +0000 (12:24 -0800)]
ceph-disk-prepare: always force mkfs.xfs

Signed-off-by: Alexandre Marangone <alexandre.marangone@inktank.com>
12 years agoudev: trigger on dmcrypted osd partitions
Sage Weil [Thu, 14 Feb 2013 02:22:45 +0000 (18:22 -0800)]
udev: trigger on dmcrypted osd partitions

Automatically map encrypted journal partitions.

For encrypted OSD partitions, map them, wait for the mapped device to
appear, and then ceph-disk-activate.

This is much simpler than doing the work in ceph-disk-activate.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph-disk-prepare: add initial support for dm-crypt
Sage Weil [Wed, 13 Feb 2013 05:35:56 +0000 (21:35 -0800)]
ceph-disk-prepare: add initial support for dm-crypt

Keep keys in /etc/ceph/dmcrypt-keys.

Identify partition instances by the partition UUID.  Identify encrypted
partitions by a parallel set of type UUIDs.

Signed-off-by: Alexandre Marangone <alexandre.maragone@inktank.com>
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph-disk-activate: pull mount options from ceph.conf
Alexandre Marangone [Fri, 15 Feb 2013 20:22:33 +0000 (12:22 -0800)]
ceph-disk-activate: pull mount options from ceph.conf

Signed-off-by: Alexandre Marangone <alexandre.marangone@inktank.com>
12 years agoceph-disk-activate: use full paths for everything
Sage Weil [Fri, 15 Feb 2013 01:05:32 +0000 (17:05 -0800)]
ceph-disk-activate: use full paths for everything

We are run from udev, which doesn't get a decent PATH.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph-disk-prepare: do partprobe after setting final partition type
Sage Weil [Fri, 15 Feb 2013 01:04:55 +0000 (17:04 -0800)]
ceph-disk-prepare: do partprobe after setting final partition type

This is necessary to kick udev into processing the updated partition and
running its rules.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoReplicatedPG: handle missing set even for old format
Samuel Just [Fri, 15 Feb 2013 21:52:37 +0000 (13:52 -0800)]
ReplicatedPG: handle missing set even for old format

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoqa: rbd/map-snapshot-io.sh: chown rbd device stuff
Sage Weil [Fri, 15 Feb 2013 21:57:47 +0000 (13:57 -0800)]
qa: rbd/map-snapshot-io.sh: chown rbd device stuff

This is fugly, but sudo -E doesn't work.  Fix this after we are installing
debs and the path doesn't matter anymore!

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocls/lock/cls_lock.cc: use !lockers.empty() instead of size() 62/head
Danny Al-Gaaf [Tue, 12 Feb 2013 17:51:51 +0000 (18:51 +0100)]
cls/lock/cls_lock.cc: use !lockers.empty() instead of size()

Use empty() since it should be prefered as it has, following the
standard, a constant time complexity regardless of the containter
type. The same is not guaranteed for size().

warning from cppchecker was:
[src/cls/lock/cls_lock.cc:209]: (performance) Possible inefficient
  checking for 'lockers' emptiness.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agosrc/client/SyntheticClient.cc: use !subdirs.empty() instead of size()
Danny Al-Gaaf [Tue, 12 Feb 2013 17:48:50 +0000 (18:48 +0100)]
src/client/SyntheticClient.cc: use !subdirs.empty() instead of size()

Use empty() since it should be prefered as it has, following the
standard, a constant time complexity regardless of the containter
type. The same is not guaranteed for size().

warning from cppchecker was:
[src/client/SyntheticClient.cc:2706]: (performance) Possible
  inefficient checking for 'subdirs' emptiness

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoclient/Client.cc: use empty() instead of size()
Danny Al-Gaaf [Tue, 12 Feb 2013 17:44:04 +0000 (18:44 +0100)]
client/Client.cc: use empty() instead of size()

Use empty() since it should be prefered as it has, following the
standard, a constant time complexity regardless of the containter
type. The same is not guaranteed for size().

warning from cppchecker was:
[src/client/Client.cc:3649]: (performance) Possible inefficient
  checking for 'mds_sessions' emptiness.
[src/client/Client.cc:7489]: (performance) Possible inefficient
  checking for 'osds' emptiness.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agomds/MDSMap.h: use up.empty() instead of up.size()
Danny Al-Gaaf [Tue, 12 Feb 2013 16:41:04 +0000 (17:41 +0100)]
mds/MDSMap.h: use up.empty() instead of up.size()

Use empty() since it should be prefered as it has, following the
standard, a constant time complexity regardless of the containter
type. The same is not guaranteed for size().

warning from cppchecker was:
[src/mds/MDSMap.h:448]: (performance) Possible inefficient
  checking for 'up' emptiness.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agomds/CDentry.h: use projected.empty() instead of projected.size()
Danny Al-Gaaf [Tue, 12 Feb 2013 16:25:00 +0000 (17:25 +0100)]
mds/CDentry.h: use projected.empty() instead of projected.size()

Use empty() since it should be prefered as it has, following the
standard, a constant time complexity regardless of the containter
type. The same is not guaranteed for size().

warning from cppchecker was:
[src/mds/CDentry.h:234]: (performance) Possible inefficient
  checking for 'projected' emptiness.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoceph_authtool.cc: use empty() instead of size()
Danny Al-Gaaf [Tue, 12 Feb 2013 16:20:19 +0000 (17:20 +0100)]
ceph_authtool.cc: use empty() instead of size()

Use empty() since it should be prefered as it has, following the
standard, a constant time complexity regardless of the containter
type. The same is not guaranteed for size().

warning from cppchecker was:
[src/ceph_authtool.cc:124]: (performance) Possible inefficient
  checking for 'caps' emptiness.
[src/ceph_authtool.cc:237]: (performance) Possible inefficient
  checking for 'caps' emptiness.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoOSD: add leveldblog compatibility flag for OSD
Samuel Just [Fri, 15 Feb 2013 19:07:40 +0000 (11:07 -0800)]
OSD: add leveldblog compatibility flag for OSD

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoPG: verify log versions during read_log
Samuel Just [Fri, 15 Feb 2013 01:45:35 +0000 (17:45 -0800)]
PG: verify log versions during read_log

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoPG: write_log if we read an old-format log
Samuel Just [Fri, 15 Feb 2013 01:29:46 +0000 (17:29 -0800)]
PG: write_log if we read an old-format log

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoosd: move pg log into leveldb
David Zafman [Fri, 8 Feb 2013 22:43:50 +0000 (14:43 -0800)]
osd: move pg log into leveldb

log from wip-pginfo
Fix bugs in PG::read_log() handling
Eliminate compiler warnings

Feature #4075: osd: move pg log into leveldb

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agoqa: pull qa stuff from ceph.com ceph.git mirror
Sage Weil [Fri, 15 Feb 2013 17:20:14 +0000 (09:20 -0800)]
qa: pull qa stuff from ceph.com ceph.git mirror

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodoc: radosgw: document config without 100-continue and custom fastcgi
Sage Weil [Fri, 15 Feb 2013 17:02:12 +0000 (09:02 -0800)]
doc: radosgw: document config without 100-continue and custom fastcgi

Reported-by: carsonoid on github
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoconfig: Add small note about default number of PGs
Wido den Hollander [Sat, 9 Feb 2013 19:55:48 +0000 (20:55 +0100)]
config: Add small note about default number of PGs

It's still not clear to end users this should go into the
mon or global section of ceph.conf

Until this gets resolved document it here as well for the people
who look up their settings in the source code.

Signed-off-by: Wido den Hollander <wido@42on.com>
12 years agoMerge remote-tracking branch 'origin/wip-hadoop-docs'
Noah Watkins [Fri, 15 Feb 2013 16:55:28 +0000 (08:55 -0800)]
Merge remote-tracking branch 'origin/wip-hadoop-docs'

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agoMerge pull request #59 from dalgaaf/wip-da-add-errorhandling
Sage Weil [Fri, 15 Feb 2013 16:54:57 +0000 (08:54 -0800)]
Merge pull request #59 from dalgaaf/wip-da-add-errorhandling

add error handling to test_sync_*.c

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Fri, 15 Feb 2013 16:48:37 +0000 (08:48 -0800)]
Merge remote-tracking branch 'gh/next'

12 years agotest_sync_io.c: add error handling 59/head
Danny Al-Gaaf [Tue, 12 Feb 2013 15:48:15 +0000 (16:48 +0100)]
test_sync_io.c: add error handling

Add error handling for open(), posix_memalign() and malloc().
Reuse code for read_* and write_* functions.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agotest_short_dio_read.c: add error handling
Danny Al-Gaaf [Tue, 12 Feb 2013 16:01:49 +0000 (17:01 +0100)]
test_short_dio_read.c: add error handling

Add error handling for open() calls.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agodoc: update rados troubleshooting for slow requests
Sage Weil [Fri, 15 Feb 2013 01:33:22 +0000 (17:33 -0800)]
doc: update rados troubleshooting for slow requests

The example was out of date.  Adding a note about how to look at the request
queue on the OSD.

Reported-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd/OSDCap: add unit test for parsing pools/objects with _ and -
Sage Weil [Thu, 14 Feb 2013 19:37:57 +0000 (11:37 -0800)]
osd/OSDCap: add unit test for parsing pools/objects with _ and -

Hunting #4122, where a user saw

2013-02-13 19:39:25.467916 7f766fdb4700 10 osd.0 10  session 0x2c8cc60 client.libvirt has caps osdcap[grant(object_prefix rbd^@children  class-read),grant(pool libvirt^@pool^@test rwx)] 'allow class-read object_prefix rbd_children, allow pool libvirt-pool-test rwx'

Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 2ce28ef1d7f95e71e1043912dfa269ea3b0d1599)

12 years agoosd/OSDCap: tweak unquoted_word parsing in osd caps
Sage Weil [Thu, 14 Feb 2013 23:39:43 +0000 (15:39 -0800)]
osd/OSDCap: tweak unquoted_word parsing in osd caps

Newer versions of spirit (1.49.0-3.1ubuntu1.1 in quantal, in particular)
dislike the construct with alnum and replace the - and _ with '\0' in the
resulting string.

Fixes: #4122
Backport: bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoOSD: always activate_map in advance_pgs, only send messages if up
Samuel Just [Thu, 14 Feb 2013 22:03:56 +0000 (14:03 -0800)]
OSD: always activate_map in advance_pgs, only send messages if up

We should always handle_activate_map() after handle_advance_map() in
order to kick the pg into a valid peering state for processing requests
prior to dropping the lock.

Additionally, we would prefer to avoid sending irrelevant messages
during boot, so only send if we are up according to the current service
osdmap.

Fixes: #4064
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoosd/OSDCap: add unit test for parsing pools/objects with _ and -
Sage Weil [Thu, 14 Feb 2013 19:37:57 +0000 (11:37 -0800)]
osd/OSDCap: add unit test for parsing pools/objects with _ and -

Hunting #4122, where a user saw

2013-02-13 19:39:25.467916 7f766fdb4700 10 osd.0 10  session 0x2c8cc60 client.libvirt has caps osdcap[grant(object_prefix rbd^@children  class-read),grant(pool libvirt^@pool^@test rwx)] 'allow class-read object_prefix rbd_children, allow pool libvirt-pool-test rwx'

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agorgw: user can specify 'rgw port' to listen on a tcp port.
Guilhem Lettron [Wed, 6 Feb 2013 21:38:52 +0000 (22:38 +0100)]
rgw: user can specify 'rgw port' to listen on a tcp port.

'rgw socket path' overrides 'rgw port'.
An 'rgw host' can be set to listen on a specific ip (default is 0.0.0.0)

Signed-off-by: Guilhem Lettron <guilhem.lettron@youscribe.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agotest/filestore/chain_xattr: remove testfile; disable LOGFILE
Sage Weil [Thu, 14 Feb 2013 20:36:12 +0000 (12:36 -0800)]
test/filestore/chain_xattr: remove testfile; disable LOGFILE

These make the gitbuilder checks unhappy.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'master' of github.com:ceph/ceph
Greg Farnum [Thu, 14 Feb 2013 19:39:13 +0000 (11:39 -0800)]
Merge branch 'master' of github.com:ceph/ceph