]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
12 years agorgw: swift list containers can return 204
Yehuda Sadeh [Fri, 26 Apr 2013 04:58:02 +0000 (21:58 -0700)]
rgw: swift list containers can return 204

In order to keep compatibility with swift, if a plain formatter
is being used, we should return 204 when there are no containers.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agorgw: fix plain formatter flush
Yehuda Sadeh [Fri, 26 Apr 2013 04:30:30 +0000 (21:30 -0700)]
rgw: fix plain formatter flush

The plain formatter flush needs to append eol if needed, and
not to clear the sections stack.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agorgw: fix bucket count when stating account
Yehuda Sadeh [Fri, 26 Apr 2013 04:28:55 +0000 (21:28 -0700)]
rgw: fix bucket count when stating account

We need to add up the num of buckets and not just set it
as we don't read the entire list of buckets in one operation.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agorgw: trivial cleanups post code review
Yehuda Sadeh [Fri, 26 Apr 2013 02:23:12 +0000 (19:23 -0700)]
rgw: trivial cleanups post code review

Following code review of #4760.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agodebian/rules: use multiline search to look for Build-Depends
Dan Mick [Fri, 26 Apr 2013 07:04:13 +0000 (00:04 -0700)]
debian/rules: use multiline search to look for Build-Depends

When Build-Depends was split into multiple lines (in commit
8f5c665744e58d6d51a1e86de55c1399f51cc1c3), the grep for
libgoogle-perftools-dev broke.  Replace grep with perl for multiline
matching.

Fixes: #4818
Signed-off-by: Dan Mick <dan.mick@inktank.com>
(cherry picked from commit 89692e099f20424a5effcefcd33df154ebc5de39)

12 years agoclient: re-fix cap releases
Sage Weil [Fri, 26 Apr 2013 17:12:37 +0000 (10:12 -0700)]
client: re-fix cap releases

Encode cap releases if NOT replay.  <facepalm>  Thanks, Greg!

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoclient: don't embed cap releases in clientreplay
Sam Lang [Thu, 25 Apr 2013 23:52:06 +0000 (18:52 -0500)]
client: don't embed cap releases in clientreplay

If the client is sending replay requests, avoid sending embedded caps,
since the mds already has the client's caps from the reconnect.
This matches the behavior of the kernel client.

Fixes #4742.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoPG: clear want_acting when we leave Primary
Samuel Just [Thu, 25 Apr 2013 21:08:57 +0000 (14:08 -0700)]
PG: clear want_acting when we leave Primary

This is somewhat annoying actually.  Intuitively we want to
clear_primary_state when we leave primary, but when we restart
peering due to a change in prior set status, we can't afford
to forget most of our peering state.  want_acting, on the
other hand, should never persist across peering attempts.
In fact, in the future, want_acting should be pulled into
the Primary state structure.

Fixes: #3904
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
12 years agomon: get own entity_inst_t via messenger, not monmap
Sage Weil [Thu, 25 Apr 2013 22:18:42 +0000 (15:18 -0700)]
mon: get own entity_inst_t via messenger, not monmap

There are intervals during bootstrap(*) during which we are part of the
monmap, but our name (mon->name) does not match the monmap's.  This means
that calling monmap->get_inst(mon->name) is not a safe way to get our own
entity_inst_t.

Instead, use messenger->get_myinst().  This includes our addr (obviously)
and an up-to-date entity_name_t, too: in bootstrap we adjust the messenger
name at the same time as mon->rank, based on the contents of the monmap.

monmap->get_inst(mon->rank) would work too.

* During mkfs, the monmap may have noname-foo instead of the name if it was
  generated from the mon_host lines or dns or whatever by
  MonMap::build_initial().  This was the case for #4811.

Fixes: #4811
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoMerge pull request #239 from ceph/wip-4760
Sage Weil [Thu, 25 Apr 2013 20:11:59 +0000 (13:11 -0700)]
Merge pull request #239 from ceph/wip-4760

#4760

Second patch Reviewed-by: Sage Weil <sage@inktank.com>

12 years agoMerge pull request #246 from ceph/wip-4793
Sage Weil [Thu, 25 Apr 2013 18:52:30 +0000 (11:52 -0700)]
Merge pull request #246 from ceph/wip-4793

#4793

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoradosgw: receiving unexpected error code while accessing an non-existing object by... 245/head
Li Wang [Thu, 25 Apr 2013 15:36:56 +0000 (23:36 +0800)]
radosgw: receiving unexpected error code while accessing an non-existing object by authorized not-owner user

This patch fixes a bug in radosgw swift compatibility code,
that is, if a not-owner but authorized user access a non-existing
object in a container, he wiil receive unexpected error code,
to repeat this bug, do the following steps,

1 User1 creates a container, and grants the read/write permission to user2

curl -X PUT -i -k -H "X-Auth-Token: $user1_token" $url/$container
curl -X POST -i -k -H "X-Auth-Token: $user1_token" -H "X-Container-Read:
$user2" -H "X-Container-Write: $user2" $url/$container

2 User2 queries the object 'obj' in the newly created container
by using HEAD instruction, note the container currently is empty

curl -X HEAD -i -k -H "X-Auth-Token: $user2_token" $url/$container/obj

3 The response received by user2 is '401 Authorization Required',
rather than the expected '404 Not Found', the details are as follows,

HTTP/1.1 401 Authorization Required
Date: Tue, 16 Apr 2013 01:52:49 GMT
Server: Apache/2.2.22 (Ubuntu)
Accept-Ranges: bytes
Content-Length: 12
Vary: Accept-Encoding

Signed-off-by: Yunchuan Wen <yunchuanwen@ubuntukylin.com>
Signed-off-by: Li Wang <liwang@ubuntukylin.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoinit-ceph: use remote config when starting daemons on remote nodes (-a)
Sage Weil [Thu, 25 Apr 2013 18:13:33 +0000 (11:13 -0700)]
init-ceph: use remote config when starting daemons on remote nodes (-a)

If you use -a to start a remote daemon, assume the remote config is present
instead of pushing the local config.  This makes more sense and simplifies
things.

Note that this means that -a in concert with -c foo means that foo must
also be present on the remote node in the same path.  That, however, is a
use case that I don't particularly care about right now.  :)

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agoMerge branch 'wip-4748-b' into next
Sage Weil [Thu, 25 Apr 2013 17:21:11 +0000 (10:21 -0700)]
Merge branch 'wip-4748-b' into next

Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoMerge branch 'wip-4778' into next
David Zafman [Thu, 25 Apr 2013 00:33:00 +0000 (17:33 -0700)]
Merge branch 'wip-4778' into next

Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoscrub clears inconsistent flag set by deep scrub
David Zafman [Tue, 23 Apr 2013 00:06:52 +0000 (17:06 -0700)]
scrub clears inconsistent flag set by deep scrub

Add new num_deep_scrub_errors and num_shallow_scrub_errors to object_stat_sum_t
Show deep-scrub error count when outputing regular scrub errors
Set invalid size in case of a stat error which sets read_error
For now do deep-scrub after repair (see #4783)

fixes: #4778
Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agoMerge pull request #242 from ceph/wip-objectcacher-enoent
Josh Durgin [Wed, 24 Apr 2013 23:20:59 +0000 (16:20 -0700)]
Merge pull request #242 from ceph/wip-objectcacher-enoent

Reviewed-by: Sage Weil <sage.weil@inktank.com>
12 years agoObjectCacher: remove all buffers from a non-existent object 242/head
Josh Durgin [Wed, 24 Apr 2013 22:06:50 +0000 (15:06 -0700)]
ObjectCacher: remove all buffers from a non-existent object

Once we're sure an object doesn't exist, we retry all the waiters in
order, and they return -ENOENT immediately. If there were a bunch of
BufferHeads waiting for data (rx state), they would be left behind
while the reads that triggered them were complete from the cache
user's perspective. These extra rx BufferHeads would pin the object in
the lru, so they wouldn't be removed by release_set(). This meant that
the assert during shutdown of the cache would be triggered.

To fix this, remove any BufferHeads in this state immediately when we
find out the object doesn't exist. Use the same condition as readx for
determining whether this is safe - if we got -ENOENT and all
BufferHeads for the object are clean or rx.

Fixes: #3664
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
12 years agomon: when electing, be sure acked leaders have new enough stores to lead 246/head
Greg Farnum [Wed, 24 Apr 2013 22:36:41 +0000 (15:36 -0700)]
mon: when electing, be sure acked leaders have new enough stores to lead

In general anybody participating in an election should be new enough to
lead thanks to the bootstrap process, but we've observed situations in
which a monitor is leader but gets so busy that it gets booted out
without noticing for a while, then processes the election messages
which were spawned, responds to them, and the other monitors kick those
up to a new election epoch. Then the old and behind monitor gets
elected as the new leader, which does bad things to our sync.

To deal with this, add the paxos first and last committed versions
to the MMonElection messages, and consider those values when deciding
whether to defer to a peer. Only defer to them if their newest value
is newer than our oldest, but also *do* defer to them if their oldest
value is newer than our newest even if we out-rank them otherwise.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agomon: be more careful about making sure we're up-to-date on sync check
Greg Farnum [Wed, 24 Apr 2013 22:27:23 +0000 (15:27 -0700)]
mon: be more careful about making sure we're up-to-date on sync check

We were looking at our own paxos_max_join_drift and using that to
calculate whether we were new enough to join without syncing, but
if those numbers don't match across monitors they might have trimmed. Use
the number they provide us as their first version and compare to that
as well.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agorgw: fix i386 compile error 239/head
Sage Weil [Wed, 24 Apr 2013 22:07:28 +0000 (15:07 -0700)]
rgw: fix i386 compile error

error: rgw/rgw_op.cc:665:63: no matching function for call to ‘min(uint64_t, size_t&)’

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoFileStore::_split_collection: src or dest may be removed on replay
Samuel Just [Wed, 24 Apr 2013 21:23:45 +0000 (14:23 -0700)]
FileStore::_split_collection: src or dest may be removed on replay

If the collection is subsequently removed, the _split_collection
might get replayed and find either src or dest removed.

Fixes: #4806
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agolibrados: fix calc_snap_set_diff interval calculation
Sage Weil [Wed, 24 Apr 2013 20:48:40 +0000 (13:48 -0700)]
librados: fix calc_snap_set_diff interval calculation

When calculating the [a,b] interval over which a given clone is valid, do
not assume that b == the clone id; that is *not* true when the original
end snap has been deleted/trimmed.

While we are here, make the code a bit cleaner to read.

Fixes: #4785
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge remote-tracking branch 'upstream/wip_2476' into next
Samuel Just [Wed, 24 Apr 2013 21:03:38 +0000 (14:03 -0700)]
Merge remote-tracking branch 'upstream/wip_2476' into next

Fixes: #2476
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoPG: call check_recovery_sources in remove_down_peer_info
Samuel Just [Wed, 24 Apr 2013 19:20:17 +0000 (12:20 -0700)]
PG: call check_recovery_sources in remove_down_peer_info

If we transition out of peering due to affected
prior set, we won't trigger start_peering_interval
and check_recovery_sources won't get called.  This
will leave an entry in missing_loc_sources without
a matching missing set.  We always want to
check_recovery_sources with remove_down_peer_info.

Fixes: 4805
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agomon: send clients away while sychronizing
Sage Weil [Wed, 24 Apr 2013 19:26:37 +0000 (12:26 -0700)]
mon: send clients away while sychronizing

When we are out of quorum, we waitlist client messages or (eventually)
send them elsewhere.  If we are synchronizing, do the same.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomkcephfs: give mon. key 'allow *' mon caps
Sage Weil [Wed, 24 Apr 2013 17:13:40 +0000 (10:13 -0700)]
mkcephfs: give mon. key 'allow *' mon caps

This will ease the transition from mkcephfs to ceph-deploy by allowing
ceph-create-keys to use the mon. keyring file in $mon_data and get the
caps it needs.

Fixes: #4756
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoPendingReleaseNotes: note about rbd resize --allow-shrink
Josh Durgin [Wed, 24 Apr 2013 17:16:03 +0000 (10:16 -0700)]
PendingReleaseNotes: note about rbd resize --allow-shrink

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
12 years agorgw: list container only shows stats if needed
Yehuda Sadeh [Tue, 23 Apr 2013 19:31:31 +0000 (12:31 -0700)]
rgw: list container only shows stats if needed

Fixes: #4759
Add a new request param 'stats' for the swift list containers
request. If set to 'false' it disables stats retrieval, which
makes it go faster. Also, don't dump stats if format is plain,
as they're not going to be dumped.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agorbd: fix cli-integration tests for striping change
Sage Weil [Wed, 24 Apr 2013 15:35:15 +0000 (08:35 -0700)]
rbd: fix cli-integration tests for striping change

We don't set the striping feature when we are using backward-compatible
(default) striping now; fix the test accordingly.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years ago95-ceph-osd-alt.rules: Fix missing parent parameter
Gary Lowell [Wed, 24 Apr 2013 15:22:04 +0000 (08:22 -0700)]
95-ceph-osd-alt.rules:  Fix missing parent parameter

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
12 years agoReplicatedPG: timeout watches based on last_became_active
Samuel Just [Mon, 1 Apr 2013 22:44:32 +0000 (15:44 -0700)]
ReplicatedPG: timeout watches based on last_became_active

This way a notify on an object with a single defunct watcher
won't necessarily have to wait the full timeout if the pg
has been active for a while.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoosd_types: add last_became_active to pg_stats
Samuel Just [Mon, 1 Apr 2013 22:42:34 +0000 (15:42 -0700)]
osd_types: add last_became_active to pg_stats

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge branch 'wip_4552' into next
Samuel Just [Wed, 24 Apr 2013 03:49:57 +0000 (20:49 -0700)]
Merge branch 'wip_4552' into next

Fixes: #4552
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoOSD: don't report peers down if hbclient_messenger is backed up
Samuel Just [Mon, 22 Apr 2013 21:50:09 +0000 (14:50 -0700)]
OSD: don't report peers down if hbclient_messenger is backed up

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoMessenger: add interface to get oldest queued message arrival time
Samuel Just [Mon, 22 Apr 2013 21:06:22 +0000 (14:06 -0700)]
Messenger: add interface to get oldest queued message arrival time

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoDispatchQueue: track queued message arrival times and expose oldest
Samuel Just [Mon, 22 Apr 2013 21:06:05 +0000 (14:06 -0700)]
DispatchQueue: track queued message arrival times and expose oldest

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge pull request #237 from ceph/wip-4794
Sage Weil [Wed, 24 Apr 2013 00:23:32 +0000 (17:23 -0700)]
Merge pull request #237 from ceph/wip-4794

init-ceph: fix (and simplify) pushing ceph.conf to remote unique name

12 years agoMerge pull request #241 from ceph/wip-4798
Sage Weil [Wed, 24 Apr 2013 00:17:02 +0000 (17:17 -0700)]
Merge pull request #241 from ceph/wip-4798

#4798

Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomon: revert part of PaxosService::is_readable() change
Sage Weil [Wed, 24 Apr 2013 00:16:31 +0000 (17:16 -0700)]
mon: revert part of PaxosService::is_readable() change

In 98e23980f4ab7ba289303f72da06721c84767293 is_readable() was changed to
call is_active(), but that has a check for is_bootstrapping(), so there is
a semantic change.

As a result, we may fail PaxosService::is_readable() (due to bootstrapping)
and then try to call Paxos::wait_for_readable().  That will assert that
Paxos::is_readable() is false, but it will be true and we will crash.

Revert that part of the change, since the semantic change was not
intentional.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agolibrbd: fix i386 build
Sage Weil [Tue, 23 Apr 2013 23:18:53 +0000 (16:18 -0700)]
librbd: fix i386 build

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #240 from ceph/wip-4665
Josh Durgin [Tue, 23 Apr 2013 23:11:44 +0000 (16:11 -0700)]
Merge pull request #240 from ceph/wip-4665

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agolibrbd: add read_iterate2 call with fixed argument type 240/head
Sage Weil [Tue, 23 Apr 2013 21:58:55 +0000 (14:58 -0700)]
librbd: add read_iterate2 call with fixed argument type

The existing read_iterate takes a size_t for the length, which is only 4GB
on 32-bit machines.  Instead, take a uint64_t length for the new
read_iterate2().

Return 0 instead of the number of bytes read; this makes the user-facing
API a bit simpler.

Fixes: #4665
Signed-off-by: Sage Weil <sage@inktank.com>
keep bytes return from internal method

12 years agolibrbd: implement read not in terms of read_iterate
Sage Weil [Tue, 23 Apr 2013 22:44:42 +0000 (15:44 -0700)]
librbd: implement read not in terms of read_iterate

The read() method returns the bytes read, trimmed to the end of the image;
use the other read() variant to do this (which use aio_read()) instead of
read_iterate().

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: drop forwarded requests after an election 241/head
Sage Weil [Tue, 23 Apr 2013 21:06:41 +0000 (14:06 -0700)]
mon: drop forwarded requests after an election

On each election, we resend routed requests to the new leader (or
requeue for ourselves).  Therefore, if we receive a forwarded request,
we should drop it on the floor if there is a new election.  Add a field
in the PaxosServiceMessage struct to track which election epoch we
received the request in, and drop it in PaxosService::dispatch() if
that is in the past.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: requeue routed_requests for self if elected leader
Sage Weil [Tue, 23 Apr 2013 20:45:59 +0000 (13:45 -0700)]
mon: requeue routed_requests for self if elected leader

If we have requests that we have forwarded, and are elected leader,
requeue those requests for ourself and queue them normally and clear out
the routed_requests map.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: track original Connection* for forwarded requests
Sage Weil [Tue, 23 Apr 2013 20:40:27 +0000 (13:40 -0700)]
mon: track original Connection* for forwarded requests

Keep a reference to the source Connection* for forwarded requests.  This
makes the reply path slightly cleaner, and will help us later.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #222 from ceph/wip-3495
Gregory Farnum [Tue, 23 Apr 2013 19:44:05 +0000 (12:44 -0700)]
Merge pull request #222 from ceph/wip-3495

Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agotest_filejournal: adjust corrupt entry tests to force header write
Samuel Just [Tue, 23 Apr 2013 19:08:14 +0000 (12:08 -0700)]
test_filejournal: adjust corrupt entry tests to force header write

The journal no longer assumes corruption if it finds a valid entry
after an inavlid entry.  Instead, these tests will exercise the
corruption detection via the header committed_up_to member.

Fixes: #4792
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agorgw: stream list buckets (containers) request
Yehuda Sadeh [Fri, 19 Apr 2013 21:03:27 +0000 (14:03 -0700)]
rgw: stream list buckets (containers) request

Fixes: #4760
Instead of retrieving the entire list of buckets in one
chunk, streamline it. This makes it so that if the request
takes too long, client isn't going to timeout before getting
any data.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoinit-ceph: fix (and simplify) pushing ceph.conf to remote unique name 237/head
Sage Weil [Tue, 23 Apr 2013 17:00:38 +0000 (10:00 -0700)]
init-ceph: fix (and simplify) pushing ceph.conf to remote unique name

The old code would only do the push once per remote node (due to the
list in $pushed_to) but would reset $unique on each attempt.  This would
break if a remote host was processed twice.

Fix by just skipping the $pushed_to optimization entirely.

Fixes: #4794
Reported-by: Andreas Friedrich <andreas.friedrich@ts.fujitsu.com>
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph-disk: OSD hotplug fixes for Centos
Gary Lowell [Thu, 11 Apr 2013 16:42:13 +0000 (09:42 -0700)]
ceph-disk:  OSD hotplug fixes for Centos

Two fixes for Centos 6.3 and other systems with udev versions
prior to 172.  The disk peristant name using the GPT UUID does
not exist, so use the by_path persistent name instead for the
journal symlink.

The gpt label fields are not available for use in udev rules. Add
ceph-disk-udev wrapper script that extracts the partition
type guid from the label and calls ceph-disk-activate if it is
a ceph guid type. (Bug #4632)

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
12 years agodoc: Usage requires --num_osds.
John Wilkins [Tue, 23 Apr 2013 04:03:15 +0000 (21:03 -0700)]
doc: Usage requires --num_osds.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Added some detail. Calculating PGs, maps; reorganized a bit.
John Wilkins [Tue, 23 Apr 2013 04:02:45 +0000 (21:02 -0700)]
doc: Added some detail. Calculating PGs, maps; reorganized a bit.

fixes: #2968

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agomon: [MDS]Monitor: remove 'stop_cluster' and 'do_stop()' 222/head
Joao Eduardo Luis [Mon, 22 Apr 2013 22:25:27 +0000 (23:25 +0100)]
mon: [MDS]Monitor: remove 'stop_cluster' and 'do_stop()'

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: MDSMonitor: tighter leash on cross-proposals to the osdmon
Joao Eduardo Luis [Mon, 22 Apr 2013 22:23:16 +0000 (23:23 +0100)]
mon: MDSMonitor: tighter leash on cross-proposals to the osdmon

Fixes: #3495
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agoMerge pull request #234 from ceph/wip-4758
Gregory Farnum [Mon, 22 Apr 2013 22:22:04 +0000 (15:22 -0700)]
Merge pull request #234 from ceph/wip-4758

Fixes #4758.

Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomon: PaxosService: add request_proposal() to perform cross-proposals
Joao Eduardo Luis [Tue, 16 Apr 2013 15:41:57 +0000 (16:41 +0100)]
mon: PaxosService: add request_proposal() to perform cross-proposals

Instead of allowing services to directly use 'propose_pending()' on
other services, we instead add two new functions:

  - request_proposal() to request 'this' service to propose its
    pending value; and
  - request_proposal(PaxosService *other) so that 'this' service
    can request a proposal to 'other'

These functions should allow us to enforce a greater set of
constraints at time of a cross-proposal, either by making sure a
service will (e.g.) hold-off his own proposals until said proposal
is performed, or even that the other service will enforce a tighter
set of constraints that wouldn't otherwise be enforced by using
'propose_pending()' directly.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: PaxosService: is_writeable() depends on being ready to be written to
Joao Eduardo Luis [Tue, 16 Apr 2013 15:41:18 +0000 (16:41 +0100)]
mon: PaxosService: is_writeable() depends on being ready to be written to

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: PaxosService: is_readable/writeable() depending on is_active()
Joao Eduardo Luis [Fri, 19 Apr 2013 11:56:51 +0000 (12:56 +0100)]
mon: PaxosService: is_readable/writeable() depending on is_active()

Instead of depending on individual conditions.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: PaxosService: consider is_recovering() on is_writeable()
Joao Eduardo Luis [Mon, 15 Apr 2013 12:38:04 +0000 (13:38 +0100)]
mon: PaxosService: consider is_recovering() on is_writeable()

A service is never writeable while it's recovering.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: commit LogSummary on every message 234/head
Sage Weil [Mon, 22 Apr 2013 22:01:09 +0000 (15:01 -0700)]
mon: commit LogSummary on every message

This moves our version pointer up so that we don't re-log (by re-consuming)
log messages to /var/log/ceph/ceph.log on ceph-mon restart.  OTOH, it means
we rewrite the summary of the last 50 messages, but we consider that to be
relatively cheap (and something we *always* did prior for bobtail and
earlier anyway).

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: set threshold to periodically stash_full
Sage Weil [Mon, 22 Apr 2013 21:58:09 +0000 (14:58 -0700)]
mon: set threshold to periodically stash_full

Set an interval to periodically write a full copy of the map that is lower
than the trim point (which is generally a very large number of commits).

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #230 from ceph/wip-mon-paxos-fixes
Sage Weil [Mon, 22 Apr 2013 22:11:46 +0000 (15:11 -0700)]
Merge pull request #230 from ceph/wip-mon-paxos-fixes

Wip mon paxos fixes

Reviewed-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #225 from ceph/wip-4543
Gregory Farnum [Mon, 22 Apr 2013 22:05:14 +0000 (15:05 -0700)]
Merge pull request #225 from ceph/wip-4543

Fixes #4543

Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoceph-mon: Attempt to obtain monmap from several possible sources 225/head
Joao Eduardo Luis [Fri, 19 Apr 2013 16:28:37 +0000 (17:28 +0100)]
ceph-mon: Attempt to obtain monmap from several possible sources

In order of interest/priority:

  - our latest monmap version
  - a backup monmap version created during sync start, if the store
    appears to be in a post-aborted sync state
  - a mkfs monmap version

If none of these are found, we should go ahead and try to build a
monmap from ceph.conf to join an existing cluster.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: Monitor: backup monmap prior to starting a store sync
Joao Eduardo Luis [Fri, 19 Apr 2013 16:28:06 +0000 (17:28 +0100)]
mon: Monitor: backup monmap prior to starting a store sync

If by fate we end up attempting a store sync after failing at
least one before, we might not have a monmap to read from the
store to backup.  Therefore, in that case, we shall backup the
current monmap being used by the monitor.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agorgw: don't send tail to gc if copying object to itself
Yehuda Sadeh [Mon, 22 Apr 2013 19:48:56 +0000 (12:48 -0700)]
rgw: don't send tail to gc if copying object to itself

Fixes: #4776
Backport: bobtail
Need to make sure that when copying an object into itself we don't
send the tail to the garbage collection.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoMerge pull request #232 from ceph/wip-4710
Josh Durgin [Mon, 22 Apr 2013 20:36:38 +0000 (13:36 -0700)]
Merge pull request #232 from ceph/wip-4710

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoMerge pull request #233 from ceph/wip-mon-idempotent
Sage Weil [Mon, 22 Apr 2013 19:58:08 +0000 (12:58 -0700)]
Merge pull request #233 from ceph/wip-mon-idempotent

Wip mon idempotent

Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agomon: make 'osd pool rmsnap ...' idempotent 233/head
Sage Weil [Mon, 22 Apr 2013 19:50:09 +0000 (12:50 -0700)]
mon: make 'osd pool rmsnap ...' idempotent

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: make 'osd pool mksnap ...' idempotent
Sage Weil [Mon, 22 Apr 2013 19:49:58 +0000 (12:49 -0700)]
mon: make 'osd pool mksnap ...' idempotent

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: make 'osd blacklist rm ...' idempotent
Sage Weil [Mon, 22 Apr 2013 19:48:49 +0000 (12:48 -0700)]
mon: make 'osd blacklist rm ...' idempotent

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agorbd: only set STRIPINGV2 feature when needed 232/head
Sage Weil [Mon, 22 Apr 2013 19:41:49 +0000 (12:41 -0700)]
rbd: only set STRIPINGV2 feature when needed

Only set the STRIPINGV2 feature if the striping parameters are non-default.
Specifically, fix the case where the passed-in size and count are == 0.

Fixes: #4710
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agorbd: fix feature display for --info
Sage Weil [Mon, 22 Apr 2013 19:38:11 +0000 (12:38 -0700)]
rbd: fix feature display for --info

Only include the feature if it is set!

Backport: bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agorbd: avoid clobbering return value with udevadm settle
Sage Weil [Mon, 22 Apr 2013 18:41:02 +0000 (11:41 -0700)]
rbd: avoid clobbering return value with udevadm settle

Fixes: #4707
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoFileJournal: a valid entry after invalid entry =/=> corrupt
Samuel Just [Mon, 22 Apr 2013 18:27:50 +0000 (11:27 -0700)]
FileJournal: a valid entry after invalid entry =/=> corrupt

Out of order journal entry writes using aio may cause entry
n+2 to be written prior to n.  This does not indicate
corruption.

Fixes: #4736
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoradosgw: Fix duplicate 'Content-Type' when using 'response-content-type'
Sylvain Munaut [Thu, 14 Feb 2013 13:48:16 +0000 (14:48 +0100)]
radosgw: Fix duplicate 'Content-Type' when using 'response-content-type'

Signed-off-by: Sylvain Munaut <s.munaut@whatever-company.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agomon: MonmapMonitor: add function to obtain latest monmap
Joao Eduardo Luis [Mon, 22 Apr 2013 15:20:37 +0000 (16:20 +0100)]
mon: MonmapMonitor: add function to obtain latest monmap

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: PaxosService: add 'exists_key/version' helper functions
Joao Eduardo Luis [Mon, 22 Apr 2013 15:13:33 +0000 (16:13 +0100)]
mon: PaxosService: add 'exists_key/version' helper functions

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agoceph-create-keys: Don't wait if permission denied
Gary Lowell [Fri, 19 Apr 2013 18:19:05 +0000 (11:19 -0700)]
ceph-create-keys:  Don't wait if permission denied

If get or create keys returns permssion denied, exit
gracefully instead of retrying.

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agodoc: Aesthetic improvements. Removed unnecessary graphic and overrode margin for...
John Wilkins [Sat, 20 Apr 2013 18:10:51 +0000 (11:10 -0700)]
doc: Aesthetic improvements. Removed unnecessary graphic and overrode margin for h3 tag.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Added a scenario to PG troubleshooting.
John Wilkins [Sat, 20 Apr 2013 18:08:08 +0000 (11:08 -0700)]
doc: Added a scenario to PG troubleshooting.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Changed usage to "bucket-name". Description was okay.
John Wilkins [Sat, 20 Apr 2013 18:06:44 +0000 (11:06 -0700)]
doc: Changed usage to "bucket-name". Description was okay.

fixes: #4102

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoMerge branch 'wip-4201' into next
David Zafman [Sat, 20 Apr 2013 01:14:28 +0000 (18:14 -0700)]
Merge branch 'wip-4201' into next

Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agotools/ceph-filestore-dump: Implement remove, export and import 223/head
David Zafman [Thu, 21 Mar 2013 05:08:08 +0000 (22:08 -0700)]
tools/ceph-filestore-dump: Implement remove, export and import

Change local names to be clearer
Break real_log() into common function get_log()
Move infos_oid, biginfo_oid and log_oid to globals for general use

Feature: #4201 (osd: data loss: pg export/import/remove)

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agoMerge branch 'wip_4662_clean' into next
Samuel Just [Sat, 20 Apr 2013 00:11:29 +0000 (17:11 -0700)]
Merge branch 'wip_4662_clean' into next

Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoReplicatedPG::_finish_mark_all_unfound_lost: only requeue if !deleting
Samuel Just [Fri, 19 Apr 2013 17:54:11 +0000 (10:54 -0700)]
ReplicatedPG::_finish_mark_all_unfound_lost: only requeue if !deleting

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoReplicatedPG::_applied_recovered_object*: don't queue scrub if deleting
Samuel Just [Fri, 19 Apr 2013 17:52:30 +0000 (10:52 -0700)]
ReplicatedPG::_applied_recovered_object*: don't queue scrub if deleting

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoPG: check for pg change in ~FlushState
Samuel Just [Fri, 19 Apr 2013 17:51:08 +0000 (10:51 -0700)]
PG: check for pg change in ~FlushState

Fixes: #4662
Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoPG: bail if deleting in _finish_recovery
Samuel Just [Fri, 19 Apr 2013 17:50:43 +0000 (10:50 -0700)]
PG: bail if deleting in _finish_recovery

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoAsyncReserver: delete context in cancel_reservation
Samuel Just [Fri, 19 Apr 2013 02:38:01 +0000 (19:38 -0700)]
AsyncReserver: delete context in cancel_reservation

Fixes: #4662
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agotools/ceph-filestore-dump: Error messages lost because stderr is closed
David Zafman [Wed, 20 Mar 2013 06:12:35 +0000 (23:12 -0700)]
tools/ceph-filestore-dump: Error messages lost because stderr is closed

Use cout instead of cerr for command errors
Use cerr for debug mode because stderr is avail
Output map_epoch in debug mode
Fix a message and only for debug mode

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agoosd: Make clear_temp() public for use by remove
David Zafman [Thu, 18 Apr 2013 18:14:46 +0000 (11:14 -0700)]
osd: Make clear_temp() public for use by remove

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agoosd: Add flag to force version write in _write_info()
David Zafman [Tue, 16 Apr 2013 06:40:13 +0000 (23:40 -0700)]
osd: Add flag to force version write in _write_info()

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agoosd: Create static PG::_write_log() function
David Zafman [Sat, 6 Apr 2013 04:39:34 +0000 (21:39 -0700)]
osd: Create static PG::_write_log() function

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agoOSDMonitor: pg split is no longer experimental
Samuel Just [Fri, 19 Apr 2013 20:21:01 +0000 (13:21 -0700)]
OSDMonitor: pg split is no longer experimental

Fixes: #4711
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoMerge pull request #228 from alram/next
Sage Weil [Fri, 19 Apr 2013 22:16:41 +0000 (15:16 -0700)]
Merge pull request #228 from alram/next

Fix journal partition creation

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoFix journal partition creation 228/head
Alexandre Marangone [Fri, 19 Apr 2013 22:09:28 +0000 (15:09 -0700)]
Fix journal partition creation

With OSD sharing data and journal, the previous code created the
journal partiton from the end of the device. A uint32_t is
used in sgdisk to get the last sector, with large HD, uint32_t
is too small.
The journal partition will be created backwards from the
a sector in the midlle of the disk leaving space before
and after it. The data partition will use whichever of
these spaces is greater. The remaining will not be used.

This patch creates the journal partition from the start as a workaround.

Signed-off-by: Alexandre Marangone <alexandre.marangone@inktank.com>
12 years agorbd: fix qa tests to use --allow-shrink
Sage Weil [Fri, 19 Apr 2013 21:08:51 +0000 (14:08 -0700)]
rbd: fix qa tests to use --allow-shrink

Fixes: #4763
Signed-off-by: Sage Weil <sage@inktank.com>