]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
12 years agomon: fix crush_ops.sh tests
Sage Weil [Wed, 27 Mar 2013 05:47:11 +0000 (22:47 -0700)]
mon: fix crush_ops.sh tests

Make it work.  Also, make note that these aren't handled idempotently by
the mon currently.  Doh!

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Wed, 27 Mar 2013 01:19:27 +0000 (18:19 -0700)]
Merge remote-tracking branch 'gh/next'

12 years agoReplicatedPG: send entire stats on OP_BACKFILL_FINISH
Samuel Just [Tue, 26 Mar 2013 22:10:37 +0000 (15:10 -0700)]
ReplicatedPG: send entire stats on OP_BACKFILL_FINISH

Otherwise, we update the stat.stat structure, but not the
stat.invalid_stats part.  This will result in a recently
split primary propogating the invalid stats but not the
invalid marker.  Sending the whole pg_stat_t structure
also mirrors MOSDSubOp.

Fixes: #4557
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Wed, 27 Mar 2013 00:05:48 +0000 (17:05 -0700)]
Merge remote-tracking branch 'gh/next'

12 years agotesting: fix hadoop-internal-test
Joe Buck [Tue, 26 Mar 2013 21:17:14 +0000 (14:17 -0700)]
testing: fix hadoop-internal-test

Remove now superfluous directory changes
that are causing tests to fail.
This code should have been removed when we transitioned
from running tests with Ant to using Java to run the tests.

Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Noah Watkins <noahwatkins@gmail.com>
12 years agoMerge pull request #149 from ceph/wip-4530
Sage Weil [Tue, 26 Mar 2013 22:13:24 +0000 (15:13 -0700)]
Merge pull request #149 from ceph/wip-4530

#4530

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #139 from ceph/wip-topo-java
Joe Buck [Tue, 26 Mar 2013 22:09:19 +0000 (15:09 -0700)]
Merge pull request #139 from ceph/wip-topo-java

Merging in Noah's branch for adding topology calls. This passes existing libcephfs, libcephfs-java and hadoop tests.

12 years agoclient: Don't signal requests already handled 149/head
Sam Lang [Mon, 25 Mar 2013 19:55:20 +0000 (14:55 -0500)]
client: Don't signal requests already handled

The assertion failure reported in #4530 is triggered
by the following:

1. client sends request
2. mds sends unsafe reply
3. before request gets journaled, mds is killed
4. mds restarts
5. client receives session close (from close request before restart)
6. session close does kick_requests()
7. kick_requests tries to signal caller that doesn't exist.

This fix avoids signaling a caller if the unsafe reply
has been received and the make_request() function has completed.
We do this by setting the caller_cond to null once the caller
is woken up, and only signal the caller in kick_requests if
caller_cond is non-null.  This avoids trying to resend requests
listed in mds_request but that have already received unsafe replies.
The unsafe requests are handled by resend_unsafe_requests() code,
so skipping those requests is allowable.

Fixes #4530.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #144 from dalgaaf/wip-da-ceph-disk
Sage Weil [Tue, 26 Mar 2013 19:06:41 +0000 (12:06 -0700)]
Merge pull request #144 from dalgaaf/wip-da-ceph-disk

Fix some issues in ceph-dsk

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #143 from ceph/wip-mds-health
Sage Weil [Tue, 26 Mar 2013 18:44:29 +0000 (11:44 -0700)]
Merge pull request #143 from ceph/wip-mds-health

improve mds health checks

Reviewed-by: Sam Lang <sam.lang@inktank.com>
12 years agoceph-disk: udevadm settle before partprobe
Gary Lowell [Tue, 26 Mar 2013 18:31:16 +0000 (11:31 -0700)]
ceph-disk:  udevadm settle before partprobe

After changing the partition table, allow the udev event to be
processed before calling partprobe.  This helps prevent partprobe
from getting a resource busy error on some platforms.

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
12 years agoMerge pull request #147 from ceph/wip-4537
Sage Weil [Tue, 26 Mar 2013 16:29:42 +0000 (09:29 -0700)]
Merge pull request #147 from ceph/wip-4537

mds: CInode::build_backtrace() always incr iter

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agomds: CInode::build_backtrace() always incr iter 147/head
Sam Lang [Tue, 26 Mar 2013 13:55:40 +0000 (08:55 -0500)]
mds: CInode::build_backtrace() always incr iter

Always increment the iterator when adding old pools
to the backtrace.  This fixes a bug on files where
the layout had been set to a different pool and then
back to the same pool, causing continuous looping in
the build_backtrace() function.

Fixes #4537.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
12 years agojava: fix test name typo 139/head
Noah Watkins [Tue, 26 Mar 2013 16:06:14 +0000 (09:06 -0700)]
java: fix test name typo

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agoMerge pull request #145 from dalgaaf/wip-da-c_str
Sage Weil [Tue, 26 Mar 2013 15:15:53 +0000 (08:15 -0700)]
Merge pull request #145 from dalgaaf/wip-da-c_str

CrushWrapper.cc: remove some std::string::c_str() calls

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoclient: Cleanup request signaling
Sam Lang [Mon, 25 Mar 2013 18:13:28 +0000 (13:13 -0500)]
client: Cleanup request signaling

Split up the conditionals handling unsafe reply
and signaling the caller to improve readability.
The overall behavior of the code remains the same.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
12 years agoclient: Handle duplicate safe replies
Sam Lang [Mon, 25 Mar 2013 17:58:13 +0000 (12:58 -0500)]
client: Handle duplicate safe replies

If the mds sends a duplicate safe reply, the mds_requests
map won't contain a matching request id (tid).  Instead of
assert failing, we log a message that we saw a reply without
a matching request.

Also remove redundant mds_requests->erase(tid) line.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
12 years agoclient: Always cleanup request after safe
Sam Lang [Mon, 25 Mar 2013 16:43:54 +0000 (11:43 -0500)]
client: Always cleanup request after safe

The client MetaRequest should always be cleaned up
and removed from the mds_requests map once the client
gets a safe reply.  This patch avoids a leak where the
mds does not send back an unsafe reply and the request
is never cleaned up.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
12 years agoclient: Remove got_safe from MetaRequest
Sam Lang [Mon, 25 Mar 2013 16:39:19 +0000 (11:39 -0500)]
client: Remove got_safe from MetaRequest

Once a safe reply is received, we remove the
request from the mds_requests map, so checking that
it might be a duplicate won't succeed.  This patch
removes the got_safe checks in the reply handling code
and the got_safe field on the MetaRequest to avoid confusion.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
12 years agoCrushWrapper.cc: remove some std::string::c_str() calls 145/head
Danny Al-Gaaf [Tue, 26 Mar 2013 11:46:46 +0000 (12:46 +0100)]
CrushWrapper.cc: remove some std::string::c_str() calls

Passing the result of c_str() to a function that takes
std::string as argument is slow and redundant.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoMerge remote-tracking branch 'gh/wip-crush'
Sage Weil [Mon, 25 Mar 2013 23:29:56 +0000 (16:29 -0700)]
Merge remote-tracking branch 'gh/wip-crush'

The non-crush bits
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agodoc/release-notes: extra note for v0.56.4
Sage Weil [Mon, 25 Mar 2013 23:24:48 +0000 (16:24 -0700)]
doc/release-notes: extra note for v0.56.4

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodoc/release-notes: v0.56.4
Sage Weil [Mon, 25 Mar 2013 23:09:24 +0000 (16:09 -0700)]
doc/release-notes: v0.56.4

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoImprove test by getting cloneid from my_snaps vector
David Zafman [Sat, 23 Mar 2013 01:14:10 +0000 (18:14 -0700)]
Improve test by getting cloneid from my_snaps vector

Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Sam Just <sam.just@inktank.com>
12 years agorgw: bucket index ops on system buckets shouldn't do anything
Yehuda Sadeh [Mon, 25 Mar 2013 16:50:33 +0000 (09:50 -0700)]
rgw: bucket index ops on system buckets shouldn't do anything

Fixes: #4508
Backport: bobtail
On certain bucket index operations we didn't check whether
the bucket was a system bucket, which caused the operations
to fail. This triggered an error message on bucket removal
operations.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoceph-disk: rename some local variabels in list_*partitions 144/head
Danny Al-Gaaf [Mon, 25 Mar 2013 16:45:32 +0000 (17:45 +0100)]
ceph-disk: rename some local variabels in list_*partitions

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoceph-disk: fix naming of a local variable in find_cluster_by_uuid
Danny Al-Gaaf [Mon, 25 Mar 2013 15:24:00 +0000 (16:24 +0100)]
ceph-disk: fix naming of a local variable in find_cluster_by_uuid

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoceph-disk: rename some constants to upper case variable names
Danny Al-Gaaf [Mon, 25 Mar 2013 15:18:17 +0000 (16:18 +0100)]
ceph-disk: rename some constants to upper case variable names

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoceph-disk: add some more docstrings
Danny Al-Gaaf [Mon, 25 Mar 2013 15:15:29 +0000 (16:15 +0100)]
ceph-disk: add some more docstrings

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoceph-disk: print subprocess.CalledProcessError on error
Danny Al-Gaaf [Mon, 25 Mar 2013 13:36:41 +0000 (14:36 +0100)]
ceph-disk: print subprocess.CalledProcessError on error

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoceph-disk: fix indention
Danny Al-Gaaf [Mon, 25 Mar 2013 12:55:56 +0000 (13:55 +0100)]
ceph-disk: fix indention

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agojava: pretty print Ceph extent
Noah Watkins [Sun, 24 Mar 2013 20:03:56 +0000 (13:03 -0700)]
java: pretty print Ceph extent

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agojava: support ceph_get_osd_addr
Noah Watkins [Fri, 22 Mar 2013 19:42:47 +0000 (12:42 -0700)]
java: support ceph_get_osd_addr

Adds a few JNI utilities from the Android project (license: Apache) to
help with IP address conversions. These functions are also updated to
work in our environment (use Ceph exception utilities, edit header
paths).

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agojava: support ceph_get_osd_crush_location
Noah Watkins [Thu, 21 Mar 2013 22:42:34 +0000 (15:42 -0700)]
java: support ceph_get_osd_crush_location

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agojava: support ceph_get_file_extent_osds
Noah Watkins [Thu, 21 Mar 2013 22:41:10 +0000 (15:41 -0700)]
java: support ceph_get_file_extent_osds

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agoMerge branch 'next'
Sage Weil [Sun, 24 Mar 2013 00:31:29 +0000 (17:31 -0700)]
Merge branch 'next'

12 years agoclient: don't set other if lookup fails in rename
Sam Lang [Sat, 23 Mar 2013 19:07:59 +0000 (14:07 -0500)]
client: don't set other if lookup fails in rename

On rename, only set the other inode if the
lookup for the destination succeeds, otherwise we hit
a segv in set_other_inode().

Fixes #4517.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Tested-by: Noah Watkins <jayhawk@cs.ucsc.edu>
12 years agoMerge branch 'next'
Sage Weil [Sat, 23 Mar 2013 18:03:58 +0000 (11:03 -0700)]
Merge branch 'next'

12 years agotest/libcephfs: Test rename error cases
Sam Lang [Fri, 22 Mar 2013 20:23:21 +0000 (15:23 -0500)]
test/libcephfs: Test rename error cases

Make sure that rename fails with the ENOENT
if the source path doesn't exist.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
12 years agoclient: Fix rename returning ENOENT for dest
Sam Lang [Fri, 22 Mar 2013 20:02:58 +0000 (15:02 -0500)]
client: Fix rename returning ENOENT for dest

Introduced by fc80c1dc6ee315ae5e039986602ffadba46cb43b,
the client should _not_ fail if the lookup for the
destination path on rename returns ENOENT.

The previous code also did not check that the lookup
returned ENOENT or success.  We add the check and fail
if we get any other errors.

Fixes #4517.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
12 years agoMDSMap: improve health check 143/head
Sage Weil [Sat, 23 Mar 2013 04:04:43 +0000 (21:04 -0700)]
MDSMap: improve health check

Note if the cluster is degraded.  If so, indicate specifically which MDSs
are degraded and what state they are in.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMDSMap:: constify a bunch of methods
Sage Weil [Sat, 23 Mar 2013 01:22:21 +0000 (18:22 -0700)]
MDSMap:: constify a bunch of methods

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agopreserve /var/lib/ceph on deb/rpm purge
Sage Weil [Fri, 22 Mar 2013 22:24:39 +0000 (15:24 -0700)]
preserve /var/lib/ceph on deb/rpm purge

We should clobber configuration and log data, but *not* user data.  Leave
/var/lib/ceph alone.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Gary Lowell <gary.lowell@inktank.com>
12 years agomon: factor out _get_pending_crush() helper 129/head
Sage Weil [Fri, 22 Mar 2013 21:27:21 +0000 (14:27 -0700)]
mon: factor out _get_pending_crush() helper

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon, crush: add some tests to build a DAG via the cli
Sage Weil [Fri, 22 Mar 2013 19:32:47 +0000 (12:32 -0700)]
mon, crush: add some tests to build a DAG via the cli

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocrush, mon: unlink vs remove
Sage Weil [Fri, 22 Mar 2013 19:32:15 +0000 (12:32 -0700)]
crush, mon: unlink vs remove

Make an 'unlink' mode of remove that will remove a link to a bucket but
not remove the bucket itself.  This refactors remove_item[_under] and moves
some of the checks into common helpers where they are not duplicated.  Fix
callers to pass the extra arg.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocrush: fix remove_item on bucket removal
Sage Weil [Thu, 21 Mar 2013 18:15:30 +0000 (11:15 -0700)]
crush: fix remove_item on bucket removal

Remove the bucket if there are no references left.

Remove the name from the map even if it is a bucket (not sure why that
condition was there in the first place!).

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: add 'osd crush add-bucket <name> <type>'
Sage Weil [Thu, 21 Mar 2013 18:04:59 +0000 (11:04 -0700)]
mon: add 'osd crush add-bucket <name> <type>'

This is (I think) the last missing piece to let you construct an entire
map via the CLI.  The add/set commands will construct intervening ancestor
nodes provide there is an existing ancestor to stick them under, but this
is needed to create the initial root node.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: allow removal of buckets via 'osd crush rm ...'
Sage Weil [Thu, 21 Mar 2013 18:03:55 +0000 (11:03 -0700)]
mon: allow removal of buckets via 'osd crush rm ...'

No reason to limit this to leaves.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocrush: change find_roots(); add find_takes()
Sage Weil [Fri, 22 Mar 2013 21:23:37 +0000 (14:23 -0700)]
crush: change find_roots(); add find_takes()

The find_roots() was looking for nodes referenced by 'take', but those
aren't necessarily roots, which is what the callers actually want.

Rename to find_takes() and add a real find_roots().  Not very efficient,
but we don't care.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: add optional ancestor arg to 'ceph osd crush rm <item> [ancestor]'
Sage Weil [Wed, 20 Mar 2013 15:40:37 +0000 (08:40 -0700)]
mon: add optional ancestor arg to 'ceph osd crush rm <item> [ancestor]'

Remove only instances of the item underneath a particular ancestor.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocrush: add remove_item_under()
Sage Weil [Wed, 20 Mar 2013 15:40:09 +0000 (08:40 -0700)]
crush: add remove_item_under()

Remove only instances of item nested beneath a particular ancestor.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: 'ceph osd crush link ...' to add a link to an existing bucket
Sage Weil [Wed, 20 Mar 2013 15:00:12 +0000 (08:00 -0700)]
mon: 'ceph osd crush link ...' to add a link to an existing bucket

Allow a second reference to an existing bucket to be added.  This lets
you create a DAG instead of a tree using the CLI.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocrush: prevent formation of a loop
Sage Weil [Wed, 20 Mar 2013 14:59:26 +0000 (07:59 -0700)]
crush: prevent formation of a loop

If we are adding an item, ensure it cannot form a loop in the tree/map/
DAG.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocrush: add link_bucket()
Sage Weil [Wed, 20 Mar 2013 14:59:03 +0000 (07:59 -0700)]
crush: add link_bucket()

Allow an existing bucket to get linked from a new position in the tree.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: 'ceph osd crush add ...' to add a second link to an item
Sage Weil [Wed, 20 Mar 2013 14:02:20 +0000 (07:02 -0700)]
mon: 'ceph osd crush add ...' to add a second link to an item

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'upstream/wip_4435'
Samuel Just [Fri, 22 Mar 2013 21:15:31 +0000 (14:15 -0700)]
Merge remote-tracking branch 'upstream/wip_4435'

Fixes: #4435
Reviewed-by: David Zafman <david.zafman@inktank.com>
12 years agoPG::GetMissing: need to check need_up_thru in MLogRec handler
Samuel Just [Fri, 22 Mar 2013 20:51:14 +0000 (13:51 -0700)]
PG::GetMissing: need to check need_up_thru in MLogRec handler

Backport: bobtail
Fixes: #4534
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoPG,osd_types: improve check_new_interval debugging
Samuel Just [Fri, 22 Mar 2013 20:48:49 +0000 (13:48 -0700)]
PG,osd_types: improve check_new_interval debugging

Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agocommon/MemoryModel: remove logging to /tmp/memlog
Sage Weil [Fri, 22 Mar 2013 20:25:49 +0000 (13:25 -0700)]
common/MemoryModel: remove logging to /tmp/memlog

This was a hack for dev purposes ages ago; remove it.  The predictable
filename is a security issue.

CVE-2013-1882

Reported-by: Michael Scherer <misc@zarb.org>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agoinit-ceph: clean up temp ceph.conf filename on exit
Sage Weil [Fri, 22 Mar 2013 20:25:43 +0000 (13:25 -0700)]
init-ceph: clean up temp ceph.conf filename on exit

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agoinit-ceph: push temp conf file to a unique location on remote host
Sage Weil [Fri, 22 Mar 2013 20:25:33 +0000 (13:25 -0700)]
init-ceph: push temp conf file to a unique location on remote host

The predictable file name is a security problem.

CVE-2013-1882

Reported-by: Michael Scherer <misc@zarb.org>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agomkcephfs: make remote temp directory name unique
Sage Weil [Fri, 22 Mar 2013 20:25:23 +0000 (13:25 -0700)]
mkcephfs: make remote temp directory name unique

The predictable file name is a security problem.

CVE-2013-1882

Reported-by: Michael Scherer <misc@zarb.org>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agoMerge pull request #130 from ceph/wip-fs-rename
Sage Weil [Fri, 22 Mar 2013 20:07:41 +0000 (13:07 -0700)]
Merge pull request #130 from ceph/wip-fs-rename

test: add ceph_rename test

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agodoc: Added {id} argument to OSD lost.
John Wilkins [Fri, 22 Mar 2013 18:52:12 +0000 (11:52 -0700)]
doc: Added {id} argument to OSD lost.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoceph-disk: re-add python 2.7 dependency comment
Sage Weil [Fri, 22 Mar 2013 17:09:55 +0000 (10:09 -0700)]
ceph-disk: re-add python 2.7 dependency comment

FIXME!

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #117 from ceph/wip-ceph-disk
Sage Weil [Fri, 22 Mar 2013 17:06:13 +0000 (10:06 -0700)]
Merge pull request #117 from ceph/wip-ceph-disk

ceph-disk-* refactor

12 years agoMerge branch 'next'
Sage Weil [Fri, 22 Mar 2013 16:15:52 +0000 (09:15 -0700)]
Merge branch 'next'

12 years agoosd: reenable 'journal aio = true'
Sage Weil [Tue, 19 Mar 2013 21:01:08 +0000 (14:01 -0700)]
osd: reenable 'journal aio = true'

Now that #4079 is resolved.  Reverts 1cfc3ae0.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoos/FileJournal: fix aio self-throttling deadlock
Sage Weil [Tue, 19 Mar 2013 21:26:16 +0000 (14:26 -0700)]
os/FileJournal: fix aio self-throttling deadlock

This block of code tries to limit the number of aios in flight by waiting
for the amount of data to be written to grow relative to a function of the
number of aios.  Strictly speaking, the condition we are waiting for is a
function of both aio_num and the write queue, but we are only woken by
changes in aio_num, and were (in rare cases) waiting when aio_num == 0 and
there was no possibility of being woken.

Fix this by verifying that aio_num > 0, and restructuring the loop to
recheck that condition on each wakeup.

Fixes: #4079
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge pull request #137 from dalgaaf/wip-da-cleanup-includes
Sage Weil [Fri, 22 Mar 2013 15:46:31 +0000 (08:46 -0700)]
Merge pull request #137 from dalgaaf/wip-da-cleanup-includes

Cleanup some twice included header

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agotest/test_snap_mapper.cc: remove twice included <tr1/memory> 137/head
Danny Al-Gaaf [Fri, 22 Mar 2013 15:03:22 +0000 (16:03 +0100)]
test/test_snap_mapper.cc: remove twice included <tr1/memory>

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agomon/MDSMonitor.cc: remove twice included MonitorDBStore.h
Danny Al-Gaaf [Fri, 22 Mar 2013 15:02:55 +0000 (16:02 +0100)]
mon/MDSMonitor.cc: remove twice included MonitorDBStore.h

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agomon/LogMonitor.cc: remove twice included <sstream>
Danny Al-Gaaf [Fri, 22 Mar 2013 15:02:23 +0000 (16:02 +0100)]
mon/LogMonitor.cc: remove twice included <sstream>

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agomon/AuthMonitor.cc: remove twice included <sstream>
Danny Al-Gaaf [Fri, 22 Mar 2013 15:01:53 +0000 (16:01 +0100)]
mon/AuthMonitor.cc: remove twice included <sstream>

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agocommon/Formatter.h: remove twice included <list>
Danny Al-Gaaf [Fri, 22 Mar 2013 15:01:15 +0000 (16:01 +0100)]
common/Formatter.h: remove twice included <list>

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoReplicatedPG: add debug flag to skip full check at reservation 136/head
Samuel Just [Fri, 22 Mar 2013 01:06:59 +0000 (18:06 -0700)]
ReplicatedPG: add debug flag to skip full check at reservation

This will make it easier to test the check in do_scan.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoReplicatedPG: replica should post BackfillTooFull in do_scan if full
Samuel Just [Thu, 21 Mar 2013 20:43:03 +0000 (13:43 -0700)]
ReplicatedPG: replica should post BackfillTooFull in do_scan if full

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoPG: halt backfill on RemoteReservationRejected in Backilling
Samuel Just [Thu, 21 Mar 2013 20:37:58 +0000 (13:37 -0700)]
PG: halt backfill on RemoteReservationRejected in Backilling

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoPG: add helper for adding a timer event to retry backfill
Samuel Just [Thu, 21 Mar 2013 20:37:13 +0000 (13:37 -0700)]
PG: add helper for adding a timer event to retry backfill

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoPG: add BackfillTooFull event for RepRecovering
Samuel Just [Thu, 21 Mar 2013 20:19:51 +0000 (13:19 -0700)]
PG: add BackfillTooFull event for RepRecovering

Replica will use this to notify Primary to stop backfilling.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoPG: add helper for rejecting backfill reservation
Samuel Just [Thu, 21 Mar 2013 20:18:41 +0000 (13:18 -0700)]
PG: add helper for rejecting backfill reservation

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoPG: use OSDService::too_full_for_backfill in RepWaitBackfillReserved
Samuel Just [Thu, 21 Mar 2013 19:08:50 +0000 (12:08 -0700)]
PG: use OSDService::too_full_for_backfill in RepWaitBackfillReserved

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoOSDService: add too_full_for_backfill
Samuel Just [Thu, 21 Mar 2013 18:37:24 +0000 (11:37 -0700)]
OSDService: add too_full_for_backfill

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge remote-tracking branch 'upstream/wip_osd_shutdown_notification'
Samuel Just [Fri, 22 Mar 2013 01:46:50 +0000 (18:46 -0700)]
Merge remote-tracking branch 'upstream/wip_osd_shutdown_notification'

Fixes: #1857
Fixes: #4267
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoMakefile: add MOSDMarkMeDown 131/head
Samuel Just [Thu, 21 Mar 2013 19:16:43 +0000 (12:16 -0700)]
Makefile: add MOSDMarkMeDown

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoOSD: notify mon prior to shutdown
Samuel Just [Thu, 21 Mar 2013 18:19:45 +0000 (11:19 -0700)]
OSD: notify mon prior to shutdown

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoMonitor: add MOSDMarkMeDown support
Samuel Just [Wed, 20 Mar 2013 21:30:49 +0000 (14:30 -0700)]
Monitor: add MOSDMarkMeDown support

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoOSDMonitor: factor out check_source helper
Samuel Just [Wed, 20 Mar 2013 21:30:29 +0000 (14:30 -0700)]
OSDMonitor: factor out check_source helper

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agomessages: add MOSDMarkMeDown
Samuel Just [Wed, 20 Mar 2013 20:43:31 +0000 (13:43 -0700)]
messages: add MOSDMarkMeDown

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoOSD: clear OpHistory on shutdown
Samuel Just [Wed, 20 Mar 2013 18:49:29 +0000 (11:49 -0700)]
OSD: clear OpHistory on shutdown

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoOpRequest: use OpRequestRef for OpHistory
Samuel Just [Wed, 20 Mar 2013 18:49:03 +0000 (11:49 -0700)]
OpRequest: use OpRequestRef for OpHistory

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoFileStore::stat: valgrind: don't read *st on error
Samuel Just [Wed, 20 Mar 2013 18:06:59 +0000 (11:06 -0700)]
FileStore::stat: valgrind: don't read *st on error

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoshared_cache: clear lru in destructor
Samuel Just [Tue, 19 Mar 2013 21:46:20 +0000 (14:46 -0700)]
shared_cache: clear lru in destructor

Otherwise, the live references will attempt to extricate
themselves from a disolving SharedLRU instance as the
member destructors run.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoceph_osd: clear client_throttler prior to putting g_ceph_context
Samuel Just [Tue, 19 Mar 2013 21:44:59 +0000 (14:44 -0700)]
ceph_osd: clear client_throttler prior to putting g_ceph_context

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoOSD: reorder OSD::shutdown
Samuel Just [Thu, 21 Mar 2013 18:19:33 +0000 (11:19 -0700)]
OSD: reorder OSD::shutdown

Reorder teardown:
- pgs
- queues/threadpools
- persist superblock
- filestore
- timers
- messengers

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoOSD: check for is_stopping after locking osd_lock or heartbeat_lock
Samuel Just [Tue, 19 Mar 2013 16:55:19 +0000 (09:55 -0700)]
OSD: check for is_stopping after locking osd_lock or heartbeat_lock

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoOSD: lookup_lock_raw_pg is dead
Samuel Just [Mon, 18 Mar 2013 23:42:57 +0000 (16:42 -0700)]
OSD: lookup_lock_raw_pg is dead

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoOSD: rename timer to tick_timer
Samuel Just [Mon, 18 Mar 2013 23:17:29 +0000 (16:17 -0700)]
OSD: rename timer to tick_timer

Only used for scheduling ticks - we should keep it
that way.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoOSD: move backfill_request_timer cleanup to OSDService::shutdown
Samuel Just [Mon, 18 Mar 2013 23:14:35 +0000 (16:14 -0700)]
OSD: move backfill_request_timer cleanup to OSDService::shutdown

Signed-off-by: Samuel Just <sam.just@inktank.com>