]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
10 years agomon/MDSMonitor: add confirm flag to fs reset 3336/head
John Spray [Mon, 12 Jan 2015 14:52:43 +0000 (14:52 +0000)]
mon/MDSMonitor: add confirm flag to fs reset

This was already in the command map but was not
being checked.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoqa: add `fs reset` to cephtool tests
John Spray [Mon, 12 Jan 2015 13:54:52 +0000 (13:54 +0000)]
qa: add `fs reset` to cephtool tests

This is just a superficial "I can call it" test,
it's actual behaviour is checked elsewhere.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agomon: implement `fs reset`
John Spray [Mon, 5 Jan 2015 19:34:57 +0000 (19:34 +0000)]
mon: implement `fs reset`

This is for use in CephFS disaster recovery.  When
the metadata pool has been forcibly reset to a single-MDS
metadata tree, we would like to reset the MDSMap to match.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoMerge pull request #2948 from ceph/wip-promote
Sage Weil [Sun, 11 Jan 2015 15:55:08 +0000 (07:55 -0800)]
Merge pull request #2948 from ceph/wip-promote

osd: promote_object separation; proxy read

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoceph_test_rados: add some debug output 2948/head
Sage Weil [Tue, 6 Jan 2015 21:01:45 +0000 (13:01 -0800)]
ceph_test_rados: add some debug output

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatedPG: improve proxy read cancelation
Sage Weil [Sun, 7 Dec 2014 01:45:28 +0000 (17:45 -0800)]
osd/ReplicatedPG: improve proxy read cancelation

Avoid taking the PG lock for a canceled read op (if we are lucky).  Recheck
after the lock is taken for good measure.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatedPG: put proxy read completion on finisher
Sage Weil [Sun, 7 Dec 2014 01:42:51 +0000 (17:42 -0800)]
osd/ReplicatedPG: put proxy read completion on finisher

We can't use the synchronous completion callbacks (in fast dispatch
context) do to the proxy read completion work.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd: tiering: avoid duplicate promotion on proxy read
Zhiqiang Wang [Fri, 28 Nov 2014 08:30:20 +0000 (16:30 +0800)]
osd: tiering: avoid duplicate promotion on proxy read

Do not promote if it is already undergoing in maybe_handle_cache.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoosd: tiering: proxy instead of redirect read in writeback mode when the
Zhiqiang Wang [Wed, 26 Nov 2014 01:57:03 +0000 (09:57 +0800)]
osd: tiering: proxy instead of redirect read in writeback mode when the
cache pool is full

To preserve read op order

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoosd: tiering: cancel and requeue proxy read when needed
Zhiqiang Wang [Fri, 21 Nov 2014 06:01:24 +0000 (14:01 +0800)]
osd: tiering: cancel and requeue proxy read when needed

Cancel and requeue proxy read on the following cases:
1) on_shutdown
2) on_change
3) background promotion is done

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
Conflicts:
src/osd/ReplicatedPG.cc
src/osd/ReplicatedPG.h

10 years agoosd/ReplicatedPG: allow reads to proxy etc even if blocked
Sage Weil [Tue, 9 Dec 2014 01:57:13 +0000 (17:57 -0800)]
osd/ReplicatedPG: allow reads to proxy etc even if blocked

If we are not write ordered, continue with cache checks so that we can
(among other things) proxy reads while promoting.

Note that this may reorder reads for clients, but we've decided that's okay.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agotest: add proxy read test
Zhiqiang Wang [Wed, 19 Nov 2014 03:14:46 +0000 (11:14 +0800)]
test: add proxy read test

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoosd: tiering: proxy reads during promote
Zhiqiang Wang [Tue, 18 Nov 2014 23:47:32 +0000 (15:47 -0800)]
osd: tiering: proxy reads during promote

wip 9980. Do proxy read and async promotion for writeback.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoosd: tiering: add cache mode READPROXY
Zhiqiang Wang [Tue, 18 Nov 2014 08:10:00 +0000 (16:10 +0800)]
osd: tiering: add cache mode READPROXY

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoosd: tiering: add proxy read support
Zhiqiang Wang [Tue, 18 Nov 2014 07:54:47 +0000 (15:54 +0800)]
osd: tiering: add proxy read support

wip 9979

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoosd/ReplicatedPG: separate promotion from the triggering op
Sage Weil [Mon, 17 Nov 2014 22:02:39 +0000 (14:02 -0800)]
osd/ReplicatedPG: separate promotion from the triggering op

Remove the triggering op from the internal promote machinery.

We keep the optional op arg to promote_object() only because we may
block on an object other than the original obc.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatedPG: pass promote error to all blocked operations
Sage Weil [Mon, 17 Nov 2014 21:06:29 +0000 (13:06 -0800)]
osd/ReplicatedPG: pass promote error to all blocked operations

This isn't the most elegant strategy, but it is the best we can do
right now.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatedPG: drop unnecessary cache_mode checks
Sage Weil [Mon, 17 Nov 2014 20:46:51 +0000 (12:46 -0800)]
osd/ReplicatedPG: drop unnecessary cache_mode checks

This currently enumerates all cache modes except none, and we don't
arrive in this function when caching is disabled.  And creating a whiteout
is not cache_mode dependent.  Simplify!

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatdPG: adjust braces (no semantic change)
Sage Weil [Thu, 23 Oct 2014 23:53:14 +0000 (16:53 -0700)]
osd/ReplicatdPG: adjust braces (no semantic change)

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatedPG: factor out must_promote case from all cache modes
Sage Weil [Thu, 23 Oct 2014 23:52:22 +0000 (16:52 -0700)]
osd/ReplicatedPG: factor out must_promote case from all cache modes

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatedPG: factor out common exists case from all cache modes
Sage Weil [Thu, 23 Oct 2014 23:51:03 +0000 (16:51 -0700)]
osd/ReplicatedPG: factor out common exists case from all cache modes

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatedPG: make op argument to promote_object optional
Sage Weil [Thu, 23 Oct 2014 21:34:36 +0000 (14:34 -0700)]
osd/ReplicatedPG: make op argument to promote_object optional

For now, we still always pass it.  In preparation, however, we modify
promote_object() so that it will work when op is null.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3309 from trociny/wip-9483 3484/head
Josh Durgin [Sat, 10 Jan 2015 23:25:10 +0000 (15:25 -0800)]
Merge pull request #3309 from trociny/wip-9483

OSD: add a get_latest_osdmap command to the admin socket

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoOSD: add a get_latest_osdmap command to the admin socket 3309/head
Mykola Golub [Wed, 7 Jan 2015 11:39:33 +0000 (13:39 +0200)]
OSD: add a get_latest_osdmap command to the admin socket

The command blocks and ensures we have the latest map from the
mon. This is useful in testing and to "unstick" clusters in some
odd situations.

Fixes: #9483, #9484 (maybe)
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
10 years agodoc: Fix PHP librados documentation
Wido den Hollander [Sat, 10 Jan 2015 13:21:27 +0000 (14:21 +0100)]
doc: Fix PHP librados documentation

10 years agoMerge pull request #3348 from ceph/wip-mon-wishlist
Loic Dachary [Sat, 10 Jan 2015 12:54:56 +0000 (13:54 +0100)]
Merge pull request #3348 from ceph/wip-mon-wishlist

doc: mon janitorial list is now a wishlist

Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years agodoc: mon janitorial list is now a wishlist 3348/head
Joao Eduardo Luis [Sat, 10 Jan 2015 12:08:22 +0000 (12:08 +0000)]
doc: mon janitorial list is now a wishlist

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
10 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Sat, 10 Jan 2015 05:43:49 +0000 (21:43 -0800)]
Merge remote-tracking branch 'gh/next'

10 years agoMerge pull request #3327 from ceph/wip-peeringqueue
Sage Weil [Sat, 10 Jan 2015 05:43:04 +0000 (21:43 -0800)]
Merge pull request #3327 from ceph/wip-peeringqueue

osd: fix peering queue bug

Reviewed-by: Samuel Just <sjust@redhat.com>
10 years agoMerge pull request #3344 from ceph/wip-librbd-snap-unprotect
Josh Durgin [Sat, 10 Jan 2015 00:56:39 +0000 (16:56 -0800)]
Merge pull request #3344 from ceph/wip-librbd-snap-unprotect

librbd: shadow variable in snap_unprotect

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agorgw: return InvalidAccessKeyId instead of AccessDenied
Yehuda Sadeh [Tue, 16 Dec 2014 20:27:54 +0000 (12:27 -0800)]
rgw: return InvalidAccessKeyId instead of AccessDenied

Fixes: #10334
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
(cherry picked from commit 56af795b1046a4c1bfba59e1fefde272bb0e5c1e)

10 years agorgw: return SignatureDoesNotMatch instead of AccessDenied
Yehuda Sadeh [Tue, 16 Dec 2014 17:11:20 +0000 (09:11 -0800)]
rgw: return SignatureDoesNotMatch instead of AccessDenied

Fixes: #10329
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
(cherry picked from commit ef75d720f289ce2e18c0047380a16b7688864560)

10 years agodoc: Clean up pool usage.
John Wilkins [Fri, 9 Jan 2015 22:54:30 +0000 (14:54 -0800)]
doc: Clean up pool usage.

Signed-off-by: John Wilkins <jowilkin@redhat.com>
10 years agodoc: Cleanup RGW pool usage.
John Wilkins [Fri, 9 Jan 2015 22:54:06 +0000 (14:54 -0800)]
doc: Cleanup RGW pool usage.

Signed-off-by: John Wilkins <jowilkin@redhat.com>
10 years agoMerge pull request #3341 from liewegas/wip-10504
Gregory Farnum [Fri, 9 Jan 2015 22:52:19 +0000 (14:52 -0800)]
Merge pull request #3341 from liewegas/wip-10504

client: add ceph version to metadata

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoclient: include ceph and git version in client metadata 3341/head
Sage Weil [Fri, 9 Jan 2015 22:41:34 +0000 (14:41 -0800)]
client: include ceph and git version in client metadata

Fixes: #10504
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3325 from ceph/wip-nits
Josh Durgin [Fri, 9 Jan 2015 22:30:44 +0000 (14:30 -0800)]
Merge pull request #3325 from ceph/wip-nits

allow 'ops' instead of 'dump_ops_in_flight'

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge pull request #3250 from ceph/wip-10372
Josh Durgin [Fri, 9 Jan 2015 22:12:22 +0000 (14:12 -0800)]
Merge pull request #3250 from ceph/wip-10372

osdc/Objecter: improve pool deletion detection

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoRevert "rgw: switch to new watch/notify API"
Yehuda Sadeh [Fri, 9 Jan 2015 22:02:04 +0000 (14:02 -0800)]
Revert "rgw: switch to new watch/notify API"

This reverts commit dc67cd69604ec4e4df846b818ec739dc7b09a537.

Conflicts:
src/rgw/rgw_rados.cc

10 years agoMerge pull request #3313 from ceph/wip-asok-get-subtrees 3339/head
Gregory Farnum [Fri, 9 Jan 2015 21:45:26 +0000 (13:45 -0800)]
Merge pull request #3313 from ceph/wip-asok-get-subtrees

Add MDS "get subtrees" asok command

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
10 years agoMerge pull request #3312 from ceph/wip-mdscacheobject-const
Gregory Farnum [Fri, 9 Jan 2015 21:29:59 +0000 (13:29 -0800)]
Merge pull request #3312 from ceph/wip-mdscacheobject-const

mds: support constness in MDSCacheObjects

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoMerge pull request #3248 from dachary/wip-table-formatter
Loic Dachary [Fri, 9 Jan 2015 18:34:35 +0000 (19:34 +0100)]
Merge pull request #3248 from dachary/wip-table-formatter

add table to Formatter

Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years agodoc: Added section to install priorities/preferences.
John Wilkins [Fri, 9 Jan 2015 18:26:51 +0000 (10:26 -0800)]
doc: Added section to install priorities/preferences.

Signed-off-by: John Wilkins <jowilkin@redhat.com>
10 years agoMerge pull request #3324 from mattrichards/rados_translate_op_flag
Josh Durgin [Fri, 9 Jan 2015 17:13:09 +0000 (09:13 -0800)]
Merge pull request #3324 from mattrichards/rados_translate_op_flag

librados: Translate operation flags from C APIs

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge pull request #3328 from xiaoxichen/memstore_size_u64
Sage Weil [Fri, 9 Jan 2015 15:52:20 +0000 (07:52 -0800)]
Merge pull request #3328 from xiaoxichen/memstore_size_u64

Bump memstore_device_bytes from U32 to U64

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3337 from dachary/wip-10494-disable-unittest-msgr
Haomai Wang [Fri, 9 Jan 2015 15:42:05 +0000 (23:42 +0800)]
Merge pull request #3337 from dachary/wip-10494-disable-unittest-msgr

tests: temporarily disable unittest_msgr

10 years agotests: temporarily disable unittest_msgr 3337/head
Loic Dachary [Fri, 9 Jan 2015 15:17:53 +0000 (16:17 +0100)]
tests: temporarily disable unittest_msgr

http://tracker.ceph.com/issues/10494 Refs: #10494

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agoMerge pull request #3331 from dachary/wip-10493-async-port
Haomai Wang [Fri, 9 Jan 2015 14:48:19 +0000 (22:48 +0800)]
Merge pull request #3331 from dachary/wip-10493-async-port

msg: initialize AsyncConnection::port

Reviewed-by: Haomai Wang<haomaiwang@gmail.com>
10 years agomds: add asok command for getting subtreemap 3313/head
John Spray [Mon, 5 Jan 2015 22:32:55 +0000 (22:32 +0000)]
mds: add asok command for getting subtreemap

For when we want to inspect this from a test or
during debugging.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agomds: give CDir a dump() method for JSON output
John Spray [Tue, 6 Jan 2015 18:13:28 +0000 (18:13 +0000)]
mds: give CDir a dump() method for JSON output

Useful when listing subtrees via admin socket.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agomds: support constness in MDSCacheObjects 3312/head
John Spray [Mon, 5 Jan 2015 23:53:18 +0000 (23:53 +0000)]
mds: support constness in MDSCacheObjects

So that one can have const CInode and CDir references
from time to time.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agolibrbd: shadow variable in snap_unprotect and list_children 3344/head
Jason Dillaman [Fri, 9 Jan 2015 13:19:43 +0000 (08:19 -0500)]
librbd: shadow variable in snap_unprotect and list_children

The shadow variable prevented snap_unprotect from returning the
correct error return code.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agodoc: Add Librados PHP documentation
Wido den Hollander [Fri, 9 Jan 2015 13:15:32 +0000 (14:15 +0100)]
doc: Add Librados PHP documentation

10 years agoMerge pull request #3332 from dachary/wip-9570-filejournal
Loic Dachary [Fri, 9 Jan 2015 12:57:11 +0000 (13:57 +0100)]
Merge pull request #3332 from dachary/wip-9570-filejournal

journal related cleanups

Reviewed-by: Jianpeng Ma <jianpeng.ma@intel.com>
10 years agocommon: Formatter: cosmetic re-indent 3248/head
Andreas Peters [Wed, 15 Oct 2014 09:30:35 +0000 (11:30 +0200)]
common: Formatter: cosmetic re-indent

Signed-off-by: Andreas Peters <andreas.joachim.peters@cern.ch>
10 years agocommon: Formatter: add TableFormatter class
Andreas-Joachim Peters [Wed, 8 Oct 2014 14:18:32 +0000 (16:18 +0200)]
common: Formatter: add TableFormatter class

For more human readable and shell parsable output.

Signed-off-by: Andreas Peters <andreas.joachim.peters@cern.ch>
10 years agoos: fix confusing indentation in FileJournal::corrupt 3332/head
Loic Dachary [Tue, 30 Sep 2014 12:49:45 +0000 (14:49 +0200)]
os: fix confusing indentation in FileJournal::corrupt

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
10 years agoos: remove debug message leftover in FileJournal
Loic Dachary [Tue, 30 Sep 2014 12:31:05 +0000 (14:31 +0200)]
os: remove debug message leftover in FileJournal

The len of the buffer shows in the message above anyway.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
10 years agomsg: initialize AsyncConnection::port 3331/head
Loic Dachary [Fri, 9 Jan 2015 10:57:58 +0000 (11:57 +0100)]
msg: initialize AsyncConnection::port

http://tracker.ceph.com/issues/10493 Fixes: #10493

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agoBump memstore_device_bytes from U32 to U64 3328/head
Xiaoxi Chen [Fri, 9 Jan 2015 08:15:06 +0000 (16:15 +0800)]
Bump memstore_device_bytes from U32 to U64

U32 limit the max size of memstore to a few GB, which
block our test on memstore performance(as a phototype).

Bump it to U64 will suit for more widely usage

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
10 years agoosd: assert there is a peering event 3326/head 3327/head
Sage Weil [Thu, 8 Jan 2015 19:10:45 +0000 (11:10 -0800)]
osd: assert there is a peering event

This became conditional way back in 12e22b3d44eba51a70d8babebc2684f0c46575a7
for unclear reasons.  It probably predates the in_use checks.  In any case,
at this point, we should only arrive here if the PG was queued, implying
that there will always be an event to process.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd: requeue PG when we skip handling a peering event
Sage Weil [Thu, 8 Jan 2015 21:34:52 +0000 (13:34 -0800)]
osd: requeue PG when we skip handling a peering event

If we don't handle the event, we need to put the PG back into the peering
queue or else the event won't get processed until the next event is
queued, at which point we'll be processing events with a delay.

The queue_null is not necessary (and is a waste of effort) because the
event is still in pg->peering_queue and the PG is queued.

Note that this only triggers when we exceeed osd_map_max_advance, usually
when there is a lot of peering and recovery activity going on.  A
workaround is to increase that value, but if you exceed osd_map_cache_size
you expose yourself to crache thrashing by the peering work queue, which
can cause serious problems with heavily degraded clusters and bit lots of
people on dumpling.

Backport: giant, firefly
Fixes: #10431
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agolibrados: Translate operation flags from C APIs 3324/head
Matt Richards [Thu, 8 Jan 2015 21:16:17 +0000 (13:16 -0800)]
librados: Translate operation flags from C APIs

The operation flags in the public C API are a distinct enum
and need to be translated to Ceph OSD flags, like as happens in
the C++ API. It seems like the C enum and the C++ enum consciously
use the same values, so I reused the C++ translation function.

Signed-off-by: Matthew Richards <mattjrichards@gmail.com>
10 years agoMerge pull request #3321 from cernceph/wip-nobarrier-doc 2952/head
Sage Weil [Thu, 8 Jan 2015 21:06:07 +0000 (13:06 -0800)]
Merge pull request #3321 from cernceph/wip-nobarrier-doc

doc: don't suggest mounting xfs with nobarrier

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3322 from dachary/wip-10426-test-directories
Sage Weil [Thu, 8 Jan 2015 20:57:53 +0000 (12:57 -0800)]
Merge pull request #3322 from dachary/wip-10426-test-directories

tests: group clusters in a single directory

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3306 from ceph/wip-10041
Gregory Farnum [Thu, 8 Jan 2015 18:39:31 +0000 (10:39 -0800)]
Merge pull request #3306 from ceph/wip-10041

client: fix mount timeout

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agomds: allow 'ops' as shorthand for 'dump_ops_in_flight' 3325/head
Sage Weil [Thu, 8 Jan 2015 18:36:22 +0000 (10:36 -0800)]
mds: allow 'ops' as shorthand for 'dump_ops_in_flight'

This is an extremely annoying thing to type when working with a
production cluster.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd: allow 'ops' as shorthand for 'dump_ops_in_flight'
Sage Weil [Thu, 8 Jan 2015 18:36:15 +0000 (10:36 -0800)]
osd: allow 'ops' as shorthand for 'dump_ops_in_flight'

This is an extremely annoying thing to type when working with a
production cluster.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agotests: group clusters in a single directory 3322/head
Loic Dachary [Thu, 8 Jan 2015 10:11:07 +0000 (11:11 +0100)]
tests: group clusters in a single directory

Group all test directories used for mini clusters into a single
sub-directory (testdir). This is easier to cleanup manually and less
error prone.

http://tracker.ceph.com/issues/10426 Fixes: #10426

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agoMerge pull request #3215 from dachary/wip-10384-ceph-test-helper-races
Loic Dachary [Thu, 8 Jan 2015 11:02:32 +0000 (12:02 +0100)]
Merge pull request #3215 from dachary/wip-10384-ceph-test-helper-races

tests: resolve ceph-helpers races

Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years agodoc: don't suggest mounting xfs with nobarrier 3321/head
Dan van der Ster [Thu, 8 Jan 2015 08:49:10 +0000 (09:49 +0100)]
doc: don't suggest mounting xfs with nobarrier

nobarrier is dangerous, so stop suggesting it as an example osd mount
options xfs.

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
10 years agoMerge pull request #3310 from ceph/wip-mdsmonitor-fixes
Yan, Zheng [Thu, 8 Jan 2015 01:55:01 +0000 (09:55 +0800)]
Merge pull request #3310 from ceph/wip-mdsmonitor-fixes

MDSMonitor fixes

10 years agoMerge pull request #3167 from ceph/wip-10307
Josh Durgin [Wed, 7 Jan 2015 21:23:44 +0000 (13:23 -0800)]
Merge pull request #3167 from ceph/wip-10307

rgw: use s->bucket_attrs instead of trying to read obj attrs

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3036 from dachary/wip-make-check
Loic Dachary [Wed, 7 Jan 2015 15:47:26 +0000 (16:47 +0100)]
Merge pull request #3036 from dachary/wip-make-check

packages: add python-virtualenv and xmlstarlet

Reviewed-by: Ken Dreyer <kdreyer@redhat.com>
10 years agoMerge pull request #3269 from ceph/wip-10387
John Spray [Wed, 7 Jan 2015 14:30:02 +0000 (14:30 +0000)]
Merge pull request #3269 from ceph/wip-10387

client: close dirfrag whem rmdir

Reviewed-by: John Spray <john.spray@redhat.com>
10 years agomon/MDSMonitor: fix `mds fail` for standby MDSs 3310/head
John Spray [Wed, 7 Jan 2015 12:37:40 +0000 (12:37 +0000)]
mon/MDSMonitor: fix `mds fail` for standby MDSs

This command takes a gid, rank or name, but
in the name case it would previously only work if
the named daemon had a rank assigned (mds_info->rank >= 0),
otherwise it would fail silently.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agomon/MDSMonitor: respect MDSMAP_DOWN when promoting standbys
John Spray [Wed, 7 Jan 2015 11:47:34 +0000 (11:47 +0000)]
mon/MDSMonitor: respect MDSMAP_DOWN when promoting standbys

Previously, a standby could become active even if 'cluster_down'
had been run.  This was awkward, because it would get you a
"laggy or crashed" mds for the standby that was actually
up and running, just being ignored because of cluster_down.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoinit-ceph: stop returns before daemons are dead 3215/head
Loic Dachary [Fri, 19 Dec 2014 14:54:33 +0000 (15:54 +0100)]
init-ceph: stop returns before daemons are dead

The existence of the pidfile must be checked outside of the loop to send
a signal to the daemon. Otherwise the daemon will remove the pidfile and
stop can return before the process is dead because it only checks
/proc/$pid if the pidfile exists.

http://tracker.ceph.com/issues/10389 Fixes: #10389

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agoclient: fix mount timeout 3306/head
Yan, Zheng [Wed, 7 Jan 2015 08:22:29 +0000 (16:22 +0800)]
client: fix mount timeout

implement a simple timeout mechanism for make_request()

Fixes: #10041
Signed-off-by: Yan, Zheng <zyan@redhat.com>
10 years agoMerge remote-tracking branch 'origin/wip-10270' into master
Josh Durgin [Tue, 6 Jan 2015 23:23:21 +0000 (15:23 -0800)]
Merge remote-tracking branch 'origin/wip-10270' into master

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Conflicts:
src/librados/IoCtxImpl.cc
src/librados/IoCtxImpl.h

10 years agoMerge branch 'wip-fast-txn'
Sage Weil [Tue, 6 Jan 2015 21:48:37 +0000 (13:48 -0800)]
Merge branch 'wip-fast-txn'

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
10 years agoosd/ECBackend: make sure localt uses tbl is ec txn does
Sage Weil [Wed, 24 Dec 2014 05:02:51 +0000 (21:02 -0800)]
osd/ECBackend: make sure localt uses tbl is ec txn does

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoceph-object-corpus: drop compat with old ObjectStore::Transaction
Sage Weil [Fri, 12 Dec 2014 01:22:16 +0000 (17:22 -0800)]
ceph-object-corpus: drop compat with old ObjectStore::Transaction

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd: fix Transaction::get_data_offset bug when map layout used
Dong Yuan [Wed, 10 Dec 2014 16:09:01 +0000 (16:09 +0000)]
osd: fix Transaction::get_data_offset bug when map layout used

add following offset:
  sizeof(__u8) +      // encode struct_v
  sizeof(__u8) +      // encode compat_v
  sizeof(__u32);      // encode len

Change-Id: I5b6662eb42aeeae64baa8699da6ce65e0b1d58c3
Signed-off-by: Dong Yuan <yuandong1222@gmail.com>
10 years agoosd: add feature CEPH_FEATURE_OSD_TRANSACTION_MAY_LAYOUT (1ULL<<47)>>)
Dong Yuan [Wed, 10 Dec 2014 11:56:49 +0000 (11:56 +0000)]
osd: add feature CEPH_FEATURE_OSD_TRANSACTION_MAY_LAYOUT (1ULL<<47)>>)

This feature determine whether we use tbl encode for transaction of use
the new map layout.

The primary uses peer_features to determine whether transaction should
use tbl, while the replica just follow the primary.

Change-Id: I92ca6e5b59bd1acde6007ad0dffc085be17accab
Signed-off-by: Dong Yuan <yuandong1222@gmail.com>
10 years agoosd: build fields for Transaction::iterator when tbl is used
Dong Yuan [Wed, 10 Dec 2014 10:02:51 +0000 (10:02 +0000)]
osd: build fields for Transaction::iterator when tbl is used

When tbl is used (for compatibility), the Transaction::begin method need
to build all fields used by iterator. That includes: coll_index,
object_index, data_bl, op_bl, etc.)

Change-Id: I48ea74fec8d052f50da254a726a9c0dffead19bc
Signed-off-by: Dong Yuan <yuandong1222@gmail.com>
10 years agoceph_perf_objectstore: fix warning
Sage Weil [Tue, 16 Dec 2014 19:17:04 +0000 (11:17 -0800)]
ceph_perf_objectstore: fix warning

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd: Transaction::append & Transaction::swap
Dong Yuan [Tue, 2 Dec 2014 17:08:44 +0000 (17:08 +0000)]
osd: Transaction::append & Transaction::swap

Finish append and swap for new Transaction encode/decode layout.

Since append will modify the op_bl now, we changed the order of append
and swap in ReplicatedBackend::sub_op_modify and
ReplicatedBackend::submit_transaction to avoid append call on op_t, so
the op_t can be encode in message.

Change-Id: I6fb421e0defdb092fb9732eef818e90291b039f5
Signed-off-by: Dong Yuan <yuandong1222@gmail.com>
10 years agoosd: new Transaction::iterator interface
Dong Yuan [Mon, 1 Dec 2014 10:58:56 +0000 (10:58 +0000)]
osd: new Transaction::iterator interface

This patch add new Transaction::iterator interface according to new
encode/decode layout. The new iterator give the whole Op struct in a
single decode_op method.

All ObjectStore Impl (FileStore/MemStore/KeyValueStore) is also changed
to use the new interface.

Change-Id: I1900a6ec302890df2c4357b071e4966c26d7f037
Signed-off-by: Dong Yuan <yuandong1222@gmail.com>
10 years agoosd: add encode/deocde impl for new layout
Dong Yuan [Thu, 27 Nov 2014 14:52:36 +0000 (14:52 +0000)]
osd: add encode/deocde impl for new layout

When use_tbl is true, Transaction::encode will give the same result as
before, while when use_tbl is false, Transaction::encode will use new
field and logic to encode and all related methods such as
get_encoded_bytes, get_data_offset will do the same.

Change-Id: Ia5864e489d47f37cf496fe3fb825b21977d2d938
Signed-off-by: Dong Yuan <yuandong1222@gmail.com>
10 years agoosd: new format for Transaction encode/decode
Dong Yuan [Wed, 26 Nov 2014 17:58:50 +0000 (17:58 +0000)]
osd: new format for Transaction encode/decode

This patch add a new fixed size struct Transaction::Op to represent
all actions.

All coll and ghobject used by the transaction are keeped in two maps:
  coll:   map<coll_t, __le32> coll_index;
  object: map<ghobject_t, __le32> object_index;

And the Op struct use the map value(__le32) to refer coll and object,
so each coll and object is only need to encode once in the transaction.

Other variable-size fields(key/value/data) is encoded in bufferlist
data_bl.

Change-Id: I52b2fcd3217a6cb35de7b309a6dd74a99478feb2
Signed-off-by: Dong Yuan <yuandong1222@gmail.com>
10 years agoosd: Add Transaction::TransactionData for fast encode/decode
Dong Yuan [Tue, 4 Nov 2014 09:10:05 +0000 (09:10 +0000)]
osd: Add Transaction::TransactionData for fast encode/decode

TransactionData wrap the following fields:
      __le64 ops;
      __le32 largest_data_len;
      __le32 largest_data_off;
      __le32 largest_data_off_in_tbl;
      __le32 pad; //make TransactionData multiple of uint64_t

This struct can encode/decode just by a single memcpy instead of many
encode/decode operations.

Change-Id: I56df78def43bd2b80b77be0825756e133434a6e6
Signed-off-by: Dong Yuan <yuandong1222@gmail.com>
10 years agoosd: remove unused Transaction fields
Dong Yuan [Mon, 8 Dec 2014 14:40:39 +0000 (14:40 +0000)]
osd: remove unused Transaction fields

We don't need sobject and pool_override anymore since we don't need to
support anything older than dumpling.

Change-Id: I22c01d4b5c6bf99765bf6bc13aecadc997d6750c
Signed-off-by: Dong Yuan <yuandong1222@gmail.com>
10 years agoMerge pull request #3272 from trociny/tell-mon-version
Loic Dachary [Tue, 6 Jan 2015 18:35:36 +0000 (19:35 +0100)]
Merge pull request #3272 from trociny/tell-mon-version

cli: make ceph tell mon.* version work

Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years agoMerge pull request #3196 from tchaikov/shadow-cct
Loic Dachary [Tue, 6 Jan 2015 18:34:20 +0000 (19:34 +0100)]
Merge pull request #3196 from tchaikov/shadow-cct

client, librados, osdc: do not shadow Dispatcher::cct

Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years agoMerge pull request #3298 from majianpeng/fix5
Loic Dachary [Tue, 6 Jan 2015 18:32:27 +0000 (19:32 +0100)]
Merge pull request #3298 from majianpeng/fix5

TestLFNIndex.cc: For root, dont do permission operations.

Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years agotests: resolve ceph-helpers races
Loic Dachary [Fri, 19 Dec 2014 11:26:37 +0000 (12:26 +0100)]
tests: resolve ceph-helpers races

Some tests were racing against the monitor. On a fast machine it worked
but slower machines (or sometime when running in parallel), the monitor
is lagging behind. Use wait_for_clean to make sure the monitor is in the
desired state for the test to succeed.

http://tracker.ceph.com/issues/10384 Fixes: #10384

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agoMerge pull request #3283 from dachary/wip-make-debs
Loic Dachary [Tue, 6 Jan 2015 18:07:04 +0000 (19:07 +0100)]
Merge pull request #3283 from dachary/wip-make-debs

debian: create a repository from sources

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3301 from ktdreyer/rm-tiobench
Sage Weil [Tue, 6 Jan 2015 16:26:04 +0000 (08:26 -0800)]
Merge pull request #3301 from ktdreyer/rm-tiobench

#10152: qa: drop tiobench suite

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3300 from ceph/wip-10412
Sage Weil [Tue, 6 Jan 2015 16:20:19 +0000 (08:20 -0800)]
Merge pull request #3300 from ceph/wip-10412

client: fix use-after-free bug in unmount()

Reviewed-by: Sage Weil <sage@redhat.com>