]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
10 years agoceph_test_rados_api_misc: do not assert rbd feature match 3411/head
Sage Weil [Tue, 20 Jan 2015 02:28:20 +0000 (18:28 -0800)]
ceph_test_rados_api_misc: do not assert rbd feature match

This test fails on upgrades when we (or the server) have new
features.  Make it less fragile.

Fixes: #10576
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3334 from dachary/wip-10216-jerasure-sync 3416/head
Sage Weil [Mon, 19 Jan 2015 20:39:36 +0000 (12:39 -0800)]
Merge pull request #3334 from dachary/wip-10216-jerasure-sync

erasure-code: update jerasure/gf-complete submodules

10 years agoMerge pull request #3375 from XinzeChi/wip-journal-seq
Sage Weil [Mon, 19 Jan 2015 20:39:04 +0000 (12:39 -0800)]
Merge pull request #3375 from XinzeChi/wip-journal-seq

osd: fix journal header.committed_up_to

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3320 from wonzhq/lfn_open
Sage Weil [Mon, 19 Jan 2015 20:38:28 +0000 (12:38 -0800)]
Merge pull request #3320 from wonzhq/lfn_open

FileStore: return error if get_index fails in lfn_open

Reviewed-by: Haomai Wang <haomaiwang@gmail.com>
10 years agoMerge pull request #3211 from yuyuyu101/wip-10172
Sage Weil [Mon, 19 Jan 2015 20:38:01 +0000 (12:38 -0800)]
Merge pull request #3211 from yuyuyu101/wip-10172

AsyncMessenger: Bind thread to core, use buffer read and fix some bugs

10 years agoMerge pull request #3221 from ceph/wip-9440
Sage Weil [Mon, 19 Jan 2015 20:36:26 +0000 (12:36 -0800)]
Merge pull request #3221 from ceph/wip-9440

mon: log health changes to clog

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3382 from xinxinsh/wip-fix
Sage Weil [Mon, 19 Jan 2015 20:35:58 +0000 (12:35 -0800)]
Merge pull request #3382 from xinxinsh/wip-fix

fix command 'ceph pg dump_stuck degraded'

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoceph_test_objectstore: fix keyvaluestore name
Sage Weil [Mon, 19 Jan 2015 20:33:20 +0000 (12:33 -0800)]
ceph_test_objectstore: fix keyvaluestore name

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3338 from ceph/wip-recover-dentries
Gregory Farnum [Mon, 19 Jan 2015 18:50:56 +0000 (10:50 -0800)]
Merge pull request #3338 from ceph/wip-recover-dentries

#9883 tools/cephfs: add recover_dentries to journaltool

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoMerge pull request #3400 from kylinstorage/fix-rbd-watch
Josh Durgin [Mon, 19 Jan 2015 17:39:51 +0000 (09:39 -0800)]
Merge pull request #3400 from kylinstorage/fix-rbd-watch

fix rbd watch command for v2 image

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge remote-tracking branch 'origin/wip-bi-sharding-3'
Yehuda Sadeh [Mon, 19 Jan 2015 17:33:46 +0000 (09:33 -0800)]
Merge remote-tracking branch 'origin/wip-bi-sharding-3'

10 years agoMerge pull request #3396 from leseb/doc-openstack-fix-glance
Josh Durgin [Mon, 19 Jan 2015 16:38:37 +0000 (08:38 -0800)]
Merge pull request #3396 from leseb/doc-openstack-fix-glance

doc: Fix OpenStack Glance configuration

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge pull request #3401 from FrankYu/master
Sage Weil [Mon, 19 Jan 2015 16:34:32 +0000 (08:34 -0800)]
Merge pull request #3401 from FrankYu/master

Doc: rbd-snapshot: Fix the typo

10 years agoMerge pull request #3374 from dachary/wip-mailmap
Loic Dachary [Mon, 19 Jan 2015 16:06:46 +0000 (17:06 +0100)]
Merge pull request #3374 from dachary/wip-mailmap

mailmap updates

Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years agotools: output per-event errors from recover dentries 3338/head
John Spray [Mon, 19 Jan 2015 15:16:46 +0000 (15:16 +0000)]
tools: output per-event errors from recover dentries

10 years agotools: handle hardlinks in recover_dentries
John Spray [Fri, 16 Jan 2015 10:53:57 +0000 (10:53 +0000)]
tools: handle hardlinks in recover_dentries

Signed-off-by: Johh Spray <john.spray@redhat.com>
10 years agotools: recover_dentries efficiency
John Spray [Thu, 15 Jan 2015 12:13:24 +0000 (12:13 +0000)]
tools: recover_dentries efficiency

Avoid a redundant stat, and gather updates to a frag
into a single OMAP get/set.

Still could be heaps more efficient in the case of
many updates to the same dirs by adding in a little
cache and batching the updates.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agotweak comment wording in recover_dentries
John Spray [Tue, 13 Jan 2015 18:15:53 +0000 (18:15 +0000)]
tweak comment wording in recover_dentries

10 years agofixup some oversized lines
John Spray [Tue, 13 Jan 2015 16:38:55 +0000 (16:38 +0000)]
fixup some oversized lines

10 years agofix handling of io.read retval
John Spray [Tue, 13 Jan 2015 16:07:26 +0000 (16:07 +0000)]
fix handling of io.read retval

(it returns length read, which was falling through as
a spurious nonzero "error")

10 years agotools: remove duplicated InoTable encoding
John Spray [Tue, 13 Jan 2015 16:06:32 +0000 (16:06 +0000)]
tools: remove duplicated InoTable encoding

...and add a method to InoTable so that we can
artifically acquire inodes.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoJournalTool: handle corrupt fnodes
John Spray [Tue, 13 Jan 2015 15:43:14 +0000 (15:43 +0000)]
JournalTool: handle corrupt fnodes

10 years agotools/cephfs: add recover_dentries to journaltool
John Spray [Wed, 17 Dec 2014 14:06:53 +0000 (14:06 +0000)]
tools/cephfs: add recover_dentries to journaltool

This is intended as a comparatively safe recovery
operation, where we compare the versions
of journalled dentries with backing store dentries,
and write into the backing store only when the
existing contents are older than the journal
or invalid.

Fixes: #9883
Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoDoc: rbd-snapshot: Fix the typo 3401/head
Frank Yu [Mon, 19 Jan 2015 12:19:25 +0000 (20:19 +0800)]
Doc: rbd-snapshot: Fix the typo

Signed-off-by: Frank Yu <flyxiaoyu@gmail.com>
10 years agorbd: fix bug about rbd watch command 3400/head
Yunchuan Wen [Mon, 19 Jan 2015 08:51:58 +0000 (08:51 +0000)]
rbd: fix bug about rbd watch command

the header oid should be prefix+image_id, rather than prefix+image_name

Signed-off-by: Yunchuan Wen <yunchuanwen@ubuntukylin.com>
10 years agoMerge pull request #3397 from liewegas/wip-prealloc
Sage Weil [Mon, 19 Jan 2015 04:46:31 +0000 (20:46 -0800)]
Merge pull request #3397 from liewegas/wip-prealloc

mon: fix globalid when prealloc value is larger than max

10 years agomon: handle case where mon_globalid_prealloc > max_global_id 3397/head
Sage Weil [Mon, 19 Jan 2015 00:49:20 +0000 (16:49 -0800)]
mon: handle case where mon_globalid_prealloc > max_global_id

This triggers with the new larger mon_globalid_prealloc value.  It didn't
trigger on the existing cluster I tested on because it already had a very
large max.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agodoc: Fix OpenStack Glance configuration 3396/head
Sébastien Han [Sun, 18 Jan 2015 21:55:57 +0000 (22:55 +0100)]
doc: Fix OpenStack Glance configuration

Glance has not completely moved to 'store' yet so we need to configure
the store in the DEFAULT section as well.

Fixes: #10478
Signed-off-by: Sébastien Han <sebastien.han@enovance.com>
10 years agoMerge pull request #3361 from wonzhq/watch-notify
Sage Weil [Sun, 18 Jan 2015 18:44:35 +0000 (10:44 -0800)]
Merge pull request #3361 from wonzhq/watch-notify

osd/ReplicatedPG: force promotion for watch/notify ops

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3395 from liewegas/wip-cache-none
Sage Weil [Sun, 18 Jan 2015 18:43:58 +0000 (10:43 -0800)]
Merge pull request #3395 from liewegas/wip-cache-none

osd: skip all of maybe_handle_cache if cachemode is none

10 years agoMerge pull request #3315 from majianpeng/fix6
Sage Weil [Sun, 18 Jan 2015 18:42:31 +0000 (10:42 -0800)]
Merge pull request #3315 from majianpeng/fix6

bug fix

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3385 from majianpeng/misc
Sage Weil [Sun, 18 Jan 2015 18:41:55 +0000 (10:41 -0800)]
Merge pull request #3385 from majianpeng/misc

Misc

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge pull request #3194 from dachary/wip-10350-erasure-code-choose-tries
Sage Weil [Sun, 18 Jan 2015 18:41:31 +0000 (10:41 -0800)]
Merge pull request #3194 from dachary/wip-10350-erasure-code-choose-tries

resolve and document most common erasure coded pool pain points

Documentation-Reviewed-by: Italo Santos <okdokk@gmail.com>
10 years agomon: change mon_globalid_prealloc to 10000 (from 100)
Sage Weil [Sun, 18 Jan 2015 18:39:25 +0000 (10:39 -0800)]
mon: change mon_globalid_prealloc to 10000 (from 100)

100 ids (session 100 authentications) can be consumed quite quickly if
the monitor is being queried by the CLI via scripts or on a large cluster,
especially if the propose interval is long (many seconds).  These live in
a 64-bit value and are only "lost" if we have a mon election before they
are consumed, so there's no real risk here.

Backport: giant, firefly
Reviewed-by: Joao Eduardo Luis <joao@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agomon: silently ignore mark_down, mark_disposable on AnonConnection
Sage Weil [Sun, 18 Jan 2015 18:18:13 +0000 (10:18 -0800)]
mon: silently ignore mark_down, mark_disposable on AnonConnection

This mirrors 0a49db8e6fa141a36ca964e68017d02b81ae7a3c but was not captured
by 9fff0c53bdc7bb332df1a710da3de71e3c41bec7.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge remote-tracking branch 'gh/wip-xio'
Sage Weil [Sun, 18 Jan 2015 18:37:19 +0000 (10:37 -0800)]
Merge remote-tracking branch 'gh/wip-xio'

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3391 from liewegas/wip-pool-delete
Sage Weil [Sun, 18 Jan 2015 18:34:59 +0000 (10:34 -0800)]
Merge pull request #3391 from liewegas/wip-pool-delete

mon: global option to prevent pool deletion

Reviewed-by: John Spray <john.spray@redhat.com>
10 years agoosd/ReplicatedPG: skip all of maybe_handle_cache if caching is off 3395/head
Sage Weil [Sat, 17 Jan 2015 18:30:47 +0000 (10:30 -0800)]
osd/ReplicatedPG: skip all of maybe_handle_cache if caching is off

Return quickly and avoid all of the checks.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoAsyncConnection: Fix memory leak for AsyncConnection 3211/head
Haomai Wang [Thu, 15 Jan 2015 07:04:48 +0000 (15:04 +0800)]
AsyncConnection: Fix memory leak for AsyncConnection

*_handler will store a reference to AsyncConnection, it need to explicit reset
it.

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
10 years agoMerge remote-tracking branch 'origin/next'
Josh Durgin [Fri, 16 Jan 2015 22:40:27 +0000 (14:40 -0800)]
Merge remote-tracking branch 'origin/next'

10 years agoMerge remote-tracking branch 'origin/wip-10271' into next
Josh Durgin [Fri, 16 Jan 2015 22:33:59 +0000 (14:33 -0800)]
Merge remote-tracking branch 'origin/wip-10271' into next

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge pull request #3292 from kylinstorage/rbd-merge-diff-v2
Josh Durgin [Fri, 16 Jan 2015 20:08:02 +0000 (12:08 -0800)]
Merge pull request #3292 from kylinstorage/rbd-merge-diff-v2

rbd: merge diff files

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3318 from XinzeChi/wip-scrub
David Zafman [Fri, 16 Jan 2015 18:36:41 +0000 (10:36 -0800)]
Merge pull request #3318 from XinzeChi/wip-scrub

osd: support schedule scrub between some time defined by users

Reviewed-by David Zafman <dzafman@redhat.com>

10 years agoMerge pull request #3090 from ceph/wip-mon-fixes
João Eduardo Luís [Fri, 16 Jan 2015 18:32:54 +0000 (18:32 +0000)]
Merge pull request #3090 from ceph/wip-mon-fixes

mon: fix issues with mixed-version monitors features

Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
10 years agorgw: bilog marker related fixes
Yehuda Sadeh [Fri, 16 Jan 2015 01:30:24 +0000 (17:30 -0800)]
rgw: bilog marker related fixes

Fix the way we parse the marker. Instead of specifying whether it's a
sharded or not sharded bucket, we pass a shard_id. If string itself
points to a singe shard, we'll use the passed shard_id, otherwise we'll
parse the string and determine the shard id by that. In this way when
referencing a single shard we can get the marker with either shard id
specified or not. This works with the non-shard case too.
Adjust the bilog listing function, set it to work with the new
interface. It was broken before, and there are multiple fixes to it.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agoMerge pull request #3390 from ceph/wip-librbd-coverity
Josh Durgin [Fri, 16 Jan 2015 16:51:51 +0000 (08:51 -0800)]
Merge pull request #3390 from ceph/wip-librbd-coverity

librbd: fix coverity false-positives for tests

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agomon: Monitor: return 'required_features' on get_required_features() 3090/head
Joao Eduardo Luis [Thu, 4 Dec 2014 18:09:40 +0000 (18:09 +0000)]
mon: Monitor: return 'required_features' on get_required_features()

We were returning 'quorum_features' instead.  This would lead to funny
and weird behavior.  I hate funny.

Backport: emperor,firefly,giant

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
10 years agomon: Elector: output features in handle_propose()
Joao Eduardo Luis [Thu, 4 Dec 2014 18:08:56 +0000 (18:08 +0000)]
mon: Elector: output features in handle_propose()

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
10 years agomon: Elector: put dangling message reference
Joao Eduardo Luis [Thu, 4 Dec 2014 18:07:23 +0000 (18:07 +0000)]
mon: Elector: put dangling message reference

Backport: emperor,firefly,giant

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
10 years agomon: mkfs compatset may be different from runtime compatset
Joao Eduardo Luis [Thu, 4 Dec 2014 18:34:23 +0000 (18:34 +0000)]
mon: mkfs compatset may be different from runtime compatset

When we create a monitor we set a given number of compat features on
disk to clearly state the features a given monitor supports -- mostly to
break backward compatibility when such compatibility cannot be
guaranteed.

However, we may wish to toggle some features during runtime; e.g., wait
for all the monitors in the quorum to support a given feature before
flipping a switch and state that all monitors now require feature X.

We are already flipping those switches during runtime, but we weren't
allowing the monitor to set a subset of those features during mkfs.
While the initial approach worked fine with clusters being upgraded and
fresh clusters, it could become weird in a mixed-version environment.

Backport: emperor,firefly,giant

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
10 years agomon/OSDMonitor: require mon_allow_pool_delete = true to remove pools 3391/head
Sage Weil [Fri, 16 Jan 2015 15:54:22 +0000 (07:54 -0800)]
mon/OSDMonitor: require mon_allow_pool_delete = true to remove pools

This is a simple safety check.  Since we default to true it is currently
opt-in.

Backport: giant, firefly
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3384 from liewegas/wip-crush-tests
Loic Dachary [Fri, 16 Jan 2015 10:49:02 +0000 (11:49 +0100)]
Merge pull request #3384 from liewegas/wip-crush-tests

crush: minor reorg of crush unit tests

Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years agoosd: fix journal header.committed_up_to 3375/head
Xinze Chi [Fri, 16 Jan 2015 08:49:09 +0000 (08:49 +0000)]
osd: fix journal header.committed_up_to

Signed-off-by: Xinze Chi <xmdxcxz@gmail.com>
10 years agotest: add test for osd scrub 3318/head
Xinze Chi [Fri, 16 Jan 2015 08:31:16 +0000 (08:31 +0000)]
test: add test for osd scrub

Signed-off-by: Xinze Chi <xmdxcxz@gmail.com>
10 years agoosd: support schedule scrub between some time defined by users
Xinze Chi [Fri, 16 Jan 2015 08:30:55 +0000 (08:30 +0000)]
osd: support schedule scrub between some time defined by users

Signed-off-by: Xinze Chi <xmdxcxz@gmail.com>
10 years agotest: Using different filename for different test case. 3385/head
Jianpeng Ma [Fri, 16 Jan 2015 08:14:17 +0000 (16:14 +0800)]
test: Using different filename for different test case.

Some test case use tmp file to test.But they used same file and create
in the same directory. If we do in parallel, it will cause error.
So different test case use own their tmp file.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
10 years agoStriep: s/OSDExtent/ObjectExtent
Jianpeng Ma [Thu, 15 Jan 2015 06:20:38 +0000 (14:20 +0800)]
Striep: s/OSDExtent/ObjectExtent

OSDExtent already removed.
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
10 years agolibrados: clean up code.
Jianpeng Ma [Mon, 12 Jan 2015 08:34:55 +0000 (16:34 +0800)]
librados: clean up code.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
10 years agolibrbd: clean up code.
Jianpeng Ma [Mon, 12 Jan 2015 06:53:08 +0000 (14:53 +0800)]
librbd: clean up code.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
10 years agoMerge pull request #3335 from ceph/wip-cephfs-tabletool
Gregory Farnum [Fri, 16 Jan 2015 05:58:17 +0000 (21:58 -0800)]
Merge pull request #3335 from ceph/wip-cephfs-tabletool

Create cephfs-table-tool

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoMerge pull request #3383 from ceph/wip-10552
Gregory Farnum [Fri, 16 Jan 2015 05:54:56 +0000 (21:54 -0800)]
Merge pull request #3383 from ceph/wip-10552

client: fix getting zero-length xattr

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoMerge pull request #3358 from ceph/wip-mon-propose
Sage Weil [Fri, 16 Jan 2015 05:47:42 +0000 (21:47 -0800)]
Merge pull request #3358 from ceph/wip-mon-propose

mon: improve paxos proposals

Reviewed-by: Joao Eduardo Luis <joao@redhat.com>
10 years agoMerge pull request #3342 from ceph/wip-10311
Sage Weil [Fri, 16 Jan 2015 05:45:56 +0000 (21:45 -0800)]
Merge pull request #3342 from ceph/wip-10311

rgw: only keep track for cleanup of rados objects that were written

Reviewed-by: Ray Lv <xiangyulv@gmail.com>
10 years agoMerge pull request #3362 from FrankYu/master
Sage Weil [Fri, 16 Jan 2015 05:24:02 +0000 (21:24 -0800)]
Merge pull request #3362 from FrankYu/master

Doc: Fix the indentation in doc/rbd/rbd-snapshot.rst

10 years agoMerge pull request #3346 from timfreund/update-radosgw-python-swift-example
Sage Weil [Fri, 16 Jan 2015 05:15:06 +0000 (21:15 -0800)]
Merge pull request #3346 from timfreund/update-radosgw-python-swift-example

doc: Replace cloudfiles with swiftclient in Python Swift example

10 years agoMerge pull request #3359 from ceph/wip-mon-converter
Sage Weil [Fri, 16 Jan 2015 05:13:34 +0000 (21:13 -0800)]
Merge pull request #3359 from ceph/wip-mon-converter

drop ceph_mon_store_converter

Reviewed-by: Joao Eduardo Luis <joao@redhat.com>
10 years agoMerge pull request #3373 from jdurgin/wip-rados-ls-dups
Sage Weil [Fri, 16 Jan 2015 05:13:06 +0000 (21:13 -0800)]
Merge pull request #3373 from jdurgin/wip-rados-ls-dups

qa: ignore duplicates in rados ls

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agocrush: move two crush tests over 3384/head
Sage Weil [Fri, 16 Jan 2015 05:10:31 +0000 (21:10 -0800)]
crush: move two crush tests over

CrushWrapper handles map manipulation, crush.cc tests the placement.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agocrush: rename unit tests
Sage Weil [Fri, 16 Jan 2015 05:06:57 +0000 (21:06 -0800)]
crush: rename unit tests

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoclient: fix getting zero-length xattr 3383/head
Yan, Zheng [Fri, 16 Jan 2015 02:16:44 +0000 (10:16 +0800)]
client: fix getting zero-length xattr

Fixes: #10552
Signed-off-by: Yan, Zheng <zyan@redhat.com>
10 years agoMerge pull request #3378 from xinxinsh/wip-cleanup
Sage Weil [Fri, 16 Jan 2015 01:26:01 +0000 (17:26 -0800)]
Merge pull request #3378 from xinxinsh/wip-cleanup

cleanup unused varibles

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agotools: create cephfs-table-tool 3335/head
John Spray [Fri, 2 Jan 2015 17:48:25 +0000 (17:48 +0000)]
tools: create cephfs-table-tool

It was unnatural to shoehorn resetting tables
into the journaltool.  This new tool initially
can simply dump or reset the session/snap/ino
tables, and would also be a place for any
more complex operations in future.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agomds: give MDSTables a `rank` attribute
John Spray [Fri, 16 Jan 2015 00:02:00 +0000 (00:02 +0000)]
mds: give MDSTables a `rank` attribute

...so that they (like the new SessionMapStore)
can be used outside of a live MDS in tool code.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agomds: abstract SessionMapStore from SessionMap
John Spray [Fri, 16 Jan 2015 00:00:56 +0000 (00:00 +0000)]
mds: abstract SessionMapStore from SessionMap

This is similar to what I did for InodeStore a while back:
introduce a logical separation between the persisted attributers
(and their encoding) and the live/runtime behavioural code.  This
results in a handy SessionMapStore class that can be used for
encode/decode from tools.

Also give it a reset_state method so that it matches the
prototype of the MDSTable subclasses for the benefit of
cephfs-table-tool.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoerasure-code: tests use different pool/profile names 3194/head
Loic Dachary [Tue, 6 Jan 2015 20:55:25 +0000 (21:55 +0100)]
erasure-code: tests use different pool/profile names

Use different erasure coded pool names and profiles to avoid deletion /
creation races. The more expensive alternative is to run a different
cluster for each test.

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agodocumentation: add troubleshooting erasure coded PGs section
Loic Dachary [Wed, 17 Dec 2014 15:08:29 +0000 (16:08 +0100)]
documentation: add troubleshooting erasure coded PGs section

Add a new section to the PG troubleshooting section that covers the most
common problems reported when an erasure coded pool fails to properly
map PGs to enough OSDs.

http://tracker.ceph.com/issues/10350 Fixes: #10350

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agoerasure-code: set max_size to chunk_count() instead of 20
Loic Dachary [Thu, 18 Dec 2014 00:25:54 +0000 (01:25 +0100)]
erasure-code: set max_size to chunk_count() instead of 20

The ruleset created for an erasure coded pool has max_size set to a
fixed value of 20, which may be incorrect when more than 20 chunks are
needed and lead to obscure errors. Set it to the number of chunks,
i.e. k+m most of the time.

In a cluster with few OSDs (9 for instance), setting max_size to 20
causes performance problems when injecting a new crushmap. The monitor
will call CrushTester::test which tries 1024 mappins for all sizes
ranging from min_size to max_size. Each attempt to map more OSDs than
available will exhaust all retries (50 by default) and it takes a
significant amount of time. In a cluster with 9 OSDs, testing one such
ruleset can take up to 5 seconds.

Since the test blocks the monitor leader, a few erasure coded rulesets
will block the monitor long enough to exceed the timeouts and trigger an
election.

http://tracker.ceph.com/issues/10363 Fixes: #10363

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agocrush: set_choose_tries = 100 for erasure code rulesets
Loic Dachary [Wed, 17 Dec 2014 15:06:55 +0000 (16:06 +0100)]
crush: set_choose_tries = 100 for erasure code rulesets

It is common for people to try to map 9 OSDs out of a 9 OSDs total ceph
cluster. The default tries (50) will frequently lead to bad mappings for
this use case. Changing it to 100 makes no significant CPU performance
difference, as tested manually by running crushtool on one million
mappings.

http://tracker.ceph.com/issues/10353 Fixes: #10353

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agocrush: update tries statistics for indep rules
Loic Dachary [Wed, 17 Dec 2014 12:43:41 +0000 (13:43 +0100)]
crush: update tries statistics for indep rules

http://tracker.ceph.com/issues/10349 Fixes: #10349

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agoerasure-code: update jerasure/gf-complete submodules 3334/head
Loic Dachary [Fri, 9 Jan 2015 12:39:24 +0000 (13:39 +0100)]
erasure-code: update jerasure/gf-complete submodules

jerasure:

git log --no-merges --pretty=%s \
  8fe20c6608385d6a1f38db89aec5cba85ccf04ac..02731df4c1eae1819c4453c9d3ab6d408cadd085
use assert(0) instead of exit(1)

gf-complete:

git log --no-merges --pretty=%s \
  39768c55bb00917691364f6f9f7bf688235aedf8..d384952c68a64d93ac7af6341d5519ea5d2958b9
gitignore: add src/.dirstamp
use assert(0) instead of exit(1)

http://tracker.ceph.com/issues/10216 Fixes: #10216

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agoMerge pull request #3380 from trhoden/doc_cephextras
Alfredo Deza [Thu, 15 Jan 2015 19:46:50 +0000 (14:46 -0500)]
Merge pull request #3380 from trhoden/doc_cephextras

doc: add cases where ceph-extras is not needed

Reviewed-by: Alfredo Deza <adeza@redhat.com>
10 years agodoc: add cases where ceph-extras is not needed 3380/head
Travis Rhoden [Thu, 15 Jan 2015 19:39:01 +0000 (14:39 -0500)]
doc: add cases where ceph-extras is not needed

The Ceph Extras repo is not needed on EL7 distributions or
Fedora

http://tracker.ceph.com/issues/9793 Refs: #9793

Signed-off-by: Travis Rhoden <trhoden@redhat.com>
10 years agoMerge pull request #3379 from ceph/wip-mon-drop-conversion
Sage Weil [Thu, 15 Jan 2015 19:22:16 +0000 (11:22 -0800)]
Merge pull request #3379 from ceph/wip-mon-drop-conversion

mon: drop store conversion code

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3377 from ceph/wip-fail-idempotent
Gregory Farnum [Thu, 15 Jan 2015 19:21:18 +0000 (11:21 -0800)]
Merge pull request #3377 from ceph/wip-fail-idempotent

mon/MDSMonitor: make 'mds fail' idempotent for IDs

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoos/FileJournal: Fix journal write fail, align for direct io
Sage Weil [Thu, 15 Jan 2015 19:20:18 +0000 (11:20 -0800)]
os/FileJournal: Fix journal write fail, align for direct io

when config journal_zero_on_create true, osd mkfs will fail when zeroing journal.
journal open with O_DIRECT, buf should align with blocksize.

Backport: giant, firefly, dumpling
Signed-off-by: Xie Rui <875016668@qq.com>
Reviewed-by: Sage Weil <sage@redhat.com>
10 years agomon: encode stashed monmap with all features
Jerry7X [Wed, 7 Jan 2015 06:29:02 +0000 (14:29 +0800)]
mon: encode stashed monmap with all features

latest_monmap that we stash is only used locally--the encoded bl is never shared. Which means we should just use CEPH_FEATURES_ALL all of the time.

Fixes: #5203
Backport: giant, firefly
Signed-off-by: Xie Rui <875016668@qq.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Joao Eduardo Luis <joao@redhat.com>
10 years agoAsyncConnection: Fix deadlock if socket failed when replacing
Haomai Wang [Wed, 14 Jan 2015 18:32:25 +0000 (02:32 +0800)]
AsyncConnection: Fix deadlock if socket failed when replacing

If client reconnect a already mark_down endpoint, server-side will detect
remote reset happen, so it will reset existing connection. Meanwhile,
retry tag is received by client-side connection and it will try to
reconnect. Again, client-side connection will send connect_msg with
connect_seq(1). But it will met server-side connection's connect_seq(0),
it will make server-side reply with reset tag. So this connection will
loop in reset and retry tag.

One solution is that we close server-side connection if connect_seq ==0 and
no message in queue. But it will trigger another problem:
1. client try to connect a already mark_down endpoint
2. client->send_message
3. server-side accept new socket, replace old one and reply retry tag
4. client plus one to connect_seq but socket failure happen
5. server-side connection detected and close because of connect_seq==0 and no
message
6. client reconnect, server-side has no existing connection and met
"connect.connect_seq > 0". So server-side will reply to RESET tag
7. client discard all messages in queue. So we lose a message never delivered

This solution add a new "once_session_reset" flag to indicate whether
"existing" reset. Because server-side's connect_seq is 0 only when it never
successfully or ever session reset. We only need to reply RESET tag if ever
session reset.

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
10 years agoEvent: Fix typo
Haomai Wang [Wed, 14 Jan 2015 15:17:23 +0000 (23:17 +0800)]
Event: Fix typo

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
10 years agoAsyncConnection: Don't increment connect_seq if connect failed
Haomai Wang [Wed, 14 Jan 2015 14:51:58 +0000 (22:51 +0800)]
AsyncConnection: Don't increment connect_seq if connect failed

If connection sent many messages without acked, then it was marked down.
Next we get a new connection, it will issue a connect_msg with connect_seq=0,
server side need to detect "connect_seq==0 && existing->connect_seq >0",
so it will reset out_q and detect remote reset. But if client side failed
before sending connect_msg, now it will issue a connect_msg with non-zero
connect_seq which will cause server-side can't detect exist remote reset.
Server-side will reply a non-zero in_seq and cause client crash.

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
10 years agoasync: adjust test_msgr and normalize log output format
Haomai Wang [Wed, 14 Jan 2015 07:01:37 +0000 (15:01 +0800)]
async: adjust test_msgr and normalize log output format

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
10 years agoAsyncConnection: Fix replacing cause original state lossy
Haomai Wang [Wed, 14 Jan 2015 03:14:16 +0000 (11:14 +0800)]
AsyncConnection: Fix replacing cause original state lossy

Because AsyncConnection won't enter "open" tag from "replace" tag,
the codes which set reply_tag won't be used when enter "open" tag.
It will cause server side discard out_q and lose state.

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
10 years agoAsyncConnection: Don't discard out_q and unregister when replacing
Haomai Wang [Tue, 13 Jan 2015 15:52:27 +0000 (23:52 +0800)]
AsyncConnection: Don't discard out_q and unregister when replacing

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
10 years agotest_msgr: Add SyntheticInjectTest
Haomai Wang [Tue, 13 Jan 2015 15:26:10 +0000 (23:26 +0800)]
test_msgr: Add SyntheticInjectTest

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
10 years agoAsyncConnection: Add ms_inject_* to AsyncConnection
Haomai Wang [Tue, 13 Jan 2015 14:18:02 +0000 (22:18 +0800)]
AsyncConnection: Add ms_inject_* to AsyncConnection

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
10 years agoAsyncConnection: Enhance replace process
Haomai Wang [Tue, 13 Jan 2015 03:54:54 +0000 (11:54 +0800)]
AsyncConnection: Enhance replace process

Make handle_connect_msg follow lock rule: unlock any lock before acquire
messenger's lock. Otherwise, deadlock will happen.

Enhance lock condition check because connection's state maybe change while
unlock itself and lock again.

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
10 years agoAsyncConnection: set state_offset=0 in case of reuse this connection
Haomai Wang [Mon, 12 Jan 2015 14:34:59 +0000 (22:34 +0800)]
AsyncConnection: set state_offset=0 in case of reuse this connection

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
10 years agoEvent: Fix incorrect memset
Haomai Wang [Mon, 12 Jan 2015 14:34:38 +0000 (22:34 +0800)]
Event: Fix incorrect memset

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
10 years agotest_msgr: Add SyntheticWorkload to do message measurement
Haomai Wang [Mon, 12 Jan 2015 04:14:39 +0000 (12:14 +0800)]
test_msgr: Add SyntheticWorkload to do message measurement

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
10 years agoAsyncConnection: Don't alloc buffer when reenter "READ_FRONT" state
Haomai Wang [Sun, 11 Jan 2015 11:33:51 +0000 (19:33 +0800)]
AsyncConnection: Don't alloc buffer when reenter "READ_FRONT" state

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
10 years agotest_msgr: Add test for a message with large payload
Haomai Wang [Sat, 10 Jan 2015 14:07:05 +0000 (22:07 +0800)]
test_msgr: Add test for a message with large payload

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>