]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
10 years agoMerge pull request #3290 from ceph/wip-da-SCA-20150102
Samuel Just [Tue, 13 Jan 2015 18:54:45 +0000 (10:54 -0800)]
Merge pull request #3290 from ceph/wip-da-SCA-20150102

Coverity and SCA fixes

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3302 from ceph/wip-9956
Samuel Just [Tue, 13 Jan 2015 18:54:21 +0000 (10:54 -0800)]
Merge pull request #3302 from ceph/wip-9956

os/FileStore: verify kernel is new enough before using extsize ioctl

Reviewed-by: Samuel Just <sjust@redhat.com>
10 years agoMerge pull request #3305 from majianpeng/fix5
Samuel Just [Tue, 13 Jan 2015 18:53:34 +0000 (10:53 -0800)]
Merge pull request #3305 from majianpeng/fix5

fix bugs about sync_filesystem

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3364 from ceph/wip-quota-test
Gregory Farnum [Tue, 13 Jan 2015 15:08:30 +0000 (07:08 -0800)]
Merge pull request #3364 from ceph/wip-quota-test

qa: set -e explicitly in quota test

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoqa: set -e explicitly in quota test 3364/head
John Spray [Tue, 13 Jan 2015 14:58:57 +0000 (14:58 +0000)]
qa: set -e explicitly in quota test

Previously was set in hashbang, which meant
that "./quota.sh" was OK, but "sh ./quota.sh" would
just run through ignoring errors.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoMerge pull request #3336 from ceph/wip-fs-reset
Gregory Farnum [Tue, 13 Jan 2015 14:47:04 +0000 (06:47 -0800)]
Merge pull request #3336 from ceph/wip-fs-reset

mon: implement `fs reset`

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoMerge pull request #3343 from dachary/wip-10505-centos-parted
Loic Dachary [Tue, 13 Jan 2015 10:07:55 +0000 (11:07 +0100)]
Merge pull request #3343 from dachary/wip-10505-centos-parted

tests: install parted in centos Dockerfile

Reviewed-by: Joao Eduardo Luis <joao@redhat.com>
10 years agoosd: enable filestore_extsize by default 3302/head
Sage Weil [Mon, 12 Jan 2015 22:00:21 +0000 (14:00 -0800)]
osd: enable filestore_extsize by default

Note that this will only get used if the kernel is new enough; if it is
older than 3.5 the option will get disabled and extsize will not be used
even if the option is set to true.

This partially reverts 01cd3cdc726a3e838bce05b355a021778b4e5db1.

Fixes: #9956
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoos/FileStore: verify kernel is new enough before using extsize ioctl
Sage Weil [Mon, 12 Jan 2015 21:59:39 +0000 (13:59 -0800)]
os/FileStore: verify kernel is new enough before using extsize ioctl

Old kernels have an XFS bug that exposes uninitialized data when the
extsize hint is set and only partially written.  This is fixed by Linux
commit aff3a9edb7080f69f07fe76a8bd089b3dfa4cb5d, documented in XFS bug
http://oss.sgi.com/bugzilla/show_bug.cgi?id=874, and tested by XFS
test xfs/229 to prevent regressions.

Notably the original bug affects kernel 3.2, which is widely deployed with
ubuntu precise 12.04.

Backport: giant, firefly
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3352 from kylinstorage/fix-10503
Gregory Farnum [Mon, 12 Jan 2015 19:33:02 +0000 (11:33 -0800)]
Merge pull request #3352 from kylinstorage/fix-10503

Fix bug 10503: http://tracker.ceph.com/issues/10503

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoMerge pull request #3203 from majianpeng/fix1
Samuel Just [Mon, 12 Jan 2015 16:39:48 +0000 (08:39 -0800)]
Merge pull request #3203 from majianpeng/fix1

avoid memcopy from librados to caller buffer

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
10 years agoMerge pull request #3034 from dachary/wip-10017-erasure-code-repair
Samuel Just [Mon, 12 Jan 2015 16:26:08 +0000 (08:26 -0800)]
Merge pull request #3034 from dachary/wip-10017-erasure-code-repair

erasure code repair when there are two failures

Reviewed-by: Samuel Just <sjust@redhat.com>
10 years agoMerge pull request #3148 from mslovy/optimazation_wbthrottle
Samuel Just [Mon, 12 Jan 2015 16:23:26 +0000 (08:23 -0800)]
Merge pull request #3148 from mslovy/optimazation_wbthrottle

os: WBThrottle: optimize the map to unordered_map

Reviewed-by: Samuel Just <sjust@redhat.com>
10 years agomon/MDSMonitor: add confirm flag to fs reset 3336/head
John Spray [Mon, 12 Jan 2015 14:52:43 +0000 (14:52 +0000)]
mon/MDSMonitor: add confirm flag to fs reset

This was already in the command map but was not
being checked.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoqa: add `fs reset` to cephtool tests
John Spray [Mon, 12 Jan 2015 13:54:52 +0000 (13:54 +0000)]
qa: add `fs reset` to cephtool tests

This is just a superficial "I can call it" test,
it's actual behaviour is checked elsewhere.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agomon: implement `fs reset`
John Spray [Mon, 5 Jan 2015 19:34:57 +0000 (19:34 +0000)]
mon: implement `fs reset`

This is for use in CephFS disaster recovery.  When
the metadata pool has been forcibly reset to a single-MDS
metadata tree, we would like to reset the MDSMap to match.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoFix bug 10503: http://tracker.ceph.com/issues/10503 3352/head
Yunchuan Wen [Mon, 12 Jan 2015 05:49:32 +0000 (05:49 +0000)]
Fix bug 10503: http://tracker.ceph.com/issues/10503
ceph-fuse: quota code is not 32-bit safe for vxattr output

Signed-off-by: Yunchuan Wen <yunchuanwen@ubuntukylin.com>
10 years agoMerge pull request #2948 from ceph/wip-promote
Sage Weil [Sun, 11 Jan 2015 15:55:08 +0000 (07:55 -0800)]
Merge pull request #2948 from ceph/wip-promote

osd: promote_object separation; proxy read

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoceph_test_rados: add some debug output 2948/head
Sage Weil [Tue, 6 Jan 2015 21:01:45 +0000 (13:01 -0800)]
ceph_test_rados: add some debug output

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatedPG: improve proxy read cancelation
Sage Weil [Sun, 7 Dec 2014 01:45:28 +0000 (17:45 -0800)]
osd/ReplicatedPG: improve proxy read cancelation

Avoid taking the PG lock for a canceled read op (if we are lucky).  Recheck
after the lock is taken for good measure.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatedPG: put proxy read completion on finisher
Sage Weil [Sun, 7 Dec 2014 01:42:51 +0000 (17:42 -0800)]
osd/ReplicatedPG: put proxy read completion on finisher

We can't use the synchronous completion callbacks (in fast dispatch
context) do to the proxy read completion work.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd: tiering: avoid duplicate promotion on proxy read
Zhiqiang Wang [Fri, 28 Nov 2014 08:30:20 +0000 (16:30 +0800)]
osd: tiering: avoid duplicate promotion on proxy read

Do not promote if it is already undergoing in maybe_handle_cache.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoosd: tiering: proxy instead of redirect read in writeback mode when the
Zhiqiang Wang [Wed, 26 Nov 2014 01:57:03 +0000 (09:57 +0800)]
osd: tiering: proxy instead of redirect read in writeback mode when the
cache pool is full

To preserve read op order

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoosd: tiering: cancel and requeue proxy read when needed
Zhiqiang Wang [Fri, 21 Nov 2014 06:01:24 +0000 (14:01 +0800)]
osd: tiering: cancel and requeue proxy read when needed

Cancel and requeue proxy read on the following cases:
1) on_shutdown
2) on_change
3) background promotion is done

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
Conflicts:
src/osd/ReplicatedPG.cc
src/osd/ReplicatedPG.h

10 years agoosd/ReplicatedPG: allow reads to proxy etc even if blocked
Sage Weil [Tue, 9 Dec 2014 01:57:13 +0000 (17:57 -0800)]
osd/ReplicatedPG: allow reads to proxy etc even if blocked

If we are not write ordered, continue with cache checks so that we can
(among other things) proxy reads while promoting.

Note that this may reorder reads for clients, but we've decided that's okay.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agotest: add proxy read test
Zhiqiang Wang [Wed, 19 Nov 2014 03:14:46 +0000 (11:14 +0800)]
test: add proxy read test

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoosd: tiering: proxy reads during promote
Zhiqiang Wang [Tue, 18 Nov 2014 23:47:32 +0000 (15:47 -0800)]
osd: tiering: proxy reads during promote

wip 9980. Do proxy read and async promotion for writeback.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoosd: tiering: add cache mode READPROXY
Zhiqiang Wang [Tue, 18 Nov 2014 08:10:00 +0000 (16:10 +0800)]
osd: tiering: add cache mode READPROXY

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoosd: tiering: add proxy read support
Zhiqiang Wang [Tue, 18 Nov 2014 07:54:47 +0000 (15:54 +0800)]
osd: tiering: add proxy read support

wip 9979

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoosd/ReplicatedPG: separate promotion from the triggering op
Sage Weil [Mon, 17 Nov 2014 22:02:39 +0000 (14:02 -0800)]
osd/ReplicatedPG: separate promotion from the triggering op

Remove the triggering op from the internal promote machinery.

We keep the optional op arg to promote_object() only because we may
block on an object other than the original obc.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatedPG: pass promote error to all blocked operations
Sage Weil [Mon, 17 Nov 2014 21:06:29 +0000 (13:06 -0800)]
osd/ReplicatedPG: pass promote error to all blocked operations

This isn't the most elegant strategy, but it is the best we can do
right now.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatedPG: drop unnecessary cache_mode checks
Sage Weil [Mon, 17 Nov 2014 20:46:51 +0000 (12:46 -0800)]
osd/ReplicatedPG: drop unnecessary cache_mode checks

This currently enumerates all cache modes except none, and we don't
arrive in this function when caching is disabled.  And creating a whiteout
is not cache_mode dependent.  Simplify!

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatdPG: adjust braces (no semantic change)
Sage Weil [Thu, 23 Oct 2014 23:53:14 +0000 (16:53 -0700)]
osd/ReplicatdPG: adjust braces (no semantic change)

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatedPG: factor out must_promote case from all cache modes
Sage Weil [Thu, 23 Oct 2014 23:52:22 +0000 (16:52 -0700)]
osd/ReplicatedPG: factor out must_promote case from all cache modes

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatedPG: factor out common exists case from all cache modes
Sage Weil [Thu, 23 Oct 2014 23:51:03 +0000 (16:51 -0700)]
osd/ReplicatedPG: factor out common exists case from all cache modes

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatedPG: make op argument to promote_object optional
Sage Weil [Thu, 23 Oct 2014 21:34:36 +0000 (14:34 -0700)]
osd/ReplicatedPG: make op argument to promote_object optional

For now, we still always pass it.  In preparation, however, we modify
promote_object() so that it will work when op is null.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3309 from trociny/wip-9483 3484/head
Josh Durgin [Sat, 10 Jan 2015 23:25:10 +0000 (15:25 -0800)]
Merge pull request #3309 from trociny/wip-9483

OSD: add a get_latest_osdmap command to the admin socket

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoOSD: add a get_latest_osdmap command to the admin socket 3309/head
Mykola Golub [Wed, 7 Jan 2015 11:39:33 +0000 (13:39 +0200)]
OSD: add a get_latest_osdmap command to the admin socket

The command blocks and ensures we have the latest map from the
mon. This is useful in testing and to "unstick" clusters in some
odd situations.

Fixes: #9483, #9484 (maybe)
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
10 years agodoc: Fix PHP librados documentation
Wido den Hollander [Sat, 10 Jan 2015 13:21:27 +0000 (14:21 +0100)]
doc: Fix PHP librados documentation

10 years agoMerge pull request #3348 from ceph/wip-mon-wishlist
Loic Dachary [Sat, 10 Jan 2015 12:54:56 +0000 (13:54 +0100)]
Merge pull request #3348 from ceph/wip-mon-wishlist

doc: mon janitorial list is now a wishlist

Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years agodoc: mon janitorial list is now a wishlist 3348/head
Joao Eduardo Luis [Sat, 10 Jan 2015 12:08:22 +0000 (12:08 +0000)]
doc: mon janitorial list is now a wishlist

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
10 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Sat, 10 Jan 2015 05:43:49 +0000 (21:43 -0800)]
Merge remote-tracking branch 'gh/next'

10 years agoMerge pull request #3327 from ceph/wip-peeringqueue
Sage Weil [Sat, 10 Jan 2015 05:43:04 +0000 (21:43 -0800)]
Merge pull request #3327 from ceph/wip-peeringqueue

osd: fix peering queue bug

Reviewed-by: Samuel Just <sjust@redhat.com>
10 years agoMerge pull request #3344 from ceph/wip-librbd-snap-unprotect
Josh Durgin [Sat, 10 Jan 2015 00:56:39 +0000 (16:56 -0800)]
Merge pull request #3344 from ceph/wip-librbd-snap-unprotect

librbd: shadow variable in snap_unprotect

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agorgw: return InvalidAccessKeyId instead of AccessDenied
Yehuda Sadeh [Tue, 16 Dec 2014 20:27:54 +0000 (12:27 -0800)]
rgw: return InvalidAccessKeyId instead of AccessDenied

Fixes: #10334
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
(cherry picked from commit 56af795b1046a4c1bfba59e1fefde272bb0e5c1e)

10 years agorgw: return SignatureDoesNotMatch instead of AccessDenied
Yehuda Sadeh [Tue, 16 Dec 2014 17:11:20 +0000 (09:11 -0800)]
rgw: return SignatureDoesNotMatch instead of AccessDenied

Fixes: #10329
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
(cherry picked from commit ef75d720f289ce2e18c0047380a16b7688864560)

10 years agotests: install parted in centos Dockerfile 3343/head
Loic Dachary [Fri, 9 Jan 2015 23:05:53 +0000 (00:05 +0100)]
tests: install parted in centos Dockerfile

It is needed to run root ceph-disk tests.

http://tracker.ceph.com/issues/10505 Fixes: #10505

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agodoc: Clean up pool usage.
John Wilkins [Fri, 9 Jan 2015 22:54:30 +0000 (14:54 -0800)]
doc: Clean up pool usage.

Signed-off-by: John Wilkins <jowilkin@redhat.com>
10 years agodoc: Cleanup RGW pool usage.
John Wilkins [Fri, 9 Jan 2015 22:54:06 +0000 (14:54 -0800)]
doc: Cleanup RGW pool usage.

Signed-off-by: John Wilkins <jowilkin@redhat.com>
10 years agoMerge pull request #3341 from liewegas/wip-10504
Gregory Farnum [Fri, 9 Jan 2015 22:52:19 +0000 (14:52 -0800)]
Merge pull request #3341 from liewegas/wip-10504

client: add ceph version to metadata

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoclient: include ceph and git version in client metadata 3341/head
Sage Weil [Fri, 9 Jan 2015 22:41:34 +0000 (14:41 -0800)]
client: include ceph and git version in client metadata

Fixes: #10504
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3325 from ceph/wip-nits
Josh Durgin [Fri, 9 Jan 2015 22:30:44 +0000 (14:30 -0800)]
Merge pull request #3325 from ceph/wip-nits

allow 'ops' instead of 'dump_ops_in_flight'

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge pull request #3250 from ceph/wip-10372
Josh Durgin [Fri, 9 Jan 2015 22:12:22 +0000 (14:12 -0800)]
Merge pull request #3250 from ceph/wip-10372

osdc/Objecter: improve pool deletion detection

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoRevert "rgw: switch to new watch/notify API"
Yehuda Sadeh [Fri, 9 Jan 2015 22:02:04 +0000 (14:02 -0800)]
Revert "rgw: switch to new watch/notify API"

This reverts commit dc67cd69604ec4e4df846b818ec739dc7b09a537.

Conflicts:
src/rgw/rgw_rados.cc

10 years agoMerge pull request #3313 from ceph/wip-asok-get-subtrees 3339/head
Gregory Farnum [Fri, 9 Jan 2015 21:45:26 +0000 (13:45 -0800)]
Merge pull request #3313 from ceph/wip-asok-get-subtrees

Add MDS "get subtrees" asok command

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
10 years agoMerge pull request #3312 from ceph/wip-mdscacheobject-const
Gregory Farnum [Fri, 9 Jan 2015 21:29:59 +0000 (13:29 -0800)]
Merge pull request #3312 from ceph/wip-mdscacheobject-const

mds: support constness in MDSCacheObjects

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoMerge pull request #3248 from dachary/wip-table-formatter
Loic Dachary [Fri, 9 Jan 2015 18:34:35 +0000 (19:34 +0100)]
Merge pull request #3248 from dachary/wip-table-formatter

add table to Formatter

Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years agodoc: Added section to install priorities/preferences.
John Wilkins [Fri, 9 Jan 2015 18:26:51 +0000 (10:26 -0800)]
doc: Added section to install priorities/preferences.

Signed-off-by: John Wilkins <jowilkin@redhat.com>
10 years agoMerge pull request #3324 from mattrichards/rados_translate_op_flag
Josh Durgin [Fri, 9 Jan 2015 17:13:09 +0000 (09:13 -0800)]
Merge pull request #3324 from mattrichards/rados_translate_op_flag

librados: Translate operation flags from C APIs

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge pull request #3328 from xiaoxichen/memstore_size_u64
Sage Weil [Fri, 9 Jan 2015 15:52:20 +0000 (07:52 -0800)]
Merge pull request #3328 from xiaoxichen/memstore_size_u64

Bump memstore_device_bytes from U32 to U64

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3337 from dachary/wip-10494-disable-unittest-msgr
Haomai Wang [Fri, 9 Jan 2015 15:42:05 +0000 (23:42 +0800)]
Merge pull request #3337 from dachary/wip-10494-disable-unittest-msgr

tests: temporarily disable unittest_msgr

10 years agotests: temporarily disable unittest_msgr 3337/head
Loic Dachary [Fri, 9 Jan 2015 15:17:53 +0000 (16:17 +0100)]
tests: temporarily disable unittest_msgr

http://tracker.ceph.com/issues/10494 Refs: #10494

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agoMerge pull request #3331 from dachary/wip-10493-async-port
Haomai Wang [Fri, 9 Jan 2015 14:48:19 +0000 (22:48 +0800)]
Merge pull request #3331 from dachary/wip-10493-async-port

msg: initialize AsyncConnection::port

Reviewed-by: Haomai Wang<haomaiwang@gmail.com>
10 years agomds: add asok command for getting subtreemap 3313/head
John Spray [Mon, 5 Jan 2015 22:32:55 +0000 (22:32 +0000)]
mds: add asok command for getting subtreemap

For when we want to inspect this from a test or
during debugging.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agomds: give CDir a dump() method for JSON output
John Spray [Tue, 6 Jan 2015 18:13:28 +0000 (18:13 +0000)]
mds: give CDir a dump() method for JSON output

Useful when listing subtrees via admin socket.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agomds: support constness in MDSCacheObjects 3312/head
John Spray [Mon, 5 Jan 2015 23:53:18 +0000 (23:53 +0000)]
mds: support constness in MDSCacheObjects

So that one can have const CInode and CDir references
from time to time.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agolibrbd: shadow variable in snap_unprotect and list_children 3344/head
Jason Dillaman [Fri, 9 Jan 2015 13:19:43 +0000 (08:19 -0500)]
librbd: shadow variable in snap_unprotect and list_children

The shadow variable prevented snap_unprotect from returning the
correct error return code.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agodoc: Add Librados PHP documentation
Wido den Hollander [Fri, 9 Jan 2015 13:15:32 +0000 (14:15 +0100)]
doc: Add Librados PHP documentation

10 years agoMerge pull request #3332 from dachary/wip-9570-filejournal
Loic Dachary [Fri, 9 Jan 2015 12:57:11 +0000 (13:57 +0100)]
Merge pull request #3332 from dachary/wip-9570-filejournal

journal related cleanups

Reviewed-by: Jianpeng Ma <jianpeng.ma@intel.com>
10 years agocommon: Formatter: cosmetic re-indent 3248/head
Andreas Peters [Wed, 15 Oct 2014 09:30:35 +0000 (11:30 +0200)]
common: Formatter: cosmetic re-indent

Signed-off-by: Andreas Peters <andreas.joachim.peters@cern.ch>
10 years agocommon: Formatter: add TableFormatter class
Andreas-Joachim Peters [Wed, 8 Oct 2014 14:18:32 +0000 (16:18 +0200)]
common: Formatter: add TableFormatter class

For more human readable and shell parsable output.

Signed-off-by: Andreas Peters <andreas.joachim.peters@cern.ch>
10 years agoerasure-code: test repair when file is removed 3034/head
Loic Dachary [Thu, 4 Dec 2014 11:15:30 +0000 (12:15 +0100)]
erasure-code: test repair when file is removed

Add tests for when files disappear from the file system :

  * file is missing from the primary OSD
  * file is missing from an OSD that is not the primary
  * files are missing from two OSDs that are not the primary
  * files are missing from two OSDs, one of which is the primary

http://tracker.ceph.com/issues/10017 Refs: #10017

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agoosd: accumulate authoritative peers during recovery
Loic Dachary [Thu, 4 Dec 2014 10:44:34 +0000 (11:44 +0100)]
osd: accumulate authoritative peers during recovery

When PGBackend::be_compare_scrubmaps finds multiple good peers, it only
keeps the last one. This is fine for replication but erasure coding
needs to know all good peers for recovery.

PGBackend::be_compare_scrubmaps is modified to accumulate all good peers
and return them to PGBackend::be_compare_scrubmaps and indirectly to
PG::scrub_compare_maps.

PG::scrub_compare_maps will dispatch the good peers to authmap and
good_peers. In the case of authmap, the data structure is not modified
and only the last good peer is set. The ReplicatedPG::_scrub uses
authmap in a non trivial way and it should probably be modified to use
information from multiple good peers instead of just the last one. This
could be the focus of another change.

The scrubber.authoritative data structure is changed to include a list
of pair<ScrubMap::object, pg_shard_t> instead of a single
pair<ScrubMap::object, pg_shard_t> to pass to PG::repair_object and
allow it to add all good peers to the missing_loc locations if the
primary has a missing object. It could be just a list of pg_shard_t
instead because the ScrubMap::object is not used but makes more sense to
keep both and it will presumably be useful when / if the logic changes.

http://tracker.ceph.com/issues/10017 Fixes: #10017

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agoos: fix confusing indentation in FileJournal::corrupt 3332/head
Loic Dachary [Tue, 30 Sep 2014 12:49:45 +0000 (14:49 +0200)]
os: fix confusing indentation in FileJournal::corrupt

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
10 years agoos: remove debug message leftover in FileJournal
Loic Dachary [Tue, 30 Sep 2014 12:31:05 +0000 (14:31 +0200)]
os: remove debug message leftover in FileJournal

The len of the buffer shows in the message above anyway.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
10 years agomsg: initialize AsyncConnection::port 3331/head
Loic Dachary [Fri, 9 Jan 2015 10:57:58 +0000 (11:57 +0100)]
msg: initialize AsyncConnection::port

http://tracker.ceph.com/issues/10493 Fixes: #10493

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agoBump memstore_device_bytes from U32 to U64 3328/head
Xiaoxi Chen [Fri, 9 Jan 2015 08:15:06 +0000 (16:15 +0800)]
Bump memstore_device_bytes from U32 to U64

U32 limit the max size of memstore to a few GB, which
block our test on memstore performance(as a phototype).

Bump it to U64 will suit for more widely usage

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
10 years agoosd: assert there is a peering event 3326/head 3327/head
Sage Weil [Thu, 8 Jan 2015 19:10:45 +0000 (11:10 -0800)]
osd: assert there is a peering event

This became conditional way back in 12e22b3d44eba51a70d8babebc2684f0c46575a7
for unclear reasons.  It probably predates the in_use checks.  In any case,
at this point, we should only arrive here if the PG was queued, implying
that there will always be an event to process.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd: requeue PG when we skip handling a peering event
Sage Weil [Thu, 8 Jan 2015 21:34:52 +0000 (13:34 -0800)]
osd: requeue PG when we skip handling a peering event

If we don't handle the event, we need to put the PG back into the peering
queue or else the event won't get processed until the next event is
queued, at which point we'll be processing events with a delay.

The queue_null is not necessary (and is a waste of effort) because the
event is still in pg->peering_queue and the PG is queued.

Note that this only triggers when we exceeed osd_map_max_advance, usually
when there is a lot of peering and recovery activity going on.  A
workaround is to increase that value, but if you exceed osd_map_cache_size
you expose yourself to crache thrashing by the peering work queue, which
can cause serious problems with heavily degraded clusters and bit lots of
people on dumpling.

Backport: giant, firefly
Fixes: #10431
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agolibrados: Translate operation flags from C APIs 3324/head
Matt Richards [Thu, 8 Jan 2015 21:16:17 +0000 (13:16 -0800)]
librados: Translate operation flags from C APIs

The operation flags in the public C API are a distinct enum
and need to be translated to Ceph OSD flags, like as happens in
the C++ API. It seems like the C enum and the C++ enum consciously
use the same values, so I reused the C++ translation function.

Signed-off-by: Matthew Richards <mattjrichards@gmail.com>
10 years agoMerge pull request #3321 from cernceph/wip-nobarrier-doc 2952/head
Sage Weil [Thu, 8 Jan 2015 21:06:07 +0000 (13:06 -0800)]
Merge pull request #3321 from cernceph/wip-nobarrier-doc

doc: don't suggest mounting xfs with nobarrier

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3322 from dachary/wip-10426-test-directories
Sage Weil [Thu, 8 Jan 2015 20:57:53 +0000 (12:57 -0800)]
Merge pull request #3322 from dachary/wip-10426-test-directories

tests: group clusters in a single directory

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3306 from ceph/wip-10041
Gregory Farnum [Thu, 8 Jan 2015 18:39:31 +0000 (10:39 -0800)]
Merge pull request #3306 from ceph/wip-10041

client: fix mount timeout

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agomds: allow 'ops' as shorthand for 'dump_ops_in_flight' 3325/head
Sage Weil [Thu, 8 Jan 2015 18:36:22 +0000 (10:36 -0800)]
mds: allow 'ops' as shorthand for 'dump_ops_in_flight'

This is an extremely annoying thing to type when working with a
production cluster.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd: allow 'ops' as shorthand for 'dump_ops_in_flight'
Sage Weil [Thu, 8 Jan 2015 18:36:15 +0000 (10:36 -0800)]
osd: allow 'ops' as shorthand for 'dump_ops_in_flight'

This is an extremely annoying thing to type when working with a
production cluster.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agotests: group clusters in a single directory 3322/head
Loic Dachary [Thu, 8 Jan 2015 10:11:07 +0000 (11:11 +0100)]
tests: group clusters in a single directory

Group all test directories used for mini clusters into a single
sub-directory (testdir). This is easier to cleanup manually and less
error prone.

http://tracker.ceph.com/issues/10426 Fixes: #10426

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agoMerge pull request #3215 from dachary/wip-10384-ceph-test-helper-races
Loic Dachary [Thu, 8 Jan 2015 11:02:32 +0000 (12:02 +0100)]
Merge pull request #3215 from dachary/wip-10384-ceph-test-helper-races

tests: resolve ceph-helpers races

Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years agodoc: don't suggest mounting xfs with nobarrier 3321/head
Dan van der Ster [Thu, 8 Jan 2015 08:49:10 +0000 (09:49 +0100)]
doc: don't suggest mounting xfs with nobarrier

nobarrier is dangerous, so stop suggesting it as an example osd mount
options xfs.

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
10 years agomon, os: check the result of sync_filesystem. 3305/head
Jianpeng Ma [Thu, 8 Jan 2015 02:29:37 +0000 (10:29 +0800)]
mon, os: check the result of sync_filesystem.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
10 years agoMerge pull request #3310 from ceph/wip-mdsmonitor-fixes
Yan, Zheng [Thu, 8 Jan 2015 01:55:01 +0000 (09:55 +0800)]
Merge pull request #3310 from ceph/wip-mdsmonitor-fixes

MDSMonitor fixes

10 years agoMerge pull request #3167 from ceph/wip-10307
Josh Durgin [Wed, 7 Jan 2015 21:23:44 +0000 (13:23 -0800)]
Merge pull request #3167 from ceph/wip-10307

rgw: use s->bucket_attrs instead of trying to read obj attrs

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3036 from dachary/wip-make-check
Loic Dachary [Wed, 7 Jan 2015 15:47:26 +0000 (16:47 +0100)]
Merge pull request #3036 from dachary/wip-make-check

packages: add python-virtualenv and xmlstarlet

Reviewed-by: Ken Dreyer <kdreyer@redhat.com>
10 years agoMerge pull request #3269 from ceph/wip-10387
John Spray [Wed, 7 Jan 2015 14:30:02 +0000 (14:30 +0000)]
Merge pull request #3269 from ceph/wip-10387

client: close dirfrag whem rmdir

Reviewed-by: John Spray <john.spray@redhat.com>
10 years agomon/MDSMonitor: fix `mds fail` for standby MDSs 3310/head
John Spray [Wed, 7 Jan 2015 12:37:40 +0000 (12:37 +0000)]
mon/MDSMonitor: fix `mds fail` for standby MDSs

This command takes a gid, rank or name, but
in the name case it would previously only work if
the named daemon had a rank assigned (mds_info->rank >= 0),
otherwise it would fail silently.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agomon/MDSMonitor: respect MDSMAP_DOWN when promoting standbys
John Spray [Wed, 7 Jan 2015 11:47:34 +0000 (11:47 +0000)]
mon/MDSMonitor: respect MDSMAP_DOWN when promoting standbys

Previously, a standby could become active even if 'cluster_down'
had been run.  This was awkward, because it would get you a
"laggy or crashed" mds for the standby that was actually
up and running, just being ignored because of cluster_down.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoinit-ceph: stop returns before daemons are dead 3215/head
Loic Dachary [Fri, 19 Dec 2014 14:54:33 +0000 (15:54 +0100)]
init-ceph: stop returns before daemons are dead

The existence of the pidfile must be checked outside of the loop to send
a signal to the daemon. Otherwise the daemon will remove the pidfile and
stop can return before the process is dead because it only checks
/proc/$pid if the pidfile exists.

http://tracker.ceph.com/issues/10389 Fixes: #10389

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agomsg/async/AsyncConnection.cc: reduce scope of variable 3290/head
Danny Al-Gaaf [Wed, 7 Jan 2015 09:12:24 +0000 (10:12 +0100)]
msg/async/AsyncConnection.cc: reduce scope of variable

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
10 years agoosd/ClassHandler.cc: move stat into error handling
Danny Al-Gaaf [Wed, 7 Jan 2015 08:34:07 +0000 (09:34 +0100)]
osd/ClassHandler.cc: move stat into error handling

There is no security advantage to check if the class file
exists before opening it but the file could be removed or
exchanged between the stat and open. Instead directly open
it and fail. Check if the file was missing afterwards
for debug messages and error codes.

Make sure cls->status is set if the class open call fails.

To solve Coverity issue:

CID 743419 (#1 of 1): Time of check time of use (TOCTOU)
 fs_check_call: Calling function stat to perform check on fname.

743419 Time of check time of use
 An attacker could change the filename's file association or
 other attributes between the check and use.

 In ClassHandler::_load_class(ClassHandler::ClassData *): A check
 occurs on a file's attributes before the file is used in a
 privileged operation, but things may have changed (CWE-367)

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
10 years agocrush/crush.c: prevent DIVIDE_BY_ZERO
Danny Al-Gaaf [Fri, 2 Jan 2015 21:30:24 +0000 (22:30 +0100)]
crush/crush.c: prevent DIVIDE_BY_ZERO

Fix for:

CID 1219471 (#1 of 1): Division or modulo by zero (DIVIDE_BY_ZERO)
 divide_by_zero: In function call crush_make_uniform_bucket,
 division by expression item_weight which may be zero has undefined
 behavior.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
10 years agosrc/common/obj_bencher: fix some UNINIT issues
Danny Al-Gaaf [Fri, 2 Jan 2015 21:12:31 +0000 (22:12 +0100)]
src/common/obj_bencher: fix some UNINIT issues

Make sure concurrentios is always >= 0 to fix these coverity
issues and to prevent bad_alloc/negative array sizes:

CID 1128404 (#1 of 1): Uninitialized scalar variable (UNINIT)
 uninit_use: Using uninitialized value index[slot].

CID 1128405 (#1 of 1): Uninitialized pointer read (UNINIT)
 uninit_use: Using uninitialized value contents[slot].

CID 1219644 (#1 of 1): Uninitialized pointer read (UNINIT)
 uninit_use: Using uninitialized value contents[slot].

CID 1219645 (#1 of 1): Uninitialized pointer read (UNINIT)
 uninit_use: Using uninitialized value contents[slot].

CID 1219646 (#1 of 1): Uninitialized scalar variable (UNINIT)
 uninit_use: Using uninitialized value index[slot].

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>