]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
11 years agomsgr: fix rebind() race 1016/head
Xihui He [Mon, 30 Dec 2013 04:04:10 +0000 (12:04 +0800)]
msgr: fix rebind() race
stop the accepter and mark all pipes down before rebind to avoid race

Fixes: #6992
Signed-off-by: Xihui He xihuihe@gmail.com
11 years agoMerge pull request #1007 from ceph/wip-misc-fixes
Sage Weil [Sun, 29 Dec 2013 20:04:09 +0000 (12:04 -0800)]
Merge pull request #1007 from ceph/wip-misc-fixes

misc fixes

Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #1004 from ceph/wip-snaps
Loic Dachary [Sun, 29 Dec 2013 08:31:39 +0000 (00:31 -0800)]
Merge pull request #1004 from ceph/wip-snaps

make ceph_test_rados read from snaps; resulting bugs found

Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agoqa/workunits/rest/test.py: rbd pool ruleset is now 0 1007/head
Sage Weil [Sat, 28 Dec 2013 18:36:27 +0000 (10:36 -0800)]
qa/workunits/rest/test.py: rbd pool ruleset is now 0

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoceph_test_rados_api_tier: retry EBUSY race checks
Sage Weil [Sat, 28 Dec 2013 18:34:56 +0000 (10:34 -0800)]
ceph_test_rados_api_tier: retry EBUSY race checks

...or else these will occasionally fail against a thrashing cluster.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agolibcephfs: get osd location on -1 should return EINVAL
Sage Weil [Sat, 28 Dec 2013 18:25:00 +0000 (10:25 -0800)]
libcephfs: get osd location on -1 should return EINVAL

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoqa/workunits/mon/crush_ops.sh: fix in-use rule rm test
Sage Weil [Sat, 28 Dec 2013 18:22:18 +0000 (10:22 -0800)]
qa/workunits/mon/crush_ops.sh: fix in-use rule rm test

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush: fix get_full_location_ordered
Sage Weil [Sat, 28 Dec 2013 16:55:02 +0000 (08:55 -0800)]
crush: fix get_full_location_ordered

This should return -ENOENT when an id is not present.  Broken by
746069ee62c74ecf04ed45988029d5c3382a38d2.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1005 from ceph/wip-rgw-leak
Yehuda Sadeh [Sat, 28 Dec 2013 00:52:30 +0000 (16:52 -0800)]
Merge pull request #1005 from ceph/wip-rgw-leak

rgw: fix leak of RGWProcess

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agorgw: fix leak of RGWProcess 1005/head
Sage Weil [Sat, 28 Dec 2013 00:36:02 +0000 (16:36 -0800)]
rgw: fix leak of RGWProcess

Introduced by a3e50b09a1fa22b80dea014d4b7bd96c23904f22.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd: preserve user_version in snaps/clones 1004/head
Sage Weil [Sat, 28 Dec 2013 00:29:11 +0000 (16:29 -0800)]
osd: preserve user_version in snaps/clones

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoceph_test_rados: test read from snapshots
Sage Weil [Sat, 28 Dec 2013 00:28:58 +0000 (16:28 -0800)]
ceph_test_rados: test read from snapshots

This was disabled back in 2011, c54aa7db3bc6e4c763e3b08d2ae98f89afe5a246.
Whoops!

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/OSDMap: observe 'osd crush chooseleaf type' option for initial rules
Sage Weil [Fri, 27 Dec 2013 21:45:34 +0000 (13:45 -0800)]
osd/OSDMap: observe 'osd crush chooseleaf type' option for initial rules

This option was dropped by 2a7fcc35b8ceeff1e07da28b10ced4a2a4ed09ec.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge branch 'rbd-map-options'
Josh Durgin [Fri, 27 Dec 2013 17:56:55 +0000 (09:56 -0800)]
Merge branch 'rbd-map-options'

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
11 years agorbd: expose options available to rbd map
Ilya Dryomov [Fri, 27 Dec 2013 17:40:59 +0000 (19:40 +0200)]
rbd: expose options available to rbd map

Add a -o / --options option, which would allow users to specify
rbd-specific and generic ceph client and osd options available at
mapping time in a comma separated list (similar to mount(8) mount
options).

Exposed options are:

- fsid=%s
- ip=%s
- share
- noshare
- crc
- nocrc
- osdkeepalive=%d
- osd_idle_ttl=%d
- rw
- ro (equivalent to existing --read-only flag)

The rw/ro < 3.7 kernels compatibility kludge added in commit
fb0f1986449b is preserved.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agoMerge pull request #1001 from dachary/wip-forward-tid
Sage Weil [Fri, 27 Dec 2013 15:59:43 +0000 (07:59 -0800)]
Merge pull request #1001 from dachary/wip-forward-tid

messages: add tid to string form of MForward

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1002 from yuyuyu101/wip-7062
Loic Dachary [Fri, 27 Dec 2013 12:28:59 +0000 (04:28 -0800)]
Merge pull request #1002 from yuyuyu101/wip-7062

Lack of "start" member function declare in WBThrottle.h
make check runs ok

Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agoLack of "start" member function declare in WBThrottle.h 1002/head
Haomai Wang [Fri, 27 Dec 2013 10:11:09 +0000 (18:11 +0800)]
Lack of "start" member function declare in WBThrottle.h

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
11 years agomessages: add tid to string form of MForward 1001/head
Loic Dachary [Fri, 27 Dec 2013 06:30:04 +0000 (07:30 +0100)]
messages: add tid to string form of MForward

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #837 from ceph/port/fallocate
Sage Weil [Fri, 27 Dec 2013 05:33:39 +0000 (21:33 -0800)]
Merge pull request #837 from ceph/port/fallocate

FileJournal: zero-fill in-lieu of posix_fallocate

We may want to change that to a #warning later...

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #982 from dachary/wip-default-crush-rule
Sage Weil [Fri, 27 Dec 2013 05:29:36 +0000 (21:29 -0800)]
Merge pull request #982 from dachary/wip-default-crush-rule

osd: add default crush rule for erasure pools

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #974 from dachary/wip-build-depends
Sage Weil [Fri, 27 Dec 2013 05:26:02 +0000 (21:26 -0800)]
Merge pull request #974 from dachary/wip-build-depends

packaging: make check needs argparse and uuidgen

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #994 from yuyuyu101/wip-7062
Sage Weil [Fri, 27 Dec 2013 05:25:08 +0000 (21:25 -0800)]
Merge pull request #994 from yuyuyu101/wip-7062

Fix WBThrottle thread disappear problem

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agopackaging: make check needs argparse and uuidgen 974/head
Loic Dachary [Thu, 19 Dec 2013 14:48:46 +0000 (15:48 +0100)]
packaging: make check needs argparse and uuidgen

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #1000 from ceph/wip-rbd-tinc-5426
Josh Durgin [Fri, 27 Dec 2013 02:53:02 +0000 (18:53 -0800)]
Merge pull request #1000 from ceph/wip-rbd-tinc-5426

fix #5426 race in librbd

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agolibrbd: call user completion after incrementing perfcounters 1000/head
Josh Durgin [Fri, 27 Dec 2013 01:38:52 +0000 (17:38 -0800)]
librbd: call user completion after incrementing perfcounters

The perfcounters (and the ictx) are only valid while the image is
still open.  If the librbd user gets the callback for its last I/O,
then closes the image, the ictx and its perfcounters will be
invalid. If the AioCompletion object is has not run the rest of its
complete() method yet, it will access these now-invalid addresses,
possibly leading to a crash.

The AioCompletion object is independent of the ictx and does not
access it again after incrementing perfcounters, so avoid this race by
calling the user's callback after this step. The AioCompletion object
will be cleaned up by the rest of complete_request(), independent of
the ImageCtx.

Fixes: #5426
Backport: dumpling, emperor
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoosd: create default ruleset for erasure pools 982/head
Loic Dachary [Thu, 26 Dec 2013 11:23:50 +0000 (12:23 +0100)]
osd: create default ruleset for erasure pools

The ruleset --osd_pool_default_crush_erasure_ruleset is created to be
suitable for erasure coded pools when OSDMap::build_simple is required
to build the default OSD map of a new cluster.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agomon: implement --osd-pool-default-crush-erasure-ruleset
Loic Dachary [Thu, 26 Dec 2013 08:59:18 +0000 (09:59 +0100)]
mon: implement --osd-pool-default-crush-erasure-ruleset

It must be different from the replicated default.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agomon: implement --osd-pool-default-crush-replicated-ruleset
Loic Dachary [Thu, 26 Dec 2013 11:03:57 +0000 (12:03 +0100)]
mon: implement --osd-pool-default-crush-replicated-ruleset

--osd-pool-default-crush-replicated-ruleset replaces
--osd-pool-default-crush-rule

If --osd-pool-default-crush-rule is set it takes precedence over
--osd-pool-default-crush-replicated-ruleset and a deprecation warning is
displayed.

The CrushWrapper::get_osd_pool_default_crush_replicated_ruleset helper is
used to implement this behaviour.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoosd: use CrushWrapper::add_simple_ruleset
Loic Dachary [Thu, 26 Dec 2013 10:20:41 +0000 (11:20 +0100)]
osd: use CrushWrapper::add_simple_ruleset

Replace the manually crafted ruleset in OSDMap::build_simple_crush_map*
with calls to add_simple_ruleset. The generated ruleset do not have the
same behavior but that presumably do not cause any backward
compatibility problem because they are only created when a new cluster
is being initialized.

The prototypes of OSDMap::build_simple* are modified to allow for a
return code and display of a human readable error message.

The --osd-min-rep and --osd-max-rep configuration options are removed :
they were only used in the code that was removed.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoosd: build_simple creates a single rule
Loic Dachary [Wed, 25 Dec 2013 12:19:56 +0000 (13:19 +0100)]
osd: build_simple creates a single rule

The three rules created by build_simple are identical. They are replaced
by a single rule named replicated_rule which is set to be used by the
data, rbd and metadata pools.

Instead of hardcoding the ruleset number to zero, it is read from
osd_pool_default_crush_ruleset which defaults to zero.

The CEPH_DEFAULT_CRUSH_REPLICATED_RULESET enum is moved from osd_type.h to
config.h because it may be needed when osd_type.h is not included.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: set min_rep and max_rep depending on mode
Loic Dachary [Thu, 26 Dec 2013 23:10:55 +0000 (00:10 +0100)]
crush: set min_rep and max_rep depending on mode

Assuming firstn is for replica and indep is for erasure. This is a
strong constraint but it is unlikely to make the resulting ruleset unfit
to be used in most cases.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: add rule_type argument to add_simple_ruleset
Loic Dachary [Thu, 26 Dec 2013 18:50:37 +0000 (19:50 +0100)]
crush: add rule_type argument to  add_simple_ruleset

Instead of hardcoded pg_pool_t::TYPE_REPLICATED

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agopartially rename rule to ruleset
Loic Dachary [Thu, 26 Dec 2013 08:49:02 +0000 (09:49 +0100)]
partially rename rule to ruleset

Where code is changed, get the opportunity to rename rule to ruleset to
improve naming consistency.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge branch 'leseb-doc-rbd-havana'
Josh Durgin [Thu, 26 Dec 2013 17:55:17 +0000 (09:55 -0800)]
Merge branch 'leseb-doc-rbd-havana'

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
11 years agodoc: Add OpenStack Havana documentation
Sébastien Han [Fri, 6 Dec 2013 14:43:37 +0000 (15:43 +0100)]
doc: Add OpenStack Havana documentation

New features appeared during the Havana cycle.
This patch offers a general update of the doc.

Signed-off-by: Sébastien Han <sebastien.han@enovance.com>
11 years agoosd: factorize build_simple and build_simple_from_conf
Loic Dachary [Wed, 25 Dec 2013 11:59:00 +0000 (12:59 +0100)]
osd: factorize build_simple and build_simple_from_conf

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoqa: remove osd pool create erasure tests
Loic Dachary [Thu, 26 Dec 2013 07:39:52 +0000 (08:39 +0100)]
qa: remove osd pool create erasure tests

Creating an erasure pool will crash the OSD because OSD::_make_pg
asserts if the type is not replicated. The tests related to erasure
coded pool creation are removed from qa/workunits/cephtool/test.sh.

The osd-create-pool.sh unit test covers the cases removed from test.sh
more extensively. The intent is to check the interactions with the MON
only, therefore it does not run an OSD and the absence of erasure code
placement group backend implementation is not an issue.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agomon: osd-pool-create must not loop forever on kill
Loic Dachary [Wed, 25 Dec 2013 20:36:13 +0000 (21:36 +0100)]
mon: osd-pool-create must not loop forever on kill

Looping forever on kill does not serve any useful purpose.
Reduce the verbosity of the exit trap to help diagnose error
conditions.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoclient: SyntheticClient uses the first available pool
Loic Dachary [Wed, 25 Dec 2013 13:01:48 +0000 (14:01 +0100)]
client: SyntheticClient uses the first available pool

It is unrelated to CEPH_DATA_RULE which is replaced by
SYNCLIENT_FIRST_POOL.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agomon: MDS data and metadata pool numbers are hardcoded
Loic Dachary [Wed, 25 Dec 2013 12:30:34 +0000 (13:30 +0100)]
mon: MDS data and metadata pool numbers are hardcoded

The MDS assumes pool 0 and 1 are suitable for data and metadata
respectively. Instead of relying on the CEPH_DATA_RULE and
CEPH_METADATA_RULE constants that only match by chance, set a hardcoded
value specific to MDS to reduce the fragility of the hardcoded
assumption.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoFix WBThrottle thread disappear problem 994/head
Haomai Wang [Thu, 26 Dec 2013 03:20:52 +0000 (11:20 +0800)]
Fix WBThrottle thread disappear problem

New ceph_osd.cc code did ObjectStore init work before global_init_daemonize(),
and WBThrottle thread is created when objectstore constructed. So after
daemon(), WBThrottle thread won't exist in new process. It will result in
deadlock.

When "cur_ios" which is member of WBThrottle hits hard limit, there exists two
ways to decrease "cur_ios". The first is WBThrottle thread which is dead if
deamonize, another is SyncThread. SyncThread will block at op_tp.pause()
because thread in op_tp(threadpool) block at
wbthrottle.throttle(FileStore::doop). So no thread will continue process jobs
in filestore layer and all threads is waiting.

Fix #7062 (http://tracker.ceph.com/issues/7062)

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
11 years agoMerge pull request #995 from dachary/wip-deprecated
Sage Weil [Thu, 26 Dec 2013 00:09:31 +0000 (16:09 -0800)]
Merge pull request #995 from dachary/wip-deprecated

rados: deprecated attribute has no argument

11 years agoceph_argparse: kill _daemon versions of argparse calls 996/head
Ilya Dryomov [Wed, 25 Dec 2013 19:41:16 +0000 (21:41 +0200)]
ceph_argparse: kill _daemon versions of argparse calls

Commit c76bbc2e6df1, which introduced _daemon versions of some of the
argparse calls, also changed the behaviour of non-_daemon versions.
The change resulted in incorrect error messages, e.g.

  $ ./rbd create b0 --size
  rbd: extraneous parameter --size

instead of what should have been

  $ ./rbd create b0 --size
  Option --size requires an argument.

The users of _daemon versions were added in commit be801f6c506d and
removed in commit f26bd55e57f1, so just kill the _daemon versions and
restore the old behaviour.  (This effectively reverts commit
c76bbc2e6df1.)

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agorados: deprecated attribute has no argument 995/head
Loic Dachary [Wed, 25 Dec 2013 09:44:07 +0000 (10:44 +0100)]
rados: deprecated attribute has no argument

The deprecated attribute argument was introduced in gcc 4.5
http://gcc.gnu.org/gcc-4.5/changes.html and centos6 has a lower version.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #988 from ceph/wip-crush-location
Loic Dachary [Wed, 25 Dec 2013 09:07:04 +0000 (01:07 -0800)]
Merge pull request #988 from ceph/wip-crush-location

add 'crush location' config option

make check is ok

Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #993 from ceph/wip-librados-lock
Sage Weil [Tue, 24 Dec 2013 18:51:01 +0000 (10:51 -0800)]
Merge pull request #993 from ceph/wip-librados-lock

Wip librados lock

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agolibrados: lockless get_instance_id() 993/head
Yehuda Sadeh [Thu, 5 Dec 2013 07:33:42 +0000 (23:33 -0800)]
librados: lockless get_instance_id()

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoobjecter, librados: create Objecter::Op in two phases
Yehuda Sadeh [Sat, 23 Nov 2013 01:21:57 +0000 (17:21 -0800)]
objecter, librados: create Objecter::Op in two phases

(currently only in some librados operations)
First create the op, only then lock and submit so that we reduce lock
contention.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agocrush/CrushWrapper: note about get_immediate_parent() 988/head
Sage Weil [Tue, 24 Dec 2013 16:01:15 +0000 (08:01 -0800)]
crush/CrushWrapper: note about get_immediate_parent()

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agolibrados: mark old get_version() as deprecated
Sage Weil [Mon, 23 Dec 2013 21:14:43 +0000 (13:14 -0800)]
librados: mark old get_version() as deprecated

Use the newly-discovered (for me) deprecated attribute to mark the old
get_version() method and point users toward get_version64().  And fix a
couple of users in the kvstore code!

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agolibrados: deprecate aio_operate() read variant that takes snapid
Sage Weil [Mon, 23 Dec 2013 21:13:06 +0000 (13:13 -0800)]
librados: deprecate aio_operate() read variant that takes snapid

The argument was ignored.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agolibrbd: localize or distribute parent (snap) reads
Sage Weil [Thu, 31 Oct 2013 00:21:05 +0000 (17:21 -0700)]
librbd: localize or distribute parent (snap) reads

The parent is always a snapshot.  We may want to treat it differently
than other snaps by virtue of it (likely) being a more highly-shared
image.

By default, localize parent reads.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosdc/Objecter: use crush location and distance for LOCALIZE_READS
Sage Weil [Wed, 30 Oct 2013 15:59:48 +0000 (08:59 -0700)]
osdc/Objecter: use crush location and distance for LOCALIZE_READS

Use the hierarchy in the CRUSH map to determine what the closest
replica is.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosdc/Objecter: maintain crush_location multimap
Sage Weil [Mon, 23 Dec 2013 23:18:07 +0000 (15:18 -0800)]
osdc/Objecter: maintain crush_location multimap

Observe and parse the 'crush location' config option.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush/CrushWrapper: simplify get_full_location_ordered()
Sage Weil [Wed, 30 Oct 2013 16:00:52 +0000 (09:00 -0700)]
crush/CrushWrapper: simplify get_full_location_ordered()

Just ascend the hierarchy; it is much less complicated.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush/CrushWrapper: add get_common_ancestor_distance()
Sage Weil [Wed, 30 Oct 2013 15:59:00 +0000 (08:59 -0700)]
crush/CrushWrapper: add get_common_ancestor_distance()

Calculate closest common ancestor (type) in the hierarchy.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #990 from ceph/wip-fix-mon-fwd
Sage Weil [Tue, 24 Dec 2013 01:02:11 +0000 (17:02 -0800)]
Merge pull request #990 from ceph/wip-fix-mon-fwd

mon: fix forwarded request features when requests are resent

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agoMerge pull request #989 from ceph/wip-7056
Sage Weil [Mon, 23 Dec 2013 23:53:42 +0000 (15:53 -0800)]
Merge pull request #989 from ceph/wip-7056

osd/ReplicatedPG: include omap header in copy-get

This now passes rados/thrash tests without failures.

11 years agomon/OSDMonitor: use generic CrushWrapper::parse_loc_map helper
Sage Weil [Tue, 29 Oct 2013 23:37:59 +0000 (16:37 -0700)]
mon/OSDMonitor: use generic CrushWrapper::parse_loc_map helper

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush/CrushWrapper: add parse_loc_[multi]map helpers
Sage Weil [Tue, 29 Oct 2013 23:37:42 +0000 (16:37 -0700)]
crush/CrushWrapper: add parse_loc_[multi]map helpers

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #991 from dachary/wip-stop
Sage Weil [Mon, 23 Dec 2013 21:12:14 +0000 (13:12 -0800)]
Merge pull request #991 from dachary/wip-stop

vstart/stop: do not loop forever on kill

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: fix copy-get iteration of omap keys 989/head
Sage Weil [Mon, 23 Dec 2013 20:52:34 +0000 (12:52 -0800)]
osd/ReplicatedPG: fix copy-get iteration of omap keys

We need to call upper_bound() before checking if the iterator is valid!

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoceph_test_rados: s/tmap/omap/
Sage Weil [Mon, 23 Dec 2013 19:37:53 +0000 (11:37 -0800)]
ceph_test_rados: s/tmap/omap/

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agovstart/stop: do not loop forever on kill 991/head
Loic Dachary [Mon, 23 Dec 2013 20:44:38 +0000 (21:44 +0100)]
vstart/stop: do not loop forever on kill

It may be the case that stop.sh can't stop a process for reasons
unrelated to vstart.sh. Because apache runs independantly, for
instance. Instead of trying forever, try twice in a raw ( should be
enough 99% of the case ) and try three more times, sleeping one second
between each try should be more than enough.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoconfig: add 'crush location' option
Sage Weil [Tue, 29 Oct 2013 23:19:37 +0000 (16:19 -0700)]
config: add 'crush location' option

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agodoc: Fix caps documentation for Admin API
Wido den Hollander [Mon, 23 Dec 2013 20:10:59 +0000 (21:10 +0100)]
doc: Fix caps documentation for Admin API

The correct caps is users instead of user

11 years agomon: fix forwarded request features when requests are resent 990/head
Sage Weil [Mon, 23 Dec 2013 18:59:14 +0000 (10:59 -0800)]
mon: fix forwarded request features when requests are resent

Pass the features in explicitly so that we can use messages we've just
decoded in resend_routed_requests().

Keep the features in struct RoutedRequest.

Renamed conn_features -> con_features while we are here.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: include omap header in copy-get
Sage Weil [Mon, 23 Dec 2013 18:21:44 +0000 (10:21 -0800)]
osd/ReplicatedPG: include omap header in copy-get

Missed this the first time around.  Thank you, ceph_test_rados!

Fixes: #7056
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #984 from ceph/wip-7051
Sage Weil [Mon, 23 Dec 2013 17:52:02 +0000 (09:52 -0800)]
Merge pull request #984 from ceph/wip-7051

#7051: forward connection features alongside with message

Reviewed-by: Loic Dachary <loic@dachary.org>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Mon, 23 Dec 2013 17:28:29 +0000 (09:28 -0800)]
Merge remote-tracking branch 'gh/next'

11 years agoMerge remote-tracking branch 'gh/wip-cache'
Sage Weil [Mon, 23 Dec 2013 17:22:36 +0000 (09:22 -0800)]
Merge remote-tracking branch 'gh/wip-cache'

11 years agoMerge pull request #987 from ceph/wip-crush-shrink-diff
Sage Weil [Mon, 23 Dec 2013 17:19:11 +0000 (09:19 -0800)]
Merge pull request #987 from ceph/wip-crush-shrink-diff

crush: shrink diff with kernel implementation

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agocrush: misc formatting and whitespace fixes 987/head
Ilya Dryomov [Mon, 23 Dec 2013 16:12:56 +0000 (18:12 +0200)]
crush: misc formatting and whitespace fixes

- whitespace in crush.h

- format is_out() definition and call site to 80 columns

- whitespace around local_fallback_tries in crush_choose_firstn()

All of this is to shrink the diff with the kernel implementation.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agocrush: use kernel-doc consistently
Ilya Dryomov [Mon, 23 Dec 2013 16:12:56 +0000 (18:12 +0200)]
crush: use kernel-doc consistently

kernel-doc syntax is "@arg: desc", not "@param arg desc".  In addition,
these comments are usually placed around function definitions instead
of function declarations.  Follow these guidelines to shrink the diff.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agocrush/mapper: unsigned -> unsigned int
Ilya Dryomov [Mon, 23 Dec 2013 16:12:56 +0000 (18:12 +0200)]
crush/mapper: unsigned -> unsigned int

Kernel implementation is located in net/, and use of "unsigned int" is
preferred to bare "unsigned" in net tree (as proven by several net/
cleanups).  Follow this guideline to shrink the diff.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agoMerge pull request #985 from dachary/wip-erasure-code-defaults
João Eduardo Luís [Mon, 23 Dec 2013 12:47:41 +0000 (04:47 -0800)]
Merge pull request #985 from dachary/wip-erasure-code-defaults

mon: use kill instead of pkill in osd-pool-create

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: use kill instead of pkill in osd-pool-create 985/head
Loic Dachary [Mon, 23 Dec 2013 12:10:18 +0000 (13:10 +0100)]
mon: use kill instead of pkill in osd-pool-create

The --pidfile option of pkill is not supported by all versions. Use kill
instead for compatibility. Instead of looping on : loop on sleep 1 so an
inifinite loop does is slower at filling the disk.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoosd: OSDMap: dump osd_xinfo_t::features as an int 984/head
Joao Eduardo Luis [Mon, 23 Dec 2013 01:29:23 +0000 (17:29 -0800)]
osd: OSDMap: dump osd_xinfo_t::features as an int

Instead of dumping the list in a string-list format, which in
retrospect wasn't very useful.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: Monitor: Forward connection features
Joao Eduardo Luis [Mon, 23 Dec 2013 01:26:59 +0000 (17:26 -0800)]
mon: Monitor: Forward connection features

We are relying on connection features to track OSD supported
features.  However, we were not forwarding connection features
when we forwarded a message from a peon to the leader.  That
was breaking the OSD feature tracking.

Fixes: 7051
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agoMerge remote-tracking branch 'gh/master' into wip-cache
Sage Weil [Sun, 22 Dec 2013 23:33:59 +0000 (15:33 -0800)]
Merge remote-tracking branch 'gh/master' into wip-cache

Conflicts:
src/osdc/Objecter.h
src/vstart.sh

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #976 from dachary/wip-erasure-code-defaults
Sage Weil [Sun, 22 Dec 2013 23:30:43 +0000 (15:30 -0800)]
Merge pull request #976 from dachary/wip-erasure-code-defaults

provide sensible defaults when creating an erasure coded pool

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agomon: unit test for osd pool create 976/head
Loic Dachary [Fri, 20 Dec 2013 19:39:21 +0000 (20:39 +0100)]
mon: unit test for osd pool create

It is inconvenient to run such tests in the
qa/workunits/cephtool/test.sh because they require that the mon is
restarted to test errors in the format of the default erasure code
properties and check the appropriate error message is output.

osd-pool-create.sh runs a single mon from sources using command
line options and a temporary directory, the same way vstart.sh does but
lightweight.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agomon: erasure code pool properties defaults
Loic Dachary [Sun, 22 Dec 2013 22:37:08 +0000 (23:37 +0100)]
mon: erasure code pool properties defaults

If no properties are set when creating an erasure coded pool, default to
using the jerasure plugin with the cauchy_good technique which is the
fastest.

The defaults are set with osd_pool_default_erasure_code_properties.

The erasure code plugins are loaded from the directory specified in the
erasure-code-directory property. Contrary to the other properties it
will most commonly be the same throughout the cluster. The default is
set to /usr/lib/ceph/erasure-code with
osd_pool_default_erasure_code_directory

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agomon: add error message argument to prepare_new_pool
Loic Dachary [Fri, 20 Dec 2013 16:23:16 +0000 (17:23 +0100)]
mon: add error message argument to prepare_new_pool

Add a stringstream argument to prepare_new_pool for the purpose of
recording human readable error message.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agomon: do not include = in pool properties values
Loic Dachary [Sat, 21 Dec 2013 13:52:17 +0000 (14:52 +0100)]
mon: do not include = in pool properties values

foo=bar was parsed as {"foo":"=bar"} instead of {"foo":"bar"} because of
the missing equal++

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocommon: implement get_str_map to parse key/values
Loic Dachary [Sat, 21 Dec 2013 12:58:44 +0000 (13:58 +0100)]
common: implement get_str_map to parse key/values

It is capable of parsing json or key=value pairs. The prototype is made
to look like get_str_list. The implementation is in common + include and
use .h. It will probably be moved to common and use .hpp instead, along
with str_list.{cc,h}.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoosd: pool properties are not an array
Loic Dachary [Sat, 21 Dec 2013 13:48:27 +0000 (14:48 +0100)]
osd: pool properties are not an array

They must be dumped with open_object_section instead of
open_array_section otherwise only the values are displayed.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agomon: osd create pool must fail on incompatible type
Loic Dachary [Sat, 21 Dec 2013 14:49:19 +0000 (15:49 +0100)]
mon: osd create pool must fail on incompatible type

When osd create pool is called twice on the same pool, it will succeed
because the pool already exists. However, if a different type is
specified, it must fail.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agopackaging: erasure-code plugins go in /usr/lib/ceph
Loic Dachary [Fri, 20 Dec 2013 16:05:45 +0000 (17:05 +0100)]
packaging: erasure-code plugins go in /usr/lib/ceph

Install the plugins in /usr/lib/ceph/erasure-code instead of
/usr/lib/erasure-code to comply with FHS : "Applications may use a
single subdirectory under /usr/lib."

http://refspecs.linuxfoundation.org/FHS_2.3/fhs-2.3.html

The debian package is modified to install the plugins as part of the
ceph package which also ships rados-classes.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #983 from dachary/wip-rep-replicated
Sage Weil [Sun, 22 Dec 2013 20:39:08 +0000 (12:39 -0800)]
Merge pull request #983 from dachary/wip-rep-replicated

mon: s/rep/replicated/ in pool create prototype

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agomon: s/rep/replicated/ in pool create prototype 983/head
Loic Dachary [Sun, 22 Dec 2013 17:26:42 +0000 (18:26 +0100)]
mon: s/rep/replicated/ in pool create prototype

The test is updated to remove unecessary asserts. Since all combinations
of properties and pool type are allowed, there is no way to statically
check the validity of the arguments.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoceph_test_rados: update in-memory user_version on RemoveAttrsOp
Sage Weil [Sun, 22 Dec 2013 07:32:24 +0000 (23:32 -0800)]
ceph_test_rados: update in-memory user_version on RemoveAttrsOp

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: clear whiteout on successful copy-from
Sage Weil [Sun, 22 Dec 2013 07:01:56 +0000 (23:01 -0800)]
osd/ReplicatedPG: clear whiteout on successful copy-from

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoceph_test_rados: check existence on is_dirty completion
Sage Weil [Sun, 22 Dec 2013 06:52:28 +0000 (22:52 -0800)]
ceph_test_rados: check existence on is_dirty completion

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agomon/OSDMonitor: propagate snap updates to tier pools on update
Sage Weil [Thu, 19 Dec 2013 23:01:26 +0000 (15:01 -0800)]
mon/OSDMonitor: propagate snap updates to tier pools on update

For any pg_pool_t update, verify that any changes to the pool snapshot
metadata are propagated to the tiers.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/OSDMap: implement propapage_snaps_to_tiers()
Sage Weil [Thu, 19 Dec 2013 22:59:45 +0000 (14:59 -0800)]
osd/OSDMap: implement propapage_snaps_to_tiers()

Tier pools mirror the base pool's snapshot metadata.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agorgw: add -ldl for mongoose
Sage Weil [Sun, 22 Dec 2013 17:00:43 +0000 (09:00 -0800)]
rgw: add -ldl for mongoose

/usr/bin/ld: mongoose/mongoose.o: undefined reference to symbol 'dlsym@@GLIBC_2.2.5'
/lib/x86_64-linux-gnu/libdl.so.2: error adding symbols: DSO missing from command line
error: collect2: ld returned 1 exit status

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #980 from ceph/port/misc
Sage Weil [Sun, 22 Dec 2013 17:34:12 +0000 (09:34 -0800)]
Merge pull request #980 from ceph/port/misc

Misc portability patches

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #981 from dachary/wip-rep-replicated
Sage Weil [Sun, 22 Dec 2013 07:43:38 +0000 (23:43 -0800)]
Merge pull request #981 from dachary/wip-rep-replicated

replace pool type REP with REPLICATED

Reviewed-by: Sage Weil <sage@inktank.com>