]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
11 years agoman: update man/ from doc/man/8 920/head
Loic Dachary [Sat, 7 Dec 2013 21:07:38 +0000 (22:07 +0100)]
man: update man/ from doc/man/8

As explained in admin/manpage-howto.txt

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoman: Ceph is also an object store
Loic Dachary [Sat, 7 Dec 2013 20:52:16 +0000 (21:52 +0100)]
man: Ceph is also an object store

Replace

   Ceph distributed file system

with

   Ceph distributed storage system

to help reduce the idea that Ceph is just a file system.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #923 from dachary/wip-crush-test
Sage Weil [Tue, 10 Dec 2013 17:06:31 +0000 (09:06 -0800)]
Merge pull request #923 from dachary/wip-crush-test

CrushTester patches and documentation

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoos/MemStore: do on_apply_sync callback synchronously
Sage Weil [Tue, 10 Dec 2013 16:56:35 +0000 (08:56 -0800)]
os/MemStore: do on_apply_sync callback synchronously

We can easily deadlock if we put this in the Finisher thread behind other
work; do it synchronously!

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agocrush: implement --show-bad-mappings for indep 923/head
Loic Dachary [Mon, 9 Dec 2013 13:35:00 +0000 (14:35 +0100)]
crush: implement --show-bad-mappings for indep

Support the presence of ITEM_NONE device numbers in the indep mapping as
proof of a bad mapping. Implement the associated unit tests.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: add unitest for crushtool --show-bad-mappings
Loic Dachary [Mon, 9 Dec 2013 13:08:14 +0000 (14:08 +0100)]
crush: add unitest for crushtool --show-bad-mappings

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: remove scary message string
Loic Dachary [Sun, 8 Dec 2013 21:39:18 +0000 (22:39 +0100)]
crush: remove scary message string

The string is no longer used and can be removed.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: document the --test mode of operations
Loic Dachary [Sun, 8 Dec 2013 21:03:33 +0000 (22:03 +0100)]
crush: document the --test mode of operations

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #918 from ceph/port/misc
Sage Weil [Mon, 9 Dec 2013 19:16:49 +0000 (11:16 -0800)]
Merge pull request #918 from ceph/port/misc

Misc portability patches

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #922 from dachary/wip-crush-choose-tries
Sage Weil [Mon, 9 Dec 2013 16:28:43 +0000 (08:28 -0800)]
Merge pull request #922 from dachary/wip-crush-choose-tries

crush: fix map->choose_tries boundary test

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agocrush: --show-utilization* implies --show-statistics
Loic Dachary [Sun, 8 Dec 2013 18:45:28 +0000 (19:45 +0100)]
crush: --show-utilization* implies --show-statistics

--show-utilization* outputs only if --show-statistics is set, which is
confusing. Instead of failing, set --show-statistics to avoid the
confusion.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: add CrushTester accessors
Loic Dachary [Sun, 8 Dec 2013 18:39:16 +0000 (19:39 +0100)]
crush: add CrushTester accessors

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: output --show-bad-mappings on err
Loic Dachary [Sun, 8 Dec 2013 16:57:25 +0000 (17:57 +0100)]
crush: output --show-bad-mappings on err

Instead of using stdout so that it displays well when used in
conjunction with --show-statistics

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: fix map->choose_tries boundary test 922/head
Loic Dachary [Sun, 8 Dec 2013 13:38:59 +0000 (14:38 +0100)]
crush: fix map->choose_tries boundary test

CrushWrapper::start_choose_profile allocates map->choose_tries with
choose_total_tries elements. When crush_choose_firstn sets a value, it
tests against map->choose_local_tries which could lead to memory
corruption if map->choose_total_tries is smaller than
map->choose_local_tries.

Another indesirable but non fatal side effect is that the output crushtool
--show-choose-tries will be truncated to choose_local_tries which is
set to a lower value than choose_total_tries by the default tuneables.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #869 from ceph/wip-crush
Sage Weil [Sun, 8 Dec 2013 04:59:22 +0000 (20:59 -0800)]
Merge pull request #869 from ceph/wip-crush

crush changes for erasure coding

Reviewed-by: Loic Dachary <loic@dachary.org>
Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agolibrbd: remove unused private variable 918/head
Noah Watkins [Sat, 7 Dec 2013 17:58:43 +0000 (09:58 -0800)]
librbd: remove unused private variable

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agoTrackedOp: remove unused private variable
Noah Watkins [Sat, 7 Dec 2013 17:54:53 +0000 (09:54 -0800)]
TrackedOp: remove unused private variable

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agolibrbd: rename howmany to avoid conflict
Noah Watkins [Sat, 7 Dec 2013 17:59:13 +0000 (09:59 -0800)]
librbd: rename howmany to avoid conflict

A howmany macro exists on some platforms in standard headers, but there
really isn't any sort of standard that I've found. We just avoid the
conflict entirely this way.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agoMerge pull request #917 from ceph/port/compat
Sage Weil [Sat, 7 Dec 2013 22:01:14 +0000 (14:01 -0800)]
Merge pull request #917 from ceph/port/compat

compat: define replacement TEMP_FAILURE_RETRY

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #919 from ceph/port/fdatasync
Sage Weil [Sat, 7 Dec 2013 22:00:40 +0000 (14:00 -0800)]
Merge pull request #919 from ceph/port/fdatasync

wbthrottle: use feature check for fdatasync

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agowbthrottle: use feature check for fdatasync 919/head
Noah Watkins [Sun, 29 Sep 2013 18:32:29 +0000 (11:32 -0700)]
wbthrottle: use feature check for fdatasync

Checking for fdatasync uses the same approach as the qemu configure
script. The relevant commit is d1722a27f552a22561104210e0afad4577878e53.
Here is a copy of the commit message which explains the check:

Under Darwin, a symbol exists for the fdatasync() function, so that our
link test succeeds. However _POSIX_SYNCHRONIZED_IO is set to '-1'.

According to POSIX:2008, a value of -1 means the feature is not
supported.
A value of 0 means supported at compilation time, and a value greater 0
means supported at both compilation and run time.

Enable fdatasync() only if _POSIX_SYNCHRONIZED_IO is '>0'.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agorados_sync: fix mismatched tag warning
Noah Watkins [Sat, 7 Dec 2013 17:59:39 +0000 (09:59 -0800)]
rados_sync: fix mismatched tag warning

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agorados_sync: remove unused private variable
Noah Watkins [Sat, 7 Dec 2013 18:01:30 +0000 (10:01 -0800)]
rados_sync: remove unused private variable

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agomon: check for sys/vfs.h existence
Noah Watkins [Fri, 27 Sep 2013 14:38:11 +0000 (07:38 -0700)]
mon: check for sys/vfs.h existence

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agomake: increase maximum template recursion depth
Noah Watkins [Tue, 29 Oct 2013 15:54:01 +0000 (08:54 -0700)]
make: increase maximum template recursion depth

With clang on OSX spirit blows up without this.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agocompat: define replacement TEMP_FAILURE_RETRY 917/head
Noah Watkins [Sun, 22 Sep 2013 18:02:34 +0000 (11:02 -0700)]
compat: define replacement TEMP_FAILURE_RETRY

Not all platforms have it.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agoMerge remote-tracking branch 'gh/wip-fix-3x'
Sage Weil [Sat, 7 Dec 2013 00:56:10 +0000 (16:56 -0800)]
Merge remote-tracking branch 'gh/wip-fix-3x'

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoMerge remote-tracking branch 'gh/wip-fix-tunables'
Sage Weil [Sat, 7 Dec 2013 00:55:54 +0000 (16:55 -0800)]
Merge remote-tracking branch 'gh/wip-fix-tunables'

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
11 years agocrush/CrushCompiler: make current set of tunables 'safe'
Sage Weil [Sat, 7 Dec 2013 00:03:21 +0000 (16:03 -0800)]
crush/CrushCompiler: make current set of tunables 'safe'

We can reenable this error the next time we add new tunables.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrushtool: remove scary tunables messages
Sage Weil [Sat, 7 Dec 2013 00:20:23 +0000 (16:20 -0800)]
crushtool: remove scary tunables messages

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush/CrushCompiler: start with legacy tunables when compiling
Sage Weil [Sat, 7 Dec 2013 00:18:04 +0000 (16:18 -0800)]
crush/CrushCompiler: start with legacy tunables when compiling

Ensure that a crush file always compiled deterministically, even though
the default values for *new* maps has changed.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush: add indep data set to cli tests 869/head
Sage Weil [Sat, 7 Dec 2013 00:04:55 +0000 (16:04 -0800)]
crush: add indep data set to cli tests

This will help us catch things if we break the mapping.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosdmaptool: fix cli tests for 3x
Sage Weil [Sat, 7 Dec 2013 00:13:50 +0000 (16:13 -0800)]
osdmaptool: fix cli tests for 3x

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd: default to 3x replication
Sage Weil [Fri, 6 Dec 2013 18:35:45 +0000 (10:35 -0800)]
osd: default to 3x replication

3x is the recommendation; it should be the default too.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #913 from dachary/wip-crush-unittest
Sage Weil [Sat, 7 Dec 2013 00:10:00 +0000 (16:10 -0800)]
Merge pull request #913 from dachary/wip-crush-unittest

CrushWrapper::move_bucket unittest and minor fixes

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoRevert "osd: default to 3x replication"
Sage Weil [Fri, 6 Dec 2013 23:48:39 +0000 (15:48 -0800)]
Revert "osd: default to 3x replication"

This reverts commit cb26fbde52f31b449af60acce3ced34e593d6e1e.

Fix unit tests and do integration tests first; this may have unexpected
consequences.

11 years agocrush: detach_bucket must test item >= 0 not > 0 913/head
Loic Dachary [Fri, 6 Dec 2013 23:31:54 +0000 (00:31 +0100)]
crush: detach_bucket must test item >= 0 not > 0

Since detach_bucket is a private helper solely used by move_bucket which
contains another ( correct ) safeguard, the code cannot be reached and
the problem can never happen. If another function uses detach_bucket,
it may happen.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: remove obsolete comments from link_bucket
Loic Dachary [Fri, 6 Dec 2013 23:27:09 +0000 (00:27 +0100)]
crush: remove obsolete comments from link_bucket

Probably copy/pasted from move_bucket.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: remove redundant code from move_bucket
Loic Dachary [Fri, 6 Dec 2013 23:21:16 +0000 (00:21 +0100)]
crush: remove redundant code from move_bucket

The following was introduced in 2012 by a2d0cff1b071bed84ac439e4fcf9ddfb936f89c8

  // un-set the device name so we can use add_item later
  build_rmap(name_map, name_rmap);
  name_map.erase(id);
  name_rmap.erase(id_name);

when insert_item refused to move a bucket for which a name already
exists. It was changed in 2013 by
4e2557a038dc1e8c68993ad8571d74e2eb8ea90a and now supports it. The
TestCrushWrapper unittest for move_bucket pass.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: unittest CrushWrapper::move_bucket
Loic Dachary [Fri, 6 Dec 2013 23:19:50 +0000 (00:19 +0100)]
crush: unittest CrushWrapper::move_bucket

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #888 from ceph/wip-crush-tunables
Sage Weil [Fri, 6 Dec 2013 22:45:57 +0000 (14:45 -0800)]
Merge pull request #888 from ceph/wip-crush-tunables

default to bobtail-era crush tunables.

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agoMerge pull request #903 from ceph/wip-memstore
Sage Weil [Fri, 6 Dec 2013 22:38:15 +0000 (14:38 -0800)]
Merge pull request #903 from ceph/wip-memstore

memstore: reference ObjectStore backend

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #907 from ceph/wip-3x
Sage Weil [Fri, 6 Dec 2013 22:25:38 +0000 (14:25 -0800)]
Merge pull request #907 from ceph/wip-3x

osd: default to 3x replication

11 years agocrush/mapper: dump indep partial progression for debugging
Sage Weil [Wed, 4 Dec 2013 00:46:49 +0000 (16:46 -0800)]
crush/mapper: dump indep partial progression for debugging

...if DEBUG_INDEP is #defined.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoPendingReleaseNotes: note change of CRUSH indep mode in release notes
Sage Weil [Tue, 3 Dec 2013 22:46:46 +0000 (14:46 -0800)]
PendingReleaseNotes: note change of CRUSH indep mode in release notes

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush: add feature CRUSH_V2 for new indep mode and SET_*_TRIES rule steps
Sage Weil [Fri, 6 Dec 2013 21:58:51 +0000 (13:58 -0800)]
crush: add feature CRUSH_V2 for new indep mode and SET_*_TRIES rule steps

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush: CHOOSE_LEAF -> CHOOSELEAF throughout
Sage Weil [Tue, 3 Dec 2013 21:40:47 +0000 (13:40 -0800)]
crush: CHOOSE_LEAF -> CHOOSELEAF throughout

This aligns the internal identifier names with the user-visible names in
the decompiled crush map language.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/OSDMap: fix feature calculation for CACHEPOOL
Sage Weil [Tue, 3 Dec 2013 18:59:29 +0000 (10:59 -0800)]
osd/OSDMap: fix feature calculation for CACHEPOOL

We need to include the faeture in the mask.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush/CrushCompiler: [de]compile set_choose[leaf]_tries rule step
Sage Weil [Tue, 3 Dec 2013 01:50:44 +0000 (17:50 -0800)]
crush/CrushCompiler: [de]compile set_choose[leaf]_tries rule step

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush/CrushWrapper: set chooseleaf_tries to 5 for 'simple' indep rules
Sage Weil [Tue, 3 Dec 2013 16:49:15 +0000 (08:49 -0800)]
crush/CrushWrapper: set chooseleaf_tries to 5 for 'simple' indep rules

When making a generic indep rule, set the recursive retry to 5.  This gives
better overall results.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush/mapper: add SET_CHOOSE_TRIES rule step
Sage Weil [Tue, 3 Dec 2013 16:34:39 +0000 (08:34 -0800)]
crush/mapper: add SET_CHOOSE_TRIES rule step

Since we can specify the recursive retries in a rule, we may as well also
specify the non-recursive tries too for completeness.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush/mapper: apply chooseleaf_tries to firstn mode too
Sage Weil [Tue, 3 Dec 2013 16:33:55 +0000 (08:33 -0800)]
crush/mapper: apply chooseleaf_tries to firstn mode too

Parameterize the attempts for the _firstn choose method, and apply the
rule-specified tries count to firstn mode as well.  Note that we have
slightly different behavior here than with indep:

 If the firstn value is not specified for firstn, we pass through the
 normal attempt count.  This maintains compatibility with legacy behavior.
 Note that this is usually *not* actually N^2 work, though, because of the
 descend_once tunable.  However, descend_once is unfortunately *not* the
 same thing as 1 chooseleaf try because it is only checked on a reject but
 not on a collision.  Sigh.

 In contrast, for indep, if tries is not specified we default to 1
 recursive attempt, because that is simply more sane, and we have the
 option to do so.  The descend_once tunable has no effect for indep.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush/mapper: fix up the indep tests
Sage Weil [Tue, 3 Dec 2013 01:39:15 +0000 (17:39 -0800)]
crush/mapper: fix up the indep tests

Fix indentation.
Simplify+fix the changed vs moved calculation.
Use the new SET_CHOOSE_LEAF_TRIES command.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #909 from dachary/wip-crush-unittest
Sage Weil [Fri, 6 Dec 2013 20:35:52 +0000 (12:35 -0800)]
Merge pull request #909 from dachary/wip-crush-unittest

more CrushWrapper unittest

11 years agocrush: unittest CrushWrapper::get_immediate_parent 909/head
Loic Dachary [Fri, 6 Dec 2013 18:33:49 +0000 (19:33 +0100)]
crush: unittest CrushWrapper::get_immediate_parent

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: unittest CrushWrapper::update_item
Loic Dachary [Fri, 6 Dec 2013 14:44:03 +0000 (15:44 +0100)]
crush: unittest CrushWrapper::update_item

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: unittest s/std::string/string/
Loic Dachary [Fri, 6 Dec 2013 14:43:23 +0000 (15:43 +0100)]
crush: unittest s/std::string/string/

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: unittest use const instead of define
Loic Dachary [Fri, 6 Dec 2013 13:39:10 +0000 (14:39 +0100)]
crush: unittest use const instead of define

And reduce the depth of the hierarchy because three levels of buckets
capture the same cases as four levels.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: unittest CrushWrapper::check_item_loc
Loic Dachary [Fri, 6 Dec 2013 12:32:31 +0000 (13:32 +0100)]
crush: unittest CrushWrapper::check_item_loc

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: unittest remove useless c->create()
Loic Dachary [Fri, 6 Dec 2013 12:31:22 +0000 (13:31 +0100)]
crush: unittest remove useless c->create()

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge remote-tracking branch 'origin/next'
Yehuda Sadeh [Fri, 6 Dec 2013 19:24:06 +0000 (11:24 -0800)]
Merge remote-tracking branch 'origin/next'

11 years agoosd: default to 3x replication 907/head
Sage Weil [Fri, 6 Dec 2013 18:35:45 +0000 (10:35 -0800)]
osd: default to 3x replication

3x is the recommendation; it should be the default too.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #901 from dachary/wip-crush-unittest
Sage Weil [Fri, 6 Dec 2013 16:29:01 +0000 (08:29 -0800)]
Merge pull request #901 from dachary/wip-crush-unittest

crush: check for invalid names in loc[]

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agocrush: check for invalid names in loc[] 901/head
Loic Dachary [Thu, 5 Dec 2013 18:41:50 +0000 (19:41 +0100)]
crush: check for invalid names in loc[]

Add the is_valid_crush_loc helper to test for invalid crush names in
insert_item and update_item, before performing any side
effect. Implement the associated unit tests.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoosd: queue pg deletion after on_removal txn 903/head
Sage Weil [Fri, 6 Dec 2013 06:11:41 +0000 (22:11 -0800)]
osd: queue pg deletion after on_removal txn

The removal is normally so slow that these don't really race, but they
could.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoos/MemStore: implement reference 'memstore' backend
Sage Weil [Fri, 6 Dec 2013 00:58:06 +0000 (16:58 -0800)]
os/MemStore: implement reference 'memstore' backend

This is (as near to) a trivial ObjectStore backend for the OSD as we can
get at the moment.  Everything is stored in memory.  We are slightly
tricky with the locking, but not overly so.

On umount we dump everything out to disk, and on mount we load it all in
again, so we have some very coarse persistence/durability... just enough
to make this usable in a non-failure environment.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #900 from ceph/wip-mon-mds-trim
João Eduardo Luís [Fri, 6 Dec 2013 02:15:21 +0000 (18:15 -0800)]
Merge pull request #900 from ceph/wip-mon-mds-trim

mon: MDSMonitor: trim versions and let PaxosService decide whether to propose

We were not trimming mdsmap versions and were generating a new map every time
we modified the pending value.

Now we not only make sure that MDSMonitor will trim old maps (configurable
option allowing us to set the maximum number of maps to keep, defaulting to 500,
much like other services do) but we also delegate to PaxosService the decision on
whether to propose our pending value.

We also perform several modifications to 'ceph-kvstore-tool', allowing one to obtain
the contents of a given prefix:key and have them outputted to a file instead of stdout,
and also add support for getting the size of a given prefix:key's value.

'ceph report' was also modified so that we always output the first and last
committed versions for all services; up until this point, we would only output the
first committed version on all services, and only a few were also outputting the
last committed version.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agomon: ceph-kvstore-tool: get size of value for prefix/key 900/head
Joao Eduardo Luis [Thu, 5 Dec 2013 17:05:33 +0000 (17:05 +0000)]
mon: ceph-kvstore-tool: get size of value for prefix/key

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agotools: ceph-kvstore-tool: output value contents to file on 'get'
Joao Eduardo Luis [Thu, 5 Dec 2013 12:08:35 +0000 (12:08 +0000)]
tools: ceph-kvstore-tool: output value contents to file on 'get'

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: Have 'ceph report' print last committed versions
Joao Eduardo Luis [Thu, 5 Dec 2013 17:39:50 +0000 (17:39 +0000)]
mon: Have 'ceph report' print last committed versions

Only for those services that weren't doing it.

Backport: dumpling
Backport: emperor

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: MDSMonitor: let PaxosService decide on whether to propose
Joao Eduardo Luis [Thu, 5 Dec 2013 17:26:47 +0000 (17:26 +0000)]
mon: MDSMonitor: let PaxosService decide on whether to propose

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agoos/ObjectStore: make getattrs() pure virtual
Sage Weil [Thu, 5 Dec 2013 23:33:20 +0000 (15:33 -0800)]
os/ObjectStore: make getattrs() pure virtual

It is required.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agos/true/1 and s/false/0
tamil [Thu, 5 Dec 2013 21:05:12 +0000 (13:05 -0800)]
s/true/1 and s/false/0

Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
11 years agomon: MDSMonitor: implement 'get_trim_to()' to let the mon trim mdsmaps
Joao Eduardo Luis [Wed, 4 Dec 2013 17:49:10 +0000 (17:49 +0000)]
mon: MDSMonitor: implement 'get_trim_to()' to let the mon trim mdsmaps

This commit also adds two options to the MDSMonitor:

  - mon_max_mdsmap_epochs: the maximum amount of maps we'll keep (def: 500)
  - mon_mds_force_trim: the version we want to trim to

This results in 'get_trim_to()' returning the possible values:

  - if we have set mon_mds_force_trim, and this value is greater than the
    last committed version, trim to mon_mds_force_trim
  - if we hold more than the max number of maps, trim to last - max
  - if we have set mon_mds_force_trim and if we hold more than the max
    number of maps, and mon_mds_force_trim is lower than last - max,
    then trim to last - max

Backport: dumpling
Backport: emperor

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: MDSMonitor: print map on encode_pending() iff debug mon = 30+
Joao Eduardo Luis [Wed, 4 Dec 2013 00:41:13 +0000 (00:41 +0000)]
mon: MDSMonitor: print map on encode_pending() iff debug mon = 30+

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: MDSMonitor: consider 'debug level' parameter on 'print_map()'
Joao Eduardo Luis [Wed, 4 Dec 2013 00:40:37 +0000 (00:40 +0000)]
mon: MDSMonitor: consider 'debug level' parameter on 'print_map()'

The parameter was there, just not used.  It does default to 7, so
existing callers are okay.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: MDSMonitor: remove reference to no-longer-used encode_trim()
Joao Eduardo Luis [Wed, 4 Dec 2013 00:34:38 +0000 (00:34 +0000)]
mon: MDSMonitor: remove reference to no-longer-used encode_trim()

We weren't using it and it's no longer used by anyone anyway.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agoMerge pull request #899 from dachary/wip-crush-unittest
Sage Weil [Thu, 5 Dec 2013 17:18:50 +0000 (09:18 -0800)]
Merge pull request #899 from dachary/wip-crush-unittest

CrushWrapper::insert_item unittest and minor fixes

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agocrush: CrushWrapper unit tests 899/head
Loic Dachary [Thu, 5 Dec 2013 13:09:16 +0000 (14:09 +0100)]
crush: CrushWrapper unit tests

Covers all cases for the following methods. All but insert_item are trivial.

* insert_item
* set_item_name
* name_exists
* item_exists
* get_item_id
* get_item_name
* get_num_type_names
* get_type_id
* get_type_name
* is_valid_crush_name

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: remove redundant test in insert_item
Loic Dachary [Thu, 5 Dec 2013 12:01:00 +0000 (13:01 +0100)]
crush: remove redundant test in insert_item

A year after the last modification of test to check if an item was added
twice to the same bucket, the subtree_contains test was added a few
lines above it, making it redundant.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: insert_item returns on error if bucket name is invalid
Loic Dachary [Thu, 5 Dec 2013 08:54:37 +0000 (09:54 +0100)]
crush: insert_item returns on error if bucket name is invalid

A bucket name may be created as a side effect of insert_item. All names
in the loc argument are checked for validity at the beginning of the
method and an error is returned immediately if one is found. This allows
to not check for errors when setting the name of an item later on.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoos/ObjectStore: prevent copying
Sage Weil [Wed, 4 Dec 2013 22:46:49 +0000 (14:46 -0800)]
os/ObjectStore: prevent copying

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoos/ObjectStore: pass cct to ctor
Sage Weil [Wed, 4 Dec 2013 22:46:40 +0000 (14:46 -0800)]
os/ObjectStore: pass cct to ctor

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #892 from jpds/ceph-disk-journal-mbrtogpt
Loic Dachary [Wed, 4 Dec 2013 19:42:30 +0000 (11:42 -0800)]
Merge pull request #892 from jpds/ceph-disk-journal-mbrtogpt

Call --mbrtogpt on journal run of sgdisk should the drive require a GPT ...

Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #782 from danchai/master
Sage Weil [Wed, 4 Dec 2013 15:42:48 +0000 (07:42 -0800)]
Merge pull request #782 from danchai/master

ObjBencher: add rand_read_bench to support rand test in rados-bench

11 years agoCall --mbrtogpt on journal run of sgdisk should the drive require a GPT table. 892/head
Jonathan Davies [Tue, 3 Dec 2013 21:26:43 +0000 (21:26 +0000)]
Call --mbrtogpt on journal run of sgdisk should the drive require a GPT table.

Signed-off-by: Jonathan Davies <jonathan.davies@canonical.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agoObjBencher: add rand_read_bench functions to support rand test in rados-bench 782/head
danchai [Tue, 29 Oct 2013 10:10:48 +0000 (18:10 +0800)]
ObjBencher: add rand_read_bench functions to support rand test in rados-bench

Signed-off-by: Tengwei Cai <tengweicai@gmail.com>
11 years agodoc/rados/operations/crush: fix more
Sage Weil [Wed, 4 Dec 2013 06:46:37 +0000 (22:46 -0800)]
doc/rados/operations/crush: fix more

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agodoc/rados/operations/crush: fix rst
Sage Weil [Wed, 4 Dec 2013 06:18:41 +0000 (22:18 -0800)]
doc/rados/operations/crush: fix rst

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #893 from jdurgin/wip-init-highlander
Sage Weil [Wed, 4 Dec 2013 00:39:38 +0000 (16:39 -0800)]
Merge pull request #893 from jdurgin/wip-init-highlander

init, upstart: prevent daemons being started by both

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoinit, upstart: prevent daemons being started by both 893/head
Josh Durgin [Mon, 25 Nov 2013 21:43:43 +0000 (13:43 -0800)]
init, upstart: prevent daemons being started by both

There can be only one init system starting a daemon. If there is a
host entry in ceph.conf for a daemon, sysvinit would try to start it
even if the daemon's directory did not include a sysvinit file. This
preserves backwards compatibility with older installs using sysvinit,
but if an upstart file is present in the daemon's directory, upstart
will try to start them, regardless of host entries in ceph.conf.

If there's an upstart file in a daemon's directory and a host entry
for that daemon in ceph.conf, both sysvinit and upstart would attempt
to manage it.

Fix this by only starting daemons if the marker file for the other
init system is not present. This maintains backwards compatibility
with older installs using neither sysvinit or upstart marker files,
and does not break any valid configurations. The only configuration
that would break is one with both sysvinit and upstart files present
for the same daemon.

Backport: emperor, dumpling
Reported-by: Tim Spriggs <tims@uahirise.org>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
11 years agocrush/mapper: new SET_CHOOSE_LEAF_TRIES command
Sage Weil [Tue, 3 Dec 2013 16:16:41 +0000 (08:16 -0800)]
crush/mapper: new SET_CHOOSE_LEAF_TRIES command

Explicitly control the number of sample attempts, and allow the number of
tries in the recursive call to be explicitly controlled via the rule. This
is important because the amount of time we want to spend looking for a
solution may be rule dependent (e.g., higher for the wide indep pool than
the rep pools).

(We should do the same for the other tunables, by the way!)

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush/mapper: pass parent r value for indep call
Sage Weil [Tue, 3 Dec 2013 01:17:13 +0000 (17:17 -0800)]
crush/mapper: pass parent r value for indep call

Pass down the parent's 'r' value so that we will sample different values in
the recursive call when the parent tries multiple times.  This avoids doing
useless work (calling multiple times and trying the same values).

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush/mapper: clarify numrep vs endpos
Sage Weil [Tue, 3 Dec 2013 01:15:56 +0000 (17:15 -0800)]
crush/mapper: clarify numrep vs endpos

Pass numrep (the width of the result) separately from the number of results
we want *this* iteration.  This makes things less awkward when we do a
recursive call (for chooseleaf) and want only one item.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/osd_types: pg_pool_t: fix /// -> ///< comments
Sage Weil [Mon, 4 Nov 2013 11:24:49 +0000 (03:24 -0800)]
osd/osd_types: pg_pool_t: fix /// -> ///< comments

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agomon, crush: add mode to "osd crush rule create-simple ..."
Sage Weil [Mon, 4 Nov 2013 11:12:45 +0000 (03:12 -0800)]
mon, crush: add mode to "osd crush rule create-simple ..."

Add a mode (firstn or indep) to the create-simple command.  Make it
optional and default to firstn (for compatiblity and simplicity).

Note: a "default=..." option for mon commands would be easier.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/OSDMap: do not shift result when removing nonexistent osds
Sage Weil [Sat, 2 Nov 2013 23:07:05 +0000 (16:07 -0700)]
osd/OSDMap: do not shift result when removing nonexistent osds

If it is a replicated pool, remove and shift to the left.  For erasure
pools, replace nonexistent items with CRUSH_ITEM_NONE.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd, crush: add 'erasure' pool/pg type
Sage Weil [Sat, 2 Nov 2013 23:02:41 +0000 (16:02 -0700)]
osd, crush: add 'erasure' pool/pg type

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush/mapper: strip firstn conditionals out of crush_choose, rename
Sage Weil [Sat, 2 Nov 2013 18:54:09 +0000 (11:54 -0700)]
crush/mapper: strip firstn conditionals out of crush_choose, rename

Now that indep is handled by crush_choose_indep, rename crush_choose to
crush_choose_firstn and remove all the conditionals.  This ends up
stripping out *lots* of code.

Note that it *also* makes it obvious that the shenanigans we were playing
with r' for uniform buckets were broken for firstn mode.  This appears to
have happened waaaay back in commit dae8bec9 (or earlier)... 2007.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush: add a few unit tests for INDEP mode
Sage Weil [Sat, 2 Nov 2013 17:58:30 +0000 (10:58 -0700)]
crush: add a few unit tests for INDEP mode

Signed-off-by: Sage Weil <sage@inktank.com>