]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
11 years agomon/OSDMonitor: fix legacy tunables warning
Sage Weil [Wed, 12 Feb 2014 21:18:04 +0000 (13:18 -0800)]
mon/OSDMonitor: fix legacy tunables warning

Warn on legacy tunables, not on non-optimal tunables.  Optimal is a moving
target, but it is really the legacy defaults that we want to push people
off of.

Fixes: #7399
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1224 from kdreyer-inktank/packaging-libdir
Sage Weil [Wed, 12 Feb 2014 18:35:07 +0000 (10:35 -0800)]
Merge pull request #1224 from kdreyer-inktank/packaging-libdir

packaging: do not package libdir/ceph recursively

11 years agopackaging: do not package libdir/ceph recursively 1224/head
Alexandre Oliva [Wed, 12 Feb 2014 17:46:50 +0000 (15:46 -0200)]
packaging: do not package libdir/ceph recursively

Package libdir/ceph non-recursively, to avoid duplicates, and
package libdir/ceph/ceph_common.sh explicitly.

Signed-off-by: Alexandre Oliva <oliva@gnu.org>
Signed-off-by: Ken Dreyer <ken.dreyer@inktank.com>
11 years agoMerge pull request #1215 from ceph/wip-7385
Sage Weil [Wed, 12 Feb 2014 17:54:30 +0000 (09:54 -0800)]
Merge pull request #1215 from ceph/wip-7385

Remove the max cached objects restriction for librbd

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1199 from ceph/wip-doc-librados-intro
Loic Dachary [Wed, 12 Feb 2014 16:25:42 +0000 (17:25 +0100)]
Merge pull request #1199 from ceph/wip-doc-librados-intro

Wip doc librados intro

Reviewed-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #1221 from dachary/wip-filestore
Loic Dachary [Wed, 12 Feb 2014 11:52:02 +0000 (12:52 +0100)]
Merge pull request #1221 from dachary/wip-filestore

tests: fix packaging for s/filestore/objectstore/

Reviewed-by: Haomai Wang <haomaiwang@gmail.com>
11 years agotests: fix packaging for s/filestore/objectstore/ 1221/head
Loic Dachary [Wed, 12 Feb 2014 11:44:21 +0000 (12:44 +0100)]
tests: fix packaging for s/filestore/objectstore/

The binaries file name have changed and need to be updated in the
packaging files for deb and rpm. Fix a few leftovers as well.

Fixing 1a588f18ba0e57df64f8a48c1393a4bc65019571

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #1220 from dachary/wip-filestore
Kri5 [Wed, 12 Feb 2014 11:26:28 +0000 (12:26 +0100)]
Merge pull request #1220 from dachary/wip-filestore

tests: fix objectstore tests

11 years agotests: fix objectstore tests 1220/head
Loic Dachary [Wed, 12 Feb 2014 10:52:37 +0000 (11:52 +0100)]
tests: fix objectstore tests

The objectstore test from 1a588f18ba0e57df64f8a48c1393a4bc65019571 was
missing a few changes.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #1212 from ywang19/master
Loic Dachary [Wed, 12 Feb 2014 08:06:31 +0000 (09:06 +0100)]
Merge pull request #1212 from ywang19/master

correct one command line at building packages section

Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #1168 from yuyuyu101/wip-refactor-objectstore-test
Sage Weil [Wed, 12 Feb 2014 05:17:05 +0000 (21:17 -0800)]
Merge pull request #1168 from yuyuyu101/wip-refactor-objectstore-test

Rename test/filestore to test/objectstore

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1218 from yuyuyu101/wip-misc-fix
Sage Weil [Wed, 12 Feb 2014 05:12:09 +0000 (21:12 -0800)]
Merge pull request #1218 from yuyuyu101/wip-misc-fix

Fix bad dealloctor

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoFix bad dealloctor 1218/head
Haomai Wang [Wed, 12 Feb 2014 04:04:30 +0000 (12:04 +0800)]
Fix bad dealloctor

Memory allocated by malloc() should be deallocated by free(), not 'delete'

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
11 years agocorrect one command line at building packages section 1212/head
ywang19 [Tue, 11 Feb 2014 05:14:32 +0000 (13:14 +0800)]
correct one command line at building packages section

Signed-off-by: Wang, Yaguang <yaguang.wang@intel.com>
11 years agoosdmaptool: fix cli test
Sage Weil [Wed, 12 Feb 2014 02:38:16 +0000 (18:38 -0800)]
osdmaptool: fix cli test

Encoding the extra tunable byte threw off the output here.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agotset_bufferlist: fix signed/unsigned comparison
Sage Weil [Tue, 11 Feb 2014 18:12:34 +0000 (10:12 -0800)]
tset_bufferlist: fix signed/unsigned comparison

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1185 from ceph/wip-crush
Loic Dachary [Tue, 11 Feb 2014 22:42:53 +0000 (23:42 +0100)]
Merge pull request #1185 from ceph/wip-crush

crush: "vary_r" tunable

Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agodoc: Incorporated feed back from Loic and Dan. 1199/head
John Wilkins [Tue, 11 Feb 2014 21:28:33 +0000 (13:28 -0800)]
doc: Incorporated feed back from Loic and Dan.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agodoc: Adds additional terms for use with librados.
John Wilkins [Tue, 11 Feb 2014 21:28:04 +0000 (13:28 -0800)]
doc: Adds additional terms for use with librados.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agoObjectCacher: remove unused target/max setters 1215/head
Josh Durgin [Tue, 11 Feb 2014 19:47:48 +0000 (11:47 -0800)]
ObjectCacher: remove unused target/max setters

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
11 years agolibrbd: remove limit on number of objects in the cache
Josh Durgin [Tue, 11 Feb 2014 18:14:36 +0000 (10:14 -0800)]
librbd: remove limit on number of objects in the cache

The number of objects is not a significant indicated of when data
should be written out for rbd. Use the highest possible value for
number of objects and just rely on the dirty data limits to trigger
flushing. When the number of objects is low, and many start being
flushed before they accumulate many requests, it hurts average request
size and performance for many concurrent sequential writes.

Fixes: #7385
Backport: emperor, dumpling
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoObjectCacher: use uint64_t for target and max values
Josh Durgin [Tue, 11 Feb 2014 19:53:00 +0000 (11:53 -0800)]
ObjectCacher: use uint64_t for target and max values

All the options are uint64_t, but the ObjectCacher was converting them
to int64_t. There's never any reason for these to be negative, so
change the type.

Adjust a few conditionals so that they only convert known-positive
signed values to uint64_t before comparing with the target and max
values. Leave the actual stats accounting as loff_t for now, since
bugs in accounting will have bad effects if negative values wrap
around.

Backport: emperor, dumpling
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoObjectCacher: remove max_bytes and max_ob arguments to trim()
Josh Durgin [Tue, 11 Feb 2014 18:35:14 +0000 (10:35 -0800)]
ObjectCacher: remove max_bytes and max_ob arguments to trim()

These are never passed, so replace them with the defaults.

Backport: emperor, dumpling
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
11 years agomon: allow firefly crush tunables to be selected 1185/head
Sage Weil [Tue, 11 Feb 2014 16:47:15 +0000 (08:47 -0800)]
mon: allow firefly crush tunables to be selected

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agodoc/rados/operations/crush: describe new vary_r tunable
Sage Weil [Tue, 11 Feb 2014 16:45:18 +0000 (08:45 -0800)]
doc/rados/operations/crush: describe new vary_r tunable

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush: add firefly tunables baseline test
Sage Weil [Wed, 5 Feb 2014 00:05:20 +0000 (16:05 -0800)]
crush: add firefly tunables baseline test

This is a user's map that gives different results when the vary_r tunable
is adjusted.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrushtool: new cli tests for the vary-r tunable
Sage Weil [Tue, 4 Feb 2014 23:59:22 +0000 (15:59 -0800)]
crushtool: new cli tests for the vary-r tunable

These illustrate the variation in mapping results as the vary_r tunable
is adjusted.  Note:

1- For the vary_r=0 case, we have several inputs that map to only a single
output:

      rule 3 (delltestrule) num_rep 4 result size == 1:\t27/1024 (esc)
      rule 3 (delltestrule) num_rep 4 result size == 2:\t997/1024 (esc)

This is the behavior we are fixing.  For all of the other values of
vary_r, we get 2 outputs for all inputs.

2- If we use vary_r 1, which is likely the most efficient computation,
we get lots of inputs that change.  By setting larger values of vary_r,
we can trade a bit of extra computation to get a mapping that is more
similar to the legacy behavior. This is useful for legacy clusters:

    $ for f in `seq 1 4` ; do diff -u test-map-vary-r-0.t test-map-vary-r-$f.t | grep -c -- +  ; done
    3030
    1629
    645
    228

The crushmap here comes from a user who was seeing a bad mapping for certain
pgs after some OSDs were reweighted by utilization.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush: add infrastructure around SET_CHOOSELEAF_VARY_R rule step/command
Sage Weil [Tue, 4 Feb 2014 23:33:08 +0000 (15:33 -0800)]
crush: add infrastructure around SET_CHOOSELEAF_VARY_R rule step/command

This will let you vary the vary_r tunable on a per-rule basis.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush: add SET_CHOOSELEAF_VARY_R step
Sage Weil [Tue, 4 Feb 2014 21:40:49 +0000 (13:40 -0800)]
crush: add SET_CHOOSELEAF_VARY_R step

This lets you adjust the vary_r tunable on a per-rule basis.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush: add infrastructure around new chooseleaf_vary_r tunable
Sage Weil [Tue, 4 Feb 2014 23:31:40 +0000 (15:31 -0800)]
crush: add infrastructure around new chooseleaf_vary_r tunable

- encoding
- feature bit
- decompile/compile

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1207 from dachary/wip-7378
Sage Weil [Tue, 11 Feb 2014 16:30:31 +0000 (08:30 -0800)]
Merge pull request #1207 from dachary/wip-7378

common: admin socket fallback to json-pretty format

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1198 from dachary/wip-mailmap
Loic Dachary [Tue, 11 Feb 2014 15:14:49 +0000 (16:14 +0100)]
Merge pull request #1198 from dachary/wip-mailmap

mailmap updates

Reviewed-By: Christophe Courtaut <christophe.courtaut@gmail.com>
11 years agomailmap: Derek Yarnell is with University of Mississippi 1198/head
Loic Dachary [Fri, 7 Feb 2014 16:44:05 +0000 (17:44 +0100)]
mailmap: Derek Yarnell is with University of Mississippi

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agomailmap: Dmitry Smirnov is with Debian GNU/Linux
Loic Dachary [Fri, 7 Feb 2014 16:43:02 +0000 (17:43 +0100)]
mailmap: Dmitry Smirnov is with Debian GNU/Linux

Reviewed-by: Dmitry Smirnov <onlyjob@member.fsf.org>
Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agomailmap: Eric Mourgaya is with Credit Mutuel Arkea
Loic Dachary [Mon, 10 Feb 2014 13:46:03 +0000 (14:46 +0100)]
mailmap: Eric Mourgaya is with Credit Mutuel Arkea

and name normalization

Reviewed-by: Eric Mourgaya <eric.mourgaya@arkea.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocommon: admin socket fallback to json-pretty format 1207/head
Loic Dachary [Mon, 10 Feb 2014 22:42:38 +0000 (23:42 +0100)]
common: admin socket fallback to json-pretty format

If the format argument to a command sent to the admin socket is not
among the supported formats ( json, json-pretty, xml, xml-pretty ) the
new_formatter function will return null and the AdminSocketHook::call
function must fall back to a sensible default.

The CephContextHook::call and HelpHook::call failed to do that and a
malformed format argument would cause the mon to crash. A check is added
to each of them and fallback to json-pretty if the format is not
recognized.

To further protect AdminSocketHook::call implementations from similar
problems the format argument is checked immediately after accepting the
command in AdminSocket::do_accept and replaced with json-pretty if it is
not known.

A test case is added for both CephContextHook::call and HelpHook::call
to demonstrate the problem exists and is fixed by the patch.

Three other instances of unsafe calls to new_formatter were found and
a fallback to json-pretty was added. All other calls have been audited
and appear to be safe.

http://tracker.ceph.com/issues/7378 fixes #7378

Backport: emperor, dumpling
Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #1209 from fghaas/master
John Wilkins [Mon, 10 Feb 2014 23:52:31 +0000 (15:52 -0800)]
Merge pull request #1209 from fghaas/master

doc: highlight that "raw" is the only useful RBD format for QEMU

11 years agodoc: highlight that "raw" is the only useful RBD format for QEMU 1209/head
Florian Haas [Mon, 10 Feb 2014 23:04:06 +0000 (00:04 +0100)]
doc: highlight that "raw" is the only useful RBD format for QEMU

Explain why people should be using the "raw" image format for RBD
volumes created for use by QEMU: using any other format adds only
overhead, but no extra value (since RBDs are also CoW and
thin-provisioned), plus the Qcow2 storage driver is not migration safe
when caching is enabled, whereas the RBD driver is.

Also, fix a minor glitch in the example qemu-img commands ("-f rbd"
and "-O rbd" should really be "-f raw" and "-O raw").

Finally, drop the "-f" option altogether on qemu-img commands where it
makes no sense (info and resize).

Signed-off-by: Florian Haas <florian@hastexo.com>
11 years agoMerge branch wip-librados-timeout
Josh Durgin [Mon, 10 Feb 2014 21:49:23 +0000 (13:49 -0800)]
Merge branch wip-librados-timeout

Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #1205 from ceph/wip-7334
Josh Durgin [Mon, 10 Feb 2014 21:09:40 +0000 (13:09 -0800)]
Merge pull request #1205 from ceph/wip-7334

use `partx` for CentOS/RHEL instead of `partprobe`
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoMerge pull request #1204 from ceph/wip-fsetpipesz-fix
Josh Durgin [Mon, 10 Feb 2014 20:59:03 +0000 (12:59 -0800)]
Merge pull request #1204 from ceph/wip-fsetpipesz-fix

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoqa: add script for testing rados client timeout options
Josh Durgin [Thu, 6 Feb 2014 01:26:02 +0000 (17:26 -0800)]
qa: add script for testing rados client timeout options

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
11 years agorados: check return values for commands that can now fail
Josh Durgin [Thu, 6 Feb 2014 01:25:24 +0000 (17:25 -0800)]
rados: check return values for commands that can now fail

A few places were not checking the return values of commands, since
they could not fail before timeouts were added.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
11 years agolibrados: check and return on error so timeouts work
Josh Durgin [Thu, 6 Feb 2014 01:24:16 +0000 (17:24 -0800)]
librados: check and return on error so timeouts work

Some functions could not previously return errors, but they had an
int return value, which can now receive ETIMEDOUT.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
11 years agomsg/Pipe: add option to restrict delay injection to specific msg type
Josh Durgin [Thu, 6 Feb 2014 01:22:14 +0000 (17:22 -0800)]
msg/Pipe: add option to restrict delay injection to specific msg type

This makes it possible to test timeouts reliably by delaying certain
messages effectively forever, but still being able to e.g. connect and
authenticate to the monitors.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoMonClient: add a timeout on commands for librados
Josh Durgin [Tue, 4 Feb 2014 02:30:00 +0000 (18:30 -0800)]
MonClient: add a timeout on commands for librados

Just use the conf option directly, since librados is the only caller.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoObjecter: implement mon and osd operation timeouts
Josh Durgin [Tue, 4 Feb 2014 01:59:21 +0000 (17:59 -0800)]
Objecter: implement mon and osd operation timeouts

This captures almost all operations from librados other than mon_commands().

Get the values for the timeouts from the Objecter constructor, so only
librados uses them.

Add C_Cancel_*_Op, finish_*_op(), and *_op_cancel() for each type of
operation, to mirror those for Op. Create a callback and schedule it
in the existing timer thread if the timeouts are specified.

Fixes: #6507
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoalert the user about error messages from partx 1205/head
Alfredo Deza [Mon, 10 Feb 2014 20:07:55 +0000 (15:07 -0500)]
alert the user about error messages from partx

Signed-off-by: Alfredo Deza <alfredo@deza.pe>
11 years agouse partx for red hat or centos instead of partprobe
Alfredo Deza [Fri, 7 Feb 2014 16:55:01 +0000 (11:55 -0500)]
use partx for red hat or centos instead of partprobe

Signed-off-by: Alfredo Deza <alfredo@deza.pe>
11 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Mon, 10 Feb 2014 18:19:55 +0000 (10:19 -0800)]
Merge remote-tracking branch 'gh/next'

11 years agocommon/buffer: fix build breakage for CEPH_HAVE_SETPIPE_SZ 1204/head
Ilya Dryomov [Mon, 10 Feb 2014 17:34:44 +0000 (19:34 +0200)]
common/buffer: fix build breakage for CEPH_HAVE_SETPIPE_SZ

common/buffer.cc fails to build if CEPH_HAVE_SETPIPE_SZ is defined.
Fix it.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agoconfigure: fix F_SETPIPE_SZ detection
Ilya Dryomov [Mon, 10 Feb 2014 17:34:44 +0000 (19:34 +0200)]
configure: fix F_SETPIPE_SZ detection

Currently CEPH_HAVE_SETPIPE_SZ is not set even if F_SETPIPE_SZ is
available, because AC_COMPILE_IFELSE test program as written always
fails to compile.  F_SETPIPE_SZ is a macro, so use AC_EGREP_CPP which
works on the preprocessor output instead of trying to compile.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agoconfigure: don't check for arpa/nameser_compat.h twice
Ilya Dryomov [Mon, 10 Feb 2014 17:34:44 +0000 (19:34 +0200)]
configure: don't check for arpa/nameser_compat.h twice

Nuke redundant check and move the real one into the common
AC_CHECK_HEADERS stanza.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agomailmap: Moritz Möller is with Bigpoint.com
Loic Dachary [Fri, 7 Feb 2014 16:40:13 +0000 (17:40 +0100)]
mailmap: Moritz Möller is with Bigpoint.com

Reviewed-by: Moritz Möller <mm@mxs.de>
Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge remote-tracking branch 'gh/wip-7329' into next
Sage Weil [Sun, 9 Feb 2014 18:37:47 +0000 (10:37 -0800)]
Merge remote-tracking branch 'gh/wip-7329' into next

11 years agoceph_test_rados_api_tier: try harder to trigger the flush vs try-flush race
Sage Weil [Sun, 9 Feb 2014 04:20:21 +0000 (20:20 -0800)]
ceph_test_rados_api_tier: try harder to trigger the flush vs try-flush race

It seems to be reasonable easy to complete a flush before the next client
request is processed.  Crazy...

Same with the flush vs write race.

Fixes: #7329
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1201 from ceph/wip-7370
Loic Dachary [Sun, 9 Feb 2014 00:12:46 +0000 (01:12 +0100)]
Merge pull request #1201 from ceph/wip-7370

crush: fix tries/retries bug that was recently introduced

Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #1115 from jcsp/tell_cleanup
Loic Dachary [Sat, 8 Feb 2014 23:58:39 +0000 (00:58 +0100)]
Merge pull request #1115 from jcsp/tell_cleanup

Remove some almost-duplicate COMMAND definitions

Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #1127 from dmsimard/log_links
Loic Dachary [Sat, 8 Feb 2014 23:33:42 +0000 (00:33 +0100)]
Merge pull request #1127 from dmsimard/log_links

Doc: Fix 404 broken links to logging and debug configuration

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agocrush: add chooseleaf_vary_r tunable
Sage Weil [Tue, 4 Feb 2014 21:38:29 +0000 (13:38 -0800)]
crush: add chooseleaf_vary_r tunable

The current crush_choose_firstn code will re-use the same 'r' value for
the recursive call.  That means that if we are hitting a collision or
rejection for some reason (say, an OSD that is marked out) and need to
retry, we will keep making the same (bad) choice in that recursive
selection.

Introduce a tunable that fixes that behavior by incorporating the parent
'r' value into the recursive starting point, so that a different path
will be taken in subsequent placement attempts.

Note that this was done from the get-go for the new crush_choose_indep
algorithm.

This was exposed by a user who was seeing PGs stuck in active+remapped
after reweight-by-utilization because the up set mapped to a single OSD.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush: allow crush rules to set (re)tries counts to 0 1201/head
Sage Weil [Sat, 8 Feb 2014 20:23:05 +0000 (12:23 -0800)]
crush: allow crush rules to set (re)tries counts to 0

These two fields are misnomers; they are *retry* counts.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush: fix off-by-one errors in total_tries refactor
Sage Weil [Tue, 4 Feb 2014 20:14:14 +0000 (12:14 -0800)]
crush: fix off-by-one errors in total_tries refactor

Back in 27f4d1f6bc32c2ed7b2c5080cbd58b14df622607 we refactored the CRUSH
code to allow adjustment of the retry counts on a per-pool basis.  That
commit had an off-by-one bug: the previous "tries" counter was a *retry*
count, not a *try* count, but the new code was passing in 1 meaning
there should be no retries.

Fix the ftotal vs tries comparison to use < instead of <= to fix the
problem.  Note that the original code used <= here, which means the
global "choose_total_tries" tunable is actually counting retries.
Compensate for that by adding 1 in crush_do_rule when we pull the tunable
into the local variable.

This was noticed looking at output from a user provided osdmap.
Unfortunately the map doesn't illustrate the change in mapping behavior
and I haven't managed to construct one yet that does.  Inspection of the
crush debug output now aligns with prior versions, though.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrushtool: add cli test for off-by-one tries vs retries bug
Sage Weil [Sat, 8 Feb 2014 20:21:26 +0000 (12:21 -0800)]
crushtool: add cli test for off-by-one tries vs retries bug

See bug #7370.  This passes on dumpling and breaks prior to the #7370 fix.

Backport: emperor, dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Sat, 8 Feb 2014 16:23:12 +0000 (08:23 -0800)]
Merge remote-tracking branch 'gh/next'

11 years agoqa/workunits/rest: use larger max_file_size
Sage Weil [Sat, 8 Feb 2014 16:22:29 +0000 (08:22 -0800)]
qa/workunits/rest: use larger max_file_size

64k is the min.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agomailmap: Somnath Roy is with SanDisk
Loic Dachary [Fri, 7 Feb 2014 16:36:13 +0000 (17:36 +0100)]
mailmap: Somnath Roy is with SanDisk

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agomailmap: Yan, Zheng name normalization
Loic Dachary [Fri, 7 Feb 2014 16:33:39 +0000 (17:33 +0100)]
mailmap: Yan, Zheng name normalization

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agomailmap: Ray Lv is with Yahoo!
Loic Dachary [Fri, 7 Feb 2014 16:30:59 +0000 (17:30 +0100)]
mailmap: Ray Lv is with Yahoo!

Reviewed-by: Ray Lv <xiangyulv@gmail.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoRename test/filestore to test/objectstore 1168/head
Haomai Wang [Sat, 8 Feb 2014 07:41:52 +0000 (15:41 +0800)]
Rename test/filestore to test/objectstore

Now ObjectStore support three backend types, so we need to make each backend
share unit test to avoid duplicate codes.

This patch mainly make workload_generator workable for objectstore.

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
11 years agoscript to test rgw multi part uploads using s3 interface
tamil [Sat, 8 Feb 2014 06:15:11 +0000 (22:15 -0800)]
script to test rgw multi part uploads using s3 interface

Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
(cherry picked from commit 5d59dd9cd67834d991b038323bcbc3e8f8612229)

11 years agoscript to test rgw multi part uploads using s3 interface
tamil [Sat, 8 Feb 2014 06:15:11 +0000 (22:15 -0800)]
script to test rgw multi part uploads using s3 interface

Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
11 years agoMerge branch 'next' of github.com:ceph/ceph into next
tamil [Sat, 8 Feb 2014 01:10:10 +0000 (17:10 -0800)]
Merge branch 'next' of github.com:ceph/ceph into next

11 years agoadded script to test rgw user quota
tamil [Sat, 8 Feb 2014 01:09:30 +0000 (17:09 -0800)]
added script to test rgw user quota

Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
11 years agodoc: Added Python doc.
John Wilkins [Fri, 7 Feb 2014 23:49:00 +0000 (15:49 -0800)]
doc: Added Python doc.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agodoc: Added inline literal tag.
John Wilkins [Fri, 7 Feb 2014 23:48:45 +0000 (15:48 -0800)]
doc: Added inline literal tag.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agodoc: Adds Python to index and sets maxdepth to 2.
John Wilkins [Fri, 7 Feb 2014 23:47:40 +0000 (15:47 -0800)]
doc: Adds Python to index and sets maxdepth to 2.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agoscript to test rgw user quota functionality
tamil [Fri, 7 Feb 2014 23:34:05 +0000 (15:34 -0800)]
script to test rgw user quota functionality

Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
11 years agoMerge pull request #1197 from ceph/wip-osdmap-primary
Gregory Farnum [Fri, 7 Feb 2014 18:21:40 +0000 (10:21 -0800)]
Merge pull request #1197 from ceph/wip-osdmap-primary

osd/OSDMap: populate *primary when pool dne

Reviewed-by: Greg Farnum <greg@inktank.com>
11 years agoosd/OSDMap: populate *primary when pool dne 1197/head
Sage Weil [Fri, 7 Feb 2014 17:38:37 +0000 (09:38 -0800)]
osd/OSDMap: populate *primary when pool dne

This fixes a valgrind error from OSD::handle_osd_map where primary is not
initialized and is compared after the call to pg_to_acting_osds().

We are still not distinguishing from "no mapping" to "pool doesn't exist,
no mapping".  That is a somewhat larger change, though.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1173 from ceph/wip-7109
Gregory Farnum [Fri, 7 Feb 2014 17:26:38 +0000 (09:26 -0800)]
Merge pull request #1173 from ceph/wip-7109

Fix #7109: Prevent removal of default data pool

Reviewed-by: Greg Farnum <greg@inktank.com>
11 years agorgw: initialize variable before call
Yehuda Sadeh [Wed, 5 Feb 2014 23:19:51 +0000 (15:19 -0800)]
rgw: initialize variable before call

Need to initialize the truncated variable, as we sometimes ignore error
response (e.g., with ENOENT), and in such cases we can't expect it to be
set.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
(cherry picked from commit 9ecf3467a3efc071954c4f69a23b51e677742c1f)

11 years agoMerge pull request #1191 from ceph/wip-rgw-vg
Sage Weil [Fri, 7 Feb 2014 16:43:21 +0000 (08:43 -0800)]
Merge pull request #1191 from ceph/wip-rgw-vg

rgw: initialize variable before call

Revewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1194 from dachary/wip-erasure-code-directory
Loic Dachary [Fri, 7 Feb 2014 16:10:29 +0000 (17:10 +0100)]
Merge pull request #1194 from dachary/wip-erasure-code-directory

erasure code directory

Reviewed-By: Christophe Courtaut <christophe.courtaut@gmail.com>
11 years agoMerge pull request #1178 from ceph/wip-osdmaptool
Sage Weil [Fri, 7 Feb 2014 15:12:11 +0000 (07:12 -0800)]
Merge pull request #1178 from ceph/wip-osdmaptool

osdmaptool: add --test-map-pgs mode

Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agoosdmaptool: tests for --test-map-pgs 1178/head
Loic Dachary [Fri, 7 Feb 2014 11:13:27 +0000 (12:13 +0100)]
osdmaptool: tests for --test-map-pgs

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoosdmaptool: test --import/export-crush
Loic Dachary [Fri, 7 Feb 2014 11:10:47 +0000 (12:10 +0100)]
osdmaptool: test --import/export-crush

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoosdmaptool: s/simple.t/missing-argument.t/
Loic Dachary [Fri, 7 Feb 2014 08:36:07 +0000 (09:36 +0100)]
osdmaptool: s/simple.t/missing-argument.t/

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoosdmaptool: fix cli tests
Sage Weil [Tue, 4 Feb 2014 01:47:18 +0000 (17:47 -0800)]
osdmaptool: fix cli tests

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosdmaptool: allow a completely random placement
Sage Weil [Tue, 4 Feb 2014 00:32:48 +0000 (16:32 -0800)]
osdmaptool: allow a completely random placement

This useful for comparison purposes and sanity-checking the results.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosdmaptool: add --test-map-pgs mode
Sage Weil [Tue, 4 Feb 2014 00:19:07 +0000 (16:19 -0800)]
osdmaptool: add --test-map-pgs mode

This command will map all pgs from all pools (or just one pool) to osds
and summarize the placement and calculate the actual standard deviation and
the expected value.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agorest/test.py: use larger max_file_size for mds set test
Sage Weil [Fri, 7 Feb 2014 14:11:57 +0000 (06:11 -0800)]
rest/test.py: use larger max_file_size for mds set test

Current min is 64k.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoRevert test case of "mon: OSDMonitor: do not allow changing an erasure-coded pool...
David Zafman [Fri, 7 Feb 2014 03:13:20 +0000 (19:13 -0800)]
Revert test case of "mon: OSDMonitor: do not allow changing an erasure-coded pool's size"

This reverts part of commit c8c4cc6e81816069886af6bff968712993554759.

Fixes: #7355
Signed-off-by: David Zafman <david.zafman@inktank.com>
11 years agoSome suggested changes, both errors and rewordings
Dan Mick [Fri, 7 Feb 2014 00:54:47 +0000 (16:54 -0800)]
Some suggested changes, both errors and rewordings

Python and C code examples tweaked a bit

Signed-off-by: Dan Mick <dan.mick@inktank.com>
11 years agoerasure-code: move test files to a dedicated directory 1194/head
Loic Dachary [Thu, 6 Feb 2014 10:28:21 +0000 (11:28 +0100)]
erasure-code: move test files to a dedicated directory

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoerasure-code: move source files to a dedicated directory
Loic Dachary [Thu, 6 Feb 2014 10:00:33 +0000 (11:00 +0100)]
erasure-code: move source files to a dedicated directory

The src/erasure-code directory contains the erasure-code plugin system
and the jerasure plugin. It is moved out of OSD because it now belongs
to a convenience library ( LIBERASURE_CODE ) which is used both by OSDs
and MONs.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #1188 from dachary/wip-7339
Loic Dachary [Thu, 6 Feb 2014 00:38:42 +0000 (01:38 +0100)]
Merge pull request #1188 from dachary/wip-7339

DNM: define pg_pool_t::stripe_width and set from erasure-code plugin on erasure pool creation

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agorgw: initialize variable before call 1191/head
Yehuda Sadeh [Wed, 5 Feb 2014 23:19:51 +0000 (15:19 -0800)]
rgw: initialize variable before call

Need to initialize the truncated variable, as we sometimes ignore error
response (e.g., with ENOENT), and in such cases we can't expect it to be
set.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoMerge pull request #1190 from ceph/wip-snaptest-next
Sage Weil [Wed, 5 Feb 2014 21:04:44 +0000 (13:04 -0800)]
Merge pull request #1190 from ceph/wip-snaptest-next

qa/workunits/snaps: New allow_new_snaps syntax

11 years agoqa/workunits/snaps: New allow_new_snaps syntax 1190/head
John Spray [Wed, 5 Feb 2014 18:44:40 +0000 (18:44 +0000)]
qa/workunits/snaps: New allow_new_snaps syntax

These were probably just obscuring other failures.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: John Spray <john.spray@inktank.com>
11 years agomon: test osd pool create pg_pool_t::stripe_width behavior 1188/head
Loic Dachary [Wed, 5 Feb 2014 19:44:32 +0000 (20:44 +0100)]
mon: test osd pool create pg_pool_t::stripe_width behavior

* Check that the default from the configuration options is found in the
  output of osd dump
* Check that specifying an undersized osd_pool_erasure_code_stripe_width
  value is taken into account and padded.

Signed-off-by: Loic Dachary <loic@dachary.org>