]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
11 years agoceph_argparse: kill _daemon versions of argparse calls 996/head
Ilya Dryomov [Wed, 25 Dec 2013 19:41:16 +0000 (21:41 +0200)]
ceph_argparse: kill _daemon versions of argparse calls

Commit c76bbc2e6df1, which introduced _daemon versions of some of the
argparse calls, also changed the behaviour of non-_daemon versions.
The change resulted in incorrect error messages, e.g.

  $ ./rbd create b0 --size
  rbd: extraneous parameter --size

instead of what should have been

  $ ./rbd create b0 --size
  Option --size requires an argument.

The users of _daemon versions were added in commit be801f6c506d and
removed in commit f26bd55e57f1, so just kill the _daemon versions and
restore the old behaviour.  (This effectively reverts commit
c76bbc2e6df1.)

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agoMerge pull request #988 from ceph/wip-crush-location
Loic Dachary [Wed, 25 Dec 2013 09:07:04 +0000 (01:07 -0800)]
Merge pull request #988 from ceph/wip-crush-location

add 'crush location' config option

make check is ok

Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #993 from ceph/wip-librados-lock
Sage Weil [Tue, 24 Dec 2013 18:51:01 +0000 (10:51 -0800)]
Merge pull request #993 from ceph/wip-librados-lock

Wip librados lock

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agolibrados: lockless get_instance_id() 993/head
Yehuda Sadeh [Thu, 5 Dec 2013 07:33:42 +0000 (23:33 -0800)]
librados: lockless get_instance_id()

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoobjecter, librados: create Objecter::Op in two phases
Yehuda Sadeh [Sat, 23 Nov 2013 01:21:57 +0000 (17:21 -0800)]
objecter, librados: create Objecter::Op in two phases

(currently only in some librados operations)
First create the op, only then lock and submit so that we reduce lock
contention.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agocrush/CrushWrapper: note about get_immediate_parent() 988/head
Sage Weil [Tue, 24 Dec 2013 16:01:15 +0000 (08:01 -0800)]
crush/CrushWrapper: note about get_immediate_parent()

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agolibrados: mark old get_version() as deprecated
Sage Weil [Mon, 23 Dec 2013 21:14:43 +0000 (13:14 -0800)]
librados: mark old get_version() as deprecated

Use the newly-discovered (for me) deprecated attribute to mark the old
get_version() method and point users toward get_version64().  And fix a
couple of users in the kvstore code!

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agolibrados: deprecate aio_operate() read variant that takes snapid
Sage Weil [Mon, 23 Dec 2013 21:13:06 +0000 (13:13 -0800)]
librados: deprecate aio_operate() read variant that takes snapid

The argument was ignored.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agolibrbd: localize or distribute parent (snap) reads
Sage Weil [Thu, 31 Oct 2013 00:21:05 +0000 (17:21 -0700)]
librbd: localize or distribute parent (snap) reads

The parent is always a snapshot.  We may want to treat it differently
than other snaps by virtue of it (likely) being a more highly-shared
image.

By default, localize parent reads.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosdc/Objecter: use crush location and distance for LOCALIZE_READS
Sage Weil [Wed, 30 Oct 2013 15:59:48 +0000 (08:59 -0700)]
osdc/Objecter: use crush location and distance for LOCALIZE_READS

Use the hierarchy in the CRUSH map to determine what the closest
replica is.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosdc/Objecter: maintain crush_location multimap
Sage Weil [Mon, 23 Dec 2013 23:18:07 +0000 (15:18 -0800)]
osdc/Objecter: maintain crush_location multimap

Observe and parse the 'crush location' config option.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush/CrushWrapper: simplify get_full_location_ordered()
Sage Weil [Wed, 30 Oct 2013 16:00:52 +0000 (09:00 -0700)]
crush/CrushWrapper: simplify get_full_location_ordered()

Just ascend the hierarchy; it is much less complicated.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush/CrushWrapper: add get_common_ancestor_distance()
Sage Weil [Wed, 30 Oct 2013 15:59:00 +0000 (08:59 -0700)]
crush/CrushWrapper: add get_common_ancestor_distance()

Calculate closest common ancestor (type) in the hierarchy.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #990 from ceph/wip-fix-mon-fwd
Sage Weil [Tue, 24 Dec 2013 01:02:11 +0000 (17:02 -0800)]
Merge pull request #990 from ceph/wip-fix-mon-fwd

mon: fix forwarded request features when requests are resent

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agoMerge pull request #989 from ceph/wip-7056
Sage Weil [Mon, 23 Dec 2013 23:53:42 +0000 (15:53 -0800)]
Merge pull request #989 from ceph/wip-7056

osd/ReplicatedPG: include omap header in copy-get

This now passes rados/thrash tests without failures.

11 years agomon/OSDMonitor: use generic CrushWrapper::parse_loc_map helper
Sage Weil [Tue, 29 Oct 2013 23:37:59 +0000 (16:37 -0700)]
mon/OSDMonitor: use generic CrushWrapper::parse_loc_map helper

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush/CrushWrapper: add parse_loc_[multi]map helpers
Sage Weil [Tue, 29 Oct 2013 23:37:42 +0000 (16:37 -0700)]
crush/CrushWrapper: add parse_loc_[multi]map helpers

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #991 from dachary/wip-stop
Sage Weil [Mon, 23 Dec 2013 21:12:14 +0000 (13:12 -0800)]
Merge pull request #991 from dachary/wip-stop

vstart/stop: do not loop forever on kill

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: fix copy-get iteration of omap keys 989/head
Sage Weil [Mon, 23 Dec 2013 20:52:34 +0000 (12:52 -0800)]
osd/ReplicatedPG: fix copy-get iteration of omap keys

We need to call upper_bound() before checking if the iterator is valid!

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoceph_test_rados: s/tmap/omap/
Sage Weil [Mon, 23 Dec 2013 19:37:53 +0000 (11:37 -0800)]
ceph_test_rados: s/tmap/omap/

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agovstart/stop: do not loop forever on kill 991/head
Loic Dachary [Mon, 23 Dec 2013 20:44:38 +0000 (21:44 +0100)]
vstart/stop: do not loop forever on kill

It may be the case that stop.sh can't stop a process for reasons
unrelated to vstart.sh. Because apache runs independantly, for
instance. Instead of trying forever, try twice in a raw ( should be
enough 99% of the case ) and try three more times, sleeping one second
between each try should be more than enough.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoconfig: add 'crush location' option
Sage Weil [Tue, 29 Oct 2013 23:19:37 +0000 (16:19 -0700)]
config: add 'crush location' option

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agodoc: Fix caps documentation for Admin API
Wido den Hollander [Mon, 23 Dec 2013 20:10:59 +0000 (21:10 +0100)]
doc: Fix caps documentation for Admin API

The correct caps is users instead of user

11 years agomon: fix forwarded request features when requests are resent 990/head
Sage Weil [Mon, 23 Dec 2013 18:59:14 +0000 (10:59 -0800)]
mon: fix forwarded request features when requests are resent

Pass the features in explicitly so that we can use messages we've just
decoded in resend_routed_requests().

Keep the features in struct RoutedRequest.

Renamed conn_features -> con_features while we are here.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: include omap header in copy-get
Sage Weil [Mon, 23 Dec 2013 18:21:44 +0000 (10:21 -0800)]
osd/ReplicatedPG: include omap header in copy-get

Missed this the first time around.  Thank you, ceph_test_rados!

Fixes: #7056
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #984 from ceph/wip-7051
Sage Weil [Mon, 23 Dec 2013 17:52:02 +0000 (09:52 -0800)]
Merge pull request #984 from ceph/wip-7051

#7051: forward connection features alongside with message

Reviewed-by: Loic Dachary <loic@dachary.org>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Mon, 23 Dec 2013 17:28:29 +0000 (09:28 -0800)]
Merge remote-tracking branch 'gh/next'

11 years agoMerge remote-tracking branch 'gh/wip-cache'
Sage Weil [Mon, 23 Dec 2013 17:22:36 +0000 (09:22 -0800)]
Merge remote-tracking branch 'gh/wip-cache'

11 years agoMerge pull request #987 from ceph/wip-crush-shrink-diff
Sage Weil [Mon, 23 Dec 2013 17:19:11 +0000 (09:19 -0800)]
Merge pull request #987 from ceph/wip-crush-shrink-diff

crush: shrink diff with kernel implementation

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agocrush: misc formatting and whitespace fixes 987/head
Ilya Dryomov [Mon, 23 Dec 2013 16:12:56 +0000 (18:12 +0200)]
crush: misc formatting and whitespace fixes

- whitespace in crush.h

- format is_out() definition and call site to 80 columns

- whitespace around local_fallback_tries in crush_choose_firstn()

All of this is to shrink the diff with the kernel implementation.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agocrush: use kernel-doc consistently
Ilya Dryomov [Mon, 23 Dec 2013 16:12:56 +0000 (18:12 +0200)]
crush: use kernel-doc consistently

kernel-doc syntax is "@arg: desc", not "@param arg desc".  In addition,
these comments are usually placed around function definitions instead
of function declarations.  Follow these guidelines to shrink the diff.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agocrush/mapper: unsigned -> unsigned int
Ilya Dryomov [Mon, 23 Dec 2013 16:12:56 +0000 (18:12 +0200)]
crush/mapper: unsigned -> unsigned int

Kernel implementation is located in net/, and use of "unsigned int" is
preferred to bare "unsigned" in net tree (as proven by several net/
cleanups).  Follow this guideline to shrink the diff.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agoMerge pull request #985 from dachary/wip-erasure-code-defaults
João Eduardo Luís [Mon, 23 Dec 2013 12:47:41 +0000 (04:47 -0800)]
Merge pull request #985 from dachary/wip-erasure-code-defaults

mon: use kill instead of pkill in osd-pool-create

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: use kill instead of pkill in osd-pool-create 985/head
Loic Dachary [Mon, 23 Dec 2013 12:10:18 +0000 (13:10 +0100)]
mon: use kill instead of pkill in osd-pool-create

The --pidfile option of pkill is not supported by all versions. Use kill
instead for compatibility. Instead of looping on : loop on sleep 1 so an
inifinite loop does is slower at filling the disk.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoosd: OSDMap: dump osd_xinfo_t::features as an int 984/head
Joao Eduardo Luis [Mon, 23 Dec 2013 01:29:23 +0000 (17:29 -0800)]
osd: OSDMap: dump osd_xinfo_t::features as an int

Instead of dumping the list in a string-list format, which in
retrospect wasn't very useful.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: Monitor: Forward connection features
Joao Eduardo Luis [Mon, 23 Dec 2013 01:26:59 +0000 (17:26 -0800)]
mon: Monitor: Forward connection features

We are relying on connection features to track OSD supported
features.  However, we were not forwarding connection features
when we forwarded a message from a peon to the leader.  That
was breaking the OSD feature tracking.

Fixes: 7051
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agoMerge remote-tracking branch 'gh/master' into wip-cache
Sage Weil [Sun, 22 Dec 2013 23:33:59 +0000 (15:33 -0800)]
Merge remote-tracking branch 'gh/master' into wip-cache

Conflicts:
src/osdc/Objecter.h
src/vstart.sh

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #976 from dachary/wip-erasure-code-defaults
Sage Weil [Sun, 22 Dec 2013 23:30:43 +0000 (15:30 -0800)]
Merge pull request #976 from dachary/wip-erasure-code-defaults

provide sensible defaults when creating an erasure coded pool

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agomon: unit test for osd pool create 976/head
Loic Dachary [Fri, 20 Dec 2013 19:39:21 +0000 (20:39 +0100)]
mon: unit test for osd pool create

It is inconvenient to run such tests in the
qa/workunits/cephtool/test.sh because they require that the mon is
restarted to test errors in the format of the default erasure code
properties and check the appropriate error message is output.

osd-pool-create.sh runs a single mon from sources using command
line options and a temporary directory, the same way vstart.sh does but
lightweight.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agomon: erasure code pool properties defaults
Loic Dachary [Sun, 22 Dec 2013 22:37:08 +0000 (23:37 +0100)]
mon: erasure code pool properties defaults

If no properties are set when creating an erasure coded pool, default to
using the jerasure plugin with the cauchy_good technique which is the
fastest.

The defaults are set with osd_pool_default_erasure_code_properties.

The erasure code plugins are loaded from the directory specified in the
erasure-code-directory property. Contrary to the other properties it
will most commonly be the same throughout the cluster. The default is
set to /usr/lib/ceph/erasure-code with
osd_pool_default_erasure_code_directory

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agomon: add error message argument to prepare_new_pool
Loic Dachary [Fri, 20 Dec 2013 16:23:16 +0000 (17:23 +0100)]
mon: add error message argument to prepare_new_pool

Add a stringstream argument to prepare_new_pool for the purpose of
recording human readable error message.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agomon: do not include = in pool properties values
Loic Dachary [Sat, 21 Dec 2013 13:52:17 +0000 (14:52 +0100)]
mon: do not include = in pool properties values

foo=bar was parsed as {"foo":"=bar"} instead of {"foo":"bar"} because of
the missing equal++

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocommon: implement get_str_map to parse key/values
Loic Dachary [Sat, 21 Dec 2013 12:58:44 +0000 (13:58 +0100)]
common: implement get_str_map to parse key/values

It is capable of parsing json or key=value pairs. The prototype is made
to look like get_str_list. The implementation is in common + include and
use .h. It will probably be moved to common and use .hpp instead, along
with str_list.{cc,h}.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoosd: pool properties are not an array
Loic Dachary [Sat, 21 Dec 2013 13:48:27 +0000 (14:48 +0100)]
osd: pool properties are not an array

They must be dumped with open_object_section instead of
open_array_section otherwise only the values are displayed.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agomon: osd create pool must fail on incompatible type
Loic Dachary [Sat, 21 Dec 2013 14:49:19 +0000 (15:49 +0100)]
mon: osd create pool must fail on incompatible type

When osd create pool is called twice on the same pool, it will succeed
because the pool already exists. However, if a different type is
specified, it must fail.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agopackaging: erasure-code plugins go in /usr/lib/ceph
Loic Dachary [Fri, 20 Dec 2013 16:05:45 +0000 (17:05 +0100)]
packaging: erasure-code plugins go in /usr/lib/ceph

Install the plugins in /usr/lib/ceph/erasure-code instead of
/usr/lib/erasure-code to comply with FHS : "Applications may use a
single subdirectory under /usr/lib."

http://refspecs.linuxfoundation.org/FHS_2.3/fhs-2.3.html

The debian package is modified to install the plugins as part of the
ceph package which also ships rados-classes.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #983 from dachary/wip-rep-replicated
Sage Weil [Sun, 22 Dec 2013 20:39:08 +0000 (12:39 -0800)]
Merge pull request #983 from dachary/wip-rep-replicated

mon: s/rep/replicated/ in pool create prototype

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agomon: s/rep/replicated/ in pool create prototype 983/head
Loic Dachary [Sun, 22 Dec 2013 17:26:42 +0000 (18:26 +0100)]
mon: s/rep/replicated/ in pool create prototype

The test is updated to remove unecessary asserts. Since all combinations
of properties and pool type are allowed, there is no way to statically
check the validity of the arguments.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoceph_test_rados: update in-memory user_version on RemoveAttrsOp
Sage Weil [Sun, 22 Dec 2013 07:32:24 +0000 (23:32 -0800)]
ceph_test_rados: update in-memory user_version on RemoveAttrsOp

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: clear whiteout on successful copy-from
Sage Weil [Sun, 22 Dec 2013 07:01:56 +0000 (23:01 -0800)]
osd/ReplicatedPG: clear whiteout on successful copy-from

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoceph_test_rados: check existence on is_dirty completion
Sage Weil [Sun, 22 Dec 2013 06:52:28 +0000 (22:52 -0800)]
ceph_test_rados: check existence on is_dirty completion

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agomon/OSDMonitor: propagate snap updates to tier pools on update
Sage Weil [Thu, 19 Dec 2013 23:01:26 +0000 (15:01 -0800)]
mon/OSDMonitor: propagate snap updates to tier pools on update

For any pg_pool_t update, verify that any changes to the pool snapshot
metadata are propagated to the tiers.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/OSDMap: implement propapage_snaps_to_tiers()
Sage Weil [Thu, 19 Dec 2013 22:59:45 +0000 (14:59 -0800)]
osd/OSDMap: implement propapage_snaps_to_tiers()

Tier pools mirror the base pool's snapshot metadata.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agorgw: add -ldl for mongoose
Sage Weil [Sun, 22 Dec 2013 17:00:43 +0000 (09:00 -0800)]
rgw: add -ldl for mongoose

/usr/bin/ld: mongoose/mongoose.o: undefined reference to symbol 'dlsym@@GLIBC_2.2.5'
/lib/x86_64-linux-gnu/libdl.so.2: error adding symbols: DSO missing from command line
error: collect2: ld returned 1 exit status

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #980 from ceph/port/misc
Sage Weil [Sun, 22 Dec 2013 17:34:12 +0000 (09:34 -0800)]
Merge pull request #980 from ceph/port/misc

Misc portability patches

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #981 from dachary/wip-rep-replicated
Sage Weil [Sun, 22 Dec 2013 07:43:38 +0000 (23:43 -0800)]
Merge pull request #981 from dachary/wip-rep-replicated

replace pool type REP with REPLICATED

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoceph_test_rados_api_tier: more grace for HitSetTrim
Sage Weil [Sun, 22 Dec 2013 07:40:14 +0000 (23:40 -0800)]
ceph_test_rados_api_tier: more grace for HitSetTrim

Saw this test fail due to ill-timed thrashing:

 /a/teuthology-2013-12-20_23:00:02-rados-master-testing-basic-plana/10941

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoceph_test_rados: update in-memory user_version on RemoveAttrsOp
Sage Weil [Sun, 22 Dec 2013 07:32:24 +0000 (23:32 -0800)]
ceph_test_rados: update in-memory user_version on RemoveAttrsOp

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoreplace pool type REP with REPLICATED 981/head
Loic Dachary [Sun, 22 Dec 2013 06:04:36 +0000 (07:04 +0100)]
replace pool type REP with REPLICATED

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agodoc/release-notes: missed a name
Sage Weil [Sun, 22 Dec 2013 05:42:43 +0000 (21:42 -0800)]
doc/release-notes: missed a name

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agodoc/release-notes: v0.72.2
Sage Weil [Sun, 22 Dec 2013 05:33:23 +0000 (21:33 -0800)]
doc/release-notes: v0.72.2

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agopipe: add compat for TEMP_FAILURE_RETRY symbol 980/head
Noah Watkins [Sat, 21 Dec 2013 19:12:10 +0000 (13:12 -0600)]
pipe: add compat for TEMP_FAILURE_RETRY symbol

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agolinux_version: build on all platforms
Noah Watkins [Sat, 21 Dec 2013 19:08:59 +0000 (13:08 -0600)]
linux_version: build on all platforms

This linux version check is used in FileJournal to check about write
caching behavior. This is a temporary fix that will result in the
failure path and a warning about writing caching being turned on until
methods for OSX/FreeBSD/Windows can be found to find the same
information.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agomake: add libcommon for missing symbols
Noah Watkins [Sat, 21 Dec 2013 19:03:05 +0000 (13:03 -0600)]
make: add libcommon for missing symbols

On OSX without linking in libcommon at the end of these make targets
there is a missing reference to pipe_cloexec, even though the dependency
is present indirectly through libglobal.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agobuffer: remove darwin portability cruft
Noah Watkins [Sun, 29 Sep 2013 18:34:54 +0000 (11:34 -0700)]
buffer: remove darwin portability cruft

valloc conflicts with an existing call, and none of these macros are
actually used in buffer.h. The DARWIN check isn't valid either since
this is an installed header and that depends on acconfig.h

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agostatfs: include headers for statfs structs
Noah Watkins [Sun, 29 Sep 2013 19:07:55 +0000 (12:07 -0700)]
statfs: include headers for statfs structs

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agocompat: enable lseek64 alias
Noah Watkins [Sun, 29 Sep 2013 16:04:37 +0000 (09:04 -0700)]
compat: enable lseek64 alias

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agolibcephfs: ignore missing offset64 definition
Noah Watkins [Tue, 24 Sep 2013 16:09:28 +0000 (09:09 -0700)]
libcephfs: ignore missing offset64 definition

on apple.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agoMerge pull request #977 from ceph/wip-kill-raid4
Sage Weil [Fri, 20 Dec 2013 23:20:23 +0000 (15:20 -0800)]
Merge pull request #977 from ceph/wip-kill-raid4

osd: remove remaining instances of raid4 pool types (never implemented)

Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #975 from BCLibCoop/bclibcoop/rgw_cors
Yehuda Sadeh [Fri, 20 Dec 2013 22:26:41 +0000 (14:26 -0800)]
Merge pull request #975 from BCLibCoop/bclibcoop/rgw_cors

RGW: CORS use the correct headers for checking, and validate headers as lowercase where needed

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoMerge pull request #979 from dachary/wip-wrapped-vstart-errors
Sage Weil [Fri, 20 Dec 2013 22:19:05 +0000 (14:19 -0800)]
Merge pull request #979 from dachary/wip-wrapped-vstart-errors

vstart_wrapped_tests must fail if one test fail

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #818 from ceph/wip-rgw-standalone-2
Sage Weil [Fri, 20 Dec 2013 21:41:58 +0000 (13:41 -0800)]
Merge pull request #818 from ceph/wip-rgw-standalone-2

Wip rgw standalone 2

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoRevert "Enable libs3 support for debian packages"
Sage Weil [Fri, 20 Dec 2013 21:14:08 +0000 (13:14 -0800)]
Revert "Enable libs3 support for debian packages"

This reverts commit 8814265f0888f8091a7d83a900ffd6b65ae77f34.

Or not!  This adds a build-time dependency which none of the gitbuilders
have, so scrap it.

11 years agomon: pool create will not fail if the type differs 979/head
Loic Dachary [Fri, 20 Dec 2013 21:00:31 +0000 (22:00 +0100)]
mon: pool create will not fail if the type differs

It looked like it worked because the wrapper hide the error. The failing
tests are commented out so that the other tests can be used.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agodoc/release-notes: v0.67.5
Sage Weil [Fri, 20 Dec 2013 20:53:03 +0000 (12:53 -0800)]
doc/release-notes: v0.67.5

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agounittests: fail if one test fail
Loic Dachary [Fri, 20 Dec 2013 20:13:00 +0000 (21:13 +0100)]
unittests: fail if one test fail

vstart_wrapped_tests must return on error if one of the tests
fail.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #460 from toabctl/build-depends
Sage Weil [Fri, 20 Dec 2013 18:48:11 +0000 (10:48 -0800)]
Merge pull request #460 from toabctl/build-depends

Enable libs3 support for debian packages

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #916 from ceph/port/buffer
Sage Weil [Fri, 20 Dec 2013 18:29:03 +0000 (10:29 -0800)]
Merge pull request #916 from ceph/port/buffer

buffer: use int64_t instead of loff_t

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agobuffer: use int64_t instead of loff_t 916/head
Noah Watkins [Fri, 6 Dec 2013 19:09:51 +0000 (11:09 -0800)]
buffer: use int64_t instead of loff_t

Because portability.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agoMerge pull request #933 from dachary/wip-erasure-code-benchmark
Loic Dachary [Fri, 20 Dec 2013 11:24:10 +0000 (03:24 -0800)]
Merge pull request #933 from dachary/wip-erasure-code-benchmark

osd: erasure code benchmark tool

Reviewed-by: Andreas Peters <andreas.joachim.peters@cern.ch>
Reviewed-by: Christophe Courtaut <christophe.courtaut@gmail.com>
11 years agoosd: git ignore erasure code benchmark binary 933/head
Loic Dachary [Thu, 19 Dec 2013 11:14:38 +0000 (12:14 +0100)]
osd: git ignore erasure code benchmark binary

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoosd: erasure code benchmark is installed is part of ceph-test
Loic Dachary [Thu, 19 Dec 2013 11:12:41 +0000 (12:12 +0100)]
osd: erasure code benchmark is installed is part of ceph-test

Add to the packaging for RPMs and DEBs

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoosd: erasure code benchmark workunit
Loic Dachary [Thu, 12 Dec 2013 22:14:02 +0000 (23:14 +0100)]
osd: erasure code benchmark workunit

Display benchmark results for the default erasure code plugins, in a tab
separated CSV file. The first two column contain the amount of KB
that were coded or decoded, for a given combination of parameters
displayed in the following fields.

seconds KB plugin k m work. iter. size eras.
1.2 10 example 2 1 encode 10 1024 0
0.5 10 example 2 1 decode 10 1024 1

It can be used as input for a human readable report. It is also intented
to be used to show if a given version of an erasure code plugin performs
better than another.

The last column ( not shown above for brievety ) is the exact command
that was run to produce the result so it can be copy / pasted to
reproduce them or to profile.

Only the jerasure techniques mentionned in
https://www.usenix.org/legacy/events/fast09/tech/full_papers/plank/plank_html/
are benchmarked, the others are assumed to be less interesting.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoosd: erasure code benchmark tool
Loic Dachary [Wed, 11 Dec 2013 23:34:35 +0000 (00:34 +0100)]
osd: erasure code benchmark tool

Implement the ceph_erasure_code_benchmark utility to:

* load an erasure code plugin

* loop over the encode/decode function using the parameters from the
  command line

* print the number of bytes encoded/decoded and the time to process

When decoding, random chunks ( as set with --erasures ) are lost on each
run.

For instance:

    $ ceph_erasure_code_benchmark \
       --plugin jerasure \
       --parameter erasure-code-directory=.libs \
       --parameter erasure-code-technique=reed_sol_van \
       --parameter erasure-code-k=2 \
       --parameter erasure-code-m=2 \
       --workload decode \
       --erasures 2 \
       --iterations 1000
    0.964759 1048576

shows 1GB is decoded in 1second.

It is intended to be used by other scripts to present a human readable
output or detect performance regressions.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoosd: set erasure code packet size default to 2048
Loic Dachary [Fri, 13 Dec 2013 23:41:03 +0000 (00:41 +0100)]
osd: set erasure code packet size default to 2048

As shown in
https://www.usenix.org/legacy/events/fast09/tech/full_papers/plank/plank_html/
under "Impact of the Packet Size", the optimal for is in the order of 1k
rather than the current default of 8. Benchmarks are required to find
the actual optimum.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoosd: better performances for the erasure code example
Loic Dachary [Fri, 13 Dec 2013 13:07:37 +0000 (14:07 +0100)]
osd: better performances for the erasure code example

The XOR based example is ten times slower than it could because it uses
the buffer::ptr[] operator. Use a temporary char * instead. It performs
as well as jerasure Reed Solomon when decoding with a single erasure:

$ ceph_erasure_code_benchmark \
   --plugin example  --parameter erasure-code-directory=.libs \
   --parameter erasure-code-technique=example \
   --parameter erasure-code-k=2 --parameter erasure-code-m=1 \
   --erasure 1 --workload decode --iterations 5000
8.095007 5GB

$ ceph_erasure_code_benchmark \
   --plugin jerasure  --parameter erasure-code-directory=.libs \
   --parameter erasure-code-technique=reed_sol_van \
   --parameter erasure-code-k=10 --parameter erasure-code-m=6 \
   --erasure 1 --workload decode --iterations 5000
7.870990 5GB

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoosd: conditionally disable dlclose of erasure code plugins
Loic Dachary [Thu, 12 Dec 2013 13:03:26 +0000 (14:03 +0100)]
osd: conditionally disable dlclose of erasure code plugins

When profiling, tools such as valgrind --tool=callgrind require that the
dynamically loaded libraries are not dlclosed so they can collect usage
information.

The public ErasureCodePluginRegistry::disable_dlclose boolean is introduced
for this purpose.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoosd: Fix assert which doesn't apply when compat_mode on
David Zafman [Thu, 19 Dec 2013 22:37:28 +0000 (14:37 -0800)]
osd: Fix assert which doesn't apply when compat_mode on

Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit edaec9a8361396bd4c12814c16610669694b5b6c)

11 years agoAdd backward comptible acting set until all OSDs updated
David Zafman [Tue, 17 Dec 2013 06:08:07 +0000 (22:08 -0800)]
Add backward comptible acting set until all OSDs updated

Add configuration variable to override compatible acting set handling.
Later we'll check the osdmap that all OSDs are updated to use new acting sets.

Fixes: #6990
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 19cff890eb6083eefdb7b709773313b2c8acbcea)

11 years agoosd/ReplicatedPG: fix promote cancellation
Sage Weil [Thu, 19 Dec 2013 21:14:18 +0000 (13:14 -0800)]
osd/ReplicatedPG: fix promote cancellation

The canceling caller cleans up the blocked objects for us; we simply need
to bail out early.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: drop RepGather::ondone callback
Sage Weil [Thu, 19 Dec 2013 21:12:20 +0000 (13:12 -0800)]
osd/ReplicatedPG: drop RepGather::ondone callback

We kick the blocked contexts in the completion path of process_copy_chunk(),
after we have take the RWWRITE obc lock.  There is no need to delay the
unblocking until the RepGather finishes.

This also fixes a leak: the ondone wasn't getting cleaned up if a peering
interval change happens and the repgather is applied early in on_change().

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agovstart.sh: go faster
Sage Weil [Wed, 18 Dec 2013 20:56:40 +0000 (12:56 -0800)]
vstart.sh: go faster

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: fix undirty on clean object
Sage Weil [Tue, 17 Dec 2013 20:33:54 +0000 (12:33 -0800)]
osd/ReplicatedPG: fix undirty on clean object

Return success, but do not screw up the stats.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: track dirty, whiteout stat counts
Sage Weil [Tue, 17 Dec 2013 01:18:48 +0000 (17:18 -0800)]
osd/ReplicatedPG: track dirty, whiteout stat counts

These counts will be useful (even necessary!) for the cache agent, and are
generally interesting to the admin as well.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/osd_types: include num_objects_dirty, num_whiteouts in object_stat_sum_t
Sage Weil [Tue, 17 Dec 2013 00:03:56 +0000 (16:03 -0800)]
osd/osd_types: include num_objects_dirty, num_whiteouts in object_stat_sum_t

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: EBUSY on cache-evict when watchers are present
Sage Weil [Sat, 14 Dec 2013 00:39:02 +0000 (16:39 -0800)]
osd/ReplicatedPG: EBUSY on cache-evict when watchers are present

Linger operations will follow the object to the cache pool when the pool
overlay process is set.  If we evict the object, the object_info_t will
go away along with the watch state and confusing things will happen.
Prevent that from happening by returning EBUSY when you try to evict a
watched object.

Note that you *can* flush a watched object, and the dirty flag will be
cleared.  But you still can't evict it.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoceph_test_rados: test cache_flush, cache_try_flush, cache_evict
Sage Weil [Thu, 12 Dec 2013 21:21:31 +0000 (13:21 -0800)]
ceph_test_rados: test cache_flush, cache_try_flush, cache_evict

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoceph_test_rados_api_tier: fix HitSet* test names
Sage Weil [Tue, 17 Dec 2013 18:32:07 +0000 (10:32 -0800)]
ceph_test_rados_api_tier: fix HitSet* test names

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/osd_types: debug: include size in object_info_t operator<<
Sage Weil [Fri, 13 Dec 2013 21:40:01 +0000 (13:40 -0800)]
osd/osd_types: debug: include size in object_info_t operator<<

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: debug: clean up oi printout
Sage Weil [Fri, 13 Dec 2013 21:38:13 +0000 (13:38 -0800)]
osd/ReplicatedPG: debug: clean up oi printout

Signed-off-by: Sage Weil <sage@inktank.com>