]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
11 years agocommon: fix rare race condition in Throttle unit tests 944/head
Loic Dachary [Sun, 15 Dec 2013 13:31:27 +0000 (14:31 +0100)]
common: fix rare race condition in Throttle unit tests

The thread created to test Throttle race conditions updates a value (
throttle.get_current() ) that is tested by the main gtest thread but is
not protected by a lock. Instead of adding a lock, the main thread tests
the value after pthread_join() on the child thread.

http://tracker.ceph.com/issues/6679 fixes #6679

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocommon: format Throttle test to 80 columns
Loic Dachary [Sun, 15 Dec 2013 13:30:38 +0000 (14:30 +0100)]
common: format Throttle test to 80 columns

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #929 from kazhang/add-pkg-config
Loic Dachary [Sun, 15 Dec 2013 11:26:21 +0000 (03:26 -0800)]
Merge pull request #929 from kazhang/add-pkg-config

add apt-get install pkg-config for ubuntu server

Reviewed-by: Loic Dachary <loic@dachary.org>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agodoc: Added additional comments on placement targets and default placement.
John Wilkins [Sat, 14 Dec 2013 00:09:35 +0000 (16:09 -0800)]
doc: Added additional comments on placement targets and default placement.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agodoc: Updates to federated config.
John Wilkins [Sat, 14 Dec 2013 00:08:37 +0000 (16:08 -0800)]
doc: Updates to federated config.

Reverted Emperor versionadded to Dumpling as it gets backported.
Added default index and bucket pools to pool creation
Added default default_placment setting
Added placement_pools key val pair examples.
Added comments for re-running the procedure for the secondary region.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agoMerge remote-tracking branch 'gh/wip-objecter-full-2'
Sage Weil [Fri, 13 Dec 2013 18:49:10 +0000 (10:49 -0800)]
Merge remote-tracking branch 'gh/wip-objecter-full-2'

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #936 from ceph/wip-rbd-single-major
Josh Durgin [Fri, 13 Dec 2013 18:40:11 +0000 (10:40 -0800)]
Merge pull request #936 from ceph/wip-rbd-single-major

rbd: support for single-major device number allocation scheme

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoMerge pull request #932 from ceph/wip-6979
Sage Weil [Fri, 13 Dec 2013 18:03:43 +0000 (10:03 -0800)]
Merge pull request #932 from ceph/wip-6979

replace sgdisk subprocess calls with a helper

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Fri, 13 Dec 2013 17:58:10 +0000 (09:58 -0800)]
Merge remote-tracking branch 'gh/next'

11 years agotest/libcephfs: release resources before umount
Yan, Zheng [Tue, 10 Dec 2013 23:38:18 +0000 (07:38 +0800)]
test/libcephfs: release resources before umount

Fixes: #6742
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agouse the new get_command helper in check_call 932/head
Alfredo Deza [Fri, 13 Dec 2013 17:06:25 +0000 (12:06 -0500)]
use the new get_command helper in check_call

Signed-off-by: Alfredo Deza <alfredo@deza.pe>
11 years agorbd: modprobe with single_major=Y on newer kernels 936/head
Ilya Dryomov [Fri, 13 Dec 2013 15:40:52 +0000 (17:40 +0200)]
rbd: modprobe with single_major=Y on newer kernels

On kernels that support it, and if 'rbd map' is given a chance to
modprobe, turn on single-major device number allocation scheme.  For
users who for some reason don't want it, the workaround is to insert
the rbd module manually before executing the first 'rbd map' command.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agorbd: add support for single-major device number allocation scheme
Ilya Dryomov [Fri, 13 Dec 2013 15:40:52 +0000 (17:40 +0200)]
rbd: add support for single-major device number allocation scheme

With the preparatory commits ("rbd: match against wholedisk device
numbers on unmap" and "rbd: match against both major and minor on unmap
on kernels >= 3.14") in, this amounts to chosing to work with new rbd
bus interfaces (/sys/bus/rbd/{add,remove}_single_major) if they are
available, instead of the old ones (/sys/bus/rbd/{add,remove}).

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agorbd: match against both major and minor on unmap on newer kernels
Ilya Dryomov [Fri, 13 Dec 2013 15:40:52 +0000 (17:40 +0200)]
rbd: match against both major and minor on unmap on newer kernels

As described in commit "rbd: match against wholedisk device numbers on
unmap", currently we only match against major numbers.  In preparation
for support for single-major device number allocation scheme, start
matching against minor numbers also, which newer kernels provide in
a /sys/bus/rbd/devices/<id>/minor sysfs attribute.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agorbd: match against whole disks on unmap
Ilya Dryomov [Fri, 13 Dec 2013 15:40:52 +0000 (17:40 +0200)]
rbd: match against whole disks on unmap

Currently the way 'rbd unmap' translates a user-provided block device
into an rbd id is it matches the major number of the specified device
against /sys/bus/rbd/devices/<id>/major for each rbd mapping and
declares success on the first match.  This works for both entire disks
and partitions, because under the current device number allocation
scheme, each mapping means a new major number.

In preparation for support for single-major device number allocation
scheme, which would require matching both major and minor numbers, make
sure to always match against entire disk device numbers, by converting
the specified device major:minor pair into wholdedisk major:minor pair.
To achive that, use the libblkid library, which accomplishes this goal
by walking stable sysfs structures.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agorbd: switch to strict_strtol for major parsing
Ilya Dryomov [Fri, 13 Dec 2013 15:40:52 +0000 (17:40 +0200)]
rbd: switch to strict_strtol for major parsing

Use common/strict_strtol, which actually parses integers in a proper
way, instead of atoi for parsing /sys/bus/rbd/devices/<id>/major.  This
is important, because the kernel apparently can write things like
"(none)" into that file, and in general is more bulletproof.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agoMerge pull request #934 from cernceph/wip-rgw-ulimit
Sage Weil [Thu, 12 Dec 2013 17:42:21 +0000 (09:42 -0800)]
Merge pull request #934 from cernceph/wip-rgw-ulimit

radosgw: increase nofiles ulimit on sysvinit machines

11 years agoMerge pull request #935 from ceph/wip-vstart-memstore
Sage Weil [Thu, 12 Dec 2013 17:41:40 +0000 (09:41 -0800)]
Merge pull request #935 from ceph/wip-vstart-memstore

vstart.sh: add --memstore option

11 years agovstart.sh: add --memstore option 935/head
Yehuda Sadeh [Thu, 12 Dec 2013 17:31:53 +0000 (09:31 -0800)]
vstart.sh: add --memstore option

for setting memstore backed osds

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agouse the absolute path for executables if found
Alfredo Deza [Thu, 12 Dec 2013 16:16:38 +0000 (11:16 -0500)]
use the absolute path for executables if found

Signed-off-by: Alfredo Deza <alfredo@deza.pe>
11 years agoremove trailing semicolon
Alfredo Deza [Thu, 12 Dec 2013 15:26:05 +0000 (10:26 -0500)]
remove trailing semicolon

Signed-off-by: Alfredo Deza <alfredo@deza.pe>
11 years agoradosgw: increase nofiles ulimit on sysvinit machines 934/head
Dan van der Ster [Thu, 12 Dec 2013 13:53:13 +0000 (14:53 +0100)]
radosgw: increase nofiles ulimit on sysvinit machines

Clusters with many OSDs require a higher nofiles ulimit than the RHEL default. Increase it.

Tested-by: Dan van der Ster <daniel.vanderster@cern.ch>
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
11 years agodoc/release-notes: sort
Sage Weil [Thu, 12 Dec 2013 00:13:51 +0000 (16:13 -0800)]
doc/release-notes: sort

meh

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agodoc/release-notes: fix indentation; sigh
Sage Weil [Thu, 12 Dec 2013 00:11:00 +0000 (16:11 -0800)]
doc/release-notes: fix indentation; sigh

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agodoc/release-notes: v0.73
Sage Weil [Wed, 11 Dec 2013 23:59:45 +0000 (15:59 -0800)]
doc/release-notes: v0.73

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoPendingReleaseNotes: note CRUSH and hashpspool default changes
Sage Weil [Wed, 11 Dec 2013 23:39:37 +0000 (15:39 -0800)]
PendingReleaseNotes: note CRUSH and hashpspool default changes

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #930 from ceph/wip-hashpspool
Sage Weil [Wed, 11 Dec 2013 23:37:46 +0000 (15:37 -0800)]
Merge pull request #930 from ceph/wip-hashpspool

enable hashpspool by default

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agoRevert "Partial revert "mon: osd pool set syntax relaxed, modify unit tests""
Greg Farnum [Wed, 11 Dec 2013 22:17:25 +0000 (14:17 -0800)]
Revert "Partial revert "mon: osd pool set syntax relaxed, modify unit tests""

This reverts commit e80ab94bf44e102fcd87d16dc11e38ca4c0eeadb.

We accept non-CephInt arguments again, now that we've got the monitors
handling differing APIs intelligently.

Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon/OSDMonitor: take 'osd pool set ...' value as a string again
Sage Weil [Wed, 4 Dec 2013 05:39:03 +0000 (21:39 -0800)]
mon/OSDMonitor: take 'osd pool set ...' value as a string again

We ran into problems before when we made this a string because a mixed
cluster of mons might forward a client request with the wrong schema.
To make this work, we make the new code understand both the new and
old schema, and also backport a change to emperor and dumpling to
handle the new schema.

For the previous attempt to do this, see:
 337195f04653eed8e8f153a5b074f3bd48408998
 2fe0d0d97af95c22db80800f5b9da51f672d9407

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
11 years agoMerge pull request #925 from ceph/wip-mon-api
Gregory Farnum [Wed, 11 Dec 2013 21:27:03 +0000 (13:27 -0800)]
Merge pull request #925 from ceph/wip-mon-api

Merge in changes to unify the API presented by the monitors and handle changes gracefully.

(Upgrade tests) Tested-by: Tamil Muthamizhan <tamil.muthamizhan@inktank.com>

Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agoreplace sgdisk subprocess calls with a helper
Alfredo Deza [Wed, 11 Dec 2013 20:41:45 +0000 (15:41 -0500)]
replace sgdisk subprocess calls with a helper

Signed-off-by: Alfredo Deza <alfredo@deza.pe>
11 years agoosd: enable HASHPSPOOL by default 930/head
Sage Weil [Wed, 11 Dec 2013 19:19:37 +0000 (11:19 -0800)]
osd: enable HASHPSPOOL by default

Much like the CRUSH tunables, this first appears in kernel v3.9.

Unlike the CRUSH tunables, it does not appear in Ceph until v0.64
(post cuttlefish, pre dumpling).

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agomon: if we're the leader, don't validate command matching 925/head
Greg Farnum [Tue, 10 Dec 2013 19:33:51 +0000 (11:33 -0800)]
mon: if we're the leader, don't validate command matching

Classic-format commands never match our leader command set!

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agomon: by default, warn if some members of the quorum are "classic"
Greg Farnum [Tue, 10 Dec 2013 18:56:33 +0000 (10:56 -0800)]
mon: by default, warn if some members of the quorum are "classic"

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoadd apt-get install pkg-config for ubuntu server 929/head
Kai Zhang [Wed, 11 Dec 2013 00:25:48 +0000 (16:25 -0800)]
add apt-get install pkg-config for ubuntu server

Signed-off-by: Kai Zhang <kaizh.pub@gmail.com>
11 years agoMemStore: update for the new ObjectStore interface
Greg Farnum [Tue, 10 Dec 2013 23:51:39 +0000 (15:51 -0800)]
MemStore: update for the new ObjectStore interface

68fdcfa1cc249af859400a2ce4590fefbb2f525b changed the ObjectStore
interface in the 'next' branch, which was merged into master by
e5a02c33e23e4fbdc7bf0f16a5bbff61f4e37186. Unfortunately the
Memstore (added via the master branch) was not corrected for this
interface change.

Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
11 years agoMerge branch 'next'
Gary Lowell [Tue, 10 Dec 2013 21:00:14 +0000 (21:00 +0000)]
Merge branch 'next'

11 years agoMerge pull request #927 from dachary/wip-crush-test
Gregory Farnum [Tue, 10 Dec 2013 20:25:07 +0000 (12:25 -0800)]
Merge pull request #927 from dachary/wip-crush-test

crush: remove crushtool test leftover

11 years agocrush: remove crushtool test leftover 927/head
Loic Dachary [Tue, 10 Dec 2013 19:35:34 +0000 (20:35 +0100)]
crush: remove crushtool test leftover

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #920 from dachary/wip-man
Sage Weil [Tue, 10 Dec 2013 19:10:41 +0000 (11:10 -0800)]
Merge pull request #920 from dachary/wip-man

man: Ceph is also an object store

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoElector: use monitor's encoded command sets instead of our own
Greg Farnum [Tue, 10 Dec 2013 18:23:03 +0000 (10:23 -0800)]
Elector: use monitor's encoded command sets instead of our own

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMerge pull request #865 from ceph/wip-doc-build-cluster
scuttlemonkey [Tue, 10 Dec 2013 18:14:59 +0000 (10:14 -0800)]
Merge pull request #865 from ceph/wip-doc-build-cluster

Wip doc build cluster

11 years agoMonitor: encode and expose mon command sets
Greg Farnum [Tue, 10 Dec 2013 18:06:36 +0000 (10:06 -0800)]
Monitor: encode and expose mon command sets

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoman: update man/ from doc/man/8 920/head
Loic Dachary [Sat, 7 Dec 2013 21:07:38 +0000 (22:07 +0100)]
man: update man/ from doc/man/8

As explained in admin/manpage-howto.txt

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoman: Ceph is also an object store
Loic Dachary [Sat, 7 Dec 2013 20:52:16 +0000 (21:52 +0100)]
man: Ceph is also an object store

Replace

   Ceph distributed file system

with

   Ceph distributed storage system

to help reduce the idea that Ceph is just a file system.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #923 from dachary/wip-crush-test
Sage Weil [Tue, 10 Dec 2013 17:06:31 +0000 (09:06 -0800)]
Merge pull request #923 from dachary/wip-crush-test

CrushTester patches and documentation

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoos/MemStore: do on_apply_sync callback synchronously
Sage Weil [Tue, 10 Dec 2013 16:56:35 +0000 (08:56 -0800)]
os/MemStore: do on_apply_sync callback synchronously

We can easily deadlock if we put this in the Finisher thread behind other
work; do it synchronously!

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agov0.73 v0.73
Gary Lowell [Tue, 10 Dec 2013 04:55:36 +0000 (04:55 +0000)]
v0.73

11 years agoElector: keep a list of classic mons instead of each mon's commands
Greg Farnum [Mon, 9 Dec 2013 23:30:57 +0000 (15:30 -0800)]
Elector: keep a list of classic mons instead of each mon's commands

We aren't actually using the sets, so don't bother keeping them.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agocrush: implement --show-bad-mappings for indep 923/head
Loic Dachary [Mon, 9 Dec 2013 13:35:00 +0000 (14:35 +0100)]
crush: implement --show-bad-mappings for indep

Support the presence of ITEM_NONE device numbers in the indep mapping as
proof of a bad mapping. Implement the associated unit tests.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: add unitest for crushtool --show-bad-mappings
Loic Dachary [Mon, 9 Dec 2013 13:08:14 +0000 (14:08 +0100)]
crush: add unitest for crushtool --show-bad-mappings

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: remove scary message string
Loic Dachary [Sun, 8 Dec 2013 21:39:18 +0000 (22:39 +0100)]
crush: remove scary message string

The string is no longer used and can be removed.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: document the --test mode of operations
Loic Dachary [Sun, 8 Dec 2013 21:03:33 +0000 (22:03 +0100)]
crush: document the --test mode of operations

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMonitor: Elector: share the classic command set if we have a classic mon
Greg Farnum [Mon, 9 Dec 2013 16:44:05 +0000 (08:44 -0800)]
Monitor: Elector: share the classic command set if we have a classic mon

The leader now checks to see if any monitors did not provide their
command set, and if so, shares the list of "classic" commands instead
of his own set. This will prevent users from seeing different commands
(depending on whether they connect to an old or new mon) while
performing upgrades, and will make it really obvious if they forgot
to upgrade one of the monitors!

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoElector: share local command set when deferring
Greg Farnum [Mon, 9 Dec 2013 16:41:54 +0000 (08:41 -0800)]
Elector: share local command set when deferring

We're about to use this at a basic level, to identify when we have
"classic" monitors in-quorum, but could also do something more
sophisticated like a set intersection on the commands.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMonitor: import MonCommands.h from original Dumpling and expose it
Greg Farnum [Mon, 9 Dec 2013 06:17:39 +0000 (22:17 -0800)]
Monitor: import MonCommands.h from original Dumpling and expose it

If the Elector doesn't receive a set of commands from the elected leader, it
assumes the monitor is "classic" and uses the Dumpling command set as
the leader set.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMonitor: validate incoming commands against the leader's set too
Greg Farnum [Sat, 7 Dec 2013 03:08:13 +0000 (19:08 -0800)]
Monitor: validate incoming commands against the leader's set too

Then check against our own, and forward if we don't recognize it
or for some reason don't match.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMonitor: disseminate leader's command set instead of our own
Greg Farnum [Fri, 6 Dec 2013 22:55:13 +0000 (14:55 -0800)]
Monitor: disseminate leader's command set instead of our own

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoElector: transmit local api on election win, accept leader's on loss
Greg Farnum [Fri, 6 Dec 2013 22:08:48 +0000 (14:08 -0800)]
Elector: transmit local api on election win, accept leader's on loss

If we're the leader, just point to our local set. Disseminating these
will let peons advertise the full command set supported by the leader.
INCOMPLETE: does not yet handle winning Electors who do not send a command set.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agomessages: make room for passing supported monitor commands in MMonElection
Greg Farnum [Fri, 6 Dec 2013 21:13:03 +0000 (13:13 -0800)]
messages: make room for passing supported monitor commands in MMonElection

We're going to use this space to let leader tell everybody what
commands it supports.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMonitor: pull command mapping out of _allowed_command()
Greg Farnum [Sat, 7 Dec 2013 00:09:36 +0000 (16:09 -0800)]
Monitor: pull command mapping out of _allowed_command()

We want to be able to validate commands against both the leader and
local command sets, so make that functionality generic.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMerge pull request #918 from ceph/port/misc
Sage Weil [Mon, 9 Dec 2013 19:16:49 +0000 (11:16 -0800)]
Merge pull request #918 from ceph/port/misc

Misc portability patches

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #922 from dachary/wip-crush-choose-tries
Sage Weil [Mon, 9 Dec 2013 16:28:43 +0000 (08:28 -0800)]
Merge pull request #922 from dachary/wip-crush-choose-tries

crush: fix map->choose_tries boundary test

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agocrush: --show-utilization* implies --show-statistics
Loic Dachary [Sun, 8 Dec 2013 18:45:28 +0000 (19:45 +0100)]
crush: --show-utilization* implies --show-statistics

--show-utilization* outputs only if --show-statistics is set, which is
confusing. Instead of failing, set --show-statistics to avoid the
confusion.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMonitor: add a separate leader_supported_commands
Greg Farnum [Fri, 6 Dec 2013 21:55:38 +0000 (13:55 -0800)]
Monitor: add a separate leader_supported_commands

This isn't used yet, but will be shortly.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMonitor: expose local monitor commands to other compilation units
Greg Farnum [Fri, 6 Dec 2013 21:48:42 +0000 (13:48 -0800)]
Monitor: expose local monitor commands to other compilation units

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMonCommand: add operator== and operator!=
Greg Farnum [Sat, 7 Dec 2013 02:19:32 +0000 (18:19 -0800)]
MonCommand: add operator== and operator!=

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMonCommand: support encode/decode
Greg Farnum [Fri, 6 Dec 2013 21:51:51 +0000 (13:51 -0800)]
MonCommand: support encode/decode

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoencoding: fix [encode|decode]_array_nohead
Greg Farnum [Sat, 7 Dec 2013 02:19:13 +0000 (18:19 -0800)]
encoding: fix [encode|decode]_array_nohead

We want to actually encode each element and keep it, rather than
writing each one at the position after the array end!

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agocrush: add CrushTester accessors
Loic Dachary [Sun, 8 Dec 2013 18:39:16 +0000 (19:39 +0100)]
crush: add CrushTester accessors

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: output --show-bad-mappings on err
Loic Dachary [Sun, 8 Dec 2013 16:57:25 +0000 (17:57 +0100)]
crush: output --show-bad-mappings on err

Instead of using stdout so that it displays well when used in
conjunction with --show-statistics

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: fix map->choose_tries boundary test 922/head
Loic Dachary [Sun, 8 Dec 2013 13:38:59 +0000 (14:38 +0100)]
crush: fix map->choose_tries boundary test

CrushWrapper::start_choose_profile allocates map->choose_tries with
choose_total_tries elements. When crush_choose_firstn sets a value, it
tests against map->choose_local_tries which could lead to memory
corruption if map->choose_total_tries is smaller than
map->choose_local_tries.

Another indesirable but non fatal side effect is that the output crushtool
--show-choose-tries will be truncated to choose_local_tries which is
set to a lower value than choose_total_tries by the default tuneables.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #869 from ceph/wip-crush
Sage Weil [Sun, 8 Dec 2013 04:59:22 +0000 (20:59 -0800)]
Merge pull request #869 from ceph/wip-crush

crush changes for erasure coding

Reviewed-by: Loic Dachary <loic@dachary.org>
Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agolibrbd: remove unused private variable 918/head
Noah Watkins [Sat, 7 Dec 2013 17:58:43 +0000 (09:58 -0800)]
librbd: remove unused private variable

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agoTrackedOp: remove unused private variable
Noah Watkins [Sat, 7 Dec 2013 17:54:53 +0000 (09:54 -0800)]
TrackedOp: remove unused private variable

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agolibrbd: rename howmany to avoid conflict
Noah Watkins [Sat, 7 Dec 2013 17:59:13 +0000 (09:59 -0800)]
librbd: rename howmany to avoid conflict

A howmany macro exists on some platforms in standard headers, but there
really isn't any sort of standard that I've found. We just avoid the
conflict entirely this way.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agoMerge pull request #917 from ceph/port/compat
Sage Weil [Sat, 7 Dec 2013 22:01:14 +0000 (14:01 -0800)]
Merge pull request #917 from ceph/port/compat

compat: define replacement TEMP_FAILURE_RETRY

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #919 from ceph/port/fdatasync
Sage Weil [Sat, 7 Dec 2013 22:00:40 +0000 (14:00 -0800)]
Merge pull request #919 from ceph/port/fdatasync

wbthrottle: use feature check for fdatasync

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agowbthrottle: use feature check for fdatasync 919/head
Noah Watkins [Sun, 29 Sep 2013 18:32:29 +0000 (11:32 -0700)]
wbthrottle: use feature check for fdatasync

Checking for fdatasync uses the same approach as the qemu configure
script. The relevant commit is d1722a27f552a22561104210e0afad4577878e53.
Here is a copy of the commit message which explains the check:

Under Darwin, a symbol exists for the fdatasync() function, so that our
link test succeeds. However _POSIX_SYNCHRONIZED_IO is set to '-1'.

According to POSIX:2008, a value of -1 means the feature is not
supported.
A value of 0 means supported at compilation time, and a value greater 0
means supported at both compilation and run time.

Enable fdatasync() only if _POSIX_SYNCHRONIZED_IO is '>0'.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agorados_sync: fix mismatched tag warning
Noah Watkins [Sat, 7 Dec 2013 17:59:39 +0000 (09:59 -0800)]
rados_sync: fix mismatched tag warning

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agorados_sync: remove unused private variable
Noah Watkins [Sat, 7 Dec 2013 18:01:30 +0000 (10:01 -0800)]
rados_sync: remove unused private variable

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agomon: check for sys/vfs.h existence
Noah Watkins [Fri, 27 Sep 2013 14:38:11 +0000 (07:38 -0700)]
mon: check for sys/vfs.h existence

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agomake: increase maximum template recursion depth
Noah Watkins [Tue, 29 Oct 2013 15:54:01 +0000 (08:54 -0700)]
make: increase maximum template recursion depth

With clang on OSX spirit blows up without this.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agocompat: define replacement TEMP_FAILURE_RETRY 917/head
Noah Watkins [Sun, 22 Sep 2013 18:02:34 +0000 (11:02 -0700)]
compat: define replacement TEMP_FAILURE_RETRY

Not all platforms have it.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agoMerge remote-tracking branch 'gh/wip-fix-3x'
Sage Weil [Sat, 7 Dec 2013 00:56:10 +0000 (16:56 -0800)]
Merge remote-tracking branch 'gh/wip-fix-3x'

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoMerge remote-tracking branch 'gh/wip-fix-tunables'
Sage Weil [Sat, 7 Dec 2013 00:55:54 +0000 (16:55 -0800)]
Merge remote-tracking branch 'gh/wip-fix-tunables'

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
11 years agocrush/CrushCompiler: make current set of tunables 'safe'
Sage Weil [Sat, 7 Dec 2013 00:03:21 +0000 (16:03 -0800)]
crush/CrushCompiler: make current set of tunables 'safe'

We can reenable this error the next time we add new tunables.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrushtool: remove scary tunables messages
Sage Weil [Sat, 7 Dec 2013 00:20:23 +0000 (16:20 -0800)]
crushtool: remove scary tunables messages

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush/CrushCompiler: start with legacy tunables when compiling
Sage Weil [Sat, 7 Dec 2013 00:18:04 +0000 (16:18 -0800)]
crush/CrushCompiler: start with legacy tunables when compiling

Ensure that a crush file always compiled deterministically, even though
the default values for *new* maps has changed.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocrush: add indep data set to cli tests 869/head
Sage Weil [Sat, 7 Dec 2013 00:04:55 +0000 (16:04 -0800)]
crush: add indep data set to cli tests

This will help us catch things if we break the mapping.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosdmaptool: fix cli tests for 3x
Sage Weil [Sat, 7 Dec 2013 00:13:50 +0000 (16:13 -0800)]
osdmaptool: fix cli tests for 3x

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd: default to 3x replication
Sage Weil [Fri, 6 Dec 2013 18:35:45 +0000 (10:35 -0800)]
osd: default to 3x replication

3x is the recommendation; it should be the default too.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #913 from dachary/wip-crush-unittest
Sage Weil [Sat, 7 Dec 2013 00:10:00 +0000 (16:10 -0800)]
Merge pull request #913 from dachary/wip-crush-unittest

CrushWrapper::move_bucket unittest and minor fixes

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoobjecter: don't take extra throttle budget for resent ops 914/head
Josh Durgin [Sat, 7 Dec 2013 00:03:20 +0000 (16:03 -0800)]
objecter: don't take extra throttle budget for resent ops

These ops have already taken their budget in the original op_submit().
It will be returned via put_op_budget() when they complete.
If there were many localized reads of missing objects from replicas,
or cache pool redirects, this would cause the objecter to use up all
of its op throttle budget and hang.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoRevert "osd: default to 3x replication"
Sage Weil [Fri, 6 Dec 2013 23:48:39 +0000 (15:48 -0800)]
Revert "osd: default to 3x replication"

This reverts commit cb26fbde52f31b449af60acce3ced34e593d6e1e.

Fix unit tests and do integration tests first; this may have unexpected
consequences.

11 years agocrush: detach_bucket must test item >= 0 not > 0 913/head
Loic Dachary [Fri, 6 Dec 2013 23:31:54 +0000 (00:31 +0100)]
crush: detach_bucket must test item >= 0 not > 0

Since detach_bucket is a private helper solely used by move_bucket which
contains another ( correct ) safeguard, the code cannot be reached and
the problem can never happen. If another function uses detach_bucket,
it may happen.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: remove obsolete comments from link_bucket
Loic Dachary [Fri, 6 Dec 2013 23:27:09 +0000 (00:27 +0100)]
crush: remove obsolete comments from link_bucket

Probably copy/pasted from move_bucket.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: remove redundant code from move_bucket
Loic Dachary [Fri, 6 Dec 2013 23:21:16 +0000 (00:21 +0100)]
crush: remove redundant code from move_bucket

The following was introduced in 2012 by a2d0cff1b071bed84ac439e4fcf9ddfb936f89c8

  // un-set the device name so we can use add_item later
  build_rmap(name_map, name_rmap);
  name_map.erase(id);
  name_rmap.erase(id_name);

when insert_item refused to move a bucket for which a name already
exists. It was changed in 2013 by
4e2557a038dc1e8c68993ad8571d74e2eb8ea90a and now supports it. The
TestCrushWrapper unittest for move_bucket pass.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agocrush: unittest CrushWrapper::move_bucket
Loic Dachary [Fri, 6 Dec 2013 23:19:50 +0000 (00:19 +0100)]
crush: unittest CrushWrapper::move_bucket

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #888 from ceph/wip-crush-tunables
Sage Weil [Fri, 6 Dec 2013 22:45:57 +0000 (14:45 -0800)]
Merge pull request #888 from ceph/wip-crush-tunables

default to bobtail-era crush tunables.

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>