git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/log

]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/log

projects / ceph.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Sage Weil [Wed, 28 Aug 2013 16:50:11 +0000 (09:50 -0700)]

mon: discover mon addrs, names during election state too

Currently we only detect new mon addrs and names during the probing phase.
For non-trivial clusters, this means we can get into a sticky spot when
we discover enough peers to form an quorum, but not all of them, and the
undiscovered ones are enough to break the mon ranks and prevent an
election.

One way to work around this is to continue addr and name discovery during
the election. We should also consider making the ranks less sensitive to
the undefined addrs; that is a separate change.

Fixes: #4924
Backport: dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
Tested-by: Bernhard Glomm <bernhard.glomm@ecologic.eu>

commit | commitdiff | tree

athanatos [Tue, 27 Aug 2013 17:56:49 +0000 (10:56 -0700)]

Merge pull request #545 from dachary/wip-6117

SharedPtrRegistry: get_next must not delete while holding the lock

Reviewed-by: Samuel Just <sam.just@inktank.com>

commit | commitdiff | tree

John Wilkins [Tue, 27 Aug 2013 17:25:50 +0000 (10:25 -0700)]

doc: Updated to accurately reflect that upstart applies to a single node.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>

commit | commitdiff | tree

Gary Lowell [Tue, 27 Aug 2013 16:53:12 +0000 (09:53 -0700)]

ceph.spec.in: radosgw package doesn't require mod_fcgi

Fixes #5702

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>

commit | commitdiff | tree

Sage Weil [Tue, 27 Aug 2013 15:30:50 +0000 (08:30 -0700)]

librbd: fix debug print in aio_write

Reported-by: James Harper <james.harper@bendigoit.com.au>
Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Roald J. van Loon [Tue, 27 Aug 2013 15:17:19 +0000 (08:17 -0700)]

cleanup: removed last references to g_conf from auth

Trivial cleanup. There were still 3 references to g_conf in CephxKeyServer.
Replaced them in favor of cct->_conf.

Signed-off-by: Roald J. van Loon <roaldvanloon@gmail.com>

commit | commitdiff | tree

Loic Dachary [Tue, 27 Aug 2013 14:09:17 +0000 (16:09 +0200)]

SharedPtrRegistry: get_next must not delete while holding the lock

    bool get_next(const K &key, pair<K, VPtr> *next)

may indirectly delete the object pointed by next->second when
doing :

    *next = make_pair(i->first, next_val);

and it will deadlock (EDEADLK) when

    void operator()(V *to_remove) {
      {
Mutex::Locker l(parent->lock);

tries to acquire the lock because it is already held. The
Mutex::Locker is isolated in a block and the *next* parameter is set
outside of the block.

A test case demonstrating the problem is added to test_sharedptr_registry.cc

http://tracker.ceph.com/issues/6117 fixes #6117

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Loic Dachary [Tue, 27 Aug 2013 11:58:33 +0000 (13:58 +0200)]

common: move SharedPtrRegistry test after t.join

The thread created to test SharedPtrRegistry race conditions updates a
value ( ptr ) that is tested by the main gtest thread but is not
protected by a lock. Instead of adding a lock, the main thread tests
the value after pthread_join() on the child thread.

http://tracker.ceph.com/issues/6130 fixes #6130

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Sage Weil [Tue, 27 Aug 2013 01:11:32 +0000 (18:11 -0700)]

Merge remote-tracking branch 'gh/next'

commit | commitdiff | tree

Sage Weil [Sat, 24 Aug 2013 21:04:09 +0000 (14:04 -0700)]

osd: install admin socket commands after signals

This lets us tell by the presence of the admin socket commands whether
a signal will make us shut down cleanly. See #5924.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 26 Aug 2013 20:19:27 +0000 (13:19 -0700)]

mon/DataHealthService: preserve compat of data stats dump

See 96621bdb004e539a0186fb592f44d51cf49f1c31.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 26 Aug 2013 20:17:20 +0000 (13:17 -0700)]

Merge pull request #526 from ceph/wip-5909

mon: Early warning system for monitor stores growing over predefined threshold

Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Mon, 26 Aug 2013 17:42:34 +0000 (10:42 -0700)]

Merge pull request #540 from ceph/wip-doc-update

List packages needed for RPM-based distros

commit | commitdiff | tree

Samuel Just [Thu, 22 Aug 2013 18:19:52 +0000 (11:19 -0700)]

WBThrottle: use fdatasync instead of fsync

Backport: dumpling
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Samuel Just [Thu, 22 Aug 2013 18:19:37 +0000 (11:19 -0700)]

FileStore: add config option to disable the wbthrottle

Backport: dumpling
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Alfredo Deza [Mon, 26 Aug 2013 16:48:56 +0000 (12:48 -0400)]

fix nss lib name

Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>

commit | commitdiff | tree

Alfredo Deza [Mon, 26 Aug 2013 16:05:00 +0000 (12:05 -0400)]

update the README with required RPM packages

Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>

commit | commitdiff | tree

Josh Durgin [Mon, 26 Aug 2013 00:12:42 +0000 (17:12 -0700)]

Merge branch 'sleinen'

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>

commit | commitdiff | tree

Simon Leinen [Sun, 4 Aug 2013 14:34:52 +0000 (14:34 +0000)]

Improve warning message when there are unfound objects, but probing
hasn't finished yet.

Signed-off-by: Simon Leinen <simon.leinen@switch.ch>

commit | commitdiff | tree

Sage Weil [Sat, 24 Aug 2013 21:12:44 +0000 (14:12 -0700)]

Merge remote-tracking branch 'gh/next'

commit | commitdiff | tree

Joao Eduardo Luis [Thu, 22 Aug 2013 15:08:22 +0000 (16:08 +0100)]

mon: DataHealthService: monitor backing store's size and report it

If the store's size grows beyond what we believe to be reasonable, we must
let the user know that something fishy may be going on. This intends to
act as an early warning system for monitors suffering from leveldb
compaction issues. However, if the monitor's store is just growing a lot
due to normal cluster behaviour, we made sure that the warning threshold
is adjustable by tuning 'mon_leveldb_size_warn' (defaulting to 40GB).

Fixes: #5909
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>

commit | commitdiff | tree

Joao Eduardo Luis [Thu, 22 Aug 2013 15:05:17 +0000 (16:05 +0100)]

mon: mon_types: DataStats: add 'dump(Formatter*)' method

... and use it on DataHealthService.cc, instead of building our own
version of the classes' formatted output.

Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>

commit | commitdiff | tree

Joao Eduardo Luis [Thu, 22 Aug 2013 14:57:05 +0000 (15:57 +0100)]

mon: MonitorDBStore: rely on backing store to provide estimated store size

Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>

commit | commitdiff | tree

Joao Eduardo Luis [Thu, 22 Aug 2013 15:17:12 +0000 (16:17 +0100)]

test: ceph_test_store_tool: output estimated store size on 'get-size'

Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>

commit | commitdiff | tree

Sage Weil [Sat, 24 Aug 2013 04:21:57 +0000 (21:21 -0700)]

Merge pull request #514 from kri5/wip-clang-compilation

Do not use some compilation flag invalid for clang

Reviewed-by: Loic Dachary <loic@dachary.org>
Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Sat, 24 Aug 2013 04:18:44 +0000 (21:18 -0700)]

Merge pull request #522 from kri5/master

vstart.sh: Allow to run multiple cluster instances.

commit | commitdiff | tree

Sage Weil [Fri, 23 Aug 2013 22:30:41 +0000 (15:30 -0700)]

Merge pull request #531 from dmick/wip-6099

ceph_rest_api.py: create own default for log_file

Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 23 Aug 2013 22:21:41 +0000 (15:21 -0700)]

rados-config: do not load ceph.conf

Fixes: #2901
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 23 Aug 2013 22:11:49 +0000 (15:11 -0700)]

osd/ReplicatedPG: require write payload match length

Hopefully this won't break old clients; I can't think of any. We *should*
be picky about our requests.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 23 Aug 2013 22:02:00 +0000 (15:02 -0700)]

osd/ReplicatedPG: verify we have enough data for WRITE and WRITEFULL

Fixes: #2207
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>

commit | commitdiff | tree

Dan Mick [Fri, 23 Aug 2013 00:30:24 +0000 (17:30 -0700)]

ceph_rest_api.py: create own default for log_file

common/config thinks the default log_file for non-daemons should be "".
Override that so that the default is
/var/log/ceph/{cluster}-{name}.{pid}.log
since ceph-rest-api is more of a daemon than a client.

Fixes: #6099
Backport: dumpling
Signed-off-by: Dan Mick <dan.mick@inktank.com>

commit | commitdiff | tree

Samuel Just [Fri, 23 Aug 2013 21:50:42 +0000 (14:50 -0700)]

ReplicatedPG: mark stats invalid when marking unfound lost

Fixes: #3660
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Samuel Just [Fri, 23 Aug 2013 21:50:20 +0000 (14:50 -0700)]

ReplicatedPG: make watch timeout configurable

Fixes: #2354
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 23 Aug 2013 21:56:46 +0000 (14:56 -0700)]

osd/OSDCap: allow . for unquoted strings

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 23 Aug 2013 21:56:37 +0000 (14:56 -0700)]

mon/MonCap: allow . in unquoted string

Fixes: #5967
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 23 Aug 2013 21:56:12 +0000 (14:56 -0700)]

librados: make safe and complete callback arguments separate

Fixes: #2914
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>

commit | commitdiff | tree

David Disseldorp [Mon, 29 Jul 2013 15:05:44 +0000 (17:05 +0200)]

mds: remove waiting lock before merging with neighbours

CephFS currently deadlocks under CTDB's ping_pong POSIX locking test
when run concurrently on multiple nodes.
The deadlock is caused by failed removal of a waiting_locks entry when
the waiting lock is merged with an existing lock, e.g:

Initial MDS state (two clients, same file):
held_locks -- start: 0, length: 1, client: 4116, pid: 7899, type: 2
      start: 2, length: 1, client: 4110, pid: 40767, type: 2
waiting_locks -- start: 1, length: 1, client: 4116, pid: 7899, type: 2

Waiting lock entry 4116@1:1 fires:
handle_client_file_setlock: start: 1, length: 1,
    client: 4116, pid: 7899, type: 2

MDS state after lock is obtained:
held_locks -- start: 0, length: 2, client: 4116, pid: 7899, type: 2
      start: 2, length: 1, client: 4110, pid: 40767, type: 2
waiting_locks -- start: 1, length: 1, client: 4116, pid: 7899, type: 2

Note that the waiting 4116@1:1 lock entry is merged with the existing
4116@0:1 held lock to become a 4116@0:2 held lock. However, the now
handled 4116@1:1 waiting_locks entry remains.

When handling a lock request, the MDS calls adjust_locks() to merge
the new lock with available neighbours. If the new lock is merged,
then the waiting_locks entry is not located in the subsequent
remove_waiting() call because adjust_locks changed the new lock to
include the old locks.
This fix ensures that the waiting_locks entry is removed prior to
modification during merge.

Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

John Wilkins [Fri, 23 Aug 2013 20:43:44 +0000 (13:43 -0700)]

doc: Fixed broken link by adding Transitioning to ceph-deploy to this doc.

fixes: 6107

Signed-off-by: John Wilkins <john.wilkins@inktank.com>

commit | commitdiff | tree

Yehuda Sadeh [Fri, 23 Aug 2013 20:16:16 +0000 (13:16 -0700)]

Merge pull request #495 from kri5/wip-5820

rgw: rgw-admin throw an error when invalid flag is passed

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 23 Aug 2013 19:45:06 +0000 (12:45 -0700)]

Merge pull request #533 from ceph/wip-osd-healthy-tuanble

osd: add 'osd heartbeat min healthy ratio' tunable

Reviewed-by: Samuel Just <sam.just@inktank.com>

commit | commitdiff | tree

Yehuda Sadeh [Fri, 23 Aug 2013 19:00:30 +0000 (12:00 -0700)]

Merge pull request #535 from ceph/wip-readdir-r-sucks

Fix readdir_r invocation

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 23 Aug 2013 18:45:35 +0000 (11:45 -0700)]

os: make readdir_r buffers larger

PATH_MAX isn't quite big enough.

Backport: dumpling, cuttlefish, bobtail
Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 23 Aug 2013 18:45:08 +0000 (11:45 -0700)]

os: fix readdir_r buffer size

The buffer needs to be big or else we're walk all over the stack.

Backport: dumpling, cuttlefish, bobtail
Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Joao Eduardo Luis [Thu, 22 Aug 2013 15:17:02 +0000 (16:17 +0100)]

os: KeyValueDB: expose interface to obtain estimated store size

On LevelDBStore, instead of using leveldb's GetApproximateSizes() function,
we will instead assess what's the store's raw size from the contents of
the store dir (this means .sst's, .log's, etc). The reason behind this
approach is that GetApproximateSizes() would expect us to provide a range
of keys for which to obtain an approximate size; on the other hand, what we
really want is to obtain the size of the store -- not the size of the
data (besides, with the compaction issues we've been seeing, we wonder
how reliable such approximation would be).

Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>

commit | commitdiff | tree

Sage Weil [Thu, 22 Aug 2013 22:54:48 +0000 (15:54 -0700)]

mon/Paxos: fix another uncommitted value corner case

It is possible that we begin the paxos recovery with an uncommitted
value for, say, commit 100.  During last/collect we discover 100 has been
committed already.  But also, another node provides an uncommitted value
for 101 with the same pn.  Currently, we refuse to learn it, because the
pn is not strictly > than our current uncommitted pn... even though it is
the next last_committed+1 value that we need.

There are two possible fixes here:

- make this a >= as we can accept newer values from the same pn.
- discard our uncommitted value metadata when we commit the value.

Let's do both!

Fixes: #6090
Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Yehuda Sadeh [Mon, 19 Aug 2013 23:56:27 +0000 (16:56 -0700)]

rgw: bucket meta remove don't overwrite entry point first

Fixes: #6056
When removing a bucket metadata entry we first unlink the bucket
and then we remove the bucket entrypoint object. Originally
when unlinking the bucket we first overwrote the bucket entrypoint
entry marking it as 'unlinked'. However, this is not really needed
as we're just about to remove it. The original version triggered
a bug, as we needed to propagate the new header version first (which
we didn't do, so the subsequent bucket removal failed).

Reviewed-by: Greg Farnum <greg@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>

commit | commitdiff | tree

Alfredo Deza [Fri, 23 Aug 2013 12:56:07 +0000 (08:56 -0400)]

ceph-disk: specify the filetype when mounting

Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 23 Aug 2013 15:12:46 +0000 (08:12 -0700)]

doc/release-notes: v0.67.2

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Yehuda Sadeh [Fri, 23 Aug 2013 14:17:39 +0000 (07:17 -0700)]

Merge pull request #528 from kri5/wip-radosgw-admin-help

rgw: Adds --system option help to radosgw-admin

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>

commit | commitdiff | tree

Christophe Courtaut [Thu, 22 Aug 2013 15:54:08 +0000 (17:54 +0200)]

rgw: Adds --system option help to radosgw-admin

Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>

commit | commitdiff | tree

Sage Weil [Fri, 23 Aug 2013 04:44:31 +0000 (21:44 -0700)]

osd: add 'osd heartbeat min healthy ratio' tunable

This was hard-coded to 1/3; make it tunable.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 23 Aug 2013 04:34:57 +0000 (21:34 -0700)]

Merge pull request #532 from dmick/next

PGMonitor: pg dump_stuck should respect --format (plain works fine)

Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sandon Van Ness [Fri, 23 Aug 2013 02:44:40 +0000 (19:44 -0700)]

QA: Compile fsstress if missing on machine.

Some distro's have a lack of ltp-kernel packages and all we need is
fstress. This just modified the shell script to download/compile
fstress from source and copy it to the right location if it doesn't
currently exist where it is expected. It is a very small/quick
compile and currently only SLES and debian do not have it already.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Sandon Van Ness <sandon@inktank.com>

commit | commitdiff | tree

Sandon Van Ness [Fri, 23 Aug 2013 02:44:40 +0000 (19:44 -0700)]

QA: Compile fsstress if missing on machine.

Some distro's have a lack of ltp-kernel packages and all we need is
fstress. This just modified the shell script to download/compile
fstress from source and copy it to the right location if it doesn't
currently exist where it is expected. It is a very small/quick
compile and currently only SLES and debian do not have it already.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Sandon Van Ness <sandon@inktank.com>

commit | commitdiff | tree

Dan Mick [Fri, 23 Aug 2013 01:53:13 +0000 (18:53 -0700)]

PGMonitor: pg dump_stuck should respect --format (plain works fine)

Signed-off-by: Dan Mick <dan.mick@inktank.com>

commit | commitdiff | tree

Sage Weil [Sat, 20 Jul 2013 16:02:40 +0000 (09:02 -0700)]

init-ceph: behave if incompletely installed

e.g., Debian 'removed, config remains' state

Fixes: #5695
Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 23 Aug 2013 00:23:09 +0000 (17:23 -0700)]

Merge remote-tracking branch 'gh/next'

commit | commitdiff | tree

Sage Weil [Thu, 22 Aug 2013 21:20:57 +0000 (14:20 -0700)]

yasm-wrapper: more futzing to behave on fedora 19

Some new arguments, and behave (return success) when the touch target isn't
specified.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Yehuda Sadeh [Thu, 22 Aug 2013 17:53:12 +0000 (10:53 -0700)]

rgw: fix crash when creating new zone on init

Moving the watch/notify init before the zone init,
as we might need to send a notification.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>

commit | commitdiff | tree

Gary Lowell [Thu, 22 Aug 2013 20:29:32 +0000 (13:29 -0700)]

ceph.spec.in: remove trailing paren in previous commit

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>

commit | commitdiff | tree

Gary Lowell [Thu, 22 Aug 2013 18:07:16 +0000 (11:07 -0700)]

ceph.spec.in: Don't invoke debug_package macro on centos.

If the redhat-rpm-config package is installed, the debuginfo rpms will
be built by default. The build will fail when the package installed
and the specfile also invokes the macro.

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>

commit | commitdiff | tree

athanatos [Thu, 22 Aug 2013 17:24:52 +0000 (10:24 -0700)]

Merge pull request #414 from dachary/wip-5510

replace ObjectContext pointers with shared_ptr

Reviewed-by: Samuel Just <sam.just@inktank.com>

commit | commitdiff | tree

Sage Weil [Thu, 22 Aug 2013 16:17:16 +0000 (09:17 -0700)]

Merge pull request #527 from ceph/wip-mon-fix-verbose-output

mon: remove lingering debug output

Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Thu, 22 Aug 2013 16:16:19 +0000 (09:16 -0700)]

Merge pull request #520 from ceph/wip-crc

This is better, faster intel optimized code.

Reviewed-by: Yehuda Sadeh <yehuda.sadeh@inktank.com>

commit | commitdiff | tree

Sage Weil [Wed, 21 Aug 2013 05:01:22 +0000 (22:01 -0700)]

Makefile: move all crc code into libcrc.la

This is simpler.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Wed, 21 Aug 2013 04:56:34 +0000 (21:56 -0700)]

crc32c: add intel optimized crc32c implementation

This is from Intel's ISA-L library and licensed under BSD 3-clause.

It needs to build with yasm, which means we go through all sorts of pain
to make this work with libtool:

- strip out args it doesn't understand with yasm-wrapper
- detect whether it is recent enough during configure

The code is conditional on:

- build-time support (yasm)
- run-time support (sse4.2)

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Wed, 21 Aug 2013 04:51:16 +0000 (21:51 -0700)]

arch: add cpu probing

For now, just a check to see if we have SSE4.2.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Tue, 20 Aug 2013 23:45:24 +0000 (16:45 -0700)]

yasm-wrapper: hide libtool insanity from yasm

libtool passes all kinds of crap to yasm that yasm does not understand.
Hide it with this ugly wrapper. Sigh.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Thu, 22 Aug 2013 16:01:20 +0000 (09:01 -0700)]

Merge pull request #529 from dachary/master

doc: fix erasure code formatting warnings and errors

commit | commitdiff | tree

Joao Eduardo Luis [Thu, 22 Aug 2013 15:44:41 +0000 (16:44 +0100)]

mon: Monitor: remove lingering debug message from f087d84b

Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>

commit | commitdiff | tree

Loic Dachary [Thu, 22 Aug 2013 15:45:39 +0000 (17:45 +0200)]

doc: fix erasure code formatting warnings and errors

http://tracker.ceph.com/issues/4929 refs #4929

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Sage Weil [Thu, 22 Aug 2013 15:34:03 +0000 (08:34 -0700)]

Merge pull request #525 from ksperis/rbdmap.init-fix

init-rbdmap: fix error on stop rbdmap

Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Thu, 22 Aug 2013 15:17:56 +0000 (08:17 -0700)]

mon/Paxos: ignore do_refresh() return value

Makes coverity happy.

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Alexandre Oliva [Thu, 22 Aug 2013 06:40:22 +0000 (03:40 -0300)]

enable mds rejoin with active inodes' old parent xattrs

When the parent xattrs of active inodes that the mds attempts to open
during rejoin lack pool info (struct_v < 5), this field will be filled
in with -1, causing the mds to retry fetching a backtrace with a pool
number that matches the expected value, which fails and causes the
err==-ENOENT branch to be taken and retry pool 1, which succeeds, but
with pool -1, and so keeps on bouncing between the two retry cases
forever.

This patch arranges for the mds to go along with pool -1 instead of
insisting that it be refetched, enabling it to complete recovery
instead of eating cpu, network bandwidth and metadata osd's resources
like there's no tomorrow, in what AFAICT is an infinite and very busy
loop.

This is not a new problem: I've had it even before upgrading from
Cuttlefish to Dumpling, I'd just never managed to track it down, and
force-unmounting the filesystem and then restarting the mds was an
easier (if inconvenient) work-around, particularly because it always
hit when the filesystem was under active, heavy-ish use (or there
wouldn't be much reason for caps recovery ;-)

There are two issues not addressed in this patch, however.  One is
that nothing seems to proactively update the parent xattr when it is
found to be outdated, so it remains out of date forever.  Not even
renaming top-level directories causes the xattrs to be recursively
rewritten.  AFAICT that's a bug.

The other is that inodes that don't have a parent xattr (created by
even older versions of ceph) are reported as non-existing in the mds
rejoin message, because the absence of the parent xattr is signaled as
a missing inode (?failed to reconnect caps for missing inodes?).  I
suppose this may cause more serious recovery problems.

I suppose a global pass over the filesystem tree updating parent
xattrs that are out-of-date would be desirable, if we find any parent
xattrs still lacking current information; it might make sense to
activate it as a background thread from the backtrace decoding
function, when it finds a parent xattr that's too out-of-date, or as a
separate client (ceph-fsck?).

Backport: dumpling, cuttlefish
Signed-off-by: Alexandre Oliva <oliva@gnu.org>
Reviewed-by: Zheng, Yan <zheng.z.yan@intel.com>

commit | commitdiff | tree

Laurent Barbe [Thu, 22 Aug 2013 10:12:49 +0000 (12:12 +0200)]

init-rbdmap: fix error on stop rbdmap

Avoid an error on stop service if many /dev/rbd* exist.

Signed-off-by: Laurent Barbe <laurent@ksperis.com>

commit | commitdiff | tree

Sage Weil [Wed, 21 Aug 2013 05:44:43 +0000 (22:44 -0700)]

ceph-monstore-tool: shut up coverity

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Yan, Zheng [Wed, 21 Aug 2013 05:26:50 +0000 (13:26 +0800)]

store: fix issues reported by coverity

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Loic Dachary [Tue, 13 Aug 2013 15:28:31 +0000 (17:28 +0200)]

ReplicatedPG: create ObjectContext with SharedPtrRegistry

All new ObjectContext are replaced with calls to
SharedPtrRegistry::lookup_or_create to ensure that they are all
registered. Because the constructor is invoked with no argument, care
is taken to always initialize the destructor_callback data member
immediately afterwards.

ReplicatedPG::get_object_context contains a redundant call to
get_snapset_context that is removed.

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Loic Dachary [Tue, 13 Aug 2013 15:02:40 +0000 (17:02 +0200)]

ReplicatedPG: replace object_contexts.find with object_contexts.lookup

The std::map equivalent of find is SharedPtrRegistry::lookup

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Loic Dachary [Tue, 13 Aug 2013 14:52:18 +0000 (16:52 +0200)]

ReplicatedPG: add Context to cleanup the PG after an ObjectContext deletion

ReplicatedPG::C_PG_ObjectContext is added to encapsulate a
call to ReplicatedPG::object_context_destructor_callback method
which is reponsible for

* manually de-allocating the SnapSetContext of the ObjectContext if
any. It will eventually be managed by a SharedPtrRegistry.

ReplicatedPG::C_PG_ObjectContext must be added to the destructor_callback
member of ObjectContext immediately after it is created.

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Loic Dachary [Tue, 13 Aug 2013 14:40:06 +0000 (16:40 +0200)]

ReplicatedPG: replace map iterators with SharedPtrRegistry::get_next

SharedPtrRegistry does not provide an iterator equivalent to

    map<hobject_t, ObjectContext*>::iterator i

It is replaced with a thread safe get_next method roughly used
as follows:

    pair<hobject_t, ObjectContextRef> i;
    while (object_contexts.get_next(i.first, &i))

All occurences of the iterator are replaced with get_next style
traversal.

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Loic Dachary [Tue, 13 Aug 2013 14:13:19 +0000 (16:13 +0200)]

ReplicatedPG: remove lookup_object_context method

Both ReplicatedPG::lookup_object_context and
ReplicatedPG::_lookup_object_context methods are provided by
SharedPtrRegistry.

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Loic Dachary [Mon, 12 Aug 2013 16:19:06 +0000 (18:19 +0200)]

ReplicatedPG: remove reference counting logic

ObjectContext manual reference counting and managing the
object_contexts object involves calls to

* obc->ref++ and obc->get()
* put_object_context and put_object_contexts
* register_object_context
* assertions on obc->registered

They are all removed because SharedPtrRegistry provides the
same service.

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Loic Dachary [Mon, 12 Aug 2013 15:45:44 +0000 (17:45 +0200)]

ReplicatedPG: ObjectContext * becomes ObjectContextRef

The map of hobject_t to ObjectContext is made a
SharedPtrRegistry owned by ReplicatedPG

- map<hobject_t, ObjectContext*> object_contexts;
+ SharedPtrRegistry<hobject_t, ObjectContext> object_contexts;

All ObjectContext pointers are changed into ObjectContextRef, i.e.
shared_ptr.

In Watch.h std::tr1::shared_ptr<ObjectContext> is used instead
of ObjectContextRef because Watch.h is included before it is
defined.

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Loic Dachary [Mon, 12 Aug 2013 14:47:42 +0000 (16:47 +0200)]

ReplicatedPG: ObjectContext is made compatible with SharedPtrRegistry

When creating a new object SharedPtrRegistry::lookup_or_create uses
the default ObjectContext constructor with no argument. The existing
ObjectContext constructor is modified to have no argument and the
initialization that was previously done within the constructor is done
by the caller (that only happens three times).

The ObjectContext::get method is removed: its only purpose is to
increment the ref.

The ObjectContext::registered data member is removed as well as all
the associated assert()

The ObjectContext::destructor_callback data member Context is added
and called by the destructor. It will allow the caller to perform
additional cleanup, if necessary.

All ObjectContext * data members are replaced with shared_ptr.

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Loic Dachary [Thu, 15 Aug 2013 18:15:03 +0000 (20:15 +0200)]

ReplicatedPG: add Mutex to protect snapset_contexts

snapset_contexts_locks is added and locked in each function where
snapset_contexts or the SnapSetContext::ref data member needs to be
accessed or modified.

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Loic Dachary [Thu, 15 Aug 2013 17:42:13 +0000 (19:42 +0200)]

PG: remove unused PG::_cond

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Loic Dachary [Mon, 12 Aug 2013 12:05:38 +0000 (14:05 +0200)]

sharedptr_registry: add a variant of get_next() and the empty() method

The SharedPtrRegistry::get_next() method with a value of type VPtr
instead of V is added because it is sometime more convenient to not
copy the value when walking the registry. The
SharedPtrRegistry::empty() predicate method is added.

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Josh Durgin [Wed, 21 Aug 2013 23:29:29 +0000 (16:29 -0700)]

Merge branch 'next'

commit | commitdiff | tree

Josh Durgin [Wed, 21 Aug 2013 22:56:20 +0000 (15:56 -0700)]

objecter: fix keys of dump_linger_ops

The registering flag no longer exists, and registered was using the
wrong property due to a copy-paste error.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage.weil@inktank.com>

commit | commitdiff | tree

Josh Durgin [Wed, 21 Aug 2013 21:28:49 +0000 (14:28 -0700)]

objecter: resend unfinished lingers when osdmap is no longer paused

Plain Ops that haven't finished yet need to be resent if the osdmap
transitions from full or paused to unpaused. If these Ops are
triggered by LingerOps, they will be cancelled instead (since
should_resend = false), but the LingerOps that triggered them will not
be resent.

Fix this by checking the registered flag for all linger ops, and
resending any of them that aren't paused anymore.

Fixes: #6070
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage.weil@inktank.com>

commit | commitdiff | tree

Yehuda Sadeh [Mon, 19 Aug 2013 15:40:16 +0000 (08:40 -0700)]

rgw: change cache / watch-notify init sequence

Fixes: #6046
We were initializing the watch-notify (through the cache
init) before reading the zone info which was much too
early, as we didn't have the control pool name yet. Now
simplifying init/cleanup a bit, cache doesn't call watch/notify
init and cleanup directly, but rather states its need
through a virtual callback.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

John Wilkins [Wed, 21 Aug 2013 18:02:26 +0000 (11:02 -0700)]

Merge branch 'master' of https://github.com/ceph/ceph

commit | commitdiff | tree

John Wilkins [Wed, 21 Aug 2013 18:01:48 +0000 (11:01 -0700)]

doc: Clarified quorum requirements.

fixes: #5412

Signed-off-by: John Wilkins <john.wilkins@inktank.com>

commit | commitdiff | tree

Sage Weil [Wed, 21 Aug 2013 18:00:45 +0000 (11:00 -0700)]

Merge pull request #524 from ceph/wip-mon-delta

mon: add 'pg dump delta' to get just the rate info

Reviewed-by: Samuel Just <sam.just@inktank.com>

commit | commitdiff | tree

John Wilkins [Wed, 21 Aug 2013 17:56:23 +0000 (10:56 -0700)]

doc: Fixed typo.

fixes: #5968

Signed-off-by: John Wilkins <john.wilkins@inktank.com>

commit | commitdiff | tree

Sage Weil [Wed, 21 Aug 2013 17:36:54 +0000 (10:36 -0700)]

Merge pull request #523 from dachary/master

doc: fix erasure code formatting warnings and errors

commit | commitdiff | tree

Loic Dachary [Wed, 21 Aug 2013 16:09:03 +0000 (18:09 +0200)]

doc: fix erasure code formatting warnings and errors

http://tracker.ceph.com/issues/4929 refs #4929

Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Sage Weil [Tue, 20 Aug 2013 23:44:49 +0000 (16:44 -0700)]

build-depend on yasm

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Wed, 21 Aug 2013 04:00:14 +0000 (21:00 -0700)]

crc32c: note intel crc code copyrights

It's a BSD 3-clause.

Signed-off-by: Sage Weil <sage@inktank.com>

Unnamed repository; edit this file 'description' to name the repository.