]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
12 years agotest/librados/cmd.cc: tolerate thrashing on pg_command tests 542/head
Sage Weil [Mon, 26 Aug 2013 19:52:44 +0000 (12:52 -0700)]
test/librados/cmd.cc: tolerate thrashing on pg_command tests

We may get ENXIO (osd down) or ENOENT (pg dne (yet) on the target osd) if
there is thrashing going on.

Fixes: #6122
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agorados-config: do not load ceph.conf
Sage Weil [Fri, 23 Aug 2013 22:21:41 +0000 (15:21 -0700)]
rados-config: do not load ceph.conf

Fixes: #2901
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoosd/ReplicatedPG: require write payload match length
Sage Weil [Fri, 23 Aug 2013 22:11:49 +0000 (15:11 -0700)]
osd/ReplicatedPG: require write payload match length

Hopefully this won't break old clients; I can't think of any.  We *should*
be picky about our requests.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoosd/ReplicatedPG: verify we have enough data for WRITE and WRITEFULL
Sage Weil [Fri, 23 Aug 2013 22:02:00 +0000 (15:02 -0700)]
osd/ReplicatedPG: verify we have enough data for WRITE and WRITEFULL

Fixes: #2207
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoReplicatedPG: mark stats invalid when marking unfound lost
Samuel Just [Fri, 23 Aug 2013 21:50:42 +0000 (14:50 -0700)]
ReplicatedPG: mark stats invalid when marking unfound lost

Fixes: #3660
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoReplicatedPG: make watch timeout configurable
Samuel Just [Fri, 23 Aug 2013 21:50:20 +0000 (14:50 -0700)]
ReplicatedPG: make watch timeout configurable

Fixes: #2354
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoosd/OSDCap: allow . for unquoted strings
Sage Weil [Fri, 23 Aug 2013 21:56:46 +0000 (14:56 -0700)]
osd/OSDCap: allow . for unquoted strings

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agomon/MonCap: allow . in unquoted string
Sage Weil [Fri, 23 Aug 2013 21:56:37 +0000 (14:56 -0700)]
mon/MonCap: allow . in unquoted string

Fixes: #5967
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agolibrados: make safe and complete callback arguments separate
Sage Weil [Fri, 23 Aug 2013 21:56:12 +0000 (14:56 -0700)]
librados: make safe and complete callback arguments separate

Fixes: #2914
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agomds: remove waiting lock before merging with neighbours
David Disseldorp [Mon, 29 Jul 2013 15:05:44 +0000 (17:05 +0200)]
mds: remove waiting lock before merging with neighbours

CephFS currently deadlocks under CTDB's ping_pong POSIX locking test
when run concurrently on multiple nodes.
The deadlock is caused by failed removal of a waiting_locks entry when
the waiting lock is merged with an existing lock, e.g:

Initial MDS state (two clients, same file):
held_locks -- start: 0, length: 1, client: 4116, pid: 7899, type: 2
      start: 2, length: 1, client: 4110, pid: 40767, type: 2
waiting_locks -- start: 1, length: 1, client: 4116, pid: 7899, type: 2

Waiting lock entry 4116@1:1 fires:
handle_client_file_setlock: start: 1, length: 1,
    client: 4116, pid: 7899, type: 2

MDS state after lock is obtained:
held_locks -- start: 0, length: 2, client: 4116, pid: 7899, type: 2
      start: 2, length: 1, client: 4110, pid: 40767, type: 2
waiting_locks -- start: 1, length: 1, client: 4116, pid: 7899, type: 2

Note that the waiting 4116@1:1 lock entry is merged with the existing
4116@0:1 held lock to become a 4116@0:2 held lock. However, the now
handled 4116@1:1 waiting_locks entry remains.

When handling a lock request, the MDS calls adjust_locks() to merge
the new lock with available neighbours. If the new lock is merged,
then the waiting_locks entry is not located in the subsequent
remove_waiting() call because adjust_locks changed the new lock to
include the old locks.
This fix ensures that the waiting_locks entry is removed prior to
modification during merge.

Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agodoc: Fixed broken link by adding Transitioning to ceph-deploy to this doc.
John Wilkins [Fri, 23 Aug 2013 20:43:44 +0000 (13:43 -0700)]
doc: Fixed broken link by adding Transitioning to ceph-deploy to this doc.

fixes: 6107

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoMerge pull request #495 from kri5/wip-5820
Yehuda Sadeh [Fri, 23 Aug 2013 20:16:16 +0000 (13:16 -0700)]
Merge pull request #495 from kri5/wip-5820

rgw: rgw-admin throw an error when invalid flag is passed

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoMerge pull request #533 from ceph/wip-osd-healthy-tuanble
Sage Weil [Fri, 23 Aug 2013 19:45:06 +0000 (12:45 -0700)]
Merge pull request #533 from ceph/wip-osd-healthy-tuanble

osd: add 'osd heartbeat min healthy ratio' tunable

Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agodoc/release-notes: v0.67.2
Sage Weil [Fri, 23 Aug 2013 15:12:46 +0000 (08:12 -0700)]
doc/release-notes: v0.67.2

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #528 from kri5/wip-radosgw-admin-help
Yehuda Sadeh [Fri, 23 Aug 2013 14:17:39 +0000 (07:17 -0700)]
Merge pull request #528 from kri5/wip-radosgw-admin-help

rgw: Adds --system option help to radosgw-admin

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agorgw: Adds --system option help to radosgw-admin 528/head
Christophe Courtaut [Thu, 22 Aug 2013 15:54:08 +0000 (17:54 +0200)]
rgw: Adds --system option help to radosgw-admin

Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
12 years agoosd: add 'osd heartbeat min healthy ratio' tunable 533/head
Sage Weil [Fri, 23 Aug 2013 04:44:31 +0000 (21:44 -0700)]
osd: add 'osd heartbeat min healthy ratio' tunable

This was hard-coded to 1/3; make it tunable.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoQA: Compile fsstress if missing on machine.
Sandon Van Ness [Fri, 23 Aug 2013 02:44:40 +0000 (19:44 -0700)]
QA: Compile fsstress if missing on machine.

Some distro's have a lack of ltp-kernel packages and all we need is
fstress. This just modified the shell script to download/compile
fstress from source and copy it to the right location if it doesn't
currently exist where it is expected. It is a very small/quick
compile and currently only SLES and debian do not have it already.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
12 years agoinit-ceph: behave if incompletely installed
Sage Weil [Sat, 20 Jul 2013 16:02:40 +0000 (09:02 -0700)]
init-ceph: behave if incompletely installed

e.g., Debian 'removed, config remains' state

Fixes: #5695
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Fri, 23 Aug 2013 00:23:09 +0000 (17:23 -0700)]
Merge remote-tracking branch 'gh/next'

12 years agoyasm-wrapper: more futzing to behave on fedora 19
Sage Weil [Thu, 22 Aug 2013 21:20:57 +0000 (14:20 -0700)]
yasm-wrapper: more futzing to behave on fedora 19

Some new arguments, and behave (return success) when the touch target isn't
specified.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agorgw: fix crash when creating new zone on init
Yehuda Sadeh [Thu, 22 Aug 2013 17:53:12 +0000 (10:53 -0700)]
rgw: fix crash when creating new zone on init

Moving the watch/notify init before the zone init,
as we might need to send a notification.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoceph.spec.in: remove trailing paren in previous commit
Gary Lowell [Thu, 22 Aug 2013 20:29:32 +0000 (13:29 -0700)]
ceph.spec.in:  remove trailing paren in previous commit

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
12 years agoceph.spec.in: Don't invoke debug_package macro on centos.
Gary Lowell [Thu, 22 Aug 2013 18:07:16 +0000 (11:07 -0700)]
ceph.spec.in:  Don't invoke debug_package macro on centos.

If the redhat-rpm-config package is installed, the debuginfo rpms will
be built by default.   The build will fail when the package installed
and the specfile also invokes the macro.

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
12 years agoMerge pull request #414 from dachary/wip-5510
athanatos [Thu, 22 Aug 2013 17:24:52 +0000 (10:24 -0700)]
Merge pull request #414 from dachary/wip-5510

replace ObjectContext pointers with shared_ptr

Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge pull request #527 from ceph/wip-mon-fix-verbose-output
Sage Weil [Thu, 22 Aug 2013 16:17:16 +0000 (09:17 -0700)]
Merge pull request #527 from ceph/wip-mon-fix-verbose-output

mon: remove lingering debug output

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #520 from ceph/wip-crc
Sage Weil [Thu, 22 Aug 2013 16:16:19 +0000 (09:16 -0700)]
Merge pull request #520 from ceph/wip-crc

This is better, faster intel optimized code.

Reviewed-by: Yehuda Sadeh <yehuda.sadeh@inktank.com>
12 years agoMakefile: move all crc code into libcrc.la 520/head
Sage Weil [Wed, 21 Aug 2013 05:01:22 +0000 (22:01 -0700)]
Makefile: move all crc code into libcrc.la

This is simpler.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocrc32c: add intel optimized crc32c implementation
Sage Weil [Wed, 21 Aug 2013 04:56:34 +0000 (21:56 -0700)]
crc32c: add intel optimized crc32c implementation

This is from Intel's ISA-L library and licensed under BSD 3-clause.

It needs to build with yasm, which means we go through all sorts of pain
to make this work with libtool:

 - strip out args it doesn't understand with yasm-wrapper
 - detect whether it is recent enough during configure

The code is conditional on:

 - build-time support (yasm)
 - run-time support (sse4.2)

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoarch: add cpu probing
Sage Weil [Wed, 21 Aug 2013 04:51:16 +0000 (21:51 -0700)]
arch: add cpu probing

For now, just a check to see if we have SSE4.2.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoyasm-wrapper: hide libtool insanity from yasm
Sage Weil [Tue, 20 Aug 2013 23:45:24 +0000 (16:45 -0700)]
yasm-wrapper: hide libtool insanity from yasm

libtool passes all kinds of crap to yasm that yasm does not understand.
Hide it with this ugly wrapper.  Sigh.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #529 from dachary/master
Sage Weil [Thu, 22 Aug 2013 16:01:20 +0000 (09:01 -0700)]
Merge pull request #529 from dachary/master

doc: fix erasure code formatting warnings and errors

12 years agomon: Monitor: remove lingering debug message from f087d84b 527/head
Joao Eduardo Luis [Thu, 22 Aug 2013 15:44:41 +0000 (16:44 +0100)]
mon: Monitor: remove lingering debug message from f087d84b

Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
12 years agodoc: fix erasure code formatting warnings and errors 529/head
Loic Dachary [Thu, 22 Aug 2013 15:45:39 +0000 (17:45 +0200)]
doc: fix erasure code formatting warnings and errors

http://tracker.ceph.com/issues/4929 refs #4929

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agoMerge pull request #525 from ksperis/rbdmap.init-fix
Sage Weil [Thu, 22 Aug 2013 15:34:03 +0000 (08:34 -0700)]
Merge pull request #525 from ksperis/rbdmap.init-fix

init-rbdmap: fix error on stop rbdmap

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agomon/Paxos: ignore do_refresh() return value
Sage Weil [Thu, 22 Aug 2013 15:17:56 +0000 (08:17 -0700)]
mon/Paxos: ignore do_refresh() return value

Makes coverity happy.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoenable mds rejoin with active inodes' old parent xattrs
Alexandre Oliva [Thu, 22 Aug 2013 06:40:22 +0000 (03:40 -0300)]
enable mds rejoin with active inodes' old parent xattrs

When the parent xattrs of active inodes that the mds attempts to open
during rejoin lack pool info (struct_v < 5), this field will be filled
in with -1, causing the mds to retry fetching a backtrace with a pool
number that matches the expected value, which fails and causes the
err==-ENOENT branch to be taken and retry pool 1, which succeeds, but
with pool -1, and so keeps on bouncing between the two retry cases
forever.

This patch arranges for the mds to go along with pool -1 instead of
insisting that it be refetched, enabling it to complete recovery
instead of eating cpu, network bandwidth and metadata osd's resources
like there's no tomorrow, in what AFAICT is an infinite and very busy
loop.

This is not a new problem: I've had it even before upgrading from
Cuttlefish to Dumpling, I'd just never managed to track it down, and
force-unmounting the filesystem and then restarting the mds was an
easier (if inconvenient) work-around, particularly because it always
hit when the filesystem was under active, heavy-ish use (or there
wouldn't be much reason for caps recovery ;-)

There are two issues not addressed in this patch, however.  One is
that nothing seems to proactively update the parent xattr when it is
found to be outdated, so it remains out of date forever.  Not even
renaming top-level directories causes the xattrs to be recursively
rewritten.  AFAICT that's a bug.

The other is that inodes that don't have a parent xattr (created by
even older versions of ceph) are reported as non-existing in the mds
rejoin message, because the absence of the parent xattr is signaled as
a missing inode (?failed to reconnect caps for missing inodes?).  I
suppose this may cause more serious recovery problems.

I suppose a global pass over the filesystem tree updating parent
xattrs that are out-of-date would be desirable, if we find any parent
xattrs still lacking current information; it might make sense to
activate it as a background thread from the backtrace decoding
function, when it finds a parent xattr that's too out-of-date, or as a
separate client (ceph-fsck?).

Backport: dumpling, cuttlefish
Signed-off-by: Alexandre Oliva <oliva@gnu.org>
Reviewed-by: Zheng, Yan <zheng.z.yan@intel.com>
12 years agoinit-rbdmap: fix error on stop rbdmap 525/head
Laurent Barbe [Thu, 22 Aug 2013 10:12:49 +0000 (12:12 +0200)]
init-rbdmap: fix error on stop rbdmap

Avoid an error on stop service if many /dev/rbd* exist.

Signed-off-by: Laurent Barbe <laurent@ksperis.com>
12 years agoceph-monstore-tool: shut up coverity
Sage Weil [Wed, 21 Aug 2013 05:44:43 +0000 (22:44 -0700)]
ceph-monstore-tool: shut up coverity

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agostore: fix issues reported by coverity
Yan, Zheng [Wed, 21 Aug 2013 05:26:50 +0000 (13:26 +0800)]
store: fix issues reported by coverity

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoReplicatedPG: create ObjectContext with SharedPtrRegistry 414/head
Loic Dachary [Tue, 13 Aug 2013 15:28:31 +0000 (17:28 +0200)]
ReplicatedPG: create ObjectContext with SharedPtrRegistry

All new ObjectContext are replaced with calls to
SharedPtrRegistry::lookup_or_create to ensure that they are all
registered. Because the constructor is invoked with no argument, care
is taken to always initialize the destructor_callback data member
immediately afterwards.

ReplicatedPG::get_object_context contains a redundant call to
get_snapset_context that is removed.

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agoReplicatedPG: replace object_contexts.find with object_contexts.lookup
Loic Dachary [Tue, 13 Aug 2013 15:02:40 +0000 (17:02 +0200)]
ReplicatedPG: replace object_contexts.find with object_contexts.lookup

The std::map equivalent of find is SharedPtrRegistry::lookup

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agoReplicatedPG: add Context to cleanup the PG after an ObjectContext deletion
Loic Dachary [Tue, 13 Aug 2013 14:52:18 +0000 (16:52 +0200)]
ReplicatedPG: add Context to cleanup the PG after an ObjectContext deletion

ReplicatedPG::C_PG_ObjectContext is added to encapsulate a
call to ReplicatedPG::object_context_destructor_callback method
which is reponsible for

  * manually de-allocating the SnapSetContext of the ObjectContext if
    any. It will eventually be managed by a SharedPtrRegistry.

ReplicatedPG::C_PG_ObjectContext must be added to the destructor_callback
member of ObjectContext immediately after it is created.

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agoReplicatedPG: replace map iterators with SharedPtrRegistry::get_next
Loic Dachary [Tue, 13 Aug 2013 14:40:06 +0000 (16:40 +0200)]
ReplicatedPG: replace map iterators with SharedPtrRegistry::get_next

SharedPtrRegistry does not provide an iterator equivalent to

    map<hobject_t, ObjectContext*>::iterator i

It is replaced with a thread safe get_next method roughly used
as follows:

    pair<hobject_t, ObjectContextRef> i;
    while (object_contexts.get_next(i.first, &i))

All occurences of the iterator are replaced with get_next style
traversal.

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agoReplicatedPG: remove lookup_object_context method
Loic Dachary [Tue, 13 Aug 2013 14:13:19 +0000 (16:13 +0200)]
ReplicatedPG: remove lookup_object_context method

Both ReplicatedPG::lookup_object_context and
ReplicatedPG::_lookup_object_context methods are provided by
SharedPtrRegistry.

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agoReplicatedPG: remove reference counting logic
Loic Dachary [Mon, 12 Aug 2013 16:19:06 +0000 (18:19 +0200)]
ReplicatedPG: remove reference counting logic

ObjectContext manual reference counting and managing the
object_contexts object involves calls to

* obc->ref++ and obc->get()
* put_object_context and put_object_contexts
* register_object_context
* assertions on obc->registered

They are all removed because SharedPtrRegistry provides the
same service.

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agoReplicatedPG: ObjectContext * becomes ObjectContextRef
Loic Dachary [Mon, 12 Aug 2013 15:45:44 +0000 (17:45 +0200)]
ReplicatedPG: ObjectContext * becomes ObjectContextRef

The map of hobject_t to ObjectContext is made a
SharedPtrRegistry owned by ReplicatedPG

    -  map<hobject_t, ObjectContext*> object_contexts;
    +  SharedPtrRegistry<hobject_t, ObjectContext> object_contexts;

All ObjectContext pointers are changed into ObjectContextRef, i.e.
shared_ptr.

In Watch.h std::tr1::shared_ptr<ObjectContext> is used instead
of ObjectContextRef because Watch.h is included before it is
defined.

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agoReplicatedPG: ObjectContext is made compatible with SharedPtrRegistry
Loic Dachary [Mon, 12 Aug 2013 14:47:42 +0000 (16:47 +0200)]
ReplicatedPG: ObjectContext is made compatible with SharedPtrRegistry

When creating a new object SharedPtrRegistry::lookup_or_create uses
the default ObjectContext constructor with no argument. The existing
ObjectContext constructor is modified to have no argument and the
initialization that was previously done within the constructor is done
by the caller (that only happens three times).

The ObjectContext::get method is removed: its only purpose is to
increment the ref.

The ObjectContext::registered data member is removed as well as all
the associated assert()

The ObjectContext::destructor_callback data member Context is added
and called by the destructor. It will allow the caller to perform
additional cleanup, if necessary.

All ObjectContext * data members are replaced with shared_ptr.

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agoReplicatedPG: add Mutex to protect snapset_contexts
Loic Dachary [Thu, 15 Aug 2013 18:15:03 +0000 (20:15 +0200)]
ReplicatedPG: add Mutex to protect snapset_contexts

snapset_contexts_locks is added and locked in each function where
snapset_contexts or the SnapSetContext::ref data member needs to be
accessed or modified.

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agoPG: remove unused PG::_cond
Loic Dachary [Thu, 15 Aug 2013 17:42:13 +0000 (19:42 +0200)]
PG: remove unused PG::_cond

http://tracker.ceph.com/issues/5510 refs #5510

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agosharedptr_registry: add a variant of get_next() and the empty() method
Loic Dachary [Mon, 12 Aug 2013 12:05:38 +0000 (14:05 +0200)]
sharedptr_registry: add a variant of get_next() and the empty() method

The SharedPtrRegistry::get_next() method with a value of type VPtr
instead of V is added because it is sometime more convenient to not
copy the value when walking the registry. The
SharedPtrRegistry::empty() predicate method is added.

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agoMerge branch 'next'
Josh Durgin [Wed, 21 Aug 2013 23:29:29 +0000 (16:29 -0700)]
Merge branch 'next'

12 years agoobjecter: fix keys of dump_linger_ops
Josh Durgin [Wed, 21 Aug 2013 22:56:20 +0000 (15:56 -0700)]
objecter: fix keys of dump_linger_ops

The registering flag no longer exists, and registered was using the
wrong property due to a copy-paste error.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage.weil@inktank.com>
12 years agoobjecter: resend unfinished lingers when osdmap is no longer paused
Josh Durgin [Wed, 21 Aug 2013 21:28:49 +0000 (14:28 -0700)]
objecter: resend unfinished lingers when osdmap is no longer paused

Plain Ops that haven't finished yet need to be resent if the osdmap
transitions from full or paused to unpaused.  If these Ops are
triggered by LingerOps, they will be cancelled instead (since
should_resend = false), but the LingerOps that triggered them will not
be resent.

Fix this by checking the registered flag for all linger ops, and
resending any of them that aren't paused anymore.

Fixes: #6070
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage.weil@inktank.com>
12 years agorgw: change cache / watch-notify init sequence
Yehuda Sadeh [Mon, 19 Aug 2013 15:40:16 +0000 (08:40 -0700)]
rgw: change cache / watch-notify init sequence

Fixes: #6046
We were initializing the watch-notify (through the cache
init) before reading the zone info which was much too
early, as we didn't have the control pool name yet. Now
simplifying init/cleanup a bit, cache doesn't call watch/notify
init and cleanup directly, but rather states its need
through a virtual callback.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'master' of https://github.com/ceph/ceph
John Wilkins [Wed, 21 Aug 2013 18:02:26 +0000 (11:02 -0700)]
Merge branch 'master' of https://github.com/ceph/ceph

12 years agodoc: Clarified quorum requirements.
John Wilkins [Wed, 21 Aug 2013 18:01:48 +0000 (11:01 -0700)]
doc: Clarified quorum requirements.

fixes: #5412

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoMerge pull request #524 from ceph/wip-mon-delta
Sage Weil [Wed, 21 Aug 2013 18:00:45 +0000 (11:00 -0700)]
Merge pull request #524 from ceph/wip-mon-delta

mon: add 'pg dump delta' to get just the rate info

Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agodoc: Fixed typo.
John Wilkins [Wed, 21 Aug 2013 17:56:23 +0000 (10:56 -0700)]
doc: Fixed typo.

fixes: #5968

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoMerge pull request #523 from dachary/master
Sage Weil [Wed, 21 Aug 2013 17:36:54 +0000 (10:36 -0700)]
Merge pull request #523 from dachary/master

doc: fix erasure code formatting warnings and errors

12 years agodoc: fix erasure code formatting warnings and errors 523/head
Loic Dachary [Wed, 21 Aug 2013 16:09:03 +0000 (18:09 +0200)]
doc: fix erasure code formatting warnings and errors

http://tracker.ceph.com/issues/4929 refs #4929

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agobuild-depend on yasm
Sage Weil [Tue, 20 Aug 2013 23:44:49 +0000 (16:44 -0700)]
build-depend on yasm

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocrc32c: note intel crc code copyrights
Sage Weil [Wed, 21 Aug 2013 04:00:14 +0000 (21:00 -0700)]
crc32c: note intel crc code copyrights

It's a BSD 3-clause.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocrc32c: add intel baseline algorithm
Sage Weil [Wed, 21 Aug 2013 03:59:56 +0000 (20:59 -0700)]
crc32c: add intel baseline algorithm

This is than the sctp code but probably slower.  We'll add it anywhere
just as a reference and to have a baseline for comparing performance.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Wed, 21 Aug 2013 05:40:13 +0000 (22:40 -0700)]
Merge remote-tracking branch 'gh/next'

12 years agoceph-disk: partprobe after creating journal partition
Sage Weil [Wed, 21 Aug 2013 05:39:09 +0000 (22:39 -0700)]
ceph-disk: partprobe after creating journal partition

At least one user reports that a partprobe is needed after creating the
journal partition.  It is not clear why sgdisk is not doing it, but this
fixes ceph-disk for them, and should be harmless for other users.

Fixes: #5599
Tested-by: lurbs in #ceph
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'gh/wip-6004' into next
Sage Weil [Tue, 20 Aug 2013 23:57:46 +0000 (16:57 -0700)]
Merge remote-tracking branch 'gh/wip-6004' into next

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years ago.gitignore: ignore test-driver
Sage Weil [Fri, 9 Aug 2013 19:49:57 +0000 (12:49 -0700)]
.gitignore: ignore test-driver

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agofuse: fix warning when compiled against old fuse versions
Sage Weil [Fri, 9 Aug 2013 19:42:49 +0000 (12:42 -0700)]
fuse: fix warning when compiled against old fuse versions

client/fuse_ll.cc: In function 'void invalidate_cb(void*, vinodeno_t, int64_t, int64_t)':
warning: client/fuse_ll.cc:540: unused variable 'fino'

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agojson_spirit: remove unused typedef
Sage Weil [Fri, 9 Aug 2013 19:40:34 +0000 (12:40 -0700)]
json_spirit: remove unused typedef

In file included from json_spirit/json_spirit_writer.cpp:7:0:
json_spirit/json_spirit_writer_template.h: In function 'String_type json_spirit::non_printable_to_string(unsigned int)':
json_spirit/json_spirit_writer_template.h:37:50: warning: typedef 'Char_type' locally defined but not used [-Wunused-local-typedefs]
         typedef typename String_type::value_type Char_type;

(Also, ha ha, this file uses \r\n.)

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agogtest: add build-aux/test-driver to .gitignore
Sage Weil [Fri, 9 Aug 2013 19:31:41 +0000 (12:31 -0700)]
gtest: add build-aux/test-driver to .gitignore

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocrc32c: remove old intel implementation
Sage Weil [Tue, 20 Aug 2013 18:56:06 +0000 (11:56 -0700)]
crc32c: remove old intel implementation

The license is not LGPL compatible.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocommon/crc32c: refactor a bit
Sage Weil [Tue, 20 Aug 2013 18:55:10 +0000 (11:55 -0700)]
common/crc32c: refactor a bit

- the generic function without the _le suffix (useless)
- use a static global so that detection only happens once
- make the structure a bit cleaner to plug in new implementations

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #517 from dmick/wip-6049
Dan Mick [Tue, 20 Aug 2013 19:18:43 +0000 (12:18 -0700)]
Merge pull request #517 from dmick/wip-6049

mon/PGMap: OSD byte counts 4x too large (conversion to bytes overzealous)

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon/Paxos: always refresh after any store_state
Sage Weil [Tue, 20 Aug 2013 18:27:23 +0000 (11:27 -0700)]
mon/Paxos: always refresh after any store_state

If we store any new state, we need to refresh the services, even if we
are still in the midst of Paxos recovery.  This is because the
subscription path will share any committed state even when paxos is
still recovering.  This prevents a race like:

 - we have maps 10..20
 - we drop out of quorum
 - we are elected leader, paxos recovery starts
 - we get one LAST with committed states that trim maps 10..15
 - we get a subscribe for map 10..20
   - we crash because 10 is no longer on disk because the PaxosService
     is out of sync with the on-disk state.

Fixes: #6045
Backport: dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon/Paxos: return whether store_state stored anything
Sage Weil [Tue, 20 Aug 2013 18:27:09 +0000 (11:27 -0700)]
mon/Paxos: return whether store_state stored anything

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon/Paxos: cleanup: use do_refresh from handle_commit
Sage Weil [Tue, 20 Aug 2013 18:26:57 +0000 (11:26 -0700)]
mon/Paxos: cleanup: use do_refresh from handle_commit

This avoid duplicated code by using the helper created exactly for this
purpose.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agopybind: fix Rados.conf_parse_env test
Sage Weil [Tue, 20 Aug 2013 18:23:46 +0000 (11:23 -0700)]
pybind: fix Rados.conf_parse_env test

This happens after we connect, which means we get ENOSYS always.
Instead, parse_env inside the normal setup method, which had the added
benefit of being able to debug these tests.

Backport: dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PGMap: OSD byte counts 4x too large (conversion to bytes overzealous) 517/head
Dan Mick [Tue, 20 Aug 2013 18:10:42 +0000 (11:10 -0700)]
mon/PGMap: OSD byte counts 4x too large (conversion to bytes overzealous)

Fixes: #6049
Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agoMerge pull request #516 from dachary/master
athanatos [Tue, 20 Aug 2013 17:34:32 +0000 (10:34 -0700)]
Merge pull request #516 from dachary/master

erasure code : plugin, interface and glossary documentation updates

Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoerasure code : plugin, interface and glossary documentation updates 516/head
Loic Dachary [Tue, 20 Aug 2013 14:17:10 +0000 (16:17 +0200)]
erasure code : plugin, interface and glossary documentation updates

* replace the erasure code plugin abstract interface with a doxygen link
  that will be populated when the header shows in master
* update the plugin documentation to reflect the current draft implementation
* fix broken link to PGBackend-h
* add a glossary to define chunk, stripe, shard and strip with a drawing

http://tracker.ceph.com/issues/4929 refs #4929

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Tue, 20 Aug 2013 05:53:28 +0000 (22:53 -0700)]
Merge remote-tracking branch 'gh/next'

12 years agoPG: remove old log when we upgrade log version
Samuel Just [Tue, 20 Aug 2013 00:23:44 +0000 (17:23 -0700)]
PG: remove old log when we upgrade log version

Otherwise the log_oid will be non-empty and the next
boot will cause us to try to upgrade again.

Fixes: #6057
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'wip-fallocate'
Sage Weil [Tue, 20 Aug 2013 05:50:11 +0000 (22:50 -0700)]
Merge branch 'wip-fallocate'

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoceph-fuse: fallocate appears in fuse 2.9.1, not 2.9
Sage Weil [Tue, 20 Aug 2013 04:46:29 +0000 (21:46 -0700)]
ceph-fuse: fallocate appears in fuse 2.9.1, not 2.9

There is no macro to differentiate 2.9 from 2.9.1, so we have to wait
to use this until 3.0.  :(

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoclient: do not mark_caps_dirty for generic fallocate
Sage Weil [Fri, 16 Aug 2013 06:05:17 +0000 (23:05 -0700)]
client: do not mark_caps_dirty for generic fallocate

A normal fallocate in which the size is not changed is still a no-op.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoclient: guard fallocate with #ifdefs
Sage Weil [Fri, 16 Aug 2013 06:01:59 +0000 (23:01 -0700)]
client: guard fallocate with #ifdefs

Only include linux header if it's linux.  Only implement the fallocate
method if FALLOC_FL_PUNCH_HOLE is defined.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoCeph-fuse: Fallocate and punch hole support
Li Wang [Thu, 15 Aug 2013 04:04:03 +0000 (12:04 +0800)]
Ceph-fuse: Fallocate and punch hole support

This patch implements fallocate and punch hole support for Ceph fuse client.

Signed-off-by: Yunchuan Wen <yunchuanwen@ubuntukylin.com>
Signed-off-by: Li Wang <liwang@ubuntukylin.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agomon: add 'pg dump delta' to get just the rate info 524/head
Sage Weil [Tue, 20 Aug 2013 04:37:00 +0000 (21:37 -0700)]
mon: add 'pg dump delta' to get just the rate info

Still include it in the basic 'pg dump summary' info.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoPGLog: add a config to disable PGLog::check()
Samuel Just [Mon, 19 Aug 2013 07:02:24 +0000 (00:02 -0700)]
PGLog: add a config to disable PGLog::check()

This is a debug check which may be causing excessive
cpu usage.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agodoc: Title change.
John Wilkins [Tue, 20 Aug 2013 00:27:10 +0000 (17:27 -0700)]
doc: Title change.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoosd/ReplicatedPG: remove broken AccessMode logic
Sage Weil [Mon, 19 Aug 2013 05:34:24 +0000 (22:34 -0700)]
osd/ReplicatedPG: remove broken AccessMode logic

The original intent here was to handle reads in two modes.  For
workloads with read/modify/write ops, the RMW mode would:

 - queue writes for local store and replicas immediately
 - block reads until the write commits to all replicas

For mixed read/write workloads without read/modify/write ops, the
DELAYED mode would:

 - queue writes for replicas
 - allow local reads
 - once replicas commit, queue write locally
 - block local reads until local write completes

In reality, we never use the DELAYED mode.  It's untested and possibly
broken, and it is unlikely we will see a workload where it is important
in the near to mid term.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge pull request #508 from ceph/wip-5905
Gregory Farnum [Mon, 19 Aug 2013 22:14:40 +0000 (15:14 -0700)]
Merge pull request #508 from ceph/wip-5905

examples: add a librados/hello_world program

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Loic Dachary <loic@dachary.org>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agoexamples: add a librados/hello_world program 508/head
Greg Farnum [Thu, 15 Aug 2013 23:16:37 +0000 (16:16 -0700)]
examples: add a librados/hello_world program

This is a simple program with lots of explanatory comments people
can use as a model for using librados.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoceph: parse CEPH_ARGS environment variable
Sage Weil [Mon, 19 Aug 2013 19:48:50 +0000 (12:48 -0700)]
ceph: parse CEPH_ARGS environment variable

Fixes: #6052
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agorados pybind: add conf_parse_env()
Sage Weil [Mon, 19 Aug 2013 19:48:40 +0000 (12:48 -0700)]
rados pybind: add conf_parse_env()

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Mon, 19 Aug 2013 19:41:54 +0000 (12:41 -0700)]
Merge remote-tracking branch 'gh/next'

12 years agodoc/release-notes: v0.61.8
Sage Weil [Mon, 19 Aug 2013 19:41:26 +0000 (12:41 -0700)]
doc/release-notes: v0.61.8

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #513 from dalgaaf/fix/wip-da-documentation
Sage Weil [Mon, 19 Aug 2013 19:32:30 +0000 (12:32 -0700)]
Merge pull request #513 from dalgaaf/fix/wip-da-documentation

Fix documentation issues

12 years agofilestore-config-ref.rst: mark some filestore keys as deprecated 513/head
Danny Al-Gaaf [Mon, 19 Aug 2013 18:56:48 +0000 (20:56 +0200)]
filestore-config-ref.rst: mark some filestore keys as deprecated

Marked the following keys as deprecated since v0.65:
- filestore flusher
- filestore flusher max fds
- filestore sync flush

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>