]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
11 years agolibrbd: fix zero length request handling 1632/head
Josh Durgin [Wed, 9 Apr 2014 00:38:50 +0000 (17:38 -0700)]
librbd: fix zero length request handling

Zero-length writes would hang because the completion was never
called. Reads would hit an assert about zero length in
Striper::file_to_exents().

Fix all of these cases by skipping zero-length extents. The completion
is created and finished when finish_adding_requests() is called. This
is slightly different from usual completions since it comes from the
same thread as the one scheduling the request, but zero-length aio
requests should never happen from things that might care about this,
like QEMU.

Writes and discards have had this bug since the beginning of
librbd. Reads might have avoided it until stripingv2 was added.

Fixes: #5469
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoMerge pull request #1628 from ceph/wip-5835
Josh Durgin [Tue, 8 Apr 2014 21:47:21 +0000 (14:47 -0700)]
Merge pull request #1628 from ceph/wip-5835

update package descriptions

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
11 years agodebian: update ceph description 1628/head
Sage Weil [Tue, 8 Apr 2014 21:19:38 +0000 (14:19 -0700)]
debian: update ceph description

Fixes: #5835
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoceph.spec: update ceph description
Sage Weil [Tue, 8 Apr 2014 21:18:44 +0000 (14:18 -0700)]
ceph.spec: update ceph description

Fixes: #5835
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1625 from ceph/wip-8019
Samuel Just [Tue, 8 Apr 2014 19:45:28 +0000 (12:45 -0700)]
Merge pull request #1625 from ceph/wip-8019

osd: fix journal umount/mount weirdness

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agoos/FileStore: reset journal state on umount 1625/head
Sage Weil [Tue, 8 Apr 2014 17:52:43 +0000 (10:52 -0700)]
os/FileStore: reset journal state on umount

We observed a sequence like:

 - replay journal
   - sets JournalingObjectStore applied_op_seq
 - umount
 - mount
   - initiate commit with prevous applied_op_seq
 - replay journal
   - commit finishes
   - on replay commit, we fail assert op > committed_seq

Although strictly speaking the assert failure is harmless here, in general
we should not let state leak through from a previous mount into this
mount or else assertions are in general more difficult to reason about.

Fixes: #8019
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agovstart.sh: make crush location match up with what init-ceph does
Sage Weil [Tue, 8 Apr 2014 17:58:53 +0000 (10:58 -0700)]
vstart.sh: make crush location match up with what init-ceph does

This makes is to that ./init-ceph restart osd.0 won't modify the CRUSH
tree.  And in any case, the localhost/localrack thing we were doing before
was pretty useless.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1623 from ceph/wip-8026
Gregory Farnum [Tue, 8 Apr 2014 17:43:14 +0000 (10:43 -0700)]
Merge pull request #1623 from ceph/wip-8026

mds: fix shared_ptr MDRequest bugs

Reviewed-by: Greg Farnum <greg@inktank.com>
11 years agoMerge pull request #1621 from dachary/wip-7914
Sage Weil [Tue, 8 Apr 2014 17:14:46 +0000 (10:14 -0700)]
Merge pull request #1621 from dachary/wip-7914

erasure-code: thread-safe initialization of gf-complete

This looks like a good interim solution until gf-complete exposes a simpler init function
that hides this.

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agomds: fix shared_ptr MDRequest bugs 1623/head
Yan, Zheng [Tue, 8 Apr 2014 08:11:03 +0000 (16:11 +0800)]
mds: fix shared_ptr MDRequest bugs

The main change is use shared_ptr instead of weak_ptr to define
active request map. The reason is that slave request needs to be
preserved until master explicitly finishes it.

Fixes: #8026
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years agoerasure-code: thread-safe initialization of gf-complete 1621/head
Loic Dachary [Mon, 7 Apr 2014 22:20:29 +0000 (00:20 +0200)]
erasure-code: thread-safe initialization of gf-complete

Instead of relying on an implicit initialization happening during
encoding/decoding with galois.c:galois_init_default_field, call
gf.c:gf_init_easy for each w values when the plugin is loaded.

Loading the plugin is protected against race conditions by a lock.

It does not cover all possible uses of gf-complete but it is enough for
the ceph jerasure plugin.

http://tracker.ceph.com/issues/7914 fixes #7914

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #1610 from ceph/wip-4354-shared_ptr
Sage Weil [Tue, 8 Apr 2014 04:27:50 +0000 (21:27 -0700)]
Merge pull request #1610 from ceph/wip-4354-shared_ptr

Use shared pointers for Mutations/OpRequests in the MDS

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1594 from ceph/wip-7958
Sage Weil [Tue, 8 Apr 2014 04:27:04 +0000 (21:27 -0700)]
Merge pull request #1594 from ceph/wip-7958

wip 7958

Passed sage-2014-04-07_07:04:02-fs-wip-7958-testing-basic-plana.

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoosd_types: fix pg_stat_t::encode, object_stat_sum_t::decode version
Samuel Just [Mon, 7 Apr 2014 23:40:09 +0000 (16:40 -0700)]
osd_types: fix pg_stat_t::encode, object_stat_sum_t::decode version

Introduced in a130a4452e4fb159dc62fb417077d98dc9ebd621
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoSimpleLock: Switch MutationRef& for MutationRef in get_xlock() 1610/head
Greg Farnum [Wed, 12 Mar 2014 20:14:56 +0000 (13:14 -0700)]
SimpleLock: Switch MutationRef& for MutationRef in get_xlock()

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMDCache: use raw MutationImpl* instead of MutationRef in a few places
Greg Farnum [Thu, 13 Mar 2014 03:50:19 +0000 (20:50 -0700)]
MDCache: use raw MutationImpl* instead of MutationRef in a few places

Avoid the atomic ops necessary when copying a shared_ptr.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoLocker: use raw MutationImpl* instead of MutationRef in several places
Greg Farnum [Wed, 12 Mar 2014 20:03:26 +0000 (13:03 -0700)]
Locker: use raw MutationImpl* instead of MutationRef in several places

Sadly, you can't implicitly convert non-const references to shared pointers, so avoid the atomic ops necessary when copying a shared_ptr.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoLocker: use a null_ref instead of NULL
Greg Farnum [Wed, 12 Mar 2014 21:20:52 +0000 (14:20 -0700)]
Locker: use a null_ref instead of NULL

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoLocker: Use MutationRef instead of raw pointers
Greg Farnum [Wed, 12 Mar 2014 17:53:16 +0000 (10:53 -0700)]
Locker: Use MutationRef instead of raw pointers

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoLocker: remove Mutation param from xlock_import
Greg Farnum [Mon, 7 Apr 2014 23:05:49 +0000 (16:05 -0700)]
Locker: remove Mutation param from xlock_import

It's not used.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMDCache: fix users of active_requests for use of shared_ptr
Greg Farnum [Thu, 13 Mar 2014 03:20:08 +0000 (20:20 -0700)]
MDCache: fix users of active_requests for use of shared_ptr

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMDCache: use a null_ref instead of NULL in a few places
Greg Farnum [Wed, 12 Mar 2014 20:43:20 +0000 (13:43 -0700)]
MDCache: use a null_ref instead of NULL in a few places

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMDCache: use MutationRef instead of raw pointers
Greg Farnum [Wed, 12 Mar 2014 17:33:57 +0000 (10:33 -0700)]
MDCache: use MutationRef instead of raw pointers

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoServer: use MutationRef instead of raw pointer
Greg Farnum [Wed, 12 Mar 2014 16:48:04 +0000 (09:48 -0700)]
Server: use MutationRef instead of raw pointer

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMDS: switch cache object classes to use MutationRef instead of raw pointers
Greg Farnum [Wed, 12 Mar 2014 16:42:45 +0000 (09:42 -0700)]
MDS: switch cache object classes to use MutationRef instead of raw pointers

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoceph_test_rados_api_misc: print osd_max_attr_size
Sage Weil [Mon, 7 Apr 2014 23:31:16 +0000 (16:31 -0700)]
ceph_test_rados_api_misc: print osd_max_attr_size

Very confusing results from this test in bug #8009.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1612 from ceph/wip-7919
Sage Weil [Mon, 7 Apr 2014 23:11:51 +0000 (16:11 -0700)]
Merge pull request #1612 from ceph/wip-7919

mon: MonCommands: have all 'auth' commands require 'execute' caps

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1620 from ceph/wip-8003
Sage Weil [Mon, 7 Apr 2014 23:09:40 +0000 (16:09 -0700)]
Merge pull request #1620 from ceph/wip-8003

Wip 8003

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1611 from ceph/wip-7975
Sage Weil [Mon, 7 Apr 2014 22:59:37 +0000 (15:59 -0700)]
Merge pull request #1611 from ceph/wip-7975

osd: disable agent when stats_invalid (post-split)

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agodoc: Removed --stable arg and replaced with --release arg for ceph-deploy.
John Wilkins [Mon, 7 Apr 2014 22:49:09 +0000 (15:49 -0700)]
doc: Removed --stable arg and replaced with --release arg for ceph-deploy.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agoosd/ReplicatedPG: warn if invalid stats prevent us from activating agent 1611/head
Sage Weil [Mon, 7 Apr 2014 22:39:59 +0000 (15:39 -0700)]
osd/ReplicatedPG: warn if invalid stats prevent us from activating agent

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: dump agent state on pg query
Sage Weil [Mon, 7 Apr 2014 22:34:53 +0000 (15:34 -0700)]
osd/ReplicatedPG: dump agent state on pg query

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: kickstart the agent if scrub stats become valid
Sage Weil [Mon, 7 Apr 2014 22:21:01 +0000 (15:21 -0700)]
osd/ReplicatedPG: kickstart the agent if scrub stats become valid

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge tag 'v0.79' into firefly
Sage Weil [Mon, 7 Apr 2014 22:04:18 +0000 (15:04 -0700)]
Merge tag 'v0.79' into firefly

v0.79

11 years agoMerge pull request #1619 from ceph/wip-7659
Samuel Just [Mon, 7 Apr 2014 21:47:40 +0000 (14:47 -0700)]
Merge pull request #1619 from ceph/wip-7659

Wip 7659

Reviewed-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
11 years agoReplicatedPG: do not evict head while clone is being promoted 1620/head
Samuel Just [Sun, 6 Apr 2014 20:38:52 +0000 (13:38 -0700)]
ReplicatedPG: do not evict head while clone is being promoted

Fixes: #8003
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG::trim_object: account evicted prev clone for stats
Samuel Just [Mon, 7 Apr 2014 00:49:20 +0000 (17:49 -0700)]
ReplicatedPG::trim_object: account evicted prev clone for stats

If the previous clone is evicted, we shouldn't adjust the stats to
account for its new clone_overlap value.

Fixes: #7964
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG::make_writeable: check for evicted clone before adjusting for clone_overlap
Samuel Just [Sun, 6 Apr 2014 23:30:25 +0000 (16:30 -0700)]
ReplicatedPG::make_writeable: check for evicted clone before adjusting for clone_overlap

Fixes: #7964
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #1617 from ceph/wip-7904
Sage Weil [Mon, 7 Apr 2014 21:02:58 +0000 (14:02 -0700)]
Merge pull request #1617 from ceph/wip-7904

Wip 7904

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1614 from ceph/wip-7964
Sage Weil [Mon, 7 Apr 2014 21:01:58 +0000 (14:01 -0700)]
Merge pull request #1614 from ceph/wip-7964

Wip 7964

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1616 from ceph/wip-7916
Sage Weil [Mon, 7 Apr 2014 20:59:22 +0000 (13:59 -0700)]
Merge pull request #1616 from ceph/wip-7916

ReplicatedPG: improve get_object_context debugging

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoReplicatedPG: use get_clone_bytes on evict/promote
Samuel Just [Sun, 6 Apr 2014 19:29:56 +0000 (12:29 -0700)]
ReplicatedPG: use get_clone_bytes on evict/promote

Fixes: #7964
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG::_scrub: account for clone_overlap on each clone
Samuel Just [Sun, 6 Apr 2014 19:23:52 +0000 (12:23 -0700)]
ReplicatedPG::_scrub: account for clone_overlap on each clone

Otherwise, we end up subtracting off clone_overlap for evicted clones
whose sizes we did not add in.

Fixes: #7964
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG::find_object_context: check obs.exists on clone obc before checking...
Samuel Just [Sun, 6 Apr 2014 18:22:04 +0000 (11:22 -0700)]
ReplicatedPG::find_object_context: check obs.exists on clone obc before checking snaps

Fixes: #7858
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG::finish_promote: add debugging assert for clone_size
Samuel Just [Fri, 4 Apr 2014 20:53:22 +0000 (13:53 -0700)]
ReplicatedPG::finish_promote: add debugging assert for clone_size

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #1613 from ceph/wip-7994
Sage Weil [Mon, 7 Apr 2014 17:57:33 +0000 (10:57 -0700)]
Merge pull request #1613 from ceph/wip-7994

OSD: _share_map_outgoing whenever sending a message to a peer

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoqa: workunits: mon: auth_caps.sh: test 'auth' caps requirements 1612/head
Joao Eduardo Luis [Mon, 7 Apr 2014 17:30:56 +0000 (18:30 +0100)]
qa: workunits: mon: auth_caps.sh: test 'auth' caps requirements

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: MonCommands: have all 'auth' commands require 'execute' caps
Joao Eduardo Luis [Mon, 7 Apr 2014 17:17:54 +0000 (18:17 +0100)]
mon: MonCommands: have all 'auth' commands require 'execute' caps

Earlier patch already have the entity requiring 'execute' caps for
read-only commands.  This patch introduces the same requirement for *all*
auth commands, read-only and read-write alike.

While the rationale behind the earlier patch for leaving read-write
operations out of this requirement still holds, we now enforce this to
match compatibility with what was happening back on Dumpling with regard
to the 'execute' cap being required for auth commands.  However, it should
be noted that back on Dumpling we were only requiring the 'execute' cap
for auth commands, regardless of read-only or read-write, and no other
caps were required.

Fixes: 7919
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years ago0.79 v0.79
Jenkins [Mon, 7 Apr 2014 16:48:36 +0000 (16:48 +0000)]
0.79

11 years agomds: fix uninit val in MMDSSlaveRequest
Sage Weil [Mon, 7 Apr 2014 03:26:39 +0000 (20:26 -0700)]
mds: fix uninit val in MMDSSlaveRequest

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1607 from ceph/wip-7997
Sage Weil [Mon, 7 Apr 2014 15:11:00 +0000 (08:11 -0700)]
Merge pull request #1607 from ceph/wip-7997

mon: wait for quorum for MMonGetVersion

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agoMerge pull request #1609 from ceph/wip-7739
Sage Weil [Mon, 7 Apr 2014 00:56:05 +0000 (17:56 -0700)]
Merge pull request #1609 from ceph/wip-7739

mds: fix some uninitialized message fields

Reviewed-by: Zheng Yan <zheng.z.yan@intel.com>
11 years agomds: fix uninit MMDSSlaveRequest lock_type 1609/head
Sage Weil [Mon, 7 Apr 2014 00:36:38 +0000 (17:36 -0700)]
mds: fix uninit MMDSSlaveRequest lock_type

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1608 from ceph/wip-8002
Samuel Just [Sun, 6 Apr 2014 23:32:38 +0000 (16:32 -0700)]
Merge pull request #1608 from ceph/wip-8002

osd: fix osd map subscribe on YOU_DIED osd_ping

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agoosd: fix map subscription in YOU_DIED osd_ping handler 1608/head
Sage Weil [Sun, 6 Apr 2014 23:03:50 +0000 (16:03 -0700)]
osd: fix map subscription in YOU_DIED osd_ping handler

If we have epoch X and find out we died as of epoch Y, we still want to
request X+1.  Among other things, this fixes a 'stall' if Y happens to be
the most recent map published and no new maps are generated because we will
never get anything back from our subscription.

This makes this osdmap_subscribe() caller match every other caller by
passing in current epoch + 1.

Fixes: #8002
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agomsgr: add ms_dump_on_send option
Sage Weil [Wed, 2 Apr 2014 15:49:33 +0000 (08:49 -0700)]
msgr: add ms_dump_on_send option

This is useful only for debugging.  The encoded contents of a message are
dumped to the log on message send.  This is useful when valgrind is
triggering warnings about uninitialized memory in messages because the
call chain will indicate which message type is to blame, whereas the
usual writer thread context does not tell us any useful information.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agomds: fix uninitialized fields in MDiscover
Sage Weil [Sun, 6 Apr 2014 20:18:40 +0000 (13:18 -0700)]
mds: fix uninitialized fields in MDiscover

Fixes: #7739
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agomon: wait for quorum for MMonGetVersion 1607/head
Sage Weil [Sat, 5 Apr 2014 23:58:55 +0000 (16:58 -0700)]
mon: wait for quorum for MMonGetVersion

We should not respond to checks for map versions when we are in the
probing or electing states or else clients will get incorrect results when
they ask what the latest map version is.

Fixes: #7997
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoclient: release clean pages if no open file want RDCACHE 1594/head
Yan, Zheng [Fri, 4 Apr 2014 17:06:29 +0000 (01:06 +0800)]
client: release clean pages if no open file want RDCACHE

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years agoosd: disable agent when stats_invalid (post-split)
Sage Weil [Sat, 5 Apr 2014 01:15:04 +0000 (18:15 -0700)]
osd: disable agent when stats_invalid (post-split)

After a split the pg stats are approximate but not precisely correct.  Any
inaccuracy can be problematic for the agent because it determines the
level of effort and potentially full/blocking behavior based on that.

We could concievably do some estimation here that is "safe" in that we
don't commit to too much effort (or back off later if it isn't paying off)
and never block, but that is error-prone.

Instead, just disable the agent until a scrub makes the stats reliable
again.

We should document that a scrub after split is recommended (in any case)
and especially important on cache tiers, but there are currently *no*
user docs about PG splitting.

Fixes: #7975
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1605 from ceph/wip-7993
Sage Weil [Sat, 5 Apr 2014 01:07:52 +0000 (18:07 -0700)]
Merge pull request #1605 from ceph/wip-7993

ceph-post-file: use getopt for multiple options, add longopts to help

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoOSD: _share_map_outgoing whenever sending a message to a peer 1613/head
Greg Farnum [Fri, 4 Apr 2014 23:06:05 +0000 (16:06 -0700)]
OSD: _share_map_outgoing whenever sending a message to a peer

This ensures that they get new maps before an op which requires them (that
they would then request from the monitor).

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoceph-post-file: use getopt for multiple options, add longopts to help 1605/head
Dan Mick [Fri, 4 Apr 2014 22:26:42 +0000 (15:26 -0700)]
ceph-post-file: use getopt for multiple options, add longopts to help

Fixes: #7993
Signed-off-by: Dan Mick <dan.mick@inktank.com>
11 years agoMerge pull request #1603 from ceph/wip-7983
Samuel Just [Fri, 4 Apr 2014 22:17:00 +0000 (15:17 -0700)]
Merge pull request #1603 from ceph/wip-7983

osd/ReplicatedPG: do not hit_set_persist while potentially backfilling hit_set_*

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #1604 from ceph/wip-7992
Dan Mick [Fri, 4 Apr 2014 21:41:02 +0000 (14:41 -0700)]
Merge pull request #1604 from ceph/wip-7992

ceph-post-file: fix installation of ssh key files

11 years agoceph-post-file: fix installation of ssh key files 1604/head
Sage Weil [Fri, 4 Apr 2014 21:39:56 +0000 (14:39 -0700)]
ceph-post-file: fix installation of ssh key files

Fixes: #7992
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/ReplicatedPG: do not hit_set_persist while potentially backfilling hit_set_* 1603/head
Sage Weil [Fri, 4 Apr 2014 20:56:33 +0000 (13:56 -0700)]
osd/ReplicatedPG: do not hit_set_persist while potentially backfilling hit_set_*

The hit_set transactions may include both a modify of the new hit_set and
deletion of an old one, spanning the backfill boundary, and we may end up
sending a backfill target a blank transaction that does not correctly
remove the old object.  Later it will notice the stray object and
throw an assertion.

Fix this by skipping hit_set_persist() if any of the backfill targets are
still working on the very first hash value in the PG (which is where all
of the hit_set objects live).  This is coarse but simple.

Another solution would be to send separate ops for the trim/deletion and
new hit_set update, but that is a bit more complex and a bit more
runtime overhead (twice the messages).

Fixes: #7983
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agodoc/release-notes: note about emperor backport of mon auth fix
Sage Weil [Fri, 4 Apr 2014 19:59:41 +0000 (12:59 -0700)]
doc/release-notes: note about emperor backport of mon auth fix

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agomon: MonCommands.h: have 'auth' read-only operations require 'x' cap
Joao Eduardo Luis [Thu, 3 Apr 2014 17:21:08 +0000 (18:21 +0100)]
mon: MonCommands.h: have 'auth' read-only operations require 'x' cap

This reintroduces the same semantics that were in place in dumpling prior
to the refactoring of the cap/command matching code.

We haven't added this requirement to auth read-write operations as that
would have the potential to break a lot of well-configured keyrings once
the users upgraded, without any significant gain -- we assume that if
they have set 'rw' caps on a given entity, they are indeed expecting said
entity to be sort-of-privileged entities with regard to monitor access.

Fixes: #7919
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMigrator: use a null ref instead of NULL when calling into path_traverse
Greg Farnum [Wed, 12 Mar 2014 20:43:05 +0000 (13:43 -0700)]
Migrator: use a null ref instead of NULL when calling into path_traverse

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMigrator: use MDRequestRef and MutationRef instead of raw pointers
Greg Farnum [Wed, 12 Mar 2014 16:38:15 +0000 (09:38 -0700)]
Migrator: use MDRequestRef and MutationRef instead of raw pointers

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoSimpleLock: use MutationRef instead of raw pointers
Greg Farnum [Wed, 12 Mar 2014 03:52:21 +0000 (20:52 -0700)]
SimpleLock: use MutationRef instead of raw pointers

While we're here, remove the non-const get_xlock_by() (because
we don't need it). Also note we return a full MutationRef
(instead of a ref to the stored one). It's necessary in case we
don't have a set-up more() object.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMutation: move self_ref into MutationImpl instead of MDRequestImpl
Greg Farnum [Wed, 12 Mar 2014 03:20:41 +0000 (20:20 -0700)]
Mutation: move self_ref into MutationImpl instead of MDRequestImpl

We keep an MDRequestImpl::set_self_ref(MDRequestRef&) function so
that we don't need to do the pointer conversion elsewhere.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMutation: rename to MutationImpl and define MutationRef
Greg Farnum [Wed, 12 Mar 2014 03:11:21 +0000 (20:11 -0700)]
Mutation: rename to MutationImpl and define MutationRef

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoLocker: use MDRequestRef instead of MDRequest*
Greg Farnum [Mon, 10 Mar 2014 23:08:11 +0000 (16:08 -0700)]
Locker: use MDRequestRef instead of MDRequest*

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMDCache: use MDRequestRef instead of MDRequest*
Greg Farnum [Mon, 10 Mar 2014 23:07:00 +0000 (16:07 -0700)]
MDCache: use MDRequestRef instead of MDRequest*

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoServer: Use MDRequestRef instead of raw pointers
Greg Farnum [Sat, 8 Mar 2014 00:37:25 +0000 (16:37 -0800)]
Server: Use MDRequestRef instead of raw pointers

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMDS: Convert the request_start* functions and their immediate callers
Greg Farnum [Sat, 8 Mar 2014 00:01:42 +0000 (16:01 -0800)]
MDS: Convert the request_start* functions and their immediate callers

Also, the active_requests mapping gets weak pointers.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agomds: MDRequest: rename to MDRequestImpl, and declare MDRequestRef
Greg Farnum [Fri, 7 Mar 2014 23:58:11 +0000 (15:58 -0800)]
mds: MDRequest: rename to MDRequestImpl, and declare MDRequestRef

We're switching the MDRequest to be used as a shared pointer. This is the
first step on the path to inserting an OpTracker into the MDS.
Give the MDRequestImpl a weak_ptr self_ref so that we can keep
using the elist for now.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoinclude/memory: add static_pointer_cast
Greg Farnum [Wed, 12 Mar 2014 20:56:42 +0000 (13:56 -0700)]
include/memory: add static_pointer_cast

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMerge pull request #1602 from ceph/wip-cache-create-fix
Samuel Just [Fri, 4 Apr 2014 17:34:40 +0000 (10:34 -0700)]
Merge pull request #1602 from ceph/wip-cache-create-fix

ReplicatedPG: fix CEPH_OSD_OP_CREATE on cache pools

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agoclient: fix null pointer dereference in Client::unlink
Yan, Zheng [Fri, 4 Apr 2014 12:50:41 +0000 (20:50 +0800)]
client: fix null pointer dereference in Client::unlink

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years agoObjectCacher: assert no waiter when remove buffer head
Yan, Zheng [Thu, 3 Apr 2014 13:08:03 +0000 (21:08 +0800)]
ObjectCacher: assert no waiter when remove buffer head

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years agoclient: cleanup Client::_invalidate_inode_cache()
Yan, Zheng [Fri, 4 Apr 2014 01:39:21 +0000 (09:39 +0800)]
client: cleanup Client::_invalidate_inode_cache()

drop parameter 'keep_caps'

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years agoclient: drop Fr cap before gettattr CEPH_STAT_CAP_SIZE
Yan, Zheng [Fri, 4 Apr 2014 01:06:27 +0000 (09:06 +0800)]
client: drop Fr cap before gettattr CEPH_STAT_CAP_SIZE

When MDS receives the getattr request, corresponding inode's filelock
can be in unstable state which waits for client's Fr cap.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years agoclient: properly retain used caps
Yan, Zheng [Fri, 4 Apr 2014 05:50:10 +0000 (13:50 +0800)]
client: properly retain used caps

Pass properly 'retain' to Client::send_cap() because it is used to
adjust cap->issued.

Also make Client::encode_inode_release() not release used/dirty caps.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years agoclient: assign implemented caps to caps field of MClientCaps
Yan, Zheng [Thu, 3 Apr 2014 23:50:55 +0000 (07:50 +0800)]
client: assign implemented caps to caps field of MClientCaps

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years agoclient: hold Fcr caps during readahead
Yan, Zheng [Thu, 3 Apr 2014 22:49:49 +0000 (06:49 +0800)]
client: hold Fcr caps during readahead

Fcr caps prevent the file from being truncated.

Fixes: #7958
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years agoclient: implement RDCACHE reference tracking
Yan, Zheng [Thu, 3 Apr 2014 12:01:04 +0000 (20:01 +0800)]
client: implement RDCACHE reference tracking

make the code be able to track Fc caps used by aysnc buffer reads

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years agoReplicatedPG: fix CEPH_OSD_OP_CREATE on cache pools 1602/head
Ilya Dryomov [Fri, 4 Apr 2014 13:40:29 +0000 (17:40 +0400)]
ReplicatedPG: fix CEPH_OSD_OP_CREATE on cache pools

The following

./ceph osd pool create data-cache 8 8
./ceph osd tier add data data-cache
./ceph osd tier cache-mode data-cache writeback
./ceph osd tier set-overlay data data-cache

./rados -p data create foo
./rados -p data stat foo

results in

  error stat-ing data/foo: No such file or directory

even though foo exists in the data-cache pool, as it should.  STAT
checks for (exists && !is_whiteout()), but the whiteout flag isn't
cleared on CREATE as it is on WRITE and WRITEFULL.  The problem is
that, for newly created 0-sized cache pool objects, CREATE handler in
do_osd_ops() doesn't get a chance to queue OP_TOUCH, and so the logic
in prepare_transaction() considers CREATE to be a read and therefore
doesn't clear whiteout.  Fix it by allowing CREATE handler to queue
OP_TOUCH at all times, mimicking WRITE and WRITEFULL behaviour.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
11 years agoMerge pull request #1600 from ceph/wip-7922
Sage Weil [Fri, 4 Apr 2014 16:22:42 +0000 (09:22 -0700)]
Merge pull request #1600 from ceph/wip-7922

Wip 7922

Passes my manual testing and the new teuthology test case.

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoosd: Send REJECT to all previously acquired reservations 1600/head
David Zafman [Fri, 4 Apr 2014 05:13:17 +0000 (22:13 -0700)]
osd: Send REJECT to all previously acquired reservations

When getting a REJECT from a backfill target, tell already GRANTed targets to
go back to RepNotRecovering state by sending a REJECT to them.

Fixes: #7922
Signed-off-by: David Zafman <david.zafman@inktank.com>
11 years agodoc/release-notes: v0.79 release notes
Sage Weil [Fri, 4 Apr 2014 01:28:12 +0000 (18:28 -0700)]
doc/release-notes: v0.79 release notes

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoFix byte-order dependency in calculation of initial challenge
Dan Mick [Thu, 3 Apr 2014 20:59:59 +0000 (13:59 -0700)]
Fix byte-order dependency in calculation of initial challenge

Fixes: #7977
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoReplicatedPG::_delete_oid: adjust num_object_clones 1614/head
Samuel Just [Thu, 3 Apr 2014 17:13:57 +0000 (10:13 -0700)]
ReplicatedPG::_delete_oid: adjust num_object_clones

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG::agent_choose_mode: improve debugging
Samuel Just [Wed, 2 Apr 2014 22:53:00 +0000 (15:53 -0700)]
ReplicatedPG::agent_choose_mode: improve debugging

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #1599 from ceph/wip-7978
Sage Weil [Fri, 4 Apr 2014 00:44:13 +0000 (17:44 -0700)]
Merge pull request #1599 from ceph/wip-7978

rgw: only look at next placement rule if we're not at the last rule

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agorgw: only look at next placement rule if we're not at the last rule 1599/head
Yehuda Sadeh [Thu, 3 Apr 2014 22:15:41 +0000 (15:15 -0700)]
rgw: only look at next placement rule if we're not at the last rule

Fixes: #7978
We tried to move to the next placement rule, but we were already at the
last one, so we ended up looping forever.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoReplicatedPG::agent_choose_mode: use num_user_objects for target_max_bytes calc
Samuel Just [Wed, 2 Apr 2014 22:36:35 +0000 (15:36 -0700)]
ReplicatedPG::agent_choose_mode: use num_user_objects for target_max_bytes calc

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG::agent_choose_mode: exclude omap objects for ec base pool
Samuel Just [Wed, 2 Apr 2014 21:19:30 +0000 (14:19 -0700)]
ReplicatedPG::agent_choose_mode: exclude omap objects for ec base pool

Fixes: #7831
Signed-off-by: Samuel Just <sam.just@inktank.com>