git.apps.os.sepia.ceph.com Git - ceph.git/log

]> git.apps.os.sepia.ceph.com Git - ceph.git/log

projects / ceph.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Sage Weil [Tue, 17 Jun 2014 00:00:51 +0000 (17:00 -0700)]

mon: ensure HealthService warning(s) include a summary

The low disk space check would change our status to HEALTH_WARN and include
a detail message, but no summary. We need both.

Backport: firefly
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 3ed7f2dd4345633ff34017b201082f4c261ef387)

commit | commitdiff | tree

Sage Weil [Mon, 16 Jun 2014 23:58:14 +0000 (16:58 -0700)]

mon: refactor check_health()

Refactor the get_health() methods to always take both a summary and detail.
Eliminate the return value and pull that directly from the summary, as we
already do with the PaxosServices.

Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 82e47db8073b622183a5e33f6e0b999a3a144804)

commit | commitdiff | tree

Sage Weil [Mon, 16 Jun 2014 23:40:05 +0000 (16:40 -0700)]

mon: fix typos, punctuation for mon disk space warning(s)

Backport: firefly
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 98883f6308ce72f69a71feab29ef00e13f319cdb)

Conflicts:

src/mon/DataHealthService.cc

commit | commitdiff | tree

Sage Weil [Mon, 16 Jun 2014 23:27:05 +0000 (16:27 -0700)]

mon/OSDMonitor: make down osd count sensible

We currently log something like

1/10 in osds are down

in the health warning when there are down OSDs, but this is based on a
comparison of the number of up vs the number of in osds, and makes no sense
when there are up osds that are not in.

Instead, count only the number OSDs that are both down and in (relative to
the total number of OSDs in) and warn about that. This means that, if a
disk fails, and we mark it out, and the cluster fully repairs itself, it
will go back to a HEALTH_OK state.

I think that is a good thing, and certainly preferable to the current
nonsense. If we want to distinguish between down+out OSDs that were failed
vs those that have been "acknowledged" by an admin to be dead, we will
need to add some additional state (possibly reusing the AUTOOUT flag?), but
that will require more discussion.

Backport: firefly (maybe)
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 55a97787088b79356c678a909b2410b3924e7f5b)

commit | commitdiff | tree

Sage Weil [Mon, 30 Jun 2014 14:05:04 +0000 (07:05 -0700)]

qa/workunits/suites/fsx.sh: don't use zero range

Zero range is not supported by cephfs.

Fixes: #8542
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 2dec8a810060f65d022c06e82090b4aa5ccec0cb)

commit | commitdiff | tree

Loic Dachary [Mon, 30 Jun 2014 15:01:03 +0000 (17:01 +0200)]

Merge pull request #1991 from dachary/wip-8307-erasure-code-profile-implicit-creation

erasure code profile implicit creation (firefly backport)

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>

commit | commitdiff | tree

Loic Dachary [Wed, 11 Jun 2014 20:44:57 +0000 (22:44 +0200)]

erasure-code: pool create must not create profiles

If a non existent profile is provided as an argument to osd pool create,
it must exit on error and not create the profile as a side effect.

http://tracker.ceph.com/issues/8307 refs: #8307

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit ff2eb234e63cd281b40405cb3397da5babda943f)

commit | commitdiff | tree

Loic Dachary [Wed, 11 Jun 2014 20:47:49 +0000 (22:47 +0200)]

erasure-code: OSDMonitor::get_erasure_code is a const

If it is not, the non const version of OSDMap::get_erasure_code_profile
is called and a profile is created as a side effect, which is not
intended.

http://tracker.ceph.com/issues/8307 refs: #8307

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 3c638111a4943758b6089c63a42aabbf281ac257)

commit | commitdiff | tree

Loic Dachary [Tue, 27 May 2014 08:06:46 +0000 (10:06 +0200)]

mon: fix set cache_target_full_ratio

It was a noop because it was incorrectly using the variable n. Add a
test to protect against regression.

http://tracker.ceph.com/issues/8440 Fixes: #8440

Reported-by: Geoffrey Hartz <hartz.geoffrey@gmail.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit c2225f874dcf37222d831b65b5a319d598d2fcd9)

commit | commitdiff | tree

Alfredo Deza [Fri, 20 Jun 2014 15:14:25 +0000 (11:14 -0400)]

log the command that is being run with subprocess

Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
(cherry picked from commit e189a668285f9ab73116bc19f9df1cc515473541)

commit | commitdiff | tree

Ilya Dryomov [Thu, 5 Jun 2014 06:08:42 +0000 (10:08 +0400)]

XfsFileStoreBackend: call ioctl(XFS_IOC_FSSETXATTR) less often

No need to call ioctl(XFS_IOC_FSSETXATTR) if extsize is already set to
the value we want or if any extents are allocated - XFS will refuse to
change extsize in that's the case.

Fixes: #8241
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
(cherry picked from commit bc3b30ed09b8f3eb86b61e3a05ccacfd928faa95)

commit | commitdiff | tree

John Spray [Tue, 20 May 2014 15:25:19 +0000 (16:25 +0100)]

mon: Fix default replicated pool ruleset choice

Specifically, in the case where the configured
default ruleset is CEPH_DEFAULT_CRUSH_REPLICATED_RULESET,
instead of assuming ruleset 0 exists, choose the lowest
numbered ruleset.

In the case where an explicit ruleset is passed to
OSDMonitor::prepare_pool_crush_ruleset, verify
that it really exists.

The idea is to eliminate cases where a pool could
exist with its crush ruleset set to something
other than a value ruleset ID.

Fixes: #8373
Signed-off-by: John Spray <john.spray@inktank.com>
(cherry picked from commit 1d9e4ac2e2bedfd40ee2d91a4a6098150af9b5df)

Conflicts:

src/crush/CrushWrapper.h

commit | commitdiff | tree

Sage Weil [Tue, 3 Jun 2014 18:45:20 +0000 (11:45 -0700)]

librados: simplify/fix rados_pool_list bounds checks

We were not breaking out of the loop when we filled up the buffer unless
we happened to do so on a pool name boundary.  This means that len would
roll over (it was unsigned).  In my case, I was not able to reproduce
anything particularly bad since (I think) the strncpy was interpreting the
large unsigned value as signed, but in any case this fixes it, simplifies
the arithmetic, and adds a simple test.

- use a single 'rl' value for the amount of buffer space we want to
  consume
- use this to check that there is room and also as the strncat length
- rely on the initial memset to ensure that the trailing 0 is in place.

Fixes: #8447
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 3ec32a6bb11d92e36a0e6381b40ce2fd1fbb016a)

commit | commitdiff | tree

Sage Weil [Wed, 25 Jun 2014 19:42:11 +0000 (12:42 -0700)]

Merge pull request #1982 from accelazh/firefly-fix-issue-8256

Make <poolname> in "ceph osd tier --help" clearer (fix issue 8256).

Reviewed-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Samuel Just [Tue, 3 Jun 2014 23:14:15 +0000 (16:14 -0700)]

OSD::calc_priors_during: handle CRUSH_ITEM_NONE correctly

Fixes: #8507
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 0bd6f6797c69af9aff851033c57c42121671c684)

Conflicts:
src/osd/OSD.cc

commit | commitdiff | tree

Samuel Just [Tue, 3 Jun 2014 23:11:32 +0000 (16:11 -0700)]

OSD::calc_priors_during: fix confusing for loop bracing (cosmetic)

Confusing lack of braces is confusing.

Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit d76936b92300be5cc403fd5a36616a2424c7877d)

Conflicts:
src/osd/OSD.cc

commit | commitdiff | tree

Samuel Just [Tue, 24 Jun 2014 17:11:21 +0000 (10:11 -0700)]

rados.cc: fix pool alignment check

Only check pool alignment if io_ctx is initialized.

Introduced in 304b08a23a3db57010078046955a786fe3589ef8
Fixes: #8652
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit d7350a3741bf4cdb270c6361e68090fe280cf36d)

Conflicts:
src/tools/rados/rados.cc

commit | commitdiff | tree

Sage Weil [Tue, 17 Jun 2014 20:33:14 +0000 (13:33 -0700)]

osd: fix filestore perf stats update

Update the struct we are about to send, not the (unlocked!) one we will
send the next time around.

Backport: firefly, dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 4afffb4a10a0bbf7f2018ef3ed6b167c7921e46b)

commit | commitdiff | tree

Greg Farnum [Thu, 24 Apr 2014 22:34:24 +0000 (15:34 -0700)]

FileStore: set XATTR_NO_SPILL_OUT when creating new files.

Fixes: #8205
Backport: firefly

Signed-off-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit e3b995e1567f3ccc6d00ae27ab2aa99ca157228a)

commit | commitdiff | tree

Haomai Wang [Sat, 7 Jun 2014 06:32:23 +0000 (14:32 +0800)]

FileStore: make _clone() copy spill out marker

Previously we were not doing so, and that resulted in unpredictable loss
of xattrs from the client's perspective.

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit 239476a92849159d2a8966d90ca055c116bee91e)

commit | commitdiff | tree

Loic Dachary [Wed, 18 Jun 2014 15:01:54 +0000 (17:01 +0200)]

erasure-code: verify that rados put enforces alignment

http://tracker.ceph.com/issues/8622 refs: #8622

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit b46c4056014dd6de5e3bd736f2c41f096ea708b4)

commit | commitdiff | tree

Lluis Pamies-Juarez [Wed, 18 Jun 2014 17:00:09 +0000 (10:00 -0700)]

enforce rados put aligment

Signed-off-by: Lluis Pamies-Juarez <lluis.pamies-juarez@hgst.com>
(cherry picked from commit 304b08a23a3db57010078046955a786fe3589ef8)

commit | commitdiff | tree

Sage Weil [Fri, 6 Jun 2014 20:31:29 +0000 (13:31 -0700)]

osd/OSDMap: do not require ERASURE_CODE feature of clients

Just because an EC pool exists in the cluster does not mean tha tthe client
has to support the feature:

1) The way client IO is initiated is no different for EC pools than for
   replicated pools.
2) People may add an EC pool to an existing cluster with old clients and
   locking those old clients out is very rude when they are not using the
   new pool.
3) The only direct client user of EC pools right now is rgw, and the new
   versions already need to support various other features like CRUSH_V2
   in order to work.  These features are present in new kernels.

Fixes: #8556
Backport: firefly
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 3fe1699f9620280d0070cfe6f01cfeb2332e7470)

commit | commitdiff | tree

Sage Weil [Thu, 12 Jun 2014 23:44:53 +0000 (16:44 -0700)]

osd/OSDMap: make get_features() take an entity type

Make the helper that returns what features are required of the OSDMap take
an entity type argument, as the required features may vary between
components in the cluster.

Backport: firefly
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 250677c965365edf3ecd24ef73700fc6d992ea42)

commit | commitdiff | tree

Haomai Wang [Wed, 21 May 2014 10:12:22 +0000 (18:12 +0800)]

Avoid extra check for clean object

We needn't to check clean object via buffer state, skip the clean object.

Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
(cherry picked from commit f51e33bd9c5a8e1cfc7065b30785696dc45918bc)

commit | commitdiff | tree

Accela Zhao [Wed, 18 Jun 2014 09:17:03 +0000 (17:17 +0800)]

Make <poolname> in "ceph osd tier --help" clearer.

The ceph osd tier --help info on the left always says <poolname>.
It is unclear which one to put <tierpool> on the right.

$ceph osd tier --help
osd tier add <poolname> <poolname> {--   add the tier <tierpool> to base pool
force-nonempty}                          <pool>
osd tier add-cache <poolname>            add a cache <tierpool> of size <size>
<poolname> <int[0-]>                     to existing pool <pool>
...

This patch modifies description on the right to tell which <poolname>:

osd tier add <poolname> <poolname> {--   add the tier <tierpool> (the second
force-nonempty}                          one) to base pool <pool> (the first
                                           one)
...

Fix: http://tracker.ceph.com/issues/8256

Signed-off-by: Yilong Zhao <accelazh@gmail.com>

commit | commitdiff | tree

Sage Weil [Mon, 16 Jun 2014 16:25:32 +0000 (09:25 -0700)]

Merge pull request #1962 from dachary/wip-8599-ruleset-firefly

mon: pool set <pool> crush_ruleset must not use rule_exists (firefly)

Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

John Spray [Tue, 20 May 2014 15:50:18 +0000 (16:50 +0100)]

mon: pool set <pool> crush_ruleset must not use rule_exists

Implement CrushWrapper::ruleset_exists that iterates over the existing
rulesets to find the one matching the ruleset argument.

ceph osd pool set <pool> crush_ruleset must not use
CrushWrapper::rule_exists, which checks for a *rule* existing, whereas
the value being set is a *ruleset*. (cherry picked from commit
fb504baed98d57dca8ec141bcc3fd021f99d82b0)

A test via ceph osd pool set data crush_ruleset verifies the ruleset
argument is accepted.

http://tracker.ceph.com/issues/8599 fixes: #8599

Backport: firefly, emperor, dumpling
Signed-off-by: John Spray <john.spray@inktank.com>
Signed-off-by: Loic Dachary <loic@dachary.org>

commit | commitdiff | tree

Sage Weil [Mon, 9 Jun 2014 03:18:49 +0000 (20:18 -0700)]

init-ceph: continue after failure doing osd data mount

If we are starting many daemons and hit an error, we normally note it and
move on. Do the same when doing the pre-mount step.

Fixes: #8554
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 6a7e20147cc39ed4689809ca7d674d3d408f2a17)

commit | commitdiff | tree

Steve Taylor [Tue, 10 Jun 2014 18:42:55 +0000 (12:42 -0600)]

Fix for bug #6700

When preparing OSD disks with colocated journals, the intialization process
fails when using dmcrypt. The kernel fails to re-read the partition table after
the storage partition is created because the journal partition is already in use
by dmcrypt. This fix unmaps the journal partition from dmcrypt and allows the
partition table to be read.

Signed-off-by: Stephen F Taylor <steveftaylor@gmail.com>
(cherry picked from commit 673394702b725ff3f26d13b54d909208daa56d89)

commit | commitdiff | tree

John Wilkins [Thu, 5 Jun 2014 18:29:20 +0000 (11:29 -0700)]

doc: Added Disable requiretty commentary.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>

commit | commitdiff | tree

Samuel Just [Fri, 16 May 2014 23:56:33 +0000 (16:56 -0700)]

ReplicatedPG::start_flush: fix clone deletion case

dsnapc.snaps will be non-empty most of the time if there
have been snaps before prev_snapc. What we really want to
know is whether there are any snaps between oi.snaps.back()
and prev_snapc.

Fixes: 8334
Backport: firefly
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 29f615b7ac9e92f77cdef9927070727fee9d5e33)

commit | commitdiff | tree

Samuel Just [Mon, 12 May 2014 22:08:07 +0000 (15:08 -0700)]

ReplicatedPG::start_flush: send delete even if there are no snaps

Even if all snaps for the clone have been removed, we still have to
send the delete to ensure that when the object is recreated the
new snaps aren't included in the wrong clone.

Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 2ec2182745fa7c22526a7cf3dedb25bc314c9db4)

commit | commitdiff | tree

Samuel Just [Fri, 16 May 2014 03:53:27 +0000 (20:53 -0700)]

HashIndex: in cleanup, interpret missing dir as completed merge

If we stop between unlinking the empty subdir and removing the root
merge marker, we get ENOENT on the get_info. That's actually fine.

Backport: firefly
Fixes: 8332
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 5ff95dbdd2dbb533d344f37fea722ca4f140e670)

commit | commitdiff | tree

Alfredo Deza [Wed, 28 May 2014 15:48:12 +0000 (11:48 -0400)]

add backport of collections.Counter for python2.6

Using Raymond Hettinger's MIT backport

Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
(cherry picked from commit 23b75b550507438c79b3aa75e06721e5f7b134a4)

commit | commitdiff | tree

Ailing [Wed, 28 May 2014 19:37:48 +0000 (12:37 -0700)]

rest-api: key missing for per "rx" and "rwx"

commit 85a1cf31e6 and db266a3fb2 introduce new per "rx" and "rwx", but key missing for per "rx" and "rwx" in permmap

Signed-off-by: Ailing Zhang <ailzhang@cisco.com>
(cherry picked from commit 0b5a67410793ec28cac47e6e44cbbcf5684d77e7)

commit | commitdiff | tree

Greg Farnum [Thu, 22 May 2014 04:41:23 +0000 (21:41 -0700)]

cephfs-java: build against older jni headers

Older versions of the JNI interface expected non-const parameters
to their memory move functions. It's unpleasant, but won't actually
change the memory in question, to do a cast_const in order to satisfy
those older headers. (And even if it *did* modify the memory, that
would be okay given our single user.)

Signed-off-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit 4d4b77e5b6b923507ec4a0ad9d5c7018e4542a3c)

commit | commitdiff | tree

Ilya Dryomov [Fri, 16 May 2014 15:03:13 +0000 (19:03 +0400)]

OSDMonitor: set next commit in mon primary-affinity reply

Commit 8c5c55c8b47e ("mon: set next commit in mon command replies")
fixed MMonCommand replies to include the right version, but the
primary-affinity handler was authored before that. Fix it.

Backport: firefly
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
(cherry picked from commit a78b14ec1769ef37bef82bfda6faabb581b4cd7d)

commit | commitdiff | tree

Dmitry Smirnov [Mon, 12 May 2014 04:08:44 +0000 (14:08 +1000)]

prioritise use of `javac` executable (gcj provides it through alternatives).

On Debian this fixes FTBFS when gcj-jdk and openjdk-7-jdk are installed at
the same time because build system will use default `javac` executable
provided by current JDK through `update-alternatives` instead of blindly
calling GCJ when it is present.

Signed-off-by: Dmitry Smirnov <onlyjob@member.fsf.org>
(cherry picked from commit 8b682d167e4535df582f1c77542e2b1ea0981228)

commit | commitdiff | tree

Dmitry Smirnov [Mon, 12 May 2014 04:02:53 +0000 (14:02 +1000)]

pass '-classpath' option (gcj/javah ignores CLASSPATH environment variable).

This should not affect OpenJDK which understands '-classpath' as well.

With gcj-jdk we still get FTBFS later:

~~~~
    java/native/libcephfs_jni.cc:2878:55: error: invalid conversion from 'const jbyte* {aka const signed char*}' to 'jbyte* {aka signed char*}' [-fpermissive]
                 reinterpret_cast<const jbyte*>(rawAddress));
                                                           ^
    In file included from java/native/libcephfs_jni.cc:27:0:
    /usr/lib/gcc/x86_64-linux-gnu/4.8/include/jni.h:1471:8: error:   initializing argument 4 of 'void _Jv_JNIEnv::SetByteArrayRegion(jbyteArray, jsize, jsize, jbyte*)' [-fpermissive]
       void SetByteArrayRegion (jbyteArray val0, jsize val1, jsize val2, jbyte * val3)
            ^
    make[5] *** [java/native/libcephfs_jni_la-libcephfs_jni.lo] Error 1
~~~~

Signed-off-by: Dmitry Smirnov <onlyjob@member.fsf.org>
(cherry picked from commit 89fe0353582bde7e2fbf32f1626d430a20002dd0)

commit | commitdiff | tree

Dmitry Smirnov [Mon, 12 May 2014 03:57:20 +0000 (13:57 +1000)]

look for "jni.h" in gcj-jdk path, needed to find "jni.h" with gcj-jdk_4.9.0

Signed-off-by: Dmitry Smirnov <onlyjob@member.fsf.org>
(cherry picked from commit 0f4120c0115e7977ae7c03458addcc2b2916db07)

commit | commitdiff | tree

Sage Weil [Thu, 8 May 2014 15:52:51 +0000 (08:52 -0700)]

ceph-disk: partprobe before settle when preparing dev

Two users have reported this fixes a problem with using --dmcrypt.

Fixes: #6966
Tested-by: Eric Eastman <eric0e@aol.com>
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 0f196265f049d432e399197a3af3f90d2e916275)

commit | commitdiff | tree

Greg Farnum [Tue, 13 May 2014 20:15:28 +0000 (13:15 -0700)]

test: fix some templates to match new output code

Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 00225d739cefa1415524a3de45fb9a5a2db53018)

commit | commitdiff | tree

Greg Farnum [Thu, 15 May 2014 23:50:43 +0000 (16:50 -0700)]

OSD: fix an osdmap_subscribe interface misuse

When calling osdmap_subscribe, you have to pass an epoch newer than the
current map's. _maybe_boot() was not doing this correctly -- we would
fail a check for being *in* the monitor's existing map range, and then
pass along the map prior to the monitor's range. But if we were exactly
one behind, that value would be our current epoch, and the request would
get dropped. So instead, make sure we are not *in contact* with the monitor's
existing map range.

Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 290ac818696414758978b78517b137c226110bb4)

commit | commitdiff | tree

Sage Weil [Mon, 19 May 2014 17:32:12 +0000 (10:32 -0700)]

osd: skip out of order op checks on tiered pools

When we send redirected ops, we do not assign a new tid, which means that
a given client's ops for a pool may not have strictly ordered tids. Skip
this check if the pool is tiered to avoid false positives.

Fixes: #8380
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit cf2b172c843da0599164901956b66c306a59e570)

commit | commitdiff | tree

Samuel Just [Tue, 6 May 2014 18:50:14 +0000 (11:50 -0700)]

ReplicatedPG: block scrub on blocked object contexts

Fixes: #8011
Backport: firefly
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 7411477153219d66625a74c5886530029c516036)

commit | commitdiff | tree

Guang Yang [Fri, 9 May 2014 09:21:23 +0000 (09:21 +0000)]

msg: Fix inconsistent message sequence negotiation during connection reset

Backport: firefly, emperor, dumpling

Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit bdee119076dd0eb65334840d141ccdf06091e3c9)

commit | commitdiff | tree

Samuel Just [Tue, 15 Apr 2014 19:55:47 +0000 (12:55 -0700)]

Objecter::_op_submit: only replace the tid if it's 0

Otherwise, redirected ops will suddenly have a different tid
and will become uncancelable.

Fixes: #7588
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 76568aa0db4e16ac1af8fe6405edade1e61cbc81)

commit | commitdiff | tree

Sage Weil [Thu, 8 May 2014 17:42:42 +0000 (10:42 -0700)]

mon/OSDMonitor: force op resend when pool overlay changes

If a client is sending a sequence of ops (say, a, b, c, d) and partway
through that sequence it receives an OSDMap update that changes the
overlay, the ops will get send to different pools, and the replies will
come back completely out of order.

To fix this, force a resend of all outstanding ops any time the overlay
changes.

Fixes: #8305
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 63d92ab0969357f78fdade749785136a509bc81b)

commit | commitdiff | tree

Sage Weil [Thu, 8 May 2014 17:50:51 +0000 (10:50 -0700)]

osd: discard client ops sent before last_force_op_resend

If an op is sent before last_force_op_resend, and the client's feature is
present, drop the op because we know they will resend.

Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 45e79a17a932192995f8328ae9f6e8a2a6348d10)

commit | commitdiff | tree

Sage Weil [Thu, 8 May 2014 17:52:11 +0000 (10:52 -0700)]

osdc/Objecter: resend ops in the last_force_op_resend epoch

If we are a client, and process a map that sets last_force_op_resend to
the current epoch, force a resend of this op.

If the OSD expects us to do this, it will discard our previous op. If the
OSD is old, it will process the old one, this will appear as a dup, and we
are no worse off than before.

Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit dd700bdf7115223cb3e517b851f462d75dd76a2b)

commit | commitdiff | tree

Sage Weil [Thu, 8 May 2014 17:40:10 +0000 (10:40 -0700)]

osd/osd_types: add last_force_op_resend to pg_pool_t

Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 3152faf79f498a723ae0fe44301ccb21b15a96ab)

commit | commitdiff | tree

Sage Weil [Fri, 9 May 2014 16:20:34 +0000 (09:20 -0700)]

osd: handle race between osdmap and prepare_to_stop

If we get a MOSDMarkMeDown message and set service.state == STOPPING, we
kick the prepare_to_stop() thread. Normally, it will wake up and then
set osd.state == STOPPING, and when we process the map message next we
will not warn. However, if dispatch() takes the lock instead and processes
the map, it will fail the preparing_to_stop check and issue a spurious
warning.

Fix by checking for either preparing_to_stop or stopping.

Fixes: #8319
Backport: firefly, emperor, dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 6b858be0676f937a99dbd51321497f30c3a0097f)

commit | commitdiff | tree

Sage Weil [Sat, 10 May 2014 17:29:11 +0000 (10:29 -0700)]

osd/ReplicatedPG: do not queue NULL dup_op

We call start_flush() with a NULL op in a couple different places. Do not
put a NULL pointer on the dup_ops list or we will crash later.

Fixes: #8328
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 0d67f9b0695765824bdc4a65fbed88edf8ea232e)

commit | commitdiff | tree

Jenkins [Mon, 12 May 2014 15:12:54 +0000 (15:12 +0000)]

0.80.1

commit | commitdiff | tree

Jenkins [Mon, 12 May 2014 15:11:33 +0000 (15:11 +0000)]

0.80.1

commit | commitdiff | tree

Jenkins [Mon, 12 May 2014 15:10:56 +0000 (15:10 +0000)]

0.80.1

commit | commitdiff | tree

Jenkins [Mon, 12 May 2014 15:09:01 +0000 (15:09 +0000)]

0.80.1

commit | commitdiff | tree

Samuel Just [Fri, 2 May 2014 23:21:26 +0000 (16:21 -0700)]

Revert "ReplicatedPG: block scrub on blocked object contexts"

This reverts commit e66f2e36c06ca00c1147f922d3513f56b122a5c0.
Reviewed-by: Sage Weil <sage@inktank.com>
0f3235d46c8fd6c537bd4aa8a3faec6c00f311a8 is the firefly commit
corresponding to e66f2e36c06ca00c1147f922d3513f56b122a5c0.

(cherry picked from commit 84728058dbb91b8ed062240b3373b18078f0c9ca)

commit | commitdiff | tree

Yehuda Sadeh [Tue, 6 May 2014 23:55:27 +0000 (16:55 -0700)]

rgw: fix stripe_size calculation

Fixes: #8299
Backport: firefly
The stripe size calculation was broken, specifically affected cases
where we had manifest that described multiple parts.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
(cherry picked from commit 9968b938b5d47fdf3a67db134bd2ea6bf3a28086)

commit | commitdiff | tree

Yehuda Sadeh [Tue, 6 May 2014 18:06:29 +0000 (11:06 -0700)]

rgw: cut short object read if a chunk returns error

Fixes: #8289
Backport: firefly, dumpling
When reading an object, if we hit an error when trying to read one of
the rados objects then we should just stop. Otherwise we're just going
to continue reading the rest of the object, and since it can't be sent
back to the client (as we have a hole in the middle), we end up
accumulating everything in memory.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
(cherry picked from commit 03b0d1cfb7bd30a77fedcf75eb06476b21b14e95)

commit | commitdiff | tree

Yehuda Sadeh [Mon, 21 Apr 2014 22:34:04 +0000 (15:34 -0700)]

rgw: send user manifest header field

Fixes: #8170
Backport: firefly
If user manifest header exists (swift) send it as part of the object
header data.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
(cherry picked from commit 5cc5686039a882ad345681133c9c5a4a2c2fd86b)

commit | commitdiff | tree

Yan, Zheng [Fri, 11 Apr 2014 07:03:37 +0000 (15:03 +0800)]

client: add asok command to kick sessions that were remote reset

Fixes: #8021
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
(cherry picked from commit 09a1bc5a4601d356b9cc69be8541e6515d763861)

commit | commitdiff | tree

Sage Weil [Fri, 18 Apr 2014 20:50:11 +0000 (13:50 -0700)]

osd: throttle snap trimmming with simple delay

This is not particularly smart, but it is *a* knob that lets you make
the snap trimmer slow down. It's a flow and a simple delay, so it is
adjustable at runtime. Default is 0 (no change in behavior).

Partial solution for #6278.

Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 4413670d784efc2392359f0f22bca7c9056188f4)

commit | commitdiff | tree

Sage Weil [Fri, 2 May 2014 21:48:35 +0000 (14:48 -0700)]

mon/MonClient: remove stray _finish_hunting() calls

Callig _finish_hunting() clears out the bool hunting flag, which means we
don't retry by connection to another mon periodically.  Instead, we send
keepalives every 10s.  But, since we aren't yet in state HAVE_SESSION, we
don't check that the keepalives are getting responses.  This means that an
ill-timed connection reset (say, after we get a MonMap, but before we
finish authenticating) can drop the monc into a black hole that does not
retry.

Instead, we should *only* call _finish_hunting() when we complete the
authentication handshake.

Fixes: #8278
Backport: firefly, dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
(cherry picked from commit 77a6f0aefebebf057f02bfb95c088a30ed93c53f)

commit | commitdiff | tree

Sage Weil [Fri, 2 May 2014 23:41:26 +0000 (16:41 -0700)]

osd/ReplicatedPG: fix trim of in-flight hit_sets

We normally need to stat the hit_set to know how many bytes to adjust the
stats by. If the hit_set was just written, we will get ENOENT.

Get the obc instead, which will either get the in-memory copy (because the
repop is still in flight) or load it off of disk.

Fixes: #8283
Backport: firefly
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 72fdd557c35cb721d4b502c5a8f68c878f11a19c)

commit | commitdiff | tree

Sage Weil [Tue, 6 May 2014 18:01:27 +0000 (11:01 -0700)]

osd/ReplicatedPG: fix whiteouts for other cache mode

We were special casing WRITEBACK mode for handling whiteouts; this needs to
also include the FORWARD and READONLY modes. To avoid having to list
specific cache modes, though, just check != NONE.

Fixes: #8296
Backport: firefly
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 3e387d62ed95898db8a7d7163c2bacc363b8f617)

commit | commitdiff | tree

Sage Weil [Thu, 1 May 2014 23:53:17 +0000 (16:53 -0700)]

osd: Prevent divide by zero in agent_choose_mode()

Fixes: #8175
Backport: firefly

Signed-off-by: David Zafman <david.zafman@inktank.com>
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit f47f867952e6b2a16a296c82bb9b585b21cde6c8)

commit | commitdiff | tree

David Zafman [Tue, 22 Apr 2014 06:52:04 +0000 (23:52 -0700)]

osd, common: If agent_work() finds no objs to work on delay 5 (default) secs

Add config osd_agent_delay_time of 5 seconds
Honor delay by ignoring agent_choose_mode() calls
Add tier_delay to logger
Treat restart after delay like we were previously idle

Fixes: #8113
Backport: firefly

Signed-off-by: David Zafman <david.zafman@inktank.com>
(cherry picked from commit b7d31e5f5952c631dd4172bcb825e77a13fc60bc)

commit | commitdiff | tree

David Zafman [Fri, 2 May 2014 01:54:30 +0000 (18:54 -0700)]

osd/ReplicatedPG: agent_work() fix next if finished early due to start_max

Backport: firefly

Signed-off-by: David Zafman <david.zafman@inktank.com>
(cherry picked from commit 9cf470cac8dd4d8f769e768f2de6b9eb67a3c3af)

commit | commitdiff | tree

Haomai Wang [Sat, 3 May 2014 04:53:06 +0000 (12:53 +0800)]

Fix clone problem

When clone happened, the origin header also will be updated in GenericObjectMap,
so the new header wraper(StripObjectHeader) should be updated too.

Fix #8282
Signed-off-by: Haomai Wang <haomaiwang@gmail.com>
(cherry picked from commit 3aee1e0ffe0583f74c02d9c9e86c7fb267f3515c)

commit | commitdiff | tree

Jenkins [Tue, 6 May 2014 14:03:28 +0000 (14:03 +0000)]

0.80

commit | commitdiff | tree

Sage Weil [Sat, 3 May 2014 22:11:58 +0000 (15:11 -0700)]

Merge pull request #1763 from ceph/wip-blacklist

Wip blacklist

Backport: firefly
Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Yan, Zheng [Sat, 3 May 2014 21:17:15 +0000 (05:17 +0800)]

osd: check blacklisted clients in ReplicatedPG::do_op()

OSD checks if client is blacklisted only when receiving OSD request.
It's possible that OSD request's sender get blacklisted while OSD
request in in some waiting list.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>

commit | commitdiff | tree

Sage Weil [Sat, 3 May 2014 14:52:08 +0000 (07:52 -0700)]

ceph-object-corpus: v0.80-rc1-35-g4812150

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 2 May 2014 22:10:43 +0000 (15:10 -0700)]

mon/PGMonitor: set tid on no-op PGStatsAck

The OSD needs to know the tid. Both generally, and specifically because
the flush_pg_stats may be blocking on it.

Fixes: #8280
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
(cherry picked from commit 5a6ae2a978dcaf96ef89de3aaa74fe951a64def6)

commit | commitdiff | tree

Sage Weil [Fri, 2 May 2014 22:00:11 +0000 (15:00 -0700)]

mon/OSDMonitor: share latest map with osd on dup boot message

If we get a dup boot message, share the newer maps with the osd so that
they know they are living in the past.

Fixes: #8279
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
(cherry picked from commit 2e6b24868da0b203c2d70ac91071166d95d1d851)

commit | commitdiff | tree

Sage Weil [Fri, 2 May 2014 00:08:36 +0000 (17:08 -0700)]

Merge pull request #1751 from ceph/wip-mds-shutdown

mds: remove mdsdir in the final step of shutdown MDS

Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Yan, Zheng [Thu, 1 May 2014 21:22:35 +0000 (05:22 +0800)]

mds: remove mdsdir in the final step of shutdown MDS

Otherwise we may get bad subtree map if we restart the MDS before
the shutdown process finishes.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>

commit | commitdiff | tree

Samuel Just [Fri, 18 Apr 2014 00:26:17 +0000 (17:26 -0700)]

ReplicatedPG: block scrub on blocked object contexts

Fixes: #8011
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit e66f2e36c06ca00c1147f922d3513f56b122a5c0)

commit | commitdiff | tree

Samuel Just [Tue, 1 Apr 2014 23:27:20 +0000 (16:27 -0700)]

rados.h,ReplicatedPG: add CEPH_OSD_FLAG_ENFORCE_SNAPC and use on flush

We need to ensure that even with pool snaps, we use the snapc provided in order
to ensure that the clones are written back correctly.

Fixes: #7941
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 499adb1db1cd225c91acce31b5f48fad5145043b)

commit | commitdiff | tree

Samuel Just [Thu, 24 Apr 2014 19:48:44 +0000 (12:48 -0700)]

ECBackend::continue_recovery_op: handle a source shard going down

get_min_avail_to_read_shards might return an error if there are
no longer enough sources to reconstruct the missing shards.
This is possible if osds went down while we were writing the
previous chunk -- we already notice in check_recovery_sources
if a source goes down during a read.

Fixes: #8161
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 1885792c517670086332a8bab237c58558ee6dda)

commit | commitdiff | tree

Samuel Just [Fri, 25 Apr 2014 23:28:38 +0000 (16:28 -0700)]

ReplicatedPG: we can get EAGAIN on missing clone flush

Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 060105c313c5b4a777c55f17115eeb95ebb17117)

commit | commitdiff | tree

Samuel Just [Fri, 11 Apr 2014 01:15:30 +0000 (18:15 -0700)]

ReplicatedPG: do not preserve op context during flush

Any information stashed in the OpContext may be obsolete by the time we
actually mark the object clean.  Instead, let the start_flush caller
clean up its OpContext and in try_flush_mark_clean we'll create a new
one.  The primary reason to keep the OpContext would have been locking,
but we can set the obc as blocking without holding an OpContext, and
that would allow trimming to happen in the mean time (which is good
since trim_object does not respect rw locks since it doesn't change user
visible state).  In try_flush_mark_clean, we requeue the fop->op along
with (but ahead of) the fop->dup_ops.

Fixes: #8068
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit d83b8f58513e86c68aa6c25d4f909d9a3c4a103e)

commit | commitdiff | tree

Yehuda Sadeh [Fri, 25 Apr 2014 21:11:27 +0000 (14:11 -0700)]

rgw: fix url escaping

Fixes: #8202
This fixes the radosgw side of issue #8202. Needed to cast value
to unsigned char, otherwise it'd get padded.

Backport: dumpling

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
(cherry picked from commit bcf92c496aba0dfde432290fc2df5620a2767313)

commit | commitdiff | tree

Sage Weil [Mon, 28 Apr 2014 19:40:04 +0000 (12:40 -0700)]

doc/release-notes: v0.67.8 notes

Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Yan, Zheng [Sat, 26 Apr 2014 12:49:16 +0000 (20:49 +0800)]

Merge pull request #1729 from ceph/wip-7966

readlink result in resapwn

Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>

commit | commitdiff | tree

Sage Weil [Sat, 26 Apr 2014 02:46:24 +0000 (19:46 -0700)]

mds: terminate readlink result in resapwn

readlink(2) does not null terminate the buffer; we need to do that.

Fixes: #7966
Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

wusui [Fri, 25 Apr 2014 23:20:20 +0000 (16:20 -0700)]

Merge pull request #1727 from ceph/wip-8193

ceph_test_rados_api_tier: increase HitSetTrim timeouts

commit | commitdiff | tree

Sage Weil [Fri, 25 Apr 2014 22:58:47 +0000 (15:58 -0700)]

Merge pull request #1725 from FlorentCoppint/master

Skipping '_netdev' Debian fstab option

Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Fri, 25 Apr 2014 22:49:06 +0000 (15:49 -0700)]

ceph_test_rados_api_tier: increase HitSetTrim timeouts

...so that they pass when they get unlucky with thrashing.

This will vastly decrease the probability of failure, but failure will
always be possible when a timeout is in place.

Fixes: #8193
Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

FlorentCoppint [Fri, 25 Apr 2014 07:20:02 +0000 (09:20 +0200)]

Skipping '_netdev' Debian fstab option

Signed-off-by: Florent Bautista <florent@coppint.com>

commit | commitdiff | tree

Loic Dachary [Thu, 24 Apr 2014 20:42:09 +0000 (22:42 +0200)]

Merge pull request #1717 from dachary/wip-auid

mon: add ceph osd pool set <pool> auid

Reviewed-by: Greg Farnum <greg@inktank.com>

commit | commitdiff | tree

wusui [Thu, 24 Apr 2014 20:27:40 +0000 (13:27 -0700)]

Merge pull request #1724 from ceph/wip-uselocalgithubforqemu-wusui

Use new git mirror for qemu-iotests

commit | commitdiff | tree

Warren Usui [Thu, 24 Apr 2014 19:55:26 +0000 (12:55 -0700)]

Use new git mirror for qemu-iotests

Fixes: 8191
Signed-off-by: Warren Usui <warren.usui@inktank.com>

commit | commitdiff | tree

Sage Weil [Thu, 24 Apr 2014 01:00:59 +0000 (18:00 -0700)]

Merge remote-tracking branch 'gh/firefly'

commit | commitdiff | tree

Sage Weil [Thu, 24 Apr 2014 00:23:12 +0000 (17:23 -0700)]

Merge pull request #1720 from jdurgin/wip-list-children-test

test_rbd.py: ignore children in cache pools

Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Sage Weil [Wed, 23 Apr 2014 23:07:02 +0000 (16:07 -0700)]

Merge pull request #1719 from ceph/wip-8168

Wip 8168

Reviewed-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Samuel Just [Tue, 22 Apr 2014 23:03:48 +0000 (16:03 -0700)]

ReplicatedPG::do_osd_ops: consider head whiteout in list-snaps

Signed-off-by: Samuel Just <sam.just@inktank.com>

commit | commitdiff | tree

Samuel Just [Tue, 22 Apr 2014 22:12:52 +0000 (15:12 -0700)]

ReplicatedPG::do_op: don't return ENOENT for whiteout on snapdir read

Signed-off-by: Samuel Just <sam.just@inktank.com>

Unnamed repository; edit this file 'description' to name the repository.