]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
10 years agolibrbd: complete all pending aio ops prior to closing image 3405/head
Jason Dillaman [Mon, 15 Dec 2014 15:53:53 +0000 (10:53 -0500)]
librbd: complete all pending aio ops prior to closing image

It was possible for an image to be closed while aio operations
were still outstanding.  Now all aio operations are tracked and
completed before the image is closed.

Fixes: #10299
Backport: giant, firefly, dumpling
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agorgw: change multipart upload id magic
Yehuda Sadeh [Fri, 12 Dec 2014 13:24:01 +0000 (05:24 -0800)]
rgw: change multipart upload id magic

Fixes: #10271
Backport: firefly, giant

Some clients can't sign requests correctly with the original magic
prefix.

Reported-by: Georgios Dimitrakakis <giorgis@acmac.uoc.gr>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
(cherry picked from commit 5fc7a0be67a03ed63fcc8408f8d71a31a1841076)

10 years agorgw: url decode http query params correctly
Yehuda Sadeh [Thu, 11 Dec 2014 17:07:10 +0000 (09:07 -0800)]
rgw: url decode http query params correctly

Fixes: #10271
Backport: firefly

This got broken by the fix for #8702. Since we now only url_decode if
we're in query, we need to specify that we're in query when decoding
these args.

Reported-by: Georgios Dimitrakakis <giorgis@acmac.uoc.gr>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
(cherry picked from commit 21e07eb6abacb085f81b65acd706b46af29ffc03)

10 years agoqa: ignore duplicates in rados ls
Josh Durgin [Wed, 14 Jan 2015 23:01:38 +0000 (15:01 -0800)]
qa: ignore duplicates in rados ls

These can happen with split or with state changes due to reordering
results within the hash range requested. It's easy enough to filter
them out at this stage.

Backport: giant, firefly
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
(cherry picked from commit e7cc6117adf653a4915fb7a75fac68f8fa0239ec)

10 years agoosd: requeue PG when we skip handling a peering event
Sage Weil [Thu, 8 Jan 2015 21:34:52 +0000 (13:34 -0800)]
osd: requeue PG when we skip handling a peering event

If we don't handle the event, we need to put the PG back into the peering
queue or else the event won't get processed until the next event is
queued, at which point we'll be processing events with a delay.

The queue_null is not necessary (and is a waste of effort) because the
event is still in pg->peering_queue and the PG is queued.

Note that this only triggers when we exceeed osd_map_max_advance, usually
when there is a lot of peering and recovery activity going on.  A
workaround is to increase that value, but if you exceed osd_map_cache_size
you expose yourself to crache thrashing by the peering work queue, which
can cause serious problems with heavily degraded clusters and bit lots of
people on dumpling.

Backport: giant, firefly
Fixes: #10431
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 492ccc900c3358f36b6b14a207beec071eb06707)

10 years agoIf trusty, use older version of qemu
Warren Usui [Fri, 19 Dec 2014 04:00:28 +0000 (20:00 -0800)]
If trusty, use older version of qemu

Fixes #10319
Signed-off-by: Warren Usui <warren.usui@inktank.com>
(cherry-picked from 46a1a4cb670d30397979cd89808a2e420cef2c11)

10 years agoMerge pull request #3266 from ceph/giant-10415
Sage Weil [Mon, 29 Dec 2014 18:55:22 +0000 (10:55 -0800)]
Merge pull request #3266 from ceph/giant-10415

libcephfs/test.cc: close fd before umount

10 years agolibcephfs/test.cc: close fd before umount 3266/head
Yan, Zheng [Tue, 23 Dec 2014 02:22:00 +0000 (10:22 +0800)]
libcephfs/test.cc: close fd before umount

Fixes: #10415
Signed-off-by: Yan, Zheng <zyan@redhat.com>
(cherry picked from commit d3fb563cee4c4cf08ff4ee01782e52a100462429)

10 years agoRemove sepia dependency (use fqdn)
Warren Usui [Wed, 17 Dec 2014 06:01:26 +0000 (22:01 -0800)]
Remove sepia dependency (use fqdn)

Fixes: #10255
Signed-off-by: Warren Usui <warren.usui@inktank.com>
(cherry picked from commit 19dafe164833705225e168a686696fb4e170aba7)

10 years agoMerge pull request #3159 from ceph/wip-10229-giant
Gregory Farnum [Fri, 12 Dec 2014 01:03:07 +0000 (17:03 -0800)]
Merge pull request #3159 from ceph/wip-10229-giant

osdc/Filer: use finisher to execute C_Probe and C_PurgeRange [giant backport]

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoosdc/Filer: use finisher to execute C_Probe and C_PurgeRange 3159/head
Yan, Zheng [Thu, 4 Dec 2014 04:18:47 +0000 (12:18 +0800)]
osdc/Filer: use finisher to execute C_Probe and C_PurgeRange

Currently contexts C_Probe/C_PurgeRange are executed while holding
OSDSession::completion_lock. C_Probe and C_PurgeRange may call
Objecter::stat() and Objecter::remove() respectively, which acquire
Objecter::rwlock. This can cause deadlock because there is intermediate
dependency between Objecter::rwlock and OSDSession::completion_lock:

 Objecter::rwlock -> OSDSession::lock -> OSDSession::completion_lock

The fix is exexcute C_Probe/C_PurgeRange in finisher thread.

Fixes: #10229
Signed-off-by: Yan, Zheng <zyan@redhat.com>
(cherry picked from commit d3ee89ace660161df7796affbf9a70f3d0dedce1)

10 years agoMerge pull request #3151 from ceph/wip-10288-giant
Gregory Farnum [Thu, 11 Dec 2014 18:47:38 +0000 (10:47 -0800)]
Merge pull request #3151 from ceph/wip-10288-giant

mon: fix `fs ls` on peons [giant backport]

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agomon: fix `fs ls` on peons 3151/head
John Spray [Thu, 11 Dec 2014 14:00:57 +0000 (14:00 +0000)]
mon: fix `fs ls` on peons

This was incorrectly using pending_mdsmap instead
of mdsmap.  We didn't notice in test because of
single-mon configurations.

Fixes: #10288
Backport: giant

Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit 5559e6aea9e9374ecdac0351777dfd6f5f5d1e67)

10 years agoMerge pull request #3010 from dachary/wip-10018-primary-erasure-code-hinfo-giant
Samuel Just [Mon, 8 Dec 2014 21:19:20 +0000 (13:19 -0800)]
Merge pull request #3010 from dachary/wip-10018-primary-erasure-code-hinfo-giant

osd: deep scrub must not abort if hinfo is missing (giant)

Reviewed-by: Samuel Just <sjust@redhat.com>
10 years agoMerge pull request #3110 from ceph/giant-10263
Gregory Farnum [Mon, 8 Dec 2014 20:36:48 +0000 (12:36 -0800)]
Merge pull request #3110 from ceph/giant-10263

mds: store backtrace for straydir

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agomds: store backtrace for straydir 3110/head
Yan, Zheng [Fri, 7 Nov 2014 03:38:37 +0000 (11:38 +0800)]
mds: store backtrace for straydir

Backport: giant, firefly, emperor, dumpling
Signed-off-by: Yan, Zheng <zyan@redhat.com>
(cherry picked from commit 0d89db5d3e5ae5d552d4058a88a4e186748ab1d2)

10 years agoMerge pull request #3088 from dachary/wip-10063-hobject-shard-giant
Sage Weil [Sat, 6 Dec 2014 19:06:20 +0000 (11:06 -0800)]
Merge pull request #3088 from dachary/wip-10063-hobject-shard-giant

common: do not omit shard when ghobject NO_GEN is set (giant)

10 years agoMerge pull request #3095 from dachary/wip-9785-dmcrypt-keys-permissions-giant
Sage Weil [Sat, 6 Dec 2014 01:33:12 +0000 (17:33 -0800)]
Merge pull request #3095 from dachary/wip-9785-dmcrypt-keys-permissions-giant

ceph-disk: dmcrypt file permissions (giant)

10 years agoMerge pull request #3006 from dachary/wip-9420-erasure-code-non-regression-giant
Sage Weil [Sat, 6 Dec 2014 01:30:31 +0000 (17:30 -0800)]
Merge pull request #3006 from dachary/wip-9420-erasure-code-non-regression-giant

 erasure-code: store and compare encoded contents (giant)

10 years agoceph-disk: dmcrypt file permissions 3095/head
Loic Dachary [Thu, 4 Dec 2014 21:21:32 +0000 (22:21 +0100)]
ceph-disk: dmcrypt file permissions

The directory in which key files are stored for dmcrypt must be 700 and
the file 600.

http://tracker.ceph.com/issues/9785 Fixes: #9785

Signed-off-by: Loic Dachary <ldachary@redhat.com>
(cherry picked from commit 58682d1776ab1fd4daddd887d921ca9cc312bf50)

10 years agoMerge pull request #3085 from dachary/wip-10125-radosgw-init-giant
Sage Weil [Fri, 5 Dec 2014 17:03:54 +0000 (09:03 -0800)]
Merge pull request #3085 from dachary/wip-10125-radosgw-init-giant

rgw: run radosgw as apache with systemd (giant)

10 years agocommon: do not omit shard when ghobject NO_GEN is set 3088/head
Loic Dachary [Fri, 14 Nov 2014 00:16:10 +0000 (01:16 +0100)]
common: do not omit shard when ghobject NO_GEN is set

Do not silence the display of shard_id when generation is NO_GEN.
Erasure coded objects JSON representation used by ceph_objectstore_tool
need the shard_id to find the file containing the chunk.

Minimal testing is added to ceph_objectstore_tool.py

http://tracker.ceph.com/issues/10063 Fixes: #10063

Signed-off-by: Loic Dachary <ldachary@redhat.com>
(cherry picked from commit dcf09aed121f566221f539106d10283a09f15cf5)

10 years agorgw: run radosgw as apache with systemd 3085/head
Loic Dachary [Tue, 2 Dec 2014 17:10:48 +0000 (18:10 +0100)]
rgw: run radosgw as apache with systemd

Same as sysv.

http://tracker.ceph.com/issues/10125 Fixes: #10125

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 7b621f4abf63456272dec3449aa108c89504a7a5)

Conflicts:
src/init-radosgw.sysv

10 years agoMerge pull request #3077 from ceph/wip-10030-giant
Josh Durgin [Thu, 4 Dec 2014 19:32:01 +0000 (11:32 -0800)]
Merge pull request #3077 from ceph/wip-10030-giant

librbd: don't close an already closed parent image upon failure

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge pull request #3062 from ceph/wip-10123-giant
Sage Weil [Thu, 4 Dec 2014 07:02:43 +0000 (23:02 -0800)]
Merge pull request #3062 from ceph/wip-10123-giant

librbd: protect list_children from invalid child pool IoCtxs

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3055 from ceph/wip-10135-giant
Gregory Farnum [Wed, 3 Dec 2014 14:44:56 +0000 (06:44 -0800)]
Merge pull request #3055 from ceph/wip-10135-giant

mon: OSDMonitor: allow adding tiers to FS pools

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agomon: OSDMonitor: allow adding tiers to FS pools 3055/head
John Spray [Tue, 25 Nov 2014 16:54:42 +0000 (16:54 +0000)]
mon: OSDMonitor: allow adding tiers to FS pools

This was an overly-strict check.  In fact it is perfectly
fine to set an overlay on a pool that is already in use
as a filesystem data or metadata pool.

Fixes: #10135
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit 17b5fc9a40440e76dd1fa64f7fc19577ae3b58ce)

10 years agolibrbd: don't close an already closed parent image upon failure 3077/head
Jason Dillaman [Thu, 6 Nov 2014 10:01:38 +0000 (05:01 -0500)]
librbd: don't close an already closed parent image upon failure

If librbd is not able to open a child's parent image, it will
incorrectly close the parent image twice, resulting in a crash.

Fixes: #10030
Backport: firefly, giant
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 61ebfebd59b61ffdc203dfeca01ee1a02315133e)

10 years agoMerge pull request #2990 from ceph/wip-10151-giant
John Spray [Tue, 2 Dec 2014 11:35:59 +0000 (11:35 +0000)]
Merge pull request #2990 from ceph/wip-10151-giant

mon: fix MDS health status from peons

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agolibrbd: protect list_children from invalid child pool IoCtxs 3062/head
Jason Dillaman [Tue, 18 Nov 2014 02:49:26 +0000 (21:49 -0500)]
librbd: protect list_children from invalid child pool IoCtxs

While listing child images, don't ignore error codes returned
from librados when creating an IoCtx. This will prevent seg
faults from occurring when an invalid IoCtx is used.

Fixes: #10123
Backport: giant, firefly, dumpling
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 0d350b6817d7905908a4e432cd359ca1d36bab50)

10 years agoMerge pull request #3047 from ceph/wip-10011-giant
Gregory Farnum [Tue, 2 Dec 2014 01:59:19 +0000 (17:59 -0800)]
Merge pull request #3047 from ceph/wip-10011-giant

osdc: fix Journaler write error handling [giant backport]

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoosdc: fix Journaler write error handling 3047/head
John Spray [Thu, 6 Nov 2014 11:46:29 +0000 (11:46 +0000)]
osdc: fix Journaler write error handling

Since we started wrapping the write error
handler in a finisher, multiple calls to
handle_write_error would hit the assert()
on the second call before the actual
handler had been called (at the other end
of the finisher) from the first call.

The symptom was that the MDS was intermittently
failing to respawn on blacklist, seen in #10011.

Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit 762eda88a18ba707bd5410f38e21e95c4a6b3a46)

10 years agoMerge pull request #3005 from dachary/wip-9665-ceph-disk-partprobe-giant
Sage Weil [Wed, 26 Nov 2014 05:18:59 +0000 (21:18 -0800)]
Merge pull request #3005 from dachary/wip-9665-ceph-disk-partprobe-giant

ceph disk zap must call partprobe

10 years agoosd: deep scrub must not abort if hinfo is missing 3010/head
Loic Dachary [Thu, 6 Nov 2014 16:11:20 +0000 (17:11 +0100)]
osd: deep scrub must not abort if hinfo is missing

Instead it should set read_error.

http://tracker.ceph.com/issues/10018 Fixes: #10018

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
(cherry picked from commit 9d84d2e8309d26e39ca849a75166d2d7f2dec9ea)

10 years agoerasure-code: erasure_code_benchmark exhaustive erasure exploration 3006/head
Loic Dachary [Thu, 25 Sep 2014 12:46:07 +0000 (14:46 +0200)]
erasure-code: erasure_code_benchmark exhaustive erasure exploration

Add the --erasure-generation exhaustive flag to try all combinations of
erasures, not just one at random.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
(cherry picked from commit 2d7adb23bc52e7c0753f4571fecd8eefa209ef02)

Conflicts:
src/test/erasure-code/ceph_erasure_code_benchmark.h

10 years agoerasure-code: add erasure_code_benchmark --verbose
Loic Dachary [Mon, 29 Sep 2014 09:17:13 +0000 (11:17 +0200)]
erasure-code: add erasure_code_benchmark --verbose

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
(cherry picked from commit 3ff2816b3eecfb7277295583387549dac5429628)

Conflicts:
src/test/erasure-code/ceph_erasure_code_benchmark.cc
src/test/erasure-code/ceph_erasure_code_benchmark.h

10 years agoerasure_code: implement ceph_erasure_code to assert the existence of a plugin
Loic Dachary [Tue, 23 Sep 2014 12:37:57 +0000 (14:37 +0200)]
erasure_code: implement ceph_erasure_code to assert the existence of a plugin

This is handy when scripting in the context of teuthology and only
conditionally run tests for the isa plugin, for instance.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
(cherry picked from commit efe121d9f2028c312eef2650d32ccf0cbc828edb)

10 years agoerasure-code: ceph_erasure_code does not need to avoid dlclose
Loic Dachary [Tue, 23 Sep 2014 12:36:08 +0000 (14:36 +0200)]
erasure-code: ceph_erasure_code does not need to avoid dlclose

The only reason for not dlclosing plugins at exit is for callgrind but
ceph_erasure_code has no workload that would require callgrind.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
(cherry picked from commit 49613cb2aab6e73e3ea50fa164735b55e80121cd)

10 years agoerasure-code: add corpus verification to make check
Loic Dachary [Tue, 23 Sep 2014 09:38:09 +0000 (11:38 +0200)]
erasure-code: add corpus verification to make check

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
(cherry picked from commit 6fdbdff2ad1b55d4a37dcb95cfbb06c4454cdaf2)

10 years agoerasure-code: Makefile.am cosmetics
Loic Dachary [Sat, 13 Sep 2014 10:58:27 +0000 (12:58 +0200)]
erasure-code: Makefile.am cosmetics

Cluster benchmark related lines together.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
(cherry picked from commit 10c88c8f27080a8e25f128b7065cee5c2f68e91b)

10 years agoerasure-code: s/alignement/alignment/ typos in jerasure
Loic Dachary [Sat, 13 Sep 2014 10:55:26 +0000 (12:55 +0200)]
erasure-code: s/alignement/alignment/ typos in jerasure

The jerasure-per-chunk-alignment prameter was mispelled and while
useable that would lead to confusion.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
(cherry picked from commit 2c84d0b1db57d918840e669a17bbd8c5ddca9747)

10 years agoerasure-code: workunit to check for encoding regression
Loic Dachary [Sat, 13 Sep 2014 11:36:09 +0000 (13:36 +0200)]
erasure-code: workunit to check for encoding regression

Clone the archive of encoded objects and decode all archived objects, up
to and including the current ceph version.

http://tracker.ceph.com/issues/9420 Refs: #9420

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
(cherry picked from commit 7638b15f23976c3265cf766e16cf93af1a7e0091)

10 years agoerasure-code: store and compare encoded contents
Loic Dachary [Sat, 13 Sep 2014 08:16:31 +0000 (10:16 +0200)]
erasure-code: store and compare encoded contents

Introduce ceph_erasure_code_non_regression to check and compare how an
erasure code plugin encodes and decodes content with a given set of
parameters. For instance:

./ceph_erasure_code_non_regression \
      --plugin jerasure \
      --parameter technique=reed_sol_van \
      --parameter k=2 \
      --parameter m=2 \
      --stripe-width 3181 \
      --create \
      --check

Will create an encoded object (--create) and store it into a directory
along with the chunks, one chunk per file. The directory name is derived
from the parameters. The content of the object is a random pattern of 31
bytes repeated to fill the object size specified with --stripe-width.

The check function (--check) reads the object back from the file,
encodes it and compares the result with the content of the chunks read
from the files. It also attempts recover from one or two erasures.

Chunks encoded by a given version of Ceph are expected to be encoded
exactly in the same way by all Ceph versions going forward.

http://tracker.ceph.com/issues/9420 Refs: #9420

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
(cherry picked from commit f5901303dbf50e9d08f2f1e510a1936a20037909)

10 years agoceph-disk: run partprobe after zap 3005/head
Loic Dachary [Thu, 9 Oct 2014 16:52:17 +0000 (18:52 +0200)]
ceph-disk: run partprobe after zap

Not running partprobe after zapping a device can lead to the following:

* ceph-disk prepare /dev/loop2
* links are created in /dev/disk/by-partuuid
* ceph-disk zap /dev/loop2
* links are not removed from /dev/disk/by-partuuid
* ceph-disk prepare /dev/loop2
* some links are not created in /dev/disk/by-partuuid

This is assuming there is a bug in the way udev events are handled by
the operating system.

http://tracker.ceph.com/issues/9665 Fixes: #9665

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
(cherry picked from commit fed3b06c47a5ef22cb3514c7647544120086d1e7)

10 years agoceph-disk: use update_partition in prepare_dev and main_prepare
Loic Dachary [Fri, 10 Oct 2014 08:26:31 +0000 (10:26 +0200)]
ceph-disk: use update_partition in prepare_dev and main_prepare

In the case of prepare_dev the partx alternative was missing and is not
added because update_partition does it.

http://tracker.ceph.com/issues/9721 Fixes: #9721

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
(cherry picked from commit 23e71b1ee816c0ec8bd65891998657c46e364fbe)

10 years agoceph-disk: encapsulate partprobe / partx calls
Loic Dachary [Fri, 10 Oct 2014 08:23:34 +0000 (10:23 +0200)]
ceph-disk: encapsulate partprobe / partx calls

Add the update_partition function to reduce code duplication.
The action is made an argument although it always is -a because it will
be -d when deleting a partition.

Use the update_partition function in prepare_journal_dev

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
(cherry picked from commit 922a15ea6865ef915bbdec2597433da6792c1cb2)

10 years agomon: fix MDS health status from peons 2990/head
John Spray [Mon, 24 Nov 2014 11:00:25 +0000 (11:00 +0000)]
mon: fix MDS health status from peons

The health data was there, but we were attempting
to enumerate MDS GIDs from pending_mdsmap (empty on
peons) instead of mdsmap (populated from paxos updates)

Fixes: #10151
Backport: giant

Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit 0c33930e3a90f3873b7c7b18ff70dec2894fce29)

Conflicts:
src/mon/MDSMonitor.cc

10 years agoMerge pull request #2975 from ceph/wip-9936-giant
Josh Durgin [Thu, 20 Nov 2014 21:13:33 +0000 (13:13 -0800)]
Merge pull request #2975 from ceph/wip-9936-giant

rbd: Fix the rbd export when image size more than 2G

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge pull request #2963 from ceph/wip-10114-giant
Loic Dachary [Wed, 19 Nov 2014 01:40:47 +0000 (02:40 +0100)]
Merge pull request #2963 from ceph/wip-10114-giant

Wip 10114 giant

Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years agoMerge pull request #2958 from ceph/wip-10128-giant
David Zafman [Tue, 18 Nov 2014 23:48:16 +0000 (15:48 -0800)]
Merge pull request #2958 from ceph/wip-10128-giant

ceph_objectstore_tool: When exporting to stdout, don't cout messages

Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years agoerasure-code isa-l: remove duplicated lines (fix warning) 2963/head
Dan Mick [Tue, 18 Nov 2014 23:21:30 +0000 (15:21 -0800)]
erasure-code isa-l: remove duplicated lines (fix warning)

06a245a added a section def to assembly files; I added it twice to
this file.  There's no damage, but a compiler warning (on machines with
yasm installed)

Signed-off-by: Dan Mick <dan.mick@redhat.com>
(cherry picked from commit 10f6ef185a9d09e396e94036ec90bfe8a0738ce9)

10 years agoAdd annotation to all assembly files to turn off stack-execute bit
Dan Mick [Sat, 15 Nov 2014 01:59:57 +0000 (17:59 -0800)]
Add annotation to all assembly files to turn off stack-execute bit

See discussion in http://tracker.ceph.com/issues/10114

Building with these changes allows output from readelf like this:

 $ readelf -lW src/.libs/librados.so.2 | grep GNU_STACK
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000
0x000000 RW  0x8

(note the absence of 'X' in 'RW')

Fixes: #10114
Signed-off-by: Dan Mick <dan.mick@redhat.com>
(cherry picked from commit 06a245a9845c0c126fb3106b41b2fd2bc4bc4df3)

10 years agoceph_objectstore_tool: When exporting to stdout, don't cout messages 2957/head 2958/head
David Zafman [Tue, 18 Nov 2014 07:02:50 +0000 (23:02 -0800)]
ceph_objectstore_tool: When exporting to stdout, don't cout messages

Fixes: #10128
Caused by a2bd2aa7

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 0d5262ac2f69ed3996af76a72894b1722a27b37d)

10 years agorbd: Fix the rbd export when image size more than 2G 2975/head
Vicente Cheng [Wed, 29 Oct 2014 04:21:11 +0000 (12:21 +0800)]
rbd: Fix the rbd export when image size more than 2G

When using export <image-name> <path> and the size of image is more
than 2G, the previous version about finish() could not handle in
seeking the offset in image and return error.

This is caused by the incorrect variable type. Try to use the correct
variable type to fixed it.

I use another variable which type is uint64_t for confirming seeking
and still use the previous r for return error.

uint64_t is more better than type int for handle lseek64().

Signed-off-by: Vicente Cheng <freeze.bilsted@gmail.com>
(cherry picked from commit 4b87a81c86db06f6fe2bee440c65fc05cd4c23ce)

10 years agoosd/OSD: use OSDMap helper to determine if we are correct op target
Sage Weil [Thu, 13 Nov 2014 01:11:10 +0000 (17:11 -0800)]
osd/OSD: use OSDMap helper to determine if we are correct op target

Use the new helper.  This fixes our behavior for EC pools where targetting
a different shard is not correct, while for replicated pools it may be. In
the EC case, it leaves the op hanging indefinitely in the OpTracker because
the pgid exists but as a different shard.

Fixes: #9835
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 9e05ba086a36ae9a04b347153b685c2b8adac2c3)

10 years agoosd/OSDMap: add osd_is_valid_op_target()
Sage Weil [Thu, 13 Nov 2014 01:04:35 +0000 (17:04 -0800)]
osd/OSDMap: add osd_is_valid_op_target()

Helper to check whether an osd is a given op target for a pg.  This
assumes that for EC we always send ops to the primary, while for
replicated we may target any replica.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 89c02637914ac7332e9dbdbfefc2049b2b6c127d)

10 years agoqa: allow small allocation diffs for exported rbds
Josh Durgin [Wed, 12 Nov 2014 02:16:02 +0000 (18:16 -0800)]
qa: allow small allocation diffs for exported rbds

The local filesytem may behave slightly differently. This isn't
foolproof, but seems to be reliable enough on rhel7 rootfs, where
exact comparison was failing.

Fixes: #10002
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
(cherry picked from commit e94d3c11edb9c9cbcf108463fdff8404df79be33)

10 years agocommon: Add cctid meta variable
Adam Crume [Thu, 18 Sep 2014 23:57:27 +0000 (16:57 -0700)]
common: Add cctid meta variable

Fixes: #6228
Signed-off-by: Adam Crume <adamcrume@gmail.com>
(cherry picked from commit bb45621cb117131707a85154292a3b3cdd1c662a)

10 years agoMerge pull request #2804 from ceph/wip-9301-giant
Sage Weil [Tue, 11 Nov 2014 16:28:19 +0000 (08:28 -0800)]
Merge pull request #2804 from ceph/wip-9301-giant

mon: backport paxos off-by-one bug (9301) to giant

10 years agoMerge pull request #2887 from ceph/wip-9977-backport
Gregory Farnum [Tue, 11 Nov 2014 06:41:19 +0000 (22:41 -0800)]
Merge pull request #2887 from ceph/wip-9977-backport

tools: skip up to expire_pos in journal-tool

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoclient: trim unused inodes before reconnecting to recovering MDS
Yan, Zheng [Thu, 11 Sep 2014 01:36:44 +0000 (09:36 +0800)]
client: trim unused inodes before reconnecting to recovering MDS

So the recovering MDS does not need to fetch these ununsed inodes during
cache rejoin. This may reduce MDS recovery time.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
(cherry picked from commit 2bd7ceeff53ad0f49d5825b6e7f378683616dffb)

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoclient: allow xattr caps in inject_release_failure
John Spray [Mon, 27 Oct 2014 12:02:17 +0000 (12:02 +0000)]
client: allow xattr caps in inject_release_failure

Because some test environments generate spurious
rmxattr operations, allow the client to release
'X' caps.  Allows xattr operations to proceed
while still preventing client releasing other caps.

Fixes: #9800
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit 5691c68a0a44eb2cdf0afb3f39a540f5d42a5c0c)

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agotools: skip up to expire_pos in journal-tool 2887/head
John Spray [Mon, 3 Nov 2014 19:19:45 +0000 (19:19 +0000)]
tools: skip up to expire_pos in journal-tool

Previously worked for journals starting from an
object boundary (i.e. freshly created filesystems)

Fixes: #9977
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit 65c33503c83ff8d88781c5c3ae81d88d84c8b3e4)

Conflicts:
src/tools/cephfs/JournalScanner.cc

10 years agoMerge pull request #2876 from ceph/giant-readdir-fix
Gregory Farnum [Sat, 8 Nov 2014 00:26:54 +0000 (16:26 -0800)]
Merge pull request #2876 from ceph/giant-readdir-fix

Giant readdir fix

10 years agoMerge pull request #2879 from ceph/wip-10025-giant
Gregory Farnum [Fri, 7 Nov 2014 22:10:40 +0000 (14:10 -0800)]
Merge pull request #2879 from ceph/wip-10025-giant

#10025/giant -- tools: fix MDS journal import

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agotools: fix MDS journal import 2879/head
John Spray [Fri, 7 Nov 2014 11:34:43 +0000 (11:34 +0000)]
tools: fix MDS journal import

Previously it only worked on fresh filesystems which
hadn't been trimmed yet, and resulted in an invalid
trimmed_pos when expire_pos wasn't on an object
boundary.

Fixes: #10025
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit fb29e71f9a97c12354045ad2e128156e503be696)

10 years agoclient: fix I_COMPLETE_ORDERED checking 2876/head
Yan, Zheng [Mon, 27 Oct 2014 20:57:16 +0000 (13:57 -0700)]
client: fix I_COMPLETE_ORDERED checking

Current code marks a directory inode as complete and ordered when readdir
finishes, but it does not check if the directory was modified in the middle
of readdir. This is wrong, directory inode should not be marked as ordered
if it was modified during readddir

The fix is introduce a new counter to the inode data struct, we increase
the counter each time the directory is modified. When readdir finishes, we
check the counter to decide if the directory should be marked as ordered.

Fixes: #9894
Signed-off-by: Yan, Zheng <zyan@redhat.com>
(cherry picked from commit a4caed8a53d011b214ab516090676641f7c4699d)

10 years agoclient: preserve ordering of readdir result in cache
Yan, Zheng [Tue, 9 Sep 2014 09:34:46 +0000 (17:34 +0800)]
client: preserve ordering of readdir result in cache

Preserve ordering of readdir result in a list, so that the result of cached
readdir is consistant with uncached readdir.

As a side effect, this commit also removes the code that removes stale dentries.
This is OK because stale dentries does not have valid lease, they will be
filter out by the shared gen check in Client::_readdir_cache_cb()

Signed-off-by: Yan, Zheng <zyan@redhat.com>
(cherry picked from commit 346c06c1647658768e927a47768a0bc74de17b53)

10 years agoclient: introduce a new flag indicating if dentries in directory are sorted
Yan, Zheng [Tue, 9 Sep 2014 06:06:06 +0000 (14:06 +0800)]
client: introduce a new flag indicating if dentries in directory are sorted

When creating a file, Client::insert_dentry_inode() set the dentry's offset
based on directory's max offset. The offset does not reflect the real
postion of the dentry in directory. Later readdir reply from real postion
of the dentry in directory. Later readdir reply from MDS may change the
dentry's position/offset. This inconsistency can cause missing/duplicate
entries in readdir result if readdir is partly satisfied by dcache_readdir().

The fix is introduce a new flag indicating if dentries in directory are
sorted. We use _readdir_cache_cb() to handle readdir only when the flag is
set, clear the flag after creating/deleting/renaming file.

Fixes: #9178
Signed-off-by: Yan, Zheng <zyan@redhat.com>
(cherry picked from commit 600af25493947871c38214aa370e2544a7fea399)

10 years agoqa: use sudo even more when rsyncing /usr
Greg Farnum [Fri, 7 Nov 2014 01:48:01 +0000 (17:48 -0800)]
qa: use sudo even more when rsyncing /usr

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 3aa7797741f9cff06053a2f31550fe6929039692)

10 years agoMerge pull request #2858 from ceph/wip-9909
Loic Dachary [Wed, 5 Nov 2014 07:51:18 +0000 (08:51 +0100)]
Merge pull request #2858 from ceph/wip-9909

tools: rados put /dev/null should write() and not create()

Reviewed-by: Loic Dachary <loic-201408@dachary.org>
10 years agotools: rados put /dev/null should write() and not create() 2858/head
Loic Dachary [Thu, 2 Oct 2014 07:23:55 +0000 (09:23 +0200)]
tools: rados put /dev/null should write() and not create()

In the rados.cc special case to handle put an empty objects, use
write_full() instead of create().

A special case was introduced 6843a0b81f10125842c90bc63eccc4fd873b58f2
to create() an object if the rados put file is empty. Prior to this fix
an attempt to rados put an empty file was a noop. The problem with this
fix is that it is not idempotent. rados put an empty file twice would
fail the second time and rados put a file with one byte would succeed as
expected.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
(cherry picked from commit 50e80407f3c2f74d77ba876d01e7313c3544ea4d)

10 years agorgw: set length for keystone token validation request
Yehuda Sadeh [Thu, 9 Oct 2014 17:20:27 +0000 (10:20 -0700)]
rgw: set length for keystone token validation request

Fixes: #7796
Backport: giany, firefly
Need to set content length to this request, as the server might not
handle a chunked request (even though we don't send anything).

Tested-by: Mark Kirkwood <mark.kirkwood@catalyst.net.nz>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
(cherry picked from commit 3dd4ccad7fe97fc16a3ee4130549b48600bc485c)

10 years agoMerge pull request #2846 from dachary/wip-9752-past-intervals-giant
Sage Weil [Fri, 31 Oct 2014 15:35:42 +0000 (08:35 -0700)]
Merge pull request #2846 from dachary/wip-9752-past-intervals-giant

osd: past_interval display bug on acting

10 years agoosd: past_interval display bug on acting 2846/head
Loic Dachary [Thu, 30 Oct 2014 23:49:21 +0000 (00:49 +0100)]
osd: past_interval display bug on acting

The acting array was incorrectly including the primary and up_primary.

http://tracker.ceph.com/issues/9752 Fixes: #9752

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
(cherry picked from commit c5f8d6eded52da451fdd1d807bd4700221e4c41c)

10 years agoMerge pull request #2841 from ceph/giant-9869
Yan, Zheng [Fri, 31 Oct 2014 00:01:12 +0000 (17:01 -0700)]
Merge pull request #2841 from ceph/giant-9869

Backport "client: cast m->get_client_tid() to compare to 16-bit Inode::flushing_cap_tid"

10 years agoclient: cast m->get_client_tid() to compare to 16-bit Inode::flushing_cap_tid 2841/head
Greg Farnum [Thu, 23 Oct 2014 00:16:31 +0000 (17:16 -0700)]
client: cast m->get_client_tid() to compare to 16-bit Inode::flushing_cap_tid

m->get_client_tid() is 64 bits (as it should be), but Inode::flushing_cap_tid
is only 16 bits. 16 bits should be plenty to let the cap flush updates
pipeline appropriately, but we need to cast in the proper direction when
comparing these differently-sized versions. So downcast the 64-bit one
to 16 bits.

Fixes: #9869
Backport: giant, firefly, dumpling

Signed-off-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit a5184cf46a6e867287e24aeb731634828467cd98)

10 years agoMerge pull request #2838 from ceph/wip-9945-giant
Sage Weil [Thu, 30 Oct 2014 17:05:22 +0000 (10:05 -0700)]
Merge pull request #2838 from ceph/wip-9945-giant

messages: fix COMPAT_VERSION on MClientSession

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agomessages: fix COMPAT_VERSION on MClientSession 2838/head
John Spray [Thu, 30 Oct 2014 16:43:21 +0000 (16:43 +0000)]
messages: fix COMPAT_VERSION on MClientSession

This was incorrectly incremented to 2 by omission
of an explicit COMPAT_VERSION value.

Fixes: #9945
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit 1eb9bcb1d36014293efc687b4331be8c4d208d8e)

10 years ago0.87 v0.87
Jenkins [Wed, 29 Oct 2014 18:03:55 +0000 (11:03 -0700)]
0.87

10 years agoMerge remote-tracking branch 'origin/wip-9806-giant' into giant
Josh Durgin [Tue, 28 Oct 2014 20:08:05 +0000 (13:08 -0700)]
Merge remote-tracking branch 'origin/wip-9806-giant' into giant

Reviewed-by: Samuel Just <sam.just@inktank.com>
10 years agoMerge pull request #2630 from ceph/wip-9545
Samuel Just [Mon, 27 Oct 2014 20:20:16 +0000 (13:20 -0700)]
Merge pull request #2630 from ceph/wip-9545

os/FileStore: do not loop in sync_entry on shutdown

Reviewed-by: Samuel Just <sam.just@inktank.com>
10 years agomon: re-bootstrap if we get probed by a mon that is way ahead 2804/head
Sage Weil [Thu, 18 Sep 2014 21:23:36 +0000 (14:23 -0700)]
mon: re-bootstrap if we get probed by a mon that is way ahead

During bootstrap we verify that our paxos commits overlap with the other
mons we will form a quorum with.  If they do not, we do a sync.

However, it is possible we pass those checks, then fail to join a quorum
before the quorum moves ahead in time such that we no longer overlap.
Currently nothing kicks up back into a probing state to discover we need
to sync... we will just keep trying to call or join an election instead.

Fix this by jumping back to bootstrap if we get a probe that is ahead of
us.  Only do this from non probe or sync states as these will be common;
it is only the active and electing states that matter (and probably just
electing!).

Fixes: #9301
Backport: giant, firefly
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit c421b55e8e15ef04ca8aeb47f7d090375eaa8573)

10 years agomon/Paxos: fix off-by-one in last_ vs first_committed check
Sage Weil [Thu, 18 Sep 2014 21:11:24 +0000 (14:11 -0700)]
mon/Paxos: fix off-by-one in last_ vs first_committed check

peon last_committed + 1 == leader first_committed is okay.  Note that the
other check (where I clean up whitespace) gets this correct.

Fixes: #9301 (partly)
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit d81cd7f86695185dce31df76c33c9a02123f0e4a)

10 years agoMerge pull request #2800 from ceph/wip-enoent-race
João Eduardo Luís [Sun, 26 Oct 2014 18:58:50 +0000 (18:58 +0000)]
Merge pull request #2800 from ceph/wip-enoent-race

os/LevelDBStore, RocksDBStore: fix race handling for get store size

Reviewed-by: Joao Eduardo Luis <joao@redhat.com>
10 years agoos/LevelDBStore, RocksDBStore: fix race handling for get store size 2800/head
Sage Weil [Sat, 25 Oct 2014 04:23:19 +0000 (21:23 -0700)]
os/LevelDBStore, RocksDBStore: fix race handling for get store size

If we get ENOENT, skip this file, instead of adding in undefined stat
values.

Backport: firefly
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2799 from athanatos/wip-9480
Sage Weil [Fri, 24 Oct 2014 20:20:43 +0000 (13:20 -0700)]
Merge pull request #2799 from athanatos/wip-9480

Wip 9480

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2798 from athanatos/wip-9875
Sage Weil [Fri, 24 Oct 2014 19:55:05 +0000 (12:55 -0700)]
Merge pull request #2798 from athanatos/wip-9875

ReplicatedPG: writeout hit_set object with correct prior_version

Reviewed-by: Sage Weil <sage@redhat.com>
10 years ago.gitmodules: ignoring changes in rocksdb submodule
Federico Gimenez [Fri, 24 Oct 2014 06:46:50 +0000 (08:46 +0200)]
.gitmodules: ignoring changes in rocksdb submodule

Signed-off-by: Federico Gimenez <fgimenez@coit.es>
(cherry picked from commit 60eaeca4ddccc79b29b17ad433c6569cb2a89500)

10 years agoMerge pull request #2797 from ceph/wip-rbd-revert
Josh Durgin [Fri, 24 Oct 2014 18:12:29 +0000 (11:12 -0700)]
Merge pull request #2797 from ceph/wip-rbd-revert

rbd/objectcacher: revert recent changes for giant

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
10 years agoRevert "Enforce cache size on read requests" 2797/head
Sage Weil [Fri, 24 Oct 2014 18:06:16 +0000 (11:06 -0700)]
Revert "Enforce cache size on read requests"

This reverts commit 4fc9fffc494abedac0a9b1ce44706343f18466f1.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoRevert "rbd: ObjectCacher reads can hang when reading sparse files"
Sage Weil [Fri, 24 Oct 2014 18:06:08 +0000 (11:06 -0700)]
Revert "rbd: ObjectCacher reads can hang when reading sparse files"

This reverts commit cdb7675a21c9107e3596c90c2b1598def3c6899f.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoRevert "Fix read performance regression in ObjectCacher"
Sage Weil [Fri, 24 Oct 2014 18:05:53 +0000 (11:05 -0700)]
Revert "Fix read performance regression in ObjectCacher"

This reverts commit 65be257e9295619b960b49f6aa80ecdf8ea4d16a.

Too late for giant.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2795 from ceph/wip-9873
David Zafman [Fri, 24 Oct 2014 17:49:32 +0000 (10:49 -0700)]
Merge pull request #2795 from ceph/wip-9873

objecter: fix tick_event shutdown race (9873)

Reviewed-by: David Zafman <dzafman@redhat.com>
10 years agoosdc/Objecter: fix tick_event handling in shutdown vs tick race 2795/head
Sage Weil [Fri, 24 Oct 2014 16:32:20 +0000 (09:32 -0700)]
osdc/Objecter: fix tick_event handling in shutdown vs tick race

If we fail to cancel the tick_event, we rely on tick() itself to clear
tick_event.  I'm not quite sure how we got this wrong in the previous
commit, but this boils down to two cases:

1) shutdown() successfully cancels the event and clears tick_event.  tick()
   never runs.  tick_event == NULL when we finish.
2) shutdown() fails to cancel the event because it has already started.  In
   this case tick itself is blocking (or about to block) waiting on the
   rlock.  When it does run it will clear tick_event itself, then see
   initiazed == 0 and exit without rescheduling.

Fixes: #9873
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agocommon/Timer: recheck stopping before sleep if we dropped the lock
Sage Weil [Fri, 24 Oct 2014 16:20:41 +0000 (09:20 -0700)]
common/Timer: recheck stopping before sleep if we dropped the lock

If we have safe_callbacks==false, the stopping flag may have changed while
we were doing our callback. Recheck it and exit to avoid a deadlock on
shutdown.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2787 from ceph/fix-fstat-mode
Sage Weil [Fri, 24 Oct 2014 00:57:02 +0000 (17:57 -0700)]
Merge pull request #2787 from ceph/fix-fstat-mode

java: fill in stat structure correctly

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agojava: fill in stat structure correctly 2787/head
Noah Watkins [Thu, 23 Oct 2014 20:22:52 +0000 (13:22 -0700)]
java: fill in stat structure correctly

Added stat filling helper function but only stat and lstat were updated.
This patch makes fstat use it. Crucially the fstat wasn't updating the
mode flags.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
10 years agoObjecter: resend linger ops on any interval change
Josh Durgin [Mon, 20 Oct 2014 20:29:13 +0000 (13:29 -0700)]
Objecter: resend linger ops on any interval change

Watch/notify ops need to be resent after a pg split occurs, as well as
a few other circumstances that the existing objecter checks did not
catch.

Refactor the check the OSD uses for this to add a version taking the
more basic types instead of the whole OSD map, and stash the needed
info when an op is sent.

Fixes: #9806
Backport: giant, firefly, dumpling
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
10 years agoMerge pull request #2785 from athanatos/wip-9821
Sage Weil [Thu, 23 Oct 2014 20:45:26 +0000 (13:45 -0700)]
Merge pull request #2785 from athanatos/wip-9821

PG:: reset_interval_flush and in set_last_peering_reset

Reviewed-by: Sage Weil <sage@redhat.com>