]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
10 years agorgw: finalize perfcounters after shutting down storage 3554/head
Yehuda Sadeh [Fri, 30 Jan 2015 22:34:32 +0000 (14:34 -0800)]
rgw: finalize perfcounters after shutting down storage

Fixes: #10572
Backport: giant, firefly
First disable the storage subsystem, then disable perfcounters.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agoMerge pull request #3504 from ceph/wip-10553
Josh Durgin [Tue, 27 Jan 2015 20:46:08 +0000 (12:46 -0800)]
Merge pull request #3504 from ceph/wip-10553

rgw: fix partial GET in swift

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agorbd image_read.sh: disable exclusive locking
Jason Dillaman [Thu, 22 Jan 2015 05:44:08 +0000 (00:44 -0500)]
rbd image_read.sh: disable exclusive locking

Until the kernel supports RBD exclusive locking, this test
has been updated to create shared images (exclusive locking
disabled).

Fixes: #10613
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agorbd: ensure aio_write buffer isn't invalidated during image import
Jason Dillaman [Wed, 21 Jan 2015 19:55:02 +0000 (14:55 -0500)]
rbd: ensure aio_write buffer isn't invalidated during image import

The buffer provided to aio_write shouldn't be invalidated until
after aio_write has indicated that the operation has completed.

Fixes: #10590
Backport: giant
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge pull request #3426 from jdurgin/wip-10592
Jason Dillaman [Wed, 21 Jan 2015 19:59:57 +0000 (14:59 -0500)]
Merge pull request #3426 from jdurgin/wip-10592

qa: disable automatic locking for manual locking test

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
10 years agoMerge pull request #3427 from jdurgin/wip-cram
Sage Weil [Wed, 21 Jan 2015 03:28:51 +0000 (19:28 -0800)]
Merge pull request #3427 from jdurgin/wip-cram

test: fix rbd cli tests for new feature bit

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agotest: fix rbd cli tests for new feature bit 3427/head
Josh Durgin [Wed, 21 Jan 2015 01:12:15 +0000 (17:12 -0800)]
test: fix rbd cli tests for new feature bit

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agoqa: disable automatic locking for manual locking test 3426/head
Josh Durgin [Tue, 20 Jan 2015 23:55:11 +0000 (15:55 -0800)]
qa: disable automatic locking for manual locking test

Automatic locking hides the ESHUTDOWN from the caller, which is how
this test detects that blacklisting works.

Fixes: #10592
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
10 years agoRevert "Merge remote-tracking branch 'origin/wip-bi-sharding-3' into next"
Yehuda Sadeh [Mon, 19 Jan 2015 17:26:00 +0000 (09:26 -0800)]
Revert "Merge remote-tracking branch 'origin/wip-bi-sharding-3' into next"

This reverts commit f79d8f24e9c0bf0d0b37270eba2745a878f2caed, reversing
changes made to 896c8899ac28eb0403bfaa20454f3756f3705c51.

10 years agoMerge remote-tracking branch 'origin/wip-bi-sharding-3' into next
Yehuda Sadeh [Mon, 19 Jan 2015 17:14:32 +0000 (09:14 -0800)]
Merge remote-tracking branch 'origin/wip-bi-sharding-3' into next

10 years agoMerge remote-tracking branch 'origin/wip-10271' into next
Josh Durgin [Fri, 16 Jan 2015 22:33:59 +0000 (14:33 -0800)]
Merge remote-tracking branch 'origin/wip-10271' into next

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agorgw: bilog marker related fixes
Yehuda Sadeh [Fri, 16 Jan 2015 01:30:24 +0000 (17:30 -0800)]
rgw: bilog marker related fixes

Fix the way we parse the marker. Instead of specifying whether it's a
sharded or not sharded bucket, we pass a shard_id. If string itself
points to a singe shard, we'll use the passed shard_id, otherwise we'll
parse the string and determine the shard id by that. In this way when
referencing a single shard we can get the marker with either shard id
specified or not. This works with the non-shard case too.
Adjust the bilog listing function, set it to work with the new
interface. It was broken before, and there are multiple fixes to it.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agoMerge pull request #3342 from ceph/wip-10311
Sage Weil [Fri, 16 Jan 2015 05:45:56 +0000 (21:45 -0800)]
Merge pull request #3342 from ceph/wip-10311

rgw: only keep track for cleanup of rados objects that were written

Reviewed-by: Ray Lv <xiangyulv@gmail.com>
10 years agorgw: fix partial GET in swift 3504/head
Yehuda Sadeh [Fri, 16 Jan 2015 00:31:22 +0000 (16:31 -0800)]
rgw: fix partial GET in swift

Fixes: #10553
backport: firefly, giant

Don't set the ret code to reflect partial download, just set the
response status when needed.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agoos/FileJournal: Fix journal write fail, align for direct io
Sage Weil [Thu, 15 Jan 2015 19:20:18 +0000 (11:20 -0800)]
os/FileJournal: Fix journal write fail, align for direct io

when config journal_zero_on_create true, osd mkfs will fail when zeroing journal.
journal open with O_DIRECT, buf should align with blocksize.

Backport: giant, firefly, dumpling
Signed-off-by: Xie Rui <875016668@qq.com>
Reviewed-by: Sage Weil <sage@redhat.com>
10 years agomon: encode stashed monmap with all features
Jerry7X [Wed, 7 Jan 2015 06:29:02 +0000 (14:29 +0800)]
mon: encode stashed monmap with all features

latest_monmap that we stash is only used locally--the encoded bl is never shared. Which means we should just use CEPH_FEATURES_ALL all of the time.

Fixes: #5203
Backport: giant, firefly
Signed-off-by: Xie Rui <875016668@qq.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Joao Eduardo Luis <joao@redhat.com>
10 years agorgw: wait for completion only if not completion available
Yehuda Sadeh [Wed, 14 Jan 2015 19:47:18 +0000 (11:47 -0800)]
rgw: wait for completion only if not completion available

In a bucket aio operation, wait for completions only if there are no
completions available. Otherwise we might wait forever, as everything
already complete.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Wed, 14 Jan 2015 16:57:33 +0000 (08:57 -0800)]
Merge remote-tracking branch 'gh/next'

10 years agorgw: bi list, update marker only if result not empty
Yehuda Sadeh [Tue, 13 Jan 2015 21:37:24 +0000 (13:37 -0800)]
rgw: bi list, update marker only if result not empty

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: fix memory leak
Yehuda Sadeh [Tue, 13 Jan 2015 21:31:39 +0000 (13:31 -0800)]
rgw: fix memory leak

We were iterating on both completion_objs, and completions assuming that
they follow each other. They don't do it. While at it, index completions
by id, so that we could update the completed objs correctly.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: initialize RGWBucketInfo::num_shards
Yehuda Sadeh [Mon, 12 Jan 2015 22:38:58 +0000 (14:38 -0800)]
rgw: initialize RGWBucketInfo::num_shards

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agocls_rgw: call ioctx->aio_operate() under lock
Yehuda Sadeh [Mon, 12 Jan 2015 16:55:35 +0000 (08:55 -0800)]
cls_rgw: call ioctx->aio_operate() under lock

close a race window

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: fix linkage following rebase
Yehuda Sadeh [Sat, 10 Jan 2015 17:36:00 +0000 (09:36 -0800)]
rgw: fix linkage following rebase

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: update calls to handle bucket sharding
Yehuda Sadeh [Fri, 9 Jan 2015 22:55:42 +0000 (14:55 -0800)]
rgw: update calls to handle bucket sharding

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: only keep track for cleanup of rados objects that were written
Yehuda Sadeh [Fri, 9 Jan 2015 18:23:35 +0000 (10:23 -0800)]
rgw: only keep track for cleanup of rados objects that were written

Fixes: #10311
We're keeping track of rados objects that we've written so that we could
clean them up if needed. Earlier we weren't too accurate about it and
were also setting the head object that is yet to be written. This now
only applies to the tail data, and a bit clearer.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agotest: fix test_cls_rgw
Yehuda Sadeh [Thu, 8 Jan 2015 18:28:05 +0000 (10:28 -0800)]
test: fix test_cls_rgw

Adjust to new api calls.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agocls_rgw: remove incorrect function declaration
Yehuda Sadeh [Thu, 8 Jan 2015 18:27:32 +0000 (10:27 -0800)]
cls_rgw: remove incorrect function declaration

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: max shards configuration is part of the zone config
Yehuda Sadeh [Wed, 7 Jan 2015 23:30:27 +0000 (15:30 -0800)]
rgw: max shards configuration is part of the zone config

The zone config params are set in the region configuration. Also,
there's a ceph.conf configurable (rgw_override_bucket_index_max_shards)
for overriding this per rgw.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: pass num shards on bucket initialization
Yehuda Sadeh [Tue, 9 Dec 2014 21:58:09 +0000 (13:58 -0800)]
rgw: pass num shards on bucket initialization

Need to pass the actual num shards that are going to be used for this
specific bucket. Bucket may be created by applying metadata from
different zone, so num shards might be different.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: write multi shard markers on replica_log appropriately
Yehuda Sadeh [Mon, 8 Dec 2014 23:44:09 +0000 (15:44 -0800)]
rgw: write multi shard markers on replica_log appropriately

When getting a list of shard_id#marker, iterate through the shards and
write each as needed.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agocls_rgw: extend shards marker api
Yehuda Sadeh [Mon, 8 Dec 2014 23:43:47 +0000 (15:43 -0800)]
cls_rgw: extend shards marker api

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw, cls_rgw: keep shard ids with oids
Yehuda Sadeh [Fri, 5 Dec 2014 23:52:26 +0000 (15:52 -0800)]
rgw, cls_rgw: keep shard ids with oids

Instead of just having the list of oids, keep the shard ids together, so
that we can know on which shard the operation happened.
Bucket markers are just using the shard numeric id, instead of the
bucket instance shard id. This makes it easier to parse the markers
appropriately.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agocls_rgw: clean up CLSRGWConcurrentIO
Yehuda Sadeh [Fri, 5 Dec 2014 22:10:50 +0000 (14:10 -0800)]
cls_rgw: clean up CLSRGWConcurrentIO

Class is no longer a template, and keeps a map of oids by shard_id. Call
issue_op() using both shard_id and oids. Shard id is used for mapping
the results in the derived classes.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: modify bucket instance shard marker ids
Yehuda Sadeh [Fri, 5 Dec 2014 01:15:15 +0000 (17:15 -0800)]
rgw: modify bucket instance shard marker ids

Instead of having the markers prefixed by the oids, use the bucket
instance id.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: bucket replica log, handle shard ids
Yehuda Sadeh [Wed, 3 Dec 2014 22:41:00 +0000 (14:41 -0800)]
rgw: bucket replica log, handle shard ids

bucket replica log can now save entries by the specified shard id

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agocls_rgw: list bi log should not return marker entry
Yehuda Sadeh [Wed, 3 Dec 2014 22:40:13 +0000 (14:40 -0800)]
cls_rgw: list bi log should not return marker entry

The marker should be served as a lower bound, but should not be
returned.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: bucket_index_shard_hash_type fixes
Yehuda Sadeh [Wed, 3 Dec 2014 19:31:26 +0000 (11:31 -0800)]
rgw: bucket_index_shard_hash_type fixes

add initializations, json encode /decode.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: decode the req_state bucket instance id if needed
Yehuda Sadeh [Wed, 3 Dec 2014 19:30:44 +0000 (11:30 -0800)]
rgw: decode the req_state bucket instance id if needed

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: improve bucket sharding hashing
Yehuda Sadeh [Tue, 2 Dec 2014 22:17:14 +0000 (14:17 -0800)]
rgw: improve bucket sharding hashing

Amplify small source changes.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: data changes log, log info by bucket shard id
Yehuda Sadeh [Tue, 2 Dec 2014 00:22:32 +0000 (16:22 -0800)]
rgw: data changes log, log info by bucket shard id

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: use new BucketShard structure for index manipulation calls
Yehuda Sadeh [Mon, 1 Dec 2014 23:34:37 +0000 (15:34 -0800)]
rgw: use new BucketShard structure for index manipulation calls

Instead of recalculating the hash every call, do it once, and pass this
structure around. Also, will be used for logging changes into the data
log.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: bi log list/trim can get specific bucket shard
Yehuda Sadeh [Mon, 24 Nov 2014 22:30:20 +0000 (14:30 -0800)]
rgw: bi log list/trim can get specific bucket shard

bucket shard can be specified on the bucket instance param. It can be
added like this: <bucket-instance>[:shard-id]

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agoFix the multipart uploads functional test failures due to bucket index sharding.
Guang Yang [Mon, 3 Nov 2014 12:56:06 +0000 (12:56 +0000)]
Fix the multipart uploads functional test failures due to bucket index sharding.

Signed-off-by: Guang Yang <yguang@yahoo-inc.com>
10 years agoFix get_bucket_instance_info, only build the oid if it is empty.
Guang Yang [Fri, 31 Oct 2014 16:56:36 +0000 (16:56 +0000)]
Fix get_bucket_instance_info, only build the oid if it is empty.

Signed-off-by: Guang Yang <yguang@yahoo-inc.com>
10 years agoAdjust bi log trim implementation to work with multiple bucket shards.
Guang Yang [Wed, 24 Sep 2014 06:21:28 +0000 (06:21 +0000)]
Adjust bi log trim implementation to work with multiple bucket shards.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
10 years agoAdjust bi log listing to work with multiple bucket shards.
Guang Yang [Tue, 23 Sep 2014 23:14:24 +0000 (23:14 +0000)]
Adjust bi log listing to work with multiple bucket shards.
Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
10 years agocls_rgw, rgw: switch different ops to new concurrent infrastructure
Yehuda Sadeh [Fri, 19 Sep 2014 22:34:54 +0000 (15:34 -0700)]
cls_rgw, rgw: switch different ops to new concurrent infrastructure

Make all the relevant ops use the CLSRGWConcurrentIO infrastructure,
which simplifies things.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: generalize container type for concurrent IO base class
Yehuda Sadeh [Fri, 19 Sep 2014 22:14:55 +0000 (15:14 -0700)]
rgw: generalize container type for concurrent IO base class

Turned the ConcurrentIO class a template, so that we could use different
kind of containers that are needed for the different operations.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agocls_rgw, rgw: create base class for common bucket shard operations
Yehuda Sadeh [Fri, 19 Sep 2014 21:55:12 +0000 (14:55 -0700)]
cls_rgw, rgw: create base class for common bucket shard operations

Instead of copy pasting the same code all over again, create a base
class for the needed concurrent IO operations.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agoAdjust bucket stats/index checking/index rebuild/tag timeout implementation to work...
Guang Yang [Fri, 29 Aug 2014 10:22:50 +0000 (10:22 +0000)]
Adjust bucket stats/index checking/index rebuild/tag timeout implementation to work with multiple shards.

Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
10 years agoAdjust bucket listing to work with multiple shards.
Guang Yang [Mon, 18 Aug 2014 11:46:32 +0000 (11:46 +0000)]
Adjust bucket listing to work with multiple shards.

Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
10 years agoAdjust rgw bucket prepare/complete OP to work with multiple bucket index shards.
Guang Yang [Sat, 16 Aug 2014 09:04:28 +0000 (09:04 +0000)]
Adjust rgw bucket prepare/complete OP to work with multiple bucket index shards.

Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
10 years agoImplement sharding for bucket creation.
Guang Yang [Fri, 1 Aug 2014 04:54:13 +0000 (04:54 +0000)]
Implement sharding for bucket creation.

Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
10 years agoAdd a new field to bucket info indicating the number of shards of this bucket and...
Guang Yang [Mon, 28 Jul 2014 07:40:26 +0000 (07:40 +0000)]
Add a new field to bucket info indicating the number of shards of this bucket and make it configurable.

Signed-off-by: Guang Yang (yguang@yahoo-inc.com)
10 years agoPendingReleaseNotes: make a note about librados flag changes
Sage Weil [Tue, 13 Jan 2015 20:23:37 +0000 (12:23 -0800)]
PendingReleaseNotes: make a note about librados flag changes

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3360 from mattrichards/bump_rados_version
Sage Weil [Tue, 13 Jan 2015 20:18:04 +0000 (12:18 -0800)]
Merge pull request #3360 from mattrichards/bump_rados_version

librados: bump rados version number

Reviewed-by: Sage Weil <sage@redhat.com>
10 years ago0.91 v0.91
Jenkins [Tue, 13 Jan 2015 20:10:22 +0000 (12:10 -0800)]
0.91

10 years agoMerge pull request #2697 from ceph/wip-8900
Josh Durgin [Tue, 13 Jan 2015 19:17:29 +0000 (11:17 -0800)]
Merge pull request #2697 from ceph/wip-8900

RBD image watcher and new exclusive lock handling

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge pull request #3254 from trociny/feature-10036
Samuel Just [Tue, 13 Jan 2015 18:56:29 +0000 (10:56 -0800)]
Merge pull request #3254 from trociny/feature-10036

osd: osd tree to show primary-affinity value

Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years agoMerge pull request #3281 from ceph/wip-10441-b
Samuel Just [Tue, 13 Jan 2015 18:55:29 +0000 (10:55 -0800)]
Merge pull request #3281 from ceph/wip-10441-b

osd: fix watch ordering bug 10441 option b

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agoMerge pull request #3290 from ceph/wip-da-SCA-20150102
Samuel Just [Tue, 13 Jan 2015 18:54:45 +0000 (10:54 -0800)]
Merge pull request #3290 from ceph/wip-da-SCA-20150102

Coverity and SCA fixes

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3302 from ceph/wip-9956
Samuel Just [Tue, 13 Jan 2015 18:54:21 +0000 (10:54 -0800)]
Merge pull request #3302 from ceph/wip-9956

os/FileStore: verify kernel is new enough before using extsize ioctl

Reviewed-by: Samuel Just <sjust@redhat.com>
10 years agoMerge pull request #3305 from majianpeng/fix5
Samuel Just [Tue, 13 Jan 2015 18:53:34 +0000 (10:53 -0800)]
Merge pull request #3305 from majianpeng/fix5

fix bugs about sync_filesystem

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3364 from ceph/wip-quota-test
Gregory Farnum [Tue, 13 Jan 2015 15:08:30 +0000 (07:08 -0800)]
Merge pull request #3364 from ceph/wip-quota-test

qa: set -e explicitly in quota test

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoqa: set -e explicitly in quota test 3364/head
John Spray [Tue, 13 Jan 2015 14:58:57 +0000 (14:58 +0000)]
qa: set -e explicitly in quota test

Previously was set in hashbang, which meant
that "./quota.sh" was OK, but "sh ./quota.sh" would
just run through ignoring errors.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoMerge pull request #3336 from ceph/wip-fs-reset
Gregory Farnum [Tue, 13 Jan 2015 14:47:04 +0000 (06:47 -0800)]
Merge pull request #3336 from ceph/wip-fs-reset

mon: implement `fs reset`

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoMerge pull request #3343 from dachary/wip-10505-centos-parted
Loic Dachary [Tue, 13 Jan 2015 10:07:55 +0000 (11:07 +0100)]
Merge pull request #3343 from dachary/wip-10505-centos-parted

tests: install parted in centos Dockerfile

Reviewed-by: Joao Eduardo Luis <joao@redhat.com>
10 years agolibrbd: flush pending AIO requests under all existing flush scenarios 2697/head
Jason Dillaman [Tue, 13 Jan 2015 04:17:50 +0000 (23:17 -0500)]
librbd: flush pending AIO requests under all existing flush scenarios

AIO requests that are waiting on the image lock should be flushed
during all existing RBD flush scenarios.  A few flush cases were
missed in the original implementation.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agolibrbd: AIO requests should retry lock requests
Jason Dillaman [Tue, 13 Jan 2015 04:14:11 +0000 (23:14 -0500)]
librbd: AIO requests should retry lock requests

Added a timer to support retrying AIO lock requests until
they are successful.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agolibrbd: differentiate between R/O vs R/W RBD features
Jason Dillaman [Mon, 3 Nov 2014 21:51:06 +0000 (16:51 -0500)]
librbd: differentiate between R/O vs R/W RBD features

The new RBD exclusive lock feature should be treated as a
feature that is only applied when the image is opened in
R/W mode.

Older clients will need to handle the updated
cls_rbd::get_features method in order to properly determine
the incompatible features for an image depending on the
current mode.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agolibrbd: Add internal unit test cases
Jason Dillaman [Tue, 21 Oct 2014 02:09:29 +0000 (22:09 -0400)]
librbd: Add internal unit test cases

The new unit tests cover the modifications made to integrate
the internal librbd functionality with the new ImageWatcher.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agolibrbd: Add ImageWatcher unit test cases
Jason Dillaman [Fri, 17 Oct 2014 13:05:22 +0000 (09:05 -0400)]
librbd: Add ImageWatcher unit test cases

Directly unit test the new ImageWatcher class to complement
the existing librbd integration tests of exclusive lock
handling.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agolibrbd: Add convenience library to support unit tests
Jason Dillaman [Sun, 16 Nov 2014 19:20:42 +0000 (14:20 -0500)]
librbd: Add convenience library to support unit tests

Unit tests need access to the private symbols of librbd no
longer exported from librbd.so.  A new librbd_internal
convenience library was created to allow access.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agorbd: Allow CLI to optionally create shared images
Jason Dillaman [Wed, 1 Oct 2014 20:12:21 +0000 (16:12 -0400)]
rbd: Allow CLI to optionally create shared images

Images that are flagged as shared cannot use the RBD
object map nor RBD mirroring features.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agolibrbd: Integrate librbd with new exclusive lock feature
Jason Dillaman [Wed, 8 Oct 2014 12:41:53 +0000 (08:41 -0400)]
librbd: Integrate librbd with new exclusive lock feature

Operations that update the image now require the exclusive lock
if the feature is enabled.  AIO write and discard operations will
automatically request the exclusive lock from the current leader
to support live-migration.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agolibrados: bump rados version number 3360/head
Matt Richards [Tue, 13 Jan 2015 00:59:42 +0000 (16:59 -0800)]
librados: bump rados version number

As a follow-on to 49d114f1fff90e5c0f206725a5eb82c0ba329376,
increment the "extra" version field so clients can easily
determine if they have a version of librados that properly
translates C API operation flags.

Signed-off-by: Matthew Richards <mattjrichards@gmail.com>
10 years agoMerge pull request #3316 from ceph/wip-10471
Josh Durgin [Tue, 13 Jan 2015 00:20:28 +0000 (16:20 -0800)]
Merge pull request #3316 from ceph/wip-10471

rgw: index swift keys appropriately

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
10 years agolibrbd: Create image exclusive lock watch/notify handler
Jason Dillaman [Wed, 8 Oct 2014 12:20:47 +0000 (08:20 -0400)]
librbd: Create image exclusive lock watch/notify handler

The new watch/notify handler replaces the existing header
update watch/notify handler and adds support for managing
image exclusive lock leadership.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agoosd: enable filestore_extsize by default 3302/head
Sage Weil [Mon, 12 Jan 2015 22:00:21 +0000 (14:00 -0800)]
osd: enable filestore_extsize by default

Note that this will only get used if the kernel is new enough; if it is
older than 3.5 the option will get disabled and extsize will not be used
even if the option is set to true.

This partially reverts 01cd3cdc726a3e838bce05b355a021778b4e5db1.

Fixes: #9956
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoos/FileStore: verify kernel is new enough before using extsize ioctl
Sage Weil [Mon, 12 Jan 2015 21:59:39 +0000 (13:59 -0800)]
os/FileStore: verify kernel is new enough before using extsize ioctl

Old kernels have an XFS bug that exposes uninitialized data when the
extsize hint is set and only partially written.  This is fixed by Linux
commit aff3a9edb7080f69f07fe76a8bd089b3dfa4cb5d, documented in XFS bug
http://oss.sgi.com/bugzilla/show_bug.cgi?id=874, and tested by XFS
test xfs/229 to prevent regressions.

Notably the original bug affects kernel 3.2, which is widely deployed with
ubuntu precise 12.04.

Backport: giant, firefly
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #3352 from kylinstorage/fix-10503
Gregory Farnum [Mon, 12 Jan 2015 19:33:02 +0000 (11:33 -0800)]
Merge pull request #3352 from kylinstorage/fix-10503

Fix bug 10503: http://tracker.ceph.com/issues/10503

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoMerge pull request #3203 from majianpeng/fix1
Samuel Just [Mon, 12 Jan 2015 16:39:48 +0000 (08:39 -0800)]
Merge pull request #3203 from majianpeng/fix1

avoid memcopy from librados to caller buffer

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
10 years agoMerge pull request #3034 from dachary/wip-10017-erasure-code-repair
Samuel Just [Mon, 12 Jan 2015 16:26:08 +0000 (08:26 -0800)]
Merge pull request #3034 from dachary/wip-10017-erasure-code-repair

erasure code repair when there are two failures

Reviewed-by: Samuel Just <sjust@redhat.com>
10 years agoMerge pull request #3148 from mslovy/optimazation_wbthrottle
Samuel Just [Mon, 12 Jan 2015 16:23:26 +0000 (08:23 -0800)]
Merge pull request #3148 from mslovy/optimazation_wbthrottle

os: WBThrottle: optimize the map to unordered_map

Reviewed-by: Samuel Just <sjust@redhat.com>
10 years agomon/MDSMonitor: add confirm flag to fs reset 3336/head
John Spray [Mon, 12 Jan 2015 14:52:43 +0000 (14:52 +0000)]
mon/MDSMonitor: add confirm flag to fs reset

This was already in the command map but was not
being checked.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoqa: add `fs reset` to cephtool tests
John Spray [Mon, 12 Jan 2015 13:54:52 +0000 (13:54 +0000)]
qa: add `fs reset` to cephtool tests

This is just a superficial "I can call it" test,
it's actual behaviour is checked elsewhere.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agomon: implement `fs reset`
John Spray [Mon, 5 Jan 2015 19:34:57 +0000 (19:34 +0000)]
mon: implement `fs reset`

This is for use in CephFS disaster recovery.  When
the metadata pool has been forcibly reset to a single-MDS
metadata tree, we would like to reset the MDSMap to match.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoFix bug 10503: http://tracker.ceph.com/issues/10503 3352/head
Yunchuan Wen [Mon, 12 Jan 2015 05:49:32 +0000 (05:49 +0000)]
Fix bug 10503: http://tracker.ceph.com/issues/10503
ceph-fuse: quota code is not 32-bit safe for vxattr output

Signed-off-by: Yunchuan Wen <yunchuanwen@ubuntukylin.com>
10 years agoMerge pull request #2948 from ceph/wip-promote
Sage Weil [Sun, 11 Jan 2015 15:55:08 +0000 (07:55 -0800)]
Merge pull request #2948 from ceph/wip-promote

osd: promote_object separation; proxy read

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoceph_test_rados: add some debug output 2948/head
Sage Weil [Tue, 6 Jan 2015 21:01:45 +0000 (13:01 -0800)]
ceph_test_rados: add some debug output

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatedPG: improve proxy read cancelation
Sage Weil [Sun, 7 Dec 2014 01:45:28 +0000 (17:45 -0800)]
osd/ReplicatedPG: improve proxy read cancelation

Avoid taking the PG lock for a canceled read op (if we are lucky).  Recheck
after the lock is taken for good measure.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd/ReplicatedPG: put proxy read completion on finisher
Sage Weil [Sun, 7 Dec 2014 01:42:51 +0000 (17:42 -0800)]
osd/ReplicatedPG: put proxy read completion on finisher

We can't use the synchronous completion callbacks (in fast dispatch
context) do to the proxy read completion work.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd: tiering: avoid duplicate promotion on proxy read
Zhiqiang Wang [Fri, 28 Nov 2014 08:30:20 +0000 (16:30 +0800)]
osd: tiering: avoid duplicate promotion on proxy read

Do not promote if it is already undergoing in maybe_handle_cache.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoosd: tiering: proxy instead of redirect read in writeback mode when the
Zhiqiang Wang [Wed, 26 Nov 2014 01:57:03 +0000 (09:57 +0800)]
osd: tiering: proxy instead of redirect read in writeback mode when the
cache pool is full

To preserve read op order

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoosd: tiering: cancel and requeue proxy read when needed
Zhiqiang Wang [Fri, 21 Nov 2014 06:01:24 +0000 (14:01 +0800)]
osd: tiering: cancel and requeue proxy read when needed

Cancel and requeue proxy read on the following cases:
1) on_shutdown
2) on_change
3) background promotion is done

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
Conflicts:
src/osd/ReplicatedPG.cc
src/osd/ReplicatedPG.h

10 years agoosd/ReplicatedPG: allow reads to proxy etc even if blocked
Sage Weil [Tue, 9 Dec 2014 01:57:13 +0000 (17:57 -0800)]
osd/ReplicatedPG: allow reads to proxy etc even if blocked

If we are not write ordered, continue with cache checks so that we can
(among other things) proxy reads while promoting.

Note that this may reorder reads for clients, but we've decided that's okay.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agotest: add proxy read test
Zhiqiang Wang [Wed, 19 Nov 2014 03:14:46 +0000 (11:14 +0800)]
test: add proxy read test

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoosd: tiering: proxy reads during promote
Zhiqiang Wang [Tue, 18 Nov 2014 23:47:32 +0000 (15:47 -0800)]
osd: tiering: proxy reads during promote

wip 9980. Do proxy read and async promotion for writeback.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoosd: tiering: add cache mode READPROXY
Zhiqiang Wang [Tue, 18 Nov 2014 08:10:00 +0000 (16:10 +0800)]
osd: tiering: add cache mode READPROXY

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoosd: tiering: add proxy read support
Zhiqiang Wang [Tue, 18 Nov 2014 07:54:47 +0000 (15:54 +0800)]
osd: tiering: add proxy read support

wip 9979

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>