]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
12 years agomon: limit warnings about low mon disk space 192/head
Sage Weil [Wed, 3 Apr 2013 15:37:50 +0000 (08:37 -0700)]
mon: limit warnings about low mon disk space

Only warn once per percentage point per epoch.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph-disk: CalledProcessError has no output keyword on 2.6
Gary Lowell [Tue, 2 Apr 2013 19:11:10 +0000 (12:11 -0700)]
ceph-disk:  CalledProcessError has no output keyword on 2.6

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
12 years agoMerge pull request #185 from dalgaaf/wip-da-fix-misc-2
Sage Weil [Wed, 3 Apr 2013 00:41:50 +0000 (17:41 -0700)]
Merge pull request #185 from dalgaaf/wip-da-fix-misc-2

Bunch of fixes for issues from SCA

12 years agoMerge pull request #186 from dalgaaf/wip-da-pylint
Sage Weil [Wed, 3 Apr 2013 00:41:16 +0000 (17:41 -0700)]
Merge pull request #186 from dalgaaf/wip-da-pylint

Fix smaller python issues

12 years agoMerge pull request #187 from imjustmatthew/imjustmatthew_docs2
Sage Weil [Wed, 3 Apr 2013 00:40:50 +0000 (17:40 -0700)]
Merge pull request #187 from imjustmatthew/imjustmatthew_docs2

Adds "mds fail 0" command to operations commmand reference.

12 years agoMerge pull request #188 from dmick/wip-test-config-key
Dan Mick [Tue, 2 Apr 2013 23:38:12 +0000 (16:38 -0700)]
Merge pull request #188 from dmick/wip-test-config-key

test_mon_config_key.py: fix 'del' to clean up correctly internally

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agotest_mon_config_key.py: fix 'del' to clean up correctly internally 188/head
Dan Mick [Tue, 2 Apr 2013 22:03:17 +0000 (15:03 -0700)]
test_mon_config_key.py: fix 'del' to clean up correctly internally

12 years agoMerge remote-tracking branch 'origin/wip-4619'
Greg Farnum [Tue, 2 Apr 2013 21:38:44 +0000 (14:38 -0700)]
Merge remote-tracking branch 'origin/wip-4619'

Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: initialize tableservers/clients on mds creation
Sage Weil [Tue, 2 Apr 2013 20:04:48 +0000 (13:04 -0700)]
mds: initialize tableservers/clients on mds creation

The handle_mds_recovery(who) path initializes the anchorclients by having
the server send a 'ready' message on recovery when the server is active
and a peer becomes active.  Similarly, recovery_done() does the same when
the server becomes active.  However, this misses the creation path.  Handle
that explicitly in boot_create.

Fixes: #4619
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoRevert "mds: trigger tableserver active/recovery hook even for self"
Sage Weil [Tue, 2 Apr 2013 20:05:46 +0000 (13:05 -0700)]
Revert "mds: trigger tableserver active/recovery hook even for self"

This reverts commit 968c6c0c9408b33904041e5ddbd9ea738e831713.

This will trigger the 'ready' message twice when we restart, because we
will trigger in both recovery_done() and handle_mds_recovery().

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoREADME: fix dependency lists
Dan Mick [Tue, 2 Apr 2013 20:01:04 +0000 (13:01 -0700)]
README: fix dependency lists

1) automake appeared twice
2) make apt-get command match the list

12 years agoAdds "mds fail 0" command to operations commmand reference. 187/head
Matthew Roy [Tue, 2 Apr 2013 17:57:53 +0000 (13:57 -0400)]
Adds "mds fail 0" command to operations commmand reference.
Partially fixes #2206, though better documentation will eventually be needed.

12 years agoMerge pull request #184 from dachary/wip-4617
Sage Weil [Tue, 2 Apr 2013 16:35:05 +0000 (09:35 -0700)]
Merge pull request #184 from dachary/wip-4617

explain what an inline xattr is and how it relates to omap

12 years agomds: trigger tableserver active/recovery hook even for self
Sage Weil [Tue, 2 Apr 2013 15:58:35 +0000 (08:58 -0700)]
mds: trigger tableserver active/recovery hook even for self

The tableserver now sends a READY message to clients when they go active;
we need to do this even for our own local tableclients, or else they do
not initialize and hang on first use after bringing up a fresh cluster.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoceph-disk: fix some (local) variable names 186/head
Danny Al-Gaaf [Tue, 2 Apr 2013 15:54:53 +0000 (17:54 +0200)]
ceph-disk: fix some (local) variable names

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoceph-disk: fix naming of local variable in is_mounted()
Danny Al-Gaaf [Tue, 2 Apr 2013 15:36:37 +0000 (17:36 +0200)]
ceph-disk: fix naming of local variable in is_mounted()

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoceph-disk: merge twice defined function is_mounted(dev)
Danny Al-Gaaf [Tue, 2 Apr 2013 15:33:08 +0000 (17:33 +0200)]
ceph-disk: merge twice defined function is_mounted(dev)

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoceph-disk: define exception type
Danny Al-Gaaf [Tue, 2 Apr 2013 15:26:12 +0000 (17:26 +0200)]
ceph-disk: define exception type

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoceph-disk: fix Redefining name 'uuid' from outer scope
Danny Al-Gaaf [Tue, 2 Apr 2013 15:17:38 +0000 (17:17 +0200)]
ceph-disk: fix Redefining name 'uuid' from outer scope

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoceph-disk: add missing space after comma
Danny Al-Gaaf [Tue, 2 Apr 2013 15:14:23 +0000 (17:14 +0200)]
ceph-disk: add missing space after comma

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_user.cc: reduce scope of variable 185/head
Danny Al-Gaaf [Tue, 2 Apr 2013 15:01:07 +0000 (17:01 +0200)]
rgw/rgw_user.cc: reduce scope of variable

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_user.h: remove twice initialized purge_keys
Danny Al-Gaaf [Tue, 2 Apr 2013 14:50:41 +0000 (16:50 +0200)]
rgw/rgw_user.h: remove twice initialized purge_keys

Remove twice initialized purge_keys from RGWUserAdminOpState();

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agotest_cors.cc: prefer prefix ++operator for iterator
Danny Al-Gaaf [Tue, 2 Apr 2013 14:36:17 +0000 (16:36 +0200)]
test_cors.cc:  prefer prefix ++operator for iterator

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agotest_cors.cc: use static_cast instead of C-Style cast
Danny Al-Gaaf [Tue, 2 Apr 2013 14:30:57 +0000 (16:30 +0200)]
test_cors.cc: use static_cast instead of C-Style cast

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agotest_cors.cc: use %u to format unsigned in sprintf()
Danny Al-Gaaf [Tue, 2 Apr 2013 14:25:22 +0000 (16:25 +0200)]
test_cors.cc: use %u to format unsigned in sprintf()

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_user.cc: use 'true' directly instead of variable
Danny Al-Gaaf [Tue, 2 Apr 2013 14:17:52 +0000 (16:17 +0200)]
rgw/rgw_user.cc: use 'true' directly instead of variable

Instead of passing 'true' via bool defer_user_update variable
in RGWUser::execute_modify() to keys.add() use it directly.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_user.cc: reduce scope of same_email in execute_modify()
Danny Al-Gaaf [Tue, 2 Apr 2013 14:15:27 +0000 (16:15 +0200)]
rgw/rgw_user.cc: reduce scope of same_email in execute_modify()

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_user.cc: remove some unused std::string variables
Danny Al-Gaaf [Tue, 2 Apr 2013 14:10:22 +0000 (16:10 +0200)]
rgw/rgw_user.cc: remove some unused std::string variables

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_cors_swift.h: prefer prefix ++operator for iterator
Danny Al-Gaaf [Tue, 2 Apr 2013 14:05:58 +0000 (16:05 +0200)]
rgw/rgw_cors_swift.h: prefer prefix ++operator for iterator

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_op.cc: prefer prefix ++operator for iterator
Danny Al-Gaaf [Tue, 2 Apr 2013 14:03:50 +0000 (16:03 +0200)]
rgw/rgw_op.cc: prefer prefix ++operator for iterator

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_op.cc: remove unsused variable
Danny Al-Gaaf [Tue, 2 Apr 2013 14:03:10 +0000 (16:03 +0200)]
rgw/rgw_op.cc: remove unsused variable

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_op.cc: use static_cast instead of C-Style cast
Danny Al-Gaaf [Tue, 2 Apr 2013 14:02:30 +0000 (16:02 +0200)]
rgw/rgw_op.cc: use static_cast instead of C-Style cast

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_cors_s3.cc: prefer prefix ++operator for iterator
Danny Al-Gaaf [Tue, 2 Apr 2013 14:00:12 +0000 (16:00 +0200)]
rgw/rgw_cors_s3.cc: prefer prefix ++operator for iterator

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_cors_s3.cc: remove unused variable
Danny Al-Gaaf [Tue, 2 Apr 2013 13:57:37 +0000 (15:57 +0200)]
rgw/rgw_cors_s3.cc: remove unused variable

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_cors_s3.cc: use static_cast instead of C-Style cast
Danny Al-Gaaf [Tue, 2 Apr 2013 13:55:51 +0000 (15:55 +0200)]
rgw/rgw_cors_s3.cc: use static_cast instead of C-Style cast

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoexplain what an inline xattr is and how it relates to omap 184/head
Loic Dachary [Tue, 2 Apr 2013 13:54:57 +0000 (15:54 +0200)]
explain what an inline xattr is and how it relates to omap

The logic of the configuration flags related to xattr is clarified to define what an inline xattr is and when storing in the object map is preferred.

http://tracker.ceph.com/issues/4617 refs #4617

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agorgw/rgw_cors.cc: fix inefficient usage of string::find()
Danny Al-Gaaf [Tue, 2 Apr 2013 13:43:12 +0000 (15:43 +0200)]
rgw/rgw_cors.cc: fix inefficient usage of string::find()

Fix warning from cppcheck:
 [src/rgw/rgw_cors.cc:70]: (performance) Inefficient usage of
 string::find() in condition; string::compare() would be faster.

Instead of string::find() use boost::algorithm::starts_with().

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_cors.cc: prefer prefix ++operator for iterator
Danny Al-Gaaf [Tue, 2 Apr 2013 12:52:06 +0000 (14:52 +0200)]
rgw/rgw_cors.cc: prefer prefix ++operator for iterator

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_cors.cc: use empty() instead of size() == 0
Danny Al-Gaaf [Tue, 2 Apr 2013 12:47:54 +0000 (14:47 +0200)]
rgw/rgw_cors.cc: use empty() instead of size() == 0

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_bucket.cc: prefer prefix ++operator for iterator
Danny Al-Gaaf [Tue, 2 Apr 2013 12:43:36 +0000 (14:43 +0200)]
rgw/rgw_bucket.cc: prefer prefix ++operator for iterator

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_bucket.cc: remove unused variable
Danny Al-Gaaf [Tue, 2 Apr 2013 12:42:58 +0000 (14:42 +0200)]
rgw/rgw_bucket.cc: remove unused variable

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_user.h: initialize some bool variables in constructor
Danny Al-Gaaf [Tue, 2 Apr 2013 12:41:56 +0000 (14:41 +0200)]
rgw/rgw_user.h: initialize some bool variables in constructor

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw/rgw_user.h: move initialization in initialization list
Danny Al-Gaaf [Tue, 2 Apr 2013 12:39:24 +0000 (14:39 +0200)]
rgw/rgw_user.h: move initialization in initialization list

Move initialization of some variables from constructor body to
the initialization list.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorbd.cc: use static_cast instead of C-Style cast
Danny Al-Gaaf [Tue, 2 Apr 2013 12:24:28 +0000 (14:24 +0200)]
rbd.cc: use static_cast instead of C-Style cast

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agomds/Migrator.cc: prefer prefix ++operator for iterator
Danny Al-Gaaf [Tue, 2 Apr 2013 12:01:24 +0000 (14:01 +0200)]
mds/Migrator.cc: prefer prefix ++operator for iterator

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agorgw: Create RESTful endpoint for user and bucket administration.
caleb miles [Mon, 25 Mar 2013 15:46:34 +0000 (11:46 -0400)]
rgw: Create RESTful endpoint for user and bucket administration.

Expose the following operations through a RESTful endpoint:
    user create
    user modify
    user remove
    subuser create
    subuser modify
    subuser remove
    key create
    key remove
    bucket list
    bucket stats
    bucket link
    bucket unlink
    bucket check
    bucket remove
    remove object

building on the existing /{admin} endpoint.

Signed-off-by caleb miles <caleb.miles@inktank.com>

12 years agodoc/release-notes: v0.60
Sage Weil [Tue, 2 Apr 2013 01:17:27 +0000 (18:17 -0700)]
doc/release-notes: v0.60

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'next'
Gary Lowell [Tue, 2 Apr 2013 00:57:45 +0000 (17:57 -0700)]
Merge branch 'next'

12 years agoMerge pull request #181 from ceph/wip_4510
athanatos [Mon, 1 Apr 2013 23:32:34 +0000 (16:32 -0700)]
Merge pull request #181 from ceph/wip_4510

Scrub/repair should correctly handle truncation and EIO

Fixes #4510
Reviewed-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoPG::_scan_list: assert if error is neither -EIO nor -ENOENT 181/head
Samuel Just [Mon, 1 Apr 2013 23:20:13 +0000 (16:20 -0700)]
PG::_scan_list: assert if error is neither -EIO nor -ENOENT

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoFileStore: rename debug_delete_obj to debug_obj_on_delete
Samuel Just [Mon, 1 Apr 2013 23:16:13 +0000 (16:16 -0700)]
FileStore: rename debug_delete_obj to debug_obj_on_delete

This should make the method intent less confusing.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoPG: _scan_list can now handle EIO on read, stat, get_omap_header
Samuel Just [Mon, 1 Apr 2013 23:11:44 +0000 (16:11 -0700)]
PG: _scan_list can now handle EIO on read, stat, get_omap_header

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoObjectStore: add allow_eio to read, stat, get_omap_header
Samuel Just [Mon, 1 Apr 2013 23:08:43 +0000 (16:08 -0700)]
ObjectStore: add allow_eio to read, stat, get_omap_header

This will allow enlightened callers to handle EIO.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge pull request #183 from ceph/wip-4313-b
João Eduardo Luís [Mon, 1 Apr 2013 22:57:04 +0000 (15:57 -0700)]
Merge pull request #183 from ceph/wip-4313-b

qa: workunits: mon: test 'config-key' store

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agolibrados: test empty ObjectWriteOperation
Sage Weil [Mon, 1 Apr 2013 22:34:56 +0000 (15:34 -0700)]
librados: test empty ObjectWriteOperation

Tests that #2673 is fixed.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #182 from ceph/wip-no-cors-without-rgw
Sage Weil [Mon, 1 Apr 2013 21:56:30 +0000 (14:56 -0700)]
Merge pull request #182 from ceph/wip-no-cors-without-rgw

Makefile.am: disable building ceph_test_cors when radosgw is not enabled

12 years agoMakefile.am: disable building ceph_test_cors when radosgw is not enabled 182/head
Josh Durgin [Mon, 1 Apr 2013 21:04:02 +0000 (14:04 -0700)]
Makefile.am: disable building ceph_test_cors when radosgw is not enabled

This test depends on radosgw. Trying to build it without radosgw will
result in a compile error.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
12 years agov0.60 v0.60
Gary Lowell [Mon, 1 Apr 2013 19:22:53 +0000 (12:22 -0700)]
v0.60

12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Mon, 1 Apr 2013 18:52:46 +0000 (11:52 -0700)]
Merge remote-tracking branch 'gh/next'

12 years agoMerge pull request #169 from ceph/wip-rbd-diff
Sage Weil [Mon, 1 Apr 2013 18:26:16 +0000 (11:26 -0700)]
Merge pull request #169 from ceph/wip-rbd-diff

rbd incremental backup/restore

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agolibrados: don't use lockdep for AioCompletionImpl
Josh Durgin [Mon, 1 Apr 2013 18:09:52 +0000 (11:09 -0700)]
librados: don't use lockdep for AioCompletionImpl

This is a quick workaround for the next branch. A more complete fix
will be done for the master branch. This does not affect correctness,
just what qa runs with lockdep enabled do.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage.weil@inktank.com>
12 years agotest: fix signed/unsigned comparison in test_cors
Greg Farnum [Mon, 1 Apr 2013 16:56:27 +0000 (09:56 -0700)]
test: fix signed/unsigned comparison in test_cors

Signed-off-by: Greg Farnum <greg@inktank.com>
Acked-by: Sage Weil <sage@inktank.com>
12 years agoPG: don't compare auth with itself
Samuel Just [Fri, 29 Mar 2013 23:30:14 +0000 (16:30 -0700)]
PG: don't compare auth with itself

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoPG: pass authoritative scrub map to _scrub
Samuel Just [Sun, 31 Mar 2013 07:00:27 +0000 (00:00 -0700)]
PG: pass authoritative scrub map to _scrub

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoPG: read_error should trigger a repair in _compare_scrub_objects
Samuel Just [Fri, 29 Mar 2013 00:03:28 +0000 (17:03 -0700)]
PG: read_error should trigger a repair in _compare_scrub_objects

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoFileStore,OSD: add mechanism for injecting EIO, truncating obj
Samuel Just [Tue, 26 Mar 2013 22:14:59 +0000 (15:14 -0700)]
FileStore,OSD: add mechanism for injecting EIO, truncating obj

This will be used in testing repair.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoPG::_select_auth_object: prefer a peer which did not hit a read error
Samuel Just [Tue, 26 Mar 2013 20:09:00 +0000 (13:09 -0700)]
PG::_select_auth_object: prefer a peer which did not hit a read error

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoPG: make _select_auth_object smarter
Samuel Just [Tue, 26 Mar 2013 20:08:29 +0000 (13:08 -0700)]
PG: make _select_auth_object smarter

Previously, we just picked the first one to have the object in
question.  Now, we will attempt to choose one that has as
much of the following as possible:
1) has the object (there must be one)
2) has an object_info attr
3) has a valid object_info attr
4) has an object_info whose size matches the scrubbed size

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge branch 'wip-mds'
Greg Farnum [Mon, 1 Apr 2013 16:31:37 +0000 (09:31 -0700)]
Merge branch 'wip-mds'

12 years agomds: bump the protocol version.
Greg Farnum [Mon, 1 Apr 2013 16:27:27 +0000 (09:27 -0700)]
mds: bump the protocol version.

We've changed quite a lot of the restart behavior, as well as one
of the message encodings. This is cheaper and easier than using feature bits,
and CephFS is still a tech preview or whatever, so let's cover them using this.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agomds: don't roll back prepared table updates
Yan, Zheng [Sun, 31 Mar 2013 06:19:17 +0000 (14:19 +0800)]
mds: don't roll back prepared table updates

When table server is recovering, it re-sends 'agree' messages for
prepared table updates. It is possible table client receives an
'agree' messages before it commits the corresponding update. Don't
send 'rollback' message back to the server in this case.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: clear scatter dirty if replica inode has no auth subtree
Yan, Zheng [Sun, 17 Mar 2013 03:13:38 +0000 (11:13 +0800)]
mds: clear scatter dirty if replica inode has no auth subtree

This avoids sending superfluous scatterlock state to recovering MDS

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: don't replicate purging dentry
Yan, Zheng [Fri, 15 Mar 2013 05:09:34 +0000 (13:09 +0800)]
mds: don't replicate purging dentry

open_remote_ino is racy, it's possible someone deletes the inode's
last linkage while the MDS is discovering the inode.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: eval inodes with caps imported by cache rejoin message
Yan, Zheng [Sun, 17 Mar 2013 01:45:55 +0000 (09:45 +0800)]
mds: eval inodes with caps imported by cache rejoin message

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: try merging subtree after clear EXPORTBOUND
Yan, Zheng [Sat, 16 Mar 2013 13:43:17 +0000 (21:43 +0800)]
mds: try merging subtree after clear EXPORTBOUND

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: clear dirty inode rstat if import fails
Yan, Zheng [Sat, 16 Mar 2013 04:38:56 +0000 (12:38 +0800)]
mds: clear dirty inode rstat if import fails

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: don't open dirfrag while subtree is frozen
Yan, Zheng [Tue, 12 Mar 2013 12:51:43 +0000 (20:51 +0800)]
mds: don't open dirfrag while subtree is frozen

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: notify bystanders if export aborts
Yan, Zheng [Thu, 14 Mar 2013 03:57:16 +0000 (11:57 +0800)]
mds: notify bystanders if export aborts

So bystanders know the subtree is single auth earlier.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: fix export cancel notification
Yan, Zheng [Thu, 14 Mar 2013 04:24:54 +0000 (12:24 +0800)]
mds: fix export cancel notification

The comment says that if the importer is dead, bystanders thinks the
exporter is the only auth, as per mdcache->handle_mds_failure(). But
there is no such code in MDCache::handle_mds_failure().

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: unfreeze subtree if import aborts in PREPPED state
Yan, Zheng [Thu, 14 Mar 2013 04:01:08 +0000 (12:01 +0800)]
mds: unfreeze subtree if import aborts in PREPPED state

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: check MDS peer's state through mdsmap
Yan, Zheng [Thu, 14 Mar 2013 03:23:48 +0000 (11:23 +0800)]
mds: check MDS peer's state through mdsmap

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: avoid double auth pin for file recovery
Yan, Zheng [Thu, 14 Mar 2013 02:11:31 +0000 (10:11 +0800)]
mds: avoid double auth pin for file recovery

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: add dirty imported dirfrag to LogSegment
Yan, Zheng [Tue, 12 Mar 2013 08:11:13 +0000 (16:11 +0800)]
mds: add dirty imported dirfrag to LogSegment

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: send lock action message when auth MDS is in proper state.
Yan, Zheng [Tue, 12 Mar 2013 08:51:53 +0000 (16:51 +0800)]
mds: send lock action message when auth MDS is in proper state.

For rejoining object, don't send lock ACK message because lock states
are still uncertain. The lock ACK may confuse object's auth MDS and
trigger assertion.

If object's auth MDS is not active, just skip sending NUDGE, REQRDLOCK
and REQSCATTER messages. MDCache::handle_mds_recovery() will take care
of them.

Also defer caps release message until clientreplay or active

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: issue caps when lock state in replica become SYNC
Yan, Zheng [Tue, 12 Mar 2013 08:19:26 +0000 (16:19 +0800)]
mds: issue caps when lock state in replica become SYNC

because client can request READ caps from non-auth MDS.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: share inode max size after MDS recovers
Yan, Zheng [Tue, 12 Mar 2013 08:27:22 +0000 (16:27 +0800)]
mds: share inode max size after MDS recovers

The MDS may crash after journaling the new max size, but before sending
the new max size to the client. Later when the MDS recovers, the client
re-requests the new max size, but the MDS finds max size unchanged. So
the client waits for the new max size forever. This issue can be avoided
by checking client cap's last_sent, share inode max size if it is zero.
(reconnected cap's last_sent is zero)

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: take object's versionlock when rejoinning xlock
Yan, Zheng [Thu, 14 Mar 2013 12:56:27 +0000 (20:56 +0800)]
mds: take object's versionlock when rejoinning xlock

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: reqid for rejoinning authpin/wrlock need to be list
Yan, Zheng [Thu, 14 Mar 2013 12:29:53 +0000 (20:29 +0800)]
mds: reqid for rejoinning authpin/wrlock need to be list

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: handle linkage mismatch during cache rejoin
Yan, Zheng [Thu, 14 Mar 2013 12:06:27 +0000 (20:06 +0800)]
mds: handle linkage mismatch during cache rejoin

For MDS cluster, not all file system namespace operations that impact
multiple MDS use two phase commit. Some operations use dentry link/unlink
message to update replica dentry's linkage after they are committed by
the master MDS. It's possible the master MDS crashes after journaling an
operation, but before sending the dentry link/unlink messages. Later when
the MDS recovers and receives cache rejoin messages from the surviving
MDS, it will find linkage mismatch.

The original cache rejoin code does not properly handle the case that
dentry unlink messages were missing. Unlinked inodes were linked to stray
dentries. So the cache rejoin ack message need push replicas of these
stray dentries to the surviving MDS.

This patch also adds code that handles cache expiration in the middle of
cache rejoining.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: encode dirfrag base in cache rejoin ack
Yan, Zheng [Wed, 13 Mar 2013 12:58:26 +0000 (20:58 +0800)]
mds: encode dirfrag base in cache rejoin ack

Cache rejoin ack message already encodes inode base, make it also encode
dirfrag base. This allowes the message to replicate stray dentries like
MDentryUnlink message. The function will be used by later patch.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoMerge pull request #179 from ceph/wip-client-cond
Gregory Farnum [Mon, 1 Apr 2013 16:22:45 +0000 (09:22 -0700)]
Merge pull request #179 from ceph/wip-client-cond

client: always remove cond from list after waiting

Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: include replica nonce in MMDSCacheRejoin::inode_strong
Yan, Zheng [Wed, 13 Mar 2013 12:47:11 +0000 (20:47 +0800)]
mds: include replica nonce in MMDSCacheRejoin::inode_strong

So the recovering MDS can properly handle cache expire messages.
Also increase the nonce value when sending the cache rejoin acks.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
Also update the MMDSCacheRejoin encoding to the new format.
Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agomon: OSDMonitor: only output warn/err messages if quotas are set > 0
Joao Eduardo Luis [Mon, 1 Apr 2013 16:14:15 +0000 (17:14 +0100)]
mon: OSDMonitor: only output warn/err messages if quotas are set > 0

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomds: remove MDCache::rejoin_fetch_dirfrags()
Yan, Zheng [Wed, 13 Mar 2013 11:23:18 +0000 (19:23 +0800)]
mds: remove MDCache::rejoin_fetch_dirfrags()

In commit 77946dcdae (mds: fetch missing inodes from disk), I introduced
MDCache::rejoin_fetch_dirfrags(). But it basicly duplicates the function
of MDCache::open_undef_dirfrags(), so just remove rejoin_fetch_dirfrags()
and make open_undef_dirfrags() also handle undefined inodes.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: fix MDS recovery involving cross authority rename
Yan, Zheng [Wed, 13 Mar 2013 10:56:27 +0000 (18:56 +0800)]
mds: fix MDS recovery involving cross authority rename

For mds cluster, rename operation may involve multiple MDS. If the
rename source's auth MDS crashes after some witness MDS have prepared
the rename but before the rename is committing. Later when the MDS
recovers, its subtree map and linkages are different from the prepared
MDS'. This causes problems for both subtree resolve and cache rejoin.
The solution is, if the rename source's auth MDS fails, the prepared
witness MDS query the master MDS if the operation is committing. If
it's not, rollback the rename, then send resolve message to the
recovering MDS.

Another similar case is a prepared witness MDS crashes when the
rename source's auth MDS has prepared or is preparing the operation.
when the witness recovers, the master just delay sending the resolve
ack message until the it commits the operation.

This patch also updates Server::handle_client_rename(). Make preparing
the rename source's auth MDS be the final step before committing the
rename.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: send resolve acks after master updates are safely logged
Yan, Zheng [Wed, 13 Mar 2013 08:54:58 +0000 (16:54 +0800)]
mds: send resolve acks after master updates are safely logged

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: send cache rejoin messages after gathering all resolves
Yan, Zheng [Thu, 14 Mar 2013 07:06:45 +0000 (15:06 +0800)]
mds: send cache rejoin messages after gathering all resolves

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: don't send MDentry{Link,Unlink} before receiving cache rejoin
Yan, Zheng [Fri, 15 Mar 2013 02:34:09 +0000 (10:34 +0800)]
mds: don't send MDentry{Link,Unlink} before receiving cache rejoin

The active MDS calls MDCache::rejoin_scour_survivor_replicas() when it
receives the cache rejoin message. The function will remove the objects
replicated by MDentry{Link,Unlink} from replica map.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: set resolve/rejoin gather MDS set in advance
Yan, Zheng [Thu, 14 Mar 2013 16:08:39 +0000 (00:08 +0800)]
mds: set resolve/rejoin gather MDS set in advance

For active MDS, it may receive resolve/rejoin message before receiving
the mdsmap message that claims the MDS cluster is in resolving/rejoning
state. So instead of set the gather MDS set when receiving the mdsmap.
set them in advance when detecting MDS' failure.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: don't send resolve message between active MDS
Yan, Zheng [Thu, 14 Mar 2013 04:27:51 +0000 (12:27 +0800)]
mds: don't send resolve message between active MDS

When MDS cluster is resolving, current behavior is sending subtree resolve
message to all other MDS and waiting for all other MDS' resolve message.
The problem is that active MDS can have diffent subtree map due to rename.
Besides gathering active MDS's resolve messages are also racy. The only
function for these messages is disambiguate other MDS' import. We can
replace it by import finish notification.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>