]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
14 years agoimprove debug printing
Greg Farnum [Fri, 15 Apr 2011 22:49:46 +0000 (15:49 -0700)]
improve debug printing

14 years agomds: Unify migration-handling code in _commit_slave_rename.
Greg Farnum [Thu, 14 Apr 2011 22:53:09 +0000 (15:53 -0700)]
mds: Unify migration-handling code in _commit_slave_rename.

We need to handle locks and pins on exported inodes but we
were using a separate if block with its own (non-matching!) check
for no good reason.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agomds: _commit_slave_rename needs to drop auth_pins for exported xlocks.
Greg Farnum [Mon, 11 Apr 2011 23:55:09 +0000 (16:55 -0700)]
mds: _commit_slave_rename needs to drop auth_pins for exported xlocks.

Otherwise these pins are never dropped from the inode since we
don't go through our normal xlock teardown code. Now we do!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoMDS: Make _rename_apply inode import auth_pinning more intelligent.
Greg Farnum [Thu, 7 Apr 2011 00:05:26 +0000 (17:05 -0700)]
MDS: Make _rename_apply inode import auth_pinning more intelligent.

We don't want auth_pins on the locallocks (they're never auth_pinned)
and we only want new auth_pins that are for locks on the inode that we
imported -- not for each xlock that the mdr has everywhere (like,
say, on the srcdn)!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agomds: If we're a slave, clean up xlocks when we export an inode.
Greg Farnum [Thu, 31 Mar 2011 21:02:48 +0000 (14:02 -0700)]
mds: If we're a slave, clean up xlocks when we export an inode.

Because we can do an inode import during a rename that skips the usual
channels, we were getting into an odd state with the xlocks (which we
did as a slave for an inode that we exported away). Clean up the
record of these xlocks for inodes before we get into the request
cleanup (at which point we are labeled as no-longer-auth, and the
standard cleanup routines will break).

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agomds: properly drop imported xlocks.
Greg Farnum [Thu, 31 Mar 2011 00:10:05 +0000 (17:10 -0700)]
mds: properly drop imported xlocks.

Because we can do an inode import during a rename that skips the usual
channels, we were getting into an odd state with the xlocks (which
were formerly remote and are now local). Clean up the record of
those remote xlocks.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoMDS: Server takes auth_pins for xlocks on imported inodes.
Greg Farnum [Fri, 25 Mar 2011 23:41:49 +0000 (16:41 -0700)]
MDS: Server takes auth_pins for xlocks on imported inodes.

Should fix #934.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoosd: show "full" or "nearfull" in osdmap summary line
Sage Weil [Mon, 18 Apr 2011 16:57:55 +0000 (09:57 -0700)]
osd: show "full" or "nearfull" in osdmap summary line

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoMerge remote branch 'origin/stable'
Sage Weil [Mon, 18 Apr 2011 16:58:15 +0000 (09:58 -0700)]
Merge remote branch 'origin/stable'

Conflicts:
src/osdc/Journaler.cc

14 years agoMerge branch 'rgw_uid'
Yehuda Sadeh [Mon, 18 Apr 2011 16:56:08 +0000 (09:56 -0700)]
Merge branch 'rgw_uid'

14 years agorgw: remove get_user_info() and clean up
Yehuda Sadeh [Mon, 18 Apr 2011 15:56:52 +0000 (08:56 -0700)]
rgw: remove get_user_info() and clean up

rename all the get_uid_by_* to get_user_info_by_*, remove get_user_info()
and call the appropriate function instead (either the by_uid or by_access_key).

14 years agorgw: store user info on all indexes in the same format
Yehuda Sadeh [Mon, 18 Apr 2011 15:32:09 +0000 (08:32 -0700)]
rgw: store user info on all indexes in the same format

this breaks backward compatibility, we'll have to deal with that
later.

14 years agorgw_admin: can lookup user by access key
Yehuda Sadeh [Mon, 18 Apr 2011 15:15:11 +0000 (08:15 -0700)]
rgw_admin: can lookup user by access key

14 years agomount.ceph: behave when CONFIG_KEYS is not compiled in
Sage Weil [Mon, 18 Apr 2011 04:58:27 +0000 (21:58 -0700)]
mount.ceph: behave when CONFIG_KEYS is not compiled in

In that case we get ENOSYS.  This also implies an old version of the client
and that we should fall back.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoradosgw_admin: Update manpage to new syntax
Wido den Hollander [Mon, 18 Apr 2011 00:40:46 +0000 (17:40 -0700)]
radosgw_admin: Update manpage to new syntax

Signed-off-by: Wido den Hollander <wido@widodh.nl>
Signed-off-by: Colin McCabe <cmccabe@alumni.cmu.edu>
14 years agoMDS: Fix Locker::handle_reqrdlock for xlocked locks.
Greg Farnum [Wed, 13 Apr 2011 23:02:51 +0000 (16:02 -0700)]
MDS: Fix Locker::handle_reqrdlock for xlocked locks.

We previously dropped the request but that was inappropriate for that
one case because the replica has no way to trigger a resend.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: Always _open_parents when opening a new snaprealm
Sage Weil [Wed, 13 Apr 2011 20:57:49 +0000 (13:57 -0700)]
mds: Always _open_parents when opening a new snaprealm

Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agomds: don't run all of try_subtree_merge on a rename across MDSes.
Greg Farnum [Mon, 11 Apr 2011 23:57:50 +0000 (16:57 -0700)]
mds: don't run all of try_subtree_merge on a rename across MDSes.

Previously we'd try and do the whole thing, which meant that
the replica got a lock twiddle before it had finished the export.
That broke things spectacularly, since we weren't respecting our
invariants about who gets remote locking messages.
Now we pass through a flag and respect our invariants.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: adjust LocalLock can_xlock_local().
Greg Farnum [Thu, 7 Apr 2011 00:03:12 +0000 (17:03 -0700)]
mds: adjust LocalLock can_xlock_local().

I don't remember why we needed can_xlock_local() to begin with, but
I can tell that adding this get_xlock_by() check won't stop anything
working that was ever working to begin with (really it's still not
strong enough a check).

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: Extend use of find_ino_peers.
Greg Farnum [Thu, 7 Apr 2011 00:01:53 +0000 (17:01 -0700)]
mds: Extend use of find_ino_peers.

Missed a few places that need it.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: Make use of find_ino_peers
Greg Farnum [Fri, 1 Apr 2011 00:25:52 +0000 (17:25 -0700)]
mds: Make use of find_ino_peers

Previously we just had to give up on ESTALE. Now
we can attempt to recover!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agorandom commenting
Greg Farnum [Thu, 24 Mar 2011 21:26:46 +0000 (14:26 -0700)]
random commenting

14 years agoMDS: Remove inappropriate assert from _logged_slave_rename.
Greg Farnum [Thu, 24 Mar 2011 21:11:06 +0000 (14:11 -0700)]
MDS: Remove inappropriate assert from _logged_slave_rename.

The slave also can hold some auth pins from locks which the
master has asked it to grab. It's possible we can intelligently
determine how many, but for now just drop the assert.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoMDS: Server::handle_slave_rename_prep now accounts for dir snaplock.
Greg Farnum [Thu, 24 Mar 2011 19:23:38 +0000 (12:23 -0700)]
MDS: Server::handle_slave_rename_prep now accounts for dir snaplock.

Previously it ignored the auth pin required to hold snap xlock, which
is currently always held for a rename on a dir. This would lead to
a permanent hang on the request. Now we account for it!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoMDS: Don't move inode to snaprealms if not primary inode.
Greg Farnum [Wed, 23 Mar 2011 18:50:43 +0000 (11:50 -0700)]
MDS: Don't move inode to snaprealms if not primary inode.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoMDCache: update assert to account for being a slave.
Greg Farnum [Wed, 23 Mar 2011 17:41:36 +0000 (10:41 -0700)]
MDCache: update assert to account for being a slave.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoServer: push_projected_linkage in _link_remote
Greg Farnum [Tue, 22 Mar 2011 22:27:21 +0000 (15:27 -0700)]
Server: push_projected_linkage in _link_remote

_link_remote_finish will pop the linkage if inc==true, so we'd
better push it to match!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoServer: ensure slave mdses have full dest tree
Greg Farnum [Tue, 22 Mar 2011 21:23:33 +0000 (14:23 -0700)]
Server: ensure slave mdses have full dest tree

We were already taking rdlocks on the source tree, to make
sure that each slave MDS could traverse to the source dentry. Now,
if there are slave MDSes, we take rdlocks on each destination
ancestor to make sure the slaves can also traverse there.
This fixes an fsstress bug.

Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agorgw: basic support for separate uid and access key
Yehuda Sadeh [Sat, 16 Apr 2011 00:20:44 +0000 (17:20 -0700)]
rgw: basic support for separate uid and access key

14 years agomds: fix null deref in debug
Sage Weil [Fri, 15 Apr 2011 23:32:45 +0000 (16:32 -0700)]
mds: fix null deref in debug

The *dir isn't always non-null (namely, during DISCOVERING state).

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: keep import/export subtree_map state in sync with journal
Sage Weil [Fri, 15 Apr 2011 22:51:50 +0000 (15:51 -0700)]
mds: keep import/export subtree_map state in sync with journal

We were being sloppy before with the ESubtreeMap vs import/export events.
Fix that by doing a few things:

 - add an ambig flag to the subtree map items, and set it for in-progress
   imports.  That means an ESubtreeMap followed by EImportFinish will do
   the right thing now.
 - adjust the dir_auth on EExport journaling (handle_export_dir_ack) so
   that our journaled subtree_map state is always in sync with what we
   see during replay.

Also document clearly what the dir_auth variations actually mean.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix export cancel during IMPORT_PREPPING
Sage Weil [Fri, 15 Apr 2011 20:53:54 +0000 (13:53 -0700)]
mds: fix export cancel during IMPORT_PREPPING

If we are in PREPPING, we need to drop the stickydirs() on the inodes, and
not the pins on the dirfrags.  Do this in the helper so we can keep the
call chains simple.

Also deal with the case where we get a cancel in PREPPED state.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: clean up trim_non_auth_subtree output
Sage Weil [Fri, 15 Apr 2011 17:05:50 +0000 (10:05 -0700)]
mds: clean up trim_non_auth_subtree output

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: cancel exports in PREPPING state on any failure
Sage Weil [Thu, 14 Apr 2011 01:36:33 +0000 (18:36 -0700)]
mds: cancel exports in PREPPING state on any failure

The prepping nodes may need to discover bounds from the failed node and
may hang indefinitely.  Meanwhile, we won't send out mds_resolve messages
until in-progress migrations complete.  Deadlock.

In certain cases the importing node can manufacture the replica.  If it
doesn't realize that right off, though, it will get hung up trying to
discover from the wrong node, get referred to the failed node, and block
waiting for recovery.  The replica forging is a bit suspect anyway, so
let's avoid the whole thing if we can!

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: use helpers for import_reverse
Sage Weil [Thu, 14 Apr 2011 01:34:55 +0000 (18:34 -0700)]
mds: use helpers for import_reverse

Use helpers for common code shared between handle_export_cancel and
handle_mds_failure_or_stop.

Also include handling for IMPORT_PREPPING state, even though we don't use
it yet.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: don't skip inodes in journal that may be trimmed during replay
Sage Weil [Fri, 15 Apr 2011 17:02:46 +0000 (10:02 -0700)]
mds: don't skip inodes in journal that may be trimmed during replay

During replay we trim non-auth inodes on EExport or EImportFinish abort.
Subtree trimming may be delayed, too.

Skip parents if the diri is in the same blob, or if it is journaled in the
current segment *and* it is in a subtree that is unambiguously auth.  We can't
easily be more precise than that because the actual event we care about on
replay is EExport, but the migrator doesn't twiddle auth bits to false until
later.

Also, reset last_journaled on import.

This fixes replay bugs like

2011-04-13 18:15:18.064029 7f65588ef710 mds1.journal EImportStart.replay 10000000015 bounds []
2011-04-13 18:15:18.064034 7f65588ef710 mds1.journal EMetaBlob.replay 2 dirlumps by unknown0
2011-04-13 18:15:18.064040 7f65588ef710 mds1.journal EMetaBlob.replay dir 10000000010
2011-04-13 18:15:18.064046 7f65588ef710 mds1.journal EMetaBlob.replay missing dir ino  10000000010
mds/journal.cc: In function 'void EMetaBlob::replay(MDS*, LogSegment*)', in thread '0x7f65588ef710'
mds/journal.cc: 407: FAILED assert(0)
 ceph version 0.25-683-g653580a (commit:653580ae84c471c34872f14a0308c78af71f7243)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x53) [0xa53d26]
 2: (EMetaBlob::replay(MDS*, LogSegment*)+0x7eb) [0x7a737d]

Fixes: #994
Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoconfig: warn about old-style conf section names
Colin Patrick McCabe [Fri, 15 Apr 2011 22:21:36 +0000 (15:21 -0700)]
config: warn about old-style conf section names

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoman: Update cmds documentation.
Greg Farnum [Fri, 15 Apr 2011 22:54:12 +0000 (15:54 -0700)]
man: Update cmds documentation.

You always need to specify a rank if you do journal-check.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agovstart.sh: use new-style section names in config
Colin Patrick McCabe [Fri, 15 Apr 2011 21:45:56 +0000 (14:45 -0700)]
vstart.sh: use new-style section names in config

Use new-style section names in vstart.sh.
Also update sample.ceph.conf.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agomon:don't check for old-style monitor section name
Colin Patrick McCabe [Fri, 15 Apr 2011 21:40:49 +0000 (14:40 -0700)]
mon:don't check for old-style monitor section name

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agocconf: update man page
Colin Patrick McCabe [Fri, 15 Apr 2011 21:34:39 +0000 (14:34 -0700)]
cconf: update man page

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agomkcephfs, init-ceph: tolerate complete lack of a type
Sage Weil [Fri, 15 Apr 2011 21:03:20 +0000 (14:03 -0700)]
mkcephfs, init-ceph: tolerate complete lack of a type

We were bailing out of mkcephfs with a config with no mds's defined
(because we set -e and grep returns an error here).

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoobjecter: log when we defer a write because of FULL osdmap flag
Sage Weil [Fri, 15 Apr 2011 21:03:40 +0000 (14:03 -0700)]
objecter: log when we defer a write because of FULL osdmap flag

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomkcephfs, init-ceph: tolerate complete lack of a type
Sage Weil [Fri, 15 Apr 2011 21:03:20 +0000 (14:03 -0700)]
mkcephfs, init-ceph: tolerate complete lack of a type

We were bailing out of mkcephfs with a config with no mds's defined
(because we set -e and grep returns an error here).

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoconfig: do not accept old-style section names
Colin Patrick McCabe [Fri, 15 Apr 2011 20:59:57 +0000 (13:59 -0700)]
config: do not accept old-style section names

Stop accepting old-style section names of the form $type$id.  Instead,
we want section names of the form $type.$id.  So [osd0] will no longer
be a valid section name; instead, use [osd.0].

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agocconf: fix usage; clean up some code
Colin Patrick McCabe [Fri, 15 Apr 2011 20:49:55 +0000 (13:49 -0700)]
cconf: fix usage; clean up some code

cconf: fix obsolete usage message. Add --list-all-sections flag.
Use new ceph_argparse stuff. Update tests.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoconfig: normalize key names, cleanup
Colin Patrick McCabe [Fri, 15 Apr 2011 19:03:12 +0000 (12:03 -0700)]
config: normalize key names, cleanup

Normalize key names in md_config_t::get_val and md_config_t::set_val

Remove unused fields from struct config_option.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agorgw: fix other err related issues
Yehuda Sadeh [Fri, 15 Apr 2011 18:15:11 +0000 (11:15 -0700)]
rgw: fix other err related issues

also remove the now redundant formatter->flush()

14 years agorgw: adjustments to error handling
Yehuda Sadeh [Fri, 15 Apr 2011 17:52:14 +0000 (10:52 -0700)]
rgw: adjustments to error handling

fixing mixup between s3 error code and s3 error message

14 years agolibceph: implement ceph_conf_set and ceph_conf_get
Colin Patrick McCabe [Fri, 15 Apr 2011 17:37:05 +0000 (10:37 -0700)]
libceph: implement ceph_conf_set and ceph_conf_get

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agomds: init metablob MDLog* for EImportStart
Sage Weil [Thu, 14 Apr 2011 01:59:14 +0000 (18:59 -0700)]
mds: init metablob MDLog* for EImportStart

This will initialize metablob.my_offset, which makes the parent inode
journaling logic work properly.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoinit-ceph: no log_dir default
Sage Weil [Thu, 14 Apr 2011 01:06:29 +0000 (18:06 -0700)]
init-ceph: no log_dir default

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix journal offset types
Sage Weil [Thu, 14 Apr 2011 02:01:15 +0000 (19:01 -0700)]
mds: fix journal offset types

Always uint64_t!

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: show migration state names on cancel
Sage Weil [Thu, 14 Apr 2011 02:08:05 +0000 (19:08 -0700)]
mds: show migration state names on cancel

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agorgw: rework error handling a bit
Colin Patrick McCabe [Thu, 14 Apr 2011 23:14:48 +0000 (16:14 -0700)]
rgw: rework error handling a bit

Rados Gateway: get rid of RGWOp::err. We already have req_state::err and
that represents the same thing.

Standardize nomenclature for errors. 'errno' is our internal
representation of the error. 'code' is what is returned by S3.
'message' is the message at the end. Improve rgw_err.

dump_errno shouldn't modify req_state, but just dump the error.
A new function set_req_state_err sets the error based on an 'errno'.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoconfig: add test for override ordering, comment
Colin Patrick McCabe [Thu, 14 Apr 2011 20:45:13 +0000 (13:45 -0700)]
config: add test for override ordering, comment

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoconfig: de-globalize reading config file
Colin Patrick McCabe [Thu, 14 Apr 2011 22:26:20 +0000 (15:26 -0700)]
config: de-globalize reading config file

Reading a config file into any md_config_t structure except g_conf used
to be impossible. This is because the config_option code used to
contain explicit references to g_conf. Those have been removed, so now
any md_config_t should be able to read a configuration file.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoradosgw_admin: fix make check
Colin Patrick McCabe [Thu, 14 Apr 2011 22:18:31 +0000 (15:18 -0700)]
radosgw_admin: fix make check

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoconfig: make md_config_t.name a value, not ptr
Colin Patrick McCabe [Thu, 14 Apr 2011 22:10:08 +0000 (15:10 -0700)]
config: make md_config_t.name a value, not ptr

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agorgw: don't modify object owner when setting acls
Yehuda Sadeh [Thu, 14 Apr 2011 21:59:09 +0000 (14:59 -0700)]
rgw: don't modify object owner when setting acls

14 years agorgw: allow changing acl using canned acl
Yehuda Sadeh [Thu, 14 Apr 2011 21:41:13 +0000 (14:41 -0700)]
rgw: allow changing acl using canned acl

14 years agoradosgw_admin: add 'bucket unlink' option
Yehuda Sadeh [Thu, 14 Apr 2011 18:08:17 +0000 (11:08 -0700)]
radosgw_admin: add 'bucket unlink' option

14 years agomkcephfs: Actually do a mkfs.btrfs
Wido den Hollander [Thu, 31 Mar 2011 17:59:40 +0000 (19:59 +0200)]
mkcephfs: Actually do a mkfs.btrfs

Signed-off-by: Wido den Hollander <wido@widodh.nl>
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
14 years agoMonitorStore: use sync_filesystem when available
Colin Patrick McCabe [Thu, 14 Apr 2011 00:40:02 +0000 (17:40 -0700)]
MonitorStore: use sync_filesystem when available

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agodout: log_per_instance should work with log_file
Colin Patrick McCabe [Wed, 13 Apr 2011 21:43:21 +0000 (14:43 -0700)]
dout: log_per_instance should work with log_file

Now log_per_instance (the symlink dance) works with both log_file and
log_dir. This will facilitate gradually removing log_dir.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoRadosModel: error handling fixes
Samuel Just [Tue, 12 Apr 2011 19:02:22 +0000 (12:02 -0700)]
RadosModel: error handling fixes

ReadOp should read the recieve length to prevent buffer error.

Check error codes on WriteOp and ReadOp.

Signed-off-by: Samuel Just <rexludorum@gmail.com>
14 years agofilestore: fix do_getxattr check
Colin Patrick McCabe [Wed, 13 Apr 2011 21:49:48 +0000 (14:49 -0700)]
filestore: fix do_getxattr check

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoFileStore: give better error message about xattrs
Colin Patrick McCabe [Wed, 13 Apr 2011 20:52:58 +0000 (13:52 -0700)]
FileStore: give better error message about xattrs

Fixes #952.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agomds: fix dn unlocking on export_reverse
Sage Weil [Wed, 13 Apr 2011 19:03:31 +0000 (12:03 -0700)]
mds: fix dn unlocking on export_reverse

Triggered by mds_kill_import_at 5.  We were clearing the export_locks
prior to calling export_unlock (der!).

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: during export list target second
Sage Weil [Wed, 13 Apr 2011 18:27:21 +0000 (11:27 -0700)]
mds: during export list target second

We need to maintain the invariant that (dir_auth.first==whoami) == is_auth.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: do not start_new_segment on replay_start
Sage Weil [Wed, 13 Apr 2011 18:26:15 +0000 (11:26 -0700)]
mds: do not start_new_segment on replay_start

We do not need to start a new segment after replay.  And in fact must not
journal an ESubtreeMap prior to doing resolve!

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix bad import_state check on handle_export_discover
Sage Weil [Wed, 13 Apr 2011 17:17:24 +0000 (10:17 -0700)]
mds: fix bad import_state check on handle_export_discover

This populates import_state[] with an bad value and leads to crashes like

mds/Migrator.h: In function 'static const char* Migrator::get_import_statename(int)', in thread '0x7f5ea8c97710'
mds/Migrator.h: 112: FAILED assert(0)
 ceph version 0.25-670-g85bd67e (commit:85bd67e0ab58876ad807b44ab2154e84b90a4f30)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x53) [0xa53ad6]
 2: (Migrator::get_import_statename(int)+0x68) [0x91ea0f]
 3: (Migrator::show_importing()+0x174) [0x90f640]

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: queue rejoin_waiters on rejoin_ack survivor
Sage Weil [Wed, 13 Apr 2011 17:06:34 +0000 (10:06 -0700)]
mds: queue rejoin_waiters on rejoin_ack survivor

For recovering nodes, we eventually open_snap_parents and much later
requeue these waiters.  A surviving node wasn't requeueing them at all.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix resolve
Sage Weil [Wed, 13 Apr 2011 03:57:11 +0000 (20:57 -0700)]
mds: fix resolve

This was broken by a01fba175b646f6 when an ambiguous import was changed
from CDIR_AUTH_UNKNOWN to <whoami,whoami> and disambiguate_imports wasn't
updated accordingly.  The result was inconsistent results for subtree
ownership on different nodes.

This updates disambiguate_imports to match that EImportStart::replay
change.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: don't check_rstats on non-auth or frozen dirs
Sage Weil [Tue, 12 Apr 2011 23:08:46 +0000 (16:08 -0700)]
mds: don't check_rstats on non-auth or frozen dirs

If we are, say, auth but frozen (mid-import) the dir content isn't valid
and check_rstats will likely fail.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: fix _freeze_dir assert for refragment case
Sage Weil [Tue, 12 Apr 2011 22:32:17 +0000 (15:32 -0700)]
mds: fix _freeze_dir assert for refragment case

The is_freezeable_dir() is true at freeze time but not forever after over
the lifetime of the freeze.  We split later on and _freeze_dir on the new
fragments, so this assertion isn't necessarily true then.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: fix choose_lock_state() on xlocked object
Sage Weil [Tue, 12 Apr 2011 22:10:54 +0000 (15:10 -0700)]
mds: fix choose_lock_state() on xlocked object

This crops up on inodes during clientreplay when we reconnect the cap
on the newly created (and still xlocked) object.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: Use syncfs when available
Colin Patrick McCabe [Wed, 13 Apr 2011 20:28:52 +0000 (13:28 -0700)]
osd: Use syncfs when available

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agorgw: create bucket with empty name should return a valid error
Yehuda Sadeh [Wed, 13 Apr 2011 20:29:24 +0000 (13:29 -0700)]
rgw: create bucket with empty name should return a valid error

14 years agorgw: recreation of bucket returns success
Yehuda Sadeh [Wed, 13 Apr 2011 17:38:11 +0000 (10:38 -0700)]
rgw: recreation of bucket returns success

unless it was owned by a different user, at which case it
returns -EEXIST.

14 years agomds: update rstats on stray dir when you rename over existing inode.
Greg Farnum [Wed, 13 Apr 2011 17:36:05 +0000 (10:36 -0700)]
mds: update rstats on stray dir when you rename over existing inode.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agosample.ceph.conf: add log file and pid file
Colin Patrick McCabe [Wed, 13 Apr 2011 17:02:25 +0000 (10:02 -0700)]
sample.ceph.conf: add log file and pid file

These really should be included in a sample...

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agovstart.sh: use "log file" instead of "log dir"
Colin Patrick McCabe [Wed, 13 Apr 2011 16:59:16 +0000 (09:59 -0700)]
vstart.sh: use "log file" instead of "log dir"

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agorgw: listing non existent bucket returns NoSuchBucket
Yehuda Sadeh [Wed, 13 Apr 2011 15:45:43 +0000 (08:45 -0700)]
rgw: listing non existent bucket returns NoSuchBucket

14 years agoMerge remote branch 'origin/mon_mds'
Sage Weil [Wed, 13 Apr 2011 02:57:18 +0000 (19:57 -0700)]
Merge remote branch 'origin/mon_mds'

14 years agoosd: move MAX_CEPH_OBJECT_NAME_LEN into object.h
Colin Patrick McCabe [Wed, 13 Apr 2011 00:20:21 +0000 (17:20 -0700)]
osd: move MAX_CEPH_OBJECT_NAME_LEN into object.h

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoosd: check obj name length to avoid ENAMETOOLONG
Colin Patrick McCabe [Tue, 12 Apr 2011 23:41:06 +0000 (16:41 -0700)]
osd: check obj name length to avoid ENAMETOOLONG

Since the object store is ultimately based on ext3, ext4, or btrfs, and
object names ultimately get translated into file names, we need to
impose a corresponding limit on the length of ceph object names.

Otherwise, the "writeback" thread in the FileStore gets ENAMETOOLONG,
and the transaction does not succeed, even though we journalled it.

Perhaps we will extend or eliminate MAX_CEPH_OBJECT_NAME_LEN at some
point by using prehashing or some other technique. Until then, we need
to be sure to check for this.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agolibrbd: don't write to stdout
Josh Durgin [Tue, 12 Apr 2011 23:39:48 +0000 (16:39 -0700)]
librbd: don't write to stdout

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoAdd test_mutate
Colin Patrick McCabe [Tue, 12 Apr 2011 21:05:21 +0000 (14:05 -0700)]
Add test_mutate

Add test_mutate, in an effort to track down an objecter bug.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agomdsmap: initialize standby_for_rank
Sage Weil [Tue, 12 Apr 2011 21:00:24 +0000 (14:00 -0700)]
mdsmap: initialize standby_for_rank

This is initialized in MDSMonitor anyway; do so where for completeness.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomon: simplify mds follow checks
Sage Weil [Tue, 12 Apr 2011 21:13:56 +0000 (14:13 -0700)]
mon: simplify mds follow checks

Instead of assigning followers in the last_beacon laggy check loop, do it
at the end, the same way we let standby nodes take over.

This also fixes a bug where a non-standby node (say, up:replay) that used
to be up:standby-replay and has standby_for_rank set gets reset back to
up:standby-replay.

Fixes: #1001
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomon: simplify mds laggy check
Sage Weil [Tue, 12 Apr 2011 20:54:53 +0000 (13:54 -0700)]
mon: simplify mds laggy check

We should never have a laggy standby, so technically this doesn't change
any behavior, but it makes the flow less confusing.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomon: don't take over for a standby-replay
Sage Weil [Tue, 12 Apr 2011 20:33:32 +0000 (13:33 -0700)]
mon: don't take over for a standby-replay

If a standby-replay is laggy we shouldn't "take over" for them (they're
not part of the cluster yet).  They should be removed like a regular
standby.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agorados-tool: use init_with_config interface
Colin Patrick McCabe [Tue, 12 Apr 2011 21:04:06 +0000 (14:04 -0700)]
rados-tool: use init_with_config interface

Programs that use both librados and common_init should use
init_with_config.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agomds: make _create_system_file dirty dentries properly
Sage Weil [Tue, 12 Apr 2011 18:08:39 +0000 (11:08 -0700)]
mds: make _create_system_file dirty dentries properly

Properly dirty the new dentries so they get written to the directory
objects later on.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: fix create_mydir_hierarchy to save dir
Sage Weil [Tue, 12 Apr 2011 18:07:54 +0000 (11:07 -0700)]
mds: fix create_mydir_hierarchy to save dir

Mark the dentries dirty so they get saved to disk (they're not journaled!).
This fixes rstat problems on startup, where populate_mydir was recreating
the entries and munging rstats accordingly.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: improve scatterlog debug msg
Sage Weil [Mon, 11 Apr 2011 21:54:59 +0000 (14:54 -0700)]
mds: improve scatterlog debug msg

14 years agomds: clear flush state on rejoin ack
Sage Weil [Wed, 6 Apr 2011 22:12:57 +0000 (15:12 -0700)]
mds: clear flush state on rejoin ack

If we sent scatterlock state during rejoin, the auth will send us an inode
base.  Clear scatterlock flush state if that happens.

Fixes: #637
Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: send any dirty scatterlock state on rejoin
Sage Weil [Wed, 6 Apr 2011 22:09:39 +0000 (15:09 -0700)]
mds: send any dirty scatterlock state on rejoin

Not just inodes for auth dirfrags, but for any inode with dirty scatterlock
state.  Include the root inode.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: reset dirty->flushing on rejoin scatterflush
Sage Weil [Wed, 6 Apr 2011 20:29:38 +0000 (13:29 -0700)]
mds: reset dirty->flushing on rejoin scatterflush

Reset dirty/flushing state during rejoin.

Signed-off-by: Sage Weil <sage@newdream.net>