]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
14 years agomds: fix typo in EMetaBlob encoder
Sage Weil [Tue, 5 Oct 2010 18:05:55 +0000 (11:05 -0700)]
mds: fix typo in EMetaBlob encoder

This was wrongly setting the dir_layout_exists flag to true.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: less chatty in log about caps
Sage Weil [Tue, 5 Oct 2010 17:43:29 +0000 (10:43 -0700)]
osd: less chatty in log about caps

14 years agomds: zero inode layout for dirs
Sage Weil [Tue, 5 Oct 2010 17:21:38 +0000 (10:21 -0700)]
mds: zero inode layout for dirs

These aren't used for anything.

Also rename the default_dir_layout to _log_, since that's all that we now
use it for.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agodump backtrace when getting sigsegv and sigabrt
Yehuda Sadeh [Tue, 5 Oct 2010 19:04:07 +0000 (12:04 -0700)]
dump backtrace when getting sigsegv and sigabrt

14 years agoclient: Fix truncate_seq/truncate_length initialization.
Greg Farnum [Tue, 5 Oct 2010 16:25:38 +0000 (09:25 -0700)]
client: Fix truncate_seq/truncate_length initialization.
Initializing to 0 was causing file_to_extents to get called on every inode
since the MDS initializes truncate_seq to 1 and truncate_length to -1.
This revealed itself as a crash on directory inodes, which have their
layouts zeroed since merging the file_layouts branch.
To make clearer, assert that anything being truncated is a file inode.

14 years agomds: fix LocalLock xlocking by replacing default
Greg Farnum [Tue, 5 Oct 2010 00:00:54 +0000 (17:00 -0700)]
mds: fix LocalLock xlocking by replacing default

14 years agomds: fix ESession/ESessions event id type again
Sage Weil [Tue, 5 Oct 2010 17:12:59 +0000 (10:12 -0700)]
mds: fix ESession/ESessions event id type again

Not sure how many times we've screwed this one up!

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agofilestore: drop unused parse_coll() declaration
Sage Weil [Mon, 4 Oct 2010 15:59:22 +0000 (08:59 -0700)]
filestore: drop unused parse_coll() declaration

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoMerge branch 'testing' into unstable
Sage Weil [Mon, 4 Oct 2010 18:21:51 +0000 (11:21 -0700)]
Merge branch 'testing' into unstable

Conflicts:
src/mds/Locker.cc

14 years agoMerge branch 'file_layouts' into unstable
Greg Farnum [Mon, 4 Oct 2010 18:08:14 +0000 (11:08 -0700)]
Merge branch 'file_layouts' into unstable

Conflicts:
src/mds/CInode.cc
src/mds/CInode.h
src/mds/MDCache.cc
src/mds/SimpleLock.h

14 years agoadd set layout ops to ceph_strings
Greg Farnum [Thu, 30 Sep 2010 18:36:57 +0000 (11:36 -0700)]
add set layout ops to ceph_strings

14 years agocephfs: Wrote and committed cephfs
Greg Farnum [Wed, 29 Sep 2010 23:06:35 +0000 (16:06 -0700)]
cephfs: Wrote and committed cephfs

14 years agomds: Conditionally encode default dir layout.
Greg Farnum [Wed, 29 Sep 2010 21:18:23 +0000 (14:18 -0700)]
mds: Conditionally encode default dir layout.

Previously we unconditionally encoded the standard layout, which
on a directory inode is meaningless. So, use that spot to fill
in the default dir layout, if it exists. Otherwise, zero-fill.
This lets us display default directory layouts without changing
the protocol, which is good.

14 years agoclient: update test_ioctls to test new stuff
Greg Farnum [Tue, 28 Sep 2010 21:14:41 +0000 (14:14 -0700)]
client: update test_ioctls to test new stuff

14 years agoalways throw by value; always catch by const ref
Colin Patrick McCabe [Mon, 4 Oct 2010 17:47:30 +0000 (10:47 -0700)]
always throw by value; always catch by const ref

Always throw exceptions by value rather than as pointers. Always catch
exceptions as const references to avoid unecessary copying. This fixes a
few minor memory leaks and should simplify handling exceptions in the
future.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoclient: import ioctl header from ceph-client
Greg Farnum [Tue, 28 Sep 2010 21:13:06 +0000 (14:13 -0700)]
client: import ioctl header from ceph-client

14 years agomds: fix setlayout truncation check.
Greg Farnum [Fri, 24 Sep 2010 23:11:50 +0000 (16:11 -0700)]
mds: fix setlayout truncation check.

The trunc_seq is initialized to 1 in prepare_new_inode.

14 years agomds: misc fixes for dir default layout projection
Greg Farnum [Fri, 24 Sep 2010 23:10:58 +0000 (16:10 -0700)]
mds: misc fixes for dir default layout projection

14 years agomds: If a projected inode has a dir_layout, we now encode it to disk.
Greg Farnum [Fri, 24 Sep 2010 18:40:57 +0000 (11:40 -0700)]
mds: If a projected inode has a dir_layout, we now encode it to disk.

14 years agomds: Implement op CEPH_MDS_OP_SETDIRLAYOUT.
Greg Farnum [Fri, 24 Sep 2010 00:13:05 +0000 (17:13 -0700)]
mds: Implement op CEPH_MDS_OP_SETDIRLAYOUT.

Implement handler functions, add to inode projection machinery, etc.

14 years agomds: zero out the layout in handle_client_setlayout
Greg Farnum [Thu, 23 Sep 2010 23:36:37 +0000 (16:36 -0700)]
mds: zero out the layout in handle_client_setlayout

Could have led to an invalid layout by mistake.

14 years agomds: Look for and make use of directory tree default layouts, if existent.
Greg Farnum [Thu, 23 Sep 2010 20:33:16 +0000 (13:33 -0700)]
mds: Look for and make use of directory tree default layouts, if existent.

14 years agofilestore: make list_collections() list all dirs
Sage Weil [Mon, 4 Oct 2010 15:50:31 +0000 (08:50 -0700)]
filestore: make list_collections() list all dirs

coll_t is now unstructured; list all dirs besides '.' and '..'.

The old coll_t::parse() was broken.  Remove it.  Fixes
a4138c905053cf79a03b50fa766c08ad718b8c58.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: make load_pgs verbose
Sage Weil [Mon, 4 Oct 2010 15:44:38 +0000 (08:44 -0700)]
osd: make load_pgs verbose

Show what it's skipping any why.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix setlayout truncation check.
Greg Farnum [Fri, 24 Sep 2010 23:11:50 +0000 (16:11 -0700)]
mds: fix setlayout truncation check.

The trunc_seq is initialized to 1 in prepare_new_inode.

14 years agomds: zero out the layout in handle_client_setlayout
Greg Farnum [Thu, 23 Sep 2010 23:36:37 +0000 (16:36 -0700)]
mds: zero out the layout in handle_client_setlayout

Could have led to an invalid layout by mistake.

14 years agomds: remove unused CompatSet mds_features.
Greg Farnum [Wed, 22 Sep 2010 23:36:05 +0000 (16:36 -0700)]
mds: remove unused CompatSet mds_features.

All the MDS features are stored in the MDSMap::mdsmap_compat

14 years agomon: add 'mds fail N' command
Sage Weil [Fri, 1 Oct 2010 22:54:56 +0000 (15:54 -0700)]
mon: add 'mds fail N' command

Manually mark an mds rank as failed.  The daemon should kill itself when
it finds out.

Note that this doesn't do any sanity checks, so it can also be used to
adjust state in an otherwise inconsistent mdsmap due to other bugs (one
where, say, an mds in up but has no info, or not up but not in the failed
set.)

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agobuffer::list::copy: complain about invalid strings
Colin Patrick McCabe [Fri, 1 Oct 2010 18:09:30 +0000 (11:09 -0700)]
buffer::list::copy: complain about invalid strings

Raise an exception when someone feeds us a "string" that has embedded
NULL characters.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agomds: fix and use add_replica_stray() helper for handle_dentry_unlink
Sage Weil [Fri, 1 Oct 2010 19:48:59 +0000 (12:48 -0700)]
mds: fix and use add_replica_stray() helper for handle_dentry_unlink

Eliminate duplicate code by using (and fixing) the helper.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix stray replica push on _rename_prepare_witness()
Sage Weil [Fri, 1 Oct 2010 19:43:20 +0000 (12:43 -0700)]
mds: fix stray replica push on _rename_prepare_witness()

We need to push all parents of the straydn to the target.  This changed
a while back with the mdsdir stuff but this bit of code wasn't updated.
Updated to mirror send_dentry_unlink().

This fixes a crash like:
mds/MDCache.cc: In function 'void MDCache::adjust_subtree_auth(CDir*, std::pair<int, int>, bool)':
mds/MDCache.cc:644: FAILED assert(root)
 ceph version 0.22~rc (0e67718a365b42969e785f544ea3b4258bb2407f)
 1: (MDCache::add_replica_dir(ceph::buffer::list::iterator&, CInode*, int, std::list<Context*, std::allocator<Context*> >&)+0x1c1) [0x536a91]
 2: (MDCache::add_replica_stray(ceph::buffer::list&, int)+0xdb) [0x536fab]
 3: (Server::handle_slave_rename_prep(MDRequest*)+0x1113) [0x4d5c33]
 4: (Server::dispatch_slave_request(MDRequest*)+0x21b) [0x4de80b]
 5: (Server::handle_slave_request(MMDSSlaveRequest*)+0x145) [0x4e1955]
 6: (MDS::_dispatch(Message*)+0x2598) [0x49e038]
...

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: revamp forgetting lost objects
Sage Weil [Fri, 1 Oct 2010 19:32:59 +0000 (12:32 -0700)]
osd: revamp forgetting lost objects

The old forget lost objects rewrote history in the PG log, which is asking
for all kinds of trouble.  Instead, add new logs events to indicate that
an object is LOST (deleted) or LOST_REVERTed (reverted to an older
version).

The LOST_REVERT case means we may need to recover the old version from
another node and rewrite the version number.  This isn't implemented yet;
for now we just assert.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: move PG::Info::coll to PG::coll
Colin Patrick McCabe [Fri, 1 Oct 2010 18:56:42 +0000 (11:56 -0700)]
osd: move PG::Info::coll to PG::coll

It's best not to have data members in PG::Info that are not serialized
and sent over the wire. Cache coll directly inside PG instead.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: cache coll_t in PG
Colin P. McCabe [Fri, 1 Oct 2010 01:13:44 +0000 (18:13 -0700)]
osd: cache coll_t in PG

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: fix recovery_primary loop on local clone
Sage Weil [Fri, 1 Oct 2010 05:00:06 +0000 (22:00 -0700)]
osd: fix recovery_primary loop on local clone

When we take the clone branch, we update the missing map.  This invalidates
our current iterator, which can cause badness.  Instead, increment the
iterator near the top of the loop so we don't have to worry about it.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agogitignore: Ignore cscope and vim temporary files
Colin P. McCabe [Fri, 1 Oct 2010 01:13:44 +0000 (18:13 -0700)]
gitignore: Ignore cscope and vim temporary files

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: generalize coll_t to a string
Colin Patrick McCabe [Thu, 30 Sep 2010 23:59:53 +0000 (16:59 -0700)]
osd: generalize coll_t to a string

coll_t is now a string. META_COLL and TEMP_COLL are just constants now.

Now there is a constructor that takes pgid_t and snapid_t, rather than
factory methods. It's clear what that constructor does, so wrapping it
in factory methods should be unecessary.

Bump coll_t serialization version to 3. Implement decoding for the old
versions.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agomds: drop bad assert
Sage Weil [Thu, 30 Sep 2010 17:54:23 +0000 (10:54 -0700)]
mds: drop bad assert

Introduced by f1921c3a952726e025773979a7597de793897058.  Should probably
audit this code.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoMakefile: add missing include
Sage Weil [Wed, 29 Sep 2010 19:02:30 +0000 (12:02 -0700)]
Makefile: add missing include

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agointerval_set: hide data members
Colin McCabe [Wed, 29 Sep 2010 02:00:28 +0000 (19:00 -0700)]
interval_set: hide data members

This change makes interval_set::m and interval_set::_size private data
members in interval_set, instead of public. This change also creates a
non-const iterator. Using this iterator, users can modify the length of
an interval. So now, all users can use the iterators rather than
interacting with the class internals directly.

14 years agomon: Fix issue first addressed in 2c5a3d99aa3be5ce114072e84f73a0a6426e63fd.
Greg Farnum [Wed, 29 Sep 2010 17:39:49 +0000 (10:39 -0700)]
mon: Fix issue first addressed in 2c5a3d99aa3be5ce114072e84f73a0a6426e63fd.
We were properly falling out of the while loop when we reached end(), but
not checking for it in the following if-else. Now we do!
Reported-by: Henry C Chang <henry_c_chang@tcloudcomputing.com>
14 years agoosd: do not request backlog from peers with empty pg
Sage Weil [Wed, 29 Sep 2010 15:25:27 +0000 (08:25 -0700)]
osd: do not request backlog from peers with empty pg

This avoids stalling out peering, because the peer just responds with
another 'empty' PG::Info in response (which we already have).

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: try to object from other replica(s) on EOF
Sage Weil [Wed, 29 Sep 2010 15:24:34 +0000 (08:24 -0700)]
osd: try to object from other replica(s) on EOF

If during recovery we are unable to pull from a replica due to reaching
EOF (e.g., zeroed out object), pull from the next available replica (if
any).

Eventually this should be extended to do the same when a checksum fails.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoAdd the setup-chroot.sh script
Colin Patrick McCabe [Wed, 29 Sep 2010 02:22:50 +0000 (19:22 -0700)]
Add the setup-chroot.sh script

The setup-chroot.sh script is very handy for building the server in a
chroot environment. I thought I would share it here in case anyone else
finds it useful.

14 years agoosd: clarify comment in recovery code
Sage Weil [Mon, 27 Sep 2010 22:41:32 +0000 (15:41 -0700)]
osd: clarify comment in recovery code

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomsgr: Don't take over old pipes if they're lossy.
Greg Farnum [Tue, 28 Sep 2010 18:45:43 +0000 (11:45 -0700)]
msgr: Don't take over old pipes if they're lossy.
Fixes bug #443.

14 years agoImplement interval_set::const_iterator
Colin Patrick McCabe [Wed, 15 Sep 2010 22:23:15 +0000 (15:23 -0700)]
Implement interval_set::const_iterator

14 years agoRename interval_set::begin and end
Colin Patrick McCabe [Mon, 27 Sep 2010 19:06:06 +0000 (12:06 -0700)]
Rename interval_set::begin and end

Rename interval_set::begin and end to interval_set::range_begin and
interval_set::range_end, respectively.

14 years agorgw: send 100-continue response only if requested
Yehuda Sadeh [Mon, 27 Sep 2010 16:37:07 +0000 (09:37 -0700)]
rgw: send 100-continue response only if requested

14 years agomds: set PREXLOCK next state to LOCK
Sage Weil [Mon, 27 Sep 2010 15:33:27 +0000 (08:33 -0700)]
mds: set PREXLOCK next state to LOCK

This really shouldn't happen (!), but if it does, at least avoid getting
the primary state out of sync with the replicas.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: don't block request on freezing if we're already auth_pinned.
Sage Weil [Mon, 27 Sep 2010 15:31:34 +0000 (08:31 -0700)]
mds: don't block request on freezing if we're already auth_pinned.

If we already auth_pinned, we're past the gates; don't stop on freezable.

This screws up xlock: the lock moves to PREXLOCK state, but the request
that would normally xlock it gets deferred because of a racing freezing
of the tree.  Then the PREXLOCK gather kicks in and badness happens.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agorgw: fix meta attr setting when doing copy operation
Yehuda Sadeh [Sun, 26 Sep 2010 01:12:59 +0000 (18:12 -0700)]
rgw: fix meta attr setting when doing copy operation

14 years agomds: block request is freezing
Sage Weil [Sat, 25 Sep 2010 21:44:50 +0000 (14:44 -0700)]
mds: block request is freezing

This prevents a deadlock where:

 - client request releases caps
 - caps release deferred (freezing)
 - request proceeds (freezing)
 - can't revoke caps because they're released (but deferred)
 deadlock!

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: add coll_t::is_pg() method
Sage Weil [Sat, 25 Sep 2010 03:10:08 +0000 (20:10 -0700)]
osd: add coll_t::is_pg() method

This makes the interface a bit more adaptable for a situation where it has
a simple string representation instead of the strict structure it has now.
Eventually this function can simply attempt a pg_t parse.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix ESessions event type
Sage Weil [Fri, 24 Sep 2010 23:20:30 +0000 (16:20 -0700)]
mds: fix ESessions event type

Using the singular event type meant trying to decode as an ESession (and
failing!).

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix xlock state asserts for LocalLock
Sage Weil [Fri, 24 Sep 2010 22:54:57 +0000 (15:54 -0700)]
mds: fix xlock state asserts for LocalLock

The LocalLock (versionlocks) allow xlocking but have only a single state
(LOCK_LOCK).

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix locallock rule (missing column)
Sage Weil [Fri, 24 Sep 2010 22:22:18 +0000 (15:22 -0700)]
mds: fix locallock rule (missing column)

The fwr column was missing, leading to a 0 for xlock, which broke slave
xlocks.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: add rename failure hooks
Sage Weil [Fri, 24 Sep 2010 22:15:12 +0000 (15:15 -0700)]
mds: add rename failure hooks

14 years agoosd: fix pull completion tests, again
Sage Weil [Fri, 24 Sep 2010 21:50:05 +0000 (14:50 -0700)]
osd: fix pull completion tests, again

op->complete==false is inconclusive.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoceph: make version in backtrace look nice
Sage Weil [Fri, 24 Sep 2010 21:01:52 +0000 (14:01 -0700)]
ceph: make version in backtrace look nice

match debug log
include .h, not .c

14 years agoosd: clean out redundant (and wrong) complete calculation
Sage Weil [Fri, 24 Sep 2010 20:57:00 +0000 (13:57 -0700)]
osd: clean out redundant (and wrong) complete calculation

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: Create struct default_file_layout and encoder/decoder functions.
Greg Farnum [Thu, 23 Sep 2010 20:32:45 +0000 (13:32 -0700)]
mds: Create struct default_file_layout and encoder/decoder functions.
Also enable the state transfer when lock state changes.

Still to do: make anything actually create these.

14 years agoosd: make sparse data/clone push behave with partial object push
Sage Weil [Fri, 24 Sep 2010 18:43:37 +0000 (11:43 -0700)]
osd: make sparse data/clone push behave with partial object push

We can't error out if we don't get everything we want in one go now that
we support pushing objects in pieces.  Remove this check entirely, since
we don't have a good error handling case anyway.

14 years agomds: defer MExportDirDiscover until we have root inode open
Sage Weil [Fri, 24 Sep 2010 18:10:52 +0000 (11:10 -0700)]
mds: defer MExportDirDiscover until we have root inode open

Otherwise we can't traverse or do anything useful.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: alloc auth xlock on versionlock/LocalLock
Sage Weil [Fri, 24 Sep 2010 17:19:25 +0000 (10:19 -0700)]
mds: alloc auth xlock on versionlock/LocalLock

This is done when we do a slave xlock in order do avoid pipelining updates
to the inode, making rollback of complex operations like rename/link
safe.

14 years agomds: defer cap release and update consistently when frozen
Sage Weil [Fri, 24 Sep 2010 16:40:40 +0000 (09:40 -0700)]
mds: defer cap release and update consistently when frozen

We need to preserve the order of processing of cap release and writeback
messages across handle_client_caps() and process_request_cap_release().
Use a helper with the appropriate condition, and defer the release
processing as needed.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: refactor process_cap_update a bit
Sage Weil [Fri, 24 Sep 2010 15:56:20 +0000 (08:56 -0700)]
mds: refactor process_cap_update a bit

Fewer args

14 years agomds: drop old/incorrect comment
Sage Weil [Fri, 24 Sep 2010 15:42:44 +0000 (08:42 -0700)]
mds: drop old/incorrect comment

14 years agomds: always mark parent scatterlock when marking dirty rstat
Sage Weil [Fri, 24 Sep 2010 15:15:54 +0000 (08:15 -0700)]
mds: always mark parent scatterlock when marking dirty rstat

Note that this will let the parent nestlock 'dirty' state get out of
sync with the lock state, as the whole point of the dirty rstat lists is
that it can happen any time.  It does, however, queue us up.

14 years agomds: mark dirty rstat inodes during recovery
Sage Weil [Fri, 24 Sep 2010 14:52:37 +0000 (07:52 -0700)]
mds: mark dirty rstat inodes during recovery

14 years agomds: error to log when inode/dirfrag rbytes get out of sync
Sage Weil [Thu, 23 Sep 2010 23:20:00 +0000 (16:20 -0700)]
mds: error to log when inode/dirfrag rbytes get out of sync

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: stubs for printing projected fragstat/rstat
Sage Weil [Thu, 23 Sep 2010 23:17:13 +0000 (16:17 -0700)]
mds: stubs for printing projected fragstat/rstat

Disabled for now, since it is so freaking verbose.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: assimilate dirty rstat inodes during scatter_writeback
Sage Weil [Thu, 23 Sep 2010 23:16:20 +0000 (16:16 -0700)]
mds: assimilate dirty rstat inodes during scatter_writeback

We put some of the predirty_journal_parents() code that calls the
project_rstat_inode_to_frag() into a common helper and use that.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: maintain dirty_rstat list
Sage Weil [Thu, 23 Sep 2010 20:05:06 +0000 (13:05 -0700)]
mds: maintain dirty_rstat list

Add on fetch or import of dirty_rstat; clear on export of dirty_rstat.

14 years agomds: add dirty_rstat CInode elist, state, pins
Sage Weil [Thu, 23 Sep 2010 20:04:00 +0000 (13:04 -0700)]
mds: add dirty_rstat CInode elist, state, pins

We need to track inodes with unpropagated rstat data on a per-dirfrag
basis so that we can propagate it when the nestlock becomes writeable.

14 years agoosd: remove assertion
Yehuda Sadeh [Fri, 24 Sep 2010 17:46:38 +0000 (10:46 -0700)]
osd: remove assertion

14 years agoqa: improved rgw tests
Yehuda Sadeh [Fri, 24 Sep 2010 17:13:14 +0000 (10:13 -0700)]
qa: improved rgw tests

14 years agomakefile: drop quotes on tcmalloc CXXFLAGS
Sage Weil [Fri, 24 Sep 2010 04:20:31 +0000 (21:20 -0700)]
makefile: drop quotes on tcmalloc CXXFLAGS

14 years agomds: scatter pin frozen tree on importer too
Sage Weil [Thu, 23 Sep 2010 23:12:21 +0000 (16:12 -0700)]
mds: scatter pin frozen tree on importer too

The importer also needs to scatter pin.  This avoids scatterlock gather
races like so:

A: start exporting to B
A: freeze, scatter pin tree
C: initiate gather
A: delay replay to gather
B: reply to gather, do not include (non-auth) dirfrag
A,B: finish migration
A: reply to gather, do not include (now non-auth) dirfrag
C: gets no info about the dirfrag!

By pinning on the importer, we ensure that at least one MDS will respond
to the gather with auth dirfrag info.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: drop dead Renamer code
Sage Weil [Thu, 23 Sep 2010 19:26:42 +0000 (12:26 -0700)]
mds: drop dead Renamer code

14 years agomds: clarify inode dirstat/rstat locking
Sage Weil [Thu, 23 Sep 2010 19:20:36 +0000 (12:20 -0700)]
mds: clarify inode dirstat/rstat locking

The accounted_rstat must always remain consistent with the parent dirfrag,
which in turn means it is governed by the parent's nestlock.

The rstat is protected by _this_ inode's nestlock, and is updated by
scatter_writebehind() or predirty_journal_parents().

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix bounding frag rstat/fragstat update during import
Sage Weil [Thu, 23 Sep 2010 17:00:07 +0000 (10:00 -0700)]
mds: fix bounding frag rstat/fragstat update during import

Be careful about when we update bounding dirfrag info during an import.  If
the lock is in a MIX state, we do NOT want to update, since the inode
auth doesn't know jack (unless they are also dirfrag auth, in which case
we'll find out when we unscatter anyway).

Fixes fix 9d81f9d6.

14 years agomds: do not scatter_writebehind on nudge if replicated
Sage Weil [Thu, 23 Sep 2010 04:10:18 +0000 (21:10 -0700)]
mds: do not scatter_writebehind on nudge if replicated

This can cause the inode rstat etc to become out of sync with dirfrag
accounted_rstat when the scatterlock is not in a gathered state: the
local values will get updated but those on other nodes will not, and the
inode will drift out of sync with the dirfrags.

Other callers to scatter_writebehind() are all in contexts where we have
_just_ gathered dirfrag state, or there is no remote dirfrag state to
gather.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: use scatter pins for migration instead of rd/wrlocks
Sage Weil [Wed, 22 Sep 2010 22:42:52 +0000 (15:42 -0700)]
mds: use scatter pins for migration instead of rd/wrlocks

This is simpler (for the migrator), and wrlocks allow scatter_writebehind,
which is a no-no for a frozen tree.  By pinning the frozen dir's parent
inode, we prevent any scatter or unscatter operations from implicitly
updating metadata within the frozen root dirfrag.

14 years agomds: add scatterpins
Sage Weil [Wed, 22 Sep 2010 22:41:24 +0000 (15:41 -0700)]
mds: add scatterpins

14 years agobacktrace: include ceph version
Greg Farnum [Thu, 23 Sep 2010 16:39:59 +0000 (09:39 -0700)]
backtrace: include ceph version

14 years agomds: always pass pick_inode_snap the head
Sage Weil [Fri, 17 Sep 2010 16:10:46 +0000 (09:10 -0700)]
mds: always pass pick_inode_snap the head

This fixes a possible infinite loop in handle_client_caps().  We need to
_always_ pass the head inode in.

14 years agoqa: add simple rgw test
Yehuda Sadeh [Thu, 23 Sep 2010 05:32:40 +0000 (22:32 -0700)]
qa: add simple rgw test

14 years agomds: remove unused CompatSet mds_features.
Greg Farnum [Wed, 22 Sep 2010 23:36:05 +0000 (16:36 -0700)]
mds: remove unused CompatSet mds_features.

All the MDS features are stored in the MDSMap::mdsmap_compat

14 years agomds: add policylock to the inodes.
Greg Farnum [Wed, 22 Sep 2010 21:48:39 +0000 (14:48 -0700)]
mds: add policylock to the inodes.
This will be used to cover per-directory default file distribution
policies, and maybe other things that come up.

14 years agomds: fix eval_gather() for non-auth inodes
Sage Weil [Wed, 22 Sep 2010 21:02:08 +0000 (14:02 -0700)]
mds: fix eval_gather() for non-auth inodes

For non-auth nodes, we want a can_* policy that's < AUTH, not <= AUTH.
Adjust macro accordingly.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoMerge branch 'testing' into unstable
Sage Weil [Wed, 22 Sep 2010 20:45:13 +0000 (13:45 -0700)]
Merge branch 'testing' into unstable

14 years agomon: return errors (not 0) from MonitorStore::get_bl_ss()
Sage Weil [Wed, 22 Sep 2010 20:32:11 +0000 (13:32 -0700)]
mon: return errors (not 0) from MonitorStore::get_bl_ss()

Checked callers, should be fine.

14 years agomon: move election start reset to starting_election() helper
Sage Weil [Wed, 22 Sep 2010 18:31:12 +0000 (11:31 -0700)]
mon: move election start reset to starting_election() helper

An election can start either because we call it, or because someone else
calls it.  Either way, we need to reset our state, so move that code into
the election_starting() callback, which is called by the elector's
start()/call_election() anyway.

This hopefully fixes a case where we see a timeout expire on the monitor
and fail the assertion

mon/Paxos.cc: In function 'void Paxos::lease_timeout()':
mon/Paxos.cc:684: FAILED assert(mon->is_peon())
 1: (SafeTimer::EventWrapper::finish(int)+0x259) [0x52da29]
 2: (Timer::timer_entry()+0x8e3) [0x52f523]
 3: (Timer::TimerThread::entry()+0xd) [0x46d45d]
 4: (Thread::_entry_func(void*)+0xa) [0x458aca]
 5: (()+0x6a3a) [0x7fe0bd6a4a3a]
 6: (clone()+0x6d) [0x7fe0bc8c277d]

The Paxos::election_starting() hook resets the timer, and will at least
close this possible cause.

Reported-by: Henry C Chang <henry_c_chang@tcloudcomputing.com>
Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: distribute flocklock properly!
Greg Farnum [Wed, 22 Sep 2010 18:40:23 +0000 (11:40 -0700)]
mds: distribute flocklock properly!

Previously we weren't handling it in a lot of our distributed system
areas, which would have broken stuff if it were being used.

14 years agomds: distribute flocklock properly!
Greg Farnum [Wed, 22 Sep 2010 18:40:23 +0000 (11:40 -0700)]
mds: distribute flocklock properly!

Previously we weren't handling it in a lot of our distributed system
areas, which would have broken stuff if it were being used.

14 years agomds: Make SimpleLock wait shift bits unique like they should be.
Greg Farnum [Wed, 22 Sep 2010 18:14:08 +0000 (11:14 -0700)]
mds: Make SimpleLock wait shift bits unique like they should be.

This wasn't actually breaking stuff before, but it did mean
we woke up stuff we didn't need to.

14 years agomds: Make SimpleLock wait shift bits unique like they should be.
Greg Farnum [Wed, 22 Sep 2010 18:14:08 +0000 (11:14 -0700)]
mds: Make SimpleLock wait shift bits unique like they should be.

This wasn't actually breaking stuff before, but it did mean
we woke up stuff we didn't need to.

14 years agomon: Fix infinite looping, if failed_notes is empty.
Greg Farnum [Wed, 22 Sep 2010 16:49:58 +0000 (09:49 -0700)]
mon: Fix infinite looping, if failed_notes is empty.

Reported-by: Henry C Chang <henry_c_chang@tcloudcomputing.com>
14 years agomon: add debug output
Sage Weil [Wed, 22 Sep 2010 16:25:32 +0000 (09:25 -0700)]
mon: add debug output