Sage Weil [Wed, 20 Apr 2011 19:05:10 +0000 (12:05 -0700)]
testlibrbd: fix signed/unsigned comparisons
testlibrbd.c: In function 'write_test_data':
testlibrbd.c:191: warning: comparison between signed and unsigned integer expressions
testlibrbd.c: In function 'aio_read_test_data':
testlibrbd.c:207: warning: comparison between signed and unsigned integer expressions
testlibrbd.c: In function 'read_test_data':
testlibrbd.c:222: warning: comparison between signed and unsigned integer expressions
Objects can now register as configuration observers interested in a
subset of the configuration keys. The observers will be told exactly
which keys have changed.
The first user is dout, which now no longer needs the infamous SIGHUP
hack to know when to reopen the config file.
librados: Remove rados_reopen_log, which was basically a means for the
library user to trigger the SIGHUP behavior.
Changes are accumulated and applied all at once by apply_changes. This
function is called as part of common_init, and after every call to
injectargs.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Since logging options are per-config, logically DoutStreambuf instances
should also be per-config. This also allows us to eliminate the
"if (uninitialized)" checks at the beginning of every call to dout.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
pg_remove has been included for longer than we've had versions
in the struct, so this check for end is useless -- if pg_remove
wasn't encoded we're already broken by decoding the version at
the beginning.
Tommi Virtanen [Tue, 19 Apr 2011 18:20:24 +0000 (11:20 -0700)]
debian: Handle missing tcmalloc on Debian lenny.
lenny doesn't have a suitable libgoogle-perftools-dev, and
release.sh edits it out of build-deps. Detect that and tell
configure that not having tcmalloc is ok.
Sage Weil [Tue, 19 Apr 2011 18:33:34 +0000 (11:33 -0700)]
mon: remove class distribution infrastructure
This is now the admin's job. Removes a lot of code with limited testing
and coverage.
We rev the internal monitor protocol because the state machine ids changed.
This should not affect the on-disk format. Just stop and restart all the
monitors at once during the upgrade.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 19 Apr 2011 16:25:30 +0000 (09:25 -0700)]
mds: remove MDSlaveUpdate from list on deletion
These are added to the LogSegment list on the slaves, but also need to be
removed from that list when we replay a COMMIT|ROLLBACK or when the op's
fate is determined during the resolve stage.
This fixes a crash like
./include/elist.h: In function 'elist<T>::item::~item() [with T =
MDSlaveUpdate*]', in thread '0x7fb2004d5700'
./include/elist.h: 39: FAILED assert(!is_on_list())
ceph version 0.26 (commit:9981ff90968398da43c63106694d661f5e3d07d5)
1: (MDSlaveUpdate::~MDSlaveUpdate()+0x59) [0x4d9fe9]
2: (ESlaveUpdate::replay(MDS*)+0x422) [0x4d2772]
3: (MDLog::_replay_thread()+0xb90) [0x67f850]
4: (MDLog::ReplayThread::entry()+0xd) [0x4b89ed]
5: (()+0x7971) [0x7fb20564a971]
6: (clone()+0x6d) [0x7fb2042e692d]
ceph version 0.26 (commit:9981ff90968398da43c63106694d661f5e3d07d5)
1: (MDSlaveUpdate::~MDSlaveUpdate()+0x59) [0x4d9fe9]
2: (ESlaveUpdate::replay(MDS*)+0x422) [0x4d2772]
3: (MDLog::_replay_thread()+0xb90) [0x67f850]
4: (MDLog::ReplayThread::entry()+0xd) [0x4b89ed]
5: (()+0x7971) [0x7fb20564a971]
Fixes: #1019 Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 18 Apr 2011 22:06:25 +0000 (15:06 -0700)]
journaler: truncate/zero ahead of write position
Remove/zero objects N periods ahead of the journal write position. This
ensures that when we reprobe the journal length, we will always detect the
end position as the correct write_pos, even when there is weird data
"ahead" of us that we may bump up against.
MDS: Make _rename_apply inode import auth_pinning more intelligent.
We don't want auth_pins on the locallocks (they're never auth_pinned)
and we only want new auth_pins that are for locks on the inode that we
imported -- not for each xlock that the mdr has everywhere (like,
say, on the srcdn)!
Greg Farnum [Thu, 31 Mar 2011 21:02:48 +0000 (14:02 -0700)]
mds: If we're a slave, clean up xlocks when we export an inode.
Because we can do an inode import during a rename that skips the usual
channels, we were getting into an odd state with the xlocks (which we
did as a slave for an inode that we exported away). Clean up the
record of these xlocks for inodes before we get into the request
cleanup (at which point we are labeled as no-longer-auth, and the
standard cleanup routines will break).
Greg Farnum [Thu, 31 Mar 2011 00:10:05 +0000 (17:10 -0700)]
mds: properly drop imported xlocks.
Because we can do an inode import during a rename that skips the usual
channels, we were getting into an odd state with the xlocks (which
were formerly remote and are now local). Clean up the record of
those remote xlocks.
rename all the get_uid_by_* to get_user_info_by_*, remove get_user_info()
and call the appropriate function instead (either the by_uid or by_access_key).
mds: don't run all of try_subtree_merge on a rename across MDSes.
Previously we'd try and do the whole thing, which meant that
the replica got a lock twiddle before it had finished the export.
That broke things spectacularly, since we weren't respecting our
invariants about who gets remote locking messages.
Now we pass through a flag and respect our invariants.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
I don't remember why we needed can_xlock_local() to begin with, but
I can tell that adding this get_xlock_by() check won't stop anything
working that was ever working to begin with (really it's still not
strong enough a check).
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Greg Farnum [Thu, 24 Mar 2011 21:11:06 +0000 (14:11 -0700)]
MDS: Remove inappropriate assert from _logged_slave_rename.
The slave also can hold some auth pins from locks which the
master has asked it to grab. It's possible we can intelligently
determine how many, but for now just drop the assert.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Greg Farnum [Thu, 24 Mar 2011 19:23:38 +0000 (12:23 -0700)]
MDS: Server::handle_slave_rename_prep now accounts for dir snaplock.
Previously it ignored the auth pin required to hold snap xlock, which
is currently always held for a rename on a dir. This would lead to
a permanent hang on the request. Now we account for it!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Greg Farnum [Tue, 22 Mar 2011 21:23:33 +0000 (14:23 -0700)]
Server: ensure slave mdses have full dest tree
We were already taking rdlocks on the source tree, to make
sure that each slave MDS could traverse to the source dentry. Now,
if there are slave MDSes, we take rdlocks on each destination
ancestor to make sure the slaves can also traverse there.
This fixes an fsstress bug.
Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Sage Weil [Fri, 15 Apr 2011 22:51:50 +0000 (15:51 -0700)]
mds: keep import/export subtree_map state in sync with journal
We were being sloppy before with the ESubtreeMap vs import/export events.
Fix that by doing a few things:
- add an ambig flag to the subtree map items, and set it for in-progress
imports. That means an ESubtreeMap followed by EImportFinish will do
the right thing now.
- adjust the dir_auth on EExport journaling (handle_export_dir_ack) so
that our journaled subtree_map state is always in sync with what we
see during replay.
Also document clearly what the dir_auth variations actually mean.