Thread: remove globals. Thread create must succeed
Remove the references to global variables from Thread.h. They are really
unecessary. In every case, the printout is followed by an assert which
will deliver the exact same information.
Assert that thread creation succeeds. Nobody was checking the return
value of Thread::create() previously. Added a new function,
Thread::try_create(), which programmers can use if they do want to check
the value of Thread::create() and handle it appropriately.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Sometimes we create a Monitor without a Messenger. So we can't pull the
CephContext out of the Messenger, because it may be NULL. Just specify
it explicitly in the Monitor constructor.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Greg Farnum [Mon, 6 Jun 2011 20:43:37 +0000 (13:43 -0700)]
mds: xlock_finish should only do_issue in certain cases.
We accidentally (we think) initialized this variable to true when
we want it to be false: we should only do_issue if there aren't
any remaining locks, not in all cases.
De-globalize CephToolContext. It's important to do this now because the
constructor for CephToolContext references the configuration (via
CephContext.)
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Greg Farnum [Wed, 8 Jun 2011 21:13:28 +0000 (14:13 -0700)]
mds: rename: remove illicit assert.
We actually do want witnesses who aren't auth for anything
to do journaling in some cases, so kill the assert.
That also negates the need for the not_journaling check.
Sage Weil [Wed, 8 Jun 2011 20:29:21 +0000 (13:29 -0700)]
mds: try_trim_non_auth_subtree if we rename a dir away from a non-auth subtree
It's possible we have non-auth metadata only because we have a subtree
nested beneath. If we rename a directory out of a non-auth subtree, we
should try to trim any non-auth content from that subtree that may now
be possible due to the child subtrees being linked elsewhere.
Fixes: #1146 Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Wed, 8 Jun 2011 20:18:07 +0000 (13:18 -0700)]
mds: remove unlinked metadata from cache on replay
If we replay a metablob that unlinks something, throw it out immediately.
Recursively. This comes up when:
- we rename a file from one mds to another, and we replay the event on
the source mds. the inode gets thrown out.
- we rename a directory from one mds to another, and when journaled, the
source mds had no nested metadata. same thing: we throw it out. we
may have something in our cache nested beneath that, though, that was
since committed and such, but the fact that we didn't journal it being
reattached elsewhere implies that it was clean and gone when our event
was journaled, and we can throw it all out. recursively.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
libcommon uses symbols from the crypto libraries, so they must appear on
the link line whenever libcommon appears. Later, we may want to revisit
this dependency; however, right now, having unit tests that build
consistently is pretty important.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Sage Weil [Wed, 8 Jun 2011 03:48:52 +0000 (20:48 -0700)]
mds: open renamed import child frags during journal replay
Open up any child frags of the imported renamed inode that are noted in
the journal event. (Note we blindly open up that list here; it's up to the
journaler to only populate it when appropriate.) If the listed frags are
not already open, open them up and set the dir_auth to unknown; presumably
they belong to the rename source/exporter. If we already had them open,
then the adjust_subtree_after_rename call above will have caught them and
already done the necessary subtree adjustment.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Wed, 8 Jun 2011 03:46:42 +0000 (20:46 -0700)]
mds: journal open srci frags on srci import (master)
If we are importing the renamed inode, and it is a directory, journal a
list of all open dirfrags (currently, this is actually all frags) so that
we can open them up during journal replay.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Wed, 8 Jun 2011 03:43:29 +0000 (20:43 -0700)]
mds: journal renames on witnesses if we have nested subtrees
If a rename witness has any subtrees that are nested beneath the renamed
directory, we need to journal the rename event so that our cache is
properly updated on journal replay.
Further, if we are exporting srci, we also need to journal the dest
(even if we aren't auth for destdn) if we have any open dirfrags because
those will turn into nested subtrees shortly.
We still need to ensure that the cache is properly trimmed during replay.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 7 Jun 2011 16:41:56 +0000 (09:41 -0700)]
mds: fix/clean up xlock import/export
- create xlock import/export helpers
- fix/simplify checks: we want to export/import only xlocks on the inode
that is being migrated, unless they are locallock.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
MonClient should contain a KeyRing and a RotatingKeyRing. All the
MonClient users, except possibly csyn, don't want to manage those
objects themselves.
Don't chdir until after we have opened the KeyRing. If the KeyRing is at
a relative path, a chdir may make it inaccessible. Separate the chdir
function from the daemonize function.
Refactor the cmds argument parsing a little bit. Separate the special
actions from the normal operations of the daemon.
This should allow librados and libceph to support CephX finally! yay!
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
This commit is just erroneous. It adds checks on a pipe write
for the result and an abort if the write failed. But that's broken
in the desired case where we succeed, block on ceph_fuse_ll_main(),
and the parent process is long-gone by the time we get to this code!
Greg Farnum [Fri, 3 Jun 2011 18:53:10 +0000 (11:53 -0700)]
rados_bencher: re-add written objects constraint to read benchmark.
Somehow, in the last major change, the constraints that kept the
bencher from trying to read non-existent objects got removed. Put
a check back in the main bench loop to fix that.
Greg Farnum [Fri, 3 Jun 2011 16:53:20 +0000 (09:53 -0700)]
mds: Clean up _rename_prepare journaling
This has been broken for a while in terms of journaling
things the MDS isn't auth for. This patch should fix that, and
adds a few asserts to that effect.
Also adds a new not_journaling flag to _rename_prepare
for those cases which call the function and then discard
the bufferlist results. Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Thu, 2 Jun 2011 21:27:23 +0000 (14:27 -0700)]
uclient: reset flushing_caps on (mds) cap import.
Previously, we could get stuck thinking that we'd flushed caps
(that went to the original MDS, waited on freeze for export,
and then were dropped) without ever telling the auth MDS that we
wanted to do so. This caused hung shutdowns:
1) during shutdown we drop all our caps
2) we get stuck and notice that we have a flushing cap
3) we send cap flush
4) MDS ignores it (I think because actual data already got updated?
and now we don't have the proper caps either)
Greg Farnum [Thu, 2 Jun 2011 18:43:05 +0000 (11:43 -0700)]
uclient: don't use racy check for uncommitted data.
Previously we used a check for if there were CEPH_CAP_FILE_BUFFER refs,
but that was racy if we had other threads (they could hold caps for
sync writes or something). Instead, see if we have any in-flight
writes or uncommitted objects.