Tommi Virtanen [Fri, 10 Jun 2011 23:14:13 +0000 (16:14 -0700)]
pybind: Open shared libs by their major version.
The *.so files are only in the -dev packages, and normal
operation should not require those. The major version
numbers represent incompatible API/ABI changes anyway.
The debian dependencies were already correctly including
the major version.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
Tommi Virtanen [Fri, 10 Jun 2011 22:51:33 +0000 (15:51 -0700)]
debian: Properly package the python bindings.
Build-depend on python-support. Add binary package
python-ceph, making it contain all the ceph python
packages, regardless of their name; the modules are
too small to deserve their own debs.
Make python-ceph depend only on librados2 for now.
librgw is not packaged yet.
Dropping unnecessary build-dep on python-dev, that's
only needed for compiling C extensions, and we're using
ctypes.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
Thread: remove globals. Thread create must succeed
Remove the references to global variables from Thread.h. They are really
unecessary. In every case, the printout is followed by an assert which
will deliver the exact same information.
Assert that thread creation succeeds. Nobody was checking the return
value of Thread::create() previously. Added a new function,
Thread::try_create(), which programmers can use if they do want to check
the value of Thread::create() and handle it appropriately.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Sometimes we create a Monitor without a Messenger. So we can't pull the
CephContext out of the Messenger, because it may be NULL. Just specify
it explicitly in the Monitor constructor.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Sage Weil [Thu, 9 Jun 2011 21:52:55 +0000 (14:52 -0700)]
mds: fix xlock_finish do_issue checks
Should default to false, and only get set to true if there are caps for
this lock. Among other things this means we don't set it for dentry
locks (which have no caps).
Sage Weil [Thu, 9 Jun 2011 21:21:12 +0000 (14:21 -0700)]
mds: fix xlock_finish issue flag check
We were sometimes setting do_issue but not *pneed_issue. Simplify by
setting do_issue internally to the function and then either issuing or
setting *pneed_issue at the end.
Also fix bug with second argument to eval_gather().
Sage Weil [Thu, 9 Jun 2011 18:37:18 +0000 (11:37 -0700)]
mds: set or issue caps on lock state changes
Set pneed_issue (or issue ourselves) whenever we jump directly to the
target lock state. Make sure we only do it if there are caps (cap shift)
for this particular lock.
Sage Weil [Thu, 9 Jun 2011 17:43:13 +0000 (10:43 -0700)]
mds: make issue_caps from file_update_finish smarter
We do one funky thing in file_update_finish that only issues caps on a
single cap when max_size changes. This is more commonly we see. However,
if a lock changes state and we need to issue on the whole inode (for all
clients), avoid doing the cap-specific issue by checking the issue set.
Sage Weil [Thu, 9 Jun 2011 17:41:53 +0000 (10:41 -0700)]
mds: issue caps from drop_locks
In drop_locks, build a set of inodes we need to issue caps on. Then do it
all at once. This does two things:
- it fixes the fact that currently a dropped lock leading to an eval and
lock state change will not issue caps _at_all_
- it ensure we only issue_caps once for each inode, even when we are
dropping multiple locks on it.
Sage Weil [Thu, 9 Jun 2011 17:03:08 +0000 (10:03 -0700)]
mds: pass pissue_caps through *lock_finish()
This allows *lock_finish() callers to handle the issue_caps themselves.
None of them do yet (this arg is still optional) so this is patch has no
functional change (yet!).
Greg Farnum [Mon, 6 Jun 2011 20:43:37 +0000 (13:43 -0700)]
mds: xlock_finish should only do_issue in certain cases.
We accidentally (we think) initialized this variable to true when
we want it to be false: we should only do_issue if there aren't
any remaining locks, not in all cases.
De-globalize CephToolContext. It's important to do this now because the
constructor for CephToolContext references the configuration (via
CephContext.)
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Greg Farnum [Wed, 8 Jun 2011 21:13:28 +0000 (14:13 -0700)]
mds: rename: remove illicit assert.
We actually do want witnesses who aren't auth for anything
to do journaling in some cases, so kill the assert.
That also negates the need for the not_journaling check.
Sage Weil [Wed, 8 Jun 2011 20:29:21 +0000 (13:29 -0700)]
mds: try_trim_non_auth_subtree if we rename a dir away from a non-auth subtree
It's possible we have non-auth metadata only because we have a subtree
nested beneath. If we rename a directory out of a non-auth subtree, we
should try to trim any non-auth content from that subtree that may now
be possible due to the child subtrees being linked elsewhere.
Fixes: #1146 Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Wed, 8 Jun 2011 20:18:07 +0000 (13:18 -0700)]
mds: remove unlinked metadata from cache on replay
If we replay a metablob that unlinks something, throw it out immediately.
Recursively. This comes up when:
- we rename a file from one mds to another, and we replay the event on
the source mds. the inode gets thrown out.
- we rename a directory from one mds to another, and when journaled, the
source mds had no nested metadata. same thing: we throw it out. we
may have something in our cache nested beneath that, though, that was
since committed and such, but the fact that we didn't journal it being
reattached elsewhere implies that it was clean and gone when our event
was journaled, and we can throw it all out. recursively.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
libcommon uses symbols from the crypto libraries, so they must appear on
the link line whenever libcommon appears. Later, we may want to revisit
this dependency; however, right now, having unit tests that build
consistently is pretty important.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Sam Lang [Wed, 8 Jun 2011 17:59:11 +0000 (12:59 -0500)]
Fix segfault caused by invalid argument string.
This patchset includes minor fixes to the crushtool utility. If an invalid bucket type is speicifed on the command line, the code was iterating through bucket_types for the length of the static array, but the last entry in that array has null (0) values, which was causing a segfault. This patch just checks that bucket_types[i].name is non-null instead. Also, if the wrong bucket type or algorithm is specified, prints the usage string on exit.
Signed-off-by: Sam Lang <samlang@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 8 Jun 2011 03:48:52 +0000 (20:48 -0700)]
mds: open renamed import child frags during journal replay
Open up any child frags of the imported renamed inode that are noted in
the journal event. (Note we blindly open up that list here; it's up to the
journaler to only populate it when appropriate.) If the listed frags are
not already open, open them up and set the dir_auth to unknown; presumably
they belong to the rename source/exporter. If we already had them open,
then the adjust_subtree_after_rename call above will have caught them and
already done the necessary subtree adjustment.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Wed, 8 Jun 2011 03:46:42 +0000 (20:46 -0700)]
mds: journal open srci frags on srci import (master)
If we are importing the renamed inode, and it is a directory, journal a
list of all open dirfrags (currently, this is actually all frags) so that
we can open them up during journal replay.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Wed, 8 Jun 2011 03:43:29 +0000 (20:43 -0700)]
mds: journal renames on witnesses if we have nested subtrees
If a rename witness has any subtrees that are nested beneath the renamed
directory, we need to journal the rename event so that our cache is
properly updated on journal replay.
Further, if we are exporting srci, we also need to journal the dest
(even if we aren't auth for destdn) if we have any open dirfrags because
those will turn into nested subtrees shortly.
We still need to ensure that the cache is properly trimmed during replay.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>