Alex Elder [Thu, 1 Nov 2012 18:30:11 +0000 (13:30 -0500)]
run_xfstests.sh: add optional iteration count
This adds a "-c <count>" option to the run_xfstests.sh script so
the full set of tests can be repeated more than once without having
to go through the setup process each time.
Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>
Sage Weil [Tue, 30 Oct 2012 21:17:56 +0000 (14:17 -0700)]
ceph-disk-activate: avoid duplicating mounts if already activated
If the given device is already mounted at the target location, do not
mount --move it again and create a bunch of dup entries in the /etc/mtab
and kernel mount table.
Samuel Just [Mon, 15 Oct 2012 22:39:55 +0000 (15:39 -0700)]
FileJournal: break writeq locking from queue_lock
This prevents the relatively long process of queueing
finishers from preventing op submission.
In submit_entry, we no longer check for full before placing
the write in the writeq, committed_thru should work anyway,
and we don't want to grab the required lock.
Samuel Just [Sat, 6 Oct 2012 00:33:36 +0000 (17:33 -0700)]
JournalingFileStore: move apply/commit sequencing to apply_manager
syncing the filestore requires a stable commit point (i.e., all ops
up to applied_seq must have been applied). Previously, we used
journal_lock to atomically block new applies while waiting for
the remaining ones to finish. This creates unnecessary contention.
We now use apply_manager to manage that state atomically with its
own lock.
Samuel Just [Fri, 5 Oct 2012 20:46:13 +0000 (13:46 -0700)]
JournalingFileStore: create submit_manager to order op submission
Previously, we ensured op ordering by queueing for journal and
the op queue under the journal lock. All that is required is
that obtaining an op sequence, queueing for journal, and
(for parallel) queueing for application to the fs are done
atomically. To that end, submit_manager now handles op submission.
Samuel Just [Tue, 31 Jul 2012 16:04:40 +0000 (09:04 -0700)]
JournalingFileStore: pass -1 as the alignment if unimportant
Previously, data_align began at 0 and remained that way if no
transaction contained a large data segment. This 0 was propagated
to prepare_single_write, which padded out most of a page to ensure
that the bl started with 0 alignment. Passing -1 will ensure that
we don't prepad these small segments.
Sage Weil [Tue, 30 Oct 2012 20:19:30 +0000 (13:19 -0700)]
msg/SimpleMessenger: start accepter in ready()
Start the accepter thread when the first dispatcher is ready. This ensures
that there will be someone around to verify authorizers for incoming
connections, and means we have a bit less failure noise on the monitors
as a result.
Sage Weil [Fri, 26 Oct 2012 04:21:18 +0000 (21:21 -0700)]
ceph-disk-prepare: poke kernel into refreshing partition tables
Prod the kernel to refresh the partition table after we create one. The
partprobe program is packaged with parted, which we already use, so this
introduces no new dependency.
Sage Weil [Tue, 30 Oct 2012 17:00:42 +0000 (10:00 -0700)]
msg/Pipe: only randomize start seq #'s if MSG_AUTH feature is present
The kernel client expects seq #'s to start at 1 or else it is unhappy.
So, only randomize these values if the MSG_AUTH feature is present--that is
the only time it matters anyway.
Sage Weil [Mon, 29 Oct 2012 18:03:46 +0000 (11:03 -0700)]
osd: make pool_snap_info_t encoding backward compatible
Way back in fc869dee1e8a1c90c93cb7e678563772fb1c51fb (v0.42) when we redid
the osd type encoding we forgot to make this conditionally encode the old
format for old clients. In particular, this means that kernel clients
will fail to decode the osdmap if there is a rados pool with a pool-level
snapshot defined.
Fixes: #3290 Signed-off-by: Sage Weil <sage@inktank.com>
Sam Lang [Mon, 29 Oct 2012 15:30:01 +0000 (10:30 -0500)]
client: Fix ref counting double free with hardlink
Peforming a hard link through the libcephfs interface causes
a double free on shutdown, due to the Client::link call decrementing
the parent (of the target) directory's inode. This fix removes the
put_inode(dir) call, to match the behavior of Client::ll_link.
Signed-off-by: Sam Lang <sam.lang@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Dan Mick [Tue, 23 Oct 2012 04:15:51 +0000 (21:15 -0700)]
librbd: clip requests past end-of-image.
Rename check_io to clip_io, which can modify the passed-in length
to clamp it to the device size. This is expected behavior for
block-device emulation.
Call clip_io in rbd_write(); need to return clipped length there,
even though aio_write() is calling clip_io() as well (for the
direct path).
Signed-off-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Yan, Zheng [Fri, 26 Oct 2012 06:26:35 +0000 (14:26 +0800)]
mds: Fix SnapRealm differ check in CInode::encode_inodestat()
When checking if inode's SnapRealm is different from readdir
SnapRealm, we should use find_snaprealm() to get inode's SnapRealm.
Without this fix, I got lots of "ceph_add_cap: couldn't find snap
realm 100" from kernel client.
Sage Weil [Fri, 26 Oct 2012 22:48:52 +0000 (15:48 -0700)]
mds: allow try_eval to eval replica locks
Allow try_eval(MDSCacheObject*, int mask) to eval locks on replica objects
so that they don't get stuck in an unstable state. The eval(CInode*, mask)
handles the non-auth already. For the dentry case, call eval_any(), which
handles the non-auth case, instead of directly calling simple_eval(), which
does not.
Reported-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>
Yan, Zheng [Thu, 25 Oct 2012 12:26:50 +0000 (20:26 +0800)]
mds: Fix stray check in Migrator::export_dir()
Commit f8110c (Allow export subtrees in other MDS' stray directory)
make the "directory in stray " check always return false. This is
because the directory in question is grandchild of mdsdir.
Yan, Zheng [Thu, 25 Oct 2012 12:26:49 +0000 (20:26 +0800)]
mds: fix stray migration/reintegration check in handle_client_rename
The stray migration/reintegration generates a source path that will
be rooted in a (possibly remote) MDS's MDSDIR; adjust the check in
handle_client_rename()