]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
12 years agoworkload_generator.cc: remove twice included "common/debug.h"
Danny Al-Gaaf [Mon, 4 Feb 2013 16:54:04 +0000 (17:54 +0100)]
workload_generator.cc: remove twice included "common/debug.h"

Cleanup includes, remove twice included "common/debug.h"

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agotest_idempotent.cc: remove twice included "os/FileStore.h"
Danny Al-Gaaf [Mon, 4 Feb 2013 16:54:03 +0000 (17:54 +0100)]
test_idempotent.cc: remove twice included "os/FileStore.h"

Cleanup includes, remove twice included "os/FileStore.h".

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agotp_bench.cc: remove twice included <iostream>
Danny Al-Gaaf [Mon, 4 Feb 2013 16:54:02 +0000 (17:54 +0100)]
tp_bench.cc: remove twice included <iostream>

Cleanup includes, remove twice included <iostream>.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agosmall_io_bench*.cc: remove twice included <iostream>
Danny Al-Gaaf [Mon, 4 Feb 2013 16:54:01 +0000 (17:54 +0100)]
small_io_bench*.cc: remove twice included <iostream>

Cleanup includes, remove twice included <iostream>.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoMDS.cc: remove twice included common/errno.h
Danny Al-Gaaf [Mon, 4 Feb 2013 16:54:00 +0000 (17:54 +0100)]
MDS.cc: remove twice included common/errno.h

Cleanup includes, remove twice included common/errno.h.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agomon: enforce reweight be between 0..1
Sage Weil [Mon, 4 Feb 2013 17:14:39 +0000 (09:14 -0800)]
mon: enforce reweight be between 0..1

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Luis <joao.luis@inktank.com>
12 years agoqa: smalliobenchrbd workunit
Sage Weil [Sun, 3 Feb 2013 17:28:22 +0000 (09:28 -0800)]
qa: smalliobenchrbd workunit

Run a bunch of parallel smalliobenchrbd processes.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'gh/wip-rbd-bench'
Sage Weil [Sun, 3 Feb 2013 16:59:48 +0000 (08:59 -0800)]
Merge remote-tracking branch 'gh/wip-rbd-bench'

Conflicts:
ceph.spec.in
debian/ceph-test.install
src/.gitignore

12 years agoMerge branch 'wip-rpm-update3'
Gary Lowell [Sat, 2 Feb 2013 07:26:21 +0000 (23:26 -0800)]
Merge branch 'wip-rpm-update3'

Patches to ceph.spec.in and addition of rbd-fuse package.

12 years agoMerge branch 'master' of https://github.com/ceph/ceph
John Wilkins [Fri, 1 Feb 2013 19:31:10 +0000 (11:31 -0800)]
Merge branch 'master' of https://github.com/ceph/ceph

12 years agodoc: Minor edits.
John Wilkins [Fri, 1 Feb 2013 19:30:30 +0000 (11:30 -0800)]
doc: Minor edits.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agorgw: key indexes are only link to user info
Yehuda Sadeh [Thu, 13 Dec 2012 23:52:34 +0000 (15:52 -0800)]
rgw: key indexes are only link to user info

Instead of keeping multiple copies of the user info,
we just treat the key index as a pointer to the actual
user info (indexed by uid). This helps with two issues:
first, it scales better as we don't need to update the
entire set of keys whenever we make any change. Second,
it helps with the uid index atomicity.
One point to keep in mind is that both the links and the
info can be cached, so effect on performance is minimal.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: caleb miles <caleb.miles@inktank.com>
12 years agoBuild: Add -n to files and description for rbd-fuse in ceph.sepc.in
Gary Lowell [Fri, 1 Feb 2013 05:51:44 +0000 (21:51 -0800)]
Build:  Add -n to files and description for rbd-fuse in ceph.sepc.in

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
12 years agoMakefile: Install new rdb-fuse.8 man page
Gary Lowell [Fri, 1 Feb 2013 05:04:49 +0000 (21:04 -0800)]
Makefile:  Install new rdb-fuse.8 man page

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
12 years agobuild: Add new rbd-fuse package
Gary Lowell [Fri, 1 Feb 2013 04:35:26 +0000 (20:35 -0800)]
build:  Add new rbd-fuse package

rdb-fuse is a new facility to map ceph rdb images to files.

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
12 years agoRevert "Don't install rbd-fuse binary"
Danny Al-Gaaf [Wed, 30 Jan 2013 18:00:40 +0000 (19:00 +0100)]
Revert "Don't install rbd-fuse binary"

This reverts commit 35e5d74e5c5786bc91df5dc10b5c08c77305df4e.

-> fix build instead

12 years agorbd-fuse: quick and dirty manpage
Dan Mick [Fri, 1 Feb 2013 02:43:29 +0000 (18:43 -0800)]
rbd-fuse: quick and dirty manpage

Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agorbd-fuse: quick and dirty manpage
Dan Mick [Fri, 1 Feb 2013 02:43:29 +0000 (18:43 -0800)]
rbd-fuse: quick and dirty manpage

Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agoceph-filestore-dump.cc: don't use po::value<string>()->required()
Danny Al-Gaaf [Thu, 31 Jan 2013 14:41:19 +0000 (15:41 +0100)]
ceph-filestore-dump.cc: don't use po::value<string>()->required()

Don't use po::value<string>()->required() since this breaks build on
RHEL/CentOs6. Check if the options are set as in the code of other
ceph parts.

Move some checks up in the code to validate options as soon
as possible. Remove printing 'help' twice, and check it first.

Fix type description.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agodoc: Added more detail to SSD section. Links to performance blogs.
John Wilkins [Fri, 1 Feb 2013 00:34:02 +0000 (16:34 -0800)]
doc: Added more detail to SSD section. Links to performance blogs.

fixes: #3960

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoMerge pull request #37 from alram/master
Yehuda Sadeh [Fri, 1 Feb 2013 00:19:28 +0000 (16:19 -0800)]
Merge pull request #37 from alram/master

Add important note in doc/radosgw/config.rst

12 years agoAdd important note in doc/radosgw/config.rst 37/head
Alexandre Marangone [Thu, 31 Jan 2013 23:58:15 +0000 (15:58 -0800)]
Add important note in doc/radosgw/config.rst

For CentOS and similar, FastCgiWrapper is turned on by default.
This causes Apache to spawn radosgw processes.

12 years agoceph-filestore-dump.cc: don't use po::value<string>()->required()
Danny Al-Gaaf [Thu, 31 Jan 2013 14:41:19 +0000 (15:41 +0100)]
ceph-filestore-dump.cc: don't use po::value<string>()->required()

Don't use po::value<string>()->required() since this breaks build on
RHEL/CentOs6. Check if the options are set as in the code of other
ceph parts.

Move some checks up in the code to validate options as soon
as possible. Remove printing 'help' twice, and check it first.

Fix type description.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agoceph.spec.in: fix file section for ceph-resource-agents
Danny Al-Gaaf [Wed, 30 Jan 2013 18:00:45 +0000 (19:00 +0100)]
ceph.spec.in: fix file section for ceph-resource-agents

Create needed dirs (/usr/lib/ocf/resource.d/ceph) for the ceph-resource-agents
subpackage.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoceph.spec.in: extend fix for libedit-devel on special SUSE versions
Danny Al-Gaaf [Wed, 30 Jan 2013 18:00:44 +0000 (19:00 +0100)]
ceph.spec.in: extend fix for libedit-devel on special SUSE versions

Extend fix for libedit-devel on special SUSE versions, use ncurses
also on src/ocf/Makefile and src/java/Makefile

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoceph.spec.in: don't move libcephfs_jni files around
Danny Al-Gaaf [Wed, 30 Jan 2013 18:00:43 +0000 (19:00 +0100)]
ceph.spec.in: don't move libcephfs_jni files around

Don't move libcephfs_jni files around from %{_libdir} to /usr/lib/jni/
in the buildroot. They should be placed in %{_libdir} as all libs.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoceph.spec.in: move libcephfs_jni.so to ceph-devel
Danny Al-Gaaf [Wed, 30 Jan 2013 18:00:42 +0000 (19:00 +0100)]
ceph.spec.in: move libcephfs_jni.so to ceph-devel

Move libcephfs_jni.so to the ceph-devel package since so-files they
shouldn't be part of the library package.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoValidate format strings for CLS_ERR/CLS_LOG
Dan Mick [Thu, 31 Jan 2013 01:33:09 +0000 (17:33 -0800)]
Validate format strings for CLS_ERR/CLS_LOG

cls_log needed __attribute__((format(printf..)) to allow the compiler
to crosscheck format strings and arguments.  After adding that, there
needed to be a bunch of fixups for %ll, and a few changes for missing
arguments, etc. uncovered by the checking.

Fixes: #3970
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoqa: update the rbd/concurrent.sh workunit
Alex Elder [Thu, 31 Jan 2013 12:47:59 +0000 (06:47 -0600)]
qa: update the rbd/concurrent.sh workunit

A few changes, now that a few rbd problems have been fixed.
First, the more substantive changes:
    - Generate a source file, and compare what's read back from rbd
      devices with the content of that file.
    - Write to the rbd device such that the written data spans
      an (assumed 4 MB) rbd object boundary, as well as starting
      and ending on non-page-aligned offsets.
    - Perform multiple reads on rbd devices: entirely within a range
      before any written data; beginning before but ending within
      written data; the exact written data (and validating what's
      read); beginning within written data but ending after it;
      reading after written data but within a written rbd object;
      and reading from an unwritten rbd object.
    - Have the sleep between iterations provide a non-integer value
      to avoid zero (or quantized) delays.

Also, some a little less substantive (but possibly informative):
    - Don't run with "set -x".  It produces a ton of noise that is
      not useful for this test.  This is an exerciser, looking
      really for system crashes during concurrent activity, and
      knowing which commands were (concurrently) active isn't going
      to help much in diagnosis.
    - Create two more directories, used to track the degree of
      concurrency (more or less) and the highest rbd id consumed.
      Files whose names are numbers are touched in each, and the
      highest at the end is the highest during the run.  This gets
      around issues passing environment info from sub-shells to the
      top-level shell.  As a bonus, it offers a better chance of
      avoiding problems due to concurrent update.
    - NAMESDIR is renamed NAMES_DIR, and it (and the others) is
      set up in the setup() function.
    - Increase the concurrency and iteration counts.
    - Move the default definitions before the ceph secrets stuff

Signed-off-by: Alex Elder <elder@inktank.com>
12 years agoAdd ceph-filestore-dump to the packaging
David Zafman [Thu, 31 Jan 2013 02:50:07 +0000 (18:50 -0800)]
Add ceph-filestore-dump to the packaging

Feature: #3890

Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agodoc: v0.56.2 release notes
Sage Weil [Wed, 30 Jan 2013 23:41:39 +0000 (15:41 -0800)]
doc: v0.56.2 release notes

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: create tool to extract pg info and pg log from filestore
David Zafman [Wed, 30 Jan 2013 02:21:51 +0000 (18:21 -0800)]
osd: create tool to extract pg info and pg log from filestore

New application ceph-filestore-dump created that mounts filstore
and can dump info or log in JSON when an OSD is not running.

Feature: #3890

Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoMove read_log() function to prep for next commit
David Zafman [Wed, 30 Jan 2013 01:59:45 +0000 (17:59 -0800)]
Move read_log() function to prep for next commit

Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoPGMap: fix -Wsign-compare warning
Danny Al-Gaaf [Wed, 30 Jan 2013 17:52:24 +0000 (18:52 +0100)]
PGMap: fix -Wsign-compare warning

Fix -Wsign-compare compiler warning:

mon/PGMap.cc: In member function 'void PGMap::apply_incremental
 (CephContext*, const PGMap::Incremental&)':
mon/PGMap.cc:247:30: warning: comparison between signed and
 unsigned integer expressions [-Wsign-compare]

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agotest_libcephfs: fix xattr test
Sage Weil [Wed, 30 Jan 2013 19:32:23 +0000 (11:32 -0800)]
test_libcephfs: fix xattr test

Ignore the ceph.*.layout xattrs.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoqa: add test for rbd map and snapshots
Sage Weil [Wed, 30 Jan 2013 09:06:03 +0000 (01:06 -0800)]
qa: add test for rbd map and snapshots

This tests for the behavior reported in #3964.  It passes on the current
code, but fails on 3.2 in squeeze (and 32-bit?).

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Wed, 30 Jan 2013 09:05:07 +0000 (01:05 -0800)]
Merge remote-tracking branch 'gh/next'

12 years agocls_rbd, cls_rgw: use PRI*64 when printing/logging 64-bit values
Dan Mick [Wed, 30 Jan 2013 07:05:49 +0000 (23:05 -0800)]
cls_rbd, cls_rgw: use PRI*64 when printing/logging 64-bit values

caused segfaults in 32-bit build

Fixes: #3961
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agomds: move lexical_cast and assert re-#include to the top
Sage Weil [Wed, 30 Jan 2013 03:48:25 +0000 (19:48 -0800)]
mds: move lexical_cast and assert re-#include to the top

We should keep the re-#includes immediately following the offender, and
documented.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoDon't install rbd-fuse binary
Dan Mick [Wed, 30 Jan 2013 03:00:27 +0000 (19:00 -0800)]
Don't install rbd-fuse binary

fixes packaging warnings

Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agomds/Server.cc: fix warring assert.h's
Dan Mick [Wed, 30 Jan 2013 02:41:20 +0000 (18:41 -0800)]
mds/Server.cc: fix warring assert.h's

New include boost/lexical_cast.hpp apparently drags in the system
assert.h on quantal and squeeze at least, breaking our careful
assert.h; re-include our file to fix it back

Fixes: #3957
Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agomon: require name for 'auth add ...' command
Sage Weil [Wed, 30 Jan 2013 02:41:52 +0000 (18:41 -0800)]
mon: require name for 'auth add ...' command

Otherwise we interpret the empty string as 'unknown.'.

Fixes: #3956
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'origin/wip-fuse-create-fix'
Greg Farnum [Wed, 30 Jan 2013 01:07:49 +0000 (17:07 -0800)]
Merge remote-tracking branch 'origin/wip-fuse-create-fix'

Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoinit-ceph: make ulimit -n be part of daemon command
Dan Mick [Tue, 29 Jan 2013 23:18:53 +0000 (15:18 -0800)]
init-ceph: make ulimit -n be part of daemon command

ulimit -n from 'max open files' was being set only on the machine
running /etc/init.d/ceph.  It needs to be added to the commands to
start the daemons, and run both locally and remotely.

Verified by examining /proc/<pid>/limits on local and remote hosts

Fixes: #3900
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Loïc Dachary <loic@dachary.org>
Reviewed-by: Gary Lowell <gary.lowell@inktank.com>
12 years agoMerge remote-tracking branch 'gh/wip-recovery-stats-b'
Sage Weil [Wed, 30 Jan 2013 00:34:21 +0000 (16:34 -0800)]
Merge remote-tracking branch 'gh/wip-recovery-stats-b'

Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge branch 'wip-vxattr'
Sage Weil [Wed, 30 Jan 2013 00:26:57 +0000 (16:26 -0800)]
Merge branch 'wip-vxattr'

Reviewed-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoqa: add layout_vxattrs.sh test script
Sage Weil [Sat, 19 Jan 2013 19:33:04 +0000 (11:33 -0800)]
qa: add layout_vxattrs.sh test script

Test virtual xattrs for file and directory layouts.

TODO: create a data pool, add it to the fs, and make sure we can use it.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomds: allow dir layout/policy to be removed via removexattr on ceph.dir.layout
Sage Weil [Sat, 19 Jan 2013 18:11:18 +0000 (10:11 -0800)]
mds: allow dir layout/policy to be removed via removexattr on ceph.dir.layout

This lets a user remove a policy that was previously set on a dir.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomds: handle ceph.*.layout.* setxattr
Sage Weil [Sat, 19 Jan 2013 18:09:39 +0000 (10:09 -0800)]
mds: handle ceph.*.layout.* setxattr

Allow individual fields of file or dir layouts to be set via setxattr.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomds: fix client view of dir layout when layout is removed
Sage Weil [Sat, 19 Jan 2013 18:04:05 +0000 (10:04 -0800)]
mds: fix client view of dir layout when layout is removed

We weren't handling the case where the projected node has NULL for the
layout properly.  Fixes the client's view when we remove the dir layout.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoclient: note presence of dir layout in inode operator<<
Sage Weil [Sat, 19 Jan 2013 18:04:39 +0000 (10:04 -0800)]
client: note presence of dir layout in inode operator<<

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoclient: list only aggregate xattr, but allow setting subfield xattrs
Sage Weil [Sat, 19 Jan 2013 17:05:59 +0000 (09:05 -0800)]
client: list only aggregate xattr, but allow setting subfield xattrs

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoclient: implement ceph.file.* and ceph.dir.* vxattrs
Sage Weil [Sat, 19 Jan 2013 06:26:00 +0000 (22:26 -0800)]
client: implement ceph.file.* and ceph.dir.* vxattrs

Display ceph.file.* vxattrs on any regular file, and ceph.dir.* vxattrs
on any directory that has a policy set.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoclient: move xattr namespace enforcement into internal method
Sage Weil [Sat, 19 Jan 2013 01:21:37 +0000 (17:21 -0800)]
client: move xattr namespace enforcement into internal method

This captures libcephfs users now too.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoclient: allow ceph.* xattrs
Sage Weil [Sat, 19 Jan 2013 01:20:22 +0000 (17:20 -0800)]
client: allow ceph.* xattrs

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomds: open mydir after replay
Sage Weil [Fri, 18 Jan 2013 06:00:42 +0000 (22:00 -0800)]
mds: open mydir after replay

In certain cases, we may replay the journal and not end up with the
dirfrag for mydir open.  This is fine--we just need to open it up and
fetch it below.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoObjectCacher: fix flush_set when no flushing is needed
Josh Durgin [Tue, 29 Jan 2013 22:22:15 +0000 (14:22 -0800)]
ObjectCacher: fix flush_set when no flushing is needed

C_GatherBuilder takes ownership of the Context we pass it. Deleting it
in flush_set after constructing the C_GatherBuilder results in a
double delete.

Fixes: #3946
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sam Lang <sam.lang@inktank.com>
12 years agoqa: add rbd/concurrent workunit
Alex Elder [Tue, 29 Jan 2013 21:51:13 +0000 (15:51 -0600)]
qa: add rbd/concurrent workunit

This defines a new workunit shell script that performs a bunch of
rbd operations concurrently in order to exercise code paths and
catch reference count and bad pointer problems.

Signed-off-by: Alex Elder <elder@inktank.com>
12 years agomds: Send created ino in journaled_reply
Sam Lang [Tue, 29 Jan 2013 17:28:00 +0000 (11:28 -0600)]
mds: Send created ino in journaled_reply

The MDS avoids sending an early reply if a request
triggered inode allocation (no preallocated inodes yet).
For create, this prevented the created ino from being
sent back to the client, which is used to indicate
creation (as apposed to already existing) of the file.
This commit fixes the issue by adding the created ino
to the journaled (safe) reply.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
12 years agoclient: Don't use geteuid/gid for fuse ll_create
Sam Lang [Tue, 29 Jan 2013 16:18:29 +0000 (10:18 -0600)]
client: Don't use geteuid/gid for fuse ll_create

Fixes a bug in ll_create where files that already exist at the MDS
don't get the created flag set on reply.  This causes a permissions
check, which fails because geteuid/getegid are 0/0 for ll_create.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
12 years agoceph.spec.in: package rbd udev rule
Gary Lowell [Tue, 29 Jan 2013 06:49:45 +0000 (22:49 -0800)]
ceph.spec.in: package rbd udev rule

Package udev/50-rbd.rules per bug 3930.

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
12 years agomon: smooth pg stat rates over last N pgmaps
Sage Weil [Tue, 29 Jan 2013 03:46:33 +0000 (19:46 -0800)]
mon: smooth pg stat rates over last N pgmaps

This smooths the recovery and throughput stats over the last N pgmaps,
defaulting to 2.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'yan/wip-mds'
Sage Weil [Tue, 29 Jan 2013 03:17:48 +0000 (19:17 -0800)]
Merge remote-tracking branch 'yan/wip-mds'

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agodoc: fix overly-big fixed-width text in Firefox
Ross Turk [Tue, 29 Jan 2013 03:03:56 +0000 (19:03 -0800)]
doc: fix overly-big fixed-width text in Firefox

Changed font size for <pre> elements to be 15pt instead of 1.5em - Firefox seems to render 1.1em a bit bigger than other browsers.

Signed-off-by: Ross Turk <ross@inktank.com>
12 years agomon/PGMap: report IO rates
Sage Weil [Sat, 26 Jan 2013 03:51:40 +0000 (19:51 -0800)]
mon/PGMap: report IO rates

This does not appear to be very accurate; probably the stat values we're
displaying are not being calculated correctly.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PGMap: report recovery rates
Sage Weil [Sat, 26 Jan 2013 03:51:14 +0000 (19:51 -0800)]
mon/PGMap: report recovery rates

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PGMap: include timestamp
Sage Weil [Sat, 26 Jan 2013 03:50:45 +0000 (19:50 -0800)]
mon/PGMap: include timestamp

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: track recovery ops in stats
Sage Weil [Sat, 26 Jan 2013 03:49:16 +0000 (19:49 -0800)]
osd: track recovery ops in stats

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd_types: add recovery counts to object_sum_stats_t
Sage Weil [Sat, 26 Jan 2013 03:06:52 +0000 (19:06 -0800)]
osd_types: add recovery counts to object_sum_stats_t

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agorbd-fuse: fix warning
Sage Weil [Tue, 29 Jan 2013 02:27:53 +0000 (18:27 -0800)]
rbd-fuse: fix warning

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodoc: Removed indep, and clarified explanation.
John Wilkins [Tue, 29 Jan 2013 02:44:07 +0000 (18:44 -0800)]
doc: Removed indep, and clarified explanation.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agomds: clear inode dirty when slave rename finishes.
Yan, Zheng [Sun, 27 Jan 2013 07:31:47 +0000 (15:31 +0800)]
mds: clear inode dirty when slave rename finishes.

The inode is linked to a non-auth directory, so remove it from LogSegment's
dirty inode list.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: mark export bounds for cross authority directory rename
Yan, Zheng [Sun, 27 Jan 2013 07:22:46 +0000 (15:22 +0800)]
mds: mark export bounds for cross authority directory rename

this guarantees that the importing MDS gets directory fragment's
up-to-date fragstat/rstat.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: allow handling slave request in the clientreplay stage
Yan, Zheng [Sun, 27 Jan 2013 07:16:19 +0000 (15:16 +0800)]
mds: allow handling slave request in the clientreplay stage

replaying a client request may need to create slave request and the slave
MDS can be also in the clientreplay stage.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: fix 'discover' handling in the rejoin stage
Yan, Zheng [Sun, 27 Jan 2013 07:14:55 +0000 (15:14 +0800)]
mds: fix 'discover' handling in the rejoin stage

If the MDS is the resolve stage, current MDCache::handle_discover() only handles
'discover' from MDS that it has already gotten rejoin acknowledgement. This can
cause circular wait because MDCache::rejoin_gather_finish() fetches reconnected
inodes before send rejoin acknowledgements, and fetching reconnected inode may
triggers 'discover'. The fix is not delay handling 'discover' from MDS that are
also in the rejoin stage.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: add projected rename's subtree bounds to ESubtreeMap
Yan, Zheng [Sun, 27 Jan 2013 06:41:50 +0000 (14:41 +0800)]
mds: add projected rename's subtree bounds to ESubtreeMap

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: fetch missing inodes from disk
Yan, Zheng [Sat, 19 Jan 2013 01:24:12 +0000 (09:24 +0800)]
mds: fetch missing inodes from disk

The problem of fetching missing inodes from replicas is that replicated inodes
does not have up-to-date rstat and fragstat. So just fetch missing inodes from
disk

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: rejoin remote wrlocks and frozen auth pin
Yan, Zheng [Sat, 19 Jan 2013 01:17:22 +0000 (09:17 +0800)]
mds: rejoin remote wrlocks and frozen auth pin

Includes remote wrlocks and frozen authpin in cache rejoin strong message

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: move variables special to rename into MDRequest::more
Yan, Zheng [Fri, 18 Jan 2013 14:54:02 +0000 (22:54 +0800)]
mds: move variables special to rename into MDRequest::more

My previous patches add two pointers (ambiguous_auth_inode and
auth_pin_freeze) to class Mutation. They are both used by cross
authority rename, both point to the renamed inode. Later patches
need add more rename special state to MDRequest, So just move them
into MDRequest::more

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: properly clear CDir::STATE_COMPLETE when replaying EImportStart
Yan, Zheng [Mon, 21 Jan 2013 14:05:42 +0000 (22:05 +0800)]
mds: properly clear CDir::STATE_COMPLETE when replaying EImportStart

when replaying EImportStart, we should set/clear directory's COMPLETE
flag according with the flag in the journal entry.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: don't journal opened non-auth inode
Yan, Zheng [Mon, 21 Jan 2013 02:04:03 +0000 (10:04 +0800)]
mds: don't journal opened non-auth inode

If we journal opened non-auth inode, during journal replay, the corresponding
entry will add non-auth objects to the cache. But the MDS does not journal all
subsequent modifications (rmdir,rename) to these non-auth objects, so the code
that manages cache and subtree may get confused. Besides non-auth objects will
be trimmed at the resolve stage.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: journal inode's projected parent when doing link rollback
Yan, Zheng [Wed, 16 Jan 2013 12:25:30 +0000 (20:25 +0800)]
mds: journal inode's projected parent when doing link rollback

Otherwise the journal entry will revert the effect of any on-going
rename operation for the inode.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: fix for MDCache::disambiguate_imports
Yan, Zheng [Wed, 16 Jan 2013 12:22:03 +0000 (20:22 +0800)]
mds: fix for MDCache::disambiguate_imports

In the resolve stage, if no MDS claims other MDS's disambiguous subtree
import, the subtree's dir_auth is undefined.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: fix for MDCache::adjust_bounded_subtree_auth
Yan, Zheng [Wed, 16 Jan 2013 12:17:23 +0000 (20:17 +0800)]
mds: fix for MDCache::adjust_bounded_subtree_auth

After swallowing extra subtrees, subtree bounds may change, so it
should re-check.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: don't replace existing slave request
Yan, Zheng [Wed, 16 Jan 2013 11:58:49 +0000 (19:58 +0800)]
mds: don't replace existing slave request

The MDS may receive a client request, but find there is an existing
slave request. It means other MDS is handling the same request, so
we should not replace the slave request with a new client request,
just forward the request.

The client request may include embeded cap releases, we need process
them even the request is forwarded.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: always use {push,pop}_projected_linkage to change linkage
Yan, Zheng [Wed, 16 Jan 2013 11:38:38 +0000 (19:38 +0800)]
mds: always use {push,pop}_projected_linkage to change linkage

Current code skips using {push,pop}_projected_linkage to modify replica
dentry's linkage. This confuses EMetaBlob::add_dir_context() and makes
it record out-of-date path when TO_ROOT mode is used. This patch changes
the code to always use {push,pop}_projected_linkage to modify dentry's
linkage. It makes sure MDCache::create_subtree_map() record correct and
up-to-date subtree map.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: send resolve messages after all MDS reach resolve stage
Yan, Zheng [Sat, 19 Jan 2013 01:49:04 +0000 (09:49 +0800)]
mds: send resolve messages after all MDS reach resolve stage

Current code sends resolve messages when resolving MDS set changes.
There is no need to send resolve messages when some MDS leave the
resolve stage. Sending message while some MDS are replaying is also
not very useful.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: split reslove into two sub-stages
Yan, Zheng [Fri, 18 Jan 2013 11:41:48 +0000 (19:41 +0800)]
mds: split reslove into two sub-stages

The resolve stage serves to disambiguate the fate of uncommitted slave
updates and resolve subtrees authority. The MDS sends resolve message
that claims subtrees authority immediately when reslove stage is entered,
When receiving a resolve message, the MDS also processes it immediately.
This may cause problem if there are uncommitted slave rename and some of
them need rollback later. It's because slave rename rollback may modify
subtree map.

The fix is split reslove into two sub-stages, the first sub-stage serves
to disambiguate slave updates, do slave commit or rollback. After the
the first sub-stage finishes, the MDS sends resolve messages that claim
subtrees authority to other MDS and processes received resolve messages.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: fix slave rename rollback
Yan, Zheng [Sat, 19 Jan 2013 05:00:29 +0000 (13:00 +0800)]
mds: fix slave rename rollback

The main issue of old slave rename rollback code is that it assumes
all affected objects are in the cache. The assumption is not true
when MDS does rollback in the resolve stage. This patch removes the
assumption and makes Server::do_rename_rollback() check individual
object and roll back change.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: preserve non-auth/unlinked objects until slave commit
Yan, Zheng [Sat, 19 Jan 2013 04:57:31 +0000 (12:57 +0800)]
mds: preserve non-auth/unlinked objects until slave commit

The MDS should not trim objects in non-auth subtree immediately after
replaying a slave rename. Because the slave rename may require rollback
later and these objects are needed for rollback.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: don't journal non-auth rename source directory
Yan, Zheng [Sun, 20 Jan 2013 11:23:38 +0000 (19:23 +0800)]
mds: don't journal non-auth rename source directory

After replaying a slave rename, non-auth directory that we rename out of will
be trimmed. So there is no need to journal it.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: force journal straydn for rename if necessary
Yan, Zheng [Fri, 18 Jan 2013 06:08:45 +0000 (14:08 +0800)]
mds: force journal straydn for rename if necessary

rename may overwrite an empty directory inode and move it into stray
directory. MDS who has auth subtree beneath the overwrited directory
need journal the stray dentry when handling rename slave request.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: splits rename force journal check into separate function
Yan, Zheng [Sat, 19 Jan 2013 11:03:01 +0000 (19:03 +0800)]
mds: splits rename force journal check into separate function

the function will be used by later patch that fixes rename rollback

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: fix "had dentry linked to wrong inode" warning
Yan, Zheng [Fri, 18 Jan 2013 02:47:21 +0000 (10:47 +0800)]
mds: fix "had dentry linked to wrong inode" warning

The reason of "had dentry linked to wrong inode" warning is that
Server::_rename_prepare() adds the destdir to the EMetaBlob before
adding the straydir. So during MDS recovers, the destdir is first
replayed. The old inode is directly replaced by the source inode.
We can void the warning by adding the straydir first.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: don't set xlocks on dentries done when early reply rename
Yan, Zheng [Sat, 19 Jan 2013 00:30:23 +0000 (08:30 +0800)]
mds: don't set xlocks on dentries done when early reply rename

_rename_finish() does not send dentry link/unlink message to replicas.
We should prevent dentries that are modified by the rename operation
from getting new replicas while the rename operation is committing.
So don't set xlocks on dentries "done".

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Tue, 29 Jan 2013 02:15:35 +0000 (18:15 -0800)]
Merge remote-tracking branch 'gh/next'

12 years agoMerge branch 'master' of https://github.com/ceph/ceph
John Wilkins [Tue, 29 Jan 2013 01:51:20 +0000 (17:51 -0800)]
Merge branch 'master' of https://github.com/ceph/ceph

12 years agodoc: Updated to add indep and first n to chooseleaf. Num only used with firstn.
John Wilkins [Tue, 29 Jan 2013 01:50:47 +0000 (17:50 -0800)]
doc: Updated to add indep and first n to chooseleaf. Num only used with firstn.

fixes: #3711

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agorgw: fix crash when missing content-type in POST object
Yehuda Sadeh [Tue, 29 Jan 2013 01:13:23 +0000 (17:13 -0800)]
rgw: fix crash when missing content-type in POST object

Fixes: #3941
This fixes a crash when handling S3 POST request and content type
is not provided.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoMerge branch 'wip-pool-delete'
Josh Durgin [Tue, 29 Jan 2013 00:53:41 +0000 (16:53 -0800)]
Merge branch 'wip-pool-delete'

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>