]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
12 years agomds: move variables special to rename into MDRequest::more
Yan, Zheng [Fri, 18 Jan 2013 14:54:02 +0000 (22:54 +0800)]
mds: move variables special to rename into MDRequest::more

My previous patches add two pointers (ambiguous_auth_inode and
auth_pin_freeze) to class Mutation. They are both used by cross
authority rename, both point to the renamed inode. Later patches
need add more rename special state to MDRequest, So just move them
into MDRequest::more

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: properly clear CDir::STATE_COMPLETE when replaying EImportStart
Yan, Zheng [Mon, 21 Jan 2013 14:05:42 +0000 (22:05 +0800)]
mds: properly clear CDir::STATE_COMPLETE when replaying EImportStart

when replaying EImportStart, we should set/clear directory's COMPLETE
flag according with the flag in the journal entry.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: don't journal opened non-auth inode
Yan, Zheng [Mon, 21 Jan 2013 02:04:03 +0000 (10:04 +0800)]
mds: don't journal opened non-auth inode

If we journal opened non-auth inode, during journal replay, the corresponding
entry will add non-auth objects to the cache. But the MDS does not journal all
subsequent modifications (rmdir,rename) to these non-auth objects, so the code
that manages cache and subtree may get confused. Besides non-auth objects will
be trimmed at the resolve stage.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: journal inode's projected parent when doing link rollback
Yan, Zheng [Wed, 16 Jan 2013 12:25:30 +0000 (20:25 +0800)]
mds: journal inode's projected parent when doing link rollback

Otherwise the journal entry will revert the effect of any on-going
rename operation for the inode.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: fix for MDCache::disambiguate_imports
Yan, Zheng [Wed, 16 Jan 2013 12:22:03 +0000 (20:22 +0800)]
mds: fix for MDCache::disambiguate_imports

In the resolve stage, if no MDS claims other MDS's disambiguous subtree
import, the subtree's dir_auth is undefined.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: fix for MDCache::adjust_bounded_subtree_auth
Yan, Zheng [Wed, 16 Jan 2013 12:17:23 +0000 (20:17 +0800)]
mds: fix for MDCache::adjust_bounded_subtree_auth

After swallowing extra subtrees, subtree bounds may change, so it
should re-check.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: don't replace existing slave request
Yan, Zheng [Wed, 16 Jan 2013 11:58:49 +0000 (19:58 +0800)]
mds: don't replace existing slave request

The MDS may receive a client request, but find there is an existing
slave request. It means other MDS is handling the same request, so
we should not replace the slave request with a new client request,
just forward the request.

The client request may include embeded cap releases, we need process
them even the request is forwarded.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: always use {push,pop}_projected_linkage to change linkage
Yan, Zheng [Wed, 16 Jan 2013 11:38:38 +0000 (19:38 +0800)]
mds: always use {push,pop}_projected_linkage to change linkage

Current code skips using {push,pop}_projected_linkage to modify replica
dentry's linkage. This confuses EMetaBlob::add_dir_context() and makes
it record out-of-date path when TO_ROOT mode is used. This patch changes
the code to always use {push,pop}_projected_linkage to modify dentry's
linkage. It makes sure MDCache::create_subtree_map() record correct and
up-to-date subtree map.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: send resolve messages after all MDS reach resolve stage
Yan, Zheng [Sat, 19 Jan 2013 01:49:04 +0000 (09:49 +0800)]
mds: send resolve messages after all MDS reach resolve stage

Current code sends resolve messages when resolving MDS set changes.
There is no need to send resolve messages when some MDS leave the
resolve stage. Sending message while some MDS are replaying is also
not very useful.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: split reslove into two sub-stages
Yan, Zheng [Fri, 18 Jan 2013 11:41:48 +0000 (19:41 +0800)]
mds: split reslove into two sub-stages

The resolve stage serves to disambiguate the fate of uncommitted slave
updates and resolve subtrees authority. The MDS sends resolve message
that claims subtrees authority immediately when reslove stage is entered,
When receiving a resolve message, the MDS also processes it immediately.
This may cause problem if there are uncommitted slave rename and some of
them need rollback later. It's because slave rename rollback may modify
subtree map.

The fix is split reslove into two sub-stages, the first sub-stage serves
to disambiguate slave updates, do slave commit or rollback. After the
the first sub-stage finishes, the MDS sends resolve messages that claim
subtrees authority to other MDS and processes received resolve messages.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: fix slave rename rollback
Yan, Zheng [Sat, 19 Jan 2013 05:00:29 +0000 (13:00 +0800)]
mds: fix slave rename rollback

The main issue of old slave rename rollback code is that it assumes
all affected objects are in the cache. The assumption is not true
when MDS does rollback in the resolve stage. This patch removes the
assumption and makes Server::do_rename_rollback() check individual
object and roll back change.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: preserve non-auth/unlinked objects until slave commit
Yan, Zheng [Sat, 19 Jan 2013 04:57:31 +0000 (12:57 +0800)]
mds: preserve non-auth/unlinked objects until slave commit

The MDS should not trim objects in non-auth subtree immediately after
replaying a slave rename. Because the slave rename may require rollback
later and these objects are needed for rollback.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: don't journal non-auth rename source directory
Yan, Zheng [Sun, 20 Jan 2013 11:23:38 +0000 (19:23 +0800)]
mds: don't journal non-auth rename source directory

After replaying a slave rename, non-auth directory that we rename out of will
be trimmed. So there is no need to journal it.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: force journal straydn for rename if necessary
Yan, Zheng [Fri, 18 Jan 2013 06:08:45 +0000 (14:08 +0800)]
mds: force journal straydn for rename if necessary

rename may overwrite an empty directory inode and move it into stray
directory. MDS who has auth subtree beneath the overwrited directory
need journal the stray dentry when handling rename slave request.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: splits rename force journal check into separate function
Yan, Zheng [Sat, 19 Jan 2013 11:03:01 +0000 (19:03 +0800)]
mds: splits rename force journal check into separate function

the function will be used by later patch that fixes rename rollback

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: fix "had dentry linked to wrong inode" warning
Yan, Zheng [Fri, 18 Jan 2013 02:47:21 +0000 (10:47 +0800)]
mds: fix "had dentry linked to wrong inode" warning

The reason of "had dentry linked to wrong inode" warning is that
Server::_rename_prepare() adds the destdir to the EMetaBlob before
adding the straydir. So during MDS recovers, the destdir is first
replayed. The old inode is directly replaced by the source inode.
We can void the warning by adding the straydir first.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: don't set xlocks on dentries done when early reply rename
Yan, Zheng [Sat, 19 Jan 2013 00:30:23 +0000 (08:30 +0800)]
mds: don't set xlocks on dentries done when early reply rename

_rename_finish() does not send dentry link/unlink message to replicas.
We should prevent dentries that are modified by the rename operation
from getting new replicas while the rename operation is committing.
So don't set xlocks on dentries "done".

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: properly set error_dentry for discover reply
Yan, Zheng [Sun, 6 Jan 2013 01:15:55 +0000 (09:15 +0800)]
mds: properly set error_dentry for discover reply

If MDCache::handle_discover() receives an 'discover path' request but
can not find the base inode. It should properly set the 'error_dentry'
to make sure MDCache::handle_discover_reply() checks correct object's
wait queue.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: introduce XSYN to SYNC lock state transition
Yan, Zheng [Fri, 4 Jan 2013 02:36:50 +0000 (10:36 +0800)]
mds: introduce XSYN to SYNC lock state transition

If lock is in XSYN state, Locker::simple_sync() firstly try changing
lock state to EXCL. If it fail to change lock state to EXCL, it just
returns. So Locker::simple_sync() does not guarantee the lock state
eventually changes to SYNC. This issue can cause replica that requests
read lock hang. The fix is introduce an intermediate state for XSYN
to SYNC transition.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: allow journaling multiple root inodes in EMetaBlob
Yan, Zheng [Thu, 17 Jan 2013 07:29:21 +0000 (15:29 +0800)]
mds: allow journaling multiple root inodes in EMetaBlob

In some cases (rename, rmdir, subtree map), we may need journal multiple
root inodes (/, mdsdir) in one EMetaBlob. This patch modifies EMetaBlob
format to support journaling multiple root inodes.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: lock remote inode's primary dentry during rename
Yan, Zheng [Sat, 5 Jan 2013 02:07:11 +0000 (10:07 +0800)]
mds: lock remote inode's primary dentry during rename

commit 1203cd2110 (mds: allow open_remote_ino() to open xlocked dentry)
makes Server::handle_client_rename() xlocks remote inodes' primary
dentry so witness MDS can open xlocked dentry. But I added remote inodes'
projected primary dentries to the xlock list. This is wrong because
projected dentries are invisible for path traverse.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: check deleted directory in Server::rdlock_path_xlock_dentry
Yan, Zheng [Thu, 17 Jan 2013 06:50:44 +0000 (14:50 +0800)]
mds: check deleted directory in Server::rdlock_path_xlock_dentry

Commit b03eab22e4 (mds: forbid creating file in deleted directory)
is not complete, mknod, mkdir and symlink are missed. Move the ckeck
into Server::rdlock_path_xlock_dentry() fixes the issue.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agomds: fix end check in Server::handle_client_readdir()
Yan, Zheng [Fri, 11 Jan 2013 07:46:59 +0000 (15:46 +0800)]
mds: fix end check in Server::handle_client_readdir()

commit 1174dd3188 (don't retry readdir request after issuing caps)
introduced an bug that wrongly marks 'end' in the the readdir reply.
The code that touches existing dentries re-uses an iterator, and the
iterator is used for checking if readdir is end.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agoconfigure: remove -m4_include(m4/acx_pthread.m4)
Danny Al-Gaaf [Wed, 23 Jan 2013 17:57:47 +0000 (18:57 +0100)]
configure: remove -m4_include(m4/acx_pthread.m4)

Since we use already AC_CONFIG_MACRO_DIR, no need to include m4/acx_pthread.m4
extra.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoconfigure: fix RPM_RELEASE
Danny Al-Gaaf [Wed, 23 Jan 2013 17:57:46 +0000 (18:57 +0100)]
configure: fix RPM_RELEASE

Use git to get RPM_RELEASE only if this is a git repo
clone and if the git command is available on the system.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoosdmaptool: fix clitests
Sage Weil [Sun, 27 Jan 2013 04:49:47 +0000 (20:49 -0800)]
osdmaptool: fix clitests

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: dump/display pool min_size
Sage Weil [Sun, 27 Jan 2013 03:33:20 +0000 (19:33 -0800)]
osd: dump/display pool min_size

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agorbd-fuse: add simple RBD FUSE client
Dan Mick [Tue, 30 Oct 2012 21:02:53 +0000 (14:02 -0700)]
rbd-fuse: add simple RBD FUSE client

Currently written in C on FUSE hi-level interfaces, so error reporting
could be better.  No serious work done for performance.  But it's
usable as it stands.

Specify -c <conf> and a mountpoint, and images show up as files in
that mountpoint.  You can create new images; they'll be created
with attributes stored in xattrs:

user.rbdfuse.imagesize: default 1GB
user.rbdfuse.imageorder: default 22
user.rbdfuse.imagefeatures: default 1 (layering)

Images may be truncated or extended by rewriting.  Currently
once an image is opened, it's not closed, so it can't be deleted
or changed outside of the fuse path.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agorbd-fuse: Original code from Andreas Bluemle
Andreas Bluemle [Wed, 21 Nov 2012 07:25:48 +0000 (23:25 -0800)]
rbd-fuse: Original code from Andreas Bluemle

Signed-off-by: Andreas Bluemle <andreas.bluemle@itxperts.de>
12 years agos3/php: update to 1.5? version of API
Dan Mick [Sat, 26 Jan 2013 05:22:45 +0000 (21:22 -0800)]
s3/php: update to 1.5? version of API

Something like v1.5 of the Amazon PHP library requires the AmazonS3
constructor to be given an array of parameters rather than using
the globals.  More research needs to happen, and particularly
about the v2 API, but this might solve someone's problem with
v1.5 while we do that research.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agoworkunit for iogen
tamil [Sat, 26 Jan 2013 01:59:38 +0000 (17:59 -0800)]
workunit for iogen

Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
12 years agoMerge branch 'wip-osd-msgr'
Sage Weil [Sat, 26 Jan 2013 01:59:19 +0000 (17:59 -0800)]
Merge branch 'wip-osd-msgr'

Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agomon: Monitor: timecheck: only output report to dout once
Joao Eduardo Luis [Fri, 25 Jan 2013 02:48:07 +0000 (02:48 +0000)]
mon: Monitor: timecheck: only output report to dout once

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agomon: Monitor: track timecheck round state and report on health
Joao Eduardo Luis [Wed, 23 Jan 2013 21:41:25 +0000 (21:41 +0000)]
mon: Monitor: track timecheck round state and report on health

Fixes: #3854
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agodoc: Added new, more comprehensive OSD/PG monitoring doc.
John Wilkins [Sat, 26 Jan 2013 00:16:28 +0000 (16:16 -0800)]
doc: Added new, more comprehensive OSD/PG monitoring doc.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Trimmed some detail and added a x-ref to detailed osd/pg monitoring doc.
John Wilkins [Sat, 26 Jan 2013 00:15:52 +0000 (16:15 -0800)]
doc: Trimmed some detail and added a x-ref to detailed osd/pg monitoring doc.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Added osd/pg monitoring section to the index.
John Wilkins [Sat, 26 Jan 2013 00:14:38 +0000 (16:14 -0800)]
doc: Added osd/pg monitoring section to the index.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Added x-ref links.
John Wilkins [Sat, 26 Jan 2013 00:14:12 +0000 (16:14 -0800)]
doc: Added x-ref links.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoMerge branch 'master' of https://github.com/ceph/ceph
John Wilkins [Fri, 25 Jan 2013 22:25:06 +0000 (14:25 -0800)]
Merge branch 'master' of https://github.com/ceph/ceph

12 years agodoc: fixed description for pg in control section.
John Wilkins [Fri, 25 Jan 2013 22:24:37 +0000 (14:24 -0800)]
doc: fixed description for pg in control section.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: wider sidebar, larger font, cleaned tip CSS
Ross Turk [Fri, 25 Jan 2013 20:48:31 +0000 (12:48 -0800)]
doc: wider sidebar, larger font, cleaned tip CSS

The sidebar is now about a hundred pixels wider and the fonts
are larger throughout.  This works a lot better when you get
deep into the doc structure - it used to wrap horribly.

I also fixed how literals look inside .tip and .important.

Signed-off-by: Ross Turk <ross@inktank.com>
12 years agosharedptr_registry: remove extaneous Mutex::Locker declaration
Samuel Just [Fri, 25 Jan 2013 19:31:29 +0000 (11:31 -0800)]
sharedptr_registry: remove extaneous Mutex::Locker declaration

For some reason, the lookup() retry loop (for when happened to
race with a removal and grab an invalid WeakPtr) locked
the lock again.  This causes the #3836 crash since the lock
is already locked.  It's rare since it requires a lookup between
invalidation of the WeakPtr and removal of the WeakPtr entry.

Fixes: #3836
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agodoc: Added Subdomain section.
John Wilkins [Fri, 25 Jan 2013 18:54:07 +0000 (10:54 -0800)]
doc: Added Subdomain section.

fixes: #3778

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoosd/PG: include map epoch in query results
Sage Weil [Fri, 25 Jan 2013 17:40:07 +0000 (09:40 -0800)]
osd/PG: include map epoch in query results

Currently you can only infer it from the info.history.* fields.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: kill unused addr-based send_map()
Sage Weil [Fri, 25 Jan 2013 17:30:00 +0000 (09:30 -0800)]
osd: kill unused addr-based send_map()

Not used, old API, bad.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: share incoming maps via Connection*, not addrs
Sage Weil [Fri, 25 Jan 2013 17:29:37 +0000 (09:29 -0800)]
osd: share incoming maps via Connection*, not addrs

Kill a set of parallel methods that are using the old addr/inst-based
msgr APIs, and instead use Connection handles.  This is much safer and gets
us closer to killing the old msgr API.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: pass new maps to dead osds via existing Connection
Sage Weil [Fri, 25 Jan 2013 17:27:00 +0000 (09:27 -0800)]
osd: pass new maps to dead osds via existing Connection

Previously we were sending these maps to dead osds via their old addrs
using a new outgoing connection and setting the flags so that the msgr
would clean up.  That mechanism is possibly buggy and fragile, and we can
avoid it entirely if we just reuse the existing heartbeat Connection.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: requeue osdmaps on heartbeat connections for cluster connection
Sage Weil [Fri, 25 Jan 2013 17:25:28 +0000 (09:25 -0800)]
osd: requeue osdmaps on heartbeat connections for cluster connection

If we receive an OSDMap on the cluster connection, requeue it for the
cluster messenger, and process it there where we normally do.  This avoids
any concerns about locking and ordering rules.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomsgr: add get_loopback_connection() method
Sage Weil [Fri, 25 Jan 2013 17:23:23 +0000 (09:23 -0800)]
msgr: add get_loopback_connection() method

Return the Connection* for ourselves, so we can queue messages for
ourselves.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocommon: fix cli tests on usage
Sage Weil [Fri, 25 Jan 2013 05:48:26 +0000 (21:48 -0800)]
common: fix cli tests on usage

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoRevert "filestore: disable extra committing queue allowance"
Sage Weil [Thu, 24 Jan 2013 06:16:50 +0000 (22:16 -0800)]
Revert "filestore: disable extra committing queue allowance"

This reverts commit 44dca5c8c5058acf9bc391303dc77893793ce0be.

The allowance is not only added for btrfs as of commit
e639254a0c5f8e3528fa8f2b2b451296653556bc, which makes us happy
for both non-btrfs (lower latency) and btrfs (better small io
throughput, no big stall during commit).

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoos/FileStore: only adjust up op queue for btrfs
Sage Weil [Thu, 24 Jan 2013 06:16:49 +0000 (22:16 -0800)]
os/FileStore: only adjust up op queue for btrfs

We only need to adjust up the op queue limits during commit for btrfs,
because the snapshot initiation (async create) is currently
high-latency and the op queue is quiesced during that period.

This lets us revert 44dca5c, which disabled the extra allowance because
it is generally bad for non-btrfs writeahead mode.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoadminops.rst: revert changes for as-yet-unimplemented features
Dan Mick [Fri, 25 Jan 2013 04:52:35 +0000 (20:52 -0800)]
adminops.rst: revert changes for as-yet-unimplemented features

See wip-admin-api for the new specification

Fixes: #3724
Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agorados: remove unused "check_stdio" parameter
Dan Mick [Thu, 24 Jan 2013 21:38:25 +0000 (13:38 -0800)]
rados: remove unused "check_stdio" parameter

Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agorados: obey op_size for 'get'
Sage Weil [Thu, 24 Jan 2013 05:31:11 +0000 (21:31 -0800)]
rados: obey op_size for 'get'

Otherwise we try to read the whole object in one go, which doesn't bode
well for large objects (either non-optimal or simply broken).

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agoFileStore: ping TPHandle after each operation in _do_transactions
Samuel Just [Thu, 24 Jan 2013 20:02:09 +0000 (12:02 -0800)]
FileStore: ping TPHandle after each operation in _do_transactions

Each completed operation in the transaction proves thread
liveness, a stuck thread should still trigger the timeouts.

Fixes: #3928
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoOSD: use TPHandle in peering_wq
Samuel Just [Thu, 24 Jan 2013 19:07:37 +0000 (11:07 -0800)]
OSD: use TPHandle in peering_wq

Implement _process overload with TPHandle argument and use
that to ping the hb map between pgs and between map epochs
when advancing a pg.  The thread will still timeout if
genuinely stuck at any point.

Fixes: 3905
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoWorkQueue: add TPHandle to allow _process to ping the hb map
Samuel Just [Thu, 24 Jan 2013 19:04:04 +0000 (11:04 -0800)]
WorkQueue: add TPHandle to allow _process to ping the hb map

Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agolibcephfs-java test: use provided environment
Sage Weil [Thu, 24 Jan 2013 23:13:37 +0000 (15:13 -0800)]
libcephfs-java test: use provided environment

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocommon: only show -d, -f options for daemons
Sage Weil [Thu, 24 Jan 2013 21:29:03 +0000 (13:29 -0800)]
common: only show -d, -f options for daemons

Fixes: #3073
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodoc: Syntax fixes.
John Wilkins [Thu, 24 Jan 2013 21:13:03 +0000 (13:13 -0800)]
doc: Syntax fixes.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Updated usage for Bobtail.
John Wilkins [Thu, 24 Jan 2013 20:58:29 +0000 (12:58 -0800)]
doc: Updated usage for Bobtail.

fixes: #3831

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Updated usage for Bobtail.
John Wilkins [Thu, 24 Jan 2013 20:57:14 +0000 (12:57 -0800)]
doc: Updated usage for Bobtail.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoMerge branch 'master' of https://github.com/ceph/ceph
John Wilkins [Thu, 24 Jan 2013 20:47:58 +0000 (12:47 -0800)]
Merge branch 'master' of https://github.com/ceph/ceph

12 years agodoc: Added example of ext4 user_xattr mount option.
John Wilkins [Thu, 24 Jan 2013 20:46:49 +0000 (12:46 -0800)]
doc: Added example of ext4 user_xattr mount option.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agorgw_rest: Make fallback uri configurable.
caleb miles [Mon, 14 Jan 2013 17:16:12 +0000 (12:16 -0500)]
rgw_rest: Make fallback uri configurable.

Some HTTP servers, notabily lighttp, do not set SCRIPT_URI, make the fallback
string configurable.

Signed-off-by: caleb miles <caleb.miles@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agocommon/HeartbeatMap: fix uninitialized variable
Sage Weil [Thu, 24 Jan 2013 18:52:46 +0000 (10:52 -0800)]
common/HeartbeatMap: fix uninitialized variable

Introduced by me in 132045ce085e8584a3e177af552ee7a5205b13d8.  Thank you,
valgrind!

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agolibcephfs-java test: jar files are in /usr/local/share/java, it seems
Sage Weil [Thu, 24 Jan 2013 18:41:34 +0000 (10:41 -0800)]
libcephfs-java test: jar files are in /usr/local/share/java, it seems

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agowireshark: fix indention
Danny Al-Gaaf [Thu, 24 Jan 2013 17:21:21 +0000 (18:21 +0100)]
wireshark: fix indention

Fix indention.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agowireshark: fix guint64 print format handling
Danny Al-Gaaf [Thu, 24 Jan 2013 17:21:20 +0000 (18:21 +0100)]
wireshark: fix guint64 print format handling

Use G_GUINT64_FORMAT to handle print format of guint64 correctly.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoPendingReleaseNotes: pool removal cli changes
Sage Weil [Thu, 24 Jan 2013 02:50:57 +0000 (18:50 -0800)]
PendingReleaseNotes: pool removal cli changes

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'gh/wip-rm-pool'
Sage Weil [Thu, 24 Jan 2013 02:49:05 +0000 (18:49 -0800)]
Merge remote-tracking branch 'gh/wip-rm-pool'

Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge remote-tracking branch 'gh/wip-3832-oc-flushrange'
Sage Weil [Thu, 24 Jan 2013 02:47:25 +0000 (18:47 -0800)]
Merge remote-tracking branch 'gh/wip-3832-oc-flushrange'

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'wip-osd-hb'
Sage Weil [Thu, 24 Jan 2013 02:40:49 +0000 (18:40 -0800)]
Merge branch 'wip-osd-hb'

Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge remote-tracking branch 'upstream/wip_push_after_complete'
Samuel Just [Thu, 24 Jan 2013 00:55:33 +0000 (16:55 -0800)]
Merge remote-tracking branch 'upstream/wip_push_after_complete'

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoReplicatedPG: handle omap > max_recovery_chunk
Samuel Just [Wed, 23 Jan 2013 20:49:04 +0000 (12:49 -0800)]
ReplicatedPG: handle omap > max_recovery_chunk

span_of fails if len == 0.

Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoReplicatedPG: correctly handle omap key larger than max chunk
Samuel Just [Wed, 23 Jan 2013 20:18:31 +0000 (12:18 -0800)]
ReplicatedPG: correctly handle omap key larger than max chunk

Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoReplicatedPG: start scanning omap at omap_recovered_to
Samuel Just [Wed, 23 Jan 2013 20:15:10 +0000 (12:15 -0800)]
ReplicatedPG: start scanning omap at omap_recovered_to

Previously, we started scanning omap after omap_recovered_to.
This is a problem since the break in the loop implies that
omap_recovered_to is the first key not recovered.

Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoReplicatedPG: don't finish_recovery_op until the transaction completes
Samuel Just [Wed, 23 Jan 2013 19:50:13 +0000 (11:50 -0800)]
ReplicatedPG: don't finish_recovery_op until the transaction completes

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoReplicatedPG: ack push only after transaction has completed
Samuel Just [Wed, 23 Jan 2013 19:35:47 +0000 (11:35 -0800)]
ReplicatedPG: ack push only after transaction has completed

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoObjectStore: add queue_transactions with oncomplete
Samuel Just [Wed, 23 Jan 2013 19:13:28 +0000 (11:13 -0800)]
ObjectStore: add queue_transactions with oncomplete

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agorados: safety interlock on 'rmpool' command
Sage Weil [Wed, 23 Jan 2013 16:49:06 +0000 (08:49 -0800)]
rados: safety interlock on 'rmpool' command

This is a very easy way for a user to do a lot of damage with no way back.
Make sure they mean it.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: implement safety interlock for deleting pools
Sage Weil [Wed, 23 Jan 2013 16:40:13 +0000 (08:40 -0800)]
mon: implement safety interlock for deleting pools

This is a very easy way for users to accidentally to a *lot* of damage.
Make it an annoying manual process to actually do this.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocommon/HeartbeatMap: inject unhealthy heartbeat for N seconds
Sage Weil [Wed, 23 Jan 2013 05:18:45 +0000 (21:18 -0800)]
common/HeartbeatMap: inject unhealthy heartbeat for N seconds

This lets us test code that is triggered by an unhealthy heartbeat in a
generic way.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoos/FileStore: add stall injection into filestore op queue
Sage Weil [Wed, 23 Jan 2013 02:08:22 +0000 (18:08 -0800)]
os/FileStore: add stall injection into filestore op queue

Allow admin to artificially induce a stall in the op queue.  Forces the
thread(s) to sleep for N seconds.  We pause for 1 second increments and
recheck the value so that a previously stalled thread can be unwedged by
reinjecting a lower value (or 0).  To stall indefinitely, just injust
very large number.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: do not join cluster if not healthy
Sage Weil [Wed, 23 Jan 2013 02:03:10 +0000 (18:03 -0800)]
osd: do not join cluster if not healthy

If our internal heartbeats are failing, do not send a boot message and try
to join the cluster.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: hold lock while calling start_boot on startup
Sage Weil [Wed, 23 Jan 2013 02:01:07 +0000 (18:01 -0800)]
osd: hold lock while calling start_boot on startup

This probably doesn't strictly matter because start_boot doesn't need the
lock (currently) and few other threads should be running, but it is
better to be consistent.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: do not reply to ping if internal heartbeat is not healthy
Sage Weil [Wed, 23 Jan 2013 01:56:32 +0000 (17:56 -0800)]
osd: do not reply to ping if internal heartbeat is not healthy

If we find that our internal threads are stalled, do not reply to ping
requests.  If we do this long enough, peers will mark us down.  If we are
only transiently unhealthy, we will reply to the next ping and they will
be satisfied.  If we are unhealthy and marked down, and eventually recover,
we will mark ourselves back up.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: reduce op thread heartbeat default 30 -> 15 seconds
Sage Weil [Wed, 23 Jan 2013 01:53:40 +0000 (17:53 -0800)]
osd: reduce op thread heartbeat default 30 -> 15 seconds

If the thread stalls for 15 seconds, let our internal heartbeat fail.
This will let us internally respond more quickly to a stalled or failing
disk.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #35 from cholcombe973/master
Yehuda Sadeh [Wed, 23 Jan 2013 00:54:39 +0000 (16:54 -0800)]
Merge pull request #35 from cholcombe973/master

Making the usage details a little better.

12 years agoMerge remote-tracking branch 'gh/wip-3833-b'
Sage Weil [Wed, 23 Jan 2013 00:13:14 +0000 (16:13 -0800)]
Merge remote-tracking branch 'gh/wip-3833-b'

Conflicts:
src/osd/OSD.cc
src/osd/OSD.h

Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoUpdate src/rgw/rgw_admin.cc 35/head
cholcombe973 [Wed, 23 Jan 2013 00:07:27 +0000 (19:07 -0500)]
Update src/rgw/rgw_admin.cc

Improved the usage message.

12 years agoMerge branch 'wip-3651'
David Zafman [Tue, 22 Jan 2013 23:58:44 +0000 (15:58 -0800)]
Merge branch 'wip-3651'

12 years agoosd: debug support for omap deep-scrub
David Zafman [Tue, 15 Jan 2013 00:37:09 +0000 (16:37 -0800)]
osd: debug support for omap deep-scrub

Deep-scrub test support through admin socket

Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoosd: Add digest of omap for deep-scrub
David Zafman [Wed, 9 Jan 2013 03:24:13 +0000 (19:24 -0800)]
osd: Add digest of omap for deep-scrub

Add ScrubMap encode/decode v4 message with omap digest
Compute digest of header and key/value.  Use bufferlist
to reflect structure and compute as we go, clearing
bufferlist to reduce memory usage.

Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoosd: Add missing unregister_command() in OSD::shutdown()
David Zafman [Fri, 18 Jan 2013 17:31:00 +0000 (09:31 -0800)]
osd: Add missing unregister_command() in OSD::shutdown()

Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoconfig: helper to identify internal fields we should be quiet about
Sage Weil [Tue, 22 Jan 2013 22:59:30 +0000 (14:59 -0800)]
config: helper to identify internal fields we should be quiet about

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocommon/Throttle: fix modeline, whitespace
Sage Weil [Tue, 22 Jan 2013 22:56:36 +0000 (14:56 -0800)]
common/Throttle: fix modeline, whitespace

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodoc: Modified usage for upgrade.
John Wilkins [Tue, 22 Jan 2013 22:55:19 +0000 (14:55 -0800)]
doc: Modified usage for upgrade.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoosd: improve sub_op flag points
Sage Weil [Tue, 22 Jan 2013 05:02:01 +0000 (21:02 -0800)]
osd: improve sub_op flag points

Signed-off-by: Sage Weil <sage@inktank.com>