]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
14 years agomds: rip out rename linkmerge support
Sage Weil [Thu, 3 Mar 2011 00:13:54 +0000 (16:13 -0800)]
mds: rip out rename linkmerge support

It turns out POSIX says rename(a,b) is a no-op when a and b link to the
same inode.  This is super weird but good news because it means we can
rip out a bunch of poorly tested code.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agodout: Reopen dout after parsing all config opts
Colin Patrick McCabe [Wed, 2 Mar 2011 15:28:10 +0000 (07:28 -0800)]
dout: Reopen dout after parsing all config opts

Reopen the dout stream only after we parse all configuration options.
Specifying --log-file on the command line now works as expected.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agodout: remove g_conf.log_to_file
Colin Patrick McCabe [Wed, 2 Mar 2011 15:12:03 +0000 (07:12 -0800)]
dout: remove g_conf.log_to_file

Remove the log_to_file configuration option. Instead, only log to a file
if either log_file or log_dir is set.

This way, command-line options like --log-file=/tmp/foo work as
expected.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agologging: default to foreground logging
Colin Patrick McCabe [Wed, 2 Mar 2011 12:22:20 +0000 (04:22 -0800)]
logging: default to foreground logging

At global constructor time: default to logging everything to stderr.

During common_init: set appropriate logging defaults based on the type
or program (daemon or other).

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agocmds/cosd: Fix IsHeapProfilerRunning implicit return type cast.
Alexandre Oliva [Wed, 2 Mar 2011 21:39:09 +0000 (13:39 -0800)]
cmds/cosd: Fix IsHeapProfilerRunning implicit return type cast.

G++ complains about the difference between the return type of tcmalloc's
IsHeapProfilerRunning (int) and the return type of the function that
g_conf.profiler_running is supposed to point to (bool). We could
probably get away with a type-cast, but as a compiler developer and
former C++ language lawyer, I'd rather not take the risk of destroying
the universe by invoking undefined behavior ;-)

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agomds: drop some dead code
Sage Weil [Wed, 2 Mar 2011 17:50:44 +0000 (09:50 -0800)]
mds: drop some dead code

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: fix one rename dentry linkage projection case
Sage Weil [Wed, 2 Mar 2011 17:41:20 +0000 (09:41 -0800)]
mds: fix one rename dentry linkage projection case

There are more.  :(

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: trigger discover_all_missing after replay delay
Sage Weil [Tue, 1 Mar 2011 00:05:08 +0000 (16:05 -0800)]
osd: trigger discover_all_missing after replay delay

We were calling discover_all_missing only when we went immediately active,
not after we were in the replay state (which triggers from a timer event
that calls OSD::activate_pg().  Move the call into PG::activate() so that
we catch both callers.

This requires passing in a query_map from the caller.  While we're at it,
clean up some other instances where we are defining a new query_map
deep within the call tree.

Fixes: #847 (I hope)
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: handle osd_ping (and ack requests) while !active
Sage Weil [Mon, 28 Feb 2011 22:15:14 +0000 (14:15 -0800)]
osd: handle osd_ping (and ack requests) while !active

In particular, we may start getting ping requests before getting (or while
processing) our first map that makes us go active.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoRevert "move g_default_file_layout into ceph_fs.cc"
Sage Weil [Mon, 28 Feb 2011 20:57:56 +0000 (12:57 -0800)]
Revert "move g_default_file_layout into ceph_fs.cc"

This reverts commit 1dc12e3e1de1ee6aeb3ef11bb3faafa4757b1a65.

The headers and ceph_fs.cc are written such that they can be shared
verbatim between the kernel and userspace code.  Omitting the headers
was deliberate, because they differ depending on the build environment.

The default file layout seems fine in config.cc, since it is declared
in config.h, and is a bunch of tunables we generally try to keep in
config.cc.

14 years agocconf: fix clitest
Colin Patrick McCabe [Mon, 28 Feb 2011 11:51:01 +0000 (03:51 -0800)]
cconf: fix clitest

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agounittests: remember to use AM_LDFLAGS
Colin Patrick McCabe [Mon, 28 Feb 2011 11:00:38 +0000 (03:00 -0800)]
unittests: remember to use AM_LDFLAGS

remember to use AM_LDFLAGS when setting _LDFLAGS. Otherwise, the global
flags will be lost.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoMerge branch 'librados_api' into next
Colin Patrick McCabe [Mon, 28 Feb 2011 09:57:58 +0000 (01:57 -0800)]
Merge branch 'librados_api' into next

14 years agoRename PoolHandle to IoContext: part 2
Colin Patrick McCabe [Sat, 26 Feb 2011 00:38:33 +0000 (16:38 -0800)]
Rename PoolHandle to IoContext: part 2

The previous change changed all PoolHandle uses to IoContext. This
change also renames the variable names.

Also fix a few API functions whose names weren't quite right after the
previous change. rados_pool_list really does just list pools-- it has
nothing to do with ioctxes.

rados_ioctx_change_auid should be rados_ioctx_pool_set_auid. Although it
takes an ioctx as an argument, it operates on the pool.

rados_ioctx_close should just return void. APIs where the close
operation can fail are broken. What is the user supposed to do if
closing doesn't work?

Also, fix a few test programs that got overlooked earlier.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agotestlibrbd: call rados_connect
Josh Durgin [Sat, 26 Feb 2011 02:04:21 +0000 (18:04 -0800)]
testlibrbd: call rados_connect

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoFileStore.h: reorder queue operations in _journaled_ahead
Samuel Just [Thu, 24 Feb 2011 20:31:58 +0000 (12:31 -0800)]
FileStore.h: reorder queue operations in _journaled_ahead

In writeahead mode, an op could dissappear from jq without immediately
reappearing in q.  Thus, q can be empty before seq is requeued and
finished.  _journaled_ahead will now enqueue the op in q before removing
from jq.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
14 years agoRevert "FileStore: fix OpSequencer::flush error"
Samuel Just [Thu, 24 Feb 2011 20:42:16 +0000 (12:42 -0800)]
Revert "FileStore: fix OpSequencer::flush error"

This reverts commit c78b29a47d7211a4b8b1585112ac22b8435a82c7.

This commit introduced an error in parallel journaling mode.
OpSequencer::flush is only meant to ensure that the ops have become
readable, not necessarily journalled.

14 years agolibrados: Rename rados_pool_t to rados_ioctx_t
Colin Patrick McCabe [Fri, 25 Feb 2011 18:23:12 +0000 (10:23 -0800)]
librados: Rename rados_pool_t to rados_ioctx_t

rados_pool_t -> rados_ioctx_t

class PoolCtx -> class IoCtxImpl

class PoolHandle -> class IoCtx

PoolHandle::name() -> IoCtx::get_pool_name()

Replace rados_pool_destroy, PoolHandle::destroy with rados_pool_delete
and Rados::pool_delete.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agotestradospp: update for new librados API
Josh Durgin [Sat, 26 Feb 2011 00:26:51 +0000 (16:26 -0800)]
testradospp: update for new librados API

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agotestlibrbdpp: convert to new APIs
Josh Durgin [Sat, 26 Feb 2011 00:05:00 +0000 (16:05 -0800)]
testlibrbdpp: convert to new APIs

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agotest_common.sh: should rm objects before adding
Colin Patrick McCabe [Fri, 25 Feb 2011 17:08:00 +0000 (09:08 -0800)]
test_common.sh: should rm objects before adding

rados_write doesn't replace the whole object, but that's what we want in
these old tests. So just rm it first.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agorbd: de-globalize rbd, rados, Image
Colin Patrick McCabe [Fri, 25 Feb 2011 16:27:29 +0000 (08:27 -0800)]
rbd: de-globalize rbd, rados, Image

Use RAII for rbd, rados, and Image. Their destructors will be called
when main exits, thus doing the cleanup for us. Use auto_ptr for Image.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agolibrbd, librados: fix my last commits to use the new librados API
Josh Durgin [Fri, 25 Feb 2011 23:45:49 +0000 (15:45 -0800)]
librbd, librados: fix my last commits to use the new librados API

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agolibrados: add snap_get_stamp to C API
Josh Durgin [Fri, 25 Feb 2011 23:28:30 +0000 (15:28 -0800)]
librados: add snap_get_stamp to C API

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agotestlibrbdpp: initialize pointers
Josh Durgin [Fri, 25 Feb 2011 22:19:51 +0000 (14:19 -0800)]
testlibrbdpp: initialize pointers

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agolibrados, librbd: remove selfmanaged_snap_rollback_object
Josh Durgin [Fri, 25 Feb 2011 22:00:05 +0000 (14:00 -0800)]
librados, librbd: remove selfmanaged_snap_rollback_object

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoMerge branch 'librados_api_cpp' into librados_api
Colin Patrick McCabe [Fri, 25 Feb 2011 16:15:11 +0000 (08:15 -0800)]
Merge branch 'librados_api_cpp' into librados_api

Conflicts:
src/include/rbd/librbd.hpp
src/librbd.cc
src/rbd.cc

14 years agotestlibrbdpp: use new librbd api
Josh Durgin [Fri, 25 Feb 2011 21:39:35 +0000 (13:39 -0800)]
testlibrbdpp: use new librbd api

14 years agorbd: update for librbd api changes
Josh Durgin [Fri, 25 Feb 2011 18:57:27 +0000 (10:57 -0800)]
rbd: update for librbd api changes

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agolibrbd: tweak C++ API
Josh Durgin [Fri, 25 Feb 2011 18:54:30 +0000 (10:54 -0800)]
librbd: tweak C++ API

- rename image_open to open and make it return an int
- remove Image::close, replace with destructor
- make Image constructor private

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agolibrados: C++ API rework
Colin Patrick McCabe [Thu, 24 Feb 2011 16:14:20 +0000 (08:14 -0800)]
librados: C++ API rework

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agorollback rename
Colin Patrick McCabe [Thu, 24 Feb 2011 18:13:09 +0000 (10:13 -0800)]
rollback rename

14 years agorbd: use new librbd C++ api
Josh Durgin [Fri, 25 Feb 2011 01:20:33 +0000 (17:20 -0800)]
rbd: use new librbd C++ api

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agolibrbd: make C++ api nicer
Josh Durgin [Thu, 24 Feb 2011 21:41:55 +0000 (13:41 -0800)]
librbd: make C++ api nicer

Adds Image class and replaces aio_create_completion with a constructor.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agofilejournal: fix type punning warning, drop unneeded cast
Sage Weil [Thu, 24 Feb 2011 14:12:02 +0000 (06:12 -0800)]
filejournal: fix type punning warning, drop unneeded cast

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoSome tweaks for the librados C API
Colin Patrick McCabe [Thu, 24 Feb 2011 11:30:23 +0000 (03:30 -0800)]
Some tweaks for the librados C API

rados_reopen_log: should take a cluster parameter.

Add rados_pool_list, rados_pool_list_free.

rados_snap_set_read -> rados_pool_snap_set_read

rados_snap_set_write_context -> rados_pool_selfmanaged_snap_set_write_ctx

write/write_full/etc: re-arrange parameter order to be the same as
pwrite(2).

Change interface of rados_pool_list a bit

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agolibrbd, rbd: fill in the rest of image_info_t
Josh Durgin [Thu, 24 Feb 2011 19:23:51 +0000 (11:23 -0800)]
librbd, rbd: fill in the rest of image_info_t

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoMakefile: fix libatomic_ops linking
Sage Weil [Thu, 24 Feb 2011 13:49:29 +0000 (05:49 -0800)]
Makefile: fix libatomic_ops linking

LDADD seems to have no effect on the final link command.  Switching this
back to AM_LDFLAGS.  This was changed as in 1c7d8f1ac2c, although it's not
clear that the change was intentional...

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: remove "N stopped" from short mdsmap summary
Sage Weil [Thu, 24 Feb 2011 08:34:37 +0000 (00:34 -0800)]
mds: remove "N stopped" from short mdsmap summary

It's confusing because it sounds like we're talking about daemons, when we
really just mean there are some ranks that created some ondisk state but
aren't currently part of the running cluster.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomon: include mds gid in logs
Sage Weil [Thu, 24 Feb 2011 08:31:55 +0000 (00:31 -0800)]
mon: include mds gid in logs

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds,osd: print 'starting ...' message to stdout
Sage Weil [Thu, 24 Feb 2011 08:20:06 +0000 (00:20 -0800)]
mds,osd: print 'starting ...' message to stdout

The timestamp/threadid prefix is unnecessary, and stdout seems more
appropriate.  Now matches cmon.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agocommon: only print version to stdout for daemons
Sage Weil [Thu, 24 Feb 2011 08:19:23 +0000 (00:19 -0800)]
common: only print version to stdout for daemons

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: add 'exit' command
Sage Weil [Thu, 24 Feb 2011 15:50:58 +0000 (07:50 -0800)]
mds: add 'exit' command

Tell a cmds process to suicide/exit immediately.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: fix frag string rendering
Sage Weil [Thu, 24 Feb 2011 15:50:17 +0000 (07:50 -0800)]
mds: fix frag string rendering

Was mostly gibberish from df7c7bd79237d2a8b691f4e59433b0b39a9721a2

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: strengthen assertions in rejoin ack
Sage Weil [Wed, 23 Feb 2011 23:08:58 +0000 (15:08 -0800)]
mds: strengthen assertions in rejoin ack

The ACK only contains items we asked for with a WEAK request.  Assert as
much.  (The old continue bits were from ~2007, when this was originally
written.)

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: fix gratuitous map lookup
Sage Weil [Wed, 23 Feb 2011 23:10:45 +0000 (15:10 -0800)]
mds: fix gratuitous map lookup

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: mark_down connections to any failed peers
Sage Weil [Wed, 23 Feb 2011 22:40:38 +0000 (14:40 -0800)]
mds: mark_down connections to any failed peers

This cleans up messenger state, prevents log spam, and saves a small amount
of memory.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: fix export cancellation vs nested freezes
Sage Weil [Wed, 23 Feb 2011 22:25:06 +0000 (14:25 -0800)]
mds: fix export cancellation vs nested freezes

Prevent freezes from completing while we are canceling exports.  Otherwise
if we are freezing /a/b and /a, and cancel /a/b, we may inadvertantly
complete the freeze on /a (synchronously) and confuse ourselves.  Pin
all freezes beforehand so that when we cancel each one we do not cause
any others to prematurely complete.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoFileStore: fix OpSequencer::flush error
Samuel Just [Wed, 23 Feb 2011 21:55:43 +0000 (13:55 -0800)]
FileStore: fix OpSequencer::flush error

In writeahead mode, an op will dissappear from jq without immediately
reappearing in q.  Thus, q can be empty before seq is requeued and
finished.  last_thru_q and last_thru_jq will now be tracked explicitly.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
14 years agomds: print waiter tag in hex
Sage Weil [Wed, 23 Feb 2011 21:45:23 +0000 (13:45 -0800)]
mds: print waiter tag in hex

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: make frag string rendering simpler
Sage Weil [Wed, 23 Feb 2011 21:36:17 +0000 (13:36 -0800)]
mds: make frag string rendering simpler

Show actual bit prefix when rendering a frag_t.  That is,

$value/$numbits -> bits*

So,

0/0      -> *
000000/1 -> 0*
800000/1 -> 1*
800000/3 -> 100*

etc.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomon: fix dup mds takeover
Sage Weil [Wed, 23 Feb 2011 21:34:01 +0000 (13:34 -0800)]
mon: fix dup mds takeover

Allow a standby to take over for a single MDS only by consistently looking
at the pending_mdsmap and not mdsmap.  Mixing the two leads to all kinds
of confusion.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: print msg when fragtree updates from journal
Sage Weil [Wed, 23 Feb 2011 21:18:59 +0000 (13:18 -0800)]
mds: print msg when fragtree updates from journal

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: verify frags in more approrpiate places
Sage Weil [Wed, 23 Feb 2011 21:17:32 +0000 (13:17 -0800)]
mds: verify frags in more approrpiate places

Not in inner helpers, which may be called on multiple frags to get things
in sync.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: refragment dirs when inode dirfragtree updates from journal
Sage Weil [Wed, 23 Feb 2011 21:01:08 +0000 (13:01 -0800)]
mds: refragment dirs when inode dirfragtree updates from journal

Force dir fragmentation specified by dirfragtree when replayed from
the journal.

Example:
 mds0 is auth for /foo, mds1 is auth for /foo/bar.
 mds1 fragments /foo/bar.  journals etc.
 mds0 gets fragment notify and the in-memory inode's dirfragtree changes.
 mds0 journals the /foo/bar inode for some random reason.
 mds0 imports /foo/bar.

On replay, mds0 refragments upon first mention of the new fragtree in the
journal, so that the dirfragtree <-> dir frags always match.  Confusion is
avoided when we, say, import /foo/bar.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: fix CDir::take_waiting() on dentry waiters
Sage Weil [Wed, 23 Feb 2011 19:55:06 +0000 (11:55 -0800)]
mds: fix CDir::take_waiting() on dentry waiters

Using take_dentry_waiting() means we double-put the DNWAITER pin.  It's
also way slower.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoMore fixes, additions for config API
Colin Patrick McCabe [Wed, 23 Feb 2011 16:22:47 +0000 (08:22 -0800)]
More fixes, additions for config API

Add test of the librados configuration API to testrados.c

rados_reopen_log should return void since it can't encounter errors.

Create new rados_conf_get_alloc function that allocates memory, but has
a simpler interface.

Create rados_set_conf_defaults, a place to put librados-specific
configuration defaults.

Implement rados_conf_read_file by re-calling common_init. This isn't the
best way to do it, but it will get the function implemented for now.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoUpdate testrados, rename functions in librados.cc
Colin Patrick McCabe [Wed, 23 Feb 2011 15:29:41 +0000 (07:29 -0800)]
Update testrados, rename functions in librados.cc

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agorados_create: add id parameter
Colin Patrick McCabe [Wed, 23 Feb 2011 13:59:22 +0000 (05:59 -0800)]
rados_create: add id parameter

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoFold common_set_defaults into common_init
Colin Patrick McCabe [Wed, 23 Feb 2011 13:50:10 +0000 (05:50 -0800)]
Fold common_set_defaults into common_init

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoSplt rados_init into rados_create + rados_connect
Colin Patrick McCabe [Wed, 23 Feb 2011 12:43:12 +0000 (04:43 -0800)]
Splt rados_init into rados_create + rados_connect

Splt rados_init into rados_create and rados_connect.  The pattern will
be for users to call create, set configuration, and then connect. Rename
rados_release to rados_destroy, to be more symmetrical with
rados_create. You can't reconnect after calling destroy.

Don't create the messenger inside the RadosClient constructor. Instead,
wait until RadosClient::connect().

Rename rados_conf_apply to rados_reopen_log. Add comment about SIGHUP.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoAdd rados_conf_apply, comments
Colin Patrick McCabe [Wed, 23 Feb 2011 00:38:41 +0000 (16:38 -0800)]
Add rados_conf_apply, comments

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agomd_config_t::set_val/get_val
Colin Patrick McCabe [Tue, 22 Feb 2011 20:27:07 +0000 (12:27 -0800)]
md_config_t::set_val/get_val

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agocommon: more include and copyright fixes
Colin Patrick McCabe [Tue, 22 Feb 2011 19:32:55 +0000 (11:32 -0800)]
common: more include and copyright fixes

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agocommon: Fix some missing includes, copyrights
Colin Patrick McCabe [Tue, 22 Feb 2011 19:23:31 +0000 (11:23 -0800)]
common: Fix some missing includes, copyrights

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agocconf: remove second argument to cconf --lookup
Colin Patrick McCabe [Tue, 22 Feb 2011 17:00:21 +0000 (09:00 -0800)]
cconf: remove second argument to cconf --lookup

Everyone uses get_conf to get configuration values. So the logic for
defaulting to some value if we can't find the requested key should live
there. Also fix a case in cconf where we could encounter a usage error
and keep on going.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoceph_common.sh: remove get_val, get_val_bool
Colin Patrick McCabe [Tue, 22 Feb 2011 16:48:53 +0000 (08:48 -0800)]
ceph_common.sh: remove get_val, get_val_bool

get_val and get_val_bool are unused.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoRename config.h -> common/config.h
Colin Patrick McCabe [Tue, 22 Feb 2011 16:38:45 +0000 (08:38 -0800)]
Rename config.h -> common/config.h

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoconfig.cc: doesn't depend on ceph_ver.h
Colin Patrick McCabe [Tue, 22 Feb 2011 16:23:30 +0000 (08:23 -0800)]
config.cc: doesn't depend on ceph_ver.h

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agotestlibrbdpp: update for new librados and librbd APIs
Josh Durgin [Wed, 23 Feb 2011 23:32:57 +0000 (15:32 -0800)]
testlibrbdpp: update for new librados and librbd APIs

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agorbd: update for new librados and librbd APIs
Josh Durgin [Wed, 23 Feb 2011 23:23:59 +0000 (15:23 -0800)]
rbd: update for new librados and librbd APIs

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agolibrbd: implement stacking on top of librados
Josh Durgin [Wed, 23 Feb 2011 22:46:04 +0000 (14:46 -0800)]
librbd: implement stacking on top of librados

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agolibrados: add constructor to allow client re-use
Josh Durgin [Wed, 23 Feb 2011 22:07:46 +0000 (14:07 -0800)]
librados: add constructor to allow client re-use

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agolibrados: switch to noun_verb function names
Josh Durgin [Wed, 23 Feb 2011 22:00:34 +0000 (14:00 -0800)]
librados: switch to noun_verb function names

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agolibrbd: stack on top of librados
Sage Weil [Tue, 22 Feb 2011 21:15:23 +0000 (13:15 -0800)]
librbd: stack on top of librados

14 years agoReplicatedPG: snap_trimmer should bail out while finalizing_scrub
Samuel Just [Thu, 10 Feb 2011 23:45:22 +0000 (15:45 -0800)]
ReplicatedPG: snap_trimmer should bail out while finalizing_scrub

Check to make sure !finalizing_scrub when relocking.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
14 years agoOSD,PG: fix race between processing scrub and dequeueing scrub
Samuel Just [Tue, 15 Feb 2011 18:02:19 +0000 (10:02 -0800)]
OSD,PG: fix race between processing scrub and dequeueing scrub

Previously, a second scrub could be scheduled between when the first is
dequeued and processed resulting in two scrubs of the pg running
concurrently.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
14 years agoosd: fix recovery pointer when pulling head before snapid
Sage Weil [Tue, 22 Feb 2011 20:45:21 +0000 (12:45 -0800)]
osd: fix recovery pointer when pulling head before snapid

If recovery wants to pull a snapped object and needs the head first, pull()
does that, but the caller doesn't ++skipped and incorrectly bumps the
recovery pointer, preventing us from going back and re-pulling the snapped
object later.

Return a tristate enum from pull so we can tell what it did and update our
recovery state appropriately.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: verify object version during push
Sage Weil [Tue, 22 Feb 2011 20:20:40 +0000 (12:20 -0800)]
osd: verify object version during push

Fail to push if the ondisk version doesn't match the version we want to
send.

This isn't supposed to happen. If it does it means we have a bug somewhere
else.  Log something to the error log and don't push.  This is better than
the current behavior, which goes into a loop (repeatedly pulling the object
and retrying when it's not the right version).

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: improve up_thru request behavior
Sage Weil [Tue, 22 Feb 2011 17:40:47 +0000 (09:40 -0800)]
osd: improve up_thru request behavior

There is some epoch the OSD wants for up_thru, based on when the PG mapping
last changed.  However, once the monitor gets to the point where it must
update the map, it should set up_thru to the most recent epoch the OSD has
seen (i.e. the epoch it is known to be "up thru"!).  This will hopefully/
frequently avoid any subsequent up_thru requests.

MOSDAlive already has a separate field (in PaxosServiceMessage) to hold the
latest epoch; just fix the constructor to set it properly, and make the
monitor use it.  No protocol change, yay!

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agopybind: update rados python bindings for new API
Colin Patrick McCabe [Tue, 22 Feb 2011 17:27:10 +0000 (09:27 -0800)]
pybind: update rados python bindings for new API

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoosd: set correct epoch for up_thru osd->mon request
Sage Weil [Tue, 22 Feb 2011 17:06:05 +0000 (09:06 -0800)]
osd: set correct epoch for up_thru osd->mon request

Put the epoch we need for up_thru in the request.  Putting the most recent
epoch causes incorrect osdmap churn.

Fixes: #824
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoPGMap: make osd_full and nearfull ratios configurable.
Greg Farnum [Tue, 22 Feb 2011 16:00:15 +0000 (08:00 -0800)]
PGMap: make osd_full and nearfull ratios configurable.

These were previously set by #defines. Pretty stupid
when we have a nice config system already!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agolibrados: more API cleanup; rados_conf_ stubs
Sage Weil [Sat, 12 Feb 2011 22:20:16 +0000 (14:20 -0800)]
librados: more API cleanup; rados_conf_ stubs

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomove g_default_file_layout into ceph_fs.cc
Colin Patrick McCabe [Tue, 22 Feb 2011 13:12:34 +0000 (05:12 -0800)]
move g_default_file_layout into ceph_fs.cc

It's defined in ceph_fs.h.

Fix a bunch of headers that use types without including the headers that
define those types.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agolibrados: add cluster handle to C API
Sage Weil [Fri, 11 Feb 2011 23:38:15 +0000 (15:38 -0800)]
librados: add cluster handle to C API

Had to add a layer of indirection to the list context handles.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoMakefile: include ceph_argsparse.h in dist tarball
Sage Weil [Mon, 21 Feb 2011 05:00:57 +0000 (21:00 -0800)]
Makefile: include ceph_argsparse.h in dist tarball

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agofilestore: fix clone_range
Sage Weil [Mon, 21 Feb 2011 04:55:49 +0000 (20:55 -0800)]
filestore: fix clone_range

This was broken by the safe_write() switchover; the success return value
is now 0, not the number of bytes written.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agocommon: Split argument parsing into ceph_argparse
Colin Patrick McCabe [Sun, 20 Feb 2011 17:18:03 +0000 (09:18 -0800)]
common: Split argument parsing into ceph_argparse

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agokeyring_init: don't print error when explicit key/keyfile is specified
Sage Weil [Sun, 20 Feb 2011 21:54:20 +0000 (13:54 -0800)]
keyring_init: don't print error when explicit key/keyfile is specified

e.g. when I am non-root and specify a key explicitly, no need to complain
about not being able to read root's /etc/ceph/keyring.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoRevert "keyring_init: g_conf.keyring is not a list"
Sage Weil [Sun, 20 Feb 2011 21:52:15 +0000 (13:52 -0800)]
Revert "keyring_init: g_conf.keyring is not a list"

This reverts commit 2fb6036aa53f5eb3173b80fd17b7240bd3daf156.

14 years agokeyring_init: g_conf.keyring is not a list
Colin Patrick McCabe [Fri, 18 Feb 2011 17:51:32 +0000 (09:51 -0800)]
keyring_init: g_conf.keyring is not a list

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoRevert "Makefile.am: remove unused libs from linking with librbd tests and rbd"
Josh Durgin [Sat, 19 Feb 2011 00:01:10 +0000 (16:01 -0800)]
Revert "Makefile.am: remove unused libs from linking with librbd tests and rbd"

Same problem as 38f38a99149e88f18072fcbdbee316ac21f6f30f.

This reverts commit e5db46cea0997f3f959b2ae896c980585f079ac0.

14 years agoClock: remove unused mutex
Colin Patrick McCabe [Fri, 18 Feb 2011 16:38:52 +0000 (08:38 -0800)]
Clock: remove unused mutex

We don't use a mutex in g_clock any more, so let's not construct one any
more.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoMerge branch 'pool_memory'
Greg Farnum [Fri, 18 Feb 2011 23:42:16 +0000 (15:42 -0800)]
Merge branch 'pool_memory'

14 years agotest: Add new memory tests, move to own subdir.
Greg Farnum [Wed, 16 Feb 2011 21:01:34 +0000 (13:01 -0800)]
test: Add new memory tests, move to own subdir.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agovstart: remove directories, too.
Greg Farnum [Wed, 16 Feb 2011 19:23:11 +0000 (11:23 -0800)]
vstart: remove directories, too.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoOSD: convert waiting_for_pg from hash_map to map.
Greg Farnum [Mon, 14 Feb 2011 21:24:40 +0000 (13:24 -0800)]
OSD: convert waiting_for_pg from hash_map to map.

This doesn't need to be a hash_map; there will only be an entry
for each PG that gets a message request while it's not active.
Shouldn't be too many PGs that that happens too, right?

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoPG: remove the object locking stubs and some dead code.
Greg Farnum [Mon, 14 Feb 2011 21:23:42 +0000 (13:23 -0800)]
PG: remove the object locking stubs and some dead code.

These are unused (#if 0'd, so no way to use them!) and require
a memory-hogging hash_map. Goodbye!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoPG: convert hash_maps to maps, remove unused.
Greg Farnum [Sat, 12 Feb 2011 01:25:14 +0000 (17:25 -0800)]
PG: convert hash_maps to maps, remove unused.

waiting_for_[missing|degraded]_object don't need to be
hash_maps, and we don't use stat_object_temp_rd at all.
Swap to map and remove to reduce per-PG memory consumption!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>