]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
14 years agomds: mark_down connections to any failed peers
Sage Weil [Wed, 23 Feb 2011 22:40:38 +0000 (14:40 -0800)]
mds: mark_down connections to any failed peers

This cleans up messenger state, prevents log spam, and saves a small amount
of memory.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: fix export cancellation vs nested freezes
Sage Weil [Wed, 23 Feb 2011 22:25:06 +0000 (14:25 -0800)]
mds: fix export cancellation vs nested freezes

Prevent freezes from completing while we are canceling exports.  Otherwise
if we are freezing /a/b and /a, and cancel /a/b, we may inadvertantly
complete the freeze on /a (synchronously) and confuse ourselves.  Pin
all freezes beforehand so that when we cancel each one we do not cause
any others to prematurely complete.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoFileStore: fix OpSequencer::flush error
Samuel Just [Wed, 23 Feb 2011 21:55:43 +0000 (13:55 -0800)]
FileStore: fix OpSequencer::flush error

In writeahead mode, an op will dissappear from jq without immediately
reappearing in q.  Thus, q can be empty before seq is requeued and
finished.  last_thru_q and last_thru_jq will now be tracked explicitly.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
14 years agomds: print waiter tag in hex
Sage Weil [Wed, 23 Feb 2011 21:45:23 +0000 (13:45 -0800)]
mds: print waiter tag in hex

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: make frag string rendering simpler
Sage Weil [Wed, 23 Feb 2011 21:36:17 +0000 (13:36 -0800)]
mds: make frag string rendering simpler

Show actual bit prefix when rendering a frag_t.  That is,

$value/$numbits -> bits*

So,

0/0      -> *
000000/1 -> 0*
800000/1 -> 1*
800000/3 -> 100*

etc.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomon: fix dup mds takeover
Sage Weil [Wed, 23 Feb 2011 21:34:01 +0000 (13:34 -0800)]
mon: fix dup mds takeover

Allow a standby to take over for a single MDS only by consistently looking
at the pending_mdsmap and not mdsmap.  Mixing the two leads to all kinds
of confusion.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: print msg when fragtree updates from journal
Sage Weil [Wed, 23 Feb 2011 21:18:59 +0000 (13:18 -0800)]
mds: print msg when fragtree updates from journal

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: verify frags in more approrpiate places
Sage Weil [Wed, 23 Feb 2011 21:17:32 +0000 (13:17 -0800)]
mds: verify frags in more approrpiate places

Not in inner helpers, which may be called on multiple frags to get things
in sync.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: refragment dirs when inode dirfragtree updates from journal
Sage Weil [Wed, 23 Feb 2011 21:01:08 +0000 (13:01 -0800)]
mds: refragment dirs when inode dirfragtree updates from journal

Force dir fragmentation specified by dirfragtree when replayed from
the journal.

Example:
 mds0 is auth for /foo, mds1 is auth for /foo/bar.
 mds1 fragments /foo/bar.  journals etc.
 mds0 gets fragment notify and the in-memory inode's dirfragtree changes.
 mds0 journals the /foo/bar inode for some random reason.
 mds0 imports /foo/bar.

On replay, mds0 refragments upon first mention of the new fragtree in the
journal, so that the dirfragtree <-> dir frags always match.  Confusion is
avoided when we, say, import /foo/bar.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: fix CDir::take_waiting() on dentry waiters
Sage Weil [Wed, 23 Feb 2011 19:55:06 +0000 (11:55 -0800)]
mds: fix CDir::take_waiting() on dentry waiters

Using take_dentry_waiting() means we double-put the DNWAITER pin.  It's
also way slower.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoReplicatedPG: snap_trimmer should bail out while finalizing_scrub
Samuel Just [Thu, 10 Feb 2011 23:45:22 +0000 (15:45 -0800)]
ReplicatedPG: snap_trimmer should bail out while finalizing_scrub

Check to make sure !finalizing_scrub when relocking.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
14 years agoOSD,PG: fix race between processing scrub and dequeueing scrub
Samuel Just [Tue, 15 Feb 2011 18:02:19 +0000 (10:02 -0800)]
OSD,PG: fix race between processing scrub and dequeueing scrub

Previously, a second scrub could be scheduled between when the first is
dequeued and processed resulting in two scrubs of the pg running
concurrently.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
14 years agoosd: fix recovery pointer when pulling head before snapid
Sage Weil [Tue, 22 Feb 2011 20:45:21 +0000 (12:45 -0800)]
osd: fix recovery pointer when pulling head before snapid

If recovery wants to pull a snapped object and needs the head first, pull()
does that, but the caller doesn't ++skipped and incorrectly bumps the
recovery pointer, preventing us from going back and re-pulling the snapped
object later.

Return a tristate enum from pull so we can tell what it did and update our
recovery state appropriately.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: verify object version during push
Sage Weil [Tue, 22 Feb 2011 20:20:40 +0000 (12:20 -0800)]
osd: verify object version during push

Fail to push if the ondisk version doesn't match the version we want to
send.

This isn't supposed to happen. If it does it means we have a bug somewhere
else.  Log something to the error log and don't push.  This is better than
the current behavior, which goes into a loop (repeatedly pulling the object
and retrying when it's not the right version).

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: improve up_thru request behavior
Sage Weil [Tue, 22 Feb 2011 17:40:47 +0000 (09:40 -0800)]
osd: improve up_thru request behavior

There is some epoch the OSD wants for up_thru, based on when the PG mapping
last changed.  However, once the monitor gets to the point where it must
update the map, it should set up_thru to the most recent epoch the OSD has
seen (i.e. the epoch it is known to be "up thru"!).  This will hopefully/
frequently avoid any subsequent up_thru requests.

MOSDAlive already has a separate field (in PaxosServiceMessage) to hold the
latest epoch; just fix the constructor to set it properly, and make the
monitor use it.  No protocol change, yay!

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: set correct epoch for up_thru osd->mon request
Sage Weil [Tue, 22 Feb 2011 17:06:05 +0000 (09:06 -0800)]
osd: set correct epoch for up_thru osd->mon request

Put the epoch we need for up_thru in the request.  Putting the most recent
epoch causes incorrect osdmap churn.

Fixes: #824
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoPGMap: make osd_full and nearfull ratios configurable.
Greg Farnum [Tue, 22 Feb 2011 16:00:15 +0000 (08:00 -0800)]
PGMap: make osd_full and nearfull ratios configurable.

These were previously set by #defines. Pretty stupid
when we have a nice config system already!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoMakefile: include ceph_argsparse.h in dist tarball
Sage Weil [Mon, 21 Feb 2011 05:00:57 +0000 (21:00 -0800)]
Makefile: include ceph_argsparse.h in dist tarball

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agofilestore: fix clone_range
Sage Weil [Mon, 21 Feb 2011 04:55:49 +0000 (20:55 -0800)]
filestore: fix clone_range

This was broken by the safe_write() switchover; the success return value
is now 0, not the number of bytes written.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agocommon: Split argument parsing into ceph_argparse
Colin Patrick McCabe [Sun, 20 Feb 2011 17:18:03 +0000 (09:18 -0800)]
common: Split argument parsing into ceph_argparse

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agokeyring_init: don't print error when explicit key/keyfile is specified
Sage Weil [Sun, 20 Feb 2011 21:54:20 +0000 (13:54 -0800)]
keyring_init: don't print error when explicit key/keyfile is specified

e.g. when I am non-root and specify a key explicitly, no need to complain
about not being able to read root's /etc/ceph/keyring.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoRevert "keyring_init: g_conf.keyring is not a list"
Sage Weil [Sun, 20 Feb 2011 21:52:15 +0000 (13:52 -0800)]
Revert "keyring_init: g_conf.keyring is not a list"

This reverts commit 2fb6036aa53f5eb3173b80fd17b7240bd3daf156.

14 years agokeyring_init: g_conf.keyring is not a list
Colin Patrick McCabe [Fri, 18 Feb 2011 17:51:32 +0000 (09:51 -0800)]
keyring_init: g_conf.keyring is not a list

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoRevert "Makefile.am: remove unused libs from linking with librbd tests and rbd"
Josh Durgin [Sat, 19 Feb 2011 00:01:10 +0000 (16:01 -0800)]
Revert "Makefile.am: remove unused libs from linking with librbd tests and rbd"

Same problem as 38f38a99149e88f18072fcbdbee316ac21f6f30f.

This reverts commit e5db46cea0997f3f959b2ae896c980585f079ac0.

14 years agoClock: remove unused mutex
Colin Patrick McCabe [Fri, 18 Feb 2011 16:38:52 +0000 (08:38 -0800)]
Clock: remove unused mutex

We don't use a mutex in g_clock any more, so let's not construct one any
more.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoMerge branch 'pool_memory'
Greg Farnum [Fri, 18 Feb 2011 23:42:16 +0000 (15:42 -0800)]
Merge branch 'pool_memory'

14 years agotest: Add new memory tests, move to own subdir.
Greg Farnum [Wed, 16 Feb 2011 21:01:34 +0000 (13:01 -0800)]
test: Add new memory tests, move to own subdir.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agovstart: remove directories, too.
Greg Farnum [Wed, 16 Feb 2011 19:23:11 +0000 (11:23 -0800)]
vstart: remove directories, too.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoOSD: convert waiting_for_pg from hash_map to map.
Greg Farnum [Mon, 14 Feb 2011 21:24:40 +0000 (13:24 -0800)]
OSD: convert waiting_for_pg from hash_map to map.

This doesn't need to be a hash_map; there will only be an entry
for each PG that gets a message request while it's not active.
Shouldn't be too many PGs that that happens too, right?

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoPG: remove the object locking stubs and some dead code.
Greg Farnum [Mon, 14 Feb 2011 21:23:42 +0000 (13:23 -0800)]
PG: remove the object locking stubs and some dead code.

These are unused (#if 0'd, so no way to use them!) and require
a memory-hogging hash_map. Goodbye!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoPG: convert hash_maps to maps, remove unused.
Greg Farnum [Sat, 12 Feb 2011 01:25:14 +0000 (17:25 -0800)]
PG: convert hash_maps to maps, remove unused.

waiting_for_[missing|degraded]_object don't need to be
hash_maps, and we don't use stat_object_temp_rd at all.
Swap to map and remove to reduce per-PG memory consumption!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agodebug.h: cleanup includes
Colin Patrick McCabe [Fri, 18 Feb 2011 15:35:23 +0000 (07:35 -0800)]
debug.h: cleanup includes

Shouldn't need to include DoutStreambuf.h; that's all implementation.
Don't include Mutex.h, since we don't use it.
*Do* include config.h, since we need it.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agocommon: Move hex dump functions into hex.h
Colin Patrick McCabe [Fri, 18 Feb 2011 14:35:47 +0000 (06:35 -0800)]
common: Move hex dump functions into hex.h

Move hex dump functions into hex.h. Remove unecessary includes from
debug.cc

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoMakefile: version.cc should depend on ceph_ver
Colin Patrick McCabe [Fri, 18 Feb 2011 14:28:39 +0000 (06:28 -0800)]
Makefile: version.cc should depend on ceph_ver

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agodebug.h: move Ceph version stuff into version.h
Colin Patrick McCabe [Fri, 18 Feb 2011 21:04:51 +0000 (13:04 -0800)]
debug.h: move Ceph version stuff into version.h

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoRevert "Makefile.am: remove unused libs from linking with librbd"
Josh Durgin [Fri, 18 Feb 2011 20:57:55 +0000 (12:57 -0800)]
Revert "Makefile.am: remove unused libs from linking with librbd"

librados doesn't export ceph::buffer_total_alloc

This reverts commit 9bbd6c32a59ce0a2e4cc21a498e0b04bcd4781ed.

14 years agotestlibrbdpp: fix off by one error in read test
Josh Durgin [Fri, 18 Feb 2011 19:17:17 +0000 (11:17 -0800)]
testlibrbdpp: fix off by one error in read test

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoMakefile.am: remove unused libs from linking with librbd tests and rbd
Josh Durgin [Fri, 18 Feb 2011 19:12:48 +0000 (11:12 -0800)]
Makefile.am: remove unused libs from linking with librbd tests and rbd

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoMakefile.am: remove unused libs from linking with librbd
Josh Durgin [Fri, 18 Feb 2011 18:31:25 +0000 (10:31 -0800)]
Makefile.am: remove unused libs from linking with librbd

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoMerge remote branch 'origin/max_commit_size'
Sage Weil [Fri, 18 Feb 2011 07:06:13 +0000 (23:06 -0800)]
Merge remote branch 'origin/max_commit_size'

14 years agopybind/rados: write_full: remove silly extra param
Colin Patrick McCabe [Thu, 17 Feb 2011 18:53:00 +0000 (10:53 -0800)]
pybind/rados: write_full: remove silly extra param

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agopybind/rados: implement Pool.write_full
Colin Patrick McCabe [Thu, 17 Feb 2011 18:47:41 +0000 (10:47 -0800)]
pybind/rados: implement Pool.write_full

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agolibrbd: hold image context lock minimally
Josh Durgin [Fri, 18 Feb 2011 01:30:19 +0000 (17:30 -0800)]
librbd: hold image context lock minimally

Holding the image context lock during snapshot removal prevented the
client from responding to a notify, causing a deadlock. This could be
triggered by removing a snapshot while concurrently adding more to the
same image.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agopybind/rados: implement Pool::change_auid
Colin Patrick McCabe [Thu, 17 Feb 2011 18:24:22 +0000 (10:24 -0800)]
pybind/rados: implement Pool::change_auid

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agopybind/rados: add rados.version
Colin Patrick McCabe [Thu, 17 Feb 2011 18:05:10 +0000 (10:05 -0800)]
pybind/rados: add rados.version

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agopybind/rados: Add Rados.pool_exists
Colin Patrick McCabe [Thu, 17 Feb 2011 17:58:20 +0000 (09:58 -0800)]
pybind/rados: Add Rados.pool_exists

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agopybind/rados: Snap.name should be a py string
Colin Patrick McCabe [Thu, 17 Feb 2011 17:46:34 +0000 (09:46 -0800)]
pybind/rados: Snap.name should be a py string

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agopybind/rados: add snapshots
Colin Patrick McCabe [Thu, 17 Feb 2011 17:38:11 +0000 (09:38 -0800)]
pybind/rados: add snapshots

Add snapshot lookup, iteration, creation, destruction interface.

Add test.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoAdd Pool::list_objects
Colin Patrick McCabe [Thu, 17 Feb 2011 14:46:08 +0000 (06:46 -0800)]
Add Pool::list_objects

Add a Pool::list_objects method. Add a test for this to pybind-test.py

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoAdd pybind-test
Colin Patrick McCabe [Thu, 17 Feb 2011 12:41:15 +0000 (04:41 -0800)]
Add pybind-test

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agorados-python bindings: Fix pool deletion a bit
Colin Patrick McCabe [Thu, 17 Feb 2011 12:11:08 +0000 (04:11 -0800)]
rados-python bindings: Fix pool deletion a bit

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agomkcephfs: fix premature tmp directory deletion
Samuel Just [Thu, 17 Feb 2011 19:43:58 +0000 (11:43 -0800)]
mkcephfs: fix premature tmp directory deletion

Previously, the temp directory would be deleted after the first daemon
on a host was started leaving the second one to fail.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
14 years ago.gitignore: ignore testsnaps
Samuel Just [Thu, 17 Feb 2011 19:23:37 +0000 (11:23 -0800)]
.gitignore: ignore testsnaps

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
14 years ago.gitignore: ignore debian packaging outputs
Josh Durgin [Thu, 17 Feb 2011 19:02:59 +0000 (11:02 -0800)]
.gitignore: ignore debian packaging outputs

14 years agomonmaptool: fix command-line output
Colin Patrick McCabe [Wed, 16 Feb 2011 18:28:48 +0000 (10:28 -0800)]
monmaptool: fix command-line output

Don't check errno if it isn't set.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agodout: don't print version when forcing fg logging.
Colin Patrick McCabe [Wed, 16 Feb 2011 17:21:23 +0000 (09:21 -0800)]
dout: don't print version when forcing fg logging.

dout: don't print version when forcing fg logging.

Fix tests that were assuming us to spew errors about /var/log, which we
no longer do.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agodout initialization: remove unecessary flush()
Colin Patrick McCabe [Wed, 16 Feb 2011 16:57:49 +0000 (08:57 -0800)]
dout initialization: remove unecessary flush()

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agodout: properly output ceph version on opening dout
Colin Patrick McCabe [Wed, 16 Feb 2011 16:22:05 +0000 (08:22 -0800)]
dout: properly output ceph version on opening dout

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agomonmaptool: set_foreground_logging
Colin Patrick McCabe [Wed, 16 Feb 2011 16:04:29 +0000 (08:04 -0800)]
monmaptool: set_foreground_logging

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agosimplemessenger: Fix num_threads bug printout.
Greg Farnum [Wed, 16 Feb 2011 23:07:01 +0000 (15:07 -0800)]
simplemessenger: Fix num_threads bug printout.

Also add documentation to get_num_threads since its contract
changed significantly.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agomsgr: complain if there are > 1 threads, not 1
Colin Patrick McCabe [Wed, 16 Feb 2011 15:15:38 +0000 (07:15 -0800)]
msgr: complain if there are > 1 threads, not 1

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoinit-ceph: use do_cmd for pid_file dir creation
Sage Weil [Wed, 16 Feb 2011 17:19:42 +0000 (09:19 -0800)]
init-ceph: use do_cmd for pid_file dir creation

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: fix population on unconnected_watchers on obc load
Sage Weil [Wed, 16 Feb 2011 03:32:35 +0000 (19:32 -0800)]
osd: fix population on unconnected_watchers on obc load

Fixes: #807
Signed-off-by: Sage Weil <sage@newdream.net>
14 years agocommon: thread: get number of threads from /proc
Colin Patrick McCabe [Tue, 15 Feb 2011 19:02:03 +0000 (11:02 -0800)]
common: thread: get number of threads from /proc

The kernel knows how many threads we have; just ask it. One less atomic
variable to carry around.

We will eventually have to avoid doing this check for non-daemon code,
but that's a separate issue.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoRemove ExportControl (we have better auth now)
Colin Patrick McCabe [Tue, 15 Feb 2011 17:51:35 +0000 (09:51 -0800)]
Remove ExportControl (we have better auth now)

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoinit-ceph: status: use daemon_is_running
Colin Patrick McCabe [Tue, 15 Feb 2011 17:16:35 +0000 (09:16 -0800)]
init-ceph: status: use daemon_is_running

daemon_is_running does some nice things like check /proc/$pid/cmdline.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoinit-ceph: fix status for multi-node clusters
Colin Patrick McCabe [Tue, 15 Feb 2011 16:39:07 +0000 (08:39 -0800)]
init-ceph: fix status for multi-node clusters

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agotest/osd: Fix indentation on RadosModel.h and TestSnaps.cc
Samuel Just [Tue, 15 Feb 2011 23:24:16 +0000 (15:24 -0800)]
test/osd: Fix indentation on RadosModel.h and TestSnaps.cc

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
14 years agotestsnaps: add snapshot test
Samuel Just [Tue, 15 Feb 2011 21:48:57 +0000 (13:48 -0800)]
testsnaps: add snapshot test

Uses RadosModel.h to check the results of a randomized sequence of
writes, reads, snapshots, snapshot removals, and rollbacks for errors.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
14 years agoMerge branch 'dout2'
Colin Patrick McCabe [Tue, 15 Feb 2011 13:49:50 +0000 (05:49 -0800)]
Merge branch 'dout2'

14 years agoJournaler: add some checks for expire_pos.
Greg Farnum [Tue, 15 Feb 2011 17:00:06 +0000 (09:00 -0800)]
Journaler: add some checks for expire_pos.

I don't think these are necessary checks, but the expire_pos >= trim_pos
invariant got broken somehow by johnl, and these checks won't hurt!

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
14 years agoJournaler: call set_layout after init_headers.
Greg Farnum [Tue, 15 Feb 2011 16:58:48 +0000 (08:58 -0800)]
Journaler: call set_layout after init_headers.

set_layout modifies last_committed, but then init_headers
uses operator= and overwrites those changes. In this case
it doesn't matter as they're both writing the same changes,
but make the ordering explicit for the future.

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
14 years agoOSD: ignore osd_max_write_size if it's set to 0.
Greg Farnum [Mon, 14 Feb 2011 20:05:05 +0000 (12:05 -0800)]
OSD: ignore osd_max_write_size if it's set to 0.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agocommon/lockdep.cc: don't use dout unlocked
Colin Patrick McCabe [Mon, 14 Feb 2011 15:31:27 +0000 (07:31 -0800)]
common/lockdep.cc: don't use dout unlocked

Lockdep should use the regular dout() interfaces, rather than going
around them. In particular, we shouldn't output to dout without taking
dout_lock.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoassert: allow assertions inside calls to dout()
Colin Patrick McCabe [Mon, 14 Feb 2011 15:20:57 +0000 (07:20 -0800)]
assert: allow assertions inside calls to dout()

We should handle the situation where we assert() while already holding
the dout() lock. At the same time, we want to get the dout lock if we
can, because it makes the logs look nicer. pthread_mutex_trylock solves
the dilemma.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agodout: Convert _dout_lock to plain pthread_mutex_t
Colin Patrick McCabe [Mon, 14 Feb 2011 13:56:52 +0000 (05:56 -0800)]
dout: Convert _dout_lock to plain pthread_mutex_t

Convert _dout_lock to plain pthread_mutex_t. This way, we don't have to
depend on the order of global constructor initialization. It should also
be slightly more efficient. The dout_lock was never subject to lockdep
anyway, so that's not an issue.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoassert.cc: some cleanup
Colin Patrick McCabe [Mon, 14 Feb 2011 13:47:35 +0000 (05:47 -0800)]
assert.cc: some cleanup

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agocommon/debug.h: use std::string rather than string
Colin Patrick McCabe [Mon, 14 Feb 2011 13:47:00 +0000 (05:47 -0800)]
common/debug.h: use std::string rather than string

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agocommon: Remove common/tls.cc
Colin Patrick McCabe [Mon, 14 Feb 2011 13:14:36 +0000 (05:14 -0800)]
common: Remove common/tls.cc

Using ELF TLS via the __thread keyword is much faster than using
pthread_getspecific and pthread_setspecific. It's also much nicer
looking syntactically. Finally, the __thread keyword is going to be
standardized in C++0x. So there's no reason to have an infrastructure
dependent on pthread_getspecific.

There were no users so this shouldn't affect anything negatively.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agodout: use DoutLocker rather than Mutex::Locker
Colin Patrick McCabe [Mon, 14 Feb 2011 12:10:17 +0000 (04:10 -0800)]
dout: use DoutLocker rather than Mutex::Locker

Use DoutLocker rather than Mutex::Locker, in preparation for making the
dout_lock a plain old pthread_mutex_t.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoos/FileStore: use derr/dendl for dout locking
Colin Patrick McCabe [Mon, 14 Feb 2011 12:06:54 +0000 (04:06 -0800)]
os/FileStore: use derr/dendl for dout locking

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoos/FileStore: use ceph_abort rather than abort
Colin Patrick McCabe [Mon, 14 Feb 2011 12:05:47 +0000 (04:05 -0800)]
os/FileStore: use ceph_abort rather than abort

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agolockdep: balance dout and dendl, fix whitespace
Colin Patrick McCabe [Mon, 14 Feb 2011 11:30:55 +0000 (03:30 -0800)]
lockdep: balance dout and dendl, fix whitespace

Make lockdep use dendl the same way as the other code. This is in
preparation for making lockdep use normal dout() rather than an unlocked
version.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoJournaler: fix bad assert.
Greg Farnum [Sun, 13 Feb 2011 22:16:36 +0000 (14:16 -0800)]
Journaler: fix bad assert.

We can call reread_head during normal replay under
certain circumstances. So add the REREAD_HEAD state
as allowed.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agotestlibrbd: fix printf args
Sage Weil [Sat, 12 Feb 2011 21:23:20 +0000 (13:23 -0800)]
testlibrbd: fix printf args

Stupid me!

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agocommon: create dout_emergency interface and use it
Colin Patrick McCabe [Sat, 12 Feb 2011 17:56:03 +0000 (09:56 -0800)]
common: create dout_emergency interface and use it

Create the dout_emergency interface, which is safe to call from a signal
handler or from inside dout itself.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agocfuse: use safe_read and check return value
Sage Weil [Sat, 12 Feb 2011 06:57:29 +0000 (22:57 -0800)]
cfuse: use safe_read and check return value

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agotestlibrbd: check return values
Sage Weil [Sat, 12 Feb 2011 07:06:07 +0000 (23:06 -0800)]
testlibrbd: check return values

Stupid printf!

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agodebian: add python, python-dev build-deps
Sage Weil [Sat, 12 Feb 2011 06:47:51 +0000 (22:47 -0800)]
debian: add python, python-dev build-deps

Might be overkill?  The error I see from pbuilder is

checking for a Python interpreter with version >= 2.4... none
error: configure: in `/tmp/buildd/ceph-0.24.3-676-gcde53e9':
error: configure: Failed to find Python 2.4 or newer

...but I'm guessing python-dev is needed too?

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomsgr: clean up Pipe::queue_received locking
Sage Weil [Fri, 11 Feb 2011 05:09:42 +0000 (21:09 -0800)]
msgr: clean up Pipe::queue_received locking

Ensure we maintain the invariant that a pipe has a non-empty queue IFF
the pipe is queued.

Prompted by #798.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoMDCache: switch CDir::_commit so that it can limit max write size.
Greg Farnum [Sat, 12 Feb 2011 00:58:18 +0000 (16:58 -0800)]
MDCache: switch CDir::_commit so that it can limit max write size.

This should fix #777.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoMDCache: add max_dir_commit_size.
Greg Farnum [Fri, 11 Feb 2011 23:54:18 +0000 (15:54 -0800)]
MDCache: add max_dir_commit_size.

Configured by setting mds_dir_max_commit_size in conf, or else
by looking at osd_max_write_size. This should lead to sane
max commits even if the user doesn't specify anything.

This will be used in the next commit or to by CDir.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoMDS: Don't always _commit_full just because we have a complete dir.
Greg Farnum [Thu, 10 Feb 2011 21:23:44 +0000 (13:23 -0800)]
MDS: Don't always _commit_full just because we have a complete dir.

Instead, commit if a certain percentage of the dentries are dirty.
Configurable via mds_dir_commit_ratio!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoMerge remote branch 'origin/librbd'
Sage Weil [Fri, 11 Feb 2011 21:29:32 +0000 (13:29 -0800)]
Merge remote branch 'origin/librbd'

14 years agoobjecter: set linger op target pg when a linger is resent
Josh Durgin [Fri, 11 Feb 2011 21:21:05 +0000 (13:21 -0800)]
objecter: set linger op target pg when a linger is resent

send_linger always creates a new Op, but op_submit does not fill in
the target pg if an existing session is passed in, so when a linger
was resent, it had the wrong pg set.

This caused a crash in cosd with debugging turned on when running
testlibrbd twice. This occurred because the object context for the
linger in the wrong pg had no object name set.

14 years agoDisable lockdep for ExportControl, ConfFile locks
Colin Patrick McCabe [Fri, 11 Feb 2011 18:58:03 +0000 (10:58 -0800)]
Disable lockdep for ExportControl, ConfFile locks

Currently, we haven't read the configuration at the time we initialize
these locks. So we can't know whether lockdep has been enabled, or what
verbosity it is supposed to have. So just disable it on these locks.

Potentially ExportControl's initialization could be moved to after
g_conf.lockdep and g_conf.debug_lockdep have been read from the
configuration, if lockdep is needed for this component.

ConfFile probably doesn't need a lock at all, but that's another story.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoMDCache: switch CDir::_commit so that it can limit max write size.
Greg Farnum [Sat, 12 Feb 2011 00:58:18 +0000 (16:58 -0800)]
MDCache: switch CDir::_commit so that it can limit max write size.

This should fix #777.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoMDCache: add max_dir_commit_size.
Greg Farnum [Fri, 11 Feb 2011 23:54:18 +0000 (15:54 -0800)]
MDCache: add max_dir_commit_size.

Configured by setting mds_dir_max_commit_size in conf, or else
by looking at osd_max_write_size. This should lead to sane
max commits even if the user doesn't specify anything.

This will be used in the next commit or to by CDir.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoMDS: Don't always _commit_full just because we have a complete dir.
Greg Farnum [Thu, 10 Feb 2011 21:23:44 +0000 (13:23 -0800)]
MDS: Don't always _commit_full just because we have a complete dir.

Instead, commit if a certain percentage of the dentries are dirty.
Configurable via mds_dir_commit_ratio!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years ago.gitignore: py-compile
Sage Weil [Fri, 11 Feb 2011 23:38:40 +0000 (15:38 -0800)]
.gitignore: py-compile

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>