]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
14 years agoTeach run-cli-tests about build dirs.
Tommi Virtanen [Thu, 13 Jan 2011 20:59:11 +0000 (12:59 -0800)]
Teach run-cli-tests about build dirs.

An optional argument points can tell it where to put
generated files (in this case, virtualenv). Provide
the argument in Makefile.am.

Options are still passed to cram, so you can say
"./src/test/run-cli-tests -i".

14 years agoRename variable in run-cli-tests.
Tommi Virtanen [Thu, 13 Jan 2011 20:52:26 +0000 (12:52 -0800)]
Rename variable in run-cli-tests.

Emphasize the fact that the path is the source dir, not the build dir.

14 years agorun-cli-tests is in srcdir not in build dir.
Tommi Virtanen [Thu, 13 Jan 2011 20:50:29 +0000 (12:50 -0800)]
run-cli-tests is in srcdir not in build dir.

Found by "make distcheck".

14 years agoInclude run-cli-tests in release tarball.
Tommi Virtanen [Thu, 13 Jan 2011 20:49:55 +0000 (12:49 -0800)]
Include run-cli-tests in release tarball.

Found by "make distcheck".

14 years agoMerge branch 'tests-broken' into unstable
Tommi Virtanen [Fri, 14 Jan 2011 23:22:01 +0000 (15:22 -0800)]
Merge branch 'tests-broken' into unstable

14 years agoFix clitests for cconf usage change.
Tommi Virtanen [Fri, 14 Jan 2011 23:21:46 +0000 (15:21 -0800)]
Fix clitests for cconf usage change.

14 years agoMerge branch 'tests-broken' into unstable
Tommi Virtanen [Fri, 14 Jan 2011 23:07:41 +0000 (15:07 -0800)]
Merge branch 'tests-broken' into unstable

14 years agoFix clitests for cauthtool usage change.
Tommi Virtanen [Fri, 14 Jan 2011 23:06:35 +0000 (15:06 -0800)]
Fix clitests for cauthtool usage change.

14 years agoMerge commit 'cfae10b8f8b0d91f37dc6eb72f3b3f8285bb15e7' into tests-broken-2
Tommi Virtanen [Fri, 14 Jan 2011 23:04:16 +0000 (15:04 -0800)]
Merge commit 'cfae10b8f8b0d91f37dc6eb72f3b3f8285bb15e7' into tests-broken-2

14 years agoPlaintext keyring format is supposed to be user-friendly, so test it.
Tommi Virtanen [Fri, 14 Jan 2011 23:01:11 +0000 (15:01 -0800)]
Plaintext keyring format is supposed to be user-friendly, so test it.

14 years agoNow that cauthtool has two kinds of keyrings, test them both.
Tommi Virtanen [Fri, 14 Jan 2011 23:00:47 +0000 (15:00 -0800)]
Now that cauthtool has two kinds of keyrings, test them both.

14 years agoFix a bug where "cauthtool --create-keyring" (no --bin) wrote garbage.
Tommi Virtanen [Fri, 14 Jan 2011 22:24:50 +0000 (14:24 -0800)]
Fix a bug where "cauthtool --create-keyring" (no --bin) wrote garbage.

This only triggered when running without --gen-key or --add-key.

14 years agounit tests: do standard ceph init before tests
Colin Patrick McCabe [Fri, 14 Jan 2011 13:57:36 +0000 (05:57 -0800)]
unit tests: do standard ceph init before tests

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoqa: Add tiobench test
Colin Patrick McCabe [Fri, 14 Jan 2011 12:38:13 +0000 (04:38 -0800)]
qa: Add tiobench test

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agovstart.sh: don't depend on /usr/bin/host
Colin Patrick McCabe [Fri, 14 Jan 2011 11:12:38 +0000 (03:12 -0800)]
vstart.sh: don't depend on /usr/bin/host

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agomds: use common helper to journal a client session close
Sage Weil [Fri, 14 Jan 2011 06:08:56 +0000 (22:08 -0800)]
mds: use common helper to journal a client session close

We saw a bug where an ESession close was followed by an EMetaBlob on that
session (see 6d0dc4bf64b2792d6fc007268c5a42ae4e2e583c).  My best guess is
that a session timeout raced with a request waiting on locks (only the
explicit client close path was calling request_kill).  To avoid that,
introduce a helper to journal client close so that the common work (killing
any pending requests AND releasing prealloc inos) happen in all cases.

Fixes #708 (I hope!).

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agocconf: fix usage parsing, add --resolve search
Yehuda Sadeh [Fri, 14 Jan 2011 00:09:16 +0000 (16:09 -0800)]
cconf: fix usage parsing, add --resolve search

--resolve-search is used to resolve a search path result

14 years agokeyring: default keyring file name is 'keyring'
Yehuda Sadeh [Fri, 14 Jan 2011 00:10:27 +0000 (16:10 -0800)]
keyring: default keyring file name is 'keyring'

update accordingly the tools, scripts, man page

14 years agocauthtool: default keyring format is plaintext, add --bin
Yehuda Sadeh [Wed, 12 Jan 2011 22:51:12 +0000 (14:51 -0800)]
cauthtool: default keyring format is plaintext, add --bin

14 years agoconfig: keyring uses a search path again
Yehuda Sadeh [Fri, 14 Jan 2011 00:08:22 +0000 (16:08 -0800)]
config: keyring uses a search path again

14 years agocommon: fix buffer::list::decode_base64
Colin Patrick McCabe [Thu, 13 Jan 2011 19:19:27 +0000 (11:19 -0800)]
common: fix buffer::list::decode_base64

buffer::list::decode_base64 needs to check for decode failures.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoqa: add xattr check
Sage Weil [Thu, 13 Jan 2011 23:47:30 +0000 (15:47 -0800)]
qa: add xattr check

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoMerge branch 'testing' into unstable
Sage Weil [Thu, 13 Jan 2011 21:24:52 +0000 (13:24 -0800)]
Merge branch 'testing' into unstable

Conflicts:
configure.ac

14 years agofilejournal: rewrite completion handling, fix ordering on full->notfull
Sage Weil [Thu, 13 Jan 2011 21:14:24 +0000 (13:14 -0800)]
filejournal: rewrite completion handling, fix ordering on full->notfull

Rewriting the completion handling to be simpler, clearer, so that it is
easier to maintain a strict completion ordering invariant.

This also fixes an ordering bug: When restarting journal, we defer
initially until we get a committed_thru from the previous commit and then
do all those completions.  That same logic needs to also apply to new items
submitted during that commit interval.  This was broken before, but the
simpler structure fixes it.  Fixes #666.

Tested-by: Jim Schutt <jaschut@sandia.gov>
Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoPG: activate should not enqueue snap_trimmer on a replica
Samuel Just [Thu, 13 Jan 2011 20:18:17 +0000 (12:18 -0800)]
PG: activate should not enqueue snap_trimmer on a replica

Previously, activate would queue_snap_trim() for replicas if snap_trimq
ended up non-empty, guaranteeing a crash for any replica starting up
while purged_snaps lagged behind pool->cached_removed_snaps.

This should fix #702.

Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
14 years agoFix confusing comment about gtest automake hookup.
Tommi Virtanen [Thu, 13 Jan 2011 19:32:16 +0000 (11:32 -0800)]
Fix confusing comment about gtest automake hookup.

14 years agounit: add IncorrectBase64Encoding test
Colin Patrick McCabe [Thu, 13 Jan 2011 18:34:35 +0000 (10:34 -0800)]
unit: add IncorrectBase64Encoding test

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agounit: Add test/base64.cc
Colin Patrick McCabe [Thu, 13 Jan 2011 18:23:49 +0000 (10:23 -0800)]
unit: Add test/base64.cc

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoAdd a target to build but not run unittests.
Tommi Virtanen [Thu, 13 Jan 2011 17:50:46 +0000 (09:50 -0800)]
Add a target to build but not run unittests.

Use with "make -C src unittests".

14 years agoReplicatedPG: Fix oi.size bug in _rollback_to
Samuel Just [Wed, 12 Jan 2011 23:09:51 +0000 (15:09 -0800)]
ReplicatedPG: Fix oi.size bug in _rollback_to

_rollback_to calls _delete_head before cloning the clone into place.
_delete_head sets the object info size to 0.  _rollback_to now resets
the size to match the rolled back object.  Previously, this bug
manifested as a failed assert in scrub when checking the object sizes.

Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
14 years agoReplicatedPG: register_object_context and register_snapset_context cleanup
Samuel Just [Wed, 12 Jan 2011 21:51:55 +0000 (13:51 -0800)]
ReplicatedPG: register_object_context and register_snapset_context cleanup

Previously, get_object_context and get_snapset_context did not register
the resulting objects.  In some cases, these objects would not get
registered and multiple copies would end up created.  This caused a bug
in find_object_context where get_snapset_context could return an object
distinct from the one referenced by the object returned from
get_object_context.

Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
14 years agoFix src/test/run-cli-tests to work with any cwd.
Tommi Virtanen [Wed, 12 Jan 2011 21:28:11 +0000 (13:28 -0800)]
Fix src/test/run-cli-tests to work with any cwd.

14 years agoReplicatedPG: snap_trimmer work around
Samuel Just [Wed, 12 Jan 2011 20:07:44 +0000 (12:07 -0800)]
ReplicatedPG: snap_trimmer work around

Currently, an OSD bug is causing snap_trimq to contain some snaps
already in purged_snaps.  This work around should let kvmtest
come back up.  A real fix is still needed.

Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
14 years agoMerge branch 'clitests-3' into unstable
Sage Weil [Wed, 12 Jan 2011 20:04:16 +0000 (12:04 -0800)]
Merge branch 'clitests-3' into unstable

14 years agoAdjust clitests after cauthtool changes.
Tommi Virtanen [Wed, 12 Jan 2011 19:10:24 +0000 (11:10 -0800)]
Adjust clitests after cauthtool changes.

14 years agoMerge commit '735eb400dc617c599f8cb42af91bab00931eeaff' into clitests-z
Tommi Virtanen [Wed, 12 Jan 2011 18:58:52 +0000 (10:58 -0800)]
Merge commit '735eb400dc617c599f8cb42af91bab00931eeaff' into clitests-z

14 years agoAdjust clitests after cauthtool changes.
Tommi Virtanen [Wed, 12 Jan 2011 18:52:46 +0000 (10:52 -0800)]
Adjust clitests after cauthtool changes.

14 years agoMerge commit 'e9a70f15029d397ebf0414e5f16fda321af5f55b' into clitests-4
Tommi Virtanen [Wed, 12 Jan 2011 18:49:27 +0000 (10:49 -0800)]
Merge commit 'e9a70f15029d397ebf0414e5f16fda321af5f55b' into clitests-4

14 years agoFix osdmaptool error reporting.
Tommi Virtanen [Wed, 12 Jan 2011 18:24:08 +0000 (10:24 -0800)]
Fix osdmaptool error reporting.

14 years agouclient: Switch how inodes link to dentries a bit.
Greg Farnum [Tue, 4 Jan 2011 21:32:47 +0000 (13:32 -0800)]
uclient: Switch how inodes link to dentries a bit.

Inodes now have a set of parent dentries, rather than a single
pointer. This allows the cache to accurately represent multiple
hard links.
Various minor adjustments were made so that this change in
format works and is error checked.

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
14 years agoauth: change the plaintext keyring format
Yehuda Sadeh [Tue, 11 Jan 2011 22:51:19 +0000 (14:51 -0800)]
auth: change the plaintext keyring format

14 years agoRevert "client: Remove the I_COMPLETE flag from the parent directory in relink_inode."
Greg Farnum [Tue, 4 Jan 2011 21:34:52 +0000 (13:34 -0800)]
Revert "client: Remove the I_COMPLETE flag from the parent directory in relink_inode."

This reverts commit c43455cee4b7b45de6bd04454a40bc7016f2d6d1. We don't
need this fix any more since we now handle hard links properly!

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
14 years agoconf: ConfFile can parse bufferlists
Yehuda Sadeh [Tue, 11 Jan 2011 22:06:33 +0000 (14:06 -0800)]
conf: ConfFile can parse bufferlists

14 years agoosd: avoid creating some temporary coll_t objects
Colin Patrick McCabe [Mon, 3 Jan 2011 05:11:07 +0000 (21:11 -0800)]
osd: avoid creating some temporary coll_t objects

PG::coll caches the value of coll_t(this->info.pgid). So use PG::coll
when appropriate rather than constructing a new object.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: de-globalize PG::oldest_update
Colin Patrick McCabe [Tue, 28 Dec 2010 23:55:24 +0000 (15:55 -0800)]
osd: de-globalize PG::oldest_update

Making oldest_update a class variable complicates log merging and wastes
space in the PG struct. Even though memory is big, cachelines are still
small. Just calculate it when we need it.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: clean up loop in proc_replica_log
Colin Patrick McCabe [Tue, 28 Dec 2010 23:48:53 +0000 (15:48 -0800)]
osd: clean up loop in proc_replica_log

We don't need to update lu on (almost) every iteration, only on the
final one. Use a const iterator.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: remove unused PG global
Colin Patrick McCabe [Tue, 28 Dec 2010 22:27:25 +0000 (14:27 -0800)]
osd: remove unused PG global

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: call prior_set_affected only if we have one
Colin Patrick McCabe [Mon, 27 Dec 2010 21:53:52 +0000 (13:53 -0800)]
osd: call prior_set_affected only if we have one

Don't call prior_set_affected if the prior set hasn't been built. This
will be the case unless we're a primary doing peering.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: prevent PG objects from being copied
Colin Patrick McCabe [Mon, 27 Dec 2010 20:39:23 +0000 (12:39 -0800)]
osd: prevent PG objects from being copied

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: Put all prior_set fields into a struct
Colin Patrick McCabe [Mon, 27 Dec 2010 19:44:29 +0000 (11:44 -0800)]
osd: Put all prior_set fields into a struct

Keep all the prior set stuff together.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoRemove outdated TODO note.
Tommi Virtanen [Wed, 12 Jan 2011 17:18:55 +0000 (09:18 -0800)]
Remove outdated TODO note.

The test originally used cat <<EOF, which made
the mon$id in the config file get expanded at
that time.

14 years agoAdd CLI tests for osdmaptool and friends.
Tommi Virtanen [Wed, 12 Jan 2011 00:43:46 +0000 (16:43 -0800)]
Add CLI tests for osdmaptool and friends.

Uses a python package "cram" as test runner.
Requires PIP (python-pip.deb) installed on the
build machine, to actually run these tests.

The cram application itself is included as a
tarball that gets installed in a virtualenv
when the tests are run. cram is GPL.

14 years agoGit ignored files cleanup.
Tommi Virtanen [Tue, 11 Jan 2011 22:02:16 +0000 (14:02 -0800)]
Git ignored files cleanup.

Make gitignore entries not match recursively.

I wanted to introduce a directory "osdmaptool" to contain cli tests
for that tool, but all the files there were ignored because of these
rules.  Better be explicit about what you want ignored.

Move all ignores for generated binaries to be together.

Fixed "testecph" typo.

Added ignores for: testdout_streambuf testsignal_handlers testtimers.

14 years agoosd: OSD::queue_pg_for_deletion: avoid double del
Colin Patrick McCabe [Tue, 11 Jan 2011 18:15:02 +0000 (10:15 -0800)]
osd: OSD::queue_pg_for_deletion: avoid double del

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: OSD::queue_pg_for_deletion: avoid double del
Colin Patrick McCabe [Tue, 11 Jan 2011 18:15:02 +0000 (10:15 -0800)]
osd: OSD::queue_pg_for_deletion: avoid double del

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agomds: avoid double-pinning stray inodes
Sage Weil [Tue, 11 Jan 2011 17:50:20 +0000 (09:50 -0800)]
mds: avoid double-pinning stray inodes

We make multiple iterations through populate_mydir().  Only pin each stray
once.  Fixes #689 and crashes like

mds/CInode.h: In function 'virtual void CInode::bad_get(int)':
mds/CInode.h:1088: FAILED assert(ref_set.count(by) == 0)
ceph version 0.24 (180a4176035521940390f4ce24ee3eb7aa290632)
1: (CInode::bad_put(int)+0) [0x827b090]
2: (MDSCacheObject::get(int)+0x153) [0x813e463]
3: (MDCache::populate_mydir()+0x8a) [0x81a7e5a]
4: (MDCache::_create_system_file_finish(Mutation*, CDentry*,
Context*)+0x181) [0x819f501]
5: (C_MDC_CreateSystemFile::finish(int)+0x29) [0x81d6c29]
6: (finish_contexts(std::list<Context*, std::allocator<Context*> >&,
int)+0x6b) [0x81d663b]
7: (Journaler::_finish_flush(int, long long, utime_t, bool)+0x983) [0x82f2f53]
8: (Journaler::C_Flush::finish(int)+0x3f) [0x82fb24f]
9: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x801) [0x82d8e31]
10: (MDS::_dispatch(Message*)+0x2ae5) [0x80eaa15]
11: (MDS::ms_dispatch(Message*)+0x62) [0x80eb142]
12: (SimpleMessenger::dispatch_entry()+0x899) [0x80b8649]
13: (SimpleMessenger::DispatchThread::entry()+0x22) [0x80b30f2]

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agodebian: make update_pbuilder.sh a bit smarter
Sage Weil [Sat, 8 Jan 2011 23:41:20 +0000 (15:41 -0800)]
debian: make update_pbuilder.sh a bit smarter

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agokeyring: can parse plain text keyring files
Yehuda Sadeh [Mon, 10 Jan 2011 23:50:26 +0000 (15:50 -0800)]
keyring: can parse plain text keyring files

14 years agoReplicatedPG: Fix bug in rollback
Samuel Just [Mon, 10 Jan 2011 22:45:06 +0000 (14:45 -0800)]
ReplicatedPG: Fix bug in rollback

Previously, _rollback_to assumed that the rollback was a noop if
ctx->clone_obc was set and it's prior version matches head's version.
However, this broke in sequences like:

Write "snap1 contents" to oid "blah"
create snapshot "snap1"
Write "snap2 contents" to oid "blah"
create snapshot "snap2"
rollback oid "blah" to snapshot "snap1"

In this case, make_writeable would have just cloned head to the snap2
clone, but the relevant clone is actually "snap1".  _rollback_to now
verifies that the most recent clone is the correct one before assuming
that head is already correct.

Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
14 years agoPort encoding tests over to gtest.
Tommi Virtanen [Mon, 10 Jan 2011 19:00:15 +0000 (11:00 -0800)]
Port encoding tests over to gtest.

14 years agoUse Google Test framework for unit tests.
Tommi Virtanen [Fri, 7 Jan 2011 21:15:40 +0000 (13:15 -0800)]
Use Google Test framework for unit tests.

Use ``make check`` to run the tests.

The src/gtest directory comes from ``svn export
http://googletest.googlecode.com/svn/tags/release-1.5.0 src/gtest``
and running "git add -f src/gtest".

gtest is licensed under the New BSD license, see src/gtest/COPYING.
For more on Google Test, see http://code.google.com/p/googletest/

Changed autogen.sh regenerate gtest automake files too. Make sure to
run ``./autogen.sh && ./configure`` after merging this commit, or
incremental builds may fail. The automake integration is inspired
heavily by the protobuf project, and may still be problematic.

Make git ignore files generated by gtest compilation.

Currently putting in just one new-style unit test, refactoring old
tests to fit will come in separate commits.

Note: if you are starting daemons, listening on TCP ports, using
multiple machines, mounting filesystems, etc, it's not a unit test
and does not belong in this setup. A framework for system/integration
tests will be provided later.

14 years agoMake git ignore generated files.
Tommi Virtanen [Mon, 10 Jan 2011 18:48:20 +0000 (10:48 -0800)]
Make git ignore generated files.

14 years agoos: don't crash on no-journal case
Colin Patrick McCabe [Sun, 9 Jan 2011 21:34:40 +0000 (13:34 -0800)]
os: don't crash on no-journal case

JournalingObjectStore::commit_start should handle the case where journal is
null. This will occur if the user doesn't configure a journal.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agov0.24.1 v0.24.1
Sage Weil [Sat, 8 Jan 2011 00:50:15 +0000 (16:50 -0800)]
v0.24.1

14 years agotest_split.sh: add many_pools test
Colin Patrick McCabe [Fri, 7 Jan 2011 23:01:42 +0000 (15:01 -0800)]
test_split.sh: add many_pools test

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoReplicatedPG: get_object_context ssc refcount leak
Samuel Just [Fri, 7 Jan 2011 22:23:04 +0000 (14:23 -0800)]
ReplicatedPG: get_object_context ssc refcount leak

If obc->obs.ssc is non-null, the second get_snapset_context ends up
leaking a snapset reference.

Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
14 years agomds: fix _dout_lock recursion recursion
Sage Weil [Fri, 7 Jan 2011 22:17:21 +0000 (14:17 -0800)]
mds: fix _dout_lock recursion recursion

The get_snaps() method also something to dout.  We need to take care to
not do that as part of the ostream operator<< chain.  Fixes #684.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: take rdlocks on bounding dftlocks; clean up migrator lock code
Sage Weil [Thu, 6 Jan 2011 22:28:01 +0000 (14:28 -0800)]
mds: take rdlocks on bounding dftlocks; clean up migrator lock code

We need to take an rdlock on bounding dirfrags during migration for a
rather irritating reason: when we export the bound inode, we need to send
scatterlock state for the dirfrags as well, so that the new auth also gets
the correct info.  If we race with a refragment, this info is useless, as
we can't redivvy it up.  And it's needed for the scatterlocks to work
properly: when the auth is in a sync/lock state it keeps each dirfrag's
portion in the local (auth OR replica) dirfrag.

So: take a rdlock on the bounding dirfrags to avoid this.  Clean up the
Locker bulk rdlock interface while we're at it to be more general and
useful.

Also, while we're here, do an rdlock_try at this point.  Note that we still
are going to fail more often than before, since dftlocks will frequently
be scattered if there has been a recent fragmentation.  There is some
inevitable conflict here between refragmentation (which wants dftlock
in MIX) and exports (which want it SYNC).  TODO.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: make thrash_exports select random frags
Sage Weil [Thu, 6 Jan 2011 21:42:29 +0000 (13:42 -0800)]
mds: make thrash_exports select random frags

We were always picking the first frag.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: force dirfrag fragmention when replaying metablob
Sage Weil [Thu, 6 Jan 2011 19:49:56 +0000 (11:49 -0800)]
mds: force dirfrag fragmention when replaying metablob

We can have non-auth (and thus ambiguously fragmented) dirs in our cache.
When those get replayed, adjust our fragmentation as needed.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoReplicatedPG: register_object_context and register_snapset_context cleanup
Samuel Just [Fri, 7 Jan 2011 20:21:09 +0000 (12:21 -0800)]
ReplicatedPG: register_object_context and register_snapset_context cleanup

Previously, get_object_context and get_snapset_context did not register
the resulting objects.  In some cases, these objects would not get
registered and multiple copies would end up created.  This caused a bug
in find_object_context where get_snapset_context could return an object
distinct from the one referenced by the object returned from
get_object_context.

Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
14 years agolibrados: check for initialization before doing certain operations
Yehuda Sadeh [Fri, 7 Jan 2011 20:40:40 +0000 (12:40 -0800)]
librados: check for initialization before doing certain operations

14 years agolibrados: fix api declaration
Yehuda Sadeh [Fri, 7 Jan 2011 19:22:58 +0000 (11:22 -0800)]
librados: fix api declaration

14 years agolibrados: add rados->version, include librados.h from .hpp
Yehuda Sadeh [Fri, 7 Jan 2011 18:50:42 +0000 (10:50 -0800)]
librados: add rados->version, include librados.h from .hpp

14 years agolibrados-config: add man page
Yehuda Sadeh [Thu, 6 Jan 2011 23:11:34 +0000 (15:11 -0800)]
librados-config: add man page

14 years agolibrados-config: added a command line tool to dump librados version
Yehuda Sadeh [Thu, 6 Jan 2011 23:04:07 +0000 (15:04 -0800)]
librados-config: added a command line tool to dump librados version

14 years agoReplicatedPG: clone_overlap should contain one entry per clone
Samuel Just [Thu, 6 Jan 2011 23:48:13 +0000 (15:48 -0800)]
ReplicatedPG: clone_overlap should contain one entry per clone

Previously, writefull and _delete_head would remove the last
entry from snapset.clone_overlap.  Now, the last entry becomes
an empty interval_set.  clone_overlap should contain one entry
per clone.

The missing entries previously caused a bug in _rollback_to where
iter would be clone_overlap.end().

Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
14 years agoosd: Create all_osds_die test
Colin Patrick McCabe [Wed, 5 Jan 2011 00:37:37 +0000 (16:37 -0800)]
osd: Create all_osds_die test

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agomon: mark osds down for not sending MOSDPGStat
Colin Patrick McCabe [Thu, 6 Jan 2011 21:33:07 +0000 (13:33 -0800)]
mon: mark osds down for not sending MOSDPGStat

PGMonitor::prepare_pg_stats should check to see if the stats in the
MOSDPgStats message are the same as the ones we already have. If so, no
need to create an incremental; just send an ACK and return false.

The leading Monitor now marks osds as down if they haven't sent a
MOSDPGStat message in the last 15 minutes.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agomon: Always forward the PGStats to the leader
Colin Patrick McCabe [Tue, 4 Jan 2011 19:31:52 +0000 (11:31 -0800)]
mon: Always forward the PGStats to the leader

Always forward the PGStats to the leader, even if they are the same as
the old PGStats. The leader will mark as down osds that haven't sent
PGStats for a few minutes.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: Introduce osd_mon_report_interval_max
Colin Patrick McCabe [Tue, 4 Jan 2011 00:04:38 +0000 (16:04 -0800)]
osd: Introduce osd_mon_report_interval_max

After every g_conf.osd_mon_report_interval_max seconds, we send out a PG
stat update even if nothing has changed. This is to let the monitors
know that we're alive.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agomon: don't allow Monitor to be copied
Colin Patrick McCabe [Tue, 4 Jan 2011 17:54:30 +0000 (09:54 -0800)]
mon: don't allow Monitor to be copied

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agomon: remove obsolete comment
Colin Patrick McCabe [Tue, 4 Jan 2011 17:38:06 +0000 (09:38 -0800)]
mon: remove obsolete comment

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: Rename osd_mon_report_interval
Colin Patrick McCabe [Thu, 6 Jan 2011 02:29:09 +0000 (18:29 -0800)]
osd: Rename osd_mon_report_interval

Rename osd_mon_report_interval to osd_mon_report_interval_min.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agomon: Introduce Monitor::leader_since
Colin Patrick McCabe [Mon, 3 Jan 2011 23:02:15 +0000 (15:02 -0800)]
mon: Introduce Monitor::leader_since

Introduce Monitor::leader_since to keep track of when the current
monitor became the leader.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoMerge branch 'standby_replay' into unstable
Greg Farnum [Thu, 6 Jan 2011 23:39:14 +0000 (15:39 -0800)]
Merge branch 'standby_replay' into unstable

14 years agomds: Add is_any_replay() method and fill it in as appropriate.
Greg Farnum [Thu, 6 Jan 2011 23:37:59 +0000 (15:37 -0800)]
mds: Add is_any_replay() method and fill it in as appropriate.

This way we don't need to remember to call all three of is_replay(),
is_standby_replay(), is_oneshot_replay().

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
14 years agoMerge remote branch 'origin/unstable' into standby_replay
Greg Farnum [Thu, 6 Jan 2011 22:50:35 +0000 (14:50 -0800)]
Merge remote branch 'origin/unstable' into standby_replay

Conflicts:
src/cmds.cc
src/mds/MDS.cc
src/mds/MDS.h

14 years agolibrados: add library api versioning
Yehuda Sadeh [Thu, 6 Jan 2011 22:43:31 +0000 (14:43 -0800)]
librados: add library api versioning

14 years agojournaler: delete Contexts on finish() in new functions.
Greg Farnum [Mon, 20 Dec 2010 22:35:23 +0000 (14:35 -0800)]
journaler: delete Contexts on finish() in new functions.

Previously we weren't, and leaked memory.

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
14 years agomdcache: change replay trimming a bit.
Greg Farnum [Mon, 20 Dec 2010 21:32:43 +0000 (13:32 -0800)]
mdcache: change replay trimming a bit.

Previously we were re-inserting dentrys on the open list. But if
there weren't any other available dentrys to trim, this could
have led to an infinite loop!
Now, we save them in a list and pop them back in once the trim
is done.

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
14 years agoMDS: rename replay Contexts -- they were ambiguous at best.
Greg Farnum [Mon, 20 Dec 2010 21:10:44 +0000 (13:10 -0800)]
MDS: rename replay Contexts -- they were ambiguous at best.

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
14 years agoMDS: add gids to the logger file names.
Greg Farnum [Fri, 17 Dec 2010 23:56:44 +0000 (15:56 -0800)]
MDS: add gids to the logger file names.

This is just to make differentiating between the standby's files
and stuff easier.

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
14 years agomdlog: return EAGAIN if replay falls off the tail of the journal.
Greg Farnum [Fri, 17 Dec 2010 21:25:04 +0000 (13:25 -0800)]
mdlog: return EAGAIN if replay falls off the tail of the journal.

This can happen when we're following an active journal, and
would previously cause the MDS to shut down. Now we return EAGAIN,
so the MDS can recover as it likes.
Currently, that recovery is a simple respawn, as when we discover
we've fallen behind via probing.

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
14 years agojournaler: Add init_headers function, call when reading head off disk.
Greg Farnum [Fri, 17 Dec 2010 00:47:30 +0000 (16:47 -0800)]
journaler: Add init_headers function, call when reading head off disk.

Uninitialized headers were causing a failed assert during replay,
and there's no good reason to leave them set at their defaults just
because the *current* incarnation of this MDS has never written to
disk!

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
14 years agomds: After probing the journal, reset if we've fallen behind.
Greg Farnum [Thu, 16 Dec 2010 19:53:38 +0000 (11:53 -0800)]
mds: After probing the journal, reset if we've fallen behind.

Previously, if the journal got trimmed and we missed log entries,
we failed out in the journaling step and stopped.
This is still possible and needs to be fixed, but pre-emptively checking
that we're still in the live part of the journal narrows the race range.

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
14 years agoMDS: make standby_trim_segments functional. Hurray, hot standbys work!
Greg Farnum [Wed, 15 Dec 2010 00:45:50 +0000 (16:45 -0800)]
MDS: make standby_trim_segments functional. Hurray, hot standbys work!

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
14 years agomdlog: Add some helper functions for accessing segments map data.
Greg Farnum [Wed, 15 Dec 2010 00:45:21 +0000 (16:45 -0800)]
mdlog: Add some helper functions for accessing segments map data.

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
14 years agomdcache: adjust trim() to handle running during standby-replay.
Greg Farnum [Wed, 15 Dec 2010 00:44:55 +0000 (16:44 -0800)]
mdcache: adjust trim() to handle running during standby-replay.

This just means it needs to handle files on the open list and not
trim them. Add a check for that with an assert, and keep them alive.

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
14 years agoelist: add a clear_list function.
Greg Farnum [Wed, 15 Dec 2010 00:43:45 +0000 (16:43 -0800)]
elist: add a clear_list function.

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>