]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
13 years agofilestore: init filestore_kill_at in ctor
Sage Weil [Thu, 12 Apr 2012 17:51:14 +0000 (10:51 -0700)]
filestore: init filestore_kill_at in ctor

Otherwise we don't get the option for FileStore instances created after
common_init_finish() (which does md_config_t::call_all_observers()).

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agotest_idempotent_sequence: Add commands and lose a couple of optional args.
Joao Eduardo Luis [Thu, 12 Apr 2012 19:05:28 +0000 (20:05 +0100)]
test_idempotent_sequence: Add commands and lose a couple of optional args.

Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
13 years agofilestore: name internally
Sage Weil [Thu, 12 Apr 2012 17:40:08 +0000 (10:40 -0700)]
filestore: name internally

We need to allow the perfcounter name to be controlled so that we can have
two instances of FileStore in the same process that don't step on each
other.  Default to 'filestore'.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agotest_idempotent_sequence: no need to reinject value that is already there
Sage Weil [Thu, 12 Apr 2012 17:00:39 +0000 (10:00 -0700)]
test_idempotent_sequence: no need to reinject value that is already there

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agodeterministicopseq: add collection_rename() support
Joao Eduardo Luis [Thu, 12 Apr 2012 13:53:50 +0000 (14:53 +0100)]
deterministicopseq: add collection_rename() support

13 years agotest_idempotent_sequence: Generate a reproducible sequence of txs.
Joao Eduardo Luis [Wed, 11 Apr 2012 20:50:45 +0000 (21:50 +0100)]
test_idempotent_sequence: Generate a reproducible sequence of txs.

With this test we aim at reproducing the same sequence of transactions
as long as we are provided with the same seed between runs.

We also allow failures to be injected onto the filestore if the
--filestore-kill-at <VAL> argument is passed, and we provide verification
when --test-verify-at <VAL> is provided.

Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
13 years agoVerifyFileStore: Check if two FileStore's match after applying a set of operations.
Joao Eduardo Luis [Wed, 11 Apr 2012 20:49:09 +0000 (21:49 +0100)]
VerifyFileStore: Check if two FileStore's match after applying a set of operations.

With DeterministicOpSequence we are able to reproduce exactly the same
sequence of operations, over and over. However, if the filestore fails
(e.g., because we injected a failure), we want to check if it is kept
consistent after replaying its journal.

With VerifyFileStore, which extends DeterministicOpSequence, we are able
to bring a brand new filestore to the state the failed filestore would
reach had it not failed. We can then compare to check if the failure
introduced inconsistencies after replaying the journal.

(This is still work in progress and not fully functional)

Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
13 years agoDeterministicOpSequence: Generate a reproducible sequence of operations.
Joao Eduardo Luis [Wed, 11 Apr 2012 20:42:16 +0000 (21:42 +0100)]
DeterministicOpSequence: Generate a reproducible sequence of operations.

Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
13 years agoTestFileStoreState: Represent a FileStore's state to be used by tests.
Joao Eduardo Luis [Mon, 9 Apr 2012 14:59:23 +0000 (15:59 +0100)]
TestFileStoreState: Represent a FileStore's state to be used by tests.

Instead of having each test creating the same representation of a
FileStore's state, with a map/set of collections and objects, as well as
multiple init() functions for each test that are in all similar in
nature, provide this in a single class that can be inheritted by test
classes.

13 years agofilestore: two-phase guard
Sage Weil [Sat, 14 Apr 2012 00:11:54 +0000 (17:11 -0700)]
filestore: two-phase guard

For certain operations (collection_add) we need a two-phase guard, and an
"in-progress" state.

 - before exposing an object in a new location, we need to mark it so that
   old operations affecting the target name don't touch the new object.
 - can't just set the guard before starting or else we can't distinguish
   between a collection_add that was in-progress and one that happend a
   long time ago.

We may need the same for collection_rename().

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: simple failure injections via --filestore-kill-at <n>
Sage Weil [Sat, 14 Apr 2012 00:17:36 +0000 (17:17 -0700)]
filestore: simple failure injections via --filestore-kill-at <n>

This will make filestore suicide (_exit(1)) on the n'th potential failure
call site.  We can potentially fail:

     - before a transaction
     - between each op
     - at the end

Additionally, we instrument the guards:

     - before/after/inside _set_replay_guard
     - between significant steps of callers of _set_replay_guard

All instrumentation points are inside _do_transactions(), so if everything
is done in a single sequencer (or from a single thread) the failure
point is deterministic.

That said, use an atomic so we will still reliably fail (at some point)
when there are multiple filestore threads in action.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: fix collection_add argument names
Sage Weil [Fri, 13 Apr 2012 21:43:57 +0000 (14:43 -0700)]
filestore: fix collection_add argument names

No functional changes, just fixing and clarifying argument names so that it
is less confusing/wrong.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: replay collection_move using add+remove
Sage Weil [Fri, 13 Apr 2012 18:30:49 +0000 (11:30 -0700)]
filestore: replay collection_move using add+remove

This approximates the buggy collection_move.  It is still buggy.  It is
only there to replay old journals.

Rip out buggy (and now unused) collection_move code.

For the record, the problem there is that a crash between setting the guard
and unlinking the old name will not remove the old name on replay because
the guard for the link stage is indistinguishable from that for the unlink
stage.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: implement collection_move() as add + remove
Sage Weil [Fri, 13 Apr 2012 16:56:04 +0000 (09:56 -0700)]
filestore: implement collection_move() as add + remove

This ensures we get add and remove steps with different spos values, which
makes the guard work.  The collection_move implementation breaks on replay
because those values match, so the just-set guard prevents unlink from
happening.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: cleanup: flip sense of replay guard check
Sage Weil [Tue, 10 Apr 2012 22:30:47 +0000 (15:30 -0700)]
filestore: cleanup: flip sense of replay guard check

The other are all if (_check_replay_guard(..)) do_it;.  Make this one
match.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: fix fd leak on collection_rename
Sage Weil [Tue, 10 Apr 2012 22:30:03 +0000 (15:30 -0700)]
filestore: fix fd leak on collection_rename

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: fix collection_rename guard
Sage Weil [Tue, 10 Apr 2012 22:29:49 +0000 (15:29 -0700)]
filestore: fix collection_rename guard

If we crash between the rename and setting the guard, we can get EEXIST
or ENOTEMPTY on rename.  Tolerate that.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: fix collection_add guard
Sage Weil [Tue, 10 Apr 2012 22:31:21 +0000 (15:31 -0700)]
filestore: fix collection_add guard

If we crash between the link() and setting the guard, we will get
EEXIST.  Tolerate that.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: fix collection_move guard
Sage Weil [Tue, 10 Apr 2012 22:24:49 +0000 (15:24 -0700)]
filestore: fix collection_move guard

We had a sequence like:

 1- write A block 1
 2- write A block 2
 3- write A block 3
 4- write A block 4
 5- move A -> B
     - link B
     - unlink A
     - set guard on B   <crash>
  - replay 3, 4, 5

with the result being B with only half of its content.  The problem is that
we destroyed the old link _and_ didn't guard the new content.  Instead,
set the guard before the link, and replay the unlink step here
unconditionally.

Fixes: #2178
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoconfigure: --with-system-leveldb
Sage Weil [Tue, 10 Apr 2012 03:30:42 +0000 (20:30 -0700)]
configure: --with-system-leveldb

Default to bundled leveldb.  Optionally check.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: fix leveldb includes
Sage Weil [Mon, 9 Apr 2012 19:21:56 +0000 (12:21 -0700)]
filestore: fix leveldb includes

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agocephfs: fix uninit var warning
Sage Weil [Tue, 10 Apr 2012 03:23:24 +0000 (20:23 -0700)]
cephfs: fix uninit var warning

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoencoding: fix iterator use for struct_len copy_in
Sage Weil [Mon, 9 Apr 2012 18:25:41 +0000 (11:25 -0700)]
encoding: fix iterator use for struct_len copy_in

The end() iterator position does not record an offset when the list is
modified.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agobuffer: allow advance() to move an iterator backward
Sage Weil [Mon, 9 Apr 2012 18:26:34 +0000 (11:26 -0700)]
buffer: allow advance() to move an iterator backward

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge remote branch 'gh/stable' into next
Sage Weil [Mon, 9 Apr 2012 03:59:33 +0000 (20:59 -0700)]
Merge remote branch 'gh/stable' into next

13 years agoconfigure: HAVE_FALLOCATE -> CEPH_HAVE_FALLOCATE
Sage Weil [Mon, 9 Apr 2012 03:58:59 +0000 (20:58 -0700)]
configure: HAVE_FALLOCATE -> CEPH_HAVE_FALLOCATE

/usr/include/linux/fs.h defines this on CentOS 5, even though it does not
in fact compile.  This stupid workaround avoids the problem.

Reported-by: Nick Couchman <Nick.Couchman@seakr.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoencoding: use iterator to copy_in encoded length
Sage Weil [Fri, 6 Apr 2012 16:33:43 +0000 (09:33 -0700)]
encoding: use iterator to copy_in encoded length

This gives us a pointer to the position into the list where the final
length value will be copied.  Previously we used bl.copy_in(), which takes
a byte offset and needs iterator over the bufferlist to seek to the
correct position, resulting in O(n^2) encoding time for large structures.

Fixes: #2161
Reported-by: Jim Schutt <jaschut@sandia.gov>
Diagnosed-by: Ake van der Meer <petrabbit@xs4all.nl>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agov0.44.2 v0.44.2
Sage Weil [Thu, 5 Apr 2012 21:55:04 +0000 (14:55 -0700)]
v0.44.2

13 years agoFileStore: do not check dbobjectmap without option set
Samuel Just [Thu, 5 Apr 2012 21:58:55 +0000 (14:58 -0700)]
FileStore: do not check dbobjectmap without option set

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agotest_rewrite_latency: check return value
Sage Weil [Tue, 3 Apr 2012 22:35:26 +0000 (15:35 -0700)]
test_rewrite_latency: check return value

Fixes warning

warning: test/test_rewrite_latency.cc:27:36: ignoring return value of ‘ssize_t pwrite(int, const void*, size_t, __off64_t)’, declared with attribute warn_unused_result [-Wunused-result]

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMakefile: add mssing header
Sage Weil [Tue, 3 Apr 2012 22:28:26 +0000 (15:28 -0700)]
Makefile: add mssing header

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agorgw: throttle at num_threads * 2
Sage Weil [Tue, 3 Apr 2012 21:21:53 +0000 (14:21 -0700)]
rgw: throttle at num_threads * 2

If we throttle at num_threads, then nothing gets into the workqueue until
a worker thread is idle, which means you pay the latency of setting it up
and queueing it.  This way we keep some requests ready to go.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote-tracking branch 'gh/msgr-api-changes'
Sage Weil [Tue, 3 Apr 2012 20:44:29 +0000 (13:44 -0700)]
Merge remote-tracking branch 'gh/msgr-api-changes'

Reviewed-by: Sage Weil <sage@newdream.net>
13 years agofilestore: print Sequencer name in debug output
Sage Weil [Tue, 3 Apr 2012 20:00:13 +0000 (13:00 -0700)]
filestore: print Sequencer name in debug output

And clean it up just a bit.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomsgr: clean up Pipe::do_sendmsg.
Greg Farnum [Wed, 28 Mar 2012 22:06:32 +0000 (15:06 -0700)]
msgr: clean up Pipe::do_sendmsg.

Document it as with the tcp stuff, remove an if(0)'d debugging block,
and remove the useless "sd" parameter since it's always the same as
the Pipe's sd member.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomsgr: write minimal documentation for the tcp functions.
Greg Farnum [Wed, 28 Mar 2012 21:32:03 +0000 (14:32 -0700)]
msgr: write minimal documentation for the tcp functions.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomsgr: make a bunch of stuff private.
Greg Farnum [Tue, 27 Mar 2012 19:57:14 +0000 (12:57 -0700)]
msgr: make a bunch of stuff private.

Why were all these data members public? They're accessed by Pipes
and the Accepter and stuff, so maybe that's why...but that's all
internal interface stuff.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomsg: update the Dispatcher and Messenger documentation
Greg Farnum [Tue, 27 Mar 2012 17:46:20 +0000 (10:46 -0700)]
msg: update the Dispatcher and Messenger documentation

Clarify what mark_down() and mark_down_on_empty() actually do.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agodispatcher: fix documentation for ms_handle_reset
Greg Farnum [Mon, 26 Mar 2012 21:19:40 +0000 (14:19 -0700)]
dispatcher: fix documentation for ms_handle_reset

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomsgr: rename set_ip() -> set_addr_unknowns()
Greg Farnum [Fri, 23 Mar 2012 20:32:46 +0000 (13:32 -0700)]
msgr: rename set_ip() -> set_addr_unknowns()

The generic interface shouldn't reference specifics like that.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomsgr: Remove _my_name and ms_addr, replace with direct access to my_inst.
Greg Farnum [Fri, 23 Mar 2012 20:28:07 +0000 (13:28 -0700)]
msgr: Remove _my_name and ms_addr, replace with direct access to my_inst.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomsgr: store the entity_inst_t in the Messenger.
Greg Farnum [Tue, 3 Apr 2012 20:13:20 +0000 (13:13 -0700)]
msgr: store the entity_inst_t in the Messenger.

Convert ms_addr and _my_name to be references to their fields in
the entity_inst_t my_inst.
This way we can use const references for accessing all of them,
instead of the bizarre distinction we had before for get_myinst().

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agobuffer: implement a contents_equal function on bufferlists
Greg Farnum [Thu, 22 Mar 2012 00:27:09 +0000 (17:27 -0700)]
buffer: implement a contents_equal function on bufferlists

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomsgr: change the signature of get_myaddr()
Greg Farnum [Mon, 19 Mar 2012 20:12:14 +0000 (13:12 -0700)]
msgr: change the signature of get_myaddr()

Return a const reference to the actual address, instead of copying it.
All current users are happy with this, and I can't see a good reason
to copy it instead.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomsgr: get_connection() is required to establish a connection if none exists.
Greg Farnum [Thu, 8 Mar 2012 00:43:04 +0000 (16:43 -0800)]
msgr: get_connection() is required to establish a connection if none exists.

Making an allowance for lossy server connections is silly. Just don't
ask for the Connection in that case. (There aren't any users who
rely on the previous behavior.)
Document that requirement in Messenger.h!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agotest: fix monmaptool help text
Greg Farnum [Tue, 3 Apr 2012 20:10:23 +0000 (13:10 -0700)]
test: fix monmaptool help text

Broken by commit:15f0a3270fdcf09acce554313f2d0c0814a511e4

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agocls_rgw: guard decode
Yehuda Sadeh [Tue, 3 Apr 2012 18:32:44 +0000 (11:32 -0700)]
cls_rgw: guard decode

thee were few cases where decode wasn't guarded.

Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
13 years agocls_rgw: reset return code in some cases
Yehuda Sadeh [Tue, 3 Apr 2012 18:30:57 +0000 (11:30 -0700)]
cls_rgw: reset return code in some cases

Beforehand the return code was ignored, so fixed the cases
where we erroneously return error instead of success.

Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
13 years agolibrados: fix exec test
Sage Weil [Tue, 3 Apr 2012 17:12:01 +0000 (10:12 -0700)]
librados: fix exec test

Return for read operations is now returned correctly.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agodoc: disable broken 'doxygenclass' class in librados c++ doc
Sage Weil [Tue, 3 Apr 2012 16:06:37 +0000 (09:06 -0700)]
doc: disable broken 'doxygenclass' class in librados c++ doc

This is the last remaining gitbuilder error.  Add it back when the C++
docs actually build.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote-tracking branch 'gh/stable'
Sage Weil [Tue, 3 Apr 2012 15:58:13 +0000 (08:58 -0700)]
Merge remote-tracking branch 'gh/stable'

13 years agotest_workload_gen: fix Sequencer ctor
Sage Weil [Tue, 3 Apr 2012 15:44:46 +0000 (08:44 -0700)]
test_workload_gen: fix Sequencer ctor

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote-tracking branch 'gh/wip-name-sequencers'
Sage Weil [Tue, 3 Apr 2012 05:04:04 +0000 (22:04 -0700)]
Merge remote-tracking branch 'gh/wip-name-sequencers'

13 years agoMerge remote-tracking branch 'gh/wip-2087'
Sage Weil [Tue, 3 Apr 2012 05:03:55 +0000 (22:03 -0700)]
Merge remote-tracking branch 'gh/wip-2087'

13 years agorgw: check for subuser existence
Yehuda Sadeh [Mon, 2 Apr 2012 20:11:01 +0000 (13:11 -0700)]
rgw: check for subuser existence

This fixes #1856: looking up subuser that doesn't exist returns
user as long as subuser prefix defined existing user.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agofilestore: fix ZERO fallback write
Sage Weil [Mon, 2 Apr 2012 00:04:58 +0000 (17:04 -0700)]
filestore: fix ZERO fallback write

It helps if we write zeros!

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoqa: test_rewrite_latency
Sage Weil [Sun, 1 Apr 2012 23:24:39 +0000 (16:24 -0700)]
qa: test_rewrite_latency

Tool to measure latency of overwriting a single block.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote branch 'gh/wip-mon_setup'
Sage Weil [Sat, 31 Mar 2012 03:31:30 +0000 (20:31 -0700)]
Merge remote branch 'gh/wip-mon_setup'

Reviewed-by: Sage Weil <sage@newdream.net>
13 years agoosd: fix error code return from class methods
Sage Weil [Sat, 31 Mar 2012 03:18:42 +0000 (20:18 -0700)]
osd: fix error code return from class methods

Don't shadow the result at function scope.

Fixes: #2148
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomonmaptool: make clear you can set the fsid when making a new map.
Greg Farnum [Sat, 31 Mar 2012 00:22:57 +0000 (17:22 -0700)]
monmaptool: make clear you can set the fsid when making a new map.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoceph_mon: fix fsid parsing.
Greg Farnum [Sat, 31 Mar 2012 00:07:19 +0000 (17:07 -0700)]
ceph_mon: fix fsid parsing.

fsid is a field in the CephContext _conf structure and is parsed by
the standard options parsing library before it gets to the ceph_mon
custom parsing.
Instead do the standard parsing, and check that member directly
to decide if we want to (over)write the monmap's fsid.

Fixes one part of #2221.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoosd: update_stats() on reads too
Sage Weil [Fri, 30 Mar 2012 23:14:05 +0000 (16:14 -0700)]
osd: update_stats() on reads too

Update pg stats on any op completion (read or write), not just writes.  Do
the calls with log_op_stats() for consistency's sake.  Skip if the request
was an error.

Fixes: #2209
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoMerge remote branch 'gh/wip-osd-hb'
Sage Weil [Fri, 30 Mar 2012 23:00:29 +0000 (16:00 -0700)]
Merge remote branch 'gh/wip-osd-hb'

13 years agoosd: fix typo in debug message
Sage Weil [Fri, 30 Mar 2012 22:37:34 +0000 (15:37 -0700)]
osd: fix typo in debug message

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge remote branch 'gh/wip-osd-recovery-sources'
Sage Weil [Fri, 30 Mar 2012 21:57:57 +0000 (14:57 -0700)]
Merge remote branch 'gh/wip-osd-recovery-sources'

13 years agoobjectstore: name Sequencers
Sage Weil [Sun, 4 Mar 2012 05:07:05 +0000 (21:07 -0800)]
objectstore: name Sequencers

Assign a (unique) name to each Sequencer.  This will aid in debugging, and
can be useful when dumping traces of FileStore workloads.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoceph_common.sh: Remove dead code.
Tommi Virtanen [Fri, 3 Jun 2011 19:55:31 +0000 (12:55 -0700)]
ceph_common.sh: Remove dead code.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoman: Oops, update ceph-mon(8) for real. Sorry about that.
Tommi Virtanen [Fri, 30 Mar 2012 18:27:47 +0000 (11:27 -0700)]
man: Oops, update ceph-mon(8) for real. Sorry about that.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoman: Update ceph-mon(8) after reStructuredText syntax fixes.
Tommi Virtanen [Fri, 30 Mar 2012 18:26:19 +0000 (11:26 -0700)]
man: Update ceph-mon(8) after reStructuredText syntax fixes.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agodoc: Remove duplicate anchor from (unused) overview doc.
Tommi Virtanen [Fri, 30 Mar 2012 18:16:57 +0000 (11:16 -0700)]
doc: Remove duplicate anchor from (unused) overview doc.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agodoc: Convert the mailing list mention to not be a section heading.
Tommi Virtanen [Wed, 28 Mar 2012 20:55:01 +0000 (13:55 -0700)]
doc: Convert the mailing list mention to not be a section heading.

If toctree is inside a section, the subtree is inside the section too.
We don't want all of dev/* to be under "Mailing list".

I have not found a decent workaround for this. The toplevel toctree
avoids this purely by the fact that it is the topmost toctree. Right
now that means you should 1) avoid having more than a few paragraphs of
text before the toctree for that subtree (put most of the content after
the toctree; clumsy if the toctree is long), or 2) put the toptree
immediately after the document title, make it :hidden:, and let the
reader use links in the text or the ToC in the sidebar to navigate.
See start/index for an example of this.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agodoc: Fix reStructuredText syntax errors.
Tommi Virtanen [Fri, 30 Mar 2012 18:11:12 +0000 (11:11 -0700)]
doc: Fix reStructuredText syntax errors.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoadd include/stringify.h
Sage Weil [Sun, 4 Mar 2012 05:06:12 +0000 (21:06 -0800)]
add include/stringify.h

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoFileJournal: check pwrite return value when zeroing journal
Samuel Just [Fri, 30 Mar 2012 16:59:24 +0000 (09:59 -0700)]
FileJournal: check pwrite return value when zeroing journal

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agofilestore: set guard on collection_move
Sage Weil [Fri, 30 Mar 2012 16:51:45 +0000 (09:51 -0700)]
filestore: set guard on collection_move

During recovery we submit transactions like:

 - delete a/foo
 - move tmp/foo to a/foo

This prevents the EEXIST check in collection_move from doing any good,
since the destination never exists.  We need to do that remove at least
sometimes, because we may be overwriting an existing/older version of the
object.

So,
 - set the guard after we do the move, so that
 - the delete won't be repated, and
 - the EEXIST check will work

Also check the guard for good measure (although that doesn't do anything
specifically useful in this scenario).

Fixes: #2164
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoosd: clear RECOVERING on start_peering_interval
Sage Weil [Wed, 28 Mar 2012 16:50:00 +0000 (09:50 -0700)]
osd: clear RECOVERING on start_peering_interval

This prevents us from, say, getting into a recovering+stray state.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: more heartbeat debug
Sage Weil [Fri, 30 Mar 2012 15:45:52 +0000 (08:45 -0700)]
osd: more heartbeat debug

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: don't fail new heartbeat peers
Sage Weil [Fri, 30 Mar 2012 03:54:25 +0000 (20:54 -0700)]
osd: don't fail new heartbeat peers

last_tx may be 0 because we just added this peer; don't mark them down
yet!

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: ignore peer epoch of 0 on ping reply
Sage Weil [Fri, 30 Mar 2012 03:34:55 +0000 (20:34 -0700)]
osd: ignore peer epoch of 0 on ping reply

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: discard heartbeat_peer in note_down_osd
Sage Weil [Thu, 29 Mar 2012 05:32:30 +0000 (22:32 -0700)]
osd: discard heartbeat_peer in note_down_osd

Discard the heartbeat_peer as soon as we find out, along with queued
failures, or else the heartbeat_check may come along (without map_lock)
and requeue a failure.  And then later, when we try to report it, we'll
osdmap->get_inst() on a now-down OSD and fail miserably.

Reported-by: Wido den Hollander <wido@widodh.nl>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: rename hbin -> hbclient, hbout -> hbserver
Sage Weil [Mon, 26 Mar 2012 16:53:50 +0000 (09:53 -0700)]
osd: rename hbin -> hbclient, hbout -> hbserver

This is way less confusing.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: send pings from hbin
Sage Weil [Mon, 26 Mar 2012 16:50:51 +0000 (09:50 -0700)]
osd: send pings from hbin

Fixes: #2212
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: simplify heartbeat logic
Sage Weil [Thu, 22 Mar 2012 14:50:44 +0000 (07:50 -0700)]
osd: simplify heartbeat logic

Simplify heartbeats to use a simple request/reply model.

 - avoid any weirdness with map update timing
 - no from/to distinction
 - lossy client/server model

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agotest: test_workload_gen: Add callback for collection destruction.
Joao Eduardo Luis [Fri, 30 Mar 2012 14:32:12 +0000 (15:32 +0100)]
test: test_workload_gen: Add callback for collection destruction.

When we remove a collection, we must cleanup after the coll_entry_t we
once had on the available collections set. For some reason, we weren't
doing this.

This commit adds a new callback, which inherits from the 'OnReadable'
callback on the WorkloadGenerator class, that will be responsible for
deleting the coll_entry_t once we know the collection transaction
destroying the collection has finished.

13 years agoceph: --concise by default, add --verbose option
Sage Weil [Fri, 30 Mar 2012 04:31:20 +0000 (21:31 -0700)]
ceph: --concise by default, add --verbose option

It's time.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoPG,ReplicatedPG: update missing_loc_sources with missing_loc
Samuel Just [Fri, 30 Mar 2012 01:02:16 +0000 (18:02 -0700)]
PG,ReplicatedPG: update missing_loc_sources with missing_loc

In some cases missing_loc was updated without missing_loc_sources

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoReplicatedPG: fix loop in check_recovery_sources
Samuel Just [Fri, 30 Mar 2012 01:00:50 +0000 (18:00 -0700)]
ReplicatedPG: fix loop in check_recovery_sources

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoMerge remote branch 'upstream/wip_latency'
Samuel Just [Thu, 29 Mar 2012 20:15:39 +0000 (13:15 -0700)]
Merge remote branch 'upstream/wip_latency'

13 years agotest: test_workload_gen: Fixing a memleak.
Joao Eduardo Luis [Thu, 29 Mar 2012 14:34:01 +0000 (15:34 +0100)]
test: test_workload_gen: Fixing a memleak.

Apparently, the FileStore does not cleanup after transactions once they
are applied, which may lead to huge memory leaks.

In this commit we simply 'delete m_tx' in the transaction's callback
class.

13 years agoReplicatedPG: ctx might not contain an OpRequest
Samuel Just [Wed, 28 Mar 2012 22:54:57 +0000 (15:54 -0700)]
ReplicatedPG: ctx might not contain an OpRequest

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoFileJournal: optionally zero journal on create
Samuel Just [Tue, 27 Mar 2012 16:32:01 +0000 (09:32 -0700)]
FileJournal: optionally zero journal on create

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoFileJournal: use DSYNC for directio path
Samuel Just [Mon, 26 Mar 2012 20:48:23 +0000 (13:48 -0700)]
FileJournal: use DSYNC for directio path

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoFileStore: Pass OpRequestRef into filestore in queue_transaction
Samuel Just [Sat, 24 Mar 2012 05:54:41 +0000 (22:54 -0700)]
FileStore: Pass OpRequestRef into filestore in queue_transaction

This allow us to track op progress through the filestore.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoosd/: OpRequest implements TrackedOp for passing into filestore
Samuel Just [Thu, 29 Mar 2012 00:10:29 +0000 (17:10 -0700)]
osd/: OpRequest implements TrackedOp for passing into filestore

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agotest: test_workload_gen: Change CLI option and add '--help' usage.
Joao Eduardo Luis [Wed, 28 Mar 2012 16:02:15 +0000 (17:02 +0100)]
test: test_workload_gen: Change CLI option and add '--help' usage.

With this commit, we support the following options (and old ones are no
longer available):

--test-num-colls VAL                Set the number of collections
--test-num-objs-per-coll VAL        Set the number of objects per
                                    collection
--test-destroy-coll-per-N-trans VAL Set how many transactions to run
                                    before destroying a collection.

And --help will show the program's usage description.

13 years agorgw: replace dout with ldout
Yehuda Sadeh [Wed, 28 Mar 2012 15:34:11 +0000 (08:34 -0700)]
rgw: replace dout with ldout

librgw can't use g_ceph_context

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agotest: test_workload_gen: Default arguments, and minor changes.
Joao Eduardo Luis [Wed, 28 Mar 2012 13:59:32 +0000 (14:59 +0100)]
test: test_workload_gen: Default arguments, and minor changes.

Besides adding support for default arguments, passed onto global_init(),
this commit fixes a conflict in Makefile.am, and a missing lib
dependency. Also, we didn't used to pay attention to the return values
from store->mkfs() and store->mount(), and now do.

13 years agoMerge branch 'stable'
Sage Weil [Wed, 28 Mar 2012 02:58:54 +0000 (19:58 -0700)]
Merge branch 'stable'

13 years agotest: test_workload_gen: Destroy collections.
Joao Eduardo Luis [Wed, 28 Mar 2012 00:11:37 +0000 (01:11 +0100)]
test: test_workload_gen: Destroy collections.

13 years agotest: test_workload_gen: CodeStyle compliance and cleanup.
Joao Eduardo Luis [Sun, 25 Mar 2012 16:38:40 +0000 (17:38 +0100)]
test: test_workload_gen: CodeStyle compliance and cleanup.

This commit aims at the compliance with Ceph's CodeStyle, as well
as cleaning up some lingering unused code.

Also, now we allow changing the default OSD data and journal
locations, as well as the OSD journal size, by providing the
options '--osd-data <PATH>', '--osd-journal <PATH>' and
'--osd-journal-size <VAL>' on the CLI arguments. If not provided,
these will default to 'workload_gen_dir', 'workload_gen_journal'
and '400', respectively.