src: get rid of the Observers throughout the code base.
This is a big patch that will remove all references to the observers
throughout the code, including a complete removal of the Observer-related
messages' source files.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
We reworked the code a bit to accommodate the introduction for the log
monitor's publish/subscribe mechanisms. With this patch we no longer
depend on the observer's, and use instead the much broader approach of
subscribing to events. In our case, we will subscribe to log levels.
If the '-w'/'--watch' flag is defined, the tool will be subscribed to the
'log-info' level by default, unless one of the following flags are defined
(in which case the level will be changed accordingly): '--watch-debug',
'--watch-info', '--watch-sec', '--watch-warn' and '--watch-error'.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
mon: Add publish/subscribe capabilities to the log monitor and status cmd.
This patch allows us to stir away from the monitor's observer mechanism,
by using instead the already existing publish/subscribe mechanism.
We follow the log levels used by the log monitor, and will recognize any
one of the following subscriptions: 'log-error' (higher priority),
'log-warn', 'log-sec', 'log-info' and 'log-debug' (lowest priority).
Also, add a new 'status' command to the monitor, which may be invoked by
any client (such as the ceph tool), and which shall return the status of
the various cluster components (osdmap, pgmap, ...).
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
workloadgen: forcing the user to specify a data and journal.
These default arguments, although handy when we just want to run the test,
just mess things up when we don't actually need them. If we don't specify
them on the CLI, we'll end up using the default ones, and that is just
annoying.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
workloadgen: Allow finer control over what the generator does.
Allow the user to have more control on:
- the sizes of the data being written by the operations;
- which operations are suppressed from execution;
- view the throughput;
- specify the periodicity of throughput output.
For the CLI options, '--help' should suffice.
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
Sage Weil [Sun, 6 May 2012 21:18:22 +0000 (14:18 -0700)]
osd: reset last_peering_interval on replica activate
There was a silent bug in the activate 'acks' that go from the replica back
to the primary. Prior to 86aa07d7a91ac23074e76551c3a6db3a5736cffa, we
were passing same_interval_since to the callback, which mean that
sometimes _activate_committed() would ignore it and we wouldn't update
last_epoch_started. This was mosty invisible; the next peering event would
just, in some cases, look at more past intervals than it needed to.
In 86aa07d7a91ac23074e76551c3a6db3a5736cffa we fixed this so that the check
is correct. (We noticed because now we aren't setting the pg CLEAN flag
until after last_epoch_started is updated.) That, in turn, revealed a
similar bug that we're fixing here: the replica's last_peering_reset could
be lower than the primary's, such that the activate 'ack' info is ignored.
To fix this, simply set last_peering_reset to the current epoch when the
replica activates; this will always be greater than the primary's.
Sage Weil [Sat, 5 May 2012 18:24:57 +0000 (11:24 -0700)]
osd: do not mark pg clean until active is durable
Do not mark a PG CLEAN or set last_epoch_clean until after the PG activate
is stable on all replicas.
This effectively means that last_epoch_clean will never fall in an interval
that follows last_epoch_started's interval. It *can* be >
last_epoch_started when it falls within the same interval.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sat, 5 May 2012 20:07:06 +0000 (13:07 -0700)]
osd: check against last_peering_reset in _activate_committed
We are checking against last_peering_reset in _activate_committed(), so we
need to pass in that value to compare against; last_peering_reset may be
greater than same_interval_since, e.g. on a replica that learns about the
PG after the initial creation epoch.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Apparently S3_put_object() and S3_get_object() need to
run on the same thread as S3_runall_request_context() (at least
per context). So We now call them in the workqueue thread.
There was a bug when doing a read with multiple threads, when
one of the threads was left behind; when it returned the compared
data string might have been cluttered by newer strings that
were longer.
Sage Weil [Fri, 4 May 2012 22:26:33 +0000 (15:26 -0700)]
librados: call safe callback on read operation
This avoids confusion for the user who isn't sure if they should wait for
complete or safe on a read aio. It also means that you can always wait
for safe for both reads or writes, which can simplify some code.
Dup the roundtrip functional tests to verify this works.
Signed-off-by: Sage Weil <sage@newdream.net> Reviewed-by: Yehuda Sadeh <yehuda.sadeh@inktank.com>
Sage Weil [Fri, 4 May 2012 18:05:34 +0000 (11:05 -0700)]
crush: comment and clean up checks for check_item_loc and insert_item
- drop useless cur for check_item_loc
- comment the checks we're doing so the code is understandable
- use name_exists instead of broken get_item_id != 0 check
Sage Weil [Fri, 4 May 2012 01:50:42 +0000 (18:50 -0700)]
global_init: do not count threads before daemonize()
We were verifying that there was only 1 thread (the presumably main()) when
we call daemonize. However, with the new logging code, we stop a thread
right before the check, and /proc apparently updates asynchronously such
that our attempt to count running threads gives us a bad answer.
Just remove this kludgey check; we'll have to catch this class of bugs
the hard way.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Reviewed-by: Greg Farnum <greg@inktank.com>
Tommi Virtanen [Thu, 3 May 2012 17:10:29 +0000 (10:10 -0700)]
doc: Rename to use dashes not underscores in URLs.
This makes the-separate-words in the url match as separate words in
searches, where this_way only matches an explicit "this_way" search.
http://www.mattcutts.com/blog/dashes-vs-underscores/
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
John Wilkins [Thu, 3 May 2012 03:31:35 +0000 (20:31 -0700)]
Removed "Ceph Development Status" per Bryan
Modified title syntax per Tommi
Modified paragraph width to 80-chars per Dan
Moved "Build from Source" out of Install
Renamed create_cluster to config-cluster
Added config-ref with configuration reference tables
Added a toc ref for man/1/obsync per Dan
Removed redundant sections from Ops
Deleted "Why use Ceph" and "Introduction to Storage Clusters"
Signed-off-by: John Wilkins <john.wilkins@dreamhost.com>