]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
12 years agoPG, OSD: reject backfills when an OSD is nearly full
Mike Ryan [Fri, 14 Sep 2012 17:31:42 +0000 (10:31 -0700)]
PG, OSD: reject backfills when an OSD is nearly full

Reject backfills when an OSD reaches a configurable full ratio. Retry
backfilling periodically in the hopes that the OSD has become less full.

This changeset introduces two configuration options for dealing with
this: osd_refuse_backfill_full_ratio and osd_backfill_retry_interval.

We also introduce two new state transitions in the PG's Active state.

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agotimer: add unsafe callbacks option
Mike Ryan [Fri, 14 Sep 2012 17:30:17 +0000 (10:30 -0700)]
timer: add unsafe callbacks option

Using unsafe callbacks drops the lock between invocations of event
callbacks. It is useful under some circumstances, but the user must take
caution. See the comment in Timer.h for full details.

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agoPG,osd_types,PGMonitor: make backfill state names more descriptive
Samuel Just [Mon, 10 Sep 2012 16:25:07 +0000 (09:25 -0700)]
PG,osd_types,PGMonitor: make backfill state names more descriptive

PG_STATE_BACKFILL->PG_STATE_BACKFILL_WAIT
and
PG_STATE_BACKFILLING->PG_STATE_BACKFILL

backfill -> wait_backfill
backfill+backfilling -> backfill

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoPG: add CEPH_FEATURE for backfill reservation
Samuel Just [Fri, 7 Sep 2012 19:10:24 +0000 (12:10 -0700)]
PG: add CEPH_FEATURE for backfill reservation

Also adds backwards compatibility by just post_event-ing
the RemoteBackfillReserved() rather than sending the
message to an older replica.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agodoc/: added documentation for backfill_reservation
Samuel Just [Fri, 7 Sep 2012 16:22:10 +0000 (09:22 -0700)]
doc/: added documentation for backfill_reservation

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoosd/: add PG_STATE_BACKFILLING
Samuel Just [Fri, 7 Sep 2012 01:02:08 +0000 (18:02 -0700)]
osd/: add PG_STATE_BACKFILLING

PG_STATE_BACKFILLING is set when the pg enters the Backfilling state.
That is, +backfilling indicates that the pg has obtained its
reservations and is now actively backfilling.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoosd/: add backfill reservations
Samuel Just [Thu, 6 Sep 2012 22:11:57 +0000 (15:11 -0700)]
osd/: add backfill reservations

Previously, a new osd would be bombarded by backfills from many osds
simultaneously, resulting in excessively high load.  Instead, we
want to limit the number of backfills coming into and going out
from a single osd.

To that end, each OSDService now has two AsyncReserver instances: one
for backfills going from the osd (local_reserver) and one for backfills
going to the osd (remote_reserver).  For a primary to initiate a
backfill, it must first obtain a reservation from its own
local_reserver.  Then, it must obtain a reservation from the backfill
target's remote_reserver via a MBackfillReserve message. This process is
managed by substates of Active and ReplicaActive (see the changes in
PG.h).  The reservations are dropped either on the Backfilled event,
which is sent on the primary before calling recovery_complete and on the
replica on receipt of the BackfillComplete progress message), or upon
leaving Active or ReplicaActive.

It's important that we always grab the local reservation before the
remote reservation in order to prevent a circular dependency.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoOSD: add init and shutdown for OSDService
Samuel Just [Mon, 24 Sep 2012 18:37:07 +0000 (11:37 -0700)]
OSD: add init and shutdown for OSDService

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoPG: create macro for simple events
Samuel Just [Wed, 5 Sep 2012 00:18:50 +0000 (17:18 -0700)]
PG: create macro for simple events

This should make defining no-information events a bit simpler.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years ago:doc: Fixed broken hyperlinks.
John Wilkins [Fri, 7 Sep 2012 04:00:29 +0000 (21:00 -0700)]
:doc: Fixed broken hyperlinks.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years ago:doc: Addresses Documentation #3096. Also added new information.
John Wilkins [Fri, 7 Sep 2012 03:31:46 +0000 (20:31 -0700)]
:doc: Addresses Documentation #3096. Also added new information.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agogitignore: Remove "nbproject", no idea what that even is.
Tommi Virtanen [Thu, 6 Sep 2012 23:11:39 +0000 (16:11 -0700)]
gitignore: Remove "nbproject", no idea what that even is.

Looks like this line was accidentally introduced in commit
af4d8db55f7268ab68ee5a7e17ac58c993528566.

Signed-off-by: Tommi Virtanen <tv@inktank.com>
12 years agorgw: fix usage
Yehuda Sadeh [Thu, 6 Sep 2012 17:15:54 +0000 (10:15 -0700)]
rgw: fix usage

Fixes: #3085
usage was showing 'bucket info' command that never
existed.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years ago:doc: Minor tweak to heading text.
John Wilkins [Thu, 6 Sep 2012 00:33:45 +0000 (17:33 -0700)]
:doc: Minor tweak to heading text.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoMerge branch 'master' of github.com:ceph/ceph
John Wilkins [Thu, 6 Sep 2012 00:26:39 +0000 (17:26 -0700)]
Merge branch 'master' of github.com:ceph/ceph

12 years ago:doc: Modified the index page to point to the new cluster-ops section.
John Wilkins [Thu, 6 Sep 2012 00:25:34 +0000 (17:25 -0700)]
:doc: Modified the index page to point to the new cluster-ops section.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years ago:doc: Made minor changes to restructuredText headers.
John Wilkins [Thu, 6 Sep 2012 00:24:54 +0000 (17:24 -0700)]
:doc: Made minor changes to restructuredText headers.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years ago:doc: Added comment redirecting editors to new page location.
John Wilkins [Thu, 6 Sep 2012 00:23:03 +0000 (17:23 -0700)]
:doc: Added comment redirecting editors to new page location.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years ago:doc: Added index/toctree page for cluster ops.
John Wilkins [Thu, 6 Sep 2012 00:22:32 +0000 (17:22 -0700)]
:doc: Added index/toctree page for cluster ops.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years ago:doc: Added new auth settings to reference doc.
John Wilkins [Thu, 6 Sep 2012 00:22:02 +0000 (17:22 -0700)]
:doc: Added new auth settings to reference doc.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years ago:doc: Refactored and moved control.rst page.
John Wilkins [Thu, 6 Sep 2012 00:21:31 +0000 (17:21 -0700)]
:doc: Refactored and moved control.rst page.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years ago:doc: Consolidated file system recommendations.
John Wilkins [Thu, 6 Sep 2012 00:21:04 +0000 (17:21 -0700)]
:doc: Consolidated file system recommendations.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years ago:doc: Minor syntax update.
John Wilkins [Thu, 6 Sep 2012 00:20:30 +0000 (17:20 -0700)]
:doc: Minor syntax update.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years ago:doc: restructuredText syntax corrections.
John Wilkins [Thu, 6 Sep 2012 00:19:51 +0000 (17:19 -0700)]
:doc: restructuredText syntax corrections.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years ago:doc: Added index page. This will be refactored again soon.
John Wilkins [Thu, 6 Sep 2012 00:17:57 +0000 (17:17 -0700)]
:doc: Added index page. This will be refactored again soon.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years ago:doc: Refactored and moved to ceph/docs/cluster-ops/pools.rst
John Wilkins [Thu, 6 Sep 2012 00:17:20 +0000 (17:17 -0700)]
:doc: Refactored and moved to ceph/docs/cluster-ops/pools.rst

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years ago:doc: Removed. New section is in ceph/doc/cluster-ops/authentication.rst
John Wilkins [Thu, 6 Sep 2012 00:16:29 +0000 (17:16 -0700)]
:doc: Removed. New section is in ceph/doc/cluster-ops/authentication.rst

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years ago:doc: Fixed heading syntax.
John Wilkins [Thu, 6 Sep 2012 00:15:27 +0000 (17:15 -0700)]
:doc: Fixed heading syntax.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoPG: clear want_acting in choose_acting if want == acting
Samuel Just [Wed, 5 Sep 2012 22:56:25 +0000 (15:56 -0700)]
PG: clear want_acting in choose_acting if want == acting

Otherwise, a pg_temp from a previous peering sequence
(but not a different peering_interval) might leak through
into Active and incorrectly trip the
Active::react(AdvMap&) asserts regarding want_acting.
Those asserts assume that want_acting is either empty or is
a results of recovery completion.  In the latter case, the
want_acting set much consist only of elements of up and
acting.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge remote-tracking branch 'gh/wip_deep_scrub_final'
Sage Weil [Wed, 5 Sep 2012 21:23:49 +0000 (14:23 -0700)]
Merge remote-tracking branch 'gh/wip_deep_scrub_final'

12 years agoosd: initialize pg_log_entry_t::invalid_pool in default ctor
Sage Weil [Fri, 31 Aug 2012 22:40:16 +0000 (15:40 -0700)]
osd: initialize pg_log_entry_t::invalid_pool in default ctor

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodoc: explain what scrub, deep-scrub, and repair actually do
Mike Ryan [Tue, 7 Aug 2012 23:32:59 +0000 (16:32 -0700)]
doc: explain what scrub, deep-scrub, and repair actually do

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agoosd: deep scrub, read file contents from disk and compare digest
Mike Ryan [Mon, 27 Aug 2012 18:16:17 +0000 (11:16 -0700)]
osd: deep scrub, read file contents from disk and compare digest

Deep scrub reads the contents of every file from the store and computes
a crc32 digest. The primary compares the digest of all replicas and will
mark the PG inconsistent if any don't match.

OSDs that do not support deep scrub simply perform an ordinary chunky
scrub. Any subset of OSDs that do support deep scrub will have their
digests compared.

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agobuffer: class for efficiently calculating CRC32 of >= 1 bufferlist
Mike Ryan [Tue, 31 Jul 2012 21:21:43 +0000 (14:21 -0700)]
buffer: class for efficiently calculating CRC32 of >= 1 bufferlist

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agopg: store scrubber state in its own object
Mike Ryan [Tue, 4 Sep 2012 23:37:54 +0000 (16:37 -0700)]
pg: store scrubber state in its own object

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agoosd: chunky scrub, scrub PGs a chunk of objects at a time
Mike Ryan [Mon, 16 Jul 2012 22:58:26 +0000 (15:58 -0700)]
osd: chunky scrub, scrub PGs a chunk of objects at a time

Chunky scrub is a more efficient scrub. It blocks writes on a subset of
objects and scrubs those, allowing writes through to the rest of the PG.

The scrub takes longer to complete than a classic scrub, but improves
overall write throughput.

This feature is backward-compatible with classic scrub. If the primary
detects that any replica does not have the chunky scrub feature, it
falls back to the less efficient classic scrub.

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agocrush: change default type from 'pool' to 'root'
Sage Weil [Wed, 5 Sep 2012 20:05:59 +0000 (13:05 -0700)]
crush: change default type from 'pool' to 'root'

The 'pool=default' in the default crush maps is confusing wrt rados pools.
'root' makes more sense given that we are talking about hierarchies/trees.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'upstream/wip-kvstore'
Samuel Just [Wed, 5 Sep 2012 17:50:50 +0000 (10:50 -0700)]
Merge remote-tracking branch 'upstream/wip-kvstore'

12 years agoFileStore: get objects whose names fall within a range
Mike Ryan [Tue, 10 Jul 2012 23:22:58 +0000 (16:22 -0700)]
FileStore: get objects whose names fall within a range

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agopg: change _scrub() to take out parameters as pointers
Mike Ryan [Tue, 3 Jul 2012 22:37:12 +0000 (15:37 -0700)]
pg: change _scrub() to take out parameters as pointers

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agoMonitor.cc: Added include for limits.h.
Gary Lowell [Wed, 5 Sep 2012 03:31:00 +0000 (20:31 -0700)]
Monitor.cc:  Added include for limits.h.

This include is needed on Centos.  It seems to be included implicitly
on other platforms.

12 years agoMerge branch 'master' of github.com:ceph/ceph
John Wilkins [Wed, 5 Sep 2012 00:17:07 +0000 (17:17 -0700)]
Merge branch 'master' of github.com:ceph/ceph

12 years ago:doc: Added a section for adding and removing monitors. Singificantly re-factored.
John Wilkins [Wed, 5 Sep 2012 00:16:23 +0000 (17:16 -0700)]
:doc: Added a section for adding and removing monitors. Singificantly re-factored.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years ago:doc: Incorporated Joao's feedback into the reference material.
John Wilkins [Wed, 5 Sep 2012 00:15:35 +0000 (17:15 -0700)]
:doc: Incorporated Joao's feedback into the reference material.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodocs: Add CloudStack documentation
Wido den Hollander [Wed, 8 Aug 2012 21:51:10 +0000 (23:51 +0200)]
docs: Add CloudStack documentation

The basic documentation about how you can use RBD with CloudStack

Signed-off-by: Wido den Hollander <wido@widodh.nl>
12 years ago:doc: Added recovering from OSD failures. Will be re-factored again soon.
John Wilkins [Tue, 4 Sep 2012 23:34:32 +0000 (16:34 -0700)]
:doc: Added recovering from OSD failures. Will be re-factored again soon.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Added monitor failure recovery. Will be re-factored again soon.
John Wilkins [Tue, 4 Sep 2012 23:33:47 +0000 (16:33 -0700)]
doc: Added monitor failure recovery. Will be re-factored again soon.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Re-factored adding an OSD.
John Wilkins [Tue, 4 Sep 2012 23:19:33 +0000 (16:19 -0700)]
doc: Re-factored adding an OSD.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Minor updates.
John Wilkins [Tue, 4 Sep 2012 23:18:54 +0000 (16:18 -0700)]
doc: Minor updates.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoMerge branch 'master' of github.com:ceph/ceph
John Wilkins [Tue, 4 Sep 2012 23:17:46 +0000 (16:17 -0700)]
Merge branch 'master' of github.com:ceph/ceph

12 years agodoc: Added admonishments for Ceph FS per http://tracker.newdream.net/issues/3077
John Wilkins [Tue, 4 Sep 2012 23:17:29 +0000 (16:17 -0700)]
doc: Added admonishments for Ceph FS per http://tracker.newdream.net/issues/3077

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Updated to incorporate Sage's changes.
John Wilkins [Tue, 4 Sep 2012 23:08:35 +0000 (16:08 -0700)]
doc: Updated to incorporate Sage's changes.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Added anchor references.
John Wilkins [Tue, 4 Sep 2012 23:03:34 +0000 (16:03 -0700)]
doc: Added anchor references.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Refactored the debug section to point back to reference.
John Wilkins [Tue, 4 Sep 2012 23:02:37 +0000 (16:02 -0700)]
doc: Refactored the debug section to point back to reference.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Added QA reference to --valgrind option.
John Wilkins [Tue, 4 Sep 2012 23:01:48 +0000 (16:01 -0700)]
doc: Added QA reference to --valgrind option.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoRevert "ReplicatedPG: fill in user log entry last after snapdir tran"
Sage Weil [Tue, 4 Sep 2012 22:21:45 +0000 (15:21 -0700)]
Revert "ReplicatedPG: fill in user log entry last after snapdir tran"

This reverts commit 0aad5462eb79be0427004f2442903bb56c2057c1.

This gives us two events with the same version, and crashes like so:

osd/PG.cc: In function 'void PG::add_log_entry(pg_log_entry_t&, ceph::bufferlist&)' thread 7fd21b187700 time 2012-09-04 15:10:39.475385
osd/PG.cc: 2181: FAILED assert(e.version > info.last_update)

 ceph version 0.51-411-g40fd6ba (commit:40fd6ba8ed9ba70c8d20a79936f53f10f2dfe839)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x95) [0x1025139]
 2: (PG::add_log_entry(pg_log_entry_t&, ceph::buffer::list&)+0xb0) [0xe47552]
 3: (PG::append_log(std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&, eversion_t, ObjectStore::Transaction&)+0x1cd) [0xe47939]
 4: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x41f2) [0xcb5c84]
 5: (PG::do_request(std::tr1::shared_ptr<OpRequest>)+0x1f3) [0xe42329]
 6: (OSD::dequeue_op(PG*)+0x286) [0xd9a214]
 7: (OSD::OpWQ::_process(PG*)+0x27) [0xda20e7]
 8: (ThreadPool::WorkQueue<PG>::_void_process(void*)+0x2e) [0xdf1ab4]
 9: (ThreadPool::worker()+0x4ce) [0x101c762]
 10: (ThreadPool::WorkThread::entry()+0x1c) [0xda049e]
 11: (Thread::_entry_func(void*)+0x23) [0x1016a49]
 12: (()+0x7e9a) [0x7fd22b7fce9a]
 13: (clone()+0x6d) [0x7fd229db14bd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

12 years agoOSD::handle_pg_stats_ack: grab pg refcount while processing pg
Samuel Just [Tue, 4 Sep 2012 20:55:09 +0000 (13:55 -0700)]
OSD::handle_pg_stats_ack: grab pg refcount while processing pg

If the queue refcount is the last one for the pg, the pg->put()
in the loop will destroy the pg while the lock is still held
leading to #3071.  Thus, grab refcount in case we need to drop
it.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoReplicatedPG: fill in user log entry last after snapdir tran
Samuel Just [Tue, 4 Sep 2012 20:32:58 +0000 (13:32 -0700)]
ReplicatedPG: fill in user log entry last after snapdir tran

The user log entry contains the request id, which will be used
by replay ops to put themselves in the correct place in the
waiting_for_commit/ack maps.  Thus, the repop needs to be tagged
with the same version as the log entry with the request id.
Thus, the request id bearing log entry should be the last in
the log entry vector.

This should fix #3072, wherein a replay which should wait on
the repop tagged as version '36 will instead wait on '35.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoPG: In Active, don't transition to WantActingChange
Samuel Just [Thu, 23 Aug 2012 18:10:25 +0000 (11:10 -0700)]
PG: In Active, don't transition to WantActingChange

want_acting is filled in during recovery completion in
order to move the newly backfilled osd into its correct
place.  In this case, however, want_acting must contain
only members of acting and up.  Thus, we can be sure that
if any of them go down, we would restart peering anyway.
Thus, we need not transition to WaitActingChange, which
does not reflect that we continue to serve client operations
in the interim.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge branch 'wip-msgr'
Sage Weil [Tue, 4 Sep 2012 19:17:13 +0000 (12:17 -0700)]
Merge branch 'wip-msgr'

Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomsg/Pipe: kill useless onconnect arg
Sage Weil [Tue, 4 Sep 2012 18:54:44 +0000 (11:54 -0700)]
msg/Pipe: kill useless onconnect arg

This reduces debug output but nothing else, for know discernable reason.
Drop it.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'master' of github.com:ceph/ceph
John Wilkins [Tue, 4 Sep 2012 18:40:51 +0000 (11:40 -0700)]
Merge branch 'master' of github.com:ceph/ceph

12 years agodoc: Added PG states.
John Wilkins [Tue, 4 Sep 2012 18:40:25 +0000 (11:40 -0700)]
doc: Added PG states.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Promoting PG concepts into mainline docs. Redundant version still in Internals.
John Wilkins [Tue, 4 Sep 2012 18:39:16 +0000 (11:39 -0700)]
doc: Promoting PG concepts into mainline docs. Redundant version still in Internals.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: New section on placement groups.
John Wilkins [Tue, 4 Sep 2012 18:38:27 +0000 (11:38 -0700)]
doc: New section on placement groups.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Moved from configuration to operations. Updated with new info.
John Wilkins [Tue, 4 Sep 2012 18:37:52 +0000 (11:37 -0700)]
doc: Moved from configuration to operations. Updated with new info.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Created a more robust doc for monitoring a cluster.
John Wilkins [Tue, 4 Sep 2012 18:37:13 +0000 (11:37 -0700)]
doc: Created a more robust doc for monitoring a cluster.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Consolidated start and stop.
John Wilkins [Tue, 4 Sep 2012 18:36:33 +0000 (11:36 -0700)]
doc: Consolidated start and stop.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Added a new "Data Placement" overview section for added continuity.
John Wilkins [Tue, 4 Sep 2012 18:35:33 +0000 (11:35 -0700)]
doc: Added a new "Data Placement" overview section for added continuity.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Added a new CRUSH map section. Will need to incorporate new tunables info.
John Wilkins [Tue, 4 Sep 2012 18:34:58 +0000 (11:34 -0700)]
doc: Added a new CRUSH map section. Will need to incorporate new tunables info.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Moving new auth section from configuration to operations.
John Wilkins [Tue, 4 Sep 2012 18:33:37 +0000 (11:33 -0700)]
doc: Moving new auth section from configuration to operations.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoobjecter: fix osdmap wait
Sage Weil [Tue, 4 Sep 2012 18:29:21 +0000 (11:29 -0700)]
objecter: fix osdmap wait

When we get a pool_op_reply, we find out which osdmap we need to wait for.
The wait_for_new_map() code was feeding that epoch into
maybe_request_map(), which was feeding it to the monitor with the subscribe
request.  However, that epoch is the *start* epoch, not what we want.  Fix
this code to always subscribe to what we have (+1), and ensure we keep
asking for more until we catch up to what we know we should eventually
get.

Bug: #3075
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agodoc: Fix leftover "localhost" mention.
Tommi Virtanen [Tue, 4 Sep 2012 15:20:57 +0000 (08:20 -0700)]
doc: Fix leftover "localhost" mention.

Commit dd011aba90831bade3b67e99268429be10635dce changed
the conf file sample to say {hostname}, but changed the
prose only from ``localhost`` to ``{localhost}``.

Signed-off-by: Tommi Virtanen <tv@inktank.com>
12 years agodoc: Added debug ref to toctree. Trimmed title names a bit.
John Wilkins [Mon, 3 Sep 2012 21:06:44 +0000 (14:06 -0700)]
doc: Added debug ref to toctree. Trimmed title names a bit.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Added "how to" for debug/logging config. Trimmed titles too.
John Wilkins [Mon, 3 Sep 2012 21:05:51 +0000 (14:05 -0700)]
doc: Added "how to" for debug/logging config. Trimmed titles too.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Added logging reference under configuration section.
John Wilkins [Mon, 3 Sep 2012 21:04:11 +0000 (14:04 -0700)]
doc: Added logging reference under configuration section.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agomsg/Pipe: do not special-case failure during connect
Sage Weil [Mon, 3 Sep 2012 21:00:09 +0000 (14:00 -0700)]
msg/Pipe: do not special-case failure during connect

Do not special case failure during connect.  In particular, we may be
reconnecting and experience a second fault, and wipe out our session
(e.g., between the fs client and the mds) and destroy important session
state.

This logic dates back to the original patch in '08 when the standby
state was introduced.

Bug: #3070
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodoc: Added runtime configuration example.
John Wilkins [Mon, 3 Sep 2012 20:35:49 +0000 (13:35 -0700)]
doc: Added runtime configuration example.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agotest, key_value_store: added distributed flat btree key-value store
Eleanor Cawthon [Fri, 8 Jun 2012 18:05:20 +0000 (11:05 -0700)]
test, key_value_store: added distributed flat btree key-value store

Uses one index object and many sub objects to store key-value pairs. The pairs
are stored in the omaps of librados objects. The index contains keys
corresponding to the highest key in an object, and values that contain the
name of the object where the key range is stored. The tree guarantees that
the number of pairs in an object will be > k and < 2k for a user-specified k.
KvStoreBench contains benchmarking tests.

Signed-off-by: Eleanor Cawthon <eleanor.cawthon@inktank.com>
12 years agovstart.sh: -r to start radosgw
Sage Weil [Sat, 1 Sep 2012 21:39:28 +0000 (14:39 -0700)]
vstart.sh: -r to start radosgw

Uses a fixed access/secret key for easier testing.  Starts a standalone
apache2 process with basic config (based on the teuthology one).

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'gh/wip-coverity'
Sage Weil [Sat, 1 Sep 2012 21:50:34 +0000 (14:50 -0700)]
Merge remote-tracking branch 'gh/wip-coverity'

12 years agoMerge branch 'wip-osd-flags'
Sage Weil [Sat, 1 Sep 2012 00:06:16 +0000 (17:06 -0700)]
Merge branch 'wip-osd-flags'

Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoosd: defer backfill with NOBACKFILL osdmap flag is set
Sage Weil [Fri, 31 Aug 2012 23:31:01 +0000 (16:31 -0700)]
osd: defer backfill with NOBACKFILL osdmap flag is set

If we encounter nobackfill, let ourselves to fall out of the recovery
queue.  If we encounter a map that has does not have the flag set and we
are not clean, requeue ourselves.  This is a big hammer, but simple.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoClarify CodingStyle with respect to tab compression of space runs
Dan Mick [Fri, 31 Aug 2012 22:18:53 +0000 (15:18 -0700)]
Clarify CodingStyle with respect to tab compression of space runs
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Greg Farnum <gregory.farnum@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoFix rados put from '-' (stdin)
Dan Mick [Fri, 31 Aug 2012 21:41:29 +0000 (14:41 -0700)]
Fix rados put from '-' (stdin)

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Mike Ryan <mike.ryan@inktank.com>
Reviewed-by: Greg Farnum <gregory.farnum@inktank.com>
Fixes: #3068
12 years agoosd: pause/unpause recovery based on NORECOVER osdmap flag
Sage Weil [Fri, 24 Aug 2012 01:16:58 +0000 (18:16 -0700)]
osd: pause/unpause recovery based on NORECOVER osdmap flag

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdmap: add NORECOVER flag
Sage Weil [Fri, 24 Aug 2012 01:12:28 +0000 (18:12 -0700)]
osdmap: add NORECOVER flag

This will stop recovery via log catch-up and via backfill both.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdmap: add NOBACKFILL flag
Sage Weil [Fri, 24 Aug 2012 01:00:57 +0000 (18:00 -0700)]
osdmap: add NOBACKFILL flag

This will tell the OSDs to please not initiate any backfill operations.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoPG: do not update stats in ReplicaActive from info
Samuel Just [Fri, 31 Aug 2012 21:01:47 +0000 (14:01 -0700)]
PG: do not update stats in ReplicaActive from info

Bug #2954

Consider the following case:

1) Primary calls share_pg_info()
2) Primary processes client op and sends off sub_op to replica
3) Replica process sub_op
4) Replica process info reverting stat to before 2)

Similarly:

1) Primary processes client op
2) Primary calls share_pg_info()
3) Replica processes info
[4) Replica processes sub_op]

If 4) is interrupted by a map change, we can end up in a case there
the replica's info has a stat which reflects a log entry which
is not there.  If that logs ends up authoratative, the most recent
op will be replayed and end up double counted in the log.

There should actually be no cases where the stats change after the
replica goes active except for as part of a sub_op_modify.  Thus,
ReplicaActive::MInfoRec should not update the stats.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agocrushtool: Miscellaneous cleanup.
caleb miles [Tue, 28 Aug 2012 21:42:30 +0000 (14:42 -0700)]
crushtool: Miscellaneous cleanup.

Clean up the output messages; add some function documentation and some
unit tests.

Signed-off-by: caleb miles <caleb.miles@inktank.com>
12 years agoosd/osd_types.h: fix pg_history_t::merge copy paste error
Samuel Just [Thu, 30 Aug 2012 00:00:43 +0000 (17:00 -0700)]
osd/osd_types.h: fix pg_history_t::merge copy paste error

CID 716882: Copy-paste error (COPY_PASTE_ERROR)At (2): "last_epoch_started" in
"other.last_epoch_started" looks like a copy-paste error. Should it say
"last_epoch_split" instead?

From what I can tell, this really should be checking other.last_epoch_split
rather than other.last_epoch_started.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoosd/Watch.h: uninit var in ctor Watch
Samuel Just [Wed, 29 Aug 2012 23:43:02 +0000 (16:43 -0700)]
osd/Watch.h: uninit var in ctor Watch

CID 717345: Uninitialized pointer field (UNINIT_CTOR)At (8): Non-static class
member "obc" is not initialized in this constructor nor in any functions that
it calls.
At (2): Non-static class member "id" is not initialized in this constructor nor
in any functions that it calls.
At (4): Non-static class member "reply" is not
initialized in this constructor nor in any functions that it calls.
At (6): Non-static class member "timeout" is not initialized in this
constructor nor in any functions that it calls.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoosd/ReplicatedPG.h: uninit var in ctor RepModify
Samuel Just [Wed, 29 Aug 2012 23:39:25 +0000 (16:39 -0700)]
osd/ReplicatedPG.h: uninit var in ctor RepModify

CID 717344: Uninitialized scalar field (UNINIT_CTOR)At (2): Non-static class
member "epoch_started" is not initialized in this constructor nor in any
functions that it calls.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoosd/ReplicatedPG.h: uninit var in ctor OpContext
Samuel Just [Wed, 29 Aug 2012 23:37:50 +0000 (16:37 -0700)]
osd/ReplicatedPG.h: uninit var in ctor OpContext

CID 717343: Uninitialized pointer field (UNINIT_CTOR)At (3): Non-static class
member "snapset" is not initialized in this constructor nor in any functions
that it calls.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoosd/ReplicatedPG: pass PGPool to ReplicatedPG ctor by ref
Samuel Just [Wed, 29 Aug 2012 23:35:01 +0000 (16:35 -0700)]
osd/ReplicatedPG: pass PGPool to ReplicatedPG ctor by ref

CID 717057: Big parameter passed by value (PASS_BY_VALUE)At (1): Passing
parameter _pool of type PGPool (size 336 bytes) by value.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoosd/PG.h: uninit var in ctor NamedState
Samuel Just [Wed, 29 Aug 2012 23:32:28 +0000 (16:32 -0700)]
osd/PG.h: uninit var in ctor NamedState

CID 717340: Uninitialized pointer field (UNINIT_CTOR)At (2): Non-static class
member "state_name" is not initialized in this constructor nor in any functions
that it calls.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoosd/PG.h: uninit var in ctor OndiskLog
Samuel Just [Wed, 29 Aug 2012 23:31:20 +0000 (16:31 -0700)]
osd/PG.h: uninit var in ctor OndiskLog

CID 717342: Uninitialized scalar field (UNINIT_CTOR)At (2): Non-static class
member "has_checksums" is not initialized in this constructor nor in any
functions that it calls.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoosd/PG.h: uninit var in ctor IndexedLog
Samuel Just [Wed, 29 Aug 2012 23:30:07 +0000 (16:30 -0700)]
osd/PG.h: uninit var in ctor IndexedLog

CID 717339: Uninitialized scalar field (UNINIT_CTOR)At (2): Non-static class
member "last_requested" is not initialized in this constructor nor in any
functions that it calls.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoosd/PG.cc: PG constructor pass PGPool by reference
Samuel Just [Wed, 29 Aug 2012 23:28:20 +0000 (16:28 -0700)]
osd/PG.cc: PG constructor pass PGPool by reference

CID 717053: Big parameter passed by value (PASS_BY_VALUE)At (1): Passing
parameter _pool of type PGPool (size 336 bytes) by value.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoosd/OpRequest.h: uninit vars in ctor OpRequest
Samuel Just [Wed, 29 Aug 2012 23:25:20 +0000 (16:25 -0700)]
osd/OpRequest.h: uninit vars in ctor OpRequest

At (2): Non-static class member "hit_flag_points" is not initialized in this
constructor nor in any functions that it calls.  CID 717338: Uninitialized
scalar field (UNINIT_CTOR)At (4): Non-static class
member "latest_flag_point" is not initialized in this constructor nor in any
functions that it calls.

Signed-off-by: Samuel Just <sam.just@inktank.com>