]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
13 years agoosd: don't complete recovery if unfound
Sage Weil [Thu, 23 Feb 2012 17:39:50 +0000 (09:39 -0800)]
osd: don't complete recovery if unfound

Otherwise we fail the !needs_recovery() assert.  Because we aren't
recovered.  For example,

2012-02-21 16:16:13.104665 1685c700 osd.5 1217 pg[0.16( v 1215'337 lc 19'2 (0'0,1215'337] n=25 ec=1 les/c 0/1061 1210/1210/1210) [5,3] r=0 lpr=1210 mlcod 0'0 active m=23 u=23 snaptrimq=[1~99,9b~e,aa~72,11d~3d,15b~e,16a~f,17a~5,180~4,185~1a,1a0~a,1ac~10,1bd~4,1c2~8,1cb~1,1cd~1,1cf~1a,1ea~10,1fb~6,202~2,205~2,209~2,20c~8,215~2,218~5,21e~1,220~1,222~9,22c~4,231~3,235~2,238~3,23e~2,241~4,246~1,248~1,24b~1,24d~9,257~6,25e~1,263~1,265~2,268~3,26e~1,273~1,275~5,27e~1,280~2]] needs_recovery osd.3 has 23 missing
osd/PG.cc: In function 'boost::statechart::result PG::RecoveryState::Active::react(const PG::RecoveryState::RecoveryComplete&)' thread 1685c700 time 2012-02-21 16:16:13.108923
osd/PG.cc: 4070: FAILED assert(!pg->needs_recovery())
 ceph version 0.42-70-g0e4367a (commit:0e4367aaac88b99c36386b6ce5e8d816fdd4ada0)
 1: (PG::RecoveryState::Active::react(PG::RecoveryState::RecoveryComplete const&)+0x1b3) [0x6a1173]
 2: (boost::statechart::simple_state<PG::RecoveryState::Active, PG::RecoveryState::Primary, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x121) [0x6c7301]
 3: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x6b) [0x6bfc6b]
 4: (PG::RecoveryState::handle_recovery_complete(PG::RecoveryCtx*)+0x10c) [0x67c03c]
 5: (ReplicatedPG::start_recovery_ops(int, PG::RecoveryCtx*)+0x241) [0x4f83c1]
 6: (OSD::do_recovery(PG*)+0x345) [0x54b3e5]
 7: (ThreadPool::worker()+0xa26) [0x619e66]
 8: (ThreadPool::WorkThread::entry()+0xd) [0x57ad5d]
 9: (()+0x7971) [0x5037971]
 10: (clone()+0x6d) [0x679f92d]

Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoMakefile: include encoding check scripts in dist tarball
Sage Weil [Wed, 22 Feb 2012 01:11:02 +0000 (17:11 -0800)]
Makefile: include encoding check scripts in dist tarball

This makes 'make distcheck' happy.  Well, more happy at least; it's still
cranky but I can't tell why.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: make object_info_t::dump using hobject_t and object_locator_t dumpers
Sage Weil [Tue, 21 Feb 2012 23:08:26 +0000 (15:08 -0800)]
osd: make object_info_t::dump  using hobject_t and object_locator_t dumpers

Makes the output more readable.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote-tracking branch 'gh/wip-dump-stuck-pgs'
Sage Weil [Tue, 21 Feb 2012 22:46:00 +0000 (14:46 -0800)]
Merge remote-tracking branch 'gh/wip-dump-stuck-pgs'

Reviewed-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote-tracking branch 'gh/wip-osd-write'
Sage Weil [Tue, 21 Feb 2012 22:44:44 +0000 (14:44 -0800)]
Merge remote-tracking branch 'gh/wip-osd-write'

Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoosdmap: dump embedded crush map in Incremental::dump()
Sage Weil [Tue, 21 Feb 2012 22:43:23 +0000 (14:43 -0800)]
osdmap: dump embedded crush map in Incremental::dump()

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge branch 'wip-crush'
Sage Weil [Tue, 21 Feb 2012 22:39:16 +0000 (14:39 -0800)]
Merge branch 'wip-crush'

Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agocrush: write CrushWrapper:dump()
Sage Weil [Tue, 21 Feb 2012 22:37:50 +0000 (14:37 -0800)]
crush: write CrushWrapper:dump()

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agotest/rados-api/misc: fix LibRadosMisc.Operate1PP test
Sage Weil [Tue, 21 Feb 2012 05:12:21 +0000 (21:12 -0800)]
test/rados-api/misc: fix LibRadosMisc.Operate1PP test

It's a mutation, so we get a result of 0 (or error).

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: refuse to return data payload if request wrote anything
Sage Weil [Tue, 21 Feb 2012 05:11:46 +0000 (21:11 -0800)]
osd: refuse to return data payload if request wrote anything

Write operations aren't allowed to return a data payload because
we can't do so reliably. If the client has to resend the request
and it has already been applied, we will return 0 with no
payload.  Non-deterministic behavior is no good.

See #1765.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge branch 'wip-osdmap'
Sage Weil [Tue, 21 Feb 2012 21:51:27 +0000 (13:51 -0800)]
Merge branch 'wip-osdmap'

Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoosdmap: dump fullmap from dump()
Sage Weil [Tue, 21 Feb 2012 21:50:34 +0000 (13:50 -0800)]
osdmap: dump fullmap from dump()

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge branch 'wip-1821'
Sage Weil [Tue, 21 Feb 2012 21:43:36 +0000 (13:43 -0800)]
Merge branch 'wip-1821'

Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agorgw: accepted access key chars should be url safe
Yehuda Sadeh [Tue, 21 Feb 2012 20:11:26 +0000 (12:11 -0800)]
rgw: accepted access key chars should be url safe

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoMakefile: fix misplaced unit tests
Sage Weil [Tue, 21 Feb 2012 00:01:34 +0000 (16:01 -0800)]
Makefile: fix misplaced unit tests

These weren't run on make check because they were defined in the wrong
spot.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agohobject_t: remove unused back_up_to_bounding_key()
Sage Weil [Mon, 20 Feb 2012 19:25:37 +0000 (11:25 -0800)]
hobject_t: remove unused back_up_to_bounding_key()

This was a path not taken in the backfill code.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: sched_scrub() outside of map_lock
Sage Weil [Mon, 20 Feb 2012 22:41:28 +0000 (14:41 -0800)]
osd: sched_scrub() outside of map_lock

Inside sched_scrub() we call _lookup_lock_pg(), which takes
map_lock.get_read().  That's technically okay because RWLock read side is
recursive, but lockdep doesn't know that, and we don't need map_lock
because we hold osd_lock.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoglobal: resurrect lockdep
Sage Weil [Mon, 20 Feb 2012 22:38:20 +0000 (14:38 -0800)]
global: resurrect lockdep

Add 'lockdep' config option, and initialize g_lockdep from that in
global_init().

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: disable pg_num adjustment
Sage Weil [Mon, 20 Feb 2012 21:00:14 +0000 (13:00 -0800)]
mon: disable pg_num adjustment

Until #1515 is fixed/reimplemented.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: use encode function for new Incremental
Sage Weil [Mon, 20 Feb 2012 19:02:49 +0000 (11:02 -0800)]
mon: use encode function for new Incremental

When we encode an Incremental, use the encode wrapper function, so that
we can capture the encoded struct when building with ENCODE_DUMP.  Set
all features (the default when encode() is called directly).

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosdmap: successfully decode short map
Sage Weil [Mon, 20 Feb 2012 18:56:25 +0000 (10:56 -0800)]
osdmap: successfully decode short map

When we send (old) maps to the kclient, we omit the extended section.  Lets
decode those (old, abbreviated maps) successfully, too.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosdmap: use FEATURE encoder macro
Sage Weil [Mon, 20 Feb 2012 18:41:52 +0000 (10:41 -0800)]
osdmap: use FEATURE encoder macro

This generates encode/decode functions that pass feature bits into the
encoder, allowing us to encode old formats.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoqa/btrfs/test_rmdir_async_snap
Sage Weil [Mon, 20 Feb 2012 18:31:55 +0000 (10:31 -0800)]
qa/btrfs/test_rmdir_async_snap

Attempt to reproduce btrfs bug when rmdirs race with an async snap.
Unsuccessful.  Best guess is that we need multiple threads to trigger.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoceph-dencoder: add OSDMap::Incremental
Sage Weil [Mon, 20 Feb 2012 18:26:41 +0000 (10:26 -0800)]
ceph-dencoder: add OSDMap::Incremental

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosdmap: add Incremental::dump()
Sage Weil [Mon, 20 Feb 2012 18:26:30 +0000 (10:26 -0800)]
osdmap: add Incremental::dump()

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: don't count SNAPDIR as a clone during backfill
Sage Weil [Mon, 20 Feb 2012 17:40:03 +0000 (09:40 -0800)]
osd: don't count SNAPDIR as a clone during backfill

When we are backfilling, we add in objects as we push them.  Do not count
the snapdir object as a clone, or else we'll screw up the count.

Fixes: #2080
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agocrush: fix CrushCompiler warning
Sage Weil [Mon, 20 Feb 2012 14:40:44 +0000 (06:40 -0800)]
crush: fix CrushCompiler warning

warning: crush/CrushCompiler.cc:595: â€˜r’ may be used uninitialized in this function

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agotest/encoding/readable.sh: sh, not dash
Sage Weil [Mon, 20 Feb 2012 14:27:47 +0000 (06:27 -0800)]
test/encoding/readable.sh: sh, not dash

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agocrushtool: fix clitests
Sage Weil [Mon, 20 Feb 2012 14:27:39 +0000 (06:27 -0800)]
crushtool: fix clitests

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge branch 'stable'
Sage Weil [Mon, 20 Feb 2012 03:36:00 +0000 (19:36 -0800)]
Merge branch 'stable'

13 years agomsgr: fix shutdown race again
Sage Weil [Mon, 20 Feb 2012 03:37:13 +0000 (19:37 -0800)]
msgr: fix shutdown race again

Only unlock once.  Sigh.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agov0.42 v0.42
Sage Weil [Sun, 19 Feb 2012 23:30:37 +0000 (15:30 -0800)]
v0.42

13 years agomsgr: fix accept shutdown race fault
Sage Weil [Sun, 19 Feb 2012 22:52:41 +0000 (14:52 -0800)]
msgr: fix accept shutdown race fault

Need to hold pipe_lock.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: test injected crush map
Sage Weil [Sun, 19 Feb 2012 22:50:15 +0000 (14:50 -0800)]
mon: test injected crush map

Run a bunch of inputs through an injected crush map to make sure it isn't
broken.

Fixes: #1932
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agocrush: move crushtool --test into CrushTester
Sage Weil [Sun, 19 Feb 2012 22:48:05 +0000 (14:48 -0800)]
crush: move crushtool --test into CrushTester

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agocrush: move (de)compile into CrushCompiler class
Sage Weil [Sun, 19 Feb 2012 22:16:23 +0000 (14:16 -0800)]
crush: move (de)compile into CrushCompiler class

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: fix message discard on shutdown
Sage Weil [Sun, 19 Feb 2012 20:44:58 +0000 (12:44 -0800)]
mon: fix message discard on shutdown

Return true, so the messenger is happy, and drop the message reference.

Avoids an assert like

2012-02-19T12:36:05.102 INFO:teuthology.task.ceph.mon.2.err:ms_deliver_dispatch: fatal error: unhandled message 0x1b7b280 paxos(auth lease_ack lc 8 fc 1 pn 0 opn 0) v1 from mon.2 10.3.14.197:6789/0msg/Messenger.h: In function 'void Messenger::ms_deliver_dispatch(Message*)' thread 7fd7fe360700 time 2012-02-19 12:36:05.094713
2012-02-19T12:36:05.102 INFO:teuthology.task.ceph.mon.2.err:msg/Messenger.h: 143: FAILED assert(0)

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agocrush: uninline encode/decode
Sage Weil [Sun, 19 Feb 2012 20:08:11 +0000 (12:08 -0800)]
crush: uninline encode/decode

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agocrush: cleanup: use temp var for curstep
Sage Weil [Sun, 19 Feb 2012 19:59:11 +0000 (11:59 -0800)]
crush: cleanup: use temp var for curstep

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomds: use want_state to indicate shutdown
Sage Weil [Sun, 19 Feb 2012 15:41:47 +0000 (07:41 -0800)]
mds: use want_state to indicate shutdown

State gets DNE when we receive the first map.  And want_ makes more sense
anyway.  Fixes MDS startup.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: fix up argument to PG::init()
Sage Weil [Sun, 19 Feb 2012 06:17:35 +0000 (22:17 -0800)]
osd: fix up argument to PG::init()

Commit cefa55b288b40e17ade9875493dd94de52ac22bf moved PG initialization
into init(), but passed acting for both up and acting args.  This lead to
confusion between primary and replica.

Also fix debug print so that the output is useful.

Fixes: #2075, #2070
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoSimpleMessenger: drop unused sigint()
Sage Weil [Sun, 19 Feb 2012 05:49:35 +0000 (21:49 -0800)]
SimpleMessenger: drop unused sigint()

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomsgr: promote SimpleMessenger::Policy to Messenger::Policy
Sage Weil [Sun, 19 Feb 2012 05:48:50 +0000 (21:48 -0800)]
msgr: promote SimpleMessenger::Policy to Messenger::Policy

This is part of the generic interface, not specific to the implementation.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomds: ignore all msgr callbacks on shutdown, not just dispatch
Sage Weil [Sun, 19 Feb 2012 05:43:18 +0000 (21:43 -0800)]
mds: ignore all msgr callbacks on shutdown, not just dispatch

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: discard messages while shutting down
Sage Weil [Sun, 19 Feb 2012 05:37:09 +0000 (21:37 -0800)]
mon: discard messages while shutting down

Add SHUTDOWN state.  Ignore any msgr callbacks if set.

Fixes crash like

2012-02-18T21:57:58.912 INFO:teuthology.task.ceph:Shutting down mon daemons...
2012-02-18T21:57:58.912 DEBUG:teuthology.task.ceph.mon.a:waiting for process to exit
2012-02-18T21:57:58.913 INFO:teuthology.task.ceph.mon.a.err:2012-02-18 21:57:58.927759 7fe98dfa1700 mon.a@1(peon) e1 *** Got Signal Terminated ***
2012-02-18T21:57:59.014 INFO:teuthology.task.ceph.mon.a.err:*** Caught signal (Segmentation fault) **
2012-02-18T21:57:59.014 INFO:teuthology.task.ceph.mon.a.err: in thread 7fe98d7a0700
2012-02-18T21:57:59.014 INFO:teuthology.task.ceph.mon.a.err: ceph version 0.41-382-gc1db900 (commit:c1db9009c2cde9dc7ab8857b0d28a1b6d931e98a)
2012-02-18T21:57:59.015 INFO:teuthology.task.ceph.mon.a.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x5b0871]
2012-02-18T21:57:59.015 INFO:teuthology.task.ceph.mon.a.err: 2: (()+0xfb40) [0x7fe991a1eb40]
2012-02-18T21:57:59.015 INFO:teuthology.task.ceph.mon.a.err: 3: (PerfCounters::set(int, unsigned long)+0x1a) [0x52008a]
2012-02-18T21:57:59.015 INFO:teuthology.task.ceph.mon.a.err: 4: (PGMonitor::update_logger()+0x96) [0x4d4bf6]
2012-02-18T21:57:59.015 INFO:teuthology.task.ceph.mon.a.err: 5: (PGMonitor::update_from_paxos()+0xa70) [0x4e0980]
2012-02-18T21:57:59.016 INFO:teuthology.task.ceph.mon.a.err: 6: (Monitor::_ms_dispatch(Message*)+0x143b) [0x47bd6b]
2012-02-18T21:57:59.016 INFO:teuthology.task.ceph.mon.a.err: 7: (Monitor::ms_dispatch(Message*)+0x90) [0x489210]
2012-02-18T21:57:59.016 INFO:teuthology.task.ceph.mon.a.err: 8: (SimpleMessenger::dispatch_entry()+0x89a) [0x53959a]
2012-02-18T21:57:59.016 INFO:teuthology.task.ceph.mon.a.err: 9: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x46358c]
2012-02-18T21:57:59.016 INFO:teuthology.task.ceph.mon.a.err: 10: (()+0x7971) [0x7fe991a16971]
2012-02-18T21:57:59.017 INFO:teuthology.task.ceph.mon.a.err: 11: (clone()+0x6d) [0x7fe9902a592d]

which is analogous to #2014.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomsgr: fix shutdown vs accept race
Sage Weil [Sat, 18 Feb 2012 21:45:37 +0000 (13:45 -0800)]
msgr: fix shutdown vs accept race

This is a kludge.  The real fix is to rewrite SimpleMessenger as a state
machine.

Fixes: #2073
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomds: drop all messages during suicide
Sage Weil [Sat, 18 Feb 2012 21:36:24 +0000 (13:36 -0800)]
mds: drop all messages during suicide

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge remote branch 'gh/wip-pg-states'
Sage Weil [Sat, 18 Feb 2012 22:00:50 +0000 (14:00 -0800)]
Merge remote branch 'gh/wip-pg-states'

13 years agomon: fix STUCK_STALE check
Sage Weil [Sat, 18 Feb 2012 00:34:49 +0000 (16:34 -0800)]
mon: fix STUCK_STALE check

Look at last_unstale if STALE bit is not set.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: add dump_stuck command
Josh Durgin [Wed, 15 Feb 2012 01:52:36 +0000 (17:52 -0800)]
mon: add dump_stuck command

This will help monitoring transient pg states at a coarse level.

Fixes: #2005
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agomon: constify functions needed to use dout from a const function
Josh Durgin [Wed, 15 Feb 2012 01:53:28 +0000 (17:53 -0800)]
mon: constify functions needed to use dout from a const function

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoPGMap: extract method for outputting plain pg stats
Josh Durgin [Fri, 10 Feb 2012 19:53:54 +0000 (11:53 -0800)]
PGMap: extract method for outputting plain pg stats

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoPGMap: fix else indentation
Josh Durgin [Fri, 10 Feb 2012 21:03:22 +0000 (13:03 -0800)]
PGMap: fix else indentation

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoosd: update_stats() in GetInfo state start
Sage Weil [Fri, 17 Feb 2012 23:26:37 +0000 (15:26 -0800)]
osd: update_stats() in GetInfo state start

This is the first stage of peering.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: don't update_stats() on prec_replica_info
Sage Weil [Fri, 17 Feb 2012 23:26:06 +0000 (15:26 -0800)]
osd: don't update_stats() on prec_replica_info

Nothing changes here...

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: hold journal_lock during
Sage Weil [Fri, 17 Feb 2012 21:59:08 +0000 (13:59 -0800)]
filestore: hold journal_lock during

Hold journal_lock during replay so that we don't stomp on variables like
op_seq and open_ops that the the commit thread cares about.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: only complete/deregister repop once
Sage Weil [Sat, 18 Feb 2012 00:23:50 +0000 (16:23 -0800)]
osd: only complete/deregister repop once

It's now possible to send the ack and deregister the repop before the
op_applied() happens.  And when that happens, we'll call eval_repop() once
more.  Don't do anything in that case.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge branch 'next'
Josh Durgin [Fri, 17 Feb 2012 22:31:44 +0000 (14:31 -0800)]
Merge branch 'next'

13 years agoman: regenerate man pages
Josh Durgin [Fri, 17 Feb 2012 22:11:18 +0000 (14:11 -0800)]
man: regenerate man pages

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoman: move man page fixes to rst
Josh Durgin [Fri, 17 Feb 2012 22:09:55 +0000 (14:09 -0800)]
man: move man page fixes to rst

83cf1b62fde525d068bc292c4a1ccc42199657ae and
e5f49104ab62ba7bc42cf6ecf41c9257b46585f7 updated the nroff output
but not the rst source.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: fix snapshot creation/deletion syntax in rbd man page (trivial)
Florian Haas [Fri, 17 Feb 2012 20:15:15 +0000 (21:15 +0100)]
doc: fix snapshot creation/deletion syntax in rbd man page (trivial)

Creating a snapshot requires using "rbd snap create",
as opposed to just "rbd create". Also for purposes of
clarification, add note that removing a snapshot similarly
requires "rbd snap rm".

Thanks to Josh Durgin for the explanation on IRC.

Signed-off-by: Florian Haas <florian@hastexo.com>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoPGMap: add indent settings header
Josh Durgin [Fri, 10 Feb 2012 02:15:01 +0000 (18:15 -0800)]
PGMap: add indent settings header

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoPGMap: add last_state_change to dump output
Josh Durgin [Fri, 10 Feb 2012 02:14:35 +0000 (18:14 -0800)]
PGMap: add last_state_change to dump output

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoPGMap: fix dump header fields
Josh Durgin [Fri, 10 Feb 2012 01:52:07 +0000 (17:52 -0800)]
PGMap: fix dump header fields

kilobytes were removed from the output by
625b0b0291543baf424fb3bae4c7a36d280df91e, and last_scrub_stamp was
added by 988e350d35bba9591cd0ca5b58ce9ecb8f8ddd80.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoosd: make op_commit imply op_applied for purposes of repop completion
Sage Weil [Fri, 17 Feb 2012 21:48:02 +0000 (13:48 -0800)]
osd: make op_commit imply op_applied for purposes of repop completion

For repop completion, we want waitfor_ack and _commit to be empty.  For
replicas, a commit reply implies ack, so ack is always a subset of commit.
But for the local write, we wait for applied separately, so we can have
repops open where we sent the reply to the client but still have it open
and consuming memory.  And generating 'old request' warnings in the logs
(when the filestore is taking a long time to apply to the fs).

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: add REMAPPED state
Sage Weil [Fri, 17 Feb 2012 21:46:11 +0000 (13:46 -0800)]
osd: add REMAPPED state

Set this bit whenever up != acting.  This tells you that the OSDMap is
explicitly remapping the PG to different nodes (than what CRUSH specified).

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: refactor recovery completion
Sage Weil [Fri, 17 Feb 2012 21:19:57 +0000 (13:19 -0800)]
osd: refactor recovery completion

- rename is_all_update() -> needs_recovery(), reverse logic.
- drop up != acting check; that has nothing to do with
  recovery itself
- drop trigger in Active::react(const ActMap&)... it's nonsensical
- CompleteRecovery always leads to finish_recovery (or acting set change)

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: introduce RECOVERING pg state
Sage Weil [Fri, 17 Feb 2012 18:55:12 +0000 (10:55 -0800)]
osd: introduce RECOVERING pg state

Since clean now means not degraded, we need some other indication that
recovery has completed and we are "done" (given the current up/down state
of the OSDs).

Adding a 'recovering' state also makes it clearer to users that work is
being done, as opposed to the current situation, where they look for the
absense of 'clean'.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agopaxos: fix is_consistent() check
Sage Weil [Fri, 17 Feb 2012 18:23:12 +0000 (10:23 -0800)]
paxos: fix is_consistent() check

If our last_committed == 1, we don't need a separate stash.  This is the
logic that slurp() follows, so fix is_consistent() to match.

Fixes: #2077
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: change nested iterator name
Tom Callaway [Fri, 17 Feb 2012 17:14:16 +0000 (09:14 -0800)]
osd: change nested iterator name

Don't shadow the iterator variable.

Signed-off-by: Tom Callaway <spot@redhat.com>
Signed-off-by: David Nalley <david@gnsa.us>
13 years agoadd missing #includes to build on gcc 4.7
Tom Callaway [Fri, 17 Feb 2012 17:14:56 +0000 (09:14 -0800)]
add missing #includes to build on gcc 4.7

Signed-off-by: Tom Callaway <spot@redhat.com>
Signed-off-by: David Nalley <david@gnsa.us>
13 years agomds: comment out unused code in mds dump_pop_map
Tom Callaway [Fri, 17 Feb 2012 16:58:40 +0000 (08:58 -0800)]
mds: comment out unused code in mds dump_pop_map

Signed-off-by: Tom Callaway <spot@redhat.com>
Signed-off-by: David Nalley <david@gnsa.us>
13 years agoMerge branch 'next'
Sage Weil [Fri, 17 Feb 2012 05:00:49 +0000 (21:00 -0800)]
Merge branch 'next'

13 years agoosd: fix _activate_committed replica->primary message
Sage Weil [Wed, 15 Feb 2012 21:16:25 +0000 (13:16 -0800)]
osd: fix _activate_committed replica->primary message

Normally we take a fresh map reference in PG::lock().  However,
_activate_committed needs to make sure the map hasn't changed significantly
before acting.  In the case of #2068, the OSD map has moved forward and
the mapping has changed, but the PG hasn't processed that yet, and thus
mis-tags the MOSDPGInfo message.

Tag the message with the e epoch, and also pass down the primary's address
to send the message to the right location.

Fixes: #2068
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: skip threadpool pause on shutdown when blackholed
Sage Weil [Thu, 16 Feb 2012 23:18:58 +0000 (15:18 -0800)]
osd: skip threadpool pause on shutdown when blackholed

We can't pause the threadpools if they're blocked on a blackholed
filestore.  Instead, just call _exit().

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: fix _activate_committed replica->primary message
Sage Weil [Wed, 15 Feb 2012 21:16:25 +0000 (13:16 -0800)]
osd: fix _activate_committed replica->primary message

Normally we take a fresh map reference in PG::lock().  However,
_activate_committed needs to make sure the map hasn't changed significantly
before acting.  In the case of #2068, the OSD map has moved forward and
the mapping has changed, but the PG hasn't processed that yet, and thus
mis-tags the MOSDPGInfo message.

Tag the message with the e epoch, and also pass down the primary's address
to send the message to the right location.

Fixes: #2068
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: fix do not always clear DEGRADED/set CLEAN on recovery finish
Sage Weil [Wed, 15 Feb 2012 23:20:35 +0000 (15:20 -0800)]
osd: fix do not always clear DEGRADED/set CLEAN on recovery finish

Clean means we have exactly the right number of replicas and recovery is
complete.  Degraded means we do not have enough replicas, either because
recovery is in progress, or because acting is too small.

A consequence is that if we have a PG with len(up) == 1 but a pg_temp
mapping so that len(acting) == 2, it will be active and not clean.

Fixes: #2060
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoinit: Only check if auto start is disabled when the issued command is "start"
Wido den Hollander [Wed, 15 Feb 2012 15:20:16 +0000 (16:20 +0100)]
init: Only check if auto start is disabled when the issued command is "start"

This still makes sure daemons don't start on boot.

When auto start was disabled it would also prevent logrotate from doing it's job.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoceph.spec.in: Move libcls_*.so from -devel to base package
Holger Macht [Wed, 15 Feb 2012 16:29:09 +0000 (17:29 +0100)]
ceph.spec.in: Move libcls_*.so from -devel to base package

OSDs (src/osd/ClassHandler.cc) specifically look for libcls_*.so in
/usr/$libdir/rados-classes, so libcls_rbd.so and libcls_rgw.so need to
be shipped along with the base package.

Signed-off-by: Holger Macht <hmacht@suse.de>
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoobjclass: add debug_objclass knob, default to off
Sage Weil [Wed, 15 Feb 2012 17:04:22 +0000 (09:04 -0800)]
objclass: add debug_objclass knob, default to off

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: reduce watch/notify debug noise
Sage Weil [Wed, 15 Feb 2012 17:03:28 +0000 (09:03 -0800)]
osd: reduce watch/notify debug noise

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomsgr: mark_all_down on shutdown
Sage Weil [Wed, 15 Feb 2012 16:09:32 +0000 (08:09 -0800)]
msgr: mark_all_down on shutdown

This ensures we destroy all the Pipes and discard their messages.  Among
other things, this can avoid

2012-02-15 03:16:46.385242 7fe712b9a700 mon.f@5(peon) e1 *** Got Signal Terminated ***
2012-02-15 03:16:46.470227 7fe712b9a700 mon.f@5(peon) e1 shutdown
msg/SimpleMessenger.h: In function 'virtual SimpleMessenger::Pipe::~Pipe()' thread 7fe716a37780 time 2012-02-15 03:16:46.471005
msg/SimpleMessenger.h: 234: FAILED assert(!i->second->is_on_list())
 ceph version 0.41-362-g40802ae (commit:40802ae883a94d205a8716065b80ad5d7ff57d12)
 1: (SimpleMessenger::Pipe::~Pipe()+0x199) [0x4669d9]
 2: (SimpleMessenger::~SimpleMessenger()+0x31) [0x552231]
 3: (main()+0x3026) [0x4614a6]
 4: (__libc_start_main()+0xfe) [0x7fe714dd6d8e]
 5: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x45e219]
 ceph version 0.41-362-g40802ae (commit:40802ae883a94d205a8716065b80ad5d7ff57d12)
 1: (SimpleMessenger::Pipe::~Pipe()+0x199) [0x4669d9]
 2: (SimpleMessenger::~SimpleMessenger()+0x31) [0x552231]
 3: (main()+0x3026) [0x4614a6]
 4: (__libc_start_main()+0xfe) [0x7fe714dd6d8e]
 5: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x45e219]

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: do not sync_and_flush if blackholed
Sage Weil [Wed, 15 Feb 2012 16:21:02 +0000 (08:21 -0800)]
osd: do not sync_and_flush if blackholed

If we have blackholed this will block forever.  In that case dont' bother.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoworkqueue: make pause/unpause count
Sage Weil [Wed, 15 Feb 2012 16:20:32 +0000 (08:20 -0800)]
workqueue: make pause/unpause count

We can pause() multiple times, and we need as many unpause()s to actually
resume work.

This resolves problems where we have two actors interested in pausing a
queue, both want to stop work, and they aren't interacting/coordinating.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: exit code 0 on SIGINT/SIGTERM
Sage Weil [Wed, 15 Feb 2012 06:05:36 +0000 (22:05 -0800)]
osd: exit code 0 on SIGINT/SIGTERM

This makes daemon-handler happy...

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agosignals: check write(2) return values
Sage Weil [Tue, 14 Feb 2012 17:09:39 +0000 (09:09 -0800)]
signals: check write(2) return values

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: semi-clean shutdown on signal
Sage Weil [Sun, 12 Feb 2012 22:35:03 +0000 (14:35 -0800)]
osd: semi-clean shutdown on signal

Make some effort to stop work in progress, remove pid file, and exit with
informative error code.

Note that this is much simpler than the shutdown() exit path; I'm not sure
whether a complete teardown is useful.  It's also difficult to maintain
and get right with everything else going on, and it's not clear that it's
worth the effort right now.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomds: remove some cruft
Sage Weil [Sun, 12 Feb 2012 22:12:44 +0000 (14:12 -0800)]
mds: remove some cruft

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomds: remove pidfile
Sage Weil [Sun, 12 Feb 2012 00:39:27 +0000 (16:39 -0800)]
mds: remove pidfile

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: do a clean shutdown on SIGINT/SIGTERM
Sage Weil [Sun, 12 Feb 2012 22:43:13 +0000 (14:43 -0800)]
mon: do a clean shutdown on SIGINT/SIGTERM

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: install async signal handlers for SIG{HUP,INT,TERM}
Sage Weil [Sun, 12 Feb 2012 00:38:06 +0000 (16:38 -0800)]
mon: install async signal handlers for SIG{HUP,INT,TERM}

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: install async signal handlers for SIG{HUP,INT,TERM}
Sage Weil [Sun, 12 Feb 2012 00:36:33 +0000 (16:36 -0800)]
osd: install async signal handlers for SIG{HUP,INT,TERM}

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomds: install async signal handlers for SIG{HUP,INT,TERM}
Sage Weil [Sun, 12 Feb 2012 00:33:51 +0000 (16:33 -0800)]
mds: install async signal handlers for SIG{HUP,INT,TERM}

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agosignal: remove unused/obsolete handle_shutdown_signal
Sage Weil [Sun, 12 Feb 2012 00:39:48 +0000 (16:39 -0800)]
signal: remove unused/obsolete handle_shutdown_signal

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agosignals: do not install default SIGHUP, SIGINT, SIGTERM handlers
Sage Weil [Sun, 12 Feb 2012 00:30:26 +0000 (16:30 -0800)]
signals: do not install default SIGHUP, SIGINT, SIGTERM handlers

These should be app specific and async.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agosignals: implement safe async signal handler framework
Sage Weil [Sat, 11 Feb 2012 17:45:06 +0000 (09:45 -0800)]
signals: implement safe async signal handler framework

Based on http://evbergen.home.xs4all.nl/unix-signals.html.

Instead of his design, though, we write single bytes, and create a pipe per
signal we have handlers registered for.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agolibradospp: add config_t typedef
Sage Weil [Wed, 15 Feb 2012 01:03:54 +0000 (17:03 -0800)]
libradospp: add config_t typedef

Don't expose internal CephContext type name.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agolibrados: use rados_config_t typedef instead of CephContext
Sage Weil [Wed, 15 Feb 2012 01:03:00 +0000 (17:03 -0800)]
librados: use rados_config_t typedef instead of CephContext

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agodoc: Balance backticks.
Tommi Virtanen [Tue, 14 Feb 2012 23:52:55 +0000 (15:52 -0800)]
doc: Balance backticks.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoMerge branch 'wip-osd-hb'
Sage Weil [Tue, 14 Feb 2012 22:01:22 +0000 (14:01 -0800)]
Merge branch 'wip-osd-hb'

Reviewed-by: Samuel Just <samuel.just@dreamhost.com>