]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Sage Weil [Thu, 23 Feb 2012 17:39:50 +0000 (09:39 -0800)]
osd: don't complete recovery if unfound
Otherwise we fail the !needs_recovery() assert. Because we aren't
recovered. For example,
2012-02-21 16:16:13.104665
1685c700 osd.5 1217 pg[0.16( v 1215'337 lc 19'2 (0'0,1215'337] n=25 ec=1 les/c 0/1061 1210/1210/1210) [5,3] r=0 lpr=1210 mlcod 0'0 active m=23 u=23 snaptrimq=[1~99,9b~e,aa~72,11d~3d,15b~e,16a~f,17a~5,180~4,185~1a,1a0~a,1ac~10,1bd~4,1c2~8,1cb~1,1cd~1,1cf~1a,1ea~10,1fb~6,202~2,205~2,209~2,20c~8,215~2,218~5,21e~1,220~1,222~9,22c~4,231~3,235~2,238~3,23e~2,241~4,246~1,248~1,24b~1,24d~9,257~6,25e~1,263~1,265~2,268~3,26e~1,273~1,275~5,27e~1,280~2]] needs_recovery osd.3 has 23 missing
osd/PG.cc: In function 'boost::statechart::result PG::RecoveryState::Active::react(const PG::RecoveryState::RecoveryComplete&)' thread
1685c700 time 2012-02-21 16:16:13.108923
osd/PG.cc: 4070: FAILED assert(!pg->needs_recovery())
ceph version
0.42-70-g0e4367a (commit:
0e4367aaac88b99c36386b6ce5e8d816fdd4ada0 )
1: (PG::RecoveryState::Active::react(PG::RecoveryState::RecoveryComplete const&)+0x1b3) [0x6a1173]
2: (boost::statechart::simple_state<PG::RecoveryState::Active, PG::RecoveryState::Primary, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x121) [0x6c7301]
3: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x6b) [0x6bfc6b]
4: (PG::RecoveryState::handle_recovery_complete(PG::RecoveryCtx*)+0x10c) [0x67c03c]
5: (ReplicatedPG::start_recovery_ops(int, PG::RecoveryCtx*)+0x241) [0x4f83c1]
6: (OSD::do_recovery(PG*)+0x345) [0x54b3e5]
7: (ThreadPool::worker()+0xa26) [0x619e66]
8: (ThreadPool::WorkThread::entry()+0xd) [0x57ad5d]
9: (()+0x7971) [0x5037971]
10: (clone()+0x6d) [0x679f92d]
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
Sage Weil [Wed, 22 Feb 2012 01:11:02 +0000 (17:11 -0800)]
Makefile: include encoding check scripts in dist tarball
This makes 'make distcheck' happy. Well, more happy at least; it's still
cranky but I can't tell why.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 21 Feb 2012 23:08:26 +0000 (15:08 -0800)]
osd: make object_info_t::dump using hobject_t and object_locator_t dumpers
Makes the output more readable.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 21 Feb 2012 22:46:00 +0000 (14:46 -0800)]
Merge remote-tracking branch 'gh/wip-dump-stuck-pgs'
Reviewed-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 21 Feb 2012 22:44:44 +0000 (14:44 -0800)]
Merge remote-tracking branch 'gh/wip-osd-write'
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
Sage Weil [Tue, 21 Feb 2012 22:43:23 +0000 (14:43 -0800)]
osdmap: dump embedded crush map in Incremental::dump()
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 21 Feb 2012 22:39:16 +0000 (14:39 -0800)]
Merge branch 'wip-crush'
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
Sage Weil [Tue, 21 Feb 2012 22:37:50 +0000 (14:37 -0800)]
crush: write CrushWrapper:dump()
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 21 Feb 2012 05:12:21 +0000 (21:12 -0800)]
test/rados-api/misc: fix LibRadosMisc.Operate1PP test
It's a mutation, so we get a result of 0 (or error).
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 21 Feb 2012 05:11:46 +0000 (21:11 -0800)]
osd: refuse to return data payload if request wrote anything
Write operations aren't allowed to return a data payload because
we can't do so reliably. If the client has to resend the request
and it has already been applied, we will return 0 with no
payload. Non-deterministic behavior is no good.
See #1765.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 21 Feb 2012 21:51:27 +0000 (13:51 -0800)]
Merge branch 'wip-osdmap'
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
Sage Weil [Tue, 21 Feb 2012 21:50:34 +0000 (13:50 -0800)]
osdmap: dump fullmap from dump()
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 21 Feb 2012 21:43:36 +0000 (13:43 -0800)]
Merge branch 'wip-1821'
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
Yehuda Sadeh [Tue, 21 Feb 2012 20:11:26 +0000 (12:11 -0800)]
rgw: accepted access key chars should be url safe
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Sage Weil [Tue, 21 Feb 2012 00:01:34 +0000 (16:01 -0800)]
Makefile: fix misplaced unit tests
These weren't run on make check because they were defined in the wrong
spot.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 20 Feb 2012 19:25:37 +0000 (11:25 -0800)]
hobject_t: remove unused back_up_to_bounding_key()
This was a path not taken in the backfill code.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 20 Feb 2012 22:41:28 +0000 (14:41 -0800)]
osd: sched_scrub() outside of map_lock
Inside sched_scrub() we call _lookup_lock_pg(), which takes
map_lock.get_read(). That's technically okay because RWLock read side is
recursive, but lockdep doesn't know that, and we don't need map_lock
because we hold osd_lock.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 20 Feb 2012 22:38:20 +0000 (14:38 -0800)]
global: resurrect lockdep
Add 'lockdep' config option, and initialize g_lockdep from that in
global_init().
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 20 Feb 2012 21:00:14 +0000 (13:00 -0800)]
mon: disable pg_num adjustment
Until #1515 is fixed/reimplemented.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Mon, 20 Feb 2012 19:02:49 +0000 (11:02 -0800)]
mon: use encode function for new Incremental
When we encode an Incremental, use the encode wrapper function, so that
we can capture the encoded struct when building with ENCODE_DUMP. Set
all features (the default when encode() is called directly).
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 20 Feb 2012 18:56:25 +0000 (10:56 -0800)]
osdmap: successfully decode short map
When we send (old) maps to the kclient, we omit the extended section. Lets
decode those (old, abbreviated maps) successfully, too.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 20 Feb 2012 18:41:52 +0000 (10:41 -0800)]
osdmap: use FEATURE encoder macro
This generates encode/decode functions that pass feature bits into the
encoder, allowing us to encode old formats.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 20 Feb 2012 18:31:55 +0000 (10:31 -0800)]
qa/btrfs/test_rmdir_async_snap
Attempt to reproduce btrfs bug when rmdirs race with an async snap.
Unsuccessful. Best guess is that we need multiple threads to trigger.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 20 Feb 2012 18:26:41 +0000 (10:26 -0800)]
ceph-dencoder: add OSDMap::Incremental
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 20 Feb 2012 18:26:30 +0000 (10:26 -0800)]
osdmap: add Incremental::dump()
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 20 Feb 2012 17:40:03 +0000 (09:40 -0800)]
osd: don't count SNAPDIR as a clone during backfill
When we are backfilling, we add in objects as we push them. Do not count
the snapdir object as a clone, or else we'll screw up the count.
Fixes: #2080
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Mon, 20 Feb 2012 14:40:44 +0000 (06:40 -0800)]
crush: fix CrushCompiler warning
warning: crush/CrushCompiler.cc:595: ‘r’ may be used uninitialized in this function
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Mon, 20 Feb 2012 14:27:47 +0000 (06:27 -0800)]
test/encoding/readable.sh: sh, not dash
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Mon, 20 Feb 2012 14:27:39 +0000 (06:27 -0800)]
crushtool: fix clitests
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Mon, 20 Feb 2012 03:36:00 +0000 (19:36 -0800)]
Merge branch 'stable'
Sage Weil [Mon, 20 Feb 2012 03:37:13 +0000 (19:37 -0800)]
msgr: fix shutdown race again
Only unlock once. Sigh.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Sun, 19 Feb 2012 23:30:37 +0000 (15:30 -0800)]
v0.42
Sage Weil [Sun, 19 Feb 2012 22:52:41 +0000 (14:52 -0800)]
msgr: fix accept shutdown race fault
Need to hold pipe_lock.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Sun, 19 Feb 2012 22:50:15 +0000 (14:50 -0800)]
mon: test injected crush map
Run a bunch of inputs through an injected crush map to make sure it isn't
broken.
Fixes: #1932
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 19 Feb 2012 22:48:05 +0000 (14:48 -0800)]
crush: move crushtool --test into CrushTester
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 19 Feb 2012 22:16:23 +0000 (14:16 -0800)]
crush: move (de)compile into CrushCompiler class
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 19 Feb 2012 20:44:58 +0000 (12:44 -0800)]
mon: fix message discard on shutdown
Return true, so the messenger is happy, and drop the message reference.
Avoids an assert like
2012-02-19T12:36:05.102 INFO:teuthology.task.ceph.mon.2.err:ms_deliver_dispatch: fatal error: unhandled message 0x1b7b280 paxos(auth lease_ack lc 8 fc 1 pn 0 opn 0) v1 from mon.2 10.3.14.197:6789/0msg/Messenger.h: In function 'void Messenger::ms_deliver_dispatch(Message*)' thread
7fd7fe360700 time 2012-02-19 12:36:05.094713
2012-02-19T12:36:05.102 INFO:teuthology.task.ceph.mon.2.err:msg/Messenger.h: 143: FAILED assert(0)
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Sun, 19 Feb 2012 20:08:11 +0000 (12:08 -0800)]
crush: uninline encode/decode
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 19 Feb 2012 19:59:11 +0000 (11:59 -0800)]
crush: cleanup: use temp var for curstep
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 19 Feb 2012 15:41:47 +0000 (07:41 -0800)]
mds: use want_state to indicate shutdown
State gets DNE when we receive the first map. And want_ makes more sense
anyway. Fixes MDS startup.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 19 Feb 2012 06:17:35 +0000 (22:17 -0800)]
osd: fix up argument to PG::init()
Commit
cefa55b288b40e17ade9875493dd94de52ac22bf moved PG initialization
into init(), but passed acting for both up and acting args. This lead to
confusion between primary and replica.
Also fix debug print so that the output is useful.
Fixes: #2075, #2070
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 19 Feb 2012 05:49:35 +0000 (21:49 -0800)]
SimpleMessenger: drop unused sigint()
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 19 Feb 2012 05:48:50 +0000 (21:48 -0800)]
msgr: promote SimpleMessenger::Policy to Messenger::Policy
This is part of the generic interface, not specific to the implementation.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 19 Feb 2012 05:43:18 +0000 (21:43 -0800)]
mds: ignore all msgr callbacks on shutdown, not just dispatch
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 19 Feb 2012 05:37:09 +0000 (21:37 -0800)]
mon: discard messages while shutting down
Add SHUTDOWN state. Ignore any msgr callbacks if set.
Fixes crash like
2012-02-18T21:57:58.912 INFO:teuthology.task.ceph:Shutting down mon daemons...
2012-02-18T21:57:58.912 DEBUG:teuthology.task.ceph.mon.a:waiting for process to exit
2012-02-18T21:57:58.913 INFO:teuthology.task.ceph.mon.a.err:2012-02-18 21:57:58.927759
7fe98dfa1700 mon.a@1(peon) e1 *** Got Signal Terminated ***
2012-02-18T21:57:59.014 INFO:teuthology.task.ceph.mon.a.err:*** Caught signal (Segmentation fault) **
2012-02-18T21:57:59.014 INFO:teuthology.task.ceph.mon.a.err: in thread
7fe98d7a0700
2012-02-18T21:57:59.014 INFO:teuthology.task.ceph.mon.a.err: ceph version
0.41-382-gc1db900 (commit:
c1db9009c2cde9dc7ab8857b0d28a1b6d931e98a )
2012-02-18T21:57:59.015 INFO:teuthology.task.ceph.mon.a.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x5b0871]
2012-02-18T21:57:59.015 INFO:teuthology.task.ceph.mon.a.err: 2: (()+0xfb40) [0x7fe991a1eb40]
2012-02-18T21:57:59.015 INFO:teuthology.task.ceph.mon.a.err: 3: (PerfCounters::set(int, unsigned long)+0x1a) [0x52008a]
2012-02-18T21:57:59.015 INFO:teuthology.task.ceph.mon.a.err: 4: (PGMonitor::update_logger()+0x96) [0x4d4bf6]
2012-02-18T21:57:59.015 INFO:teuthology.task.ceph.mon.a.err: 5: (PGMonitor::update_from_paxos()+0xa70) [0x4e0980]
2012-02-18T21:57:59.016 INFO:teuthology.task.ceph.mon.a.err: 6: (Monitor::_ms_dispatch(Message*)+0x143b) [0x47bd6b]
2012-02-18T21:57:59.016 INFO:teuthology.task.ceph.mon.a.err: 7: (Monitor::ms_dispatch(Message*)+0x90) [0x489210]
2012-02-18T21:57:59.016 INFO:teuthology.task.ceph.mon.a.err: 8: (SimpleMessenger::dispatch_entry()+0x89a) [0x53959a]
2012-02-18T21:57:59.016 INFO:teuthology.task.ceph.mon.a.err: 9: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x46358c]
2012-02-18T21:57:59.016 INFO:teuthology.task.ceph.mon.a.err: 10: (()+0x7971) [0x7fe991a16971]
2012-02-18T21:57:59.017 INFO:teuthology.task.ceph.mon.a.err: 11: (clone()+0x6d) [0x7fe9902a592d]
which is analogous to #2014.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sat, 18 Feb 2012 21:45:37 +0000 (13:45 -0800)]
msgr: fix shutdown vs accept race
This is a kludge. The real fix is to rewrite SimpleMessenger as a state
machine.
Fixes: #2073
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sat, 18 Feb 2012 21:36:24 +0000 (13:36 -0800)]
mds: drop all messages during suicide
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sat, 18 Feb 2012 22:00:50 +0000 (14:00 -0800)]
Merge remote branch 'gh/wip-pg-states'
Sage Weil [Sat, 18 Feb 2012 00:34:49 +0000 (16:34 -0800)]
mon: fix STUCK_STALE check
Look at last_unstale if STALE bit is not set.
Signed-off-by: Sage Weil <sage@newdream.net>
Josh Durgin [Wed, 15 Feb 2012 01:52:36 +0000 (17:52 -0800)]
mon: add dump_stuck command
This will help monitoring transient pg states at a coarse level.
Fixes: #2005
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin [Wed, 15 Feb 2012 01:53:28 +0000 (17:53 -0800)]
mon: constify functions needed to use dout from a const function
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin [Fri, 10 Feb 2012 19:53:54 +0000 (11:53 -0800)]
PGMap: extract method for outputting plain pg stats
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin [Fri, 10 Feb 2012 21:03:22 +0000 (13:03 -0800)]
PGMap: fix else indentation
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Sage Weil [Fri, 17 Feb 2012 23:26:37 +0000 (15:26 -0800)]
osd: update_stats() in GetInfo state start
This is the first stage of peering.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 17 Feb 2012 23:26:06 +0000 (15:26 -0800)]
osd: don't update_stats() on prec_replica_info
Nothing changes here...
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 17 Feb 2012 21:59:08 +0000 (13:59 -0800)]
filestore: hold journal_lock during
Hold journal_lock during replay so that we don't stomp on variables like
op_seq and open_ops that the the commit thread cares about.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sat, 18 Feb 2012 00:23:50 +0000 (16:23 -0800)]
osd: only complete/deregister repop once
It's now possible to send the ack and deregister the repop before the
op_applied() happens. And when that happens, we'll call eval_repop() once
more. Don't do anything in that case.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Josh Durgin [Fri, 17 Feb 2012 22:31:44 +0000 (14:31 -0800)]
Merge branch 'next'
Josh Durgin [Fri, 17 Feb 2012 22:11:18 +0000 (14:11 -0800)]
man: regenerate man pages
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin [Fri, 17 Feb 2012 22:09:55 +0000 (14:09 -0800)]
man: move man page fixes to rst
83cf1b62fde525d068bc292c4a1ccc42199657ae and
e5f49104ab62ba7bc42cf6ecf41c9257b46585f7 updated the nroff output
but not the rst source.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Florian Haas [Fri, 17 Feb 2012 20:15:15 +0000 (21:15 +0100)]
doc: fix snapshot creation/deletion syntax in rbd man page (trivial)
Creating a snapshot requires using "rbd snap create",
as opposed to just "rbd create". Also for purposes of
clarification, add note that removing a snapshot similarly
requires "rbd snap rm".
Thanks to Josh Durgin for the explanation on IRC.
Signed-off-by: Florian Haas <florian@hastexo.com>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin [Fri, 10 Feb 2012 02:15:01 +0000 (18:15 -0800)]
PGMap: add indent settings header
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin [Fri, 10 Feb 2012 02:14:35 +0000 (18:14 -0800)]
PGMap: add last_state_change to dump output
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin [Fri, 10 Feb 2012 01:52:07 +0000 (17:52 -0800)]
PGMap: fix dump header fields
kilobytes were removed from the output by
625b0b0291543baf424fb3bae4c7a36d280df91e , and last_scrub_stamp was
added by
988e350d35bba9591cd0ca5b58ce9ecb8f8ddd80 .
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Sage Weil [Fri, 17 Feb 2012 21:48:02 +0000 (13:48 -0800)]
osd: make op_commit imply op_applied for purposes of repop completion
For repop completion, we want waitfor_ack and _commit to be empty. For
replicas, a commit reply implies ack, so ack is always a subset of commit.
But for the local write, we wait for applied separately, so we can have
repops open where we sent the reply to the client but still have it open
and consuming memory. And generating 'old request' warnings in the logs
(when the filestore is taking a long time to apply to the fs).
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 17 Feb 2012 21:46:11 +0000 (13:46 -0800)]
osd: add REMAPPED state
Set this bit whenever up != acting. This tells you that the OSDMap is
explicitly remapping the PG to different nodes (than what CRUSH specified).
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 17 Feb 2012 21:19:57 +0000 (13:19 -0800)]
osd: refactor recovery completion
- rename is_all_update() -> needs_recovery(), reverse logic.
- drop up != acting check; that has nothing to do with
recovery itself
- drop trigger in Active::react(const ActMap&)... it's nonsensical
- CompleteRecovery always leads to finish_recovery (or acting set change)
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 17 Feb 2012 18:55:12 +0000 (10:55 -0800)]
osd: introduce RECOVERING pg state
Since clean now means not degraded, we need some other indication that
recovery has completed and we are "done" (given the current up/down state
of the OSDs).
Adding a 'recovering' state also makes it clearer to users that work is
being done, as opposed to the current situation, where they look for the
absense of 'clean'.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Fri, 17 Feb 2012 18:23:12 +0000 (10:23 -0800)]
paxos: fix is_consistent() check
If our last_committed == 1, we don't need a separate stash. This is the
logic that slurp() follows, so fix is_consistent() to match.
Fixes: #2077
Signed-off-by: Sage Weil <sage@newdream.net>
Tom Callaway [Fri, 17 Feb 2012 17:14:16 +0000 (09:14 -0800)]
osd: change nested iterator name
Don't shadow the iterator variable.
Signed-off-by: Tom Callaway <spot@redhat.com>
Signed-off-by: David Nalley <david@gnsa.us>
Tom Callaway [Fri, 17 Feb 2012 17:14:56 +0000 (09:14 -0800)]
add missing #includes to build on gcc 4.7
Signed-off-by: Tom Callaway <spot@redhat.com>
Signed-off-by: David Nalley <david@gnsa.us>
Tom Callaway [Fri, 17 Feb 2012 16:58:40 +0000 (08:58 -0800)]
mds: comment out unused code in mds dump_pop_map
Signed-off-by: Tom Callaway <spot@redhat.com>
Signed-off-by: David Nalley <david@gnsa.us>
Sage Weil [Fri, 17 Feb 2012 05:00:49 +0000 (21:00 -0800)]
Merge branch 'next'
Sage Weil [Wed, 15 Feb 2012 21:16:25 +0000 (13:16 -0800)]
osd: fix _activate_committed replica->primary message
Normally we take a fresh map reference in PG::lock(). However,
_activate_committed needs to make sure the map hasn't changed significantly
before acting. In the case of #2068, the OSD map has moved forward and
the mapping has changed, but the PG hasn't processed that yet, and thus
mis-tags the MOSDPGInfo message.
Tag the message with the e epoch, and also pass down the primary's address
to send the message to the right location.
Fixes: #2068
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 16 Feb 2012 23:18:58 +0000 (15:18 -0800)]
osd: skip threadpool pause on shutdown when blackholed
We can't pause the threadpools if they're blocked on a blackholed
filestore. Instead, just call _exit().
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 15 Feb 2012 21:16:25 +0000 (13:16 -0800)]
osd: fix _activate_committed replica->primary message
Normally we take a fresh map reference in PG::lock(). However,
_activate_committed needs to make sure the map hasn't changed significantly
before acting. In the case of #2068, the OSD map has moved forward and
the mapping has changed, but the PG hasn't processed that yet, and thus
mis-tags the MOSDPGInfo message.
Tag the message with the e epoch, and also pass down the primary's address
to send the message to the right location.
Fixes: #2068
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 15 Feb 2012 23:20:35 +0000 (15:20 -0800)]
osd: fix do not always clear DEGRADED/set CLEAN on recovery finish
Clean means we have exactly the right number of replicas and recovery is
complete. Degraded means we do not have enough replicas, either because
recovery is in progress, or because acting is too small.
A consequence is that if we have a PG with len(up) == 1 but a pg_temp
mapping so that len(acting) == 2, it will be active and not clean.
Fixes: #2060
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
Wido den Hollander [Wed, 15 Feb 2012 15:20:16 +0000 (16:20 +0100)]
init: Only check if auto start is disabled when the issued command is "start"
This still makes sure daemons don't start on boot.
When auto start was disabled it would also prevent logrotate from doing it's job.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Signed-off-by: Sage Weil <sage@newdream.net>
Holger Macht [Wed, 15 Feb 2012 16:29:09 +0000 (17:29 +0100)]
ceph.spec.in: Move libcls_*.so from -devel to base package
OSDs (src/osd/ClassHandler.cc) specifically look for libcls_*.so in
/usr/$libdir/rados-classes, so libcls_rbd.so and libcls_rgw.so need to
be shipped along with the base package.
Signed-off-by: Holger Macht <hmacht@suse.de>
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 15 Feb 2012 17:04:22 +0000 (09:04 -0800)]
objclass: add debug_objclass knob, default to off
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 15 Feb 2012 17:03:28 +0000 (09:03 -0800)]
osd: reduce watch/notify debug noise
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 15 Feb 2012 16:09:32 +0000 (08:09 -0800)]
msgr: mark_all_down on shutdown
This ensures we destroy all the Pipes and discard their messages. Among
other things, this can avoid
2012-02-15 03:16:46.385242
7fe712b9a700 mon.f@5(peon) e1 *** Got Signal Terminated ***
2012-02-15 03:16:46.470227
7fe712b9a700 mon.f@5(peon) e1 shutdown
msg/SimpleMessenger.h: In function 'virtual SimpleMessenger::Pipe::~Pipe()' thread
7fe716a37780 time 2012-02-15 03:16:46.471005
msg/SimpleMessenger.h: 234: FAILED assert(!i->second->is_on_list())
ceph version
0.41-362-g40802ae (commit:
40802ae883a94d205a8716065b80ad5d7ff57d12 )
1: (SimpleMessenger::Pipe::~Pipe()+0x199) [0x4669d9]
2: (SimpleMessenger::~SimpleMessenger()+0x31) [0x552231]
3: (main()+0x3026) [0x4614a6]
4: (__libc_start_main()+0xfe) [0x7fe714dd6d8e]
5: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x45e219]
ceph version
0.41-362-g40802ae (commit:
40802ae883a94d205a8716065b80ad5d7ff57d12 )
1: (SimpleMessenger::Pipe::~Pipe()+0x199) [0x4669d9]
2: (SimpleMessenger::~SimpleMessenger()+0x31) [0x552231]
3: (main()+0x3026) [0x4614a6]
4: (__libc_start_main()+0xfe) [0x7fe714dd6d8e]
5: /tmp/cephtest/binary/usr/local/bin/ceph-mon() [0x45e219]
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 15 Feb 2012 16:21:02 +0000 (08:21 -0800)]
osd: do not sync_and_flush if blackholed
If we have blackholed this will block forever. In that case dont' bother.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 15 Feb 2012 16:20:32 +0000 (08:20 -0800)]
workqueue: make pause/unpause count
We can pause() multiple times, and we need as many unpause()s to actually
resume work.
This resolves problems where we have two actors interested in pausing a
queue, both want to stop work, and they aren't interacting/coordinating.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 15 Feb 2012 06:05:36 +0000 (22:05 -0800)]
osd: exit code 0 on SIGINT/SIGTERM
This makes daemon-handler happy...
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 14 Feb 2012 17:09:39 +0000 (09:09 -0800)]
signals: check write(2) return values
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Sun, 12 Feb 2012 22:35:03 +0000 (14:35 -0800)]
osd: semi-clean shutdown on signal
Make some effort to stop work in progress, remove pid file, and exit with
informative error code.
Note that this is much simpler than the shutdown() exit path; I'm not sure
whether a complete teardown is useful. It's also difficult to maintain
and get right with everything else going on, and it's not clear that it's
worth the effort right now.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 12 Feb 2012 22:12:44 +0000 (14:12 -0800)]
mds: remove some cruft
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 12 Feb 2012 00:39:27 +0000 (16:39 -0800)]
mds: remove pidfile
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 12 Feb 2012 22:43:13 +0000 (14:43 -0800)]
mon: do a clean shutdown on SIGINT/SIGTERM
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 12 Feb 2012 00:38:06 +0000 (16:38 -0800)]
mon: install async signal handlers for SIG{HUP,INT,TERM}
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 12 Feb 2012 00:36:33 +0000 (16:36 -0800)]
osd: install async signal handlers for SIG{HUP,INT,TERM}
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 12 Feb 2012 00:33:51 +0000 (16:33 -0800)]
mds: install async signal handlers for SIG{HUP,INT,TERM}
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 12 Feb 2012 00:39:48 +0000 (16:39 -0800)]
signal: remove unused/obsolete handle_shutdown_signal
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sun, 12 Feb 2012 00:30:26 +0000 (16:30 -0800)]
signals: do not install default SIGHUP, SIGINT, SIGTERM handlers
These should be app specific and async.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sat, 11 Feb 2012 17:45:06 +0000 (09:45 -0800)]
signals: implement safe async signal handler framework
Based on http://evbergen.home.xs4all.nl/unix-signals.html.
Instead of his design, though, we write single bytes, and create a pipe per
signal we have handlers registered for.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 15 Feb 2012 01:03:54 +0000 (17:03 -0800)]
libradospp: add config_t typedef
Don't expose internal CephContext type name.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 15 Feb 2012 01:03:00 +0000 (17:03 -0800)]
librados: use rados_config_t typedef instead of CephContext
Signed-off-by: Sage Weil <sage@newdream.net>
Tommi Virtanen [Tue, 14 Feb 2012 23:52:55 +0000 (15:52 -0800)]
doc: Balance backticks.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
Sage Weil [Tue, 14 Feb 2012 22:01:22 +0000 (14:01 -0800)]
Merge branch 'wip-osd-hb'
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>