]> git.apps.os.sepia.ceph.com Git - ceph.git/commit
mon: scale heartbeat grace based on laggy probability, interval
authorSage Weil <sage@inktank.com>
Tue, 4 Sep 2012 20:39:23 +0000 (13:39 -0700)
committerSage Weil <sage@inktank.com>
Tue, 18 Sep 2012 21:39:00 +0000 (14:39 -0700)
commitadf0fe6a10ece6c2e48ecf6c66e849dfddf95656
tree94f4f61e2fae02b05d3553441fd8c59daa3175fa
parent3f51d31639eb5af4e907fa316f1643b02ddb8f27
mon: scale heartbeat grace based on laggy probability, interval

If, based on historical behavior, an observed osd failure is likely to be
due to unresponsiveness and not the daemon stopping, scale the heartbeat
grace period accordingly:

 grace' = grace + laggy_probabiliy * laggy_interval

This will avoid fruitlessly marking OSDs down and generating additional
map update overhead when the cluster is overloaded and potentially
struggling to keep up with map updates.   See #3045.

Signed-off-by: Sage Weil <sage@inktank.com>
src/mon/OSDMonitor.cc