Instead of looking at the current time we process the message, look at the
receive time. This gives us a more real failure time given that messages
may be requeued.
It doesn't solve the problem when messages are forwarded between monitors
due to an election, but that's ok; this is still a net improvement.
Signed-off-by: Sage Weil <sage@inktank.com>
// calculate failure time
utime_t now = ceph_clock_now(g_ceph_context);
- utime_t failed_since = now - utime_t(m->failed_for ? m->failed_for : g_conf->osd_heartbeat_grace, 0);
+ utime_t failed_since = m->get_recv_stamp() - utime_t(m->failed_for ? m->failed_for : g_conf->osd_heartbeat_grace, 0);
if (m->if_osd_failed()) {
// add a report