The Sharded OpWQ will opportunistically wait for more work when
processing an empty queue. While waiting, the default work queue
heartbeat timeout and suicide_grace values are modified. The
`threadpool_default_timeout` grace is applied and suicide_grace is
disabled. If this op hangs, the heartbeat watchdog will not trigger an
OSD suicide recovery.
The default work queue values for grace and suicide_grace are re-applied
after finding work. This keeps the heartbeat timeouts consistent with
the values applied on _process() entry.
Fixes: https://tracker.ceph.com/issues/45076
Signed-off-by: Dan Hill <daniel.hill@canonical.com>
sdata->shard_lock.unlock();
return;
}
+ // found a work item; reapply default wq timeouts
osd->cct->get_heartbeat_map()->reset_timeout(hb,
- osd->cct->_conf->threadpool_default_timeout, 0);
+ timeout_interval, suicide_interval);
} else {
dout(20) << __func__ << " need return immediately" << dendl;
wait_lock.unlock();