We are running mysql on top of rbd. sysbench qps occasionally drops to zero
with the INSERT benchmark.
Debug code captured >2s latency between PG::queue_op() and OSD::dequeue_op().
We finally found out that the latency came from below code in OSD::ShardedOpWQ::_process(),
sdata->sdata_cond.WaitInterval(sdata->sdata_lock,
utime_t(osd->cct->_conf->threadpool_empty_queue_max_wait, 0));
"threadpool_empty_queue_max_wait" is 2s by default.
Normally, it should not sleep for 2s since the comming IO requests will wakeup it.
But there is a small timing window that it missed the wakeup signal actually.
For example,
msgr-worker-0 thread tp_osd_tp thread
OSD::ShardedOpWQ::_enqueue OSD::ShardedOpWQ::_process
--------------------------- ---------------------------
T1: sdata_op_ordering_lock.Lock()
T2: sdata_op_ordering_lock.Lock()
"queue empty"
sdata_op_ordering_lock.Unlock()
"insert op"
sdata_op_ordering_lock.Unlock()
T3: sdata_lock.Lock()
T4: sdata_lock.Lock()
"send wakeup signal"
sdata_lock.Unock()
// here the wakeup signal has no effect actually
// becuase it has not slept yet.
// then, it sleeps.
WaitInterval(2s)
This patch unlocks sdata_op_ordering_lock with sdata_lock hold in OSD::ShardedOpWQ::_process(),
then the timeline becomes,
msgr-worker-0 thread tp_osd_tp thread
OSD::ShardedOpWQ::_enqueue OSD::ShardedOpWQ::_process
--------------------------- ---------------------------
T1: sdata_op_ordering_lock.Lock()
T2: sdata_op_ordering_lock.Lock()
"queue empty"
sdata_lock.Lock()
T3: sdata_op_ordering_lock.Unlock()
"insert op"
sdata_op_ordering_lock.Unlock()
sdata_lock.Lock()
T4: WaitInterval(2s) -> it actually unlocks sdata_lock
"send wakeup signal"
sdata_lock.Unock()
//got signal, wakeup immediately
With this one line change, we can avoid occasional high latency.
Signed-off-by: Ming Lin <ming.lin@alibaba-inc.com>
if (sdata->pqueue->empty()) {
dout(20) << __func__ << " empty q, waiting" << dendl;
// optimistically sleep a moment; maybe another work item will come along.
- sdata->sdata_op_ordering_lock.Unlock();
osd->cct->get_heartbeat_map()->reset_timeout(hb,
osd->cct->_conf->threadpool_default_timeout, 0);
sdata->sdata_lock.Lock();
+ sdata->sdata_op_ordering_lock.Unlock();
sdata->sdata_cond.WaitInterval(sdata->sdata_lock,
utime_t(osd->cct->_conf->threadpool_empty_queue_max_wait, 0));
sdata->sdata_lock.Unlock();