]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/commit
osd: unlock sdata_op_ordering_lock with sdata_lock hold to avoid missing wakeup signal 15891/head
authorMing Lin <ming.lin@alibaba-inc.com>
Fri, 23 Jun 2017 17:28:19 +0000 (10:28 -0700)
committerMing Lin <ming.lin@alibaba-inc.com>
Fri, 23 Jun 2017 17:30:35 +0000 (10:30 -0700)
commitbc683385819146f3f6f096ceec97e1226a3cd237
tree2e8b4eb975ce04bd3a8da1a4ac15df3f23f6e477
parent9d1f4b68a35e1cb6183863bc9e83369075f742f0
osd: unlock sdata_op_ordering_lock with sdata_lock hold to avoid missing wakeup signal

We are running mysql on top of rbd. sysbench qps occasionally drops to zero
with the INSERT benchmark.

Debug code captured >2s latency between PG::queue_op() and OSD::dequeue_op().
We finally found out that the latency came from below code in OSD::ShardedOpWQ::_process(),

sdata->sdata_cond.WaitInterval(sdata->sdata_lock,
      utime_t(osd->cct->_conf->threadpool_empty_queue_max_wait, 0));

"threadpool_empty_queue_max_wait" is 2s by default.

Normally, it should not sleep for 2s since the comming IO requests will wakeup it.
But there is a small timing window that it missed the wakeup signal actually.
For example,

     msgr-worker-0 thread                    tp_osd_tp thread
     OSD::ShardedOpWQ::_enqueue              OSD::ShardedOpWQ::_process
     ---------------------------             ---------------------------
T1:                                          sdata_op_ordering_lock.Lock()
T2:  sdata_op_ordering_lock.Lock()
                                             "queue empty"
                                             sdata_op_ordering_lock.Unlock()
     "insert op"
     sdata_op_ordering_lock.Unlock()

T3:  sdata_lock.Lock()
T4:                                          sdata_lock.Lock()
     "send wakeup signal"
     sdata_lock.Unock()
                                             // here the wakeup signal has no effect actually
                                             // becuase it has not slept yet.

                                             // then, it sleeps.
                                             WaitInterval(2s)

This patch unlocks sdata_op_ordering_lock with sdata_lock hold in OSD::ShardedOpWQ::_process(),
then the timeline becomes,

     msgr-worker-0 thread                    tp_osd_tp thread
     OSD::ShardedOpWQ::_enqueue              OSD::ShardedOpWQ::_process
     ---------------------------             ---------------------------
T1:                                          sdata_op_ordering_lock.Lock()
T2:  sdata_op_ordering_lock.Lock()
                                             "queue empty"
                                             sdata_lock.Lock()
T3:                                          sdata_op_ordering_lock.Unlock()
     "insert op"
     sdata_op_ordering_lock.Unlock()
     sdata_lock.Lock()

T4:                                          WaitInterval(2s) -> it actually unlocks sdata_lock
     "send wakeup signal"
     sdata_lock.Unock()
                                             //got signal, wakeup immediately

With this one line change, we can avoid occasional high latency.

Signed-off-by: Ming Lin <ming.lin@alibaba-inc.com>
src/osd/OSD.cc