When a worker thread with the smallest thread index waits for future work
items from the mClock queue, oncommit callbacks are called. But after the
callback, the thread has to continue waiting instead of returning back to
the ShardedThreadPool::shardedthreadpool_worker() loop. Returning results
in the threads with the smallest index across all shards to busy loop
causing very high CPU utilization.
The fix involves reacquiring the shard_lock and waiting on sdata_cond
until notified or until time period lapses. After this, the smallest
thread index repopulates the oncommit queue from the context_queue
if there were any additions.
Fixes: https://tracker.ceph.com/issues/56530
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit
180a5a7bffd4d96c472cc39447717958dd51bbd9)
if (is_smallest_thread_index) {
sdata->shard_lock.unlock();
handle_oncommits(oncommits);
- return;
+ sdata->shard_lock.lock();
}
std::unique_lock wait_lock{sdata->sdata_wait_lock};
auto future_time = ceph::real_clock::from_double(*when_ready);
// Reapply default wq timeouts
osd->cct->get_heartbeat_map()->reset_timeout(hb,
timeout_interval, suicide_interval);
+ // Populate the oncommits list if there were any additions
+ // to the context_queue while we were waiting
+ if (is_smallest_thread_index) {
+ sdata->context_queue.move_to(oncommits);
+ }
}
} // while