While analyzing a log from Mike Dawson I saw a long stall while librbd's
objectcacher was starting lots (many hundreds) of IOs. Limit the amount of
time we spend doing this at a time to allow IO replies to be processed so
that the cache remains responsive.
I'm not sure this warrants a tunable (which we would need to add for both
libcephfs and librbd).
Signed-off-by: Sage Weil <sage@inktank.com>
#include "include/assert.h"
+#define MAX_FLUSH_UNDER_LOCK 20 ///< max bh's we start writeback on while holding the lock
+
/*** ObjectCacher::BufferHead ***/
utime_t cutoff = ceph_clock_now(cct);
cutoff -= max_dirty_age;
BufferHead *bh = 0;
+ int max = MAX_FLUSH_UNDER_LOCK;
while ((bh = static_cast<BufferHead*>(bh_lru_dirty.lru_get_next_expire())) != 0 &&
- bh->last_write < cutoff) {
+ bh->last_write < cutoff &&
+ --max > 0) {
ldout(cct, 10) << "flusher flushing aged dirty bh " << *bh << dendl;
bh_write(bh);
}