From: Shai Fultheim Date: Thu, 28 May 2026 07:24:05 +0000 (+0300) Subject: crimson/os/seastore: wake blocked IO on BackgroundProcess wakeup X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=8f3a16656982dfdfd55a6229c90884de365036e1;p=ceph.git crimson/os/seastore: wake blocked IO on BackgroundProcess wakeup BackgroundProcess::run() blocks on blocking_background when background_should_run() returns false. When the loop wakes up — typically because arm_blocking_io_and_wake() kicked it — the loop re-checks background_should_run(). If that still returns false (e.g. the cleaner has no pending work), the loop goes straight back to sleep. But the wake may also have been triggered by a change in the space- availability condition (should_block_io) — and there is no path that re-evaluates blocked user IO when background_should_run is false. The blocked IO has no future trigger to re-check, and stays stuck until the next unrelated wake-up happens to come at the right moment, or never. Add a maybe_wake_blocked_io() call right after the wake. It's a no-op when nothing is blocked, and unblocks IO that the space condition has already cleared otherwise. Observed in sustained random-write benches that wedged with IO blocked indefinitely while the cleaner sat idle (should_run=false). Signed-off-by: Shai Fultheim --- diff --git a/src/crimson/os/seastore/extent_placement_manager.cc b/src/crimson/os/seastore/extent_placement_manager.cc index 024dce94028..c6110a28de3 100644 --- a/src/crimson/os/seastore/extent_placement_manager.cc +++ b/src/crimson/os/seastore/extent_placement_manager.cc @@ -841,6 +841,12 @@ ExtentPlacementManager::BackgroundProcess::run() assert(!blocking_background); blocking_background = seastar::promise<>(); co_await blocking_background->get_future(); + // After waking (typically because arm_blocking_io_and_wake() kicked us), + // give any blocked user IO a chance to proceed. Without this call the + // loop would go straight back to sleep if background_should_run() is + // still false, but the space condition (should_block_io) may already be + // satisfied, leaving blocked IO stuck with no future trigger to re-check. + maybe_wake_blocked_io(); } } log_state("run(exit)");