]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/commit
crimson/seastore: make RecordSubmitter::wait_available() idempotent 69121/head
authorKefu Chai <tchaikov@gmail.com>
Tue, 26 May 2026 14:01:41 +0000 (22:01 +0800)
committerKefu Chai <k.chai@proxmox.com>
Wed, 27 May 2026 07:47:36 +0000 (15:47 +0800)
commitdc83077a2b4d515b591df76d8cbba35b2ccf9cb9
treea0636cbf0e45d2669727fc40f176146a6535c41d
parentab49b3aab85290987eac287a0c7bd89c8d287839
crimson/seastore: make RecordSubmitter::wait_available() idempotent

Under sustained 4K randwrite workloads that roll journal segments
frequently, crimson-osd hits
```
    crimson/os/seastore/journal/record_submitter.cc:198:
    FAILED ceph_assert(!is_available())
```
and, in release builds without assertions, a downstream
`boost::throw_exception<std::length_error>` from
`seastar::shared_promise::get_shared_future()` called on a
disengaged `std::optional` in the same code path.

`RecordSubmitter::roll_segment()` arms wait_available_promise on entry,
then chains `journal_allocator.roll().safe_then(...)` whose continuation
sets the promise's value and resets the optional. The background
continuation can resolve before the subsequent `wait_available()` call
is entered -- the optional gets reset, `is_available()` becomes true
again, and `wait_available()`'s `assert(!is_available())` fires. The
brittle invariant being assumed

> .safe_then's continuation will not run before its outer call returns

is not part of seastar's contract.

Honour the documented contract instead.  record_submitter.h
says:

> wait for available if cannot submit, should check
> is_available() again when the future is resolved.

The postcondition is "available when resolved"; the precondition
"unavailable when called" was incidental.  Make `wait_available()`
idempotent: if `is_available()` is already true on entry, return a
ready future immediately. All three external callers
- `RecordSubmitter::roll_segment`
- `CircularBoundedJournal::submit_record`
- `SegmentedOolWriter::do_write`

re-check `is_available()` on the next iteration or in the chained
continuation and dispatch correctly.

Validated by runing a 96-job fio randwrite bench to confirm
the fix in operation; pre-patch the assert fires within ~30 min
and kills the OSD.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>
src/crimson/os/seastore/journal/record_submitter.cc