]> git.apps.os.sepia.ceph.com Git - ceph.git/commit
osd/PeeringState: clear LAGGY and WAIT states on exiting Started 31864/head
authorSage Weil <sage@redhat.com>
Mon, 25 Nov 2019 19:15:24 +0000 (13:15 -0600)
committerSage Weil <sage@redhat.com>
Mon, 25 Nov 2019 19:15:24 +0000 (13:15 -0600)
commit7bbc724d99e998bf6e06c3d32dc68348ab6aa45a
tree4dd572d8102bae4528753f8a0c86d4f13294e69c
parent67c96be4f6fca0ff48e8d152452d9ddb8fc8a042
osd/PeeringState: clear LAGGY and WAIT states on exiting Started

These flags were not getting cleared except in recheck_readable(), which
meant that a flag from a prior interval could bleed into a new interval.
More dangerously, in a mixed-version cluster, one interval might include
all octopus+ OSDs while the next might include a pre-octopus OSD, bypassing
most of the laggy recheck code.  This could lead to a stalled request
and/or requeue ordering bug when release_object_locks() looked at
is_laggy() and put a lock waiter on the waiting_for_readable list.

Fixes: https://tracker.ceph.com/issues/42978
Signed-off-by: Sage Weil <sage@redhat.com>
src/osd/PeeringState.cc