git.apps.os.sepia.ceph.com Git

author	Sage Weil <sage@redhat.com>
	Mon, 25 Nov 2019 19:15:24 +0000 (13:15 -0600)
committer	Sage Weil <sage@redhat.com>
	Mon, 25 Nov 2019 19:15:24 +0000 (13:15 -0600)
commit	7bbc724d99e998bf6e06c3d32dc68348ab6aa45a
tree	4dd572d8102bae4528753f8a0c86d4f13294e69c	tree \| snapshot
parent	67c96be4f6fca0ff48e8d152452d9ddb8fc8a042	commit \| diff

osd/PeeringState: clear LAGGY and WAIT states on exiting Started

These flags were not getting cleared except in recheck_readable(), which
meant that a flag from a prior interval could bleed into a new interval.
More dangerously, in a mixed-version cluster, one interval might include
all octopus+ OSDs while the next might include a pre-octopus OSD, bypassing
most of the laggy recheck code. This could lead to a stalled request
and/or requeue ordering bug when release_object_locks() looked at
is_laggy() and put a lock waiter on the waiting_for_readable list.

Fixes: https://tracker.ceph.com/issues/42978
Signed-off-by: Sage Weil <sage@redhat.com>