git.apps.os.sepia.ceph.com Git

author	Sage Weil <sage@redhat.com>
	Wed, 25 Oct 2017 03:32:18 +0000 (22:32 -0500)
committer	Sage Weil <sage@redhat.com>
	Thu, 26 Oct 2017 02:52:12 +0000 (21:52 -0500)
commit	efd1a7714767ce80ec0b20059cfb66f8b18be12d
tree	8c305c830c30dae43a7b1fcb9938c585342bdbed	tree \| snapshot
parent	ebb4093c2c8ac10ddba92866634d77882975511f	commit \| diff

osd/PG: make scan recovery op cancellation match up reliably

Previously, there was only one time we would end up in this region of code:
when the backfill was rejected by the peer.  Previously that was apparently
reliably when we had an outstanding SCAN request, because we would
unconditionally cancle the MAX recovery op and clear waiting_on_backfill.

See 624aaf2a4ea9950153a89ff921e2adce683a6f51 for when this code appeared.

Now we have several similar paths, and we don't always have an outstanding
scan call (I don't think!).  Regardless, move most these three cases into
a common helper and make the finish_recovery_op completion conditional
on whether there is an outstanding SCAN.  This fixes a leak of a recovery
op when we defer while a scan is outstanding (this bug was recently
introduced by e708410542b0a52fbb29e14b76f49c94adbc0a59 and then
duplicated by 2463c6463d1ed38a2e15a0960ed1530a47851489).

Note that there is still one other time we register MAX ops: when we are
finishing backfill.  There, we start one per target.  But we will always
get back our reply and process it in the normal way (that old commit
did not change the timing for these).

Signed-off-by: Sage Weil <sage@redhat.com>

src/osd/PG.cc		diff \| blob \| history
src/osd/PG.h		diff \| blob \| history