git.apps.os.sepia.ceph.com Git

author	Samuel Just <sam.just@inktank.com>
	Tue, 11 Mar 2014 17:31:55 +0000 (10:31 -0700)
committer	Samuel Just <sam.just@inktank.com>
	Wed, 12 Mar 2014 17:38:17 +0000 (10:38 -0700)
commit	a576eb320463ee79feff7b0a973cad117cd98cf9
tree	b8ede8885d28ef60bb2618978cf7406cb434cd3b	tree \| snapshot
parent	83731a75d7f29778dafff5e08a3ebc5da1498665	commit \| diff

PG: do not serve requests until replicas have activated

There are two problems:
1) We choose the min last_update amoung peers with the max local-les
value as an upper bound on requests which could have been reported to
the client as committed.  We then, for ec pools, roll back to that point
to ensure that we don't inadvertently commit to an update which fewer
than K replicas actually saw.  If the primary sets local-les, accepts an
update from a client, and there is a new interval before any of the
replicas have been activated, we will end up being forced to use that
update which no other replica has seen as the new last_update.  This
will cause the object to become unfound.  We don't have this problem as
long as all active replicas agree on last_update before we accept IO.

2) Even for replicated pools, we would then immediately respond to the
request which created the primary-only update with a commit since it is
in the log and we have no outstanding repops.  If we then lose that
primary before any of the replicas in the new interval record the new
log, we will not only lose the object, but also the log entry recording
it, which will result in a lost write.

For these reasons, it seems like we need to wait for the replicas to
activate before we can process new requests essentially because whatever
update we select as last_update is essentially regarded as committed as
soon as we accept IO.

Fixes: #7649
Signed-off-by: Samuel Just <sam.just@inktank.com>

src/osd/PG.cc		diff \| blob \| history
src/osd/ReplicatedPG.cc		diff \| blob \| history