osd: Optimized EC choose_async_recovery_ec must use auth_shard
Optimized EC pools modify how GetLog and choose_acting work,
if the auth_shard is a non-primary shard and the (new) primary
is behind the auth_shard then we cannot just get the log from
the non-primary shard because it will be missing entries for
partial writes. Instead we need to get the log from a shard
that has the full log first and then repeat GetLog to get
the log from the auth_shard.
choose_acting was modifying auth_shard in the case where
we need to get the log from another shard first. This is
wrong - the remainder of the logic in choose_acting and
in particular choose_async_recovery_ec needs to use the
auth_shard to calculate what the acting set will be.
Using a different shard occasional can cause a
different acting set to be selected (because of
thresholds about the number of log entries behind
a shard needs to be to perform async recovery) and
this can lead to two shards flip/flopping with
different opinions about what the acting set should be.
Fix is to separate out which shard will be returned
to GetLog from the auth_shard which will be used
for acting set calculations.
Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
(cherry picked from commit
3c2161ee7350a05e0d81a23ce24cd0712dfef5fb)