On connecting to a new monitor, we will resend everything
including the osd failure reports previously sent.
To realize this, we call requeue_failures() to transfer inflight
failure reports from failure_pending to failure_queue first, and
then call send_failures() to do the real delivery job.
The problem here is that the send_failures() never sends a
failure report again if it successfully detects that the doomed osd
is already in the failure_pending set, which is necessary as we don't
want to report monitor of the same osd failure twice in normal case.
This pr solves the above problem by erasing the record from failure_pending
set simultaneously during the requeue_failures() process. So the
succeeding call to send_failures() can resend the failure reports correctly.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
unsigned old_pending = failure_pending.size();
for (map<int,pair<utime_t,entity_inst_t> >::iterator p =
failure_pending.begin();
- p != failure_pending.end();
- ++p) {
+ p != failure_pending.end(); ) {
failure_queue[p->first] = p->second.first;
+ failure_pending.erase(p++);
}
dout(10) << __func__ << " " << old_queue << " + " << old_pending << " -> "
<< failure_queue.size() << dendl;