If we come back up on the same address, there is a possible race. Other
nodes will mark_down when they see us go down. If we go up first, queue
some messages, and _then_ they see that we're down and mark_down, the
messages we queued will get lost. Since it's stateful on the cluster
backend, we need to introduce an ordering so that closing out the _old_
session doesn't break the new session. We do this by binding to a new
address (just a different port, actually) before marking ourselves back
up.
Fixes #592.
Signed-off-by: Sage Weil <sage@newdream.net>
state = STATE_BOOTING;
up_epoch = 0;
+ int r = cluster_messenger->rebind();
+ if (r != 0)
+ do_shutdown = true; // FIXME: do_restart?
+
reset_heartbeat_peers();
}
}