Bug #3142 appears to be caused by the following sequence:
- object X missing on primary and replica
- [assert-ver,watch], notify, unwatch requests come in, get deferred
- object is recovered on primary, !missing, create_object_context
- populate_obc_watchers() does nothing, since still degraded
- notify happens now (odd but ok?)
- replica recovered, !degraded
- watch skips bc of bad assert
- unwatch trips up on an assert because populate_obc_watchers never
ran
Fix this by populating the obc watcher when !missing, not when
!degraded. This conditional dates back to Sam's original watch/notify
cleanup in October 2011.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
void ReplicatedPG::populate_obc_watchers(ObjectContext *obc)
{
- if (!is_active() || is_degraded_object(obc->obs.oi.soid) ||
- is_missing_object(obc->obs.oi.soid))
+ if (!is_active() ||
+ is_missing_object(obc->obs.oi.soid)) {
+ dout(10) << "populate_obc_watchers " << obc->obs.oi.soid << " !active or missing, waiting" << dendl;
return;
+ }
+ dout(10) << "populate_obc_watchers " << obc->obs.oi.soid << dendl;
if (!obc->obs.oi.watchers.empty()) {
Mutex::Locker l(osd->watch_lock);
assert(obc->unconnected_watchers.size() == 0);