RGWCoroutinesManager::run() was setting ret = -ECANCELED to break out of
the loop when it sees going_down. coroutines that failed with -ECANCELED
were confusing this logic and leading to coroutine deadlock assertions
below. when we hit the going_down case, set a 'canceled' flag, and check
that flag when deciding whether to break out of the loop
Fixes: http://tracker.ceph.com/issues/17465
Signed-off-by: Casey Bodley <cbodley@redhat.com>
int ret = 0;
int blocked_count = 0;
int interval_wait_count = 0;
+ bool canceled = false; // set on going_down
RGWCoroutinesEnv env;
uint64_t run_context = run_context_count.inc();
if (going_down.read() > 0) {
ldout(cct, 5) << __func__ << "(): was stopped, exiting" << dendl;
ret = -ECANCELED;
+ canceled = true;
break;
}
handle_unblocked_stack(context_stacks, scheduled_stacks, blocked_stack, &blocked_count);
iter = scheduled_stacks.begin();
}
- if (ret == -ECANCELED) {
+ if (canceled) {
break;
}