If we are a syncrhonous read, we don't need this: we don't aio_wait for
sync reads. If we are an aio_read, we are in the aio_running count anyway,
and there is also no purpose for this counter.
I'm a bit unsure about the NVME use of this counter; I switched it to use
num_running (pretty sure we aren't mixing reads and writes on a single
IOContext) *but* it might make more sense to switch to a private counter.