We shall apply prefetch policy based on the residual length
instead of the original requested length.
E.g., suppose the recv_max_prefetch is 4K, and the read sequences are 1K, 5K, 2K
**Before this change:**
- read 1K, as 1K < recv_max_prefetch, we prefetch 4K, and the 1K read
itself is hit in the cache after the prefetch is done.
- read 5K, the first 3K is hit in the cache and the cache is now empty,
as 5K > recv_max_prefetch, we don't prefetch and trigger a 2K read instead.
- read 2K, the cache is now empty, as 2K > recv_max_prefetch, we trigger
another prefetch and get 2K from the cache after prefetch is done.
**After this change:**
- read 1K, as 1K < recv_max_prefetch, we prefetch 4K, and the 1K read
itself is hit in the cache after the prefetch is done.
- read 5K, the first 3K is hit in the cache and the cache is now empty
and we have 5K-3K = 2K to read, as 2K < recv_max_prefetch, we prefetch
again and get 2K from the cache after prefetch is done, the cache has
2K data remaining.
- read 2K, which is directly hit in the cache.
From the above example, we need exactly 2 (prefetch)reads now instead of
3 reads which we need before.