common/crc32c_intel_fast: avoid reading partial trailing word
The optimized intel code reads in word-sized chunks, knowing that the
allocator will only hand out memory in word-sized increments. This makes
valgrind unhappy. Whitelisting doesn't work because for some reason there
is no caller context (probably because of some interaction with yasm?).
Instead, just use the baseline code for the last few bytes. This should
not be significant.
Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>