Use __builtin_clz() supported by GCC and Clang to figure out
how many bits we should shift instead of shifting by a bit
in a loop until the value gets normalized. Improves performance
of this function by up to 3x in worst-case scenario and overall
straw2 performance by ~10%.
Signed-off-by: Piotr Dałek <git@predictor.org.pl>
/* normalize input */
iexpon = 15;
- while (!(x & 0x18000)) {
- x <<= 1;
- iexpon--;
+
+ // figure out number of bits we need to shift and
+ // do it in one step instead of iteratively
+ if (!(x & 0x18000)) {
+ int bits = __builtin_clz(x & 0x1FFFF) - 16;
+ x <<= bits;
+ iexpon = 15 - bits;
}
index1 = (x >> 8) << 1;