Summary:
While the yield instruction conseptually sounds correct on most platforms it is
a simple nop that doesn't delay the execution anywhere close to what an x86
pause instruction does. In other projects with spin-wait loops an isb has been
observed to be much closer to the x86 behavior.
On a Graviton3 system the following test improves on average by 2x with this
change averaged over 20 runs:
```
./db_bench -benchmarks=fillrandom -threads=64 -batch_size=1
-memtablerep=skip_list -value_size=100 --num=100000
level0_slowdown_writes_trigger=9999 -level0_stop_writes_trigger=9999
-disable_auto_compactions --max_write_buffer_number=8 -max_background_flushes=8
--disable_wal --write_buffer_size=
160000000 --block_size=16384
--allow_concurrent_memtable_write -compression_type none
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/10118
Reviewed By: jay-zhuang
Differential Revision:
D37120578
fbshipit-source-id:
c20bde4298222edfab7ff7cb6d42497e7012400d
#if defined(__i386__) || defined(__x86_64__)
asm volatile("pause");
#elif defined(__aarch64__)
- asm volatile("yield");
+ asm volatile("isb");
#elif defined(__powerpc64__)
asm volatile("or 27,27,27");
#endif