rgw: s/std::map/boost::container::flat_map/ cls_bucket_list_ordered
(RGWRados and CLS).
Probably faster, allocating less. Definitely not slower.
Examples from single-OSD vstart.sh cluster, Ceph built at -O2
BEFORE
[mbenjamin@lemon python]$ time s3cmd -c s3cfg_userx ls s3://DOCREQUEST_750/CSV/SUB1/ > /dev/null
real 4m48.991s
user 3m45.260s
sys 0m7.174s
(2nd run)
radosgw
Samples: 81K of event 'cycles:ppp', 4000 Hz, Event count (approx.):
3189324729
Overhead Shared Object Symbol
7.06% libtcmalloc.so.4.5.1 [.] tcmalloc::CentralFreeList::FetchFromOneSpans
6.85% libstdc++.so.6.0.25 [.] std::__ostream_insert<char, std::char_traits<char> >
6.15% librados.so.2.0.0 [.] ceph::buffer::v14_2_0::list::iterator_impl<true>::copy
4.12% librados.so.2.0.0 [.] ceph::buffer::v14_2_0::ptr::copy_out
4.11% libstdc++.so.6.0.25 [.] std::basic_streambuf<char, std::char_traits<char> >::xsputn
3.49% libc-2.27.so [.] __memmove_avx_unaligned_erms
3.33% libtcmalloc.so.4.5.1 [.] tc_deletearray_aligned_nothrow
3.04% radosgw [.] std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pa
2.46% radosgw [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::compare
2.45% libtcmalloc.so.4.5.1 [.] operator new[]
2.39% libstdc++.so.6.0.25 [.] std::ostream::sentry::sentry
2.36% radosgw [.] std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pa
2.07% librados.so.2.0.0 [.] ceph::buffer::v14_2_0::list::iterator_impl<true>::advance
1.94% libc-2.27.so [.] __memcmp_avx2_movbe
1.93% radosgw [.] std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pa
1.85% librados.so.2.0.0 [.] ceph::buffer::v14_2_0::list::iterator_impl<true>::copy
1.79% radosgw [.] rgw_bucket_dir::decode
1.76% libceph-common.so.0 [.] operator<<
1.42% radosgw [.] ceph::decode
1.35% libstdc++.so.6.0.25 [.] std::_Rb_tree_insert_and_rebalance
1.33% libtcmalloc.so.4.5.1 [.] tcmalloc::CentralFreeList::ReleaseToSpans
1.31% librados.so.2.0.0 [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate
1.24% [kernel] [k] copy_user_enhanced_fast_string
For a higher level overview, try: perf top --sort comm,dso
osd
Samples: 23K of event 'cycles:ppp', 4000 Hz, Event count (approx.):
5059851086
Overhead Shared Object Symbol
8.57% libc-2.27.so [.] vfprintf
4.90% ceph-osd [.] ceph::logging::Log::_flush
4.66% libc-2.27.so [.] _IO_default_xsputn
3.49% libtcmalloc.so.4.5.1 [.] operator new[]
2.93% ceph-osd [.] StackStringBuf<4096ul>::xsputn
2.70% libstdc++.so.6.0.25 [.] std::__ostream_insert<char, std::char_traits<char> >
2.11% libpthread-2.27.so [.] __pthread_mutex_unlock_usercnt
1.91% libc-2.27.so [.] __memmove_avx_unaligned_erms
1.75% libstdc++.so.6.0.25 [.] std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<unsigned l
1.64% libc-2.27.so [.] _itoa_word
1.58% libtcmalloc.so.4.5.1 [.] tc_deletearray_aligned_nothrow
1.39% libstdc++.so.6.0.25 [.] std::ostream::sentry::sentry
1.33% [kernel] [k] stackleak_erase
1.27% [kernel] [k] copy_user_enhanced_fast_string
1.22% libc-2.27.so [.] __strlen_avx2
1.15% ceph-osd [.] ceph::buffer::v14_2_0::list::append
1.12% libstdc++.so.6.0.25 [.] std::ostream::_M_insert<unsigned long>
1.10% ceph-osd [.] ceph::buffer::v14_2_0::ptr::append
1.04% libpthread-2.27.so [.] __pthread_mutex_lock
1.01% libc-2.27.so [.] __tz_convert
0.94% [kernel] [k] entry_SYSCALL_64
0.94% ceph-osd [.] ceph::buffer::v14_2_0::list::iterator_impl<true>::copy
0.93% ceph-osd [.] ceph::logging::Log::submit_entry
For a higher level overview, try: perf top --sort comm,dso
AFTER
[mbenjamin@lemon python]$ time s3cmd -c s3cfg_userx ls s3://DOCREQUEST_750/CSV/SUB1/ > /dev/null
real 4m51.488s
user 3m36.785s
sys 0m5.689s
(1st run)
radosgw
Samples: 52K of event 'cycles:ppp', 4000 Hz, Event count (approx.):
4426952205
Overhead Shared Object Symbol
6.11% librados.so.2.0.0 [.] ceph::buffer::v14_2_0::list::iterator_impl<true>::copy
5.83% libstdc++.so.6.0.25 [.] std::__ostream_insert<char, std::char_traits<char> >
3.89% radosgw [.] rgw_bucket_dir::decode
3.73% radosgw [.] rgw_bucket_dir_entry::rgw_bucket_dir_entry
3.68% librados.so.2.0.0 [.] ceph::buffer::v14_2_0::ptr::copy_out
3.37% libstdc++.so.6.0.25 [.] std::basic_streambuf<char, std::char_traits<char> >::xsputn
3.14% libtcmalloc.so.4.5.1 [.] tcmalloc::CentralFreeList::FetchFromOneSpans
2.55% libc-2.27.so [.] __memmove_avx_unaligned_erms
2.22% libtcmalloc.so.4.5.1 [.] tc_deletearray_aligned_nothrow
2.02% libstdc++.so.6.0.25 [.] std::ostream::sentry::sentry
1.79% librados.so.2.0.0 [.] ceph::buffer::v14_2_0::list::iterator_impl<true>::advance
1.77% [kernel] [k] n_tty_write
1.74% libc-2.27.so [.] vfprintf
1.71% libtcmalloc.so.4.5.1 [.] operator new[]
1.65% librados.so.2.0.0 [.] ceph::buffer::v14_2_0::list::iterator_impl<true>::copy
1.58% radosgw [.] ceph::decode
1.38% librados.so.2.0.0 [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate
1.22% [kernel] [k] stackleak_erase
1.12% libceph-common.so.0 [.] operator<<
1.09% libc-2.27.so [.] _IO_default_xsputn
1.06% libc-2.27.so [.] __memcmp_avx2_movbe
1.04% [kernel] [k] copy_user_enhanced_fast_string
0.93% librados.so.2.0.0 [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append
For a higher level overview, try: perf top --sort comm,dso
osd
Samples: 134K of event 'cycles:ppp', 4000 Hz, Event count (approx.):
26819176020
Overhead Shared Object Symbol
8.82% libc-2.27.so [.] vfprintf
6.88% ceph-osd [.] ceph::logging::Log::_flush
4.80% libc-2.27.so [.] _IO_default_xsputn
4.15% libpthread-2.27.so [.] __pthread_mutex_unlock_usercnt
2.69% ceph-osd [.] StackStringBuf<4096ul>::xsputn
2.54% libstdc++.so.6.0.25 [.] std::__ostream_insert<char, std::char_traits<char> >
2.40% libtcmalloc.so.4.5.1 [.] operator new[]
2.39% libc-2.27.so [.] __memmove_avx_unaligned_erms
2.13% libcls_rgw.so.1.0.0 [.] boost::container::dtl::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<ch
1.61% libtcmalloc.so.4.5.1 [.] tc_deletearray_aligned_nothrow
1.54% libc-2.27.so [.] _itoa_word
1.49% libstdc++.so.6.0.25 [.] std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<unsigned l
1.41% libpthread-2.27.so [.] __pthread_mutex_lock
1.41% ceph-osd [.] ceph::logging::Log::submit_entry
1.23% libstdc++.so.6.0.25 [.] std::ostream::sentry::sentry
1.19% libc-2.27.so [.] __tz_convert
1.08% libc-2.27.so [.] __strlen_avx2
1.08% [kernel] [k] stackleak_erase
1.06% libcls_rgw.so.1.0.0 [.] rgw_bucket_list
1.02% ceph-osd [.] ceph::buffer::v14_2_0::list::append
0.99% ceph-osd [.] ceph::buffer::v14_2_0::ptr::append
0.96% [kernel] [k] copy_user_enhanced_fast_string
0.93% [kernel] [k] entry_SYSCALL_64
For a higher level overview, try: perf top --sort comm,dso
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>