log: avoid heap allocations for most log entries
Each log Entry now exists on the stack and uses a large (4k) buffer for its log
stream. This Entry is std::move'd to the queues (std::vector and
boost::circular_buffer) in the Log, involving only memory copies in the general
case. There are two memory copies (std::move) for any given Entry, once in
Log::submit_entry and again in Log::_flush
In practice, this eliminates 100% of allocations outside of startup
allocations
I've run a simple experiment with the MDS that copies /usr/bin to CephFS. I got
measurements for the number of allocations from the heap profiler and the
profile of CPU usage in the MDS.
** Before this patch **
== Heap profile: ==
$ google-pprof --alloc_objects --text bin/ceph-mds out/mds.a.profile.0001.heap
Total:
1105048 objects
433329 39.2% 39.2% 433329 39.2% ceph::logging::Log::create_entry
209311 18.9% 58.2% 209311 18.9% __gnu_cxx::__aligned_membuf::_M_addr (inline)
192963 17.5% 75.6% 192963 17.5% __gnu_cxx::new_allocator::allocate (inline)
61774 5.6% 81.2% 61774 5.6% std::__cxx11::basic_string::_M_mutate
37689 3.4% 84.6% 37689 3.4% ceph::buffer::raw_combined::create (inline)
22773 2.1% 86.7% 22773 2.1% mempool::pool_allocator::allocate (inline)
17761 1.6% 88.3% 20523 1.9% std::pair::pair (inline)
15795 1.4% 89.7% 15797 1.4% std::swap (inline)
11011 1.0% 90.7% 130061 11.8% std::__cxx11::list::_M_insert (inline)
10822 1.0% 91.7% 10822 1.0% std::__cxx11::basic_string::reserve
9108 0.8% 92.5% 32721 3.0% __gnu_cxx::new_allocator::construct (inline)
8608 0.8% 93.3% 8610 0.8% std::_Deque_base::_M_initialize_map (inline)
7694 0.7% 94.0% 7694 0.7% std::__cxx11::basic_string::_M_capacity (inline)
6160 0.6% 94.5% 6160 0.6% Journaler::wrap_finisher
6084 0.6% 95.1% 70892 6.4% std::map::operator[] (inline)
5347 0.5% 95.6% 5347 0.5% MutationImpl::add_projected_fnode
4381 0.4% 96.0% 7706 0.7% mempool::pool_allocator::construct (inline)
3588 0.3% 96.3% 182966 16.6% Locker::_do_cap_update
3049 0.3% 96.6% 5280 0.5% std::__shared_count::__shared_count (inline)
3043 0.3% 96.9% 3043 0.3% MDSLogContextBase::MDSLogContextBase (inline)
3038 0.3% 97.1% 14763 1.3% std::__uninitialized_copy::__uninit_copy (inline)
So approximately 430k heap allocations for Entry were created! The
basic_string::_M_mutate is also another allocation via PrebufferedStreambuf
== Profile data ==
Selecting interesting functions
Samples: 798K of event 'cycles:pp', Event count (approx
Children Self Com Shared Object Symbol
+ 1.04% 1.04% log libceph-common.so.0 [.] ceph::logging::Log::_flush
+ 0.05% 0.05% log libceph-common.so.0 [.] ceph::logging::Log::flush
+ 0.00% 0.00% log libceph-common.so.0 [.] ceph::logging::Log::_log_safe_write
+ 0.00% 0.00% log libceph-common.so.0 [.] ceph::logging::Log::entry
+ 0.00% 0.00% log libceph-common.so.0 [.] ceph::logging::Log::_flush_logbuf
...
Children Self Command Shared Object Symbol
+ 3.69% 0.00% safe_timer libceph-common.so.0 [.] CachedPrebufferedStreambuf::~CachedPrebufferedStreambuf
+ 0.53% 0.00% ms_dispatch libceph-common.so.0 [.] CachedPrebufferedStreambuf::~CachedPrebufferedStreambuf
+ 0.13% 0.00% fn_anonymous libceph-common.so.0 [.] CachedPrebufferedStreambuf::~CachedPrebufferedStreambuf
+ 0.00% 0.00% ms_dispatch libceph-common.so.0 [.] CachedPrebufferedStreambuf::create
+ 0.00% 0.00% fn_anonymous libceph-common.so.0 [.] CachedPrebufferedStreambuf::create
Children Self Command Shared Object Symbol
+ 0.07% 0.07% fn_anonymous libceph-common.so.0 [.] ceph::logging::Log::create_entry
+ 0.00% 0.00% ms_dispatch libceph-common.so.0 [.] ceph::logging::Log::create_entry
+ 0.00% 0.00% ms_dispatch ceph-mds [.] _ZN4ceph7logging3Log12create_entryEiiPm@plt
+ 0.00% 0.00% md_submit libceph-common.so.0 [.] ceph::logging::Log::create_entry
+ 0.00% 0.00% fn_anonymous ceph-mds [.] _ZN4ceph7logging3Log12create_entryEiiPm@plt
+ 0.00% 0.00% safe_timer libceph-common.so.0 [.] ceph::logging::Log::create_entry
+ 0.00% 0.00% mds_rank_progr libceph-common.so.0 [.] ceph::logging::Log::create_entry
+ 0.00% 0.00% mds_rank_progr ceph-mds [.] _ZN4ceph7logging3Log12create_entryEiiPm@plt
+ 0.00% 0.00% msgr-worker-0 libceph-common.so.0 [.] ceph::logging::Log::create_entry
+ 0.00% 0.00% msgr-worker-2 libceph-common.so.0 [.] ceph::logging::Log::create_entry
+ 0.00% 0.00% md_submit ceph-mds [.] _ZN4ceph7logging3Log12create_entryEiiPm@plt
+ 0.00% 0.00% msgr-worker-1 libceph-common.so.0 [.] ceph::logging::Log::create_entry
Children Self Command Shared Object Symbol
+ 8.29% 0.00% ms_dispatch libstdc++.so.6.0.24 [.] virtual thunk to std::basic_ostream<char, std::char_traits<char> >::~basic_ostream
+ 7.55% 1.46% ms_dispatch libstdc++.so.6.0.24 [.] std::ostream::_M_insert<long>
+ 3.87% 0.00% fn_anonymous libstdc++.so.6.0.24 [.] std::basic_ostream<char, std::char_traits<char> >::~basic_ostream
+ 2.92% 0.00% md_submit libstdc++.so.6.0.24 [.] virtual thunk to std::basic_ostream<char, std::char_traits<char> >::~basic_ostream
+ 2.38% 0.00% fn_anonymous libstdc++.so.6.0.24 [.] virtual thunk to std::basic_ostream<char, std::char_traits<char> >::~basic_ostream
+ 2.22% 2.22% fn_anonymous libstdc++.so.6.0.24 [.] std::ostream::sentry::sentry
+ 1.89% 0.13% fn_anonymous libstdc++.so.6.0.24 [.] std::__ostream_insert<char, std::char_traits<char> >
+ 0.71% 0.00% ms_dispatch libstdc++.so.6.0.24 [.] std::basic_ostream<char, std::char_traits<char> >::~basic_ostream
+ 0.39% 0.06% fn_anonymous libstdc++.so.6.0.24 [.] std::ostream::_M_insert<long>
+ 0.29% 0.21% ms_dispatch libstdc++.so.6.0.24 [.] std::__ostream_insert<char, std::char_traits<char> >
+ 0.27% 0.27% ms_dispatch libstdc++.so.6.0.24 [.] std::ostream::sentry::~sentry
+ 0.27% 0.27% fn_anonymous libstdc++.so.6.0.24 [.] std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<long>
+ 0.22% 0.22% ms_dispatch libstdc++.so.6.0.24 [.] std::basic_streambuf<char, std::char_traits<char> >::xsputn
+ 0.20% 0.20% ms_dispatch libstdc++.so.6.0.24 [.] std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<long>
+ 0.15% 0.15% fn_anonymous libstdc++.so.6.0.24 [.] std::ostream::sentry::~sentry
+ 0.14% 0.14% ms_dispatch libstdc++.so.6.0.24 [.] std::ostream::sentry::sentry
+ 0.13% 0.00% ms_dispatch libstdc++.so.6.0.24 [.] std::ostream::_M_insert<unsigned long>
+ 0.13% 0.13% fn_anonymous libstdc++.so.6.0.24 [.] std::basic_streambuf<char, std::char_traits<char> >::xsputn
+ 0.00% 0.00% fn_anonymous libstdc++.so.6.0.24 [.] std::ostream::_M_insert<unsigned long>
+ 0.00% 0.00% ms_dispatch libstdc++.so.6.0.24 [.] std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<unsigned long>
And the unittest_log time:
$ bin/unittest_log
[==========] Running 15 tests from 1 test case
[----------] Global test environment set-up
[----------] 15 tests from Log
[ RUN ] Log.Simple
[ OK ] Log.Simple (0 ms)
[ RUN ] Log.ReuseBad
[ OK ] Log.ReuseBad (1 ms)
[ RUN ] Log.ManyNoGather
[ OK ] Log.ManyNoGather (0 ms)
[ RUN ] Log.ManyGatherLog
[ OK ] Log.ManyGatherLog (12 ms)
[ RUN ] Log.ManyGatherLogStringAssign
[ OK ] Log.ManyGatherLogStringAssign (27 ms)
[ RUN ] Log.ManyGatherLogStringAssignWithReserve
[ OK ] Log.ManyGatherLogStringAssignWithReserve (27 ms)
[ RUN ] Log.ManyGatherLogPrebuf
[ OK ] Log.ManyGatherLogPrebuf (15 ms)
[ RUN ] Log.ManyGatherLogPrebufOverflow
[ OK ] Log.ManyGatherLogPrebufOverflow (15 ms)
[ RUN ] Log.ManyGather
[ OK ] Log.ManyGather (8 ms)
[ RUN ] Log.InternalSegv
[WARNING] /home/pdonnell/cephfs-shell/src/googletest/googletest/src/gtest-death-test.cc:836:: Death tests use fork(), which is unsafe particularly in a threaded context. For this test, Google Test detected 3 threads
[ OK ] Log.InternalSegv (8 ms)
[ RUN ] Log.LargeLog
[ OK ] Log.LargeLog (43 ms)
[ RUN ] Log.TimeSwitch
[ OK ] Log.TimeSwitch (1 ms)
[ RUN ] Log.TimeFormat
[ OK ] Log.TimeFormat (0 ms)
[ RUN ] Log.Speed_gather
[ OK ] Log.Speed_gather (1779 ms)
[ RUN ] Log.Speed_nogather
[ OK ] Log.Speed_nogather (64 ms)
[----------] 15 tests from Log (2000 ms total)
[----------] Global test environment tear-down
[==========] 15 tests from 1 test case ran. (2000 ms total)
[ PASSED ] 15 tests
** After Patch **
The StackStreamBuf uses 4k for its default buffer. This appears to be more than
reasonable for preventing allocations for logging
== Heap profile: ==
$ google-pprof --alloc_objects --text bin/ceph-mds out/mds.a.profile.0001.heap
Total:
1052127 objects
384957 36.6% 36.6% 384957 36.6% __gnu_cxx::new_allocator::allocate (inline)
274720 26.1% 62.7% 274720 26.1% __gnu_cxx::__aligned_membuf::_M_addr (inline)
68496 6.5% 69.2% 68496 6.5% std::__cxx11::basic_string::_M_mutate
44140 4.2% 73.4% 51828 4.9% std::pair::pair (inline)
43091 4.1% 77.5% 43091 4.1% ceph::buffer::raw_combined::create (inline)
27706 2.6% 80.1% 236407 22.5% std::__cxx11::list::_M_insert (inline)
25699 2.4% 82.6% 25699 2.4% std::__cxx11::basic_string::reserve
23183 2.2% 84.8% 23183 2.2% mempool::pool_allocator::allocate (inline)
19466 1.9% 86.6% 81716 7.8% __gnu_cxx::new_allocator::construct (inline)
17606 1.7% 88.3% 17606 1.7% std::__cxx11::basic_string::_M_capacity (inline)
16879 1.6% 89.9% 16881 1.6% std::swap (inline)
8572 0.8% 90.7% 8574 0.8% std::_Deque_base::_M_initialize_map (inline)
8477 0.8% 91.5% 11808 1.1% mempool::pool_allocator::construct (inline)
6166 0.6% 92.1% 6166 0.6% Journaler::wrap_finisher
6080 0.6% 92.7% 134975 12.8% std::map::operator[] (inline)
6079 0.6% 93.3% 6079 0.6% MutationImpl::add_projected_fnode
== Profile data ==
Samples: 62K of event 'cycles:u', Event count (approx.)
Overhead Command Shared Object Symbol
+ 5.91% log libc-2.23.so [.] vfprintf
+ 5.75% ms_dispatch libstdc++.so.6.0.24 [.] std::__ostream_insert<char, std::char_traits<char> >
+ 4.59% ms_dispatch ceph-mds [.] StackStringBuf<4096ul>::xsputn
+ 4.26% ms_dispatch libc-2.23.so [.] __memmove_ssse3_back
+ 4.07% log libceph-common.so.0 [.] ceph::logging::Log::_flush
+ 2.48% ms_dispatch libstdc++.so.6.0.24 [.] std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<l
+ 2.13% fn_anonymous libstdc++.so.6.0.24 [.] std::__ostream_insert<char, std::char_traits<char> >
+ 2.09% ms_dispatch ceph-mds [.] CDir::check_rstats
+ 2.06% ms_dispatch libstdc++.so.6.0.24 [.] std::ostream::sentry::sentry
+ 1.98% ms_dispatch libstdc++.so.6.0.24 [.] std::ostream::sentry::~sentry
+ 1.87% log libc-2.23.so [.] __strcpy_sse2_unaligned
+ 1.60% fn_anonymous ceph-mds [.] StackStringBuf<4096ul>::xsputn
+ 1.46% log libc-2.23.so [.] _IO_default_xsputn
+ 1.45% log libc-2.23.so [.] _itoa_word
+ 1.43% fn_anonymous libc-2.23.so [.] __memmove_ssse3_back
+ 1.40% ms_dispatch libstdc++.so.6.0.24 [.] std::ostream::_M_insert<long>
+ 0.98% fn_anonymous libstdc++.so.6.0.24 [.] std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<l
+ 0.89% ms_dispatch libstdc++.so.6.0.24 [.] 0x
+ 0.88% ms_dispatch libstdc++.so.6.0.24 [.] std::_Rb_tree_increment
And the unittest_log time:
$ bin/unittest_log
[==========] Running 13 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 13 tests from Log
[ RUN ] Log.Simple
[ OK ] Log.Simple (1 ms)
[ RUN ] Log.ReuseBad
[ OK ] Log.ReuseBad (0 ms)
[ RUN ] Log.ManyNoGather
[ OK ] Log.ManyNoGather (0 ms)
[ RUN ] Log.ManyGatherLog
[ OK ] Log.ManyGatherLog (83 ms)
[ RUN ] Log.ManyGatherLogStringAssign
[ OK ] Log.ManyGatherLogStringAssign (79 ms)
[ RUN ] Log.ManyGatherLogStackSpillover
[ OK ] Log.ManyGatherLogStackSpillover (81 ms)
[ RUN ] Log.ManyGather
[ OK ] Log.ManyGather (80 ms)
[ RUN ] Log.InternalSegv
[WARNING] /home/pdonnell/ceph/src/googletest/googletest/src/gtest-death-test.cc:836:: Death tests use fork(), which is unsafe particularly in a threaded context. For this test, Google Test detected 3 threads.
[ OK ] Log.InternalSegv (7 ms)
[ RUN ] Log.LargeLog
[ OK ] Log.LargeLog (55 ms)
[ RUN ] Log.TimeSwitch
[ OK ] Log.TimeSwitch (4 ms)
[ RUN ] Log.TimeFormat
[ OK ] Log.TimeFormat (1 ms)
[ RUN ] Log.Speed_gather
[ OK ] Log.Speed_gather (1441 ms)
[ RUN ] Log.Speed_nogather
[ OK ] Log.Speed_nogather (63 ms)
[----------] 13 tests from Log (1895 ms total)
[----------] Global test environment tear-down
[==========] 13 tests from 1 test case ran. (1895 ms total)
[ PASSED ] 13 tests.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>