msgr: more conservative locking, thread join asserts
We caught a bunch of crashes like this:
10.02.11 17:01:01.600660
7f87070c3950 -- 10.3.14.134:6800/8203 >> 10.3.14.130:6800/18914 pipe(0x7fc2be2cebe0 sd=36 pgs=2409 cs=1 l=0).do_sendmsg error Broken pipe
10.02.11 17:01:01.600700
7f87070c3950 -- 10.3.14.134:6800/8203 >> 10.3.14.130:6800/18914 pipe(0x7fc2be2cebe0 sd=36 pgs=2409 cs=1 l=0).writer error sending 0x7fc27da1c570, 32: Broken pipe
10.02.11 17:01:01.600796
7f87070c3950 -- 10.3.14.134:6800/8203 >> 10.3.14.130:6800/18914 pipe(0x7fc2be2cebe0 sd=-1 pgs=2409 cs=1 l=0).fault initiating reconnect
...
./common/Thread.h: In function 'int Thread::join(void**)':
./common/Thread.h:66: FAILED assert(0)
1: (Thread::join(void**)+0x73) [0x64fcd3]
2: (SimpleMessenger::Pipe::join_reader()+0x68) [0x6555a2]
3: (SimpleMessenger::Pipe::connect()+0xf5) [0x645be9]
4: (SimpleMessenger::Pipe::writer()+0x157) [0x64793d]
5: (SimpleMessenger::Pipe::Writer::entry()+0x19) [0x63e107]
6: (Thread::_entry_func(void*)+0x20) [0x64e816]
7: /lib/libpthread.so.0 [0x7fc2c3bbdfc7]
8: (clone()+0x6d) [0x7fc2c2e005ad]
that look a bit like multiple procs were racing into
join_reader(). Add an assert to catch that if it happens again,
and also wrap thread starts in pipe_lock to ensure we keep the
_running flags in sync with reality. Add in a few other
sanity checks too.