osd: init local_connection for fast_dispatch in _send_boot()
We were not properly setting up Sessions on the local_connection for
fast_dispatch'ed Messages if the cluster_addr was set explicitly: the OSD
was not in the dispatch list at bind() time (in ceph_osd.cc), and nothing
called it later on. This issue was missed in testing because Inktank only
uses unified NICs.
That led to errors like the following:
When do ec-read, i met a bug which was occured 100%. The messages are:
2014-07-14 10:03:07.318681
7f7654f6e700 -1 osd/OSD.cc: In function
'virtual void OSD::ms_fast_dispatch(Message*)' thread
7f7654f6e700 time
2014-07-14 10:03:07.316782 osd/OSD.cc: 5019: FAILED assert(session)
ceph version
0.82-585-g79f3f67 (
79f3f6749122ce2944baa70541949d7ca75525e6)
1: (OSD::ms_fast_dispatch(Message*)+0x286) [0x6544b6]
2: (DispatchQueue::fast_dispatch(Message*)+0x56) [0xb059d6]
3: (DispatchQueue::run_local_delivery()+0x6b) [0xb08e0b]
4: (DispatchQueue::LocalDeliveryThread::entry()+0xd) [0xa4a5fd]
5: (()+0x8182) [0x7f7665670182]
6: (clone()+0x6d) [0x7f7663a1130d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
To resolve this, we have the OSD invoke ms_handle_fast_connect() explicitly
in send_boot(). It's not really an appropriate location, but we're already
doing a bunch of messenger twiddling there, so it's acceptable for now.
Signed-off-by: Ma Jianpeng <jianpeng.ma@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>