From bc2bfef7fe0b3ba5ae604bee803f4e61fe41f7c1 Mon Sep 17 00:00:00 2001 From: Greg Farnum Date: Wed, 14 Jun 2017 13:27:24 -0700 Subject: [PATCH] doc: describe mark_events logging available via the OSD's OpTracker Signed-off-by: Greg Farnum --- .../troubleshooting/troubleshooting-osd.rst | 52 ++++++++++++++++++- 1 file changed, 51 insertions(+), 1 deletion(-) diff --git a/doc/rados/troubleshooting/troubleshooting-osd.rst b/doc/rados/troubleshooting/troubleshooting-osd.rst index fe29f4767f9..f72c6a4adc1 100644 --- a/doc/rados/troubleshooting/troubleshooting-osd.rst +++ b/doc/rados/troubleshooting/troubleshooting-osd.rst @@ -417,7 +417,57 @@ Possible solutions - Upgrade Ceph - Restart OSDs - +Debugging Slow Requests +----------------------- + +If you run "ceph daemon osd. dump_historic_ops" or "dump_ops_in_flight", +you will see a set of operations and a list of events each operation went +through. These are briefly described below. + +Events from the Messenger layer: + +- header_read: when the messenger first started reading the message off the wire +- throttled: when the messenger tried to acquire memory throttle space to read + the message into memory +- all_read: when the messenger finished reading the message off the wire +- dispatched: when the messenger gave the message to the OSD +- Initiated: : the primary marks this when it + hears about the above, but for a particular replica +- commit_sent: we sent a reply back to the client (or primary OSD, for sub ops) + +Many of these events are seemingly redundant, but cross important boundaries in +the internal code (such as passing data across locks into new threads). Flapping OSDs ============= -- 2.39.5