From: Samuel Just Date: Wed, 18 Mar 2015 19:02:04 +0000 (-0700) Subject: doc: add last_epoch_started.rst X-Git-Tag: v0.94~28^2~1 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=2956ae278daa8744e52e7c69fe5d5416267b84a4;p=ceph.git doc: add last_epoch_started.rst Signed-off-by: Samuel Just --- diff --git a/doc/dev/osd_internals/last_epoch_started.rst b/doc/dev/osd_internals/last_epoch_started.rst new file mode 100644 index 000000000000..fcb930f48b62 --- /dev/null +++ b/doc/dev/osd_internals/last_epoch_started.rst @@ -0,0 +1,39 @@ +====================== +last_epoch_started +====================== + +info.last_epoch_started records an activation epoch e for interval i +such that all writes commited in i or earlier are reflected in the +local info/log and no writes after i are reflected in the local +info/log. Since no committed write is ever divergent, even if we +get an authoritative log/info with an older info.last_epoch_started, +we can leave our info.last_epoch_started alone since no writes could +have commited in any intervening interval (See PG::proc_master_log). + +info.history.last_epoch_started records a lower bound on the most +recent interval in which the pg as a whole went active and accepted +writes. On a particular osd, it is also an upper bound on the +activation epoch of intervals in which writes in the local pg log +occurred (we update it before accepting writes). Because all +committed writes are committed by all acting set osds, any +non-divergent writes ensure that history.last_epoch_started was +recorded by all acting set members in the interval. Once peering has +queried one osd from each interval back to some seen +history.last_epoch_started, it follows that no interval after the max +history.last_epoch_started can have reported writes as committed +(since we record it before recording client writes in an interval). +Thus, the minimum last_update across all infos with +info.last_epoch_started >= MAX(history.last_epoch_started) must be an +upper bound on writes reported as committed to the client. + +We update info.last_epoch_started with the intial activation message, +but we only update history.last_epoch_started after the new +info.last_epoch_started is persisted (possibly along with the first +write). This ensures that we do not require an osd with the most +recent info.last_epoch_started until all acting set osds have recorded +it. In find_best_info, we do include info.last_epoch_started values +when calculating the max_last_epoch_started_found because we want to +avoid designating a log entry divergent which in a prior interval +would have been non-divergent. In activate(), we use the peer's +last_epoch_started value as a bound on how far back divergent log +entries can be found.