doc/dev: improve EC glossary

author Zac Dover <zac.dover@gmail.com>

Mon, 31 Oct 2022 03:17:45 +0000 (13:17 +1000)

committer Zac Dover <zac.dover@gmail.com>

Mon, 31 Oct 2022 03:32:24 +0000 (13:32 +1000)
author Zac Dover <zac.dover@gmail.com>
Mon, 31 Oct 2022 03:17:45 +0000 (13:17 +1000)
committer Zac Dover <zac.dover@gmail.com>
Mon, 31 Oct 2022 03:32:24 +0000 (13:32 +1000)
diff --git a/doc/dev/osd_internals/erasure_coding.rst b/doc/dev/osd_internals/erasure_coding.rst

index 7263cc35b67152194dcde112d1143f3467d1b20c..68eb433dc002dfb9df5258263b6610ca6d6a5b49 100644 (file)
--- a/doc/dev/osd_internals/erasure_coding.rst
+++ b/doc/dev/osd_internals/erasure_coding.rst
@@ -6,47 +6,50 @@ Glossary
  --------
  
  *chunk* 
-   when the encoding function is called, it returns chunks of the same
-   size. Data chunks which can be concatenated to reconstruct the original
-   object and coding chunks which can be used to rebuild a lost chunk.
+   when the encoding function is called, it returns chunks of the same size.
+   There are two kinds of chunks: (1) data chunks, which can be concatenated to
+   reconstruct the original object, and (2) coding chunks, which can be used to
+   rebuild a lost chunk.
  
  *chunk rank*
-   the index of a chunk when returned by the encoding function. The
-   rank of the first chunk is 0, the rank of the second chunk is 1
-   etc.
+   the index of a chunk, as determined by the encoding function. The
+   rank of the first chunk is 0, the rank of the second chunk is 1,
+   and so on.
  
  *stripe* 
-   when an object is too large to be encoded with a single call,
-   each set of chunks created by a call to the encoding function is
-   called a stripe.
+   if an object is so large that encoding it requires more than one call to the
+   encoding function, each of these calls will create a set of chunks called a
+   *stripe*.
  
-*shard|strip*
+*shard* (also called *strip*)
     an ordered sequence of chunks of the same rank from the same
-   object.  For a given placement group, each OSD contains shards of
+   object. For a given placement group, each OSD contains shards of
     the same rank. When dealing with objects that are encoded with a
-   single operation, *chunk* is sometime used instead of *shard*
+   single operation, *chunk* is sometimes used instead of *shard*
     because the shard is made of a single chunk. The *chunks* in a
     *shard* are ordered according to the rank of the stripe they belong
     to.
  
  *K*
-   the number of data *chunks*, i.e. the number of *chunks* in which the
-   original object is divided. For instance if *K* = 2 a 10KB object
-   will be divided into *K* objects of 5KB each.
+   the number of "data *chunks*" into which an object is divided. For example, 
+   if *K* = 2, then a 10KB object is divided into two objects of 5KB each.
  
  *M* 
-   the number of coding *chunks*, i.e. the number of additional *chunks*
-   computed by the encoding functions. If there are 2 coding *chunks*, 
-   it means 2 OSDs can be out without losing data.
+   the number of coding *chunks* (the number of chunks in addition to the "data
+   chunks") computed by the encoding functions. *M* is equal to the number of
+   OSDs that can be lost from the cluster without the cluster suffering data
+   loss. For example, if there are two coding *chunks*, then two OSDs can be
+   down without data loss.
  
  *N*
-   the number of data *chunks* plus the number of coding *chunks*, 
-   i.e. *K+M*.
+   the number of data *chunks* plus the number of coding *chunks*. *K* + *M*.
  
  *rate*
-   the proportion of the *chunks* that contains useful information, i.e. *K/N*.
-   For instance, for *K* = 9 and *M* = 3 (i.e. *K+M* = *N* = 12) the rate is 
-   *K* = 9 / *N* = 12 = 0.75, i.e. 75% of the chunks contain useful information.
+   the proportion of the *chunks* containing useful information: that is, *K*
+   divided by *N*. For example, suppose that *K* = 9 and *M* = 3. This would
+   mean that *N* = 12 (because *K* + *M* = 9 + 3). Therefore, the rate (*K* /
+   *N*) is 9 / 12 = 0.75. In other words, 75% of the chunks contain useful
+   information.
  
  The definitions are illustrated as follows (PG stands for placement group):
  ::
@@ -71,8 +74,8 @@ The definitions are illustrated as follows (PG stands for placement group):
         |         ...             | |         ...             |
         +-------------------------+ +-------------------------+
  
-Table of content
-----------------
+Table of contents
+-----------------
  
  .. toctree::
     :maxdepth: 1
author	Zac Dover <zac.dover@gmail.com>
	Mon, 31 Oct 2022 03:17:45 +0000 (13:17 +1000)
committer	Zac Dover <zac.dover@gmail.com>
	Mon, 31 Oct 2022 03:32:24 +0000 (13:32 +1000)