From 6d6de42b6f8b38460a38ca24214a04a6e19698e4 Mon Sep 17 00:00:00 2001 From: Zac Dover Date: Wed, 2 Nov 2022 13:45:48 +1000 Subject: [PATCH] doc/dev: refine erasure_coding.rst Improve the readability and clarity of erasure_coding.rst. Co-author: Cole Mitchell Signed-off-by: Zac Dover --- doc/dev/osd_internals/erasure_coding.rst | 58 ++++++++++++------------ 1 file changed, 30 insertions(+), 28 deletions(-) diff --git a/doc/dev/osd_internals/erasure_coding.rst b/doc/dev/osd_internals/erasure_coding.rst index c1764e8a0bb..40064961bba 100644 --- a/doc/dev/osd_internals/erasure_coding.rst +++ b/doc/dev/osd_internals/erasure_coding.rst @@ -6,50 +6,52 @@ Glossary -------- *chunk* - when the encoding function is called, it returns chunks of the same size. - There are two kinds of chunks: (1) data chunks, which can be concatenated to - reconstruct the original object, and (2) coding chunks, which can be used to - rebuild a lost chunk. + When the encoding function is called, it returns chunks of the same + size as each other. There are two kinds of chunks: (1) *data + chunks*, which can be concatenated to reconstruct the original + object, and (2) *coding chunks*, which can be used to rebuild a + lost chunk. *chunk rank* - the index of a chunk, as determined by the encoding function. The + The index of a chunk, as determined by the encoding function. The rank of the first chunk is 0, the rank of the second chunk is 1, and so on. *K* - the number of "data *chunks*" into which an object is divided. For example, - if *K* = 2, then a 10KB object is divided into two objects of 5KB each. + The number of data chunks into which an object is divided. For + example, if *K* = 2, then a 10KB object is divided into two objects + of 5KB each. *M* - the number of coding *chunks* (the number of chunks in addition to the "data - chunks") computed by the encoding functions. *M* is equal to the number of - OSDs that can be lost from the cluster without the cluster suffering data - loss. For example, if there are two coding *chunks*, then two OSDs can be - down without data loss. + The number of coding chunks computed by the encoding function. *M* + is equal to the number of OSDs that can be missing from the cluster + without the cluster suffering data loss. For example, if there are + two coding chunks, then two OSDs can be missing without data loss. *N* - the number of data *chunks* plus the number of coding *chunks*. *K* + *M*. + The number of data chunks plus the number of coding chunks: that + is, *K* + *M*. *rate* - the proportion of the *chunks* containing useful information: that is, *K* - divided by *N*. For example, suppose that *K* = 9 and *M* = 3. This would - mean that *N* = 12 (because *K* + *M* = 9 + 3). Therefore, the rate (*K* / - *N*) is 9 / 12 = 0.75. In other words, 75% of the chunks contain useful - information. + The proportion of the total chunks containing useful information: + that is, *K* divided by *N*. For example, suppose that *K* = 9 and + *M* = 3. This would mean that *N* = 12 (because *K* + *M* = 9 + 3). + Therefore, the *rate* (*K* / *N*) would be 9 / 12 = 0.75. In other + words, 75% of the chunks would contain useful information. *shard* (also called *strip*) - an ordered sequence of chunks of the same rank from the same - object. For a given placement group, each OSD contains shards of - the same rank. When dealing with objects that are encoded with a - single operation, *chunk* is sometimes used instead of *shard* - because the shard is made of a single chunk. The *chunks* in a - *shard* are ordered according to the rank of the stripe they belong - to. + An ordered sequence of chunks of the same rank from the same object. For a + given placement group, each OSD contains shards of the same rank. In the + special case in which an object is encoded with only one call to the + encoding function, the term *chunk* may be used instead of *shard* because + the shard is made of a single chunk. The chunks in a shard are ordered + according to the rank of the stripe (see *stripe* below) they belong to. + *stripe* - if an object is so large that encoding it requires more than one call to the - encoding function, each of these calls will create a set of chunks called a - *stripe*. + If an object is so large that encoding it requires more than one + call to the encoding function, each of these calls creates a set of + chunks called a *stripe*. The definitions are illustrated as follows (PG stands for placement group): :: -- 2.39.5