From 3a831292835c57318c613284fdf4af3da9622c04 Mon Sep 17 00:00:00 2001 From: Loic Dachary Date: Tue, 20 Aug 2013 16:17:10 +0200 Subject: [PATCH] erasure code : plugin, interface and glossary documentation updates * replace the erasure code plugin abstract interface with a doxygen link that will be populated when the header shows in master * update the plugin documentation to reflect the current draft implementation * fix broken link to PGBackend-h * add a glossary to define chunk, stripe, shard and strip with a drawing http://tracker.ceph.com/issues/4929 refs #4929 Signed-off-by: Loic Dachary --- doc/dev/osd_internals/erasure_coding.rst | 39 ++++++ .../erasure_coding/developer_notes.rst | 111 ++++++------------ .../erasure_coding/pgbackend.rst | 2 +- 3 files changed, 77 insertions(+), 75 deletions(-) diff --git a/doc/dev/osd_internals/erasure_coding.rst b/doc/dev/osd_internals/erasure_coding.rst index deb91aca9db2..bfc425251a84 100644 --- a/doc/dev/osd_internals/erasure_coding.rst +++ b/doc/dev/osd_internals/erasure_coding.rst @@ -10,6 +10,45 @@ architectural changes `_, up to the point where it becomes a reference of the erasure coding implementation itself. +Glossary +-------- + +*chunk* + when the encoding function is called, it returns chunks of the + same size. + +*stripe* + when an object is too large to be encoded with a single call, + each set of chunks created by a call to the encoding function is + called a stripe. + +*shard|strip* + the file that holds all chunks of a same rank for a given object. + +Example: +:: + OSD 40 OSD 33 + +-------------------------+ +-------------------------+ + | shard 0 - PG 10 | | shard 1 - PG 10 | + |+------ object O -------+| |+------ object O -------+| + ||+---------------------+|| ||+---------------------+|| + stripe||| chunk 0 ||| ||| chunk 1 ||| ... + 0 ||| [0,+N) ||| ||| [0,+N) ||| + ||+---------------------+|| ||+---------------------+|| + ||+---------------------+|| ||+---------------------+|| + stripe||| chunk 0 ||| ||| chunk 1 ||| ... + 1 ||| [N,+N) ||| ||| [N,+N) ||| + ||+---------------------+|| ||+---------------------+|| + ||+---------------------+|| ||+---------------------+|| + stripe||| chunk 0 [N*2,+len) ||| ||| chunk 1 [N*2,+len) ||| ... + 2 ||+---------------------+|| ||+---------------------+|| + |+-----------------------+| |+-----------------------+| + | ... | | ... | + +-------------------------+ +-------------------------+ + +Table of content +---------------- + .. toctree:: :maxdepth: 1 diff --git a/doc/dev/osd_internals/erasure_coding/developer_notes.rst b/doc/dev/osd_internals/erasure_coding/developer_notes.rst index 40616ae271c3..496a4a99f760 100644 --- a/doc/dev/osd_internals/erasure_coding/developer_notes.rst +++ b/doc/dev/osd_internals/erasure_coding/developer_notes.rst @@ -9,17 +9,6 @@ An erasure coded pool only supports full writes, appends and read. It does not support snapshots or clone. An ErasureCodedPGBackend is derived from PGBackend. - -Glossary --------- - -* Stripe - -* Data chunk and parity chunk - -* Shard - - Reading and writing encoded chunks from and to OSDs --------------------------------------------------- An erasure coded pool stores each object as M+K chunks. It is divided @@ -352,7 +341,7 @@ If *OSD 1* goes down while *S2D2* is still in flight, the payload is partially a The log entry *1,2* found on *OSD 3* is divergent from the new authoritative log provided by *OSD 4* : it is discarded and the file containing the *S2P1* chunk is truncated to the nearest multiple of the stripe size. -Erasure code library +`Erasure code library `_ -------------------- Using `Reed-Solomon `_, @@ -385,83 +374,57 @@ the encoding functions: smaller buffers will mean more calls and more overhead. Although Reed-Solomon is provided as a default, Ceph uses it via an -abastract API designed to allow each pool to chose the plugin that +abstract API designed to allow each pool to choose the plugin that implements it. :: - ceph osd pool set-erasure-code plugin-dir - ceph osd pool set-erasure-code plugin + ceph osd pool create \ + erasure-code-directory= \ + erasure-code-plugin= The ** is dynamically loaded from ** (defaults to -*/usr/lib/ceph/erasure-code-plugins* ) and expected to implement the -*create_erasure_code_context* function - -* erasure_coding_t \*create_erasure_code_context(g_conf) +*/usr/lib/ceph/erasure-code* ) and expected to implement the +*void __erasure_code_init(char *plugin_name)* function +which is responsible for registering an object derived from +*ErasureCodePlugin* in the registry singleton : +:: + registry.plugins[plugin_name] = new ErasureCodePluginExample(); - return an object configured to encode and decode according to a - given algorithm and a given set of parameters as specified in - g_conf. Parameters must be prefixed with erasure-code to avoid name - collisions - :: - ceph osd pool set-erasure-code m 10 - ceph osd pool set-erasure-code k 3 - ceph osd pool set-erasure-code algorithm Reed-Solomon +The *ErasureCodePlugin* derived object must provide a factory method +from which the concrete implementation of the *ErasureCodeInterface* +object can be generated: +:: + virtual int factory(ErasureCodeInterfaceRef *erasure_code, + const map ¶meters) { + *erasure_code = ErasureCodeInterfaceRef(new ErasureCodeExample(parameters)); + return 0; + } + +The *parameters* is the list of *key=value* pairs that were set when the pool +was created. Each *key* must be prefixed with erasure-code to avoid name collisions +:: + ceph osd pool create \ + erasure-code-directory= \ # mandatory + erasure-code-plugin=jerasure \ # mandatory + erasure-code-m=10 \ # optional and plugin dependant + erasure-code-k=3 \ # optional and plugin dependant + erasure-code-algorithm=Reed-Solomon \ # optional and plugin dependant Erasure code library abstract API --------------------------------- -The following are methods of the abstract class erasure_coding_t. - -* set minimum_to_decode(const set &want_to_read, const set &available_chunks); - - returns the smallest subset of *available_chunks* that needs to be retrieved in order - to successfully decode *want_to_read* chunks. - -* set minimum_to_decode_with_cost(const set &want_to_read, const map &available) - - returns the minimum cost set required to read the specified - chunks given a mapping of available chunks to costs. The costs might - allow to consider the difference between reading local chunks vs - remote chunks. - -* map encode(const set &want_to_encode, const buffer &in) - - encode the content of *in* and return a map associating the chunk - number with its encoded content. The map only contains the chunks - contained in the *want_to_encode* set. For instance, in the simplest - case M=2,K=1 for a buffer containing AB, calling - :: - encode([1,2,3], 'AB') - => { 1 => 'A', 2 => 'B', 3 => 'Z' } - - If only the parity chunk is of interest, calling - :: - encode([3], 'AB') - => { 3 => 'Z' } - - -* map decode(const set &want_to_read, const map &chunks) - - decode *chunks* to read the content of the *want_to_read* chunks and - return a map associating the chunk number with its decoded - content. For instance, in the simplest case M=2,K=1 for an - encoded payload of data A and B with parity Z, calling - :: - decode([1,2], { 1 => 'A', 2 => 'B', 3 => 'Z' }) - => { 1 => 'A', 2 => 'B' } - - If however, the chunk B is to be read but is missing it will be: - :: - decode([2], { 1 => 'A', 3 => 'Z' }) - => { 2 => 'B' } + .. doxygenfile:: ErasureCodeInterface.h Erasure code jerasure plugin ---------------------------- The parameters interpreted by the jerasure plugin are: :: - ceph osd pool set-erasure-code m (defaults 10) - ceph osd pool set-erasure-code k (default 3) - ceph osd pool set-erasure-code algorithm (default Reed-Solomon) + ceph osd pool create \ + erasure-code-directory= \ # plugin directory absolute path + erasure-code-plugin=jerasure \ # plugin name (only jerasure) + erasure-code-m= \ # data chunks (default 10) + erasure-code-k= \ # parity chunks (default 3) + erasure-code-algorithm=Reed-Solomon \ # algorithm (only Reed-Solomon) Scrubbing diff --git a/doc/dev/osd_internals/erasure_coding/pgbackend.rst b/doc/dev/osd_internals/erasure_coding/pgbackend.rst index 662351e9d779..9e3fcb2bf86c 100644 --- a/doc/dev/osd_internals/erasure_coding/pgbackend.rst +++ b/doc/dev/osd_internals/erasure_coding/pgbackend.rst @@ -2,7 +2,7 @@ PG Backend Proposal =================== -See also `PGBackend.h `_ +See also `PGBackend.h <../PGBackend-h>`_ Motivation ---------- -- 2.47.3