From 472f495e40318a6849512bd90736e730362fd747 Mon Sep 17 00:00:00 2001 From: Loic Dachary Date: Sun, 8 Dec 2013 22:03:33 +0100 Subject: [PATCH] crush: document the --test mode of operations Signed-off-by: Loic Dachary --- doc/man/8/crushtool.rst | 123 ++++++++++++++++++- man/crushtool.8 | 256 +++++++++++++++++++++++++++++++++++++++- 2 files changed, 367 insertions(+), 12 deletions(-) diff --git a/doc/man/8/crushtool.rst b/doc/man/8/crushtool.rst index 97303cc4bfacf..ea4691fec12d9 100644 --- a/doc/man/8/crushtool.rst +++ b/doc/man/8/crushtool.rst @@ -8,7 +8,7 @@ Synopsis ======== | **crushtool** ( -d *map* | -c *map.txt* | --build --num_osds *numosds* - *layer1* *...* ) [ -o *outfile* ] + *layer1* *...* | --test ) [ -o *outfile* ] Description @@ -47,14 +47,125 @@ The tool has four modes of operation. names, see crushtool --help for more information. -Options -======= +Running tests +============= + +The test mode will use the input crush map ( as specified with **-i +map** ) and perform a dry run of CRUSH mapping or random placement ( +if **--simulate** is set ). On completion, two kinds of reports can be +created. The **--show-...** options output human readable informations +on stderr. The **--output-csv** option creates CSV files that are +documented by the **--help-output** option. + +.. option:: --show-statistics + + for each rule display the mapping of each object. For instance:: + + CRUSH rule 1 x 24 [11,6] + + shows that object **24** is mapped to devices **[11,6]** by rule + **1**. At the end of the mapping details, a summary of the + distribution is displayed. For instance:: + + rule 1 (metadata) num_rep 5 result size == 5: 1024/1024 + + shows that rule **1** which is named **metadata** successfully + mapped **1024** objects to **result size == 5** devices when trying + to map them to **num_rep 5** replicas. When it fails to provide the + required mapping, presumably because the number of **tries** must + be increased, a breakdown of the failures is displays. For instance:: + + rule 1 (metadata) num_rep 10 result size == 8: 4/1024 + rule 1 (metadata) num_rep 10 result size == 9: 93/1024 + rule 1 (metadata) num_rep 10 result size == 10: 927/1024 + + shows that although **num_rep 10** replicas were required, **4** + out of **1024** objects ( **4/1024** ) were mapped to **result size + == 8** devices only. + +.. option:: --show-bad-mappings + + display which object failed to be mapped to the required number of + devices. For instance:: + + bad mapping rule 1 x 781 num_rep 7 result [8,10,2,11,6,9] + + shows that when rule **1** was required to map **7** devices, it + could only map six : **[8,10,2,11,6,9]**. + +.. option:: --show-utilization + + display the expected and actual utilisation for each device, for + each number of replicas. For instance:: + + device 0: stored : 951 expected : 853.333 + device 1: stored : 963 expected : 853.333 + ... + + shows that device **0** stored **951** objects and was expected to store **853**. + Implies **--show-statistics**. + +.. option:: --show-utilization-all + + displays the same as **--show-utilization** but does not suppress + output when the weight of a device is zero. + Implies **--show-statistics**. + +.. option:: --show-choose-tries + + display how many attempts were needed to find a device mapping. + For instance:: + + 0: 95224 + 1: 3745 + 2: 2225 + .. + + shows that **95224** mappings succeeded without retries, **3745** + mappings succeeded with one attempts, etc. There are as many rows + as the value of the **--set-choose-total-tries** option. + +.. option:: --output-csv + + create CSV files (in the current directory) containing information + documented by **--help-output**. The files are named after the rule + used when collecting the statistics. For instance, if the rule + metadata is used, the CSV files will be:: + + metadata-absolute_weights.csv + metadata-device_utilization.csv + ... + + The first line of the file shortly explains the column layout. For + instance:: + + metadata-absolute_weights.csv + Device ID, Absolute Weight + 0,1 + ... + +.. option:: --output-name NAME + + prepend **NAME** to the file names generated when **--output-csv** + is specified. For instance **--output-name FOO** will create + files:: + + FOO-metadata-absolute_weights.csv + FOO-metadata-device_utilization.csv + ... + +The **--set-...** options can be used to modify the tunables of the +input crush map. The input crush map is modified in +memory. For example:: -.. option:: -o outfile + $ crushtool -i mymap --test --show-bad-mappings + bad mapping rule 1 x 781 num_rep 7 result [8,10,2,11,6,9] - will specify the output file. +could be fixed by increasing the **choose-total-tries** as follows: - + $ crushtool -i mymap --test \ + --show-bad-mappings \ + --set-choose-total-tries 500 Building a map ============== diff --git a/man/crushtool.8 b/man/crushtool.8 index 90332016741f4..8653cb2992b4c 100644 --- a/man/crushtool.8 +++ b/man/crushtool.8 @@ -1,4 +1,6 @@ -.TH "CRUSHTOOL" "8" "November 18, 2013" "dev" "Ceph" +.\" Man page generated from reStructuredText. +. +.TH "CRUSHTOOL" "8" "December 09, 2013" "dev" "Ceph" .SH NAME crushtool \- CRUSH map manipulation tool . @@ -28,12 +30,37 @@ level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. -.\" Man page generated from reStructuredText. . +.nr rst2man-indent-level 0 +. +.de1 rstReportMargin +\\$1 \\n[an-margin] +level \\n[rst2man-indent-level] +level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] +- +\\n[rst2man-indent0] +\\n[rst2man-indent1] +\\n[rst2man-indent2] +.. +.de1 INDENT +.\" .rstReportMargin pre: +. RS \\$1 +. nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] +. nr rst2man-indent-level +1 +.\" .rstReportMargin post: +.. +.de UNINDENT +. RE +.\" indent \\n[an-margin] +.\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] +.nr rst2man-indent-level -1 +.\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] +.in \\n[rst2man-indent\\n[rst2man-indent-level]]u +.. .SH SYNOPSIS .nf \fBcrushtool\fP ( \-d \fImap\fP | \-c \fImap.txt\fP | \-\-build \-\-num_osds \fInumosds\fP -\fIlayer1\fP \fI...\fP ) [ \-o \fIoutfile\fP ] +\fIlayer1\fP \fI\&...\fP | \-\-test ) [ \-o \fIoutfile\fP ] .fi .sp .SH DESCRIPTION @@ -76,11 +103,216 @@ structure. See below for examples. will perform a dry run of a CRUSH mapping for a range of input object names, see crushtool \-\-help for more information. .UNINDENT -.SH OPTIONS +.SH RUNNING TESTS +.sp +The test mode will use the input crush map ( as specified with \fB\-i +map\fP ) and perform a dry run of CRUSH mapping or random placement ( +if \fB\-\-simulate\fP is set ). On completion, two kinds of reports can be +created. The \fB\-\-show\-...\fP options output human readable informations +on stderr. The \fB\-\-output\-csv\fP option creates CSV files that are +documented by the \fB\-\-help\-output\fP option. .INDENT 0.0 .TP -.B \-o outfile -will specify the output file. +.B \-\-show\-statistics +for each rule display the mapping of each object. For instance: +.INDENT 7.0 +.INDENT 3.5 +.sp +.nf +.ft C +CRUSH rule 1 x 24 [11,6] +.ft P +.fi +.UNINDENT +.UNINDENT +.sp +shows that object \fB24\fP is mapped to devices \fB[11,6]\fP by rule +\fB1\fP\&. At the end of the mapping details, a summary of the +distribution is displayed. For instance: +.INDENT 7.0 +.INDENT 3.5 +.sp +.nf +.ft C +rule 1 (metadata) num_rep 5 result size == 5: 1024/1024 +.ft P +.fi +.UNINDENT +.UNINDENT +.sp +shows that rule \fB1\fP which is named \fBmetadata\fP successfully +mapped \fB1024\fP objects to \fBresult size == 5\fP devices when trying +to map them to \fBnum_rep 5\fP replicas. When it fails to provide the +required mapping, presumably because the number of \fBtries\fP must +be increased, a breakdown of the failures is displays. For instance: +.INDENT 7.0 +.INDENT 3.5 +.sp +.nf +.ft C +rule 1 (metadata) num_rep 10 result size == 8: 4/1024 +rule 1 (metadata) num_rep 10 result size == 9: 93/1024 +rule 1 (metadata) num_rep 10 result size == 10: 927/1024 +.ft P +.fi +.UNINDENT +.UNINDENT +.sp +shows that although \fBnum_rep 10\fP replicas were required, \fB4\fP +out of \fB1024\fP objects ( \fB4/1024\fP ) were mapped to \fBresult size +== 8\fP devices only. +.UNINDENT +.INDENT 0.0 +.TP +.B \-\-show\-bad\-mappings +display which object failed to be mapped to the required number of +devices. For instance: +.INDENT 7.0 +.INDENT 3.5 +.sp +.nf +.ft C +bad mapping rule 1 x 781 num_rep 7 result [8,10,2,11,6,9] +.ft P +.fi +.UNINDENT +.UNINDENT +.sp +shows that when rule \fB1\fP was required to map \fB7\fP devices, it +could only map six : \fB[8,10,2,11,6,9]\fP\&. +.UNINDENT +.INDENT 0.0 +.TP +.B \-\-show\-utilization +display the expected and actual utilisation for each device, for +each number of replicas. For instance: +.INDENT 7.0 +.INDENT 3.5 +.sp +.nf +.ft C +device 0: stored : 951 expected : 853.333 +device 1: stored : 963 expected : 853.333 +\&... +.ft P +.fi +.UNINDENT +.UNINDENT +.sp +shows that device \fB0\fP stored \fB951\fP objects and was expected to store \fB853\fP\&. +Implies \fB\-\-show\-statistics\fP\&. +.UNINDENT +.INDENT 0.0 +.TP +.B \-\-show\-utilization\-all +displays the same as \fB\-\-show\-utilization\fP but does not suppress +output when the weight of a device is zero. +Implies \fB\-\-show\-statistics\fP\&. +.UNINDENT +.INDENT 0.0 +.TP +.B \-\-show\-choose\-tries +display how many attempts were needed to find a device mapping. +For instance: +.INDENT 7.0 +.INDENT 3.5 +.sp +.nf +.ft C +0: 95224 +1: 3745 +2: 2225 +\&.. +.ft P +.fi +.UNINDENT +.UNINDENT +.sp +shows that \fB95224\fP mappings succeeded without retries, \fB3745\fP +mappings succeeded with one attempts, etc. There are as many rows +as the value of the \fB\-\-set\-choose\-total\-tries\fP option. +.UNINDENT +.INDENT 0.0 +.TP +.B \-\-output\-csv +create CVS files (in the current directory) containing information +documented by \fB\-\-help\-output\fP\&. The files are named after the rule +used when collecting the statistics. For instance, if the rule +metadata is used, the CSV files will be: +.INDENT 7.0 +.INDENT 3.5 +.sp +.nf +.ft C +metadata\-absolute_weights.csv +metadata\-device_utilization.csv +\&... +.ft P +.fi +.UNINDENT +.UNINDENT +.sp +The first line of the file shortly explains the column layout. For +instance: +.INDENT 7.0 +.INDENT 3.5 +.sp +.nf +.ft C +metadata\-absolute_weights.csv +Device ID, Absolute Weight +0,1 +\&... +.ft P +.fi +.UNINDENT +.UNINDENT +.UNINDENT +.INDENT 0.0 +.TP +.B \-\-output\-name NAME +prepend \fBNAME\fP to the file names generated when \fB\-\-output\-csv\fP +is specified. For instance \fB\-\-output\-name FOO\fP will create +files: +.INDENT 7.0 +.INDENT 3.5 +.sp +.nf +.ft C +FOO\-metadata\-absolute_weights.csv +FOO\-metadata\-device_utilization.csv +\&... +.ft P +.fi +.UNINDENT +.UNINDENT +.UNINDENT +.sp +The \fB\-\-set\-...\fP options can be used to modify the tunables of the +input crush map, provided the \fB\-\-enable\-unsafe\-tunables\fP option is +also set to disable the safeguard. The input crush map is modified in +memory. For example: +.INDENT 0.0 +.INDENT 3.5 +.sp +.nf +.ft C +$ crushtool \-i mymap \-\-test \-\-show\-bad\-mappings +bad mapping rule 1 x 781 num_rep 7 result [8,10,2,11,6,9] +.ft P +.fi +.UNINDENT +.UNINDENT +.sp +could be fixed by increasing the \fBchoose\-total\-tries\fP as follows: +.INDENT 0.0 +.INDENT 3.5 +.INDENT 0.0 +.TP +.B $ crushtool \-i mymap \-\-test +\-\-show\-bad\-mappings \-\-enable\-unsafe\-tunables \-\-set\-choose\-total\-tries 500 +.UNINDENT +.UNINDENT .UNINDENT .SH BUILDING A MAP .sp @@ -90,12 +322,16 @@ CRUSH hierarchy. Each layer describes how the layer (or raw devices) preceding it should be grouped. .sp Each layer consists of: +.INDENT 0.0 +.INDENT 3.5 .sp .nf .ft C name ( uniform | list | tree | straw ) size .ft P .fi +.UNINDENT +.UNINDENT .sp The first element is the name for the elements in the layer (e.g. "rack"). Each element\(aqs name will be append a number to the @@ -115,14 +351,20 @@ leaving an extra 2U for a rack switch. .sp To reflect our hierarchy of devices, nodes, racks and rows, we would execute the following: +.INDENT 0.0 +.INDENT 3.5 .sp .nf .ft C crushtool \-o crushmap \-\-build \-\-num_osds 320 node straw 4 rack straw 20 row straw 2 .ft P .fi +.UNINDENT +.UNINDENT .sp To adjust the default (generic) mapping rules, we can run: +.INDENT 0.0 +.INDENT 3.5 .sp .nf .ft C @@ -136,6 +378,8 @@ vi map.txt crushtool \-c map.txt \-o crushmap .ft P .fi +.UNINDENT +.UNINDENT .SH AVAILABILITY .sp \fBcrushtool\fP is part of the Ceph distributed file system. Please -- 2.39.5