six storage devices. For a detailed discussion of CRUSH rules, see **Section 3.2**
of `CRUSH - Controlled, Scalable, Decentralized Placement of Replicated Data`_.
-A rule takes the following form::
+A normal CRUSH rule takes the following form::
rule <rulename> {
step emit
}
+CRUSH MSR (Multi-Step Retry) rules are a distinct type of CRUSH rule which
+supports retrying steps and provides better support for configurations that
+require multiple OSDs within each failure domain. MSR rules take the following
+form::
+
+ rule <rulename> {
+
+ id [a unique integer ID]
+ type [msr_indep|msr_firstn]
+ step take <bucket-name> [class <device-class>]
+ step choosemsr <N> type <bucket-type>
+ step emit
+ }
``id``
:Description: A unique integer that identifies the rule.
``type``
:Description: Denotes the type of replication strategy to be enforced by the
- rule.
+ rule. msr_firstn and msr_indep are a distinct descent algorithm
+ which supports retrying steps within the rule and therefore
+ multiple OSDs per failure domain.
:Purpose: A component of the rule mask.
:Type: String
:Required: Yes
:Default: ``replicated``
- :Valid Values: ``replicated`` or ``erasure``
+ :Valid Values: ``replicated``, ``erasure``, ``msr_firstn``, ``msr_indep``
``step take <bucket-name> [class <device-class>]``
final CRUSH mapping transformation is therefore 1, 2, 3, 4, 5
→ 1, 2, 6, 4, 5.
+``step choosemsr {num} type {bucket-type}``
+ :Description: Selects a num buckets of type bucket-type. msr_firstn and msr_indep
+ must use choosemsr rather than choose or chooseleaf.
+
+ - If ``{num} == 0``, choose ``pool-num-replicas`` buckets (as many buckets as are available).
+ - If ``pool-num-replicas > {num} > 0``, choose that many buckets.
+ :Purpose: Choose step required for msr_firstn and msr_indep rules.
+ :Prerequisite: Follows ``step take`` and precedes ``step emit``
+ :Example: ``step choosemsr 3 type host``
+
.. _crush-reclassify:
Migrating from a legacy SSD rule to device classes
[default: ``default``].
* **crush-failure-domain**: the CRUSH bucket type used in the distribution of
erasure-coded shards [default: ``host``].
+ * **crush-osds-per-failure-domain**: Maximum number of OSDs to place in each
+ failure domain -- defaults to 1. Using a value greater than one will
+ cause a CRUSH MSR rule to be created, see below. Must be specified if
+ ``crush-num-failure-domains`` is specified.
+ * **crush-num-failure-domains**: Number of failure domains to map. Must be
+ specified if ``crush-osds-per-failure-domain`` is specified. Results in
+ a CRUSH MSR rule being created.
* **crush-device-class**: the device class on which to place data [default:
none, which means that all devices are used].
* **k** and **m** (and, for the ``lrc`` plugin, **l**): these determine the
argument is omitted, then Ceph will create the CRUSH rule automatically.
+CRUSH MSR Rules
+---------------
+
+Creating an erasure-code profile with a ``crush-osds-per-failure-domain``
+value greater than one will cause a CRUSH MSR rule type to be created
+instead of a normal CRUSH rule. Normal crush rules cannot retry prior
+steps when an out OSD is encountered and rely on CHOOSELEAF steps to
+permit moving OSDs to new hosts. However, CHOOSELEAF rules don't
+support more than a single OSD per failure domain. MSR rules, new in
+squid, support multiple OSDs per failure domain by retrying all prior
+steps when an out OSD is encountered. Using MSR rules requires that
+OSDs and clients be required to support the CRUSH_MSR feature bit
+(squid or newer).
+
+
Deleting rules
--------------