partitions for storing objects.
Ceph Clients retrieve a `Cluster Map`_ from a Ceph Monitor, and write objects to
-pools. The pool's ``size`` or number of replicas, the CRUSH ruleset and the
+pools. The pool's ``size`` or number of replicas, the CRUSH rule and the
number of placement groups determine how Ceph will place the data.
.. ditaa::
| To
v
+--------+ +---------------+
- | Pool |---------->| CRUSH Ruleset |
+ | Pool |---------->| CRUSH Rule |
+--------+ Selects +---------------+
- Ownership/Access to Objects
- The Number of Placement Groups, and
-- The CRUSH Ruleset to Use.
+- The CRUSH Rule to Use.
See `Set Pool Values`_ for details.
::
ceph fs flag set enable_multiple true --yes-i-really-mean-it
- ceph osd pool create recovery <pg-num> replicated <crush-ruleset-name>
+ ceph osd pool create recovery <pg-num> replicated <crush-rule-name>
ceph fs new recovery-fs recovery <data pool> --allow-dangerous-metadata-overlay
cephfs-data-scan init --force-init --filesystem recovery-fs --alternate-pool recovery
ceph fs reset recovery-fs --yes-i-really-mean-it
The output should resemble::
- pool 3 'hadoop1' rep size 1 min_size 1 crush_ruleset 0...
+ pool 3 'hadoop1' rep size 1 min_size 1 crush_rule 0...
where ``3`` is the pool id. Next we will use the pool id reference to register
the pool as a data pool for storing file system data. ::
replicated pool to the erasure-coded pool) if they have not been
accessed in a week.
-The erasure-coded pool crush ruleset targets hardware designed for
+The erasure-coded pool CRUSH rule targets hardware designed for
cold storage with high latency and slow access time. The replicated
-pool crush ruleset targets faster hardware to provide better response
+pool CRUSH rule targets faster hardware to provide better response
times.
Cheap multidatacenter storage
datacenter contains the same amount of storage with no power-supply
backup and no air-cooling system.
-An erasure-coded pool is created with a crush map ruleset that will
+An erasure-coded pool is created with a CRUSH rule that will
ensure no data loss if at most three datacenters fail
simultaneously. The overhead is 50% with erasure code configured to
split data in six (k=6) and create three coding chunks (m=3). With
$ ceph osd pool create ecpool 12 12 erasure
-Set up an erasure-coded pool and the associated crush ruleset::
+Set up an erasure-coded pool and the associated CRUSH rule ``ecrule``::
- $ ceph osd crush rule create-erasure ecruleset
+ $ ceph osd crush rule create-erasure ecrule
$ ceph osd pool create ecpool 12 12 erasure \
- default ecruleset
+ default ecrule
-Set the ruleset failure domain to osd (instead of the host which is the default)::
+Set the CRUSH failure domain to osd (instead of host, which is the default)::
$ ceph osd erasure-code-profile set myprofile \
crush-failure-domain=osd
$ ceph osd erasure-code-profile ls
default
-Set the ruleset to take ssd (instead of default)::
+Set the rule to ssd (instead of default)::
$ ceph osd erasure-code-profile set myprofile \
crush-root=ssd
Controlled Replication Under Scalable Hashing. It is the algorithm
Ceph uses to compute object storage locations.
- ruleset
- A set of CRUSH data placement rules that applies to a particular pool(s).
+ CRUSH rule
+ The CRUSH data placement rule that applies to a particular pool(s).
Pool
Pools
radosgw-admin period update --commit
.. note:: Mapping the index pool (for each zone, if applicable) to a CRUSH
- ruleset of SSD-based OSDs may also help with bucket index performance.
+ rule of SSD-based OSDs may also help with bucket index performance.
Add Wildcard to DNS
-------------------
Usage::
ceph osd pool create <poolname> <int[0-]> {<int[0-]>} {replicated|erasure}
- {<erasure_code_profile>} {<ruleset>} {<int>}
+ {<erasure_code_profile>} {<rule>} {<int>}
Subcommand ``delete`` deletes pool.
Usage::
ceph osd pool get <poolname> size|min_size|pg_num|
- pgp_num|crush_ruleset|auid|write_fadvise_dontneed
+ pgp_num|crush_rule|auid|write_fadvise_dontneed
Only for tiered pools::
Usage::
ceph osd pool set <poolname> size|min_size|pg_num|
- pgp_num|crush_ruleset|hashpspool|nodelete|nopgchange|nosizechange|
+ pgp_num|crush_rule|hashpspool|nodelete|nopgchange|nosizechange|
hit_set_type|hit_set_period|hit_set_count|hit_set_fpp|debug_fake_ec_pool|
target_max_bytes|target_max_objects|cache_target_dirty_ratio|
cache_target_dirty_high_ratio|
5 1 osd.5 1
...
-CRUSH rulesets are created so the generated crushmap can be
-tested. They are the same rulesets as the one created by default when
+CRUSH rules are created so the generated crushmap can be
+tested. They are the same rules as the ones created by default when
creating a new Ceph cluster. They can be further edited with::
# decompile
``librados`` and connect to a Ceph Monitor. Once connected, ``librados``
retrieves the :term:`Cluster Map` from the Ceph Monitor. When the client app
wants to read or write data, it creates an I/O context and binds to a
-:term:`pool`. The pool has an associated :term:`ruleset` that defines how it
+:term:`pool`. The pool has an associated :term:`CRUSH Rule` that defines how it
will place data in the storage cluster. Via the I/O context, the client
provides the object name to ``librados``, which takes the object name
and the cluster map (i.e., the topology of the cluster) and `computes`_ the
| To
v
+--------+ +---------------+
- | Pool |---------->| CRUSH Ruleset |
+ | Pool |---------->| CRUSH Rule |
+--------+ Selects +---------------+
With Ceph, you can run multiple Ceph Storage Clusters on the same hardware.
Running multiple clusters provides a higher level of isolation compared to
-using different pools on the same cluster with different CRUSH rulesets. A
+using different pools on the same cluster with different CRUSH rules. A
separate cluster will have separate monitor, OSD and metadata server processes.
When running Ceph with default settings, the default cluster name is ``ceph``,
which means you would save your Ceph configuration file with the file name
See `Weighting Bucket Items`_ for details.
-``osd pool default crush replicated ruleset``
+``osd pool default crush rule``
-:Description: The default CRUSH ruleset to use when creating a replicated pool.
+:Description: The default CRUSH rule to use when creating a replicated pool.
:Type: 8-bit Integer
-:Default: ``CEPH_DEFAULT_CRUSH_REPLICATED_RULESET``, which means "pick
- a ruleset with the lowest numerical ID and use that". This is to
- make pool creation work in the absence of ruleset 0.
+:Default: ``-1``, which means "pick the rule with the lowest numerical ID and
+ use that". This is to make pool creation work in the absence of rule 0.
``osd pool erasure code stripe unit``
- **Erasure Coding:** In this scenario, the pool uses erasure coding to
store data much more efficiently with a small performance tradeoff.
-In the standard storage scenario, you can setup a CRUSH ruleset to establish
+In the standard storage scenario, you can setup a CRUSH rule to establish
the failure domain (e.g., osd, host, chassis, rack, row, etc.). Ceph OSD
-Daemons perform optimally when all storage drives in the ruleset are of the
+Daemons perform optimally when all storage drives in the rule are of the
same size, speed (both RPMs and throughput) and type. See `CRUSH Maps`_
-for details on creating a ruleset. Once you have created a ruleset, create
+for details on creating a rule. Once you have created a rule, create
a backing storage pool.
In the erasure coding scenario, the pool creation arguments will generate the
-appropriate ruleset automatically. See `Create a Pool`_ for details.
+appropriate rule automatically. See `Create a Pool`_ for details.
In subsequent examples, we will refer to the backing storage pool
as ``cold-storage``.
Setting up a cache pool follows the same procedure as the standard storage
scenario, but with this difference: the drives for the cache tier are typically
high performance drives that reside in their own servers and have their own
-ruleset. When setting up a ruleset, it should take account of the hosts that
-have the high performance drives while omitting the hosts that don't. See
+CRUSH rule. When setting up such a rule, it should take account of the hosts
+that have the high performance drives while omitting the hosts that don't. See
`Placing Different Pools on Different OSDs`_ for details.
* ``size``: Sets the number of copies of data in the pool.
* ``pg_num``: The placement group number.
* ``pgp_num``: Effective number when calculating pg placement.
- * ``crush_ruleset``: rule number for mapping placement.
+ * ``crush_rule``: rule number for mapping placement.
Get the value of a pool setting. ::
#. `Recompile`_ the CRUSH map.
#. `Set the CRUSH map`_.
-To activate CRUSH map rules for a specific pool, identify the common ruleset
-number for those rules and specify that ruleset number for the pool. See `Set
-Pool Values`_ for details.
+For details on setting the CRUSH map rule for a specific pool, see `Set
+Pool Values`_.
.. _Get the CRUSH map: #getcrushmap
.. _Decompile: #decompilecrushmap
---------------
CRUSH maps support the notion of 'CRUSH rules', which are the rules that
-determine data placement for a pool. For large clusters, you will likely create
-many pools where each pool may have its own CRUSH ruleset and rules. The default
-CRUSH map has a rule for each pool, and one ruleset assigned to each of the
-default pools.
+determine data placement for a pool. The default CRUSH map has a rule for each
+pool. For large clusters, you will likely create many pools where each pool may
+have its own non-default CRUSH rule.
-.. note:: In most cases, you will not need to modify the default rules. When
- you create a new pool, its default ruleset is ``0``.
+.. note:: In most cases, you will not need to modify the default rule. When
+ you create a new pool, by default the rule will be set to ``0``.
CRUSH rules define placement and replication strategies or distribution policies
``ruleset``
-:Description: A means of classifying a rule as belonging to a set of rules.
- Activated by `setting the ruleset in a pool`_.
+:Description: A unique whole number for identifying the rule. The name ``ruleset``
+ is a carry-over from the past, when it was possible to have multiple
+ CRUSH rules per pool.
:Purpose: A component of the rule mask.
:Type: Integer
:Required: Yes
:Default: 0
-.. _setting the ruleset in a pool: ../pools#setpoolvalues
-
``type``
:Prerequisite: Follows ``step choose``.
:Example: ``step emit``
-.. important:: To activate one or more rules with a common ruleset number to a
- pool, set the ruleset number of the pool.
+.. important:: A given CRUSH rule may be assigned to multiple pools, but it
+ is not possible for a single pool to have multiple CRUSH rules.
Placing Different Pools on Different OSDS:
- **Pools:** Ceph stores data within pools, which are logical groups for storing
objects. Pools manage the number of placement groups, the number of replicas,
- and the ruleset for the pool. To store data in a pool, you must have
+ and the CRUSH rule for the pool. To store data in a pool, you must have
an authenticated user with permissions for the pool. Ceph can snapshot pools.
See `Pools`_ for additional details.
``crush-root={root}``
:Description: The name of the crush bucket used for the first step of
- the ruleset. For intance **step take default**.
+ the CRUSH rule. For intance **step take default**.
:Type: String
:Required: No.
:Description: Ensure that no two chunks are in a bucket with the same
failure domain. For instance, if the failure domain is
**host** no two chunks will be stored on the same
- host. It is used to create a ruleset step such as **step
+ host. It is used to create a CRUSH rule step such as **step
chooseleaf host**.
:Type: String
``crush-root={root}``
:Description: The name of the crush bucket used for the first step of
- the ruleset. For intance **step take default**.
+ the CRUSH rule. For intance **step take default**.
:Type: String
:Required: No.
:Description: Ensure that no two chunks are in a bucket with the same
failure domain. For instance, if the failure domain is
**host** no two chunks will be stored on the same
- host. It is used to create a ruleset step such as **step
+ host. It is used to create a CRUSH rule step such as **step
chooseleaf host**.
:Type: String
``crush-root={root}``
:Description: The name of the crush bucket used for the first step of
- the ruleset. For intance **step take default**.
+ the CRUSH rule. For intance **step take default**.
:Type: String
:Required: No.
defined by **l** will be stored. For instance, if it is
set to **rack**, each group of **l** chunks will be
placed in a different rack. It is used to create a
- ruleset step such as **step choose rack**. If it is not
+ CRUSH rule step such as **step choose rack**. If it is not
set, no such grouping is done.
:Type: String
:Description: Ensure that no two chunks are in a bucket with the same
failure domain. For instance, if the failure domain is
**host** no two chunks will be stored on the same
- host. It is used to create a ruleset step such as **step
+ host. It is used to create a CRUSH rule step such as **step
chooseleaf host**.
:Type: String
step 2 cDDD____
step 3 ____cDDD
-Controlling crush placement
+Controlling CRUSH placement
===========================
-The default crush ruleset provides OSDs that are on different hosts. For instance::
+The default CRUSH rule provides OSDs that are on different hosts. For instance::
chunk nr 01234567
crush-steps='[ [ "choose", "rack", 2 ], [ "chooseleaf", "host", 4 ] ]'
-will create a ruleset that will select two crush buckets of type
+will create a rule that will select two crush buckets of type
*rack* and for each of them choose four OSDs, each of them located in
different buckets of type *host*.
-The ruleset can also be manually crafted for finer control.
+The CRUSH rule can also be manually crafted for finer control.
=====================
Erasure code is defined by a **profile** and is used when creating an
-erasure coded pool and the associated crush ruleset.
+erasure coded pool and the associated CRUSH rule.
The **default** erasure code profile (which is created when the Ceph
cluster is initialized) provides the same level of redundancy as two
``crush-root={root}``
:Description: The name of the crush bucket used for the first step of
- the ruleset. For intance **step take default**.
+ the CRUSH rule. For intance **step take default**.
:Type: String
:Required: No.
:Description: Ensure that no two chunks are in a bucket with the same
failure domain. For instance, if the failure domain is
**host** no two chunks will be stored on the same
- host. It is used to create a ruleset step such as **step
+ host. It is used to create a CRUSH rule step such as **step
chooseleaf host**.
:Type: String
The *NYAN* object will be divided in three (*K=3*) and two additional
*chunks* will be created (*M=2*). The value of *M* defines how many
OSD can be lost simultaneously without losing any data. The
-*crush-failure-domain=rack* will create a CRUSH ruleset that ensures
+*crush-failure-domain=rack* will create a CRUSH rule that ensures
no two *chunks* are stored in the same rack.
.. ditaa::
setting up multiple pools, be careful to ensure you set a reasonable number of
placement groups for both the pool and the cluster as a whole.
-- **CRUSH Rules**: When you store data in a pool, a CRUSH ruleset mapped to the
- pool enables CRUSH to identify a rule for the placement of the object
- and its replicas (or chunks for erasure coded pools) in your cluster.
- You can create a custom CRUSH rule for your pool.
+- **CRUSH Rules**: When you store data in a pool, placement of the object
+ and its replicas (or chunks for erasure coded pools) in your cluster is governed
+ by CRUSH rules. You can create a custom CRUSH rule for your pool if the default
+ rule is not appropriate for your use case.
- **Snapshots**: When you create snapshots with ``ceph osd pool mksnap``,
you effectively take a snapshot of a particular pool.
:Type: String
:Required: No.
-:Default: For **replicated** pools it is the ruleset specified by the ``osd
- pool default crush replicated ruleset`` config variable. This
- ruleset must exist.
+:Default: For **replicated** pools it is the rule specified by the ``osd
+ pool default crush rule`` config variable. This rule must exist.
For **erasure** pools it is ``erasure-code`` if the ``default``
`erasure code profile`_ is used or ``{pool-name}`` otherwise. This
- ruleset will be created implicitly if it doesn't exist already.
+ rule will be created implicitly if it doesn't exist already.
``[erasure-code-profile=profile]``
.. _Monitor Configuration: ../../configuration/mon-config-ref
-If you created your own rulesets and rules for a pool you created, you should
-consider removing them when you no longer need your pool::
+If you created your own rules for a pool you created, you should consider
+removing them when you no longer need your pool::
- ceph osd pool get {pool-name} crush_ruleset
+ ceph osd pool get {pool-name} crush_rule
-If the ruleset was "123", for example, you can check the other pools like so::
+If the rule was "123", for example, you can check the other pools like so::
- ceph osd dump | grep "^pool" | grep "crush_ruleset 123"
+ ceph osd dump | grep "^pool" | grep "crush_rule 123"
-If no other pools use that custom ruleset, then it's safe to delete that
-ruleset from the cluster.
+If no other pools use that custom rule, then it's safe to delete that
+rule from the cluster.
If you created users with permissions strictly for a pool that no longer
exists, you should consider deleting those users too::
:Type: Integer
:Valid Range: Equal to or less than ``pg_num``.
-.. _crush_ruleset:
+.. _crush_rule:
-``crush_ruleset``
+``crush_rule``
-:Description: The ruleset to use for mapping object placement in the cluster.
+:Description: The rule to use for mapping object placement in the cluster.
:Type: Integer
.. _allow_ec_overwrites:
:Valid Range: Equal to or less than ``pg_num``.
-``crush_ruleset``
+``crush_rule``
-:Description: see crush_ruleset_
+:Description: see crush_rule_
``hit_set_type``
CRUSH constraints cannot be satisfied
-------------------------------------
-If the cluster has enough OSDs, it is possible that the CRUSH ruleset
+If the cluster has enough OSDs, it is possible that the CRUSH rule
imposes constraints that cannot be satisfied. If there are 10 OSDs on
-two hosts and the CRUSH rulesets require that no two OSDs from the
+two hosts and the CRUSH rule requires that no two OSDs from the
same host are used in the same PG, the mapping may fail because only
-two OSD will be found. You can check the constraint by displaying the
-ruleset::
+two OSDs will be found. You can check the constraint by displaying ("dumping")
+the rule::
$ ceph osd crush rule ls
[
- "replicated_ruleset",
+ "replicated_rule",
"erasurepool"]
$ ceph osd crush rule dump erasurepool
{ "rule_id": 1,
* adding more OSDs to the cluster (that does not require the erasure
coded pool to be modified, it will become clean automatically)
-* use a hand made CRUSH ruleset that tries more times to find a good
- mapping. It can be done by setting ``set_choose_tries`` to a value
+* use a handmade CRUSH rule that tries more times to find a good
+ mapping. This can be done by setting ``set_choose_tries`` to a value
greater than the default.
You should first verify the problem with ``crushtool`` after
bad mapping rule 8 x 79 num_rep 9 result [6,0,2,1,4,7,2147483647,5,8]
bad mapping rule 8 x 173 num_rep 9 result [0,4,6,8,2,1,3,7,2147483647]
-Where ``--num-rep`` is the number of OSDs the erasure code crush
-ruleset needs, ``--rule`` is the value of the ``ruleset`` field
+Where ``--num-rep`` is the number of OSDs the erasure code CRUSH
+rule needs, ``--rule`` is the value of the ``ruleset`` field
displayed by ``ceph osd crush rule dump``. The test will try mapping
one million values (i.e. the range defined by ``[--min-x,--max-x]``)
and must display at least one bad mapping. If it outputs nothing it
means all mappings are successfull and you can stop right there: the
problem is elsewhere.
-The crush ruleset can be edited by decompiling the crush map::
+The CRUSH rule can be edited by decompiling the crush map::
$ crushtool --decompile crush.map > crush.txt
-and adding the following line to the ruleset::
+and adding the following line to the rule::
step set_choose_tries 100