From 2d84e018091a9e5421e8f6c7613ec3c4b9458940 Mon Sep 17 00:00:00 2001 From: Zac Dover Date: Sat, 20 May 2023 02:26:45 +1000 Subject: [PATCH] doc/rados: edit data-placement.rst Edit doc/rados/data-placement.rst. Co-authored-by: Cole Mitchell Signed-off-by: Zac Dover (cherry picked from commit 32600c27c4dca6b9d5fae9892c0a1660b672781c) --- doc/rados/operations/data-placement.rst | 63 +++++++++++++------------ 1 file changed, 34 insertions(+), 29 deletions(-) diff --git a/doc/rados/operations/data-placement.rst b/doc/rados/operations/data-placement.rst index 8576af8a0e533..3d3be65ec0874 100644 --- a/doc/rados/operations/data-placement.rst +++ b/doc/rados/operations/data-placement.rst @@ -2,40 +2,45 @@ Data Placement Overview ========================= -Ceph stores, replicates and rebalances data objects across a RADOS cluster -dynamically. With many different users storing objects in different pools for -different purposes on countless OSDs, Ceph operations require some data -placement planning. The main data placement planning concepts in Ceph include: +Ceph stores, replicates, and rebalances data objects across a RADOS cluster +dynamically. Because different users store objects in different pools for +different purposes on many OSDs, Ceph operations require a certain amount of +data- placement planning. The main data-placement planning concepts in Ceph +include: -- **Pools:** Ceph stores data within pools, which are logical groups for storing - objects. Pools manage the number of placement groups, the number of replicas, - and the CRUSH rule for the pool. To store data in a pool, you must have - an authenticated user with permissions for the pool. Ceph can snapshot pools. - See `Pools`_ for additional details. +- **Pools:** Ceph stores data within pools, which are logical groups used for + storing objects. Pools manage the number of placement groups, the number of + replicas, and the CRUSH rule for the pool. To store data in a pool, it is + necessary to be an authenticated user with permissions for the pool. Ceph is + able to make snapshots of pools. For additional details, see `Pools`_. -- **Placement Groups:** Ceph maps objects to placement groups (PGs). - Placement groups (PGs) are shards or fragments of a logical object pool - that place objects as a group into OSDs. Placement groups reduce the amount - of per-object metadata when Ceph stores the data in OSDs. A larger number of - placement groups (e.g., 100 per OSD) leads to better balancing. See - :ref:`placement groups` for additional details. +- **Placement Groups:** Ceph maps objects to placement groups. Placement + groups (PGs) are shards or fragments of a logical object pool that place + objects as a group into OSDs. Placement groups reduce the amount of + per-object metadata that is necessary for Ceph to store the data in OSDs. A + greater number of placement groups (for example, 100 PGs per OSD as compared + with 50 PGs per OSD) leads to better balancing. For additional details, see + :ref:`placement groups`. -- **CRUSH Maps:** CRUSH is a big part of what allows Ceph to scale without - performance bottlenecks, without limitations to scalability, and without a - single point of failure. CRUSH maps provide the physical topology of the - cluster to the CRUSH algorithm to determine where the data for an object - and its replicas should be stored, and how to do so across failure domains - for added data safety among other things. See `CRUSH Maps`_ for additional - details. +- **CRUSH Maps:** CRUSH plays a major role in allowing Ceph to scale while + avoiding certain pitfalls, such as performance bottlenecks, limitations to + scalability, and single points of failure. CRUSH maps provide the physical + topology of the cluster to the CRUSH algorithm, so that it can determine both + (1) where the data for an object and its replicas should be stored and (2) + how to store that data across failure domains so as to improve data safety. + For additional details, see `CRUSH Maps`_. -- **Balancer:** The balancer is a feature that will automatically optimize the - distribution of PGs across devices to achieve a balanced data distribution, - maximizing the amount of data that can be stored in the cluster and evenly - distributing the workload across OSDs. +- **Balancer:** The balancer is a feature that automatically optimizes the + distribution of placement groups across devices in order to achieve a + balanced data distribution, in order to maximize the amount of data that can + be stored in the cluster, and in order to evenly distribute the workload + across OSDs. -When you initially set up a test cluster, you can use the default values. Once -you begin planning for a large Ceph cluster, refer to pools, placement groups -and CRUSH for data placement operations. +It is possible to use the default values for each of the above components. +Default values are recommended for a test cluster's initial setup. However, +when planning a large Ceph cluster, values should be customized for +data-placement operations with reference to the different roles played by +pools, placement groups, and CRUSH. .. _Pools: ../pools .. _CRUSH Maps: ../crush-map -- 2.39.5