From cb300ee6b8e00285cd086d9637aa448fcc5035eb Mon Sep 17 00:00:00 2001 From: Jonas Jelten Date: Sun, 20 Sep 2020 23:05:33 +0200 Subject: [PATCH] doc: fix and improve the explainations of up and acting osd sets acting = where clients requests are processed up = where pgs were planned to be by crush+upmap Fixes: https://tracker.ceph.com/issues/48718 Signed-off-by: Jonas Jelten --- doc/rados/operations/monitoring-osd-pg.rst | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/doc/rados/operations/monitoring-osd-pg.rst b/doc/rados/operations/monitoring-osd-pg.rst index 08b70dd4d51..86b85e7b48d 100644 --- a/doc/rados/operations/monitoring-osd-pg.rst +++ b/doc/rados/operations/monitoring-osd-pg.rst @@ -109,9 +109,15 @@ requires three replicas of a placement group, CRUSH may assign them to ``osd.1``, ``osd.2`` and ``osd.3`` respectively. CRUSH actually seeks a pseudo-random placement that will take into account failure domains you set in your `CRUSH map`_, so you will rarely see placement groups assigned to nearest -neighbor OSDs in a large cluster. We refer to the set of OSDs that should -contain the replicas of a particular placement group as the **Acting Set**. In -some cases, an OSD in the Acting Set is ``down`` or otherwise not able to +neighbor OSDs in a large cluster. + +Ceph processes a client request using the **Acting Set**, which is the set of +OSDs that will actually handle the requests since they have a full and working +version of a placement group shard. The set of OSDs that should contain a shard +of a particular placement group as the **Up Set**, i.e. where data is +moved/copied to (or planned to be). + +In some cases, an OSD in the Acting Set is ``down`` or otherwise not able to service requests for objects in the placement group. When these situations arise, don't panic. Common examples include: @@ -122,12 +128,10 @@ arise, don't panic. Common examples include: - An OSD in the Acting Set is ``down`` or unable to service requests, and another OSD has temporarily assumed its duties. -Ceph processes a client request using the **Up Set**, which is the set of OSDs -that will actually handle the requests. In most cases, the Up Set and the Acting -Set are virtually identical. When they are not, it may indicate that Ceph is -migrating data, an OSD is recovering, or that there is a problem (i.e., Ceph -usually echoes a "HEALTH WARN" state with a "stuck stale" message in such -scenarios). +In most cases, the Up Set and the Acting Set are identical. When they are not, +it may indicate that Ceph is migrating the PG (it's remapped), an OSD is +recovering, or that there is a problem (i.e., Ceph usually echoes a "HEALTH +WARN" state with a "stuck stale" message in such scenarios). To retrieve a list of placement groups, execute:: -- 2.39.5