From a2937fab78e64f61e6126c630c20c2184e9b14f0 Mon Sep 17 00:00:00 2001 From: Shraddha Agrawal Date: Wed, 14 May 2025 18:06:38 +0530 Subject: [PATCH] doc: address review comments Signed-off-by: Shraddha Agrawal --- PendingReleaseNotes | 2 +- doc/rados/operations/monitoring.rst | 16 ++++++---------- 2 files changed, 7 insertions(+), 11 deletions(-) diff --git a/PendingReleaseNotes b/PendingReleaseNotes index b238937eb81..8380f6c06e0 100644 --- a/PendingReleaseNotes +++ b/PendingReleaseNotes @@ -151,7 +151,7 @@ users to view the availability score for each pool in a cluster. A pool is considered unavailable if any PG in the pool is not in active state or if there are unfound objects. Otherwise the pool is considered available. The score is updated every - 5 seconds. + 5 seconds. This feature is in tech preview. Related trackers: - https://tracker.ceph.com/issues/67777 diff --git a/doc/rados/operations/monitoring.rst b/doc/rados/operations/monitoring.rst index 6e879defdaa..176661d8bc9 100644 --- a/doc/rados/operations/monitoring.rst +++ b/doc/rados/operations/monitoring.rst @@ -751,8 +751,7 @@ the following command can be invoked: ceph osd pool availability-status -If the cluster has 4 pools, this is what the ``availability-status`` -will report: +Example output: .. prompt:: bash $ @@ -762,15 +761,12 @@ will report: cephfs.a.meta 77s 0s 0 0s 0s 1 1 cephfs.a.data 76s 0s 0 0s 0s 1 1 -We consider a pool unavailable if there is potentially any data loss. -This means, if there are any PG in the pool not in -active state or if there are unfound objects, some data might be -either unreachable or lost. In such cases, we mark the pool as -unavailable. Otherwise the pool is considered available. -For example: A pool will be marked available even if an OSD is down -as long as PG replication ensures there is no data loss. +A pool is considered ``unavailable`` when at least one PG in the pool +becomes inactive or there is at least one unfound object in the pool. +Otherwise the pool is considered ``available``. We first calculate the Mean Time Between Failures (MTBF) and -Mean Time To Recover (MTTR) and arrive at the availability score +Mean Time To Recover (MTTR) from the uptime and downtime recorded +for each pool and arrive at the availability score by finding ratio of MTBF to total time (ie MTTR + MTBF). The score is updated every 5 seconds. \ No newline at end of file -- 2.39.5