]> git.apps.os.sepia.ceph.com Git - ceph.git/commitdiff
pybind/mgr/pg_autoscaler: change overlapping roots to warning 46242/head
authorKamoltat <ksirivad@redhat.com>
Thu, 12 May 2022 12:22:13 +0000 (12:22 +0000)
committerKamoltat <ksirivad@redhat.com>
Mon, 25 Jul 2022 14:17:03 +0000 (14:17 +0000)
Change the log level of overlapping roots
from ``Error`` to ``Warning``.

Point the user to documentation that
explains the overlapping roots.

Added more information regarding overlapping roots
in the autoscaler documentation such as
the step to get rid of the warning.

Fixes: https://tracker.ceph.com/issues/55611
Signed-off-by: Kamoltat <ksirivad@redhat.com>
doc/rados/operations/placement-groups.rst
src/pybind/mgr/pg_autoscaler/module.py

index d8d1a532bb659dfb753349ed4754197c2753b1fa..c471ff8bcd94e726ce61c4b8d0b01ec0154bf03d 100644 (file)
@@ -143,16 +143,21 @@ example, a pool that maps to OSDs of class `ssd` and a pool that maps
 to OSDs of class `hdd` will each have optimal PG counts that depend on
 the number of those respective device types.
 
+In the case where a pool uses OSDs under two or more CRUSH roots, e.g., (shadow
+trees with both `ssd` and `hdd` devices), the autoscaler will
+issue a warning to the user in the manager log stating the name of the pool
+and the set of roots that overlap each other. The autoscaler will not
+scale any pools with overlapping roots because this can cause problems
+with the scaling process. We recommend making each pool belong to only
+one root (one OSD class) to get rid of the warning and ensure a successful
+scaling process.
+
 The autoscaler uses the `bulk` flag to determine which pool
 should start out with a full complement of PGs and only
 scales down when the usage ratio across the pool is not even.
 However, if the pool doesn't have the `bulk` flag, the pool will
 start out with minimal PGs and only when there is more usage in the pool.
 
-The autoscaler identifies any overlapping roots and prevents the pools
-with such roots from scaling because overlapping roots can cause problems
-with the scaling process.
-
 To create pool with `bulk` flag::
 
   ceph osd pool create <pool-name> --bulk
index 92a149ae15645444135f8e4062c7436f45fc39d2..d4758c75ab4462449134a5844d7afb71210960bb 100644 (file)
@@ -364,8 +364,11 @@ class PgAutoscaler(MgrModule):
                     if prev_root_id != root_id:
                         overlapped_roots.add(prev_root_id)
                         overlapped_roots.add(root_id)
-                        self.log.error('pool %d has overlapping roots: %s',
-                                       pool_id, overlapped_roots)
+                        self.log.warning("pool %s won't scale due to overlapping roots: %s",
+                                       pool['pool_name'], overlapped_roots)
+                        self.log.warning("Please See: https://docs.ceph.com/en/"
+                                         "latest/rados/operations/placement-groups"
+                                         "/#automated-scaling")
                     break
             if not s:
                 s = CrushSubtreeResourceStatus()