From: Kamoltat Date: Thu, 12 May 2022 12:22:13 +0000 (+0000) Subject: pybind/mgr/pg_autoscaler: change overlapping roots to warning X-Git-Tag: v18.0.0~310^2 X-Git-Url: http://git.apps.os.sepia.ceph.com/?a=commitdiff_plain;h=e8490dae9fb9596c68fb4dc05ac8b0f6adb305b8;p=ceph.git pybind/mgr/pg_autoscaler: change overlapping roots to warning Change the log level of overlapping roots from ``Error`` to ``Warning``. Point the user to documentation that explains the overlapping roots. Added more information regarding overlapping roots in the autoscaler documentation such as the step to get rid of the warning. Fixes: https://tracker.ceph.com/issues/55611 Signed-off-by: Kamoltat --- diff --git a/doc/rados/operations/placement-groups.rst b/doc/rados/operations/placement-groups.rst index d8d1a532bb659..c471ff8bcd94e 100644 --- a/doc/rados/operations/placement-groups.rst +++ b/doc/rados/operations/placement-groups.rst @@ -143,16 +143,21 @@ example, a pool that maps to OSDs of class `ssd` and a pool that maps to OSDs of class `hdd` will each have optimal PG counts that depend on the number of those respective device types. +In the case where a pool uses OSDs under two or more CRUSH roots, e.g., (shadow +trees with both `ssd` and `hdd` devices), the autoscaler will +issue a warning to the user in the manager log stating the name of the pool +and the set of roots that overlap each other. The autoscaler will not +scale any pools with overlapping roots because this can cause problems +with the scaling process. We recommend making each pool belong to only +one root (one OSD class) to get rid of the warning and ensure a successful +scaling process. + The autoscaler uses the `bulk` flag to determine which pool should start out with a full complement of PGs and only scales down when the usage ratio across the pool is not even. However, if the pool doesn't have the `bulk` flag, the pool will start out with minimal PGs and only when there is more usage in the pool. -The autoscaler identifies any overlapping roots and prevents the pools -with such roots from scaling because overlapping roots can cause problems -with the scaling process. - To create pool with `bulk` flag:: ceph osd pool create --bulk diff --git a/src/pybind/mgr/pg_autoscaler/module.py b/src/pybind/mgr/pg_autoscaler/module.py index 92a149ae15645..d4758c75ab446 100644 --- a/src/pybind/mgr/pg_autoscaler/module.py +++ b/src/pybind/mgr/pg_autoscaler/module.py @@ -364,8 +364,11 @@ class PgAutoscaler(MgrModule): if prev_root_id != root_id: overlapped_roots.add(prev_root_id) overlapped_roots.add(root_id) - self.log.error('pool %d has overlapping roots: %s', - pool_id, overlapped_roots) + self.log.warning("pool %s won't scale due to overlapping roots: %s", + pool['pool_name'], overlapped_roots) + self.log.warning("Please See: https://docs.ceph.com/en/" + "latest/rados/operations/placement-groups" + "/#automated-scaling") break if not s: s = CrushSubtreeResourceStatus()