]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/commit
pybind/mgr/pg_autoscaler: Use bytes_used for actual_raw_used 51921/head
authorKamoltat <ksirivad@redhat.com>
Fri, 2 Jun 2023 20:06:52 +0000 (20:06 +0000)
committerKamoltat <ksirivad@redhat.com>
Mon, 5 Jun 2023 13:12:52 +0000 (13:12 +0000)
commit3d8ac80f61cd332b53aea1fa5799f8ccd3b01b66
treeea2827d73697744003f8f81a79ec09e4c522723b
parentc5fb47a03c06f1b341e1aaee70fcb19acb7007ba
pybind/mgr/pg_autoscaler: Use bytes_used for actual_raw_used

Problem

We realized that `store` is not
the correct value to represent `actual_raw_used`
when it comes to pool(s) with `compression` enabled.

https://github.com/ceph/ceph/pull/29986
was the PR that is the culprit of the issue, since
it simply changed `byte_used` to `store` just
because they want a per pool value of bytes_used
without factoring in replication. However, they
did not realized that in doing so also caused
pools with compression to inherit an incorrect
value for `actual_raw_used`.

This also caused an incorrect value for `capacity_ratio`
since the autoscaler scales PGs according to the
`capacity_ratio` of each pool. The existing issue
causes pool with compression to have higher `capacity_ratio`
where in reality the actual utilization is less than
non-compressed pools, assuming we perform I/O with the same
work load on each pool evenly.

Solution

Use `bytes_used` instead of `store` when fetching
for `actual_raw_used` and when calculating `pool_raw_used`
we `max(actual_raw_used, target_bytes * raw_used_rate)`

Fixes:

https://tracker.ceph.com/issues/54136

Signed-off-by: Kamoltat <ksirivad@redhat.com>
src/pybind/mgr/pg_autoscaler/module.py