git.apps.os.sepia.ceph.com Git

author	Kamoltat <ksirivad@redhat.com>
	Fri, 2 Jun 2023 20:06:52 +0000 (20:06 +0000)
committer	Kamoltat <ksirivad@redhat.com>
	Tue, 19 Sep 2023 20:11:01 +0000 (20:11 +0000)
commit	c3cc098236b43be9fc4f217ab79c726bcbb7f623
tree	77b3564bdbe6497ffbae5fd7a0ddde504cd8fe71	tree \| snapshot
parent	733a975b9bd6cb569eab259a848f717e2ce1998a	commit \| diff

pybind/mgr/pg_autoscaler: Use bytes_used for actual_raw_used

Problem

We realized that `store` is not
the correct value to represent `actual_raw_used`
when it comes to pool(s) with `compression` enabled.

https://github.com/ceph/ceph/pull/29986
was the PR that is the culprit of the issue, since
it simply changed `byte_used` to `store` just
because they want a per pool value of bytes_used
without factoring in replication. However, they
did not realized that in doing so also caused
pools with compression to inherit an incorrect
value for `actual_raw_used`.

This also caused an incorrect value for `capacity_ratio`
since the autoscaler scales PGs according to the
`capacity_ratio` of each pool. The existing issue
causes pool with compression to have higher `capacity_ratio`
where in reality the actual utilization is less than
non-compressed pools, assuming we perform I/O with the same
work load on each pool evenly.

Solution

Use `bytes_used` instead of `store` when fetching
for `actual_raw_used` and when calculating `pool_raw_used`
we `max(actual_raw_used, target_bytes * raw_used_rate)`

Fixes:

https://tracker.ceph.com/issues/54136

Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit 3d8ac80f61cd332b53aea1fa5799f8ccd3b01b66)