git-server-git.apps.pok.os.sepia.ceph.com Git

author	Kamoltat <ksirivad@redhat.com>
	Wed, 18 Oct 2023 22:52:20 +0000 (22:52 +0000)
committer	Konstantin Shalygin <k0ste@k0ste.ru>
	Sat, 17 Aug 2024 11:01:39 +0000 (18:01 +0700)
commit	603cbf6e6da23524439fdf2ba17a7e38d96553bc
tree	1e5faf1dbb1a2a70d39ca8ccd5dfb45969ec38de	tree \| snapshot
parent	3bf8d0ff30cc46e52e42024a812768d0a2b50a05	commit \| diff

qa/tasks/ceph_manager.py: Rewrite test_pool_min_size

Problem:

Failed the test in EC Pool configuration because PGs are
not going into active+clean (our fault for over thrashing and checking the wrong thing).
Also, PG would not go into active because we thrash below min_size
in an EC pool config, not enough shards in the acting set.
Therefore, failed the wait_for_recovery check.
Moreover, When we revive osds, we didn't add the osd back in the cluster,
this messes up true count for live_osds in the test.

Solution:

Instead of randomly choosing OSDs to thrash,
we randomly select a PG from each pool and
thrash the OSDs in the PG's acting set until
we reach min_size, then we check to see if the
PG is still active. After that we revive all
the OSDs to see if the PG recovered cleanly.

We removed some of the unnecessary part such
as `min_dead`, `min_live`, `min_out` and etc.

Also, we refractored the part of where we are
assigning k,m for the EC pools so that we get
better code readablility.

Fixes: Fixes: https://tracker.ceph.com/issues/59172
Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit 8c4768ecb3ec38c8fce209eae9fe931e974d0495)