There is a race hazard in the OSD thrasher which tests if a pool
exists and then queries the PGs in the pool. It is possible that
a pool exists (has been added to OSDMap) but the PGs have not
been created yet (by the OSDs). Add a sleep/retry to mitigate
the race.
Fixes: https://tracker.ceph.com/issues/70818
Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
have the option to specify which pool you
want the PG from.
"""
- pgs = self.ceph_manager.get_pg_stats()
+ with safe_while(sleep=5, tries=3, action="get_pg_stats") as proceed:
+ while proceed():
+ pgs = self.ceph_manager.get_pg_stats()
+ if pgs:
+ break
+ # If pool has just been created it might not have PGs yet
+ self.log('No pgs; trying again')
if not pgs:
self.log('No pgs; doing nothing')
return