PriorityQueueBase::do_clean() shouldn't remove ClientRec instances which
still have queued requests. Otherwise, very low priority clients might
end up having requests actually lost, which shouldn't be possible.
In the OSD, this resulted in PGRecovery items being lost if queued with
background_best_effort while expanding a cluster. Such items can
legitimately sit in the queue for a long period of time as they
represent background data migration which is allowed to be starved by an
aggressive client workload. Dropping the items broke an assumption in
the OSD that all items enqueued would eventually be dequeued resulting
in resources being leaked.
Fixes: https://tracker.ceph.com/issues/61594
Signed-off-by: Samuel Just <sjust@redhat.com>
if (erase_point > 0 || idle_point > 0) {
for (auto i = client_map.begin(); i != client_map.end(); /* empty */) {
auto i2 = i++;
- if (erase_point &&
+ if (!(i2->second->has_request()) &&
+ erase_point &&
erased_num < erase_max &&
i2->second->last_tick <= erase_point) {
delete_from_heaps(i2->second);