From: Sage Weil Date: Tue, 9 Apr 2019 22:12:37 +0000 (-0500) Subject: mgr/DaemonServer: prevent pgp_num reductions from outpacing pg_num merges X-Git-Tag: v14.2.1~23^2 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=refs%2Fpull%2F27547%2Fhead;p=ceph.git mgr/DaemonServer: prevent pgp_num reductions from outpacing pg_num merges If we are merging lots of pgs down to a much smaller number of pgs, and the pgs are able to move quickly (faster than the merges happen), we can end up with too many pgs on a small number of osds, triggering the max pgs per osd limits. Avoid this by preventing the pgp_num reductions from getting too far out in front of the merges themselves. Basically, cap the delta between pgp_num and pg_num to the max_misplaced ratio. We are already limiting the movement caused by pgp_num by max_misplaced; this effectively just makes sure that the actual merging (and pg_num reductions) are keeping up. Fixes: http://tracker.ceph.com/issues/38786 Signed-off-by: Sage Weil (cherry picked from commit 76503a1438fa1f166d2c230c73ca8d7b67e6468d) --- diff --git a/src/mgr/DaemonServer.cc b/src/mgr/DaemonServer.cc index cf1d79bf5ef9..38143ecc227f 100644 --- a/src/mgr/DaemonServer.cc +++ b/src/mgr/DaemonServer.cc @@ -2601,6 +2601,22 @@ void DaemonServer::adjust_pgs() } dout(20) << " room " << room << " estmax " << estmax << " delta " << delta << " next " << next << dendl; + if (p.get_pgp_num_target() == p.get_pg_num_target()) { + // since pgp_num is tracking pg_num, ceph is handling + // pgp_num. so, be responsible: don't let pgp_num get + // too far out ahead of merges (if we are merging). + // this avoids moving lots of unmerged pgs onto a + // small number of OSDs where we might blow out the + // per-osd pg max. + unsigned max_outpace_merges = + std::max(8, p.get_pg_num() * max_misplaced); + if (next + max_outpace_merges < p.get_pg_num()) { + next = p.get_pg_num() - max_outpace_merges; + dout(10) << " using next " << next + << " to avoid outpacing merges (max_outpace_merges " + << max_outpace_merges << ")" << dendl; + } + } } dout(10) << "pool " << i.first << " pgp_num_target " << p.get_pgp_num_target()