From: xie xingguo Date: Tue, 26 Apr 2016 03:13:32 +0000 (+0800) Subject: mon/OSDMonitor: improve reweight_by_utilization() logic X-Git-Tag: v10.2.1~50 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=d9851351aeb6d45a2df1c107b23e77c992926d0a;p=ceph.git mon/OSDMonitor: improve reweight_by_utilization() logic By calling reweight_by_utilization() method, we are aiming at an evener result of utilization among all osds. To achieve this, we shall decrease weights of osds which are currently overloaded, and try to increase weights of osds which are currently underloaded when it is possible. However, we can't do this all at a time in order to avoid a massive pg migrations between osds. Thus we introduce a max_osds limit to smooth the progress. The problem here is that we have sorted the utilization of all osds in a descending manner and we always try to decrease the weights of the most overloaded osds since they are most likely to encounter a nearfull/full transition soon, but we won't increase the weights from the most underloaded(least utilized by contrast) at the same time, which I think is not quite reasonable. Actually, the best thing would probably be to iterate over teh low and high osds in parallel, and do the ones that are furthest from the average first. Signed-off-by: xie xingguo (cherry picked from commit e7a32534ebc9e27f955ff2d7a8d1db511383301e) --- diff --git a/src/mon/OSDMonitor.cc b/src/mon/OSDMonitor.cc index 43e529afba7..3c97fce0d8b 100644 --- a/src/mon/OSDMonitor.cc +++ b/src/mon/OSDMonitor.cc @@ -599,10 +599,13 @@ int OSDMonitor::reweight_by_utilization(int oload, util_by_osd.push_back(osd_util); } - // sort and iterate from most to least utilized - std::sort(util_by_osd.begin(), util_by_osd.end(), [](std::pair l, std::pair r) { - return l.second > r.second; - }); + // sort by absolute deviation from the mean utilization, + // in descending order. + std::sort(util_by_osd.begin(), util_by_osd.end(), + [average_util](std::pair l, std::pair r) { + return abs(l.second - average_util) > abs(r.second - average_util); + } + ); OSDMap::Incremental newinc;