From: Sage Weil Date: Fri, 15 Sep 2017 20:52:28 +0000 (-0400) Subject: Revert "osd/OSDMap: allow bidirectional swap of pg-upmap-items" X-Git-Tag: v13.0.1~864^2 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=refs%2Fpull%2F17760%2Fhead;p=ceph.git Revert "osd/OSDMap: allow bidirectional swap of pg-upmap-items" This reverts commit 09af9b8afb40cc8aa629501582a75e03edf0bf2e. We need to prevent duplicates in the final result. For example, we can currently take [1,2,3] and apply [(1,2)] and get [2,2,3] or [1,2,3] and apply [(3,2)] and get [1,2,2] The rest of the system is not prepared to handle duplicates in the result set like this. The reverted commit was intended to allow [1,2,3] and [(1,2),(2,1)] to get [2,1,3] to reorder primaries. First, this bidirectional swap is hard to implement in a way that also prevents dups. For example, [1,2,3] and [(1,4),(2,3),(3,4)] would give [4,3,4] but would we just drop the last step we'd have [4,3,3] which is also invalid, etc. Simpler to just not handle bidirectional swaps. In practice, they are not needed: if you just want to choose a different primary then use primary_affinity, or pg_upmap (not pg_upmap_items). Fixes: http://tracker.ceph.com/issues/21410 Signed-off-by: Sage Weil --- diff --git a/src/osd/OSDMap.cc b/src/osd/OSDMap.cc index 9c3875b4fe99..0bd2fbe3c373 100644 --- a/src/osd/OSDMap.cc +++ b/src/osd/OSDMap.cc @@ -1923,17 +1923,28 @@ void OSDMap::_apply_upmap(const pg_pool_t& pi, pg_t raw_pg, vector *raw) co auto q = pg_upmap_items.find(pg); if (q != pg_upmap_items.end()) { - for (auto& i : *raw) { - for (auto& r : q->second) { - if (r.first != i) { - continue; - } - if (!(r.second != CRUSH_ITEM_NONE && - r.second < max_osd && - osd_weight[r.second] == 0)) { - i = r.second; - } - break; + // NOTE: this approach does not allow a bidirectional swap, + // e.g., [[1,2],[2,1]] applied to [0,1,2] -> [0,2,1]. + for (auto& r : q->second) { + // make sure the replacement value doesn't already appear + bool exists = false; + ssize_t pos = -1; + for (unsigned i = 0; i < raw->size(); ++i) { + int osd = (*raw)[i]; + if (osd == r.second) { + exists = true; + break; + } + // ignore mapping if target is marked out (or invalid osd id) + if (osd == r.first && + pos < 0 && + !(r.second != CRUSH_ITEM_NONE && r.second < max_osd && + osd_weight[r.second] == 0)) { + pos = i; + } + } + if (!exists && pos >= 0) { + (*raw)[pos] = r.second; } } }