This reverts commit
09af9b8afb40cc8aa629501582a75e03edf0bf2e.
We need to prevent duplicates in the final result. For example, we
can currently take
[1,2,3] and apply [(1,2)] and get [2,2,3]
or
[1,2,3] and apply [(3,2)] and get [1,2,2]
The rest of the system is not prepared to handle duplicates in the
result set like this.
The reverted commit was intended to allow
[1,2,3] and [(1,2),(2,1)] to get [2,1,3]
to reorder primaries. First, this bidirectional swap is hard to implement
in a way that also prevents dups. For example,
[1,2,3] and [(1,4),(2,3),(3,4)] would give [4,3,4]
but would we just drop the last step we'd have [4,3,3] which
is also invalid, etc. Simpler to just not handle bidirectional
swaps. In practice, they are not needed: if you just want to choose
a different primary then use primary_affinity, or pg_upmap
(not pg_upmap_items).
Fixes: http://tracker.ceph.com/issues/21410
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
9c73305e3ad11177d58632eba6ece5d2c0e701da)
auto q = pg_upmap_items.find(pg);
if (q != pg_upmap_items.end()) {
- for (auto& i : *raw) {
- for (auto& r : q->second) {
- if (r.first != i) {
- continue;
- }
- if (!(r.second != CRUSH_ITEM_NONE &&
- r.second < max_osd &&
- osd_weight[r.second] == 0)) {
- i = r.second;
- }
- break;
+ // NOTE: this approach does not allow a bidirectional swap,
+ // e.g., [[1,2],[2,1]] applied to [0,1,2] -> [0,2,1].
+ for (auto& r : q->second) {
+ // make sure the replacement value doesn't already appear
+ bool exists = false;
+ ssize_t pos = -1;
+ for (unsigned i = 0; i < raw->size(); ++i) {
+ int osd = (*raw)[i];
+ if (osd == r.second) {
+ exists = true;
+ break;
+ }
+ // ignore mapping if target is marked out (or invalid osd id)
+ if (osd == r.first &&
+ pos < 0 &&
+ !(r.second != CRUSH_ITEM_NONE && r.second < max_osd &&
+ osd_weight[r.second] == 0)) {
+ pos = i;
+ }
+ }
+ if (!exists && pos >= 0) {
+ (*raw)[pos] = r.second;
}
}
}