Erasure pools do not support read from replica, so we should drop
any rank > 0 requests.
This fixes a bug where an erasure pool maps to [1,2,3], temporarily maps
to [-1,2,3], sends a request to osd.2, and then remaps back to [1,2,3].
Because the 0 shard never appears on osd.2, the request sits in the
waiting_for_pg map indefinitely and cases slow request warnings.
This problem does not come up on replicated pools because all instances of
the PG are created equal.
Fix by only considering role == 0 for erasure pools as a correct mapping.
Fixes: #9835
Signed-off-by: Sage Weil <sage@redhat.com>
vector<int> acting;
int nrep = osdmap->pg_to_acting_osds(pgid.pgid, acting);
int role = osdmap->calc_pg_role(whoami, acting, nrep);
- if (role >= 0) {
+ const pg_pool_t *pi = osdmap->get_pg_pool(pgid.pool());
+ if (role == 0 || (role > 0 && !pi->is_erasure())) {
++p; // still me
} else {
dout(10) << " discarding waiting ops for " << pgid << dendl;
if (!pg) {
dout(7) << "hit non-existent pg " << pgid << dendl;
- if (osdmap->get_pg_acting_role(pgid.pgid, whoami) >= 0) {
+ const pg_pool_t *pi = osdmap->get_pg_pool(pgid.pool());
+ int role = osdmap->get_pg_acting_role(pgid.pgid, whoami);
+ if (role == 0 || (role > 0 && !pi->is_erasure())) {
dout(7) << "we are valid target for op, waiting" << dendl;
waiting_for_pg[pgid].push_back(op);
op->mark_delayed("waiting for pg to exist locally");