]> git.apps.os.sepia.ceph.com Git - ceph-ci.git/commitdiff
osd: Do not sent PDWs if read count > k
authorAlex Ainscow <aainscow@uk.ibm.com>
Fri, 1 Aug 2025 14:09:58 +0000 (15:09 +0100)
committerAlex Ainscow <aainscow@uk.ibm.com>
Wed, 17 Sep 2025 08:43:26 +0000 (09:43 +0100)
The main point of PDW (as currently implemented) is to reduce the amount
of reading performed by the primary when preparing for a read-modify-write (RMW).

It was making the assumption that if any recovery was required by a
conventional RMW, then a PDW is always better. This was an incorrect assumption
as a conventional RMW performs at most K reads for any plugin which
supports PDW. As such, we tweak this logic to perform a conventional RMW
if the PDW is going to read k or more shards.

This should improve performance in some minor areas.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit cffd10f3cc82e0aef29209e6e823b92bdb0291ce)

src/osd/ECTransaction.cc

index 33e4f063ec8950446f1a0ce8ed470b385aeaff84..b35687941247880be5c37e2063bfd7b0d4702350 100644 (file)
@@ -213,6 +213,10 @@ ECTransaction::WritePlanObj::WritePlanObj(
 
         if (pdw_write_mode != 0) {
           do_parity_delta_write = (pdw_write_mode == 2);
+        } else if (pdw_read_shards.size() >= sinfo.get_k()) {
+          // Even if recovery required for a convention RMW, PDW is not more
+          // efficient.
+          do_parity_delta_write = false;
         } else if (!shard_id_set::difference(pdw_read_shards, readable_shards).empty()) {
           // Some kind of reconstruct would be needed for PDW, so don't bother.
           do_parity_delta_write = false;