From: Sridhar Seshasayee Date: Thu, 2 Feb 2023 12:16:27 +0000 (+0530) Subject: osd/osd_types: use appropriate cost value for PushReplyOp X-Git-Tag: v17.2.7~418^2~29 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=2e98d162f9dc7af9bf28d544e2fce27914f73213;p=ceph.git osd/osd_types: use appropriate cost value for PushReplyOp See included comments -- previous values did not account for object size. This causes problems for mclock which is much more strict in how it interprets costs. Fixes: https://tracker.ceph.com/issues/58529 Signed-off-by: Samuel Just Signed-off-by: Sridhar Seshasayee --- diff --git a/src/osd/osd_types.cc b/src/osd/osd_types.cc index 3a92591409f4..bd8cac4ddcc5 100644 --- a/src/osd/osd_types.cc +++ b/src/osd/osd_types.cc @@ -6688,9 +6688,34 @@ ostream& operator<<(ostream& out, const PushReplyOp &op) uint64_t PushReplyOp::cost(CephContext *cct) const { - - return cct->_conf->osd_push_per_object_cost + - cct->_conf->osd_recovery_max_chunk; + if (cct->_conf->osd_op_queue == "mclock_scheduler") { + /* In general, we really never want to throttle PushReplyOp messages. + * As long as the object is smaller than osd_recovery_max_chunk (8M at + * time of writing this comment, so this is basically always true), + * processing the PushReplyOp does not cost any further IO and simply + * permits the object once more to be written to. + * + * In the unlikely event that the object is larger than + * osd_recovery_max_chunk (again, 8M at the moment, so never for common + * configurations of rbd and virtually never for cephfs and rgw), + * we *still* want to push out the next portion immediately so that we can + * release the object for IO. + * + * The throttling for this operation on the primary occurs at the point + * where we queue the PGRecoveryContext which calls into recover_missing + * and recover_backfill to initiate pushes. + * See OSD::queue_recovery_context. + */ + return 1; + } else { + /* We retain this legacy behavior for WeightedPriorityQueue. It seems to + * require very large costs for several messages in order to do any + * meaningful amount of throttling. This branch should be removed after + * Reef. + */ + return cct->_conf->osd_push_per_object_cost + + cct->_conf->osd_recovery_max_chunk; + } } // -- PullOp --