]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/commit
mds: use regular dispatch for processing metrics
authorPatrick Donnelly <pdonnell@redhat.com>
Wed, 24 Apr 2024 19:35:14 +0000 (15:35 -0400)
committerPatrick Donnelly <pdonnell@redhat.com>
Thu, 23 May 2024 19:38:09 +0000 (15:38 -0400)
commitbe91b9effdedc596a4a567354bde0ff594537217
tree7d6f7a63e5631b01c39566d98c7bd1a09812dd1c
parentf1882d8093fb4b60c1d1e52444565787ca4bd6cc
mds: use regular dispatch for processing metrics

There have been cases where the MDS does an undesirable failover because it
misses heartbeat resets after a long recovery in up:replay.  It was observed
that the MDS was processing a flood of metrics messages from all reconnecting
clients. This likely caused undersiable MetricAggregator::lock contention in
the messenger threads while fast dispatching client metrics.

Instead, use the normal dispatch where acquiring locks is okay to do.

See-also: linux.git/f7c2f4f6ce16fb58f7d024f3e1b40023c4b43ff9
Fixes: https://tracker.ceph.com/issues/65658
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit ed1fe9909338bc1bc0a29df22666e9ba11fa52fe)
src/mds/MetricAggregator.cc
src/mds/MetricAggregator.h