From: Kamoltat Sirivadhna Date: Fri, 23 Aug 2024 20:24:36 +0000 (+0000) Subject: doc/rados/operations/health-checks: Add MON_NETSPLIT Warning X-Git-Tag: v20.3.0~20^2 X-Git-Url: http://git.apps.os.sepia.ceph.com/?a=commitdiff_plain;h=a5248f5c4cc964ffbf5028a85556f8ea533288f6;p=ceph.git doc/rados/operations/health-checks: Add MON_NETSPLIT Warning Fixes: https://tracker.ceph.com/issues/67371 Signed-off-by: Kamoltat Sirivadhna --- diff --git a/doc/rados/operations/health-checks.rst b/doc/rados/operations/health-checks.rst index 134a1d469d627..5bd0ca070bf36 100644 --- a/doc/rados/operations/health-checks.rst +++ b/doc/rados/operations/health-checks.rst @@ -153,6 +153,24 @@ To adjust the warning threshold, run the following command: ceph config set global mon_data_size_warn +MON_NETSPLIT +____________ + +A network partition has occurred among Ceph Monitors. This health check is +raised when one or more monitors detect that at least two Ceph Monitors have +lost connectivity or reachability, based on their individual connection scores, +which are frequently updated. This warning only appears when +the cluster is provisioned with at least three Ceph Monitors and are using the +``connectivity`` election strategy. + +Network partitions are reported in two ways: +- As location-level netsplits (e.g., "Netsplit detected between dc1 and dc2") when + all monitors in one location cannot communicate with all monitors in another location +- As individual monitor netsplits (e.g., "Netsplit detected between mon.a and mon.d") + when only specific monitors are disconnected across locations + +The system prioritizes reporting at the highest topology level (``datacenter``, ``rack``, etc.) +when possible, to better help operators identify infrastructure-level network issues. AUTH_INSECURE_GLOBAL_ID_RECLAIM _______________________________