From 739c23da1b7bad1d6443530fcce8f167581c76a6 Mon Sep 17 00:00:00 2001
From: Sage Weil <sage@redhat.com>
Date: Tue, 3 Mar 2020 10:32:41 -0600
Subject: [PATCH] doc/cephadm/monitoring: document process to set up monitoring
 with cephadm

Signed-off-by: Sage Weil <sage@redhat.com>
---
 doc/cephadm/index.rst      |  1 +
 doc/cephadm/monitoring.rst | 72 ++++++++++++++++++++++++++++++++++++++
 doc/mgr/dashboard.rst      |  2 ++
 3 files changed, 75 insertions(+)
 create mode 100644 doc/cephadm/monitoring.rst
diff --git a/doc/cephadm/index.rst b/doc/cephadm/index.rst
index 898010b2040..a6e76d3467c 100644
--- a/doc/cephadm/index.rst
+++ b/doc/cephadm/index.rst
@@ -225,6 +225,7 @@ Further Reading
     :maxdepth: 2
 
     Cephadm administration <administration>
+    Cephadm monitoring <monitoring>
     Cephadm CLI <../mgr/orchestrator>
     DriveGroups <drivegroups>
     OS recommendations <../start/os-recommendations>
diff --git a/doc/cephadm/monitoring.rst b/doc/cephadm/monitoring.rst
new file mode 100644
index 00000000000..38d0aee91ab
--- /dev/null
+++ b/doc/cephadm/monitoring.rst
@@ -0,0 +1,72 @@
+Monitoring Stack with Cephadm
+=============================
+
+The Ceph dashboard makes use of prometheus, grafana, and related tools
+to store and visualize detailed metrics on cluster utilization and
+performance.  Ceph users have three options:
+
+#. Have cephadm deploy and configure these services.
+#. Deploy and configure these services manually.  This is recommended for users
+   with existing prometheus services in their environment (and in cases where
+   Ceph is running in Kubernetes with Rook).
+#. Skip the monitoring stack completely.  Some Ceph dashboard graphs will
+   not be available.
+
+Deploying monitoring with cephadm
+---------------------------------
+
+To deploy a basic monitoring stack:
+
+#. Enable the prometheus module in the ceph-mgr daemon.  This exposes the internal Ceph metrics so that prometheus can scrape them.::
+
+     ceph mgr module enable prometheus
+
+#. Deploy a node-exporter service on every node of the cluster.  The node-exporter provides host-level metrics like CPU and memory utilization.::
+
+     ceph orch apply node-exporter all:true
+
+#. Deploy alertmanager::
+
+     ceph orch apply alertmanager 1
+
+#. Deploy prometheus.  A single prometheus instance is sufficient, but
+   for HA you may want to deploy two.::
+
+     ceph orch apply prometheus 1    # or 2
+
+#. Deploy grafana::
+
+     ceph orch apply grafana 1
+
+Cephadm handles the prometheus, grafana, and alertmanager
+configurations automatically.
+
+It may take a minute or two for services to be deployed.  Once
+completed, you should see something like this from ``ceph orch ls``::
+
+  $ ceph orch ls
+  NAME           RUNNING  REFRESHED  IMAGE NAME                                      IMAGE ID        SPEC
+  alertmanager       1/1  6s ago     docker.io/prom/alertmanager:latest              0881eb8f169f  present
+  crash              2/2  6s ago     docker.io/ceph/daemon-base:latest-master-devel  mix           present
+  grafana            1/1  0s ago     docker.io/pcuzner/ceph-grafana-el8:latest       f77afcf0bcf6   absent
+  node-exporter      2/2  6s ago     docker.io/prom/node-exporter:latest             e5a616e4b9cf  present
+  prometheus         1/1  6s ago     docker.io/prom/prometheus:latest                e935122ab143  present
+
+
+Deploying monitoring manually
+-----------------------------
+
+If you have an existing prometheus monitoring infrastructure, or would like
+to manage it yourself, you need to configure it to integrate with your Ceph
+cluster.
+
+* Enable the prometheus module in the ceph-mgr daemon::
+
+     ceph mgr module enable prometheus
+
+  By default, ceph-mgr presents prometheus metrics on port 9283 on each host
+  running a ceph-mgr daemon.  Configure prometheus to scrape these.
+
+* To enable the dashboard's prometheus-based alerting, see :ref:`dashboard-alerting`.
+
+* To enable dashboard integration with Grafana, see :ref:`dashboard-grafana`.
diff --git a/doc/mgr/dashboard.rst b/doc/mgr/dashboard.rst
index 1070910d37c..0a645f16ce5 100644
--- a/doc/mgr/dashboard.rst
+++ b/doc/mgr/dashboard.rst
@@ -503,6 +503,8 @@ To enable SSO::
 
   $ ceph dashboard sso enable saml2
 
+.. _dashboard-alerting:
+
 Enabling Prometheus Alerting
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-- 
2.39.5