From 739c23da1b7bad1d6443530fcce8f167581c76a6 Mon Sep 17 00:00:00 2001 From: Sage Weil Date: Tue, 3 Mar 2020 10:32:41 -0600 Subject: [PATCH] doc/cephadm/monitoring: document process to set up monitoring with cephadm Signed-off-by: Sage Weil --- doc/cephadm/index.rst | 1 + doc/cephadm/monitoring.rst | 72 ++++++++++++++++++++++++++++++++++++++ doc/mgr/dashboard.rst | 2 ++ 3 files changed, 75 insertions(+) create mode 100644 doc/cephadm/monitoring.rst diff --git a/doc/cephadm/index.rst b/doc/cephadm/index.rst index 898010b2040..a6e76d3467c 100644 --- a/doc/cephadm/index.rst +++ b/doc/cephadm/index.rst @@ -225,6 +225,7 @@ Further Reading :maxdepth: 2 Cephadm administration + Cephadm monitoring Cephadm CLI <../mgr/orchestrator> DriveGroups OS recommendations <../start/os-recommendations> diff --git a/doc/cephadm/monitoring.rst b/doc/cephadm/monitoring.rst new file mode 100644 index 00000000000..38d0aee91ab --- /dev/null +++ b/doc/cephadm/monitoring.rst @@ -0,0 +1,72 @@ +Monitoring Stack with Cephadm +============================= + +The Ceph dashboard makes use of prometheus, grafana, and related tools +to store and visualize detailed metrics on cluster utilization and +performance. Ceph users have three options: + +#. Have cephadm deploy and configure these services. +#. Deploy and configure these services manually. This is recommended for users + with existing prometheus services in their environment (and in cases where + Ceph is running in Kubernetes with Rook). +#. Skip the monitoring stack completely. Some Ceph dashboard graphs will + not be available. + +Deploying monitoring with cephadm +--------------------------------- + +To deploy a basic monitoring stack: + +#. Enable the prometheus module in the ceph-mgr daemon. This exposes the internal Ceph metrics so that prometheus can scrape them.:: + + ceph mgr module enable prometheus + +#. Deploy a node-exporter service on every node of the cluster. The node-exporter provides host-level metrics like CPU and memory utilization.:: + + ceph orch apply node-exporter all:true + +#. Deploy alertmanager:: + + ceph orch apply alertmanager 1 + +#. Deploy prometheus. A single prometheus instance is sufficient, but + for HA you may want to deploy two.:: + + ceph orch apply prometheus 1 # or 2 + +#. Deploy grafana:: + + ceph orch apply grafana 1 + +Cephadm handles the prometheus, grafana, and alertmanager +configurations automatically. + +It may take a minute or two for services to be deployed. Once +completed, you should see something like this from ``ceph orch ls``:: + + $ ceph orch ls + NAME RUNNING REFRESHED IMAGE NAME IMAGE ID SPEC + alertmanager 1/1 6s ago docker.io/prom/alertmanager:latest 0881eb8f169f present + crash 2/2 6s ago docker.io/ceph/daemon-base:latest-master-devel mix present + grafana 1/1 0s ago docker.io/pcuzner/ceph-grafana-el8:latest f77afcf0bcf6 absent + node-exporter 2/2 6s ago docker.io/prom/node-exporter:latest e5a616e4b9cf present + prometheus 1/1 6s ago docker.io/prom/prometheus:latest e935122ab143 present + + +Deploying monitoring manually +----------------------------- + +If you have an existing prometheus monitoring infrastructure, or would like +to manage it yourself, you need to configure it to integrate with your Ceph +cluster. + +* Enable the prometheus module in the ceph-mgr daemon:: + + ceph mgr module enable prometheus + + By default, ceph-mgr presents prometheus metrics on port 9283 on each host + running a ceph-mgr daemon. Configure prometheus to scrape these. + +* To enable the dashboard's prometheus-based alerting, see :ref:`dashboard-alerting`. + +* To enable dashboard integration with Grafana, see :ref:`dashboard-grafana`. diff --git a/doc/mgr/dashboard.rst b/doc/mgr/dashboard.rst index 1070910d37c..0a645f16ce5 100644 --- a/doc/mgr/dashboard.rst +++ b/doc/mgr/dashboard.rst @@ -503,6 +503,8 @@ To enable SSO:: $ ceph dashboard sso enable saml2 +.. _dashboard-alerting: + Enabling Prometheus Alerting ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -- 2.39.5