Prometheus Module
=================
- Provides a Prometheus exporter to pass on Ceph performance counters
- from the collection point in ceph-mgr. Ceph-mgr receives MMgrReport
- messages from all MgrClient processes (mons and OSDs, for instance)
- with performance counter schema data and actual counter data, and keeps
- a circular buffer of the last N samples. This module creates an HTTP
- endpoint (like all Prometheus exporters) and retrieves the latest sample
- of every counter when polled (or "scraped" in Prometheus terminology).
- The HTTP path and query parameters are ignored; all extant counters
- for all reporting entities are returned in text exposition format.
- (See the Prometheus `documentation <https://prometheus.io/docs/instrumenting/exposition_formats/#text-format-details>`_.)
-
- Enabling prometheus output
+ The Manager ``prometheus`` module implements a Prometheus exporter to expose
+ Ceph performance counters from the collection point in the Manager. The
+ Manager receives ``MMgrReport`` messages from all ``MgrClient`` processes
+ (including mons and OSDs) with performance counter schema data and counter
+ data, and maintains a circular buffer of the latest samples. This module
+ listens on an HTTP endpoint and retrieves the latest sample of every counter
+ when scraped. The HTTP path and query parameters are ignored. All extant
+ counters for all reporting entities are returned in the Prometheus exposition
+ format. (See the Prometheus `documentation
+ <https://prometheus.io/docs/instrumenting/exposition_formats/#text-format-details>`_.)
+
+ Enabling Prometheus output
==========================
- The *prometheus* module is enabled with:
+ Enable the ``prometheus`` module by running the below command :
-.. prompt:: bash $
+.. prompt:: bash #
ceph mgr module enable prometheus
code (service unavailable). You can set other options using the ``ceph config
set`` commands.
- To tell the module to respond with possibly stale data, set it to ``return``:
+ To configure the module to respond with possibly stale data, set
+ the cache strategy to ``return``:
-.. prompt:: bash $
+.. prompt:: bash #
- ceph config set mgr mgr/prometheus/stale_cache_strategy return
+ ceph config set mgr mgr/prometheus/stale_cache_strategy return
- To tell the module to respond with "service unavailable", set it to ``fail``:
+ To configure the module to respond with "service unavailable", set it to ``fail``:
-.. prompt:: bash $
+.. prompt:: bash #
ceph config set mgr mgr/prometheus/stale_cache_strategy fail
ceph config set mgr mgr/prometheus/cache false
- If you are using the prometheus module behind some kind of reverse proxy or
- loadbalancer, you can simplify discovering the active instance by switching
+ If you are using the ``prometheus`` module behind a reverse proxy or
+ load balancer, you can simplify discovery of the active instance by switching
to ``error``-mode:
-.. prompt:: bash $
+.. prompt:: bash #
ceph config set mgr mgr/prometheus/standby_behaviour error
ceph_health_detail{name="OSD_DOWN",severity="HEALTH_WARN"} 1.0
ceph_health_detail{name="PG_DEGRADED",severity="HEALTH_WARN"} 1.0
- The health check history is made available through the following commands;
+ The health check history may be retrieved and cleared by running the following commands:
-::
+.. prompt:: bash #
+
+ ceph healthcheck history ls [--format {plain|json|json-pretty}]
+ ceph healthcheck history clear
- The ``ls`` command provides an overview of the health checks that the cluster has
- encountered, or since the last ``clear`` command was issued. The example below;
- ceph healthcheck history ls [--format {plain|json|json-pretty}]
- ceph healthcheck history clear
+
+ The ``ceph healthcheck ls`` command provides an overview of the health checks that the cluster has
+ encountered since the last ``clear`` command was issued:
+.. prompt:: bash #
+
+ ceph healthcheck history ls
+
::
- [ceph: root@c8-node1 /]# ceph healthcheck history ls
Healthcheck Name First Seen (UTC) Last seen (UTC) Count Active
OSDMAP_FLAGS 2021/09/16 03:17:47 2021/09/16 22:07:40 2 No
OSD_DOWN 2021/09/17 00:11:59 2021/09/17 00:11:59 1 Yes
RBD IO statistics
-----------------
- The module can optionally collect RBD per-image IO statistics by enabling
- dynamic OSD performance counters. The statistics are gathered for all images
- in the pools that are specified in the ``mgr/prometheus/rbd_stats_pools``
+ The ``prometheus`` module can optionally collect RBD per-image IO statistics by enabling
+ dynamic OSD performance counters. Statistics are gathered for all images
+ in the pools that are specified by the ``mgr/prometheus/rbd_stats_pools``
configuration parameter. The parameter is a comma or space separated list
- of ``pool[/namespace]`` entries. If the namespace is not specified the
+ of ``pool[/namespace]`` entries. If the RBD namespace is not specified,
statistics are collected for all namespaces in the pool.
- Example to activate the RBD-enabled pools ``pool1``, ``pool2`` and ``poolN``:
+ To enable collection of stats for RBD pools named ``pool1``, ``pool2`` and ``poolN``:
-.. prompt:: bash $
+.. prompt:: bash #
ceph config set mgr mgr/prometheus/rbd_stats_pools "pool1,pool2,poolN"
- The wildcard can be used to indicate all pools or namespaces:
+ A wildcard can be used to indicate all pools or namespaces:
-.. prompt:: bash $
+.. prompt:: bash #
ceph config set mgr mgr/prometheus/rbd_stats_pools "*"
force refresh earlier if it detects statistics from a previously unknown
RBD image.
- Example to turn up the sync interval to 10 minutes:
+ To set the sync interval to 10 minutes run the following command:
-.. prompt:: bash $
+.. prompt:: bash #
ceph config set mgr mgr/prometheus/rbd_stats_pools_refresh_interval 600
Ceph daemon performance counters metrics
-----------------------------------------
- With the introduction of ``ceph-exporter`` daemon, the prometheus module will no longer export Ceph daemon
- perf counters as prometheus metrics by default. However, one may re-enable exporting these metrics by setting
+ With the introduction of the ``ceph-exporter`` daemon, the ``prometheus`` module will no longer export Ceph daemon
+ perf counters as Prometheus metrics by default. However, one may re-enable exporting these metrics by setting
the module option ``exclude_perf_counters`` to ``false``:
-.. prompt:: bash $
+.. prompt:: bash #
ceph config set mgr mgr/prometheus/exclude_perf_counters false