Merge branch 'main' into mgr-prom

author Zac Dover <zac.dover@proton.me>

Wed, 23 Apr 2025 09:15:19 +0000 (19:15 +1000)

committer GitHub <noreply@github.com>

Wed, 23 Apr 2025 09:15:19 +0000 (19:15 +1000)
author Zac Dover <zac.dover@proton.me>
Wed, 23 Apr 2025 09:15:19 +0000 (19:15 +1000)
committer GitHub <noreply@github.com>
Wed, 23 Apr 2025 09:15:19 +0000 (19:15 +1000)
diff --cc doc/mgr/prometheus.rst

index 9c2d341b753fc63ecb25fb446eaea428136839fa,a3ea3f1e81cebc0521570878f29bbd15334a59d5..e61b7beb4c33f1e860e8b05598acf5dca79c00a2
--- 1/doc/mgr/prometheus.rst
--- 2/doc/mgr/prometheus.rst
+++ b/doc/mgr/prometheus.rst
@@@ -4,23 -4,23 +4,23 @@@
   Prometheus Module
   =================
   
- -Provides a Prometheus exporter to pass on Ceph performance counters
- -from the collection point in ceph-mgr.  Ceph-mgr receives MMgrReport
- -messages from all MgrClient processes (mons and OSDs, for instance)
- -with performance counter schema data and actual counter data, and keeps
- -a circular buffer of the last N samples.  This module creates an HTTP
- -endpoint (like all Prometheus exporters) and retrieves the latest sample
- -of every counter when polled (or "scraped" in Prometheus terminology).
- -The HTTP path and query parameters are ignored; all extant counters
- -for all reporting entities are returned in text exposition format.
- -(See the Prometheus `documentation <https://prometheus.io/docs/instrumenting/exposition_formats/#text-format-details>`_.)
- -
- -Enabling prometheus output
+ +The Manager ``prometheus`` module implements a Prometheus exporter to expose
+ +Ceph performance counters from the collection point in the Manager.  The
+ +Manager receives ``MMgrReport`` messages from all ``MgrClient`` processes
+ +(including mons and OSDs) with performance counter schema data and counter
+ +data, and maintains a circular buffer of the latest samples.  This module
+ +listens on an HTTP endpoint and retrieves the latest sample of every counter
+ +when scraped.  The HTTP path and query parameters are ignored. All extant
+ +counters for all reporting entities are returned in the Prometheus exposition
+ +format.  (See the Prometheus `documentation
+ +<https://prometheus.io/docs/instrumenting/exposition_formats/#text-format-details>`_.)
+ +
+ +Enabling Prometheus output
   ==========================
   
- -The *prometheus* module is enabled with:
+ +Enable the ``prometheus`` module by running the below command :
   
- .. prompt:: bash #
+ .. prompt:: bash $
   
      ceph mgr module enable prometheus
   
@@@ -90,16 -91,15 +90,16 @@@ This behavior can be configured. By def
   code (service unavailable). You can set other options using the ``ceph config
   set`` commands.
   
- -To tell the module to respond with possibly stale data, set it to ``return``:
+ +To configure the module to respond with possibly stale data, set
+ +the cache strategy to ``return``:
   
- .. prompt:: bash #
+ .. prompt:: bash $
   
       ceph config set mgr mgr/prometheus/stale_cache_strategy return
   
- -To tell the module to respond with "service unavailable", set it to ``fail``:
+ +To configure the module to respond with "service unavailable", set it to ``fail``:
   
- .. prompt:: bash #
+ .. prompt:: bash $
   
      ceph config set mgr mgr/prometheus/stale_cache_strategy fail
   
@@@ -109,11 -109,11 +109,11 @@@ If you are confident that you don't req
   
      ceph config set mgr mgr/prometheus/cache false
   
- -If you are using the prometheus module behind some kind of reverse proxy or
- -loadbalancer, you can simplify discovering the active instance by switching
+ +If you are using the ``prometheus`` module behind a reverse proxy or
+ +load balancer, you can simplify discovery of the active instance by switching
   to ``error``-mode:
   
- .. prompt:: bash #
+ .. prompt:: bash $
   
      ceph config set mgr mgr/prometheus/standby_behaviour error
   
@@@ -152,22 -152,19 +152,19 @@@ The metrics take the following form
       ceph_health_detail{name="OSD_DOWN",severity="HEALTH_WARN"} 1.0
       ceph_health_detail{name="PG_DEGRADED",severity="HEALTH_WARN"} 1.0
   
- -The health check history is made available through the following commands;
+ +The health check history may be retrieved and cleared by running the following commands:
   
- .. prompt:: bash #
+ ::
   
- -    healthcheck history ls [--format {plain|json|json-pretty}]
- -    healthcheck history clear
+ +    ceph healthcheck history ls [--format {plain|json|json-pretty}]
+ +    ceph healthcheck history clear
   
- -The ``ls`` command provides an overview of the health checks that the cluster has
- -encountered, or since the last ``clear`` command was issued. The example below;
+ +The ``ceph healthcheck ls`` command provides an overview of the health checks that the cluster has
+ +encountered since the last ``clear`` command was issued:
   
- .. prompt:: bash #
- 
-    ceph healthcheck history ls
- 
   ::
   
+     [ceph: root@c8-node1 /]# ceph healthcheck history ls
       Healthcheck Name          First Seen (UTC)      Last seen (UTC)       Count  Active
       OSDMAP_FLAGS              2021/09/16 03:17:47   2021/09/16 22:07:40       2    No
       OSD_DOWN                  2021/09/17 00:11:59   2021/09/17 00:11:59       1   Yes
@@@ -178,22 -175,22 +175,22 @@@
   RBD IO statistics
   -----------------
   
- -The module can optionally collect RBD per-image IO statistics by enabling
- -dynamic OSD performance counters. The statistics are gathered for all images
- -in the pools that are specified in the ``mgr/prometheus/rbd_stats_pools``
+ +The ``prometheus`` module can optionally collect RBD per-image IO statistics by enabling
+ +dynamic OSD performance counters. Statistics are gathered for all images
+ +in the pools that are specified by the ``mgr/prometheus/rbd_stats_pools``
   configuration parameter. The parameter is a comma or space separated list
- -of ``pool[/namespace]`` entries. If the namespace is not specified the
+ +of ``pool[/namespace]`` entries. If the RBD namespace is not specified,
   statistics are collected for all namespaces in the pool.
   
- -Example to activate the RBD-enabled pools ``pool1``, ``pool2`` and ``poolN``:
+ +To enable collection of stats for RBD pools named ``pool1``, ``pool2`` and ``poolN``:
   
- .. prompt:: bash #
+ .. prompt:: bash $
   
      ceph config set mgr mgr/prometheus/rbd_stats_pools "pool1,pool2,poolN"
   
- -The wildcard can be used to indicate all pools or namespaces:
+ +A wildcard can be used to indicate all pools or namespaces:
   
- .. prompt:: bash #
+ .. prompt:: bash $
   
      ceph config set mgr mgr/prometheus/rbd_stats_pools "*"
   
@@@ -204,20 -201,20 +201,20 @@@ parameter, which defaults to 300 second
   force refresh earlier if it detects statistics from a previously unknown
   RBD image.
   
- -Example to turn up the sync interval to 10 minutes:
+ +To set the sync interval to 10 minutes run the following command:
   
- .. prompt:: bash #
+ .. prompt:: bash $
   
      ceph config set mgr mgr/prometheus/rbd_stats_pools_refresh_interval 600
   
   Ceph daemon performance counters metrics
   -----------------------------------------
   
- -With the introduction of ``ceph-exporter`` daemon, the prometheus module will no longer export Ceph daemon
- -perf counters as prometheus metrics by default. However, one may re-enable exporting these metrics by setting
+ +With the introduction of the ``ceph-exporter`` daemon, the ``prometheus`` module will no longer export Ceph daemon
+ +perf counters as Prometheus metrics by default. However, one may re-enable exporting these metrics by setting
   the module option ``exclude_perf_counters`` to ``false``:
   
- .. prompt:: bash #
+ .. prompt:: bash $
   
      ceph config set mgr mgr/prometheus/exclude_perf_counters false
author	Zac Dover <zac.dover@proton.me>
	Wed, 23 Apr 2025 09:15:19 +0000 (19:15 +1000)
committer	GitHub <noreply@github.com>
	Wed, 23 Apr 2025 09:15:19 +0000 (19:15 +1000)