From: Zac Dover <zac.dover@proton.me>
Date: Wed, 23 Apr 2025 09:15:19 +0000 (+1000)
Subject: Merge branch 'main' into mgr-prom
X-Git-Tag: v20.3.0~34^2
X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=4843cd7f522882ec021f2777a220692f424c4df9;p=ceph.git

Merge branch 'main' into mgr-prom

Signed-off-by: Zac Dover <zac.dover@proton.me>
---

4843cd7f522882ec021f2777a220692f424c4df9
diff --cc doc/mgr/prometheus.rst
index 9c2d341b753f,a3ea3f1e81ce..e61b7beb4c33
--- a/doc/mgr/prometheus.rst
+++ b/doc/mgr/prometheus.rst
@@@ -4,23 -4,23 +4,23 @@@
  Prometheus Module
  =================
  
 -Provides a Prometheus exporter to pass on Ceph performance counters
 -from the collection point in ceph-mgr.  Ceph-mgr receives MMgrReport
 -messages from all MgrClient processes (mons and OSDs, for instance)
 -with performance counter schema data and actual counter data, and keeps
 -a circular buffer of the last N samples.  This module creates an HTTP
 -endpoint (like all Prometheus exporters) and retrieves the latest sample
 -of every counter when polled (or "scraped" in Prometheus terminology).
 -The HTTP path and query parameters are ignored; all extant counters
 -for all reporting entities are returned in text exposition format.
 -(See the Prometheus `documentation <https://prometheus.io/docs/instrumenting/exposition_formats/#text-format-details>`_.)
 -
 -Enabling prometheus output
 +The Manager ``prometheus`` module implements a Prometheus exporter to expose
 +Ceph performance counters from the collection point in the Manager.  The
 +Manager receives ``MMgrReport`` messages from all ``MgrClient`` processes
 +(including mons and OSDs) with performance counter schema data and counter
 +data, and maintains a circular buffer of the latest samples.  This module
 +listens on an HTTP endpoint and retrieves the latest sample of every counter
 +when scraped.  The HTTP path and query parameters are ignored. All extant
 +counters for all reporting entities are returned in the Prometheus exposition
 +format.  (See the Prometheus `documentation
 +<https://prometheus.io/docs/instrumenting/exposition_formats/#text-format-details>`_.)
 +
 +Enabling Prometheus output
  ==========================
  
 -The *prometheus* module is enabled with:
 +Enable the ``prometheus`` module by running the below command :
  
- .. prompt:: bash #
+ .. prompt:: bash $
  
     ceph mgr module enable prometheus
  
@@@ -90,16 -91,15 +90,16 @@@ This behavior can be configured. By def
  code (service unavailable). You can set other options using the ``ceph config
  set`` commands.
  
 -To tell the module to respond with possibly stale data, set it to ``return``:
 +To configure the module to respond with possibly stale data, set
 +the cache strategy to ``return``:
  
- .. prompt:: bash #
+ .. prompt:: bash $
  
      ceph config set mgr mgr/prometheus/stale_cache_strategy return
  
 -To tell the module to respond with "service unavailable", set it to ``fail``:
 +To configure the module to respond with "service unavailable", set it to ``fail``:
  
- .. prompt:: bash #
+ .. prompt:: bash $
  
     ceph config set mgr mgr/prometheus/stale_cache_strategy fail
  
@@@ -109,11 -109,11 +109,11 @@@ If you are confident that you don't req
  
     ceph config set mgr mgr/prometheus/cache false
  
 -If you are using the prometheus module behind some kind of reverse proxy or
 -loadbalancer, you can simplify discovering the active instance by switching
 +If you are using the ``prometheus`` module behind a reverse proxy or
 +load balancer, you can simplify discovery of the active instance by switching
  to ``error``-mode:
  
- .. prompt:: bash #
+ .. prompt:: bash $
  
     ceph config set mgr mgr/prometheus/standby_behaviour error
  
@@@ -152,22 -152,19 +152,19 @@@ The metrics take the following form
      ceph_health_detail{name="OSD_DOWN",severity="HEALTH_WARN"} 1.0
      ceph_health_detail{name="PG_DEGRADED",severity="HEALTH_WARN"} 1.0
  
 -The health check history is made available through the following commands;
 +The health check history may be retrieved and cleared by running the following commands:
  
- .. prompt:: bash #
+ ::
  
 -    healthcheck history ls [--format {plain|json|json-pretty}]
 -    healthcheck history clear
 +    ceph healthcheck history ls [--format {plain|json|json-pretty}]
 +    ceph healthcheck history clear
  
 -The ``ls`` command provides an overview of the health checks that the cluster has
 -encountered, or since the last ``clear`` command was issued. The example below;
 +The ``ceph healthcheck ls`` command provides an overview of the health checks that the cluster has
 +encountered since the last ``clear`` command was issued:
  
- .. prompt:: bash #
- 
-    ceph healthcheck history ls
- 
  ::
  
+     [ceph: root@c8-node1 /]# ceph healthcheck history ls
      Healthcheck Name          First Seen (UTC)      Last seen (UTC)       Count  Active
      OSDMAP_FLAGS              2021/09/16 03:17:47   2021/09/16 22:07:40       2    No
      OSD_DOWN                  2021/09/17 00:11:59   2021/09/17 00:11:59       1   Yes
@@@ -178,22 -175,22 +175,22 @@@
  RBD IO statistics
  -----------------
  
 -The module can optionally collect RBD per-image IO statistics by enabling
 -dynamic OSD performance counters. The statistics are gathered for all images
 -in the pools that are specified in the ``mgr/prometheus/rbd_stats_pools``
 +The ``prometheus`` module can optionally collect RBD per-image IO statistics by enabling
 +dynamic OSD performance counters. Statistics are gathered for all images
 +in the pools that are specified by the ``mgr/prometheus/rbd_stats_pools``
  configuration parameter. The parameter is a comma or space separated list
 -of ``pool[/namespace]`` entries. If the namespace is not specified the
 +of ``pool[/namespace]`` entries. If the RBD namespace is not specified,
  statistics are collected for all namespaces in the pool.
  
 -Example to activate the RBD-enabled pools ``pool1``, ``pool2`` and ``poolN``:
 +To enable collection of stats for RBD pools named ``pool1``, ``pool2`` and ``poolN``:
  
- .. prompt:: bash #
+ .. prompt:: bash $
  
     ceph config set mgr mgr/prometheus/rbd_stats_pools "pool1,pool2,poolN"
  
 -The wildcard can be used to indicate all pools or namespaces:
 +A wildcard can be used to indicate all pools or namespaces:
  
- .. prompt:: bash #
+ .. prompt:: bash $
  
     ceph config set mgr mgr/prometheus/rbd_stats_pools "*"
  
@@@ -204,20 -201,20 +201,20 @@@ parameter, which defaults to 300 second
  force refresh earlier if it detects statistics from a previously unknown
  RBD image.
  
 -Example to turn up the sync interval to 10 minutes:
 +To set the sync interval to 10 minutes run the following command:
  
- .. prompt:: bash #
+ .. prompt:: bash $
  
     ceph config set mgr mgr/prometheus/rbd_stats_pools_refresh_interval 600
  
  Ceph daemon performance counters metrics
  -----------------------------------------
  
 -With the introduction of ``ceph-exporter`` daemon, the prometheus module will no longer export Ceph daemon
 -perf counters as prometheus metrics by default. However, one may re-enable exporting these metrics by setting
 +With the introduction of the ``ceph-exporter`` daemon, the ``prometheus`` module will no longer export Ceph daemon
 +perf counters as Prometheus metrics by default. However, one may re-enable exporting these metrics by setting
  the module option ``exclude_perf_counters`` to ``false``:
  
- .. prompt:: bash #
+ .. prompt:: bash $
  
     ceph config set mgr mgr/prometheus/exclude_perf_counters false