From ff57d7eb67076b005408d3cf45106eb1aa818932 Mon Sep 17 00:00:00 2001 From: Anthony D'Atri Date: Mon, 26 May 2025 20:06:18 -0400 Subject: [PATCH] src/common/options: Clarify scope of scrub intervals in osd.yaml.in Signed-off-by: Anthony D'Atri (cherry picked from commit 1a2290d85af6840000e97cfd0a11c574da419c15) --- doc/cephadm/services/snmp-gateway.rst | 30 +++++++++--------- doc/rados/configuration/osd-config-ref.rst | 4 ++- src/common/options/osd.yaml.in | 37 ++++++++++++++++------ 3 files changed, 46 insertions(+), 25 deletions(-) diff --git a/doc/cephadm/services/snmp-gateway.rst b/doc/cephadm/services/snmp-gateway.rst index f927fdfd0a3..6487042408f 100644 --- a/doc/cephadm/services/snmp-gateway.rst +++ b/doc/cephadm/services/snmp-gateway.rst @@ -3,9 +3,9 @@ SNMP Gateway Service ==================== SNMP_ is still a widely used protocol, to monitor distributed systems and devices across a variety of hardware -and software platforms. Ceph's SNMP integration focuses on forwarding alerts from it's Prometheus Alertmanager -cluster to a gateway daemon. The gateway daemon, transforms the alert into an SNMP Notification and sends -it on to a designated SNMP management platform. The gateway daemon is from the snmp_notifier_ project, +and software platforms. Ceph's SNMP integration focuses on forwarding alerts from its Prometheus Alertmanager +cluster to a gateway daemon. The gateway daemon transforms the alert into an SNMP Notification and sends +it on to a designated SNMP management platform. The gateway daemon is from the ``snmp_notifier``_ project, which provides SNMP V2c and V3 support (authentication and encryption). Ceph's SNMP gateway service deploys one instance of the gateway by default. You may increase this @@ -22,7 +22,7 @@ The table below shows the SNMP versions that are supported by the gateway implem ================ =========== =============================================== SNMP Version Supported Notes ================ =========== =============================================== - V1 ❌ Not supported by snmp_notifier + V1 ❌ Not supported by ``snmp_notifier`` V2c ✔ V3 authNoPriv ✔ uses username/password authentication, without encryption (NoPriv = no privacy) @@ -35,8 +35,8 @@ Deploying an SNMP Gateway ========================= Both SNMP V2c and V3 provide credentials support. In the case of V2c, this is just the community string - but for V3 environments you must provide additional authentication information. These credentials are not supported on the command -line when deploying the service. Instead, you must create the service using a credentials file (in yaml format), or -specify the complete service definition in a yaml file. +line when deploying the service. Instead, you must create the service using a credentials file (in YAML format), or +specify the complete service definition in a YAML file. Command format -------------- @@ -99,13 +99,13 @@ with the file containing the following configuration SNMP V3 (authNoPriv) -------------------- -Deploying an snmp-gateway service supporting SNMP V3 with authentication only, would look like this; +Deploying an snmp-gateway service supporting SNMP V3 with authentication only would look like this: .. prompt:: bash # ceph orch apply snmp-gateway --snmp-version=V3 --engine-id=800C53F000000 --destination=192.168.122.1:162 -i ./snmpv3_creds.yml -with a credentials file as; +with a credentials file of the following form: .. code-block:: yaml @@ -113,7 +113,7 @@ with a credentials file as; snmp_v3_auth_username: myuser snmp_v3_auth_password: mypassword -or as a service configuration file +Alternately a ``ceph orch`` service configuration file of the following form: .. code-block:: yaml @@ -134,13 +134,13 @@ or as a service configuration file SNMP V3 (authPriv) ------------------ -Defining an SNMP V3 gateway service that implements authentication and privacy (encryption), requires two additional values +To define an SNMP V3 gateway service that implements authentication and privacy (encryption), supply two additional values: .. prompt:: bash # ceph orch apply snmp-gateway --snmp-version=V3 --engine-id=800C53F000000 --destination=192.168.122.1:162 --privacy-protocol=AES -i ./snmpv3_creds.yml -with a credentials file as; +with a credentials file of the following form: .. code-block:: yaml @@ -152,7 +152,7 @@ with a credentials file as; .. note:: - The credentials are stored on the host, restricted to the root user and passed to the snmp_notifier daemon as + The credentials are stored on the host, restricted to the ``root`` user and passed to the ``snmp_notifier`` daemon as an environment file (``--env-file``), to limit exposure. @@ -165,7 +165,7 @@ alert that has an OID_ label to the SNMP gateway daemon for processing. Implementing the MIB ====================== -To make sense of the SNMP Notification/Trap, you'll need to apply the MIB to your SNMP management platform. The MIB (CEPH-MIB.txt) can -downloaded from the main Ceph repo_ +To make sense of SNMP notifications and traps, you'll need to apply the MIB to your SNMP management platform. The MIB (``CEPH-MIB.txt``) can +downloaded from the main Ceph GitHub repository_ -.. _repo: https://github.com/ceph/ceph/tree/master/monitoring/snmp +.. _repository: https://github.com/ceph/ceph/tree/master/monitoring/snmp diff --git a/doc/rados/configuration/osd-config-ref.rst b/doc/rados/configuration/osd-config-ref.rst index f43f3727a53..0dc8be8ce12 100644 --- a/doc/rados/configuration/osd-config-ref.rst +++ b/doc/rados/configuration/osd-config-ref.rst @@ -50,7 +50,9 @@ automatically. When using Filestore, the journal size should be at least twice the product of the expected drive speed multiplied by ``filestore_max_sync_interval``. However, the most common practice is to partition the journal drive (often an SSD), and mount it such -that Ceph uses the entire partition for the journal. +that Ceph uses the entire partition for the journal. Note that Filestore has been +deprecated for several releases and any legacy Filestore OSDs should be migrated +to BlueStore. .. confval:: osd_uuid .. confval:: osd_data diff --git a/src/common/options/osd.yaml.in b/src/common/options/osd.yaml.in index 009a2ac7558..b3e2a96396d 100644 --- a/src/common/options/osd.yaml.in +++ b/src/common/options/osd.yaml.in @@ -342,9 +342,9 @@ options: - name: osd_scrub_min_interval type: float level: advanced - desc: Scrub each PG no more often than this interval - fmt_desc: The minimal interval in seconds for scrubbing the Ceph OSD Daemon - when the Ceph Storage Cluster load is low. + desc: The desired interval between scrubs of a specific PG. Note that this option + must be set at ``global`` scope, or for both ``mgr`` and``osd``. + fmt_desc: The desired interval in seconds between scrubs of a specific PG. default: 1_day see_also: - osd_scrub_max_interval @@ -353,9 +353,9 @@ options: - name: osd_scrub_max_interval type: float level: advanced - desc: Scrub each PG no less often than this interval - fmt_desc: The maximum interval in seconds for scrubbing the Ceph OSD Daemon - irrespective of cluster load. + desc: Scrub each PG no less often than this interval. Note that this option + must be set at ``global`` scope, or for both ``mgr`` and``osd``. + fmt_desc: The maximum interval in seconds for scrubbing each PG. default: 7_day see_also: - osd_scrub_min_interval @@ -492,11 +492,30 @@ options: - name: osd_deep_scrub_interval type: float level: advanced - desc: Deep scrub each PG (i.e., verify data checksums) at least this often - fmt_desc: The interval for "deep" scrubbing (fully reading all data). The - ``osd_scrub_load_threshold`` does not affect this setting. + desc: Deep scrub each PG (i.e., verify data checksums) at least this often. Note that this option + must be set at ``global`` scope, or for both ``mgr`` and``osd``. + fmt_desc: The interval for "deep" scrubbing (fully reading all data). default: 7_day with_legacy: true +- name: osd_deep_scrub_interval_cv + type: float + level: advanced + desc: determining the amount of variation in the deep scrub interval + long_desc: deep scrub intervals are varied by a random amount to prevent + stampedes. This parameter determines the amount of variation. + Technically ``osd_deep_scrub_interval_cv`` is the coefficient of variation for + the deep scrub interval. + fmt_desc: The coefficient of variation for the deep scrub interval, specified as a + ratio. On average, the next deep scrub for a PG is scheduled osd_deep_scrub_interval + after the last deep scrub . The actual time is randomized to a normal distribution + with a standard deviation of osd_deep_scrub_interval * osd_deep_scrub_interval_cv + (clamped to within 2 standard deviations). + The default value guarantees that 95% of deep scrubs will be scheduled in the range + [0.8 * osd_deep_scrub_interval, 1.2 * osd_deep_scrub_interval]. + min: 0 + max: 0.4 + default: 0.2 + with_legacy: false - name: osd_deep_scrub_randomize_ratio type: float level: advanced -- 2.39.5