====================
SNMP_ is still a widely used protocol, to monitor distributed systems and devices across a variety of hardware
-and software platforms. Ceph's SNMP integration focuses on forwarding alerts from it's Prometheus Alertmanager
-cluster to a gateway daemon. The gateway daemon, transforms the alert into an SNMP Notification and sends
-it on to a designated SNMP management platform. The gateway daemon is from the snmp_notifier_ project,
+and software platforms. Ceph's SNMP integration focuses on forwarding alerts from its Prometheus Alertmanager
+cluster to a gateway daemon. The gateway daemon transforms the alert into an SNMP Notification and sends
+it on to a designated SNMP management platform. The gateway daemon is from the ``snmp_notifier``_ project,
which provides SNMP V2c and V3 support (authentication and encryption).
Ceph's SNMP gateway service deploys one instance of the gateway by default. You may increase this
================ =========== ===============================================
SNMP Version Supported Notes
================ =========== ===============================================
- V1 ❌ Not supported by snmp_notifier
+ V1 ❌ Not supported by ``snmp_notifier``
V2c ✔
V3 authNoPriv ✔ uses username/password authentication, without
encryption (NoPriv = no privacy)
=========================
Both SNMP V2c and V3 provide credentials support. In the case of V2c, this is just the community string - but for V3
environments you must provide additional authentication information. These credentials are not supported on the command
-line when deploying the service. Instead, you must create the service using a credentials file (in yaml format), or
-specify the complete service definition in a yaml file.
+line when deploying the service. Instead, you must create the service using a credentials file (in YAML format), or
+specify the complete service definition in a YAML file.
Command format
--------------
SNMP V3 (authNoPriv)
--------------------
-Deploying an snmp-gateway service supporting SNMP V3 with authentication only, would look like this;
+Deploying an snmp-gateway service supporting SNMP V3 with authentication only would look like this:
.. prompt:: bash #
ceph orch apply snmp-gateway --snmp-version=V3 --engine-id=800C53F000000 --destination=192.168.122.1:162 -i ./snmpv3_creds.yml
-with a credentials file as;
+with a credentials file of the following form:
.. code-block:: yaml
snmp_v3_auth_username: myuser
snmp_v3_auth_password: mypassword
-or as a service configuration file
+Alternately a ``ceph orch`` service configuration file of the following form:
.. code-block:: yaml
SNMP V3 (authPriv)
------------------
-Defining an SNMP V3 gateway service that implements authentication and privacy (encryption), requires two additional values
+To define an SNMP V3 gateway service that implements authentication and privacy (encryption), supply two additional values:
.. prompt:: bash #
ceph orch apply snmp-gateway --snmp-version=V3 --engine-id=800C53F000000 --destination=192.168.122.1:162 --privacy-protocol=AES -i ./snmpv3_creds.yml
-with a credentials file as;
+with a credentials file of the following form:
.. code-block:: yaml
.. note::
- The credentials are stored on the host, restricted to the root user and passed to the snmp_notifier daemon as
+ The credentials are stored on the host, restricted to the ``root`` user and passed to the ``snmp_notifier`` daemon as
an environment file (``--env-file``), to limit exposure.
Implementing the MIB
======================
-To make sense of the SNMP Notification/Trap, you'll need to apply the MIB to your SNMP management platform. The MIB (CEPH-MIB.txt) can
-downloaded from the main Ceph repo_
+To make sense of SNMP notifications and traps, you'll need to apply the MIB to your SNMP management platform. The MIB (``CEPH-MIB.txt``) can
+downloaded from the main Ceph GitHub repository_
-.. _repo: https://github.com/ceph/ceph/tree/master/monitoring/snmp
+.. _repository: https://github.com/ceph/ceph/tree/master/monitoring/snmp
- name: osd_scrub_min_interval
type: float
level: advanced
- desc: Scrub each PG no more often than this interval
- fmt_desc: The minimal interval in seconds for scrubbing the Ceph OSD Daemon
- when the Ceph Storage Cluster load is low.
+ desc: The desired interval between scrubs of a specific PG. Note that this option
+ must be set at ``global`` scope, or for both ``mgr`` and``osd``.
+ fmt_desc: The desired interval in seconds between scrubs of a specific PG.
default: 1_day
see_also:
- osd_scrub_max_interval
- name: osd_scrub_max_interval
type: float
level: advanced
- desc: Scrub each PG no less often than this interval
- fmt_desc: The maximum interval in seconds for scrubbing the Ceph OSD Daemon
- irrespective of cluster load.
+ desc: Scrub each PG no less often than this interval. Note that this option
+ must be set at ``global`` scope, or for both ``mgr`` and``osd``.
+ fmt_desc: The maximum interval in seconds for scrubbing each PG.
default: 7_day
see_also:
- osd_scrub_min_interval
- name: osd_deep_scrub_interval
type: float
level: advanced
- desc: Deep scrub each PG (i.e., verify data checksums) at least this often
- fmt_desc: The interval for "deep" scrubbing (fully reading all data). The
- ``osd_scrub_load_threshold`` does not affect this setting.
+ desc: Deep scrub each PG (i.e., verify data checksums) at least this often. Note that this option
+ must be set at ``global`` scope, or for both ``mgr`` and``osd``.
+ fmt_desc: The interval for "deep" scrubbing (fully reading all data).
default: 7_day
with_legacy: true
+- name: osd_deep_scrub_interval_cv
+ type: float
+ level: advanced
+ desc: determining the amount of variation in the deep scrub interval
+ long_desc: deep scrub intervals are varied by a random amount to prevent
+ stampedes. This parameter determines the amount of variation.
+ Technically ``osd_deep_scrub_interval_cv`` is the coefficient of variation for
+ the deep scrub interval.
+ fmt_desc: The coefficient of variation for the deep scrub interval, specified as a
+ ratio. On average, the next deep scrub for a PG is scheduled osd_deep_scrub_interval
+ after the last deep scrub . The actual time is randomized to a normal distribution
+ with a standard deviation of osd_deep_scrub_interval * osd_deep_scrub_interval_cv
+ (clamped to within 2 standard deviations).
+ The default value guarantees that 95% of deep scrubs will be scheduled in the range
+ [0.8 * osd_deep_scrub_interval, 1.2 * osd_deep_scrub_interval].
+ min: 0
+ max: 0.4
+ default: 0.2
+ with_legacy: false
- name: osd_deep_scrub_randomize_ratio
type: float
level: advanced