with 'backfilling', which allows Ceph to set backfill operations to a lower
priority than requests to read or write data.
+.. note:: Some of these settings are automatically reset if the `mClock`_
+ scheduler is active, see `mClock backfill`_.
.. confval:: osd_max_backfills
.. confval:: osd_backfill_scan_min
the number recovery requests, threads and object chunk sizes which allows Ceph
perform well in a degraded state.
+.. note:: Some of these settings are automatically reset if the `mClock`_
+ scheduler is active, see `mClock backfill`_.
+
.. confval:: osd_recovery_delay_start
.. confval:: osd_recovery_max_active
.. confval:: osd_recovery_max_active_hdd
.. _pool: ../../operations/pools
.. _Configuring Monitor/OSD Interaction: ../mon-osd-interaction
.. _Monitoring OSDs and PGs: ../../operations/monitoring-osd-pg#peering
+.. _mClock: ../mclock-config-ref.rst
+.. _mClock backfill: ../mclock-config-ref.rst#recovery-backfill-options
.. _Pool & PG Config Reference: ../pool-pg-config-ref
.. _Journal Config Reference: ../journal-ref
.. _cache target dirty high ratio: ../../operations/pools#cache-target-dirty-high-ratio
Ceph provides a number of settings to manage the load spike associated with the
reassignment of PGs to an OSD (especially a new OSD). The ``osd_max_backfills``
setting specifies the maximum number of concurrent backfills to and from an OSD
-(default: 1). The ``backfill_full_ratio`` setting allows an OSD to refuse a
+(default: 1; note you cannot change this if the `mClock`_ scheduler is active,
+unless you set ``osd_mclock_override_recovery_settings = true``, see
+`mClock backfill`_).
+The ``backfill_full_ratio`` setting allows an OSD to refuse a
backfill request if the OSD is approaching its full ratio (default: 90%). This
setting can be changed with the ``ceph osd set-backfillfull-ratio`` command. If
an OSD refuses a backfill request, the ``osd_backfill_retry_interval`` setting
.. _data placement: ../data-placement
.. _pool: ../pools
.. _placement group: ../placement-groups
+.. _mClock: ../../configuration/mclock-config-ref.rst
+.. _mClock backfill: ../../configuration/mclock-config-ref.rst#recovery-backfill-options
.. _Architecture: ../../../architecture
.. _OSD Not Running: ../../troubleshooting/troubleshooting-osd#osd-not-running
.. _Troubleshooting PG Errors: ../../troubleshooting/troubleshooting-pg#troubleshooting-pg-errors
in recovery and 1 shard of another recovering PG.
fmt_desc: The maximum number of backfills allowed to or from a single OSD.
Note that this is applied separately for read and write operations.
+ This setting is automatically reset when the mClock scheduler is used.
default: 1
+ see_also:
+ - osd_mclock_override_recovery_settings
flags:
- runtime
with_legacy: true
fmt_desc: Time in seconds to sleep before the next recovery or backfill op.
Increasing this value will slow down recovery operation while
client operations will be less impacted.
+ note: This setting is ignored when the mClock scheduler is used.
default: 0
flags:
- runtime
desc: Time in seconds to sleep before next recovery or backfill op for HDDs
fmt_desc: Time in seconds to sleep before next recovery or backfill op
for HDDs.
+ note: This setting is ignored when the mClock scheduler is used.
default: 0.1
flags:
- runtime
desc: Time in seconds to sleep before next recovery or backfill op for SSDs
fmt_desc: Time in seconds to sleep before the next recovery or backfill op
for SSDs.
+ note: This setting is ignored when the mClock scheduler is used.
default: 0
see_also:
- osd_recovery_sleep
on HDD and journal is on SSD
fmt_desc: Time in seconds to sleep before the next recovery or backfill op
when OSD data is on HDD and OSD journal / WAL+DB is on SSD.
+ note: This setting is ignored when the mClock scheduler is used.
default: 0.025
see_also:
- osd_recovery_sleep
fmt_desc: Time in seconds to sleep before next snap trim op.
Increasing this value will slow down snap trimming.
This option overrides backend specific variants.
+ note: This setting is ignored when the mClock scheduler is used.
default: 0
flags:
- runtime
type: float
level: advanced
desc: Time in seconds to sleep before next snap trim for HDDs
+ note: This setting is ignored when the mClock scheduler is used.
default: 5
flags:
- runtime
desc: Time in seconds to sleep before next snap trim for SSDs
fmt_desc: Time in seconds to sleep before next snap trim op
for SSD OSDs (including NVMe).
+ note: This setting is ignored when the mClock scheduler is used.
default: 0
flags:
- runtime
is on SSD
fmt_desc: Time in seconds to sleep before next snap trim op
when OSD data is on an HDD and the OSD journal or WAL+DB is on an SSD.
+ note: This setting is ignored when the mClock scheduler is used.
default: 2
flags:
- runtime
desc: Maximum concurrent scrubs on a single OSD
fmt_desc: The maximum number of simultaneous scrub operations for
a Ceph OSD Daemon.
+ note: This setting is ignored when the mClock scheduler is used.
default: 3
with_legacy: true
- name: osd_scrub_during_recovery
- name: osd_scrub_sleep
type: float
level: advanced
- desc: Duration to inject a delay during scrubbing
- fmt_desc: Time to sleep before scrubbing the next group of chunks. Increasing this value will slow
- down the overall rate of scrubbing so that client operations will be less impacted.
+ desc: Duration (in seconds) of delay injected between chunks when scrubbing
+ fmt_desc: Sleep time in seconds before scrubbing the next group of objects (the next chunk).
+ Increasing this value will slow down the overall rate of scrubbing, reducing scrub
+ impact on client operations.
+ note: This setting is ignored when the mClock scheduler is used.
default: 0
flags:
- runtime
- name: osd_scrub_extended_sleep
type: float
level: advanced
- desc: Duration to inject a delay during scrubbing out of scrubbing hours
+ desc: Duration (in seconds) of delay injected between chunks when scrubbing out
+ of scrubbing hours
+ fmt_desc: Sleep time in seconds before scrubbing the next group of objects (the next chunk).
+ This configuration value is used for scrubbing out of scrubbing hours.
+ Increasing this value will slow down the overall rate of scrubbing, reducing scrub
+ impact on client operations.
+ note: This setting is ignored when the mClock scheduler is used.
default: 0
see_also:
- osd_scrub_begin_hour
is ``0``, which means that the ``hdd`` or ``ssd`` values
(below) are used, depending on the type of the primary
device backing the OSD.
+ This setting is automatically reset when the mClock scheduler is used.
default: 0
see_also:
- osd_recovery_max_active_hdd
- osd_recovery_max_active_ssd
+ - osd_mclock_override_recovery_settings
flags:
- runtime
with_legacy: true
devices)
fmt_desc: The number of active recovery requests per OSD at one time, if the
primary device is rotational.
+ note: This setting is automatically reset when the mClock scheduler is used.
default: 3
see_also:
- osd_recovery_max_active
- osd_recovery_max_active_ssd
+ - osd_mclock_override_recovery_settings
flags:
- runtime
with_legacy: true
solid state devices)
fmt_desc: The number of active recovery requests per OSD at one time, if the
primary device is non-rotational (i.e., an SSD).
+ note: This setting is automatically reset when the mClock scheduler is used.
default: 10
see_also:
- osd_recovery_max_active
- osd_recovery_max_active_hdd
+ - osd_mclock_override_recovery_settings
flags:
- runtime
with_legacy: true
below)
fmt_desc: Time in seconds to sleep before the next removal transaction. This
throttles the PG deletion process.
+ note: This setting is ignored when the mClock scheduler is used.
default: 0
flags:
- runtime
- name: osd_delete_sleep_hdd
type: float
level: advanced
- desc: Time in seconds to sleep before next removal transaction for HDDs
+ desc: Time in seconds to sleep before next removal transaction for HDDs.
+ note: This setting is ignored when the mClock scheduler is used.
default: 5
flags:
- runtime
type: float
level: advanced
desc: Time in seconds to sleep before next removal transaction for SSDs
+ note: This setting is ignored when the mClock scheduler is used.
default: 1
flags:
- runtime
level: advanced
desc: Time in seconds to sleep before next removal transaction when OSD data is on HDD
and OSD journal or WAL+DB is on SSD
+ note: This setting is ignored when the mClock scheduler is used.
default: 1
flags:
- runtime