or delete some existing data to reduce utilization.
+
Data health (pools & placement groups)
------------------------------
PG_AVAILABILITY
_______________
+Data availability is reduced, meaning that the cluster is unable to
+service potential read or write requests for some data in the cluster.
+Specifically, one or more PGs is in a state that does not allow IO
+requests to be serviced. Problematic PG states include *peering*,
+*stale*, *incomplete*, and the lack of *active* (if those conditions do not clear
+quickly).
+
+Detailed information about which PGs are affected is available from::
+
+ ceph health detail
+
+In most cases the root cause is that one or more OSDs is currently
+down; see the dicussion for ``OSD_DOWN`` above.
+
+The state of specific problematic PGs can be queried with::
+
+ ceph tell <pgid> query
PG_DEGRADED
___________
+Data redundancy is reduced for some data, meaning the cluster does not
+have the desired number of replicas for all data (for replicated
+pools) or erasure code fragments (for erasure coded pools).
+Specifically, one or more PGs:
+
+* has the *degraded* or *undersized* flag set, meaning there are not
+ enough instances of that placement group in the cluster;
+* has not had the *clean* flag set for some time.
+
+Detailed information about which PGs are affected is available from::
+
+ ceph health detail
+
+In most cases the root cause is that one or more OSDs is currently
+down; see the dicussion for ``OSD_DOWN`` above.
+
+The state of specific problematic PGs can be queried with::
+
+ ceph tell <pgid> query
+
PG_DEGRADED_FULL
________________
+Data redundancy may be reduced or at risk for some data due to a lack
+of free space in the cluster. Specifically, one or more PGs has the
+*backfill_toofull* or *recovery_toofull* flag set, meaning that the
+cluster is unable to migrate or recover data because one or more OSDs
+is above the *backfillfull* threshold.
+
+See the discussion for *OSD_BACKFILLFULL* or *OSD_FULL* above for
+steps to resolve this condition.
PG_DAMAGED
__________
+Data scrubbing has discovered some problems with data consistency in
+the cluster. Specifically, one or more PGs has the *inconsistent* or
+*snaptrim_error* flag is set, indicating an earlier scrub operation
+found a problem, or that the *repair* flag is set, meaning a repair
+for such an inconsistency is currently in progress.
+
+See :doc:`pg-repair` for more information.
+
OSD_SCRUB_ERRORS
________________
+Recent OSD scrubs have uncovered inconsistencies. This error is generally
+paired with *PG_DAMANGED* (see above).
+
+See :doc:`pg-repair` for more information.
CACHE_POOL_NEAR_FULL
____________________
+A cache tier pool is nearly full. Full in this context is determined
+by the ``target_max_bytes`` and ``target_max_objects`` properties on
+the cache pool. Once the pool reaches the target threshold, write
+requests to the pool may block while data is flushed and evicted
+from the cache, a state that normally leads to very high latencies and
+poor performance.
+
+The cache pool target size can be adjusted with::
+
+ ceph osd pool set <cache-pool-name> target_max_bytes <bytes>
+ ceph osd pool set <cache-pool-name> target_max_objects <objects>
+
+Normal cache flush and evict activity may also be throttled due to reduced
+availability or performance of the base tier, or overall cluster load.
TOO_FEW_PGS
___________
+The number of PGs in use in the cluster is below the configurable
+threshold of ``mon_pg_warn_min_per_osd`` PGs per OSD. This can lead
+to suboptimizal distribution and balance of data across the OSDs in
+the cluster, and similar reduce overall performance.
+
+This may be an expected condition if data pools have not yet been
+created.
+
+The PG count for existing pools can be increased or new pools can be
+created. Please refer to
+:doc:`placement-groups#Choosing-the-number-of-Placement-Groups` for
+more information.
TOO_MANY_PGS
____________
+The number of PGs in use in the cluster is above the configurable
+threshold of ``mon_pg_warn_max_per_osd`` PGs per OSD. This can lead
+to higher memory utilization for OSD daemons, slower peering after
+cluster state changes (like OSD restarts, additions, or removals), and
+higher load on the Manager and Monitor daemons.
+
+The ``pg_num`` value for existing pools cannot currently be reduced.
+However, the ``pgp_num`` value can, which effectively collocates some
+PGs on the same sets of OSDs, mitigating some of the negative impacts
+described above. The ``pgp_num`` value can be adjusted with::
+
+ ceph osd pool set <pool> pgp_num <value>
+
+Please refer to
+:doc:`placement-groups#Choosing-the-number-of-Placement-Groups` for
+more information.
SMALLER_PGP_NUM
_______________
+One or more pools has a ``pgp_num`` value less than ``pg_num``. This
+is normally an indication that the PG count was increased without
+also increasing the placement behavior.
+
+This is sometimes done deliberately to separate out the `split` step
+when the PG count is adjusted from the data migration that is needed
+when ``pgp_num`` is changed.
+
+This is normally resolved by setting ``pgp_num`` to match ``pg_num``,
+triggering the data migration, with::
+
+ ceph osd pool set <pool> pgp_num <pg-num-value>
+
MANY_OBJECTS_PER_PG
___________________
+One or more pools has an average number of objects per PG that is
+significantly higher than the overall cluster average. The specific
+threshold is controlled by the ``mon_pg_warn_max_object_skew``
+configuration value.
+
+This is usually an indication that the pool(s) containing most of the
+data in the cluster have too few PGs, and/or that other pools that do
+not contain as much data have too many PGs. See the discussion of
+*TOO_MANY_PGS* above.
+
+The threshold can be raised to silence the health warning by adjusting
+the ``mon_pg_warn_max_object_skew`` config option on the monitors.
POOL_FULL
_________
+One or more pools has reached (or is very close to reaching) its
+quota. The threshold to trigger this error condition is controlled by
+the ``mon_pool_quota_crit_threshold`` configuration option.
+
+Pool quotas can be adjusted up or down (or removed) with::
+
+ ceph osd pool set-quota <pool> max_bytes <bytes>
+ ceph osd pool set-quota <pool> max_objects <objects>
+
+Setting the quota value to 0 will disable the quota.
POOL_NEAR_FULL
______________
+One or more pools is approaching is quota. The threshold to trigger
+this warning condition is controlled by the
+``mon_pool_quota_warn_threshold`` configuration option.
+
+Pool quotas can be adjusted up or down (or removed) with::
+
+ ceph osd pool set-quota <pool> max_bytes <bytes>
+ ceph osd pool set-quota <pool> max_objects <objects>
+
+Setting the quota value to 0 will disable the quota.
OBJECT_MISPLACED
________________
+One or more objects in the cluster is not stored on the node the
+cluster would like it to be stored on. This is an indication that
+data migration due to some recent cluster change has not yet completed.
+
+Misplaced data is not a dangerous condition in and of itself; data
+consistency is never at risk, and old copies of objects are never
+removed until the desired number of new copies (in the desired
+locations) are present.
OBJECT_UNFOUND
______________
+One or more objects in the cluster cannot be found. Specifically, the
+OSDs know that a new or updated copy of an object should exist, but a
+copy of that version of the object has not been found on OSDs that are
+currently online.
+
+Read or write requests to unfound objects will block.
+
+Ideally, a down OSD can be brought back online that has the more
+recent copy of the unfound object. Candidate OSDs can be identified from the
+peering state for the PG(s) responsible for the unfound object::
+
+ ceph tell <pgid> query
+
+If the latest copy of the object is not available, the cluster can be
+told to roll back to a previous version of the object. See
+:doc:`troubleshooting-pg#Unfound-objects` for more information.
REQUEST_SLOW
____________
+One or more OSD requests is taking a long time to process. This can
+be an indication of extreme load, a slow storage device, or a software
+bug.
+
+The request queue on the OSD(s) in question can be queried with the
+following command, executed from the OSD host::
+
+ ceph daemon osd.<id> ops
+
+A summary of the slowest recent requests can be seen with::
+
+ ceph daemon osd.<id> dump_historic_ops
+
+The location of an OSD can be found with::
+
+ ceph osd find osd.<id>
REQUEST_STUCK
_____________
+One or more OSD requests has been blocked for an extremely long time.
+This is an indication that either the cluster has been unhealthy for
+an extended period of time (e.g., not enough running OSDs) or there is
+some internal problem with the OSD. See the dicussion of
+*REQUEST_SLOW* above.
PG_NOT_SCRUBBED
_______________
+One or more PGs has not been scrubbed recently. PGs are normally
+scrubbed every ``mon_scrub_interval`` seconds, and this warning
+triggers when ``mon_warn_not_scrubbed`` such intervals have elapsed
+without a scrub.
+
+PGs will not scrub if they are not flagged as *clean*, which may
+happen if they are misplaced or degraded (see *PG_AVAILABILITY* and
+*PG_DEGRADED* above).
+
+You can manually initiate a scrub of a clean PG with::
+
+ ceph pg scrub <pgid>
PG_NOT_DEEP_SCRUBBED
____________________
+One or more PGs has not been deep scrubbed recently. PGs are normally
+scrubbed every ``osd_deep_mon_scrub_interval`` seconds, and this warning
+triggers when ``mon_warn_not_deep_scrubbed`` such intervals have elapsed
+without a scrub.
+
+PGs will not (deep) scrub if they are not flagged as *clean*, which may
+happen if they are misplaced or degraded (see *PG_AVAILABILITY* and
+*PG_DEGRADED* above).
+
+You can manually initiate a scrub of a clean PG with::
+
+ ceph pg deep-scrub <pgid>
CephFS
------