doc/start: Improve hardware-recommendations.rst

author Anthony D'Atri <anthonyeleven@users.noreply.github.com>

Thu, 23 Oct 2025 19:29:19 +0000 (15:29 -0400)

committer Anthony D'Atri <anthonyeleven@users.noreply.github.com>

Fri, 24 Oct 2025 21:40:52 +0000 (17:40 -0400)
author Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Thu, 23 Oct 2025 19:29:19 +0000 (15:29 -0400)
committer Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Fri, 24 Oct 2025 21:40:52 +0000 (17:40 -0400)
diff --git a/doc/start/hardware-recommendations.rst b/doc/start/hardware-recommendations.rst

index 7a10315724b7ec4f5ef7c7203121a619c7f205c2..1047d56fc35c92e558a38d54f13fe15a2bb2e80b 100644 (file)
--- a/doc/start/hardware-recommendations.rst
+++ b/doc/start/hardware-recommendations.rst
@@ -68,15 +68,21 @@ is advised.
          on a small initial cluster footprint.
  
  There is an :confval:`osd_memory_target` setting for BlueStore OSDs that
-defaults to 4GB.  Factor in a prudent margin for the operating system and
+defaults to 4 GiB. Factor in a prudent margin for the operating system and
  administrative tasks (like monitoring and metrics) as well as increased
-consumption during recovery:  provisioning ~8GB *per BlueStore OSD* is thus
-advised.
+consumption during recovery. We recommend ensuring that total server RAM
+is greater than (number of OSDs * ``osd_memory_target`` * 2), which
+allows for usage by the OS and by other Ceph daemons. A 1U server with
+8-10 OSDs thus is well-provisioned with 128 GB of physical memory. Enabling
+:confval:`osd_memory_target_autotune` can help avoid OOMing under heavy load or when
+non-OSD daemons migrate onto a node. An effective :confval:`osd_memory_target` of
+at least 6 GiB can help mitigate slow requests on HDD OSDs.
+
  
  Monitors and Managers (ceph-mon and ceph-mgr)
  ---------------------------------------------
  
-Monitor and manager daemon memory usage scales with the size of the
+Monitor and Manager memory usage scales with the size of the
  cluster.  Note that at boot-time and during topology changes and recovery these
  daemons will need more RAM than they do during steady-state operation, so plan
  for peak usage. For very small clusters, 32 GB suffices. For clusters of up to,
@@ -99,8 +105,8 @@ its cache. We recommend 1 GB as a minimum for most systems.  See
  Memory
  ======
  
-Bluestore uses its own memory to cache data rather than relying on the
-operating system's page cache. In Bluestore you can adjust the amount of memory
+BlueStore uses its own memory to cache data rather than relying on the
+operating system's page cache. When using the BlueStore OSD back end you can adjust the amount of memory
  that the OSD attempts to consume by changing the :confval:`osd_memory_target`
  configuration option.
  
@@ -140,10 +146,11 @@ configuration option.
          may result in lower performance, and your Ceph cluster may well be
          happier with a daemon that crashes vs one that slows to a crawl.
  
-When using the legacy FileStore back end, the OS page cache was used for caching
-data, so tuning was not normally needed. When using the legacy FileStore backend,
-the OSD memory consumption was related to the number of PGs per daemon in the
-system.
+When using the legacy Filestore back end, the OS page cache was used for caching
+data, so tuning was not normally needed. OSD memory consumption is related
+to the workload and number of PGs that it serves. BlueStore OSDs do not use
+the page cache, so the autotuner is recommended to ensure that RAM is used
+fully but prudently.
  
  
  Data Storage
@@ -174,7 +181,7 @@ drives:
  For more
  information on how to effectively use a mix of fast drives and slow drives in
  your Ceph cluster, see the :ref:`block and block.db <bluestore-mixed-device-config>`
-section of the Bluestore Configuration Reference.
+section of the BlueStore Configuration Reference.
  
  Hard Disk Drives
  ----------------
@@ -507,19 +514,19 @@ core / spine network switches or routers, often at least 40 Gb/s.
  Baseboard Management Controller (BMC)
  -------------------------------------
  
-Your server chassis should have a Baseboard Management Controller (BMC).
+Your server chassis likely has a Baseboard Management Controller (BMC).
  Well-known examples are iDRAC (Dell), CIMC (Cisco UCS), and iLO (HPE).
  Administration and deployment tools may also use BMCs extensively, especially
  via IPMI or Redfish, so consider the cost/benefit tradeoff of an out-of-band
-network for security and administration.  Hypervisor SSH access, VM image uploads,
+network for security and administration. Hypervisor SSH access, VM image uploads,
  OS image installs, management sockets, etc. can impose significant loads on a network.
  Running multiple networks may seem like overkill, but each traffic path represents
  a potential capacity, throughput and/or performance bottleneck that you should
  carefully consider before deploying a large scale data cluster.
  
-Additionally BMCs as of 2023 rarely sport network connections faster than 1 Gb/s,
+Additionally, BMCs as of 2025 rarely offer network connections faster than 1 Gb/s,
  so dedicated and inexpensive 1 Gb/s switches for BMC administrative traffic
-may reduce costs by wasting fewer expenive ports on faster host switches.
+may reduce costs by wasting fewer expensive ports on faster host switches.
  
  
  Failure Domains
author	Anthony D'Atri <anthonyeleven@users.noreply.github.com>
	Thu, 23 Oct 2025 19:29:19 +0000 (15:29 -0400)
committer	Anthony D'Atri <anthonyeleven@users.noreply.github.com>
	Fri, 24 Oct 2025 21:40:52 +0000 (17:40 -0400)