From: John Wilkins Date: Fri, 29 Aug 2014 00:25:07 +0000 (-0700) Subject: doc: Added sysctl max thread count discussion. X-Git-Tag: v0.86~178 X-Git-Url: http://git.apps.os.sepia.ceph.com/?a=commitdiff_plain;h=7948e13b9d12f3da676e93347b3f18151c424488;p=ceph.git doc: Added sysctl max thread count discussion. Fixes: #6142 Signed-off-by: John Wilkins --- diff --git a/doc/rados/troubleshooting/troubleshooting-osd.rst b/doc/rados/troubleshooting/troubleshooting-osd.rst index 8fe25f40aebc5..e67038c68755c 100644 --- a/doc/rados/troubleshooting/troubleshooting-osd.rst +++ b/doc/rados/troubleshooting/troubleshooting-osd.rst @@ -134,6 +134,20 @@ If you start your cluster and an OSD won't start, check the following: actual mounts, you may have trouble starting OSDs. If you want to store the journal on a block device, you should partition your journal disk and assign one partition per OSD. + +- **Check Max Threadcount:** If you have a node with a lot of OSDs, you may be + hitting the default maximum number of threads (e.g., usually 32k), especially + during recovery. You can increase the number of threads using ``sysctl`` to + see if increasing the maximum number of threads to the maximum possible + number of threads allowed (i.e., 4194303) will help. For example:: + + sysctl -w kernel.pid_max=4194303 + + If increasing the maximum thread count resolves the issue, you can make it + permanent by including a ``kernel.pid_max`` setting in the + ``/etc/sysctl.conf`` file. For example:: + + kernel.pid_max = 4194303 - **Kernel Version:** Identify the kernel version and distribution you are using. Ceph uses some third party tools by default, which may be @@ -145,6 +159,8 @@ If you start your cluster and an OSD won't start, check the following: (if it isn't already), and try again. If it segment faults again, contact the ceph-devel email list and provide your Ceph configuration file, your monitor output and the contents of your log file(s). + + If you cannot resolve the issue and the email list isn't helpful, you may contact `Inktank`_ for support. diff --git a/doc/start/hardware-recommendations.rst b/doc/start/hardware-recommendations.rst index ffbc37a58900f..da91af75faddd 100644 --- a/doc/start/hardware-recommendations.rst +++ b/doc/start/hardware-recommendations.rst @@ -192,6 +192,16 @@ is up to date. See `OS Recommendations`_ for notes on ``glibc`` and ``syncfs(2)`` to ensure that your hardware performs as expected when running multiple OSDs per host. +Hosts with high numbers of OSDs (e.g., > 20) may spawn a lot of threads, +especially during recovery and rebalancing. Many Linux kernels default to +a relatively small maximum number of threads (e.g., 32k). If you encounter +problems starting up OSDs on hosts with a high number of OSDs, consider +setting ``kernel.pid_max`` to a higher number of threads. The theoretical +maximum is 4,194,303 threads. For example, you could add the following to +the ``/etc/sysctl.conf`` file:: + + kernel.pid_max = 4194303 + Networks ========