From: Sage Weil Date: Wed, 3 Jun 2015 18:57:34 +0000 (-0400) Subject: upstart: limit respawn to 3 in 30 mins (instead of 5 in 30s) X-Git-Tag: v0.94.4~23^2 X-Git-Url: http://git.apps.os.sepia.ceph.com/?a=commitdiff_plain;h=b3822f113e07547194b844f647bcb7d45513b25f;p=ceph.git upstart: limit respawn to 3 in 30 mins (instead of 5 in 30s) It may take tens of seconds to restart each time, so 5 in 30s does not stop the crash on startup respawn loop in many cases. In particular, we'd like to catch the case where the internal heartbeats fail. This should be enough for all but the most sluggish of OSDs and capture many cases of failure shortly after startup. Fixes: #11798 Signed-off-by: Sage Weil (cherry picked from commit eaff6cb24ef052c54dfa2131811758e335f19939) --- diff --git a/src/upstart/ceph-mds.conf b/src/upstart/ceph-mds.conf index 77841cdccd736..4063d9116ebce 100644 --- a/src/upstart/ceph-mds.conf +++ b/src/upstart/ceph-mds.conf @@ -4,7 +4,7 @@ start on ceph-mds stop on runlevel [!2345] or stopping ceph-mds-all respawn -respawn limit 5 30 +respawn limit 3 1800 limit nofile 16384 16384 diff --git a/src/upstart/ceph-mon.conf b/src/upstart/ceph-mon.conf index 0279f15c5a8bf..83c98583c5d69 100644 --- a/src/upstart/ceph-mon.conf +++ b/src/upstart/ceph-mon.conf @@ -4,7 +4,7 @@ start on ceph-mon stop on runlevel [!2345] or stopping ceph-mon-all respawn -respawn limit 5 30 +respawn limit 3 1800 limit nofile 16384 16384 diff --git a/src/upstart/ceph-osd.conf b/src/upstart/ceph-osd.conf index d0205eec6bfa5..2438c206f292b 100644 --- a/src/upstart/ceph-osd.conf +++ b/src/upstart/ceph-osd.conf @@ -4,7 +4,7 @@ start on ceph-osd stop on runlevel [!2345] or stopping ceph-osd-all respawn -respawn limit 5 30 +respawn limit 3 1800 limit nofile 327680 327680