From: Dmitry Smirnov Date: Sat, 29 Mar 2014 00:59:24 +0000 (+1100) Subject: init: fix OSD startup issue X-Git-Tag: v0.79~60^2 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=44afc2332e3ff6e64d8aa8fb0d160aeb60dbcc85;p=ceph.git init: fix OSD startup issue On machines with MON and OSDs (on boot) OSDs started shortly after MON startup but MON needs time to become oprational so OSDs fail to start due to short timeout because they don't have enough time to establish communication with cluster. This is even more likely to happen when there are other monitors down which is not unusual when servers are rebooting after power failure. Increasing timeout significantly improves chances for successful OSD start. Signed-off-by: Dmitry Smirnov --- diff --git a/src/init-ceph.in b/src/init-ceph.in index c27ca341aeff..846bd573cef7 100644 --- a/src/init-ceph.in +++ b/src/init-ceph.in @@ -327,7 +327,7 @@ for name in $what; do get_conf osd_weight "" "osd crush initial weight" defaultweight="$(df -P -k $osd_data/. | tail -1 | awk '{ print sprintf("%.2f",$2/1073741824) }')" get_conf osd_keyring "$osd_data/keyring" "keyring" - do_cmd "timeout 10 $BINDIR/ceph -c $conf --name=osd.$id --keyring=$osd_keyring osd crush create-or-move -- $id ${osd_weight:-${defaultweight:-1}} $osd_location" + do_cmd "timeout 30 $BINDIR/ceph -c $conf --name=osd.$id --keyring=$osd_keyring osd crush create-or-move -- $id ${osd_weight:-${defaultweight:-1}} $osd_location" fi fi