From: Sage Weil Date: Mon, 28 Oct 2013 22:56:15 +0000 (-0700) Subject: init-ceph: make crush update on osd start time out X-Git-Tag: v0.72-rc1~8^2~1 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=177e2ab1cad325b875249a514bc1774ff32e0074;p=ceph.git init-ceph: make crush update on osd start time out If the monitor is not currently available, this crush update would block forever, preventing the OSD and (potentially) the rest of the system from starting up. Instead, make it time out after 10 seconds and then abort startup. This prevents startup of an OSD if we failed to update the CRUSH position for some reason. In fact, do not start up the OSD if the CRUSH update fails for any reason--not just a timeout! Works-around: #5612 Signed-off-by: Sage Weil --- diff --git a/src/init-ceph.in b/src/init-ceph.in index 46877d755587..1a80a42b03ec 100644 --- a/src/init-ceph.in +++ b/src/init-ceph.in @@ -324,7 +324,7 @@ for name in $what; do get_conf osd_weight "" "osd crush initial weight" defaultweight="$(do_cmd "df $osd_data/. | tail -1 | awk '{ d= \$2/1073741824 ; r = sprintf(\"%.2f\", d); print r }'")" get_conf osd_keyring "$osd_data/keyring" "keyring" - do_cmd "$BINDIR/ceph \ + do_cmd "timeout 10 $BINDIR/ceph \ --name=osd.$id \ --keyring=$osd_keyring \ osd crush create-or-move \ @@ -333,8 +333,7 @@ for name in $what; do ${osd_weight:-${defaultweight:-1}} \ root=default \ host=$host \ - $osd_location \ - || :" + $osd_location" fi fi