There needs to be a timeout to prevent ceph-disk from hanging
forever. But there is no good reason to set it to a value that is less
than a few hours.
Each OSD activation needs to happen in sequence and not in parallel,
reason why there is a global activation lock.
It would be possible, when an OSD is using a device that is not
otherwise used by another OSD (i.e. they do not share an SSD journal
device etc.), to run all activations in parallel. It would however
require a more extensive modification of ceph-disk to avoid any chances
of races.
Fixes: http://tracker.ceph.com/issues/20229
Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit
a9eb52e0a4c06a80e5dbfaac394aac940edf4c68)
[Service]
Type=oneshot
KillMode=none
-Environment=CEPH_DISK_TIMEOUT=300
+Environment=CEPH_DISK_TIMEOUT=10000
ExecStart=/bin/sh -c 'timeout $CEPH_DISK_TIMEOUT flock /var/lock/ceph-disk-$(basename %f) /usr/sbin/ceph-disk --verbose --log-stdout trigger --sync %f'
TimeoutSec=0