From a0d935966c4cd8bfe93a3eaff1ea82510eada20d Mon Sep 17 00:00:00 2001 From: Kefu Chai Date: Thu, 13 Jul 2017 16:02:24 +0800 Subject: [PATCH] doc: add instructions for replacing an OSD * 8/ceph.rst: `rm` subcommand removes osd from osdmap, not the cluster. the latter is more ambiguous in different contexts. * rados/operations/add-or-rm-osds.rst: add a subsection of "Replacing an OSD". update the subsection of "Removing the OSD" with "ceph osd purge" command * release-notes.rst: link from it to the new subsection in add-or-rm-osds.rst Signed-off-by: Kefu Chai --- doc/man/8/ceph.rst | 3 +- doc/rados/operations/add-or-rm-osds.rst | 70 +++++++++++++++++++------ doc/release-notes.rst | 3 +- 3 files changed, 57 insertions(+), 19 deletions(-) diff --git a/doc/man/8/ceph.rst b/doc/man/8/ceph.rst index 339d5c3d192..80f3d9fb337 100644 --- a/doc/man/8/ceph.rst +++ b/doc/man/8/ceph.rst @@ -969,7 +969,8 @@ Usage:: ceph osd reweight-by-utilization {} {--no-increasing} -Subcommand ``rm`` removes osd(s) [...] in the cluster. +Subcommand ``rm`` removes osd(s) [...] from the OSD map. + Usage:: diff --git a/doc/rados/operations/add-or-rm-osds.rst b/doc/rados/operations/add-or-rm-osds.rst index d82add3b90c..6ee6d797141 100644 --- a/doc/rados/operations/add-or-rm-osds.rst +++ b/doc/rados/operations/add-or-rm-osds.rst @@ -165,6 +165,32 @@ weight). subsequent releases. +Replacing an OSD +---------------- + +When disks fail, or if an admnistrator wants to reprovision OSDs with a new +backend, for instance, for switching from FileStore to BlueStore, OSDs need to +be replaced. Unlike `Removing the OSD`_, replaced OSD's id and CRUSH map entry +need to be keep intact after the OSD is destroyed for replacement. + +#. Destroy the OSD first:: + + ceph osd destroy {id} --yes-i-really-mean-it + +#. Zap a disk for the new OSD, if the disk was used before for other purposes. + It's not necessary for a new disk:: + + ceph-disk zap /dev/sdX + +#. Prepare the disk for replacement by using the previously destroyed OSD id:: + + ceph-disk prepare --bluestore /dev/sdX --osd-id {id} --osd-uuid `uuidgen` + +#. And activate the OSD:: + + ceph-disk activate /dev/sdX1 + + Starting the OSD ---------------- @@ -287,6 +313,32 @@ key, removes the OSD from the OSD map, and removes the OSD from the ``ceph.conf`` file. If your host has multiple drives, you may need to remove an OSD for each drive by repeating this procedure. +#. Let the cluster forget the OSD first. This step removes the OSD from the CRUSH + map, removes its authentication key. And it is removed from the OSD map as + well. Please note the `purge subcommand`_ is introduced in Luminous, for older + versions, please see below :: + + ceph osd purge {id} --yes-i-really-mean-it + +#. Navigate to the host where you keep the master copy of the cluster's + ``ceph.conf`` file. :: + + ssh {admin-host} + cd /etc/ceph + vim ceph.conf + +#. Remove the OSD entry from your ``ceph.conf`` file (if it exists). :: + + [osd.1] + host = {hostname} + +#. From the host where you keep the master copy of the cluster's ``ceph.conf`` file, + copy the updated ``ceph.conf`` file to the ``/etc/ceph`` directory of other + hosts in your cluster. + +If your Ceph cluster is older than Luminous, instead of using ``ceph osd purge``, +you need to perform this step manually: + #. Remove the OSD from the CRUSH map so that it no longer receives data. You may also decompile the CRUSH map, remove the OSD from the device list, remove the @@ -308,23 +360,7 @@ OSD for each drive by repeating this procedure. ceph osd rm {osd-num} #for example ceph osd rm 1 - -#. Navigate to the host where you keep the master copy of the cluster's - ``ceph.conf`` file. :: - ssh {admin-host} - cd /etc/ceph - vim ceph.conf - -#. Remove the OSD entry from your ``ceph.conf`` file (if it exists). :: - - [osd.1] - host = {hostname} - -#. From the host where you keep the master copy of the cluster's ``ceph.conf`` file, - copy the updated ``ceph.conf`` file to the ``/etc/ceph`` directory of other - hosts in your cluster. - - .. _Remove an OSD: ../crush-map#removeosd +.. _purge subcommand: ../../../man/ceph#osd diff --git a/doc/release-notes.rst b/doc/release-notes.rst index 9b40925f672..64b186b8ec5 100644 --- a/doc/release-notes.rst +++ b/doc/release-notes.rst @@ -67,7 +67,7 @@ Major Changes from Kraken * There is now a *backoff* mechanism that prevents OSDs from being overloaded by requests to objects or PGs that are not currently able to process IO. - * There is a *simplified OSD replacement process* that is more robust. FIXME DOCS + * There is a `simplified OSD replacement process`_ that is more robust. * You can query the supported features and (apparent) releases of all connected daemons and clients with ``ceph features``. FIXME DOCS * You can configure the oldest Ceph client version you wish to allow to @@ -238,6 +238,7 @@ Major Changes from Kraken - ``ceph tell help`` will now return a usage summary. .. _Read more about EC overwrites: ../rados/operations/erasure-code/#erasure-coding-with-overwrites +.. _simplified OSD replacement process: ../rados/operations/add-or-rm-osds/#replacing-an-osd Major Changes from Jewel ------------------------ -- 2.39.5