doc: Re-factored adding an OSD.

author John Wilkins <john.wilkins@inktank.com>

Tue, 4 Sep 2012 23:19:33 +0000 (16:19 -0700)

committer John Wilkins <john.wilkins@inktank.com>

Tue, 4 Sep 2012 23:19:33 +0000 (16:19 -0700)
author John Wilkins <john.wilkins@inktank.com>
Tue, 4 Sep 2012 23:19:33 +0000 (16:19 -0700)
committer John Wilkins <john.wilkins@inktank.com>
Tue, 4 Sep 2012 23:19:33 +0000 (16:19 -0700)
diff --git a/doc/cluster-ops/add-or-rm-osds.rst b/doc/cluster-ops/add-or-rm-osds.rst

new file mode 100644 (file)

index 0000000..9114856
--- /dev/null
+++ b/doc/cluster-ops/add-or-rm-osds.rst
@@ -0,0 +1,313 @@
+======================
+ Adding/Removing OSDs
+======================
+
+When you have a cluster up and running, you may add OSDs or remove OSDs
+from the cluster at runtime. 
+
+Adding OSDs
+===========
+
+When you want to expand a cluster, you may add an OSD at runtime. With Ceph, an
+OSD is generally one Ceph ``ceph-osd`` daemon for one storage disk within a host
+machine. If your host has multiple storage disks, you may map one ``ceph-osd``
+daemon for each disk.
+
+Generally, it's a good idea to check the capacity of your cluster to see if you
+are reaching the upper end of its capacity. As your cluster reaches its ``near
+full`` ratio, you should add one or more OSDs to expand your cluster's capacity.
+
+.. warning:: Do not let your cluster reach its ``full ratio`` before
+   adding an OSD. OSD failures that occur after the cluster reaches 
+   its ``near full`` ratio may cause the cluster to exceed its
+   ``full ratio``.
+
+Deploy your Hardware
+--------------------
+
+If you are adding a new host when adding a new OSD, 
+see `Hardware Recommendations`_ for details on minimum recommendations
+for OSD hardware. To add a OSD host to your cluster, first make sure you have 
+an up-to-date version of Linux installed (typically Ubuntu 12.04 precise), 
+and you have made some initial preparations for your storage disks. 
+See `Filesystem Recommendations`_ for details. 
+
+Add your OSD host to a rack in your cluster, connect it to the network
+and ensure that it has network connectivity.
+
+.. _Hardware Recommendations: ../../install/hardware-recommendations
+.. _Filesystem Recommendations: ../../configure/file-system-recommendations
+
+Install the Required Software
+-----------------------------
+
+For manually deployed clusters, you must install Ceph packages
+manually. See `Installing Debian/Ubuntu Packages`_ for details.
+You should configure SSH to a user with password-less authentication
+and root permissions.
+
+.. _Installing Debian/Ubuntu Packages: ../../install/debian
+
+For clusters deployed with Chef, create a `chef user`_, `configure
+SSH keys`_, `install Ruby`_ and `install the Chef client`_ on your host. See 
+`Installing Chef`_ for details.
+
+.. _chef user: ../../install/chef#createuser
+.. _configure SSH keys: ../../install/chef#genkeys
+.. _install the Chef client: ../../install/chef#installchef
+.. _Installing Chef: ../../install/chef
+.. _Install Ruby: ../../install/chef#installruby
+
+Adding an OSD (Manual)
+----------------------
+
+This procedure sets up an ``ceph-osd`` daemon, configures it to use one disk,
+and configures the cluster to distribute data to the OSD. If your host has
+multiple disks,  you may add an OSD for each disk by repeating this procedure.
+
+To add an OSD, create a data directory for it, mount a disk to that directory,
+add the OSD to your configuration file, add the OSD to the cluster, and then
+add it to the CRUSH map.
+
+When you add the OSD to the CRUSH map, consider the weight you give to the new
+OSD.  Hard disk capacity grows 40% per year, so newer OSD hosts may have larger
+hard disks than older hosts in the cluster (i.e., they may have greater weight).
+
+#. Create the default directory on your new OSD. :: 
+
+       ssh {new-osd-host}
+       sudo mkdir /var/lib/ceph/osd/ceph-{osd-number}
+       
+
+#. If the OSD is for a disk other than the OS disk, prepare it 
+   for use with Ceph, and mount it to the directory you just created:: 
+
+       ssh {new-osd-host}
+       sudo mkfs -t {fstype} /dev/{disk}
+       sudo mount -o user_xattr /dev/{hdd} /var/lib/ceph/osd/ceph-{osd-number}
+
+
+#. Navigate to the host where you keep the master copy of the cluster's 
+   ``ceph.conf`` file. :: 
+
+       ssh {admin-host}
+       cd /etc/chef
+       vim ceph.conf
+
+#. Add the new OSD to your ``ceph.conf`` file.
+       
+.. code-block:: ini
+       
+       [osd.123]
+               host = {hostname}
+ 
+#. From the host where you keep the master copy of the cluster's 
+   ``ceph.conf`` file, copy the updated ``ceph.conf`` file to your 
+   new OSD's ``/etc/ceph`` directory and to other hosts in your cluster. :: 
+
+       ssh {new-osd} sudo tee /etc/ceph/ceph.conf < /etc/ceph/ceph.conf
+
+#. Create the OSD. ::
+
+       ceph osd create {osd-num}
+       ceph osd create 123 #for example
+       
+#. Initialize the OSD data directory. :: 
+
+       ssh {new-osd-host}
+       ceph-osd -i {osd-num} --mkfs --mkkey
+       
+   The directory must be empty before you can run ``ceph-osd``.
+
+#. Register the OSD authentication key. The value of ``ceph`` for 
+   ``ceph-{osd-num}`` in the path is the ``$cluster-$id``.  If your 
+   cluster name differs from ``ceph``, use your cluster name instead.::
+
+       ceph auth add osd.{osd-num} osd 'allow *' mon 'allow rwx' -i /var/lib/ceph/osd/ceph-{osd-num}/keyring
+
+#. Add the OSD to the CRUSH map so that it can begin receiving data. You may
+   also decompile the CRUSH map, add the OSD to the device list, add the host as a
+   bucket (if it's not already in the CRUSH map), add the device as an item in the
+   host, assign it a weight, recompile it and set it. See      `Add/Move an OSD`_ for
+   details. :: 
+
+       ceph osd crush set {id} {name} {weight}  [{bucket-type}={bucket-name}, ...]
+
+
+Adding an OSD (Chef)
+--------------------
+
+This procedure configures your OSD using ``chef-client``. If your host has
+multiple disks, you may need to execute the procedure for preparing an OSD disk
+for each data disk on your host.
+
+When you add the OSD to the CRUSH map, consider the weight you give to the new
+OSD.  Hard disk capacity grows 40% per year, so newer OSD hosts may have larger
+hard disks than older hosts in the cluster.
+
+#. Execute ``chef-client`` to register it with Chef as a Chef node.
+
+#. Edit the node. See `Configure Nodes`_ for details.
+   Change its environment to your Chef environment.
+   Add ``"role[ceph-osd]"`` to the run list.
+
+#. Execute `Prepare OSD Disks`_ for each disk.
+
+#. Execute ``chef-client`` to invoke the run list.
+
+#. Add the OSD to the CRUSH map so that it can begin receiving data. You may
+   also decompile the CRUSH map edit the file, recompile it and set it. See
+   `Add/Move an OSD`_ for details. :: 
+
+       ceph osd crush set {id} {name} {weight} pool={pool-name}  [{bucket-type}={bucket-name}, ...]
+
+
+Starting the OSD
+----------------
+
+After you add an OSD to Ceph, the OSD is in your configuration. However, 
+it is not yet running. The OSD is ``down`` and ``out``. You must start 
+your new OSD before it can begin receiving data. You may use
+``service ceph`` from your admin host or start the OSD from its host
+machine::
+
+       service ceph -a start osd.{osd.num}
+       #or alternatively
+       ssh {new-osd-host}
+       sudo /etc/init.d/ceph start osd.{osd-num}
+
+
+Once you start your OSD, it is ``up``.
+
+Put the OSD ``in`` the Cluster
+------------------------------
+
+After you start your OSD, it is ``up`` and ``out``.  You need to put it in to
+the cluster so that Ceph can begin writing data to it. :: 
+
+       ceph osd in {osd-num}
+
+
+Observe the Data Migration
+--------------------------
+
+Once you have added your new OSD to the CRUSH map, Ceph  will begin rebalancing
+the server by migrating placement groups to your new OSD. You can observe this
+process with  the `ceph`_ tool. :: 
+
+       ceph -w
+
+You should see the placement group states change from ``active+clean`` to
+``active, some degraded objects``, and finally ``active+clean`` when migration
+completes. (Control-c to exit.)
+
+
+.. _Add/Move an OSD: ../crush-map#addosd
+.. _Configure Nodes: ../../config-cluster#confignodes
+.. _Prepare OSD Disks: ../../config-cluster#prepdisks
+.. _ceph: ../monitoring
+
+
+
+Removing OSDs
+=============
+
+When you want to reduce the size of a cluster or replace hardware, you may
+remove an OSD at runtime. With Ceph, an OSD is generally one Ceph ``ceph-osd``
+daemon for one storage disk within a host machine. If your host has multiple
+storage disks, you may need to remove one ``ceph-osd`` daemon for each disk.
+Generally, it's a good idea to check the capacity of your cluster to see if you
+are reaching the upper end of its capacity. Ensure that when you remove an OSD
+that your cluster is not at its ``near full`` ratio.
+
+.. warning:: Do not let your cluster reach its ``full ratio`` when
+   removing an OSD. Removing OSDs could cause the cluster to reach 
+   or exceed its ``full ratio``.
+   
+
+Take the OSD ``out`` of the Cluster
+-----------------------------------
+
+Before you remove an OSD, it is usually ``up`` and ``in``.  You need to take it
+out of the cluster so that Ceph can begin rebalancing and copying its data to
+other OSDs. :: 
+
+       ceph osd out {osd-num}
+
+
+Observe the Data Migration
+--------------------------
+
+Once you have taken your OSD ``out`` of the cluster, Ceph  will begin
+rebalancing the cluster by migrating placement groups out of the OSD you
+removed. You can observe  this process with  the `ceph`_ tool. :: 
+
+       ceph -w
+
+You should see the placement group states change from ``active+clean`` to
+``active, some degraded objects``, and finally ``active+clean`` when migration
+completes. (Control-c to exit.)
+
+
+Stopping the OSD
+----------------
+
+After you take an OSD out of the cluster, it may still be running. 
+That is, the OSD may be ``up`` and ``out``. You must stop 
+your OSD before you remove it from the configuration. :: 
+
+       ssh {new-osd-host}
+       sudo /etc/init.d/ceph stop osd.{osd-num}
+
+Once you stop your OSD, it is ``down``. 
+
+
+Removing an OSD (Manual)
+------------------------
+
+This procedure removes an OSD from a cluster map, removes its authentication
+key, removes the OSD from the OSD map, and removes the OSD from the
+``ceph.conf`` file. If your host has multiple disks,  you may need to remove an
+OSD for each disk by repeating this procedure.
+
+
+#. Remove the OSD from the CRUSH map so that it no longer receives data. You may
+   also decompile the CRUSH map, remove the OSD from the device list, remove the
+   device as an item in the host bucket or remove the host  bucket (if it's in the
+   CRUSH map and you intend to remove the host), recompile the map and set it. 
+   See `Remove an OSD`_ for details. :: 
+
+       ceph osd crush remove {name}
+       
+#. Remove the OSD authentication key. ::
+
+       ceph auth del osd.{osd-num}
+       
+   The value of ``ceph`` for ``ceph-{osd-num}`` in the path is the ``$cluster-$id``. 
+   If your cluster name differs from ``ceph``, use your cluster name instead.  
+       
+#. Remove the OSD. ::
+
+       ceph osd rm {osd-num}
+       #for example
+       ceph osd rm 123
+       
+#. Navigate to the host where you keep the master copy of the cluster's 
+   ``ceph.conf`` file. ::
+
+       ssh {admin-host}
+       cd /etc/chef
+       vim ceph.conf
+
+#. Remove the OSD entry from your ``ceph.conf`` file. ::
+
+       [osd.123]
+               host = {hostname}
+ 
+#. From the host where you keep the master copy of the cluster's ``ceph.conf`` file, 
+   copy the updated ``ceph.conf`` file to the ``/etc/ceph`` directory of other 
+   hosts in your cluster. :: 
+
+       ssh {osd} sudo tee /etc/ceph/ceph.conf < /etc/ceph/ceph.conf            
+       
+.. _Remove an OSD: ../crush-map#removeosd
author	John Wilkins <john.wilkins@inktank.com>
	Tue, 4 Sep 2012 23:19:33 +0000 (16:19 -0700)
committer	John Wilkins <john.wilkins@inktank.com>
	Tue, 4 Sep 2012 23:19:33 +0000 (16:19 -0700)