From 9f1a5627399e2589fe706b158a32a9fb8642ac23 Mon Sep 17 00:00:00 2001 From: Alfredo Deza Date: Fri, 20 Oct 2017 11:51:55 -0400 Subject: [PATCH] doc/ceph-volume update prepare with bluestore workflow Signed-off-by: Alfredo Deza --- doc/ceph-volume/lvm/prepare.rst | 100 ++++++++++++++++++++++++++++---- 1 file changed, 89 insertions(+), 11 deletions(-) diff --git a/doc/ceph-volume/lvm/prepare.rst b/doc/ceph-volume/lvm/prepare.rst index 0fefd03787d93..cc2851b86c2a0 100644 --- a/doc/ceph-volume/lvm/prepare.rst +++ b/doc/ceph-volume/lvm/prepare.rst @@ -163,32 +163,110 @@ later be started (for detailed metadata description see :ref:`ceph-volume-lvm-ta ``bluestore`` ------------- -This subcommand is planned but not currently implemented. +The :term:`bluestore` objectstore is the default for new OSDs. It offers a bit +more flexibility for devices. Bluestore supports the following configurations: + +* A block device, a block.wal, and a block.db device +* A block device and a block.wal device +* A block device and a block.db device +* A single block device + +It can accept a whole device (not a partition, otherwise it will raise an +error) or a logical volume for ``block``. If a physical device is provided it +will then be turned into a logical volume. This allows a simpler approach at +using LVM but at the cost of flexibility: there are no options or +configurations to change how the LV is created. + +The ``block`` is specified with the ``--data`` flag, and in its simplest use +case it looks like:: + + ceph-volume lvm prepare --bluestore --data vg/lv + +A raw device can be specified in the same way:: + + ceph-volume lvm prepare --bluestore --data /path/to/device + + +If a ``block.db`` or a ``block.wal`` is needed (they are optional for +bluestore) they can be specified with ``--block.db`` and ``--block.wal`` +accordingly. These can be a physical device (they **must** be a partition) or +a logical volume. + +For both ``block.db`` and ``block.wal`` partitions aren't made logical volumes +because they can be used as-is. Logical Volumes are also allowed. + +While creating the OSD directory, the process will use a ``tmpfs`` mount to +place all the files needed for the OSD. These files are initially created by +``ceph-osd --mkfs`` and are fully ephemeral. + +A symlink is always created for the ``block`` device, and optionally for +``block.db`` and ``block.wal``. For a cluster with a default name, and an OSD +id of 0, the directory could look like:: + + # ls -l /var/lib/ceph/osd/ceph-0 + lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block -> /dev/ceph-be2b6fbd-bcf2-4c51-b35d-a35a162a02f0/osd-block-25cf0a05-2bc6-44ef-9137-79d65bd7ad62 + lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block.db -> /dev/sda1 + lrwxrwxrwx. 1 ceph ceph 93 Oct 20 13:05 block.wal -> /dev/ceph/osd-wal-0 + -rw-------. 1 ceph ceph 37 Oct 20 13:05 ceph_fsid + -rw-------. 1 ceph ceph 37 Oct 20 13:05 fsid + -rw-------. 1 ceph ceph 55 Oct 20 13:05 keyring + -rw-------. 1 ceph ceph 6 Oct 20 13:05 ready + -rw-------. 1 ceph ceph 10 Oct 20 13:05 type + -rw-------. 1 ceph ceph 2 Oct 20 13:05 whoami + +In the above case, a device was used for ``block`` so ``ceph-volume`` create +a volume group and a logical volume using the following convention: + +* volume group name: ``ceph-{cluster fsid}`` or if the vg exists already + ``ceph-{random uuid}`` + +* logical volume name: ``osd-block-{osd_fsid}`` Storing metadata ---------------- -The following tags will get applied as part of the prepartion process -regardless of the type of volume (journal or data) and also regardless of the -OSD backend: +The following tags will get applied as part of the preparation process +regardless of the type of volume (journal or data) or OSD objectstore: * ``cluster_fsid`` -* ``data_device`` -* ``journal_device`` * ``encrypted`` * ``osd_fsid`` * ``osd_id`` -* ``block`` -* ``db`` -* ``wal`` -* ``lockbox_device`` + +For :term:`filestore` these tags will be added: + +* ``journal_device`` +* ``journal_uuid`` + +For :term:`bluestore` these tags will be added: + +* ``block_device`` +* ``block_uuid`` +* ``db_device`` +* ``db_uuid`` +* ``wal_device`` +* ``wal_uuid`` .. note:: For the complete lvm tag conventions see :ref:`ceph-volume-lvm-tag-api` Summary ------- -To recap the ``prepare`` process: +To recap the ``prepare`` process for :term:`bluestore`: + +#. Accept a logical volume for block or a raw device (that will get converted + to an lv) +#. Accept partitions or logical volumes for ``block.wal`` or ``block.db`` +#. Generate a UUID for the OSD +#. Ask the monitor get an OSD ID reusing the generated UUID +#. OSD data directory is created on a tmpfs mount. +#. ``block``, ``block.wal``, and ``block.db`` are symlinked if defined. +#. monmap is fetched for activation +#. Data directory is populated by ``ceph-osd`` +#. Logical Volumes are are assigned all the Ceph metadata using lvm tags + + +And the ``prepare`` process for :term:`filestore`: #. Accept only logical volumes for data and journal (both required) #. Generate a UUID for the OSD -- 2.39.5