From: Sebastian Wagner Date: Thu, 18 Feb 2021 13:48:33 +0000 (+0100) Subject: doc/cephadm: group OSD mgmt sections into one chapter X-Git-Tag: v17.1.0~2825^2~12 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=74bcd7c5f6fe1c6ddb1b01bae8e964d71b57d5a1;p=ceph.git doc/cephadm: group OSD mgmt sections into one chapter Signed-off-by: Sebastian Wagner --- diff --git a/doc/cephadm/drivegroups.rst b/doc/cephadm/drivegroups.rst deleted file mode 100644 index aded823dd5852..0000000000000 --- a/doc/cephadm/drivegroups.rst +++ /dev/null @@ -1,452 +0,0 @@ -.. _drivegroups: - -========================= -OSD Service Specification -========================= - -:ref:`orchestrator-cli-service-spec` of type ``osd`` are a way to describe a cluster layout using the properties of disks. -It gives the user an abstract way tell ceph which disks should turn into an OSD -with which configuration without knowing the specifics of device names and paths. - -Instead of doing this - -.. prompt:: bash [monitor.1]# - - ceph orch daemon add osd **:** - -for each device and each host, we can define a yaml|json file that allows us to describe -the layout. Here's the most basic example. - -Create a file called i.e. osd_spec.yml - -.. code-block:: yaml - - service_type: osd - service_id: default_drive_group <- name of the drive_group (name can be custom) - placement: - host_pattern: '*' <- which hosts to target, currently only supports globs - data_devices: <- the type of devices you are applying specs to - all: true <- a filter, check below for a full list - -This would translate to: - -Turn any available(ceph-volume decides what 'available' is) into an OSD on all hosts that match -the glob pattern '*'. (The glob pattern matches against the registered hosts from `host ls`) -There will be a more detailed section on host_pattern down below. - -and pass it to `osd create` like so - -.. prompt:: bash [monitor.1]# - - ceph orch apply osd -i /path/to/osd_spec.yml - -This will go out on all the matching hosts and deploy these OSDs. - -Since we want to have more complex setups, there are more filters than just the 'all' filter. - -Also, there is a `--dry-run` flag that can be passed to the `apply osd` command, which gives you a synopsis -of the proposed layout. - -Example - -.. prompt:: bash [monitor.1]# - - [monitor.1]# ceph orch apply osd -i /path/to/osd_spec.yml --dry-run - - - -Filters -======= - -.. note:: - Filters are applied using a `AND` gate by default. This essentially means that a drive needs to fulfill all filter - criteria in order to get selected. - If you wish to change this behavior you can adjust this behavior by setting - - `filter_logic: OR` # valid arguments are `AND`, `OR` - - in the OSD Specification. - -You can assign disks to certain groups by their attributes using filters. - -The attributes are based off of ceph-volume's disk query. You can retrieve the information -with - -.. code-block:: bash - - ceph-volume inventory - -Vendor or Model: -------------------- - -You can target specific disks by their Vendor or by their Model - -.. code-block:: yaml - - model: disk_model_name - -or - -.. code-block:: yaml - - vendor: disk_vendor_name - - -Size: --------------- - -You can also match by disk `Size`. - -.. code-block:: yaml - - size: size_spec - -Size specs: -___________ - -Size specification of format can be of form: - -* LOW:HIGH -* :HIGH -* LOW: -* EXACT - -Concrete examples: - -Includes disks of an exact size - -.. code-block:: yaml - - size: '10G' - -Includes disks which size is within the range - -.. code-block:: yaml - - size: '10G:40G' - -Includes disks less than or equal to 10G in size - -.. code-block:: yaml - - size: ':10G' - - -Includes disks equal to or greater than 40G in size - -.. code-block:: yaml - - size: '40G:' - -Sizes don't have to be exclusively in Gigabyte(G). - -Supported units are Megabyte(M), Gigabyte(G) and Terrabyte(T). Also appending the (B) for byte is supported. MB, GB, TB - - -Rotational: ------------ - -This operates on the 'rotational' attribute of the disk. - -.. code-block:: yaml - - rotational: 0 | 1 - -`1` to match all disks that are rotational - -`0` to match all disks that are non-rotational (SSD, NVME etc) - - -All: ------------- - -This will take all disks that are 'available' - -Note: This is exclusive for the data_devices section. - -.. code-block:: yaml - - all: true - - -Limiter: --------- - -When you specified valid filters but want to limit the amount of matching disks you can use the 'limit' directive. - -.. code-block:: yaml - - limit: 2 - -For example, if you used `vendor` to match all disks that are from `VendorA` but only want to use the first two -you could use `limit`. - -.. code-block:: yaml - - data_devices: - vendor: VendorA - limit: 2 - -Note: Be aware that `limit` is really just a last resort and shouldn't be used if it can be avoided. - - -Additional Options -=================== - -There are multiple optional settings you can use to change the way OSDs are deployed. -You can add these options to the base level of a DriveGroup for it to take effect. - -This example would deploy all OSDs with encryption enabled. - -.. code-block:: yaml - - service_type: osd - service_id: example_osd_spec - placement: - host_pattern: '*' - data_devices: - all: true - encrypted: true - -See a full list in the DriveGroupSpecs - -.. py:currentmodule:: ceph.deployment.drive_group - -.. autoclass:: DriveGroupSpec - :members: - :exclude-members: from_json - -Examples -======== - -The simple case ---------------- - -All nodes with the same setup - -.. code-block:: none - - 20 HDDs - Vendor: VendorA - Model: HDD-123-foo - Size: 4TB - - 2 SSDs - Vendor: VendorB - Model: MC-55-44-ZX - Size: 512GB - -This is a common setup and can be described quite easily: - -.. code-block:: yaml - - service_type: osd - service_id: osd_spec_default - placement: - host_pattern: '*' - data_devices: - model: HDD-123-foo <- note that HDD-123 would also be valid - db_devices: - model: MC-55-44-XZ <- same here, MC-55-44 is valid - -However, we can improve it by reducing the filters on core properties of the drives: - -.. code-block:: yaml - - service_type: osd - service_id: osd_spec_default - placement: - host_pattern: '*' - data_devices: - rotational: 1 - db_devices: - rotational: 0 - -Now, we enforce all rotating devices to be declared as 'data devices' and all non-rotating devices will be used as shared_devices (wal, db) - -If you know that drives with more than 2TB will always be the slower data devices, you can also filter by size: - -.. code-block:: yaml - - service_type: osd - service_id: osd_spec_default - placement: - host_pattern: '*' - data_devices: - size: '2TB:' - db_devices: - size: ':2TB' - -Note: All of the above DriveGroups are equally valid. Which of those you want to use depends on taste and on how much you expect your node layout to change. - - -The advanced case ------------------ - -Here we have two distinct setups - -.. code-block:: none - - 20 HDDs - Vendor: VendorA - Model: HDD-123-foo - Size: 4TB - - 12 SSDs - Vendor: VendorB - Model: MC-55-44-ZX - Size: 512GB - - 2 NVMEs - Vendor: VendorC - Model: NVME-QQQQ-987 - Size: 256GB - - -* 20 HDDs should share 2 SSDs -* 10 SSDs should share 2 NVMes - -This can be described with two layouts. - -.. code-block:: yaml - - service_type: osd - service_id: osd_spec_hdd - placement: - host_pattern: '*' - data_devices: - rotational: 0 - db_devices: - model: MC-55-44-XZ - limit: 2 (db_slots is actually to be favoured here, but it's not implemented yet) - --- - service_type: osd - service_id: osd_spec_ssd - placement: - host_pattern: '*' - data_devices: - model: MC-55-44-XZ - db_devices: - vendor: VendorC - -This would create the desired layout by using all HDDs as data_devices with two SSD assigned as dedicated db/wal devices. -The remaining SSDs(8) will be data_devices that have the 'VendorC' NVMEs assigned as dedicated db/wal devices. - -The advanced case (with non-uniform nodes) ------------------------------------------- - -The examples above assumed that all nodes have the same drives. That's however not always the case. - -Node1-5 - -.. code-block:: none - - 20 HDDs - Vendor: Intel - Model: SSD-123-foo - Size: 4TB - 2 SSDs - Vendor: VendorA - Model: MC-55-44-ZX - Size: 512GB - -Node6-10 - -.. code-block:: none - - 5 NVMEs - Vendor: Intel - Model: SSD-123-foo - Size: 4TB - 20 SSDs - Vendor: VendorA - Model: MC-55-44-ZX - Size: 512GB - -You can use the 'host_pattern' key in the layout to target certain nodes. Salt target notation helps to keep things easy. - - -.. code-block:: yaml - - service_type: osd - service_id: osd_spec_node_one_to_five - placement: - host_pattern: 'node[1-5]' - data_devices: - rotational: 1 - db_devices: - rotational: 0 - --- - service_type: osd - service_id: osd_spec_six_to_ten - placement: - host_pattern: 'node[6-10]' - data_devices: - model: MC-55-44-XZ - db_devices: - model: SSD-123-foo - -This applies different OSD specs to different hosts depending on the `host_pattern` key. - -Dedicated wal + db ------------------- - -All previous cases co-located the WALs with the DBs. -It's however possible to deploy the WAL on a dedicated device as well, if it makes sense. - -.. code-block:: none - - 20 HDDs - Vendor: VendorA - Model: SSD-123-foo - Size: 4TB - - 2 SSDs - Vendor: VendorB - Model: MC-55-44-ZX - Size: 512GB - - 2 NVMEs - Vendor: VendorC - Model: NVME-QQQQ-987 - Size: 256GB - - -The OSD spec for this case would look like the following (using the `model` filter): - -.. code-block:: yaml - - service_type: osd - service_id: osd_spec_default - placement: - host_pattern: '*' - data_devices: - model: MC-55-44-XZ - db_devices: - model: SSD-123-foo - wal_devices: - model: NVME-QQQQ-987 - - -It is also possible to specify directly device paths in specific hosts like the following: - -.. code-block:: yaml - - service_type: osd - service_id: osd_using_paths - placement: - hosts: - - Node01 - - Node02 - data_devices: - paths: - - /dev/sdb - db_devices: - paths: - - /dev/sdc - wal_devices: - paths: - - /dev/sdd - - -This can easily be done with other filters, like `size` or `vendor` as well. diff --git a/doc/cephadm/index.rst b/doc/cephadm/index.rst index ceeb94877599a..61bed0b82e3b9 100644 --- a/doc/cephadm/index.rst +++ b/doc/cephadm/index.rst @@ -31,13 +31,12 @@ versions of Ceph. install adoption host-management - + osd upgrade Cephadm operations Cephadm monitoring Cephadm CLI <../mgr/orchestrator> Client Setup - DriveGroups troubleshooting concepts Cephadm Feature Planning <../dev/cephadm/index> diff --git a/doc/cephadm/install.rst b/doc/cephadm/install.rst index 904e15ece5616..d950ac1cccd2a 100644 --- a/doc/cephadm/install.rst +++ b/doc/cephadm/install.rst @@ -377,55 +377,17 @@ hosts to the cluster. No further steps are necessary. - host2 - host3 +Adding Storage +============== -Deploy OSDs -=========== - -An inventory of storage devices on all cluster hosts can be displayed with: - -.. prompt:: bash # - - ceph orch device ls - -A storage device is considered *available* if all of the following -conditions are met: - -* The device must have no partitions. -* The device must not have any LVM state. -* The device must not be mounted. -* The device must not contain a file system. -* The device must not contain a Ceph BlueStore OSD. -* The device must be larger than 5 GB. - -Ceph refuses to provision an OSD on a device that is not available. - -There are a few ways to create new OSDs: - -* Tell Ceph to consume any available and unused storage device: +To add storage to the cluster, either tell Ceph to consume any +available and unused device: .. prompt:: bash # ceph orch apply osd --all-available-devices -* Create an OSD from a specific device on a specific host: - - .. prompt:: bash # - - ceph orch daemon add osd **:** - - For example: - - .. prompt:: bash # - - ceph orch daemon add osd host1:/dev/sdb - -* Use :ref:`drivegroups` to describe device(s) to consume - based on their properties, such device type (SSD or HDD), device - model names, size, or the hosts on which the devices exist: - - .. prompt:: bash # - - ceph orch apply osd -i spec.yml +Or See :ref:`cephadm-deploy-osds` for more detailed instructions. Deploy CephFS diff --git a/doc/cephadm/osd.rst b/doc/cephadm/osd.rst new file mode 100644 index 0000000000000..2d3e52ad03705 --- /dev/null +++ b/doc/cephadm/osd.rst @@ -0,0 +1,670 @@ +*********** +OSD Service +*********** + +List Devices +============ + +Print a list of discovered devices, grouped by host and optionally +filtered to a particular host: + +.. prompt:: bash # + + ceph orch device ls [--host=...] [--refresh] + +Example:: + + HOST PATH TYPE SIZE DEVICE AVAIL REJECT REASONS + master /dev/vda hdd 42.0G False locked + node1 /dev/vda hdd 42.0G False locked + node1 /dev/vdb hdd 8192M 387836 False locked, LVM detected, Insufficient space (<5GB) on vgs + node1 /dev/vdc hdd 8192M 450575 False locked, LVM detected, Insufficient space (<5GB) on vgs + node3 /dev/vda hdd 42.0G False locked + node3 /dev/vdb hdd 8192M 395145 False LVM detected, locked, Insufficient space (<5GB) on vgs + node3 /dev/vdc hdd 8192M 165562 False LVM detected, locked, Insufficient space (<5GB) on vgs + node2 /dev/vda hdd 42.0G False locked + node2 /dev/vdb hdd 8192M 672147 False LVM detected, Insufficient space (<5GB) on vgs, locked + node2 /dev/vdc hdd 8192M 228094 False LVM detected, Insufficient space (<5GB) on vgs, locked + + +.. _cephadm-deploy-osds: + +Deploy OSDs +=========== + +An inventory of storage devices on all cluster hosts can be displayed with: + +.. prompt:: bash # + + ceph orch device ls + +A storage device is considered *available* if all of the following +conditions are met: + +* The device must have no partitions. +* The device must not have any LVM state. +* The device must not be mounted. +* The device must not contain a file system. +* The device must not contain a Ceph BlueStore OSD. +* The device must be larger than 5 GB. + +Ceph refuses to provision an OSD on a device that is not available. + +There are a few ways to create new OSDs: + +* Tell Ceph to consume any available and unused storage device: + + .. prompt:: bash # + + ceph orch apply osd --all-available-devices + +* Create an OSD from a specific device on a specific host: + + .. prompt:: bash # + + ceph orch daemon add osd **:** + + For example: + + .. prompt:: bash # + + ceph orch daemon add osd host1:/dev/sdb + +* Use :ref:`drivegroups` to describe device(s) to consume + based on their properties, such device type (SSD or HDD), device + model names, size, or the hosts on which the devices exist: + + .. prompt:: bash # + + ceph orch apply -i spec.yml + +Dry Run +------- + +``--dry-run`` will cause the orchestrator to present a preview of what will happen +without actually creating the OSDs. + +Example:: + + # ceph orch apply osd --all-available-devices --dry-run + NAME HOST DATA DB WAL + all-available-devices node1 /dev/vdb - - + all-available-devices node2 /dev/vdc - - + all-available-devices node3 /dev/vdd - - + +.. _cephadm-osd-declarative: + +Declarative State +----------------- + +Note that the effect of ``ceph orch apply`` is persistent; that is, drives which are added to the system +or become available (say, by zapping) after the command is complete will be automatically found and added to the cluster. + +That is, after using:: + + ceph orch apply osd --all-available-devices + +* If you add new disks to the cluster they will automatically be used to create new OSDs. +* A new OSD will be created automatically if you remove an OSD and clean the LVM physical volume. + +If you want to avoid this behavior (disable automatic creation of OSD on available devices), use the ``unmanaged`` parameter:: + + ceph orch apply osd --all-available-devices --unmanaged=true + +* For cephadm, see also :ref:`cephadm-spec-unmanaged`. + + +Remove an OSD +============= +:: + + ceph orch osd rm [--replace] [--force] + +Evacuates PGs from an OSD and removes it from the cluster. + +Example:: + + # ceph orch osd rm 0 + Scheduled OSD(s) for removal + + +OSDs that are not safe-to-destroy will be rejected. + +You can query the state of the operation with:: + + # ceph orch osd rm status + OSD_ID HOST STATE PG_COUNT REPLACE FORCE STARTED_AT + 2 cephadm-dev done, waiting for purge 0 True False 2020-07-17 13:01:43.147684 + 3 cephadm-dev draining 17 False True 2020-07-17 13:01:45.162158 + 4 cephadm-dev started 42 False True 2020-07-17 13:01:45.162158 + + +When no PGs are left on the OSD, it will be decommissioned and removed from the cluster. + +.. note:: + After removing an OSD, if you wipe the LVM physical volume in the device used by the removed OSD, a new OSD will be created. + Read information about the ``unmanaged`` parameter in :ref:`cephadm-osd-declarative`. + +Stopping OSD Removal +-------------------- + +You can stop the queued OSD removal operation with + +:: + + ceph orch osd rm stop + +Example:: + + # ceph orch osd rm stop 4 + Stopped OSD(s) removal + +This will reset the initial state of the OSD and take it off the removal queue. + + +Replace an OSD +------------------- +:: + + orch osd rm --replace [--force] + +Example:: + + # ceph orch osd rm 4 --replace + Scheduled OSD(s) for replacement + + +This follows the same procedure as the "Remove OSD" part with the exception that the OSD is not permanently removed +from the CRUSH hierarchy, but is assigned a 'destroyed' flag. + +**Preserving the OSD ID** + +The previously-set 'destroyed' flag is used to determine OSD ids that will be reused in the next OSD deployment. + +If you use OSDSpecs for OSD deployment, your newly added disks will be assigned the OSD ids of their replaced +counterparts, assuming the new disks still match the OSDSpecs. + +For assistance in this process you can use the '--dry-run' feature. + +Tip: The name of your OSDSpec can be retrieved from **ceph orch ls** + +Alternatively, you can use your OSDSpec file:: + + ceph orch apply osd -i --dry-run + NAME HOST DATA DB WAL + node1 /dev/vdb - - + + +If this matches your anticipated behavior, just omit the --dry-run flag to execute the deployment. + + +Erase Devices (Zap Devices) +--------------------------- + +Erase (zap) a device so that it can be reused. ``zap`` calls ``ceph-volume zap`` on the remote host. + +:: + + orch device zap + +Example command:: + + ceph orch device zap my_hostname /dev/sdx + +.. note:: + Cephadm orchestrator will automatically deploy drives that match the DriveGroup in your OSDSpec if the unmanaged flag is unset. + For example, if you use the ``all-available-devices`` option when creating OSDs, when you ``zap`` a device the cephadm orchestrator will automatically create a new OSD in the device . + To disable this behavior, see :ref:`cephadm-osd-declarative`. + + +.. _drivegroups: + +Advanced OSD Service Specifications +=================================== + +:ref:`orchestrator-cli-service-spec` of type ``osd`` are a way to describe a cluster layout using the properties of disks. +It gives the user an abstract way tell ceph which disks should turn into an OSD +with which configuration without knowing the specifics of device names and paths. + +Instead of doing this + +.. prompt:: bash [monitor.1]# + + ceph orch daemon add osd **:** + +for each device and each host, we can define a yaml|json file that allows us to describe +the layout. Here's the most basic example. + +Create a file called i.e. osd_spec.yml + +.. code-block:: yaml + + service_type: osd + service_id: default_drive_group <- name of the drive_group (name can be custom) + placement: + host_pattern: '*' <- which hosts to target, currently only supports globs + data_devices: <- the type of devices you are applying specs to + all: true <- a filter, check below for a full list + +This would translate to: + +Turn any available(ceph-volume decides what 'available' is) into an OSD on all hosts that match +the glob pattern '*'. (The glob pattern matches against the registered hosts from `host ls`) +There will be a more detailed section on host_pattern down below. + +and pass it to `osd create` like so + +.. prompt:: bash [monitor.1]# + + ceph orch apply osd -i /path/to/osd_spec.yml + +This will go out on all the matching hosts and deploy these OSDs. + +Since we want to have more complex setups, there are more filters than just the 'all' filter. + +Also, there is a `--dry-run` flag that can be passed to the `apply osd` command, which gives you a synopsis +of the proposed layout. + +Example + +.. prompt:: bash [monitor.1]# + + [monitor.1]# ceph orch apply osd -i /path/to/osd_spec.yml --dry-run + + + +Filters +------- + +.. note:: + Filters are applied using a `AND` gate by default. This essentially means that a drive needs to fulfill all filter + criteria in order to get selected. + If you wish to change this behavior you can adjust this behavior by setting + + `filter_logic: OR` # valid arguments are `AND`, `OR` + + in the OSD Specification. + +You can assign disks to certain groups by their attributes using filters. + +The attributes are based off of ceph-volume's disk query. You can retrieve the information +with + +.. code-block:: bash + + ceph-volume inventory + +Vendor or Model: +^^^^^^^^^^^^^^^^ + +You can target specific disks by their Vendor or by their Model + +.. code-block:: yaml + + model: disk_model_name + +or + +.. code-block:: yaml + + vendor: disk_vendor_name + + +Size: +^^^^^ + +You can also match by disk `Size`. + +.. code-block:: yaml + + size: size_spec + +Size specs: +___________ + +Size specification of format can be of form: + +* LOW:HIGH +* :HIGH +* LOW: +* EXACT + +Concrete examples: + +Includes disks of an exact size + +.. code-block:: yaml + + size: '10G' + +Includes disks which size is within the range + +.. code-block:: yaml + + size: '10G:40G' + +Includes disks less than or equal to 10G in size + +.. code-block:: yaml + + size: ':10G' + + +Includes disks equal to or greater than 40G in size + +.. code-block:: yaml + + size: '40G:' + +Sizes don't have to be exclusively in Gigabyte(G). + +Supported units are Megabyte(M), Gigabyte(G) and Terrabyte(T). Also appending the (B) for byte is supported. MB, GB, TB + + +Rotational: +^^^^^^^^^^^ + +This operates on the 'rotational' attribute of the disk. + +.. code-block:: yaml + + rotational: 0 | 1 + +`1` to match all disks that are rotational + +`0` to match all disks that are non-rotational (SSD, NVME etc) + + +All: +^^^^ + +This will take all disks that are 'available' + +Note: This is exclusive for the data_devices section. + +.. code-block:: yaml + + all: true + + +Limiter: +^^^^^^^^ + +When you specified valid filters but want to limit the amount of matching disks you can use the 'limit' directive. + +.. code-block:: yaml + + limit: 2 + +For example, if you used `vendor` to match all disks that are from `VendorA` but only want to use the first two +you could use `limit`. + +.. code-block:: yaml + + data_devices: + vendor: VendorA + limit: 2 + +Note: Be aware that `limit` is really just a last resort and shouldn't be used if it can be avoided. + + +Additional Options +------------------ + +There are multiple optional settings you can use to change the way OSDs are deployed. +You can add these options to the base level of a DriveGroup for it to take effect. + +This example would deploy all OSDs with encryption enabled. + +.. code-block:: yaml + + service_type: osd + service_id: example_osd_spec + placement: + host_pattern: '*' + data_devices: + all: true + encrypted: true + +See a full list in the DriveGroupSpecs + +.. py:currentmodule:: ceph.deployment.drive_group + +.. autoclass:: DriveGroupSpec + :members: + :exclude-members: from_json + +Examples +-------- + +The simple case +^^^^^^^^^^^^^^^ + +All nodes with the same setup + +.. code-block:: none + + 20 HDDs + Vendor: VendorA + Model: HDD-123-foo + Size: 4TB + + 2 SSDs + Vendor: VendorB + Model: MC-55-44-ZX + Size: 512GB + +This is a common setup and can be described quite easily: + +.. code-block:: yaml + + service_type: osd + service_id: osd_spec_default + placement: + host_pattern: '*' + data_devices: + model: HDD-123-foo <- note that HDD-123 would also be valid + db_devices: + model: MC-55-44-XZ <- same here, MC-55-44 is valid + +However, we can improve it by reducing the filters on core properties of the drives: + +.. code-block:: yaml + + service_type: osd + service_id: osd_spec_default + placement: + host_pattern: '*' + data_devices: + rotational: 1 + db_devices: + rotational: 0 + +Now, we enforce all rotating devices to be declared as 'data devices' and all non-rotating devices will be used as shared_devices (wal, db) + +If you know that drives with more than 2TB will always be the slower data devices, you can also filter by size: + +.. code-block:: yaml + + service_type: osd + service_id: osd_spec_default + placement: + host_pattern: '*' + data_devices: + size: '2TB:' + db_devices: + size: ':2TB' + +Note: All of the above DriveGroups are equally valid. Which of those you want to use depends on taste and on how much you expect your node layout to change. + + +The advanced case +^^^^^^^^^^^^^^^^^ + +Here we have two distinct setups + +.. code-block:: none + + 20 HDDs + Vendor: VendorA + Model: HDD-123-foo + Size: 4TB + + 12 SSDs + Vendor: VendorB + Model: MC-55-44-ZX + Size: 512GB + + 2 NVMEs + Vendor: VendorC + Model: NVME-QQQQ-987 + Size: 256GB + + +* 20 HDDs should share 2 SSDs +* 10 SSDs should share 2 NVMes + +This can be described with two layouts. + +.. code-block:: yaml + + service_type: osd + service_id: osd_spec_hdd + placement: + host_pattern: '*' + data_devices: + rotational: 0 + db_devices: + model: MC-55-44-XZ + limit: 2 (db_slots is actually to be favoured here, but it's not implemented yet) + --- + service_type: osd + service_id: osd_spec_ssd + placement: + host_pattern: '*' + data_devices: + model: MC-55-44-XZ + db_devices: + vendor: VendorC + +This would create the desired layout by using all HDDs as data_devices with two SSD assigned as dedicated db/wal devices. +The remaining SSDs(8) will be data_devices that have the 'VendorC' NVMEs assigned as dedicated db/wal devices. + +The advanced case (with non-uniform nodes) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The examples above assumed that all nodes have the same drives. That's however not always the case. + +Node1-5 + +.. code-block:: none + + 20 HDDs + Vendor: Intel + Model: SSD-123-foo + Size: 4TB + 2 SSDs + Vendor: VendorA + Model: MC-55-44-ZX + Size: 512GB + +Node6-10 + +.. code-block:: none + + 5 NVMEs + Vendor: Intel + Model: SSD-123-foo + Size: 4TB + 20 SSDs + Vendor: VendorA + Model: MC-55-44-ZX + Size: 512GB + +You can use the 'host_pattern' key in the layout to target certain nodes. Salt target notation helps to keep things easy. + + +.. code-block:: yaml + + service_type: osd + service_id: osd_spec_node_one_to_five + placement: + host_pattern: 'node[1-5]' + data_devices: + rotational: 1 + db_devices: + rotational: 0 + --- + service_type: osd + service_id: osd_spec_six_to_ten + placement: + host_pattern: 'node[6-10]' + data_devices: + model: MC-55-44-XZ + db_devices: + model: SSD-123-foo + +This applies different OSD specs to different hosts depending on the `host_pattern` key. + +Dedicated wal + db +^^^^^^^^^^^^^^^^^^ + +All previous cases co-located the WALs with the DBs. +It's however possible to deploy the WAL on a dedicated device as well, if it makes sense. + +.. code-block:: none + + 20 HDDs + Vendor: VendorA + Model: SSD-123-foo + Size: 4TB + + 2 SSDs + Vendor: VendorB + Model: MC-55-44-ZX + Size: 512GB + + 2 NVMEs + Vendor: VendorC + Model: NVME-QQQQ-987 + Size: 256GB + + +The OSD spec for this case would look like the following (using the `model` filter): + +.. code-block:: yaml + + service_type: osd + service_id: osd_spec_default + placement: + host_pattern: '*' + data_devices: + model: MC-55-44-XZ + db_devices: + model: SSD-123-foo + wal_devices: + model: NVME-QQQQ-987 + + +It is also possible to specify directly device paths in specific hosts like the following: + +.. code-block:: yaml + + service_type: osd + service_id: osd_using_paths + placement: + hosts: + - Node01 + - Node02 + data_devices: + paths: + - /dev/sdb + db_devices: + paths: + - /dev/sdc + wal_devices: + paths: + - /dev/sdd + + +This can easily be done with other filters, like `size` or `vendor` as well. diff --git a/doc/mgr/orchestrator.rst b/doc/mgr/orchestrator.rst index 22b1d9b56e2b5..64ff9bdf7f822 100644 --- a/doc/mgr/orchestrator.rst +++ b/doc/mgr/orchestrator.rst @@ -78,182 +78,6 @@ To remove a label, run:: ceph orch host label rm my_hostname my_label -OSD Management -============== - -List Devices ------------- - -Print a list of discovered devices, grouped by host and optionally -filtered to a particular host: - -:: - - ceph orch device ls [--host=...] [--refresh] - -Example:: - - HOST PATH TYPE SIZE DEVICE AVAIL REJECT REASONS - master /dev/vda hdd 42.0G False locked - node1 /dev/vda hdd 42.0G False locked - node1 /dev/vdb hdd 8192M 387836 False locked, LVM detected, Insufficient space (<5GB) on vgs - node1 /dev/vdc hdd 8192M 450575 False locked, LVM detected, Insufficient space (<5GB) on vgs - node3 /dev/vda hdd 42.0G False locked - node3 /dev/vdb hdd 8192M 395145 False LVM detected, locked, Insufficient space (<5GB) on vgs - node3 /dev/vdc hdd 8192M 165562 False LVM detected, locked, Insufficient space (<5GB) on vgs - node2 /dev/vda hdd 42.0G False locked - node2 /dev/vdb hdd 8192M 672147 False LVM detected, Insufficient space (<5GB) on vgs, locked - node2 /dev/vdc hdd 8192M 228094 False LVM detected, Insufficient space (<5GB) on vgs, locked - - - - -Erase Devices (Zap Devices) ---------------------------- - -Erase (zap) a device so that it can be reused. ``zap`` calls ``ceph-volume zap`` on the remote host. - -:: - - orch device zap - -Example command:: - - ceph orch device zap my_hostname /dev/sdx - -.. note:: - Cephadm orchestrator will automatically deploy drives that match the DriveGroup in your OSDSpec if the unmanaged flag is unset. - For example, if you use the ``all-available-devices`` option when creating OSDs, when you ``zap`` a device the cephadm orchestrator will automatically create a new OSD in the device . - To disable this behavior, see :ref:`orchestrator-cli-create-osds`. - -.. _orchestrator-cli-create-osds: - -Create OSDs ------------ - -Create OSDs on a set of devices on a single host:: - - ceph orch daemon add osd :device1,device2 - -Another way of doing it is using ``apply`` interface:: - - ceph orch apply osd -i [--dry-run] - -where the ``json_file/yaml_file`` is a DriveGroup specification. -For a more in-depth guide to DriveGroups please refer to :ref:`drivegroups` - -``dry-run`` will cause the orchestrator to present a preview of what will happen -without actually creating the OSDs. - -Example:: - - # ceph orch apply osd --all-available-devices --dry-run - NAME HOST DATA DB WAL - all-available-devices node1 /dev/vdb - - - all-available-devices node2 /dev/vdc - - - all-available-devices node3 /dev/vdd - - - -When the parameter ``all-available-devices`` or a DriveGroup specification is used, a cephadm service is created. -This service guarantees that all available devices or devices included in the DriveGroup will be used for OSDs. -Note that the effect of ``--all-available-devices`` is persistent; that is, drives which are added to the system -or become available (say, by zapping) after the command is complete will be automatically found and added to the cluster. - -That is, after using:: - - ceph orch apply osd --all-available-devices - -* If you add new disks to the cluster they will automatically be used to create new OSDs. -* A new OSD will be created automatically if you remove an OSD and clean the LVM physical volume. - -If you want to avoid this behavior (disable automatic creation of OSD on available devices), use the ``unmanaged`` parameter:: - - ceph orch apply osd --all-available-devices --unmanaged=true - -* For cephadm, see also :ref:`cephadm-spec-unmanaged`. - -Remove an OSD -------------- -:: - - ceph orch osd rm [--replace] [--force] - -Evacuates PGs from an OSD and removes it from the cluster. - -Example:: - - # ceph orch osd rm 0 - Scheduled OSD(s) for removal - - -OSDs that are not safe-to-destroy will be rejected. - -You can query the state of the operation with:: - - # ceph orch osd rm status - OSD_ID HOST STATE PG_COUNT REPLACE FORCE STARTED_AT - 2 cephadm-dev done, waiting for purge 0 True False 2020-07-17 13:01:43.147684 - 3 cephadm-dev draining 17 False True 2020-07-17 13:01:45.162158 - 4 cephadm-dev started 42 False True 2020-07-17 13:01:45.162158 - - -When no PGs are left on the OSD, it will be decommissioned and removed from the cluster. - -.. note:: - After removing an OSD, if you wipe the LVM physical volume in the device used by the removed OSD, a new OSD will be created. - Read information about the ``unmanaged`` parameter in :ref:`orchestrator-cli-create-osds`. - -Stopping OSD Removal --------------------- - -You can stop the queued OSD removal operation with - -:: - - ceph orch osd rm stop - -Example:: - - # ceph orch osd rm stop 4 - Stopped OSD(s) removal - -This will reset the initial state of the OSD and take it off the removal queue. - - -Replace an OSD -------------------- -:: - - orch osd rm --replace [--force] - -Example:: - - # ceph orch osd rm 4 --replace - Scheduled OSD(s) for replacement - - -This follows the same procedure as the "Remove OSD" part with the exception that the OSD is not permanently removed -from the CRUSH hierarchy, but is assigned a 'destroyed' flag. - -**Preserving the OSD ID** - -The previously-set 'destroyed' flag is used to determine OSD ids that will be reused in the next OSD deployment. - -If you use OSDSpecs for OSD deployment, your newly added disks will be assigned the OSD ids of their replaced -counterparts, assuming the new disks still match the OSDSpecs. - -For assistance in this process you can use the '--dry-run' feature. - -Tip: The name of your OSDSpec can be retrieved from **ceph orch ls** - -Alternatively, you can use your OSDSpec file:: - - ceph orch apply osd -i --dry-run - NAME HOST DATA DB WAL - node1 /dev/vdb - - - - -If this matches your anticipated behavior, just omit the --dry-run flag to execute the deployment. - .. Turn On Device Lights