From 044b400f23195ba9e060176a15b9c7c3c76592ee Mon Sep 17 00:00:00 2001 From: Joshua Schmid Date: Mon, 3 Feb 2020 16:22:02 +0100 Subject: [PATCH] doc/drivegroups: add initial write-up Signed-off-by: Joshua Schmid --- doc/cephadm/drivegroups.rst | 367 +++++++++++++++++++++++++++++++ doc/cephadm/index.rst | 16 +- doc/mgr/orchestrator_modules.rst | 11 - 3 files changed, 381 insertions(+), 13 deletions(-) create mode 100644 doc/cephadm/drivegroups.rst diff --git a/doc/cephadm/drivegroups.rst b/doc/cephadm/drivegroups.rst new file mode 100644 index 0000000000000..4ccdd0a87fe9f --- /dev/null +++ b/doc/cephadm/drivegroups.rst @@ -0,0 +1,367 @@ +DriveGroups +============ + +DriveGroups are a way to describe a cluster layout using the properties of disks. +It gives the user an abstract way tell ceph which disks should turn into an OSD +with which configuration without knowing the specifics of device names and paths. + +Instead of doing this:: + + [monitor 1] # ceph orchestrator osd create **:** + +for each device and each host, we can define a yaml|json file that allows us to describe +the layout. Here's the most basic example. + +Create a file called i.e. drivegroups.yml + +.. code-block:: yaml + + default_drive_group: <- name of the drive_group (name can be custom) + host_pattern: '*' <- which hosts to target, currently only supports globs + data_devices: <- the type of devices you are applying specs to + all: true <- a filter, check below for a full list + +This would translate to: + +Turn any available(ceph-volume decides what 'available' is) into an OSD on all hosts that match +the glob pattern '*'. (The glob pattern matches against the registered hosts from `host ls`) +There will be a more detailed section on host_pattern down below. + +and pass it to `osd create` like so:: + + [monitor 1] # ceph orchestrator osd create -i /path/to/drivegroup.yml + +This will go out on all the matching hosts and deploy these OSDs. + +Since we want to have more complex setups, there are more filters than just the 'all' filter. + + +Filters +======= + +You can assign disks to certain groups by their attributes using filters. + +The attributes are based off of ceph-volume's disk query. You can retrieve the information +with:: + + ceph-volume inventory + +Vendor or Model: +------------------- + +You can target specific disks by their Vendor or by their Model + +.. code-block:: yaml + + model: disk_model_name + +or + +.. code-block:: yaml + + vendor: disk_vendor_name + + +Size: +-------------- + +You can also match by disk `Size`. + +.. code-block:: yaml + + size: size_spec + +Size specs: +___________ + +Size specification of format can be of form: + +* LOW:HIGH +* :HIGH +* LOW: +* EXACT + +Concrete examples: + +Includes disks of an exact size:: + + size: '10G' + +Includes disks which size is within the range:: + + size: '10G:40G' + +Includes disks less than or equal to 10G in size:: + + size: ':10G' + + +Includes disks equal to or greater than 40G in size:: + + size: '40G:' + +Sizes don't have to be exclusively in Gigabyte(G). + +Supported units are Megabyte(M), Gigabyte(G) and Terrabyte(T). Also appending the (B) for byte is supported. MB, GB, TB + + +Rotational: +----------- + +This operates on the 'rotational' attribute of the disk. + +.. code-block:: yaml + + rotational: 0 | 1 + +`1` to match all disks that are rotational + +`0` to match all disks that are non-rotational (SSD, NVME etc) + + +All: +------------ + +This will take all disks that are 'available' + +Note: This is exclusive for the data_devices section. + +.. code-block:: yaml + + all: true + + +Limiter: +-------- + +When you specified valid filters but want to limit the amount of matching disks you can use the 'limit' directive. + +.. code-block:: yaml + + limit: 2 + +For example, if you used `vendor` to match all disks that are from `VendorA` but only want to use the first two +you could use `limit`. + +.. code-block:: yaml + + data_devices: + vendor: VendorA + limit: 2 + +Note: Be aware that `limit` is really just a last resort and shouldn't be used if it can be avoided. + + +Additional Options +=================== + +There are multiple optional settings you can use to change the way OSDs are deployed. +You can add these options to the base level of a DriveGroup for it to take effect. + +This example would deploy all OSDs with encryption enabled. + +.. code-block:: yaml + + example_drive_group: + data_devices: + all: true + encrypted: true + +See a full list in the DriveGroupSpecs + +.. py:currentmodule:: ceph.deployment.drive_group + +.. autoclass:: DriveGroupSpec + :members: + :exclude-members: from_json + +Examples +======== + +The simple case +--------------- + +All nodes with the same setup:: + + 20 HDDs + Vendor: VendorA + Model: HDD-123-foo + Size: 4TB + + 2 SSDs + Vendor: VendorB + Model: MC-55-44-ZX + Size: 512GB + +This is a common setup and can be described quite easily: + +.. code-block:: yaml + + drive_group_default: + host_pattern: '*' + data_devices: + model: HDD-123-foo <- note that HDD-123 would also be valid + db_devices: + model: MC-55-44-XZ <- same here, MC-55-44 is valid + +However, we can improve it by reducing the filters on core properties of the drives: + +.. code-block:: yaml + + drive_group_default: + host_pattern: '*' + data_devices: + rotational: 1 + db_devices: + rotational: 0 + +Now, we enforce all rotating devices to be declared as 'data devices' and all non-rotating devices will be used as shared_devices (wal, db) + +If you know that drives with more than 2TB will always be the slower data devices, you can also filter by size: + +.. code-block:: yaml + + drive_group_default: + host_pattern: '*' + data_devices: + size: '2TB:' + db_devices: + size: ':2TB' + +Note: All of the above DriveGroups are equally valid. Which of those you want to use depends on taste and on how much you expect your node layout to change. + + +The advanced case +----------------- + +Here we have two distinct setups:: + + 20 HDDs + Vendor: VendorA + Model: HDD-123-foo + Size: 4TB + + 12 SSDs + Vendor: VendorB + Model: MC-55-44-ZX + Size: 512GB + + 2 NVMEs + Vendor: VendorC + Model: NVME-QQQQ-987 + Size: 256GB + + +* 20 HDDs should share 2 SSDs +* 10 SSDs should share 2 NVMes + +This can be described with two layouts. + +.. code-block:: yaml + + drive_group_hdd: + host_pattern: '*' + data_devices: + rotational: 0 + db_devices: + model: MC-55-44-XZ + limit: 2 (db_slots is actually to be favoured here, but it's not implemented yet) + + drive_group_ssd: + host_pattern: '*' + data_devices: + model: MC-55-44-XZ + db_devices: + vendor: VendorC + +This would create the desired layout by using all HDDs as data_devices with two SSD assigned as dedicated db/wal devices. +The remaining SSDs(8) will be data_devices that have the 'VendorC' NVMEs assigned as dedicated db/wal devices. + +The advanced case (with non-uniform nodes) +------------------------------------------ + +The examples above assumed that all nodes have the same drives. That's however not always the case. + +Node1-5:: + + 20 HDDs + Vendor: Intel + Model: SSD-123-foo + Size: 4TB + 2 SSDs + Vendor: VendorA + Model: MC-55-44-ZX + Size: 512GB + +Node6-10:: + + 5 NVMEs + Vendor: Intel + Model: SSD-123-foo + Size: 4TB + 20 SSDs + Vendor: VendorA + Model: MC-55-44-ZX + Size: 512GB + +You can use the 'host_pattern' key in the layout to target certain nodes. Salt target notation helps to keep things easy. + + +.. code-block:: yaml + + drive_group_node_one_to_five: + host_pattern: 'node[1-5]' + data_devices: + rotational: 1 + db_devices: + rotational: 0 + + drive_group_six_to_ten: + host_pattern: 'node[6-10]' + data_devices: + model: MC-55-44-XZ + db_devices: + model: SSD-123-foo + +This will apply different drive groups to different hosts depending on the `host_pattern` key. + +Dedicated wal + db +------------------ + +All previous cases co-located the WALs with the DBs. +It's however possible to deploy the WAL on a dedicated device as well, if it makes sense. + +:: + + 20 HDDs + Vendor: VendorA + Model: SSD-123-foo + Size: 4TB + + 2 SSDs + Vendor: VendorB + Model: MC-55-44-ZX + Size: 512GB + + 2 NVMEs + Vendor: VendorC + Model: NVME-QQQQ-987 + Size: 256GB + + +The drivegroup for this case would look like this (using the `model` filter) + +.. code-block:: yaml + + drive_group_default: + host_pattern: '*' + data_devices: + model: MC-55-44-XZ + db_devices: + model: SSD-123-foo + wal_devices: + model: NVME-QQQQ-987 + + +This can easily be done with other filters, like `size` or `vendor` as well. + diff --git a/doc/cephadm/index.rst b/doc/cephadm/index.rst index fdd07b3764920..4e0ce8a0289d1 100644 --- a/doc/cephadm/index.rst +++ b/doc/cephadm/index.rst @@ -138,8 +138,9 @@ address in network ``10.1.2.0/24``,:: Deploying OSDs ============== -To add an OSD to the cluster, you need to know the device name for the -block device (hard disk or SSD) that will be used. Then,:: +To add OSDs to the cluster, you have two options: +1) You need to know the device name for the block device (hard disk or SSD) +that will be used. Then,:: [monitor 1] # ceph orch osd create **:** @@ -147,6 +148,16 @@ For example, to deploy an OSD on host *newhost*'s SSD,:: [monitor 1] # ceph orch osd create newhost:/dev/disk/by-id/ata-WDC_WDS200T2B0A-00SM50_182294800028 + +2) You need to describe your disk setup by it's properties (Drive Groups) + +Link to DriveGroup docs.:: + + [monitor 1] # ceph orchestrator osd create -i my_drivegroups.yml + + +.. _drivegroups: drivegroups:: + Deploying manager daemons ========================= @@ -193,5 +204,6 @@ Further Reading Cephadm administration Cephadm CLI <../mgr/orchestrator> + DriveGroups OS recommendations <../start/os-recommendations> diff --git a/doc/mgr/orchestrator_modules.rst b/doc/mgr/orchestrator_modules.rst index 43160708a3cbc..acb242eed172e 100644 --- a/doc/mgr/orchestrator_modules.rst +++ b/doc/mgr/orchestrator_modules.rst @@ -263,17 +263,6 @@ OSD management .. automethod:: Orchestrator.create_osds -.. py:currentmodule:: ceph.deployment.drive_group - -.. autoclass:: DeviceSelection - :members: - -.. autoclass:: DriveGroupSpec - :members: - :exclude-members: from_json - -.. py:currentmodule:: orchestrator - .. automethod:: Orchestrator.blink_device_light .. autoclass:: DeviceLightLoc -- 2.39.5