From: Sidharth Anupkrishnan Date: Wed, 11 Sep 2019 14:19:54 +0000 (+0530) Subject: doc: Add a new document on Dynamic Metadata Management in CephFS X-Git-Tag: v15.1.0~1451^2 X-Git-Url: http://git.apps.os.sepia.ceph.com/?a=commitdiff_plain;h=05bdceb1f78273fc56fe838113bff6c8819fb3ba;p=ceph-ci.git doc: Add a new document on Dynamic Metadata Management in CephFS Signed-off-by: Sidharth Anupkrishnan --- diff --git a/doc/cephfs/dynamic-metadata-management.rst b/doc/cephfs/dynamic-metadata-management.rst new file mode 100644 index 00000000000..4232de6da93 --- /dev/null +++ b/doc/cephfs/dynamic-metadata-management.rst @@ -0,0 +1,34 @@ +================================== +CephFS Dynamic Metadata Management +================================== +Metadata operations usually take up more than 50 percent of all +file system operations. Also the metadata scales in a more complex +fashion when compared to scaling storage (which in turn scales I/O +throughput linearly). This is due to the hierarchical and +interdependent nature of the file system metadata. So in CephFS, +the metadata workload is decoupled from data workload so as to +avoid placing unnecessary strain on the RADOS cluster. The metadata +is hence handled by a cluster of Metadata Servers (MDSs). +CephFS distributes metadata across MDSs via `Dynamic Subtree Partitioning `__. + +Dynamic Subtree Partitioning +---------------------------- +In traditional subtree partitioning, subtrees of the file system +hierarchy are assigned to individual MDSs. This metadata distribution +strategy provides good hierarchical locality, linear growth of +cache and horizontal scaling across MDSs and a fairly good distribution +of metadata across MDSs. + +.. image:: subtree-partitioning.svg + +The problem with traditional subtree partitioning is that the workload +growth by depth (across a single MDS) leads to a hotspot of activity. +This results in lack of vertical scaling and wastage of non-busy resources/MDSs. + +This led to the adoption of a more dynamic way of handling +metadata: Dynamic Subtree Partitioning, where load intensive portions +of the directory hierarchy from busy MDSs are migrated to non busy MDSs. + +This strategy ensures that activity hotspots are relieved as they +appear and so leads to vertical scaling of the metadata workload in +addition to horizontal scaling. diff --git a/doc/cephfs/index.rst b/doc/cephfs/index.rst index aed20343afa..0f36ca1cb4f 100644 --- a/doc/cephfs/index.rst +++ b/doc/cephfs/index.rst @@ -121,6 +121,7 @@ authentication keyring. LazyIO Distributed Metadata Cache FS volume and subvolumes + Dynamic Metadata Management in CephFS .. toctree:: :hidden: diff --git a/doc/cephfs/subtree-partitioning.svg b/doc/cephfs/subtree-partitioning.svg new file mode 100644 index 00000000000..20c60de9299 --- /dev/null +++ b/doc/cephfs/subtree-partitioning.svg @@ -0,0 +1 @@ + \ No newline at end of file