From: Igor Golikov Date: Sun, 13 Jul 2025 11:14:21 +0000 (+0000) Subject: doc: update documentation X-Git-Tag: testing/wip-jcollin-testing-20250912.051015-main~1^2 X-Git-Url: http://git.apps.os.sepia.ceph.com/?a=commitdiff_plain;h=refs%2Fheads%2Figolikov-subvolume-68929-3;p=ceph-ci.git doc: update documentation Fixes: https://tracker.ceph.com/issues/68931 Signed-off-by: Igor Golikov --- diff --git a/doc/cephfs/mds-config-ref.rst b/doc/cephfs/mds-config-ref.rst index 49f4f19fc92..f0fd891d7bf 100644 --- a/doc/cephfs/mds-config-ref.rst +++ b/doc/cephfs/mds-config-ref.rst @@ -1,3 +1,4 @@ +.. _MDS Config Reference: ====================== MDS Config Reference ====================== @@ -65,3 +66,4 @@ .. confval:: mds_min_caps_per_client .. confval:: mds_symlink_recovery .. confval:: mds_extraordinary_events_dump_interval +.. confval:: subv_metrics_window_interval diff --git a/doc/cephfs/metrics.rst b/doc/cephfs/metrics.rst index 1befec0c4ae..17de0854eed 100644 --- a/doc/cephfs/metrics.rst +++ b/doc/cephfs/metrics.rst @@ -66,6 +66,40 @@ CephFS exports client metrics as :ref:`Labeled Perf Counters`, which could be us - Gauge - Number of bytes written in input/output operations generated by all processes +Subvolume Metrics +----------------- + +CephFS exports subvolume metrics as :ref:`Labeled Perf Counters`, which could be used to monitor the subvolume performance. CephFS exports the below subvolume metrics. +Subvolume metrics are aggregated within sliding window of 30 seconds (default value, configurable via the ``subv_metrics_window_interval`` parameter, see :ref:`MDS config reference`). +In large Ceph clusters with tens of thousands of subvolumes, this parameter also helps clean up stale metrics. +When a subvolume’s sliding window becomes empty, it's metrics are removed and not reported as “zero” values, reducing memory usage and computational overhead. + +.. list-table:: Subvolume Metrics + :widths: 25 25 75 + :header-rows: 1 + + * - Name + - Type + - Description + * - ``avg_read_iops`` + - Gauge + - Average read IOPS (input/output operations per second) over the sliding window. + * - ``avg_read_tp_Bps`` + - Gauge + - Average read throughput in bytes per second. + * - ``avg_read_lat_msec`` + - Gauge + - Average read latency in milliseconds. + * - ``avg_write_iops`` + - Gauge + - Average write IOPS over the sliding window. + * - ``avg_write_tp_Bps`` + - Gauge + - Average write throughput in bytes per second. + * - ``avg_write_lat_msec`` + - Gauge + - Average write latency in milliseconds. + Getting Metrics =============== @@ -130,3 +164,21 @@ The metrics could be scraped from the MDS admin socket as well as using the tell } } ] + +The subvolume metrics are dumped as a part of the same command. The ``mds_subvolume_metrics`` section in the output of ``counter dump`` command displays the metrics for each client as shown below:: + + "mds_subvolume_metrics": [ + { + "labels": { + "fs_name": "a", + "subvolume_path": "/volumes/_nogroup/test_subvolume" + }, + "counters": { + "avg_read_iops": 0, + "avg_read_tp_Bps": 11, + "avg_read_lat_msec": 0, + "avg_write_iops": 1564, + "avg_write_tp_Bps": 6408316, + "avg_write_lat_msec": 338 + } + } \ No newline at end of file diff --git a/src/common/options/mds.yaml.in b/src/common/options/mds.yaml.in index 54f3458012f..b7ec143ab6f 100644 --- a/src/common/options/mds.yaml.in +++ b/src/common/options/mds.yaml.in @@ -1807,8 +1807,8 @@ options: type: secs level: dev desc: subvolume metrics sliding window interval, seconds - long_desc: interval in seconds to hold values in sliding window for subvolume metrics, in the metrics aggregator - default: 60 + long_desc: interval in seconds to hold values in sliding window for subvolume metrics + default: 30 min: 30 services: - mds \ No newline at end of file