tools/cephfs: add new cephfs-tool
This patch introduces `cephfs-tool`, a new standalone C++ utility
designed to interact directly with `libcephfs`.
While the tool is architected to support various subcommands in the
future, the initial implementation focuses on a `bench` command to
measure library performance. This allows developers and administrators
to benchmark the userspace library isolated from FUSE or Kernel client
overheads.
Key features include:
* Multi-threaded Read/Write throughput benchmarking.
* Configurable block sizes, file counts, and fsync intervals.
* Detailed statistical reporting (Mean, Std Dev, Min/Max) for throughput and IOPS.
* Support for specific CephFS user/group impersonation (UID/GID) via `ceph_mount_perms_set`.
As an example test on a "trial" sepia machine against the new LRC, I
used a command like:
pdonnell@trial154:~$ env CEPH_ARGS="--log-to-stderr=false --log-to-file=false --log-file=/tmp/bench.log" ./cephfs-tool -c ~/ceph.conf -k ~/keyring -i scratch --filesystem scratch bench --root-path=/pdonnell --files 256 --size=$(( 128 * 2 ** 20 )) --threads=8 --iterations 3
Benchmark Configuration:
Threads: 8 | Iterations: 3
Files: 256 | Size:
134217728
Filesystem: scratch
Root: /pdonnell
Subdirectory: bench_run_d942
UID: -1
GID: -1
--- Iteration 1 of 3 ---
Starting Write Phase...
Write: 2761.97 MB/s, 21.5779 files/s (11.864s)
Starting Read Phase...
Read: 2684.36 MB/s, 20.9716 files/s (12.207s)
--- Iteration 2 of 3 ---
Starting Write Phase...
Write: 2698.51 MB/s, 21.0821 files/s (12.143s)
Starting Read Phase...
Read: 2682.16 MB/s, 20.9544 files/s (12.217s)
--- Iteration 3 of 3 ---
Starting Write Phase...
Write: 2720.69 MB/s, 21.2554 files/s (12.044s)
Starting Read Phase...
Read: 2695.18 MB/s, 21.0561 files/s (12.158s)
*** Final Report ***
Write Throughput Statistics (3 runs):
Mean: 2727.06 MB/s
Std Dev: 26.2954 MB/s
Min: 2698.51 MB/s
Max: 2761.97 MB/s
Read Throughput Statistics (3 runs):
Mean: 2687.24 MB/s
Std Dev: 5.68904 MB/s
Min: 2682.16 MB/s
Max: 2695.18 MB/s
File Creates Statistics (3 runs):
Mean: 21.3051 files/s
Std Dev: 0.205433 files/s
Min: 21.0821 files/s
Max: 21.5779 files/s
File Reads (Opens) Statistics (3 runs):
Mean: 20.994 files/s
Std Dev: 0.
0444456 files/s
Min: 20.9544 files/s
Max: 21.0561 files/s
Cleaning up...
For a 25Gb NIC, this is just about saturating the sticker bandwidth with
a single shared mount and 8 threads. For a per-thread mount:
pdonnell@trial154:~$ env CEPH_ARGS="--log-to-stderr=false --log-to-file=false --log-file=/tmp/bench.log" ./cephfs-tool -c ~/ceph.conf -k ~/keyring -i scratch --filesystem scratch bench --root-path=/pdonnell --files 256 --size=$(( 128 * 2 ** 20 )) --threads=8 --iterations 3 --per-thread-mount
Benchmark Configuration:
Threads: 8 | Iterations: 3
Files: 256 | Size:
134217728
Filesystem: scratch
Root: /pdonnell
Subdirectory: bench_run_9d1c
UID: -1
GID: -1
--- Iteration 1 of 3 ---
Starting Write Phase...
Write: 2691.2 MB/s, 21.025 files/s (12.176s)
Starting Read Phase...
Read: 2486.76 MB/s, 19.4278 files/s (13.177s)
--- Iteration 2 of 3 ---
Starting Write Phase...
Write: 2688.77 MB/s, 21.006 files/s (12.187s)
Starting Read Phase...
Read: 2496.42 MB/s, 19.5033 files/s (13.126s)
--- Iteration 3 of 3 ---
Starting Write Phase...
Write: 2692.08 MB/s, 21.0319 files/s (12.172s)
Starting Read Phase...
Read: 2488.27 MB/s, 19.4396 files/s (13.169s)
*** Final Report ***
Write Throughput Statistics (3 runs):
Mean: 2690.68 MB/s
Std Dev: 1.40086 MB/s
Min: 2688.77 MB/s
Max: 2692.08 MB/s
Read Throughput Statistics (3 runs):
Mean: 2490.48 MB/s
Std Dev: 4.24374 MB/s
Min: 2486.76 MB/s
Max: 2496.42 MB/s
File Creates Statistics (3 runs):
Mean: 21.0209 files/s
Std Dev: 0.
0109442 files/s
Min: 21.006 files/s
Max: 21.0319 files/s
File Reads (Opens) Statistics (3 runs):
Mean: 19.4569 files/s
Std Dev: 0.
0331542 files/s
Min: 19.4278 files/s
Max: 19.5033 files/s
Cleaning up...
Or to measure file create performance:
pdonnell@trial154:~$ env CEPH_ARGS="--log-to-stderr=false --log-to-file=false --log-file=/tmp/bench.log" ./cephfs-tool -c ~/ceph.conf -k ~/keyring -i scratch --filesystem scratch bench --root-path=/pdonnell --files=$(( 2 ** 16 )) --size=$(( 0 * 2 ** 20 )) --threads=8 --iterations 3
Benchmark Configuration:
Threads: 8 | Iterations: 3
Files: 65536 | Size: 0
Filesystem: scratch
Root: /pdonnell
Subdirectory: bench_run_d435
UID: -1
GID: -1
--- Iteration 1 of 3 ---
Starting Write Phase...
Write: 3974.77 files/s (16.488s)
Starting Read Phase...
Read: 14537.7 files/s (4.508s)
Cleaning up for next iteration...
--- Iteration 2 of 3 ---
Starting Write Phase...
Write: 4167.1 files/s (15.727s)
Starting Read Phase...
Read: 13636.3 files/s (4.806s)
Cleaning up for next iteration...
--- Iteration 3 of 3 ---
Starting Write Phase...
Write: 3863.7 files/s (16.962s)
Starting Read Phase...
Read: 14972.8 files/s (4.377s)
*** Final Report ***
File Creates Statistics (3 runs):
Mean: 4001.86 files/s
Std Dev: 125.337 files/s
Min: 3863.7 files/s
Max: 4167.1 files/s
File Reads (Opens) Statistics (3 runs):
Mean: 14382.3 files/s
Std Dev: 556.594 files/s
Min: 13636.3 files/s
Max: 14972.8 files/s
Cleaning up...
Here is the current help text:
Usage: cephfs-bench [general-options] <command> [command-options]
Commands:
bench Run IO benchmark
Allowed options:
General Options:
-h [ --help ] Produce help message
-c [ --conf ] arg Ceph config file path
-i [ --id ] arg (=admin) Client ID
-k [ --keyring ] arg Path to keyring file
--filesystem arg CephFS filesystem name to mount
--uid arg (=-1) User ID to mount as
--gid arg (=-1) Group ID to mount as
Benchmark Options (used with 'bench' command):
--threads arg (=1) Number of threads
--iterations arg (=1) Number of iterations
--files arg (=100) Total number of files
--size arg (=4MB) File size (e.g. 4MB, 0 for creates only)
--block-size arg (=4MB) IO block size (e.g. 1MB)
--fsync-every arg (=0) Call fsync every N bytes
--prefix arg (=benchmark_) Filename prefix
--dir-prefix arg (=bench_run_) Directory prefix
--root-path arg (=/) Root path in CephFS
--per-thread-mount Use separate mount per thread
--no-cleanup Disable cleanup of files
AI-Assisted: significant portions of this code were AI-generated through a dozens of iterative prompts.
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>