From: chunmei liu Date: Tue, 3 Feb 2026 23:04:40 +0000 (-0800) Subject: doc/dev/seastore.rst: add design implementation for osd shards change X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=926411ba12bdf183c1a9d986079e5cd051cbb841;p=ceph-ci.git doc/dev/seastore.rst: add design implementation for osd shards change Signed-off-by: chunmei liu --- diff --git a/doc/dev/crimson/seastore.rst b/doc/dev/crimson/seastore.rst index 8c1c11357c0..2913d523471 100644 --- a/doc/dev/crimson/seastore.rst +++ b/doc/dev/crimson/seastore.rst @@ -614,6 +614,136 @@ ExtentPlacementManager is responsible for: The number of tiers is set based on the configured ``seastore_hot_tier_generations`` and ``seastore_cold_tier_generations``. Each genreation maps its own segment and has its own dedicated ``ExtentOolWriter`` writer (See ``generation_to_writer``). + +.. _multishardstores: + +MultiShardStores +---------------- +In order to not restrict the number of store shards (SeaStore::Shard) to be equal to the number of reactors threads (seastar::smp::count) allocated to the OSD - MultiShardStores is introduced. +With MultiShardStores, the number of OSD shards could be changed also after mkfs if the reactor count is changed and the OSD restarts. +Each reactor thread can host multiple store shards. Alternatively, few reactors threads could now share the same store shard. Store shards can forward their I/O requests to another store shard running on a different reactor thread. + +For example, for an OSD that had 3 reactor threads (seastar::smp::count: 3) set during mkfs. +After changing the reactor thread count to 5 and restarting the cluster, the mounted store shards will look like: + + Reactors thread 0 -> Store Shard: 0 + Reactors thread 1 -> Store Shard: 1 + Reactors thread 2 -> Store Shard: 2 + Reactors thread 3 -> Store Shard: 0 (forwarded) + Reactors thread 4 -> Store Shard: 1 (forwarded) + +When changing to seastar::smp::count: 2: + Reactors thread 0 -> Store Shard: 0, 2 + Reactors thread 1 -> Store Shard: 1 + +using ./bin/ceph daemon osd.0 dump_store_shards to check store assignment. +See the following example outputs from running dump_store_shards with the above scenarios: +**first start with 3 reactors**: + ./bin/ceph daemon osd.0 dump_store_shards + *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** + { + "this shard id": 0, + "osd shard nums": 3, + "store_shard_nums": 3, + "core_pgs": { + "core": 0, + "num_pgs": 43 + }, + "core_pgs": { + "core": 1, + "num_pgs": 43 + }, + "core_pgs": { + "core": 2, + "num_pgs": 43 + } + } + +**second restart with 2 reactors**: + ./bin/ceph daemon osd.0 dump_store_shards + *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** + { + "this shard id": 0, + "osd shard nums": 2, + "store_shard_nums": 3, + "core_pgs": { + "core": 0, + "num_pgs": 86 + }, + "core_pgs": { + "core": 1, + "num_pgs": 43 + }, + "core_store": { + "core": 0, + "store": { + "store_index": 0, + "num_pgs": 43 + }, + "store": { + "store_index": 1, + "num_pgs": 43 + } + }, + "core_store": { + "core": 1, + "store": { + "store_index": 0, + "num_pgs": 43 + } + } + } + +**third restart with 5 reactors**: + ./bin/ceph daemon osd.0 dump_store_shards + *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** + { + "this shard id": 0, + "osd shard nums": 5, + "store_shard_nums": 3, + "core_pgs": { + "core": 0, + "num_pgs": 43 + }, + "core_pgs": { + "core": 1, + "num_pgs": 43 + }, + "core_pgs": { + "core": 2, + "num_pgs": 43 + }, + "core_alien": { + "core": 0, + "alien_core": { + "alien_core_id": 0, + "num_pgs": 22 + }, + "alien_core": { + "alien_core_id": 3, + "num_pgs": 21 + } + }, + "core_alien": { + "core": 1, + "alien_core": { + "alien_core_id": 1, + "num_pgs": 22 + }, + "alien_core": { + "alien_core_id": 4, + "num_pgs": 21 + } + }, + "core_alien": { + "core": 2, + "alien_core": { + "alien_core_id": 2, + "num_pgs": 43 + } + } + } + Next Steps ==========