Remove last bits of support for 'mds_cache_size'.
'mds_cache_memory_limit' is preferred.
Fixes: https://tracker.ceph.com/issues/41951
Signed-off-by: Ramana Raja <rraja@redhat.com>
* The format of MDSs in `ceph fs dump` has changed.
+* The ``mds_cache_size`` config option is completely removed. Since luminous,
+ the ``mds_cache_memory_limit`` config option has been preferred to configure
+ the MDS's cache limits.
+
* The ``pg_autoscale_mode`` is now set to ``on`` by default for newly
created pools, which means that Ceph will automatically manage the
number of PGs. To change this behavior, or to learn more about PG
that cache.
If your workload has more files than fit in your cache (configured using
-``mds_cache_memory_limit`` or ``mds_cache_size`` settings), then
-make sure you test it appropriately: don't test your system with a small
-number of files and then expect equivalent performance when you move
-to a much larger number of files.
+``mds_cache_memory_limit`` settings), then make sure you test it
+appropriately: don't test your system with a small number of files and then
+expect equivalent performance when you move to a much larger number of files.
Do you need a file system?
--------------------------
You can limit the size of the Metadata Server (MDS) cache by:
-* *A memory limit*: A new behavior introduced in the Luminous release. Use the `mds_cache_memory_limit` parameters. We recommend to use memory limits instead of inode count limits.
-* *Inode count*: Use the `mds_cache_size` parameter. By default, limiting the MDS cache by inode count is disabled.
+* *A memory limit*: A new behavior introduced in the Luminous release. Use the `mds_cache_memory_limit` parameters.
-In addition, you can specify a cache reservation by using the `mds_cache_reservation` parameter for MDS operations. The cache reservation is limited as a percentage of the memory or inode limit and is set to 5% by default. The intent of this parameter is to have the MDS maintain an extra reserve of memory for its cache for new metadata operations to use. As a consequence, the MDS should in general operate below its memory limit because it will recall old state from clients in order to drop unused metadata in its cache.
+In addition, you can specify a cache reservation by using the `mds_cache_reservation` parameter for MDS operations. The cache reservation is limited as a percentage of the memory and is set to 5% by default. The intent of this parameter is to have the MDS maintain an extra reserve of memory for its cache for new metadata operations to use. As a consequence, the MDS should in general operate below its memory limit because it will recall old state from clients in order to drop unused metadata in its cache.
The `mds_cache_reservation` parameter replaces the `mds_health_cache_threshold` in all situations except when MDS nodes sends a health alert to the Monitors indicating the cache is too large. By default, `mds_health_cache_threshold` is 150% of the maximum cache size.
Code: MDS_HEALTH_CLIENT_RECALL, MDS_HEALTH_CLIENT_RECALL_MANY
Description: Clients maintain a metadata cache. Items (such as inodes) in the
client cache are also pinned in the MDS cache, so when the MDS needs to shrink
-its cache (to stay within ``mds_cache_size`` or ``mds_cache_memory_limit``), it
-sends messages to clients to shrink their caches too. If the client is
-unresponsive or buggy, this can prevent the MDS from properly staying within
-its cache limits and it may eventually run out of memory and crash. This
-message appears if a client has failed to release more than
+its cache (to stay within ``mds_cache_memory_limit``), it sends messages to
+clients to shrink their caches too. If the client is unresponsive or buggy,
+this can prevent the MDS from properly staying within its cache limits and it
+may eventually run out of memory and crash. This message appears if a client
+has failed to release more than
``mds_recall_warning_threshold`` capabilities (decaying with a half-life of
``mds_recall_max_decay_rate``) within the last
``mds_recall_warning_decay_rate`` second.
Description: The MDS is not succeeding in trimming its cache to comply with the
limit set by the administrator. If the MDS cache becomes too large, the daemon
may exhaust available memory and crash. By default, this message appears if
-the actual cache size (in inodes or memory) is at least 50% greater than
-``mds_cache_size`` (default 100000) or ``mds_cache_memory_limit`` (default
-1GB). Modify ``mds_health_cache_threshold`` to set the warning ratio.
+the actual cache size (in memory) is at least 50% greater than
+``mds_cache_memory_limit`` (default 1GB). Modify ``mds_health_cache_threshold``
+to set the warning ratio.
``mds cache memory limit``
:Description: The memory limit the MDS should enforce for its cache.
- Administrators should use this instead of ``mds cache size``.
:Type: 64-bit Integer Unsigned
-:Default: ``1073741824``
+:Default: ``1G``
``mds cache reservation``
:Type: Float
:Default: ``0.05``
-``mds cache size``
-
-:Description: The number of inodes to cache. A value of 0 indicates an
- unlimited number. It is recommended to use
- ``mds_cache_memory_limit`` to limit the amount of memory the MDS
- cache uses.
-:Type: 32-bit Integer
-:Default: ``0``
``mds cache mid``
Generally it will be the result of
-#. Overloading the system (if you have extra RAM, increase the "mds cache size"
- config from its default 100000; having a larger active file set than your MDS
- cache is the #1 cause of this!).
+#. Overloading the system (if you have extra RAM, increase the
+ "mds cache memory limit" config from its default 1GiB; having a larger active
+ file set than your MDS cache is the #1 cause of this!).
#. Running an older (misbehaving) client.
the Ceph Storage Cluster, and override the same setting in
``global``.
-:Example: ``mds_cache_size = 10G``
+:Example: ``mds_cache_memory_limit = 10G``
``client``
:param use_subdir: whether to put test files in a subdir or use root
"""
- cache_size = open_files/2
+ # Set MDS cache memory limit to a low value that will make the MDS to
+ # ask the client to trim the caps.
+ cache_memory_limit = "1M"
- self.set_conf('mds', 'mds cache size', cache_size)
+ self.set_conf('mds', 'mds_cache_memory_limit', cache_memory_limit)
self.set_conf('mds', 'mds_recall_max_caps', open_files/2)
self.set_conf('mds', 'mds_recall_warning_threshold', open_files)
self.fs.mds_fail_restart()
self.fs.wait_for_daemons()
mds_min_caps_per_client = int(self.fs.get_config("mds_min_caps_per_client"))
+ mds_max_caps_per_client = int(self.fs.get_config("mds_max_caps_per_client"))
mds_recall_warning_decay_rate = self.fs.get_config("mds_recall_warning_decay_rate")
self.assertTrue(open_files >= mds_min_caps_per_client)
num_caps = self.get_session(mount_a_client_id)['num_caps']
if num_caps <= mds_min_caps_per_client:
return True
- elif num_caps < cache_size:
+ elif num_caps <= mds_max_caps_per_client:
return True
else:
return False
.set_description("interval in seconds between heap releases")
.set_flag(Option::FLAG_RUNTIME),
- Option("mds_cache_size", Option::TYPE_INT, Option::LEVEL_ADVANCED)
- .set_default(0)
- .set_description("maximum number of inodes in MDS cache (<=0 is unlimited)")
- .set_long_description("This tunable is no longer recommended. Use mds_cache_memory_limit."),
-
Option("mds_cache_memory_limit", Option::TYPE_SIZE, Option::LEVEL_BASIC)
.set_default(1*(1LL<<30))
.set_description("target maximum memory usage of MDS cache")
(g_conf()->mds_dir_max_commit_size << 20) :
(0.9 *(g_conf()->osd_max_write_size << 20));
- cache_inode_limit = g_conf().get_val<int64_t>("mds_cache_size");
cache_memory_limit = g_conf().get_val<Option::size_t>("mds_cache_memory_limit");
cache_reservation = g_conf().get_val<double>("mds_cache_reservation");
cache_health_threshold = g_conf().get_val<double>("mds_health_cache_threshold");
void MDCache::handle_conf_change(const std::set<std::string>& changed, const MDSMap& mdsmap)
{
- if (changed.count("mds_cache_size"))
- cache_inode_limit = g_conf().get_val<int64_t>("mds_cache_size");
if (changed.count("mds_cache_memory_limit"))
cache_memory_limit = g_conf().get_val<Option::size_t>("mds_cache_memory_limit");
if (changed.count("mds_cache_reservation"))
void MDCache::log_stat()
{
- mds->logger->set(l_mds_inode_max, cache_inode_limit ? : INT_MAX);
mds->logger->set(l_mds_inodes, lru.lru_get_size());
mds->logger->set(l_mds_inodes_pinned, lru.lru_get_num_pinned());
mds->logger->set(l_mds_inodes_top, lru.lru_get_top());
explicit MDCache(MDSRank *m, PurgeQueue &purge_queue_);
~MDCache();
- uint64_t cache_limit_inodes(void) {
- return cache_inode_limit;
- }
uint64_t cache_limit_memory(void) {
return cache_memory_limit;
}
double cache_toofull_ratio(void) const {
- double inode_reserve = cache_inode_limit*(1.0-cache_reservation);
double memory_reserve = cache_memory_limit*(1.0-cache_reservation);
- return fmax(0.0, fmax((cache_size()-memory_reserve)/memory_reserve, cache_inode_limit == 0 ? 0.0 : (CInode::count()-inode_reserve)/inode_reserve));
+ return fmax(0.0, (cache_size()-memory_reserve)/memory_reserve);
}
bool cache_toofull(void) const {
return cache_toofull_ratio() > 0.0;
return mempool::get_pool(mempool::mds_co::id).allocated_bytes();
}
bool cache_overfull(void) const {
- return (cache_inode_limit > 0 && CInode::count() > cache_inode_limit*cache_health_threshold) || (cache_size() > cache_memory_limit*cache_health_threshold);
+ return cache_size() > cache_memory_limit*cache_health_threshold;
}
void advance_stray() {
void finish_uncommitted_fragment(dirfrag_t basedirfrag, int op);
void rollback_uncommitted_fragment(dirfrag_t basedirfrag, frag_vec_t&& old_frags);
- uint64_t cache_inode_limit;
uint64_t cache_memory_limit;
double cache_reservation;
double cache_health_threshold;
mds_plb.add_u64_counter(l_mds_dir_commit, "dir_commit", "Directory commit");
mds_plb.add_u64_counter(l_mds_dir_split, "dir_split", "Directory split");
mds_plb.add_u64_counter(l_mds_dir_merge, "dir_merge", "Directory merge");
- mds_plb.add_u64(l_mds_inode_max, "inode_max", "Max inodes, cache size");
mds_plb.add_u64(l_mds_inodes_pinned, "inodes_pinned", "Inodes pinned");
mds_plb.add_u64(l_mds_inodes_expired, "inodes_expired", "Inodes expired");
mds_plb.add_u64(l_mds_inodes_with_caps, "inodes_with_caps",
"mds_cache_memory_limit",
"mds_cache_mid",
"mds_cache_reservation",
- "mds_cache_size",
"mds_cache_trim_decay_rate",
"mds_cap_revoke_eviction_timeout",
"mds_dump_cache_threshold_file",
l_mds_dir_commit,
l_mds_dir_split,
l_mds_dir_merge,
- l_mds_inode_max,
l_mds_inodes,
l_mds_inodes_top,
l_mds_inodes_bottom,
;debug mds = 20
;debug journaler = 20
- # The number of inodes to cache.
- # Type: 32-bit Integer
- # (Default: 100000)
- ;mds cache size = 250000
+ # The memory limit the MDS should enforce for its cache.
+ # (Default: 1G)
+ ;mds cache memory limit = 2G
;[mds.alpha]
; host = alpha
usage=$usage"\t--valgrind[_{osd,mds,mon,rgw}] 'toolname args...'\n"
usage=$usage"\t--nodaemon: use ceph-run as wrapper for mon/osd/mds\n"
usage=$usage"\t--redirect-output: only useful with nodaemon, directs output to log file\n"
-usage=$usage"\t--smallmds: limit mds cache size\n"
+usage=$usage"\t--smallmds: limit mds cache memory limit\n"
usage=$usage"\t-m ip:port\t\tspecify monitor address\n"
usage=$usage"\t-k keep old configuration files\n"
usage=$usage"\t-x enable cephx (on by default)\n"
wconf <<EOF
[mds]
mds log max segments = 2
- mds cache size = 10000
+ # Default 'mds cache memory limit' is 1GiB, and here we set it to 100MiB.
+ mds cache memory limit = 100M
EOF
fi