mds: obsoleting 'mds_cache_size'

author Ramana Raja <rraja@redhat.com>

Mon, 18 Nov 2019 11:32:28 +0000 (17:02 +0530)

committer Ramana Raja <rraja@redhat.com>

Mon, 2 Dec 2019 09:21:25 +0000 (14:51 +0530)
author Ramana Raja <rraja@redhat.com>
Mon, 18 Nov 2019 11:32:28 +0000 (17:02 +0530)
committer Ramana Raja <rraja@redhat.com>
Mon, 2 Dec 2019 09:21:25 +0000 (14:51 +0530)
diff --git a/PendingReleaseNotes b/PendingReleaseNotes

index 859dd60e75337ac93847640efc41a8aabe21b85c..f4f2159c4327bb8b098cef8beec522a44c6adec9 100644 (file)
--- a/PendingReleaseNotes
+++ b/PendingReleaseNotes
@@ -275,6 +275,10 @@
  
  * The format of MDSs in `ceph fs dump` has changed.
  
+* The ``mds_cache_size`` config option is completely removed. Since luminous,
+  the ``mds_cache_memory_limit`` config option has been preferred to configure
+  the MDS's cache limits.
+
  * The ``pg_autoscale_mode`` is now set to ``on`` by default for newly
    created pools, which means that Ceph will automatically manage the
    number of PGs.  To change this behavior, or to learn more about PG
diff --git a/doc/cephfs/app-best-practices.rst b/doc/cephfs/app-best-practices.rst

index f55f46724c6638e890c95c8e87d75ef134a1301e..50bd3b689b5481f063a10b446b98245612be59d2 100644 (file)
--- a/doc/cephfs/app-best-practices.rst
+++ b/doc/cephfs/app-best-practices.rst
@@ -69,10 +69,9 @@ performance is very different for workloads whose metadata fits within
  that cache.
  
  If your workload has more files than fit in your cache (configured using 
-``mds_cache_memory_limit`` or ``mds_cache_size`` settings), then
-make sure you test it appropriately: don't test your system with a small
-number of files and then expect equivalent performance when you move
-to a much larger number of files.
+``mds_cache_memory_limit`` settings), then make sure you test it
+appropriately: don't test your system with a small number of files and then
+expect equivalent performance when you move to a much larger number of files.
  
  Do you need a file system?
  --------------------------
diff --git a/doc/cephfs/cache-size-limits.rst b/doc/cephfs/cache-size-limits.rst

index 4ea41443bcd5719d761a8e232446cf3cfc85271d..1f6f5d93b9f619b159ea8c58feaa8544c0700243 100644 (file)
--- a/doc/cephfs/cache-size-limits.rst
+++ b/doc/cephfs/cache-size-limits.rst
@@ -5,10 +5,9 @@ This section describes ways to limit MDS cache size.
  
  You can limit the size of the Metadata Server (MDS) cache by:
  
-* *A memory limit*: A new behavior introduced in the Luminous release. Use the `mds_cache_memory_limit` parameters. We recommend to use memory limits instead of inode count limits.
-* *Inode count*: Use the `mds_cache_size` parameter. By default, limiting the MDS cache by inode count is disabled.
+* *A memory limit*: A new behavior introduced in the Luminous release. Use the `mds_cache_memory_limit` parameters.
  
-In addition, you can specify a cache reservation by using the `mds_cache_reservation` parameter for MDS operations. The cache reservation is limited as a percentage of the memory or inode limit and is set to 5% by default. The intent of this parameter is to have the MDS maintain an extra reserve of memory for its cache for new metadata operations to use. As a consequence, the MDS should in general operate below its memory limit because it will recall old state from clients in order to drop unused metadata in its cache.
+In addition, you can specify a cache reservation by using the `mds_cache_reservation` parameter for MDS operations. The cache reservation is limited as a percentage of the memory and is set to 5% by default. The intent of this parameter is to have the MDS maintain an extra reserve of memory for its cache for new metadata operations to use. As a consequence, the MDS should in general operate below its memory limit because it will recall old state from clients in order to drop unused metadata in its cache.
  
  The `mds_cache_reservation` parameter replaces the `mds_health_cache_threshold` in all situations except when MDS nodes sends a health alert to the Monitors indicating the cache is too large. By default, `mds_health_cache_threshold` is 150% of the maximum cache size.
  
diff --git a/doc/cephfs/health-messages.rst b/doc/cephfs/health-messages.rst

index a0f460da23ae3ae470c529430acefa2367953761..2e79c7bfa15b7baccf7a550913a55568bf7b46ac 100644 (file)
--- a/doc/cephfs/health-messages.rst
+++ b/doc/cephfs/health-messages.rst
@@ -75,11 +75,11 @@ Message: "Client *name* failing to respond to cache pressure"
  Code: MDS_HEALTH_CLIENT_RECALL, MDS_HEALTH_CLIENT_RECALL_MANY
  Description: Clients maintain a metadata cache.  Items (such as inodes) in the
  client cache are also pinned in the MDS cache, so when the MDS needs to shrink
-its cache (to stay within ``mds_cache_size`` or ``mds_cache_memory_limit``), it
-sends messages to clients to shrink their caches too.  If the client is
-unresponsive or buggy, this can prevent the MDS from properly staying within
-its cache limits and it may eventually run out of memory and crash.  This
-message appears if a client has failed to release more than
+its cache (to stay within ``mds_cache_memory_limit``), it sends messages to
+clients to shrink their caches too.  If the client is unresponsive or buggy,
+this can prevent the MDS from properly staying within its cache limits and it
+may eventually run out of memory and crash.  This message appears if a client
+has failed to release more than
  ``mds_recall_warning_threshold`` capabilities (decaying with a half-life of
  ``mds_recall_max_decay_rate``) within the last
  ``mds_recall_warning_decay_rate`` second.
@@ -126,6 +126,6 @@ Code: MDS_HEALTH_CACHE_OVERSIZED
  Description: The MDS is not succeeding in trimming its cache to comply with the
  limit set by the administrator.  If the MDS cache becomes too large, the daemon
  may exhaust available memory and crash.  By default, this message appears if
-the actual cache size (in inodes or memory) is at least 50% greater than
-``mds_cache_size`` (default 100000) or ``mds_cache_memory_limit`` (default
-1GB). Modify ``mds_health_cache_threshold`` to set the warning ratio.
+the actual cache size (in memory) is at least 50% greater than
+``mds_cache_memory_limit`` (default 1GB). Modify ``mds_health_cache_threshold``
+to set the warning ratio.
diff --git a/doc/cephfs/mds-config-ref.rst b/doc/cephfs/mds-config-ref.rst

index 4e7a5abc0c89eabd9a2aa4f865071cf9976495f1..e9c4eec547a395caaf81ee14cc766a13a5ca09a8 100644 (file)
--- a/doc/cephfs/mds-config-ref.rst
+++ b/doc/cephfs/mds-config-ref.rst
@@ -13,9 +13,8 @@
  ``mds cache memory limit``
  
  :Description: The memory limit the MDS should enforce for its cache.
-              Administrators should use this instead of ``mds cache size``.
  :Type:  64-bit Integer Unsigned
-:Default: ``1073741824``
+:Default: ``1G``
  
  ``mds cache reservation``
  
@@ -26,14 +25,6 @@
  :Type:  Float
  :Default: ``0.05``
  
-``mds cache size``
-
-:Description: The number of inodes to cache. A value of 0 indicates an
-              unlimited number. It is recommended to use
-              ``mds_cache_memory_limit`` to limit the amount of memory the MDS
-              cache uses.
-:Type:  32-bit Integer
-:Default: ``0``
  
  ``mds cache mid``
  
diff --git a/doc/cephfs/troubleshooting.rst b/doc/cephfs/troubleshooting.rst

index 1ff6cd0bea12b3584c3be0c83521be404568c6c7..dcc3f84ab28c361a0178a152ec3a9187a01b0c48 100644 (file)
--- a/doc/cephfs/troubleshooting.rst
+++ b/doc/cephfs/troubleshooting.rst
@@ -38,9 +38,9 @@ specific clients as misbehaving, you should investigate why they are doing so.
  
  Generally it will be the result of
  
-#. Overloading the system (if you have extra RAM, increase the "mds cache size"
-   config from its default 100000; having a larger active file set than your MDS
-   cache is the #1 cause of this!).
+#. Overloading the system (if you have extra RAM, increase the
+   "mds cache memory limit" config from its default 1GiB; having a larger active
+   file set than your MDS cache is the #1 cause of this!).
  
  #. Running an older (misbehaving) client.
  
diff --git a/doc/rados/configuration/ceph-conf.rst b/doc/rados/configuration/ceph-conf.rst

index 0bbd243af8acd3f2306304e809e8e6b9d95c5999..2c326bee216db5f412dc66b8251111b3ad3ce6cb 100644 (file)
--- a/doc/rados/configuration/ceph-conf.rst
+++ b/doc/rados/configuration/ceph-conf.rst
@@ -144,7 +144,7 @@ These sections include:
                the Ceph Storage Cluster, and override the same setting in
                ``global``.
  
-:Example: ``mds_cache_size = 10G``
+:Example: ``mds_cache_memory_limit = 10G``
  
  ``client``
  
diff --git a/qa/tasks/cephfs/test_client_limits.py b/qa/tasks/cephfs/test_client_limits.py

index cd9a9a6635a0f54f604d3303887970759641eb07..75f7a68415f587661dc8f45e9f992c6ad52ad98d 100644 (file)
--- a/qa/tasks/cephfs/test_client_limits.py
+++ b/qa/tasks/cephfs/test_client_limits.py
@@ -38,15 +38,18 @@ class TestClientLimits(CephFSTestCase):
          :param use_subdir: whether to put test files in a subdir or use root
          """
  
-        cache_size = open_files/2
+        # Set MDS cache memory limit to a low value that will make the MDS to
+        # ask the client to trim the caps.
+        cache_memory_limit = "1M"
  
-        self.set_conf('mds', 'mds cache size', cache_size)
+        self.set_conf('mds', 'mds_cache_memory_limit', cache_memory_limit)
          self.set_conf('mds', 'mds_recall_max_caps', open_files/2)
          self.set_conf('mds', 'mds_recall_warning_threshold', open_files)
          self.fs.mds_fail_restart()
          self.fs.wait_for_daemons()
  
          mds_min_caps_per_client = int(self.fs.get_config("mds_min_caps_per_client"))
+        mds_max_caps_per_client = int(self.fs.get_config("mds_max_caps_per_client"))
          mds_recall_warning_decay_rate = self.fs.get_config("mds_recall_warning_decay_rate")
          self.assertTrue(open_files >= mds_min_caps_per_client)
  
@@ -87,7 +90,7 @@ class TestClientLimits(CephFSTestCase):
              num_caps = self.get_session(mount_a_client_id)['num_caps']
              if num_caps <= mds_min_caps_per_client:
                  return True
-            elif num_caps < cache_size:
+            elif num_caps <= mds_max_caps_per_client:
                  return True
              else:
                  return False
diff --git a/src/common/options.cc b/src/common/options.cc

index 67bea89ef3e3618b1c65c76567b04b596e2c8b3d..24ddaa1473174c444912388f3cc596a4b219a573 100644 (file)
--- a/src/common/options.cc
+++ b/src/common/options.cc
@@ -7455,11 +7455,6 @@ std::vector<Option> get_mds_options() {
      .set_description("interval in seconds between heap releases")
      .set_flag(Option::FLAG_RUNTIME),
  
-    Option("mds_cache_size", Option::TYPE_INT, Option::LEVEL_ADVANCED)
-    .set_default(0)
-    .set_description("maximum number of inodes in MDS cache (<=0 is unlimited)")
-    .set_long_description("This tunable is no longer recommended. Use mds_cache_memory_limit."),
-
      Option("mds_cache_memory_limit", Option::TYPE_SIZE, Option::LEVEL_BASIC)
      .set_default(1*(1LL<<30))
      .set_description("target maximum memory usage of MDS cache")
diff --git a/src/mds/MDCache.cc b/src/mds/MDCache.cc

index c6501b89c2ee8a8b8ed252aafc9ace0651a18669..4dece9eb0f269b9052f6efb7aed4253e1ad8f812 100644 (file)
--- a/src/mds/MDCache.cc
+++ b/src/mds/MDCache.cc
@@ -147,7 +147,6 @@ MDCache::MDCache(MDSRank *m, PurgeQueue &purge_queue_) :
                          (g_conf()->mds_dir_max_commit_size << 20) :
                          (0.9 *(g_conf()->osd_max_write_size << 20));
  
-  cache_inode_limit = g_conf().get_val<int64_t>("mds_cache_size");
    cache_memory_limit = g_conf().get_val<Option::size_t>("mds_cache_memory_limit");
    cache_reservation = g_conf().get_val<double>("mds_cache_reservation");
    cache_health_threshold = g_conf().get_val<double>("mds_health_cache_threshold");
@@ -212,8 +211,6 @@ MDCache::~MDCache()
  
  void MDCache::handle_conf_change(const std::set<std::string>& changed, const MDSMap& mdsmap)
  {
-  if (changed.count("mds_cache_size"))
-    cache_inode_limit = g_conf().get_val<int64_t>("mds_cache_size");
    if (changed.count("mds_cache_memory_limit"))
      cache_memory_limit = g_conf().get_val<Option::size_t>("mds_cache_memory_limit");
    if (changed.count("mds_cache_reservation"))
@@ -232,7 +229,6 @@ void MDCache::handle_conf_change(const std::set<std::string>& changed, const MDS
  
  void MDCache::log_stat()
  {
-  mds->logger->set(l_mds_inode_max, cache_inode_limit ? : INT_MAX);
    mds->logger->set(l_mds_inodes, lru.lru_get_size());
    mds->logger->set(l_mds_inodes_pinned, lru.lru_get_num_pinned());
    mds->logger->set(l_mds_inodes_top, lru.lru_get_top());
diff --git a/src/mds/MDCache.h b/src/mds/MDCache.h

index 5cf1e35db64dbd43fc358b865f7663bf13f701f5..fafedfcd5fd33eb8a104eccfbf008c6381423346 100644 (file)
--- a/src/mds/MDCache.h
+++ b/src/mds/MDCache.h
@@ -186,16 +186,12 @@ class MDCache {
    explicit MDCache(MDSRank *m, PurgeQueue &purge_queue_);
    ~MDCache();
  
-  uint64_t cache_limit_inodes(void) {
-    return cache_inode_limit;
-  }
    uint64_t cache_limit_memory(void) {
      return cache_memory_limit;
    }
    double cache_toofull_ratio(void) const {
-    double inode_reserve = cache_inode_limit*(1.0-cache_reservation);
      double memory_reserve = cache_memory_limit*(1.0-cache_reservation);
-    return fmax(0.0, fmax((cache_size()-memory_reserve)/memory_reserve, cache_inode_limit == 0 ? 0.0 : (CInode::count()-inode_reserve)/inode_reserve));
+    return fmax(0.0, (cache_size()-memory_reserve)/memory_reserve);
    }
    bool cache_toofull(void) const {
      return cache_toofull_ratio() > 0.0;
@@ -204,7 +200,7 @@ class MDCache {
      return mempool::get_pool(mempool::mds_co::id).allocated_bytes();
    }
    bool cache_overfull(void) const {
-    return (cache_inode_limit > 0 && CInode::count() > cache_inode_limit*cache_health_threshold) || (cache_size() > cache_memory_limit*cache_health_threshold);
+    return cache_size() > cache_memory_limit*cache_health_threshold;
    }
  
    void advance_stray() {
@@ -1269,7 +1265,6 @@ class MDCache {
    void finish_uncommitted_fragment(dirfrag_t basedirfrag, int op);
    void rollback_uncommitted_fragment(dirfrag_t basedirfrag, frag_vec_t&& old_frags);
  
-  uint64_t cache_inode_limit;
    uint64_t cache_memory_limit;
    double cache_reservation;
    double cache_health_threshold;
diff --git a/src/mds/MDSRank.cc b/src/mds/MDSRank.cc

index 924768f6bbe7e54a8404435bbdcc4efb48af8b6f..0591152b50088d9068966c15078fef08ba527d49 100644 (file)
--- a/src/mds/MDSRank.cc
+++ b/src/mds/MDSRank.cc
@@ -3177,7 +3177,6 @@ void MDSRank::create_logger()
      mds_plb.add_u64_counter(l_mds_dir_commit, "dir_commit", "Directory commit");
      mds_plb.add_u64_counter(l_mds_dir_split, "dir_split", "Directory split");
      mds_plb.add_u64_counter(l_mds_dir_merge, "dir_merge", "Directory merge");
-    mds_plb.add_u64(l_mds_inode_max, "inode_max", "Max inodes, cache size");
      mds_plb.add_u64(l_mds_inodes_pinned, "inodes_pinned", "Inodes pinned");
      mds_plb.add_u64(l_mds_inodes_expired, "inodes_expired", "Inodes expired");
      mds_plb.add_u64(l_mds_inodes_with_caps, "inodes_with_caps",
@@ -3639,7 +3638,6 @@ const char** MDSRankDispatcher::get_tracked_conf_keys() const
      "mds_cache_memory_limit",
      "mds_cache_mid",
      "mds_cache_reservation",
-    "mds_cache_size",
      "mds_cache_trim_decay_rate",
      "mds_cap_revoke_eviction_timeout",
      "mds_dump_cache_threshold_file",
diff --git a/src/mds/MDSRank.h b/src/mds/MDSRank.h

index 4187ee7c5ec3c29fa13117b0801c13291bb80665..1a16dffc65db5e325988bd8b7229b06269aba22e 100644 (file)
--- a/src/mds/MDSRank.h
+++ b/src/mds/MDSRank.h
@@ -51,7 +51,6 @@ enum {
    l_mds_dir_commit,
    l_mds_dir_split,
    l_mds_dir_merge,
-  l_mds_inode_max,
    l_mds_inodes,
    l_mds_inodes_top,
    l_mds_inodes_bottom,
diff --git a/src/sample.ceph.conf b/src/sample.ceph.conf

index 74ffd4134205b0379f9cb5003c9dcd49be6b4c1b..a8f8b9e04af6bc0e5e1eb41971df8472a7c08bb9 100644 (file)
--- a/src/sample.ceph.conf
+++ b/src/sample.ceph.conf
@@ -262,10 +262,9 @@
      ;debug mds                  = 20
      ;debug journaler            = 20
  
-    # The number of inodes to cache.
-    # Type: 32-bit Integer
-    # (Default: 100000)
-    ;mds cache size             = 250000
+    # The memory limit the MDS should enforce for its cache.
+    # (Default: 1G)
+    ;mds cache memory limit     = 2G
  
  ;[mds.alpha]
  ;    host                       = alpha
diff --git a/src/vstart.sh b/src/vstart.sh

index abc3f676f9c2dec26d6938d4612454c2b946d410..c8fb247fe28e88377aa6f959f417bd2bbd778850 100755 (executable)
--- a/src/vstart.sh
+++ b/src/vstart.sh
@@ -206,7 +206,7 @@ usage=$usage"\t-N, --not-new: reuse existing cluster config (default)\n"
  usage=$usage"\t--valgrind[_{osd,mds,mon,rgw}] 'toolname args...'\n"
  usage=$usage"\t--nodaemon: use ceph-run as wrapper for mon/osd/mds\n"
  usage=$usage"\t--redirect-output: only useful with nodaemon, directs output to log file\n"
-usage=$usage"\t--smallmds: limit mds cache size\n"
+usage=$usage"\t--smallmds: limit mds cache memory limit\n"
  usage=$usage"\t-m ip:port\t\tspecify monitor address\n"
  usage=$usage"\t-k keep old configuration files\n"
  usage=$usage"\t-x enable cephx (on by default)\n"
@@ -1211,7 +1211,8 @@ if [ "$smallmds" -eq 1 ]; then
      wconf <<EOF
  [mds]
          mds log max segments = 2
-        mds cache size = 10000
+        # Default 'mds cache memory limit' is 1GiB, and here we set it to 100MiB.
+        mds cache memory limit = 100M
  EOF
  fi
author	Ramana Raja <rraja@redhat.com>
	Mon, 18 Nov 2019 11:32:28 +0000 (17:02 +0530)
committer	Ramana Raja <rraja@redhat.com>
	Mon, 2 Dec 2019 09:21:25 +0000 (14:51 +0530)
PendingReleaseNotes		patch \| blob \| history
doc/cephfs/app-best-practices.rst		patch \| blob \| history
doc/cephfs/cache-size-limits.rst		patch \| blob \| history
doc/cephfs/health-messages.rst		patch \| blob \| history
doc/cephfs/mds-config-ref.rst		patch \| blob \| history
doc/cephfs/troubleshooting.rst		patch \| blob \| history
doc/rados/configuration/ceph-conf.rst		patch \| blob \| history
qa/tasks/cephfs/test_client_limits.py		patch \| blob \| history
src/common/options.cc		patch \| blob \| history
src/mds/MDCache.cc		patch \| blob \| history
src/mds/MDCache.h		patch \| blob \| history
src/mds/MDSRank.cc		patch \| blob \| history
src/mds/MDSRank.h		patch \| blob \| history
src/sample.ceph.conf		patch \| blob \| history
src/vstart.sh		patch \| blob \| history