Patrick Donnelly [Mon, 11 Sep 2017 22:21:52 +0000 (15:21 -0700)]
mds: support limiting cache by memory
This introduces two config parameters:
mds_cache_memory_limit: Sets the soft maximum of the cache to the given
byte count. (Like mds_cache_size, this doesn't actually limit the maximum
size of the cache. It just dictates the steady-state size.)
mds_cache_reservation: This replaces mds_health_cache_threshold everywhere
except the Beacon heartbeat sent to the mons. The idea here is to specify a
reservation of memory (5% by default) for operations and the MDS tries to
always maintain that reservation. So, the MDS will recall caps from clients
when it begins dipping into its reservation of memory.
mds_cache_size still limits the cache by Inode count but is now by-default 0
(i.e. unlimited). The new preferred way of specifying cache limits is by memory
size. The default is 1GB.
Fixes: http://tracker.ceph.com/issues/20594 Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1464976 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Avoids an unnecessary "max" size of the LRU which was used to calculate the
midpoint. Instead, just dynamically move the LRUObjects between top and bottom
on-the-fly.
This change is necessary for a cache which which does not limit by the number
of objects but by some other metric. (In this case, memory.)
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Tue, 12 Sep 2017 21:29:49 +0000 (14:29 -0700)]
mds: go back to compact_map for replicas
Zheng observed that an alloc_ptr doesn't really work in this case since any
call to get_replicas() will cause the map to be allocated, nullifying the
benefit. Use a compact_map until a better solution can be written. (This means
that the map will be allocated outside the mempool.)
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Fri, 28 Jul 2017 00:21:54 +0000 (17:21 -0700)]
mds: use mempool for cache objects
The purpose of this is to allow us to track memory usage by cached objects so
we can limit cache size based on memory available/allocated to the MDS.
This commit is a first step: it adds CInode, CDir, and CDentry to the mempool
but not all of the containers in these classes (e.g. std::map). However,
MDSCacheObject has been changed to allocate its containers through the mempool
by converting compact_* containers to the std versions offered through mempool
via the new alloc_ptr.
(A compact_* class simply wraps a pointer to the std:: version to reduce memory
usage of an object when the container is only occasionally used. The alloc_ptr
allows us to achieve the same thing explicitly with only a little handholding:
when all entries in the wrapped container are deleted, the caller must call
alloc_ptr.release().)
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Thu, 27 Jul 2017 19:10:14 +0000 (12:10 -0700)]
common: add alloc_ptr smart pointer
This ptr is like a unique_ptr except it allocates the underlying object on
access. The idea being that we can save memory if the object is only needed
sometimes.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/remotes/upstream/pull/17340/head:
mds: void sending cap import message when inode is frozen
client: fix message order check in handle_cap_export()
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/remotes/upstream/pull/16778/head:
mds: fix return value of MDCache::dump_cache
mds: new cap message flags indicate if there is pending capsnap
mds: properly do null snapflush part2
mds: track snap inodes through sorted map
mds: properly drop wrlock when finishing snapflush
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/remotes/upstream/pull/17095/head:
client: reset unmounting flag to false when starting a new mount
client: add mountedness check inside client_lock
client: rework Client::get_local_osd() return codes
client: remove misleading comment in get_cap_ref
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Douglas Fuller <dfuller@redhat.com>
Sage Weil [Wed, 6 Sep 2017 02:25:03 +0000 (22:25 -0400)]
crush: fix fast rule lookup when uniform
Older clients will search for the first rule with a matching ruleset,
type, and size. The has_uniform_rules bool is only set if we have rule
ids and rulesets that line up, but we must also verify that the rest of the
mask matches or else we can get a different CRUSH mapping result because
the mask might not match and old clients will fail to find a rule and we
will find one. We also can't just check the ruleset as the legacy clients
find the *first* (of potentially many) matching rules; hence we only do
the fast check if all rulesets == rule id.
_should_compact_log uses new_log != nullptr to tell whether compaction is
already in progress, but we don't set it until we are midway through the
process. Set it at the top of the method to prevent reentry.
Idea of this is to allow scripts to lookup the contributor name/email by GitHub
username. This is useful in particular for adding appropriate "Reviewed-by"s
for each GitHub style "review".
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
refactor OSD::build_initial_pg_history() so we update the info only if a
new interval is identified. also, this is consistent with
OSD::build_past_intervals_parallel().
huanwen ren [Wed, 30 Aug 2017 08:47:24 +0000 (16:47 +0800)]
mgr: add the ip addr of standbys
we need to manage the ip addr of the "standbys" state,
because the hostname/gid is insufficient to locate the
Standby node. we add ip of the mgr standby to metadata.
Danny Al-Gaaf [Tue, 30 May 2017 09:29:42 +0000 (11:29 +0200)]
common/Timer.h: ~SafeTimer needs to be virtual
Fix for:
CID 1396232 (#1 of 1): Non-virtual destructor (VIRTUAL_DTOR)
nonvirtual_dtor: Class librbd::<unnamed>::SafeTimerSingleton has a
destructor and a pointer to it is upcast to class SafeTimer which
doesn't have a virtual destructor.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Danny Al-Gaaf [Tue, 30 May 2017 08:56:23 +0000 (10:56 +0200)]
test_ipaddr.cc: memset with 0 and not '0'
Fix for:
CID 1405070 (#1 of 1): Memset fill value of '0' (NO_EFFECT)
CID 1405071 (#1 of 1): Memset fill value of '0' (NO_EFFECT)
CID 1405073 (#1 of 1): Memset fill value of '0' (NO_EFFECT)
CID 1405074 (#1 of 1): Memset fill value of '0' (NO_EFFECT)
CID 1405075 (#1 of 1): Memset fill value of '0' (NO_EFFECT)
CID 1405077 (#1 of 1): Memset fill value of '0' (NO_EFFECT)
CID 1405083 (#1 of 1): Memset fill value of '0' (NO_EFFECT)
CID 1405086 (#1 of 1): Memset fill value of '0' (NO_EFFECT)
CID 1405087 (#1 of 1): Memset fill value of '0' (NO_EFFECT)
bad_memset: "memset" with fill value "'0'" (the zero character).
memset(&net, 48, 28UL). (CWE-665)
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Danny Al-Gaaf [Thu, 11 May 2017 14:37:26 +0000 (16:37 +0200)]
client/MetaRequest.h: fix UNINIT_CTOR
Fix for:
CID 717207 (#1 of 1): Uninitialized pointer field (UNINIT_CTOR)
uninit_member: Non-static class member dirp is not initialized
in this constructor nor in any functions that it calls.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Danny Al-Gaaf [Thu, 11 May 2017 14:34:48 +0000 (16:34 +0200)]
client/Client.cc: fix UNINIT_CTOR
Fix for:
CID 1406088 (#1 of 1): Uninitialized pointer field (UNINIT_CTOR)
uninit_member: Non-static class member root_ancestor is not
initialized in this constructor nor in any functions that it calls.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Danny Al-Gaaf [Wed, 10 May 2017 14:52:54 +0000 (16:52 +0200)]
tools/rbd/Utils.cc: yank features_set_specified and related logic
since it isn't used
Fix for:
CID 1394854 (#1 of 1): 'Constant' variable guards dead code (DEADCODE)
dead_error_line: Execution cannot reach this statement:
opts->set(RBD_IMAGE_OPTION_....
Local variable features_set_specified is assigned only once, to a
constant value, making it effectively constant throughout its scope.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>