cmake: error out if rocksdb is incompatible w/ tcmalloc
the commit d406f228 in gperf implements a c11 feature used by a
recent change in rocksdb: 16e03882, which uses aligned_alloc().
and 16e03882 in rocksdb was merged after v5.7 was tagged, while 16e03882 in gperf was merged after v2.6.1 was tagged.
because aligned_alloc() is not implemented by tcmalloc until the
not-yet-released 2.6.2, if we call aligned_alloc() in an application
linked against tcmalloc, what gets called will be the glibc's
aligned_alloc(). but if we free() the memory chunk allocated by
aligned_alloc(), the tcmalloc's implementation kicks in, then
InvalidFree() is called, because the memory chunk being freed was
allocated by tcmalloc. in short, "mixing allocators", quote from
Dan Mick.
in rocksdb, aligned_alloc() is used if _ISOC11_SOURCE is defined, this
makes sense, because aligned_alloc() is a C11 function. we could avoid
using it by not defining _ISOC11_SOURCE. but as long as _GNU_SOURCE is
defined, glibc defines _ISOC11_SOURCE. and libstdc++ requires
_GNU_SOURCE, because it uses a fair amount of GNU extensions.
The fast dispatch refactor in 3cc48278bf0ee5c9535d04b60a661f988c50063b
eliminated the osdmap subscription in the ms_fast_dispatch path, which
meant ops could reach a PG without having the latest map. In a cluster
with few osdmap updates, where the monitor fails to send a new map to
an osd (it tries one random osd), this can result in indefinitely
blocked requests.
Fix this by adding an OSDService mechanism for scheduling a new osdmap
subscription request.
client: set client_try_dentry_invalidate to false by default
By default, ceph-fuse uses side effect of 'dentry invalidation' to
trim kernel dcache if it runs on kernel < 3.18. The implemention of
kernel function d_invalidate() changed in 3.18 kernel, the method no
longer works for upstream kernel >= 3.18.
RHEL 3.10 kernel includes backport of patches that change implemention
of d_invalidate(). So checking kernel version to decide if 'dentry
invalidation' method works is unreliable.
we've updated the rockdb wrapper on ceph side to be compatible with
the latest version of rocksdb upstream. so ceph is not compatible with
older version of rocksdb.
ceph: do link/rename semantic checks after srcdn is readable
For hard link, source inode must not be directory. For rename,
types of source/destination inodes must match. If srcdn is replica
and we do these checks while it's not readble, it's possible that
wrong source inode is used in these checks.
** 1409700 Uninitialized scalar field
CID 1409700 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
2. uninit_member: Non-static class member alignment is not initialized
in this constructor nor in any functions that it calls.
** 1409702 Uninitialized scalar field
CID 1409702 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
2. uninit_member: Non-static class member alignment is not initialized
in this constructor nor in any functions that it calls.
Jeff Layton [Thu, 14 Sep 2017 13:28:34 +0000 (09:28 -0400)]
lockdep: fix races with concurrent lockdep teardown
If the cct is unregistered while other threads are flogging mutexes,
then we can hit all sorts of bugs. Ensure that we handle that
situation sanely, by checking that g_lockdep is still set after
we take the lockdep_mutex.
Also, remove an assertion from lockdep_unregister, and just turn it into
an immediate return. It's possible to have a call to
lockdep_unregister_ceph_context, and then a call to
lockdep_register_ceph_context while a mutex is being held by another
task.
In that case, it's possible the lock does not exist in the map
when we go to unregister it. That's not a bug though, just a natural
consequence of that series of actions.
Tracker: http://tracker.ceph.com/issues/20988 Signed-off-by: Jeff Layton <jlayton@redhat.com>
src/client/Client.cc: In member function ‘void Client::trim_caps(MetaSession*, int)’:
src/client/Client.cc:4121:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (s->caps.size() > max)
~~~~~~~~~~~~~~~^~~~~
https://github.com/ceph/ceph/pull/17371 introduces support of
per-pool space-full flag, which turns out to set both
full and full_no_quota flags now if a pool is currently running out
of quota.
Actually this test is fragile as long as we keep appending new flags
at pool granularity, but let's not bother with that complexity now.
improve the interoperability between freebsd/osx and GNU/Linux, because
the their layouts of sockaddr_storage are different, and we use the
one of linux as the wire format. so need to convert it on freebsd/osx
side.
clang on osx emits functions with leading underscore, but the isa-l
assembly's functions have no leading underscore. we could label the
function declaration like `int foo asm("foo")` to remove the leading
underscore, but isa-l is a git submodule, let do this later. and in the
meanwhile, disable this plugin on osx.
crc32: label assembler functions without leading underscore
clang onder osx adds leading undescore to the function names to be
ABI compatible. but the assembly code does not do so. so we need to
control the name using gcc/clang extension. see
https://gcc.gnu.org/onlinedocs/gcc-4.4.0/gcc/Asm-Labels.html#Asm-Labels
compat: consolidate definitions of osx and freebsd
on osx, ENODATA = 96, so we need to fix it. also define
CLOCK_MONOTONIC_COARSE and CLOCK_REALTIME_COARSE for osx, ceph_time.h
defines this also, but i don't want to include compat.h in ceph_time.h
at this moment.
and silence the warning of
#warning ENODATA already defined to a value different from 87 (ENOATRR), refining to fix
because it is fired everywhere on osx when "compat.h" is included.
msg/msg_types: fix the denc of sockaddr_storage on freebsd/osx
the layout of sockaddr_in and sockaddr_in6 are different on
GNU/Linux and FreeBSD/OSX:
- on GNU/Linux, sockaddr does not have sa_len,
- on GNU/Linux, sockaddr* use a 16 bit integer for sa_family, but
on FreeBSD, a 32bit integer is used.
so we need to be more care when memcpy() between sockaddr_storage()
and ceph_sockaddr_storage().
* use mach_absolute_time() for monotonic time
mach_absolute_time() is faster and monotonic, see
https://developer.apple.com/library/content/qa/qa1398/_index.html
for its implementation, see
https://opensource.apple.com/source/xnu/xnu-3248.60.10/libsyscall/wrappers/mach_absolute_time.s
it's using rdtsc.
* and remove unnecessary headers from ceph_time.h