Sage Weil [Thu, 26 Jan 2017 19:22:53 +0000 (14:22 -0500)]
os/bluestore: fix statfs to not include DB partition in free space
If we report the DB space as vailable, ceph thinks the OSD can store more
data and will not mark the cluster as full as easily. And in reality, we
can't actually store data in this space--only metadata. Avoid the problem
by not reporting it as available.
Fixes: http://tracker.ceph.com/issues/18599 Signed-off-by: Sage Weil <sage@redhat.com>
Amir Vadai [Wed, 25 Jan 2017 08:36:00 +0000 (10:36 +0200)]
msg/RDMA: Fix broken compilation due to new argument in net.connect()
Fixes: 6e4ed291afc3 ("msg: add ms_bind_before_connect to bind before connect")
Change-Id: Ia45f215b5d59dfc8545017518e5162404059829e Signed-off-by: Amir Vadai <amir@vadai.me>
Matt Benjamin [Sat, 31 Dec 2016 04:30:16 +0000 (23:30 -0500)]
rgw_file: interned RGWFileHandle objects need parent refs
RGW NFS fhcache/RGWFileHandle operators assume existence of the
full chain of parents from any object to the its fs_root--this is
a consequence of the weakly-connected namespace design goal, and
not a defect.
This change ensures the invariant by taking a parent ref when
objects are interned (when a parent ref is guaranteed). Parent
refs are returned when objects are destroyed--essentially by the
invariant, such a ref must exist.
The extra ref is omitted when parent->is_root(), as that node is
not in the LRU cache.
Fixes: http://tracker.ceph.com/issues/18650 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Matt Benjamin [Thu, 19 Jan 2017 23:14:30 +0000 (18:14 -0500)]
rgw_file: add timed namespace invalidation
With change, librgw/rgw_file consumers can provide an invalidation
callback, which is used by the library to invalidate directories
whose contents should be forgotten.
The existing RGWLib GC mechanism is being used to drive this. New
configuration params have been added. The main configurable is
rgw_nfs_namespace_expire_secs, the expire timeout.
Updated post Yehuda review.
Fixes: http://tracker.ceph.com/issues/18651 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Sage Weil [Fri, 20 Jan 2017 18:59:56 +0000 (13:59 -0500)]
os/bluestore/BlueFS: increase size threshold before we flush (and generate io)
Having this too high means you might be more bursty. In practice,
though, the commit path is doing explicit syncs on small chunks
anyway. And compaction work should probably stay reasonably chunky.
Hongtong Liu [Sun, 22 Jan 2017 09:25:04 +0000 (17:25 +0800)]
os/bluestore: fix NVMEDevice::open failure if serial number ends with a number
buf in effect is the serial number in ceph.conf and
the serial number consists of 16 hexadecimal characters.
1. In order to avoid ignoring the numbers, scan buf
with isxdigit.
2. In order to ignore all the potential garbage,
scan buf from the beginning.
Signed-off-by: Hongtong Liu <hongtong.liu@istuary.com>
Kefu Chai [Thu, 19 Jan 2017 04:36:06 +0000 (12:36 +0800)]
common/BackTrace: demangle on FreeBSD also
the output on FreeBSD/clang looks like:
1: 0x44bfb3 <_Z3foov+0x413> at /usr/srcs/Ceph/work/ceph/build/bin/unittest_back_trace
2: 0x44c23e <_ZN20BackTrace_Basic_Test8TestBodyEv+0x1e> at /usr/srcs/Ceph/work/ceph/build/bin/unittest_back_trace
3: 0x4d068a <_ZN7testing8internal38HandleSehExceptionsInMethodIfSupportedINS_4TestEvEET0_PT_MS4_FS3_vEPKc+0x7a> at /usr/srcs/Ceph/work/ceph/build/bin/unittest_back_trace
4: 0x4b5977 <_ZN7testing8internal35HandleExceptionsInMethodIfSupportedINS_4TestEvEET0_PT_MS4_FS3_vEPKc+0x77> at /usr/srcs/Ceph/work/ceph/build/bin/unittest_back_trace
...
and update the test accordingly, as FreeBSD/clang uses '<>' to enclose
the mangled function and offset.
also, only demangle the C++ mangled names. those names always start with
"_Z". on FreeBSD, after demangling, "main" is turned into "unsigned
long", which does not make sense.
Jason Dillaman [Fri, 20 Jan 2017 19:26:43 +0000 (14:26 -0500)]
journal: don't hold future lock during assignment
It's possible that the future raced with its owner and reaches
an empty reference count. This was resulting in the future being
destructed while its lock was still held.
Fixes: http://tracker.ceph.com/issues/18618 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Venky Shankar [Thu, 1 Dec 2016 04:57:30 +0000 (10:27 +0530)]
librbd: Create few empty objects during copyup
This is based out of Doug's (@fullerdj) work (PR #9329)
as an attempt to avoid creating empty objects when
flattening an image and otherwise whenever unnecessary.
This gives good optimization benefit when a parent image
is sparsely populated. Moreover, this change is required
for correct behavior when checking disk usage of a clone
(which used to report fully allocated image due to all,
including empty objects being created during flatten).
Signed-off-by: Douglas Fuller dfuller@redhat.com Signed-off-by: Venky Shankar <vshankar@redhat.com>
Venky Shankar [Mon, 5 Dec 2016 09:20:06 +0000 (14:50 +0530)]
librbd: make has_parent() prone to callers from copyup
This is required when CopyupRequest would need to invoke
pre_object_map_update() as part of upcoming changes to
create fewer child image objects whenever possible.
CopyupRequest constructor accepts image extents as an
rvalue forcing the caller to transfer ownership to it
and leaving the original variable in an unspecified
stated making has_parent() return incorrect state when
invoked from CopyupRequest. Therefore, introduce a
private tracking state that can be used in place of
checking emptiness of parent image extents.
Sage Weil [Fri, 20 Jan 2017 03:09:35 +0000 (21:09 -0600)]
osdc/Objecter: infer ptruncated on old OSDs via max_entries
If we do not get an explicit 'more' value from the OSD, infer it by
checking whether we got the max requested entries. On old OSDs, which
don't enforce a limit, this will work. On new OSDs, we will get the
explicit result.
common/pick_address.cc: Copy public_netw to cluset_netw if cluster empty
- When public network is set, but cluster network is not, then
the cluster-bindings would be on 0.0.0.0 which could be unexpeted.
In this commit we copy the public network into the cluster network
to make sure that the cluster backend is not bound on 0.0.0.0
Which could be consideren an insecure, or unexpected, action.
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
Kefu Chai [Fri, 13 Jan 2017 07:41:15 +0000 (15:41 +0800)]
cmake: link ceph-{mds,mgr,mon,osd} against libcommon
add a static library named global-static, which does not link with
libceph-common. so the executables which does not link against
lib{rados,cephfs,rbd} can be linked against global-static instead if
they want to access the symbols previously available from libglobal.
and libglobal is now linked against libceph-common. and it is supposed
to be used by executables packaged by ceph-test. these exectuables can
safely depend on libceph-common offered by package of "librados2".