]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Pan Liu [Tue, 28 Mar 2017 08:48:21 +0000 (16:48 +0800)]
rbd-nbd: polish the output info before and after ioctl NBD_DISCONNECT.
Signed-off-by: Pan Liu <liupan1111@gmail.com>
Pan Liu [Tue, 28 Mar 2017 08:33:25 +0000 (16:33 +0800)]
rbd-nbd: support signal handle for SIGHUP, SIGINT, and SIGTERM.
Fixes: http://tracker.ceph.com/issues/19349
Signed-off-by: Pan Liu <liupan1111@gmail.com>
Orit Wasserman [Sun, 26 Mar 2017 07:33:28 +0000 (10:33 +0300)]
Merge pull request #10121 from theanalyst/wip-16357
rgw: cls_user don't clobber existing bucket stats when creating bucket
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Mykola Golub [Sat, 25 Mar 2017 20:26:23 +0000 (22:26 +0200)]
Merge pull request #14134 from wangzhengyong/doc
doc: add some undocumented options to rbd-nbd
Reviewed-by: Pan Liu <liupan1111@gmail.com>
Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Jason Dillaman [Sat, 25 Mar 2017 20:21:00 +0000 (16:21 -0400)]
Merge pull request #14091 from trociny/wip-prepare_async_request
librbd: potential use of uninitialised value in ImageWatcher
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Haomai Wang [Sat, 25 Mar 2017 19:38:35 +0000 (03:38 +0800)]
Merge pull request #13740 from Adirl/forksafe
msg/async/rdma: Add fork safe on RDMA
Reviewed-by: Haomai Wang <haomai@xsky.com>
Sage Weil [Sat, 25 Mar 2017 18:08:33 +0000 (13:08 -0500)]
Merge pull request #13965 from liewegas/wip-bluestore-pc
os/bluestore: fix perf counters
Sage Weil [Sat, 25 Mar 2017 18:07:37 +0000 (13:07 -0500)]
Merge pull request #13962 from Liuchang0812/wip-add-override-in-osd-headers
osd: add override in headers files
Reviewed-by: Sage Weil <sage@redhat.com>
wangzhengyong [Sat, 25 Mar 2017 07:09:01 +0000 (15:09 +0800)]
doc: add some undocumented options to rbd-nbd
Signed-off-by: wangzhengyong@cmss.chinamobile.com
liuchang0812 [Wed, 22 Feb 2017 11:43:11 +0000 (19:43 +0800)]
osd: add override in headers files
Signed-off-by: liuchang0812 <liuchang0812@gmail.com>
Kefu Chai [Sat, 25 Mar 2017 04:13:17 +0000 (12:13 +0800)]
Merge pull request #14114 from dmick/wip-boost-j
debian/rules, ceph.spec.in: invoke cmake with -DBOOST_J
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Sage Weil [Fri, 24 Mar 2017 21:41:37 +0000 (16:41 -0500)]
Merge pull request #13889 from liewegas/wip-denc-nullptr
include/denc: remove nullptr runtime magic boundedness check
Reviewed-by: Kefu Chai <kchai@redhat.com>
Sage Weil [Fri, 24 Mar 2017 21:41:18 +0000 (16:41 -0500)]
Merge pull request #14096 from baiyanchun/remove_useless_parameter
common: remove useless parameter
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Pan Liu <liupan1111@gmail.com>
Sage Weil [Fri, 24 Mar 2017 20:28:27 +0000 (15:28 -0500)]
Merge pull request #14131 from liewegas/wip-crush-encode
crush: only encode class info if SERVER_LUMINOUS
Reviewed-by: Loic Dachary <ldachary@redhat.com>
Sage Weil [Fri, 24 Mar 2017 18:17:39 +0000 (13:17 -0500)]
Merge pull request #13960 from wangzhengyong/kstore
os/kstore: some error handling
Reviewed-by: Sage Weil <sage@redhat.com>
Sage Weil [Fri, 24 Mar 2017 18:16:58 +0000 (13:16 -0500)]
Merge pull request #13973 from shinobu-x/wp-sk-primarylogpg-null-nullptr
osd/PrimaryLogPG: nullptr not NULL
Reviewed-by: Sage Weil <sage@redhat.com>
Sage Weil [Fri, 24 Mar 2017 18:13:39 +0000 (13:13 -0500)]
Merge pull request #13995 from liuhongtong/wip-config
common/config: set rocksdb_cache_size to OPT_U64
Reviewed-by: Sage Weil <sage@redhat.com>
Sage Weil [Fri, 24 Mar 2017 18:12:16 +0000 (13:12 -0500)]
Merge pull request #14013 from ShiqiCooperation/newshiqi
test/unittest_bluefs: check whether add_block_device success
Reviewed-by: Sage Weil <sage@redhat.com>
Sage Weil [Fri, 24 Mar 2017 17:59:34 +0000 (13:59 -0400)]
crush: only encode class info if SERVER_LUMINOUS
This fixes OSDMap reencode crc mismatches on jewel to
luminous upgrades.
Fixes: http://tracker.ceph.com/issues/19361
Signed-off-by: Sage Weil <sage@redhat.com>
Dan Mick [Fri, 24 Mar 2017 02:35:08 +0000 (19:35 -0700)]
ceph.spec.in: derive _smp_ncpus and use it for -DBOOST_J
Signed-off-by: Dan Mick <dan.mick@redhat.com>
Dan Mick [Fri, 24 Mar 2017 02:34:28 +0000 (19:34 -0700)]
ceph.spec.in: move lowmem_build setting of _smp_mflags
Signed-off-by: Dan Mick <dan.mick@redhat.com>
Dan Mick [Thu, 23 Mar 2017 23:36:53 +0000 (16:36 -0700)]
debian/rules: invoke cmake with -DBOOST_J
Allow boost build during toplevel cmake from Debian package build
to benefit from multiple processors. Should speed build a lot
on many-proc machines (say, arm64). Use argument passed to
debhelper.
Signed-off-by: Dan Mick <dan.mick@redhat.com>
Casey Bodley [Fri, 24 Mar 2017 15:15:05 +0000 (11:15 -0400)]
Merge pull request #14082 from idealguo/update-bucket-acl
rgw: enable to update acl of bucket created in slave zonegroup
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Casey Bodley [Fri, 24 Mar 2017 15:11:50 +0000 (11:11 -0400)]
Merge pull request #14043 from zhangsw/fix-rgw-deletebucket
rgw: delete non-empty buckets in slave zonegroup works not well
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Casey Bodley [Fri, 24 Mar 2017 15:10:28 +0000 (11:10 -0400)]
Merge pull request #13991 from Liuchang0812/wip-rgw-optimization
rgw: avoid listing user buckets for rgw_delete_user
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Casey Bodley [Fri, 24 Mar 2017 15:08:18 +0000 (11:08 -0400)]
Merge pull request #13504 from rzarzynski/wip-rgw-chunkingfilter-cleanup
rgw: clean up the unneeded rgw::io::ChunkingFilter::has_content_length.
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Kefu Chai [Fri, 24 Mar 2017 14:44:15 +0000 (22:44 +0800)]
Merge pull request #13847 from wjwithagen/wip-wjw-ceph-disk-tests-2
ceph-disk/tests/test_main.py: FreeBSD does not do multipath
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 24 Mar 2017 13:44:56 +0000 (21:44 +0800)]
Merge pull request #13974 from tchaikov/wip-vstart-start-mgr
vstart: do not start mgr if not start_all
Reviewed-by: Sage Weil <sage@redhat.com>
Kefu Chai [Fri, 24 Mar 2017 07:53:17 +0000 (15:53 +0800)]
Merge pull request #13197 from asheplyakov/master-18740
systemd/ceph-disk: make it possible to customize timeout
Reviewed-by: Loic Dachary <ldachary@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 24 Mar 2017 06:43:48 +0000 (14:43 +0800)]
Merge pull request #14103 from tchaikov/wip-https-github
script: ceph-release-notes: use https instead of http
Reviewed-by: Abhishek Lekshmanan <abhishek@suse.com>
Sage Weil [Fri, 24 Mar 2017 01:47:45 +0000 (20:47 -0500)]
Merge pull request #14085 from wjwithagen/wip-wjw-bluestore-fixture
test/objectstore/store_test_fixture.cc: Exclude bluestore code if required.
Reviewed-by: Kefu Chai <kchai@redhat.com>
Sage Weil [Fri, 24 Mar 2017 01:47:12 +0000 (20:47 -0500)]
Merge pull request #13931 from wangzhengyong/extent
os/bluestore: fix bug for calc extent_avg in reshard function
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Igor Fedotov <ifedotov@mirantis.com>
Sage Weil [Fri, 24 Mar 2017 01:44:59 +0000 (20:44 -0500)]
Merge pull request #14073 from liewegas/wip-bluestore-nullptr
os/bluestore: avoid nullptr in bluestore_extent_ref_map_t::bound_encode
Reviewed-by: Kefu Chai <kchai@redhat.com>
Sage Weil [Fri, 24 Mar 2017 01:44:35 +0000 (20:44 -0500)]
Merge pull request #13577 from yonghengdexin735/wip-zzz-openalloc
os/bluestore: fix bug in _open_alloc()
Reviewed-by: Varada Kari <varada.kari@sandisk.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Loic Dachary [Thu, 23 Mar 2017 20:48:00 +0000 (21:48 +0100)]
Merge pull request #14110 from dachary/wip-crush-cleanup
crush: builder: clean the arguments of crush_reweight* methods
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Loic Dachary <ldachary@redhat.com>
Kefu Chai [Wed, 22 Mar 2017 05:04:06 +0000 (13:04 +0800)]
vstart.sh: do not init fsmap if "$new == 0"
we cannot create a new cephfs using a non-empty pool without '--force'
option now, so the "ceph fs new" command fails with "vstart.sh -k".
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 22 Mar 2017 15:33:30 +0000 (23:33 +0800)]
tests: remove mds,osd,mon args passed to vstart.sh
Signed-off-by: Kefu Chai <kchai@redhat.com>
Sahid Orentino Ferdjaoui [Mon, 13 Mar 2017 16:36:16 +0000 (12:36 -0400)]
crush: builder: clean the arguments of crush_reweight* methods
This commit is just a cleanup to make the arguments of the method
around crush_reweight all coherent.
Signed-off-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@redhat.com>
Kefu Chai [Wed, 22 Mar 2017 03:48:40 +0000 (11:48 +0800)]
vstart.sh: remove start_*
so there are only two ways to override the number of daemons to start
- using the env var CEPH_NUM_{MON|OSD|MGR|MDS} or {MON|OSD|MGR|MDS}
- command line options: --{mon,osd,mds}_num
do prevent a daemon from running, set the corrresponding env var to 0.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Yuri Weinstein [Thu, 23 Mar 2017 15:47:55 +0000 (08:47 -0700)]
Merge pull request #14050 from ovh/bp-dump-ops-by-duration
common/TrackedOp: allow dumping historic ops sorted by duration
Reviewed-by: Sage Weil <sage@redhat.com>
Yuri Weinstein [Thu, 23 Mar 2017 15:46:36 +0000 (08:46 -0700)]
Merge pull request #14060 from LiumxNL/wip-170321
osd: combine unstable stats with info.stats when publish stats to osd
Reviewed-by: Sage Weil <sage@redhat.com>
Yuri Weinstein [Thu, 23 Mar 2017 15:45:58 +0000 (08:45 -0700)]
Merge pull request #13293 from Liuchang0812/cleanup-coverity
test, osd: fix some coverity issues
Reviewed-by: Kefu Chai <kchai@redhat.com>
Casey Bodley [Thu, 23 Mar 2017 13:54:47 +0000 (09:54 -0400)]
Merge pull request #14014 from Liuchang0812/wip-fix-seg-fault
rgw: fix memory leak in RGWGetObjLayout
Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Sage Weil [Thu, 23 Mar 2017 13:21:39 +0000 (08:21 -0500)]
os/bluestore: avoid nullptr in bluestore_extent_ref_map_t::bound_encode
Signed-off-by: Sage Weil <sage@redhat.com>
Haomai Wang [Thu, 23 Mar 2017 11:23:34 +0000 (19:23 +0800)]
Merge pull request #14094 from optimistyzy/322
bluestore, NVMeDevice: use task' own lock for (random) read
Reviewed-by: Haomai Wang <haomai@xsky.com>
Kefu Chai [Thu, 23 Mar 2017 11:13:41 +0000 (19:13 +0800)]
script: ceph-release-notes: use https instead of http
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Thu, 23 Mar 2017 08:09:34 +0000 (16:09 +0800)]
Merge pull request #14004 from liewegas/wip-osd-full-failsafe
osd: fall back to failsafe threshold if osdmap doesn't set [near]full
Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Thu, 23 Mar 2017 08:08:22 +0000 (16:08 +0800)]
Merge pull request #13903 from wjwithagen/wip-wjw-run-classes-sed
test: sed on FreeBSD requires "-i extension", so use gsed
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Thu, 23 Mar 2017 08:04:52 +0000 (16:04 +0800)]
Merge pull request #9940 from aclamk/common-recursive-mutex-fix
common: fix lockdep vs recursive mutexes
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
baiyanchun [Thu, 23 Mar 2017 02:38:15 +0000 (10:38 +0800)]
common: remove useless parameter
Signed-off-by: baiyanchun <yanchun.bai@istuary.com>
Ziye Yang [Wed, 22 Mar 2017 03:41:00 +0000 (11:41 +0800)]
bluestore, NVMeDevice: use task' own lock for (random) read
The reason is that ioc may be reaped in _aio_thread function
with the following statements:
for (auto &&it : registered_devices)
it->reap_ioc();
So if we still use ioc's lock for (random) read, it will cause
core dump.
Signed-off-by: optimistyzy <optimistyzy@gmail.com>
Guo Zhandong [Wed, 22 Mar 2017 10:00:37 +0000 (18:00 +0800)]
rgw: enable to update acl of bucket created in slave zonegroup
Fixes: http://tracker.ceph.com/issues/16888
Signed-off-by: Guo Zhandong <guozhandong@cmss.chinamobile.com>
Mykola Golub [Wed, 22 Mar 2017 20:03:34 +0000 (21:03 +0100)]
librbd: potential use of uninitialised value in ImageWatcher
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
Loic Dachary [Wed, 22 Mar 2017 18:43:37 +0000 (19:43 +0100)]
Merge pull request #14080 from ceph/evelu-ceph-disk
ceph-disk: Reporting /sys directory in get_partition_dev()
Reviewed-by: Loic Dachary <ldachary@redhat.com>
Kefu Chai [Wed, 22 Mar 2017 15:57:13 +0000 (23:57 +0800)]
Merge pull request #13942 from xiexingguo/wip-cleanup-proc-repinfo
osd/PG: conditionally retry on receiving pg-notify when Primary is Incomplete
Reviewed-by: Sage Weil <sage@redhat.com>
Kefu Chai [Wed, 22 Mar 2017 15:56:27 +0000 (23:56 +0800)]
Merge pull request #14061 from tchaikov/wip-19312
tests: ceph_test_rados_api_watch_notify: test timeout using rados_wat…
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Casey Bodley [Wed, 22 Mar 2017 15:46:33 +0000 (11:46 -0400)]
Merge pull request #12449 from cbodley/wip-rgw-test-multi-vers-acl
test/rgw: add bucket acl and versioning tests to test_multi.py
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Kefu Chai [Wed, 22 Mar 2017 14:43:41 +0000 (22:43 +0800)]
Merge pull request #14059 from vumrao/wip-vumrao-19318
common/config_opts.h: Remove deprecated osd_compact_leveldb_on_mount option
Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Mark Nelson [Wed, 22 Mar 2017 14:12:41 +0000 (09:12 -0500)]
Merge pull request #14076 from liewegas/wip-bluestore-min-alloc-size
os/bluestore: default 16KB min_alloc_size on ssd
Willem Jan Withagen [Wed, 22 Mar 2017 14:03:32 +0000 (15:03 +0100)]
test/objectstore/store_test_fixture.cc: Exclude bluestore code if required.
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
Haomai Wang [Wed, 22 Mar 2017 13:16:48 +0000 (21:16 +0800)]
Merge pull request #14068 from optimistyzy/321_new
Bluestore, NVMEDevice: add the spdk core mask check
Reviewed-by: Haomai Wang <haomai@xsky.com>
Piotr Dałek [Mon, 20 Mar 2017 12:51:25 +0000 (13:51 +0100)]
TrackedOp: allow dumping historic ops sorted by duration
Currently dump_historic_ops dumps ops sorted by their initiation time,
which may not have any relation to how long it took, and sorting output
of that command by op duration is neither fast nor convenient.
New asok command ("dump_historic_ops_by_duration") outputs the same
op list, but ordered by their duration time (longest first).
Signed-off-by: Piotr Dałek <piotr.dalek@corp.ovh.com>
optimistyzy [Tue, 21 Mar 2017 11:00:15 +0000 (19:00 +0800)]
Bluestore, NVMEDevice: add the spdk core mask check
This patch adds the spdk core mask check and also
set the master core for starting DPDK.
Signed-off-by: optimistyzy <optimistyzy@gmail.com>
liuchang0812 [Wed, 22 Mar 2017 09:27:20 +0000 (17:27 +0800)]
rgw/rgw_op: fix memory leak in RGWGetObjLayout
Signed-off-by: liuchang0812 <liuchang0812@gmail.com>
Erwan Velu [Wed, 22 Mar 2017 09:11:44 +0000 (10:11 +0100)]
ceph-disk: Reporting /sys directory in get_partition_dev()
When get_partition_dev() fails, it reports the following message :
ceph_disk.main.Error: Error: partition 2 for /dev/sdb does not appear to exist
The code search for a directory inside the /sys/block/get_dev_name(os.path.realpath(dev)).
The issue here is the error message doesn't report that path when failing while it might be involved in.
This patch is about reporting where the code was looking at when trying to estimate if the partition was available.
Signed-off-by: Erwan Velu <erwan@redhat.com>
Kefu Chai [Wed, 22 Mar 2017 03:34:21 +0000 (11:34 +0800)]
vstart.sh: do nothing if $CEPH_NUM_* is 0
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 15 Mar 2017 07:28:09 +0000 (15:28 +0800)]
vstart.sh: extract start_{osd,mon,mgr,mds} into functions
Signed-off-by: Kefu Chai <kchai@redhat.com>
Sage Weil [Wed, 22 Mar 2017 02:27:23 +0000 (21:27 -0500)]
os/bluestore: default 16KB min_alloc_size on ssd
Signed-off-by: Sage Weil <sage@redhat.com>
Orit Wasserman [Tue, 21 Mar 2017 21:44:22 +0000 (23:44 +0200)]
Merge pull request #13963 from cbodley/wip-18725
rgw-admin: remove deprecated regionmap commands
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Sage Weil [Tue, 14 Mar 2017 21:09:18 +0000 (17:09 -0400)]
os/bluestore/BlueFS: measure used bytes, not free bytes
This is more useful info for humans.
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Tue, 14 Mar 2017 20:59:26 +0000 (16:59 -0400)]
os/bluestore: fix many perfcounter types
Most of these are counters, not gauges.
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Tue, 14 Mar 2017 20:14:06 +0000 (16:14 -0400)]
os/bluestore: surface key metrics for daemonperf
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Tue, 14 Mar 2017 20:13:47 +0000 (16:13 -0400)]
os/bluestore/BlueFS: log key bluefs metrics
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Tue, 14 Mar 2017 20:13:34 +0000 (16:13 -0400)]
os/bluestore: log kv latencies
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Tue, 14 Mar 2017 20:48:53 +0000 (16:48 -0400)]
osd: exclude 'objecter' perfcounters from daemonperf
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Tue, 14 Mar 2017 20:48:37 +0000 (16:48 -0400)]
common/perf_counters: allow perfcounters to be excluded from daemonperf
By omitting the 'nick' we exclude a whole group of metrics from the
daemonperf results.
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Tue, 14 Mar 2017 20:12:37 +0000 (16:12 -0400)]
ceph: daemonperf: order metrics to match asok json dump
The daemons report this in a particular order; match that in the
daemonperf output. This corresponds to the numeric value of the l_*
enum.
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Tue, 21 Mar 2017 20:05:56 +0000 (15:05 -0500)]
Merge pull request #13888 from liewegas/wip-bluestore-dw
os/bluestore: fix deferred writes; improve flush
Reviewed-by: Igor Fedotov <ifedotov@mirantis.com>
Casey Bodley [Tue, 21 Mar 2017 19:43:48 +0000 (15:43 -0400)]
Merge pull request #13902 from Wilhelmshaven/rm_redundant_code
rgw: remove redundant codes in rgw_cache.h
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Sage Weil [Sat, 18 Mar 2017 17:51:08 +0000 (13:51 -0400)]
os/bluestore: handle zombie OpSequencers
It's possible for the Sequencer to go away while the OpSequencer still has
txcs in flight. We were handling the case where the osr was on the
deferred_queue, but it may be off the deferred_queue but waiting for the
commit to happen, and we still need to wait for that.
Fix this by introducing a 'zombie' state for the osr, in which we keep the
osr in the osr_set.
Clean up the OpSequencer methods and a few other method names.
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Fri, 17 Mar 2017 21:52:56 +0000 (17:52 -0400)]
os/bluestore: clean up flush_all()
Add assertions if we fail to flush everything.
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Fri, 17 Mar 2017 14:13:22 +0000 (10:13 -0400)]
os/bluestore: move cached items around on collection split
We've been avoiding doing this for a while and it has finally caught up
with us: the SharedBlob may outlive the split due to deferred IO, and
a read on the child collection may load a competing Blob and SharedBlob
and read from the on-disk blocks that haven't been written yet.
Fix by preserving the one-SharedBlob-instance invariant by moving cache
items to the new Collection and cache shard like we should have from the
beginning.
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Fri, 17 Mar 2017 17:54:20 +0000 (13:54 -0400)]
os/bluestore: simplify flush() wake-up condition
Clearer, and fewer wakeups.
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Fri, 17 Mar 2017 14:12:02 +0000 (10:12 -0400)]
ceph_test_objectstore: set bluestore cache shards to 5
Better test coverage!
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Thu, 16 Mar 2017 20:33:53 +0000 (16:33 -0400)]
unittest_bluestore_types: fix Collection using tests
We can't use a bare Collection since we get/put refs, the last put will
delete it, and the dtor asserts nref == 0 (no faking a ref and deliberately
leaking!).
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Thu, 16 Mar 2017 16:24:51 +0000 (12:24 -0400)]
os/bluestore/KernelDevice: drop unused flush_lock
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Thu, 16 Mar 2017 16:19:30 +0000 (12:19 -0400)]
os/bluestore: better debugging around collections
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Thu, 16 Mar 2017 15:30:59 +0000 (11:30 -0400)]
os/bluestore: nicer Onode dout prefix
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Thu, 16 Mar 2017 15:30:37 +0000 (11:30 -0400)]
os/bluestore: flush_cache on umount, fsck finish, etc.
Otherwise cache items survive beyond umount into the next mount cycle!
Also, ensure that we flush_cache *before* clearing coll_map, as some cache
items have references back to the Collection.
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Wed, 15 Mar 2017 19:01:52 +0000 (15:01 -0400)]
os/bluestore: take Collection ref from SharedBlob
These can survive as long as the txc, which can be longer than the
Collection. Make sure we have a valid ref as both finish_write and
~SharedBlob use coll for the SharedBlobSet (and coll->store->cct for
debug).
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Tue, 14 Mar 2017 20:47:48 +0000 (16:47 -0400)]
os/bluestore: fix perfcounters for deferred io
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Tue, 14 Mar 2017 20:47:40 +0000 (16:47 -0400)]
os/bluestore: remove dead _do_deferred_op code
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Tue, 14 Mar 2017 18:17:20 +0000 (14:17 -0400)]
os/bluestore: make throttles tunable online
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Mon, 13 Mar 2017 11:43:57 +0000 (07:43 -0400)]
os/bluestore: prevent throttle deadlock due to deferred writes
Kick off deferred IOs if we pass the throttle midpoint or if we would
block during submission.
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Fri, 10 Mar 2017 15:27:52 +0000 (10:27 -0500)]
ceph_test_objectstore: fix Synthetic to never modify bufferlists
We were modifying bufferlists in place, and kludging around it by making
full copies elsewhere. Instead, never modify a buffer.
This fixes issues where the buffer we submit to ObjectStore ends up in
the cache and we modify in place later, corrupting the implementation's
copy. (This was affecting BlueStore.)
Rearrange the data methods to be next to each other and clean them up a
bit too.
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Fri, 10 Mar 2017 15:20:22 +0000 (10:20 -0500)]
os/bluestore: drop obsolete comment
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Thu, 9 Mar 2017 22:28:58 +0000 (17:28 -0500)]
os/bluestore: avoid extra dev flush on single device when all io is deferred
If we have no non-deferred IO to flush, and we are running bluefs on a
single shared device, then we can rely on the bluefs flush to make our
current batch of deferred ios stable.
Separate deferred into a "done" and "stable" list. If we do sync, put
everything from "done" onto "stable". Otherwise, after we do our kv
commit via bluefs, move "done" to "stable" then.
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Tue, 14 Mar 2017 14:33:23 +0000 (10:33 -0400)]
os/bluestore: debug alloc release
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Tue, 14 Mar 2017 14:33:17 +0000 (10:33 -0400)]
os/bluestore: flush old/discarded OpSequencers too
When the Sequencer goes away it get deregistered. If there are still
deferred IOs in flight, we need to wait for those too.
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Thu, 9 Mar 2017 19:17:47 +0000 (14:17 -0500)]
os/bluestore: batch up to bluestore_deferred_batch_ops before submitting
Allow several deferred writes to accumulate before we submit them. In
general we have no time pressure, and on HDD (and perhaps sometimes SSD)
it is beneficial to accumulate and batch these so that they result in
fewer seeks. On HDD, this is particularly true of seeks away from the
journal. And on sequential workloads this can avoid seeks. In may even
allow the block layer or SSD firmware to merge IOs and perform fewer
writes.
Signed-off-by: Sage Weil <sage@redhat.com>