]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
9 years agotools:remove the local file when get map failed. 5920/head
Bo Cai [Mon, 14 Sep 2015 11:19:05 +0000 (19:19 +0800)]
tools:remove the local file when get map failed.

Signed-off-by: Bo Cai <cai.bo@h3c.com>
9 years agodebian: package radosgw-object-expirer in radosgw deb
Sage Weil [Thu, 3 Sep 2015 22:41:52 +0000 (18:41 -0400)]
debian: package radosgw-object-expirer in radosgw deb

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph.spec: package new rgw files
Sage Weil [Thu, 3 Sep 2015 22:41:26 +0000 (18:41 -0400)]
ceph.spec: package new rgw files

   /usr/bin/radosgw-object-expirer
   /usr/lib64/rados-classes/libcls_timeindex.so

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #5799 from liewegas/wip-cmake
Sage Weil [Thu, 3 Sep 2015 19:44:43 +0000 (15:44 -0400)]
Merge pull request #5799 from liewegas/wip-cmake

cmake: fix build (newstore issues)

Reviewed-by: Casey Bodley <cbodley@redhat.com>
9 years agoCMakeLists.txt: add newstore files 5799/head
Sage Weil [Thu, 3 Sep 2015 19:27:45 +0000 (15:27 -0400)]
CMakeLists.txt: add newstore files

Signed-off-by: Sage Weil <sage@redhat.com>
9 years ago.gitignore: ignore build (usually used by cmake)
Sage Weil [Thu, 3 Sep 2015 19:27:34 +0000 (15:27 -0400)]
.gitignore: ignore build (usually used by cmake)

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoMerge remote-tracking branch 'gh/master' into infernalis
Sage Weil [Thu, 3 Sep 2015 19:15:01 +0000 (15:15 -0400)]
Merge remote-tracking branch 'gh/master' into infernalis

9 years agorgw/Makefile.am: ship rgw_object_expirer_core.h
Sage Weil [Thu, 3 Sep 2015 19:13:40 +0000 (15:13 -0400)]
rgw/Makefile.am: ship rgw_object_expirer_core.h

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #5744 from ceph/wip-12909
Sage Weil [Thu, 3 Sep 2015 19:09:49 +0000 (15:09 -0400)]
Merge pull request #5744 from ceph/wip-12909

cmake: update FUSE_INCLUDE_DIRS to match autoconf

9 years agoMerge pull request #5610 from ceph/wip-cmake
Sage Weil [Thu, 3 Sep 2015 19:08:36 +0000 (15:08 -0400)]
Merge pull request #5610 from ceph/wip-cmake

cmake: make check

9 years agoceph.spec: build requires cmake 5610/head
Sage Weil [Thu, 3 Sep 2015 19:01:53 +0000 (15:01 -0400)]
ceph.spec: build requires cmake

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agodebian/control: build requires cmake
Sage Weil [Thu, 3 Sep 2015 19:01:34 +0000 (15:01 -0400)]
debian/control: build requires cmake

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agodebian/control: build-requires libboost-regex-dev
Sage Weil [Thu, 3 Sep 2015 18:59:37 +0000 (14:59 -0400)]
debian/control: build-requires libboost-regex-dev

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #5692 from ceph/wip-rgw-swift-expiration
Yehuda Sadeh [Thu, 3 Sep 2015 17:23:00 +0000 (10:23 -0700)]
Merge pull request #5692 from ceph/wip-rgw-swift-expiration

Wip rgw swift expiration

Reviewed-by: Radoslaw Zarzynski <rzarzynski@mirantis.com>
Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
9 years agocmake: install crushtool to destdir/bin
Matt Benjamin [Tue, 25 Aug 2015 17:49:25 +0000 (13:49 -0400)]
cmake: install crushtool to destdir/bin

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
9 years agocmake: add blkid as dependency to libcommon
Casey Bodley [Tue, 1 Sep 2015 19:33:31 +0000 (15:33 -0400)]
cmake: add blkid as dependency to libcommon

Signed-off-by: Casey Bodley <cbodley@redhat.com>
9 years agocmake: Changed name of crc32 target to crc32c
Ali Maredia [Tue, 25 Aug 2015 17:49:23 +0000 (13:49 -0400)]
cmake: Changed name of crc32 target to crc32c

Signed-off-by: Ali Maredia <amaredia@redhat.com>
9 years agocmake: Added shell script tests
Ali Maredia [Mon, 24 Aug 2015 22:01:09 +0000 (18:01 -0400)]
cmake: Added shell script tests

Signed-off-by: Ali Maredia <amaredia@redhat.com>
9 years agocmake: Fixed HAVE_BETTER_YASM_ELF64 variable
Ali Maredia [Mon, 24 Aug 2015 19:32:53 +0000 (15:32 -0400)]
cmake: Fixed HAVE_BETTER_YASM_ELF64 variable

Signed-off-by: Ali Maredia <amaredia@redhat.com>
9 years agocmake: Removed trailing spaces from isa .s files
Ali Maredia [Mon, 24 Aug 2015 18:11:01 +0000 (14:11 -0400)]
cmake: Removed trailing spaces from isa .s files

Signed-off-by: Ali Maredia <amaredia@redhat.com>
9 years agocmake: Uncommented erasure-code/shec conditional
Ali Maredia [Fri, 21 Aug 2015 17:33:05 +0000 (13:33 -0400)]
cmake: Uncommented erasure-code/shec conditional

Signed-off-by: Ali Maredia <amaredia@redhat.com>
9 years agocmake: Removed traces of CDS, minor cmake fixes
Ali Maredia [Wed, 19 Aug 2015 20:15:46 +0000 (16:15 -0400)]
cmake: Removed traces of CDS, minor cmake fixes

Signed-off-by: Ali Maredia <amaredia@redhat.com>
9 years agocmake: Fixed rbd_replay build issue
Ali Maredia [Tue, 18 Aug 2015 21:03:58 +0000 (17:03 -0400)]
cmake: Fixed rbd_replay build issue

Signed-off-by: Ali Maredia <amaredia@redhat.com>
9 years agocmake: Removed scripts, check_PROGRAMS included
Ali Maredia [Tue, 18 Aug 2015 19:44:36 +0000 (15:44 -0400)]
cmake: Removed scripts, check_PROGRAMS included

Removed the unittest scripts for the time being.
Built unittests included in check_PROGRAMS target.

Signed-off-by: Ali Maredia <amaredia@redhat.com>
9 years agocmake: Cleaned up syntax for make check targets
Ali Maredia [Tue, 18 Aug 2015 18:34:54 +0000 (14:34 -0400)]
cmake: Cleaned up syntax for make check targets

Signed-off-by: Ali Maredia <amaredia@redhat.com>
9 years agocmake: check_TESTPROGRAMS tests running
Ali Maredia [Mon, 17 Aug 2015 20:26:47 +0000 (16:26 -0400)]
cmake: check_TESTPROGRAMS tests running

Make check working, accept rocksdb tests. Clean up coming.

Signed-off-by: Ali Maredia <amaredia@redhat.com>
9 years agoREADME.md: Add basic CMake instructions
Ali Maredia [Tue, 4 Aug 2015 16:45:52 +0000 (12:45 -0400)]
README.md: Add basic CMake instructions

README.md: Fixed spacing, trimmed cmake section

Signed-off-by: Ali Maredia <amaredia@redhat.com>
9 years agoMerge pull request #5792 from ceph/wip-vstart-rgw
Sage Weil [Thu, 3 Sep 2015 15:03:26 +0000 (11:03 -0400)]
Merge pull request #5792 from ceph/wip-vstart-rgw

vstart: add -c argument to radosgw-admin commands

Reviewed-by: Sage Weil <sage@redhat.com>
9 years agovstart: add -c argument to radosgw-admin commands 5792/head
Casey Bodley [Wed, 2 Sep 2015 14:54:44 +0000 (10:54 -0400)]
vstart: add -c argument to radosgw-admin commands

Signed-off-by: Casey Bodley <cbodley@redhat.com>
9 years agoMerge pull request #5590 from majianpeng/mds
John Spray [Thu, 3 Sep 2015 09:35:06 +0000 (10:35 +0100)]
Merge pull request #5590 from majianpeng/mds

Mds: add osdmap epoch for setxattr.

Reviewed-by: John Spray <john.spray@redhat.com>
9 years agorgw: don't copy delete_at attr, unless it's intra region copy 5692/head
Yehuda Sadeh [Thu, 3 Sep 2015 00:56:07 +0000 (17:56 -0700)]
rgw: don't copy delete_at attr, unless it's intra region copy

We don't want to keep the expiration value of a copied object, unless
we're doing a copy within the same zone group.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
9 years agorgw: objexp shards index by key
Yehuda Sadeh [Thu, 27 Aug 2015 23:38:04 +0000 (16:38 -0700)]
rgw: objexp shards index by key

Not by time. This should provide better concurrency.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
9 years agorgw: delete-at and delete-after also on obj put / copy
Yehuda Sadeh [Thu, 27 Aug 2015 23:02:44 +0000 (16:02 -0700)]
rgw: delete-at and delete-after also on obj put / copy

And potentially later we could use also the S3 api, so it
could work with multipart upload, and POST obj.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
9 years agoMerge pull request #5775 from dachary/wip-do-autogen
Sage Weil [Wed, 2 Sep 2015 15:38:25 +0000 (11:38 -0400)]
Merge pull request #5775 from dachary/wip-do-autogen

tools: fix do_autogen.sh -R

9 years agoMerge pull request #5712 from yuyuyu101/wip-12801
Sage Weil [Wed, 2 Sep 2015 14:15:44 +0000 (10:15 -0400)]
Merge pull request #5712 from yuyuyu101/wip-12801

Mon: Make ceph osd metadata support dump all osds

Reviewed-by: Joao Eduardo Luis <joao@suse.de>
9 years agotools: fix do_autogen.sh -R 5775/head
Loic Dachary [Wed, 2 Sep 2015 14:00:10 +0000 (16:00 +0200)]
tools: fix do_autogen.sh -R

The R letter was missing from the getopts flags. Also sort the flags
alphabetically to make it easier to spot that kind of lossage.

Signed-off-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #5752 from wonzhq/doc-write-recency
Loic Dachary [Wed, 2 Sep 2015 10:05:52 +0000 (12:05 +0200)]
Merge pull request #5752 from wonzhq/doc-write-recency

doc: add the doc for min_write_recency_for_promote

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #5739 from ceph/wip-12776
Yan, Zheng [Wed, 2 Sep 2015 06:38:30 +0000 (14:38 +0800)]
Merge pull request #5739 from ceph/wip-12776

mds: fix shutdown while in standby

9 years agodoc: add the doc for min_write_recency_for_promote 5752/head
Zhiqiang Wang [Wed, 2 Sep 2015 06:00:40 +0000 (14:00 +0800)]
doc: add the doc for min_write_recency_for_promote

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
9 years agoMerge pull request #5736 from tianshan/wip-12864
Loic Dachary [Wed, 2 Sep 2015 05:19:19 +0000 (07:19 +0200)]
Merge pull request #5736 from tianshan/wip-12864

rados: make 'rados bench' support json format output

Reviewed-by: Haomai Wang <haomaiwang@gmail.com>
Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #5748 from liewegas/wip-warnings
Loic Dachary [Wed, 2 Sep 2015 05:16:48 +0000 (07:16 +0200)]
Merge pull request #5748 from liewegas/wip-warnings

fix newstore warning

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoceph-osd-prestart.sh: fix osd data dir ownership check
Sage Weil [Wed, 2 Sep 2015 01:43:04 +0000 (21:43 -0400)]
ceph-osd-prestart.sh: fix osd data dir ownership check

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #5749 from liewegas/wip-async-ms
Haomai Wang [Wed, 2 Sep 2015 01:34:32 +0000 (09:34 +0800)]
Merge pull request #5749 from liewegas/wip-async-ms

msg/async: log tx/rx at level 1

Reviewed-by: Haomai Wang <haomaiwang@gmail.com>
9 years agovstart.sh: enable all experimental features for vstart 5749/head
Sage Weil [Wed, 2 Sep 2015 01:15:20 +0000 (21:15 -0400)]
vstart.sh: enable all experimental features for vstart

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoms/async: log message tx/rx at level 1
Sage Weil [Wed, 2 Sep 2015 01:15:07 +0000 (21:15 -0400)]
ms/async: log message tx/rx at level 1

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #5722 from cxwshawn/vs-fix
Loic Dachary [Wed, 2 Sep 2015 00:11:20 +0000 (02:11 +0200)]
Merge pull request #5722 from cxwshawn/vs-fix

vstart.sh: add --mon_num --osd_num --mds_num --rgw_port option

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #5693 from tchaikov/wip-12730
Loic Dachary [Wed, 2 Sep 2015 00:05:27 +0000 (02:05 +0200)]
Merge pull request #5693 from tchaikov/wip-12730

common/SubProcess: silence compiler warnings

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #5643 from dreamhost/wip-make-check-makeopt
Loic Dachary [Tue, 1 Sep 2015 23:56:01 +0000 (01:56 +0200)]
Merge pull request #5643 from dreamhost/wip-make-check-makeopt

make-check: support MAKEOPTS overrides.

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #5299 from hjwsm1989/pgmonitor-const
Loic Dachary [Tue, 1 Sep 2015 23:03:14 +0000 (01:03 +0200)]
Merge pull request #5299 from hjwsm1989/pgmonitor-const

  mon: added const to dump_* functions in PGMonitor

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #5156 from rubenk/fix-indentation
Loic Dachary [Tue, 1 Sep 2015 22:51:58 +0000 (00:51 +0200)]
Merge pull request #5156 from rubenk/fix-indentation

Fix indentation

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #5275 from tchaikov/wip-12287
Loic Dachary [Tue, 1 Sep 2015 22:49:23 +0000 (00:49 +0200)]
Merge pull request #5275 from tchaikov/wip-12287

pybind/ceph_argparse: do not choke on non-ascii prefix

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #5702 from Sandy4999/wip-doc-sandy
Loic Dachary [Tue, 1 Sep 2015 21:57:53 +0000 (23:57 +0200)]
Merge pull request #5702 from Sandy4999/wip-doc-sandy

doc:radosgw: correct typos of the command removing a subuser

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
9 years agoMerge pull request #5747 from ceph/wip-user
Sage Weil [Tue, 1 Sep 2015 20:11:21 +0000 (16:11 -0400)]
Merge pull request #5747 from ceph/wip-user

fix ceph-disk

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #5742 from dachary/wip-user 5747/head
Loic Dachary [Tue, 1 Sep 2015 19:15:21 +0000 (21:15 +0200)]
Merge pull request #5742 from dachary/wip-user

tests: ceph-disk: dmcrypt simplification

Reviewed-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #5746 from ceph/wip-fix-doc-build
Loic Dachary [Tue, 1 Sep 2015 19:04:38 +0000 (21:04 +0200)]
Merge pull request #5746 from ceph/wip-fix-doc-build

doc: fix the code-block in ruby.rst

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agodoc: fix the code-block in ruby.rst 5746/head
Kefu Chai [Tue, 1 Sep 2015 17:41:55 +0000 (01:41 +0800)]
doc: fix the code-block in ruby.rst

* and add the link to library homepage in the section titles

Signed-off-by: Kefu Chai <kchai@redhat.com>
9 years agocmake: update FUSE_INCLUDE_DIRS to match autoconf 5744/head
Casey Bodley [Tue, 1 Sep 2015 15:35:42 +0000 (11:35 -0400)]
cmake: update FUSE_INCLUDE_DIRS to match autoconf

client/fuse_ll.cc is now including <fuse.h> and <fuse_lowlevel.h>
instead of <fuse/fuse.h> and <fuse/fuse_lowlevel.h>, so we need to add
the fuse directory to the FUSE_INCLUDE_DIRS variable

using find_path() with just fuse.h was finding a /usr/include/fuse.h
instead of the one in /usr/include/fuse/. looking for fuse_common.h and
fuse_lowlevel.h first causes it to generate the correct
FUSE_INCLUDE_DIRS=/usr/include/fuse

Fixes: #12909
Signed-off-by: Casey Bodley <cbodley@redhat.com>
9 years agoos/newstore: fix swarning 5748/head
Sage Weil [Tue, 1 Sep 2015 17:59:40 +0000 (13:59 -0400)]
os/newstore: fix swarning

os/newstore/NewStore.cc: In member function 'int NewStore::_zero(NewStore::TransContext*, NewStore::CollectionRef&, const ghobject_t&, uint64_t, size_t)':
os/newstore/NewStore.cc:3693:32: warning: ignoring return value of 'int ftruncate(int, __off64_t)', declared with attribute warn_unused_result [-Wunused-result]
       ::ftruncate(fd, f.length);
                                ^

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #5578 from ceph/wip-newstore
Sage Weil [Tue, 1 Sep 2015 17:48:06 +0000 (13:48 -0400)]
Merge pull request #5578 from ceph/wip-newstore

osd: newstore (experimental)

9 years agoceph_test_keyvaluedb: add simple commit latency benchmark 5578/head
Sage Weil [Tue, 1 Sep 2015 17:14:03 +0000 (13:14 -0400)]
ceph_test_keyvaluedb: add simple commit latency benchmark

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/newstore: update todo
Sage Weil [Tue, 1 Sep 2015 17:13:47 +0000 (13:13 -0400)]
os/newstore: update todo

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agodo_autogen.sh: build static rocksdb by default
Sage Weil [Thu, 27 Aug 2015 18:21:23 +0000 (14:21 -0400)]
do_autogen.sh: build static rocksdb by default

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agorocksdb: update alt dist rule
Sage Weil [Thu, 27 Aug 2015 15:45:58 +0000 (11:45 -0400)]
rocksdb: update alt dist rule

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_test_objectstore: make OMapIterator test work with FileStore
Sage Weil [Wed, 26 Aug 2015 19:41:50 +0000 (15:41 -0400)]
ceph_test_objectstore: make OMapIterator test work with FileStore

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_test_objectstore: enable newstore tests
Sage Weil [Tue, 1 Sep 2015 17:22:02 +0000 (13:22 -0400)]
ceph_test_objectstore: enable newstore tests

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agorocksdb: update to 3.11.2
Sage Weil [Wed, 26 Aug 2015 18:57:28 +0000 (14:57 -0400)]
rocksdb: update to 3.11.2

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/RocksDBStore: make other rmkey match
Sage Weil [Wed, 26 Aug 2015 18:54:00 +0000 (14:54 -0400)]
os/RocksDBStore: make other rmkey match

No need for Slice() here; it can take a string.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/RocksDBStore: fix rmkey()
Sage Weil [Wed, 26 Aug 2015 18:52:56 +0000 (14:52 -0400)]
os/RocksDBStore: fix rmkey()

This took way too long to debug!

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_test_keyvaluedb: some simple KeyValueDB unit tests
Sage Weil [Wed, 26 Aug 2015 17:55:45 +0000 (13:55 -0400)]
ceph_test_keyvaluedb: some simple KeyValueDB unit tests

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/newstore: fix end bound on collection_list
Sage Weil [Mon, 24 Aug 2015 21:59:34 +0000 (17:59 -0400)]
os/newstore: fix end bound on collection_list

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/newstore: flush object before doing omap reads
Sage Weil [Sat, 22 Aug 2015 14:33:40 +0000 (10:33 -0400)]
os/newstore: flush object before doing omap reads

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/newstore: add 'newstore backend options' to pass options to e.g. rocksdb
Sage Weil [Tue, 18 Aug 2015 21:22:32 +0000 (17:22 -0400)]
os/newstore: add 'newstore backend options' to pass options to e.g. rocksdb

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/newstore: change escaping chars
Sage Weil [Tue, 18 Aug 2015 19:33:39 +0000 (15:33 -0400)]
os/newstore: change escaping chars

# is lowest besides space and !, except for " (which would be too
confusing).

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/newstore: trim overlay when zeroing extent
Sage Weil [Tue, 18 Aug 2015 19:09:21 +0000 (15:09 -0400)]
os/newstore: trim overlay when zeroing extent

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/newstore: tolerate null pnext to collection_list()
Sage Weil [Tue, 18 Aug 2015 19:08:55 +0000 (15:08 -0400)]
os/newstore: tolerate null pnext to collection_list()

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/newstore: fix collection range for temp objects
Sage Weil [Tue, 18 Aug 2015 18:57:47 +0000 (14:57 -0400)]
os/newstore: fix collection range for temp objects

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/newstore: Implement fiemap
Xiaoxi Chen [Thu, 7 May 2015 07:41:20 +0000 (15:41 +0800)]
os/newstore: Implement fiemap

For simplicity we ignore holes inside an fragment now.

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
9 years agoos/newstore: make sync/async submit_transaction optional
Sage Weil [Mon, 4 May 2015 18:05:27 +0000 (11:05 -0700)]
os/newstore: make sync/async submit_transaction optional

It seems doing this synchronously may be better for SSDs?

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/newstore: renamed TransContext::fds -> sync_items
Sage Weil [Sat, 2 May 2015 23:29:24 +0000 (16:29 -0700)]
os/newstore: renamed TransContext::fds -> sync_items

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/newstore: queue kv transactions in kv_sync_thread
Sage Weil [Sat, 2 May 2015 00:22:57 +0000 (17:22 -0700)]
os/newstore: queue kv transactions in kv_sync_thread

It appears that db->submit_transaction() will block if there is a sync
commit that is in progress instead of simply queueing the new txn for
later.  To work around this, submit these to the backend in the
kv_sync_thread prior to the synchronous submit_transaction_sync().

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/newstore: process multiple aio completions at a time
Sage Weil [Sat, 2 May 2015 00:21:23 +0000 (17:21 -0700)]
os/newstore: process multiple aio completions at a time

This isn't affecting things for a slow disk, but it will matter for faster
backends.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/newstore: clean up kv commit debug output
Sage Weil [Wed, 29 Apr 2015 22:00:46 +0000 (15:00 -0700)]
os/newstore: clean up kv commit debug output

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/newstore: only ftruncate if i_size is incorrect
Sage Weil [Wed, 29 Apr 2015 21:51:00 +0000 (14:51 -0700)]
os/newstore: only ftruncate if i_size is incorrect

Even a no-op ftruncate can block in the kernel.  Prior to this change I
could frequently see ftruncate wait for an aio completion on the same
file.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoRevert "os/newstore: avoid sync append for small ios"
Sage Weil [Wed, 29 Apr 2015 20:57:40 +0000 (13:57 -0700)]
Revert "os/newstore: avoid sync append for small ios"

This reverts commit 69baab2f7eaca7688ce1d45802a82fc3539cd906.

This is slower.  :(

9 years agoos/newstore: avoid sync append for small ios
Sage Weil [Wed, 29 Apr 2015 18:52:55 +0000 (11:52 -0700)]
os/newstore: avoid sync append for small ios

An append is expensive in terms of latency (write, fdatasync, kv commit),
while a wal write is just the kv commit and the write and fdatasync are
async.  For small IOs doing the wal may improve performance.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agorocksdb: fallocate_with_keep_size = false
Sage Weil [Tue, 28 Apr 2015 23:11:05 +0000 (16:11 -0700)]
rocksdb: fallocate_with_keep_size = false

This improves my 4k random writes on hdd by about 25%.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoRevert "os/NewStore: data_map shouldn't be empty when writing all overlays"
Sage Weil [Thu, 23 Apr 2015 23:10:32 +0000 (16:10 -0700)]
Revert "os/NewStore: data_map shouldn't be empty when writing all overlays"

This reverts commit 0d9cce462fec61f754ddcd17cf9a3cf69581d7c5.

We may want to write an overlay if hte object is new and the write is small to defer the cost
of the fsync.

9 years agoos/NewStore: delay the read of all the overlays until wal applying
Zhiqiang Wang [Wed, 29 Apr 2015 06:32:25 +0000 (14:32 +0800)]
os/NewStore: delay the read of all the overlays until wal applying

The read of all the overlays can be delayed until applying the wal. If
we are doing async wal apply, this can reduce write op latency by
eliminating unnecessary reads in the write code path.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
9 years agoos/newstore: fix deadlock when newstore_sync_transaction=true
Xiaoxi Chen [Wed, 29 Apr 2015 05:59:16 +0000 (13:59 +0800)]
os/newstore: fix deadlock when newstore_sync_transaction=true

There is a deadlock issue in Newstore when newstore_sync_transaction = true.
With sync_transaction to true, the txc state machine will go all the way down
from STATE_IO_DONE to STATE_FINISHING in the same thread, while holding the osr->qlock().
The deadlock is caused in _txc_finish and _osr_reap_done, when trying to
lock osr->qlock again.

Since the _txc_finish can be called with(in sync transaction mode) or without
(in async transaction mode) holding the qlock, so fix this by setting the qlock
to PTHREAD_MUTEX_RECURSIVE, thus we can recursive acquire the qlock.

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
9 years agoos/NewStore: fix the append of the later overlays when doing combination
Zhiqiang Wang [Wed, 29 Apr 2015 06:10:51 +0000 (14:10 +0800)]
os/NewStore: fix the append of the later overlays when doing combination

The data of the later contiguous overlays should be claim_append to
'op->data', instead of 'bl'.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
9 years agoos/Newstore: flush_commit return true on STATE_KV_DONE
Xiaoxi Chen [Wed, 29 Apr 2015 05:45:52 +0000 (13:45 +0800)]
os/Newstore: flush_commit return true on STATE_KV_DONE

There is a racing condition here, if the flush_commit() call
happened after _txc_finish_kv and before next state, the context
was pushed to on_commits but no one will handle the context since
we already pass _txc_finish_kv. This bug can be easily reproduce
by putting a sleep(5) after _txc_finish_kv, and trigger the bug by
ceph-osd -i 0 --mkfs.

Fix this bug by return true directly when state >= STATE_KV_DONE(instead
of > in previous code). We already persist the data in STATE_KV_DONE so
it's safe for us to do this.

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
9 years agoos/NewStore: avoid dup the data of the overlays in the WAL
Zhiqiang Wang [Tue, 28 Apr 2015 08:24:16 +0000 (16:24 +0800)]
os/NewStore: avoid dup the data of the overlays in the WAL

When writing all the overlays, there is no need to dup the data in WAL.
Instead, we can reference the overlays in the WAL, and remove these
overlays after commiting them to the fs. When replaying, we can get
these data from the referenced overlays. Doing this way, we can save a
write and a deletion for each of the overlay data in the db.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
9 years agoos/newstore: fix multiple aio case
Sage Weil [Tue, 28 Apr 2015 16:47:09 +0000 (09:47 -0700)]
os/newstore: fix multiple aio case

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/newstore: more conservative default for aio queue depth
Sage Weil [Tue, 28 Apr 2015 16:28:13 +0000 (09:28 -0700)]
os/newstore: more conservative default for aio queue depth

There appears to be a kernel aio bug when the queue depth is small.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/newstore:close fd after writting with O_DIRECT
Xiaoxi Chen [Tue, 28 Apr 2015 12:56:13 +0000 (20:56 +0800)]
os/newstore:close fd after writting with O_DIRECT

fix bug in 2b4c60e0a521ad10b94bbc82865b49f2d28c2ac9

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
9 years agoos/NewStore: need to increase the wal op length when combining overlays
Zhiqiang Wang [Tue, 28 Apr 2015 08:41:39 +0000 (16:41 +0800)]
os/NewStore: need to increase the wal op length when combining overlays

Need to add the length of the combining overlays to the length of the
wal op.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
9 years agoos/Newstore:Fix collection_list_range
Xiaoxi Chen [Fri, 17 Apr 2015 08:14:41 +0000 (16:14 +0800)]
os/Newstore:Fix collection_list_range

We need to rule out hobject_t::max before calling get_object_key
(in which will call get_filestore_key_u32 and get an assert failure)

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
9 years agoos/newstore: fix race in _txc_aio_submit
Sage Weil [Mon, 27 Apr 2015 21:42:55 +0000 (14:42 -0700)]
os/newstore: fix race in _txc_aio_submit

We cannot rely on the iterator pointers being valid after we submit the
aio because we are racing with the completion.  Make our loop decision
before submitting and avoid dereferencing txc after that point.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/newstore : Do not need to call fdatasync if using direct.
Xiaoxi Chen [Mon, 27 Apr 2015 08:28:33 +0000 (16:28 +0800)]
os/newstore : Do not need to call fdatasync if using direct.

skip ::fdatasync if in direct mode.

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
9 years agoosd/NewStore: fix for skipping the overlay in _do_overlay_trim
Zhiqiang Wang [Mon, 27 Apr 2015 08:27:21 +0000 (16:27 +0800)]
osd/NewStore: fix for skipping the overlay in _do_overlay_trim

When the offset of the write starts at the end of the overlay, that is,
p->first + p->second.length == offset, the overlay could be skipped as
well.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>