]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
9 years agoceph_test_objectstore: add omap iterator test
Sage Weil [Wed, 23 Dec 2015 14:20:49 +0000 (09:20 -0500)]
ceph_test_objectstore: add omap iterator test

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph-objectstore-tool: add fsck command
Sage Weil [Tue, 22 Dec 2015 22:45:29 +0000 (17:45 -0500)]
ceph-objectstore-tool: add fsck command

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/ObjectStore: add fsck to interface
Sage Weil [Tue, 22 Dec 2015 22:30:48 +0000 (17:30 -0500)]
os/ObjectStore: add fsck to interface

Only bluestore and kstore implement this currently.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/kstore: clear onode on _do_remove
Sage Weil [Tue, 22 Dec 2015 22:27:17 +0000 (17:27 -0500)]
os/kstore: clear onode on _do_remove

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: clear onode in _do_remove
Sage Weil [Tue, 22 Dec 2015 22:27:10 +0000 (17:27 -0500)]
os/bluestore: clear onode in _do_remove

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd/PGBackend: fix omap digest error message
Sage Weil [Tue, 22 Dec 2015 22:19:45 +0000 (17:19 -0500)]
osd/PGBackend: fix omap digest error message

Print the *omap* digest, not *data* digest.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos: remove {get,set}_allow_sharded_objects from interface
Sage Weil [Tue, 22 Dec 2015 22:13:15 +0000 (17:13 -0500)]
os: remove {get,set}_allow_sharded_objects from interface

We've already forced everyone to upgrade through hammer, so everyone
supports this.  Just unconditionally set the feature if it is not set
(for consistency's sake).

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueFS: do not flush metadata on flush()
Sage Weil [Tue, 22 Dec 2015 21:38:34 +0000 (16:38 -0500)]
os/bluestore/BlueFS: do not flush metadata on flush()

That's what fsync is for.  Moreover, this can lead to some squirreliness
if we trigger this from _flush_log().

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlockDevice: helpful error when aio cannot init
Sage Weil [Tue, 22 Dec 2015 21:36:59 +0000 (16:36 -0500)]
os/bluestore/BlockDevice: helpful error when aio cannot init

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/FreelistManager: audit
Sage Weil [Tue, 22 Dec 2015 21:36:42 +0000 (16:36 -0500)]
os/bluestore/FreelistManager: audit

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueFS: implement invalidate_cache
Sage Weil [Tue, 22 Dec 2015 21:11:48 +0000 (16:11 -0500)]
os/bluestore/BlueFS: implement invalidate_cache

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlockDevice: add invalidate_cache
Sage Weil [Tue, 22 Dec 2015 21:11:32 +0000 (16:11 -0500)]
os/bluestore/BlockDevice: add invalidate_cache

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: simplify _do_remove
Sage Weil [Tue, 22 Dec 2015 21:11:00 +0000 (16:11 -0500)]
os/bluestore: simplify _do_remove

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: debug msg on statfs
Sage Weil [Tue, 22 Dec 2015 19:53:13 +0000 (14:53 -0500)]
os/bluestore: debug msg on statfs

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: optimize _dump_onode slightly
Sage Weil [Tue, 22 Dec 2015 19:26:09 +0000 (14:26 -0500)]
os/bluestore: optimize _dump_onode slightly

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_test_objectstore: enable bluestore debug options
Sage Weil [Tue, 22 Dec 2015 19:02:33 +0000 (14:02 -0500)]
ceph_test_objectstore: enable bluestore debug options

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueStore: fsck_on_umount
Sage Weil [Tue, 22 Dec 2015 19:02:05 +0000 (14:02 -0500)]
os/bluestore/BlueStore: fsck_on_umount

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/StupidAllocator: bluestore_debug_small_allocations
Sage Weil [Tue, 22 Dec 2015 19:05:35 +0000 (14:05 -0500)]
os/bluestore/StupidAllocator: bluestore_debug_small_allocations

Force small allocations for debugging purposes.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueStore: fix _zero when previous extent partially unwritten
Sage Weil [Tue, 22 Dec 2015 19:33:02 +0000 (14:33 -0500)]
os/bluestore/BlueStore: fix _zero when previous extent partially unwritten

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: support copy-on-write clones
Sage Weil [Tue, 22 Dec 2015 19:32:40 +0000 (14:32 -0500)]
os/bluestore: support copy-on-write clones

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueStore: only allow clone if hash matches
Sage Weil [Fri, 18 Dec 2015 22:41:01 +0000 (17:41 -0500)]
os/bluestore/BlueStore: only allow clone if hash matches

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: Enode infrastructure
Sage Weil [Mon, 14 Dec 2015 21:57:10 +0000 (16:57 -0500)]
os/bluestore: Enode infrastructure

Enodes will track extent ref counts for any extent
that is marked shared.  There will be an enode for
any unique hash value that has any refs.  We will keep
in-memory copies of only those Enodes that are
referenced by in-memory Onodes, and only if the enode
is requested (e.g., the enode won't be loaded as a
result of an object read because we never need to
call get_enode.).

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/bluestore_types: add extent FLAG_COW_{HEAD,TAIL}
Sage Weil [Tue, 15 Dec 2015 20:13:08 +0000 (15:13 -0500)]
os/bluestore/bluestore_types: add extent FLAG_COW_{HEAD,TAIL}

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agounittest_bluefs, unittest_bluestore_types
Sage Weil [Thu, 17 Dec 2015 19:14:44 +0000 (14:14 -0500)]
unittest_bluefs, unittest_bluestore_types

These should run during make check.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/bluestore_types: add contains(), clear(), empty() to extent_ref_map
Sage Weil [Mon, 14 Dec 2015 21:58:15 +0000 (16:58 -0500)]
os/bluestore/bluestore_types: add contains(), clear(), empty() to extent_ref_map

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueStore: wal_op_t::OP_COPY
Sage Weil [Mon, 14 Dec 2015 20:53:41 +0000 (15:53 -0500)]
os/bluestore/BlueStore: wal_op_t::OP_COPY

Assume block-aligned.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlockDevice: fix waiter wakeup use-after-free race
Sage Weil [Fri, 18 Dec 2015 22:33:41 +0000 (17:33 -0500)]
os/bluestore/BlockDevice: fix waiter wakeup use-after-free race

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: add bluestore_debug_no_reuse_blocks
Sage Weil [Tue, 22 Dec 2015 19:03:33 +0000 (14:03 -0500)]
os/bluestore: add bluestore_debug_no_reuse_blocks

This makes debugging a bit easier because we never use the same
extent of the disk twice, leaving useful evidence behind.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_test_objectstore: do Synthetic tests over larger objects
Sage Weil [Fri, 18 Dec 2015 22:45:27 +0000 (17:45 -0500)]
ceph_test_objectstore: do Synthetic tests over larger objects

400k for objects.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_test_objectstore: use a few hash values for objects; clone between them
Sage Weil [Thu, 17 Dec 2015 19:15:48 +0000 (14:15 -0500)]
ceph_test_objectstore: use a few hash values for objects; clone between them

We only guarantee support for clone between objects with the same hash.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_test_objectstore: dump actual vs expected on read data mismatch
Sage Weil [Thu, 17 Dec 2015 19:17:35 +0000 (14:17 -0500)]
ceph_test_objectstore: dump actual vs expected on read data mismatch

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_test_objectstore: add many clone tests
Sage Weil [Thu, 17 Dec 2015 19:15:33 +0000 (14:15 -0500)]
ceph_test_objectstore: add many clone tests

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_test_objectstore: validate full object contents after writes
Sage Weil [Thu, 17 Dec 2015 18:59:36 +0000 (13:59 -0500)]
ceph_test_objectstore: validate full object contents after writes

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_test_objectstore: debug enter/exit points
Sage Weil [Thu, 17 Dec 2015 16:28:33 +0000 (11:28 -0500)]
ceph_test_objectstore: debug enter/exit points

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_test_objectstore: save map lookups for a few ops
Sage Weil [Thu, 17 Dec 2015 16:28:14 +0000 (11:28 -0500)]
ceph_test_objectstore: save map lookups for a few ops

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_test_objectstore: fix locking for a few ops
Sage Weil [Thu, 17 Dec 2015 16:27:54 +0000 (11:27 -0500)]
ceph_test_objectstore: fix locking for a few ops

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_test_objectstore: fix clone
Sage Weil [Thu, 17 Dec 2015 16:27:03 +0000 (11:27 -0500)]
ceph_test_objectstore: fix clone

Copy the buffer, in case other threads modify it in place.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_test_objectstore: simplify object name generation
Sage Weil [Wed, 16 Dec 2015 18:18:53 +0000 (13:18 -0500)]
ceph_test_objectstore: simplify object name generation

The long names don't exercise useful code paths, and having
consistent naming makes it easier to grep through logs.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_test_objectstore: clone non-empty objects, not empty ones
Sage Weil [Wed, 16 Dec 2015 14:16:37 +0000 (09:16 -0500)]
ceph_test_objectstore: clone non-empty objects, not empty ones

This condition was backwards.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_test_objectstore: clone objects with same hash
Sage Weil [Mon, 14 Dec 2015 20:00:37 +0000 (15:00 -0500)]
ceph_test_objectstore: clone objects with same hash

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: add some slow debug path
Sage Weil [Tue, 22 Dec 2015 18:31:07 +0000 (13:31 -0500)]
os/bluestore: add some slow debug path

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: clean up comments a bit
Sage Weil [Tue, 22 Dec 2015 18:10:43 +0000 (13:10 -0500)]
os/bluestore: clean up comments a bit

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueStore: note wal releases in fsck
Sage Weil [Tue, 22 Dec 2015 17:45:58 +0000 (12:45 -0500)]
os/bluestore/BlueStore: note wal releases in fsck

Include these in used_blocks (they are about to be released but
not reflected in the onode).

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueStore: fix read bug when there is a hole
Sage Weil [Fri, 18 Dec 2015 22:40:17 +0000 (17:40 -0500)]
os/bluestore/BlueStore: fix read bug when there is a hole

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/kstore: fix rename
Sage Weil [Tue, 22 Dec 2015 18:40:56 +0000 (13:40 -0500)]
os/kstore: fix rename

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueStore: fix rename
Sage Weil [Tue, 22 Dec 2015 16:40:23 +0000 (11:40 -0500)]
os/bluestore/BlueStore: fix rename

Install a negative onode entry at the old name position.
Otherwise, a simple transaction like

 rename a -> b
 touch b

will re-read the old b onode key on the second op, and chaos will
ensue (e.g., because it'll reference the same extents from a
different object).

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueStore: remove unused OnodeMap::remove
Sage Weil [Tue, 22 Dec 2015 16:34:07 +0000 (11:34 -0500)]
os/bluestore/BlueStore: remove unused OnodeMap::remove

We install negative entries instead.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlockDevice: adjust debug output
Sage Weil [Tue, 22 Dec 2015 15:49:18 +0000 (10:49 -0500)]
os/bluestore/BlockDevice: adjust debug output

5 helpful (read/write offsets)
10 more, with aio completions
20 everything
30 fire hose
40 data hexdumps

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlockDevice: fix path
Sage Weil [Tue, 22 Dec 2015 15:48:14 +0000 (10:48 -0500)]
os/bluestore/BlockDevice: fix path

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueStore: do WAL ops buffered to avoid RMW issues
Sage Weil [Tue, 22 Dec 2015 15:31:08 +0000 (10:31 -0500)]
os/bluestore/BlueStore: do WAL ops buffered to avoid RMW issues

We may have multiple WAL ops that do read/modify/write covering
the same blocks.  To avoid the complexity of identifying those
situations and ensuring that we, say, wait for writes to complete
before reading them back again, just make the IO buffered and let
the page cache handle that for us.

This fixes the failure of LibRadosAio.RoundTripWriteFull.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agorocksdb: debug log writes/reads
Sage Weil [Mon, 21 Dec 2015 21:57:12 +0000 (16:57 -0500)]
rocksdb: debug log writes/reads

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: handle both buffered and direct+async IO
Sage Weil [Mon, 21 Dec 2015 21:56:42 +0000 (16:56 -0500)]
os/bluestore: handle both buffered and direct+async IO

Prefer aio unless explicitly directed otherwise.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlockDevice: rename bdev options
Sage Weil [Mon, 21 Dec 2015 21:45:02 +0000 (16:45 -0500)]
os/bluestore/BlockDevice: rename bdev options

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueStore: use BlueFS::get_usage()
Sage Weil [Mon, 21 Dec 2015 21:18:05 +0000 (16:18 -0500)]
os/bluestore/BlueStore: use BlueFS::get_usage()

...just so we log bdev utilization in the log.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueFS: get_usage()
Sage Weil [Mon, 21 Dec 2015 21:17:47 +0000 (16:17 -0500)]
os/bluestore/BlueFS: get_usage()

Return (and log) usage for all bdevs.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueFS: do not dirty file when overwriting bytes
Sage Weil [Mon, 21 Dec 2015 20:33:38 +0000 (15:33 -0500)]
os/bluestore/BlueFS: do not dirty file when overwriting bytes

The rocksdb log recycle option allows us to overwrite previously
allocated space in an old log file to avoid updating the file
metadata on normal file systems.  Take advantage of that here to
by implementing what is effectively O_NOCMTIME semantics: we do
not dirty the file metadata just because mtime is updated.
Instead, we dirty the file only if we allocate new space or if
the size has to be increased.

Note that on my NVME drive a single-thread rados bench test, we
jump from 30MB/sec to 50MB/sec 128KB writes as soon as we start
recycling previous logs (about 40 second into the run).

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueFS: ignore flush when buffer is small
Sage Weil [Mon, 21 Dec 2015 20:07:44 +0000 (15:07 -0500)]
os/bluestore/BlueFS: ignore flush when buffer is small

Rocksdb does a flush after every append, each of which is often
less than a full block.  This is very inefficient when our
_flush() will send that to disk (and block).

Avoid this most of the time by ignoring small flush requests
entirely, unless the force flag is set (e.g., by fsync).

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: update freelist in individual transactions
Sage Weil [Mon, 21 Dec 2015 19:45:00 +0000 (14:45 -0500)]
os/bluestore: update freelist in individual transactions

We submit each operation's transaction individually to rocksdb,
and then since a final transction to flush them all.  However,
they may not commit atomically (all together), which means we
need to leave the individual freelist updates within each
transaction.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: better debugging on fsck alloc errors
Sage Weil [Mon, 21 Dec 2015 19:22:58 +0000 (14:22 -0500)]
os/bluestore: better debugging on fsck alloc errors

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoscript/crash_bdev: simple script to inject bdev failures
Sage Weil [Mon, 21 Dec 2015 18:54:35 +0000 (13:54 -0500)]
script/crash_bdev: simple script to inject bdev failures

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: fail mount of fsck finds errors
Sage Weil [Mon, 21 Dec 2015 18:53:34 +0000 (13:53 -0500)]
os/bluestore: fail mount of fsck finds errors

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/fs/FS.h: fix aio_t::pread
Sage Weil [Mon, 21 Dec 2015 14:49:05 +0000 (09:49 -0500)]
os/fs/FS.h: fix aio_t::pread

Allocate aligned buffer.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueStore: better error msg for bdev label check
Sage Weil [Mon, 21 Dec 2015 14:39:56 +0000 (09:39 -0500)]
os/bluestore/BlueStore: better error msg for bdev label check

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: don't create block.{db,wal} by default
Sage Weil [Mon, 21 Dec 2015 14:00:17 +0000 (09:00 -0500)]
os/bluestore: don't create block.{db,wal} by default

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agovstart.sh: less noisy debug
Sage Weil [Mon, 21 Dec 2015 13:58:43 +0000 (08:58 -0500)]
vstart.sh: less noisy debug

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: fix fsck contains vs intersects
Sage Weil [Mon, 21 Dec 2015 13:57:18 +0000 (08:57 -0500)]
os/bluestore: fix fsck contains vs intersects

Any overlap is an error.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: bluestore bluefs = true
Sage Weil [Sat, 19 Dec 2015 19:06:00 +0000 (14:06 -0500)]
os/bluestore: bluestore bluefs = true

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agorpm, debian: package ceph-bluefs-tool
Sage Weil [Fri, 18 Dec 2015 20:40:45 +0000 (15:40 -0500)]
rpm, debian: package ceph-bluefs-tool

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueStore: fix error path if label set fails
Sage Weil [Thu, 17 Dec 2015 19:11:07 +0000 (14:11 -0500)]
os/bluestore/BlueStore: fix error path if label set fails

Reported-by: David Zafman <dzafman@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
9 years agorocksdb: fix recycle replay
Sage Weil [Thu, 17 Dec 2015 19:12:36 +0000 (14:12 -0500)]
rocksdb: fix recycle replay

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoMakefile-rocksdb.am: update
Sage Weil [Thu, 17 Dec 2015 14:06:48 +0000 (09:06 -0500)]
Makefile-rocksdb.am: update

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: default to 64k min_alloc_size
Sage Weil [Mon, 14 Dec 2015 21:33:38 +0000 (16:33 -0500)]
os/bluestore: default to 64k min_alloc_size

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueStore: fix _open_bdev() failure path
Sage Weil [Mon, 14 Dec 2015 21:28:22 +0000 (16:28 -0500)]
os/bluestore/BlueStore: fix _open_bdev() failure path

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agokv/RocksDBStore: behave if options string is empty
Sage Weil [Mon, 14 Dec 2015 21:27:17 +0000 (16:27 -0500)]
kv/RocksDBStore: behave if options string is empty

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: clear coll_map on umount, fsck finish
Sage Weil [Mon, 14 Dec 2015 20:56:33 +0000 (15:56 -0500)]
os/bluestore: clear coll_map on umount, fsck finish

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/kstore/KStore: fix object key decode with key
Sage Weil [Mon, 14 Dec 2015 20:55:09 +0000 (15:55 -0500)]
os/kstore/KStore: fix object key decode with key

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueStore: fix object key decode with key
Sage Weil [Mon, 14 Dec 2015 19:59:17 +0000 (14:59 -0500)]
os/bluestore/BlueStore: fix object key decode with key

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_objectstore_test: fix warning
Sage Weil [Wed, 9 Dec 2015 21:19:58 +0000 (16:19 -0500)]
ceph_objectstore_test: fix warning

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/KeyValueStore: drop kinetic #include
Sage Weil [Wed, 9 Dec 2015 21:19:07 +0000 (16:19 -0500)]
os/KeyValueStore: drop kinetic #include

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/kstore: add new KStore backend
Sage Weil [Thu, 10 Dec 2015 21:04:32 +0000 (16:04 -0500)]
os/kstore: add new KStore backend

This is based on BlueStore, but with all of the block-related code
and complexity ripped out, and a simple striping strategy added
in its place.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/bluestore_types: localize types
Sage Weil [Thu, 10 Dec 2015 21:03:59 +0000 (16:03 -0500)]
os/bluestore/bluestore_types: localize types

Prefix with bluestore_

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: add extent_ref_map_t
Sage Weil [Thu, 10 Dec 2015 22:27:04 +0000 (17:27 -0500)]
os/bluestore: add extent_ref_map_t

This will be used to refcount extents for some subset
of the store (objects with same name or hash value?).

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/FreelistManager: drop unused db ref
Sage Weil [Thu, 10 Dec 2015 21:03:41 +0000 (16:03 -0500)]
os/bluestore/FreelistManager: drop unused db ref

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: record kv backend
Sage Weil [Thu, 10 Dec 2015 21:03:23 +0000 (16:03 -0500)]
os/bluestore: record kv backend

Record kv backend at mkfs time instead of relying on current value
of config option.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: statfs
Sage Weil [Thu, 10 Dec 2015 21:02:45 +0000 (16:02 -0500)]
os/bluestore: statfs

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlockDevice: inject block failures
Sage Weil [Fri, 4 Dec 2015 01:03:10 +0000 (20:03 -0500)]
os/bluestore/BlockDevice: inject block failures

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph_test_objectstore: clean up synthetic collections
Sage Weil [Thu, 3 Dec 2015 21:33:37 +0000 (16:33 -0500)]
ceph_test_objectstore: clean up synthetic collections

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: block.db support
Sage Weil [Thu, 10 Dec 2015 22:17:45 +0000 (17:17 -0500)]
os/bluestore: block.db support

Support a mid- to fast device that will preferentially
store the rocksdb data (and wal, if block.wal is not
present).

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: less debug noise
Sage Weil [Thu, 10 Dec 2015 22:17:10 +0000 (17:17 -0500)]
os/bluestore: less debug noise

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueFS: all overwrites on open_for_write
Sage Weil [Thu, 10 Dec 2015 22:21:03 +0000 (17:21 -0500)]
os/bluestore/BlueFS: all overwrites on open_for_write

rocksdb will occasionally overwrite an existing file
if it is not present/valid in the manifest.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueStore: drop internal EnvMirror
Sage Weil [Wed, 25 Nov 2015 19:27:28 +0000 (14:27 -0500)]
os/bluestore/BlueStore: drop internal EnvMirror

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agorocksdb: pull up to master, include EnvMirror
Sage Weil [Fri, 11 Dec 2015 14:32:30 +0000 (09:32 -0500)]
rocksdb: pull up to master, include EnvMirror

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: label all block devices
Sage Weil [Thu, 10 Dec 2015 22:20:25 +0000 (17:20 -0500)]
os/bluestore: label all block devices

Label all of our block devices with a simple label
that includes the osd_uuid.  Wire this into the
ObjectStore and OSD probe mechanism.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueFS: flush log if needed
Sage Weil [Thu, 10 Dec 2015 22:19:29 +0000 (17:19 -0500)]
os/bluestore/BlueFS: flush log if needed

If a file has dirty metadata (but no dirty data), we
still need to flush the log when it is flushed.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueFS: fix replay of unlink
Sage Weil [Thu, 10 Dec 2015 22:18:57 +0000 (17:18 -0500)]
os/bluestore/BlueFS: fix replay of unlink

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: support second block.wal device
Sage Weil [Thu, 10 Dec 2015 22:15:57 +0000 (17:15 -0500)]
os/bluestore: support second block.wal device

Use this device for the bluefs log.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueStore: fix zero gap bug
Sage Weil [Thu, 10 Dec 2015 22:15:33 +0000 (17:15 -0500)]
os/bluestore/BlueStore: fix zero gap bug

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore: disable overlay for now
Sage Weil [Thu, 10 Dec 2015 22:15:14 +0000 (17:15 -0500)]
os/bluestore: disable overlay for now

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlockDevice: restructure interface
Sage Weil [Fri, 27 Nov 2015 16:07:46 +0000 (11:07 -0500)]
os/bluestore/BlockDevice: restructure interface

use atomics, do not track in-flight extents or magically cope
with racing ios (that is the users responsibility).

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoos/bluestore/BlueFS: fix overwrite
Sage Weil [Thu, 10 Dec 2015 21:49:56 +0000 (16:49 -0500)]
os/bluestore/BlueFS: fix overwrite

Signed-off-by: Sage Weil <sage@redhat.com>