]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
12 years agodoc: fix os-recommendations table
Sage Weil [Thu, 1 Nov 2012 04:27:51 +0000 (21:27 -0700)]
doc: fix os-recommendations table

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodoc: Index entry for OS Recommendations
John Wilkins [Tue, 30 Oct 2012 20:00:17 +0000 (13:00 -0700)]
doc: Index entry for OS Recommendations

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: update os-recommendations
Sage Weil [Thu, 1 Nov 2012 04:24:56 +0000 (21:24 -0700)]
doc: update os-recommendations

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'next'
Sage Weil [Thu, 1 Nov 2012 00:14:44 +0000 (17:14 -0700)]
Merge branch 'next'

12 years agoceph-disk-activate: avoid duplicating mounts if already activated
Sage Weil [Tue, 30 Oct 2012 21:17:56 +0000 (14:17 -0700)]
ceph-disk-activate: avoid duplicating mounts if already activated

If the given device is already mounted at the target location, do not
mount --move it again and create a bunch of dup entries in the /etc/mtab
and kernel mount table.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoPG: requeue snap_trimmer after scrub finishes
Mike Ryan [Wed, 31 Oct 2012 18:36:49 +0000 (11:36 -0700)]
PG: requeue snap_trimmer after scrub finishes

Previously the snap_trimmer would continuously requeue itself until the
end of scrub. This degrades performance and fills up logs for No Good
Reason.

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agoPG: requeue snap_trimmer after scrub finishes
Mike Ryan [Wed, 31 Oct 2012 18:36:49 +0000 (11:36 -0700)]
PG: requeue snap_trimmer after scrub finishes

Previously the snap_trimmer would continuously requeue itself until the
end of scrub. This degrades performance and fills up logs for No Good
Reason.

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agodoc: tiny syntax fix.
John Wilkins [Wed, 31 Oct 2012 21:12:21 +0000 (14:12 -0700)]
doc: tiny syntax fix.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Added internal anchor references.
John Wilkins [Wed, 31 Oct 2012 21:11:50 +0000 (14:11 -0700)]
doc: Added internal anchor references.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: using remote copy
John Wilkins [Wed, 31 Oct 2012 21:11:12 +0000 (14:11 -0700)]
doc: using remote copy

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoMerge remote-tracking branch 'upstream/wip_dep_fix'
Samuel Just [Wed, 31 Oct 2012 18:37:06 +0000 (11:37 -0700)]
Merge remote-tracking branch 'upstream/wip_dep_fix'

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoREADME: add libboost-program-options-dev
Samuel Just [Wed, 31 Oct 2012 18:34:13 +0000 (11:34 -0700)]
README: add libboost-program-options-dev

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoconfigure.ac: add program_options header check
Samuel Just [Wed, 31 Oct 2012 17:27:33 +0000 (10:27 -0700)]
configure.ac: add program_options header check

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge branch 'wip_journal_perf'
Samuel Just [Tue, 30 Oct 2012 20:31:45 +0000 (13:31 -0700)]
Merge branch 'wip_journal_perf'

12 years agoReplicatedPG: actually delay op for backfill_pos
Samuel Just [Mon, 22 Oct 2012 21:25:27 +0000 (14:25 -0700)]
ReplicatedPG: actually delay op for backfill_pos

3f952afe5da644b30015fead8e3d42a129b59989 neglected to
actually delay the op in ReplicatedPG::do_op.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoFinisher: add perf counter for queue len
Samuel Just [Mon, 22 Oct 2012 18:09:18 +0000 (11:09 -0700)]
Finisher: add perf counter for queue len

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoFileJournal: rename queue_lock to finisher_lock
Samuel Just [Tue, 16 Oct 2012 16:33:01 +0000 (09:33 -0700)]
FileJournal: rename queue_lock to finisher_lock

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoFileJournal: write_cond is not used
Samuel Just [Tue, 16 Oct 2012 16:25:07 +0000 (09:25 -0700)]
FileJournal: write_cond is not used

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoFileJournal: break writeq locking from queue_lock
Samuel Just [Mon, 15 Oct 2012 22:39:55 +0000 (15:39 -0700)]
FileJournal: break writeq locking from queue_lock

This prevents the relatively long process of queueing
finishers from preventing op submission.

In submit_entry, we no longer check for full before placing
the write in the writeq, committed_thru should work anyway,
and we don't want to grab the required lock.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoThrottle: reduce lock hold periods
Samuel Just [Tue, 16 Oct 2012 19:32:20 +0000 (12:32 -0700)]
Throttle: reduce lock hold periods

Previously, we tended to dump a lot of log output under
the Throttle lock.  The log level for most log statements
has been reduced to 10.

Additionally, count and max are now atomic_t and can be
read without the Throttle lock.

Finally, most of the perf counter manipulations have been
moved outside of the lock.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoos: instrument submit lock, apply lock, queue_lock, write_lock
Samuel Just [Thu, 11 Oct 2012 01:21:13 +0000 (18:21 -0700)]
os: instrument submit lock, apply lock, queue_lock, write_lock

Adds Mutex perfcounter tracking to mutexes of interest.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoFileStore: add op_throttle_lock
Samuel Just [Wed, 10 Oct 2012 16:44:32 +0000 (09:44 -0700)]
FileStore: add op_throttle_lock

Avoid using op_tp lock for the op throttle.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoFileStore: don't lock op_tp in queue_op
Samuel Just [Wed, 10 Oct 2012 16:43:57 +0000 (09:43 -0700)]
FileStore: don't lock op_tp in queue_op

Neither caller of queue_op can race.
1) in queue_transactions, already under submit lock
2) in _journaled_ahead, journal finisher is single threaded

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoperf_counters: add dec()
Samuel Just [Mon, 22 Oct 2012 17:46:57 +0000 (10:46 -0700)]
perf_counters: add dec()

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoJournalingFileStore: move apply/commit sequencing to apply_manager
Samuel Just [Sat, 6 Oct 2012 00:33:36 +0000 (17:33 -0700)]
JournalingFileStore: move apply/commit sequencing to apply_manager

syncing the filestore requires a stable commit point (i.e., all ops
up to applied_seq must have been applied).  Previously, we used
journal_lock to atomically block new applies while waiting for
the remaining ones to finish.  This creates unnecessary contention.
We now use apply_manager to manage that state atomically with its
own lock.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoJournalingFileStore: create submit_manager to order op submission
Samuel Just [Fri, 5 Oct 2012 20:46:13 +0000 (13:46 -0700)]
JournalingFileStore: create submit_manager to order op submission

Previously, we ensured op ordering by queueing for journal and
the op queue under the journal lock.  All that is required is
that obtaining an op sequence, queueing for journal, and
(for parallel) queueing for application to the fs are done
atomically.  To that end, submit_manager now handles op submission.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoJournalingObjectStore: remove force_commit, no longer needed
Samuel Just [Fri, 5 Oct 2012 23:26:35 +0000 (16:26 -0700)]
JournalingObjectStore: remove force_commit, no longer needed

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoJournalingObjectStore: whitespace fix
Samuel Just [Fri, 5 Oct 2012 23:12:36 +0000 (16:12 -0700)]
JournalingObjectStore: whitespace fix

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoFileStore: remove trigger_commit
Samuel Just [Thu, 2 Aug 2012 16:39:08 +0000 (09:39 -0700)]
FileStore: remove trigger_commit

This is no longer used.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoJournalingFileStore: pass -1 as the alignment if unimportant
Samuel Just [Tue, 31 Jul 2012 16:04:40 +0000 (09:04 -0700)]
JournalingFileStore: pass -1 as the alignment if unimportant

Previously, data_align began at 0 and remained that way if no
transaction contained a large data segment.  This 0 was propagated
to prepare_single_write, which padded out most of a page to ensure
that the bl started with 0 alignment.  Passing -1 will ensure that
we don't prepad these small segments.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoFileStore: next_finish is not used
Samuel Just [Wed, 17 Oct 2012 20:06:51 +0000 (13:06 -0700)]
FileStore: next_finish is not used

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agotest/bench: add tp bench
Samuel Just [Tue, 23 Oct 2012 05:04:40 +0000 (22:04 -0700)]
test/bench: add tp bench

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agotest/bench: small io benchmarker
Samuel Just [Sat, 6 Oct 2012 20:58:37 +0000 (13:58 -0700)]
test/bench: small io benchmarker

Precreates objects and does writes to random offsets within
random objects.

Includes rados, filestore, and vanilla fs variants

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoMutex: Instrument Mutex with perfcouter for Lock() wait
Samuel Just [Thu, 11 Oct 2012 01:20:31 +0000 (18:20 -0700)]
Mutex: Instrument Mutex with perfcouter for Lock() wait

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agomsg/SimpleMessenger: start accepter in ready()
Sage Weil [Tue, 30 Oct 2012 20:19:30 +0000 (13:19 -0700)]
msg/SimpleMessenger: start accepter in ready()

Start the accepter thread when the first dispatcher is ready.  This ensures
that there will be someone around to verify authorizers for incoming
connections, and means we have a bit less failure noise on the monitors
as a result.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: separate pre- and post-fork init
Sage Weil [Tue, 30 Oct 2012 20:16:57 +0000 (13:16 -0700)]
mon: separate pre- and post-fork init

Do most init pre-fork, then do the last little bit (start up messenger,
bootstrap) post-fork.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomsg/Pipe: fix seq # fix
Sage Weil [Tue, 30 Oct 2012 20:08:57 +0000 (13:08 -0700)]
msg/Pipe: fix seq # fix

02f6262f47f72178a78d410f4facab7bbc97b098 got this all wrong (though it
worked by accident).

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: verify authorizers for heartbeat dispatcher
Sage Weil [Tue, 30 Oct 2012 19:49:53 +0000 (12:49 -0700)]
osd: verify authorizers for heartbeat dispatcher

This was broken with the fixed messenger behavior with missing
verify_authorizer methods in 100fcca3cb54c97c4332328aad67d4b796f33ec2.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodoc: fix typo in cinder upstart config name
Josh Durgin [Tue, 30 Oct 2012 19:34:19 +0000 (12:34 -0700)]
doc: fix typo in cinder upstart config name

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
12 years agodoc: Added syntax fixes to Peter's session authentication doc.
John Wilkins [Tue, 30 Oct 2012 18:20:51 +0000 (11:20 -0700)]
doc: Added syntax fixes to Peter's session authentication doc.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoceph-disk-prepare: poke kernel into refreshing partition tables
Sage Weil [Fri, 26 Oct 2012 04:21:18 +0000 (21:21 -0700)]
ceph-disk-prepare: poke kernel into refreshing partition tables

Prod the kernel to refresh the partition table after we create one.  The
partprobe program is packaged with parted, which we already use, so this
introduces no new dependency.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph-disk-prepare: fix journal partition creation
Sage Weil [Fri, 26 Oct 2012 04:20:21 +0000 (21:20 -0700)]
ceph-disk-prepare: fix journal partition creation

The end value needs to have + to indicate it is relative to wherever the
start is.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph-disk-prepare: assume parted failure means no partition table
Sage Weil [Fri, 26 Oct 2012 01:14:47 +0000 (18:14 -0700)]
ceph-disk-prepare: assume parted failure means no partition table

If the disk has no valid label we get an error like

  Error: /dev/sdi: unrecognised disk label

Assume any error we get is that and go with an id label of 1.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomsg/Pipe: whitespace cleanup
Sage Weil [Tue, 30 Oct 2012 17:00:54 +0000 (10:00 -0700)]
msg/Pipe: whitespace cleanup

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomsg/Pipe: only randomize start seq #'s if MSG_AUTH feature is present
Sage Weil [Tue, 30 Oct 2012 17:00:42 +0000 (10:00 -0700)]
msg/Pipe: only randomize start seq #'s if MSG_AUTH feature is present

The kernel client expects seq #'s to start at 1 or else it is unhappy.
So, only randomize these values if the MSG_AUTH feature is present--that is
the only time it matters anyway.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodoc: update fs recommendations
Sage Weil [Mon, 29 Oct 2012 20:01:06 +0000 (13:01 -0700)]
doc: update fs recommendations

More forceful about recommending XFS.  More warning about using btrfs in
production deployments.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocephx: don't check signature if MSG_AUTH feature isn't present
Sage Weil [Mon, 29 Oct 2012 22:48:15 +0000 (15:48 -0700)]
cephx: don't check signature if MSG_AUTH feature isn't present

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoauth: include features in cephx SessionHandler
Sage Weil [Mon, 29 Oct 2012 22:47:45 +0000 (15:47 -0700)]
auth: include features in cephx SessionHandler

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoFixed problem with checking authorizer in accept().
Peter Reiher [Mon, 29 Oct 2012 21:36:08 +0000 (14:36 -0700)]
Fixed problem with checking authorizer in accept().

Signed-off-by: Peter Reiher <reiher@inktank.com>
12 years agolibrbd: Fix 32-bit compilation errors
Dan Mick [Mon, 29 Oct 2012 18:03:15 +0000 (11:03 -0700)]
librbd: Fix 32-bit compilation errors

Switch size_t in clip_io to uint64_t; it's just easier, and the
alternative would be to limit 32-bit builds to sizes <= 4GB

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoMerge branch 'master' of github.com:ceph/ceph
Peter Reiher [Mon, 29 Oct 2012 19:47:18 +0000 (12:47 -0700)]
Merge branch 'master' of github.com:ceph/ceph

12 years agoTemporary patch to a problem in Pipe related to monitor initialization.
Peter Reiher [Mon, 29 Oct 2012 19:42:29 +0000 (12:42 -0700)]
Temporary patch to a problem in Pipe related to monitor initialization.

Signed-off-by: Peter Reiher <reiher@inktank.com>
12 years agoMerge branch 'wip-oc-neg'
Sage Weil [Mon, 29 Oct 2012 19:37:08 +0000 (12:37 -0700)]
Merge branch 'wip-oc-neg'

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoosd: make pool_snap_info_t encoding backward compatible
Sage Weil [Mon, 29 Oct 2012 18:03:46 +0000 (11:03 -0700)]
osd: make pool_snap_info_t encoding backward compatible

Way back in fc869dee1e8a1c90c93cb7e678563772fb1c51fb (v0.42) when we redid
the osd type encoding we forgot to make this conditionally encode the old
format for old clients.  In particular, this means that kernel clients
will fail to decode the osdmap if there is a rados pool with a pool-level
snapshot defined.

Fixes: #3290
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodep-report.sh: ceph package dependency report.
Gary Lowell [Mon, 29 Oct 2012 16:55:33 +0000 (09:55 -0700)]
dep-report.sh:  ceph package dependency report.

This script searches the ceph build area for dependent header files and
and libraries to attempt to identify ceph package dependecies.

12 years agoclient: Fix ref counting double free with hardlink
Sam Lang [Mon, 29 Oct 2012 15:30:01 +0000 (10:30 -0500)]
client: Fix ref counting double free with hardlink

Peforming a hard link through the libcephfs interface causes
a double free on shutdown, due to the Client::link call decrementing
the parent (of the target) directory's inode.  This fix removes the
put_inode(dir) call, to match the behavior of Client::ll_link.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agotest: Functional test for hardlink/unmount pattern
Sam Lang [Fri, 19 Oct 2012 16:38:33 +0000 (11:38 -0500)]
test: Functional test for hardlink/unmount pattern

This test currently breaks on libcephfs as reported
in #3367.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
12 years agoosdc/ObjectCacher: remove dead locking code
Sage Weil [Sat, 27 Oct 2012 20:56:24 +0000 (13:56 -0700)]
osdc/ObjectCacher: remove dead locking code

This is unused, and mostly broken in that there is no cleanup when there
is a failure.  Also, the support in the OSD has been largely removed.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agolibrbd: clip requests past end-of-image.
Dan Mick [Tue, 23 Oct 2012 04:15:51 +0000 (21:15 -0700)]
librbd: clip requests past end-of-image.

Rename check_io to clip_io, which can modify the passed-in length
to clamp it to the device size.  This is expected behavior for
block-device emulation.

Call clip_io in rbd_write(); need to return clipped length there,
even though aio_write() is calling clip_io() as well (for the
direct path).

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agolibrbd: size max objects based on actual image object order size
Sage Weil [Sat, 27 Oct 2012 00:12:44 +0000 (17:12 -0700)]
librbd: size max objects based on actual image object order size

This has to happen after we open the image.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agorgw_cache: change call signature to overwrite rgw_rados put_obj_meta()
caleb miles [Fri, 26 Oct 2012 19:17:05 +0000 (15:17 -0400)]
rgw_cache: change call signature to overwrite rgw_rados put_obj_meta()

Signed-off-by: caleb miles <caleb.miles@inktank.com>
12 years agoMerge branch 'master' of github.com:ceph/ceph
Peter Reiher [Fri, 26 Oct 2012 22:32:48 +0000 (15:32 -0700)]
Merge branch 'master' of github.com:ceph/ceph

12 years agomon: fix leading error string from 'ceph report'
Sage Weil [Fri, 26 Oct 2012 21:55:31 +0000 (14:55 -0700)]
mon: fix leading error string from 'ceph report'

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'master' of https://github.com/ceph/ceph
John Wilkins [Fri, 26 Oct 2012 21:49:00 +0000 (14:49 -0700)]
Merge branch 'master' of https://github.com/ceph/ceph

12 years agodoc: updated front page graphic.
John Wilkins [Fri, 26 Oct 2012 21:45:08 +0000 (14:45 -0700)]
doc: updated front page graphic.

fixes: #3412

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoMerge branch 'wip-java-cephfs'
Noah Watkins [Fri, 26 Oct 2012 21:37:25 +0000 (14:37 -0700)]
Merge branch 'wip-java-cephfs'

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Reviewed-by: Joe Buck <joe.buck@inktank.com>
12 years agoPG: Do not discard op data too early
Jim Schutt [Thu, 27 Sep 2012 21:56:15 +0000 (15:56 -0600)]
PG: Do not discard op data too early

Under a sustained cephfs write load where the offered load is higher
than the storage cluster write throughput, a backlog of replication ops
that arrive via the cluster messenger builds up.  The client message
policy throttler, which should be limiting the total write workload
accepted by the storage cluster, is unable to prevent it, for any
value of osd_client_message_size_cap, under such an overload condition.

The root cause is that op data is released too early, in op_applied().

If instead the op data is released at op deletion, then the limit
imposed by the client policy throttler applies over the entire
lifetime of the op, including commits of replication ops.  That
makes the policy throttler an effective means for an OSD to
protect itself from a sustained high offered load, because it can
effectively limit the total, cluster-wide resources needed to process
in-progress write ops.

Signed-off-by: Jim Schutt <jaschut@sandia.gov>
12 years agojava: use unique directory in test
Noah Watkins [Fri, 26 Oct 2012 20:28:52 +0000 (13:28 -0700)]
java: use unique directory in test

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agojava: add tests for double mounting
Noah Watkins [Thu, 25 Oct 2012 22:10:17 +0000 (15:10 -0700)]
java: add tests for double mounting

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agojava: add AlreadyMounted exception
Noah Watkins [Thu, 25 Oct 2012 22:09:54 +0000 (15:09 -0700)]
java: add AlreadyMounted exception

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agojava: remove deprecated ceph_shutdown
Noah Watkins [Thu, 25 Oct 2012 21:42:27 +0000 (14:42 -0700)]
java: remove deprecated ceph_shutdown

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agojava: clean-up in finalize()
Noah Watkins [Thu, 25 Oct 2012 21:43:09 +0000 (14:43 -0700)]
java: clean-up in finalize()

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agojava: enable ceph_release
Noah Watkins [Thu, 25 Oct 2012 21:23:18 +0000 (14:23 -0700)]
java: enable ceph_release

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agojava: enable ceph_unmount
Noah Watkins [Thu, 25 Oct 2012 21:10:24 +0000 (14:10 -0700)]
java: enable ceph_unmount

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agojava: mkdirs returns IOException
Noah Watkins [Sat, 20 Oct 2012 17:58:23 +0000 (10:58 -0700)]
java: mkdirs returns IOException

For example, CephFileAlreadyExistsException may be returned if mkdirs is
called to create a directory already present.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agojava: log listdir contents in java client
Noah Watkins [Thu, 25 Oct 2012 15:51:33 +0000 (08:51 -0700)]
java: log listdir contents in java client

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agojava: remove tabs to fix formatting
Noah Watkins [Fri, 19 Oct 2012 19:22:05 +0000 (12:22 -0700)]
java: remove tabs to fix formatting

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agojava: add O_WRONLY open flag
Noah Watkins [Fri, 19 Oct 2012 19:20:40 +0000 (12:20 -0700)]
java: add O_WRONLY open flag

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agojava: add FileAlreadyExists exception
Noah Watkins [Fri, 19 Oct 2012 19:10:25 +0000 (12:10 -0700)]
java: add FileAlreadyExists exception

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
12 years agoosdc/ObjectCacher: handle zero bufferheads on read
Sage Weil [Fri, 26 Oct 2012 18:55:34 +0000 (11:55 -0700)]
osdc/ObjectCacher: handle zero bufferheads on read

Interpret a zero bufferhead as zeros in _readx().

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: add ZERO bufferheads from map_read()
Sage Weil [Fri, 26 Oct 2012 18:54:50 +0000 (11:54 -0700)]
osdc/ObjectCacher: add ZERO bufferheads from map_read()

When we add a bufferhead with zeros to the Object data map, use the new
zero type instead of allocating actual zeros.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: add zero bufferhead state
Sage Weil [Fri, 26 Oct 2012 18:48:51 +0000 (11:48 -0700)]
osdc/ObjectCacher: add zero bufferhead state

Wired up, but not yet used.

Treat these as clean.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agotest_librbd_fsx: sleep before exit
Sage Weil [Fri, 26 Oct 2012 18:33:31 +0000 (11:33 -0700)]
test_librbd_fsx: sleep before exit

This gives the log time to flush to disk.  Kludgey!

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: some extra debugging
Sage Weil [Fri, 26 Oct 2012 18:32:44 +0000 (11:32 -0700)]
osdc/ObjectCacher: some extra debugging

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: fill in zero buffers in map_read() on miss if complete
Sage Weil [Wed, 24 Oct 2012 21:42:50 +0000 (14:42 -0700)]
osdc/ObjectCacher: fill in zero buffers in map_read() on miss if complete

If we know we have the complete object in cache, fill in zero buffers
when we miss.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: improve debug output for readx()
Sage Weil [Wed, 24 Oct 2012 21:43:03 +0000 (14:43 -0700)]
osdc/ObjectCacher: improve debug output for readx()

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: set complete flag when we observe ENOENT
Sage Weil [Wed, 24 Oct 2012 21:41:38 +0000 (14:41 -0700)]
osdc/ObjectCacher: set complete flag when we observe ENOENT

If we observe an ENOENT on a read, set the complete flag.  Any dirty
buffers we have will still be in memory, even if the write are in flight,
because the TX state remains pinned until the writes commit.  Writes cannot
proceed faster than reads, even though reads may proceed faster than
writes.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: clear complete on trim, release
Sage Weil [Wed, 24 Oct 2012 21:36:05 +0000 (14:36 -0700)]
osdc/ObjectCacher: clear complete on trim, release

Clear the complete flag when we are discarding buffers.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: add complete flag
Sage Weil [Wed, 24 Oct 2012 21:35:24 +0000 (14:35 -0700)]
osdc/ObjectCacher: add complete flag

This is set when we know we have *all* the data for this object.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: refresh iterator in read apply loop
Sage Weil [Wed, 24 Oct 2012 19:48:02 +0000 (12:48 -0700)]
osdc/ObjectCacher: refresh iterator in read apply loop

The p iterator points to the next bh, but try_merge_bh() at the end of the
loop might merge that into our result and invalidate the iterator.  Fix
this by repeating the lookup on each pass through the loop.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: do read completions after assimilating read result
Sage Weil [Wed, 24 Oct 2012 19:44:25 +0000 (12:44 -0700)]
osdc/ObjectCacher: do read completions after assimilating read result

Wait until we have applied the entire read result to the cache before we
trigger any read completion events.  This is a cleaner and safer approach
since we can be sure that the callback won't get blocked again on data we
have but haven't applied yet.  It also fixes a crash I just observed where
the completion did a read, called trim(), and invalidated/destroyed the
iterator/bh p was referencing.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: do not close objects explicitly
Sage Weil [Tue, 23 Oct 2012 16:20:53 +0000 (09:20 -0700)]
osdc/ObjectCacher: do not close objects explicitly

Let the trimmer do that.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: make trim() trim Objects
Sage Weil [Tue, 23 Oct 2012 16:20:35 +0000 (09:20 -0700)]
osdc/ObjectCacher: make trim() trim Objects

Pull unpinned objects off the LRU in trim().  This never happens currently
due to all the explicit calls to close_object()...

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: check lru_is_expireable() in can_close()
Sage Weil [Tue, 23 Oct 2012 16:18:04 +0000 (09:18 -0700)]
osdc/ObjectCacher: check lru_is_expireable() in can_close()

We assert that if can_close(), the Object isn't pinned in the LRU.  This
assumes we did yur get/put refcounting properly, such that the pins are
at least as restrictive as can_close().

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: add LRU for Object
Sage Weil [Tue, 23 Oct 2012 12:58:27 +0000 (05:58 -0700)]
osdc/ObjectCacher: add LRU for Object

Incomplete; we aren't trimming yet.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: take Object ref for bh writes
Sage Weil [Tue, 23 Oct 2012 13:04:08 +0000 (06:04 -0700)]
osdc/ObjectCacher: take Object ref for bh writes

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: take refs for inflight lock ops
Sage Weil [Tue, 23 Oct 2012 13:03:09 +0000 (06:03 -0700)]
osdc/ObjectCacher: take refs for inflight lock ops

These are all dead/unused; should probably just rip out this code!

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: take Object ref when there are buffers
Sage Weil [Tue, 23 Oct 2012 12:55:50 +0000 (05:55 -0700)]
osdc/ObjectCacher: take Object ref when there are buffers

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: add ref count to Object
Sage Weil [Tue, 23 Oct 2012 12:55:23 +0000 (05:55 -0700)]
osdc/ObjectCacher: add ref count to Object

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/ObjectCacher: rename lru_* -> bh_lru_*
Sage Weil [Tue, 23 Oct 2012 12:42:37 +0000 (05:42 -0700)]
osdc/ObjectCacher: rename lru_* -> bh_lru_*

We'll be adding LRUs for objects, too.

Signed-off-by: Sage Weil <sage@inktank.com>