]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
13 years agoFileStore: do not check dbobjectmap without option set
Samuel Just [Thu, 5 Apr 2012 21:58:55 +0000 (14:58 -0700)]
FileStore: do not check dbobjectmap without option set

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agofilestore: set guard on collection_move
Sage Weil [Fri, 30 Mar 2012 16:51:45 +0000 (09:51 -0700)]
filestore: set guard on collection_move

During recovery we submit transactions like:

 - delete a/foo
 - move tmp/foo to a/foo

This prevents the EEXIST check in collection_move from doing any good,
since the destination never exists.  We need to do that remove at least
sometimes, because we may be overwriting an existing/older version of the
object.

So,
 - set the guard after we do the move, so that
 - the delete won't be repated, and
 - the EEXIST check will work

Also check the guard for good measure (although that doesn't do anything
specifically useful in this scenario).

Fixes: #2164
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agov0.44.1 v0.44.1
Sage Weil [Tue, 27 Mar 2012 20:02:09 +0000 (13:02 -0700)]
v0.44.1

13 years agodon't override CFLAGS
Alexandre Oliva [Thu, 22 Mar 2012 19:23:02 +0000 (16:23 -0300)]
don't override CFLAGS

leveldb adds -I flags to CFLAGS and CXXFLAGS, but if these macros are
overridden in the make command line, the flags are dropped, and the
build fails.  leveldb should probably use AM_CFLAGS instead, but the
spec file can specify the preferred CFLAGS in the configure command
line, and then everything will work as expected.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMakefile: fix modules that cannot find pk11pub.h when compiling with NSS on RHEL6
Jim Schutt [Wed, 21 Mar 2012 16:09:09 +0000 (10:09 -0600)]
Makefile: fix modules that cannot find pk11pub.h when compiling with NSS on RHEL6

Signed-off-by: Jim Schutt <jaschut@sandia.gov>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoObjecter: resend linger_ops on any change
Samuel Just [Wed, 21 Mar 2012 00:04:59 +0000 (17:04 -0700)]
Objecter: resend linger_ops on any change

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoObjectStore: Add collection_move to generate_instances
Samuel Just [Wed, 21 Mar 2012 17:58:20 +0000 (10:58 -0700)]
ObjectStore: Add collection_move to generate_instances

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoFileStore: remove src on EEXIST during collection_move replay
Samuel Just [Wed, 21 Mar 2012 17:36:21 +0000 (10:36 -0700)]
FileStore: remove src on EEXIST during collection_move replay

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoFileStore: whitelist COLLECTION_MOVE on replay
Samuel Just [Wed, 21 Mar 2012 03:09:28 +0000 (20:09 -0700)]
FileStore: whitelist COLLECTION_MOVE on replay

Signed-off-by: Samuel Just <rexludorum@gmail.com>
13 years agoObjectStore: add COLLECTION_MOVE to dump
Samuel Just [Wed, 21 Mar 2012 03:08:17 +0000 (20:08 -0700)]
ObjectStore: add COLLECTION_MOVE to dump

Signed-off-by: Samuel Just <rexludorum@gmail.com>
13 years agov0.44 v0.44
Sage Weil [Sun, 18 Mar 2012 19:03:45 +0000 (12:03 -0700)]
v0.44

13 years agorgw: process default alt args before processing conf file
Yehuda Sadeh [Tue, 20 Mar 2012 17:52:14 +0000 (10:52 -0700)]
rgw: process default alt args before processing conf file

this fixes #2189

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoosd: fix object_info.size mismatch file due to truncate_seq on new object
Sage Weil [Sun, 18 Mar 2012 16:08:15 +0000 (09:08 -0700)]
osd: fix object_info.size mismatch file due to truncate_seq on new object

If the first write that creates an object includes a truncate_seq and
truncate_size, we were taking the truncte patch and doing a truncate op
in our transaction prior to the write, and then setting the object_info
size appropriately.  However, if the object doesn't exist, the truncate
op fails even though the oi.size gets set.

Later, this turns up as a scrub error (see #2080).

Fix this by skipping the truncate if it is a new object.  Instead, we
should just initialize our truncate_{seq,size} metadata so that we're all
up to date for any later writes.

Alternatively, we could touch the object and then truncate it (up) to the
large size, but this is sort of a waste; data beyond a short object eof is
defined to be zeros, so all we would accomplish is making recovery work
harder by copying zeros around.

Fixes: #2080
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoosd: remove special handline for head recovery from clone
Sage Weil [Fri, 16 Mar 2012 21:36:38 +0000 (14:36 -0700)]
osd: remove special handline for head recovery from clone

This breaks because:

 - we don't have the head or current snapset
 - get_object_context() creates a new snapset, which is wrong

We probably can only do this if we are certain we can construct/modify
the old snapset and end up with the correct one.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: explicitly create new object,snap contexts on push
Sage Weil [Fri, 16 Mar 2012 20:07:25 +0000 (13:07 -0700)]
osd: explicitly create new object,snap contexts on push

We specifically want to use this during recovery to avoid loading the obc
or ssc for a previous version of the object and populating the watchers.
We know we won't have any existing obc here because it is missing (old or
dne).

For the snapset context, we provide it explicitly when we recover the head
or snapset object (which we always do first).  For clones, we re-use the
existing get_snapset_context(), which will either have the ssc open or
can load it from the head/snapset object.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: re-use create_object_context() in get_object_context()
Sage Weil [Fri, 16 Mar 2012 19:14:44 +0000 (12:14 -0700)]
osd: re-use create_object_context() in get_object_context()

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: ReplicatedPG::create_object_context()
Sage Weil [Fri, 16 Mar 2012 20:05:54 +0000 (13:05 -0700)]
osd: ReplicatedPG::create_object_context()

New helper that creates a new object context.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: create_snapset_context()
Sage Weil [Fri, 16 Mar 2012 20:03:42 +0000 (13:03 -0700)]
osd: create_snapset_context()

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: ensure we don't clobber other *contexts when registering new ones
Sage Weil [Fri, 16 Mar 2012 19:09:44 +0000 (12:09 -0700)]
osd: ensure we don't clobber other *contexts when registering new ones

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoReplicatedPG: populate_object_context during handle_pull_response
Samuel Just [Fri, 16 Mar 2012 17:01:03 +0000 (10:01 -0700)]
ReplicatedPG: populate_object_context during handle_pull_response

A cached objectcontext should always have its watchers populated.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoosd: maybe clear DEGRADED on recovery completion
Sage Weil [Thu, 15 Mar 2012 17:35:40 +0000 (10:35 -0700)]
osd: maybe clear DEGRADED on recovery completion

We set degraded if we don't have enough "active" replicas, which excludes
the backfill target.  We need to recheck that when we finish recovery and
the backfill target is now complete.

Fixes: #2160
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agointroduce CEPH_FEATURE_OMAP
Sage Weil [Wed, 14 Mar 2012 19:57:49 +0000 (12:57 -0700)]
introduce CEPH_FEATURE_OMAP

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: rev cluster internal protocol
Sage Weil [Wed, 14 Mar 2012 19:14:20 +0000 (12:14 -0700)]
osd: rev cluster internal protocol

This covers:

- the push/pull changes in 0.43 (which we forgot to protect against; see
  #2132)
- the new omap stuff for 0.44

Maybe we could make this finer grained so that ceph-osd would fail only
when mismatched versions are talking _and_ there is actual omap data in
play, but it's not worth the effort at this point.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoceph-fuse: make big_writes optional via 'fuse big writes'
Sage Weil [Wed, 14 Mar 2012 16:36:27 +0000 (09:36 -0700)]
ceph-fuse: make big_writes optional via 'fuse big writes'

Fixes: #2159
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agorgw: add more meaningful tests instances of encoded objects
Yehuda Sadeh [Tue, 13 Mar 2012 00:02:53 +0000 (17:02 -0700)]
rgw: add more meaningful tests instances of encoded objects

this completes #2140

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agobuild-doc: use alternate virtualenv dir, if specified
Sage Weil [Mon, 12 Mar 2012 23:46:31 +0000 (16:46 -0700)]
build-doc: use alternate virtualenv dir, if specified

The docs gitbuilder will use this to avoid rebuilding the virtualenv on
every build.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agotest_idempotent: fix global_init call
Sage Weil [Mon, 12 Mar 2012 22:12:55 +0000 (15:12 -0700)]
test_idempotent: fix global_init call

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoqa: kclient/file_layout.sh poking
Sage Weil [Mon, 12 Mar 2012 21:58:19 +0000 (14:58 -0700)]
qa: kclient/file_layout.sh poking

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agorgw: modify bucket instance for encoding test
Yehuda Sadeh [Mon, 12 Mar 2012 21:57:09 +0000 (14:57 -0700)]
rgw: modify bucket instance for encoding test

This makes 'make check' happy, otherwise we need to create
a bucket name that starts with a period. This version is better.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agotest: add test_filestore_idempotent2
Samuel Just [Wed, 7 Mar 2012 19:29:52 +0000 (11:29 -0800)]
test: add test_filestore_idempotent2

Signed-off-by: Samuel Just <rexludorum@gmail.com>
13 years agoFileStore: ignore ERANGE and ENOENT on replay
Samuel Just [Mon, 12 Mar 2012 20:33:55 +0000 (13:33 -0700)]
FileStore: ignore ERANGE and ENOENT on replay

The source object may either not exist or be the wrong size
during replay if the destination object was deleted in a future
already-applied operation.  This should not impact correctness
of the replay.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoFileStore: clarify debug/error output
Samuel Just [Mon, 12 Mar 2012 20:39:13 +0000 (13:39 -0700)]
FileStore: clarify debug/error output

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoMakefile.am, rgw: remove fcgi dependency where not needed
Yehuda Sadeh [Mon, 12 Mar 2012 21:41:24 +0000 (14:41 -0700)]
Makefile.am, rgw: remove fcgi dependency where not needed

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agorgw: add more meaningful test instances of some encoded objects
Yehuda Sadeh [Mon, 12 Mar 2012 21:22:53 +0000 (14:22 -0700)]
rgw: add more meaningful test instances of some encoded objects

still need to add tests for other objects

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agorgw: tone down some log messages
Yehuda Sadeh [Mon, 12 Mar 2012 20:22:49 +0000 (13:22 -0700)]
rgw: tone down some log messages

dout(0) -> dout(1)

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agorgw: provide different default values for 'debug rgw'
Yehuda Sadeh [Mon, 12 Mar 2012 20:18:39 +0000 (13:18 -0700)]
rgw: provide different default values for 'debug rgw'

Currently rgw and radosgw-admin require different chattiness
defaults.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoconfig: alternative config options for global_init()
Yehuda Sadeh [Mon, 12 Mar 2012 20:15:50 +0000 (13:15 -0700)]
config: alternative config options for global_init()

We want to be able to provide alternative default config values, than
the ones we set in common/config_opts.h. This can be useful when we
want different default for different modules (e.g., rgw, rgw-admin).
Just passing it on the command line won't do because then we'd override
any config set by the user, so we need to process that before the regular
parsing (but after initializing the config context).

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoqa: use recent kernel for kernel_untar_build.sh
Sage Weil [Mon, 12 Mar 2012 19:01:21 +0000 (12:01 -0700)]
qa: use recent kernel for kernel_untar_build.sh

Happier on oneiric!

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agorgw: switch ops log flag to use ceph config
Yehuda Sadeh [Mon, 12 Mar 2012 18:39:58 +0000 (11:39 -0700)]
rgw: switch ops log flag to use ceph config

It's turned on by default. So now we're using the
'rgw enable ops log' config param in ceph.conf, instead
of RGW_SHOULD_LOG_DEFAULT in the apache conf.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agofilestore: fix op_num offset/labels
Sage Weil [Mon, 12 Mar 2012 18:21:48 +0000 (11:21 -0700)]
filestore: fix op_num offset/labels

Start at 0, not 1.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoconfig: tmap to omap upgrade, true by default
Yehuda Sadeh [Mon, 12 Mar 2012 18:20:08 +0000 (11:20 -0700)]
config: tmap to omap upgrade, true by default

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoMakefile: link libfcgi to librgw
Sage Weil [Mon, 12 Mar 2012 04:11:37 +0000 (21:11 -0700)]
Makefile: link libfcgi to librgw

Need this to make a linker error go away on my squeeze dev box.  We
probably need to make sure librgw doesn't touch fcgi, once that is
revisited down the line.  Opened #2166.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoqa/workunits/kclient/file_layout: escape *
Sage Weil [Mon, 12 Mar 2012 03:36:47 +0000 (20:36 -0700)]
qa/workunits/kclient/file_layout: escape *

Escape * so that it is expanded as root.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilejournal: less log noise
Sage Weil [Sun, 11 Mar 2012 19:31:17 +0000 (12:31 -0700)]
filejournal: less log noise

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: remove unused bool idempotent
Sage Weil [Sat, 10 Mar 2012 04:54:59 +0000 (20:54 -0800)]
filestore: remove unused bool idempotent

This was from the old broken mechanism.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: fix arguments
Sage Weil [Sat, 10 Mar 2012 01:07:02 +0000 (17:07 -0800)]
filestore: fix arguments

From a change that was rebased out; missed this caller.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote branch 'gh/wip-2098'
Sage Weil [Sat, 10 Mar 2012 00:42:15 +0000 (16:42 -0800)]
Merge remote branch 'gh/wip-2098'

Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agofilestore: sync object_map on _set_replay_guard()
Sage Weil [Sat, 10 Mar 2012 00:34:55 +0000 (16:34 -0800)]
filestore: sync object_map on _set_replay_guard()

We need to sync the object_map too.  We can _almost_ check to see if there
are keys for the object and only do it then, except that they may have
existed previously and then been deleted.

So, always sync.  leveldb is reasonably nice about this... it should just
be another fsync.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoomap: add sync method to ObjectMap
Sage Weil [Thu, 8 Mar 2012 00:38:29 +0000 (16:38 -0800)]
omap: add sync method to ObjectMap

Signed-off-by: Samuel Just <rexludorum@gmail.com>
13 years agofilestore: remove old post-idempotent transaction trigger_commit
Sage Weil [Thu, 8 Mar 2012 04:58:27 +0000 (20:58 -0800)]
filestore: remove old post-idempotent transaction trigger_commit

The old strategy was to initiate a commit after any non-idempotent
transaction.  This only worked if the transaction was idempotent with
respect to itself, or could be replayed partially without problems,
and in reality that isn't the case.  For example:

 - clone A -> B
 - write to A
 - <sync>

If we crash before the sync, and replay the clone A->B, we corrupt B with
the new A data.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: guard collection_remove replay
Sage Weil [Thu, 8 Mar 2012 04:55:27 +0000 (20:55 -0800)]
filestore: guard collection_remove replay

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: guard replay of collection_add
Sage Weil [Thu, 8 Mar 2012 04:55:16 +0000 (20:55 -0800)]
filestore: guard replay of collection_add

- set guard on apply
- check guard on replay

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: guard replay of basic collection ops
Sage Weil [Thu, 8 Mar 2012 04:54:22 +0000 (20:54 -0800)]
filestore: guard replay of basic collection ops

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: guard collection_rename replay
Sage Weil [Thu, 8 Mar 2012 04:53:51 +0000 (20:53 -0800)]
filestore: guard collection_rename replay

- check guard on replay
- set guard on apply

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agofilestore: fix collection_rename error code
Sage Weil [Thu, 8 Mar 2012 04:53:27 +0000 (20:53 -0800)]
filestore: fix collection_rename error code

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: guard clone replay
Sage Weil [Thu, 8 Mar 2012 04:52:57 +0000 (20:52 -0800)]
filestore: guard clone replay

- set guard xattr on clone, clone_range
- check before applying/replaying

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: implement _set_replay_guard, _check_replay_guard
Sage Weil [Thu, 8 Mar 2012 00:37:32 +0000 (16:37 -0800)]
filestore: implement _set_replay_guard, _check_replay_guard

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: maintain SequencerPosition during _do_transaction
Sage Weil [Wed, 7 Mar 2012 05:51:35 +0000 (21:51 -0800)]
filestore: maintain SequencerPosition during _do_transaction

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: fgetxattr helpers/wrappers
Sage Weil [Wed, 7 Mar 2012 18:11:58 +0000 (10:11 -0800)]
filestore: fgetxattr helpers/wrappers

Also, do the getxattr using fgetxattr, to avoid duplicating code.  This is
slightly slower probably because we open a file handle, but if we care we
should really clean up the code to use lfn_open instead of lfn_find and
avoid the repeated path traversal too.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoadd SequencerPosition type
Sage Weil [Sun, 4 Mar 2012 21:43:18 +0000 (13:43 -0800)]
add SequencerPosition type

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: pass trans_num into _do_transaction
Sage Weil [Wed, 7 Mar 2012 05:16:06 +0000 (21:16 -0800)]
filestore: pass trans_num into _do_transaction

This gives us the <op_seq, trans_num, op_num> triple to identify every
constituent operation.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: use proper type for readdir_r tmp
Sage Weil [Sun, 4 Mar 2012 21:21:11 +0000 (13:21 -0800)]
filestore: use proper type for readdir_r tmp

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agolevedb: fix commit
Sage Weil [Fri, 9 Mar 2012 22:24:14 +0000 (14:24 -0800)]
levedb: fix commit

This got reverted back to the old commit, somehow.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge branch 'master' of github.com:ceph/ceph
Sage Weil [Fri, 9 Mar 2012 22:13:03 +0000 (14:13 -0800)]
Merge branch 'master' of github.com:ceph/ceph

13 years agoRadosModel: fix omap_clear case in RemoveAttrsOp
Samuel Just [Fri, 9 Mar 2012 22:10:18 +0000 (14:10 -0800)]
RadosModel: fix omap_clear case in RemoveAttrsOp

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoMerge branch 'wip-rgw-encode'
Sage Weil [Fri, 9 Mar 2012 22:03:15 +0000 (14:03 -0800)]
Merge branch 'wip-rgw-encode'

Conflicts:
src/rgw/rgw_cls_api.h

Reviewed-by: Sage Weil <sage@newdream.net>
13 years agorgw: fix rgw_cls_list_ret ctor
Sage Weil [Fri, 9 Mar 2012 21:55:49 +0000 (13:55 -0800)]
rgw: fix rgw_cls_list_ret ctor

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoceph-object-corpus: added 0.43
Yehuda Sadeh [Fri, 9 Mar 2012 21:32:36 +0000 (13:32 -0800)]
ceph-object-corpus: added 0.43

Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
13 years agotest/encoding/import.sh: fix target directory
Yehuda Sadeh [Fri, 9 Mar 2012 21:32:18 +0000 (13:32 -0800)]
test/encoding/import.sh: fix target directory

Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
13 years agoMakefile.am: update link dependencies for some unit tests
Yehuda Sadeh [Fri, 9 Mar 2012 21:29:59 +0000 (13:29 -0800)]
Makefile.am: update link dependencies for some unit tests

13 years agorgw: various encoding related fixes
Yehuda Sadeh [Fri, 9 Mar 2012 22:01:12 +0000 (14:01 -0800)]
rgw: various encoding related fixes

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoosd: fix watch_lock vs map_lock ordering
Sage Weil [Fri, 9 Mar 2012 21:34:55 +0000 (13:34 -0800)]
osd: fix watch_lock vs map_lock ordering

watch_lock is inside map_lock (and pg->lock), which means we need to
drop it to take pg->lock here.  That means verifying in
handle_watch_timeout that we haven't raced with another thread canceling
the timeout event, which would be indicated by

 - the entity not appearing in unconnected_watchers
 - the entity having a different (presumably newer) expire time

Fixes: #2103
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoosd: update_heartbeat_peers as needed
Sage Weil [Fri, 9 Mar 2012 20:26:22 +0000 (12:26 -0800)]
osd: update_heartbeat_peers as needed

Before, we were being very careful about updating the heartbeat peers if
new PGs were created or when certain types of messages were received.
However, the PG can change it's peers in lots of cases (e.g., when
recovery completes), but the OSD doesn't re-aggregate.

Instead, set a flag when each PG updates it's set, and check that flag in
the OSD code periodically or in likely places.  A call in tick() acts as
a catch-all.

The num_created counts can probably be cleaned out now...

Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agorgw: implement dump() for encoders
Yehuda Sadeh [Fri, 9 Mar 2012 08:06:34 +0000 (00:06 -0800)]
rgw: implement dump() for encoders

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agorgw: add stubs for dencoder test
Yehuda Sadeh [Fri, 9 Mar 2012 00:58:00 +0000 (16:58 -0800)]
rgw: add stubs for dencoder test

still need to add some content to the dump methods

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoMerge remote branch 'origin/wip-2139'
Yehuda Sadeh [Fri, 9 Mar 2012 00:15:18 +0000 (16:15 -0800)]
Merge remote branch 'origin/wip-2139'

Conflicts:
src/cls_rgw.cc
src/rgw/rgw_rados.cc
src/rgw/rgw_rados.h

Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
13 years agoMerge branch 'master' of ssh://github.com/ceph/ceph
Yehuda Sadeh [Thu, 8 Mar 2012 23:54:14 +0000 (15:54 -0800)]
Merge branch 'master' of ssh://github.com/ceph/ceph

13 years agoceph: document the way files are laid out
Alex Elder [Thu, 8 Mar 2012 23:16:45 +0000 (15:16 -0800)]
ceph: document the way files are laid out

This adds a document that I wrote about how Ceph client file data
is striped across Ceph objects to the repository.  It's a text
document.  Someone with better document preparation skills than I
should use the content below as a basis for something prettier if
that's appropriate.

[Made a few edits... -sage]

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agolibrados: fix unit test for omap_get_vals_by_key rename
Sage Weil [Thu, 8 Mar 2012 23:09:30 +0000 (15:09 -0800)]
librados: fix unit test for omap_get_vals_by_key rename

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: format time nicely in ops_in_flight output
Sage Weil [Thu, 8 Mar 2012 23:06:39 +0000 (15:06 -0800)]
osd: format time nicely in ops_in_flight output

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agolibrados: fix map -> std::map in header, string -> std::string
Sage Weil [Thu, 8 Mar 2012 23:06:19 +0000 (15:06 -0800)]
librados: fix map -> std::map in header, string -> std::string

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agodoc: example of diagnosing radosgw hang
Sage Weil [Thu, 8 Mar 2012 23:02:02 +0000 (15:02 -0800)]
doc: example of diagnosing radosgw hang

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agodoc: explain how unfound objects happen
Sage Weil [Thu, 8 Mar 2012 22:55:21 +0000 (14:55 -0800)]
doc: explain how unfound objects happen

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agodoc: make osd failure example include >3 osds
Sage Weil [Thu, 8 Mar 2012 22:55:08 +0000 (14:55 -0800)]
doc: make osd failure example include >3 osds

More realistic.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agotestrados: fix omap_get_vals_by_keys call
Sage Weil [Thu, 8 Mar 2012 22:46:56 +0000 (14:46 -0800)]
testrados: fix omap_get_vals_by_keys call

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: add zero_to field to PG::OndiskLog; track zeroed region of pg log
Sage Weil [Thu, 8 Mar 2012 22:29:42 +0000 (14:29 -0800)]
osd: add zero_to field to PG::OndiskLog; track zeroed region of pg log

Track which region of the log has been zeroed on disk.  This may be
different from tail if 'osd preserved trimmed log = false' in the config.

Only zero the portion of the log we need to.  This avoids rezeroing regions
or missing bits when 'osd preserved trimmed log' was off and is then turned
on.

Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agofilestore: use FL_ALLOC_PUNCH_HOLE to zero, when available
Sage Weil [Thu, 8 Mar 2012 22:30:06 +0000 (14:30 -0800)]
filestore: use FL_ALLOC_PUNCH_HOLE to zero, when available

First try the FL_ALLOC_PUNCH_HOLE fallocate() flag.  If we get EOPNOTSUPP,
fall back to writing zeros.

Check for fallocate(2) with configure.  Also, avoid this if we are not
Linux, since I'm not sure about the hard-coded FL_ALLOC_PUNCH_HOLE being
correct on other platforms.

Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoosd: fix op_wq vs pg->lock ordering
Sage Weil [Thu, 8 Mar 2012 22:16:59 +0000 (14:16 -0800)]
osd: fix op_wq vs pg->lock ordering

map_lock
 -> pg->lock
   -> op_wq

Fixes: #2153
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoMerge branch 'master' of ssh://skinny/home/yehudasa/ceph
Yehuda Sadeh [Thu, 8 Mar 2012 06:58:42 +0000 (22:58 -0800)]
Merge branch 'master' of ssh://skinny/home/yehudasa/ceph

13 years agoMerge branch 'wip-rgw-new-atomic'
Yehuda Sadeh [Thu, 8 Mar 2012 06:53:32 +0000 (22:53 -0800)]
Merge branch 'wip-rgw-new-atomic'

13 years agorgw: append the currect bucket marker when removing bucket
Yehuda Sadeh [Thu, 8 Mar 2012 06:52:24 +0000 (22:52 -0800)]
rgw: append the currect bucket marker when removing bucket

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoMerge branch 'master' of ssh://skinny/home/yehudasa/ceph
Yehuda Sadeh [Thu, 8 Mar 2012 06:39:46 +0000 (22:39 -0800)]
Merge branch 'master' of ssh://skinny/home/yehudasa/ceph

13 years agoMerge branch 'wip-rgw-omap'
Yehuda Sadeh [Thu, 8 Mar 2012 06:35:40 +0000 (22:35 -0800)]
Merge branch 'wip-rgw-omap'

13 years agocls_rgw: fix rgw_bucket_init_index
Yehuda Sadeh [Thu, 8 Mar 2012 06:25:47 +0000 (22:25 -0800)]
cls_rgw: fix rgw_bucket_init_index

was failing to error in case header already existed

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agorgw: remove extra unused params from omap_get()
Yehuda Sadeh [Thu, 8 Mar 2012 06:19:25 +0000 (22:19 -0800)]
rgw: remove extra unused params from omap_get()

and also rename it to omap_get_all()

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agorgw: add cls_cxx_map_clear
Yehuda Sadeh [Thu, 8 Mar 2012 06:18:57 +0000 (22:18 -0800)]
rgw: add cls_cxx_map_clear

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoleveldb: drop compaction unit test
Samuel Just [Thu, 8 Mar 2012 05:59:30 +0000 (21:59 -0800)]
leveldb: drop compaction unit test

Signed-off-by: Samuel Just <rexludorum@gmail.com>
13 years agoReplicatedPG,librados: add filter_prefix to omap_get_vals
Samuel Just [Wed, 7 Mar 2012 21:08:36 +0000 (13:08 -0800)]
ReplicatedPG,librados: add filter_prefix to omap_get_vals

Signed-off-by: Samuel Just <rexludorum@gmail.com>
13 years agorgw: use prefix filter for bucket listing
Yehuda Sadeh [Thu, 8 Mar 2012 01:10:18 +0000 (17:10 -0800)]
rgw: use prefix filter for bucket listing

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoobjclass, cls_rgw: add prefix to omap_get_vals()
Yehuda Sadeh [Thu, 8 Mar 2012 01:03:45 +0000 (17:03 -0800)]
objclass, cls_rgw: add prefix to omap_get_vals()