]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
12 years agorgw: store cluster params in a special object
Yehuda Sadeh [Wed, 29 Aug 2012 22:34:17 +0000 (15:34 -0700)]
rgw: store cluster params in a special object

We now have a cluster root pool that should hold the
cluster params. The cluster params are now read from
this object on startup, if object does not exist we
set its defaults and write it.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agorgw: cleanup create_bucket
Yehuda Sadeh [Wed, 29 Aug 2012 21:43:52 +0000 (14:43 -0700)]
rgw: cleanup create_bucket

Pool creation is now being done through an excplicit
method, get rid of the unused user param when creating
a pool, and remove auid.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agorgw: remove static store object
Yehuda Sadeh [Fri, 10 Aug 2012 23:27:21 +0000 (16:27 -0700)]
rgw: remove static store object

We used to instantiate a single RGWRados object.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agorgw: pool names are not global
Yehuda Sadeh [Fri, 10 Aug 2012 22:59:44 +0000 (15:59 -0700)]
rgw: pool names are not global

Move all hard coded pool names outside of the global
namespace.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agorgw: RGWRados holds domain root info
Yehuda Sadeh [Fri, 10 Aug 2012 21:16:42 +0000 (14:16 -0700)]
rgw: RGWRados holds domain root info

Continuing cleanup work.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agorgw: rgwstore is not global
Yehuda Sadeh [Thu, 9 Aug 2012 23:58:40 +0000 (16:58 -0700)]
rgw: rgwstore is not global

Instead of using a global rgwstore param, just pass it around.
We now do it almost all around, except for in rgw_admin, where
we can still have the global one.
This is part of a cleanup that will allow setting flexible
pool names.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agorgw: remove unused block of code
Yehuda Sadeh [Wed, 29 Aug 2012 21:20:38 +0000 (14:20 -0700)]
rgw: remove unused block of code

We were reading bucket info, but that wasn't necessary.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agorgw_admin.cc: Prevent clobbering the index when linking a bucket.
caleb miles [Tue, 28 Aug 2012 17:41:42 +0000 (10:41 -0700)]
rgw_admin.cc: Prevent clobbering the index when linking a bucket.

Prevent the 'bucket link' command from overwriting the index of an
existing bucket. Corrects bug 2935:

http://tracker.newdream.net/issues/2935

Signed-off-by: caleb miles <caleb.miles@inktank.com>
12 years agorgw: clear usage map before reading usage
Yehuda Sadeh [Tue, 28 Aug 2012 23:17:21 +0000 (16:17 -0700)]
rgw: clear usage map before reading usage

Fixes: #3057
Since we read usage in chunks we need to clear the
usage map before reading the next chunk, otherwise
we're going to aggregate the old data as well.

Backport: argonaut
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoMerge branch 'next'
Sage Weil [Tue, 28 Aug 2012 22:15:08 +0000 (15:15 -0700)]
Merge branch 'next'

12 years agoosd: fix waiting_for_disk assertion
Sage Weil [Tue, 28 Aug 2012 22:14:41 +0000 (15:14 -0700)]
osd: fix waiting_for_disk assertion

If requeue is false, we won't have cleared out waiting_for_ondisk; adjust
assert placement as appropriate.  Also, make sur we handle the requeue
and !op case properly (although I'm not sure offhand if/when it would
come up).

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agorados_bench: wait for completion callbacks before returning
Mike Ryan [Tue, 28 Aug 2012 18:57:03 +0000 (11:57 -0700)]
rados_bench: wait for completion callbacks before returning

If we don't wait for the callback, the finisher may cleanup the callback
context before the callback is actually invoked, causing a
use-after-free error.

This fixes #3048.

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agoMakefile.am: add missing .h
Yehuda Sadeh [Tue, 28 Aug 2012 21:13:53 +0000 (14:13 -0700)]
Makefile.am: add missing .h

Was missing rgw_html_errors.h

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoMerge remote-tracking branch 'origin/wip-multi-delete'
Yehuda Sadeh [Tue, 28 Aug 2012 20:36:35 +0000 (13:36 -0700)]
Merge remote-tracking branch 'origin/wip-multi-delete'

Conflicts:
src/Makefile.am

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agorados_bench: wait for completion callbacks before returning
Mike Ryan [Tue, 28 Aug 2012 18:57:03 +0000 (11:57 -0700)]
rados_bench: wait for completion callbacks before returning

If we don't wait for the callback, the finisher may cleanup the callback
context before the callback is actually invoked, causing a
use-after-free error.

This fixes #3048.

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agodoc: Completed and reviewed RGW config reference.
John Wilkins [Tue, 28 Aug 2012 20:25:44 +0000 (13:25 -0700)]
doc: Completed and reviewed RGW config reference.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: added admonishment. Updated header syntax, copy semantics and x-ref.
John Wilkins [Tue, 28 Aug 2012 20:24:29 +0000 (13:24 -0700)]
doc: added admonishment. Updated header syntax, copy semantics and x-ref.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agorgw: fix usage log read
Yehuda Sadeh [Tue, 28 Aug 2012 19:51:55 +0000 (12:51 -0700)]
rgw: fix usage log read

The usage log read got broken in a recent cleanup work.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agodoc: Added internal references. Clarified language in disk prepare.
John Wilkins [Tue, 28 Aug 2012 18:41:59 +0000 (11:41 -0700)]
doc: Added internal references. Clarified language in disk prepare.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: added sudo for hdparm command.
John Wilkins [Tue, 28 Aug 2012 18:02:13 +0000 (11:02 -0700)]
doc: added sudo for hdparm command.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: added internal hyperlink targets.
John Wilkins [Tue, 28 Aug 2012 17:55:04 +0000 (10:55 -0700)]
doc: added internal hyperlink targets.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Clean up quick start to ensure nobody uses "localhost".
John Wilkins [Tue, 28 Aug 2012 17:01:20 +0000 (10:01 -0700)]
doc: Clean up quick start to ensure nobody uses "localhost".

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Cleaned up syntax errors, and converted table to list.
John Wilkins [Tue, 28 Aug 2012 16:24:22 +0000 (09:24 -0700)]
doc: Cleaned up syntax errors, and converted table to list.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoMerge branch 'next'
Sage Weil [Tue, 28 Aug 2012 00:44:53 +0000 (17:44 -0700)]
Merge branch 'next'

12 years agoMerge branch 'wip-objecter' into next
Sage Weil [Tue, 28 Aug 2012 00:26:13 +0000 (17:26 -0700)]
Merge branch 'wip-objecter' into next

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoobjecter: fix skipped map handling
Sage Weil [Mon, 27 Aug 2012 15:24:08 +0000 (08:24 -0700)]
objecter: fix skipped map handling

If we skip a map, we want to translate NO_ACTION to NEED_RESEND, but leave
POOL_DNE alone.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoobjecter: send queued requests when we get first osdmap
Sage Weil [Mon, 27 Aug 2012 14:38:34 +0000 (07:38 -0700)]
objecter: send queued requests when we get first osdmap

If we get our first osdmap and already have requests queued, send them.

Fixes: #3050
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoobjecter: fix is_latest_map() retry on mon session restart
Sage Weil [Mon, 27 Aug 2012 04:21:44 +0000 (21:21 -0700)]
objecter: fix is_latest_map() retry on mon session restart

If the mon session drops, we get an EAGAIN callback, which we already
correctly ignored.  (Clean this up and comment so it's clearer what is
going on.)

Fix ms_handle_connect() to resubmit those requests.

Noticed while fixing #3049.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomonclient: pass EAGAIN to is_latest_map() callers
Sage Weil [Mon, 27 Aug 2012 04:17:05 +0000 (21:17 -0700)]
monclient: pass EAGAIN to is_latest_map() callers

If our map get_version check needs to be retried, tell the
is_latest_map() callers instead of giving returning 0 ("no").

Fixes: #3049
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomonclient: document get_version(), and fix return value
Sage Weil [Tue, 28 Aug 2012 00:25:54 +0000 (17:25 -0700)]
monclient: document get_version(), and fix return value

Return -EAGAIN instead of -1, since that's more meaningful, and
document it.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoImplement multi-object delete.
caleb miles [Tue, 28 Aug 2012 00:08:44 +0000 (17:08 -0700)]
Implement multi-object delete.

An implimentation of multi-object delete described in
the latest Amazon S3 API provied at

http://docs.amazonwebservices.com/AmazonS3/latest/API

This commit is in response to tracker issue 2797

http://tracker.newdream.net/issues/2797

Signed-off-by: caleb miles <caleb.miles@inktank.com>
12 years agoosd: requeue dup ops inline with in-progress ops
Sage Weil [Mon, 27 Aug 2012 21:31:32 +0000 (14:31 -0700)]
osd: requeue dup ops inline with in-progress ops

We should requeue the dups along with the originals.  This avoids
situations where, after requeue, the dups are reordered with respect to
each other.  For example:

 - client sends A, B, C
 - osd receives A
 - connection drops
 - client sends A', B', C'
 - osd puts A' in waiting_for_ondisk, starts B' and C'
 - on_change() requeues everything

Final queue order (before this patch) is
    A, B', C', A'

After this patch, the resulting queue order is
    A, A', B', C'

Or somewhat more generally, it might be:

    A, A', B, B', B'', C', C'', D'', ....

Fixes (another source of): #2947
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge remote-tracking branch 'gh/wip-mon-intparsing'
Sage Weil [Mon, 27 Aug 2012 22:10:35 +0000 (15:10 -0700)]
Merge remote-tracking branch 'gh/wip-mon-intparsing'

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agoosd: include notif pointer in notify debug output
Sage Weil [Fri, 24 Aug 2012 00:08:20 +0000 (17:08 -0700)]
osd: include notif pointer in notify debug output

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoconfig: add 'fatal signal handlers' option
Sage Weil [Thu, 23 Aug 2012 16:37:37 +0000 (09:37 -0700)]
config: add 'fatal signal handlers' option

This will let us disable the sighandlers for SEGV, etc.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocls_rgw_client: fix #include path
Sage Weil [Mon, 27 Aug 2012 21:41:01 +0000 (14:41 -0700)]
cls_rgw_client: fix #include path

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'origin/master' into wip-gc2
Yehuda Sadeh [Mon, 27 Aug 2012 19:43:27 +0000 (12:43 -0700)]
Merge remote-tracking branch 'origin/master' into wip-gc2

12 years agocls_rgw: add cls_rgw unitest, test gc api
Yehuda Sadeh [Mon, 27 Aug 2012 18:19:27 +0000 (11:19 -0700)]
cls_rgw: add cls_rgw unitest, test gc api

Test various cls_rgw gc related functionality.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agorgw-admin: get rid of lazy remove option, other fixes
Yehuda Sadeh [Tue, 21 Aug 2012 22:44:30 +0000 (15:44 -0700)]
rgw-admin: get rid of lazy remove option, other fixes

was mishandling parsing of binary flag arguments.
also, fix argument parsing and update radosgw-admin
cli test reference.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agorgw: implement garbage collector
Yehuda Sadeh [Tue, 21 Aug 2012 22:05:38 +0000 (15:05 -0700)]
rgw: implement garbage collector

Add a garbage collector thread that is responsible for clean
up of clutter. When removing an object, store info about the
leftovers in a special gc map (via rgw objclass). A new
radosgw-admin commands to list objects in gc, and to run the
gc process manually. Also, gc processors can run in parallel,
however, each will handle a single gc shard (synchronized
using lock objclass).

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agomon: make parse_pos_long() error message more helpful
Sage Weil [Mon, 27 Aug 2012 15:36:41 +0000 (08:36 -0700)]
mon: make parse_pos_long() error message more helpful

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: turn off lockdep during shutdown signal handler
Sage Weil [Sun, 26 Aug 2012 15:42:06 +0000 (08:42 -0700)]
osd: turn off lockdep during shutdown signal handler

We don't shut down all threads, and the surviving ones fight with
exit()'s teardown.  Kludge until we have a clean shutdown process.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge tag 'v0.51'
Sage Weil [Sun, 26 Aug 2012 15:18:45 +0000 (08:18 -0700)]
Merge tag 'v0.51'

v0.51

12 years agov0.51 v0.51
Sage Weil [Sat, 25 Aug 2012 22:58:39 +0000 (15:58 -0700)]
v0.51

12 years agomon: require --id
Sage Weil [Mon, 20 Aug 2012 20:12:26 +0000 (13:12 -0700)]
mon: require --id

Fixes: #2997
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: fix int parsing in monmon
Sage Weil [Fri, 24 Aug 2012 23:05:07 +0000 (16:05 -0700)]
mon: fix int parsing in monmon

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: check for int parsing errors in mdsmon
Sage Weil [Fri, 24 Aug 2012 23:03:02 +0000 (16:03 -0700)]
mon: check for int parsing errors in mdsmon

Fixes: #3014
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: check for int parsing errors in osdmon
Sage Weil [Fri, 24 Aug 2012 23:02:02 +0000 (16:02 -0700)]
mon: check for int parsing errors in osdmon

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agointerval_set: predeclare const_iterator
Sage Weil [Fri, 24 Aug 2012 21:55:12 +0000 (14:55 -0700)]
interval_set: predeclare const_iterator

This makes the coverity build happier.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
12 years agoMakefile: update coverity rules
Sage Weil [Fri, 24 Aug 2012 21:54:51 +0000 (14:54 -0700)]
Makefile: update coverity rules

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
12 years agolibrbd-dev.install: package new rbd/features.h header file.
Gary Lowell [Fri, 24 Aug 2012 22:16:05 +0000 (15:16 -0700)]
librbd-dev.install: package new rbd/features.h header file.

12 years agomon: describe how pgs are stuck in 'health detail'
Sage Weil [Fri, 24 Aug 2012 21:43:56 +0000 (14:43 -0700)]
mon: describe how pgs are stuck in 'health detail'

Showing the current state and saying it is stuck doesn't tell you how it
is stuck (e.g. stuck unclean, stuck inactive, etc.).  Also include the
stuck duration.

Fixes: #2876
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'next'
Sage Weil [Fri, 24 Aug 2012 21:38:58 +0000 (14:38 -0700)]
Merge branch 'next'

12 years agoosd: fix use-after-free in handle_notify_timeout
Sage Weil [Fri, 24 Aug 2012 18:16:01 +0000 (11:16 -0700)]
osd: fix use-after-free in handle_notify_timeout

Valgrind turned this up.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph.spec.in: package new rados library.
Gary Lowell [Fri, 24 Aug 2012 04:35:21 +0000 (21:35 -0700)]
ceph.spec.in: package new rados library.

12 years agoMerge remote-tracking branch 'gh/wip-mon-report'
Sage Weil [Thu, 23 Aug 2012 23:11:58 +0000 (16:11 -0700)]
Merge remote-tracking branch 'gh/wip-mon-report'

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agoMerge remote-tracking branch 'gh/wip_rados_bench_really_final'
Sage Weil [Thu, 23 Aug 2012 23:07:32 +0000 (16:07 -0700)]
Merge remote-tracking branch 'gh/wip_rados_bench_really_final'

Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoobj_bencher: use async remove during slow remove-by-prefix
Mike Ryan [Thu, 21 Jun 2012 18:03:15 +0000 (11:03 -0700)]
obj_bencher: use async remove during slow remove-by-prefix

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agoobj_bencher: remove all benchmark files matching a prefix
Mike Ryan [Tue, 24 Jul 2012 03:45:31 +0000 (20:45 -0700)]
obj_bencher: remove all benchmark files matching a prefix

This is a fallback for when a user wishes to delete ALL benchmark files
matching a particular prefix. In the fast case, a metadata file tells us
enough to quickly delete the files in parallel. This is the slow case,
where each file's name must be checked against the prefix.

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agoobj_bencher: cleanup files in parallel using aio
Mike Ryan [Thu, 23 Aug 2012 18:52:51 +0000 (11:52 -0700)]
obj_bencher: cleanup files in parallel using aio

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agoobj_bencher: remove benchmark objects by prefix
Mike Ryan [Thu, 21 Jun 2012 17:08:53 +0000 (10:08 -0700)]
obj_bencher: remove benchmark objects by prefix

This intelligently removes objects from a rados or rest benchmark run by
using parameters from the metadata file.

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agoobj_bencher: store per-benchmark metadata
Mike Ryan [Wed, 20 Jun 2012 21:50:04 +0000 (14:50 -0700)]
obj_bencher: store per-benchmark metadata

Store metadata for each benchmark run so that the objects can be
efficiently removed at a later point.

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agoobj_bencher: clean up objects after a write benchmark
Mike Ryan [Wed, 20 Jun 2012 21:47:46 +0000 (14:47 -0700)]
obj_bencher: clean up objects after a write benchmark

Per #2477, objects created during rados or rest write benchmark are
automatically cleaned up after the test. They can optionally be left in
place.

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agoobj_bencher: announce prefix during write benchmark
Mike Ryan [Tue, 19 Jun 2012 20:54:40 +0000 (13:54 -0700)]
obj_bencher: announce prefix during write benchmark

Per #2477 this can be used during a post-benchmark cleanup in rest and
rados bench.

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agoDon't package crush header files.
Gary Lowell [Thu, 23 Aug 2012 18:48:50 +0000 (11:48 -0700)]
Don't package crush header files.

12 years agoceph.spec.in: package new rbd header and rados library.
Gary Lowell [Thu, 23 Aug 2012 20:40:18 +0000 (13:40 -0700)]
ceph.spec.in:  package new rbd header and rados library.

12 years agoMerge branch 'wip-msgr'
Sage Weil [Thu, 23 Aug 2012 20:29:10 +0000 (13:29 -0700)]
Merge branch 'wip-msgr'

12 years agomsg/Pipe: conditionally detect session reset
Sage Weil [Thu, 23 Aug 2012 20:26:32 +0000 (13:26 -0700)]
msg/Pipe: conditionally detect session reset

Lossless peers (osd<->osd, mds<->mds, mon<->mon) never reset sessions
to each other.  In the osd and mds cases, there is no need to check for
session resets.  More significantly, these checks can trigger with an
unfortunately sequence of socket failures.  In particular,

 - A sends connect request to B
 - B accepts, increments connect_seq, then has a socket failure
   before telling A
 - A reconnects, stil with connect_seq == 0
 - B sees connect_seq == 0 and thinks there was a reset

This warrants a closer look in the fs client <-> mds case, but for now,
in the cluster-internal communications, it is moot, since reset
detection is unnecessary.

In the monitor case: we do need to check with resets because the peers
reuse the same entity_addr_t's (nonce==0), which means that a daemon
restart is effectively a reset.  In that case, use a different policy
that continues to check for resets.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoosd: prefer acting osds in calc_acting()
Sage Weil [Thu, 23 Aug 2012 20:27:26 +0000 (13:27 -0700)]
osd: prefer acting osds in calc_acting()

We currently prefer up osds, and then pull sequentially from peer_info
(strays we know about at the time).  This adds an additional preference
for the current acting, which means we can avoid changes to acting when
they are largely useless.

In particular, I observed that we chose [5,3] and later (when recovery
completed) chose [5,1] because we had since heard about an eligible stray
on 1.  That switch was basically a waste...

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agolibrados: implement aio_remove
Mike Ryan [Tue, 19 Jun 2012 23:56:40 +0000 (16:56 -0700)]
librados: implement aio_remove

Signed-off-by: Mike Ryan <mike.ryan@inktank.com>
12 years agorbd: force all exiting paths through main()/return
Dan Mick [Mon, 20 Aug 2012 22:02:57 +0000 (15:02 -0700)]
rbd: force all exiting paths through main()/return
This properly destroys objects.  In the process, remove usage_exit();
also kill error-handling in set_conf_param (never relevant for rbd.cc,
and if you call it with both pointers NULL, well...)
Also switch to EXIT_FAILURE for consistency.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Fixes: #2948
12 years agoMerge branch 'wip-mon-mkfs'
Sage Weil [Thu, 23 Aug 2012 19:59:28 +0000 (12:59 -0700)]
Merge branch 'wip-mon-mkfs'

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: name cluster uuid file 'cluster_uuid'
Sage Weil [Thu, 23 Aug 2012 19:46:40 +0000 (12:46 -0700)]
mon: name cluster uuid file 'cluster_uuid'

Begin the transition.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoobjecter: use ordered map<> for tracking tids to preserve order on resend
Sage Weil [Wed, 22 Aug 2012 04:12:33 +0000 (21:12 -0700)]
objecter: use ordered map<> for tracking tids to preserve order on resend

We are using a hash_map<> to map tids to Op*'s.  In handle_osd_map(),
we will recalc_op_target() on each Op in a random (hash) order.  These
will get put in a temp map<tid,Op*> to ensure they are resent in the
correct order, but their order on the session->ops list will be random.

Then later, if we reset an OSD connection, we will resend everything for
that session in ops order, which is be incorrect.

Fix this by explicitly reordering the requests to resend in
kick_requests(), much like we do in handle_osd_map().  This lets us
continue to use a hash_map<>, which is faster for reasonable numbers of
requests.  A simpler but slower fix would be to just use map<> instead.

This is one of many bugs contributing to #2947.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoDon't package crush header files.
Gary Lowell [Thu, 23 Aug 2012 18:48:50 +0000 (11:48 -0700)]
Don't package crush header files.

12 years agomon: create cluster_fsid on startup if not present
Sage Weil [Mon, 20 Aug 2012 17:56:37 +0000 (10:56 -0700)]
mon: create cluster_fsid on startup if not present

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: create, verify cluster_fsid file in mon_data dir on mkfs
Sage Weil [Mon, 20 Aug 2012 17:56:14 +0000 (10:56 -0700)]
mon: create, verify cluster_fsid file in mon_data dir on mkfs

Having this present is convenient for external tools.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Thu, 23 Aug 2012 03:23:02 +0000 (20:23 -0700)]
Merge remote-tracking branch 'gh/next'

12 years agocephfs: add 'map' command to dump file mapping onto objects, osds
Sage Weil [Tue, 21 Aug 2012 16:18:53 +0000 (09:18 -0700)]
cephfs: add 'map' command to dump file mapping onto objects, osds

Closes: #3010
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoperf-watch: initial version
Sage Weil [Thu, 23 Aug 2012 00:22:05 +0000 (17:22 -0700)]
perf-watch: initial version

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoobjecter: use ordered map<> for tracking tids to preserve order on resend
Sage Weil [Wed, 22 Aug 2012 04:12:33 +0000 (21:12 -0700)]
objecter: use ordered map<> for tracking tids to preserve order on resend

We are using a hash_map<> to map tids to Op*'s.  In handle_osd_map(),
we will recalc_op_target() on each Op in a random (hash) order.  These
will get put in a temp map<tid,Op*> to ensure they are resent in the
correct order, but their order on the session->ops list will be random.

Then later, if we reset an OSD connection, we will resend everything for
that session in ops order, which is be incorrect.

Fix this by explicitly reordering the requests to resend in
kick_requests(), much like we do in handle_osd_map().  This lets us
continue to use a hash_map<>, which is faster for reasonable numbers of
requests.  A simpler but slower fix would be to just use map<> instead.

This is one of many bugs contributing to #2947.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agodoc: Either use a backslash and a newline, or neither.
Tommi Virtanen [Wed, 22 Aug 2012 17:50:22 +0000 (10:50 -0700)]
doc: Either use a backslash and a newline, or neither.

Signed-off-by: Tommi Virtanen <tv@inktank.com>
12 years agoMerge remote-tracking branch 'gh/wip-crypto'
Sage Weil [Tue, 21 Aug 2012 22:47:57 +0000 (15:47 -0700)]
Merge remote-tracking branch 'gh/wip-crypto'

12 years agocls_rgw: add gc commands handling
Yehuda Sadeh [Tue, 21 Aug 2012 22:03:35 +0000 (15:03 -0700)]
cls_rgw: add gc commands handling

add the various functionality required for the gc: set entry,
defer entry, list

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoconfig_opts: add gc configurables
Yehuda Sadeh [Tue, 21 Aug 2012 22:01:28 +0000 (15:01 -0700)]
config_opts: add gc configurables

rgw_gc_max_objs: num of objects to used for gc shards
rgw_gc_obj_min_wait: min time for an object to become visible to gc
rgw_gc_processor_max_time: max time a for a single gc processor cycle
rgw_gc_processor_period: period between processors start

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agocls_lock: specify librados namespace explicitly
Yehuda Sadeh [Tue, 21 Aug 2012 21:56:43 +0000 (14:56 -0700)]
cls_lock: specify librados namespace explicitly

librados namespace was not specified, hence required including
source files to add using namespace. This fixes it.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agocls_rgw: cleanups
Yehuda Sadeh [Wed, 25 Jul 2012 23:10:10 +0000 (16:10 -0700)]
cls_rgw: cleanups

move stuff to cls/rgw, create needed helpers.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agomon: implement 'ceph report <tag ...>' command
Sage Weil [Tue, 21 Aug 2012 21:22:20 +0000 (14:22 -0700)]
mon: implement 'ceph report <tag ...>' command

Generate a simple "signed" report of the current cluster status.  Include
a simple crc so that the report is vaguely verifiable.

This is part of #2829.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoconfig: remove dead osd options
Sage Weil [Tue, 21 Aug 2012 20:24:55 +0000 (13:24 -0700)]
config: remove dead osd options

The read balancing/shedding stuff is old.  Same goes for class timeouts and
the raid options.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoFix compilation warnings on squeeze; can't printf() snapid_t directly
Dan Mick [Tue, 21 Aug 2012 18:32:45 +0000 (11:32 -0700)]
Fix compilation warnings on squeeze; can't printf() snapid_t directly

12 years agorgw: use sizeof() for snprintf
Sage Weil [Tue, 21 Aug 2012 18:01:11 +0000 (11:01 -0700)]
rgw: use sizeof() for snprintf

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'next'
Sage Weil [Tue, 21 Aug 2012 17:51:54 +0000 (10:51 -0700)]
Merge branch 'next'

12 years agoosd: fix requeue order for waiting_for_ondisk
Sage Weil [Tue, 21 Aug 2012 17:35:37 +0000 (10:35 -0700)]
osd: fix requeue order for waiting_for_ondisk

We are calling requeue_ops() on each individual op, which means we need
to requeue in reverse order (newest first, oldest last).

Fixes: #2947
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agorgw: dump content_range using 64 bit formatters
Yehuda Sadeh [Sat, 18 Aug 2012 00:34:23 +0000 (17:34 -0700)]
rgw: dump content_range using 64 bit formatters

Fixes: #2961
Also make sure that size is 64 bit.

backport: argonaut
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoRevert "rgw: dump content_range using 64 bit formatters"
Sage Weil [Tue, 21 Aug 2012 17:48:12 +0000 (10:48 -0700)]
Revert "rgw: dump content_range using 64 bit formatters"

This reverts commit cc435e99802f77b3d4b21abe022665ac9df259cf.

Wrong fix; fcgi doesn't do %lld

12 years agomon: fix monitor cluster contraction race
Sage Weil [Tue, 21 Aug 2012 00:04:58 +0000 (17:04 -0700)]
mon: fix monitor cluster contraction race

If we contract to 1 monitor, we win_standalone_election() without bumping
the election epoch.  Racing paxos updates can then reach us without being
ignored and trigger an assert:

mon/Paxos.cc: In function 'void Paxos::handle_accept(MMonPaxos*)' thread 7f85eae05700 time 2012-08-20 16:01:00.843937
mon/Paxos.cc: 468: FAILED assert(state == STATE_UPDATING)

Fixes: #3003
Reported-by: John Wilkins <john.wilkins@inktank.com>
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoAdd manpage sections for flatten, snap {un}protect
Dan Mick [Tue, 21 Aug 2012 01:00:46 +0000 (18:00 -0700)]
Add manpage sections for flatten, snap {un}protect

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: John Wilkins <john.wilkins@inktank.com>
12 years agomkcephfs, init-ceph: Warn if hostname "localhost" is seen in ceph.conf.
Tommi Virtanen [Tue, 21 Aug 2012 00:06:09 +0000 (17:06 -0700)]
mkcephfs, init-ceph: Warn if hostname "localhost" is seen in ceph.conf.

Given a ceph.conf that looks like

  [osd.42]
  host = localhost

mkcephfs used to exit with an obscure error message:

  cat: /tmp/mkcephfs.MCBIHvn4Ru/key.*: No such file or directory

"localhost" was never intended to be a valid hostname to use there.
Warn if we see it, and skip the entry. You should use the proper short
hostname of the box.

As init-ceph and mkcephfs share this library, this change affects the
sysvinit scripts too. The behavior *shouldn't* change there (localhost
entries were ignored earlier, too), but you may see this extra
warning. Which is good.

Closes: #3001
Signed-off-by: Tommi Virtanen <tv@inktank.com>
12 years ago"Removed 274 from xfstests"
tamil [Mon, 20 Aug 2012 23:53:18 +0000 (16:53 -0700)]
"Removed 274 from xfstests"

Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
12 years agotest_rbd.py: remove clone before image it depends on
Dan Mick [Mon, 20 Aug 2012 22:59:33 +0000 (15:59 -0700)]
test_rbd.py: remove clone before image it depends on

Signed-off-by: Dan Mick <dan.mick@inktank.com>