]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
13 years agocrushtool: add --enable-unsafe-tunables option
Sage Weil [Fri, 8 Jun 2012 17:57:28 +0000 (10:57 -0700)]
crushtool: add --enable-unsafe-tunables option

This is required to adjust tunables.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agocrushtool: --show-* instead of --output-*
Sage Weil [Fri, 8 Jun 2012 15:52:43 +0000 (08:52 -0700)]
crushtool: --show-* instead of --output-*

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoCrushTester: clean up output interface
Sage Weil [Fri, 8 Jun 2012 02:33:14 +0000 (19:33 -0700)]
CrushTester: clean up output interface

Multiple accessors.  Init in ctor.  Avoid temp vars in crushtool.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoCrushTester: dump histogram of choose attempts
Sage Weil [Fri, 8 Jun 2012 02:21:51 +0000 (19:21 -0700)]
CrushTester: dump histogram of choose attempts

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agocrush: generate histogram of choose tries
Sage Weil [Fri, 8 Jun 2012 02:21:36 +0000 (19:21 -0700)]
crush: generate histogram of choose tries

Optionally populate a histogram of choose descent attempts.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agocrush: fix leaf recursion if we already collided
Sage Weil [Thu, 7 Jun 2012 23:52:57 +0000 (16:52 -0700)]
crush: fix leaf recursion if we already collided

This just saves us some cycles, but does not effect placement results at
all.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoCrushTester: optionally output bad mappings
Sage Weil [Thu, 7 Jun 2012 23:34:11 +0000 (16:34 -0700)]
CrushTester: optionally output bad mappings

Optionally dump bad inputs to stdout.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agocrushtool: arguments to adjust tunables
Sage Weil [Thu, 7 Jun 2012 23:08:23 +0000 (16:08 -0700)]
crushtool: arguments to adjust tunables

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agocrush: make magic numbers tunable
Sage Weil [Thu, 7 Jun 2012 22:57:09 +0000 (15:57 -0700)]
crush: make magic numbers tunable

We have three magic numbers in crush_choose that are now tunable.  The
first two control the local retry behavior, including fallback to a
permutation.  The last is the total map descent attempts.

We can avoid a drastic incompatibility by making these tunable and encoded
in the map.  That means users can enable/disable local retry, for example,
without changing the code.  As long as the clients understand the tunables,
they can be adjusted.

This patch doesn't address the compatibility and feature bit issue.  We may
want to roll that into a larger revision with more drastic changes, once
we know what those changes will look like.  However, a careful user can
use the new code and modify the behavior.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomon: use mode 0600 throughout
Sage Weil [Thu, 7 Jun 2012 20:57:10 +0000 (13:57 -0700)]
mon: use mode 0600 throughout

Fixes: #2526
Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoMerge remote-tracking branch 'gh/mon-auth'
Sage Weil [Thu, 7 Jun 2012 19:22:47 +0000 (12:22 -0700)]
Merge remote-tracking branch 'gh/mon-auth'

Reviewed-by: Greg Farnum <greg@inktank.com>
13 years agodoc: Added mount cephfs with fstab.
John Wilkins [Thu, 7 Jun 2012 18:35:37 +0000 (11:35 -0700)]
doc: Added mount cephfs with fstab.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agoosd: include past_intervals in pg query results
Sage Weil [Thu, 7 Jun 2012 18:17:12 +0000 (11:17 -0700)]
osd: include past_intervals in pg query results

This will help us figure out *why* nodes are in the prior set.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Sam Just <sam.just@inktank.com>
13 years agoOSD: _have_pg should return NULL if pg is not in map
Samuel Just [Mon, 14 May 2012 20:12:18 +0000 (13:12 -0700)]
OSD: _have_pg should return NULL if pg is not in map

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoMerge remote-tracking branch 'gh/wip-assert2'
Sage Weil [Thu, 7 Jun 2012 18:21:39 +0000 (11:21 -0700)]
Merge remote-tracking branch 'gh/wip-assert2'

"So be it"

Reviewed-by: Sam Just <sam.just@dreamhost.com>
13 years agodeliberately break encoding macros when wrong assert is present
Sage Weil [Thu, 7 Jun 2012 17:19:09 +0000 (10:19 -0700)]
deliberately break encoding macros when wrong assert is present

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomisc assert #include cleanup, hackery
Sage Weil [Thu, 7 Jun 2012 17:18:56 +0000 (10:18 -0700)]
misc assert #include cleanup, hackery

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoreinclude assert.h after json_spirit
Sage Weil [Thu, 7 Jun 2012 17:18:38 +0000 (10:18 -0700)]
reinclude assert.h after json_spirit

json_spirit clobbers it!

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agodoc: Incorporated Sam's comments.
John Wilkins [Thu, 7 Jun 2012 17:08:16 +0000 (10:08 -0700)]
doc: Incorporated Sam's comments.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agoMerge remote-tracking branch 'gh/wip-assert'
Sage Weil [Thu, 7 Jun 2012 16:41:14 +0000 (09:41 -0700)]
Merge remote-tracking branch 'gh/wip-assert'

Reviewed-by: Sam Just <sam.just@inktank.com>
13 years agodoc: Typo fix.
John Wilkins [Thu, 7 Jun 2012 14:38:36 +0000 (07:38 -0700)]
doc: Typo fix.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agomon: set policy for client, mds before throttler
Sage Weil [Thu, 7 Jun 2012 02:19:59 +0000 (19:19 -0700)]
mon: set policy for client, mds before throttler

Otherwise we fail the assert in Messenger::set_policy_throttler()!

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoDBObjectMap: fix some warnings
Sage Weil [Thu, 7 Jun 2012 02:05:46 +0000 (19:05 -0700)]
DBObjectMap: fix some warnings

os/DBObjectMap.cc:197: warning: suggest a space before ';' or explicit braces around empty body in 'for' statement

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomake everyone use our assert #include and macro
Sage Weil [Wed, 6 Jun 2012 23:57:31 +0000 (16:57 -0700)]
make everyone use our assert #include and macro

...as detected by the previous patch.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoassert: detect when /usr/include/assert.h clobbers us
Sage Weil [Wed, 6 Jun 2012 23:06:28 +0000 (16:06 -0700)]
assert: detect when /usr/include/assert.h clobbers us

The normal assert.h is very rude in that it clobbers any existing assert
define and replaces it with its own.  An sadly, lots of things we include
include the generic version.

Be extra rude in response.  Clobber any existing assert #define, and also
#define _ASSERT_H to be a magic value that our commonly-used dendl #define
depends on.  This way we get a compile error if the system version replaces
out own.

This is imperfect, since we will only detect their rudeness when we use
the debug macros.  I'm not coming up with something that is more widely
used that would work better, however.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoMerge remote-tracking branch 'gh/wip-msgr-interface'
Sage Weil [Wed, 6 Jun 2012 23:01:19 +0000 (16:01 -0700)]
Merge remote-tracking branch 'gh/wip-msgr-interface'

Reviewed-by: Sage Weil <sage@inktank.com>
13 years agokeyserver: also authenticate against mon keyring
Sage Weil [Wed, 6 Jun 2012 22:30:36 +0000 (15:30 -0700)]
keyserver: also authenticate against mon keyring

If we don't have a secret, also check in the extra_secrets keyring.

This means we can also authenticate as any users that appear in the mon
keyring, and get the caps defined there.  This lets us bootstrap the
client.admin key with mon. key, provided mon 'allow *' caps appear in the
mon keyring.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agokeyring: implement get_caps()
Sage Weil [Wed, 6 Jun 2012 22:26:53 +0000 (15:26 -0700)]
keyring: implement get_caps()

Simple accessor, mirrors KeyServerData.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomon: share mon keyring with KeyServer
Sage Weil [Wed, 6 Jun 2012 22:26:28 +0000 (15:26 -0700)]
mon: share mon keyring with KeyServer

This will let us authenticate against items in the mon keyring, like the
mon. key itself.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoMerge remote-tracking branch 'gh/wip_hobject_wpool'
Sage Weil [Wed, 6 Jun 2012 21:11:24 +0000 (14:11 -0700)]
Merge remote-tracking branch 'gh/wip_hobject_wpool'

Reviewed-by: Sage Weil <sage@inktank.com>
13 years agomon: put cluster log at /var/log/ceph/$cluster.log and/or send to syslog
Sage Weil [Wed, 6 Jun 2012 21:09:22 +0000 (14:09 -0700)]
mon: put cluster log at /var/log/ceph/$cluster.log and/or send to syslog

Also, stop breaking it down by event severity on disk.  If you want that,
use syslog.

Fixes: #2497
Backport: dho
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
13 years agoMerge branch 'wip-crushtool'
Sage Weil [Wed, 6 Jun 2012 18:29:41 +0000 (11:29 -0700)]
Merge branch 'wip-crushtool'

Reviewed-by: Sage Weil <sage@inktank.com>
13 years agomonclient: be paranoid/defensive about send_log vs log_client==NULL
Sage Weil [Wed, 6 Jun 2012 16:13:14 +0000 (09:13 -0700)]
monclient: be paranoid/defensive about send_log vs log_client==NULL

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agocrushtool: fix cli tests given new less-chatty output, help
Sage Weil [Wed, 6 Jun 2012 18:05:57 +0000 (11:05 -0700)]
crushtool: fix cli tests given new less-chatty output, help

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agocrushtool: allow user to select output reporting in blocks
caleb miles [Tue, 5 Jun 2012 22:50:54 +0000 (15:50 -0700)]
crushtool: allow user to select output reporting in blocks

Signed-off-by: caleb miles <caleb.miles@inktank.com>
13 years agodoc: Added mount cephfs and included it in quick start.
John Wilkins [Wed, 6 Jun 2012 17:45:26 +0000 (10:45 -0700)]
doc: Added mount cephfs and included it in quick start.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agologclient: fix warning
Sage Weil [Wed, 6 Jun 2012 04:06:01 +0000 (21:06 -0700)]
logclient: fix warning

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomon: include pg acting in health detail
Sage Weil [Wed, 6 Jun 2012 04:05:55 +0000 (21:05 -0700)]
mon: include pg acting in health detail

Backport: dho
Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomon: include all types of stuck pgs in health detail
Sage Weil [Wed, 6 Jun 2012 04:04:24 +0000 (21:04 -0700)]
mon: include all types of stuck pgs in health detail

We were just including the last one, which isn't as helpful.

Backport: dho
Signed-off-by: Sage Weil <sage@inktank.com>
13 years agotest/cli/ceph-authtool: keyring.bin -> keyring
Sage Weil [Wed, 6 Jun 2012 03:16:19 +0000 (20:16 -0700)]
test/cli/ceph-authtool: keyring.bin -> keyring

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agodoc: keyring.bin -> keyring everwhere
Sage Weil [Wed, 6 Jun 2012 03:16:04 +0000 (20:16 -0700)]
doc: keyring.bin -> keyring everwhere

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agotest/: Made omap_bench compatible with teuthology
Eleanor Cawthon [Tue, 5 Jun 2012 22:34:41 +0000 (15:34 -0700)]
test/: Made omap_bench compatible with teuthology

added --name parsing, made histogram better, made rados_id
configurable, changed object names to use configurable prefix.

Signed-off-by: Eleanor Cawthon <eleanor.cawthon@inktank.com>
13 years agodoc: Added the root discussion to deploy with mkcephfs.
John Wilkins [Wed, 6 Jun 2012 00:09:59 +0000 (17:09 -0700)]
doc: Added the root discussion to deploy with mkcephfs.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agodoc: Added chmod for keyring, and moved client.admin user higher.
John Wilkins [Wed, 6 Jun 2012 00:08:45 +0000 (17:08 -0700)]
doc: Added chmod for keyring, and moved client.admin user higher.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agologclient: fix crashes, fix which entries are sent
Sage Weil [Tue, 5 Jun 2012 23:15:43 +0000 (16:15 -0700)]
logclient: fix crashes, fix which entries are sent

I was seeing crashes when the monitor tried to send log entries.

* Send log entries that haven't already been sent.
* Don't try to be tricky with the deque; i'm paranoid about the stability
  of the iterator.
* various asserts
* better variable names

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomonclient: send more log entries when first set is acked
Sage Weil [Tue, 5 Jun 2012 21:52:42 +0000 (14:52 -0700)]
monclient: send more log entries when first set is acked

Immediately send more log messages if we had more when the first set was
sent.  Otherwise, wait until the next tick to check.  This semi-throttles
logging based on how much the monitor can handle.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agologclient: not a dispatcher
Sage Weil [Tue, 5 Jun 2012 21:51:17 +0000 (14:51 -0700)]
logclient: not a dispatcher

Let MonClient and Monitor handle delivery of messages.  This puts them in
control and lets them trigger sending of more messages when we have a
bunch queued.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agologclient: limit messages per MLog message
Sage Weil [Tue, 5 Jun 2012 21:21:33 +0000 (14:21 -0700)]
logclient: limit messages per MLog message

This will avoid sending huge chunks of entries to the monitor and making
its life difficult.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
13 years agomon: limit size of each logm paxos event
Sage Weil [Tue, 5 Jun 2012 20:54:57 +0000 (13:54 -0700)]
mon: limit size of each logm paxos event

Limit the number of log events we cram into a single paxos event.

Fixes: #2518
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
13 years agoconfig_opts: filestore_update_to defaults to 1000
Samuel Just [Tue, 5 Jun 2012 23:33:47 +0000 (16:33 -0700)]
config_opts: filestore_update_to defaults to 1000

This way, filestores will be auto-upgraded by default.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD: do not convert an entire collection in one transaction
Samuel Just [Tue, 5 Jun 2012 16:58:38 +0000 (09:58 -0700)]
OSD: do not convert an entire collection in one transaction

Previously, we atomically moved the collection out of the way, created a
new collection, moved the contents of the old collection to the new
collcetion and removed the old collection.  For large collections, this
could result in unacceptably long transactions.  Now, we create a temp
collection, link all objects into it, atomically swap them, and then
remove the old one.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoFileStore,DBObjectMap: add SequencerPosition argument to ObjectMap
Samuel Just [Fri, 1 Jun 2012 05:29:21 +0000 (22:29 -0700)]
FileStore,DBObjectMap: add SequencerPosition argument to ObjectMap

Previously, sequences like:

1. touch (c1, a)
2. link (c1, c2, a)
3. rm (c1, a)
4. setattr (c2, a)
5. clone (c2, a, b)

could result in the omap entries for a being removed once ops 2-3 are
replayed.  Calls to ObjectMap::sync will include a sequencer posotion
and an hobject_t to mark.  Updates to the object map will now also check
the SequencerPosition entry on the map header preventing replay of
earlier ops.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoReplicatedPG: push_start, don't insert empty extent into data_subset
Samuel Just [Fri, 1 Jun 2012 18:07:32 +0000 (11:07 -0700)]
ReplicatedPG: push_start, don't insert empty extent into data_subset

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agotest_filestore_idempotent_sequence: add omap
Samuel Just [Fri, 1 Jun 2012 00:48:39 +0000 (17:48 -0700)]
test_filestore_idempotent_sequence: add omap

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoosd_types,PG: handle new hobject format in object_info,pg_log
Samuel Just [Thu, 31 May 2012 05:48:19 +0000 (22:48 -0700)]
osd_types,PG: handle new hobject format in object_info,pg_log

There are also legacy hobject encodings in the pg log and in object_info
attributes on objects.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agotest_object_map: remove DBObjectMapv0
Samuel Just [Wed, 30 May 2012 23:11:27 +0000 (16:11 -0700)]
test_object_map: remove DBObjectMapv0

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoOSD,FileStore: clean up filestore convsersion
Samuel Just [Wed, 30 May 2012 23:01:38 +0000 (16:01 -0700)]
OSD,FileStore: clean up filestore convsersion

Previously, we messed with the filestore_update_collections config
option to enable upgrades in the filestore.  We now pass that in as a
parameter to the FileStore,IndexManager constructors.

Further, the user must now specify the version to which to update in
order to prevent accidental updates.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoDBObjectMap,FileStore: Remove IndexedPath parameters from ObjectMap
Samuel Just [Wed, 30 May 2012 04:47:50 +0000 (21:47 -0700)]
DBObjectMap,FileStore: Remove IndexedPath parameters from ObjectMap

IndexedPath parameters are no longer needed for getting the object
collections or for supporting the TMAP implementation.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoDBObjectMap: update header comments for new structure
Samuel Just [Wed, 30 May 2012 04:45:08 +0000 (21:45 -0700)]
DBObjectMap: update header comments for new structure

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoFileStore: skip omap during list_collections
Samuel Just [Wed, 30 May 2012 03:09:20 +0000 (20:09 -0700)]
FileStore: skip omap during list_collections

Signed-off-by: Samuel Just <sam.just@dreamhost.com>
13 years agoOSD: exit(0) once filestore is converted
Samuel Just [Wed, 30 May 2012 02:57:37 +0000 (19:57 -0700)]
OSD: exit(0) once filestore is converted

Also, do not upgrade filestore automatically

Signed-off-by: Samuel Just <sam.just@dreamhost.com>
13 years agoReplicatedPG: adjust missing at push_start
Samuel Just [Wed, 30 May 2012 00:08:45 +0000 (17:08 -0700)]
ReplicatedPG: adjust missing at push_start

When we start recieving an object, we remove the old copy.  This will
prevent the primary from using that old copy after that point.
We do the same on the pushee.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoReplicatedPG: remove_object_with_snap_hardlinks before creating temp obj
Samuel Just [Tue, 29 May 2012 22:36:00 +0000 (15:36 -0700)]
ReplicatedPG: remove_object_with_snap_hardlinks before creating temp obj

hobject_ts must be unique in the filestore.  Thus, when we create the
new temp object, the old one must have been deleted already.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoDBObjectMap::init: initialize seq and v to correct values
Samuel Just [Tue, 29 May 2012 20:03:02 +0000 (13:03 -0700)]
DBObjectMap::init: initialize seq and v to correct values

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoDBObjectMap: update check() for new format
Samuel Just [Sat, 26 May 2012 21:15:27 +0000 (14:15 -0700)]
DBObjectMap: update check() for new format

Signed-off-by: Samuel Just <sam.just@dreamhost.com>
13 years agoDBObjectMap: Implement upgrade from previous format
Samuel Just [Sat, 26 May 2012 05:42:50 +0000 (22:42 -0700)]
DBObjectMap: Implement upgrade from previous format

Also includes tests in test_object_map.cc

Signed-off-by: Samuel Just <sam.just@dreamhost.com>
13 years agoDBObjectMap: restructure for unique hobject_t's
Samuel Just [Sat, 26 May 2012 02:18:41 +0000 (19:18 -0700)]
DBObjectMap: restructure for unique hobject_t's

Previously, the ObjectStore operated in terms of (coll_t,hobject_t)
tupples.  Now that hobject_t's are globally unique within the
ObjectStore, it is no longer necessary to support multiple names for the
same DBObjectMap node.

Signed-off-by: Samuel Just <sam.just@dreamhost.com>
13 years agoFileStore,DBObjectMap: remove ObjectMap link method
Samuel Just [Fri, 25 May 2012 22:06:55 +0000 (15:06 -0700)]
FileStore,DBObjectMap: remove ObjectMap link method

hobject_t's are now globally unique in filestore.  Essentially, there is
a 1-to-1 mapping from inodes to hobject_t's.  The entry in the
DBObjectMap is now tied to the inode/hobject_t.  Thus, links needn't be
tracked.  Rather, we delete the ObjectMap entry when nlink == 0.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoDBObjectMap: version bump for new format
Samuel Just [Fri, 25 May 2012 21:07:20 +0000 (14:07 -0700)]
DBObjectMap: version bump for new format

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoDBObjectMap: add parse method for old encoding
Samuel Just [Fri, 25 May 2012 20:59:57 +0000 (13:59 -0700)]
DBObjectMap: add parse method for old encoding

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agoos/: update CollectionIndex filename encodings
Samuel Just [Thu, 24 May 2012 18:28:47 +0000 (11:28 -0700)]
os/: update CollectionIndex filename encodings

filename encodings now include namespace and pool.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agotest/ObjectMap: Copy current DBObjectMap implementation
Samuel Just [Sat, 26 May 2012 04:30:37 +0000 (21:30 -0700)]
test/ObjectMap: Copy current DBObjectMap implementation

This implementation will be used to test the upgrade process.

Signed-off-by: Samuel Just <sam.just@dreamhost.com>
13 years agosrc/: Add namespace and pool fields to hobject_t
Samuel Just [Thu, 24 May 2012 17:57:22 +0000 (10:57 -0700)]
src/: Add namespace and pool fields to hobject_t

From this point, hobjects in the ObjectStore will be globally unique.  This
will allow us to avoid including the collection in the ObjectMap key encoding
and thereby enable efficient collection renames and, eventually, collection
splits.

Signed-off-by: Samuel Just <sam.just@inktank.com>
13 years agomon: clear osd_stat on osd creation/destruction
Sage Weil [Mon, 4 Jun 2012 03:09:27 +0000 (20:09 -0700)]
mon: clear osd_stat on osd creation/destruction

Reported-by: Vladimir Bashkirtsev <vladimir@bashkirtsev.com>
Signed-off-by: Sage Weil <sage@inktank.com>
13 years agodoc: Added S3 examples to the toctree.
John Wilkins [Tue, 5 Jun 2012 18:09:44 +0000 (11:09 -0700)]
doc: Added S3 examples to the toctree.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agodoc: adding code samples for S3 API usage (thanks, DH!)
Emily Popper [Tue, 5 Jun 2012 17:55:08 +0000 (10:55 -0700)]
doc: adding code samples for S3 API usage (thanks, DH!)

Signed-off-by: Ross Turk <ross@inktank.com>
13 years agoMakefile.am: explicitly mention that -Wl,--as-needed is location-sensitive.
Tommi Virtanen [Wed, 9 Mar 2011 18:17:20 +0000 (10:17 -0800)]
Makefile.am: explicitly mention that -Wl,--as-needed is location-sensitive.

13 years agodoc: Added ${lsb_release -sc} based on Sam's feedback.
John Wilkins [Tue, 5 Jun 2012 15:15:47 +0000 (08:15 -0700)]
doc: Added ${lsb_release -sc} based on Sam's feedback.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
13 years agorgw: shutdown init_timer
Yehuda Sadeh [Mon, 4 Jun 2012 23:15:11 +0000 (16:15 -0700)]
rgw: shutdown init_timer

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
13 years agorgw: try to create fcgi socket through open() first
Yehuda Sadeh [Mon, 4 Jun 2012 23:01:07 +0000 (16:01 -0700)]
rgw: try to create fcgi socket through open() first

FCGX_OpenSocket might just exit() without any warning if it fails
to create the socket.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
13 years agomsg: make clear_pipe work only on a given Pipe, rather than the current one.
Greg Farnum [Fri, 1 Jun 2012 23:46:39 +0000 (16:46 -0700)]
msg: make clear_pipe work only on a given Pipe, rather than the current one.

This way old Pipes that have been replaced can't clear the new Pipe
out of a Connection's link.
We might attempt to instead sever the link between CLOSED Pipes and
their Connections more completely (eg, when the Connection gets a
new Pipe), but that will require more work to handle all the
cases, and this works for now.

Signed-off-by: Greg Farnum <greg@inktank.com>
13 years agoCrushTester: allow build without boost stuff for chi^2 testing
Sage Weil [Sun, 3 Jun 2012 23:12:20 +0000 (16:12 -0700)]
CrushTester: allow build without boost stuff for chi^2 testing

With limited functionality.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agocrush: simiulate using rng; use chi-squared to measure result
Caleb Miles [Sun, 3 Jun 2012 22:54:22 +0000 (15:54 -0700)]
crush: simiulate using rng; use chi-squared to measure result

Signed-off-by: Caleb Miles <caleb.miles@inktank.com>
13 years agocrush: check_item_present
Caleb Miles [Sun, 3 Jun 2012 22:25:09 +0000 (15:25 -0700)]
crush: check_item_present

True if id is present in the map.

Signed-off-by: Caleb Miles <caleb.miles@inktank.com>
13 years agoceph_argparse: with_float
Caleb Miles [Sun, 3 Jun 2012 22:21:06 +0000 (15:21 -0700)]
ceph_argparse: with_float

Signed-off-by: Caleb Miles <caleb.miles@inktank.com>
13 years agoadmin_socket: only init if path is defined
Sage Weil [Sun, 3 Jun 2012 20:51:34 +0000 (13:51 -0700)]
admin_socket: only init if path is defined

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoMerge remote branch 'gh/chef-3'
Sage Weil [Sat, 2 Jun 2012 21:49:02 +0000 (14:49 -0700)]
Merge remote branch 'gh/chef-3'

13 years agoupstart: simplify start; allow group stop via an abstract job
Sage Weil [Sat, 2 Jun 2012 22:19:28 +0000 (15:19 -0700)]
upstart: simplify start; allow group stop via an abstract job

Use a 'ceph-mds' or 'ceph-mon' event to start instances instead of
explicitly calling start.  This avoids the ugly is-this-already-running
check.  [Thanks Guilhem Lettron for that!]

Make the -all job abstract (which means it stays started and can be
stopped).  Trigger a helper task (-all-starter) to trigger instance
start.  Make instances stop with the -all task.  This allows you to do

 start ceph-mds-all
 stop ceph-mds-all
 start ceph-mds id=foo
 start ceph-mds-all
 stop ceph-mds id=bar
 stop ceph-mds-all

but not

 start ceph-mds id=foo
 stop ceph-mds-all

because ceph-mds-all isn't running.  Not quite as flexible in sysvinit in
that regard, but good enough for me.

Fixes: #2414
Signed-off-by: Sage Weil <sage@inktank.com>
13 years agopaxos: warn on extreme clock skew
Sage Weil [Sat, 2 Jun 2012 21:29:48 +0000 (14:29 -0700)]
paxos: warn on extreme clock skew

This would have helped us diagnose #2480.

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoomapbench: fix warning
Sage Weil [Sat, 2 Jun 2012 21:03:17 +0000 (14:03 -0700)]
omapbench: fix warning

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoomapbench: fix misc warnings
Sage Weil [Sat, 2 Jun 2012 20:18:01 +0000 (13:18 -0700)]
omapbench: fix misc warnings

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agoReplicatedPG: fix pgls listing, add max listing size
Samuel Just [Fri, 1 Jun 2012 22:39:41 +0000 (15:39 -0700)]
ReplicatedPG: fix pgls listing, add max listing size

Previously, a client requesting a large pgls could tie up the
osd for an unacceptable amount of time.  Also, it's possible
for the osd to return less than the requested number of
entries anyway, so we now return 1 when we have completed the
listing.

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
13 years agoobjecter: fix pgls
Sage Weil [Fri, 1 Jun 2012 22:44:51 +0000 (15:44 -0700)]
objecter: fix pgls

First problem: if the osd returns more entries than we ask for, max_entries
was going negative, and we were requesting (u64)(-small number) on the
next iteration, slamming the OSD when the PG was big.  We fix that by
finishing response_size >= max_entries.

Second problem: AFAICS we were not requesting the second chunk on a large
PG at all, here, if the OSD returned less than what we wanted.  Fix this
by asking for more in that case.

That means we detect the end of a PG in two ways:

 * if the OSD sets the return value to 1 (instead of 0)
 * if we get 0 items in the response

Another patch will change the OSD behavior to return 1, and all will be
well.  If we run against an old OSD, we'll send an extra request for each
PG and get nothing back before we realize we've hit the end and move on.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Sam Just <sam.just@inktank.com>
13 years agoMerge remote-tracking branch 'gh/wip-admin'
Sage Weil [Fri, 1 Jun 2012 20:57:26 +0000 (13:57 -0700)]
Merge remote-tracking branch 'gh/wip-admin'

13 years agomon: fix slurp latest race
Sage Weil [Fri, 1 Jun 2012 20:54:28 +0000 (13:54 -0700)]
mon: fix slurp latest race

It is possible for the latest version to get out in front of the
last_committed version:

 a- start slurping
 a- slurp a bunch of states, through X
 a- get them back, write them out
 b- monitor commits many new states
 a- slurp latest, X+100 say, but only get some of those states due to the
    slurp per-message byte limit
 a- write latest + some (but not all) prior states
 a- call back into slurp(), update_from_paxos(), trigger assert

This fix ensures that we make note of the source's new latest, so that on
the next pass through slurp() we will grab any missing states.

We *also* explicitly require that we get everything up through what we have
stashed, in defense against some future kludging that might only require we
nearly (but not completely) in sync before finishing the slurp.

Fixes: #2379
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
13 years agoMakefile: include ceph-mds upstart bits in dist tarball
Sage Weil [Fri, 1 Jun 2012 20:46:42 +0000 (13:46 -0700)]
Makefile: include ceph-mds upstart bits in dist tarball

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agotest/: Added object map benchmarking tool
Eleanor Cawthon [Tue, 22 May 2012 00:17:51 +0000 (17:17 -0700)]
test/: Added object map benchmarking tool

omap_bench writes configurable objectmaps to a configurable number
of objects and generates latency statistics.

Signed-off-by: Eleanor Cawthon <eleanor.cawthon@inktank.com>
13 years agodoc: fix autobuild debian source line
Sage Weil [Fri, 1 Jun 2012 19:53:13 +0000 (12:53 -0700)]
doc: fix autobuild debian source line

Signed-off-by: Sage Weil <sage@inktank.com>
13 years agomon: throttle client msgr memory
Sage Weil [Fri, 1 Jun 2012 16:44:09 +0000 (09:44 -0700)]
mon: throttle client msgr memory

Limit the amount of memory that can be consumed by client messages, similar
to the OSD.  Do not limit inter-mon messages.

Fixes: #2495
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
13 years agoMerge remote-tracking branch 'origin/wip-2491'
Yehuda Sadeh [Fri, 1 Jun 2012 16:30:31 +0000 (09:30 -0700)]
Merge remote-tracking branch 'origin/wip-2491'