]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
13 years agoSignificant updates to introduction, getting packages, building from source, installi...
John Wilkins [Tue, 1 May 2012 02:03:53 +0000 (19:03 -0700)]
Significant updates to introduction, getting packages, building from source, installing packages, and creating a cluster.

Signed-off-by: John Wilkins <john.wilkins@dreamhost.com>
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agodoc: document NOIN, NOOUT, NOUP, NODOWN flags and flapping
Sage Weil [Thu, 26 Apr 2012 19:07:11 +0000 (12:07 -0700)]
doc: document NOIN, NOOUT, NOUP, NODOWN flags and flapping

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoRemoved this. As part of restructuring of install to include admin host.
John Wilkins [Wed, 25 Apr 2012 21:52:34 +0000 (14:52 -0700)]
Removed this. As part of restructuring of install to include admin host.

Signed-off-by: John Wilkins <john.wilkins@dreamhost.com>
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoMinor edits. Still WIP.
John Wilkins [Wed, 25 Apr 2012 21:50:15 +0000 (14:50 -0700)]
Minor edits. Still WIP.

Signed-off-by: John Wilkins <john.wilkins@dreamhost.com>
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoThe new files contain S3 APIs and a build from source doc.
John Wilkins [Wed, 25 Apr 2012 21:46:51 +0000 (14:46 -0700)]
The new files contain S3 APIs and a build from source doc.

Signed-off-by: John Wilkins <john.wilkins@dreamhost.com>
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoMinor cleanup.
John Wilkins [Thu, 12 Apr 2012 18:35:45 +0000 (11:35 -0700)]
Minor cleanup.

Signed off by: John Wilkins <john.wilkins@dreamhost.com>

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoAdded a new landing page graphic, and made some minor edits on the landing page.
John Wilkins [Thu, 12 Apr 2012 00:13:21 +0000 (17:13 -0700)]
Added a new landing page graphic, and made some minor edits on the landing page.

Submitted by: John Wilkins <john.wilkins@dreamhost.com>

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoRemoved some files for reorg.
John Wilkins [Wed, 11 Apr 2012 18:26:36 +0000 (11:26 -0700)]
Removed some files for reorg.

Submitted by: John Wilkins <john.wilkins@dreamhost.com>

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoAdded a plug for commercial support. ;)
John Wilkins [Wed, 11 Apr 2012 18:22:59 +0000 (11:22 -0700)]
Added a plug for commercial support. ;)

Submitted by: John Wilkins <john.wilkins@dreamhost.com>

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoBuilding out information architecture. Modified getting involved, why use ceph, etc.
John Wilkins [Wed, 11 Apr 2012 18:21:43 +0000 (11:21 -0700)]
Building out information architecture. Modified getting involved, why use ceph, etc.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoRemove reference to Introduction to RADOS OSDs
John Wilkins [Tue, 10 Apr 2012 19:53:15 +0000 (12:53 -0700)]
Remove reference to Introduction to RADOS OSDs

Submitted by: John Wilkins <john.wilkins@dreamhost.com>

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoAdded introduction to clustered storage and deleted older files that have been moved.
John Wilkins [Tue, 10 Apr 2012 19:46:07 +0000 (12:46 -0700)]
Added introduction to clustered storage and deleted older files that have been moved.

Submitted by: John Wilkins <john.wilkins@dreamhost.com>

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoRestructuring documentation.
John Wilkins [Tue, 10 Apr 2012 17:32:22 +0000 (10:32 -0700)]
Restructuring documentation.

Submitted by: John Wilkins <john.wilkins@dreamhost.com>

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoInitial cut of introduction, getting started, and installing. More to do on installat...
John Wilkins [Wed, 14 Mar 2012 18:58:27 +0000 (11:58 -0700)]
Initial cut of introduction, getting started, and installing. More to do on installation. RADOS gateway to follow.

Signed-off-by: John Wilkins <john.wilkins@dreamhost.com>
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agodoc: move documentation build instructions to doc/dev section
John Wilkins [Tue, 13 Mar 2012 23:48:45 +0000 (16:48 -0700)]
doc: move documentation build instructions to doc/dev section

Signed-off-by: John Wilkins <john.wilkins@dreamhost.com>
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoTreat rename across pools as an error
Dan Mick [Tue, 1 May 2012 22:33:19 +0000 (15:33 -0700)]
Treat rename across pools as an error
Fixes: #2370
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agodoc: add warning about multiple monitors on one machine.
Greg Farnum [Tue, 1 May 2012 23:40:46 +0000 (16:40 -0700)]
doc: add warning about multiple monitors on one machine.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agorgw: normalize bucket/obj before updating cache
Yehuda Sadeh [Tue, 1 May 2012 23:47:32 +0000 (16:47 -0700)]
rgw: normalize bucket/obj before updating cache

Fixes bug #2369. The problem was that sometimes we send the
notification with the un-normalized bucket/obj pair. We
should make sure that we use the caonical name before doing
any cache update.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoPG: Add probe set to HB peers during GetInfo
Samuel Just [Tue, 1 May 2012 00:31:09 +0000 (17:31 -0700)]
PG: Add probe set to HB peers during GetInfo

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoPG: check_new_interval now handles adding new maps to past intervals
Samuel Just [Mon, 30 Apr 2012 22:09:23 +0000 (15:09 -0700)]
PG: check_new_interval now handles adding new maps to past intervals

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agofilestore: allow flusher+sync_flush enable/disable via injectargs
Sage Weil [Tue, 1 May 2012 19:37:20 +0000 (12:37 -0700)]
filestore: allow flusher+sync_flush enable/disable via injectargs

This only affects the decision to queue or do things inline, so it is safe
to change while the filestore is up and running.

Also adjust the #ifdef so that there we share a single path through the
code when sync_file_range() is missing.

Fixes: #2368
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agofilestore: fix op_queue_{len,bytes} instrumentation
Sage Weil [Tue, 1 May 2012 18:30:53 +0000 (11:30 -0700)]
filestore: fix op_queue_{len,bytes} instrumentation

(re)set these in logger when they actually change.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge branch 'wip-throttle'
Sage Weil [Tue, 1 May 2012 17:49:36 +0000 (10:49 -0700)]
Merge branch 'wip-throttle'

Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agothrottle: count get_or_fail success/failure explicitly
Sage Weil [Tue, 1 May 2012 17:47:05 +0000 (10:47 -0700)]
throttle: count get_or_fail success/failure explicitly

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: pg creation calc_priors_during() should count primary as up
Sage Weil [Tue, 1 May 2012 17:39:09 +0000 (10:39 -0700)]
osd: pg creation calc_priors_during() should count primary as up

If only want to include down osds if *all* of the prior acting osds are
down.  If osd->whoami is one of them, then we're okay.

For example, if osd.13 is down, then the below should be satisfied that
osd.14 (osd->whoami) is alive:

2012-04-27 10:46:38.746681 7f5258a63700 15 osd.14 27 calc_priors_during 6.5 [9,25)
2012-04-27 10:46:38.746688 7f5258a63700 20 osd.14 27   6.5 in epoch 9 was [13,14]
2012-04-27 10:46:38.746695 7f5258a63700 20 osd.14 27   6.5 in epoch 10 was [13,14]
2012-04-27 10:46:38.746701 7f5258a63700 20 osd.14 27   6.5 in epoch 11 was [13,14]
2012-04-27 10:46:38.746709 7f5258a63700 20 osd.14 27   6.5 in epoch 12 was [13,14]
2012-04-27 10:46:38.746715 7f5258a63700 20 osd.14 27   6.5 in epoch 13 was [13,14]
2012-04-27 10:46:38.746722 7f5258a63700 20 osd.14 27   6.5 in epoch 14 was [13,14]
2012-04-27 10:46:38.746729 7f5258a63700 20 osd.14 27   6.5 in epoch 15 was [14]
2012-04-27 10:46:38.746735 7f5258a63700 20 osd.14 27   6.5 in epoch 16 was [14]
2012-04-27 10:46:38.746742 7f5258a63700 20 osd.14 27   6.5 in epoch 17 was [14]
2012-04-27 10:46:38.746748 7f5258a63700 20 osd.14 27   6.5 in epoch 18 was [13,14]
2012-04-27 10:46:38.746755 7f5258a63700 20 osd.14 27   6.5 in epoch 19 was [13,14]
2012-04-27 10:46:38.746762 7f5258a63700 20 osd.14 27   6.5 in epoch 20 was [13,14]
2012-04-27 10:46:38.746768 7f5258a63700 20 osd.14 27   6.5 in epoch 21 was [13,14]
2012-04-27 10:46:38.746775 7f5258a63700 20 osd.14 27   6.5 in epoch 22 was [14]
2012-04-27 10:46:38.746781 7f5258a63700 20 osd.14 27   6.5 in epoch 23 was [14]
2012-04-27 10:46:38.746788 7f5258a63700 20 osd.14 27   6.5 in epoch 24 was [14]
2012-04-27 10:46:38.746790 7f5258a63700 10 osd.14 27 calc_priors_during 6.5 [9,25) = 13

In that case, it wasn't, and the pg creation was blocked.

Fixes: #2355
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agothrottle: note current value and max in perfcounters
Sage Weil [Tue, 1 May 2012 16:10:52 +0000 (09:10 -0700)]
throttle: note current value and max in perfcounters

This exposes a snapshot of the current Throttle value and limit.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years ago.gitignore: ceph-kdump-copy
Sage Weil [Tue, 1 May 2012 02:16:28 +0000 (19:16 -0700)]
.gitignore: ceph-kdump-copy

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge remote-tracking branch 'gh/wip-ceph-kdump-copy'
Sage Weil [Tue, 1 May 2012 00:27:47 +0000 (17:27 -0700)]
Merge remote-tracking branch 'gh/wip-ceph-kdump-copy'

13 years agoosd: add is_unmanaged_snaps_mode() to pg_pool_t; use more consistently
Greg Farnum [Wed, 25 Apr 2012 20:07:42 +0000 (13:07 -0700)]
osd: add is_unmanaged_snaps_mode() to pg_pool_t; use more consistently

Create an is_unmanaged_snaps_mode() function to parallel
is_pool_snaps_mode(), and replace all the checks directly referencing
removed_snaps or snaps with calls to these functions.
Fixes #2345.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agopick_address: don't bother checking struct ifaddrs which have a null ifa_addr
Greg Farnum [Thu, 26 Apr 2012 01:17:53 +0000 (18:17 -0700)]
pick_address: don't bother checking struct ifaddrs which have a null ifa_addr

I assume that's the localhost interface or similar.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoMerge remote-tracking branch 'gh/wip-2352'
Sage Weil [Tue, 1 May 2012 00:22:52 +0000 (17:22 -0700)]
Merge remote-tracking branch 'gh/wip-2352'

Reviewed-by: Sage Weil <sage@newdream.net>
13 years agomsgr: include msgr name in dispatch_throttler name
Sage Weil [Mon, 30 Apr 2012 23:33:08 +0000 (16:33 -0700)]
msgr: include msgr name in dispatch_throttler name

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomsgr: name messengers
Sage Weil [Mon, 30 Apr 2012 23:31:21 +0000 (16:31 -0700)]
msgr: name messengers

Give each Messenger a logical name describing its role.  For instance, the
OSD will have client, cluster, and heartbeat messengers.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agothrottle: report stats via perfcounter
Sage Weil [Mon, 30 Apr 2012 23:25:46 +0000 (16:25 -0700)]
throttle: report stats via perfcounter

Fixes: #2358
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoglobal_init: don't fail out if there is no default config.
Greg Farnum [Mon, 30 Apr 2012 22:06:17 +0000 (15:06 -0700)]
global_init: don't fail out if there is no default config.

There are plenty of scenarios where the user doesn't need a config file.
Instead, just print a warning and let things move on.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoglobal: fix incorrect CINIT flag.
Greg Farnum [Mon, 30 Apr 2012 21:10:35 +0000 (14:10 -0700)]
global: fix incorrect CINIT flag.

There is nobody responding to CLOSE_STDERR, but this block sure looks
like it should be doing so. Fix that!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agothrottle: feed cct, name, and add logging
Sage Weil [Mon, 30 Apr 2012 17:42:39 +0000 (10:42 -0700)]
throttle: feed cct, name, and add logging

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoMerge tag 'v0.46'
Sage Weil [Mon, 30 Apr 2012 20:54:29 +0000 (13:54 -0700)]
Merge tag 'v0.46'

v0.46

13 years agoosdmap: do no dereference NULL entity_addr_t pointer in addr accessors
Sage Weil [Mon, 30 Apr 2012 20:36:37 +0000 (13:36 -0700)]
osdmap: do no dereference NULL entity_addr_t pointer in addr accessors

These may be NULL if we expand the addr vectors but haven't ever stored an
address yet.  Check for NULL and return a reference to a blank
entity_addr_t as needed.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoOSD: add different config options for map bl caches
Samuel Just [Mon, 30 Apr 2012 17:58:32 +0000 (10:58 -0700)]
OSD: add different config options for map bl caches

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agomon: fix nion -> noin typo
Sage Weil [Mon, 30 Apr 2012 18:10:58 +0000 (11:10 -0700)]
mon: fix nion -> noin typo

Thanks Greg!

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge branch 'wip-pi'
Sage Weil [Mon, 30 Apr 2012 18:12:26 +0000 (11:12 -0700)]
Merge branch 'wip-pi'

Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agov0.46 v0.46
Sage Weil [Mon, 30 Apr 2012 04:21:11 +0000 (21:21 -0700)]
v0.46

13 years agolibrbd: use unique error code for image removal failures
Josh Durgin [Fri, 27 Apr 2012 18:20:59 +0000 (11:20 -0700)]
librbd: use unique error code for image removal failures

This allows the rbd tool to provide a useful error message, instead of
compounding more possible causes into one error code.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agorun_xfstests.sh: drop #62
Sage Weil [Mon, 30 Apr 2012 16:48:27 +0000 (09:48 -0700)]
run_xfstests.sh: drop #62

Until #2359 is resolved.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosdmap: fix identify_osd() and find_osd_on_ip()
Sage Weil [Sun, 29 Apr 2012 17:10:41 +0000 (10:10 -0700)]
osdmap: fix identify_osd() and find_osd_on_ip()

In 313c1566d3b649ef81fcdc722678d77dccfa888f we switched to using the
get_addr() accessor methods, which assert that the osd exists.  Check that
before calling.

Fixes: #2361
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: keep pgs locked during handle_osd_map dance
Sage Weil [Sun, 29 Apr 2012 15:17:06 +0000 (08:17 -0700)]
osd: keep pgs locked during handle_osd_map dance

Currently we drop and retake locks during handle_osd_map calls to
advance_map and activate_map.  Instead, take them all once, and hold them.
This avoids leaving dirty in-core state in the PG without the lock held.

This will clearly go away as soon as the map threading stuff is redone.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: drop obsolete osd/PG.h #includes
Sage Weil [Sun, 29 Apr 2012 16:00:09 +0000 (09:00 -0700)]
mon: drop obsolete osd/PG.h #includes

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: set dirty flags on rewind_divergent_log
Sage Weil [Sun, 29 Apr 2012 15:11:06 +0000 (08:11 -0700)]
osd: set dirty flags on rewind_divergent_log

Make sure we record any rewind_divergent_log.  In the activate case, this
will happen anyway, but mark it dirty here for correctness/completeness.

The merge_log case might be a bug.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: use dirty flags in activate(), merge_log()
Sage Weil [Sun, 29 Apr 2012 15:03:12 +0000 (08:03 -0700)]
osd: use dirty flags in activate(), merge_log()

These are all called from within the state machine, so we can simply set
the dirty flags.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: fix nested transaction in all_activated_and_committed()
Sage Weil [Sun, 29 Apr 2012 14:59:44 +0000 (07:59 -0700)]
osd: fix nested transaction in all_activated_and_committed()

all_activated_and_committed() is called from _activate_committed(), called
from a objectstore completion, and also from the state machine, which is
part of a larger transaction.

Instead, set dirty_info, and build/apply a transaction in the caller
(the completion) as needed.  Fixes part of #2360.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: use PG::write_if_dirty() helper
Sage Weil [Sun, 29 Apr 2012 14:57:10 +0000 (07:57 -0700)]
osd: use PG::write_if_dirty() helper

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: do not merge history on query
Sage Weil [Sun, 29 Apr 2012 05:32:08 +0000 (22:32 -0700)]
osd: do not merge history on query

We shouldn't modify the local notion of the history without recording it to
disk.  And we (probably) also don't need to do that at all on query.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: dirty_info if history.merge updated anything
Sage Weil [Sun, 29 Apr 2012 05:31:23 +0000 (22:31 -0700)]
osd: dirty_info if history.merge updated anything

In proc_replica_info and proc_primary_info, we may or may not update
the pg_info_t.  If we do, set dirty_info, so that it will be recorded.
Same goes for when the primary pushes out updated stats to us.

Also, do not write a purged_snaps() update directory; rely on the caller
to write out dirty info.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: write dirty info on handle info, notify, log
Sage Weil [Sun, 29 Apr 2012 05:17:06 +0000 (22:17 -0700)]
osd: write dirty info on handle info, notify, log

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: skip scrub scheduling if we aren't up
Sage Weil [Sun, 29 Apr 2012 03:58:29 +0000 (20:58 -0700)]
osd: skip scrub scheduling if we aren't up

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: fix dirty_info check for advance/activate paths
Sage Weil [Sun, 29 Apr 2012 03:57:02 +0000 (20:57 -0700)]
osd: fix dirty_info check for advance/activate paths

Previously we would check and write dirty_info *without the pg lock* after
doing the advance and activate map calls.  This was unlikely to race with
anything because the queues were drained, but definitely not right.

Instead, do the write in activate_map, or explicitly if activate_map is
not called (so that we record our progress after handling maps when we are
not up).

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge branch 'next'
Sage Weil [Sun, 29 Apr 2012 01:18:41 +0000 (18:18 -0700)]
Merge branch 'next'

13 years agorun_seed_to.sh: clean out merge cruft
Sage Weil [Sun, 29 Apr 2012 01:12:20 +0000 (18:12 -0700)]
run_seed_to.sh: clean out merge cruft

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agolog: do not set on_exit() callback for libraries
Sage Weil [Sun, 22 Apr 2012 21:35:39 +0000 (14:35 -0700)]
log: do not set on_exit() callback for libraries

Set this up in either global_init() or common_init_finish(), both opportune
times that occur after config parsing has happened and the user has the
option to modify this behavior.  The exception would be libraries like
librados, which can't use rados_conf_* to enable this.  Arguably flush
functionality should be exposed through the librados API directly, instead
of futzing with on_exit().

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge remote branch 'gh/wip-filestore-misc'
Sage Weil [Sat, 28 Apr 2012 23:25:31 +0000 (16:25 -0700)]
Merge remote branch 'gh/wip-filestore-misc'

Conflicts:
src/test/filestore/run_seed_to.sh

13 years agoMerge remote branch 'gh/wip-2353'
Sage Weil [Sat, 28 Apr 2012 22:53:35 +0000 (15:53 -0700)]
Merge remote branch 'gh/wip-2353'

Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoosd: always share past_intervals
Sage Weil [Sat, 28 Apr 2012 22:49:40 +0000 (15:49 -0700)]
osd: always share past_intervals

Share past intervals when starting up new replicas.  This can happen via
an MOSDPGInfo or an MOSDPGLog message.

Fix up get_or_create_pg() so the past_intervals arg is required (and a ref,
like the other args). Fix doxygen comment.

Now the only time generate_past_intervals() should do any work is when
upgrading old clusters, during pg creation, and (possibly) during pg
split (when that is fully implemented).

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge branch 'wip-osdmap'
Sage Weil [Sat, 28 Apr 2012 22:25:20 +0000 (15:25 -0700)]
Merge branch 'wip-osdmap'

Conflicts:
src/mon/PGMonitor.cc
src/osd/OSDMap.h

13 years agofix file_layout.sh layouts test
Sage Weil [Sat, 28 Apr 2012 21:52:56 +0000 (14:52 -0700)]
fix file_layout.sh layouts test

preferred_osd is not gone.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge branch 'wip-mon'
Sage Weil [Sat, 28 Apr 2012 21:48:51 +0000 (14:48 -0700)]
Merge branch 'wip-mon'

Reviewed-by: Gregory Farnum <gregory.farnum@dreamhost.com>
13 years agomon: 'osd [un]set noin'
Sage Weil [Sat, 28 Apr 2012 21:48:26 +0000 (14:48 -0700)]
mon: 'osd [un]set noin'

Missed this one.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge branch 'next'
Sage Weil [Sat, 28 Apr 2012 21:47:53 +0000 (14:47 -0700)]
Merge branch 'next'

13 years agoosd: set dirty_info in generate_past_intervals
Sage Weil [Sat, 28 Apr 2012 14:37:49 +0000 (07:37 -0700)]
osd: set dirty_info in generate_past_intervals

This ensures that we save our work.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: fill in past intervals during advance_map
Sage Weil [Sat, 28 Apr 2012 14:37:15 +0000 (07:37 -0700)]
osd: fill in past intervals during advance_map

If ceph-osd is way behind, we will advance through past maps before we
mark ourselves up.  This avoids the slow recalculation once we are up, and
the ensuing badness.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: drop useless PG::fulfill_info()
Sage Weil [Sat, 28 Apr 2012 05:09:00 +0000 (22:09 -0700)]
osd: drop useless PG::fulfill_info()

There is a nice symmetry there with fulfill_log(), but it is a short
function with a single caller that mostly just forces us to copy a bunch
of data structures around unnecessarily.  Drop it.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: share past intervals with notifies
Sage Weil [Sat, 28 Apr 2012 05:08:03 +0000 (22:08 -0700)]
osd: share past intervals with notifies

Send past_intervals along with pg_info_t on every notify.  The reasoning
here is as follows:

 - we already have the state in memory
 - if we don't send it, and the primary doesn't have it, it will
   recalculate it by reading/decoding many previous maps from disk
 - for a highly-tortured cluster, i see past_intervals on the order of
   ~6 KB, times 600 pgs means ~2.5 MB sent for every activate_map(). for
   comparison, the same cluster would need to read and decode ~1 GB of
   maps to recalculate the same info.
 - for healthy clusters, the data is small, and costs little.
 - for unhealthy clusters, the data is large, but most useful.

In theory we could set a threshold so that we don't send it if it is
large, but allow the primary to query it explicitly.  I doubt it's worth
the complexity.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: only generate missing intervals in generate_past_intervals
Sage Weil [Fri, 27 Apr 2012 23:01:43 +0000 (16:01 -0700)]
osd: only generate missing intervals in generate_past_intervals

We can (currently) get into a situation where we don't have the full
history back to last_epoch_clean because non-primaries record past
intervals but don't initially have the full history, resulting in a partial
recent history.

If this happens, only fill in what's missing; no need to rebuild the recent
parts too.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: include past_intervals in pg debug printout
Sage Weil [Fri, 27 Apr 2012 22:16:23 +0000 (15:16 -0700)]
osd: include past_intervals in pg debug printout

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: fix check for whether to recalculate past_intervals
Sage Weil [Fri, 27 Apr 2012 21:30:17 +0000 (14:30 -0700)]
osd: fix check for whether to recalculate past_intervals

We may not recalculate all the way back to last_interval_clean due to
the oldest_map floor.  Figure out what we want and could calculate before
deciding whether what we have is insufficient.

Also, print something if we discard and recalculate so it is clear what is
happening and why.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: PG::Interval -> pg_interval_t
Sage Weil [Fri, 27 Apr 2012 16:51:30 +0000 (09:51 -0700)]
osd: PG::Interval -> pg_interval_t

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge branch 'next' into t
Sage Weil [Sat, 28 Apr 2012 14:46:23 +0000 (07:46 -0700)]
Merge branch 'next' into t

13 years agoStop rebuild of libcommon.la on "make dist"
Dan Mick [Sat, 28 Apr 2012 01:04:34 +0000 (18:04 -0700)]
Stop rebuild of libcommon.la on "make dist"

Fixes: 2356
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agomon: limit size of MOSDMap message sent as reply
Sage Weil [Fri, 27 Apr 2012 04:29:53 +0000 (21:29 -0700)]
mon: limit size of MOSDMap message sent as reply

We may send an MOSDMap as a reply to various requests, including

 - a failure report
 - a boot message
 - a pg_temp message
 - an up_thru message

In these cases, send a single MOSDMap message, but limit how big it gets.
All recipients here are osds, which are smart enough to request more maps
based on the MOSDMap::newest_map field.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoceph-object-corpus: revert rewind
Sage Weil [Sat, 28 Apr 2012 14:45:24 +0000 (07:45 -0700)]
ceph-object-corpus: revert rewind

From 92becb696bde7f0aa9687b2fe7505ed1ac9f493b

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosdmap: fix addr dedup check
Sage Weil [Sat, 28 Apr 2012 03:54:50 +0000 (20:54 -0700)]
osdmap: fix addr dedup check

Compare *every* address for a match, or else note that it is (or might be)
different.  Previously, we falsely took diff==0 to mean that all addrs
were definitely equal, which was not necessarily the case.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: fix bad map debug messages
Sage Weil [Sat, 28 Apr 2012 04:48:31 +0000 (21:48 -0700)]
osd: fix bad map debug messages

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoStop rebuild of libcommon.la on "make dist"
Dan Mick [Sat, 28 Apr 2012 01:04:34 +0000 (18:04 -0700)]
Stop rebuild of libcommon.la on "make dist"

Fixes: 2356
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agofilestore: fix error message
Yehuda Sadeh [Fri, 27 Apr 2012 23:05:36 +0000 (16:05 -0700)]
filestore: fix error message

error message was misleading, fixing it.

Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
13 years agofilestore: first lock osd mount point, next detect fs type
Yehuda Sadeh [Fri, 27 Apr 2012 22:46:49 +0000 (15:46 -0700)]
filestore: first lock osd mount point, next detect fs type

Fixes #2353. Problem was that there were (at least) two osd processes
that were racing for the fs detection, which triggered some errors
in the btrfs create/remove snapshot.

Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
13 years agoOSD: use map bl cache pinning during handle_osd_map
Samuel Just [Fri, 27 Apr 2012 17:00:36 +0000 (10:00 -0700)]
OSD: use map bl cache pinning during handle_osd_map

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agosimple_cache.hpp: add pinning
Samuel Just [Fri, 27 Apr 2012 17:00:08 +0000 (10:00 -0700)]
simple_cache.hpp: add pinning

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoMerge branch 'next'
Samuel Just [Fri, 27 Apr 2012 21:00:09 +0000 (14:00 -0700)]
Merge branch 'next'

13 years agoFileJournal: simply flush by waiting for completions to empty
Samuel Just [Fri, 27 Apr 2012 04:29:45 +0000 (21:29 -0700)]
FileJournal: simply flush by waiting for completions to empty

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoPG: in GetInfo Notify handler, fix peer_info_requested filter
Samuel Just [Fri, 27 Apr 2012 18:25:19 +0000 (11:25 -0700)]
PG: in GetInfo Notify handler, fix peer_info_requested filter

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoMerge branch 'wip-lpg'
Sage Weil [Fri, 27 Apr 2012 04:57:23 +0000 (21:57 -0700)]
Merge branch 'wip-lpg'

Conflicts:
src/osd/OSDMap.h

13 years agoMerge branch 'next'
Sage Weil [Fri, 27 Apr 2012 04:53:36 +0000 (21:53 -0700)]
Merge branch 'next'

13 years agolibrados: test get/set of debug levels
Sage Weil [Fri, 27 Apr 2012 04:51:55 +0000 (21:51 -0700)]
librados: test get/set of debug levels

Also do some sanity checks on the subsystem log level settings.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoconfig: allow {get,set}_val on subsystem debug levels
Sage Weil [Fri, 27 Apr 2012 04:51:23 +0000 (21:51 -0700)]
config: allow {get,set}_val on subsystem debug levels

This mimics the allows you to get and set subsystem debug levels via the
normal config access methods.  Among other things, this allows librados
users to set debug levels.

Fixes: #2350
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoOSD.cc: track osdmap refs using an LRU
Samuel Just [Fri, 27 Apr 2012 00:58:59 +0000 (17:58 -0700)]
OSD.cc: track osdmap refs using an LRU

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agocommon/: added templated simple lru implementations
Samuel Just [Wed, 25 Apr 2012 23:58:33 +0000 (16:58 -0700)]
common/: added templated simple lru implementations

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoosdmap: dedup pg_temp
Sage Weil [Thu, 26 Apr 2012 18:12:11 +0000 (11:12 -0700)]
osdmap: dedup pg_temp

We only deal with the case where the entire map is identical, since the
individual items are too small to make the pointer overhead worthwhile.
Too bad.  A in-memory btree-like structure would work better for this.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosdmap: use shared_ptr<> for pg_temp
Sage Weil [Thu, 26 Apr 2012 18:01:06 +0000 (11:01 -0700)]
osdmap: use shared_ptr<> for pg_temp

This will let us dedup later.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: make map dedup optional
Sage Weil [Thu, 26 Apr 2012 22:50:27 +0000 (15:50 -0700)]
osd: make map dedup optional

On by default.  This trades CPU for memory.  Some might have unlimited RAM
and not care.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: dedup osdmaps when added to the in-memory cache
Sage Weil [Wed, 25 Apr 2012 23:40:11 +0000 (16:40 -0700)]
osd: dedup osdmaps when added to the in-memory cache

When we add an OSDMap to our in-memory cache, dedup against an existing map
at a nearby epoch.

Signed-off-by: Sage Weil <sage@newdream.net>