]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
12 years agoceph.spec.in: add missing %{_libdir}/rados-classes/libcls_* files 419/head
Danny Al-Gaaf [Wed, 10 Jul 2013 16:12:05 +0000 (18:12 +0200)]
ceph.spec.in: add missing %{_libdir}/rados-classes/libcls_* files

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoMerge pull request #400 from ceph/wip-mon-newsync
Sage Weil [Wed, 10 Jul 2013 00:12:28 +0000 (17:12 -0700)]
Merge pull request #400 from ceph/wip-mon-newsync

simpler mon sync

Reviewed-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agoMerge branch 'wip-4982-4983-oloc-rebase'
David Zafman [Tue, 9 Jul 2013 23:24:22 +0000 (16:24 -0700)]
Merge branch 'wip-4982-4983-oloc-rebase'

fixes: #4982
fixes: #4983

Reviewed-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
12 years agoMerge pull request #415 from ceph/rgw-next
Sage Weil [Tue, 9 Jul 2013 22:34:05 +0000 (15:34 -0700)]
Merge pull request #415 from ceph/rgw-next

12 years agomon: do not scrub if scrub is in progress
Sage Weil [Tue, 9 Jul 2013 21:12:15 +0000 (14:12 -0700)]
mon: do not scrub if scrub is in progress

This prevents an assert from unexpected scrub results from the previous
scrub on the leader.

Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agounittest_pglog: fix unittest
Sage Weil [Tue, 9 Jul 2013 21:11:37 +0000 (14:11 -0700)]
unittest_pglog: fix unittest

This was broken by the pg_stat_t::reported cleanup.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'master' into wip-4982-4983-oloc-rebase 394/head
David Zafman [Tue, 9 Jul 2013 21:10:42 +0000 (14:10 -0700)]
Merge branch 'master' into wip-4982-4983-oloc-rebase

12 years agolibrados/misc.cc: reverse offset and length on write call
Samuel Just [Tue, 9 Jul 2013 17:35:09 +0000 (10:35 -0700)]
librados/misc.cc: reverse offset and length on write call

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoosd: Add namespace to dump_watchers output
David Zafman [Tue, 9 Jul 2013 06:56:56 +0000 (23:56 -0700)]
osd: Add namespace to dump_watchers output

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agoosd: Clean-up redundant use of object_locator_t
David Zafman [Tue, 9 Jul 2013 01:58:12 +0000 (18:58 -0700)]
osd: Clean-up redundant use of object_locator_t

Remove locator arg from get_object_context()/find_object_context()
Remove locator from object_info_t but retain encode format
Remove locator from object_info_t dump output
Remove OLOC_BLANK

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agoosd: Add the ability to set capabilities on namespaces
David Zafman [Fri, 28 Jun 2013 22:45:27 +0000 (15:45 -0700)]
osd: Add the ability to set capabilities on namespaces

Parse namespace spec in osd caps and use in is_match()
Add test cases to unit test

feature: #4983 (OSD: namespaces pt 2 (caps))

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agoosd: Prepare for nspace match with simpler is_match_all()
David Zafman [Fri, 28 Jun 2013 21:20:23 +0000 (14:20 -0700)]
osd: Prepare for nspace match with simpler is_match_all()

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agotest: Add namespace list test cases to librados test
David Zafman [Tue, 25 Jun 2013 22:08:55 +0000 (15:08 -0700)]
test: Add namespace list test cases to librados test

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agotest: Add namespace test cases to librados tests
David Zafman [Tue, 25 Jun 2013 05:37:50 +0000 (22:37 -0700)]
test: Add namespace test cases to librados tests

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agorados: Add namespace arg (--namespace, -N) to rados command
David Zafman [Wed, 3 Jul 2013 19:32:17 +0000 (12:32 -0700)]
rados: Add namespace arg (--namespace, -N) to rados command

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agolibrados: Add operate()/operate_read() log messages
David Zafman [Fri, 21 Jun 2013 22:19:41 +0000 (15:19 -0700)]
librados: Add operate()/operate_read() log messages

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agolibrados, os, osd, osdc, test: Add support for client specified namespaces
David Zafman [Tue, 11 Jun 2013 01:18:59 +0000 (18:18 -0700)]
librados, os, osd, osdc, test: Add support for client specified namespaces

Add rados_ioctx_namespace_set_key() and librados::IoCtx::namespace_set_key()
Add namespace to admin-daemon operations
Support namespace in osd map command
Add namespace to object_locator_t and hobject_t
Add random namespaces to psim program

Feature: #4982 (OSD: namespaces pt 1 (librados/osd, not caps))

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agoMerge branch 'wip-mon-osdmap-trim'
Sage Weil [Tue, 9 Jul 2013 20:43:25 +0000 (13:43 -0700)]
Merge branch 'wip-mon-osdmap-trim'

Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoosd: change pg_stat_t::reported from eversion_t to a pair of fields
Sage Weil [Mon, 8 Jul 2013 21:31:29 +0000 (14:31 -0700)]
osd: change pg_stat_t::reported from eversion_t to a pair of fields

This rarely represents an actual eversion_t as the epoch and seq values are
bumped semi-independently to ensure it is always unique.  Break it into
two separate fields to avoid confusion.

Drop now-unused and slightly curious inc() method.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: be smarter about calculating last_epoch_clean lower bound
Sage Weil [Mon, 8 Jul 2013 22:57:48 +0000 (15:57 -0700)]
mon: be smarter about calculating last_epoch_clean lower bound

We need to take PGs whose mapping has not changed in a long time into
account.  For them, the pg state will indicate it was clean at the time of
the report, in which case we can use that as a lower-bound on their actual
latest epoch clean.  If they are not currently clean (at report time), use
the last_epoch_clean value.

Fixes: #5519
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: report pg stats to mon at least every N (=500) epochs
Sage Weil [Mon, 8 Jul 2013 20:27:58 +0000 (13:27 -0700)]
osd: report pg stats to mon at least every N (=500) epochs

The mon needs a moderately accurate last_epoch_clean value in order to trim
old osdmaps.  To prevent a PG that hasn't peered or received IO in forever
from preventing this, send pg stats at some minimum frequency.  This will
increase the pg stat report workload for the mon over an idle pool, but
should be no worse that a cluster that is getting actual IO and sees these
updates from normal stat updates.

This makes the reported update a bit more aggressive/useful in that the epoch
is the last map epoch processed by this PG and not just one that is >= the
currenting interval.  Note that the semantics of this field are pretty useless
at this point.

See #5519

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/OSDMonitor: allow osdmap trimming to be forced via a config option
Sage Weil [Mon, 8 Jul 2013 21:55:44 +0000 (14:55 -0700)]
mon/OSDMonitor: allow osdmap trimming to be forced via a config option

In certain cases the admin may know that it is safe to trim old osdmaps but
a bug or other issue is preventing the Monitor from deciding on its own.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/OSDMonitor: make 'osd crush rm ...' slightly more idempotent
Sage Weil [Tue, 9 Jul 2013 04:09:09 +0000 (21:09 -0700)]
mon/OSDMonitor: make 'osd crush rm ...' slightly more idempotent

This particular failure is easily triggered by the crush_ops.sh
workunit.  Make it a bit less likely to fail.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agodoc/release-notes: v0.66
Sage Weil [Tue, 9 Jul 2013 18:45:34 +0000 (11:45 -0700)]
doc/release-notes: v0.66

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'wip-mon-trim'
Sage Weil [Tue, 9 Jul 2013 18:10:30 +0000 (11:10 -0700)]
Merge branch 'wip-mon-trim'

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon/PaxosService: update docs a bit
Sage Weil [Tue, 9 Jul 2013 05:06:31 +0000 (22:06 -0700)]
mon/PaxosService: update docs a bit

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: inline trim()
Sage Weil [Tue, 9 Jul 2013 05:04:10 +0000 (22:04 -0700)]
mon/PaxosService: inline trim()

This is now trivial; pull it into the caller.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: move paxos_service_trim_max into caller, clean up
Sage Weil [Tue, 9 Jul 2013 05:02:00 +0000 (22:02 -0700)]
mon/PaxosService: move paxos_service_trim_max into caller, clean up

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: simplify paxos_service_trim_min check
Sage Weil [Tue, 9 Jul 2013 04:58:13 +0000 (21:58 -0700)]
mon/PaxosService: simplify paxos_service_trim_min check

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: make service trim_to stateless
Sage Weil [Tue, 9 Jul 2013 04:54:53 +0000 (21:54 -0700)]
mon: make service trim_to stateless

Call get_trim_to() when we need to know how much to trim (if any), and
calculate it then.  No need to keep this in a hidden trim_version
variable and remember to update it.  This drops several helpers and
accessors and makes get_trim_to() a single method that services need to
override.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: pass trim target into encode_trim()
Sage Weil [Tue, 9 Jul 2013 18:09:44 +0000 (11:09 -0700)]
mon/PaxosService: pass trim target into encode_trim()

This will help us in a few patches...

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: preserve last_committed_floor across sync 400/head
Sage Weil [Tue, 9 Jul 2013 17:55:05 +0000 (10:55 -0700)]
mon: preserve last_committed_floor across sync

Add a paranoid check to prevent us from forgetting how far ahead our
last_committed was when we sync.  This prevents an i'll-timed forced-sync
from allowing paxos to warp back in time.

This should never happen unless there is a perfect storm of bad admin
decisions and/or bugs, but we guard against it anyway.

See: #5256
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: no need to reset sync state on restart
Sage Weil [Tue, 9 Jul 2013 17:52:48 +0000 (10:52 -0700)]
mon: no need to reset sync state on restart

If we are in or forcing a sync, we can leave these there until the sync
completes.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: drop single-use is_sync_on_going() check
Sage Weil [Fri, 5 Jul 2013 23:46:38 +0000 (16:46 -0700)]
mon: drop single-use is_sync_on_going() check

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: rev the internal mon protocol
Sage Weil [Tue, 9 Jul 2013 00:51:18 +0000 (17:51 -0700)]
mon: rev the internal mon protocol

This captures the new sync.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/MonitorDBStore: drop unused single prefix synchronizer
Sage Weil [Fri, 5 Jul 2013 19:29:36 +0000 (12:29 -0700)]
mon/MonitorDBStore: drop unused single prefix synchronizer

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: add --force-sync startup option
Sage Weil [Fri, 5 Jul 2013 19:14:13 +0000 (12:14 -0700)]
mon: add --force-sync startup option

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/Paxos: move consistent check into Paxos::init()
Sage Weil [Fri, 5 Jul 2013 19:11:17 +0000 (12:11 -0700)]
mon/Paxos: move consistent check into Paxos::init()

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/Paxos: remove unnecessary trim enable/disable
Sage Weil [Fri, 5 Jul 2013 17:36:54 +0000 (10:36 -0700)]
mon/Paxos: remove unnecessary trim enable/disable

The sync no longer cares if we trim Paxos versions as we go, as long as we
don't trim so fast that we fall behind between GET_CHUNK messages, which
we can consider a tuning problem.

Remove this extra complexity!

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/Paxos: config min paxos txns to keep separately
Sage Weil [Fri, 5 Jul 2013 17:34:46 +0000 (10:34 -0700)]
mon/Paxos: config min paxos txns to keep separately

We were using paxos_max_join_drift to control the minimum number of
paxos transactions to keep around.  Instead, make this explicit, and
separate from the join drift.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: implement a simpler sync
Sage Weil [Tue, 9 Jul 2013 01:13:31 +0000 (18:13 -0700)]
mon: implement a simpler sync

The previous sync implementation was highly stateful and very complex.
This made it very hard to understand and to debug, and there were bugs
still lurking in the timeout code (at least).

Replace it with something much simpler:

 - sync providers are almost stateless.  they keep an iterator, identified
   by a unique cookie, that times out in a simple way.
 - sync requesters sync from whomever they fancy.  namely anyone with newer
   committed paxos state.

There are a few extra fields that might allow sync continuation later, but
this is complex and not necessary at this point.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PGMonitor: cleanup: use const strings for pgmap prefixes
Sage Weil [Tue, 9 Jul 2013 00:01:06 +0000 (17:01 -0700)]
mon/PGMonitor: cleanup: use const strings for pgmap prefixes

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agorgw: warn on the lack of curl_multi_wait() 415/head
Yehuda Sadeh [Tue, 9 Jul 2013 16:19:16 +0000 (09:19 -0700)]
rgw: warn on the lack of curl_multi_wait()

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agorgw: fix args parsing
Yehuda Sadeh [Tue, 9 Jul 2013 07:35:00 +0000 (00:35 -0700)]
rgw: fix args parsing

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoos: Add missing pool to hobject_t:::dump() and hobject_t::decode()
David Zafman [Mon, 8 Jul 2013 22:53:38 +0000 (15:53 -0700)]
os: Add missing pool to hobject_t:::dump() and hobject_t::decode()

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agoos: Remove unused hobject_t::set_filestore_key()
David Zafman [Wed, 3 Jul 2013 18:27:11 +0000 (11:27 -0700)]
os: Remove unused hobject_t::set_filestore_key()

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agolibrados, osdc: Refactor IoCtxImpl to use operate()/operate_read()
David Zafman [Thu, 13 Jun 2013 03:51:09 +0000 (20:51 -0700)]
librados, osdc: Refactor IoCtxImpl to use operate()/operate_read()

Add ObjectOperation::write() that includes len instead of using bufferlist length
Have selfmanaged_snap_rollback_object() use mutate()

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agoTestRados: Output error for improper usage instead of Floating Point Exception
David Zafman [Wed, 3 Jul 2013 22:24:45 +0000 (15:24 -0700)]
TestRados: Output error for improper usage instead of Floating Point Exception

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agoosd: Fix object_locator_t::get_pool() return type
David Zafman [Mon, 10 Jun 2013 21:55:04 +0000 (14:55 -0700)]
osd: Fix object_locator_t::get_pool() return type

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agolibrados: Fix lock names
David Zafman [Tue, 25 Jun 2013 06:15:45 +0000 (23:15 -0700)]
librados: Fix lock names

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agopsim.cc: Fix comment on how to create .ceph_osdmap
David Zafman [Mon, 10 Jun 2013 22:58:44 +0000 (15:58 -0700)]
psim.cc: Fix comment on how to create .ceph_osdmap

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agoos: Code conformance in LFNIndex.cc
David Zafman [Fri, 7 Jun 2013 20:52:04 +0000 (13:52 -0700)]
os: Code conformance in LFNIndex.cc

Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agorgw: call appropriate curl calls for waiting on sockets
Yehuda Sadeh [Tue, 9 Jul 2013 01:55:19 +0000 (18:55 -0700)]
rgw: call appropriate curl calls for waiting on sockets

If libcurl supports curl_multi_wait() then use it, otherwise
use select() and force a timeout, even if it has been disabled.
Otherwise we may wait forever for events that we can't wait for
as select() only uses fds < 1024.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoconfigure.ac: detect whether libcurl supports curl_multi_wait()
Yehuda Sadeh [Tue, 9 Jul 2013 01:54:23 +0000 (18:54 -0700)]
configure.ac: detect whether libcurl supports curl_multi_wait()

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoMerge branch 'next'
Gary Lowell [Tue, 9 Jul 2013 06:17:50 +0000 (23:17 -0700)]
Merge branch 'next'

12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Tue, 9 Jul 2013 05:17:51 +0000 (22:17 -0700)]
Merge remote-tracking branch 'gh/next'

12 years agomon/PaxosService: prevent reads until initial service commit is done
Sage Weil [Mon, 8 Jul 2013 17:49:28 +0000 (10:49 -0700)]
mon/PaxosService: prevent reads until initial service commit is done

Do not process reads (or, by PaxosService::dispatch() implication, writes)
until we have committed the initial service state.  This avoids things like
EPERM due to missing keys when we race with mon creation, triggered by
teuthology tests doing their health check after startup.

Fixes: #5515
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon/PaxosService: unwind should_trim()
Sage Weil [Tue, 9 Jul 2013 04:44:05 +0000 (21:44 -0700)]
mon/PaxosService: unwind should_trim()

Inline the single-caller helper.  This will help us in a moment...

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: unwind service_should_trim() helper
Sage Weil [Tue, 9 Jul 2013 04:41:55 +0000 (21:41 -0700)]
mon/PaxosService: unwind service_should_trim() helper

Nobody overloads it; put it inline in should_trim().

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/MDSMonitor: remove unnecessary service_should_trim()
Sage Weil [Tue, 9 Jul 2013 04:41:34 +0000 (21:41 -0700)]
mon/MDSMonitor: remove unnecessary service_should_trim()

We never set_trim_to(), so this is unnecessary.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/OSDMonitor: remove dup service_should_trim() implementation
Sage Weil [Tue, 9 Jul 2013 04:40:36 +0000 (21:40 -0700)]
mon/OSDMonitor: remove dup service_should_trim() implementation

This matches the parent.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: trim periodically instead of via propose_pending
Sage Weil [Tue, 9 Jul 2013 04:38:11 +0000 (21:38 -0700)]
mon/PaxosService: trim periodically instead of via propose_pending

We want to trim old states even if there is no update activity.  For
example, if a long-running rebalance finishes all osdmap updates will
stop and we won't trim out old maps to free space.

Instead, trim at the same time as tick().  Remove the trim during
propose_pending() to force all trims through this path and avoid
introducing a new and rarely-exercised behavior.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon/PaxosService: reorder definitions
Sage Weil [Tue, 9 Jul 2013 04:33:37 +0000 (21:33 -0700)]
mon/PaxosService: reorder definitions

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: uninline should_trim()
Sage Weil [Tue, 9 Jul 2013 04:33:22 +0000 (21:33 -0700)]
mon/PaxosService: uninline should_trim()

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'master' of https://github.com/ceph/ceph
John Wilkins [Tue, 9 Jul 2013 01:11:57 +0000 (18:11 -0700)]
Merge branch 'master' of https://github.com/ceph/ceph

12 years agodoc: Added Ceph Object Storage installation instructions for CentOS/RHEL 6.
John Wilkins [Tue, 9 Jul 2013 01:11:25 +0000 (18:11 -0700)]
doc: Added Ceph Object Storage installation instructions for CentOS/RHEL 6.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agomon: sync all service prefixes, including pgmap_*
Sage Weil [Fri, 5 Jul 2013 18:58:29 +0000 (11:58 -0700)]
mon: sync all service prefixes, including pgmap_*

This was just recently broken with the merge of the pgmap changes.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/MonitorDBStore: expose get_chunk_tx()
Sage Weil [Thu, 4 Jul 2013 19:17:28 +0000 (12:17 -0700)]
mon/MonitorDBStore: expose get_chunk_tx()

Allow users get the transaction unencoded.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/OSDMonitor: fix base case for loading full osdmap
Sage Weil [Tue, 9 Jul 2013 00:46:40 +0000 (17:46 -0700)]
mon/OSDMonitor: fix base case for loading full osdmap

Right after cluster creation, first_committed is 1 and latest stashed in 0,
but we don't have the initial full map yet.  Thereafter, we do (because we
write it with trim).  Fixes afd6c7d8247075003e5be439ad59976c3d123218.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agoMerge branch 'wip-small-object-recovery'
Samuel Just [Mon, 8 Jul 2013 23:46:31 +0000 (16:46 -0700)]
Merge branch 'wip-small-object-recovery'

Conflicts:
src/include/ceph_features.h

Reviewed-by: Sage Weil <sage@inktank.com>
Fixes: #5278
12 years agoReplicatedPG: send compound messages to enlightened peers
Samuel Just [Wed, 19 Jun 2013 20:26:50 +0000 (13:26 -0700)]
ReplicatedPG: send compound messages to enlightened peers

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoReplicatedPG: add handlers for MOSDPG(Push|Pull|PushReply)
Samuel Just [Mon, 17 Jun 2013 23:26:31 +0000 (16:26 -0700)]
ReplicatedPG: add handlers for MOSDPG(Push|Pull|PushReply)

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoOSD: add handlers for MOSDPG(Push|PushReply|Pull)
Samuel Just [Mon, 17 Jun 2013 22:59:19 +0000 (15:59 -0700)]
OSD: add handlers for MOSDPG(Push|PushReply|Pull)

MOSDPG(Push|PushReply|Pull|SubOp|SubOpReply) need the
same thing checked prior to queueing the op, so they
share a templated handler.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agomessages/,osd_types: add messages for Push, PushReply, Pull
Samuel Just [Mon, 17 Jun 2013 22:41:36 +0000 (15:41 -0700)]
messages/,osd_types: add messages for Push, PushReply, Pull

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoReplicatedPG: split handle_pull out of sub_op_pull
Samuel Just [Fri, 14 Jun 2013 23:06:16 +0000 (16:06 -0700)]
ReplicatedPG: split handle_pull out of sub_op_pull

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoReplicatedPG: split handle_push_reply out of sub_op_push_reply
Samuel Just [Fri, 14 Jun 2013 22:35:55 +0000 (15:35 -0700)]
ReplicatedPG: split handle_push_reply out of sub_op_push_reply

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoReplicatedPG: send pulls en masse in recover_primary
Samuel Just [Fri, 14 Jun 2013 21:58:39 +0000 (14:58 -0700)]
ReplicatedPG: send pulls en masse in recover_primary

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoReplicatedPG: send pushes en mass in recover_replicas, recover_backfill
Samuel Just [Fri, 14 Jun 2013 20:44:34 +0000 (13:44 -0700)]
ReplicatedPG: send pushes en mass in recover_replicas, recover_backfill

This way, the pushes might be later merged into a smaller number of
messages.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoOSD: convert handle_push to use PushOp
Samuel Just [Wed, 12 Jun 2013 22:10:59 +0000 (15:10 -0700)]
OSD: convert handle_push to use PushOp

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoReplicatedPG: pass a PushOp into handle_pull_response
Samuel Just [Wed, 12 Jun 2013 20:28:15 +0000 (13:28 -0700)]
ReplicatedPG: pass a PushOp into handle_pull_response

This is the first step toward packaging multiple
pushes/pulls into a single message.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoReplicatedPG: split send_push into build_push_op and send_push_op
Samuel Just [Wed, 12 Jun 2013 22:50:55 +0000 (15:50 -0700)]
ReplicatedPG: split send_push into build_push_op and send_push_op

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoReplicatedPG: _committed_pushed_object don't pass op
Samuel Just [Wed, 12 Jun 2013 20:27:12 +0000 (13:27 -0700)]
ReplicatedPG: _committed_pushed_object don't pass op

Add a separate callback to handle marking the event and
the stats.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoReplicatedPG: submit_push_data must take recovery_info as non-const
Samuel Just [Mon, 8 Jul 2013 21:34:50 +0000 (14:34 -0700)]
ReplicatedPG: submit_push_data must take recovery_info as non-const

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agov0.66 v0.66
Gary Lowell [Mon, 8 Jul 2013 22:45:00 +0000 (15:45 -0700)]
v0.66

12 years agomon: implement simple 'scrub' command
Sage Weil [Mon, 8 Jul 2013 22:07:57 +0000 (15:07 -0700)]
mon: implement simple 'scrub' command

Compare all keys within the sync'ed prefixes across members of the quorum
and compare the key counts and CRC for inconsistencies.

Currently this is a one-shot inefficient hammer.  We'll want to make this
work in chunks before it is usable in production environments.

Protect with a feature bit to avoid sending MMonScrub to mons who can't
decode it.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomon: fix osdmap stash, trim to retain complete history of full maps
Sage Weil [Mon, 8 Jul 2013 22:04:59 +0000 (15:04 -0700)]
mon: fix osdmap stash, trim to retain complete history of full maps

The current interaction between sync and stashing full osdmaps only on
active mons means that a sync can result in an incomplete osdmap_full
history:

 - mon.c starts a full sync
 - during sync, active osdmap service should_stash_full() is true and
   includes a full in the txn
 - mon.c sync finishes
 - mon.c update_from_paxos gets "latest" stashed that it got from the
   paxos txn
 - mon.c does *not* walk to previous inc maps to complete it's collection
   of full maps.

To fix this, we disable the periodic/random stash of full maps by the
osdmap service.

This introduces a new problem: we must have at least one full map (the first
one) in order for a mon that just synced to build it's full collection.
Extend the encode_trim() process to allow the osdmap service to include
the oldest full map with the trim txn.  This is more complex than just
writing the full maps in the txn, but cheaper--we only write the full
map at trim time.

This *might* be related to previous bugs where the full osdmap was
missing, or case where leveldb keys seemed to 'disappear'.

Fixes: #5512
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoRevert "Makefile: fix ceph_sbindir"
Gary Lowell [Mon, 8 Jul 2013 21:49:16 +0000 (14:49 -0700)]
Revert "Makefile: fix ceph_sbindir"

This reverts commit 352f362567bf270d0896fb7573df4ae5139a56fb.

Reverting this commit because it causes problems with the debian build, and
reopening #5492.   The root problem appears to be lack of support by GNU
autotools for installing into both /sbin and /usr/sbin using the standard
location variables.

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
12 years agorgw: fix bucket link
Yehuda Sadeh [Mon, 8 Jul 2013 21:01:13 +0000 (14:01 -0700)]
rgw: fix bucket link

Bucket link was assuming the bucket head object was holding the
bucket acl, which is not true anymore.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoMerge pull request #395 from kri5/wip-vstart-documentation
Sage Weil [Mon, 8 Jul 2013 20:35:46 +0000 (13:35 -0700)]
Merge pull request #395 from kri5/wip-vstart-documentation

doc: Add a page to document vstart.sh script

12 years agodoc: Fix env variables in vstart.sh documentation 395/head
Christophe Courtaut [Mon, 8 Jul 2013 20:19:21 +0000 (22:19 +0200)]
doc: Fix env variables in vstart.sh documentation

Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
12 years agoosd/osd_types: fix pg_stat_t::dump for last_epoch_clean
Sage Weil [Mon, 8 Jul 2013 19:55:20 +0000 (12:55 -0700)]
osd/osd_types: fix pg_stat_t::dump for last_epoch_clean

Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #403 from ceph/wip-olazy
Gregory Farnum [Mon, 8 Jul 2013 19:45:45 +0000 (12:45 -0700)]
Merge pull request #403 from ceph/wip-olazy

merge: O_LAZY flag removal

Reviewed-by: Greg Farnum <greg@inktank.com
12 years agoMerge pull request #397 from kri5/wip-5478
Yehuda Sadeh [Mon, 8 Jul 2013 19:23:36 +0000 (12:23 -0700)]
Merge pull request #397 from kri5/wip-5478

rgw: Add explicit messages in radosgw init script

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoMerge pull request #406 from kri5/wip-3074
Yehuda Sadeh [Mon, 8 Jul 2013 18:43:13 +0000 (11:43 -0700)]
Merge pull request #406 from kri5/wip-3074

rgw: Add --help support to radosgw

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoclient: remove O_LAZY 403/head
Sage Weil [Mon, 8 Jul 2013 18:24:48 +0000 (11:24 -0700)]
client: remove O_LAZY

The once-upon-a-time unique O_LAZY value I chose forever ago is now
O_NOATIME, which means that some clients are choosing relaxed
consistency without meaning to.

It is highly unlikely that a real O_LAZY will ever exist, and we can
select it in the ceph case with the ioctl or libcephfs call, so drop
any support for doing this via open(2) flags.

Update doc/lazy_posix.txt file re: lazy io.

Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agocommon/crc32c: skip cpu detection incantation on not x86_64
Sage Weil [Sat, 6 Jul 2013 00:21:06 +0000 (17:21 -0700)]
common/crc32c: skip cpu detection incantation on not x86_64

On i386 this fails to build with

common/crc32c-intel.c: In function 'ceph_have_crc32c_intel':
error: common/crc32c-intel.c:79:9: PIC register clobbered by 'ebx' in 'asm'

ARM had more to complain about.

Not sure where this test came from, but it is clearly not meant for
anything other than x86_64.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge pull request #407 from dachary/wip-5487
athanatos [Mon, 8 Jul 2013 17:44:43 +0000 (10:44 -0700)]
Merge pull request #407 from dachary/wip-5487

unit tests for ObjectContext read/write locks

Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoqa/workunits/rbd/simple_big.sh: don't ENOSPC every time
Sage Weil [Mon, 8 Jul 2013 17:14:08 +0000 (10:14 -0700)]
qa/workunits/rbd/simple_big.sh: don't ENOSPC every time

Set the count on the initial dd so we don't always ENOSPC.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoqa/workunits/rbd/kernel.sh: move modprobe up
Sage Weil [Mon, 8 Jul 2013 16:58:16 +0000 (09:58 -0700)]
qa/workunits/rbd/kernel.sh: move modprobe up

Needs to happen before cleanup.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoqa/workunits/fs/test_o_trunc.sh: fix .sh to match new bin location
Sage Weil [Mon, 8 Jul 2013 16:56:29 +0000 (09:56 -0700)]
qa/workunits/fs/test_o_trunc.sh: fix .sh to match new bin location

To match 83f308962c53eec10db9e496987a9e4be7c87e9b.

Signed-off-by: Sage Weil <sage@inktank.com>