]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
13 years agomds: remove some cruft
Sage Weil [Sun, 12 Feb 2012 22:12:44 +0000 (14:12 -0800)]
mds: remove some cruft

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomds: remove pidfile
Sage Weil [Sun, 12 Feb 2012 00:39:27 +0000 (16:39 -0800)]
mds: remove pidfile

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: do a clean shutdown on SIGINT/SIGTERM
Sage Weil [Sun, 12 Feb 2012 22:43:13 +0000 (14:43 -0800)]
mon: do a clean shutdown on SIGINT/SIGTERM

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: install async signal handlers for SIG{HUP,INT,TERM}
Sage Weil [Sun, 12 Feb 2012 00:38:06 +0000 (16:38 -0800)]
mon: install async signal handlers for SIG{HUP,INT,TERM}

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: install async signal handlers for SIG{HUP,INT,TERM}
Sage Weil [Sun, 12 Feb 2012 00:36:33 +0000 (16:36 -0800)]
osd: install async signal handlers for SIG{HUP,INT,TERM}

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomds: install async signal handlers for SIG{HUP,INT,TERM}
Sage Weil [Sun, 12 Feb 2012 00:33:51 +0000 (16:33 -0800)]
mds: install async signal handlers for SIG{HUP,INT,TERM}

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agosignal: remove unused/obsolete handle_shutdown_signal
Sage Weil [Sun, 12 Feb 2012 00:39:48 +0000 (16:39 -0800)]
signal: remove unused/obsolete handle_shutdown_signal

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agosignals: do not install default SIGHUP, SIGINT, SIGTERM handlers
Sage Weil [Sun, 12 Feb 2012 00:30:26 +0000 (16:30 -0800)]
signals: do not install default SIGHUP, SIGINT, SIGTERM handlers

These should be app specific and async.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agosignals: implement safe async signal handler framework
Sage Weil [Sat, 11 Feb 2012 17:45:06 +0000 (09:45 -0800)]
signals: implement safe async signal handler framework

Based on http://evbergen.home.xs4all.nl/unix-signals.html.

Instead of his design, though, we write single bytes, and create a pipe per
signal we have handlers registered for.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agodoc: Balance backticks.
Tommi Virtanen [Tue, 14 Feb 2012 23:52:55 +0000 (15:52 -0800)]
doc: Balance backticks.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoMerge branch 'wip-osd-hb'
Sage Weil [Tue, 14 Feb 2012 22:01:22 +0000 (14:01 -0800)]
Merge branch 'wip-osd-hb'

Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agomds: use new tmap_get pbl argument
Sage Weil [Tue, 14 Feb 2012 21:41:29 +0000 (13:41 -0800)]
mds: use new tmap_get pbl argument

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agolibrados: need prval for tmap_get
Sage Weil [Tue, 14 Feb 2012 21:39:46 +0000 (13:39 -0800)]
librados: need prval for tmap_get

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agolibrados: add aio_operate for reads and tmap_get for ObjectWriteOp
Samuel Just [Tue, 7 Feb 2012 16:51:01 +0000 (08:51 -0800)]
librados: add aio_operate for reads and tmap_get for ObjectWriteOp

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoosd: remove unused need_size
Sage Weil [Tue, 14 Feb 2012 21:35:04 +0000 (13:35 -0800)]
osd: remove unused need_size

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge branch 'wip_push_refactor'
Samuel Just [Tue, 14 Feb 2012 21:02:44 +0000 (13:02 -0800)]
Merge branch 'wip_push_refactor'

Reviewed-by: Sage Weil <sage@newdream.net>
13 years agoReplicatedPG: pull() should return PULL_NONE, not false
Samuel Just [Tue, 14 Feb 2012 20:56:32 +0000 (12:56 -0800)]
ReplicatedPG: pull() should return PULL_NONE, not false

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoReplicatedPG: clean up push/pull
Samuel Just [Tue, 14 Feb 2012 20:55:43 +0000 (12:55 -0800)]
ReplicatedPG: clean up push/pull

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoosd_types.h: Add constructors for ObjectRecovery*
Samuel Just [Tue, 14 Feb 2012 20:52:59 +0000 (12:52 -0800)]
osd_types.h: Add constructors for ObjectRecovery*

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agotest_filestore_idempotent: fix test to create initial object
Sage Weil [Tue, 14 Feb 2012 19:53:05 +0000 (11:53 -0800)]
test_filestore_idempotent: fix test to create initial object

Filestore now properly fails to clone a non-existent object, which means
we should create one.

Fixes: #2062
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agolibcephfs: define CEPH_SETATTR_*
Sage Weil [Tue, 14 Feb 2012 17:06:21 +0000 (09:06 -0800)]
libcephfs: define CEPH_SETATTR_*

These are also defined internally in ceph_fs.h, so use a guard.  Annoying,
but gives us consistent naming (ceph_*/CEPH_*, not LIBCEPHFS_SETATTR_*).

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agotest/encoding/readable.sh: drop bashisms
Sage Weil [Mon, 13 Feb 2012 22:43:18 +0000 (14:43 -0800)]
test/encoding/readable.sh: drop bashisms

=, not ==!

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agofilejournal: drop unused variable
Sage Weil [Mon, 13 Feb 2012 22:35:01 +0000 (14:35 -0800)]
filejournal: drop unused variable

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agofilejournal: aio off by default
Sage Weil [Mon, 13 Feb 2012 22:32:07 +0000 (14:32 -0800)]
filejournal: aio off by default

For now, until we have a better handle on the ext4 bug, and demonstrate
that it is a clear performance win with the full stack.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote-tracking branch 'gh/wip-journal-aio-rebased'
Sage Weil [Mon, 13 Feb 2012 22:31:17 +0000 (14:31 -0800)]
Merge remote-tracking branch 'gh/wip-journal-aio-rebased'

13 years agoMerge remote-tracking branch 'gh/wip-osd'
Sage Weil [Mon, 13 Feb 2012 22:09:04 +0000 (14:09 -0800)]
Merge remote-tracking branch 'gh/wip-osd'

Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agotest/encoding/readable.sh: skip old version with known incompatibilities
Sage Weil [Mon, 13 Feb 2012 20:40:33 +0000 (12:40 -0800)]
test/encoding/readable.sh: skip old version with known incompatibilities

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoceph-dencoder: add osd_peer_stat_t
Sage Weil [Mon, 13 Feb 2012 20:13:18 +0000 (12:13 -0800)]
ceph-dencoder: add osd_peer_stat_t

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agorgw: remove extra useless info in bucket entry encoding
Yehuda Sadeh [Mon, 13 Feb 2012 20:07:17 +0000 (12:07 -0800)]
rgw: remove extra useless info in bucket entry encoding

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoReplicatedPG: refactor push and pull
Samuel Just [Mon, 13 Feb 2012 19:49:42 +0000 (11:49 -0800)]
ReplicatedPG: refactor push and pull

Now, push progress is represented by ObjectRecoveryProgress.  In
particular, rather than tracking data_subset_*ing, we track the furthest
offset before which the data will be consistent once cloning is complete.
sub_op_push now separates the pull response implementation from the
replica push implementation.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoadd CEPH_FEATURE_OSDENC
Sage Weil [Mon, 13 Feb 2012 19:27:11 +0000 (11:27 -0800)]
add CEPH_FEATURE_OSDENC

Require it for osd <-> osd and osd <-> mon communication.

This covers all the new encoding changes, except hobject_t, which is used
between the rados command line tool and the OSD for a object listing
position marker.  We can't distinguish between specific types of clients,
though, and we don't want to introduce any incompatibility with other
clients, so we'll just have to make do here.  :(

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoReplicatedPG: consider backfill_pos to be degraded
Samuel Just [Sun, 12 Feb 2012 01:50:49 +0000 (17:50 -0800)]
ReplicatedPG: consider backfill_pos to be degraded

A write may trigger via make_writeable the creation of a clone which
sorts before the object being written.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoReplicatedPG: add debugging for in flight backfill ops
Samuel Just [Sun, 12 Feb 2012 01:52:13 +0000 (17:52 -0800)]
ReplicatedPG: add debugging for in flight backfill ops

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoReplicatedPG: is_degraded may return true for backfill
Samuel Just [Sun, 12 Feb 2012 01:53:47 +0000 (17:53 -0800)]
ReplicatedPG: is_degraded may return true for backfill

If is_degraded returns true for backfill, the object may not be
in any replica's missing set.  Only call start_recovery_op if
we actually started an op.  This bug could cause a stuck
in backfill error.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMOSDSubOp: Add new object recovery state
Samuel Just [Wed, 8 Feb 2012 00:35:04 +0000 (16:35 -0800)]
MOSDSubOp: Add new object recovery state

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoReplicatedPG: consider backfill_pos to be degraded
Samuel Just [Sun, 12 Feb 2012 01:50:49 +0000 (17:50 -0800)]
ReplicatedPG: consider backfill_pos to be degraded

A write may trigger via make_writeable the creation of a clone which
sorts before the object being written.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoReplicatedPG: add debugging for in flight backfill ops
Samuel Just [Sun, 12 Feb 2012 01:52:13 +0000 (17:52 -0800)]
ReplicatedPG: add debugging for in flight backfill ops

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoReplicatedPG: is_degraded may return true for backfill
Samuel Just [Sun, 12 Feb 2012 01:53:47 +0000 (17:53 -0800)]
ReplicatedPG: is_degraded may return true for backfill

If is_degraded returns true for backfill, the object may not be
in any replica's missing set.  Only call start_recovery_op if
we actually started an op.  This bug could cause a stuck
in backfill error.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoosd: remove peer_stat from MOSDOp entirely
Sage Weil [Mon, 13 Feb 2012 19:06:34 +0000 (11:06 -0800)]
osd: remove peer_stat from MOSDOp entirely

We haven't used this feature for years and years, and don't plan to.  It
was there to facilitate "read shedding", where the primary OSD would
forward a read request to a replica.  However, replicas can't reply back
to the client in that case because OSDs don't initiate connections (they
used to).

Rip this out for now, especially since osd_peer_stat_t just changed.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote-tracking branch 'gh/wip-mon-lag'
Sage Weil [Mon, 13 Feb 2012 18:01:32 +0000 (10:01 -0800)]
Merge remote-tracking branch 'gh/wip-mon-lag'

Reviewed-by: Sage Weil <sage@newdream.net>
13 years agoosd: new osd_peer_stat_t shell type
Sage Weil [Mon, 13 Feb 2012 17:42:37 +0000 (09:42 -0800)]
osd: new osd_peer_stat_t shell type

We weren't using this, and it had broken (raw) encoding.  The constructor
also didn't initialize fields properly.

Clear out the struct and use the new encoding scheme, so we can cleanly
add fields moving forward.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoqa/btrfs/.gitignore: ignore targets
Sage Weil [Mon, 13 Feb 2012 17:35:17 +0000 (09:35 -0800)]
qa/btrfs/.gitignore: ignore targets

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: use single helper for pg creation
Sage Weil [Mon, 13 Feb 2012 04:47:02 +0000 (20:47 -0800)]
osd: use single helper for pg creation

Take a bool so that we initialize the last_epoch_started properly on
newly created PGs.  This gives us a single code path for all new PGs.

We drop the clear_primary_state(), which has no effect, given that this is
a newly constructed PG.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: use PG::init() for newly local (but not created) PGs
Sage Weil [Mon, 13 Feb 2012 04:35:14 +0000 (20:35 -0800)]
osd: use PG::init() for newly local (but not created) PGs

Use the helper for PGs that are newly instantiated on the local OSD.

This fixes the initialization of pg->info.stats.{up,acting,mapping_epoch}.

We also get rid of a premature (and useless) write_info/log, which has
bad information (and is soon after followed by the real/good one).

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: move new pg initialization into PG::info()
Sage Weil [Mon, 13 Feb 2012 04:32:25 +0000 (20:32 -0800)]
osd: move new pg initialization into PG::info()

Move initialization of misc elements of the new pg from OSD.cc to a PG
method.  No change in functionality.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: protect per-pg heartbeat peers with inner lock
Sage Weil [Mon, 13 Feb 2012 02:08:34 +0000 (18:08 -0800)]
osd: protect per-pg heartbeat peers with inner lock

Currently we update the overall heartbeat peers by looking directly at
per-pg state.  This is potentially problematic now (#2033), and definitely
so in the future when we push more peering operations into the work queues.

Create a per-pg set of peers, protected by an inner lock, and update it
using PG::update_heartbeat_peers() when appropriate under pg->lock.  Then
aggregate it into the osd peer list in OSD::update_heatbeat_peers() under
osd_lock and the inner lock.

We could probably have re-used osd->heartbeat_lock instead of adding a
new pg->heartbeat_peer_lock, but the finer locking can't hurt.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agorgw: don't use SCRIPT_NAME and QUERY_STRING vars
Yehuda Sadeh [Sun, 12 Feb 2012 06:43:35 +0000 (22:43 -0800)]
rgw: don't use SCRIPT_NAME and QUERY_STRING vars

REQUEST_URI holds everything we need, and it's encoded correctly.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoosd: flush pg on activate _after_ we queue our transaction
Sage Weil [Sun, 12 Feb 2012 05:47:42 +0000 (21:47 -0800)]
osd: flush pg on activate _after_ we queue our transaction

We recently added a flush on activate, but we are still building the
transaction (the caller queues it), so calling osr.flush() here is totally
useless.

Instead, set a flag 'need_flush', and do the flush the next time we receive
some work.

This has the added benefit of doing the flush in the worker thread, outside
of osd_lock.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: do OpRequest dispatch into PG::do_request
Sage Weil [Sun, 12 Feb 2012 05:46:17 +0000 (21:46 -0800)]
osd: do OpRequest dispatch into PG::do_request

This simplifies the external PG interface, and gives us a single path into
the PG...

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agofilestore: make flush() block forever if blackholed
Sage Weil [Sun, 12 Feb 2012 05:24:54 +0000 (21:24 -0800)]
filestore: make flush() block forever if blackholed

If we are blackholing the disk, we need to make flush() wait forever, or
else the flush() logic will return (the IO wasn't queued!) and higher
layers will continue and (eventually) misbehave.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoRevert "rgw: don't treat plus as a space in url decode"
Yehuda Sadeh [Sun, 12 Feb 2012 05:16:50 +0000 (21:16 -0800)]
Revert "rgw: don't treat plus as a space in url decode"

This reverts commit a6d7629c177fbab722a7a0c7f861caf91ff92deb.

13 years agoosd: emit useful scrub error on missing clone
Sage Weil [Sun, 12 Feb 2012 05:15:11 +0000 (21:15 -0800)]
osd: emit useful scrub error on missing clone

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agofilestore: return error from CLONE
Sage Weil [Sun, 12 Feb 2012 05:14:53 +0000 (21:14 -0800)]
filestore: return error from CLONE

Aie!

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: filter trimming|purged snaps out of op SnapContext
Sage Weil [Sat, 11 Feb 2012 22:55:06 +0000 (14:55 -0800)]
osd: filter trimming|purged snaps out of op SnapContext

We can receive an op with an old SnapContext that includes snaps that we've
already trimmed or are in the process of trimming.  Filter them out!
Otherwise we will recreate and add links into collections we've already
marked as removed, and we'll get things like ENOTEMPTY when we try to
remove them.  Or just leave them laying around.

Fixes: #1949
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: add {mon,quorum}_status admin socket commands
Sage Weil [Sat, 11 Feb 2012 22:32:46 +0000 (14:32 -0800)]
mon: add {mon,quorum}_status admin socket commands

These dump some json with the current monitor/quorum status.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: move quorum_status into helper
Sage Weil [Sat, 11 Feb 2012 22:29:52 +0000 (14:29 -0800)]
mon: move quorum_status into helper

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: move mon_status into a helper
Sage Weil [Sat, 11 Feb 2012 22:10:53 +0000 (14:10 -0800)]
mon: move mon_status into a helper

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoinit-ceph, mkcephfs: try 'btrfs device scan' before 'btrfsctl -a'
Sage Weil [Sat, 11 Feb 2012 21:43:23 +0000 (13:43 -0800)]
init-ceph, mkcephfs: try 'btrfs device scan' before 'btrfsctl -a'

Fixes: #2023
Reported-by: Wido den Hollander <wido@widodh.nl>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: fix MOSDPGCreate version setting
Sage Weil [Sat, 11 Feb 2012 19:56:51 +0000 (11:56 -0800)]
osd: fix MOSDPGCreate version setting

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote branch 'gh/wip-osd-encoding'
Sage Weil [Sat, 11 Feb 2012 19:48:29 +0000 (11:48 -0800)]
Merge remote branch 'gh/wip-osd-encoding'

13 years agoosd: queue pg removal under pg's epoch
Sage Weil [Sat, 11 Feb 2012 17:28:14 +0000 (09:28 -0800)]
osd: queue pg removal under pg's epoch

The PG may be doing work relative to a different epoch than what the osd
has.  Make sure the PG removal message is queued under that epoch to avoid
confusing/crashing the recipient like so:

2012-02-10 23:26:35.691793 7f387281f700 osd.3 514 queue_pg_for_deletion: 0.0
osd/OSD.cc: In function 'void OSD::handle_pg_remove(OpRequest*)' thread 7f387281f700 time 2012-02-10 23:26:35.691820
osd/OSD.cc: 4860: FAILED assert(pg->get_primary() == m->get_source().num())

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: check for valid snapc _before_ doing op work
Sage Weil [Sat, 11 Feb 2012 00:47:33 +0000 (16:47 -0800)]
osd: check for valid snapc _before_ doing op work

Check this early to avoid wasting effort, or causing side-effects from
do_osd_op_effects().

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: some cleanup
Sage Weil [Sat, 11 Feb 2012 00:46:33 +0000 (16:46 -0800)]
osd: some cleanup

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: validate osmdap input
Sage Weil [Sat, 11 Feb 2012 17:49:42 +0000 (09:49 -0800)]
mon: validate osmdap input

And clean up some error return paths while we're here.

Fixes: #1493
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agorgw: objects can contain '%'
Yehuda Sadeh [Sat, 11 Feb 2012 00:47:54 +0000 (16:47 -0800)]
rgw: objects can contain '%'

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agomon: fix MMonElection encoding version
Sage Weil [Fri, 10 Feb 2012 23:17:23 +0000 (15:17 -0800)]
mon: fix MMonElection encoding version

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: remove the last_consumed setting in Paxos
Greg Farnum [Fri, 10 Feb 2012 23:07:10 +0000 (15:07 -0800)]
mon: remove the last_consumed setting in Paxos

This was only ever used while initializing the Paxos machine, and it
doesn't need to be. Its existence is just an invitation to have races
between updating the stashed data and the stashed version.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoobjecter: LingerOp is refcounted
Yehuda Sadeh [Fri, 10 Feb 2012 22:45:27 +0000 (14:45 -0800)]
objecter: LingerOp is refcounted

this should fix Bug #2050, where a linger op was used after being freed.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agomon: handle inconsistent disk states on startup.
Greg Farnum [Fri, 10 Feb 2012 23:02:03 +0000 (15:02 -0800)]
mon: handle inconsistent disk states on startup.

This lets us recover from an interrupted slurp while still noticing
other corruption issues. Rather than running init() and then
update_from_paxos() on each instance, we run init() and check
consistency. If it is consistent, we update_from_paxos as before. If
it is not, we do nothing and detect the slurping state
in handle_probe_reply(). (This assumes the disk was in a slurping state. If not, the
daemon crashes because something else went horribly wrong.)

While we're at it, remove unnecessary sets of first_committed. These
are done in the call to pax->trim_to().

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoMerge branch 'wip-encoding'
Sage Weil [Fri, 10 Feb 2012 22:39:44 +0000 (14:39 -0800)]
Merge branch 'wip-encoding'

Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoqa/btrfs/create_async_snap
Sage Weil [Fri, 10 Feb 2012 22:03:44 +0000 (14:03 -0800)]
qa/btrfs/create_async_snap

Stupid tool to call the async snap ioctl.  Until the btrfs tool does it.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomessages: populate header.version in constructor
Sage Weil [Fri, 10 Feb 2012 22:38:13 +0000 (14:38 -0800)]
messages: populate header.version in constructor

Define a HEAD_VERSION and COMPAT_VERSION for any versioned message.  Pass
to Message constructor so that it is always initialized, even from the
the default constructor.  That's needed because we use that to check
decoding compatibility when receiving/decoding messages.

If we are conditionally encoding an old version, explicitly set
header.version in encode_payload().

We also set compat_version to demonstrate what will happen for future
revisions.  In this case, it's moot, because no old code understands
compat_version yet: nobody with old decode code will see these values
anyway.  But use this opportunity to demonstrate how it would be used in
the future.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: add a slurping flag to the Paxos state
Greg Farnum [Fri, 10 Feb 2012 18:42:24 +0000 (10:42 -0800)]
mon: add a slurping flag to the Paxos state

Set it before we start slurping, and clear it when we end slurping.
This allows us to differentiate between deliberately inconsistent
disk states, and broken disk states. Run simple checks in a new
is_consistent() call.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoReplicatedPG: don't put the op on -EAGAIN
Samuel Just [Fri, 10 Feb 2012 17:55:44 +0000 (09:55 -0800)]
ReplicatedPG: don't put the op on -EAGAIN

EAGAIN indicates that the op is
waiting_for_missing or waiting_for_degraded

Reviewed-by: Greg Farnum <greg.farnum@dreamhost.com>
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agomon: initialize paxos state in constructor
Greg Farnum [Fri, 10 Feb 2012 17:16:58 +0000 (09:16 -0800)]
mon: initialize paxos state in constructor

These should all be initialized in init() anyway
(except accepted_pn_from, which is set in collect and handle_collect),
but initializing them to safe defaults in the constructor provides
a safety net.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomsg: check compat_version before decoding
Sage Weil [Thu, 2 Feb 2012 23:02:41 +0000 (15:02 -0800)]
msg: check compat_version before decoding

If the newly constructed message's version is older than the
compat_version, don't even try to decode; just fail.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomsg: populate compat_version for encoded messages
Sage Weil [Thu, 2 Feb 2012 22:30:30 +0000 (14:30 -0800)]
msg: populate compat_version for encoded messages

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomsg: include compat_version in version header
Sage Weil [Thu, 2 Feb 2012 22:29:59 +0000 (14:29 -0800)]
msg: include compat_version in version header

header.version is the version we encoded.
header.compat_version is the oldest version of code that can decode it.

If the value is 0, we don't know anything about backward compatibility.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agonew encoding for Log{Entry,Summary}
Sage Weil [Fri, 10 Feb 2012 06:01:27 +0000 (22:01 -0800)]
new encoding for Log{Entry,Summary}

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoos: new encoding for hobject_t
Sage Weil [Fri, 10 Feb 2012 05:56:18 +0000 (21:56 -0800)]
os: new encoding for hobject_t

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: new encoding for pg_create_t
Sage Weil [Fri, 10 Feb 2012 05:54:34 +0000 (21:54 -0800)]
osd: new encoding for pg_create_t

There was no version encoding previously, so this is an incompatible
change.  Fortunately this type is only used in one place, MOSDPGCreate,
so we'll rev that encoding and compensate there.  All is well!

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: new encoding for osd_stat_t
Sage Weil [Fri, 10 Feb 2012 05:53:45 +0000 (21:53 -0800)]
osd: new encoding for osd_stat_t

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: new encoding for object_locator_t
Sage Weil [Fri, 10 Feb 2012 05:53:35 +0000 (21:53 -0800)]
osd: new encoding for object_locator_t

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: new encoding for osd_reqid_t
Sage Weil [Fri, 10 Feb 2012 05:53:22 +0000 (21:53 -0800)]
osd: new encoding for osd_reqid_t

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: new ScrubMap::object encoding
Sage Weil [Tue, 7 Feb 2012 04:19:33 +0000 (20:19 -0800)]
osd: new ScrubMap::object encoding

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: set last_changed when creating new pgs
Sage Weil [Fri, 3 Feb 2012 18:05:41 +0000 (10:05 -0800)]
mon: set last_changed when creating new pgs

This will help us identify PGs that are stuck in creating state.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: set last_unstale when marking PGs stale
Sage Weil [Fri, 3 Feb 2012 18:07:46 +0000 (10:07 -0800)]
mon: set last_unstale when marking PGs stale

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: include state timestamps, mapping_epoch in pg_stat_t
Sage Weil [Fri, 3 Feb 2012 18:07:30 +0000 (10:07 -0800)]
osd: include state timestamps, mapping_epoch in pg_stat_t

Track the time when the pg state last changed (or was refreshed) in
interesting ways.

Also track the epoch when the mapping last changed (same_interval_since).

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: new encoding for PG::Interval
Sage Weil [Thu, 2 Feb 2012 21:15:55 +0000 (13:15 -0800)]
osd: new encoding for PG::Interval

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: new encoding for PG::OndiskLog
Sage Weil [Thu, 2 Feb 2012 21:15:29 +0000 (13:15 -0800)]
osd: new encoding for PG::OndiskLog

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoobjectstore: new encoding for Transaction
Sage Weil [Thu, 2 Feb 2012 21:15:14 +0000 (13:15 -0800)]
objectstore: new encoding for Transaction

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: new encoding for ScrubMap
Sage Weil [Thu, 2 Feb 2012 20:44:13 +0000 (12:44 -0800)]
osd: new encoding for ScrubMap

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: new encoding for object_info_t
Sage Weil [Thu, 2 Feb 2012 20:44:04 +0000 (12:44 -0800)]
osd: new encoding for object_info_t

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: new encoding for watch_info_t
Sage Weil [Thu, 2 Feb 2012 20:43:38 +0000 (12:43 -0800)]
osd: new encoding for watch_info_t

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: new encoding for SnapSet
Sage Weil [Thu, 2 Feb 2012 20:43:26 +0000 (12:43 -0800)]
osd: new encoding for SnapSet

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: new encoding for pg_missing_t
Sage Weil [Thu, 2 Feb 2012 20:42:59 +0000 (12:42 -0800)]
osd: new encoding for pg_missing_t

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: new encoding for pg_log_t
Sage Weil [Thu, 2 Feb 2012 20:42:47 +0000 (12:42 -0800)]
osd: new encoding for pg_log_t

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: new encoding for pg_log_entry_t
Sage Weil [Thu, 2 Feb 2012 20:42:30 +0000 (12:42 -0800)]
osd: new encoding for pg_log_entry_t

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: new encoding for pg_history_t
Sage Weil [Thu, 2 Feb 2012 20:42:07 +0000 (12:42 -0800)]
osd: new encoding for pg_history_t

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: new encoding for pg_history_t
Sage Weil [Thu, 2 Feb 2012 20:41:41 +0000 (12:41 -0800)]
osd: new encoding for pg_history_t

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>