]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
16 years agomsgr: be noisier about mark_down calls
Sage Weil [Tue, 9 Dec 2008 19:56:51 +0000 (11:56 -0800)]
msgr: be noisier about mark_down calls

16 years agoosd: avoid needless calls to peer(), build_prior()
Sage Weil [Tue, 9 Dec 2008 19:00:04 +0000 (11:00 -0800)]
osd: avoid needless calls to peer(), build_prior()

Introduces PEERING pg state.  Also is smarter about when build_prior and
peer are actually called.

16 years agoosd: make prior_set_affected() slightly smarter
Sage Weil [Tue, 9 Dec 2008 18:40:01 +0000 (10:40 -0800)]
osd: make prior_set_affected() slightly smarter

Only return true if an osd goes down that we didn't already know was
down (prior_set may contain down osds if the PG is marked DOWN).

16 years agocobserver: cleanups
Sage Weil [Tue, 9 Dec 2008 22:57:48 +0000 (14:57 -0800)]
cobserver: cleanups

16 years agomon: use 'latest' for latest osd, mds maps
Sage Weil [Tue, 9 Dec 2008 22:55:00 +0000 (14:55 -0800)]
mon: use 'latest' for latest osd, mds maps

Mainly for benefit of PaxosObserver, but it also cleans things up
a bit.

16 years agocobserver: cleanup; print map summaries w/ each new state
Sage Weil [Tue, 9 Dec 2008 22:44:58 +0000 (14:44 -0800)]
cobserver: cleanup; print map summaries w/ each new state

16 years agomon: refactor map print_summary/operator<< methods
Sage Weil [Tue, 9 Dec 2008 22:44:41 +0000 (14:44 -0800)]
mon: refactor map print_summary/operator<< methods

16 years agocobserver: accidentaly removed a line
Yehuda Sadeh [Tue, 9 Dec 2008 22:23:38 +0000 (14:23 -0800)]
cobserver: accidentaly removed a line

16 years agokclient: missing files
Yehuda Sadeh [Tue, 9 Dec 2008 22:20:44 +0000 (14:20 -0800)]
kclient: missing files

16 years agowhitespaces
Yehuda Sadeh [Tue, 9 Dec 2008 22:19:27 +0000 (14:19 -0800)]
whitespaces

16 years agomon: factor ClientMap class out
Yehuda Sadeh [Tue, 9 Dec 2008 22:11:09 +0000 (14:11 -0800)]
mon: factor ClientMap class out

16 years agocobserver: utility, observe changes in different maps
Yehuda Sadeh [Tue, 9 Dec 2008 21:39:10 +0000 (13:39 -0800)]
cobserver: utility, observe changes in different maps

16 years agoosd: use push() to push clone op
Sage Weil [Tue, 9 Dec 2008 19:47:06 +0000 (11:47 -0800)]
osd: use push() to push clone op

Also fixes missing updates to peer_missing[peer] and pushing
map.

16 years agomon: factor our osdmap print, print_summary
Sage Weil [Tue, 9 Dec 2008 17:58:16 +0000 (09:58 -0800)]
mon: factor our osdmap print, print_summary

16 years agomon: factor out mds print, print_summary
Sage Weil [Tue, 9 Dec 2008 17:58:08 +0000 (09:58 -0800)]
mon: factor out mds print, print_summary

16 years agomds: stay loner if client has B and no other reason to switch state
Sage Weil [Mon, 8 Dec 2008 21:50:46 +0000 (13:50 -0800)]
mds: stay loner if client has B and no other reason to switch state

If the client has dirty data, and there is no other reason to
toggle the lock state, leave it as LONER.  The client will write
out at its leisure, and we'll avoid an unstable lock state that
is waiting on a potentially slow writeout.

16 years agoosd: missing last_mon_heartbeat declaration
Sage Weil [Tue, 9 Dec 2008 17:50:26 +0000 (09:50 -0800)]
osd: missing last_mon_heartbeat declaration

16 years agomsgr: make sure nonce matches too when connecting to peer
Sage Weil [Tue, 9 Dec 2008 16:48:03 +0000 (08:48 -0800)]
msgr: make sure nonce matches too when connecting to peer

Otherwise the predictable port numbers cause problems.

16 years agomsgr: print error when message type is unrecognized
Sage Weil [Tue, 9 Dec 2008 16:43:38 +0000 (08:43 -0800)]
msgr: print error when message type is unrecognized

16 years agoosd: ping mon less frequently when peerless
Sage Weil [Tue, 9 Dec 2008 16:42:28 +0000 (08:42 -0800)]
osd: ping mon less frequently when peerless

Every second is too much.  Make it tunable.

16 years agomon: typo in pg dump output
Sage Weil [Mon, 8 Dec 2008 22:03:46 +0000 (14:03 -0800)]
mon: typo in pg dump output

16 years agoceph: new default mon port; try to bind to port in known range
Sage Weil [Mon, 8 Dec 2008 19:44:21 +0000 (11:44 -0800)]
ceph: new default mon port; try to bind to port in known range

New monitor port in unused region (according to nmap-services).

Try to bind to a port in a known range, so that tools can easily
identify the protocol in use.

Remove some old .sh cruft.

16 years agomon: 'pg map <pgid>' command
Sage Weil [Mon, 8 Dec 2008 19:15:10 +0000 (11:15 -0800)]
mon: 'pg map <pgid>' command

To see current pg -> osd mapping

16 years agomon: 'osd dump' command; refactor sstream->bufferlist code a bit
Sage Weil [Mon, 8 Dec 2008 18:11:12 +0000 (10:11 -0800)]
mon: 'osd dump' command; refactor sstream->bufferlist code a bit

16 years agoosdmap: use print method from osdmaptool
Sage Weil [Mon, 8 Dec 2008 18:01:37 +0000 (10:01 -0800)]
osdmap: use print method from osdmaptool

16 years agoosd: pause scrub wq async
Sage Weil [Mon, 8 Dec 2008 17:54:26 +0000 (09:54 -0800)]
osd: pause scrub wq async

The scrub _process() worker may be waiting on a message from a replica, so
we can't pause it synchronously.  Instead, pause_new() to just prevent
new workers from starting.

16 years agoosd: lock pg before calling on_shutdown
Sage Weil [Fri, 5 Dec 2008 23:57:43 +0000 (15:57 -0800)]
osd: lock pg before calling on_shutdown

16 years agoosd: fix degraded figure calculation typo
Sage Weil [Fri, 5 Dec 2008 23:57:28 +0000 (15:57 -0800)]
osd: fix degraded figure calculation typo

16 years agocmonctl: resend command if monitor is not responsive
Sage Weil [Fri, 5 Dec 2008 23:45:08 +0000 (15:45 -0800)]
cmonctl: resend command if monitor is not responsive

16 years agocmonctl: interactive mode using libedit
Sage Weil [Fri, 5 Dec 2008 23:35:22 +0000 (15:35 -0800)]
cmonctl: interactive mode using libedit

16 years agoosd: log scrub ok
Sage Weil [Fri, 5 Dec 2008 22:22:17 +0000 (14:22 -0800)]
osd: log scrub ok

16 years agomon: rename out to log, log.type files
Sage Weil [Fri, 5 Dec 2008 22:22:01 +0000 (14:22 -0800)]
mon: rename out to log, log.type files

16 years agomon: notify PaxosService of any paxos state changes
Sage Weil [Fri, 5 Dec 2008 22:00:59 +0000 (14:00 -0800)]
mon: notify PaxosService of any paxos state changes

16 years agomon: dump full pgmap on each state change (for debugging)
Sage Weil [Fri, 5 Dec 2008 22:29:17 +0000 (14:29 -0800)]
mon: dump full pgmap on each state change (for debugging)

16 years agoosd: don't die on stray sub op acks
Sage Weil [Fri, 5 Dec 2008 22:27:25 +0000 (14:27 -0800)]
osd: don't die on stray sub op acks

If a replica drops out of the pg, we force and ack in on_change(), but
may still get it later.  Don't freak out.

16 years agoosd: generate_backlog sanity check
Sage Weil [Fri, 5 Dec 2008 19:48:27 +0000 (11:48 -0800)]
osd: generate_backlog sanity check

If item is on disk and log, then log entry shouldn't be a delete.

16 years agoosd: fix merge_log divergent item detection
Sage Weil [Fri, 5 Dec 2008 19:46:55 +0000 (11:46 -0800)]
osd: fix merge_log divergent item detection

An item in our log isn't divergent if it is below the bottom of
olog.  Using the last_kept item isn't helpful here because
last_kept is in olog, and may be below that log's bottom.

16 years agoosd: clean on ondisklog a bit
Sage Weil [Fri, 5 Dec 2008 19:45:05 +0000 (11:45 -0800)]
osd: clean on ondisklog a bit

Express as extent, not interval.

16 years agoosd: make read_log output a bit more informative
Sage Weil [Fri, 5 Dec 2008 19:18:25 +0000 (11:18 -0800)]
osd: make read_log output a bit more informative

16 years agovstart: only sudo if -e dev/sudo
Sage Weil [Fri, 5 Dec 2008 19:08:53 +0000 (11:08 -0800)]
vstart: only sudo if -e dev/sudo

16 years agoosd: revise missing map adjustment
Sage Weil [Fri, 5 Dec 2008 19:00:47 +0000 (11:00 -0800)]
osd: revise missing map adjustment

Rewrite helpers in terms of how they are actually used.

16 years agoosd: mark backlog events as BACKLOG
Sage Weil [Fri, 5 Dec 2008 18:59:23 +0000 (10:59 -0800)]
osd: mark backlog events as BACKLOG

This is purely to make the logs easier to read.

16 years agoosd: generate_backlog fixes
Sage Weil [Fri, 5 Dec 2008 18:01:28 +0000 (10:01 -0800)]
osd: generate_backlog fixes

Generate backlog records even if the object appears in the log if
the existing entry's prior_version in non-zero and isn't also
in the log.  This allows us to accurately generate the .have field
when we are building the missing map.

16 years agocrush: add include
Sage Weil [Fri, 5 Dec 2008 04:49:11 +0000 (20:49 -0800)]
crush: add include

16 years agokclient: reduce stack usage
Yehuda Sadeh [Fri, 5 Dec 2008 00:50:50 +0000 (16:50 -0800)]
kclient: reduce stack usage

16 years agomon: 'osd scrub \*' to scrub all osds
Sage Weil [Fri, 5 Dec 2008 00:32:53 +0000 (16:32 -0800)]
mon: 'osd scrub \*' to scrub all osds

16 years agofilestore: fix up listxattr buffer management a bit
Sage Weil [Fri, 5 Dec 2008 00:28:33 +0000 (16:28 -0800)]
filestore: fix up listxattr buffer management a bit

16 years agodstart: put debug output on local disk
Sage Weil [Fri, 5 Dec 2008 00:27:28 +0000 (16:27 -0800)]
dstart: put debug output on local disk

16 years agodebug: allow output and output symlinks to go in different directories
Sage Weil [Fri, 5 Dec 2008 00:27:09 +0000 (16:27 -0800)]
debug: allow output and output symlinks to go in different directories

16 years agologmonitor: append all notifications in a single file
Yehuda Sadeh [Fri, 5 Dec 2008 00:14:55 +0000 (16:14 -0800)]
logmonitor: append all notifications in a single file

16 years agoset/check subprotocol versions
Sage Weil [Thu, 4 Dec 2008 22:58:58 +0000 (14:58 -0800)]
set/check subprotocol versions

16 years agomon: clean up paxos service registration a bit. rev disk format.
Sage Weil [Thu, 4 Dec 2008 22:30:11 +0000 (14:30 -0800)]
mon: clean up paxos service registration a bit.  rev disk format.

16 years agocleanup, whitespace
Yehuda Sadeh [Thu, 4 Dec 2008 22:08:39 +0000 (14:08 -0800)]
cleanup, whitespace

16 years agolog: use of cascading dispatcher for log messages
Yehuda Sadeh [Thu, 4 Dec 2008 21:59:34 +0000 (13:59 -0800)]
log: use of cascading dispatcher for log messages

16 years agodispatcher: cascading dispatch infrastructure
Yehuda Sadeh [Thu, 4 Dec 2008 21:31:00 +0000 (13:31 -0800)]
dispatcher: cascading dispatch infrastructure

16 years agomon: keep pgmap consistent
Sage Weil [Thu, 4 Dec 2008 21:46:19 +0000 (13:46 -0800)]
mon: keep pgmap consistent

We were cutting corners and updating the live map before it
committed to paxos, since pg stats aren't system critical.  This
can lead to problems due to the way "latest" is saved out, though,
and it can be confusing to see things jump backward in time.

16 years agoosd: make replica scrub_map generation a subop
Sage Weil [Thu, 4 Dec 2008 21:41:29 +0000 (13:41 -0800)]
osd: make replica scrub_map generation a subop

This puts build_scrub_map in a worker thread, _and_ ensures it is
serialized wrt any in-progress writes.

16 years agologclient: always print log messages to debug output
Sage Weil [Thu, 4 Dec 2008 21:40:30 +0000 (13:40 -0800)]
logclient: always print log messages to debug output

16 years agoosd: fix up scrub error log formatting
Sage Weil [Thu, 4 Dec 2008 21:01:28 +0000 (13:01 -0800)]
osd: fix up scrub error log formatting

16 years agologclient: optionally take a stringstream
Sage Weil [Thu, 4 Dec 2008 21:01:19 +0000 (13:01 -0800)]
logclient: optionally take a stringstream

16 years agoosd: some scrub fixes
Sage Weil [Thu, 4 Dec 2008 20:18:07 +0000 (12:18 -0800)]
osd: some scrub fixes

Don't drop locks just yet; atm this leaves the dout() prefix
exposed to concurrent modifications of pg state.

Don't requeue for scrub if already scrubbing.

Fix missing object detection bugs.

16 years agoosd: fix pg_stats.reported value
Sage Weil [Thu, 4 Dec 2008 19:57:18 +0000 (11:57 -0800)]
osd: fix pg_stats.reported value

16 years agoosd: drop lock during most of scrub; only disallow concurrent writes
Sage Weil [Thu, 4 Dec 2008 19:17:58 +0000 (11:17 -0800)]
osd: drop lock during most of scrub; only disallow concurrent writes

Make the PG go read-only during a scrub.  Only take the pg lock
when absolutely necessary.  Wait for any pending writes to
complete before starting the scrub.

16 years agoosd: ignore dup scrub maps
Sage Weil [Thu, 4 Dec 2008 19:16:25 +0000 (11:16 -0800)]
osd: ignore dup scrub maps

16 years agoosd: take pg ref on scrub_wq
Sage Weil [Thu, 4 Dec 2008 19:15:46 +0000 (11:15 -0800)]
osd: take pg ref on scrub_wq

16 years agoosd: make scrub verify replica object attrs match
Sage Weil [Thu, 4 Dec 2008 18:58:23 +0000 (10:58 -0800)]
osd: make scrub verify replica object attrs match

16 years agoosd: fix pg stat acking in osd
Sage Weil [Thu, 4 Dec 2008 18:57:13 +0000 (10:57 -0800)]
osd: fix pg stat acking in osd

16 years agolog: logclient uses log types instead of log level
Yehuda Sadeh [Thu, 4 Dec 2008 18:47:35 +0000 (10:47 -0800)]
log: logclient uses log types instead of log level

16 years agoosd: check for missing clones in pick_read_snap
Sage Weil [Thu, 4 Dec 2008 18:08:22 +0000 (10:08 -0800)]
osd: check for missing clones in pick_read_snap

We may need to wait from op_read if we are missing a specific
clone.

16 years agoosd: clear waiting_for_head when we pull the head; set skipped if we do
Sage Weil [Thu, 4 Dec 2008 18:07:43 +0000 (10:07 -0800)]
osd: clear waiting_for_head when we pull the head; set skipped if we do

Need to ++skipped if we skip because we're waiting for the head
or else we'll incorrectly advanced requested_to.

Clear waiting_for_head entry when we pull a head we're waiting for.

16 years agoosd: fix missing.add_event
Sage Weil [Thu, 4 Dec 2008 18:06:09 +0000 (10:06 -0800)]
osd: fix missing.add_event

We only should set have to prior_version if we aren't missing the
prior_version too!

16 years agokclient: fix NULL dereferencing oops
Yehuda Sadeh [Thu, 4 Dec 2008 17:46:24 +0000 (09:46 -0800)]
kclient: fix NULL dereferencing oops

16 years agoosd: version pg_stats_t with <epoch,version> pair; clean up pgmon a bit
Sage Weil [Thu, 4 Dec 2008 05:12:11 +0000 (21:12 -0800)]
osd: version pg_stats_t with <epoch,version> pair; clean up pgmon a bit

16 years agoosd: keep projected info on in-progress object modifications in memory
Sage Weil [Thu, 4 Dec 2008 03:46:08 +0000 (19:46 -0800)]
osd: keep projected info on in-progress object modifications in memory

Since the primary delays its writes until after replicas ack, we need to
keep projected object info in memory for the duration, because the
semantics very much depend on whether the object exists and what its size
is (well, mainly the pg_stats do).

This can avoid re-parsing SnapSet et al for certain workloads hitting
the same objects repeatedly (e.g., mds journal objects).

16 years agoosd: fix problems with propagation of info.stats during recovery
Sage Weil [Thu, 4 Dec 2008 00:42:38 +0000 (16:42 -0800)]
osd: fix problems with propagation of info.stats during recovery

merge_log() is called on replicas, do don't use peer_info (which
is primary-only)!

16 years agoosd: keep tabs on total object copies vs missing/degraded
Sage Weil [Thu, 4 Dec 2008 00:41:53 +0000 (16:41 -0800)]
osd: keep tabs on total object copies vs missing/degraded

Define degraded as an object copy that is not present in the
proper location.

16 years agoosd: fix uninitialized var use
Sage Weil [Thu, 4 Dec 2008 00:26:18 +0000 (16:26 -0800)]
osd: fix uninitialized var use

16 years agoadd missing header file declaration
Yehuda Sadeh [Thu, 4 Dec 2008 00:53:38 +0000 (16:53 -0800)]
add missing header file declaration

16 years agoosd: move logging messages to a common infrastructure
Yehuda Sadeh [Thu, 4 Dec 2008 00:36:19 +0000 (16:36 -0800)]
osd: move logging messages to a common infrastructure

16 years agomon: clean up pg dump
Sage Weil [Wed, 3 Dec 2008 23:42:02 +0000 (15:42 -0800)]
mon: clean up pg dump

16 years agoosd: update stats on primary pull
Sage Weil [Wed, 3 Dec 2008 23:41:56 +0000 (15:41 -0800)]
osd: update stats on primary pull

16 years agoosd: fix stupid no-op typo
Sage Weil [Wed, 3 Dec 2008 23:32:44 +0000 (15:32 -0800)]
osd: fix stupid no-op typo

Everything was showing up as a no-op.

16 years agoosd: log scrub errors to central log
Sage Weil [Wed, 3 Dec 2008 23:14:41 +0000 (15:14 -0800)]
osd: log scrub errors to central log

16 years agoMerge branch 'diskformat' into unstable
Sage Weil [Wed, 3 Dec 2008 23:06:25 +0000 (15:06 -0800)]
Merge branch 'diskformat' into unstable

16 years agomon: include last_scrub info in pg dump
Sage Weil [Wed, 3 Dec 2008 22:57:23 +0000 (14:57 -0800)]
mon: include last_scrub info in pg dump

16 years agoosd: remove pg from recovery_wq with clear_primary_state
Sage Weil [Wed, 3 Dec 2008 22:57:09 +0000 (14:57 -0800)]
osd: remove pg from recovery_wq with clear_primary_state

16 years agoosd: make pg refcounting vs work queues constent
Sage Weil [Wed, 3 Dec 2008 22:51:26 +0000 (14:51 -0800)]
osd: make pg refcounting vs work queues constent

Either refcount items in queue, or don't.

16 years agoosd: do clone scrub based on our generated scrub map
Sage Weil [Wed, 3 Dec 2008 22:37:10 +0000 (14:37 -0800)]
osd: do clone scrub based on our generated scrub map

16 years agoosd: scrub info in pg_stat_t. scrub states.
Sage Weil [Wed, 3 Dec 2008 22:17:54 +0000 (14:17 -0800)]
osd: scrub info in pg_stat_t.  scrub states.

16 years agoosd: fix small quirk read_log missing generation
Sage Weil [Wed, 3 Dec 2008 21:55:48 +0000 (13:55 -0800)]
osd: fix small quirk read_log missing generation

The missing entry .have field was probably wrong due to the use
of missing.add_event (which assumes missing is up to date wrt
the previous log entry).  Use the prior version we just pulled off
disk instead.

Also, be a bit more verbose.

16 years agomon: always discard pending on election completion
Sage Weil [Wed, 3 Dec 2008 21:41:12 +0000 (13:41 -0800)]
mon: always discard pending on election completion

Previously we tried to save the pending if we were still the
leader.  The problem is that while we were not leader, we may have
missed out on some updates, in which case the pending may no longer
be based on the current state.

In the future, we could make the commit waiters smart about callback
return codes so that they try to reapply.  For now, don't worry
about it.

16 years agomds: update segment on ETableServer replay
Sage Weil [Wed, 3 Dec 2008 20:26:24 +0000 (12:26 -0800)]
mds: update segment on ETableServer replay

Otherwise we may forget to flush table changes to disk before
trimming.

Also, clean up code a bit to use update_segment() whenever
possible (instead of duplicating the specific LogSegment update).

16 years agomds: print table version loaded during log replay
Sage Weil [Wed, 3 Dec 2008 20:20:35 +0000 (12:20 -0800)]
mds: print table version loaded during log replay

16 years agoosd: default 2x pg only for now
Sage Weil [Wed, 3 Dec 2008 20:20:22 +0000 (12:20 -0800)]
osd: default 2x pg only for now

16 years agoosd: distributed scrub compares primary vs replica contents
Sage Weil [Wed, 3 Dec 2008 20:19:59 +0000 (12:19 -0800)]
osd: distributed scrub compares primary vs replica contents

The checks are still pretty trivial at this point.

16 years agoosd: do not clear ops vector to indicate noop (protocol change)
Sage Weil [Wed, 3 Dec 2008 18:35:10 +0000 (10:35 -0800)]
osd: do not clear ops vector to indicate noop (protocol change)

The reply needs to include the full ops vector.  Use a separate
flag to indicate a noop.

16 years agoosd: rewrite pg_stats queueing
Sage Weil [Wed, 3 Dec 2008 00:02:43 +0000 (16:02 -0800)]
osd: rewrite pg_stats queueing

Use an xlist instead of a separate map.  Avoid inefficient
requeueing and external map overhead.

16 years agoosd: remove useless raid4pg from build
Sage Weil [Tue, 2 Dec 2008 22:38:12 +0000 (14:38 -0800)]
osd: remove useless raid4pg from build

16 years agokclient: fix oops in case written size doesn't match request
Yehuda Sadeh [Wed, 3 Dec 2008 18:10:55 +0000 (10:10 -0800)]
kclient: fix oops in case written size doesn't match request

16 years agokclient: some logs revision
Yehuda Sadeh [Wed, 3 Dec 2008 00:16:41 +0000 (16:16 -0800)]
kclient: some logs revision