]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
16 years agoosd: fix pg stat reporting
Sage Weil [Tue, 16 Dec 2008 21:37:11 +0000 (13:37 -0800)]
osd: fix pg stat reporting

We were skipping pgs that weren't active instead of not primary.

16 years agoosd: do delayed activation after replay via a queue, not timer event
Sage Weil [Tue, 16 Dec 2008 21:28:10 +0000 (13:28 -0800)]
osd: do delayed activation after replay via a queue, not timer event

This avoids osd_lock dependency by using osd->timer.

16 years agoosd: take osd_lock in generate_backlog before peer()
Sage Weil [Tue, 16 Dec 2008 21:14:30 +0000 (13:14 -0800)]
osd: take osd_lock in generate_backlog before peer()

Peer() uses the osd.timer to schedule the replay interval, which
needs osd_lock.

16 years agomds cleanup
Sage Weil [Tue, 16 Dec 2008 19:32:44 +0000 (11:32 -0800)]
mds cleanup

16 years agomds: add logclient
Sage Weil [Tue, 16 Dec 2008 19:25:21 +0000 (11:25 -0800)]
mds: add logclient

16 years agologclient: adjust link_dispatcher; add unlink_dispatcher
Sage Weil [Tue, 16 Dec 2008 19:25:14 +0000 (11:25 -0800)]
logclient: adjust link_dispatcher; add unlink_dispatcher

16 years agotodos
Sage Weil [Tue, 16 Dec 2008 18:56:53 +0000 (10:56 -0800)]
todos

16 years agomds: remove follows==0 special cases
Sage Weil [Tue, 16 Dec 2008 00:40:08 +0000 (16:40 -0800)]
mds: remove follows==0 special cases

A follows==0 shouldn't have any special meaning anymore.  See also
1876ca5ad4b92f0794c91d15502c16ad747dbf8b.

16 years agomds: when recovering size, don't munge up projected->size; use new_size
Sage Weil [Mon, 15 Dec 2008 23:57:36 +0000 (15:57 -0800)]
mds: when recovering size, don't munge up projected->size; use new_size

16 years agomds: update rbytes with size on truncate, etc.
Sage Weil [Mon, 15 Dec 2008 23:56:51 +0000 (15:56 -0800)]
mds: update rbytes with size on truncate, etc.

16 years agomon: send mdsmap on beacon from mds not in the map
Sage Weil [Mon, 15 Dec 2008 22:33:00 +0000 (14:33 -0800)]
mon: send mdsmap on beacon from mds not in the map

This happens when an mds goes laggy and is marked down by the
monitor.

16 years agomon: clean up mds failure output
Sage Weil [Mon, 15 Dec 2008 22:17:26 +0000 (14:17 -0800)]
mon: clean up mds failure output

iterator p isn't valid; use temp values.

16 years agomon: immediately propose after 'osd setmap'
Sage Weil [Mon, 15 Dec 2008 21:44:52 +0000 (13:44 -0800)]
mon: immediately propose after 'osd setmap'

Any subsequent osdmap changes will be ignored anyway.

Note that this still throws out changes _prior_ to the setmap.  In
theory, that shouldn't matter, since we're replacing the map
anyway.

16 years agoupdate debian, spec files to reflect cmonctl->ceph rename
Sage Weil [Mon, 15 Dec 2008 20:33:10 +0000 (12:33 -0800)]
update debian, spec files to reflect cmonctl->ceph rename

16 years agoceph: allow > and < to redirect command input/output
Sage Weil [Mon, 15 Dec 2008 20:00:17 +0000 (12:00 -0800)]
ceph: allow > and < to redirect command input/output

16 years agoceph: fold cobserver into ceph
Sage Weil [Mon, 15 Dec 2008 19:45:10 +0000 (11:45 -0800)]
ceph: fold cobserver into ceph

16 years agotodo
Sage Weil [Mon, 15 Dec 2008 19:34:11 +0000 (11:34 -0800)]
todo

16 years agomds: mark new directories new in journal; add to new list on replay
Sage Weil [Mon, 15 Dec 2008 19:32:32 +0000 (11:32 -0800)]
mds: mark new directories new in journal; add to new list on replay

This ensures the dir is written when the logseg is eventually
expired.

16 years agocobserver: usage
Sage Weil [Mon, 15 Dec 2008 19:17:36 +0000 (11:17 -0800)]
cobserver: usage

16 years agorename cmonctl -> ceph
Sage Weil [Mon, 15 Dec 2008 19:15:55 +0000 (11:15 -0800)]
rename cmonctl -> ceph

16 years agocobserver: retry if when no response on startup
Yehuda Sadeh [Mon, 15 Dec 2008 19:21:33 +0000 (11:21 -0800)]
cobserver: retry if when no response on startup

16 years agovstart.sh: can specify mon address
Yehuda Sadeh [Mon, 15 Dec 2008 18:45:08 +0000 (10:45 -0800)]
vstart.sh: can specify mon address

16 years agoosd: add missing declaration
Yehuda Sadeh [Mon, 15 Dec 2008 17:54:50 +0000 (09:54 -0800)]
osd: add missing declaration

16 years agovstart: debug pg reporting
Sage Weil [Mon, 15 Dec 2008 05:40:21 +0000 (21:40 -0800)]
vstart: debug pg reporting

16 years agoosd: generate_backlog asynchronously in a work queue; simplify peering a bit
Sage Weil [Mon, 15 Dec 2008 05:39:54 +0000 (21:39 -0800)]
osd: generate_backlog asynchronously in a work queue; simplify peering a bit

We do all backlog creation in a thread pool.  Break it down into the
disk scan and log integration steps, and drop PG lock as much as possible.
We only worry about pg acting changes; backlogs are only generated when the
pg is inactive.

We also simplify the activation code a bit by observing that replicas only
generate backlogs when their logs are discontiguous with the primary; in
such cases, we pull the backlog during peering and no generate_backlog
(equivalent) is needed for activation.

16 years agoosd: half-finished backlog_wq
Sage Weil [Fri, 12 Dec 2008 05:12:42 +0000 (21:12 -0800)]
osd: half-finished backlog_wq

16 years agocrush: don't recurse to leaf unless item is a bucket
Sage Weil [Fri, 12 Dec 2008 21:46:06 +0000 (13:46 -0800)]
crush: don't recurse to leaf unless item is a bucket

This avoids choking on 'chooseleaf indep 0 item device' (it's
equivalent to 'choose indep 0 item device').

16 years agoosd: shift generate_backlog out of merge_log
Sage Weil [Sat, 13 Dec 2008 04:21:34 +0000 (20:21 -0800)]
osd: shift generate_backlog out of merge_log

...in preparation for shifting it off to a worker thread.

16 years agoosd: for remaining peers, pull either log or backlog, but not both.
Sage Weil [Sat, 13 Dec 2008 04:01:14 +0000 (20:01 -0800)]
osd: for remaining peers, pull either log or backlog, but not both.

Pull as far back as peer's last_epoch_started (if they have that much).
This ensures we will pull any divergent entries, if there are any, so
that we can update our peer_missing map accordingly.

16 years agoosd: comment clean up
Sage Weil [Fri, 12 Dec 2008 23:12:42 +0000 (15:12 -0800)]
osd: comment clean up

16 years agodstart: 3x replication
Sage Weil [Fri, 12 Dec 2008 23:12:31 +0000 (15:12 -0800)]
dstart: 3x replication

16 years agoosd: simplify peer code a bit
Sage Weil [Fri, 12 Dec 2008 23:02:55 +0000 (15:02 -0800)]
osd: simplify peer code a bit

Combine the two loops.

16 years agoosd: simplify master log recreation; fix up Log::copy_after
Sage Weil [Fri, 12 Dec 2008 23:00:34 +0000 (15:00 -0800)]
osd: simplify master log recreation; fix up Log::copy_after

Pull log from a given point from peer with the largest last_update.  Do
not worry about divergence on the peer; that is handled by the new
primary.  Simplifies PG::Query struct.

Fix copy_after to set an accurate .bottom, and to behave if the split
point given is divergent (i.e. doesn't actually appear in the log).

16 years agovstart.sh/stop.sh can start and stop specific modules
Yehuda Sadeh [Fri, 12 Dec 2008 22:51:40 +0000 (14:51 -0800)]
vstart.sh/stop.sh can start and stop specific modules

16 years agodstop: kill crun too
Sage Weil [Fri, 12 Dec 2008 22:07:49 +0000 (14:07 -0800)]
dstop: kill crun too

16 years agoosd: move max_rep back to 3x
Sage Weil [Fri, 12 Dec 2008 22:07:40 +0000 (14:07 -0800)]
osd: move max_rep back to 3x

16 years agoosd: rewrite proc_replica_log
Sage Weil [Fri, 12 Dec 2008 05:10:43 +0000 (21:10 -0800)]
osd: rewrite proc_replica_log

After we have the master log, our only real purpose with other peer/stray
logs is to update replica missing maps and to find any missing objects.
Rewrite the log handling to clearly do that, with some comments.

16 years agoosd: fix merge_old_entry bug
Sage Weil [Fri, 12 Dec 2008 05:11:38 +0000 (21:11 -0800)]
osd: fix merge_old_entry bug

We want to revise_need to the _new_ entry's version, not the old one
(which is what missing already refers to).

16 years agoosd: small peer cleanup
Sage Weil [Fri, 12 Dec 2008 00:06:16 +0000 (16:06 -0800)]
osd: small peer cleanup

Make sure we check peer_log_requested and peer_summary_requested
independently, depending on which we want.  Move 'since'
calculation to where it is needed.

16 years agotodos
Sage Weil [Fri, 12 Dec 2008 05:13:02 +0000 (21:13 -0800)]
todos

16 years agocobserver: print all log entries in each state
Sage Weil [Thu, 11 Dec 2008 22:06:57 +0000 (14:06 -0800)]
cobserver: print all log entries in each state

16 years agoosd: initialize all MOSDSubOp fields
Sage Weil [Thu, 11 Dec 2008 22:03:18 +0000 (14:03 -0800)]
osd: initialize all MOSDSubOp fields

16 years agomon: fix up MLog constructor
Sage Weil [Thu, 11 Dec 2008 22:03:07 +0000 (14:03 -0800)]
mon: fix up MLog constructor

Initialize 'last'.  More idiot proof.

16 years agofilestore: fix buffer overruns, mismatched delete[], small buffer
Sage Weil [Thu, 11 Dec 2008 21:48:02 +0000 (13:48 -0800)]
filestore: fix buffer overruns, mismatched delete[], small buffer

16 years agoosd: pad eversion_t and zero remainder
Sage Weil [Thu, 11 Dec 2008 21:44:56 +0000 (13:44 -0800)]
osd: pad eversion_t and zero remainder

16 years agomon: mkfs log msg as error
Sage Weil [Thu, 11 Dec 2008 21:26:37 +0000 (13:26 -0800)]
mon: mkfs log msg as error

Just because it'll then create log.error, log.warn, etc.

16 years agoosd: clear out pg_stat_queue on shutdown
Sage Weil [Thu, 11 Dec 2008 19:14:46 +0000 (11:14 -0800)]
osd: clear out pg_stat_queue on shutdown

16 years agofilestore: sort objects by ino
Sage Weil [Thu, 11 Dec 2008 18:26:57 +0000 (10:26 -0800)]
filestore: sort objects by ino

This will greatly increase the speed that we can stat() them, since
btrfs sorts them by ino in the btree.

16 years agoworkqueue: include types.h
Sage Weil [Thu, 11 Dec 2008 18:05:55 +0000 (10:05 -0800)]
workqueue: include types.h

16 years agodstart.sh fix broken commit
Yehuda Sadeh [Thu, 11 Dec 2008 01:11:25 +0000 (17:11 -0800)]
dstart.sh fix broken commit

16 years agodstart.sh uses crun instead of -d (for gprof)
Yehuda Sadeh [Thu, 11 Dec 2008 01:09:44 +0000 (17:09 -0800)]
dstart.sh uses crun instead of -d (for gprof)

16 years agoosd: don't clear pg_stats_valid on send
Sage Weil [Wed, 10 Dec 2008 23:54:05 +0000 (15:54 -0800)]
osd: don't clear pg_stats_valid on send

16 years agotodo
Sage Weil [Wed, 10 Dec 2008 23:31:25 +0000 (15:31 -0800)]
todo

16 years agoosd: small cleanup
Sage Weil [Wed, 10 Dec 2008 23:07:57 +0000 (15:07 -0800)]
osd: small cleanup

16 years agoosd: call peer() if we need up_thru to activate
Sage Weil [Wed, 10 Dec 2008 22:50:16 +0000 (14:50 -0800)]
osd: call peer() if we need up_thru to activate

16 years agoosd: remove/fix waiting_for_head primary recovery logic
Sage Weil [Wed, 10 Dec 2008 22:25:44 +0000 (14:25 -0800)]
osd: remove/fix waiting_for_head primary recovery logic

Pulling map has the info we need.  Simplify.

16 years agomon: observer cleanup
Sage Weil [Wed, 10 Dec 2008 21:46:40 +0000 (13:46 -0800)]
mon: observer cleanup

Simplify observer struct, some other stuff.

update_observers() when cmon is a single monitor (no cluster).  Also
immediately after registering a new observer.

Make message in terms of latest summary vs state (Paxos class has no real
notion of 'incremental', just states and 'latest').

16 years agocmonctl: fix compile error
Sage Weil [Wed, 10 Dec 2008 21:22:28 +0000 (13:22 -0800)]
cmonctl: fix compile error

16 years agoworkqueue: drain
Sage Weil [Wed, 10 Dec 2008 21:04:46 +0000 (13:04 -0800)]
workqueue: drain

16 years agoworkqueue: virtual destructor
Sage Weil [Wed, 10 Dec 2008 20:51:59 +0000 (12:51 -0800)]
workqueue: virtual destructor

16 years agomakefile: missing headers
Sage Weil [Wed, 10 Dec 2008 20:19:15 +0000 (12:19 -0800)]
makefile: missing headers

16 years agoosd: cleanup
Sage Weil [Wed, 10 Dec 2008 20:16:46 +0000 (12:16 -0800)]
osd: cleanup

16 years agoworkqueue: non-inline worker, control methods; debugging
Sage Weil [Wed, 10 Dec 2008 20:15:00 +0000 (12:15 -0800)]
workqueue: non-inline worker, control methods; debugging

16 years agomon: fix use after free
Sage Weil [Wed, 10 Dec 2008 19:54:59 +0000 (11:54 -0800)]
mon: fix use after free

16 years agoosd: use new workqueue in osd for ops
Sage Weil [Wed, 10 Dec 2008 19:35:04 +0000 (11:35 -0800)]
osd: use new workqueue in osd for ops

16 years agoosd: shared threadpool for multiple work queues
Sage Weil [Wed, 10 Dec 2008 19:13:00 +0000 (11:13 -0800)]
osd: shared threadpool for multiple work queues

16 years agoosd: fix uninit value in scrub message
Sage Weil [Wed, 10 Dec 2008 19:12:44 +0000 (11:12 -0800)]
osd: fix uninit value in scrub message

16 years agotodos
Sage Weil [Wed, 10 Dec 2008 00:47:40 +0000 (16:47 -0800)]
todos

16 years agomon: mark unresponsive mds laggy instead of failed until we can replace it
Sage Weil [Wed, 10 Dec 2008 00:34:54 +0000 (16:34 -0800)]
mon: mark unresponsive mds laggy instead of failed until we can replace it

This way we flag laggy mds's, but hold out until they come back
online or we have a standby cmds to replace them.  Should make
things much more tolerable.

16 years agocobserver: simplify headers
Sage Weil [Tue, 9 Dec 2008 23:06:48 +0000 (15:06 -0800)]
cobserver: simplify headers

16 years agoosd: make sure hb peers get marked down
Sage Weil [Wed, 10 Dec 2008 00:00:27 +0000 (16:00 -0800)]
osd: make sure hb peers get marked down

We mark_down on osdmap update when we see an osd has gone down, but the
heartbeats are sent in a different thread without map_lock using
heartbeat_inst.  So, make sure heartbeat_inst entries are removed.

Also, we add hb peers at peers' request.  When removing such entries in
update_heartbeat_peers, mark_down then, too.  (We may mark_down a failed
peer, and then receive the hb request late.  So we mark that down next
time we update the heartbeat maps.)

16 years agoosd: update_stat during recover_replicas()
Sage Weil [Tue, 9 Dec 2008 23:06:54 +0000 (15:06 -0800)]
osd: update_stat during recover_replicas()

16 years agodstart: --nostop option
Sage Weil [Tue, 9 Dec 2008 22:57:43 +0000 (14:57 -0800)]
dstart: --nostop option

to avoid ./dstop.sh

16 years agoosd: drive primary recovery via missing map, not log
Sage Weil [Tue, 9 Dec 2008 22:57:19 +0000 (14:57 -0800)]
osd: drive primary recovery via missing map, not log

16 years agomon: osdmon cleanup
Sage Weil [Tue, 9 Dec 2008 22:57:01 +0000 (14:57 -0800)]
mon: osdmon cleanup

16 years agodstart: keep old cosd binaries around for a bit
Sage Weil [Tue, 9 Dec 2008 21:34:27 +0000 (13:34 -0800)]
dstart: keep old cosd binaries around for a bit

16 years agoosd: 'pg repair <pgid>' to repair an inconsistent pg using replicas
Sage Weil [Tue, 9 Dec 2008 21:33:33 +0000 (13:33 -0800)]
osd: 'pg repair <pgid>' to repair an inconsistent pg using replicas

16 years agoosd: don't read file content during _scrub
Sage Weil [Tue, 9 Dec 2008 20:27:31 +0000 (12:27 -0800)]
osd: don't read file content during _scrub

16 years agomsgr: be noisier about mark_down calls
Sage Weil [Tue, 9 Dec 2008 19:56:51 +0000 (11:56 -0800)]
msgr: be noisier about mark_down calls

16 years agoosd: avoid needless calls to peer(), build_prior()
Sage Weil [Tue, 9 Dec 2008 19:00:04 +0000 (11:00 -0800)]
osd: avoid needless calls to peer(), build_prior()

Introduces PEERING pg state.  Also is smarter about when build_prior and
peer are actually called.

16 years agoosd: make prior_set_affected() slightly smarter
Sage Weil [Tue, 9 Dec 2008 18:40:01 +0000 (10:40 -0800)]
osd: make prior_set_affected() slightly smarter

Only return true if an osd goes down that we didn't already know was
down (prior_set may contain down osds if the PG is marked DOWN).

16 years agocobserver: cleanups
Sage Weil [Tue, 9 Dec 2008 22:57:48 +0000 (14:57 -0800)]
cobserver: cleanups

16 years agomon: use 'latest' for latest osd, mds maps
Sage Weil [Tue, 9 Dec 2008 22:55:00 +0000 (14:55 -0800)]
mon: use 'latest' for latest osd, mds maps

Mainly for benefit of PaxosObserver, but it also cleans things up
a bit.

16 years agocobserver: cleanup; print map summaries w/ each new state
Sage Weil [Tue, 9 Dec 2008 22:44:58 +0000 (14:44 -0800)]
cobserver: cleanup; print map summaries w/ each new state

16 years agomon: refactor map print_summary/operator<< methods
Sage Weil [Tue, 9 Dec 2008 22:44:41 +0000 (14:44 -0800)]
mon: refactor map print_summary/operator<< methods

16 years agocobserver: accidentaly removed a line
Yehuda Sadeh [Tue, 9 Dec 2008 22:23:38 +0000 (14:23 -0800)]
cobserver: accidentaly removed a line

16 years agokclient: missing files
Yehuda Sadeh [Tue, 9 Dec 2008 22:20:44 +0000 (14:20 -0800)]
kclient: missing files

16 years agowhitespaces
Yehuda Sadeh [Tue, 9 Dec 2008 22:19:27 +0000 (14:19 -0800)]
whitespaces

16 years agomon: factor ClientMap class out
Yehuda Sadeh [Tue, 9 Dec 2008 22:11:09 +0000 (14:11 -0800)]
mon: factor ClientMap class out

16 years agocobserver: utility, observe changes in different maps
Yehuda Sadeh [Tue, 9 Dec 2008 21:39:10 +0000 (13:39 -0800)]
cobserver: utility, observe changes in different maps

16 years agoosd: use push() to push clone op
Sage Weil [Tue, 9 Dec 2008 19:47:06 +0000 (11:47 -0800)]
osd: use push() to push clone op

Also fixes missing updates to peer_missing[peer] and pushing
map.

16 years agomon: factor our osdmap print, print_summary
Sage Weil [Tue, 9 Dec 2008 17:58:16 +0000 (09:58 -0800)]
mon: factor our osdmap print, print_summary

16 years agomon: factor out mds print, print_summary
Sage Weil [Tue, 9 Dec 2008 17:58:08 +0000 (09:58 -0800)]
mon: factor out mds print, print_summary

16 years agomds: stay loner if client has B and no other reason to switch state
Sage Weil [Mon, 8 Dec 2008 21:50:46 +0000 (13:50 -0800)]
mds: stay loner if client has B and no other reason to switch state

If the client has dirty data, and there is no other reason to
toggle the lock state, leave it as LONER.  The client will write
out at its leisure, and we'll avoid an unstable lock state that
is waiting on a potentially slow writeout.

16 years agoosd: missing last_mon_heartbeat declaration
Sage Weil [Tue, 9 Dec 2008 17:50:26 +0000 (09:50 -0800)]
osd: missing last_mon_heartbeat declaration

16 years agomsgr: make sure nonce matches too when connecting to peer
Sage Weil [Tue, 9 Dec 2008 16:48:03 +0000 (08:48 -0800)]
msgr: make sure nonce matches too when connecting to peer

Otherwise the predictable port numbers cause problems.

16 years agomsgr: print error when message type is unrecognized
Sage Weil [Tue, 9 Dec 2008 16:43:38 +0000 (08:43 -0800)]
msgr: print error when message type is unrecognized

16 years agoosd: ping mon less frequently when peerless
Sage Weil [Tue, 9 Dec 2008 16:42:28 +0000 (08:42 -0800)]
osd: ping mon less frequently when peerless

Every second is too much.  Make it tunable.

16 years agomon: typo in pg dump output
Sage Weil [Mon, 8 Dec 2008 22:03:46 +0000 (14:03 -0800)]
mon: typo in pg dump output

16 years agoceph: new default mon port; try to bind to port in known range
Sage Weil [Mon, 8 Dec 2008 19:44:21 +0000 (11:44 -0800)]
ceph: new default mon port; try to bind to port in known range

New monitor port in unused region (according to nmap-services).

Try to bind to a port in a known range, so that tools can easily
identify the protocol in use.

Remove some old .sh cruft.