]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Sage Weil [Fri, 7 Nov 2008 22:02:12 +0000 (14:02 -0800)]
osd: track recovery sources independently of missing list
Fixes pull() to choose an osd that isn't down.
Sage Weil [Fri, 7 Nov 2008 21:36:52 +0000 (13:36 -0800)]
mds: debug session ref in EMetaBlob during replay
Sage Weil [Fri, 7 Nov 2008 21:26:41 +0000 (13:26 -0800)]
mds: match last snap exactly on replay, add_*_dentry
In general, we add new snapped dentries and THEN the new live dentry
to the metablob. That means that during replay, we see [2,2] followed
by [3,head], replacing [2,head]. The [2,2] dentry should be added
anew, without paying heed to [2,head], and then the [3,head] should
replace/update [2,head].
It was mainly just the assertions in add_*_dentry that were getting
in the way.. but the lookup_exact_snap is also slightly faster.
Yehuda Sadeh [Fri, 7 Nov 2008 21:11:08 +0000 (13:11 -0800)]
Merge branch 'unstable' of ssh://ceph.newdream.net/git/ceph into unstable
Yehuda Sadeh [Fri, 7 Nov 2008 21:10:58 +0000 (13:10 -0800)]
kclient: when going down, release caps anyway
Sage Weil [Fri, 7 Nov 2008 21:07:28 +0000 (13:07 -0800)]
mon: avoid updating pg_map when osd_stat is unchanged
Sage Weil [Fri, 7 Nov 2008 20:44:26 +0000 (12:44 -0800)]
cmonctl: -w or --watch to watch (and print) mds/osd/pg stat changes
Sage Weil [Fri, 7 Nov 2008 18:56:25 +0000 (10:56 -0800)]
mds: don't cow a null dentry
Yehuda Sadeh [Fri, 7 Nov 2008 18:23:00 +0000 (10:23 -0800)]
kclient: sparse warnings
Yehuda Sadeh [Fri, 7 Nov 2008 18:21:19 +0000 (10:21 -0800)]
vstart.sh: add usage of $CEPH_BIN
Sage Weil [Fri, 7 Nov 2008 17:47:59 +0000 (09:47 -0800)]
osd: don't repeer an active pg just because the prior_set was affected
We only want to restart peering due to prior_set changes if it hasn't completed
yet.
Sage Weil [Fri, 7 Nov 2008 00:28:17 +0000 (16:28 -0800)]
mds: check dn->last when finding existing dentries during replay
We can't simply search for an existing dentry based on the name and end
snap, as that may turn up the wrong item. For example, if we have
[2,head] and the replaying operations cowed that to [2,2] and [3,head], then
if we replay the [2,2] item first we'll find [2,head] (the _wrong_ dentry)
and throw an assertion.
So just check for dn->last != p->dnlast.
Sage Weil [Fri, 7 Nov 2008 00:14:18 +0000 (16:14 -0800)]
todos
Sage Weil [Fri, 7 Nov 2008 03:27:51 +0000 (19:27 -0800)]
ebofs: another recursive lock bug
Sage Weil [Fri, 7 Nov 2008 03:15:50 +0000 (19:15 -0800)]
osd: turn up debug on any shutdown, not just SIGINT/SIGTERM, for now
Sage Weil [Fri, 7 Nov 2008 03:15:32 +0000 (19:15 -0800)]
msgr: fix problem with forced stop of pipe
Sage Weil [Fri, 7 Nov 2008 03:15:10 +0000 (19:15 -0800)]
ebofs: fix lock recursion
Sage Weil [Thu, 6 Nov 2008 22:26:10 +0000 (14:26 -0800)]
mon: handle invalid commands to pgmon
Sage Weil [Thu, 6 Nov 2008 23:32:32 +0000 (15:32 -0800)]
osd: add degraded pg state bit
Sage Weil [Thu, 6 Nov 2008 23:03:49 +0000 (15:03 -0800)]
osd: improve build_prior logic
If, during some interval since the pg last went active, we may have gone
rw, but none of the osds survived, then we include all of those osds
in the prior_set (even tho they're down), because they may have written data
that we want.
The prior logic appears to have been broken. It was only looking at the
primary osd.
Sage Weil [Thu, 6 Nov 2008 22:11:13 +0000 (14:11 -0800)]
osd: turn up debugging on SIGINT/TERM
Sage Weil [Thu, 6 Nov 2008 21:58:10 +0000 (13:58 -0800)]
osd: fix osd_lock recursion in wake_snap_trimmer
Yehuda Sadeh [Thu, 6 Nov 2008 21:47:08 +0000 (13:47 -0800)]
kclient: bookkeeper detects buffer overrun
Yehuda Sadeh [Thu, 6 Nov 2008 21:26:08 +0000 (13:26 -0800)]
kclient: frag_make_child fix (sage)
Sage Weil [Thu, 6 Nov 2008 17:43:59 +0000 (09:43 -0800)]
osd: don't pull if source osd is down
Sage Weil [Thu, 6 Nov 2008 18:57:10 +0000 (10:57 -0800)]
kclient: ran checkpatch
Sage Weil [Thu, 6 Nov 2008 18:56:51 +0000 (10:56 -0800)]
todos
Sage Weil [Thu, 6 Nov 2008 00:54:52 +0000 (16:54 -0800)]
synclient: fix debug prefix
Sage Weil [Wed, 5 Nov 2008 23:07:39 +0000 (15:07 -0800)]
cfuse: fix symlink call
Sage Weil [Wed, 5 Nov 2008 22:54:51 +0000 (14:54 -0800)]
vstart.sh
Sage Weil [Wed, 5 Nov 2008 22:54:13 +0000 (14:54 -0800)]
fix env parsing
Sage Weil [Wed, 5 Nov 2008 22:53:53 +0000 (14:53 -0800)]
streamtest: fix debug
Sage Weil [Wed, 5 Nov 2008 22:39:02 +0000 (14:39 -0800)]
vstartnew.sh: clean out gmon
Sage Weil [Wed, 5 Nov 2008 22:38:45 +0000 (14:38 -0800)]
journal: debugging journal full
Sage Weil [Wed, 5 Nov 2008 22:38:19 +0000 (14:38 -0800)]
dstart.sh: -d flag
Sage Weil [Wed, 5 Nov 2008 22:31:44 +0000 (14:31 -0800)]
config: parse CEPH_ARGS env var too
Sage Weil [Wed, 5 Nov 2008 22:15:21 +0000 (14:15 -0800)]
client: fix client_lock recursion
Sage Weil [Wed, 5 Nov 2008 22:09:04 +0000 (14:09 -0800)]
rewrite debug macros, infrastructure
Sage Weil [Tue, 4 Nov 2008 22:50:21 +0000 (14:50 -0800)]
try to chdir on exit to avoid clobbering ./gmon.out
Sage Weil [Tue, 4 Nov 2008 22:43:48 +0000 (14:43 -0800)]
osd: fix prior_set_up_thru condition
If an OSD's up_thru affects the membership of the prior_set, take note.
Then, if the osd's up_thru changes later, we know to rebuild it.
Sage Weil [Tue, 4 Nov 2008 22:19:26 +0000 (14:19 -0800)]
osd: fix PG::Info::History::same_since adjustment in advance_map
...now that we may reach this code even when the acting set is unchanged.
Sage Weil [Tue, 4 Nov 2008 21:49:18 +0000 (13:49 -0800)]
osd: repeer osd if prior set may be affected
Previously we only repeered if the active set changed. However, changes
in the up/down state of the prior set (or prior set candidates) or the
primary osd's up_thru can also affect the prior set and peering.
This fixes the problem where PGs get stuck in a "crashed" state without
moving to "crashed+replay". We sit and wait for info from a peer who
we thought was up but is now down, or vice-versa.
Sage Weil [Tue, 4 Nov 2008 21:01:08 +0000 (13:01 -0800)]
osd: shutdown cleanly on SIGINT, too
Sage Weil [Tue, 4 Nov 2008 21:01:01 +0000 (13:01 -0800)]
osd: clean up shutdown sequence
Sage Weil [Tue, 4 Nov 2008 21:00:24 +0000 (13:00 -0800)]
osd: shutdown cleanly on SIGTERM
Sage Weil [Tue, 4 Nov 2008 20:08:15 +0000 (12:08 -0800)]
filestore: lock fsid file to avoid multiple users
Sage Weil [Tue, 4 Nov 2008 19:45:53 +0000 (11:45 -0800)]
journal: fix recursive locking when queueing commit callback; simplify
Sage Weil [Tue, 4 Nov 2008 19:45:22 +0000 (11:45 -0800)]
lockdep: separate from Mutex; include checks for RWLock
Sage Weil [Tue, 4 Nov 2008 19:25:06 +0000 (11:25 -0800)]
osd: remove odd divergent log assertion
The divergent log handling is still broken in the face of backlogs, as we
can't really know if an item is really divergent or if it was deleted.
Since we can only diverge with administrator intervention, this is at least
not something we need to worry about _too_ much for now...
Sage Weil [Tue, 4 Nov 2008 19:19:35 +0000 (11:19 -0800)]
osd: put pg logs in collection 0, not the pg itself
This avoids having to special case the log object when generating backlog,
etc.
Sage Weil [Tue, 4 Nov 2008 19:18:53 +0000 (11:18 -0800)]
osd: fix recovery deferral
Sage Weil [Tue, 4 Nov 2008 00:54:13 +0000 (16:54 -0800)]
dstart.sh: debug journal
Sage Weil [Tue, 4 Nov 2008 00:53:45 +0000 (16:53 -0800)]
osd: mention pgs that do not change during advance_map()
I was seeing a missed clear_primary_state() for some pgs... not sure why
advance_map() missed them. Having trouble reproducing.
Sage Weil [Tue, 4 Nov 2008 00:52:45 +0000 (16:52 -0800)]
journal: ensure we see a clean sequence of entries on read/replay
Only lightly tested, but so far so good.
Sage Weil [Tue, 4 Nov 2008 00:42:22 +0000 (16:42 -0800)]
msgr: reorder locking in mark_down()
There was a strange series of crashes when retaking Pipe::lock inside
stop(). Not exactly sure why, but this simplifies locking slightly, and
behaves.
Sage Weil [Tue, 4 Nov 2008 00:07:45 +0000 (16:07 -0800)]
dstart.sh: enable lockdep
Sage Weil [Tue, 4 Nov 2008 00:05:45 +0000 (16:05 -0800)]
osd: fix recursive lock on remove_list_lock
queue_for_removal() takes the lock inside the loop.
Sage Weil [Tue, 4 Nov 2008 00:05:12 +0000 (16:05 -0800)]
msgr: fix recursive locking in mark_down()
pipe::stop() takes the rank lock when it needs it.
Sage Weil [Mon, 3 Nov 2008 23:55:59 +0000 (15:55 -0800)]
osd: avoid locking multiple pgs at once
This is just to satisfy lockdep.
Sage Weil [Mon, 3 Nov 2008 23:55:14 +0000 (15:55 -0800)]
lockdep: error out on recursive locks
There is no checking between instances, here.. this currently just
assumes that if you take two locks of the same type that that is bad.
(In practice, the caller could do this safely with some care.)
Sage Weil [Mon, 3 Nov 2008 21:34:11 +0000 (13:34 -0800)]
mutex: non-recursive by default
Sage Weil [Mon, 3 Nov 2008 21:24:01 +0000 (13:24 -0800)]
mutex: remove nlock assertions
These do not work when we cond.Wait(lock), because the lock drop via the
Cond wait does not decrement nlock. Just remove them, they're obvious
anyway.
Sage Weil [Mon, 3 Nov 2008 20:38:46 +0000 (12:38 -0800)]
/bin/bash, not /bin/sh
Sage Weil [Mon, 3 Nov 2008 20:37:24 +0000 (12:37 -0800)]
lockdep: faster
Sage Weil [Mon, 3 Nov 2008 20:35:40 +0000 (12:35 -0800)]
crun: no let
Sage Weil [Mon, 3 Nov 2008 19:25:32 +0000 (11:25 -0800)]
lockdep: use static array for dependency map
Sage Weil [Mon, 3 Nov 2008 18:48:09 +0000 (10:48 -0800)]
fakemsgr: missing mutex annotation
Sage Weil [Mon, 3 Nov 2008 17:51:23 +0000 (09:51 -0800)]
lockdep: assign numeric ids to each lock type
Sage Weil [Mon, 3 Nov 2008 15:31:04 +0000 (07:31 -0800)]
lockdep: only track/show held lock backtraces if --lockdep 2
Yehuda Sadeh [Mon, 3 Nov 2008 19:27:33 +0000 (11:27 -0800)]
reopen log files on usespace daemons when getting a HUP signal
Sage Weil [Mon, 3 Nov 2008 03:50:07 +0000 (19:50 -0800)]
lockdep: BackTrace.h
Sage Weil [Mon, 3 Nov 2008 03:50:00 +0000 (19:50 -0800)]
vstartnew.sh: enable lockdep
Sage Weil [Mon, 3 Nov 2008 03:49:20 +0000 (19:49 -0800)]
msgr: fix lock ordering on accept()
Sage Weil [Mon, 3 Nov 2008 03:46:55 +0000 (19:46 -0800)]
lockdep: fix include
Sage Weil [Mon, 3 Nov 2008 03:46:00 +0000 (19:46 -0800)]
ebofs: avoid taking mutex recursively
Sage Weil [Sat, 1 Nov 2008 00:13:13 +0000 (17:13 -0700)]
lockdep: disable on _dout_lock
Sage Weil [Sat, 1 Nov 2008 00:10:06 +0000 (17:10 -0700)]
lockdep: include Mutex.cc
Sage Weil [Sat, 1 Nov 2008 00:05:08 +0000 (17:05 -0700)]
lockdep: disable on per-mutex basis (and do so for atomic_t)
You should disable it if you _know_ you are an inner mutex, and
will never try to acquire another lock while you are held.
Sage Weil [Fri, 31 Oct 2008 23:46:39 +0000 (16:46 -0700)]
lockdep: enable with '--lockdep 1', off by default.
Sage Weil [Fri, 31 Oct 2008 23:17:01 +0000 (16:17 -0700)]
lockdep: make it work
Sage Weil [Fri, 31 Oct 2008 22:01:09 +0000 (15:01 -0700)]
lockdep: annotate Mutex declarations
Sage Weil [Fri, 31 Oct 2008 23:47:23 +0000 (16:47 -0700)]
msgr: set lossy flag on connect attempt
Sage Weil [Fri, 31 Oct 2008 20:12:51 +0000 (13:12 -0700)]
kclient: style, tabs
Sage Weil [Fri, 31 Oct 2008 23:33:54 +0000 (16:33 -0700)]
crush: no debug output
Sage Weil [Fri, 31 Oct 2008 21:57:32 +0000 (14:57 -0700)]
crush: dprintk lameness
Sage Weil [Fri, 31 Oct 2008 21:09:51 +0000 (14:09 -0700)]
crush todos
Sage Weil [Fri, 31 Oct 2008 21:09:22 +0000 (14:09 -0700)]
crush: fall back to a linear search if pseudorandom mapping isn't finding anything
Sage Weil [Fri, 31 Oct 2008 19:49:34 +0000 (12:49 -0700)]
crush todo
Sage Weil [Fri, 31 Oct 2008 19:48:51 +0000 (12:48 -0700)]
dstart.sh: use chooseleaf for data, cas crush rules
Sage Weil [Fri, 31 Oct 2008 19:48:30 +0000 (12:48 -0700)]
crush: fix list bucket, chooseleaf behavior
Sage Weil [Fri, 31 Oct 2008 18:53:17 +0000 (11:53 -0700)]
osdmaptool: test pg mapping
Sage Weil [Fri, 31 Oct 2008 18:52:25 +0000 (11:52 -0700)]
makefile: make --with-debug work, fix build errors
Sage Weil [Fri, 31 Oct 2008 17:43:48 +0000 (10:43 -0700)]
osd: report pg osds, osd peers to pgmonitor; include in pg dump
Yehuda Sadeh [Fri, 31 Oct 2008 18:07:23 +0000 (11:07 -0700)]
kclient: keep a pointer to the current snap context in the inode
Yehuda Sadeh [Fri, 31 Oct 2008 17:18:00 +0000 (10:18 -0700)]
kclient: use current snap context if not found
Yehuda Sadeh [Thu, 30 Oct 2008 23:35:23 +0000 (16:35 -0700)]
kclient: don't register a new bdi for the same client
Sage Weil [Thu, 30 Oct 2008 23:21:33 +0000 (16:21 -0700)]
osd: do all recovery operations in dedicated recovery thread
Sage Weil [Thu, 30 Oct 2008 22:50:46 +0000 (15:50 -0700)]
dstart.sh: 2x rep only
Sage Weil [Thu, 30 Oct 2008 22:00:43 +0000 (15:00 -0700)]
dstop.sh: stop crun too
Yehuda Sadeh [Thu, 30 Oct 2008 22:57:44 +0000 (15:57 -0700)]
kclient: override rdcache invalidation time when going down