git.apps.os.sepia.ceph.com Git

]> git.apps.os.sepia.ceph.com Git - ceph.git/log

projects / ceph.git / log

commit | commitdiff | tree

Sage Weil [Fri, 14 Nov 2008 00:48:15 +0000 (16:48 -0800)]

journal: detect size of raw block devices properly

commit | commitdiff | tree

Sage Weil [Fri, 14 Nov 2008 00:37:56 +0000 (16:37 -0800)]

osd: only trim pg log if pg contains complete set of osds

Eventually we may want to also impose some maximum pg log size. At
some point the cost of the long log will approach the cost of
building a backlog...

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 23:45:36 +0000 (15:45 -0800)]

osdmap: fix type conversions

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 23:22:24 +0000 (15:22 -0800)]

crush: mention license. minor cleanup

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 22:38:08 +0000 (14:38 -0800)]

be quiet

commit | commitdiff | tree

Yehuda Sadeh [Thu, 13 Nov 2008 22:56:43 +0000 (14:56 -0800)]

Merge branch 'unstable' of ssh://ceph.newdream.net/git/ceph into unstable

commit | commitdiff | tree

Yehuda Sadeh [Thu, 13 Nov 2008 22:56:00 +0000 (14:56 -0800)]

kclient: fix small resource leak when mds is down on mount

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 22:36:48 +0000 (14:36 -0800)]

csyn: fix msgr startup

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 22:31:57 +0000 (14:31 -0800)]

kclient: whitespace

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 22:26:53 +0000 (14:26 -0800)]

mon: standardize storage of monmap revisions

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 22:19:53 +0000 (14:19 -0800)]

mon: indicate last_consumed state after writing "latest" full maps

Until then, we may need old incremental states. This way paxos
won't discard them.

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 22:19:10 +0000 (14:19 -0800)]

paxos: only trim old states if they've been "consumed" by PaxosService

Higher level state may depend on these items. Only remove them
when its clear they are unneeded.

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 22:13:04 +0000 (14:13 -0800)]

mon: fix mon injectargs

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 21:33:05 +0000 (13:33 -0800)]

cmonctl: seed random number generator so we pick a truly random mds

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 21:30:26 +0000 (13:30 -0800)]

mds: treat open requests as non-idempotent

The problem is that the reply contains a capability, and as such
is statefull and can't be lost. Forwards by the MDS on behalf of
the client, however, introduce the possibility of multiple copies
or a request in flight if one of the MDSs fails, and the client
will drop any duplicate replies it receives.

Alternatively, the client could _also_ parse duplicate responses
(i.e. call fill_trace). I'm not sure if that's a good idea. In
any case, MDS forwarded requests are only really important for
dealing with flash flood scenarios on extremely large clusters,
so let's just set this aside for now.

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 20:45:18 +0000 (12:45 -0800)]

mds todo

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 20:43:03 +0000 (12:43 -0800)]

kclient: only kick requests when they may make progress

Kick requests if the mds they are waiting on
- failed. it may be possible to send the request elsewhere.
- became active.

The rest of the time we are just spinning our wheels.

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 20:42:00 +0000 (12:42 -0800)]

kclient: only submit mds request if mds is active

Wait until we get a new map instead.

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 19:32:34 +0000 (11:32 -0800)]

osd: always pick new mon during boot

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 19:29:33 +0000 (11:29 -0800)]

kclient: fix connect_seq on connect-side after connect

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 19:06:03 +0000 (11:06 -0800)]

mds: keep inode multiversion if it has snapped old_inodes

Once old_inodes gets cleaned out (snaps deleted), we can return
to normalcy.

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 18:56:30 +0000 (10:56 -0800)]

mon: pave way for more per-client mount info in monitor

Eventually we'll need some security stuff in here

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 18:48:25 +0000 (10:48 -0800)]

mon: client mon stat, dump commands; add to cmonctl -w

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 18:40:11 +0000 (10:40 -0800)]

msgr: include entity type in negotiation

This allows the accepting end to determine the policy it will use
on this connection, and return the correct LOSSY flag to the peer.

Also, some kclient simplification, cleanup.

And a connection ref count fix in process_accept().

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 18:01:54 +0000 (10:01 -0800)]

msgr: lotsa cleanup, protocol change, fixes, etc.

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 01:05:51 +0000 (17:05 -0800)]

msgr: various locking fixes

Fixes some deadlock problems.

We also avoid the use of newsd, and do a sync join() on the reader
thread when killing a pipe, to avoid leaking sockets.

Also fix a 'bad seq #' error due to the reader not rechecking the
pipe state after retaking the lock after reading some data.

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 01:03:42 +0000 (17:03 -0800)]

msgr: zero msgr header

Clears up some valgrind warnings

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 01:03:24 +0000 (17:03 -0800)]

thread: complain on bad join() calls

commit | commitdiff | tree

Sage Weil [Thu, 13 Nov 2008 01:03:11 +0000 (17:03 -0800)]

testmsgr: messenger tester

commit | commitdiff | tree

Sage Weil [Wed, 12 Nov 2008 22:15:22 +0000 (14:15 -0800)]

mds: multiversion inodes with multiple links, too

We may have remote links that get snapped.  They need to be able to
find the (single) anchored multiversion inode to get the correct
version.  (The anchor table isn't versioned.)

We'll need to go a step further than this and create snaprealms for
some of these too in order to handle inodes linked into multiple
realms.  But that needs backpointers first...

commit | commitdiff | tree

Yehuda Sadeh [Thu, 13 Nov 2008 19:19:14 +0000 (11:19 -0800)]

kclient: grab reference on inode before async operation

commit | commitdiff | tree

Sage Weil [Wed, 12 Nov 2008 22:31:22 +0000 (14:31 -0800)]

client: use separate locks for SafeCond completions

We can't reuse client_lock because it is already held when the completion
Context is called.

commit | commitdiff | tree

Sage Weil [Wed, 12 Nov 2008 22:18:04 +0000 (14:18 -0800)]

Revert "client: adjust objecter locking"

This reverts commit 1c70b1d8f62ad8d9eeef3a86ef36d8e6933c5702.

Actually, we need client_lock to protect access ot the objecter
and cache structures, as they have no internal locking.

Fix the safecond separately...

commit | commitdiff | tree

Sage Weil [Wed, 12 Nov 2008 22:05:31 +0000 (14:05 -0800)]

mds: cap may already be released in file_update_finish

There may be multiple release requests in the pipeline. Not ideal,
but whatever.

Also, drop locks later.

commit | commitdiff | tree

Sage Weil [Wed, 12 Nov 2008 19:23:01 +0000 (11:23 -0800)]

mds: remove cap _after_ journaling update, at the same time we send the msg

There was an ordering problem that could come up when we prepared
a release message and removed the cap, but then didn't send it to
the client until after the update was journaled.  This could cause
us to remove the _next_ instance of the capability (from a
subseqent open) in certain circumstances.

Instead, wait until after we journal the update before removing
the client cap and sending the ack.  Since time has passed,
reverify the release request seq is still >= the last_open at
that time.  Introduce a helper to avoid duplicating code for the
case where no journaling is necessary and the cap is immediately
released in _do_cap_update.

commit | commitdiff | tree

Sage Weil [Wed, 12 Nov 2008 18:23:15 +0000 (10:23 -0800)]

mds: use null snap context for purge if no realm

The inode may be unlinked, e.g. when we are replaying a journaled
purge_inode EUpdate. The snapc is not really important, as the
OSD will use the newer snapc it has for the object. And we only
really care when we're purging the HEAD anyway.

commit | commitdiff | tree

Sage Weil [Wed, 12 Nov 2008 20:25:33 +0000 (12:25 -0800)]

osd: keep head_exists accurate

Also, create object on setattr if it doesn't yet exist.

And don't munge ZERO -> DELETE, at least for now. What is the use case
for that, anyway?

commit | commitdiff | tree

Sage Weil [Wed, 12 Nov 2008 20:22:21 +0000 (12:22 -0800)]

filestore: implement touch

commit | commitdiff | tree

Sage Weil [Wed, 12 Nov 2008 20:22:12 +0000 (12:22 -0800)]

ebofs: implement touch

commit | commitdiff | tree

Sage Weil [Wed, 12 Nov 2008 20:21:57 +0000 (12:21 -0800)]

objectstore: introduce touch operation

Create an object if it doesn't exist.

commit | commitdiff | tree

Yehuda Sadeh [Wed, 12 Nov 2008 20:26:18 +0000 (12:26 -0800)]

kclient: some osd endianity fixes

commit | commitdiff | tree

Yehuda Sadeh [Wed, 12 Nov 2008 20:13:29 +0000 (12:13 -0800)]

kclient: fix symbol overshadowing

commit | commitdiff | tree

Sage Weil [Wed, 12 Nov 2008 00:00:59 +0000 (16:00 -0800)]

protocol, disk format change

commit | commitdiff | tree

Sage Weil [Tue, 11 Nov 2008 23:52:17 +0000 (15:52 -0800)]

osdmap: move offload from crush map into osdmap as osd_weight

commit | commitdiff | tree

Sage Weil [Tue, 11 Nov 2008 23:10:46 +0000 (15:10 -0800)]

dstart.sh: larger cluster

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 23:53:26 +0000 (15:53 -0800)]

mds: fix replay lookup of snapshotted null dentries

Look up replayed dentry by dnLAST, not dnfirst, as we do with
primary and remote dentries, because that is how we identify
dentry instances in the timeline.

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 23:46:15 +0000 (15:46 -0800)]

objecter: fix read scatter/gather

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 23:46:03 +0000 (15:46 -0800)]

osd: pass at_version by reference, so that cloning works

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 23:45:46 +0000 (15:45 -0800)]

mds: remove spurious warning

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 23:25:24 +0000 (15:25 -0800)]

osd: ignore logs i don't expect without freaking out

We may get a log we didn't think we requested if the prior_set gets rebuilt
or our peering is restarted for some other reason. Just ignore it, instead
of asserting.

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 22:43:17 +0000 (14:43 -0800)]

osd: assert length on write, zero

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 22:40:45 +0000 (14:40 -0800)]

objecter: whoops, do DELETE, not ZERO

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 21:42:14 +0000 (13:42 -0800)]

osd: fix typo/bug when picking osd to pull missing object from

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 21:41:48 +0000 (13:41 -0800)]

osd: fix missing vs lost counting idiocy

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 21:40:59 +0000 (13:40 -0800)]

filestore: cope with zero-length attribute values

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 21:40:50 +0000 (13:40 -0800)]

ebofs: cope with zero-length attribute values

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 18:55:10 +0000 (10:55 -0800)]

osd: use modify flag to decide whether to take read fast path

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 18:37:41 +0000 (10:37 -0800)]

kclient: fix up writes, reads for new op structure

Make sure osdc_readpages still returns bytes read, even though the
overall message result does not (pull it from the op.length).

Set MODIFY flag.

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 18:36:24 +0000 (10:36 -0800)]

osd: MODIFY is a flag; fix up op_read

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 17:46:53 +0000 (09:46 -0800)]

osd: simple higher-order append mutation

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 04:21:43 +0000 (20:21 -0800)]

kclient: fix osd reply handler sanity check

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 04:15:57 +0000 (20:15 -0800)]

objecter: destructively take ops[], bufferlist passed to read(), modify().

Small optimization.

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 04:12:20 +0000 (20:12 -0800)]

mds: set path attr on directory objects

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 04:11:57 +0000 (20:11 -0800)]

osd: object attr operations

commit | commitdiff | tree

Sage Weil [Mon, 10 Nov 2008 01:43:34 +0000 (17:43 -0800)]

objecter: tweaking interface

commit | commitdiff | tree

Sage Weil [Sun, 9 Nov 2008 23:52:02 +0000 (15:52 -0800)]

objecter: simplify objecter (no scatter/gather, generic ops vector)

commit | commitdiff | tree

Sage Weil [Sun, 9 Nov 2008 23:05:37 +0000 (15:05 -0800)]

osdc: avoid using objecter readx/writex

commit | commitdiff | tree

Sage Weil [Fri, 7 Nov 2008 04:48:55 +0000 (20:48 -0800)]

osd: compound osd operations

commit | commitdiff | tree

Sage Weil [Sun, 9 Nov 2008 16:43:14 +0000 (08:43 -0800)]

client: adjust objecter locking

We want to unlock client_lock before claling into objecter, mainly because the callbacks
rely on SafeCond that take a lock to signal a Cond and that gets awkward without mutex
recursion (see _write()'s sync case).

commit | commitdiff | tree

Sage Weil [Fri, 7 Nov 2008 22:02:12 +0000 (14:02 -0800)]

osd: track recovery sources independently of missing list

Fixes pull() to choose an osd that isn't down.

commit | commitdiff | tree

Sage Weil [Fri, 7 Nov 2008 21:36:52 +0000 (13:36 -0800)]

mds: debug session ref in EMetaBlob during replay

commit | commitdiff | tree

Sage Weil [Fri, 7 Nov 2008 21:26:41 +0000 (13:26 -0800)]

mds: match last snap exactly on replay, add_*_dentry

In general, we add new snapped dentries and THEN the new live dentry
to the metablob. That means that during replay, we see [2,2] followed
by [3,head], replacing [2,head]. The [2,2] dentry should be added
anew, without paying heed to [2,head], and then the [3,head] should
replace/update [2,head].

It was mainly just the assertions in add_*_dentry that were getting
in the way.. but the lookup_exact_snap is also slightly faster.

commit | commitdiff | tree

Yehuda Sadeh [Fri, 7 Nov 2008 21:11:08 +0000 (13:11 -0800)]

Merge branch 'unstable' of ssh://ceph.newdream.net/git/ceph into unstable

commit | commitdiff | tree

Yehuda Sadeh [Fri, 7 Nov 2008 21:10:58 +0000 (13:10 -0800)]

kclient: when going down, release caps anyway

commit | commitdiff | tree

Sage Weil [Fri, 7 Nov 2008 21:07:28 +0000 (13:07 -0800)]

mon: avoid updating pg_map when osd_stat is unchanged

commit | commitdiff | tree

Sage Weil [Fri, 7 Nov 2008 20:44:26 +0000 (12:44 -0800)]

cmonctl: -w or --watch to watch (and print) mds/osd/pg stat changes

commit | commitdiff | tree

Sage Weil [Fri, 7 Nov 2008 18:56:25 +0000 (10:56 -0800)]

mds: don't cow a null dentry

commit | commitdiff | tree

Yehuda Sadeh [Fri, 7 Nov 2008 18:23:00 +0000 (10:23 -0800)]

kclient: sparse warnings

commit | commitdiff | tree

Yehuda Sadeh [Fri, 7 Nov 2008 18:21:19 +0000 (10:21 -0800)]

vstart.sh: add usage of $CEPH_BIN

commit | commitdiff | tree

Sage Weil [Fri, 7 Nov 2008 17:47:59 +0000 (09:47 -0800)]

osd: don't repeer an active pg just because the prior_set was affected

We only want to restart peering due to prior_set changes if it hasn't completed
yet.

commit | commitdiff | tree

Sage Weil [Fri, 7 Nov 2008 00:28:17 +0000 (16:28 -0800)]

mds: check dn->last when finding existing dentries during replay

We can't simply search for an existing dentry based on the name and end
snap, as that may turn up the wrong item. For example, if we have
[2,head] and the replaying operations cowed that to [2,2] and [3,head], then
if we replay the [2,2] item first we'll find [2,head] (the _wrong_ dentry)
and throw an assertion.

So just check for dn->last != p->dnlast.

commit | commitdiff | tree

Sage Weil [Fri, 7 Nov 2008 00:14:18 +0000 (16:14 -0800)]

todos

commit | commitdiff | tree

Sage Weil [Fri, 7 Nov 2008 03:27:51 +0000 (19:27 -0800)]

ebofs: another recursive lock bug

commit | commitdiff | tree

Sage Weil [Fri, 7 Nov 2008 03:15:50 +0000 (19:15 -0800)]

osd: turn up debug on any shutdown, not just SIGINT/SIGTERM, for now

commit | commitdiff | tree

Sage Weil [Fri, 7 Nov 2008 03:15:32 +0000 (19:15 -0800)]

msgr: fix problem with forced stop of pipe

commit | commitdiff | tree

Sage Weil [Fri, 7 Nov 2008 03:15:10 +0000 (19:15 -0800)]

ebofs: fix lock recursion

commit | commitdiff | tree

Sage Weil [Thu, 6 Nov 2008 22:26:10 +0000 (14:26 -0800)]

mon: handle invalid commands to pgmon

commit | commitdiff | tree

Sage Weil [Thu, 6 Nov 2008 23:32:32 +0000 (15:32 -0800)]

osd: add degraded pg state bit

commit | commitdiff | tree

Sage Weil [Thu, 6 Nov 2008 23:03:49 +0000 (15:03 -0800)]

osd: improve build_prior logic

If, during some interval since the pg last went active, we may have gone
rw, but none of the osds survived, then we include all of those osds
in the prior_set (even tho they're down), because they may have written data
that we want.

The prior logic appears to have been broken. It was only looking at the
primary osd.

commit | commitdiff | tree

Sage Weil [Thu, 6 Nov 2008 22:11:13 +0000 (14:11 -0800)]

osd: turn up debugging on SIGINT/TERM

commit | commitdiff | tree

Sage Weil [Thu, 6 Nov 2008 21:58:10 +0000 (13:58 -0800)]

osd: fix osd_lock recursion in wake_snap_trimmer

commit | commitdiff | tree

Yehuda Sadeh [Thu, 6 Nov 2008 21:47:08 +0000 (13:47 -0800)]

kclient: bookkeeper detects buffer overrun

commit | commitdiff | tree

Yehuda Sadeh [Thu, 6 Nov 2008 21:26:08 +0000 (13:26 -0800)]

kclient: frag_make_child fix (sage)