]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
16 years agojournal: detect size of raw block devices properly v0.5
Sage Weil [Fri, 14 Nov 2008 00:48:15 +0000 (16:48 -0800)]
journal: detect size of raw block devices properly

16 years agoosd: only trim pg log if pg contains complete set of osds
Sage Weil [Fri, 14 Nov 2008 00:37:56 +0000 (16:37 -0800)]
osd: only trim pg log if pg contains complete set of osds

Eventually we may want to also impose some maximum pg log size.  At
some point the cost of the long log will approach the cost of
building a backlog...

16 years agoosdmap: fix type conversions
Sage Weil [Thu, 13 Nov 2008 23:45:36 +0000 (15:45 -0800)]
osdmap: fix type conversions

16 years agocrush: mention license. minor cleanup
Sage Weil [Thu, 13 Nov 2008 23:22:24 +0000 (15:22 -0800)]
crush: mention license.  minor cleanup

16 years agobe quiet
Sage Weil [Thu, 13 Nov 2008 22:38:08 +0000 (14:38 -0800)]
be quiet

16 years agoMerge branch 'unstable' of ssh://ceph.newdream.net/git/ceph into unstable
Yehuda Sadeh [Thu, 13 Nov 2008 22:56:43 +0000 (14:56 -0800)]
Merge branch 'unstable' of ssh://ceph.newdream.net/git/ceph into unstable

16 years agokclient: fix small resource leak when mds is down on mount
Yehuda Sadeh [Thu, 13 Nov 2008 22:56:00 +0000 (14:56 -0800)]
kclient: fix small resource leak when mds is down on mount

16 years agocsyn: fix msgr startup
Sage Weil [Thu, 13 Nov 2008 22:36:48 +0000 (14:36 -0800)]
csyn: fix msgr startup

16 years agokclient: whitespace
Sage Weil [Thu, 13 Nov 2008 22:31:57 +0000 (14:31 -0800)]
kclient: whitespace

16 years agomon: standardize storage of monmap revisions
Sage Weil [Thu, 13 Nov 2008 22:26:53 +0000 (14:26 -0800)]
mon: standardize storage of monmap revisions

16 years agomon: indicate last_consumed state after writing "latest" full maps
Sage Weil [Thu, 13 Nov 2008 22:19:53 +0000 (14:19 -0800)]
mon: indicate last_consumed state after writing "latest" full maps

Until then, we may need old incremental states.  This way paxos
won't discard them.

16 years agopaxos: only trim old states if they've been "consumed" by PaxosService
Sage Weil [Thu, 13 Nov 2008 22:19:10 +0000 (14:19 -0800)]
paxos: only trim old states if they've been "consumed" by PaxosService

Higher level state may depend on these items.  Only remove them
when its clear they are unneeded.

16 years agomon: fix mon injectargs
Sage Weil [Thu, 13 Nov 2008 22:13:04 +0000 (14:13 -0800)]
mon: fix mon injectargs

16 years agocmonctl: seed random number generator so we pick a truly random mds
Sage Weil [Thu, 13 Nov 2008 21:33:05 +0000 (13:33 -0800)]
cmonctl: seed random number generator so we pick a truly random mds

16 years agomds: treat open requests as non-idempotent
Sage Weil [Thu, 13 Nov 2008 21:30:26 +0000 (13:30 -0800)]
mds: treat open requests as non-idempotent

The problem is that the reply contains a capability, and as such
is statefull and can't be lost.  Forwards by the MDS on behalf of
the client, however, introduce the possibility of multiple copies
or a request in flight if one of the MDSs fails, and the client
will drop any duplicate replies it receives.

Alternatively, the client could _also_ parse duplicate responses
(i.e. call fill_trace).  I'm not sure if that's a good idea.  In
any case, MDS forwarded requests are only really important for
dealing with flash flood scenarios on extremely large clusters,
so let's just set this aside for now.

16 years agomds todo
Sage Weil [Thu, 13 Nov 2008 20:45:18 +0000 (12:45 -0800)]
mds todo

16 years agokclient: only kick requests when they may make progress
Sage Weil [Thu, 13 Nov 2008 20:43:03 +0000 (12:43 -0800)]
kclient: only kick requests when they may make progress

Kick requests if the mds they are waiting on
 - failed.  it may be possible to send the request elsewhere.
 - became active.

The rest of the time we are just spinning our wheels.

16 years agokclient: only submit mds request if mds is active
Sage Weil [Thu, 13 Nov 2008 20:42:00 +0000 (12:42 -0800)]
kclient: only submit mds request if mds is active

Wait until we get a new map instead.

16 years agoosd: always pick new mon during boot
Sage Weil [Thu, 13 Nov 2008 19:32:34 +0000 (11:32 -0800)]
osd: always pick new mon during boot

16 years agokclient: fix connect_seq on connect-side after connect
Sage Weil [Thu, 13 Nov 2008 19:29:33 +0000 (11:29 -0800)]
kclient: fix connect_seq on connect-side after connect

16 years agomds: keep inode multiversion if it has snapped old_inodes
Sage Weil [Thu, 13 Nov 2008 19:06:03 +0000 (11:06 -0800)]
mds: keep inode multiversion if it has snapped old_inodes

Once old_inodes gets cleaned out (snaps deleted), we can return
to normalcy.

16 years agomon: pave way for more per-client mount info in monitor
Sage Weil [Thu, 13 Nov 2008 18:56:30 +0000 (10:56 -0800)]
mon: pave way for more per-client mount info in monitor

Eventually we'll need some security stuff in here

16 years agomon: client mon stat, dump commands; add to cmonctl -w
Sage Weil [Thu, 13 Nov 2008 18:48:25 +0000 (10:48 -0800)]
mon: client mon stat, dump commands; add to cmonctl -w

16 years agomsgr: include entity type in negotiation
Sage Weil [Thu, 13 Nov 2008 18:40:11 +0000 (10:40 -0800)]
msgr: include entity type in negotiation

This allows the accepting end to determine the policy it will use
on this connection, and return the correct LOSSY flag to the peer.

Also, some kclient simplification, cleanup.

And a connection ref count fix in process_accept().

16 years agomsgr: lotsa cleanup, protocol change, fixes, etc.
Sage Weil [Thu, 13 Nov 2008 18:01:54 +0000 (10:01 -0800)]
msgr: lotsa cleanup, protocol change, fixes, etc.

16 years agomsgr: various locking fixes
Sage Weil [Thu, 13 Nov 2008 01:05:51 +0000 (17:05 -0800)]
msgr: various locking fixes

Fixes some deadlock problems.

We also avoid the use of newsd, and do a sync join() on the reader
thread when killing a pipe, to avoid leaking sockets.

Also fix a 'bad seq #' error due to the reader not rechecking the
pipe state after retaking the lock after reading some data.

16 years agomsgr: zero msgr header
Sage Weil [Thu, 13 Nov 2008 01:03:42 +0000 (17:03 -0800)]
msgr: zero msgr header

Clears up some valgrind warnings

16 years agothread: complain on bad join() calls
Sage Weil [Thu, 13 Nov 2008 01:03:24 +0000 (17:03 -0800)]
thread: complain on bad join() calls

16 years agotestmsgr: messenger tester
Sage Weil [Thu, 13 Nov 2008 01:03:11 +0000 (17:03 -0800)]
testmsgr: messenger tester

16 years agomds: multiversion inodes with multiple links, too
Sage Weil [Wed, 12 Nov 2008 22:15:22 +0000 (14:15 -0800)]
mds: multiversion inodes with multiple links, too

We may have remote links that get snapped.  They need to be able to
find the (single) anchored multiversion inode to get the correct
version.  (The anchor table isn't versioned.)

We'll need to go a step further than this and create snaprealms for
some of these too in order to handle inodes linked into multiple
realms.  But that needs backpointers first...

16 years agokclient: grab reference on inode before async operation
Yehuda Sadeh [Thu, 13 Nov 2008 19:19:14 +0000 (11:19 -0800)]
kclient: grab reference on inode before async operation

16 years agoclient: use separate locks for SafeCond completions
Sage Weil [Wed, 12 Nov 2008 22:31:22 +0000 (14:31 -0800)]
client: use separate locks for SafeCond completions

We can't reuse client_lock because it is already held when the completion
Context is called.

16 years agoRevert "client: adjust objecter locking"
Sage Weil [Wed, 12 Nov 2008 22:18:04 +0000 (14:18 -0800)]
Revert "client: adjust objecter locking"

This reverts commit 1c70b1d8f62ad8d9eeef3a86ef36d8e6933c5702.

Actually, we need client_lock to protect access ot the objecter
and cache structures, as they have no internal locking.

Fix the safecond separately...

16 years agomds: cap may already be released in file_update_finish
Sage Weil [Wed, 12 Nov 2008 22:05:31 +0000 (14:05 -0800)]
mds: cap may already be released in file_update_finish

There may be multiple release requests in the pipeline.  Not ideal,
but whatever.

Also, drop locks later.

16 years agomds: remove cap _after_ journaling update, at the same time we send the msg
Sage Weil [Wed, 12 Nov 2008 19:23:01 +0000 (11:23 -0800)]
mds: remove cap _after_ journaling update, at the same time we send the msg

There was an ordering problem that could come up when we prepared
a release message and removed the cap, but then didn't send it to
the client until after the update was journaled.  This could cause
us to remove the _next_ instance of the capability (from a
subseqent open) in certain circumstances.

Instead, wait until after we journal the update before removing
the client cap and sending the ack.  Since time has passed,
reverify the release request seq is still >= the last_open at
that time.  Introduce a helper to avoid duplicating code for the
case where no journaling is necessary and the cap is immediately
released in _do_cap_update.

16 years agomds: use null snap context for purge if no realm
Sage Weil [Wed, 12 Nov 2008 18:23:15 +0000 (10:23 -0800)]
mds: use null snap context for purge if no realm

The inode may be unlinked, e.g. when we are replaying a journaled
purge_inode EUpdate.  The snapc is not really important, as the
OSD will use the newer snapc it has for the object.  And we only
really care when we're purging the HEAD anyway.

16 years agoosd: keep head_exists accurate
Sage Weil [Wed, 12 Nov 2008 20:25:33 +0000 (12:25 -0800)]
osd: keep head_exists accurate

Also, create object on setattr if it doesn't yet exist.

And don't munge ZERO -> DELETE, at least for now.  What is the use case
for that, anyway?

16 years agofilestore: implement touch
Sage Weil [Wed, 12 Nov 2008 20:22:21 +0000 (12:22 -0800)]
filestore: implement touch

16 years agoebofs: implement touch
Sage Weil [Wed, 12 Nov 2008 20:22:12 +0000 (12:22 -0800)]
ebofs: implement touch

16 years agoobjectstore: introduce touch operation
Sage Weil [Wed, 12 Nov 2008 20:21:57 +0000 (12:21 -0800)]
objectstore: introduce touch operation

Create an object if it doesn't exist.

16 years agokclient: some osd endianity fixes
Yehuda Sadeh [Wed, 12 Nov 2008 20:26:18 +0000 (12:26 -0800)]
kclient: some osd endianity fixes

16 years agokclient: fix symbol overshadowing
Yehuda Sadeh [Wed, 12 Nov 2008 20:13:29 +0000 (12:13 -0800)]
kclient: fix symbol overshadowing

16 years agoprotocol, disk format change
Sage Weil [Wed, 12 Nov 2008 00:00:59 +0000 (16:00 -0800)]
protocol, disk format change

16 years agoosdmap: move offload from crush map into osdmap as osd_weight
Sage Weil [Tue, 11 Nov 2008 23:52:17 +0000 (15:52 -0800)]
osdmap: move offload from crush map into osdmap as osd_weight

16 years agodstart.sh: larger cluster
Sage Weil [Tue, 11 Nov 2008 23:10:46 +0000 (15:10 -0800)]
dstart.sh: larger cluster

16 years agomds: fix replay lookup of snapshotted null dentries
Sage Weil [Mon, 10 Nov 2008 23:53:26 +0000 (15:53 -0800)]
mds: fix replay lookup of snapshotted null dentries

Look up replayed dentry by dnLAST, not dnfirst, as we do with
primary and remote dentries, because that is how we identify
dentry instances in the timeline.

16 years agoobjecter: fix read scatter/gather
Sage Weil [Mon, 10 Nov 2008 23:46:15 +0000 (15:46 -0800)]
objecter: fix read scatter/gather

16 years agoosd: pass at_version by reference, so that cloning works
Sage Weil [Mon, 10 Nov 2008 23:46:03 +0000 (15:46 -0800)]
osd: pass at_version by reference, so that cloning works

16 years agomds: remove spurious warning
Sage Weil [Mon, 10 Nov 2008 23:45:46 +0000 (15:45 -0800)]
mds: remove spurious warning

16 years agoosd: ignore logs i don't expect without freaking out
Sage Weil [Mon, 10 Nov 2008 23:25:24 +0000 (15:25 -0800)]
osd: ignore logs i don't expect without freaking out

We may get a log we didn't think we requested if the prior_set gets rebuilt
or our peering is restarted for some other reason.  Just ignore it, instead
of asserting.

16 years agoosd: assert length on write, zero
Sage Weil [Mon, 10 Nov 2008 22:43:17 +0000 (14:43 -0800)]
osd: assert length on write, zero

16 years agoobjecter: whoops, do DELETE, not ZERO
Sage Weil [Mon, 10 Nov 2008 22:40:45 +0000 (14:40 -0800)]
objecter: whoops, do DELETE, not ZERO

16 years agoosd: fix typo/bug when picking osd to pull missing object from
Sage Weil [Mon, 10 Nov 2008 21:42:14 +0000 (13:42 -0800)]
osd: fix typo/bug when picking osd to pull missing object from

16 years agoosd: fix missing vs lost counting idiocy
Sage Weil [Mon, 10 Nov 2008 21:41:48 +0000 (13:41 -0800)]
osd: fix missing vs lost counting idiocy

16 years agofilestore: cope with zero-length attribute values
Sage Weil [Mon, 10 Nov 2008 21:40:59 +0000 (13:40 -0800)]
filestore: cope with zero-length attribute values

16 years agoebofs: cope with zero-length attribute values
Sage Weil [Mon, 10 Nov 2008 21:40:50 +0000 (13:40 -0800)]
ebofs: cope with zero-length attribute values

16 years agoosd: use modify flag to decide whether to take read fast path
Sage Weil [Mon, 10 Nov 2008 18:55:10 +0000 (10:55 -0800)]
osd: use modify flag to decide whether to take read fast path

16 years agokclient: fix up writes, reads for new op structure
Sage Weil [Mon, 10 Nov 2008 18:37:41 +0000 (10:37 -0800)]
kclient: fix up writes, reads for new op structure

Make sure osdc_readpages still returns bytes read, even though the
overall message result does not (pull it from the op.length).

Set MODIFY flag.

16 years agoosd: MODIFY is a flag; fix up op_read
Sage Weil [Mon, 10 Nov 2008 18:36:24 +0000 (10:36 -0800)]
osd: MODIFY is a flag; fix up op_read

16 years agoosd: simple higher-order append mutation
Sage Weil [Mon, 10 Nov 2008 17:46:53 +0000 (09:46 -0800)]
osd: simple higher-order append mutation

16 years agokclient: fix osd reply handler sanity check
Sage Weil [Mon, 10 Nov 2008 04:21:43 +0000 (20:21 -0800)]
kclient: fix osd reply handler sanity check

16 years agoobjecter: destructively take ops[], bufferlist passed to read(), modify().
Sage Weil [Mon, 10 Nov 2008 04:15:57 +0000 (20:15 -0800)]
objecter: destructively take ops[], bufferlist passed to read(), modify().

Small optimization.

16 years agomds: set path attr on directory objects
Sage Weil [Mon, 10 Nov 2008 04:12:20 +0000 (20:12 -0800)]
mds: set path attr on directory objects

16 years agoosd: object attr operations
Sage Weil [Mon, 10 Nov 2008 04:11:57 +0000 (20:11 -0800)]
osd: object attr operations

16 years agoobjecter: tweaking interface
Sage Weil [Mon, 10 Nov 2008 01:43:34 +0000 (17:43 -0800)]
objecter: tweaking interface

16 years agoobjecter: simplify objecter (no scatter/gather, generic ops vector)
Sage Weil [Sun, 9 Nov 2008 23:52:02 +0000 (15:52 -0800)]
objecter: simplify objecter (no scatter/gather, generic ops vector)

16 years agoosdc: avoid using objecter readx/writex
Sage Weil [Sun, 9 Nov 2008 23:05:37 +0000 (15:05 -0800)]
osdc: avoid using objecter readx/writex

16 years agoosd: compound osd operations
Sage Weil [Fri, 7 Nov 2008 04:48:55 +0000 (20:48 -0800)]
osd: compound osd operations

16 years agoclient: adjust objecter locking
Sage Weil [Sun, 9 Nov 2008 16:43:14 +0000 (08:43 -0800)]
client: adjust objecter locking

We want to unlock client_lock before claling into objecter, mainly because the callbacks
rely on SafeCond that take a lock to signal a Cond and that gets awkward without mutex
recursion (see _write()'s sync case).

16 years agoosd: track recovery sources independently of missing list
Sage Weil [Fri, 7 Nov 2008 22:02:12 +0000 (14:02 -0800)]
osd: track recovery sources independently of missing list

Fixes pull() to choose an osd that isn't down.

16 years agomds: debug session ref in EMetaBlob during replay
Sage Weil [Fri, 7 Nov 2008 21:36:52 +0000 (13:36 -0800)]
mds: debug session ref in EMetaBlob during replay

16 years agomds: match last snap exactly on replay, add_*_dentry
Sage Weil [Fri, 7 Nov 2008 21:26:41 +0000 (13:26 -0800)]
mds: match last snap exactly on replay, add_*_dentry

In general, we add new snapped dentries and THEN the new live dentry
to the metablob.  That means that during replay, we see [2,2] followed
by [3,head], replacing [2,head].  The [2,2] dentry should be added
anew, without paying heed to [2,head], and then the [3,head] should
replace/update [2,head].

It was mainly just the assertions in add_*_dentry that were getting
in the way.. but the lookup_exact_snap is also slightly faster.

16 years agoMerge branch 'unstable' of ssh://ceph.newdream.net/git/ceph into unstable
Yehuda Sadeh [Fri, 7 Nov 2008 21:11:08 +0000 (13:11 -0800)]
Merge branch 'unstable' of ssh://ceph.newdream.net/git/ceph into unstable

16 years agokclient: when going down, release caps anyway
Yehuda Sadeh [Fri, 7 Nov 2008 21:10:58 +0000 (13:10 -0800)]
kclient: when going down, release caps anyway

16 years agomon: avoid updating pg_map when osd_stat is unchanged
Sage Weil [Fri, 7 Nov 2008 21:07:28 +0000 (13:07 -0800)]
mon: avoid updating pg_map when osd_stat is unchanged

16 years agocmonctl: -w or --watch to watch (and print) mds/osd/pg stat changes
Sage Weil [Fri, 7 Nov 2008 20:44:26 +0000 (12:44 -0800)]
cmonctl: -w or --watch to watch (and print) mds/osd/pg stat changes

16 years agomds: don't cow a null dentry
Sage Weil [Fri, 7 Nov 2008 18:56:25 +0000 (10:56 -0800)]
mds: don't cow a null dentry

16 years agokclient: sparse warnings
Yehuda Sadeh [Fri, 7 Nov 2008 18:23:00 +0000 (10:23 -0800)]
kclient: sparse warnings

16 years agovstart.sh: add usage of $CEPH_BIN
Yehuda Sadeh [Fri, 7 Nov 2008 18:21:19 +0000 (10:21 -0800)]
vstart.sh: add usage of $CEPH_BIN

16 years agoosd: don't repeer an active pg just because the prior_set was affected
Sage Weil [Fri, 7 Nov 2008 17:47:59 +0000 (09:47 -0800)]
osd: don't repeer an active pg just because the prior_set was affected

We only want to restart peering due to prior_set changes if it hasn't completed
yet.

16 years agomds: check dn->last when finding existing dentries during replay
Sage Weil [Fri, 7 Nov 2008 00:28:17 +0000 (16:28 -0800)]
mds: check dn->last when finding existing dentries during replay

We can't simply search for an existing dentry based on the name and end
snap, as that may turn up the wrong item.  For example, if we have
[2,head] and the replaying operations cowed that to [2,2] and [3,head], then
if we replay the [2,2] item first we'll find [2,head] (the _wrong_ dentry)
and throw an assertion.

So just check for dn->last != p->dnlast.

16 years agotodos
Sage Weil [Fri, 7 Nov 2008 00:14:18 +0000 (16:14 -0800)]
todos

16 years agoebofs: another recursive lock bug
Sage Weil [Fri, 7 Nov 2008 03:27:51 +0000 (19:27 -0800)]
ebofs: another recursive lock bug

16 years agoosd: turn up debug on any shutdown, not just SIGINT/SIGTERM, for now
Sage Weil [Fri, 7 Nov 2008 03:15:50 +0000 (19:15 -0800)]
osd: turn up debug on any shutdown, not just SIGINT/SIGTERM, for now

16 years agomsgr: fix problem with forced stop of pipe
Sage Weil [Fri, 7 Nov 2008 03:15:32 +0000 (19:15 -0800)]
msgr: fix problem with forced stop of pipe

16 years agoebofs: fix lock recursion
Sage Weil [Fri, 7 Nov 2008 03:15:10 +0000 (19:15 -0800)]
ebofs: fix lock recursion

16 years agomon: handle invalid commands to pgmon
Sage Weil [Thu, 6 Nov 2008 22:26:10 +0000 (14:26 -0800)]
mon: handle invalid commands to pgmon

16 years agoosd: add degraded pg state bit
Sage Weil [Thu, 6 Nov 2008 23:32:32 +0000 (15:32 -0800)]
osd: add degraded pg state bit

16 years agoosd: improve build_prior logic
Sage Weil [Thu, 6 Nov 2008 23:03:49 +0000 (15:03 -0800)]
osd: improve build_prior logic

If, during some interval since the pg last went active, we may have gone
rw, but none of the osds survived, then we include all of those osds
in the prior_set (even tho they're down), because they may have written data
that we want.

The prior logic appears to have been broken.  It was only looking at the
primary osd.

16 years agoosd: turn up debugging on SIGINT/TERM
Sage Weil [Thu, 6 Nov 2008 22:11:13 +0000 (14:11 -0800)]
osd: turn up debugging on SIGINT/TERM

16 years agoosd: fix osd_lock recursion in wake_snap_trimmer
Sage Weil [Thu, 6 Nov 2008 21:58:10 +0000 (13:58 -0800)]
osd: fix osd_lock recursion in wake_snap_trimmer

16 years agokclient: bookkeeper detects buffer overrun
Yehuda Sadeh [Thu, 6 Nov 2008 21:47:08 +0000 (13:47 -0800)]
kclient: bookkeeper detects buffer overrun

16 years agokclient: frag_make_child fix (sage)
Yehuda Sadeh [Thu, 6 Nov 2008 21:26:08 +0000 (13:26 -0800)]
kclient: frag_make_child fix (sage)

16 years agoosd: don't pull if source osd is down
Sage Weil [Thu, 6 Nov 2008 17:43:59 +0000 (09:43 -0800)]
osd: don't pull if source osd is down

16 years agokclient: ran checkpatch
Sage Weil [Thu, 6 Nov 2008 18:57:10 +0000 (10:57 -0800)]
kclient: ran checkpatch

16 years agotodos
Sage Weil [Thu, 6 Nov 2008 18:56:51 +0000 (10:56 -0800)]
todos

16 years agosynclient: fix debug prefix
Sage Weil [Thu, 6 Nov 2008 00:54:52 +0000 (16:54 -0800)]
synclient: fix debug prefix

16 years agocfuse: fix symlink call
Sage Weil [Wed, 5 Nov 2008 23:07:39 +0000 (15:07 -0800)]
cfuse: fix symlink call

16 years agovstart.sh
Sage Weil [Wed, 5 Nov 2008 22:54:51 +0000 (14:54 -0800)]
vstart.sh

16 years agofix env parsing
Sage Weil [Wed, 5 Nov 2008 22:54:13 +0000 (14:54 -0800)]
fix env parsing