git.apps.os.sepia.ceph.com Git

mds: no rdlock in filelock LOCK state

Otherwise we get wrlocks AND rdlocks at the same time, which is clearly
problematic. Der.

Also fix _rdlock_kick to not simple_lock, since that won't help us.

commit | commitdiff | tree

Sage Weil [Thu, 14 May 2009 23:10:27 +0000 (16:10 -0700)]

mds: drain wrlocks before going from LOCK->SYNC in file_eval

This avoids excessive waits for the journal to flush on lock->sync
when a client request holds a wrlock. There's no reason to hurry.. if
someone needs it sync we can to the transition then; otherwise, it'll
happen when the wrlock is dropped.

commit | commitdiff | tree

Yehuda Sadeh [Thu, 14 May 2009 22:55:31 +0000 (15:55 -0700)]

commit | commitdiff | tree

Sage Weil [Thu, 14 May 2009 22:51:22 +0000 (15:51 -0700)]

rados: define rdcall, wrcall on arbitrary class, method

commit | commitdiff | tree

Sage Weil [Thu, 14 May 2009 22:29:31 +0000 (15:29 -0700)]

osd: make classhandler requests async

Handle deferred request queues in OSD, class loading and state
in ClassHandler.

commit | commitdiff | tree

Sage Weil [Thu, 14 May 2009 21:34:21 +0000 (14:34 -0700)]

commit | commitdiff | tree

Sage Weil [Thu, 14 May 2009 21:33:06 +0000 (14:33 -0700)]

commit | commitdiff | tree

Sage Weil [Thu, 14 May 2009 21:18:32 +0000 (14:18 -0700)]

rados: only export rados_* from librados.so

commit | commitdiff | tree

Sage Weil [Thu, 14 May 2009 21:23:40 +0000 (14:23 -0700)]

.gitignore update

commit | commitdiff | tree

Sage Weil [Thu, 14 May 2009 20:57:25 +0000 (13:57 -0700)]

mds: break CAP_RDCACHE into CAP_SHARED, CAP_CACHE

FILE_CAP_RDCACHE was being used to mean both read access to the
file attributes (size, mtime) and permission to retain cached
data. That lead to an incorrect definition of the filelock in the
mds, and in turn bugs with multiple client access. These are now
CAP_*_SHARED (all locks) and CAP_FILE_CACHE (filelock only).

The main observed symptom was a client creating files in a
directory and a second client not seeing them, due to RDCACHE not
being revoked and rdcache_gen thus not incrementing, allowing a
dcache readdir to proceed.

commit | commitdiff | tree

Sage Weil [Thu, 14 May 2009 20:47:31 +0000 (13:47 -0700)]

kclient: don't skip EXPIREABLE caps

EXPIREABLE is obsolete. We need to reconnect _all_ caps!

commit | commitdiff | tree

Yehuda Sadeh [Thu, 14 May 2009 19:55:45 +0000 (12:55 -0700)]

class: can use ceph utility to add classes

e.g. ./ceph class add foo 1 --in-data=foo.so

commit | commitdiff | tree

Sage Weil [Thu, 14 May 2009 00:02:44 +0000 (17:02 -0700)]

kclient: fix crush decoding for recent changes

commit | commitdiff | tree

Sage Weil [Wed, 13 May 2009 23:49:05 +0000 (16:49 -0700)]

rev osd protocol, disk format to reflect crush changes

commit | commitdiff | tree

Sage Weil [Wed, 13 May 2009 23:46:20 +0000 (16:46 -0700)]

crush: fix crush_perm_choose; optimize r=0 case.

This was misbehaving for x=0, among other things.

Avoid filling in perm array for the initial (p)r=0 call. We only
need to do a full permutation for subsequent r.

commit | commitdiff | tree

Yehuda Sadeh [Wed, 13 May 2009 23:39:17 +0000 (16:39 -0700)]

commit | commitdiff | tree

Yehuda Sadeh [Wed, 13 May 2009 22:11:39 +0000 (15:11 -0700)]

commit | commitdiff | tree

Yehuda Sadeh [Wed, 13 May 2009 21:52:52 +0000 (14:52 -0700)]

commit | commitdiff | tree

Sage Weil [Wed, 13 May 2009 21:56:28 +0000 (14:56 -0700)]

psim: count result set sizes

commit | commitdiff | tree

Sage Weil [Wed, 13 May 2009 21:55:07 +0000 (14:55 -0700)]

crush: fall back to exhaustive bucket search for any bucket type

If we don't get a bucket-specific choice in 5 tries, do an
exhaustive search (based on a random permutation). Only then give
up on the bucket and retry descent.

Note that the search-based fallback does not honor weighting at
all.

commit | commitdiff | tree

Sage Weil [Wed, 13 May 2009 21:07:45 +0000 (14:07 -0700)]

crush: ditch prime number theorem; generate random permutation on the fly

commit | commitdiff | tree

Sage Weil [Wed, 13 May 2009 19:02:45 +0000 (12:02 -0700)]

crush: improve uniform selection a bit

Shift to a new prime (and thus permutation) every few r.

commit | commitdiff | tree

Sage Weil [Tue, 12 May 2009 15:07:51 +0000 (08:07 -0700)]

Merge branch 'c3' into rados

commit | commitdiff | tree

Sage Weil [Tue, 12 May 2009 15:07:00 +0000 (08:07 -0700)]

testrados: C, not C++

commit | commitdiff | tree

Sage Weil [Tue, 12 May 2009 04:16:02 +0000 (21:16 -0700)]

librados: fix up #includes; use C for testrados

commit | commitdiff | tree

Sage Weil [Tue, 12 May 2009 04:06:14 +0000 (21:06 -0700)]

rados: build librados, libcrush using libtool

Finally?

commit | commitdiff | tree

Sage Weil [Mon, 11 May 2009 23:23:33 +0000 (16:23 -0700)]

librados: drop librados/ dir.

commit | commitdiff | tree

Sage Weil [Mon, 11 May 2009 23:22:08 +0000 (16:22 -0700)]

librados: build the .so

commit | commitdiff | tree

Sage Weil [Mon, 11 May 2009 22:48:00 +0000 (15:48 -0700)]

librados: rename c3 -> librados

commit | commitdiff | tree

Yehuda Sadeh [Mon, 11 May 2009 22:40:22 +0000 (15:40 -0700)]

c3: remove unnecessary include

commit | commitdiff | tree

Sage Weil [Mon, 11 May 2009 22:38:03 +0000 (15:38 -0700)]

commit | commitdiff | tree

Yehuda Sadeh [Mon, 11 May 2009 20:36:59 +0000 (13:36 -0700)]

commit | commitdiff | tree

Yehuda Sadeh [Mon, 11 May 2009 17:47:50 +0000 (10:47 -0700)]

commit | commitdiff | tree

Yehuda Sadeh [Mon, 11 May 2009 17:30:51 +0000 (10:30 -0700)]

commit | commitdiff | tree

Yehuda Sadeh [Fri, 8 May 2009 23:33:30 +0000 (16:33 -0700)]

commit | commitdiff | tree

Yehuda Sadeh [Thu, 7 May 2009 22:55:29 +0000 (15:55 -0700)]

c3: exec op string

commit | commitdiff | tree

Yehuda Sadeh [Thu, 7 May 2009 22:47:50 +0000 (15:47 -0700)]

c3: handle unhandled message

commit | commitdiff | tree

Yehuda Sadeh [Thu, 7 May 2009 22:33:33 +0000 (15:33 -0700)]

c3: rados merges issues

commit | commitdiff | tree

Yehuda Sadeh [Wed, 6 May 2009 05:45:00 +0000 (22:45 -0700)]

c3: implement exec poc

commit | commitdiff | tree

Yehuda Sadeh [Tue, 5 May 2009 18:56:13 +0000 (11:56 -0700)]

c3: fix rank

commit | commitdiff | tree

Yehuda Sadeh [Mon, 4 May 2009 23:40:29 +0000 (16:40 -0700)]

c3: fix rank static allocation

commit | commitdiff | tree

Yehuda Sadeh [Mon, 4 May 2009 22:53:41 +0000 (15:53 -0700)]

osd: add an exec op

commit | commitdiff | tree

Yehuda Sadeh [Fri, 1 May 2009 22:42:16 +0000 (15:42 -0700)]

c3: create a very simple interface

commit | commitdiff | tree

Yehuda Sadeh [Fri, 1 May 2009 00:08:00 +0000 (17:08 -0700)]

c3: mount through MonClient

commit | commitdiff | tree

Sage Weil [Mon, 11 May 2009 22:29:11 +0000 (15:29 -0700)]

c3: add mount/umount to mon client

commit | commitdiff | tree

Yehuda Sadeh [Thu, 30 Apr 2009 20:21:22 +0000 (13:21 -0700)]

s3: add read to unitest

commit | commitdiff | tree

Sage Weil [Mon, 11 May 2009 22:28:56 +0000 (15:28 -0700)]

c3: ceph simple interface

commit | commitdiff | tree

Sage Weil [Mon, 11 May 2009 21:33:17 +0000 (14:33 -0700)]

mds todo

commit | commitdiff | tree

Sage Weil [Mon, 11 May 2009 21:00:52 +0000 (14:00 -0700)]

kclient: fail connection when s_addr==0 and port/nonce don't match

Even if s_addr is 0, the port and nonce should match. We were
previously going ahead with the connection when we shouldn't have
been.

commit | commitdiff | tree

Sage Weil [Mon, 11 May 2009 20:54:03 +0000 (13:54 -0700)]

mds: maintain capid across mds restart and client reconnect

commit | commitdiff | tree

Sage Weil [Mon, 11 May 2009 20:21:26 +0000 (13:21 -0700)]

mon: don't replace standby mds

commit | commitdiff | tree

Sage Weil [Mon, 11 May 2009 20:05:38 +0000 (13:05 -0700)]

mds: handle MMonMap

commit | commitdiff | tree

Sage Weil [Mon, 11 May 2009 18:51:21 +0000 (11:51 -0700)]

journaler: fix replay

Broken by commit 497ade3b90.

Use objecter to write header (not just to read it).

Also reset the prefetch values each time the layout is set.

commit | commitdiff | tree

Sage Weil [Sun, 10 May 2009 05:13:37 +0000 (22:13 -0700)]

osd: based reported eversion in pg_stat_t on same_primary_since

This ensures the value increases when the primary changes.

commit | commitdiff | tree

Sage Weil [Sun, 10 May 2009 05:12:50 +0000 (22:12 -0700)]

osd: skip initial bit of peering if already have_master_log

Once we have settled on the master log, we want to skip that
step of peer(). Namely because peer() can be called on an
active PG if an osd shows up with stray content. We still want
to peer() in that case in case there are missing objects to be
found.

commit | commitdiff | tree

Sage Weil [Mon, 11 May 2009 18:34:37 +0000 (11:34 -0700)]

mds: fix loner drops

This was broken way back by commit 0c3becdf.

Only drop loner in eval().

commit | commitdiff | tree

Sage Weil [Mon, 11 May 2009 18:15:44 +0000 (11:15 -0700)]

rev protocols

Just for good measure, since I missed a few changes earlier.

commit | commitdiff | tree

Sage Weil [Mon, 11 May 2009 18:01:50 +0000 (11:01 -0700)]

debian: new rules file. don't strip.

commit | commitdiff | tree

Sage Weil [Mon, 11 May 2009 17:44:20 +0000 (10:44 -0700)]

osd: fix accounting .snap thing

commit | commitdiff | tree

Sage Weil [Mon, 11 May 2009 17:35:56 +0000 (10:35 -0700)]

todo

commit | commitdiff | tree

Sage Weil [Sat, 9 May 2009 18:25:27 +0000 (11:25 -0700)]

conf: improved sample

commit | commitdiff | tree

Sage Weil [Fri, 8 May 2009 21:05:31 +0000 (14:05 -0700)]

Merge branch 'unstable' into rados

commit | commitdiff | tree

Sage Weil [Fri, 8 May 2009 20:37:06 +0000 (13:37 -0700)]

mon: check for osd exists before up/down

commit | commitdiff | tree

Sage Weil [Fri, 8 May 2009 21:04:03 +0000 (14:04 -0700)]

osd: maintain up_epoch AND boot_epoch; revise OSDSuperblock accordingly

In order to make the superblock clean interval meaningful after we
are marked down and then up again (over the life of a single
cosd process insance), we track both boot_epoch and up_epoch,
and keep [boot_epoch,clean_thru] in the superblock.

This avoids seeing crashed pgs when and osd is wrongly marked down
and the osd marks itself up again.

commit | commitdiff | tree

Sage Weil [Fri, 8 May 2009 04:45:53 +0000 (21:45 -0700)]

osd: add back in support for unversioned sobject_t (.snap=0)

commit | commitdiff | tree

Sage Weil [Fri, 8 May 2009 04:00:04 +0000 (21:00 -0700)]

osd: use / in sobject_t output

commit | commitdiff | tree

Sage Weil [Fri, 8 May 2009 19:55:37 +0000 (12:55 -0700)]

osd: adjust heartbeat peer lock

Need it to protect heartbeat_from_stamp.

commit | commitdiff | tree

Sage Weil [Fri, 8 May 2009 19:55:16 +0000 (12:55 -0700)]

osd: reset heartbeat peer set on osd down

This clears out the timers.

commit | commitdiff | tree

Sage Weil [Fri, 8 May 2009 19:20:13 +0000 (12:20 -0700)]

buffer: add malloc raw buffer type

commit | commitdiff | tree

Sage Weil [Fri, 8 May 2009 04:42:19 +0000 (21:42 -0700)]

osd: adjust snap collection memberships during pg split

commit | commitdiff | tree

Sage Weil [Fri, 8 May 2009 04:38:26 +0000 (21:38 -0700)]

osd: adjust pg_stats during pg split

commit | commitdiff | tree

Sage Weil [Fri, 8 May 2009 04:32:53 +0000 (21:32 -0700)]

osd: generate correct child pg when doing pg split

commit | commitdiff | tree

Sage Weil [Fri, 8 May 2009 04:13:39 +0000 (21:13 -0700)]

ceph: don't choke on unexpected MMonMap

commit | commitdiff | tree

Sage Weil [Fri, 8 May 2009 03:49:51 +0000 (20:49 -0700)]

osd: factor out clear_recovery_state from {cancel,finish}_recovery

Also kill OSD::num_pulling counter, which is wrong anyway,
lacking locking, and probably not needed anyway with the more
general recovery_op accounting.

commit | commitdiff | tree

Sage Weil [Fri, 8 May 2009 03:47:31 +0000 (20:47 -0700)]

osd: make sure _finish_recovery only completes when it's supposed to

Because the finish_recovery does a sync, the final cleanup is
deferred, and we have to make sure we are still done (and we
haven't, say, repeered or something).

In this case, we thus make sure we don't clear out pg
recovery_ops when we actually have ops in progress.

commit | commitdiff | tree

Sage Weil [Fri, 8 May 2009 00:38:46 +0000 (17:38 -0700)]

filestore: fix object filename parsing

commit | commitdiff | tree

Sage Weil [Thu, 7 May 2009 23:32:11 +0000 (16:32 -0700)]

cosd: debug filestore

commit | commitdiff | tree

Sage Weil [Fri, 8 May 2009 00:38:11 +0000 (17:38 -0700)]

mon: don't send mdsmap on client mount

Client can request it on first mds op.

commit | commitdiff | tree

Sage Weil [Fri, 8 May 2009 00:33:28 +0000 (17:33 -0700)]

kclient: don't wait for mdsmap on mount

We just need the monmap. And osdmap, since osd_client doesn't like a
null pointer atm.

commit | commitdiff | tree

Sage Weil [Thu, 7 May 2009 23:06:50 +0000 (16:06 -0700)]

osd: set snapid in read requests

commit | commitdiff | tree

Sage Weil [Thu, 7 May 2009 21:21:16 +0000 (14:21 -0700)]

uclient: use MonClient for mount + unmount

commit | commitdiff | tree

Sage Weil [Wed, 6 May 2009 23:28:42 +0000 (16:28 -0700)]

monc: add mount/umount to mon client

commit | commitdiff | tree

Sage Weil [Wed, 6 May 2009 23:27:37 +0000 (16:27 -0700)]

objecter: fix osdmap requesting

commit | commitdiff | tree

Sage Weil [Wed, 6 May 2009 22:23:12 +0000 (15:23 -0700)]

osd: fix pg splits vs lockdep

PG splits create+lock the child while the parent is still locked.
Disable lockdep in that case only so that we don't crash and burn.

commit | commitdiff | tree

Sage Weil [Wed, 6 May 2009 21:41:42 +0000 (14:41 -0700)]

mon: 'osd pool create foo'

commit | commitdiff | tree

Sage Weil [Wed, 6 May 2009 20:51:33 +0000 (13:51 -0700)]

todo

commit | commitdiff | tree

Sage Weil [Wed, 6 May 2009 20:12:29 +0000 (13:12 -0700)]

osd: move .snap out of object_t

This makes the snap versioning completely orthogonal to the logical
object name (object_t). This is key since eventually object_t
won't be structured. And the old way made for an awkward interface
anyway.

Also killed the .snap = 0 special casing, which AFAICS was
useless.

commit | commitdiff | tree

Sage Weil [Wed, 6 May 2009 18:56:08 +0000 (11:56 -0700)]

osd: do not use ebofs

Don't compile or use ebofs.

commit | commitdiff | tree

Sage Weil [Fri, 1 May 2009 13:43:53 +0000 (06:43 -0700)]

ceph: break up ceph_fs.h header into msgr.h, rados.h

commit | commitdiff | tree

Sage Weil [Thu, 7 May 2009 21:39:47 +0000 (14:39 -0700)]

kclient: recalculate pgid each time request is sent

The pg calculation depends on osdmap parameters that are transient. In
contrast, the rest of calc_layout is concerned with file striping, which
is fixed (at least over the lifetime of the request).

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom