]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
15 years agosepia: valgrind for a while
Sage Weil [Tue, 17 Nov 2009 23:05:41 +0000 (15:05 -0800)]
sepia: valgrind for a while

15 years agoobjecter: be more verbose about laggy requests
Sage Weil [Tue, 17 Nov 2009 22:31:19 +0000 (14:31 -0800)]
objecter: be more verbose about laggy requests

15 years agomsgr: use discard_queue, kill drop_msgs
Sage Weil [Tue, 17 Nov 2009 22:11:24 +0000 (14:11 -0800)]
msgr: use discard_queue, kill drop_msgs

15 years agomsgr: don't initiate connect if we keepalive on unconnected peer
Sage Weil [Tue, 17 Nov 2009 22:06:07 +0000 (14:06 -0800)]
msgr: don't initiate connect if we keepalive on unconnected peer

15 years agomsgr: fix possible use-after-free by taking pipe lock during reap
Sage Weil [Wed, 18 Nov 2009 00:30:36 +0000 (16:30 -0800)]
msgr: fix possible use-after-free by taking pipe lock during reap

This ensures that whoever called stop() (mark_down, in particular) finished
with the pipe (unlocked it) before we go and free it.  Otherwise we might
call p->lock.Unlock() after the reaper deleted the Pipe, mucking up some
other memory.

I don't think this was actually triggerd, tho, since we would have seen
the assert(nlock == 0) in ~Mutex???

15 years agomsgr: use common helper for reader/writer thread stop and reap queueing; fix locking
Sage Weil [Wed, 18 Nov 2009 00:30:12 +0000 (16:30 -0800)]
msgr: use common helper for reader/writer thread stop and reap queueing; fix locking

We weren't holding lock during ::close(sd)

15 years agomsgr: rename report_failures -> drop_msgs
Sage Weil [Tue, 17 Nov 2009 22:03:36 +0000 (14:03 -0800)]
msgr: rename report_failures -> drop_msgs

Name was from legacy behavior

15 years agomsgr: small cleanups
Sage Weil [Tue, 17 Nov 2009 22:02:57 +0000 (14:02 -0800)]
msgr: small cleanups

15 years agomonc: small sanity check
Sage Weil [Tue, 17 Nov 2009 22:02:09 +0000 (14:02 -0800)]
monc: small sanity check

15 years agomsgr: get rid of harmless valgrind error
Sage Weil [Mon, 16 Nov 2009 19:46:03 +0000 (11:46 -0800)]
msgr: get rid of harmless valgrind error

==7781== Source and destination overlap in memcpy(0x5B97EA8, 0x5B97EA8, 136)

15 years agomon: fail on write error
Sage Weil [Mon, 16 Nov 2009 19:45:25 +0000 (11:45 -0800)]
mon: fail on write error

Don't rename badly written file.  Also assert success, for now.

15 years agomds: journal open_files based on is_any_caps_wanted(), not is_any_caps()
Sage Weil [Fri, 13 Nov 2009 23:06:31 +0000 (15:06 -0800)]
mds: journal open_files based on is_any_caps_wanted(), not is_any_caps()

Actually we're a bit conservative in a few places since the wanted check
is a bit more expensive.  We always do a full check in try_to_expire, so
much of the time we can do the quick check only.

15 years agomds: don't rejournal files with caps that are unwanted
Sage Weil [Fri, 13 Nov 2009 22:57:39 +0000 (14:57 -0800)]
mds: don't rejournal files with caps that are unwanted

If they're unwanted, it's no biggie to fail to reconnect the cap.  And
Locker::adjust_cap_wanted() already adjusts the open_file logseg lists in
this way, so let's just totally consistent.

15 years agomds: don't croak on open_files without caps
Sage Weil [Fri, 13 Nov 2009 22:44:22 +0000 (14:44 -0800)]
mds: don't croak on open_files without caps

We can get a capless inode here if we replay an open file, the client
fails to reconnect it, but does REPLAY an open request (that adds it
to the logseg).  AFAICS it's ok for the client to replay an open on a
file it doesn't have in it's cache anymore.

15 years agorados: status printouts are now threaded
Greg Farnum [Fri, 13 Nov 2009 19:00:55 +0000 (11:00 -0800)]
rados: status printouts are now threaded

15 years agomds: don't try to send mdsmap to clients that need to reconnect
Sage Weil [Fri, 13 Nov 2009 18:09:31 +0000 (10:09 -0800)]
mds: don't try to send mdsmap to clients that need to reconnect

It won't work, clients must connect to us.

Also stop sending the map on recovery finish; they'll also (currently, at
least) get that from the monitor.

15 years agomds: ignore/warn on late client reconnect attempts
Sage Weil [Fri, 13 Nov 2009 18:09:12 +0000 (10:09 -0800)]
mds: ignore/warn on late client reconnect attempts

In the future maybe we can do better (i.e. best effort attempt to reconnect
after the normal interval has passed)

15 years agoMerge branch 'unstable' of ceph.newdream.net:/git/ceph into unstable
Greg Farnum [Fri, 13 Nov 2009 00:45:05 +0000 (16:45 -0800)]
Merge branch 'unstable' of ceph.newdream.net:/git/ceph into unstable

15 years agoqa: prepare for a hierarchical test script system
Greg Farnum [Fri, 13 Nov 2009 00:44:35 +0000 (16:44 -0800)]
qa: prepare for a hierarchical test script system

15 years agotodo
Sage Weil [Thu, 12 Nov 2009 23:12:18 +0000 (15:12 -0800)]
todo

15 years agoMerge branch 'unstable' of ssh://ceph.newdream.net/git/ceph into unstable
Sage Weil [Thu, 12 Nov 2009 22:55:26 +0000 (14:55 -0800)]
Merge branch 'unstable' of ssh://ceph.newdream.net/git/ceph into unstable

15 years agoput testing ceph.conf's in git
Sage Weil [Thu, 12 Nov 2009 22:17:18 +0000 (14:17 -0800)]
put testing ceph.conf's in git

15 years agoosd: print useful mode wait message
Sage Weil [Thu, 12 Nov 2009 22:16:31 +0000 (14:16 -0800)]
osd: print useful mode wait message

15 years agofilestore: skip sync_file_range if the full commit has started
Sage Weil [Thu, 12 Nov 2009 22:16:17 +0000 (14:16 -0800)]
filestore: skip sync_file_range if the full commit has started

Presumably btrfs commit_transaction is more effective than our per-file
flushing.  And we don't want to fight each other.

15 years agomds: recommit after commit if waiting for newer version
Sage Weil [Thu, 12 Nov 2009 22:54:22 +0000 (14:54 -0800)]
mds: recommit after commit if waiting for newer version

If there are waiters for a later version of the dir to hit
disk, then we need to recommit as soon as the prior commit
completes.  We auth_pin on adding the first waiter, and do
not unpin until removing the last waiter, so this doesn't
break auth_pin rules.

Previously we could stall because we didn't finish the
waiter (on the later version) but also never started the
commit.  Sometimes we would get lucky and someone else would
ask for a commit, but sometimes not.  We would then see old
LogSegments that would never get fully expired.

15 years agorados: fix usage() and -p option checking
Greg Farnum [Thu, 12 Nov 2009 20:06:57 +0000 (12:06 -0800)]
rados: fix usage() and -p option checking

15 years agomds: warn, don't crash, on rfiles/rsubdirs underflow
Sage Weil [Thu, 12 Nov 2009 18:45:54 +0000 (10:45 -0800)]
mds: warn, don't crash, on rfiles/rsubdirs underflow

This doesn't fix the bug, but lets the mds at least start up.

15 years agouclient: increase cache size
Sage Weil [Thu, 12 Nov 2009 00:10:14 +0000 (16:10 -0800)]
uclient: increase cache size

15 years agofilestore: flusher thread; commit snaps (disabled)
Sage Weil [Thu, 12 Nov 2009 00:09:52 +0000 (16:09 -0800)]
filestore: flusher thread; commit snaps (disabled)

15 years agoosd: avoid truncate, remove ops we know will fail
Sage Weil [Wed, 11 Nov 2009 18:20:12 +0000 (10:20 -0800)]
osd: avoid truncate, remove ops we know will fail

Now that we check return codes, these cause problems.

15 years agoosd: don't requeue pg removal if already removing
Sage Weil [Wed, 11 Nov 2009 18:19:28 +0000 (10:19 -0800)]
osd: don't requeue pg removal if already removing

15 years agotodo
Sage Weil [Wed, 11 Nov 2009 23:52:10 +0000 (15:52 -0800)]
todo

15 years agomds: force rdlock on any snapped inodes
Sage Weil [Wed, 11 Nov 2009 23:47:28 +0000 (15:47 -0800)]
mds: force rdlock on any snapped inodes

When the client has an excl lock on an inode, and it's
stating a snapped version of it, we can't expect it to
put 2 and 2 together and look at it's head metadata.  If
the cap does not follow the snapid we're trying to stat, do
the full rdlock to force the snapped values back to the
mds so we can do the cow.

If there is nothing cow, the cap will get reissued with an
accurate follows value, and we won't have to do this again.

15 years agosessionmap is an object, not a pointer
Greg Farnum [Wed, 11 Nov 2009 00:44:07 +0000 (16:44 -0800)]
sessionmap is an object, not a pointer

15 years agomds: fix typo; also only suicide if we have clients
Sage Weil [Wed, 11 Nov 2009 00:32:01 +0000 (16:32 -0800)]
mds: fix typo; also only suicide if we have clients

If we are the last MDS and have no clients we should be
able to stop cleanly....

15 years agomds: underwater is function of _loaded_ version, not in core version
Sage Weil [Wed, 11 Nov 2009 00:26:44 +0000 (16:26 -0800)]
mds: underwater is function of _loaded_ version, not in core version

We may load a dir version off disk that is older than the
in-core version (because we got newer data from the
journal, say).  When marking underwater items clean, do
so based on the _loaded_ version, not out in-core version.

15 years agotodo
Sage Weil [Mon, 9 Nov 2009 21:32:07 +0000 (13:32 -0800)]
todo

15 years agomds: If last MDS, suicide on stop rather than entering infinite requeue loop
Greg Farnum [Wed, 11 Nov 2009 00:02:04 +0000 (16:02 -0800)]
mds: If last MDS, suicide on stop rather than entering infinite requeue loop

15 years agoosd: do not apply_transaction in finish_recovery
Sage Weil [Tue, 10 Nov 2009 22:51:20 +0000 (14:51 -0800)]
osd: do not apply_transaction in finish_recovery

finish_recovery needs to set up a callback for when the current set of
changes commit to disk (to kickstart cleanup of strya replicas etc).  We
can't call apply_transaction this deep inside the call chain without
causing problems.  So, pass a list of completion contexts all the way down
so that we can set up the completion callback.

15 years agofilestore: don't croak on 0 op usertrans error
Sage Weil [Tue, 10 Nov 2009 22:51:33 +0000 (14:51 -0800)]
filestore: don't croak on 0 op usertrans error

15 years agofilestore: check return values
Sage Weil [Tue, 10 Nov 2009 21:12:33 +0000 (13:12 -0800)]
filestore: check return values

15 years agotest_trans
Sage Weil [Tue, 10 Nov 2009 16:23:38 +0000 (08:23 -0800)]
test_trans

15 years agofilestore: fix usertrans setxattr, print it out nicely
Sage Weil [Tue, 10 Nov 2009 16:09:18 +0000 (08:09 -0800)]
filestore: fix usertrans setxattr, print it out nicely

15 years agofilestore: clean up btrfs ioctls; use actual btrfs ioctl.h
Sage Weil [Tue, 10 Nov 2009 15:51:44 +0000 (07:51 -0800)]
filestore: clean up btrfs ioctls; use actual btrfs ioctl.h

15 years agosample.ceph.conf: include usertrans flag
Sage Weil [Tue, 10 Nov 2009 15:37:56 +0000 (07:37 -0800)]
sample.ceph.conf: include usertrans flag

15 years agofilestore: make FileStore btrfs ioctl tests more readable
Sage Weil [Tue, 10 Nov 2009 15:37:45 +0000 (07:37 -0800)]
filestore: make FileStore btrfs ioctl tests more readable

15 years agoosd: log misdirected ops; reply with -ENXIO
Sage Weil [Mon, 9 Nov 2009 21:17:29 +0000 (13:17 -0800)]
osd: log misdirected ops; reply with -ENXIO

This is more helpful than assert(0).  It's still bad (it means the client
and osd calculated different pg mappings) though, but this makes it easier
to identify and fix.

15 years agosample.ceph.conf: fix btrfs mountoptions
Sage Weil [Mon, 9 Nov 2009 21:01:08 +0000 (13:01 -0800)]
sample.ceph.conf: fix btrfs mountoptions

15 years agoosdmap: clear out old hash distribution code
Sage Weil [Sun, 8 Nov 2009 17:21:16 +0000 (09:21 -0800)]
osdmap: clear out old hash distribution code

This screws up linkage because not everything that #Includes osdmap.h
links crush.

15 years agoosd: make pgids sort on (pool, preferred, ps)
Sage Weil [Sun, 8 Nov 2009 05:06:27 +0000 (21:06 -0800)]
osd: make pgids sort on (pool, preferred, ps)

This makes pg dump output easier to read, mainly.

15 years agocrushtool: small fix
Sage Weil [Sun, 8 Nov 2009 05:05:32 +0000 (21:05 -0800)]
crushtool: small fix

15 years agoRevert "crush: use spirit classic headers"
Sage Weil [Sun, 8 Nov 2009 04:49:37 +0000 (20:49 -0800)]
Revert "crush: use spirit classic headers"

This reverts commit 28ac4441b87907713ddaf1fe1dee62350f947cf3.

15 years agohash: small cleanup
Sage Weil [Sat, 7 Nov 2009 23:27:08 +0000 (15:27 -0800)]
hash: small cleanup

15 years agocrush: hrm fix up builder too
Sage Weil [Sun, 8 Nov 2009 04:12:21 +0000 (20:12 -0800)]
crush: hrm fix up builder too

fix

15 years agocrush: use spirit classic headers
Sage Weil [Sun, 8 Nov 2009 03:59:42 +0000 (19:59 -0800)]
crush: use spirit classic headers

This makes the 'deprecated' warnings go away.

15 years agocrush: make hash function selectable
Sage Weil [Sun, 8 Nov 2009 03:59:05 +0000 (19:59 -0800)]
crush: make hash function selectable

15 years agoosd: make object hash a pg_pool parameter
Sage Weil [Sat, 7 Nov 2009 05:36:43 +0000 (21:36 -0800)]
osd: make object hash a pg_pool parameter

15 years agoosdmaptool: test-map-object
Sage Weil [Sat, 7 Nov 2009 00:48:38 +0000 (16:48 -0800)]
osdmaptool: test-map-object

15 years agoosd: use stronger hash function for mapping objects -> pgs
Sage Weil [Sat, 7 Nov 2009 00:43:47 +0000 (16:43 -0800)]
osd: use stronger hash function for mapping objects -> pgs

The old hash (from linux dcache) was very weak, such that
least sig bits may not change and you could get lots of
consecutive objects on the same osds (because lsbits of the
pg weren't changing).

This is Robert Jenkin's hash and is quite strong.  Public
domain.

Rev osd disk format, protocol, since we're totally changing
object placement here.

15 years agocrush: no more static inline
Sage Weil [Fri, 6 Nov 2009 23:37:01 +0000 (15:37 -0800)]
crush: no more static inline

15 years agoosd: This logic is slightly less confusing without the always-true 'full' param
Greg Farnum [Fri, 6 Nov 2009 23:27:23 +0000 (15:27 -0800)]
osd: This logic is slightly less confusing without the always-true 'full' param

15 years agomon: don't delete stats when sending osdmap incremental
Sage Weil [Fri, 6 Nov 2009 21:54:53 +0000 (13:54 -0800)]
mon: don't delete stats when sending osdmap incremental

send_latest will delete m, and/or wait.  Instead call send_incremental
directly only when we know paxos is_readable.

15 years agoqa: test subdir mounts
Sage Weil [Fri, 6 Nov 2009 20:12:24 +0000 (12:12 -0800)]
qa: test subdir mounts

15 years agoinit-ceph: tell user we're mounting btrfs
Sage Weil [Fri, 6 Nov 2009 06:22:27 +0000 (22:22 -0800)]
init-ceph: tell user we're mounting btrfs

15 years agoosd: send single osd_boot on startup
Sage Weil [Fri, 6 Nov 2009 21:36:32 +0000 (13:36 -0800)]
osd: send single osd_boot on startup

15 years agomon: don't log dup osd boot msgs
Sage Weil [Fri, 6 Nov 2009 21:36:25 +0000 (13:36 -0800)]
mon: don't log dup osd boot msgs

15 years agomsgr: leave off ss_family when printing
Sage Weil [Fri, 6 Nov 2009 21:35:08 +0000 (13:35 -0800)]
msgr: leave off ss_family when printing

15 years agotodo
Sage Weil [Fri, 6 Nov 2009 06:21:22 +0000 (22:21 -0800)]
todo

15 years agoppc: do not copy_in unencoded __u32
Sage Weil [Fri, 6 Nov 2009 21:09:58 +0000 (13:09 -0800)]
ppc: do not copy_in unencoded __u32

Fixes big endian bugs.

15 years agobuffer: only define _XOPEN_SOURCE ifndef
Sage Weil [Fri, 6 Nov 2009 21:09:26 +0000 (13:09 -0800)]
buffer: only define _XOPEN_SOURCE ifndef

15 years agotodo
Sage Weil [Fri, 6 Nov 2009 07:03:59 +0000 (23:03 -0800)]
todo

15 years agomon: make initial monmap epoch match paxos version
Sage Weil [Fri, 6 Nov 2009 06:25:30 +0000 (22:25 -0800)]
mon: make initial monmap epoch match paxos version

15 years agomkmonfs: use common_init and parse regular args
Sage Weil [Fri, 6 Nov 2009 06:24:19 +0000 (22:24 -0800)]
mkmonfs: use common_init and parse regular args

15 years agodon't forget -standalone.git in release checklist
Sage Weil [Fri, 6 Nov 2009 05:53:15 +0000 (21:53 -0800)]
don't forget -standalone.git in release checklist

15 years agouclient: set NAME_MAX = PAGE_SIZE
Greg Farnum [Wed, 4 Nov 2009 00:43:53 +0000 (16:43 -0800)]
uclient: set NAME_MAX = PAGE_SIZE

15 years agoHadoop: Update JavaDoc and put in new patch file
Greg Farnum [Thu, 5 Nov 2009 20:58:29 +0000 (12:58 -0800)]
Hadoop: Update JavaDoc and put in new patch file

15 years agoHadoop: Numerous fixes.
Greg Farnum [Wed, 4 Nov 2009 22:14:41 +0000 (14:14 -0800)]
Hadoop: Numerous fixes.
Set bufUsed=0 on a flush to avoid bad rewrites of data
Downgrade a warning in IOStream since Hadoop apparently checks for EOF by reading until a read returns -1.
Remove some leftover if checks that don't do anything.
TODO: Remove something Sage did already.

15 years agomount: fix hint initialization for getaddrinfo
Yehuda Sadeh [Thu, 5 Nov 2009 18:00:19 +0000 (10:00 -0800)]
mount: fix hint initialization for getaddrinfo

15 years agofilestore: oops less noisy
Sage Weil [Thu, 5 Nov 2009 05:43:47 +0000 (21:43 -0800)]
filestore: oops less noisy

15 years agoosdmap: fix pps calc for preferred pgs
Sage Weil [Thu, 5 Nov 2009 00:21:20 +0000 (16:21 -0800)]
osdmap: fix pps calc for preferred pgs

15 years agoosd: nicer debug output on misdirected requests
Sage Weil [Thu, 5 Nov 2009 00:09:32 +0000 (16:09 -0800)]
osd: nicer debug output on misdirected requests

15 years agomds: mark scatterlocks with dirty rstat/fragstat during replay
Sage Weil [Wed, 4 Nov 2009 23:22:54 +0000 (15:22 -0800)]
mds: mark scatterlocks with dirty rstat/fragstat during replay

This ensures we propagate this info back toward the root
after we've replayed and gone active.

15 years agomds: use fnode when replaying journal
Sage Weil [Wed, 4 Nov 2009 23:22:11 +0000 (15:22 -0800)]
mds: use fnode when replaying journal

We weren't actually pulling the fnode from the journal. I'm
surprised anything worked.

This fixes a crash when untaring, restarting mds, and then
rm -r'ing the dir.  Previously the rstat would roll
negative and assert.

15 years agodebian: gracefully replace lib packages prior to '1' suffix
Sage Weil [Wed, 4 Nov 2009 21:27:06 +0000 (13:27 -0800)]
debian: gracefully replace lib packages prior to '1' suffix

15 years agoosd: kill int <-> pg_t conversions
Sage Weil [Wed, 4 Nov 2009 20:24:39 +0000 (12:24 -0800)]
osd: kill int <-> pg_t conversions

These are messy and asking for trouble.  And a cleaner
coll_t paves the way for real named coll_t's down the
line.

15 years agoosd: convert ceph_pg union to struct
Sage Weil [Wed, 4 Nov 2009 19:37:41 +0000 (11:37 -0800)]
osd: convert ceph_pg union to struct

This simplifies/fixes endian conversions.

15 years agomsgr: encode sockaddr.ss_family big endian in ceph_entity_addr
Sage Weil [Tue, 3 Nov 2009 23:15:22 +0000 (15:15 -0800)]
msgr: encode sockaddr.ss_family big endian in ceph_entity_addr

The ss_family field is normally host endianness, but we
want to exchange ceph_entity_addr across the wire and store
it on disk.  So, encode ss_family in big endian (to match
the other sockaddr field endianness).

Rev disk and wire protocols to match.

15 years agofilestore: close btrfs transactions cleanly
Sage Weil [Tue, 3 Nov 2009 20:31:40 +0000 (12:31 -0800)]
filestore: close btrfs transactions cleanly

Don't just close the fd.  We may add a mount option that makes a messy
transaction close wedge the fs instead of committing partial results.

15 years agodefault pid file /var/run/ceph/$type.$id.pid
Sage Weil [Tue, 27 Oct 2009 15:39:02 +0000 (08:39 -0700)]
default pid file /var/run/ceph/$type.$id.pid

15 years agoHadoop: Don't use the local pg by default.
Greg Farnum [Tue, 3 Nov 2009 00:37:38 +0000 (16:37 -0800)]
Hadoop: Don't use the local pg by default.

15 years agoNow passes all unit tests!
Greg Farnum [Mon, 2 Nov 2009 04:18:29 +0000 (20:18 -0800)]
Now passes all unit tests!
Changes made to rename and between error codes/exception throwing.

15 years agoHadoop: Don't throw IOExceptions on extra calls to close
Greg Farnum [Sat, 31 Oct 2009 01:39:48 +0000 (18:39 -0700)]
Hadoop: Don't throw IOExceptions on extra calls to close

15 years agoHadoop: f5 e3
Greg Farnum [Fri, 30 Oct 2009 22:37:06 +0000 (15:37 -0700)]
Hadoop: f5 e3

15 years agoHadoop: Behavioral fixes to CephFileSystem
Greg Farnum [Fri, 30 Oct 2009 20:53:54 +0000 (13:53 -0700)]
Hadoop: Behavioral fixes to CephFileSystem

15 years agoHadoop: Various changes to CephFaker; not completed.
Greg Farnum [Fri, 30 Oct 2009 20:50:30 +0000 (13:50 -0700)]
Hadoop: Various changes to CephFaker; not completed.

15 years agoHadooP: CephFaker now properly sanitizes input filenames.
Greg Farnum [Thu, 29 Oct 2009 22:06:12 +0000 (15:06 -0700)]
HadooP: CephFaker now properly sanitizes input filenames.

15 years agoHadoop: Update constructor/initialize usage a bit
Greg Farnum [Thu, 29 Oct 2009 01:36:22 +0000 (18:36 -0700)]
Hadoop: Update constructor/initialize usage a bit

15 years agoSimplify CephInputStream so that it actually works
Greg Farnum [Wed, 28 Oct 2009 21:04:58 +0000 (14:04 -0700)]
Simplify CephInputStream so that it actually works

15 years agoHadoop: Simplify CephOutputStream so that it actually works.
Greg Farnum [Tue, 27 Oct 2009 20:57:29 +0000 (13:57 -0700)]
Hadoop: Simplify CephOutputStream so that it actually works.

15 years agoHadoop: Reorder a few things for better safety and fix compile bugs
Greg Farnum [Tue, 27 Oct 2009 20:02:30 +0000 (13:02 -0700)]
Hadoop: Reorder a few things for better safety and fix compile bugs