]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
15 years agobuilddebs.sh: suppress lintian as root warning
Sage Weil [Tue, 2 Mar 2010 18:02:36 +0000 (10:02 -0800)]
builddebs.sh: suppress lintian as root warning

15 years agoauth: Add an auth_uid to AuthTicket. Still to do: copy it around
Greg Farnum [Tue, 2 Mar 2010 17:51:19 +0000 (09:51 -0800)]
auth: Add an auth_uid to AuthTicket. Still to do: copy it around

15 years agoauth: misc printing and initialization fixes
Greg Farnum [Sat, 27 Feb 2010 01:20:43 +0000 (17:20 -0800)]
auth: misc printing and initialization fixes

15 years agovstart:/mkcephfs: set client.admin auth_uid to 0
Greg Farnum [Sat, 27 Feb 2010 01:20:18 +0000 (17:20 -0800)]
vstart:/mkcephfs: set client.admin auth_uid to 0

15 years agoauthtool: set generated key to specific uid if one is given
Greg Farnum [Fri, 26 Feb 2010 23:34:50 +0000 (15:34 -0800)]
authtool: set generated key to specific uid if one is given

15 years agomsgr: Remove the type CEPH_ENTITY_TYPE_ADMIN.
Greg Farnum [Fri, 26 Feb 2010 19:28:25 +0000 (11:28 -0800)]
msgr: Remove the type CEPH_ENTITY_TYPE_ADMIN.
It *looks* like this won't break EntityName's isAdmin() function as that
depends on a set name, and client.admin will satisfy it. I think.

15 years agoceph_fs: Split CEPH_FEATURE_{SUPPORTED, REQUIRED} flags into service-based flags
Greg Farnum [Fri, 26 Feb 2010 18:57:37 +0000 (10:57 -0800)]
ceph_fs: Split CEPH_FEATURE_{SUPPORTED, REQUIRED} flags into service-based flags
msgr: New get_{required,supported}_bits methods which calculate required bits
based on type of self and of peer. Replace all hard-coded flag uses with these.

15 years agoOSD: If an auth_uid owns a pool and has no explicit permissions,
Greg Farnum [Fri, 26 Feb 2010 00:37:07 +0000 (16:37 -0800)]
OSD: If an auth_uid owns a pool and has no explicit permissions,
grant it full privileges.

15 years agoosd: Add auth_uid to OSDCaps, and fill in.
Greg Farnum [Fri, 26 Feb 2010 00:28:46 +0000 (16:28 -0800)]
osd: Add auth_uid to OSDCaps, and fill in.

15 years agoauth: constant for unknown/unsecured user
Greg Farnum [Fri, 26 Feb 2010 00:30:39 +0000 (16:30 -0800)]
auth: constant for unknown/unsecured user

15 years agofilestore: fix ondisk vs onreadable_sync deadlock
Sage Weil [Tue, 2 Mar 2010 17:57:25 +0000 (09:57 -0800)]
filestore: fix ondisk vs onreadable_sync deadlock

Do ondisk completion async in journaled_ahead completion to avoid
any onreadable_sync getting held up behind an ondisk completion.

Reuse the op_finisher finisher thread for this.

15 years agotodo
Sage Weil [Tue, 2 Mar 2010 17:45:58 +0000 (09:45 -0800)]
todo

15 years agobuilddebs.sh: keep base.tgz outside ~/debian
Sage Weil [Tue, 2 Mar 2010 17:45:55 +0000 (09:45 -0800)]
builddebs.sh: keep base.tgz outside ~/debian

15 years agomds: put forced open sessions in OPENING then OPEN
Sage Weil [Tue, 2 Mar 2010 00:34:17 +0000 (16:34 -0800)]
mds: put forced open sessions in OPENING then OPEN

We use OPENING state to indicate sessions that are being
imported.  Fix get_or_add_open_session() to NOT set the session
state (except to STATE_NEW if new) so that the caller can do
the right thing.  Otherwise, the prepare_force_open_sessions()
can't tell if it just forced open a session (and needs it to be
OPENING) or if it was already open.  Subsequently cap migrations
weren't working if the client didn't already have a session
open.

There is still a bug: if the import aborts, we have an OPENING
session with no actual open client_session message queued.  Maybe
we should have a different state instead of OPENING... IMPORTING?

15 years agotodo
Sage Weil [Mon, 1 Mar 2010 18:41:13 +0000 (10:41 -0800)]
todo

15 years agodebian: new release, push, build, publish scripts
Sage Weil [Mon, 1 Mar 2010 04:19:04 +0000 (20:19 -0800)]
debian: new release, push, build, publish scripts

15 years agoMakefile: fix /sbin hack
Sage Weil [Sun, 28 Feb 2010 22:39:55 +0000 (14:39 -0800)]
Makefile: fix /sbin hack

misplaced ;'s

15 years agoMakefile: include debian/
Sage Weil [Sun, 28 Feb 2010 22:39:45 +0000 (14:39 -0800)]
Makefile: include debian/

15 years agomds: fix (new) sessionmap decoding
Sage Weil [Fri, 26 Feb 2010 23:42:43 +0000 (15:42 -0800)]
mds: fix (new) sessionmap decoding

der!

15 years agomds: fix mds_export_targets message parent
Sage Weil [Fri, 26 Feb 2010 23:41:04 +0000 (15:41 -0800)]
mds: fix mds_export_targets message parent

Needs to be a PaxosServiceMessage.  The monitor was assuming
as much, casting, and getting a gibberish prior paxos version.

15 years agoqa: download linux tarball from ceph.newdream.net for kernel compile test
Sage Weil [Fri, 26 Feb 2010 22:50:54 +0000 (14:50 -0800)]
qa: download linux tarball from ceph.newdream.net for kernel compile test

15 years agocauthtool: validate arguments better, cleanup
Sage Weil [Fri, 26 Feb 2010 21:29:17 +0000 (13:29 -0800)]
cauthtool: validate arguments better, cleanup

Sanity check arguments up front.  Clean up code a bit.

15 years agoosd: use onreadable_sync finishers to drop ondisk locks
Sage Weil [Fri, 26 Feb 2010 20:30:07 +0000 (12:30 -0800)]
osd: use onreadable_sync finishers to drop ondisk locks

This fixes a deadlock where we are holding pg->lock, block
waiting for the ondisk lock, but the unlock completion is stuck
behind something else waiting on pg->lock in the finisher queue.

15 years agofilestore: add onreadable_sync callback
Sage Weil [Fri, 26 Feb 2010 20:29:17 +0000 (12:29 -0800)]
filestore: add onreadable_sync callback

Add an additional completion context that gets called
synchronously when the operation completes, instead of getting
shunted to the async finisher thread.  This allows us to make
sure certain completion events happen without getting 'stuck
in line' behind other completions with conflicting locks.

15 years agocontext: minor finish_contexts() cleanup
Sage Weil [Fri, 26 Feb 2010 20:09:17 +0000 (12:09 -0800)]
context: minor finish_contexts() cleanup

15 years agofinisher: generic C_OnFinisher context
Sage Weil [Fri, 26 Feb 2010 19:16:04 +0000 (11:16 -0800)]
finisher: generic C_OnFinisher context

Queue given context on given finisher.

15 years agoobjectstore: conflate onjournal and ondisk
Sage Weil [Fri, 26 Feb 2010 19:10:29 +0000 (11:10 -0800)]
objectstore: conflate onjournal and ondisk

No callers actually make this distinction, and the ObjectStore
hides the details of if/whether/how things go to the journal
or disk or in what order, so simplify things all around.

15 years agomds: revise mds sessionmap encoding [disk format change]
Sage Weil [Thu, 25 Feb 2010 23:50:09 +0000 (15:50 -0800)]
mds: revise mds sessionmap encoding [disk format change]

Encode session name before session itself, so that we can use
an existing session instead of allocating a new one.  This lets
us keep eagerly reconnecting clients that connect before we
load the sessionmap.

Add proper struct_v.  Drop useless/incorrect 'n' value.

Continue to read old format, of course.  Some minor hackery
because we didn't have a struct_v before.

15 years agomds: initialize session state, even if we had it already
Sage Weil [Thu, 25 Feb 2010 23:48:00 +0000 (15:48 -0800)]
mds: initialize session state, even if we had it already

This ensures that eagerly reconnecting clients get set to OPEN
when we replay their session from the journal.

15 years agomds: assert client_msut_resend on forward.
Sage Weil [Thu, 25 Feb 2010 23:38:42 +0000 (15:38 -0800)]
mds: assert client_msut_resend on forward.

15 years agomds: print session states as string; nicer dump
Sage Weil [Thu, 25 Feb 2010 23:25:03 +0000 (15:25 -0800)]
mds: print session states as string; nicer dump

15 years agoencoding: make bool encoder explicitly u8
Sage Weil [Thu, 25 Feb 2010 23:24:15 +0000 (15:24 -0800)]
encoding: make bool encoder explicitly u8

15 years agoauth: auth_uid needs to be in AuthCapsInfo as well.
Greg Farnum [Thu, 25 Feb 2010 21:34:47 +0000 (13:34 -0800)]
auth: auth_uid needs to be in AuthCapsInfo as well.

Conflicts:

src/auth/Auth.h

15 years agoauthmon: reminder --> remainder, for less confusion!
Greg Farnum [Thu, 25 Feb 2010 18:46:36 +0000 (10:46 -0800)]
authmon: reminder --> remainder, for less confusion!

15 years agomds: if we have no subtrees on rejoin_done, leave cluster
Sage Weil [Thu, 25 Feb 2010 19:48:35 +0000 (11:48 -0800)]
mds: if we have no subtrees on rejoin_done, leave cluster

15 years agomds: fix trim_non_auth empty lru case
Sage Weil [Thu, 25 Feb 2010 19:48:14 +0000 (11:48 -0800)]
mds: fix trim_non_auth empty lru case

If our lru is empty, make sure we clean things out _after_
unpinning the subtrees!  This came up after an mds leaving the
cluster crashed before it finished, and on replay/rejoin had
no auth subtrees.

15 years agoqa: specify logdir on command line (or assume rundir)
Sage Weil [Thu, 25 Feb 2010 18:53:15 +0000 (10:53 -0800)]
qa: specify logdir on command line (or assume rundir)

Don't just put logs with the qa source.  Among other things that
means we can't run runallonce.sh twice.

15 years agoosd: do not activate pg if lost osds and no acting has gone active
Sage Weil [Thu, 25 Feb 2010 18:52:20 +0000 (10:52 -0800)]
osd: do not activate pg if lost osds and no acting has gone active

If the no acting osd has gone active since it most recently
joined the pg, then we may not have up to date pg state (log,
etc).  If may even be empty.

If so, then do not activate even if an osd is marked lost.

15 years agoosd: detect permanently lost objects, and continue
Sage Weil [Thu, 25 Feb 2010 18:40:25 +0000 (10:40 -0800)]
osd: detect permanently lost objects, and continue

If we mark an osd lost, and subsequently there are some objects
that are permanently lost, recover.  Adjust the missing map to
no longer expect those new revisions.  (FIXME: pg stats are not
correctly adjusted; a repair will be needed.)

15 years agoosd: send log events to monitor
Sage Weil [Thu, 25 Feb 2010 18:39:01 +0000 (10:39 -0800)]
osd: send log events to monitor

15 years agoosd: print lost_at
Sage Weil [Wed, 24 Feb 2010 19:07:07 +0000 (11:07 -0800)]
osd: print lost_at

15 years agomds: fix sessionmap decoding
Sage Weil [Wed, 24 Feb 2010 05:31:15 +0000 (21:31 -0800)]
mds: fix sessionmap decoding

There's a session count value encoded, but it's a meaningless
upper bound.  Stop decoding when we hit the end of the buffer
instead.

15 years agorados: revert indentation so it matches kernel. Oops.
Greg Farnum [Wed, 24 Feb 2010 23:44:03 +0000 (15:44 -0800)]
rados: revert indentation so it matches kernel. Oops.

15 years agomsgr: Set features in ceph_msg_connect_reply
Greg Farnum [Wed, 24 Feb 2010 23:26:27 +0000 (15:26 -0800)]
msgr: Set features in ceph_msg_connect_reply

15 years agoauth: Add a uid field to EntityAuth; make it a required feature
Greg Farnum [Wed, 24 Feb 2010 22:41:11 +0000 (14:41 -0800)]
auth: Add a uid field to EntityAuth; make it a required feature

15 years agorados: fix indentation
Greg Farnum [Tue, 23 Feb 2010 20:35:55 +0000 (12:35 -0800)]
rados: fix indentation

15 years agofiler: remove -> purge_range, and scale to large ranges
Sage Weil [Wed, 24 Feb 2010 18:48:31 +0000 (10:48 -0800)]
filer: remove -> purge_range, and scale to large ranges

Redefine remove interface to operate over a range of objects
numbers, not a byte range, since we are removing objects.  It is
the caller's responsibility to ensure they have the proper
range (by mapping from the ceph_file_layout).

And behave when the range is large by only allowing a few in
flight remove requests at once.

Eventually the objecter probably needs a more generalized request
throttling mechanism, but this will do for now.

15 years agomds: make scatter_nudge actually nudge when replica asks
Sage Weil [Wed, 24 Feb 2010 05:08:56 +0000 (21:08 -0800)]
mds: make scatter_nudge actually nudge when replica asks

If we're not replicated, there is no need to twiddle the
lockstate.. we can just write out any dirty data, as when we
have delayed rstat propagation.  If we are replicated, though,
and a replica asks to nudge the lock, we had better nudge the
lock state!

15 years agomds: correctly set root inode_auth during recovery
Sage Weil [Wed, 24 Feb 2010 05:01:32 +0000 (21:01 -0800)]
mds: correctly set root inode_auth during recovery

Set to root node id as indicated by mdsmap.  Setting the auth
bit alone isn't sufficient.

15 years agomds: show nref=%d if MDS_REF_SET is not defined
Sage Weil [Tue, 23 Feb 2010 23:52:01 +0000 (15:52 -0800)]
mds: show nref=%d if MDS_REF_SET is not defined

So we can see the ref count, at least!

15 years agomds: fix file purge race
Sage Weil [Tue, 23 Feb 2010 23:51:39 +0000 (15:51 -0800)]
mds: fix file purge race

Handle the case where a new inode ref appears while we are
purging an inode.  If so, we just truncate it to 0, so that next
time we go through purge_stray() we don't have to do the work
over again.

This can happen if a client goes snooping in the stray dir (or
who knows what else!).

15 years agofilestore: explicitly parse args for _touch
Sage Weil [Tue, 23 Feb 2010 21:28:18 +0000 (13:28 -0800)]
filestore: explicitly parse args for _touch

15 years agoinit-ceph: don't barf on dash when no command
Sage Weil [Tue, 23 Feb 2010 00:10:50 +0000 (16:10 -0800)]
init-ceph: don't barf on dash when no command

15 years agoautomake: fix mount sbin dir when configured with prefix
Yehuda Sadeh [Tue, 23 Feb 2010 23:56:33 +0000 (15:56 -0800)]
automake: fix mount sbin dir when configured with prefix

15 years agoosd: clean up WRITE, TRUNCATE, TRIMTRUNC
Sage Weil [Tue, 23 Feb 2010 23:20:43 +0000 (15:20 -0800)]
osd: clean up WRITE, TRUNCATE, TRIMTRUNC

- consolidate TRUNCATE and TRIMTRUNC.
- truncate on WRITE.
- handy debug prints.

15 years agocauthtool: --caps fn alone is a command
Sage Weil [Mon, 22 Feb 2010 20:23:42 +0000 (12:23 -0800)]
cauthtool: --caps fn alone is a command

15 years agotodo
Sage Weil [Tue, 23 Feb 2010 00:00:18 +0000 (16:00 -0800)]
todo

15 years agodebian: mount.ceph in /sbin, not /usr/sbin
Sage Weil [Sat, 20 Feb 2010 05:19:09 +0000 (21:19 -0800)]
debian: mount.ceph in /sbin, not /usr/sbin

15 years agoobjectstore: simpler transaction encoding
Sage Weil [Thu, 18 Feb 2010 23:05:33 +0000 (15:05 -0800)]
objectstore: simpler transaction encoding

Just concatenate operations to a bufferlist as we go.  No
distinct decoding step is needed; we parse the transaction as it
is replayed/applied.  This avoids the old decoded intermediate
representation overhead.

Since we still decode the old version, that code is still there,
but not used for anything new.

15 years agovstart: default to 3 mds
Sage Weil [Thu, 18 Feb 2010 20:36:04 +0000 (12:36 -0800)]
vstart: default to 3 mds

15 years agouclient: do not retain caps being revoked
Sage Weil [Thu, 18 Feb 2010 19:50:49 +0000 (11:50 -0800)]
uclient: do not retain caps being revoked

Matches kclient commit 68c28323.

15 years agodebug: fix warnings, use larger path buffers
Sage Weil [Thu, 18 Feb 2010 17:57:34 +0000 (09:57 -0800)]
debug: fix warnings, use larger path buffers

15 years agologger: fix warning
Sage Weil [Thu, 18 Feb 2010 17:57:15 +0000 (09:57 -0800)]
logger: fix warning

15 years agoworkqueue: behave when multiple threads call drain()
Sage Weil [Thu, 18 Feb 2010 05:47:18 +0000 (21:47 -0800)]
workqueue: behave when multiple threads call drain()

Use a counter, not a bool.

15 years agomds: add support for directory sticky bit
Sage Weil [Thu, 18 Feb 2010 05:24:50 +0000 (21:24 -0800)]
mds: add support for directory sticky bit

Take an rdlock on the directory authlock, so that we can reliably set the
new inode's gid if the directory mode has SGID bit set.

15 years agofilestore: only do btrfs_snap if btrfs
Sage Weil [Thu, 18 Feb 2010 05:11:30 +0000 (21:11 -0800)]
filestore: only do btrfs_snap if btrfs

15 years agoMerge commit 'origin/filestore' into unstable
Sage Weil [Wed, 17 Feb 2010 22:57:31 +0000 (14:57 -0800)]
Merge commit 'origin/filestore' into unstable

Conflicts:

src/os/FileStore.cc
src/os/FileStore.h

15 years agoupdate release checklist
Sage Weil [Wed, 17 Feb 2010 22:55:40 +0000 (14:55 -0800)]
update release checklist

15 years agov0.19 v0.19
Sage Weil [Wed, 17 Feb 2010 21:53:06 +0000 (13:53 -0800)]
v0.19

15 years agomon: disable 'osd setmap'
Sage Weil [Wed, 17 Feb 2010 17:18:16 +0000 (09:18 -0800)]
mon: disable 'osd setmap'

This is dangerous, since it doesn't preserve old pool ids or pool_max, and
will confuse osds and generally wreak havoc.

15 years agoosdmap: fix uninit var warning
Sage Weil [Wed, 17 Feb 2010 03:53:51 +0000 (19:53 -0800)]
osdmap: fix uninit var warning

Harmless, but this shuts it up.

15 years agomon: add 'auth export ]name]' to export a full or partial keyring
Sage Weil [Wed, 17 Feb 2010 00:37:59 +0000 (16:37 -0800)]
mon: add 'auth export ]name]' to export a full or partial keyring

15 years agoqa: fix snaptest1.sh
Sage Weil [Mon, 15 Feb 2010 23:28:02 +0000 (15:28 -0800)]
qa: fix snaptest1.sh

15 years agoosdmap: decode old osdmaps prior to pool_max stuff
Sage Weil [Tue, 16 Feb 2010 23:59:09 +0000 (15:59 -0800)]
osdmap: decode old osdmaps prior to pool_max stuff

15 years agoosdmap: get rid of useless max_pools
Sage Weil [Tue, 16 Feb 2010 23:49:16 +0000 (15:49 -0800)]
osdmap: get rid of useless max_pools

15 years agoosd: pool cleanups
Sage Weil [Tue, 16 Feb 2010 23:07:35 +0000 (15:07 -0800)]
osd: pool cleanups

missed this before:

 - no need to initalize in create_pending(), constructor does that
 - int32_t, not int
 - pool_max while we're at it
 - initialize pool_max in OSDMap constructor

15 years agotodo
Sage Weil [Tue, 16 Feb 2010 22:33:02 +0000 (14:33 -0800)]
todo

15 years agomds: ignore session RENEWCAPS if state not open|stale
Sage Weil [Tue, 16 Feb 2010 22:32:43 +0000 (14:32 -0800)]
mds: ignore session RENEWCAPS if state not open|stale

This avoids breakage where a renewcaps races with a session
being purged, for example.

15 years agoosdmap/mon: Be more defensive about highest_pool_num usage
Greg Farnum [Tue, 16 Feb 2010 22:15:12 +0000 (14:15 -0800)]
osdmap/mon: Be more defensive about highest_pool_num usage

15 years agorados tool: mkpool/rmpool commands now available
Greg Farnum [Tue, 16 Feb 2010 20:39:46 +0000 (12:39 -0800)]
rados tool: mkpool/rmpool commands now available

15 years agomon: can now delete pools via 'ceph osd pool delete foo'
Greg Farnum [Tue, 16 Feb 2010 17:22:32 +0000 (09:22 -0800)]
mon: can now delete pools via 'ceph osd pool delete foo'

15 years agorgw: actually delete pools when using rados!
Greg Farnum [Fri, 12 Feb 2010 22:54:56 +0000 (14:54 -0800)]
rgw: actually delete pools when using rados!

15 years agorados/objecter: can now delete pools!
Greg Farnum [Fri, 12 Feb 2010 22:54:37 +0000 (14:54 -0800)]
rados/objecter: can now delete pools!

15 years agomon/msg: MPoolOp can carry POOL_OP_DELETE; OSDMon puts pool in incre old_pools
Greg Farnum [Fri, 12 Feb 2010 22:25:57 +0000 (14:25 -0800)]
mon/msg: MPoolOp can carry POOL_OP_DELETE; OSDMon puts pool in incre old_pools

15 years agolibrados: init PoolCtx properly -- was always setting snap_seq to CEPH_NOSNAP
Greg Farnum [Fri, 12 Feb 2010 22:12:22 +0000 (14:12 -0800)]
librados: init PoolCtx properly -- was always setting snap_seq to CEPH_NOSNAP

15 years agoosd: Deal with pools being removed from OSDMap.
Greg Farnum [Fri, 12 Feb 2010 21:21:22 +0000 (13:21 -0800)]
osd: Deal with pools being removed from OSDMap.

This potentially has issues, since pools are not removed from the map
until after all the PGs are removed (which is threaded, not inline with
map delivery). But Sage thinks it's okay and the system keeps working
even if you delete a pool while benchmarking on it with rados.

15 years agoOSDMap: get_pg_pool now returns a pointer
Greg Farnum [Fri, 12 Feb 2010 00:57:23 +0000 (16:57 -0800)]
OSDMap: get_pg_pool now returns a pointer
This lets us return NULL if the pool isn't in the map, which is
needed functionality for pool deletion. Meanwhile, code which
expects the pool to exist will continue to cause a crash if it doesn't.

15 years agorados: fix seg fault on cleanup of a failed pool open
Greg Farnum [Tue, 16 Feb 2010 17:21:32 +0000 (09:21 -0800)]
rados: fix seg fault on cleanup of a failed pool open

15 years agomds: infer 'follows' in journal_dirty_inode on non-head inodes
Sage Weil [Mon, 15 Feb 2010 21:47:41 +0000 (13:47 -0800)]
mds: infer 'follows' in journal_dirty_inode on non-head inodes

There are lots of callers to journal_dirty_inode that may
unwittingly be dealing with a non-head inode (e.g.
check_file_max).  If the provided inode is snapped, infer an
appropriate follows values so as not to cow_inode() again.

15 years agomds: clear cap->issued on flushsnap
Sage Weil [Mon, 15 Feb 2010 21:27:01 +0000 (13:27 -0800)]
mds: clear cap->issued on flushsnap

This allows _do_cap_update to clear out the client_range.

Kill (now) unused/unnecessary 'wanted' arg to _do_cap_update.

Also delay cap removal until after _do_cap_update (whcih takes
a Capability*).  This probably needs further cleanup.

15 years agomds: don't croak on null dentries in cache during reconnect/rejoin
Sage Weil [Mon, 15 Feb 2010 19:40:20 +0000 (11:40 -0800)]
mds: don't croak on null dentries in cache during reconnect/rejoin

They're created when we replay unlink events from the log.

15 years agoobjectcacher: use trimtrunc read/write ops
Yehuda Sadeh [Fri, 12 Feb 2010 22:32:11 +0000 (14:32 -0800)]
objectcacher: use trimtrunc read/write ops

15 years agoosdc: clean up some mess
Yehuda Sadeh [Fri, 12 Feb 2010 22:23:57 +0000 (14:23 -0800)]
osdc: clean up some mess

15 years agoobjecter: add read_trunc, write_trunc
Yehuda Sadeh [Fri, 12 Feb 2010 22:05:42 +0000 (14:05 -0800)]
objecter: add read_trunc, write_trunc

15 years agomkmonfs: rm -rf, so that we kill 0600 admin_keyring.bin
Sage Weil [Fri, 12 Feb 2010 22:54:01 +0000 (14:54 -0800)]
mkmonfs: rm -rf, so that we kill 0600 admin_keyring.bin

15 years agoosd: fix recovery requeue race
Sage Weil [Fri, 12 Feb 2010 22:45:02 +0000 (14:45 -0800)]
osd: fix recovery requeue race

If a recovery op finished right as another recovery op was
begin started, we could get into start_recovery_ops() and get
max = 0 and not start anything.  Since the PG wasn't being
requeued for later, it would never recover.  So, requeue if we
race and get max == 0.

15 years agoinit-ceph: print 'already started' instead of failing to start
Sage Weil [Fri, 12 Feb 2010 22:20:02 +0000 (14:20 -0800)]
init-ceph: print 'already started' instead of failing to start

15 years agomsgr: more conservative locking, thread join asserts
Sage Weil [Fri, 12 Feb 2010 21:38:38 +0000 (13:38 -0800)]
msgr: more conservative locking, thread join asserts

We caught a bunch of crashes like this:

10.02.11 17:01:01.600660 7f87070c3950 -- 10.3.14.134:6800/8203 >> 10.3.14.130:6800/18914 pipe(0x7fc2be2cebe0 sd=36 pgs=2409 cs=1 l=0).do_sendmsg error Broken pipe
10.02.11 17:01:01.600700 7f87070c3950 -- 10.3.14.134:6800/8203 >> 10.3.14.130:6800/18914 pipe(0x7fc2be2cebe0 sd=36 pgs=2409 cs=1 l=0).writer error sending 0x7fc27da1c570, 32: Broken pipe
10.02.11 17:01:01.600796 7f87070c3950 -- 10.3.14.134:6800/8203 >> 10.3.14.130:6800/18914 pipe(0x7fc2be2cebe0 sd=-1 pgs=2409 cs=1 l=0).fault initiating reconnect
...
./common/Thread.h: In function 'int Thread::join(void**)':
./common/Thread.h:66: FAILED assert(0)
 1: (Thread::join(void**)+0x73) [0x64fcd3]
 2: (SimpleMessenger::Pipe::join_reader()+0x68) [0x6555a2]
 3: (SimpleMessenger::Pipe::connect()+0xf5) [0x645be9]
 4: (SimpleMessenger::Pipe::writer()+0x157) [0x64793d]
 5: (SimpleMessenger::Pipe::Writer::entry()+0x19) [0x63e107]
 6: (Thread::_entry_func(void*)+0x20) [0x64e816]
 7: /lib/libpthread.so.0 [0x7fc2c3bbdfc7]
 8: (clone()+0x6d) [0x7fc2c2e005ad]

that look a bit like multiple procs were racing into
join_reader().  Add an assert to catch that if it happens again,
and also wrap thread starts in pipe_lock to ensure we keep the
_running flags in sync with reality.  Add in a few other
sanity checks too.

15 years agomon: note mds beacon times more carefully
Sage Weil [Fri, 12 Feb 2010 21:35:57 +0000 (13:35 -0800)]
mon: note mds beacon times more carefully

We need to update the beacon timestamp even when we are updating
the mds state.  Otherwise we can get caught in a busy loop
between marking an mds laggy and !laggy because the beacon stamp
never updates.

So even if we are updating, and the reply will be slow, update
our timestamp, so we don't mark the mds laggy.

15 years agoosd: bail out of interval loop completely
Sage Weil [Fri, 12 Feb 2010 21:27:49 +0000 (13:27 -0800)]
osd: bail out of interval loop completely

We're going backwards, so once this test fails, it always fails,
and we can break instead of continue.  Any skipped intervals will
be pruned shortly anyway.