]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
14 years agocrushtool: fix --add-item weight being zero when parent bucket(s) created
Sage Weil [Thu, 26 May 2011 20:12:04 +0000 (13:12 -0700)]
crushtool: fix --add-item weight being zero when parent bucket(s) created

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoMerge branch 'stable'
Sage Weil [Thu, 26 May 2011 18:04:03 +0000 (11:04 -0700)]
Merge branch 'stable'

14 years agomkcephfs: set rdir for local mon setup
Sage Weil [Thu, 26 May 2011 17:19:04 +0000 (10:19 -0700)]
mkcephfs: set rdir for local mon setup

Fixes: #1113
Reported-by: Bernard Grymonpon <bernard@openminds.be>
Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoinit-ceph: ssh
Sage Weil [Thu, 26 May 2011 16:55:37 +0000 (09:55 -0700)]
init-ceph: ssh

Another bell/whistle.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix canceled lock attempt
Sage Weil [Wed, 25 May 2011 21:54:15 +0000 (14:54 -0700)]
mds: fix canceled lock attempt

If client tries to lock a file, has to wait, and then cancels the attempt,
the client will send an unlock request to unwind its state.

 - the unlock now removes the waiting lock attempt from the wait list
 - when the lock request retries and finds it is no longer on the wait
   list it will fail.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agorgw: return EACCES if acl xattr doesn't exist
Yehuda Sadeh [Wed, 25 May 2011 19:32:50 +0000 (12:32 -0700)]
rgw: return EACCES if acl xattr doesn't exist

14 years agoPG: fix race in _activate_committed
Samuel Just [Wed, 25 May 2011 17:54:27 +0000 (10:54 -0700)]
PG: fix race in _activate_committed

Previously, _activate_committed would access the osdmap epoch racing
with handle_osd_map's osdmap update.  This would allow a message to be
sent from a replica to the primary tagged with the same epoch as
last_warm_restart, though the event actually occured before
last_warm_restart.  Thus the primary would fail to ignore the event and
transition to crashed.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
14 years agomds: do not shift to EXCL or MIX while rdlocked
Sage Weil [Wed, 18 May 2011 04:29:33 +0000 (21:29 -0700)]
mds: do not shift to EXCL or MIX while rdlocked

There was an old change in file_eval() that was allowing us to switch from
SYNC to MIX or EXCL while there were rdlocks, which either caused lots of
lock thrashing or could (I think) hang things up completely.  This was
from ea10a672, an ancient fix for something related that appears to have
taken out the rdlocked check by accident.

In my tests (one writer, one stat-er), this took things from long stalls
(up to 20 seconds) to very responsive stats.  Yay!

Fixes: #791
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agocrushtool: clean up add-item a bit; don't add item to same bucket twice
Sage Weil [Wed, 25 May 2011 04:14:59 +0000 (21:14 -0700)]
crushtool: clean up add-item a bit; don't add item to same bucket twice

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agocrushtool: fix remove-item
Sage Weil [Wed, 25 May 2011 04:05:47 +0000 (21:05 -0700)]
crushtool: fix remove-item

Scan all buckets instead of doing a tree traverse.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoradosgw_admin: update clitest
Sage Weil [Wed, 25 May 2011 03:30:38 +0000 (20:30 -0700)]
radosgw_admin: update clitest

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomkcephfs.in: print out usage if no actions given
Colin Patrick McCabe [Wed, 25 May 2011 01:16:08 +0000 (18:16 -0700)]
mkcephfs.in: print out usage if no actions given

If the user didn't specify any actions, print out a usage message rather
than silently exiting.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agorgw: Fix RGWAccess::init_storage_provider
Colin Patrick McCabe [Wed, 25 May 2011 00:50:24 +0000 (17:50 -0700)]
rgw: Fix RGWAccess::init_storage_provider

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agomkcephfs: error out on bad usage
Sage Weil [Wed, 25 May 2011 00:05:30 +0000 (17:05 -0700)]
mkcephfs: error out on bad usage

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomake: fix build for rgw
Yehuda Sadeh [Tue, 24 May 2011 23:40:12 +0000 (16:40 -0700)]
make: fix build for rgw

14 years agorgw_admin: clean warning
Yehuda Sadeh [Tue, 24 May 2011 23:33:11 +0000 (16:33 -0700)]
rgw_admin: clean warning

14 years agoMerge commit 'origin/master' into rgw-multiuser
Yehuda Sadeh [Tue, 24 May 2011 22:30:17 +0000 (15:30 -0700)]
Merge commit 'origin/master' into rgw-multiuser

14 years agorgw_admin: add key create
Yehuda Sadeh [Tue, 24 May 2011 21:29:50 +0000 (14:29 -0700)]
rgw_admin: add key create

14 years agorgw_admin: subuser and key removal
Yehuda Sadeh [Tue, 24 May 2011 21:17:59 +0000 (14:17 -0700)]
rgw_admin: subuser and key removal

14 years agojournaler: tolerate ENOENT when prezeroing
Sage Weil [Wed, 11 May 2011 04:35:50 +0000 (21:35 -0700)]
journaler: tolerate ENOENT when prezeroing

ENOENT is okay and expected.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agotest_common.sh: skip rm before put
Colin Patrick McCabe [Tue, 24 May 2011 19:36:07 +0000 (12:36 -0700)]
test_common.sh: skip rm before put

The rm before the put is unecessary and actually incorrect now.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoradostool: rados put should use write_full
Colin Patrick McCabe [Tue, 24 May 2011 19:34:56 +0000 (12:34 -0700)]
radostool: rados put should use write_full

If "rados put" uses write instead of write_full, the resulting object on
the server may be a mismash of old and new objects, if the old object
was longer than the new one. This is fairly counterintuitive behavior
for radostool, so remove it.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoMerge branch 'wip_ceph_context'
Colin Patrick McCabe [Tue, 24 May 2011 19:22:30 +0000 (12:22 -0700)]
Merge branch 'wip_ceph_context'

14 years agoCreate a libcommon service thread
Colin Patrick McCabe [Mon, 23 May 2011 23:25:57 +0000 (16:25 -0700)]
Create a libcommon service thread

Create a libcommon service thread. Use it to handle SIGHUP.

Handle it by means of a flag that gets set. Using a queue would raise
the complicated question of what to do when the queue was full.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agolibrados: len should be size_t
Sage Weil [Tue, 24 May 2011 17:00:23 +0000 (10:00 -0700)]
librados: len should be size_t

Unsigned, and size_t because it's a buffer size.

Fixes signedness warning in testrados.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: add ability to explicitly mark unfound as lost
Sage Weil [Tue, 24 May 2011 16:47:06 +0000 (09:47 -0700)]
osd: add ability to explicitly mark unfound as lost

Instead of automatically marking unfound objects lost (once we've tried
every location we can think of), do it when the administator explicitly
says to.  This avoids marking things wrong incorrectly when there are
peering issues, and also allows the administrator to decide whether there
may be offline osds that are worth bringing online.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: make automatically marking of unfound as lost optional
Sage Weil [Tue, 24 May 2011 16:42:39 +0000 (09:42 -0700)]
osd: make automatically marking of unfound as lost optional

We may not want to do this automatically until we have more confidense in
the recovery code.  Even then, possible not.  In particular, the OSDs may
believe they have contact all possible homes for the data even though there
is some long-lost OSD that has the data on disk that if offline.

For now, we make the marking process explicit so that the administrator can
make the call.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: clean up get_or_create_stray
Sage Weil [Tue, 24 May 2011 16:26:40 +0000 (09:26 -0700)]
mds: clean up get_or_create_stray

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: initialize stray_index on startup
Sage Weil [Tue, 24 May 2011 16:24:42 +0000 (09:24 -0700)]
mds: initialize stray_index on startup

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoMerge branch 'stable'
Sage Weil [Tue, 24 May 2011 16:17:24 +0000 (09:17 -0700)]
Merge branch 'stable'

14 years agov0.28.1 v0.28.1
Sage Weil [Tue, 24 May 2011 04:11:44 +0000 (21:11 -0700)]
v0.28.1

14 years agolibrads, libceph: store CephContext
Colin Patrick McCabe [Mon, 23 May 2011 21:02:15 +0000 (14:02 -0700)]
librads, libceph: store CephContext

Don't use the global g_ceph_context. Instead, store the CephContext in
the structures provided by the library user.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoAdd CephContext
Colin Patrick McCabe [Mon, 23 May 2011 17:11:15 +0000 (10:11 -0700)]
Add CephContext

A CephContext represents the context held by a single library user.
There can be multiple CephContexts in the same process.

For daemons and utility programs, there will be only one CephContext.
The CephContext contains the configuration, the dout object, and
anything else that you might want to pass to libcommon with every
function call.

Move some non-config things out of md_config_t and into CephContext.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoSplit common_init_daemonize from common_init_finish
Colin Patrick McCabe [Mon, 23 May 2011 23:29:49 +0000 (16:29 -0700)]
Split common_init_daemonize from common_init_finish

Split off common_init_daemonize from common_init_finish. cfuse is a
daemon that calls common_init_finish, but handles daemonization itself.
This fixes cfuse.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agorgw_admin: make interface a bit more explicit
Yehuda Sadeh [Mon, 23 May 2011 23:52:59 +0000 (16:52 -0700)]
rgw_admin: make interface a bit more explicit

14 years agorgw: subuser permissions
Yehuda Sadeh [Mon, 23 May 2011 22:12:48 +0000 (15:12 -0700)]
rgw: subuser permissions

14 years agomon: verify that crush max does not exceed osd max
Sage Weil [Mon, 23 May 2011 21:58:26 +0000 (14:58 -0700)]
mon: verify that crush max does not exceed osd max

- when injecting a new crushmap
- when adjusting osdmap max_osd

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agocrushtool: add --reweight-item <name> <weight>
Sage Weil [Sun, 22 May 2011 23:25:35 +0000 (16:25 -0700)]
crushtool: add --reweight-item <name> <weight>

Reweight and individual item via crushtool.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosdmaptool: fail --import-crush if crush max_devices > osdmap max_osd
Sage Weil [Sat, 21 May 2011 19:55:16 +0000 (12:55 -0700)]
osdmaptool: fail --import-crush if crush max_devices > osdmap max_osd

Crush will spew non-deterministic badness if it walks off the end of
the osd_weight vector.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agocommon_init: don't init crypto until after fork
Colin Patrick McCabe [Fri, 20 May 2011 23:35:52 +0000 (16:35 -0700)]
common_init: don't init crypto until after fork

Get rid of the initialize-then-shutdown-crypto hack. We just initialize
crypto once, after it is safe to do so. There is now a single callback,
common_init_finish, which does the final stage of initialization,
including starting crypto and daemonization (if required.)

common_init_finish needs to be done before messenger::start().

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoceph_crypto: add assert_init
Colin Patrick McCabe [Fri, 20 May 2011 22:12:49 +0000 (15:12 -0700)]
ceph_crypto: add assert_init

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoconfig: delete after new
Sage Weil [Sat, 21 May 2011 01:16:49 +0000 (18:16 -0700)]
config: delete after new

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agocrush: fix signedness warnings
Sage Weil [Sat, 21 May 2011 00:10:15 +0000 (17:10 -0700)]
crush: fix signedness warnings

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agorgw_admin: able to create multiple keys/subusers
Yehuda Sadeh [Fri, 20 May 2011 23:46:14 +0000 (16:46 -0700)]
rgw_admin: able to create multiple keys/subusers

14 years agocrushtool: --remove-item name
Sage Weil [Fri, 20 May 2011 23:45:57 +0000 (16:45 -0700)]
crushtool: --remove-item name

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agocrush: fix tree bucket encoding
Sage Weil [Fri, 20 May 2011 23:41:16 +0000 (16:41 -0700)]
crush: fix tree bucket encoding

I wonder how long this has been broken!

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agocrush: fix tree weight accessor, decompile
Sage Weil [Fri, 20 May 2011 23:40:36 +0000 (16:40 -0700)]
crush: fix tree weight accessor, decompile

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agocrushtool: default to hash 0 (rjenkins1)
Sage Weil [Fri, 20 May 2011 22:44:15 +0000 (15:44 -0700)]
crushtool: default to hash 0 (rjenkins1)

Otherwise we get 255 which is undefined and get bad results!

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agorgw: user info structure supports multiple subusers and keys
Yehuda Sadeh [Fri, 20 May 2011 22:15:48 +0000 (15:15 -0700)]
rgw: user info structure supports multiple subusers and keys

14 years agoosd: update last_epoch_clean in PG::Info::History::merge()
Sage Weil [Fri, 20 May 2011 22:08:06 +0000 (15:08 -0700)]
osd: update last_epoch_clean in PG::Info::History::merge()

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: small cleanup
Sage Weil [Fri, 20 May 2011 22:04:57 +0000 (15:04 -0700)]
osd: small cleanup

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: merge history when primary sends replica new pg info
Sage Weil [Fri, 20 May 2011 22:04:09 +0000 (15:04 -0700)]
osd: merge history when primary sends replica new pg info

This, among other things, lets us update last_epoch_started and
last_epoch_clean.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: more heartbeat rework
Sage Weil [Fri, 20 May 2011 21:45:36 +0000 (14:45 -0700)]
osd: more heartbeat rework

A few things:
 - track Connection* instead of entity_inst_t for hb peers
 - we can only send maps over the cluster_messenger
   - if peer is still alive, do that
   - if peer is not, send dying MOSDPing ping with YOU_DIED flag

14 years agomsgr: don't close close_on_empty until outgoing messages are acked
Sage Weil [Fri, 20 May 2011 21:43:57 +0000 (14:43 -0700)]
msgr: don't close close_on_empty until outgoing messages are acked

Otherwise, if we close the socket, we may lose in-flight data.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: only forget peer epochs if they are down AND no longer heartbeat peers
Sage Weil [Fri, 20 May 2011 20:25:22 +0000 (13:25 -0700)]
osd: only forget peer epochs if they are down AND no longer heartbeat peers

If we forget the peer epoch when we see them go down, we won't share the
map later in update_heartbeat_peers() to tell them they're down.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: show last_epoch_clean in PG::Info::History printer
Sage Weil [Fri, 20 May 2011 20:01:22 +0000 (13:01 -0700)]
osd: show last_epoch_clean in PG::Info::History printer

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: rework peer map epoch caching
Sage Weil [Fri, 20 May 2011 19:55:29 +0000 (12:55 -0700)]
osd: rework peer map epoch caching

We try to keep track of which epochs our peers have so that we can be
semi-intelligent about which map incrementals we send preceeding any
messages.  Since this is useful from the heartbeat and cluster channels/
threads, protect the data with an inner lock and clean up the callers.

Be smarter about when we forget.

Make note of peer epoch when we receive a ping.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomon: fix parsing of 'osd foo N ...' commands with multiple ids
Sage Weil [Fri, 20 May 2011 19:22:22 +0000 (12:22 -0700)]
mon: fix parsing of 'osd foo N ...' commands with multiple ids

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agodout: reopen log files on SIGHUP
Colin Patrick McCabe [Fri, 20 May 2011 21:23:10 +0000 (14:23 -0700)]
dout: reopen log files on SIGHUP

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agodout: reopen log files on SIGHUP
Colin Patrick McCabe [Fri, 20 May 2011 21:23:10 +0000 (14:23 -0700)]
dout: reopen log files on SIGHUP

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoAdd SignalSafeQueue
Colin Patrick McCabe [Fri, 20 May 2011 18:35:19 +0000 (11:35 -0700)]
Add SignalSafeQueue

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoosd: clean up old _from target cleanup; fix one case; share map
Sage Weil [Fri, 20 May 2011 18:29:05 +0000 (11:29 -0700)]
osd: clean up old _from target cleanup; fix one case; share map

Clean up the code to mirror the _to case.

Previously we would not mark down an old _from that is still a _to but with
a new address.  Now we do.

Share a map while we're at it, just to be nice!

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: mark down old _to targets
Sage Weil [Fri, 20 May 2011 18:25:27 +0000 (11:25 -0700)]
osd: mark down old _to targets

If a peer remains a _to target but their address changes, we still want
to mark down the old connection.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: share map with old _to peers
Sage Weil [Fri, 20 May 2011 18:20:20 +0000 (11:20 -0700)]
osd: share map with old _to peers

Use new msgr hooks to do this cleanly.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: clean up handle_osd_ping output
Sage Weil [Fri, 20 May 2011 18:17:19 +0000 (11:17 -0700)]
osd: clean up handle_osd_ping output

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: ignore stale requests for heartbeats
Sage Weil [Fri, 20 May 2011 17:54:46 +0000 (10:54 -0700)]
osd: ignore stale requests for heartbeats

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: don't prioritize heartbeat requests
Sage Weil [Fri, 20 May 2011 17:43:12 +0000 (10:43 -0700)]
osd: don't prioritize heartbeat requests

This could conceivably screw up ordering, and priority doesn't matter
anyway when this is the first message we send to this peer.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: do not clobber explicitly requested heartbeat_to target addresss
Sage Weil [Fri, 20 May 2011 17:42:16 +0000 (10:42 -0700)]
osd: do not clobber explicitly requested heartbeat_to target addresss

Consider peer P.

- P does down in, say, epoch 60, and back up in epoch 70
- P and requests a heartbeat, as_of 70
- We update to map 50, and coincidentally add the same peer as a target
- We set the heartbeat_to[P] = 50 and start sending to the _old_ address
- P marks us down because we stop sending to the new addr
- We eventually get map 70, but it's too late!

Make sure we preserve any _to targets _and_ their epoch+inst.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: request proper log extent for missing
Sage Weil [Fri, 20 May 2011 16:29:10 +0000 (09:29 -0700)]
osd: request proper log extent for missing

We can't blinding ask for everything since last_epoch_started because that
may mean we get some fragment of a backlog.  Look at the peer's log
ranges and request the correct thing.  Also, in fulfill_log, infer what
the primary should have asked for if they make a bad request.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: fix log bounds check
Sage Weil [Fri, 20 May 2011 15:44:42 +0000 (08:44 -0700)]
osd: fix log bounds check

We weren't accounting for the case where we have

 (foo,foo]+backlog

i.e., everything is backlog, and rbegin().version != log.head.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: osd# is in log entry header/prefix
Sage Weil [Fri, 20 May 2011 15:35:44 +0000 (08:35 -0700)]
osd: osd# is in log entry header/prefix

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: log broken pg state to monitor on startup, activate
Sage Weil [Fri, 20 May 2011 15:33:07 +0000 (08:33 -0700)]
osd: log broken pg state to monitor on startup, activate

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: fix proc_replica_log when peer log is empty
Sage Weil [Fri, 20 May 2011 15:09:11 +0000 (08:09 -0700)]
osd: fix proc_replica_log when peer log is empty

If the peer log is empty, and we break out of the loop on the first pass,
then clearly last_update has not been adjusted.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: encode keyring as plaintext after --mkkey
Sage Weil [Fri, 20 May 2011 14:25:24 +0000 (07:25 -0700)]
osd: encode keyring as plaintext after --mkkey

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agokeyring: make encode_plaintext method
Sage Weil [Fri, 20 May 2011 14:25:16 +0000 (07:25 -0700)]
keyring: make encode_plaintext method

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoMerge branch 'wip_choose_acting' into stable
Sage Weil [Fri, 20 May 2011 07:41:31 +0000 (00:41 -0700)]
Merge branch 'wip_choose_acting' into stable

14 years agoosd: take remote log when it is clearly superior
Sage Weil [Fri, 20 May 2011 07:27:00 +0000 (00:27 -0700)]
osd: take remote log when it is clearly superior

I'm hitting a case where the primary is compensating for a replica's
last_complete < log.tail by sending a log+backlog, but the replica
isn't smart enough to take advantage.  In this case,

      replica: log(781'26629,781'26631]
 from primary: log(781'26629,781'26631]+backlog
       result: log(781'26629,781'26631]

Doh!

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: fix compensation for bad last_complete
Sage Weil [Fri, 20 May 2011 07:14:24 +0000 (00:14 -0700)]
osd: fix compensation for bad last_complete

If the peer has a last_complete below their tail, we can get by with our
log (without backlog) if our tail if _before_ their last_complete, not
after.  Otherwise, we need a backlog!

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: remove some build_prior stringstream cruft
Sage Weil [Fri, 20 May 2011 06:48:53 +0000 (23:48 -0700)]
osd: remove some build_prior stringstream cruft

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: remove useless debug print
Sage Weil [Fri, 20 May 2011 06:46:19 +0000 (23:46 -0700)]
osd: remove useless debug print

We dump this (and more) at the end of the PgPriorSet constructor.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: include past acting osds if they were up
Sage Weil [Fri, 20 May 2011 06:40:12 +0000 (23:40 -0700)]
osd: include past acting osds if they were up

This fixes a bug where we were excluding up (but not acting) nodes from
past intervals, which in turn was triggering a nasty choose_acting loop
(because we _do_ already include acting but !up from the current
interval).

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: do not exclude me during build_prior
Sage Weil [Fri, 20 May 2011 06:38:25 +0000 (23:38 -0700)]
osd: do not exclude me during build_prior

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: show final build_prior result
Sage Weil [Fri, 20 May 2011 06:25:32 +0000 (23:25 -0700)]
osd: show final build_prior result

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomon: log mkfs as INFO with fs
Sage Weil [Fri, 20 May 2011 03:45:48 +0000 (20:45 -0700)]
mon: log mkfs as INFO with fs

The [ERR] log level is misleading.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoOSD, PG: ignore peering messages from before the last peering restart
Josh Durgin [Fri, 20 May 2011 00:19:59 +0000 (17:19 -0700)]
OSD, PG: ignore peering messages from before the last peering restart

Check them before entering the state machine so we can
safely enter the Crashed state on unexpected messages
from the current interval.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoOSD: decrement message refcount before returning
Josh Durgin [Fri, 20 May 2011 00:46:40 +0000 (17:46 -0700)]
OSD: decrement message refcount before returning

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agomds: kick linklock on revoke_stale_caps
Sage Weil [Fri, 20 May 2011 00:20:18 +0000 (17:20 -0700)]
mds: kick linklock on revoke_stale_caps

Also use the eval() method and issue caps instead of calling the individual
eval methods.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agodebian: no shlibs:Depends for obsync either
Sage Weil [Thu, 19 May 2011 23:15:59 +0000 (16:15 -0700)]
debian: no shlibs:Depends for obsync either

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agodebian: no shlibs:Depends for -dev packages
Sage Weil [Thu, 19 May 2011 23:15:26 +0000 (16:15 -0700)]
debian: no shlibs:Depends for -dev packages

So says dpkg-gencontrol, at least:

warning: dpkg-gencontrol: Depends field of package librados-dev: unknown substitution variable ${shlibs:Depends}
...

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agolibrbd: don't need to link against crypto libs
Sage Weil [Thu, 19 May 2011 23:13:34 +0000 (16:13 -0700)]
librbd: don't need to link against crypto libs

All that is done by librados.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoPG: add_event, add_next_event: ignore prior_version on backlog events
Samuel Just [Thu, 19 May 2011 21:13:56 +0000 (14:13 -0700)]
PG: add_event, add_next_event: ignore prior_version on backlog events

We would not have the previous version if we are merging backlog events.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
14 years agoexpanding testceph to test open/readdir/telldir
Brian Chrisman [Thu, 19 May 2011 20:22:33 +0000 (13:22 -0700)]
expanding testceph to test open/readdir/telldir

Signed-off-by: Brian Chrisman <brchrisman@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoadd ceph_readdir() to libceph
Brian Chrisman [Thu, 19 May 2011 20:22:32 +0000 (13:22 -0700)]
add ceph_readdir() to libceph

Signed-off-by: Brian Chrisman <brchrisman@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
14 years agolibrados: add python bindings for getxattrs
Colin Patrick McCabe [Thu, 19 May 2011 21:27:19 +0000 (14:27 -0700)]
librados: add python bindings for getxattrs

Add python bindings for getxattrs. Test getxattr, getxattrs, and
setxattr.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoclient: hold FILE_BUFFER ref while waiting for dirty throttle
Sage Weil [Thu, 19 May 2011 22:03:13 +0000 (15:03 -0700)]
client: hold FILE_BUFFER ref while waiting for dirty throttle

We may block in the write path because we've reached out dirty data limit.
Hold a reference to the FILE_BUFFER cap during that interval so we don't
lose the cap and put new dirty buffers into the objectcacher out of turn.

(We could also recheck our ability to take the ref after blocking, but I
think this is cleaner.)

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoclient: clean up _flush callers
Sage Weil [Thu, 19 May 2011 22:01:50 +0000 (15:01 -0700)]
client: clean up _flush callers

Have _flush return true if there are no dirty buffers.  Clean up some
redundant conditionals in the callers

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoclient: assert(in) on _flush
Sage Weil [Thu, 19 May 2011 22:00:34 +0000 (15:00 -0700)]
client: assert(in) on _flush

We should never arrive in _flush() and not have a reference to the inode
in question, because the presence of dirty buffers pins the inode.  This
condition was introduced forever ago; clean it out.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoclient: be more careful with FILE_BUFFER cap refs
Sage Weil [Thu, 19 May 2011 21:50:41 +0000 (14:50 -0700)]
client: be more careful with FILE_BUFFER cap refs

We should either hold a ref or not; whether we release one can't depend on
whether one is held because we can't assume the ref belongs to us.

This changes the fix in cf6b1de4 so that the ObjectCacher just calls the
flush callback if it happens to trim all dirty buffers.

We also drop the (bogus) assert about the number of refs held.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoclient: _flush should no-op if nothing to flush
Sage Weil [Thu, 19 May 2011 19:21:51 +0000 (12:21 -0700)]
client: _flush should no-op if nothing to flush

If there are no FILE_BUFFER cap_refs, then we can bail out early.
Otherwise we will end up dropping refs we don't have.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoMerge remote branch 'origin/stable'
Sage Weil [Thu, 19 May 2011 22:04:19 +0000 (15:04 -0700)]
Merge remote branch 'origin/stable'