]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
14 years agoosd: document remove_watchers race avoidance
Josh Durgin [Tue, 2 Aug 2011 19:17:42 +0000 (12:17 -0700)]
osd: document remove_watchers race avoidance

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agopg: remove do_complete_notify
Josh Durgin [Tue, 2 Aug 2011 19:19:59 +0000 (12:19 -0700)]
pg: remove do_complete_notify

This method has no dependence on the pg.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoosd: put_object_context: tolerate pgs being deleted
Josh Durgin [Tue, 2 Aug 2011 19:14:06 +0000 (12:14 -0700)]
osd: put_object_context: tolerate pgs being deleted

PGs that are queued for deletion won't be in the osdmap,
and may not be in the pg_map, but if they are, it's safe to
put object_context. Otherwise, the pg is being deleted and
will clean up the object contexts itself.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoosd, pg: clean up watchers on pg deletion and shutdown
Josh Durgin [Fri, 29 Jul 2011 19:28:26 +0000 (12:28 -0700)]
osd, pg: clean up watchers on pg deletion and shutdown

Watchers and their object contexts need to be cleaned up so
they aren't used after the pg is gone. This happened if the
pool was deleted and the connection to the watcher was reset.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agomds: detach replay thread
Sage Weil [Tue, 2 Aug 2011 18:39:35 +0000 (11:39 -0700)]
mds: detach replay thread

Since we don't join it.

This fixes a leak of per-thread state.. namely, an ~8MB chunk of virtual
memory (and a handful of real pages).

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agothread: detach()
Sage Weil [Tue, 2 Aug 2011 18:30:53 +0000 (11:30 -0700)]
thread: detach()

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agocommon/Formatter: add unit test
Colin Patrick McCabe [Tue, 2 Aug 2011 18:08:52 +0000 (11:08 -0700)]
common/Formatter: add unit test

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agomdsmon: send commands to all MDSes, not just the in&up ones.
Greg Farnum [Tue, 2 Aug 2011 00:58:53 +0000 (17:58 -0700)]
mdsmon: send commands to all MDSes, not just the in&up ones.

Now we can send messages to standbys via broadcast, even if we
can't yet single them out.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agocephtool: accept semicolons in commandline args
Colin Patrick McCabe [Mon, 1 Aug 2011 22:51:19 +0000 (15:51 -0700)]
cephtool: accept semicolons in commandline args

Semicolons can now be used to give multiple arguments to cephtool in one
invocation. So a command like this is now possible:

$ ./ceph osd dump --format=json \; pg dump --format=json -o -

The backslash is to prevent the shell from consuming the semicolon.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agotestradospp: add version tests
Sage Weil [Mon, 1 Aug 2011 23:17:41 +0000 (16:17 -0700)]
testradospp: add version tests

get_last_version
assert_version
assert_src_version

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agolibrados: add assert_src_version
Sage Weil [Mon, 1 Aug 2011 22:20:00 +0000 (15:20 -0700)]
librados: add assert_src_version

Like set_assert_version, this is an IoCtx operation that affects the next
(and only the next) operation we perform.

Factor out the current assert version check code into a helper that also
handles the src asserts.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoobjecter: add assert_src_version
Sage Weil [Mon, 1 Aug 2011 22:16:29 +0000 (15:16 -0700)]
objecter: add assert_src_version

as an ObjectOperation

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoosd: add ASSERT_SRC_VERSION operation
Sage Weil [Mon, 1 Aug 2011 21:43:35 +0000 (14:43 -0700)]
osd: add ASSERT_SRC_VERSION operation

Assert a src object has a particular version.  This is analogous to the
ASSERT_VER operation, but operations on a src_oid instead of the target.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoMerge branch 'stable'
Sage Weil [Mon, 1 Aug 2011 23:16:38 +0000 (16:16 -0700)]
Merge branch 'stable'

14 years agoosd: set reply_version for read operations
Sage Weil [Mon, 1 Aug 2011 23:16:22 +0000 (16:16 -0700)]
osd: set reply_version for read operations

This was probably broken by the OSD prepare_transaction refactor a few
months ago.  Or it never worked.  Adding test to testradospp.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoMerge remote branch 'origin/objecter-latest-map'
Sage Weil [Mon, 1 Aug 2011 20:35:23 +0000 (13:35 -0700)]
Merge remote branch 'origin/objecter-latest-map'

14 years agorgw: quiet down some log messages
Yehuda Sadeh [Mon, 1 Aug 2011 20:25:14 +0000 (13:25 -0700)]
rgw: quiet down some log messages

14 years agoescape_json_attr: don't escape single quotes
Colin Patrick McCabe [Mon, 1 Aug 2011 18:42:22 +0000 (11:42 -0700)]
escape_json_attr: don't escape single quotes

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoFormatter.cc: use common/escape.h
Colin Patrick McCabe [Mon, 1 Aug 2011 18:14:55 +0000 (11:14 -0700)]
Formatter.cc: use common/escape.h

* Rename rgw/rgw_escape.h to common/escape.h

* Use escape.h in common/Formatter.cc

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoMakefile.am: always #define __STDC_FORMAT_MACROS
Colin Patrick McCabe [Mon, 1 Aug 2011 18:13:53 +0000 (11:13 -0700)]
Makefile.am: always #define __STDC_FORMAT_MACROS

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoMerge branch 'stable'
Sage Weil [Mon, 1 Aug 2011 16:50:05 +0000 (09:50 -0700)]
Merge branch 'stable'

14 years agomds: request attempt comes from fwd count, not retry flag
Sage Weil [Sat, 30 Jul 2011 04:41:34 +0000 (21:41 -0700)]
mds: request attempt comes from fwd count, not retry flag

14 years agomds: fix create_subtree_map for new dirs
Sage Weil [Sat, 30 Jul 2011 05:10:17 +0000 (22:10 -0700)]
mds: fix create_subtree_map for new dirs

Currently mkdir foo ; rmdir foo fails because we can't get_subtree_map()
on a new directory that isn't linked in the committed plane.  Since we are
journaling the projected subtree, it makes sense to use
get_projected_subtree_map() here.

It's easiest to keep in both the old and new directories in the rename
project map instead of looking at the next-to-most-recent parent for the
inode.  The committed version is irrelevant (could conceivably be multiple
renames behind) and the current projected parent is just newdir; we need
olddir too, and we don't project for cross-mds rename anyway.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agovstart: static mapping of names to ranks
Sage Weil [Sat, 30 Jul 2011 04:40:39 +0000 (21:40 -0700)]
vstart: static mapping of names to ranks

a always 0, b always 1, etc. makes multi-mds debugging much easier.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agov0.32 v0.32
Sage Weil [Sat, 30 Jul 2011 04:44:23 +0000 (21:44 -0700)]
v0.32

14 years agorgw: don't silently ignore bad user/group when setting acl
Yehuda Sadeh [Fri, 29 Jul 2011 23:26:08 +0000 (16:26 -0700)]
rgw: don't silently ignore bad user/group when setting acl

14 years agoobjecter: rename POOL_DISAPPEARED to POOL_DNE
Josh Durgin [Thu, 28 Jul 2011 19:23:19 +0000 (12:23 -0700)]
objecter: rename POOL_DISAPPEARED to POOL_DNE

The pool may never have existed.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoobjecter: check for updated osdmap when requesting a non-existent pool
Josh Durgin [Tue, 26 Jul 2011 23:10:15 +0000 (16:10 -0700)]
objecter: check for updated osdmap when requesting a non-existent pool

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoobjecter: fix error check - error return code is negative
Josh Durgin [Mon, 25 Jul 2011 18:51:28 +0000 (11:51 -0700)]
objecter: fix error check - error return code is negative

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agomonclient: add method to retrieve the latest version of a map
Josh Durgin [Thu, 28 Jul 2011 17:30:05 +0000 (10:30 -0700)]
monclient: add method to retrieve the latest version of a map

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agomon: add GetVersion message
Josh Durgin [Tue, 26 Jul 2011 22:59:07 +0000 (15:59 -0700)]
mon: add GetVersion message

This allows clients to determine whether they have the latest
mds, mon, or osd map. This is useful for figuring out if a pool
does not exist, or if the osdmap with it simply hasn't been
received yet.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
14 years agoMakefile: include HeartbeatMap.h in dist
Sage Weil [Fri, 29 Jul 2011 22:13:57 +0000 (15:13 -0700)]
Makefile: include HeartbeatMap.h in dist

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix validation of (slave) request attempts
Sage Weil [Fri, 29 Jul 2011 21:25:32 +0000 (14:25 -0700)]
mds: fix validation of (slave) request attempts

Verify that slave requests received are not stale.

Verify that slave replies match the currently processing request.

Clean up the code a bit.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: identify slave requests with reqid + attempt number
Sage Weil [Fri, 29 Jul 2011 20:44:24 +0000 (13:44 -0700)]
mds: identify slave requests with reqid + attempt number

We need to distinguish between different attempts to process a request, or
else we can get annoying races in the slave request handling code.  E.g.,

- request sent to mds A
- A authpins items on B, B registered slave_request
- A forwards request to C, sends slave finish to B
- C receives request, sends authpin slave request to B
- B receives C's authpin request, discards (*)
- B receives A's finish, closes slave request

First we just add tracking of the attempt number.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoscatterlock: fix flag assignments.
Greg Farnum [Fri, 29 Jul 2011 21:55:31 +0000 (14:55 -0700)]
scatterlock: fix flag assignments.

Want |= to set a flag, not &=!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoosdmap: in json dump, dump out/in, up/down status
Colin Patrick McCabe [Fri, 29 Jul 2011 18:05:05 +0000 (11:05 -0700)]
osdmap: in json dump, dump out/in, up/down status

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agorgw: get current utc epoch differently
Yehuda Sadeh [Fri, 29 Jul 2011 21:37:07 +0000 (14:37 -0700)]
rgw: get current utc epoch differently

beforehand tm.tm_isdst was returning random results which happened
to work correctly most of the time since we're currently in dst

14 years agorgw: init correctly req_state->{bucket, object}
Yehuda Sadeh [Fri, 29 Jul 2011 20:17:38 +0000 (13:17 -0700)]
rgw: init correctly req_state->{bucket, object}

14 years agorgw: fix total time reporting in rgw_admin
Yehuda Sadeh [Fri, 29 Jul 2011 18:47:59 +0000 (11:47 -0700)]
rgw: fix total time reporting in rgw_admin

14 years agorgw: tweak content-md5 handling
Yehuda Sadeh [Thu, 28 Jul 2011 22:18:17 +0000 (15:18 -0700)]
rgw: tweak content-md5 handling

14 years agoheartbeatmap: fix/clarify the commenting
Greg Farnum [Fri, 29 Jul 2011 16:20:23 +0000 (09:20 -0700)]
heartbeatmap: fix/clarify the commenting

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoscatterlock: compress boolean flags into a set of state flags
Greg Farnum [Fri, 29 Jul 2011 00:12:28 +0000 (17:12 -0700)]
scatterlock: compress boolean flags into a set of state flags

While we're at it, unify the naming structure a bit and remove
the unused stale flag.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoscatterlock: rename scatter_flags -> state_flags
Greg Farnum [Thu, 28 Jul 2011 20:25:03 +0000 (13:25 -0700)]
scatterlock: rename scatter_flags -> state_flags

We want to use this for all the bools, not just the scatter ones.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoMakefile: remove from libglobal
Sage Weil [Thu, 28 Jul 2011 23:42:55 +0000 (16:42 -0700)]
Makefile: remove from libglobal

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoAdd -ltr to libcommon
Colin Patrick McCabe [Thu, 28 Jul 2011 23:30:44 +0000 (16:30 -0700)]
Add -ltr to libcommon

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoMakefile: -lrt for libglobal.la only
Sage Weil [Thu, 28 Jul 2011 23:28:27 +0000 (16:28 -0700)]
Makefile: -lrt for libglobal.la only

Debugging linking is a pita.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agounittest_bufferlist: change include order
Sage Weil [Thu, 28 Jul 2011 23:26:58 +0000 (16:26 -0700)]
unittest_bufferlist: change include order

fixes a build error (int type conflicts) for me on fatty.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix log trimming races
Sage Weil [Thu, 28 Jul 2011 22:24:23 +0000 (15:24 -0700)]
mds: fix log trimming races

trim() would iterate over segments.  It would take the *p segment, ++p,
then call try_expire().  But the _expired() function would also clean up
and (if possible) retire subsequent segments on the list if they were on
the expired list, invalidating the p iterator.

Untangle the mess by making expired segment trimming (i.e. removing from
segment list) a separate operation performed only by trim() (probably a
good idea anyway).  This keeps the iterator safe/stable.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: separate type for gratuitous debug ESubtreeMaps
Sage Weil [Thu, 28 Jul 2011 21:51:06 +0000 (14:51 -0700)]
mds: separate type for gratuitous debug ESubtreeMaps

Give these a different type so they are not interpreted as subtree
boundaries during replay.  Otherwise we break the truncate_finish code,
which references the truncate_start logsegment by offset.  Probably other
stuff too.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomon: 'ceph mon dump [--format=json]'
Sage Weil [Thu, 28 Jul 2011 21:08:08 +0000 (14:08 -0700)]
mon: 'ceph mon dump [--format=json]'

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoheartbeatmap: unit test
Sage Weil [Thu, 28 Jul 2011 20:31:50 +0000 (13:31 -0700)]
heartbeatmap: unit test

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoheartbeatmap: we don't care about pthread_t
Sage Weil [Thu, 28 Jul 2011 20:24:51 +0000 (13:24 -0700)]
heartbeatmap: we don't care about pthread_t

Workers don't have to be threads.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoclient: open session with all mds targets
Sage Weil [Thu, 28 Jul 2011 18:27:25 +0000 (11:27 -0700)]
client: open session with all mds targets

If we have an open session with an mds, we need to have an open session.

The problem is if we, say,

- client has old mdsmap
- mds A adds B as target in mdsmap
- send request to mds A
- A exports to B
- we get the EXPORT, but B isn't listed as a target for A in client map
- client gets updated map

At the time we receive the map we need to open the session to B.   We can't
really do it when we get the EXPORT because we don't know the target MDS.

We can either track which exports are pending to do it, or just blindly
open sessions with targets for any MDSs we have caps with.  Which is
basically every session we have open.  That's simplest for now.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoMakefile: fix unittest_ceph_argparse build
Sage Weil [Thu, 28 Jul 2011 22:53:03 +0000 (15:53 -0700)]
Makefile: fix unittest_ceph_argparse build

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoinjectargs: complain about unparsed args
Colin Patrick McCabe [Thu, 28 Jul 2011 22:17:38 +0000 (15:17 -0700)]
injectargs: complain about unparsed args

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoinjectargs: print out what is changing
Colin Patrick McCabe [Thu, 28 Jul 2011 21:48:31 +0000 (14:48 -0700)]
injectargs: print out what is changing

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agorgw: fix base64 check
Yehuda Sadeh [Thu, 28 Jul 2011 21:40:02 +0000 (14:40 -0700)]
rgw: fix base64 check

14 years agorgw: check content md5 validity when doing auth
Yehuda Sadeh [Thu, 28 Jul 2011 21:29:52 +0000 (14:29 -0700)]
rgw: check content md5 validity when doing auth

14 years agorgw: fix date checks
Yehuda Sadeh [Thu, 28 Jul 2011 21:03:20 +0000 (14:03 -0700)]
rgw: fix date checks

14 years agorgw: fix authentication
Yehuda Sadeh [Thu, 28 Jul 2011 19:34:30 +0000 (12:34 -0700)]
rgw: fix authentication

14 years agoscatterlock: convert [un]scatter_wanted to a bitfield
Greg Farnum [Thu, 28 Jul 2011 19:42:17 +0000 (12:42 -0700)]
scatterlock: convert [un]scatter_wanted to a bitfield

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agomds: Handle unscatter_wanted in try_eval(lock, need_issue)
Greg Farnum [Thu, 28 Jul 2011 19:33:25 +0000 (12:33 -0700)]
mds: Handle unscatter_wanted in try_eval(lock, need_issue)

commit:dac1dc83ee5598ca97c29cd5d0b12150685cd05b added handling
for scatter_wanted, but we need to handle unscatter_wanted here too.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agomds: Split the CInode::scatter_wanted field in two
Greg Farnum [Thu, 28 Jul 2011 18:34:09 +0000 (11:34 -0700)]
mds: Split the CInode::scatter_wanted field in two

We use this field to indicate we want a scatter or an unscatter. Make
that distinction explicit.
Also, clear the unscatter_wanted in simple_lock when we start a gather!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoheartbeatmap: fix mode
Sage Weil [Thu, 28 Jul 2011 17:09:58 +0000 (10:09 -0700)]
heartbeatmap: fix mode

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoheartbeatmap: warn if previous deadline is missed
Sage Weil [Thu, 28 Jul 2011 17:10:51 +0000 (10:10 -0700)]
heartbeatmap: warn if previous deadline is missed

This will generate missed deadline noise in the log that may otherwise be
missed by an infrequent heartbeat_interval.  We generally want to know if
deadlines are missed, but we don't necessarily need to touch the heartbeat
file every second.  This gets us both.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoceph_context: only wake up periodically if heartbeat_interval is set
Sage Weil [Thu, 28 Jul 2011 16:50:53 +0000 (09:50 -0700)]
ceph_context: only wake up periodically if heartbeat_interval is set

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: no need to explicitly check health
Sage Weil [Thu, 28 Jul 2011 16:47:46 +0000 (09:47 -0700)]
osd: no need to explicitly check health

The service thread does it now.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agovstart: set heartbeat file
Sage Weil [Thu, 28 Jul 2011 16:47:13 +0000 (09:47 -0700)]
vstart: set heartbeat file

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoceph_context: check internal heartbeat in cct service thread
Sage Weil [Thu, 28 Jul 2011 16:39:15 +0000 (09:39 -0700)]
ceph_context: check internal heartbeat in cct service thread

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoheartbeatmap: config options, method to touch a file if healthy
Sage Weil [Thu, 28 Jul 2011 16:38:51 +0000 (09:38 -0700)]
heartbeatmap: config options, method to touch a file if healthy

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoheartbeatmap: use atomic_t
Sage Weil [Thu, 28 Jul 2011 16:15:27 +0000 (09:15 -0700)]
heartbeatmap: use atomic_t

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoheartbeatmap: put in ceph namespace
Sage Weil [Thu, 28 Jul 2011 16:15:07 +0000 (09:15 -0700)]
heartbeatmap: put in ceph namespace

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoheartbeatmap: simplify api
Sage Weil [Thu, 28 Jul 2011 16:10:17 +0000 (09:10 -0700)]
heartbeatmap: simplify api

reset_timeout(), clear_timeout() makes more sense than "touch".

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoheartbeatmap: fix stupid race
Sage Weil [Thu, 28 Jul 2011 16:07:02 +0000 (09:07 -0700)]
heartbeatmap: fix stupid race

atomic_t is probably better here, actually... :/

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoheartbeatmap: use a list<> instead of map<>
Sage Weil [Thu, 28 Jul 2011 16:02:07 +0000 (09:02 -0700)]
heartbeatmap: use a list<> instead of map<>

Don't need a map<> here.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoworkqueue: register and time out worker threads
Sage Weil [Thu, 28 Jul 2011 05:49:12 +0000 (22:49 -0700)]
workqueue: register and time out worker threads

Register and unregister worker threads.  Periodically touch heartbeat
when idle.  Set heartbeat timeout before processing a queue item.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoworkqueue: provide op timeout to workqueue constructor
Sage Weil [Thu, 28 Jul 2011 05:47:41 +0000 (22:47 -0700)]
workqueue: provide op timeout to workqueue constructor

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agoheartbeatmap: introduce heartbeat_map
Sage Weil [Thu, 28 Jul 2011 05:38:43 +0000 (22:38 -0700)]
heartbeatmap: introduce heartbeat_map

Each thread registered and gets a private structure it can write a timeout
value to.  The timeout is time_t and always fits in a single word, so no
locking is used to update it.

Anyone can call is_healthy() to find out if any timeouts have expired.
Eventually some background thread will do this.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: mark ambig imports in ESubtreeMap during resolve
Sage Weil [Thu, 28 Jul 2011 04:44:44 +0000 (21:44 -0700)]
mds: mark ambig imports in ESubtreeMap during resolve

During resolve we may journal EImportFinish(true/false) as we resolve our
imports/exports.  And as a side-effect we may journal an ESubtreeMap.  We
need to properly mark ambig subtrees in that entry based on the
my_ambiguous_imports (resolve state), not just the migrator state (for the
active mds).

Note that the other Migrator::is_ambiguous_import() user
(send_resolve_now()) already does this correctly.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: pin inodes on LogSegment::truncating_inodes list
Sage Weil [Thu, 28 Jul 2011 04:29:05 +0000 (21:29 -0700)]
mds: pin inodes on LogSegment::truncating_inodes list

For active MDS, pin when we add to the list, unpin when we finish
truncating.

For replay, pin when we replay a truncate start, unpin when we replay a
truncate finish.  Use a nice helper for both.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: handle aborted slave rename while waiting for second prep
Sage Weil [Thu, 28 Jul 2011 03:46:03 +0000 (20:46 -0700)]
mds: handle aborted slave rename while waiting for second prep

When we get the first prep, we may respond to the master with an expanded
list of witnesses for the rename before making any change (or rollback
plan).  If the master fails before sending the second prep attempt, we
may end up in the abort path of _commit_slave_rename() with an empty
rollback_bl.  That's fine; don't crash.  We still need to unfreeze the
srci, but can skip the do_rename_rollback since we didn't actually journal
a change.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: honor scatter_wanted while freezing
Sage Weil [Wed, 27 Jul 2011 21:38:38 +0000 (14:38 -0700)]
mds: honor scatter_wanted while freezing

- mds A authpins item on mds B
- mds B starts to freeze tree containing item
- mds A tries wrlock_start on A, sends REQSCATTER to B
- mds B lock is unstable, sets scatter_wanted
- mds B lock stabilizes, calls try_eval, defers because freezing.
-> deadlock

In general, we want to avoid the eval while freezing to prevent starvation.
However, in this case with the multi-mds locking, we need to honor
the scatter_wanted even so.

Insert this check in try_eval().  This will catch it on the first try_eval
call after the lock stabilizes.  The ambiguous auth will never catch us
while freezing, and the master holds an auth_pin to prevent a freeze, so
we will never defer the eval; no need to do the same logic in the other
eval method (eval(MDSCacheObject*, ...)) used for retry.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: try_eval in many places
Sage Weil [Wed, 27 Jul 2011 21:32:24 +0000 (14:32 -0700)]
mds: try_eval in many places

These are the obvious places where we drop locks and may need to defer the
eval until after unfreeze.  There are probably more; a full audit is in
order.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: implement try_eval() on a single lock
Sage Weil [Wed, 27 Jul 2011 20:13:38 +0000 (13:13 -0700)]
mds: implement try_eval() on a single lock

We frequently call eval() on locks, usually after dropping an rd/wr/xlock.
At that point the eval() may do nothing because the object is now freezing
or frozen.  However, we still need to do the eval eventually.

These callers should eventually all switch to try_eval(), and retry as
needed.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
14 years agomds: better debugging for scatter_wanted flag
Sage Weil [Wed, 27 Jul 2011 20:08:57 +0000 (13:08 -0700)]
mds: better debugging for scatter_wanted flag

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomdsmon: Fix handling of follow-by-name MDSes.
Greg Farnum [Thu, 28 Jul 2011 00:42:24 +0000 (17:42 -0700)]
mdsmon: Fix handling of follow-by-name MDSes.

We were accidentally setting them to standby-for-rank -1 if their
leader MDS wasn't active on startup. Things worked out in the end
anyway since they would go from standby to active for the appropriate
rank, but we want them to be in proper standby-replay!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agovstart: use paired MDSes with a specified standby.
Greg Farnum [Thu, 28 Jul 2011 00:41:09 +0000 (17:41 -0700)]
vstart: use paired MDSes with a specified standby.

I think this is a bit cleaner than specifying ranks manually.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoPG: add an assert for negative entries in the scrub map
Greg Farnum [Wed, 27 Jul 2011 19:01:38 +0000 (12:01 -0700)]
PG: add an assert for negative entries in the scrub map

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoosd: label ReplicatedPG::_scrub as virtual.
Greg Farnum [Tue, 26 Jul 2011 20:48:12 +0000 (13:48 -0700)]
osd: label ReplicatedPG::_scrub as virtual.

It is virtual in the parent class PG, and the style guide says to
label them in all classes so people don't forget.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agoosd: turn down debug level on repop commit message
Greg Farnum [Tue, 26 Jul 2011 20:45:46 +0000 (13:45 -0700)]
osd: turn down debug level on repop commit message

We really don't need that to be the only thing sitting in logs.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
14 years agorgw: parse date from http header
Yehuda Sadeh [Thu, 28 Jul 2011 00:02:17 +0000 (17:02 -0700)]
rgw: parse date from http header

14 years agorgw: return required error when conent length missing on PUT
Yehuda Sadeh [Wed, 27 Jul 2011 23:04:36 +0000 (16:04 -0700)]
rgw: return required error when conent length missing on PUT

14 years agoMerge branch 'next'
Sage Weil [Wed, 27 Jul 2011 19:43:47 +0000 (12:43 -0700)]
Merge branch 'next'

14 years agomds: make two passes on scatter_nudge
Sage Weil [Wed, 27 Jul 2011 19:42:21 +0000 (12:42 -0700)]
mds: make two passes on scatter_nudge

It's possible for scatter_nudge on a scatterlock in LOCK with dirty set to
go to MIX immediately and remain stable.  Give two 'nudge' passes before
we stop to avoid looping.

This fixes an assert failure where a nudge from log trimming ended up in a
stable state and asserted (!c).  The second pass will go trigger the dirty
writebehind.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: honor scatter_wanted flag in scatter_eval()
Sage Weil [Wed, 27 Jul 2011 19:40:35 +0000 (12:40 -0700)]
mds: honor scatter_wanted flag in scatter_eval()

We do in file_eval, but not here.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agotestrados_delete_pool_while_open: remove from make
Colin Patrick McCabe [Wed, 27 Jul 2011 19:26:42 +0000 (12:26 -0700)]
testrados_delete_pool_while_open: remove from make

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agoremove testrados_delete_pool_while_open
Colin Patrick McCabe [Wed, 27 Jul 2011 17:57:37 +0000 (10:57 -0700)]
remove testrados_delete_pool_while_open

This test duplicates the functionality of
testrados_delete_pools_parallel, but not as well.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
14 years agomds: fix projected rename adjustment
Sage Weil [Tue, 26 Jul 2011 23:10:48 +0000 (16:10 -0700)]
mds: fix projected rename adjustment

- we may journal one (or _maybe_ both, probably not) of the subtree root
  addition OR the bound addition, depending on whether oldparent and
  newparent are auth.
- we can't rely on get_subtree_root() to move bounds since the projected
  subtree isn't a root in the real tree.  use CDir::contains() instead.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: clean out rename subtree cruft
Sage Weil [Tue, 26 Jul 2011 21:59:44 +0000 (14:59 -0700)]
mds: clean out rename subtree cruft

We used to force these subtrees for rename.  We don't anymore.. this is
old weirdness.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: simplify subtree map after adjusting for rename
Sage Weil [Tue, 26 Jul 2011 21:59:22 +0000 (14:59 -0700)]
mds: simplify subtree map after adjusting for rename

Merge the subtree with the parent if appropriate.

Signed-off-by: Sage Weil <sage@newdream.net>