]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
14 years agofilestore: call lower-level do_transactions() during journal replay
Sage Weil [Wed, 1 Dec 2010 21:48:56 +0000 (13:48 -0800)]
filestore: call lower-level do_transactions() during journal replay

We used to call apply_transactions, which avoided rejournaling anything
because the journal wasn't writeable yet, but that uses all kinds of other
machinery that relies on threads and finishers and such that aren't
appropriate or necessary when we're just replaying journaled events.

Instead, call the lower-level do_transactions() directly.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agofilestore: do journal mode autodetect and sanity check _before_ replay
Sage Weil [Wed, 1 Dec 2010 21:46:30 +0000 (13:46 -0800)]
filestore: do journal mode autodetect and sanity check _before_ replay

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agofilestore: fix journal locking on trailing mode
Sage Weil [Wed, 1 Dec 2010 19:05:11 +0000 (11:05 -0800)]
filestore: fix journal locking on trailing mode

We're already holding journal_lock due to the surrounding
op_submit_{start,finish}.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoMerge branch 'testing' into rc
Sage Weil [Wed, 1 Dec 2010 18:20:43 +0000 (10:20 -0800)]
Merge branch 'testing' into rc

Conflicts:
configure.ac

14 years agorbd: use MIN instead of min()
Sage Weil [Wed, 1 Dec 2010 17:51:27 +0000 (09:51 -0800)]
rbd: use MIN instead of min()

Not even sure where min() was coming from, but it seems to be missing on
i386 lucid.:

g++ -DHAVE_CONFIG_H -I.     -Wall -D__CEPH__ -D_FILE_OFFSET_BITS=64 -D_REENTRANT -D_THREAD_SAFE -rdynamic -g -O2 -MT rbd.o -MD -MP -MF .deps/rbd.Tpo -c -o rbd.o rbd.cc
rbd.cc: In function 'int do_import(void*, const char*, int, const char*)':
rbd.cc:837: error: no matching function for call to 'min(uint64_t&, off_t)'
make[3]: *** [rbd.o] Error 1

Reported-by: John Leach <john@johnleach.co.uk>
Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoclient: connect to export targets on cap EXPORT
Sage Weil [Wed, 1 Dec 2010 17:44:58 +0000 (09:44 -0800)]
client: connect to export targets on cap EXPORT

Also unconditionally connect on reconnect, even when there aren't any
outstanding requests.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoceph v0.23.2 v0.23.2
Sage Weil [Wed, 1 Dec 2010 17:27:19 +0000 (09:27 -0800)]
ceph v0.23.2

14 years agofilestore: do not autodetect BTRFS_IOC_SNAP_CREATE_ASYNC until interface is finalized
Sage Weil [Wed, 1 Dec 2010 18:03:19 +0000 (10:03 -0800)]
filestore: do not autodetect BTRFS_IOC_SNAP_CREATE_ASYNC until interface is finalized

Li has proposed an alternative V2 ioctl that looks nicer, so wait until
that is finalized.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoclient: fix cap export handler
Sage Weil [Wed, 1 Dec 2010 17:44:26 +0000 (09:44 -0800)]
client: fix cap export handler

An EXPORT cap msg can race with a cap release; deal with that (realigning
this code with the kclient).

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoman: fix monmaptool man page
Laszlo Boszormenyi [Wed, 1 Dec 2010 17:24:45 +0000 (09:24 -0800)]
man: fix monmaptool man page

I've found the manpage problem that I've noted before. It's about
monmaptool, the CLI says it's usage:
[--print] [--create [--clobber]] [--add name 1.2.3.4:567] [--rm name]
<mapfilename>
But the manpage states this as an example:
monmaptool --create --add 192.168.0.10:6789 --add 192.168.0.11:6789 --add
192.168.0.12:6789 --clobber monmap
This definitely misses 'name' after the 'add' switch, resulting:
"invalid ip:port '--add'" as an error message. Attached patch fixes this
inconsistency.

Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu>
14 years agoosd: simplify scrub sanity checks
Sage Weil [Wed, 1 Dec 2010 00:50:41 +0000 (16:50 -0800)]
osd: simplify scrub sanity checks

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: only adjust osd scrub_pending if pg was reserved
Sage Weil [Wed, 1 Dec 2010 00:50:25 +0000 (16:50 -0800)]
osd: only adjust osd scrub_pending if pg was reserved

If for some reason we enter scrub() without scrub_reserved == true, don't
adjust the osd->scrubs_pending or we'll screw up the accounting.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix import_reverse re-exporting of caps
Sage Weil [Wed, 1 Dec 2010 00:38:21 +0000 (16:38 -0800)]
mds: fix import_reverse re-exporting of caps

Make the import_reverse() set the pin/state before it clears them by using
the helper that sets them.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: turn off mds_bal_frag until resolve vs split/merge is fixed
Sage Weil [Wed, 1 Dec 2010 00:25:15 +0000 (16:25 -0800)]
mds: turn off mds_bal_frag until resolve vs split/merge is fixed

See #594

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoMerge remote branch 'origin/lost' into unstable
Sage Weil [Wed, 1 Dec 2010 00:11:20 +0000 (16:11 -0800)]
Merge remote branch 'origin/lost' into unstable

Conflicts:
src/osd/osd_types.h

14 years agoosd: refactor object_info_t constructor a bit
Colin Patrick McCabe [Tue, 30 Nov 2010 23:04:15 +0000 (15:04 -0800)]
osd: refactor object_info_t constructor a bit

Create a copy constructor for object_info_t, since we often want to copy
an object_info_t and would rather not try to remember all the fields.
Drop the lost parameter from one of the other constructors, because it's
not used that much.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: share_pg_log: update peer_missing
Colin Patrick McCabe [Tue, 30 Nov 2010 22:43:51 +0000 (14:43 -0800)]
osd: share_pg_log: update peer_missing

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: mark_obj_as_lost: fix oloc init, eversion
Colin Patrick McCabe [Tue, 30 Nov 2010 21:42:07 +0000 (13:42 -0800)]
osd: mark_obj_as_lost: fix oloc init, eversion

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: mark_all_unfound_as_lost: bugfix, refactor
Colin Patrick McCabe [Mon, 29 Nov 2010 20:01:55 +0000 (12:01 -0800)]
osd: mark_all_unfound_as_lost: bugfix, refactor

mark_all_unfound_as_lost: just delete items from the rmissing set as we
find them, rather than using a multi-pass system.

Update info.last_update as we go so that log printouts will look correct
(the log printout function checks info.last_update)

Don't remove from missing or missing_loc in mark_obj_as_lost.
PG::missing_loc should never have the soid, and PG::missing we handle
elsewhere.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: mark_obj_as_lost: don't assume we have obj
Colin Patrick McCabe [Mon, 29 Nov 2010 19:33:39 +0000 (11:33 -0800)]
osd: mark_obj_as_lost: don't assume we have obj

In PG::mark_obj_as_lost, we have to mark a missing object as lost. We
should not assume that we have an old version of the missing object in
the ObjectStore. If the object doesn't exist in the object store, we
have to create it so that recovery can function correctly.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: create lost2 test
Colin Patrick McCabe [Thu, 25 Nov 2010 05:15:00 +0000 (21:15 -0800)]
osd: create lost2 test

This one verifies:
1. Client asks for an unfound object and gets put to sleep
2. Object gets declared lost
3. Client wakes up

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: mark_all_unfound_as_lost: set lost attr
Colin Patrick McCabe [Thu, 25 Nov 2010 04:55:14 +0000 (20:55 -0800)]
osd: mark_all_unfound_as_lost: set lost attr

In mark_all_unfound_as_lost, we need to set the lost bit in the objects'
object_info_t.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoradostool: fix memleak in error path
Colin Patrick McCabe [Thu, 25 Nov 2010 01:26:35 +0000 (17:26 -0800)]
radostool: fix memleak in error path

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: mark_all_unfound_as_lost: wake waiters
Colin Patrick McCabe [Wed, 24 Nov 2010 06:04:53 +0000 (22:04 -0800)]
osd: mark_all_unfound_as_lost: wake waiters

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agotest_lost: add lost1 test
Colin Patrick McCabe [Wed, 24 Nov 2010 05:55:26 +0000 (21:55 -0800)]
test_lost: add lost1 test

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: ReplicatedPG::do_op: error on read-from-lost
Colin Patrick McCabe [Wed, 24 Nov 2010 05:45:47 +0000 (21:45 -0800)]
osd: ReplicatedPG::do_op: error on read-from-lost

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: don't mark objs as lost unless we're active
Colin Patrick McCabe [Tue, 23 Nov 2010 22:30:06 +0000 (14:30 -0800)]
osd: don't mark objs as lost unless we're active

We don't have enough information to mark objects as lost until we
activate the PG. might_have_unfound isn't even built until PG::activate.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agomds: fix resolve for surviving observers
Sage Weil [Tue, 30 Nov 2010 23:43:53 +0000 (15:43 -0800)]
mds: fix resolve for surviving observers

Make all survivors participate in resolve stage, so that survivors can
properly determine the outcome of migrations to the failed node that did
not complete.

The sequence (before):
 - A starts to export /foo to B
 - C has ambiguous auth (A,B) in it's subtree map
 - B journals import_start
 - B fails
...
 - B restarts
 - B sends resolves to everyone
   - does not claim /foo
 - A sends resolve _only_ to B
   - does claim /foo
 - B knows it's import did not complete
 - C doesn't know anything.  Also, the maybe_resolve_finish stuff was
   totally broken because the recovery_set wasn't initialized

See new (commented out) assert in Migrator.cc to reproduce the above.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agotest: dump_osd_store: sort dump output
Colin Patrick McCabe [Tue, 23 Nov 2010 18:55:59 +0000 (10:55 -0800)]
test: dump_osd_store: sort dump output

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: active replicas process logs from primaries
Colin Patrick McCabe [Tue, 23 Nov 2010 18:18:29 +0000 (10:18 -0800)]
osd: active replicas process logs from primaries

In _process_pg_info, if the primary sends us a PG::Log, a replica should
merge that log into its own.

mark_all_unfound_as_lost / share_pg_log: don't send the whole PG::Log.
Just send the new entries that were just added when marking the objects
as lost.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: object_info_t: add lost field
Colin Patrick McCabe [Mon, 22 Nov 2010 23:56:06 +0000 (15:56 -0800)]
osd: object_info_t: add lost field

We can now permanently mark objects as lost by setting the lost bit in
their object_info_t. Rev the object_info_t struct.

get_object_context: re-arrange this so that we're always setting the
lost bit. Also avoid some unecessary steps.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoAdd ./ceph dump pg debug degraded_pgs_exist
Colin Patrick McCabe [Mon, 22 Nov 2010 19:32:38 +0000 (11:32 -0800)]
Add ./ceph dump pg debug degraded_pgs_exist

./ceph dump pg debug degraded_pgs_exist returns TRUE if some pgs are
degraded; false otherwise.

tests: move start_recovery into test_common.sh.
Create recovery1 test.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years ago(re)add mechanism for marking objects as lost
Colin Patrick McCabe [Sat, 20 Nov 2010 03:15:40 +0000 (19:15 -0800)]
(re)add mechanism for marking objects as lost

In activate_map, we now mark objects that we know are unfindable as
lost. This relies on the might_have_unfound set introduced earlier.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: fix object_info_t() initialization of oloc
Sage Weil [Tue, 30 Nov 2010 20:57:43 +0000 (12:57 -0800)]
osd: fix object_info_t() initialization of oloc

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: add debug output to make completions easier to track
Sage Weil [Tue, 30 Nov 2010 20:56:15 +0000 (12:56 -0800)]
mds: add debug output to make completions easier to track

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: fix misuses of OLOC_BLANK
Sage Weil [Tue, 30 Nov 2010 20:48:32 +0000 (12:48 -0800)]
osd: fix misuses of OLOC_BLANK

Commit 6e2b594b fixed a bunch of bad get_object_context() calls, but even
with the parameter fixed some were still broken.  Pass in a valid oloc in
those cases.  The only places where OLOC_BLANK _is_ still uses is when we
know we have the object locally and will load a valid value off disk.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoRevert "mds: resolve cleanup"
Sage Weil [Tue, 30 Nov 2010 20:23:18 +0000 (12:23 -0800)]
Revert "mds: resolve cleanup"

This reverts commit cd53719f3ce712a060e4ac80cab934c597531a5e.

We need this on surviving nodes too to resolve ambiguous migrations to/from recoverying
nodes.

14 years agoMerge branch 'testing' into unstable
Sage Weil [Tue, 30 Nov 2010 20:19:39 +0000 (12:19 -0800)]
Merge branch 'testing' into unstable

Conflicts:
src/os/FileJournal.cc

14 years agoosd: make recovery_oids debug list per-pg
Sage Weil [Tue, 30 Nov 2010 19:43:19 +0000 (11:43 -0800)]
osd: make recovery_oids debug list per-pg

Otherwise we hit bad asserts if an object of the same name in different
pools is getting recovered simultaneously.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoclient: Set the DirResult buffer to NULL when deleting it.
Greg Farnum [Tue, 30 Nov 2010 18:56:34 +0000 (10:56 -0800)]
client: Set the DirResult buffer to NULL when deleting it.

This should fix a crash exposed by our bonnie workunit. Previously
the client would keep trying to read out of the (deleted) buffer!

14 years agoceph.spec.in: include gui files
Sage Weil [Tue, 30 Nov 2010 17:22:42 +0000 (09:22 -0800)]
ceph.spec.in: include gui files

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agodebian: many many cleanups
Sage Weil [Tue, 30 Nov 2010 17:13:54 +0000 (09:13 -0800)]
debian: many many cleanups

Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu>
14 years agofilejournal: fix throttle vs FULL behavior
Sage Weil [Tue, 30 Nov 2010 16:55:29 +0000 (08:55 -0800)]
filejournal: fix throttle vs FULL behavior

We don't want to add to the throttler if we aren't going to queue the
write, or else we'll never take it off again.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoMerge branch 'osd_journaling' into unstable
Sage Weil [Tue, 30 Nov 2010 16:32:55 +0000 (08:32 -0800)]
Merge branch 'osd_journaling' into unstable

14 years agofilestore: make sure blocked op_start's wake up in order
Sage Weil [Tue, 30 Nov 2010 16:30:57 +0000 (08:30 -0800)]
filestore: make sure blocked op_start's wake up in order

If they wake up out of order (which, theoretically, they could before) we
can screw up journal submitting order in writebehind mode, or apply order
in parallel and writeahead journal mode.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agofilestore: assert op_submit_finish is called in order
Sage Weil [Tue, 30 Nov 2010 16:24:57 +0000 (08:24 -0800)]
filestore: assert op_submit_finish is called in order

Verify/assert that we aren't screwing up the submission pipeline ordering.
Namely, we want to make sure that if op_apply_start() blocks, we wake up
in the proper order and don't screw up the journaling.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agofilejournal: rework journal FULL behavior and fix throttling
Sage Weil [Tue, 30 Nov 2010 15:54:42 +0000 (07:54 -0800)]
filejournal: rework journal FULL behavior and fix throttling

Keep distinct states for FULL, WAIT, and NOTFULL.

The old code was more or less correct at one point, but assumed the seq
changed on each commit, not each operation; in it's prior state it was
totally broken.

Also fix throttling (we were leaking items in the throttler that were
submitted while the journal was full).

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agofilestore: refactor op_queue/journal locking
Sage Weil [Tue, 30 Nov 2010 15:51:16 +0000 (07:51 -0800)]
filestore: refactor op_queue/journal locking

- Combine journal_lock and lock.
- Move throttling outside of the lock (this fixes potential deadlock in
  parallel journal mode)
- Make interface nomenclature a bit more helpful

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agofilestore: do not throttle op_queue in queue_op()
Sage Weil [Tue, 30 Nov 2010 15:22:37 +0000 (07:22 -0800)]
filestore: do not throttle op_queue in queue_op()

In parallel mode, queue_op is called while holding the journal lock, so it
is not okay to throttle there.  Instead, throttle in the caller.

The throttling still needs improvement, but this at least fixes the locking
problem.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoMakefile: add bloom_filter.hpp to noinst_HEADERs
Colin Patrick McCabe [Tue, 30 Nov 2010 02:49:53 +0000 (18:49 -0800)]
Makefile: add bloom_filter.hpp to noinst_HEADERs

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoMakefile: Fix VPATH builds
Colin Patrick McCabe [Tue, 30 Nov 2010 01:16:06 +0000 (17:16 -0800)]
Makefile: Fix VPATH builds

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: osd_types.h: const cleanup
Colin Patrick McCabe [Tue, 30 Nov 2010 00:38:18 +0000 (16:38 -0800)]
osd: osd_types.h: const cleanup

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: don't try to load a PG in a nonexistent pool
Colin Patrick McCabe [Tue, 30 Nov 2010 00:29:39 +0000 (16:29 -0800)]
osd: don't try to load a PG in a nonexistent pool

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agofilestore: simplify apply_transactions
Sage Weil [Tue, 30 Nov 2010 00:38:55 +0000 (16:38 -0800)]
filestore: simplify apply_transactions

Always use queue_transactions, even in no-journal case.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: PG::trim: fix inverted conditional in assert
Colin Patrick McCabe [Mon, 29 Nov 2010 23:51:26 +0000 (15:51 -0800)]
osd: PG::trim: fix inverted conditional in assert

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agocommon: prevent infinite recursion on SIGSEGV
Colin Patrick McCabe [Mon, 29 Nov 2010 22:46:08 +0000 (14:46 -0800)]
common: prevent infinite recursion on SIGSEGV

Install SIGSEGV / SIGABORT handlers with sigaction using SA_RESETHAND.
This will ensure that if the signal handler itself encounters another
fault, the default signal handler (usually dump core) will be what is
used. Also, flush the log before dumping core.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: Create pg_split test
Colin Patrick McCabe [Mon, 29 Nov 2010 20:56:23 +0000 (12:56 -0800)]
osd: Create pg_split test

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agologger: Fix a crash when the MDS shuts down cleanly.
Greg Farnum [Mon, 29 Nov 2010 21:34:49 +0000 (13:34 -0800)]
logger: Fix a crash when the MDS shuts down cleanly.

We weren't holding the lock on the logger_timer before calling shutdown.

14 years agoTimer: add some asserts to catch certain errors.
Greg Farnum [Mon, 29 Nov 2010 21:33:47 +0000 (13:33 -0800)]
Timer: add some asserts to catch certain errors.

14 years agoMakefile: Add --as-needed to LDFLAGS
Colin Patrick McCabe [Mon, 29 Nov 2010 20:18:04 +0000 (12:18 -0800)]
Makefile: Add --as-needed to LDFLAGS

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agovstart.sh: don't specify journaling mode
Sage Weil [Mon, 29 Nov 2010 19:51:13 +0000 (11:51 -0800)]
vstart.sh: don't specify journaling mode

Let the autodetection kick in, or let the dev specify via -o '...'.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: PG::trim: add assert
Colin Patrick McCabe [Mon, 29 Nov 2010 19:15:45 +0000 (11:15 -0800)]
osd: PG::trim: add assert

Assert that we're not trimming the PG log past last_complete.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: _process_pg_info: add assert for replicas
Colin Patrick McCabe [Mon, 29 Nov 2010 17:29:17 +0000 (09:29 -0800)]
osd: _process_pg_info: add assert for replicas

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: dump_missing: also dump missing_loc
Colin Patrick McCabe [Thu, 25 Nov 2010 07:36:14 +0000 (23:36 -0800)]
osd: dump_missing: also dump missing_loc

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: discover_all_missing fix
Colin Patrick McCabe [Thu, 25 Nov 2010 07:13:43 +0000 (23:13 -0800)]
osd: discover_all_missing fix

Don't request information from an OSD unless it is up and part of the
might_have_unfound set. Add more logging.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agogui: some cleanup
Colin Patrick McCabe [Wed, 24 Nov 2010 00:37:20 +0000 (16:37 -0800)]
gui: some cleanup

Rather than vectors of pointers, use vectors of NodeInfo structures.
This avoids the problem of freeing the NodeInfo structures.

GuiMonitor::gen_node_info_from_icons: initialize status.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agogui: more reindenting
Colin Patrick McCabe [Tue, 23 Nov 2010 23:39:53 +0000 (15:39 -0800)]
gui: more reindenting

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agogui: reindent a bunch of code
Colin Patrick McCabe [Tue, 23 Nov 2010 23:37:15 +0000 (15:37 -0800)]
gui: reindent a bunch of code

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agomdcache: in trim_non_auth, only print out path if it has a parent dentry.
Greg Farnum [Tue, 23 Nov 2010 22:40:54 +0000 (14:40 -0800)]
mdcache: in trim_non_auth, only print out path if it has a parent dentry.

This should only occur with the root inode, but caused a segfault for
anybody running more than one MDS who restarted.

Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
14 years agomds: Reply checking_lock while reading filelock
Herb Shiu [Tue, 23 Nov 2010 07:31:50 +0000 (15:31 +0800)]
mds: Reply checking_lock while reading filelock

Use checking_lock to repalce lock_state in extra buffer list to let client can get correct file lock reply.

14 years agoclient: remove inode from flush_caps list when auth_cap changes
Sage Weil [Tue, 23 Nov 2010 18:25:39 +0000 (10:25 -0800)]
client: remove inode from flush_caps list when auth_cap changes

Avoid confusing other code (e.g. kick_flushing_caps) by staying on the mds
flushign_caps list when we don't even have an auth_cap with them anymore.
We'll need to re-flush to a new MDS later.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix set_state_rejoin auth_pin check
Sage Weil [Tue, 23 Nov 2010 18:08:18 +0000 (10:08 -0800)]
mds: fix set_state_rejoin auth_pin check

We carry an auth pin IFF !stable AND auth.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoinit-ceph: tolerate failure in cleanallogs
Sage Weil [Tue, 23 Nov 2010 21:39:38 +0000 (13:39 -0800)]
init-ceph: tolerate failure in cleanallogs

Otherwise /var/log/ceph/stat makes rm -f error out and we fail.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: fix recover_replicas() unfound check
Sage Weil [Tue, 23 Nov 2010 21:32:49 +0000 (13:32 -0800)]
osd: fix recover_replicas() unfound check

missing_loc.count(soid) == 0 only means unfound if it's not missing on the
primary.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: recover_primary() until primary has all found objects
Sage Weil [Tue, 23 Nov 2010 21:16:52 +0000 (13:16 -0800)]
osd: recover_primary() until primary has all found objects

The logic in that if was effectively reversed.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: only discover_all_missing if unfound
Sage Weil [Tue, 23 Nov 2010 21:16:20 +0000 (13:16 -0800)]
osd: only discover_all_missing if unfound

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: add get_num_unfound() helper
Sage Weil [Tue, 23 Nov 2010 21:15:48 +0000 (13:15 -0800)]
osd: add get_num_unfound() helper

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: only search_for_missing if there are unfound objects
Sage Weil [Tue, 23 Nov 2010 20:46:51 +0000 (12:46 -0800)]
osd: only search_for_missing if there are unfound objects

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: removing unused variable, fix warning
Sage Weil [Tue, 23 Nov 2010 20:33:00 +0000 (12:33 -0800)]
osd: removing unused variable, fix warning

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: fix is_all_uptodate()
Sage Weil [Tue, 23 Nov 2010 20:32:50 +0000 (12:32 -0800)]
osd: fix is_all_uptodate()

This should only return true when recovery is done, i.e., no more missing
objects.  Nothing to do with unfound.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: fix PG::is_all_uptodate
Colin Patrick McCabe [Tue, 23 Nov 2010 18:42:32 +0000 (10:42 -0800)]
osd: fix PG::is_all_uptodate

In PG::is_all_uptodate, don't try to look for peer_missing[osd->whoami].
The primary keeps that in PG::missing!

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: PG::read_log: don't be clever with lost xattr
Colin Patrick McCabe [Mon, 22 Nov 2010 23:55:42 +0000 (15:55 -0800)]
osd: PG::read_log: don't be clever with lost xattr

Formerly, we had a special case in read_log for dealing with objects
whose objects were present on the disk, but not their attributes. This
conflicts with our plans to mark objects as lost by putting a bit in the
object attributes, since without those attributes, we'll never know if
the objects were formerly marked as lost.

This should almost never happen, and if it does, we just handle the
objects as missing in the normal way.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoRename peer_summary_requested to peer_backlog_req
Colin Patrick McCabe [Fri, 19 Nov 2010 23:25:00 +0000 (15:25 -0800)]
Rename  peer_summary_requested to peer_backlog_req

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoBuild might_have_unfound set at activation
Colin Patrick McCabe [Fri, 19 Nov 2010 23:02:46 +0000 (15:02 -0800)]
Build might_have_unfound set at activation

The might_have_unfound set is used by the primary OSD during recovery.
This set tracks the OSDs which might have unfound objects that the
primary OSD needs. As we receive Missing from each OSD in
might_have_unfound, we will remove the OSD from the set.

When might_have_unfound is empty, we will mark objects as LOST if the
latest version of the object resided on an OSD marked as lost.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agomonmaptool: Return a non-zero error code and print a useful error
Samuel Just [Tue, 23 Nov 2010 20:25:11 +0000 (12:25 -0800)]
monmaptool: Return a non-zero error code and print a useful error
message if unable to read the monmap file.

Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
14 years agomds: allow for old fs's with stray instead of stray0
Sage Weil [Tue, 23 Nov 2010 17:43:49 +0000 (09:43 -0800)]
mds: allow for old fs's with stray instead of stray0

New fs's get stray0, but we want to still behave with old ones.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoMerge branch 'testing' into unstable
Sage Weil [Tue, 23 Nov 2010 17:37:13 +0000 (09:37 -0800)]
Merge branch 'testing' into unstable

Conflicts:
configure.ac

14 years agov0.23.1 v0.23.1
Sage Weil [Sun, 21 Nov 2010 23:23:29 +0000 (15:23 -0800)]
v0.23.1

14 years agomon: always use send_reply for auth replies
Sage Weil [Tue, 23 Nov 2010 06:41:57 +0000 (22:41 -0800)]
mon: always use send_reply for auth replies

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomon: simplify send_reply code
Sage Weil [Tue, 23 Nov 2010 06:41:42 +0000 (22:41 -0800)]
mon: simplify send_reply code

No need to specify destination in send_reply, as we always have the request
for reference.

Simplify MRoute constructors (keep the ones we use) for tid and bcast
best-effort case.

Do NOT do a best-effort forward of a reply with a tid specified if the tid
is not in the routed-request map.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoosd: add assert to _process_pg_info
Colin Patrick McCabe [Tue, 23 Nov 2010 01:37:55 +0000 (17:37 -0800)]
osd: add assert to _process_pg_info

When activating an inactive replica, assert that we are doing so based
on a message from the primary.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agoosd: re-indent some code in _process_pg_info
Colin Patrick McCabe [Tue, 23 Nov 2010 01:31:50 +0000 (17:31 -0800)]
osd: re-indent some code in _process_pg_info

Re-indent the code and add a comment.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
14 years agomsgr: tolerate 0 bytes from tcp_read_nonblocking
Sage Weil [Tue, 23 Nov 2010 00:12:10 +0000 (16:12 -0800)]
msgr: tolerate 0 bytes from tcp_read_nonblocking

This can happen, I belive when we get a signal or something.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agoinit-ceph: fix (and test!) cleanlogs and cleanalllogs
Sage Weil [Mon, 22 Nov 2010 00:24:51 +0000 (16:24 -0800)]
init-ceph: fix (and test!) cleanlogs and cleanalllogs

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agomds: fix rejoin_scour_survivor_replicas inode check
Sage Weil [Mon, 22 Nov 2010 23:43:31 +0000 (15:43 -0800)]
mds: fix rejoin_scour_survivor_replicas inode check

We want to remove replicas that we don't ack, but those don't appear in
the strong_inode map; they're appended to the base_inode bufferlist.  Make
a (temporary) set to track who those are so that we know who to get rid of.

Signed-off-by: Sage Weil <sage@newdream.net>
14 years agotypes: Allow inodeno_t structs to alias.
Greg Farnum [Mon, 22 Nov 2010 23:04:22 +0000 (15:04 -0800)]
types: Allow inodeno_t structs to alias.

This removes a compiler warning that appeared in a gcc upgrade and
is apparently erroneous, about its usage violating strict-aliasing rules
when the + operator is used.

14 years agomessenger: init rc to -1, removing compiler warning.
Greg Farnum [Mon, 22 Nov 2010 23:02:54 +0000 (15:02 -0800)]
messenger: init rc to -1, removing compiler warning.

This actually is initialized before all uses, but compilers tend to
have trouble with assignment in if-else branches, and -1 is considered
invalid so there's no danger of refactoring breaking anything.

14 years agoCauses the MDSes to switch among a set of stray directories when
Samuel Just [Tue, 16 Nov 2010 23:29:40 +0000 (15:29 -0800)]
Causes the MDSes to switch among a set of stray directories when
switching to a new journal segment.

MDSCache:
The stray member has been replaced with strays, an array of inodes
representing the set of available stray directories, as well as
stray_index indicating the index of the current stray directory.

get_stray() now returns a pointer to the current stray directory
inode.

advance_stray() advances stray_index to the next stray directory.

migrate_stray no longer takes a source argument, the source mds
is inferred from the parent of the dir entry.

stray dir entries are now stray<index> rather than stray.

scan_stray_dir now scans all stray directories.

MDSLog:
start_new_segment now calls advance_stray() on MDSCache to force a new
stray directory.

mdstypes:
NUM_STRAY indicates the number of stray directories to use per MDS

MDS_INO_STRAY now takes an index argument as well as the mds number

MDS_INO_STRAY_OWNER(i) returns the mds owner of the stray directory i

MDS_INO_STRAY_OWNER(i) returns the index of the stray directory i

Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
14 years agoTimer must be initialized in Client::init and shutdown in
Samuel Just [Mon, 22 Nov 2010 18:53:55 +0000 (10:53 -0800)]
Timer must be initialized in Client::init and shutdown in
Client::shutdown.

Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
14 years agogenerate_past_intervals:generate back to lastclean
Colin Patrick McCabe [Mon, 22 Nov 2010 18:32:18 +0000 (10:32 -0800)]
generate_past_intervals:generate back to lastclean

PG::generate_past_intervals needs to generate all the intervals back to
history.last_epoch_clean, rather than just to
history.last_epoch_started. This is required by
PG::build_might_have_unfound, which needs to examine these intervals
when building the might_have_unfound set.

Move the check for whether past_intervals is up-to-date into
generate_past_intervals itself. Fix the check.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>