git.apps.os.sepia.ceph.com Git

]> git.apps.os.sepia.ceph.com Git - ceph.git/log

projects / ceph.git / log

commit | commitdiff | tree

Sage Weil [Thu, 11 Nov 2010 22:28:18 +0000 (14:28 -0800)]

Merge branch 'rc' into unstable

Conflicts:
configure.ac
src/Makefile.am

commit | commitdiff | tree

Sage Weil [Thu, 11 Nov 2010 00:34:17 +0000 (16:34 -0800)]

v0.23

commit | commitdiff | tree

Sage Weil [Thu, 11 Nov 2010 04:58:49 +0000 (20:58 -0800)]

mds: fix null_snapflush with multiple intervening snaps

The client is allowed to not send a snapflush if there is no dirty metadata
to write for a given snap. However, the mds can only look up inodes by
the last snapid in the interval. So, when doing a null_snapflush (filling
in for snapflushes the client didn't send), we have to walk forward through
intervening snaps until we find the right inode.

Note that this means we will call _do_snap_update multiple times on the
same inode, but with different snapids.

Add unit test to check this.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 11 Nov 2010 00:36:18 +0000 (16:36 -0800)]

Merge remote branch 'origin/unfound' into unstable

commit | commitdiff | tree

Sage Weil [Thu, 11 Nov 2010 00:31:26 +0000 (16:31 -0800)]

osd: scrub least recently scrubbed pgs first; once a day

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2010 23:43:37 +0000 (15:43 -0800)]

osd: don't scrub something we just scrubbed

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2010 23:33:31 +0000 (15:33 -0800)]

osd: call sched_scrub on reserve reply

Otherwise we have to wait until the next time it's called by the timer, and
during that period we have a reservation locally, and any other peers can't
reserve a scrub from us, and nobody makes any progress.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2010 23:28:39 +0000 (15:28 -0800)]

osd: fix sched_scrub

Insert whoami into reserved set on primary, not 0! Also more cleanup of
sched state helpers.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2010 22:58:34 +0000 (14:58 -0800)]

osd: do scrub schedule state changes inside scrub()

Update these values under protection of pg lock iff we start scrubbing,
otherwise back out.

On scrub completion, unreserve replicas.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2010 22:55:57 +0000 (14:55 -0800)]

osd: track last_scrubbed in PG::Info::History

Share with peers and write to disk on scrub completion.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2010 22:15:41 +0000 (14:15 -0800)]

osd: scrub: change cancel behavior

Use explicit flag, so that scrub_reserved always indicates whether the
osd count includes us or not.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 10 Nov 2010 22:43:26 +0000 (14:43 -0800)]

pg_state_string: use an ostringstream

Use an ostringstream for efficiency's sake.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2010 21:49:22 +0000 (13:49 -0800)]

vstart: stop logging to /tmp/foo

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2010 21:39:51 +0000 (13:39 -0800)]

osd: fix scrub reserved state when starting scrub

Also document scrub scheduling/pending/active states.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2010 21:16:34 +0000 (13:16 -0800)]

vstart: turn down msgr debugging

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2010 21:13:38 +0000 (13:13 -0800)]

monc: cancel timer events with lock held

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 9 Nov 2010 06:15:14 +0000 (22:15 -0800)]

Wake up clients waiting for now-found objects

PG::search_for_missing: when we find a previously unfound object, check
to see if there is an entry in waiting_for_missing_object representing a
client waiting for this object.

PG::repair_object: assert that waiting_for_missing_object is empty
before messing with missing_loc. It definitely should be during a scrub.

ReplicatedPG role change logic: always take_object_waiters on the wait
queues when the PG acting set changes.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Mon, 8 Nov 2010 20:30:26 +0000 (12:30 -0800)]

test_unfound.sh: test reading an unfound object.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Fri, 5 Nov 2010 00:28:39 +0000 (17:28 -0700)]

test_unfound.sh: verify that we have unfound objs

test_unfound.sh: verify that we have unfound objs.
Then, when we bring up the other OSD, verify that those unfound objects
are found (on that OSD).

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Thu, 4 Nov 2010 21:40:22 +0000 (14:40 -0700)]

Add num_objects_unfound to struct pg_stat_t

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 3 Nov 2010 00:58:52 +0000 (17:58 -0700)]

test_unfound.sh: shorter test

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 3 Nov 2010 00:56:00 +0000 (17:56 -0700)]

PG::recover_master_log: rename a local variable

PG::recover_master_log: rename a local variable to avoid using the
overloaded term "missing".

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 3 Nov 2010 00:51:02 +0000 (17:51 -0700)]

OSD::_process_pg_info:search_for_missing sometimes

OSD::_process_pg_info: If we're the primary for this active PG, and we
have missing objects, call search_for_missing. This should ensure that
we know where to find our missing objects.

The reason why this wasn't there before is that previously, we kept the
PG from going active until all the missing objects were found.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 3 Nov 2010 00:50:28 +0000 (17:50 -0700)]

Add PG::Missing::have_missing()

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 3 Nov 2010 00:49:37 +0000 (17:49 -0700)]

PG::search_for_missing: minor refactoring, comment

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 3 Nov 2010 00:42:48 +0000 (17:42 -0700)]

PG::peer: don't block if objects are unfound

Erase the code in PG::peer that used to keep us from becoming active
when objects were still unfound. Print out the number of missing and
unfound objects at the end of PG::peer.

Erase PG::check_for_lost_objects and PG::forget_lost_objects.

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 3 Nov 2010 00:38:16 +0000 (17:38 -0700)]

PG::peer: count/find cleanup

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 3 Nov 2010 00:36:11 +0000 (17:36 -0700)]

PG.h erase deadcode

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 3 Nov 2010 00:34:07 +0000 (17:34 -0700)]

PG: nomenclature change: talk about unfound objs

Describe objects as "unfound" when we don't know what OSD has them.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 3 Nov 2010 00:31:20 +0000 (17:31 -0700)]

PG: move ostream operator to .cpp file

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2010 17:43:56 +0000 (09:43 -0800)]

mds: fix inode->frag rstat projected with snaps

The snapid 'first' value needs to be >= inode->first; move that into
the helper.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2010 17:04:31 +0000 (09:04 -0800)]

osdmap: break up asserts for easier debugging

If we fail one of these it's helpful to know which one.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2010 17:03:37 +0000 (09:03 -0800)]

objecter: throttle before looking at lock protected state

The take_op_budget() may drop our lock if we are in keep_balanced_budget
mode, so we need to do that _before_ we take references to internal state
that may change out from under us during that time.

This fixes a crash like

./osd/OSDMap.h: In function 'entity_inst_t OSDMap::get_inst(int)':
./osd/OSDMap.h:460: FAILED assert(exists(osd) && is_up(osd))
ceph version 0.22.1 (commit:c6f403a6f441184956e00659ce713eaee7014279)
1: (Objecter::op_submit(Objecter::Op*)+0x6c2) [0x38658854c2]
2: /usr/lib64/librados.so.1() [0x3865855dc9]
3: (RadosClient::aio_write(RadosClient::PoolCtx&, object_t, long,
ceph::buffer::list const&, unsigned long,
RadosClient::AioCompletion*)+0x24b) [0x386585724b]
4: (rados_aio_write()+0x9a) [0x386585741a]
5: /usr/bin/qemu-kvm() [0x45a305]
6: /usr/bin/qemu-kvm() [0x45a430]
7: /usr/bin/qemu-kvm() [0x43bb73]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
./osd/OSDMap.h: In function 'entity_inst_t OSDMap::get_inst(int)':
./osd/OSDMap.h:460: FAILED assert(exists(osd) && is_up(osd))
ceph version 0.22.1 (commit:c6f403a6f441184956e00659ce713eaee7014279)
1: (Objecter::op_submit(Objecter::Op*)+0x6c2) [0x38658854c2]
2: /usr/lib64/librados.so.1() [0x3865855dc9]
3: (RadosClient::aio_write(RadosClient::PoolCtx&, object_t, long,
ceph::buffer::list const&, unsigned long,
RadosClient::AioCompletion*)+0x24b) [0x386585724b]
4: (rados_aio_write()+0x9a) [0x386585741a]
5: /usr/bin/qemu-kvm() [0x45a305]
6: /usr/bin/qemu-kvm() [0x45a430]
7: /usr/bin/qemu-kvm() [0x43bb73]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
*** Caught signal (ABRT) ***
ceph version 0.22.1 (commit:c6f403a6f441184956e00659ce713eaee7014279)
1: (sigabrt_handler(int)+0x91) [0x3865922b91]
2: /lib64/libc.so.6() [0x3c0c032a30]
3: (gsignal()+0x35) [0x3c0c0329b5]
4: (abort()+0x175) [0x3c0c034195]
5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x3c110beaad]

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2010 16:50:25 +0000 (08:50 -0800)]

mon: drop unnecessary state checks

We want to ignore all beacons from the mds regardless of what state they
are in.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2010 16:45:36 +0000 (08:45 -0800)]

debian: don't explicitly depend on libgoogle-perftools0

dpkg-buildpackage will autodetect the dependency. Except on lenny, where
it doesn't exist and we don't use it!

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Greg Farnum [Wed, 10 Nov 2010 16:11:23 +0000 (08:11 -0800)]

mds: Enable --journal_check mode.

This replaces the old --shadow option, which didn't work.
It starts up the MDS daemon, then replays the journal for
another MDS, and then shuts down.

Also minimally modifies the MDSMonitor to enable this
behavior; since it requires shared state.

commit | commitdiff | tree

Greg Farnum [Tue, 9 Nov 2010 18:48:00 +0000 (10:48 -0800)]

osdc: Fix bad assert in ~ObjectCacher.

The objects data member is never empty on shutdown since it now consists
of a vector of pools. Instead, check each pool map for emptiness.

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2010 15:42:29 +0000 (07:42 -0800)]

uclient: only update inode if version increased

This realigns the code with the kernel version, fixing a number of
problems when you have multiple MDSs returning info on the same inode.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 10 Nov 2010 07:59:06 +0000 (23:59 -0800)]

decompile_crush_bucket: fix depth-first decomp

We need to ensure that buckets are output after their dependencies. The
best way to do this is a depth-first traversal of the bucket directed
acyclic graph. The previous solution was incorrect because it in some
cases it didn't traverse the graph in the right order.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 10 Nov 2010 07:48:01 +0000 (23:48 -0800)]

CrushWrapper:get_bucket: ret ENOENT for no bucket

All the callers of CrushWrapper::get_bucket() check for error codes, but
not for NULL returns. So if there is no bucket (i.e., a NULL pointer) at
crush->bucket[i], just return the error code ENOENT. This is consistent
with how we handle other out-of-bounds requests.

Also, don't allow the caller to get us to try to access negative indices
in crush->bucket.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 9 Nov 2010 23:56:20 +0000 (15:56 -0800)]

Merge branch 'sched_scrub' into unstable

Conflicts:
src/osd/PG.cc
src/osd/PG.h

commit | commitdiff | tree

Sage Weil [Tue, 9 Nov 2010 23:50:48 +0000 (15:50 -0800)]

osd: small cleanup

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 9 Nov 2010 23:08:15 +0000 (15:08 -0800)]

osd: scrub: list objects without lock held

We'll go back to get anything we missed later.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 9 Nov 2010 23:46:54 +0000 (15:46 -0800)]

Merge branch 'scrub_no_lock' into unstable

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 9 Nov 2010 23:34:52 +0000 (15:34 -0800)]

ps-ceph.pl: don't show self

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 9 Nov 2010 23:04:10 +0000 (15:04 -0800)]

gui: add missing #include

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 9 Nov 2010 22:50:24 +0000 (14:50 -0800)]

Merge branch 'rbd-fiemap' into unstable

commit | commitdiff | tree

Sage Weil [Tue, 9 Nov 2010 22:49:47 +0000 (14:49 -0800)]

objecter: set READ flag on new objecter mapext/read_sparse ops

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 9 Nov 2010 22:48:52 +0000 (14:48 -0800)]

objecter: fix balancer for ops with length < 0

Notably, mapext.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 9 Nov 2010 22:36:02 +0000 (14:36 -0800)]

filestore: autodetect presense of FIEMAP ioctl

If it's not there, assume the whole object is allocated.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 9 Nov 2010 22:35:33 +0000 (14:35 -0800)]

fiemap: include linux fiemap.h header; unconditionally compile helper

If the system doesn't have the header, use our copy.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 9 Nov 2010 22:32:49 +0000 (14:32 -0800)]

ps-ceph.pl: display Ceph tests

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 9 Nov 2010 22:23:12 +0000 (14:23 -0800)]

Merge remote branch 'origin/rbd-fiemap' into unstable

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 9 Nov 2010 22:06:42 +0000 (14:06 -0800)]

Fix example config file

We need to specify a journal size for the file-based journal we set up
in the example config file.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 9 Nov 2010 21:57:17 +0000 (13:57 -0800)]

TimerThread:don't call pop_front before iter deref

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Kacper Kowalik [Tue, 9 Nov 2010 21:30:15 +0000 (13:30 -0800)]

Makefile: use openssl module check

This allows ceph to build with --as-needed.

Signed-off-by: Kacper Kowalik <xarthisius@gentoo.org>

commit | commitdiff | tree

Sage Weil [Tue, 9 Nov 2010 21:17:25 +0000 (13:17 -0800)]

osd: shut down if we do not exist

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 9 Nov 2010 21:08:56 +0000 (13:08 -0800)]

osd: handle osds that no longer exist in prior_set_affected

Consider no-longer-existent OSDs lost.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 9 Nov 2010 20:04:47 +0000 (12:04 -0800)]

Objecter: initialize timer in Objecter::init

Just in case future users of Objecter want to create one before calling
Messenger::start as a daemon.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 9 Nov 2010 18:13:46 +0000 (10:13 -0800)]

Add test_crushtool.sh

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 9 Nov 2010 18:06:10 +0000 (10:06 -0800)]

mds: turn on mds_bal_frag (dir fragmentation) by default

Let the fun begin!

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 9 Nov 2010 17:55:14 +0000 (09:55 -0800)]

mds: fix inode freeze auth pin allowance

When we're renaming across nodes, we need to freeze the inode. This
requires that we allow for the auth_pins that _we_ hold, which include
one because of the linklock xlock, and one by the MDRequest.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 9 Nov 2010 17:43:25 +0000 (09:43 -0800)]

osd: handle osds that no longer exist in build_prior

Fix build_prior to handle OSDs that no longer exist in the current map.
Consider them lost.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 9 Nov 2010 17:57:15 +0000 (09:57 -0800)]

CrushWrapper::get_bucket_item: bounds check

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 9 Nov 2010 17:55:44 +0000 (09:55 -0800)]

crushtool: don't create a dump we can't recompile

In crushtool, dump buckets in tree order. Buckets which reference other
buckets must be dumped after their depedencies, or else re-compilation
will fail.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 9 Nov 2010 17:56:05 +0000 (09:56 -0800)]

osdmap: cleanup: add parens

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 14 Oct 2010 21:40:54 +0000 (14:40 -0700)]

mds: wipe out client sessions on startup

For disaster recovery and such.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 14 Oct 2010 20:55:15 +0000 (13:55 -0700)]

mon: implement 'mds newfs <metapool> <datapool>' command

Create a new fs (by creating a new MDSMap) using the given pools.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 14 Oct 2010 20:53:47 +0000 (13:53 -0700)]

mds: use mdsmap data pool for root inode default layout

The MDSMap may specify any random pool as the data pool; use that.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 14 Oct 2010 20:37:59 +0000 (13:37 -0700)]

mds: add mds_skip_ino and mds_wipe_ino_prealloc options

These are last-ditch recovery tools. Not particularly effective ones,
though.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Christian Brunner [Tue, 9 Nov 2010 06:03:02 +0000 (22:03 -0800)]

ceph.spec.in: don't strip rados classes

Signed-off-by: Christian Brunner <christian@brunner-muc.de>

commit | commitdiff | tree

Sage Weil [Sat, 6 Nov 2010 19:12:38 +0000 (12:12 -0700)]

mds: add missing Dumper.[h,cc]

commit | commitdiff | tree

Sage Weil [Mon, 8 Nov 2010 21:18:31 +0000 (13:18 -0800)]

mds: tolerate/fix negative dir size counts

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Sat, 6 Nov 2010 19:12:38 +0000 (12:12 -0700)]

mds: add missing Dumper.[h,cc]

commit | commitdiff | tree

Andrew Farmer [Mon, 8 Nov 2010 17:41:06 +0000 (09:41 -0800)]

Replace ps-ceph.sh shell script with perl script

A much faster version of ps-ceph.sh.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Sage Weil [Sun, 7 Nov 2010 17:56:42 +0000 (09:56 -0800)]

Merge remote branch 'origin/object_locator' into unstable

Conflicts:
src/osd/OSD.cc
src/osd/ReplicatedPG.cc
src/osd/ReplicatedPG.h
src/osd/osd_types.h

commit | commitdiff | tree

Sage Weil [Sun, 7 Nov 2010 17:45:09 +0000 (09:45 -0800)]

Merge remote branch 'origin/timer-fixes' into unstable

commit | commitdiff | tree

Sage Weil [Sun, 7 Nov 2010 17:44:04 +0000 (09:44 -0800)]

v0.24~rc

commit | commitdiff | tree

Sage Weil [Sun, 7 Nov 2010 17:42:51 +0000 (09:42 -0800)]

Merge remote branch 'origin/testing' into unstable

commit | commitdiff | tree

Sage Weil [Sun, 7 Nov 2010 15:49:59 +0000 (07:49 -0800)]

mds: eval: put scatter in MIX if replicated, otherwise LOCK

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Sun, 7 Nov 2010 15:45:52 +0000 (07:45 -0800)]

mds: do not scatter_writebehind in MIX state

Replicas might come in while we're flushing and get a MIX state with
the old state.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Sun, 7 Nov 2010 04:05:11 +0000 (21:05 -0700)]

Merge branch 'unstable' into mix_stale

commit | commitdiff | tree

Sage Weil [Sat, 6 Nov 2010 18:35:54 +0000 (11:35 -0700)]

mds: remove MIX_STALE

Yay, we don't need it!

If we can't update the frag on scatter, fine. The staleness of the frag
is implicit in the frag's scatter stat version not matching the inode's.
If/when we do want to update it, the frag will clearly be writable, and
we can bring it back in sync then.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Sat, 6 Nov 2010 18:18:53 +0000 (11:18 -0700)]

mds: don't fuss with versions when taking frag/rstat from frag; it's never stale here

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Sat, 6 Nov 2010 18:18:13 +0000 (11:18 -0700)]

mds: introduce/use helpers to resync stale fragstat/rstat; update version

Simplifies code.

Also, update the version when we resync!

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Sun, 7 Nov 2010 03:55:12 +0000 (20:55 -0700)]

mds: ignore done_locking on slave requests' acquire_locks()

Slave requests ask for each xlock one at a time. Don't bail out based on
the done_locking flag.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Sun, 7 Nov 2010 03:17:32 +0000 (20:17 -0700)]

mds: don't use helper for rename srcdn

The rdlock_path_xlock_dentry helper works for _auth_ dentries that we
create locally in an auth dirfrag. For the srcdn, we need to discover an
_existing_ dentry that is not necessarily auth.

Call path_traverse ourselves, but be careful to take the appropriate locks
on the resulting dn, dir, and ancestors.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Sat, 6 Nov 2010 18:02:13 +0000 (11:02 -0700)]

mds: never complete a gather on a flushing lock

The scatter_writebehind() takes a wrlock, but that may still allow the lock
to complete a gather to LOCK and even move to say MIX before the data is
committed. Bad news!

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Sat, 6 Nov 2010 16:38:15 +0000 (09:38 -0700)]

mds: update version when bring stale rstat back up to date

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Sat, 6 Nov 2010 14:58:32 +0000 (07:58 -0700)]

mds: simplify stale semantics a bit

is_stale() => next MIX is MIX_STALE. Stale flag is then cleared. Then we
special case the import to preserve stale-ness.

TODO: add_replica_inode likely has this same problem.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Sat, 6 Nov 2010 04:52:28 +0000 (21:52 -0700)]

mds: preserve stale state on import; some cleanup

Our new invariant is that MIX_STALE always implies is_stale(). And on
import, if is_stale(), MIX becomes MIX_STALE. This ensures that a replica
that we put into MIX_STALE doesn't turn back into MIX if we import it
and take the auth's state in CInode::decode_import().

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Sat, 6 Nov 2010 00:08:10 +0000 (17:08 -0700)]

Merge branch 'mix_stale' into unstable

commit | commitdiff | tree

Sage Weil [Sat, 6 Nov 2010 00:06:10 +0000 (17:06 -0700)]

mds: add more verify_scatter asserts

For catchings fragstat errors sooner.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Fri, 5 Nov 2010 22:24:53 +0000 (15:24 -0700)]

mds: fix version check on resyncing stale rstat in predirty_journal_parents

We're resyncing rstat, so check the rstat version (not fragstat!)

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Greg Farnum [Fri, 5 Nov 2010 19:45:06 +0000 (12:45 -0700)]

mds: Fix bad inode deref.

Accidentally trying to print out the CInode after removing it in trim_non_auth!
Move the print to before it's been unlinked/removed/etc.

commit | commitdiff | tree

Colin Patrick McCabe [Fri, 5 Nov 2010 19:17:40 +0000 (12:17 -0700)]

Revisit std::multimap decoder

Previously I changed the std::multimap decoder to minimize the number of
constructor invocations. However, it could be much more expensive to
copy an initialized (decoded) val_t than to copy an empty one. For
example, if we are decoding std::multimap < int, std::set <int> >. So
change the code to insert a non-decoded val_t again.

However, this still saves two constructor invocations over the original.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Fri, 5 Nov 2010 18:34:11 +0000 (11:34 -0700)]

autogen.sh: check for pkg-config

To avoid seeing confusing errors later in the configure process, in
autogen.sh, check to make sure the pkg-config program is installed.

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Samuel Just [Thu, 21 Oct 2010 23:54:01 +0000 (16:54 -0700)]

PG.cc: build_scrub_map now drops the PG lock while scanning the PG
       build_inc_scrub_map scans all files modified since the given
           version number and creates an incremental scrub map to
           be merged with a scrub map created with build_scrub_map.
           This scan is done while holding the pg lock.
       ScrubMap.objects is now represented as a map rather than as
           a vector.

PG.h:  Added last_update_applied and finalizing_scrub members to
           PG.

ReplicatedPG.cc:
       calc_trim_to will not trim the log during a scrub (since
           replicas need the log to construct incremental maps)
       sub_op_modify_oplied and op_applied maintain a
   last_update_applied PG member to be used for determining
           how far back a replica need go to construct an
           incremental scrub map.

osd_types.h:
       Added merge_incr method for combining a scrub map with
           a subsequent incremental scrub map.
       ScrubMap.objects is now a map from sobject_t to object.

PG scrubs will now drop the PG lock while initially scanning the PG
collection allowing writes to continue.  The scrub map will be tagged
with the most recent version applied.  After halting writes, the
primary will request an incremental map from any replicas whose map
versions do not match log.head.

Signed-off-by: Samuel Just <samuelj@hq.newdream.net>

commit | commitdiff | tree

Sage Weil [Fri, 5 Nov 2010 17:38:35 +0000 (10:38 -0700)]

mds: preserve version when recovering rstat from dirfrag in predirty_journal_parents

We don't want to screw up the version here. This aligns the code with
other instances of this check.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Fri, 5 Nov 2010 06:20:33 +0000 (23:20 -0700)]

mds: restructure finish_scatter_gather_update()

Separate behavior into two dimensions: whether or not we are updating
the dirfrag, and whether or not the dirfrag is stale.

Change the various helpers to NOT implicitly update accounted_*, as the
caller doesn't always want that, notably when we are non-stale but frozen.

Signed-off-by: Sage Weil <sage@newdream.net>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom