git.apps.os.sepia.ceph.com Git

]> git.apps.os.sepia.ceph.com Git - ceph.git/log

projects / ceph.git / log

commit | commitdiff | tree

Sage Weil [Wed, 2 Feb 2011 05:07:45 +0000 (21:07 -0800)]

confutils: check return values

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>

commit | commitdiff | tree

Samuel Just [Tue, 1 Feb 2011 21:48:39 +0000 (13:48 -0800)]

FileStore: fix double close

curr_fd is already closed if cp == cur_seq. This second close
occasionally ended up closing another thread's fd. The next open would
tend to grab that fd in op_fd or current_fd which would then get closed
by the other thread leaving op_fd or current_fd pointing to some random
file (or a closed descriptor).

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 1 Feb 2011 18:54:35 +0000 (10:54 -0800)]

common: config.cc: use "admin" as the default id

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 1 Feb 2011 17:18:31 +0000 (09:18 -0800)]

common: move init_g_conf into md_config_t ctor

Make sure that g_conf is initialized with default values before anything
else happens.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 1 Feb 2011 17:21:18 +0000 (09:21 -0800)]

common: config.cc: whitespace cleanup

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 1 Feb 2011 16:33:29 +0000 (08:33 -0800)]

common: config.cc: de-globalize g_fake_kill_after

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 1 Feb 2011 16:13:27 +0000 (08:13 -0800)]

common: config.cc: de-globalize show_config

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 1 Feb 2011 16:06:12 +0000 (08:06 -0800)]

common: clean up g_conf.id initialization a bit

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 1 Feb 2011 14:23:12 +0000 (06:23 -0800)]

common: remove ceph_set_default_id

ceph_set_default_id was only ever used to set the default ID to "admin",
which it already was.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 1 Feb 2011 11:30:04 +0000 (03:30 -0800)]

FileStore: fix error handling for mkfs, umount

In FileStore::umount: check if FDs are valid before closing them. Make
them invalid after closing them. Shut down FileStore::timer.

In FileStore::mkfs: always properly shutdown and free the filestore if
an error is encountered during mkfs. Check all functions that can fail.
Print out error messages on failures.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Tue, 1 Feb 2011 17:28:15 +0000 (09:28 -0800)]

mds: make --dump-journal preserve offset

Suggest user use tar -S to preserve sparseness.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Tue, 1 Feb 2011 05:10:18 +0000 (21:10 -0800)]

gitignore: ignore eclipse metadata

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Tue, 1 Feb 2011 05:09:15 +0000 (21:09 -0800)]

remove ancient active/ stuff

commit | commitdiff | tree

Sage Weil [Tue, 1 Feb 2011 04:58:47 +0000 (20:58 -0800)]

osd: don't leak fd on error

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Tue, 1 Feb 2011 04:55:51 +0000 (20:55 -0800)]

crypto: don't clobber errno

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Tue, 1 Feb 2011 00:24:12 +0000 (16:24 -0800)]

Merge remote branch 'origin/ostimeo'

commit | commitdiff | tree

Sage Weil [Sun, 30 Jan 2011 05:34:05 +0000 (21:34 -0800)]

Merge branch 'mds_reset'

Fixes: #602

commit | commitdiff | tree

Sage Weil [Sun, 30 Jan 2011 05:17:06 +0000 (21:17 -0800)]

Merge remote branch 'origin/stable'

Conflicts:
src/osd/OSD.cc

commit | commitdiff | tree

Samuel Just [Fri, 28 Jan 2011 22:07:47 +0000 (14:07 -0800)]

OSD: update_osd_stat take heartbeat_lock

Previously update_osd_stat had a race with code modifying heartbeat_from
causing the iterator increment to occasionally segfault.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Sat, 29 Jan 2011 00:56:22 +0000 (16:56 -0800)]

mds: skip a few more inodes during journal reset

To be safe...

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Sat, 29 Jan 2011 00:25:31 +0000 (16:25 -0800)]

mds: implement journal reset

This basically works. Remaining issues:
- mydir and root inodes are recreated from scratch but need to be
reconciled with what's committed (outside the old journal)
- ?

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Sat, 29 Jan 2011 00:36:14 +0000 (16:36 -0800)]

mds: open mydir (along w/ root) inode from boot_start()

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Sat, 29 Jan 2011 00:44:50 +0000 (16:44 -0800)]

Locker: Drop loner correctly!

Our previous check for if we want to drop the loner was incorrect.
Now, it's fixed. Resolves a serious bug with inode write access.

Reported-by: Jim Schutt <jaschut@sandia.gov>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Sat, 29 Jan 2011 00:44:50 +0000 (16:44 -0800)]

commit | commitdiff | tree

Greg Farnum [Thu, 27 Jan 2011 00:25:59 +0000 (16:25 -0800)]

librados: fix C interface const, too.

See 561224e95d6c66661d1bd6dce0e3d9da6f4a7e13

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Fri, 28 Jan 2011 20:35:38 +0000 (12:35 -0800)]

mds: defer sending resolves until mdsmap.failed.empty()

There is no point sending resolves while there are still failed nodes,
since we can't complete. We also trigger an assert if we try to send to
a failed node. Instead just wait until failed.empty() and then start.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Fri, 28 Jan 2011 20:35:38 +0000 (12:35 -0800)]

commit | commitdiff | tree

Sage Weil [Fri, 28 Jan 2011 17:45:33 +0000 (09:45 -0800)]

mds: standardize option parsing

- Use the standard macros.
- Simply --hot-standby and --journal-check options (always specify rank).
- Update usage().

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Fri, 28 Jan 2011 17:05:08 +0000 (09:05 -0800)]

common: _dout_lock: initialize _dout_lock first

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Fri, 28 Jan 2011 23:23:41 +0000 (15:23 -0800)]

config: remove dead stringtable cruft

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Thu, 27 Jan 2011 17:05:24 +0000 (09:05 -0800)]

os: FileStore: Add commit timeout

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Fri, 28 Jan 2011 11:59:13 +0000 (03:59 -0800)]

rbd: Rados::init: clean up after failure

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Thu, 27 Jan 2011 16:34:53 +0000 (08:34 -0800)]

os: FileStore: ctor should init all class vars

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Thu, 27 Jan 2011 16:25:27 +0000 (08:25 -0800)]

os: FileStore: remove default param

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Thu, 27 Jan 2011 15:37:44 +0000 (07:37 -0800)]

os:FileStore:use std::string rather than huge bufs

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Fri, 28 Jan 2011 09:24:49 +0000 (01:24 -0800)]

osd: fix mutual exclusion for _dispatch

We want only one thread dispatching messages (either new or requeued), so
that we can preserve ordering. Previously we weren't doing so for all
callers of do_waiters (tick() and the first in ms_dispatch()).

This fixes osd_sub_op(_reply) ordering problems that trigger the
now-famous repop queue assert.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Fri, 28 Jan 2011 05:33:28 +0000 (21:33 -0800)]

Merge remote branch 'origin/health2' into unstable

commit | commitdiff | tree

Colin Patrick McCabe [Thu, 27 Jan 2011 12:45:29 +0000 (04:45 -0800)]

units: add signals unit test

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Thu, 27 Jan 2011 16:47:48 +0000 (08:47 -0800)]

mds: cluster_fail instead of reset_cluster

Mark all cluster members as failed, and blacklist. Do not force up/failed
ranks to stopped, as that requires the admin to do other trickery. This
keeps the cluster fail orthogonal to any journal discard/reset.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 27 Jan 2011 16:34:16 +0000 (08:34 -0800)]

mon: add mdsmap DOWN flag to prevent mdsmap updates

This is intended to be set while doing critical cluster manipulation to
avoid cmds instances from starting up and getting added to the map.

Require it be set for cluster_reset.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 27 Jan 2011 16:24:52 +0000 (08:24 -0800)]

mdsmap: add flags

Convert unused client_epoch field to flags to avoid a protocol change. It
is always 0 on current clusters. Lucky us!

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 27 Jan 2011 16:15:23 +0000 (08:15 -0800)]

mon: add 'mds reset_cluster' command

Reset an MDS cluster back to a single node. The idea is:

- wipe out mds journals
- maybe set recovery flag
- mds reset_cluster (this)

Then mds0 only recover from an (empty) journal. Other MDS nodes would only
rejoin the cluster later.

See: #602
Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 27 Jan 2011 15:54:28 +0000 (07:54 -0800)]

.gitignore: vstart generated files

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 27 Jan 2011 15:53:44 +0000 (07:53 -0800)]

vstart: put tmp files in /tmp

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 26 Jan 2011 18:06:49 +0000 (10:06 -0800)]

osd: preserve ordering when ops are requeued

Requeue ops under osd_lock to preserve ordering wrt incoming messages.
Also drain the waiter queue when ms_dispatch takes the lock before calling
_dispatch(m).

Fixes: #743
Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 25 Jan 2011 23:28:49 +0000 (15:28 -0800)]

osd: restart if the osdmap client, heartbeat, OR cluster addrs don't match

If we somehow get ourselves into a situation where the OSDMap addresses do
not match our actual addresses, restart and try again. This is still
possible if multiple MOSDBoot messages end up in flight in the monitor,
say due to a monitor disconnect/reconnect, and we race with something that
marks us down in the map.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Tue, 25 Jan 2011 23:04:06 +0000 (15:04 -0800)]

osd: avoid extraneous send_boot() calls

Only send_boot() on osdmap update if we are restarting.  Otherwise we can
end up with too many MOSDBoot messages in flight and the monitor may
apply an old one instead of a new one.  For example:

- cosd starts
- send_boot with address set A
- get an osdmap update
- send_boot again with address set A
- get an osdmap update.  now we're up.
- get osdmap update, now we're marked down,
- bind to address set B
- send_boot with address set B

and the monitor may apply the second MOSDBoot (with adddress set A).

This results in an online OSD using a cluster address that differs from
that in the OSDMap.  Which causes problems with peering, among other
things.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 26 Jan 2011 17:29:39 +0000 (09:29 -0800)]

test_unfound.sh: kill cosds rather than mark out

For this test, we need to kill cosds rather than mark them as out.
Otherwise, we cannot force objects to become unfound.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 25 Jan 2011 11:00:29 +0000 (03:00 -0800)]

disable scrubs during test_unfound

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Wed, 26 Jan 2011 22:05:35 +0000 (14:05 -0800)]

librados: Remove rados_pool_t& usage, and pointless consts.

For some reason when I wrote this I passed rados_pool_t by reference
in some functions instead of by value. It's just a void*, so this is
silly.
Also silly, some of the passed-by-value rados_pool_ts were declared
to be const. WTF?

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 26 Jan 2011 14:11:01 +0000 (06:11 -0800)]

mon: implement PGMonitor::get_health

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 26 Jan 2011 13:54:10 +0000 (05:54 -0800)]

mon: OSDMonitor::get_health: const cleanup

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 26 Jan 2011 13:47:28 +0000 (05:47 -0800)]

mon: MonitorStore::mkfs: use run_cmd

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 26 Jan 2011 11:27:36 +0000 (03:27 -0800)]

os: FileStore: use run_cmd instead of system

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Wed, 26 Jan 2011 11:27:15 +0000 (03:27 -0800)]

common: Add run_cmd

Signed-off-by: Colin McCabe <colinm@hq.newdream.net>

commit | commitdiff | tree

Samuel Just [Tue, 25 Jan 2011 21:58:36 +0000 (13:58 -0800)]

ReplicatedPG: _rollback_to fix the just cloned condition

_rollback_to in the case that head was just cloned and that clone
includes snapid does not need to do anything. Previously, snapid would
have to match the snap on the clone, but the condition should be that
snapid is contained within the clone's snaps set.

This bug was introduced in e189222f06ee287eeb6fd7f46cff7a6727806dea

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 25 Jan 2011 17:40:18 +0000 (09:40 -0800)]

mon: remove PGMap::pg_set

We don't need an additional data structure to hold the keys to pg_stat.
We can just look at the keys of pg_stat.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 25 Jan 2011 16:58:59 +0000 (08:58 -0800)]

mon: PGMap::apply_incremental must maintain pg_set

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 25 Jan 2011 15:09:06 +0000 (07:09 -0800)]

os: readdir_r: read into PATH_MAX-sized buf

Fix the readdir_r uses in FileStore.cc

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Tue, 25 Jan 2011 22:07:48 +0000 (14:07 -0800)]

dumper: rework slightly to prevent incorrect usage of g_conf.id.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Tue, 25 Jan 2011 22:07:02 +0000 (14:07 -0800)]

MDSMonitor: fix bugs with standby-replay assignment.

We were accidentally passing gid instead of rank into find_standby_for!
Also, if we got an MDS with rank -1 we went ahead and used it. Broke
up the if statement tests to make sure that doesn't happen again.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Tue, 25 Jan 2011 13:45:57 +0000 (05:45 -0800)]

os: FileStore::mkfs error handling fixes

Clean up all resources on every exit path. Don't allocate multiple
PATH_MAX buffers on the stack when one will do. Fix errno misuses.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Tue, 25 Jan 2011 17:05:38 +0000 (09:05 -0800)]

vstart: Add --standby_mds setting, for auto-creating standby-replays.

commit | commitdiff | tree

Greg Farnum [Tue, 25 Jan 2011 17:08:57 +0000 (09:08 -0800)]

Merge branch 'standby_replay' into unstable

commit | commitdiff | tree

Sage Weil [Tue, 25 Jan 2011 16:50:58 +0000 (08:50 -0800)]

Merge branch 'testing' into unstable

Conflicts:
configure.ac
src/Makefile.am
src/common/common_init.cc
src/common/debug.h
src/common/signal.cc
src/config.cc
src/mon/MDSMonitor.cc
src/msg/SimpleMessenger.cc
src/osd/OSD.cc
src/osd/ReplicatedPG.cc

commit | commitdiff | tree

Sage Weil [Tue, 25 Jan 2011 16:38:10 +0000 (08:38 -0800)]

debian: fix publish.sh for ubuntu

commit | commitdiff | tree

Sage Weil [Mon, 24 Jan 2011 20:53:22 +0000 (12:53 -0800)]

v0.24.2

commit | commitdiff | tree

Greg Farnum [Mon, 24 Jan 2011 19:06:46 +0000 (11:06 -0800)]

Merge branch 'unstable' into standby_replay

commit | commitdiff | tree

Sage Weil [Mon, 24 Jan 2011 18:59:21 +0000 (10:59 -0800)]

msgr: make connection pipe reset atomic

Close a small and unlikely race.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Mon, 24 Jan 2011 18:58:42 +0000 (10:58 -0800)]

msgr: include con in debug output

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Fri, 21 Jan 2011 18:43:53 +0000 (10:43 -0800)]

filestore: don't wait min sync interval on explicit sync()

Also, if we do wait longer, wait on the same cond.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Greg Farnum [Mon, 24 Jan 2011 18:56:39 +0000 (10:56 -0800)]

MDSMonitor: Don't create new map for standby-replay spam.

If an MDS is unable to get into the standby-replay state for some
reason (MDS it should be following doesn't exist yet, there aren't
any open MDSes, etc) it will spam the Monitor with beacons asking
to change state. These will always go to prepare_beacon since
they're asking for a state change, but can't be granted.
When this happens, return false, not true!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Mon, 24 Jan 2011 18:38:59 +0000 (10:38 -0800)]

MDSMonitor: remove unused code.

commit | commitdiff | tree

Greg Farnum [Mon, 24 Jan 2011 18:30:07 +0000 (10:30 -0800)]

MDSMonitor: be more conservative with use of pending_mdsmap.

Use the current mdsmap when looking for MDSes to standby-replay for,
as that way we know the other MDS is already up. Otherwise we could
try and come up together and potentially race.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Mon, 24 Jan 2011 18:28:22 +0000 (10:28 -0800)]

MDS: MDSMonitor: Make MDS set standby-replay preferences, not MDSMonitor.

The MDS has more information about its configuration than the MDSMonitor
does. Therefore, encode that information into the standby_for_rank,
and let the monitor just operate based off that. This reduces magic
numbers and should be more robust.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Thu, 20 Jan 2011 22:07:34 +0000 (14:07 -0800)]

man: Update cmds manual.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Sat, 22 Jan 2011 01:43:47 +0000 (17:43 -0800)]

MDSMap: Update/fix print function.

It previously didn't look at standby_for_name unless standby_for_rank
was set!

Also, we now let it print out standby_for_rank on any value that isn't
set to the default (-1), since -2 means something.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Thu, 20 Jan 2011 21:56:06 +0000 (13:56 -0800)]

MDSMonitor: On restarting MDSes; set to standby-replay if appropriate.

This way, if the primary MDS crashes and is replaced, but is supposed
to standby-replay its secondary on recovery, it will do so.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Thu, 20 Jan 2011 21:37:00 +0000 (13:37 -0800)]

MDSMonitor: Try to assign unassigned standby-replay MDSes during tick()

We can now specify an MDS as standby-replay and let the monitor
assign it to any MDS. The monitor will only assign it to an
MDS that doesn't already have a hot standby, though.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Thu, 20 Jan 2011 21:29:05 +0000 (13:29 -0800)]

MDSMap: split up find_standby_for into multiple functions.

Usage of this function is rapidly diverging, in terms of what
is desired.
We now have "find_standby_for", which selects a standby for a
particular MDS (preferring those specifically set to the MDS, then
those asking to standby for somebody); "find_unused_for" which
finds an unused MDS which does NOT want to be a standby-replay
or a standby for anybody in particular; and "find_replacement_for",
which takes any MDS it can get but prefers those which have been
shadowing the given MDS or standing by for it specifically.

Since we've demoted standby_for_name to a third-class check, don't use
it when considering whether an mds is unused.
This field is now used to pair MDSes, not to bind them to particular
ranks. So the "primary" of the pairing will have standby_for_name
set but still expects to be assigned a rank.

Also, of course, switch the MSDMonitor to call the correct function
in the correct places.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Thu, 20 Jan 2011 19:57:23 +0000 (11:57 -0800)]

MDSMonitor: Adjust handling of MDSes asking for standby-replay.

1) If the MDS does not specify an MDS to follow, we mark them as
standing-by for -2. MDSMap::find_standby_for() has been modified
to grab these MDSes.
2) If an MDS asks for standby-replay and specifies a name but
not a rank, fill in the rank if the named MDS is known to us. If it
is not known, do nothing.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Wed, 19 Jan 2011 23:59:46 +0000 (15:59 -0800)]

mds: Adjust replay state changes and options parsing.

The MDS used to interpret g_conf.id as a rank. It no longer does
so and requires that standby ranks/names be set via the g_conf options,
or else along with the replay command in the CLI. Remove the MDS versions
of standby_for_[rank|name] and just use the ones in g_conf for simplicity.

However, the MDS only looks at the rank when switching to standby;
making names usable will require an update to the MDSMonitor code to
plug in ranks from names.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Mon, 24 Jan 2011 17:45:54 +0000 (09:45 -0800)]

os: fix minor typo in function defs

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Mon, 24 Jan 2011 17:24:50 +0000 (09:24 -0800)]

os: fix some obvious error handling problems

Fix some errors like checking errno when it may not have been set, doing
other operations which may change the value of errno and then checking
it

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Mon, 24 Jan 2011 00:09:14 +0000 (16:09 -0800)]

Makefile: use new Spirit headers where available

Use new boost::spirit header files where available, to eliminate the
annoying compiler warning on newer systems.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Sun, 23 Jan 2011 23:52:41 +0000 (15:52 -0800)]

Makefile: remove unecessary header check

We already check for libcrypto++ using PKG_CHECK_MODULES; we don't need
to fish for header files.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Sun, 23 Jan 2011 23:42:39 +0000 (15:42 -0800)]

Makefile: use CXXFLAGS more consistently

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Mon, 24 Jan 2011 15:34:40 +0000 (07:34 -0800)]

test: Add test_rw

Test reading and writing lots of objects from the object store.

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Sat, 22 Jan 2011 01:42:02 +0000 (17:42 -0800)]

messages: Let MMDSBeacon set_standby_for_name from a c-string.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Wed, 19 Jan 2011 22:53:32 +0000 (14:53 -0800)]

config: add new mds_standby options.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Fri, 21 Jan 2011 19:07:52 +0000 (11:07 -0800)]

mds: Keep journaler in readonly mode until replay completes.

Previously we were switching it off for the final non-standby replay
when a standby-replay got activated. This caused issues
since the states weren't quite correct!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Samuel Just [Fri, 21 Jan 2011 21:01:20 +0000 (13:01 -0800)]

ReplicatedPG: fix snap_trimmer log version bug

Previously, ctx->at_version would be the same as ctx->obs->oi.version
leading to the log entry having prior_version == version.
This bug was introduced in d1b85e06fb5ce1cfd5bbc74ba639811b92033909.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>

commit | commitdiff | tree

Greg Farnum [Fri, 21 Jan 2011 22:20:16 +0000 (14:20 -0800)]

FileJournal: don't overflow the journal size.

Previously we were casting it to a uint64_t, but the left shift
occurs before the cast, so we were overflowing in some circumstances.
Split these up to prevent it.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Fri, 21 Jan 2011 15:03:01 +0000 (07:03 -0800)]

msgr: don't need to reinstall signals after daemon

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Fri, 21 Jan 2011 14:45:40 +0000 (06:45 -0800)]

mds: respawn must unblock signals before exec

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Fri, 21 Jan 2011 14:27:55 +0000 (06:27 -0800)]

common: move signal blocking into signal.cc

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Colin Patrick McCabe [Fri, 21 Jan 2011 13:45:01 +0000 (05:45 -0800)]

common: add signal_mask_to_str

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

commit | commitdiff | tree

Sage Weil [Fri, 21 Jan 2011 18:08:26 +0000 (10:08 -0800)]

msgr: always start reaper

If we didn't explicitly bind (i.e. are a client), then we don't start
the accepter. That's fine. But the reaper thread start was also
conditional, when it shouldn't be; otherwise the client can't clean up
old Pipes (and their sockets).

Fixes: #732
Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Fri, 21 Jan 2011 17:35:31 +0000 (09:35 -0800)]

monclient: fix locking

Hold lock in handle_* methods; assert lock held in all _* methods.

Fixes: #731
Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Colin Patrick McCabe [Fri, 21 Jan 2011 14:45:40 +0000 (06:45 -0800)]

mds: respawn must unblock signals before exec

Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom