]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
13 years agorbd: the showmapped command shouldn't connect to the cluster
Josh Durgin [Wed, 30 Nov 2011 18:26:22 +0000 (10:26 -0800)]
rbd: the showmapped command shouldn't connect to the cluster

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoceph-rbdnamer: include snapshot name if present
Josh Durgin [Tue, 29 Nov 2011 01:24:25 +0000 (17:24 -0800)]
ceph-rbdnamer: include snapshot name if present

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agorbd, mount.ceph: use pre-stored secret if available
Josh Durgin [Tue, 29 Nov 2011 01:02:15 +0000 (17:02 -0800)]
rbd, mount.ceph: use pre-stored secret if available

If a secret is specified, store and use it, but otherwise
check for a pre-existing secret to use.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agosecret: add is_kernel_secret function
Josh Durgin [Tue, 29 Nov 2011 00:54:27 +0000 (16:54 -0800)]
secret: add is_kernel_secret function

This will let us know whether we can add a key mount option
if no secret is specified.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agosecret: fix error check
Josh Durgin [Tue, 29 Nov 2011 00:52:19 +0000 (16:52 -0800)]
secret: fix error check

add_key will return -1 when an error occurs, which should be handled at a higher level and not printed here.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agorbd: allow snapshots to be mapped
Josh Durgin [Wed, 23 Nov 2011 21:54:08 +0000 (13:54 -0800)]
rbd: allow snapshots to be mapped

unmap and showmapped already support snapshots. map should too.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agotest_rados.py: clean up after EEXIST test
Josh Durgin [Tue, 6 Dec 2011 16:34:19 +0000 (08:34 -0800)]
test_rados.py: clean up after EEXIST test

This extra pool caused subsequent pool tests to fail.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoMerge remote branch 'gh/stable'
Sage Weil [Tue, 6 Dec 2011 01:33:57 +0000 (17:33 -0800)]
Merge remote branch 'gh/stable'

13 years agodoc: fix rst syntax
Sage Weil [Tue, 6 Dec 2011 00:16:35 +0000 (16:16 -0800)]
doc: fix rst syntax

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agodoc: document monitor cluster expansion/contraction
Sage Weil [Mon, 5 Dec 2011 22:07:44 +0000 (14:07 -0800)]
doc: document monitor cluster expansion/contraction

Pretty sure my rst syntax is wrong.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agocephtool: fix shutdown
Sage Weil [Mon, 5 Dec 2011 21:38:02 +0000 (13:38 -0800)]
cephtool: fix shutdown

Fix 'ceph -w' brokenness from commit ad13d0b7.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilejournal: make FileJournal::open() arg slightly less weird
Sage Weil [Mon, 5 Dec 2011 17:36:54 +0000 (09:36 -0800)]
filejournal: make FileJournal::open() arg slightly less weird

Pass in fs_op_seq (last_committed_seq), not the next expected seq, so we
can avoid subtracting and adding 1 in odd places.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge branch 'stable'
Sage Weil [Mon, 5 Dec 2011 19:21:08 +0000 (11:21 -0800)]
Merge branch 'stable'

13 years agovstart.sh: .ceph_keyring -> keyring
Sage Weil [Mon, 5 Dec 2011 17:23:56 +0000 (09:23 -0800)]
vstart.sh: .ceph_keyring -> keyring

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agofilejournal: remove bogus check in read_entry
Sage Weil [Mon, 5 Dec 2011 18:52:24 +0000 (10:52 -0800)]
filejournal: remove bogus check in read_entry

It is perfectly fine to read events that are older than the fs's seq from
the journal; open() will skip them when positioning the read pointer on
open.

Also, this code is nonsensical; it always failed the assertion.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agofilejournal: set last_committed_seq based on fs, not journal
Sage Weil [Mon, 5 Dec 2011 17:34:44 +0000 (09:34 -0800)]
filejournal: set last_committed_seq based on fs, not journal

last_committed_seq is the last seq committed to the fs, not the journal.
Set it when we begin replay with the fs provided value, not from the newest
entry in the journal.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: stub perfcounters for monitor, cluster
Sage Weil [Fri, 2 Dec 2011 23:35:38 +0000 (15:35 -0800)]
mon: stub perfcounters for monitor, cluster

The 'mon' perfcounter is for the local daemon and is always registered.

The 'cluster' perfcounter is for cluster state, and is only registered
(and thus only shows up via the admin socket) when the current daemon is
part of the cluster quorum.

No actual counters yet.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: safely requeue waiting_for_ondisk waiters on_role_change
Sage Weil [Fri, 2 Dec 2011 23:25:37 +0000 (15:25 -0800)]
osd: safely requeue waiting_for_ondisk waiters on_role_change

This could conceivably cause the reply ordering mismatch seen in bug
#1490.  Not sure why we didn't also fix this caller when we fixed that
bug last time :).

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: rename {take -> requeue}_object_waiters
Sage Weil [Fri, 2 Dec 2011 23:24:00 +0000 (15:24 -0800)]
osd: rename {take -> requeue}_object_waiters

It calls osd->requeue_ops(), so make naming more consistent and avoid
confusing people like me.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agorados.py: add list_pools method
Josh Durgin [Fri, 2 Dec 2011 21:17:34 +0000 (13:17 -0800)]
rados.py: add list_pools method

Signed-off-by: Eric Chen <Eric_YH_Chen@wistron.com>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoMerge branch 'stable'
Sage Weil [Fri, 2 Dec 2011 20:06:50 +0000 (12:06 -0800)]
Merge branch 'stable'

13 years agoDoc: add a conceptual overview of the peering process
Mark Kampe [Fri, 2 Dec 2011 19:26:20 +0000 (11:26 -0800)]
Doc: add a conceptual overview of the peering process

Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com>
13 years agomds: remove obsolete doc
Sage Weil [Fri, 2 Dec 2011 19:19:21 +0000 (11:19 -0800)]
mds: remove obsolete doc

13 years agocrush: ignore forcefed input that doesn't exist
Sage Weil [Fri, 2 Dec 2011 17:58:45 +0000 (09:58 -0800)]
crush: ignore forcefed input that doesn't exist

This might happen if, e.g., the file_layout specifies an osd that later
is removed from the cluster entirely.  Just ignore it instead of making
upper layers duplicate this check.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoRevert "CrushWrapper: ignore forcefeed if it does not exist"
Sage Weil [Fri, 2 Dec 2011 17:47:58 +0000 (09:47 -0800)]
Revert "CrushWrapper: ignore forcefeed if it does not exist"

This reverts commit 6fbab6da6942c238d40a6b4f1680a7e6da463289.

This fails a unit test.

And I change my mind.. I think this is most cleanly handled inside crush, so
we don't duplicate the same check that is generating the error with an different
data structure.

13 years agov0.39 v0.39
Sage Weil [Fri, 2 Dec 2011 17:01:31 +0000 (09:01 -0800)]
v0.39

13 years agoOSDMap: build_simple_from_conf pg_num should not be 0 with one osd
Samuel Just [Fri, 2 Dec 2011 00:28:03 +0000 (16:28 -0800)]
OSDMap: build_simple_from_conf pg_num should not be 0 with one osd

Previously, pg_num would end up set to 0 if osd.0 is the only osd.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoobjecter: initialize global_op_flags to zero
Sage Weil [Fri, 2 Dec 2011 04:36:13 +0000 (20:36 -0800)]
objecter: initialize global_op_flags to zero

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoDoc: delete gratuitous index.html
Mark Kampe [Thu, 1 Dec 2011 23:58:32 +0000 (15:58 -0800)]
Doc: delete gratuitous index.html

It was not an index, and seems to contain recommendations
for system configuration.  I have renamed it to confusing.txt
and will merge it in a future commit.

Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com>
13 years agoDoc: complete reversion of architecture.rst
Mark Kampe [Thu, 1 Dec 2011 23:34:57 +0000 (15:34 -0800)]
Doc: complete reversion of architecture.rst

(abandon in progress improvements until everything works)

Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com>
13 years agoDoc: deleted gratuitious PlanningImplementation.html,
Mark Kampe [Thu, 1 Dec 2011 23:29:00 +0000 (15:29 -0800)]
Doc: deleted gratuitious PlanningImplementation.html,

which was a copy of PlanningImplementation.txt
(and not html at all).

restored previous index.rst, which was overwritten with a copy
of PlanninImplementation.txt, but removed all of the recursively
included content from the document.

I will cherry-pick merge the new contents in a subsequent commit.

Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com>
13 years agoDoc: Restore the previous version of architecture.rst
Mark Kampe [Thu, 1 Dec 2011 23:22:15 +0000 (15:22 -0800)]
Doc: Restore the previous version of architecture.rst

it was accidentally overwritten with a version of the product
had a somewhat different audience/focus and a few sphinx
formatting errors.

I will cherry-pick the corrections in a subsequent commit.

Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com>
13 years agodoc: change state model from .svg to .png
Mark Kampe [Thu, 1 Dec 2011 22:59:24 +0000 (14:59 -0800)]
doc: change state model from .svg to .png

Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com>
13 years agofixed ubuntu version typo
Steve MacGregor [Mon, 31 Oct 2011 19:31:41 +0000 (15:31 -0400)]
fixed ubuntu version typo

13 years agoCrushWrapper: ignore forcefeed if it does not exist
Samuel Just [Tue, 29 Nov 2011 01:30:37 +0000 (17:30 -0800)]
CrushWrapper: ignore forcefeed if it does not exist

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agolibrbd: report an error if rbd header does not match
Josh Durgin [Tue, 15 Nov 2011 22:27:53 +0000 (14:27 -0800)]
librbd: report an error if rbd header does not match

This will fail on future incompatible versions of the header format.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoMerge branch 'wip_local_reads'
Greg Farnum [Thu, 1 Dec 2011 19:15:44 +0000 (11:15 -0800)]
Merge branch 'wip_local_reads'

13 years agohadoop: apache license.
Greg Farnum [Thu, 1 Dec 2011 19:14:34 +0000 (11:14 -0800)]
hadoop: apache license.

We haven't made explicit that the Hadoop Java code is under the Apache
License. Do so (with permission from the other contributors, thanks!).

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomds: fix blocking in standby replay thread
Sage Weil [Thu, 1 Dec 2011 17:17:38 +0000 (09:17 -0800)]
mds: fix blocking in standby replay thread

We need to hold mylock before waiting on the cond or else we get

./common/Cond.h: In function 'int Cond::Wait(Mutex&)', in thread '7f37fe0c8700'
./common/Cond.h: 46: FAILED assert(mutex.is_locked())
 ceph version 0.38-2-g73f99a1 (commit:73f99a189f491866da2be88adcfe0bd512282755)
 1: (MDLog::_replay_thread()+0x2483) [0x6c4393]
 2: (MDLog::ReplayThread::entry()+0xd) [0x4decbd]
 3: (()+0x6d8c) [0x7f3803e8fd8c]
 4: (clone()+0x6d) [0x7f38028d504d]
 ceph version 0.38-2-g73f99a1 (commit:73f99a189f491866da2be88adcfe0bd512282755)
 1: (MDLog::_replay_thread()+0x2483) [0x6c4393]
 2: (MDLog::ReplayThread::entry()+0xd) [0x4decbd]
 3: (()+0x6d8c) [0x7f3803e8fd8c]
 4: (clone()+0x6d) [0x7f38028d504d]
*** Caught signal (Aborted) **
 in thread 7f37fe0c8700

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoglobal: make daemon banner print explicit
Sage Weil [Thu, 1 Dec 2011 17:17:00 +0000 (09:17 -0800)]
global: make daemon banner print explicit

This eliminates some flags and avoids annoying cases where the banner is
printed but we don't want to see it.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomds: fix usage text
Sage Weil [Thu, 1 Dec 2011 16:19:47 +0000 (08:19 -0800)]
mds: fix usage text

Filename is not optional.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomds: adjust flock lock state on export
Sage Weil [Wed, 30 Nov 2011 17:57:29 +0000 (09:57 -0800)]
mds: adjust flock lock state on export

Looks like this was missed when flocklock was added.  Did a quick grep and
it doesn't look like it is missing anywhere else.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoObjecter: loop the right direction when searching for local replicas
Greg Farnum [Wed, 30 Nov 2011 02:14:29 +0000 (18:14 -0800)]
Objecter: loop the right direction when searching for local replicas

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agodoc: Add peering state diagram
Samuel Just [Wed, 30 Nov 2011 00:24:35 +0000 (16:24 -0800)]
doc: Add peering state diagram

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoMakefile: ipaddr.h, pick_address.h
Sage Weil [Tue, 29 Nov 2011 23:36:07 +0000 (15:36 -0800)]
Makefile: ipaddr.h, pick_address.h

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMakefile: add missing uuid.h to tarball
Sage Weil [Tue, 29 Nov 2011 21:31:38 +0000 (13:31 -0800)]
Makefile: add missing uuid.h to tarball

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoObjecter: fix local reads in recalc_op_target
Greg Farnum [Tue, 29 Nov 2011 21:30:45 +0000 (13:30 -0800)]
Objecter: fix local reads in recalc_op_target

We want to use the actual OSD, not the index into the array!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoosd: subscribe to next map if flagged FULL
Sage Weil [Tue, 29 Nov 2011 16:28:57 +0000 (08:28 -0800)]
osd: subscribe to next map if flagged FULL

This ensures the osd finds out when we become un-full in a timely manner.

Fixes: #1755
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomds: encode truncate_pending in inode
Sage Weil [Tue, 29 Nov 2011 05:37:18 +0000 (21:37 -0800)]
mds: encode truncate_pending in inode

Otherwise we don't actually journal this value, and we get confused when
we replay a start_truncate and try to restart it.

Fixes: #1756
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agouclient: remove filer_flags and use Objecter::global_op_flags instead
Greg Farnum [Mon, 28 Nov 2011 20:45:09 +0000 (12:45 -0800)]
uclient: remove filer_flags and use Objecter::global_op_flags instead

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoObjecter: add a new global_op_flags that is passed to every Op constructor.
Greg Farnum [Mon, 28 Nov 2011 20:42:21 +0000 (12:42 -0800)]
Objecter: add a new global_op_flags that is passed to every Op constructor.

We can use this for a global use of LOCALIZE_READS (and are about
to do so!).

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoObjecter: remove unused variable in op_submit
Greg Farnum [Mon, 28 Nov 2011 20:30:46 +0000 (12:30 -0800)]
Objecter: remove unused variable in op_submit

These flags are probably relics from when the function got split;
they belong in send_op now.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agouclient: remove useless if-else based on snapid
Greg Farnum [Mon, 28 Nov 2011 18:32:07 +0000 (10:32 -0800)]
uclient: remove useless if-else based on snapid

These are the same command anyway!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agodebian init: Do not stop or start daemons when installing or upgrading
Wido den Hollander [Wed, 16 Nov 2011 19:41:15 +0000 (20:41 +0100)]
debian init: Do not stop or start daemons when installing or upgrading

Signed-off-by: Wido den Hollander <wido@widodh.nl>
13 years agomon: search for local ip during mkfs
Sage Weil [Mon, 28 Nov 2011 00:10:46 +0000 (16:10 -0800)]
mon: search for local ip during mkfs

If an address isn't explicitly specified during mkfs, look for an unnamed
monitor in the (generated) monmap and see if any of those addresses is
configured on the local machine.  If so, assume it's us, and name ourselves
in the seed monmap.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agopick_address: implement have_local_addr()
Sage Weil [Mon, 28 Nov 2011 00:07:20 +0000 (16:07 -0800)]
pick_address: implement have_local_addr()

Check for a local ip from within a list of addresses.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomonclient: name nameless monitors noname-<foo>
Sage Weil [Mon, 28 Nov 2011 00:04:52 +0000 (16:04 -0800)]
monclient: name nameless monitors noname-<foo>

This makes them easy to pick out as unnamed.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agopick_address: whitespace
Sage Weil [Sun, 27 Nov 2011 22:50:46 +0000 (14:50 -0800)]
pick_address: whitespace

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agocorrected variable (con) to be consistent with prior examples (cluster)
Mark Kampe [Wed, 23 Nov 2011 23:56:52 +0000 (15:56 -0800)]
corrected variable (con) to be consistent with prior examples (cluster)

Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com>
13 years agoReplicatedPG: Also count overlaps for snapsets on snapdirs
Samuel Just [Wed, 23 Nov 2011 22:05:29 +0000 (14:05 -0800)]
ReplicatedPG: Also count overlaps for snapsets on snapdirs

Previously, the overlaps for snapdirs would not be included in
cstat causing the computed total to be incorrect.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoReplicatedPG: Account for clone space usage in make_writeable
Samuel Just [Tue, 22 Nov 2011 17:30:35 +0000 (09:30 -0800)]
ReplicatedPG: Account for clone space usage in make_writeable

Previously, we accounted for clone space usage inconsistently in
write_update_size_and_usage etc when walking through the operations.
make_writeable may change the most recent clone overlap, however, so we
can't handle it until then.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoMerge branch 'wip-mon'
Sage Weil [Wed, 23 Nov 2011 14:45:26 +0000 (06:45 -0800)]
Merge branch 'wip-mon'

13 years agoceph: fix shutdown race
Sage Weil [Wed, 23 Nov 2011 15:02:41 +0000 (07:02 -0800)]
ceph: fix shutdown race

Shut down MonClient before messenger, to avoid race with MonClient::tick()
and MonClient::shutdown().

Fixes

#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007f44475e2849 in _L_lock_953 () from /lib/libpthread.so.0
#2  0x00007f44475e266b in __pthread_mutex_lock (mutex=0x14d8dc8) at pthread_mutex_lock.c:61
#3  0x00000000005ae090 in Mutex::Lock (this=0x14d8db8, no_lockdep=false) at ./common/Mutex.h:108
#4  0x000000000068440e in MonClient::shutdown (this=0x14d8c30) at mon/MonClient.cc:386
#5  0x00000000005b2653 in ceph_tool_common_shutdown (ctx=0x14d84c0) at tools/common.cc:661
#6  0x00000000005ada29 in main (argc=7, argv=0x7fff8a2394c8) at tools/ceph.cc:304

vs

#0  0x00007f44475e8a0b in raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
#1  0x00000000005eff6b in reraise_fatal (signum=11) at global/signal_handler.cc:59
#2  0x00000000005f0165 in handle_fatal_signal (signum=11) at global/signal_handler.cc:106
#3  <signal handler called>
#4  0x0000000000000000 in ?? ()
#5  0x000000000068661a in MonClient::tick (this=0x14d8c30) at mon/MonClient.cc:621
#6  0x0000000000689e3b in MonClient::C_Tick::finish(int) ()
#7  0x000000000061b3c5 in SafeTimer::timer_thread (this=0x14d8df8) at common/Timer.cc:102
#8  0x000000000061c6f0 in SafeTimerThread::entry() ()
#9  0x00000000005f1219 in Thread::_entry_func (arg=0x14e1a00) at common/Thread.cc:41
#10 0x00007f44475e0971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#11 0x00007f4445ead92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#12 0x0000000000000000 in ?? ()

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agocommon/pick_address: Fix IP address stringification.
Tommi Virtanen [Wed, 23 Nov 2011 01:48:40 +0000 (17:48 -0800)]
common/pick_address: Fix IP address stringification.

Different sockaddr_* have the actual address (sin_addr, sin6_addr)
at different offsets, and sockaddr->sa_data just isn't enough.
inet_ntop conspires by taking a void*. I could figure out the right
offset with a switch (found->sa_family), but let's go for the
supposedly write-once-run-with-any-AF solution, getnameinfo.

Which, naturally, takes an extra length argument that is AF-specific,
and not provided anywhere nicely by getifaddrs. Huzzah!

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agomon: pick_addresses before common_init_finish
Sage Weil [Wed, 23 Nov 2011 00:28:42 +0000 (16:28 -0800)]
mon: pick_addresses before common_init_finish

We can't modify g_conf->public_addr after that.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: set default port if not specified...
Sage Weil [Wed, 23 Nov 2011 00:22:07 +0000 (16:22 -0800)]
mon: set default port if not specified...

...when looking for self in monmap during mkfs.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: calculate rank by addr, not name
Sage Weil [Wed, 23 Nov 2011 00:02:28 +0000 (16:02 -0800)]
mon: calculate rank by addr, not name

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomonmap: assign rank by sorting addr, not name
Sage Weil [Tue, 22 Nov 2011 23:29:43 +0000 (15:29 -0800)]
monmap: assign rank by sorting addr, not name

This allows monitors to bootstrap knowing peer addrs but not their names,
as when we specify mon_host.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoobsync: tear out rgw
Yehuda Sadeh [Tue, 22 Nov 2011 23:05:45 +0000 (15:05 -0800)]
obsync: tear out rgw

13 years agomon: name self in monmap if --public-addr specified during mkfs
Sage Weil [Tue, 22 Nov 2011 22:53:45 +0000 (14:53 -0800)]
mon: name self in monmap if --public-addr specified during mkfs

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agorgw: don't remove tail of lru if that's what we touch
Yehuda Sadeh [Tue, 22 Nov 2011 18:31:25 +0000 (10:31 -0800)]
rgw: don't remove tail of lru if that's what we touch

13 years agomon: mark down all connections when rank changes
Sage Weil [Tue, 22 Nov 2011 18:09:41 +0000 (10:09 -0800)]
mon: mark down all connections when rank changes

The election and some other stuff depend on msg->get_source().num() to get
the peer rank, and that is part of the connection state.  If it changes,
we need to close old connections and open new ones so that we aren't
taken for someone else (like mon.-1).

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: handle rank change in bootstrap
Sage Weil [Tue, 22 Nov 2011 18:08:48 +0000 (10:08 -0800)]
mon: handle rank change in bootstrap

The rank can change either because we probe and get a new monmap, or
because we get one via paxos.  Move the checks to bootstrap() to catch
both cases.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: pick an address when joining and existing cluster
Sage Weil [Tue, 22 Nov 2011 17:53:52 +0000 (09:53 -0800)]
mon: pick an address when joining and existing cluster

If we are joining an existing cluster, we can pick whatever address we
want (e.g., one specified by public_addr or public_network).

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: remove unused myaddr
Sage Weil [Tue, 22 Nov 2011 17:52:58 +0000 (09:52 -0800)]
mon: remove unused myaddr

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: simplify suicide when removed from map
Sage Weil [Tue, 22 Nov 2011 17:52:52 +0000 (09:52 -0800)]
mon: simplify suicide when removed from map

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoPG: it's not necessary to call build_inc_scrub_map in build_scrub_map
Samuel Just [Mon, 21 Nov 2011 23:06:35 +0000 (15:06 -0800)]
PG: it's not necessary to call build_inc_scrub_map in build_scrub_map

Because we have called osr.flush(), it's safe to tag map.valid_through
as last_update.   We will still have to catch up once we have stopped
writes and allowed the filestore to catch up anyway.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoMerge remote branch 'gh/subnet'
Sage Weil [Tue, 22 Nov 2011 00:17:21 +0000 (16:17 -0800)]
Merge remote branch 'gh/subnet'

13 years agoMerge remote branch 'gh/wip-mon'
Sage Weil [Tue, 22 Nov 2011 00:00:34 +0000 (16:00 -0800)]
Merge remote branch 'gh/wip-mon'

13 years agomds, osd, synclient: Pick cluster_addr/public_addr based on *_network.
Tommi Virtanen [Mon, 21 Nov 2011 21:32:45 +0000 (13:32 -0800)]
mds, osd, synclient: Pick cluster_addr/public_addr based on *_network.

Instead of specifying an IP address in ceph.conf like

[global]
cluster_addr = 10.1.2.3

you can now avoid the node-specific configuration and just say

[global]
cluster_network = 10.1.2.0/24

The *_network variables can also take a whitespace-separated list of
networks, to be checked in that order:

[global]
cluster_network = 10.1.2.0/24 192.168.42.192/26

13 years agocommon/pickaddr: Pick cluster_addr/public_addr based on *_network.
Tommi Virtanen [Sat, 19 Nov 2011 00:55:29 +0000 (16:55 -0800)]
common/pickaddr: Pick cluster_addr/public_addr based on *_network.

13 years agocommon/ipaddr: Add utility function to parse ip/cidr style networks.
Tommi Virtanen [Sat, 19 Nov 2011 00:47:45 +0000 (16:47 -0800)]
common/ipaddr: Add utility function to parse ip/cidr style networks.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agocommon/ipaddr: Find a configured IP address in given subnet.
Tommi Virtanen [Wed, 16 Nov 2011 21:39:29 +0000 (13:39 -0800)]
common/ipaddr: Find a configured IP address in given subnet.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agomsg: Move public_addr use outside ->bind()
Tommi Virtanen [Mon, 21 Nov 2011 18:12:29 +0000 (10:12 -0800)]
msg: Move public_addr use outside ->bind()

13 years agocommon/str_list: Make unused return value void.
Tommi Virtanen [Wed, 16 Nov 2011 21:40:02 +0000 (13:40 -0800)]
common/str_list: Make unused return value void.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoosd: Remove unused variable.
Tommi Virtanen [Sat, 19 Nov 2011 00:55:34 +0000 (16:55 -0800)]
osd: Remove unused variable.

13 years agoosd: fix 'stop' command
Sage Weil [Mon, 21 Nov 2011 21:28:36 +0000 (13:28 -0800)]
osd: fix 'stop' command

Special case.  We can't join the command_tp thread from itself.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: protect handle_osd_map requeueing with queue lock
Sage Weil [Mon, 21 Nov 2011 21:23:59 +0000 (13:23 -0800)]
osd: protect handle_osd_map requeueing with queue lock

pending_ops was protected by osd_lock, but it tracks something in the
queue, which has it's own lock.  Messy.  Also, useless, since
wait_for_no_ops had a single caller in shutdown() that op_wq.drain() can
do for us.

Rip it out, and track queue size under the queue lock.

Fixes: #1727
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: lock pg when requeuing requests
Sage Weil [Mon, 21 Nov 2011 19:15:38 +0000 (11:15 -0800)]
osd: lock pg when requeuing requests

The op queue is shut down, so this is mostly safe, unless someone comes
through and does requeue_ops() from a callback or something.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agopaxosservice: tolerate _active() call when not active
Sage Weil [Mon, 21 Nov 2011 18:33:53 +0000 (10:33 -0800)]
paxosservice: tolerate _active() call when not active

This can happen when multiple C_Active events are queued, and the first
does a propose_pending() (moving us into updating state).

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoobjecter: simplify map request check
Sage Weil [Thu, 17 Nov 2011 20:08:40 +0000 (12:08 -0800)]
objecter: simplify map request check

We should request a missing/intervening map if it appears to exist.
Otherwise, skip it.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoobjecter: cancel tick event on shutdown
Sage Weil [Mon, 21 Nov 2011 17:19:26 +0000 (09:19 -0800)]
objecter: cancel tick event on shutdown

Hopefully this is the root cause for

2011-11-20 23:57:41.555292 7f75dd743780 ceph version 0.38-205-g3b53b72
(commit:3b53b722b34b5284e6b8a5571a08d4b7ec276241), process ceph-fuse, pid
21223
 *  Caught signal (Segmentation fault) *
    in thread 7f75d9c6e700
    ceph version 0.38-205-g3b53b72
    (commit:3b53b722b34b5284e6b8a5571a08d4b7ec276241)
    1: /tmp/cephtest/binary/usr/local/bin/ceph-fuse() [0x6993a4]
    2: (()+0xfb40) [0x7f75dd0eeb40]
    3: (PerfCounters::set(int, unsigned long)+0x2a) [0x511bca]
    4: (Objecter::tick()+0x1f3) [0x653f43]
    5: (Objecter::C_Tick::finish(int)+0x15) [0x66aef5]
    6: (SafeTimer::timer_thread()+0x4b0) [0x5825c0]
    7: (SafeTimerThread::entry()+0x15) [0x586865]
    8: (Thread::_entry_func(void)+0x12) [0x52a832]
    9: (()+0x7971) [0x7f75dd0e6971]
    10: (clone()+0x6d) [0x7f75db97592d]

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agopaxos: fix sharing of learned commits during collect/last
Sage Weil [Sun, 20 Nov 2011 22:26:09 +0000 (14:26 -0800)]
paxos: fix sharing of learned commits during collect/last

We can learn either an uncommitted or committed value during the
collect/last recovery phase.  For the committed values, we need to remember
each peer's first/last_committed and share only at the end to avoid a
situation like:

 - mon.1 has same last_committed as us
 - mon.2 has newer last_commited, we save it
 - mon.3 has same last_commited as mon.1, we share new value
 - done... but mon.1 never got mon.2's newer commit.

Instead, save the commit sharing until the collect process completes, so
we know that any committed value learned from anyone is shared with
everyone who needs it.

This fixes a crash like

mon/Paxos.cc: In function 'void Paxos::handle_begin(MMonPaxos*)', in thread '7fd91192c700'
mon/Paxos.cc: 400: FAILED assert(begin->last_committed == last_committed)
 ceph version 0.38-208-g9aabd39 (commit:9aabd3982cceb7e8489412b4bfbb4c2387880de2)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x76) [0x72454e]
 2: (Paxos::handle_begin(MMonPaxos*)+0x363) [0x6499ef]
 3: (Paxos::dispatch(PaxosServiceMessage*)+0x2b4) [0x64db2c]
 4: (Monitor::_ms_dispatch(Message*)+0xdc6) [0x6205c2]
 5: (Monitor::ms_dispatch(Message*)+0x3a) [0x62831a]
 6: (Messenger::ms_deliver_dispatch(Message*)+0x63) [0x7d1f31]
 7: (SimpleMessenger::dispatch_entry()+0x7c2) [0x7bb786]
 8: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x6070fa]
 9: (Thread::_entry_func(void*)+0x23) [0x6f3f69]
 10: (()+0x7971) [0x7fd9153a1971]
 11: (clone()+0x6d) [0x7fd913c3092d]

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agorgw: support alternative date formatting
Yehuda Sadeh [Sun, 20 Nov 2011 21:17:04 +0000 (13:17 -0800)]
rgw: support alternative date formatting

being used by s3cmd

13 years agopaxosservice: consolidate _active and _commit
Sage Weil [Fri, 18 Nov 2011 18:35:44 +0000 (10:35 -0800)]
paxosservice: consolidate _active and _commit

Use the same callback for when paxos goes active and for when it commits
something.  The response in both cases is the same.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agopaxosservice: remove unused committed() callback
Sage Weil [Fri, 18 Nov 2011 18:05:35 +0000 (10:05 -0800)]
paxosservice: remove unused committed() callback

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: mdsmon: tick() from on_active() instead of committed()
Sage Weil [Fri, 18 Nov 2011 18:01:30 +0000 (10:01 -0800)]
mon: mdsmon: tick() from on_active() instead of committed()

Same effect, and avoids useless committed().

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: share random osd map from update_from_paxos, not committed()
Sage Weil [Fri, 18 Nov 2011 17:56:10 +0000 (09:56 -0800)]
mon: share random osd map from update_from_paxos, not committed()

This will let us remove committed() entirely.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoconfig: support --no-<foo> for bool options
Sage Weil [Fri, 18 Nov 2011 19:04:24 +0000 (11:04 -0800)]
config: support --no-<foo> for bool options

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoconfig: whitespace
Sage Weil [Fri, 18 Nov 2011 19:04:09 +0000 (11:04 -0800)]
config: whitespace

Signed-off-by: Sage Weil <sage@newdream.net>