]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
13 years agov0.39 v0.39
Sage Weil [Fri, 2 Dec 2011 17:01:31 +0000 (09:01 -0800)]
v0.39

13 years agoOSDMap: build_simple_from_conf pg_num should not be 0 with one osd
Samuel Just [Fri, 2 Dec 2011 00:28:03 +0000 (16:28 -0800)]
OSDMap: build_simple_from_conf pg_num should not be 0 with one osd

Previously, pg_num would end up set to 0 if osd.0 is the only osd.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agolibrbd: report an error if rbd header does not match
Josh Durgin [Tue, 15 Nov 2011 22:27:53 +0000 (14:27 -0800)]
librbd: report an error if rbd header does not match

This will fail on future incompatible versions of the header format.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agomds: adjust flock lock state on export
Sage Weil [Wed, 30 Nov 2011 17:57:29 +0000 (09:57 -0800)]
mds: adjust flock lock state on export

Looks like this was missed when flocklock was added.  Did a quick grep and
it doesn't look like it is missing anywhere else.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agodoc: Add peering state diagram
Samuel Just [Wed, 30 Nov 2011 00:24:35 +0000 (16:24 -0800)]
doc: Add peering state diagram

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoMakefile: ipaddr.h, pick_address.h
Sage Weil [Tue, 29 Nov 2011 23:36:07 +0000 (15:36 -0800)]
Makefile: ipaddr.h, pick_address.h

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMakefile: add missing uuid.h to tarball
Sage Weil [Tue, 29 Nov 2011 21:31:38 +0000 (13:31 -0800)]
Makefile: add missing uuid.h to tarball

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: subscribe to next map if flagged FULL
Sage Weil [Tue, 29 Nov 2011 16:28:57 +0000 (08:28 -0800)]
osd: subscribe to next map if flagged FULL

This ensures the osd finds out when we become un-full in a timely manner.

Fixes: #1755
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomds: encode truncate_pending in inode
Sage Weil [Tue, 29 Nov 2011 05:37:18 +0000 (21:37 -0800)]
mds: encode truncate_pending in inode

Otherwise we don't actually journal this value, and we get confused when
we replay a start_truncate and try to restart it.

Fixes: #1756
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agodebian init: Do not stop or start daemons when installing or upgrading
Wido den Hollander [Wed, 16 Nov 2011 19:41:15 +0000 (20:41 +0100)]
debian init: Do not stop or start daemons when installing or upgrading

Signed-off-by: Wido den Hollander <wido@widodh.nl>
13 years agomon: search for local ip during mkfs
Sage Weil [Mon, 28 Nov 2011 00:10:46 +0000 (16:10 -0800)]
mon: search for local ip during mkfs

If an address isn't explicitly specified during mkfs, look for an unnamed
monitor in the (generated) monmap and see if any of those addresses is
configured on the local machine.  If so, assume it's us, and name ourselves
in the seed monmap.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agopick_address: implement have_local_addr()
Sage Weil [Mon, 28 Nov 2011 00:07:20 +0000 (16:07 -0800)]
pick_address: implement have_local_addr()

Check for a local ip from within a list of addresses.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomonclient: name nameless monitors noname-<foo>
Sage Weil [Mon, 28 Nov 2011 00:04:52 +0000 (16:04 -0800)]
monclient: name nameless monitors noname-<foo>

This makes them easy to pick out as unnamed.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agopick_address: whitespace
Sage Weil [Sun, 27 Nov 2011 22:50:46 +0000 (14:50 -0800)]
pick_address: whitespace

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agocorrected variable (con) to be consistent with prior examples (cluster)
Mark Kampe [Wed, 23 Nov 2011 23:56:52 +0000 (15:56 -0800)]
corrected variable (con) to be consistent with prior examples (cluster)

Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com>
13 years agoReplicatedPG: Also count overlaps for snapsets on snapdirs
Samuel Just [Wed, 23 Nov 2011 22:05:29 +0000 (14:05 -0800)]
ReplicatedPG: Also count overlaps for snapsets on snapdirs

Previously, the overlaps for snapdirs would not be included in
cstat causing the computed total to be incorrect.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoReplicatedPG: Account for clone space usage in make_writeable
Samuel Just [Tue, 22 Nov 2011 17:30:35 +0000 (09:30 -0800)]
ReplicatedPG: Account for clone space usage in make_writeable

Previously, we accounted for clone space usage inconsistently in
write_update_size_and_usage etc when walking through the operations.
make_writeable may change the most recent clone overlap, however, so we
can't handle it until then.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoMerge branch 'wip-mon'
Sage Weil [Wed, 23 Nov 2011 14:45:26 +0000 (06:45 -0800)]
Merge branch 'wip-mon'

13 years agoceph: fix shutdown race
Sage Weil [Wed, 23 Nov 2011 15:02:41 +0000 (07:02 -0800)]
ceph: fix shutdown race

Shut down MonClient before messenger, to avoid race with MonClient::tick()
and MonClient::shutdown().

Fixes

#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007f44475e2849 in _L_lock_953 () from /lib/libpthread.so.0
#2  0x00007f44475e266b in __pthread_mutex_lock (mutex=0x14d8dc8) at pthread_mutex_lock.c:61
#3  0x00000000005ae090 in Mutex::Lock (this=0x14d8db8, no_lockdep=false) at ./common/Mutex.h:108
#4  0x000000000068440e in MonClient::shutdown (this=0x14d8c30) at mon/MonClient.cc:386
#5  0x00000000005b2653 in ceph_tool_common_shutdown (ctx=0x14d84c0) at tools/common.cc:661
#6  0x00000000005ada29 in main (argc=7, argv=0x7fff8a2394c8) at tools/ceph.cc:304

vs

#0  0x00007f44475e8a0b in raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
#1  0x00000000005eff6b in reraise_fatal (signum=11) at global/signal_handler.cc:59
#2  0x00000000005f0165 in handle_fatal_signal (signum=11) at global/signal_handler.cc:106
#3  <signal handler called>
#4  0x0000000000000000 in ?? ()
#5  0x000000000068661a in MonClient::tick (this=0x14d8c30) at mon/MonClient.cc:621
#6  0x0000000000689e3b in MonClient::C_Tick::finish(int) ()
#7  0x000000000061b3c5 in SafeTimer::timer_thread (this=0x14d8df8) at common/Timer.cc:102
#8  0x000000000061c6f0 in SafeTimerThread::entry() ()
#9  0x00000000005f1219 in Thread::_entry_func (arg=0x14e1a00) at common/Thread.cc:41
#10 0x00007f44475e0971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#11 0x00007f4445ead92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#12 0x0000000000000000 in ?? ()

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agocommon/pick_address: Fix IP address stringification.
Tommi Virtanen [Wed, 23 Nov 2011 01:48:40 +0000 (17:48 -0800)]
common/pick_address: Fix IP address stringification.

Different sockaddr_* have the actual address (sin_addr, sin6_addr)
at different offsets, and sockaddr->sa_data just isn't enough.
inet_ntop conspires by taking a void*. I could figure out the right
offset with a switch (found->sa_family), but let's go for the
supposedly write-once-run-with-any-AF solution, getnameinfo.

Which, naturally, takes an extra length argument that is AF-specific,
and not provided anywhere nicely by getifaddrs. Huzzah!

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agomon: pick_addresses before common_init_finish
Sage Weil [Wed, 23 Nov 2011 00:28:42 +0000 (16:28 -0800)]
mon: pick_addresses before common_init_finish

We can't modify g_conf->public_addr after that.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: set default port if not specified...
Sage Weil [Wed, 23 Nov 2011 00:22:07 +0000 (16:22 -0800)]
mon: set default port if not specified...

...when looking for self in monmap during mkfs.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: calculate rank by addr, not name
Sage Weil [Wed, 23 Nov 2011 00:02:28 +0000 (16:02 -0800)]
mon: calculate rank by addr, not name

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomonmap: assign rank by sorting addr, not name
Sage Weil [Tue, 22 Nov 2011 23:29:43 +0000 (15:29 -0800)]
monmap: assign rank by sorting addr, not name

This allows monitors to bootstrap knowing peer addrs but not their names,
as when we specify mon_host.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoobsync: tear out rgw
Yehuda Sadeh [Tue, 22 Nov 2011 23:05:45 +0000 (15:05 -0800)]
obsync: tear out rgw

13 years agomon: name self in monmap if --public-addr specified during mkfs
Sage Weil [Tue, 22 Nov 2011 22:53:45 +0000 (14:53 -0800)]
mon: name self in monmap if --public-addr specified during mkfs

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agorgw: don't remove tail of lru if that's what we touch
Yehuda Sadeh [Tue, 22 Nov 2011 18:31:25 +0000 (10:31 -0800)]
rgw: don't remove tail of lru if that's what we touch

13 years agomon: mark down all connections when rank changes
Sage Weil [Tue, 22 Nov 2011 18:09:41 +0000 (10:09 -0800)]
mon: mark down all connections when rank changes

The election and some other stuff depend on msg->get_source().num() to get
the peer rank, and that is part of the connection state.  If it changes,
we need to close old connections and open new ones so that we aren't
taken for someone else (like mon.-1).

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: handle rank change in bootstrap
Sage Weil [Tue, 22 Nov 2011 18:08:48 +0000 (10:08 -0800)]
mon: handle rank change in bootstrap

The rank can change either because we probe and get a new monmap, or
because we get one via paxos.  Move the checks to bootstrap() to catch
both cases.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: pick an address when joining and existing cluster
Sage Weil [Tue, 22 Nov 2011 17:53:52 +0000 (09:53 -0800)]
mon: pick an address when joining and existing cluster

If we are joining an existing cluster, we can pick whatever address we
want (e.g., one specified by public_addr or public_network).

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: remove unused myaddr
Sage Weil [Tue, 22 Nov 2011 17:52:58 +0000 (09:52 -0800)]
mon: remove unused myaddr

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: simplify suicide when removed from map
Sage Weil [Tue, 22 Nov 2011 17:52:52 +0000 (09:52 -0800)]
mon: simplify suicide when removed from map

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoPG: it's not necessary to call build_inc_scrub_map in build_scrub_map
Samuel Just [Mon, 21 Nov 2011 23:06:35 +0000 (15:06 -0800)]
PG: it's not necessary to call build_inc_scrub_map in build_scrub_map

Because we have called osr.flush(), it's safe to tag map.valid_through
as last_update.   We will still have to catch up once we have stopped
writes and allowed the filestore to catch up anyway.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoMerge remote branch 'gh/subnet'
Sage Weil [Tue, 22 Nov 2011 00:17:21 +0000 (16:17 -0800)]
Merge remote branch 'gh/subnet'

13 years agoMerge remote branch 'gh/wip-mon'
Sage Weil [Tue, 22 Nov 2011 00:00:34 +0000 (16:00 -0800)]
Merge remote branch 'gh/wip-mon'

13 years agomds, osd, synclient: Pick cluster_addr/public_addr based on *_network.
Tommi Virtanen [Mon, 21 Nov 2011 21:32:45 +0000 (13:32 -0800)]
mds, osd, synclient: Pick cluster_addr/public_addr based on *_network.

Instead of specifying an IP address in ceph.conf like

[global]
cluster_addr = 10.1.2.3

you can now avoid the node-specific configuration and just say

[global]
cluster_network = 10.1.2.0/24

The *_network variables can also take a whitespace-separated list of
networks, to be checked in that order:

[global]
cluster_network = 10.1.2.0/24 192.168.42.192/26

13 years agocommon/pickaddr: Pick cluster_addr/public_addr based on *_network.
Tommi Virtanen [Sat, 19 Nov 2011 00:55:29 +0000 (16:55 -0800)]
common/pickaddr: Pick cluster_addr/public_addr based on *_network.

13 years agocommon/ipaddr: Add utility function to parse ip/cidr style networks.
Tommi Virtanen [Sat, 19 Nov 2011 00:47:45 +0000 (16:47 -0800)]
common/ipaddr: Add utility function to parse ip/cidr style networks.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agocommon/ipaddr: Find a configured IP address in given subnet.
Tommi Virtanen [Wed, 16 Nov 2011 21:39:29 +0000 (13:39 -0800)]
common/ipaddr: Find a configured IP address in given subnet.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agomsg: Move public_addr use outside ->bind()
Tommi Virtanen [Mon, 21 Nov 2011 18:12:29 +0000 (10:12 -0800)]
msg: Move public_addr use outside ->bind()

13 years agocommon/str_list: Make unused return value void.
Tommi Virtanen [Wed, 16 Nov 2011 21:40:02 +0000 (13:40 -0800)]
common/str_list: Make unused return value void.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoosd: Remove unused variable.
Tommi Virtanen [Sat, 19 Nov 2011 00:55:34 +0000 (16:55 -0800)]
osd: Remove unused variable.

13 years agoosd: fix 'stop' command
Sage Weil [Mon, 21 Nov 2011 21:28:36 +0000 (13:28 -0800)]
osd: fix 'stop' command

Special case.  We can't join the command_tp thread from itself.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: protect handle_osd_map requeueing with queue lock
Sage Weil [Mon, 21 Nov 2011 21:23:59 +0000 (13:23 -0800)]
osd: protect handle_osd_map requeueing with queue lock

pending_ops was protected by osd_lock, but it tracks something in the
queue, which has it's own lock.  Messy.  Also, useless, since
wait_for_no_ops had a single caller in shutdown() that op_wq.drain() can
do for us.

Rip it out, and track queue size under the queue lock.

Fixes: #1727
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: lock pg when requeuing requests
Sage Weil [Mon, 21 Nov 2011 19:15:38 +0000 (11:15 -0800)]
osd: lock pg when requeuing requests

The op queue is shut down, so this is mostly safe, unless someone comes
through and does requeue_ops() from a callback or something.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agopaxosservice: tolerate _active() call when not active
Sage Weil [Mon, 21 Nov 2011 18:33:53 +0000 (10:33 -0800)]
paxosservice: tolerate _active() call when not active

This can happen when multiple C_Active events are queued, and the first
does a propose_pending() (moving us into updating state).

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoobjecter: simplify map request check
Sage Weil [Thu, 17 Nov 2011 20:08:40 +0000 (12:08 -0800)]
objecter: simplify map request check

We should request a missing/intervening map if it appears to exist.
Otherwise, skip it.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoobjecter: cancel tick event on shutdown
Sage Weil [Mon, 21 Nov 2011 17:19:26 +0000 (09:19 -0800)]
objecter: cancel tick event on shutdown

Hopefully this is the root cause for

2011-11-20 23:57:41.555292 7f75dd743780 ceph version 0.38-205-g3b53b72
(commit:3b53b722b34b5284e6b8a5571a08d4b7ec276241), process ceph-fuse, pid
21223
 *  Caught signal (Segmentation fault) *
    in thread 7f75d9c6e700
    ceph version 0.38-205-g3b53b72
    (commit:3b53b722b34b5284e6b8a5571a08d4b7ec276241)
    1: /tmp/cephtest/binary/usr/local/bin/ceph-fuse() [0x6993a4]
    2: (()+0xfb40) [0x7f75dd0eeb40]
    3: (PerfCounters::set(int, unsigned long)+0x2a) [0x511bca]
    4: (Objecter::tick()+0x1f3) [0x653f43]
    5: (Objecter::C_Tick::finish(int)+0x15) [0x66aef5]
    6: (SafeTimer::timer_thread()+0x4b0) [0x5825c0]
    7: (SafeTimerThread::entry()+0x15) [0x586865]
    8: (Thread::_entry_func(void)+0x12) [0x52a832]
    9: (()+0x7971) [0x7f75dd0e6971]
    10: (clone()+0x6d) [0x7f75db97592d]

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agopaxos: fix sharing of learned commits during collect/last
Sage Weil [Sun, 20 Nov 2011 22:26:09 +0000 (14:26 -0800)]
paxos: fix sharing of learned commits during collect/last

We can learn either an uncommitted or committed value during the
collect/last recovery phase.  For the committed values, we need to remember
each peer's first/last_committed and share only at the end to avoid a
situation like:

 - mon.1 has same last_committed as us
 - mon.2 has newer last_commited, we save it
 - mon.3 has same last_commited as mon.1, we share new value
 - done... but mon.1 never got mon.2's newer commit.

Instead, save the commit sharing until the collect process completes, so
we know that any committed value learned from anyone is shared with
everyone who needs it.

This fixes a crash like

mon/Paxos.cc: In function 'void Paxos::handle_begin(MMonPaxos*)', in thread '7fd91192c700'
mon/Paxos.cc: 400: FAILED assert(begin->last_committed == last_committed)
 ceph version 0.38-208-g9aabd39 (commit:9aabd3982cceb7e8489412b4bfbb4c2387880de2)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x76) [0x72454e]
 2: (Paxos::handle_begin(MMonPaxos*)+0x363) [0x6499ef]
 3: (Paxos::dispatch(PaxosServiceMessage*)+0x2b4) [0x64db2c]
 4: (Monitor::_ms_dispatch(Message*)+0xdc6) [0x6205c2]
 5: (Monitor::ms_dispatch(Message*)+0x3a) [0x62831a]
 6: (Messenger::ms_deliver_dispatch(Message*)+0x63) [0x7d1f31]
 7: (SimpleMessenger::dispatch_entry()+0x7c2) [0x7bb786]
 8: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x6070fa]
 9: (Thread::_entry_func(void*)+0x23) [0x6f3f69]
 10: (()+0x7971) [0x7fd9153a1971]
 11: (clone()+0x6d) [0x7fd913c3092d]

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agorgw: support alternative date formatting
Yehuda Sadeh [Sun, 20 Nov 2011 21:17:04 +0000 (13:17 -0800)]
rgw: support alternative date formatting

being used by s3cmd

13 years agopaxosservice: consolidate _active and _commit
Sage Weil [Fri, 18 Nov 2011 18:35:44 +0000 (10:35 -0800)]
paxosservice: consolidate _active and _commit

Use the same callback for when paxos goes active and for when it commits
something.  The response in both cases is the same.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agopaxosservice: remove unused committed() callback
Sage Weil [Fri, 18 Nov 2011 18:05:35 +0000 (10:05 -0800)]
paxosservice: remove unused committed() callback

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: mdsmon: tick() from on_active() instead of committed()
Sage Weil [Fri, 18 Nov 2011 18:01:30 +0000 (10:01 -0800)]
mon: mdsmon: tick() from on_active() instead of committed()

Same effect, and avoids useless committed().

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: share random osd map from update_from_paxos, not committed()
Sage Weil [Fri, 18 Nov 2011 17:56:10 +0000 (09:56 -0800)]
mon: share random osd map from update_from_paxos, not committed()

This will let us remove committed() entirely.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoconfig: support --no-<foo> for bool options
Sage Weil [Fri, 18 Nov 2011 19:04:24 +0000 (11:04 -0800)]
config: support --no-<foo> for bool options

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoconfig: whitespace
Sage Weil [Fri, 18 Nov 2011 19:04:09 +0000 (11:04 -0800)]
config: whitespace

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosdmon: set the maps-to-keep floor to be at least epoch 0
Greg Farnum [Fri, 18 Nov 2011 23:56:35 +0000 (15:56 -0800)]
osdmon: set the maps-to-keep floor to be at least epoch 0

Looks like this conditional was just set backwards by mistake. There
have been a number of issues with OSDMap versions that are probably
related to this...
(Thanks to some smarts in trim_to, we at least did not trim ALL of
our maps. But on every tick prior to epoch 500 [that's the default]
the leader was trimming all old maps off the system.)

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoRevert "osd: simplify finalizing scrub on replica"
Samuel Just [Fri, 18 Nov 2011 01:17:06 +0000 (17:17 -0800)]
Revert "osd: simplify finalizing scrub on replica"

This reverts commit dd5087fabb2a743741a96ee4610379afa8431f68.

Calling osr.flush() is not quite enough since the onreadable callbacks
may not have been called (thus, last_update_applied may still lag behind
the tail of the log).

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoFileStore.cc: onreadable callbacks in OpSequencer order is enough
Samuel Just [Thu, 17 Nov 2011 21:45:08 +0000 (13:45 -0800)]
FileStore.cc: onreadable callbacks in OpSequencer order is enough

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoosd: error responses should trigger all requested notifications.
Greg Farnum [Fri, 18 Nov 2011 16:49:35 +0000 (08:49 -0800)]
osd: error responses should trigger all requested notifications.

There's no good reason I can find to limit error code responses to
the ACK.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoobjecter: trigger oncommit acks if the request returns an error code.
Greg Farnum [Fri, 18 Nov 2011 16:47:09 +0000 (08:47 -0800)]
objecter: trigger oncommit acks if the request returns an error code.

Many users only set oncommit acks, so if they get an error code
(which comes only as a CEPH_OSD_OP_ACK right now) the request
disappears into the ether.
(And remove stupid debug statements while we're at it.)

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agopaxos: do not create_pending if !active
Sage Weil [Fri, 18 Nov 2011 17:49:03 +0000 (09:49 -0800)]
paxos: do not create_pending if !active

This avoids a scenario like:

- _active()
  - proposes value
- _commit()
  - creates new pending, even though in updating state

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoRevert "mon: don't propose new state from update_from_paxos"
Sage Weil [Fri, 18 Nov 2011 17:43:09 +0000 (09:43 -0800)]
Revert "mon: don't propose new state from update_from_paxos"

This reverts commit 66c628acc8be71a92e801179431e4b938b857b3d.

13 years agomon: don't propose new state from update_from_paxos
Sage Weil [Fri, 18 Nov 2011 04:45:54 +0000 (20:45 -0800)]
mon: don't propose new state from update_from_paxos

Proposing a new state from within update_from_paxos() confuses some callers,
like PaxosService::_active().  Instead, do it in the on_active() callback.
This also let's us collapse the check_osd_map() caller into on_active(),
and makes it happen on leaders and peons alike, which ought to avoid some
of the pg creation lag we see sometimes (presumably when the osds have
sessions with peons instead of the leader).

Fixes: #1708
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agorgw: if swift url is not set up, just use whatever client used
Yehuda Sadeh [Fri, 18 Nov 2011 00:54:51 +0000 (16:54 -0800)]
rgw: if swift url is not set up, just use whatever client used

13 years agofuse: fix readdir return code
Sage Weil [Thu, 17 Nov 2011 23:01:17 +0000 (15:01 -0800)]
fuse: fix readdir return code

Ignore ENOSPC generated by our own callback, as it is only used to
terminate the loop.

Broken by commit cd90061239a598f6fca94326b6d2c32f325c96eb.

Fixes: #1728
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agopaxos: fix trimming when we skip over incrementals
Sage Weil [Thu, 17 Nov 2011 22:11:38 +0000 (14:11 -0800)]
paxos: fix trimming when we skip over incrementals

Remove open-coded trimming of old states and use our method (that also
removes additional per-state files).  Fixes old stray state files.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agopaxos: store stashed state _and_ incrementals
Sage Weil [Thu, 17 Nov 2011 22:10:34 +0000 (14:10 -0800)]
paxos: store stashed state _and_ incrementals

Paxos::share_state() may share a stashed state and incrementals that
follow; we need to store the same.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: elector: always start election via monitor
Sage Weil [Fri, 11 Nov 2011 05:58:53 +0000 (21:58 -0800)]
mon: elector: always start election via monitor

Don't go from active -> electing without passing (monitor) go.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agocommon: libraries should not log to stdout/stderr
Sage Weil [Thu, 17 Nov 2011 20:07:34 +0000 (12:07 -0800)]
common: libraries should not log to stdout/stderr

Certainly not by default.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoobjecter: set skipped_map if we skip a map
Sage Weil [Thu, 17 Nov 2011 19:56:37 +0000 (11:56 -0800)]
objecter: set skipped_map if we skip a map

This ensures that we resend _all_ requests, since we aren't sure which
may have mapped to a different primary and then back.  This was missed in
the original implementation in 4fe9cca5dd63a1924be2b5cb18f542fb4b97a768.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoobjecter: add is_locked() asserts
Sage Weil [Thu, 17 Nov 2011 19:39:55 +0000 (11:39 -0800)]
objecter: add is_locked() asserts

Sanity check.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoobjecter: send slow osd MPing via Connection*
Sage Weil [Thu, 17 Nov 2011 19:39:36 +0000 (11:39 -0800)]
objecter: send slow osd MPing via Connection*

This may address #1732 indirectly because we have a Connection* reference
here.  However, it's still not clear how we ended up with an OSDSession*
for an osd that doesn't exist.  :/

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: add pending_ops assert
Sage Weil [Wed, 16 Nov 2011 21:10:58 +0000 (13:10 -0800)]
osd: add pending_ops assert

Just a sanity check, hopefully helping us track down #1727.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: renamed get_latest* -> get_stashed*
Sage Weil [Wed, 16 Nov 2011 19:01:59 +0000 (11:01 -0800)]
mon: renamed get_latest* -> get_stashed*

This makes e.g. get_latest_version() vs get_last_committed() less
confusing.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: fix ver tracking for auth database
Sage Weil [Wed, 16 Nov 2011 18:57:23 +0000 (10:57 -0800)]
mon: fix ver tracking for auth database

Local variable keys_ver needs to be updated when we slurp up latest stashed
version.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: always load stashed version when version doesn't match
Sage Weil [Wed, 16 Nov 2011 18:54:59 +0000 (10:54 -0800)]
mon: always load stashed version when version doesn't match

The slurp process can happen after the monitor has started and has some
in-memory version of the state, and that process may wipe out old
incrementals and change the stashed version.  That means that in
update_from_paxos, we need to pull the stashed version if it doesn't
match what we currently have or else we may not have the incrementals we
need to get up to date.

This simplifies and cleans up that code a bit so it is not specific to
monitor startup.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agorgw: don't log entries with bad utf8
Yehuda Sadeh [Tue, 15 Nov 2011 01:02:08 +0000 (17:02 -0800)]
rgw: don't log entries with bad utf8

13 years agorgw: adjust error code in swift copy failures
Yehuda Sadeh [Mon, 14 Nov 2011 22:39:18 +0000 (14:39 -0800)]
rgw: adjust error code in swift copy failures

13 years agorgw: fix swift responses encoding
Yehuda Sadeh [Mon, 14 Nov 2011 21:55:09 +0000 (13:55 -0800)]
rgw: fix swift responses encoding

13 years agorgw: Fix some merge problems uncovered by gcc warnings:
Josh Pieper [Fri, 11 Nov 2011 13:19:55 +0000 (08:19 -0500)]
rgw: Fix some merge problems uncovered by gcc warnings:

 * a refactor in e2100bce left the mod_ptr and unmod_ptr members set
   incorrectly in RGWCopyObj::init_common
 * a fix in 6752babd aggregated error returns, but then failed to do
   anything with them

Signed-off-by: Josh Pieper <jjp@pobox.com>
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoResolve gcc warnings.
Josh Pieper [Fri, 11 Nov 2011 13:19:02 +0000 (08:19 -0500)]
Resolve gcc warnings.

These should have no functional changes:
 * Check errors from functions that currently cannot return any
 * Initialize variables that gcc can't determine will be initialized
   in a following function call
 * Remove unused variables

Signed-off-by: Josh Pieper <jjp@pobox.com>
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: remove dead osd_max_opq code
Sage Weil [Mon, 14 Nov 2011 20:15:14 +0000 (12:15 -0800)]
osd: remove dead osd_max_opq code

This is no longer used as of a while ago!

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoworkunits: rados python workunit should be executable
Josh Durgin [Mon, 14 Nov 2011 16:18:17 +0000 (08:18 -0800)]
workunits: rados python workunit should be executable

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agocrush: send debug output to dout, not stdout/err
Sage Weil [Sun, 13 Nov 2011 22:18:19 +0000 (14:18 -0800)]
crush: send debug output to dout, not stdout/err

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agotest/run_cmd: use mkstemp instead of mkstemps
Sage Weil [Sun, 13 Nov 2011 22:16:52 +0000 (14:16 -0800)]
test/run_cmd: use mkstemp instead of mkstemps

my box didn't have mkstemps

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoceph-authtool: fix clitests
Sage Weil [Sun, 13 Nov 2011 22:07:01 +0000 (14:07 -0800)]
ceph-authtool: fix clitests

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agotest_str_list: make sure ' ' and ', ' separaters work for str lists
Sage Weil [Sat, 12 Nov 2011 23:17:29 +0000 (15:17 -0800)]
test_str_list: make sure ' ' and ', ' separaters work for str lists

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoceph-authtool: make error msg more helpful
Sage Weil [Sat, 12 Nov 2011 22:55:39 +0000 (14:55 -0800)]
ceph-authtool: make error msg more helpful

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agokeyring: don't print auid if it is the default
Sage Weil [Sat, 12 Nov 2011 22:55:28 +0000 (14:55 -0800)]
keyring: don't print auid if it is the default

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: implement 'fsid' command
Sage Weil [Sat, 12 Nov 2011 22:55:06 +0000 (14:55 -0800)]
mon: implement 'fsid' command

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge branch 'stable'
Sage Weil [Sat, 12 Nov 2011 22:19:26 +0000 (14:19 -0800)]
Merge branch 'stable'

13 years agomon: fix 'osd crush add ..' weight
Sage Weil [Sat, 12 Nov 2011 22:04:15 +0000 (14:04 -0800)]
mon: fix 'osd crush add ..' weight

This was changed to floating point in commit 3f67893.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosdmap: build_simple with normal osd/host/rack/pool hierarchy
Sage Weil [Sat, 12 Nov 2011 22:05:15 +0000 (14:05 -0800)]
osdmap: build_simple with normal osd/host/rack/pool hierarchy

This will be useful in the general case where the cluster is created with
an empty map and useful crush hierarchy.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: fix 'osd crush add ..' weight
Sage Weil [Sat, 12 Nov 2011 22:04:15 +0000 (14:04 -0800)]
mon: fix 'osd crush add ..' weight

This was changed to floating point in commit 3f67893.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agovstart.sh: don't generate initial osdmap explicitly
Sage Weil [Sat, 12 Nov 2011 21:42:38 +0000 (13:42 -0800)]
vstart.sh: don't generate initial osdmap explicitly

This is simpler and exercises the monitors ability to start with a generic
osdmap and build it out as new osds are added to the cluster.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: make initial osdmap optional
Sage Weil [Sat, 12 Nov 2011 21:41:59 +0000 (13:41 -0800)]
mon: make initial osdmap optional

If an initial osdmap is not provided, we generate an empty one.  The user
add osds on their own after that.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosdmap: build_simple: create reasonable pools when numosd==0
Sage Weil [Sat, 12 Nov 2011 21:41:05 +0000 (13:41 -0800)]
osdmap: build_simple: create reasonable pools when numosd==0

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: add '--fsid foo' arg for setting generated monmap fsid
Sage Weil [Sat, 12 Nov 2011 21:16:30 +0000 (13:16 -0800)]
mon: add '--fsid foo' arg for setting generated monmap fsid

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: take '--fsid foo' arg with --mkfs
Sage Weil [Sat, 12 Nov 2011 05:02:23 +0000 (21:02 -0800)]
mon: take '--fsid foo' arg with --mkfs

This will set the seed monmap's fsid.  This is useful if the monmap is
dynamically generated (e.g., based on ceph.conf or --mon-host list).

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>