]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
12 years agoos/FileStore: automatically enable 'filestore xattr use omap' as needed 386/head
Sage Weil [Sat, 29 Jun 2013 01:26:31 +0000 (18:26 -0700)]
os/FileStore: automatically enable 'filestore xattr use omap' as needed

Automatically enable the 'filestore xattr use omap' option if the fs
does not appear to handle large xattrs on its own.

This makes for a more pleasant use experience as they are not told to
enable something that we already know they must enable in order to
continue.

Fixes: #5137
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agolibrados: add test for large and many xattrs
Sage Weil [Sat, 29 Jun 2013 00:45:21 +0000 (17:45 -0700)]
librados: add test for large and many xattrs

Verify that we can set large and large numbers of attrs on an object.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: set maximum object attr size
Sage Weil [Sat, 29 Jun 2013 01:15:23 +0000 (18:15 -0700)]
osd: set maximum object attr size

Make a well-defined maximum size of an object attribute.  Since Linus has
a 64KB limit, and that is what we normally use to back this, use that as
the limit.  This means that even when leveldb is backing large xattrs
(as ext4 users must do) we will return EFBIG on >64KB setxattr attempts.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomds: log before respawning when standby-replay falls behind
Sage Weil [Fri, 28 Jun 2013 21:20:28 +0000 (14:20 -0700)]
mds: log before respawning when standby-replay falls behind

Call into an MDS method so that we can write to the log.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodoc: Created an install page for Calxeda development packages.
John Wilkins [Thu, 27 Jun 2013 23:31:44 +0000 (16:31 -0700)]
doc: Created an install page for Calxeda development packages.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoMerge branch 'next'
Greg Farnum [Thu, 27 Jun 2013 22:23:00 +0000 (15:23 -0700)]
Merge branch 'next'

12 years agoceph-disk: s/else if/elif/
Greg Farnum [Thu, 27 Jun 2013 21:58:14 +0000 (14:58 -0700)]
ceph-disk: s/else if/elif/

Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Joao Luis <joao.luis@inktank.com>
(cherry picked from commit bd8255a750de08c1b8ee5e9c9a0a1b9b16171462)

12 years agoMerge pull request #372 from ceph/wip-mon-pgmap
João Eduardo Luís [Thu, 27 Jun 2013 22:09:50 +0000 (15:09 -0700)]
Merge pull request #372 from ceph/wip-mon-pgmap

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Thu, 27 Jun 2013 05:19:32 +0000 (22:19 -0700)]
Merge remote-tracking branch 'gh/next'

12 years agoMerge pull request #378 from ceph/wip-init-rbd
Sage Weil [Thu, 27 Jun 2013 05:15:11 +0000 (22:15 -0700)]
Merge pull request #378 from ceph/wip-init-rbd

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoqa/workunits/misc/multiple_rsync: put tee output in /tmp
Sage Weil [Thu, 27 Jun 2013 05:11:07 +0000 (22:11 -0700)]
qa/workunits/misc/multiple_rsync: put tee output in /tmp

2013-06-25T10:29:15.811 INFO:teuthology.task.workunit.client.0.err:+ rsync -auv --exclude local/ /usr/ usr.2
2013-06-25T10:29:15.811 INFO:teuthology.task.workunit.client.0.err:+ tee a
2013-06-25T10:29:15.902 INFO:teuthology.task.workunit.client.0.out:sending incremental file list
2013-06-25T10:29:48.738 INFO:teuthology.task.workunit.client.0.out:
2013-06-25T10:29:48.740 INFO:teuthology.task.workunit.client.0.out:sent 1449972 bytes  received 7477 bytes  43505.94 bytes/sec
2013-06-25T10:29:48.740 INFO:teuthology.task.workunit.client.0.out:total size is 3205268241  speedup is 2199.23
2013-06-25T10:29:48.740 INFO:teuthology.task.workunit.client.0.err:+ hexdump -C a
2013-06-25T10:29:48.741 INFO:teuthology.task.workunit.client.0.out:00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
2013-06-25T10:29:48.741 INFO:teuthology.task.workunit.client.0.out:00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 0a 73  |...............s|
2013-06-25T10:29:48.742 INFO:teuthology.task.workunit.client.0.out:00000020  65 6e 74 20 31 34 34 39  39 37 32 20 62 79 74 65  |ent 1449972 byte|
2013-06-25T10:29:48.742 INFO:teuthology.task.workunit.client.0.out:00000030  73 20 20 72 65 63 65 69  76 65 64 20 37 34 37 37  |s  received 7477|
2013-06-25T10:29:48.742 INFO:teuthology.task.workunit.client.0.out:00000040  20 62 79 74 65 73 20 20  34 33 35 30 35 2e 39 34  | bytes  43505.94|
2013-06-25T10:29:48.742 INFO:teuthology.task.workunit.client.0.out:00000050  20 62 79 74 65 73 2f 73  65 63 0a 74 6f 74 61 6c  | bytes/sec.total|
2013-06-25T10:29:48.742 INFO:teuthology.task.workunit.client.0.out:00000060  20 73 69 7a 65 20 69 73  20 33 32 30 35 32 36 38  | size is 3205268|
2013-06-25T10:29:48.742 INFO:teuthology.task.workunit.client.0.out:00000070  32 34 31 20 20 73 70 65  65 64 75 70 20 69 73 20  |241  speedup is |
2013-06-25T10:29:48.743 INFO:teuthology.task.workunit.client.0.out:00000080  32 31 39 39 2e 32 33 0a                           |2199.23.|
2013-06-25T10:29:48.743 INFO:teuthology.task.workunit.client.0.out:00000088

This passes consistently when the output is in /tmp, but fails after a few
iterations when on cephfs+kclient.  Avoid the bug with this test.

See: #5453

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agorgw: fix radosgw-admin buckets list
Yehuda Sadeh [Wed, 26 Jun 2013 18:28:57 +0000 (11:28 -0700)]
rgw: fix radosgw-admin buckets list

Fixes: #5455
Backport: cuttlefish
This commit fixes a regression, where radosgw-admin buckets list
operation wasn't returning any data.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoHandle non-existent front interface in maps from older MONs
David Zafman [Thu, 27 Jun 2013 01:55:26 +0000 (18:55 -0700)]
Handle non-existent front interface in maps from older MONs

Fix OSDService::get_con_osd_hb() to not try to get_connection() without front interface
Fix OSD::handle_osd_map() to check for missing front interface

Fixes: #5460
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoqa/workunits/rbd/simple_1tb: add simple rbd read/write test on large image
Sage Weil [Thu, 27 Jun 2013 02:34:27 +0000 (19:34 -0700)]
qa/workunits/rbd/simple_1tb: add simple rbd read/write test on large image

Motivated by #5454.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph-disk: do not mount over an osd directly in /var/lib/ceph/osd/$cluster-$id
Sage Weil [Thu, 27 Jun 2013 01:27:49 +0000 (18:27 -0700)]
ceph-disk: do not mount over an osd directly in /var/lib/ceph/osd/$cluster-$id

If we see a 'ready' file in the target OSD dir, do not mount our device
on top of it.

Among other things, this prevents ceph-disk activate on stray disks from
stepping on teuthology osds.

Fixes: #5445
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PGMonitor: avoid duplicating map_pg_create() effort on same maps 372/head
Sage Weil [Thu, 27 Jun 2013 00:34:39 +0000 (17:34 -0700)]
mon/PGMonitor: avoid duplicating map_pg_create() effort on same maps

If we have an election and refresh, but the osdmap does not change, there
is no need to recalculate the pg create maps.  However, if we register new
creating pgs, we do... when the last_pg_scan update gets pulled out of
paxos (i.e., on both leader and peon mons).

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocephtool/test.sh: add case for auth add with no caps
Dan Mick [Thu, 27 Jun 2013 00:07:48 +0000 (17:07 -0700)]
cephtool/test.sh: add case for auth add with no caps

Test case for failure in #5467.  Supplying new auth info overwrites.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoMonCommands.h: auth add doesn't require caps (it can use -i <file>)
Dan Mick [Wed, 26 Jun 2013 22:31:25 +0000 (15:31 -0700)]
MonCommands.h: auth add doesn't require caps (it can use -i <file>)

This was a regression from the old behavior introduced by the
CLI rewrite.

Fixes: #5467
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoMerge branch 'next'
Dan Mick [Wed, 26 Jun 2013 19:39:15 +0000 (12:39 -0700)]
Merge branch 'next'

12 years agoMakefile.am: fix libglobal.la race with ceph_test_cors
Dan Mick [Wed, 26 Jun 2013 01:23:22 +0000 (18:23 -0700)]
Makefile.am: fix libglobal.la race with ceph_test_cors

ceph_test_cors had libglobal.la in its _LDFLAGS macro definition;
it should have been in _LDADD.  Moreover, things using libglobal.la
ought to be using LIBGLOBAL_LDA to add it to _LDADD.  Fix them all.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agomon/PGMonitor: use post_paxos_update, not init, to refresh from osdmap
Sage Weil [Wed, 26 Jun 2013 13:53:08 +0000 (06:53 -0700)]
mon/PGMonitor: use post_paxos_update, not init, to refresh from osdmap

We do two things here:
 - make init an one-time unconditional init method, which is what the
   health service expects/needs.
 - switch PGMonitor::init to be post_paxos_update() which is called after
   the other services update, which is what PGMonitor really needs.

This is a new version of the fix originally in commit
a2fe0137946541e7b3b537698e1865fbce974ca6 (and those around it).  That is,
this re-fixes a problem where osds do not see pg creates from their
subscribe due to map_pg_creates() not getting called.

Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: add post_paxos_update() hook
Sage Weil [Wed, 26 Jun 2013 13:52:01 +0000 (06:52 -0700)]
mon/PaxosService: add post_paxos_update() hook

Some services need to update internal state based on other service's
state, and thus need to be run after everyone has pulled their info out of
paxos.

Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: do not reopen MonitorDBStore during startup
Sage Weil [Wed, 26 Jun 2013 13:01:40 +0000 (06:01 -0700)]
mon: do not reopen MonitorDBStore during startup

level doesn't seem to like this when it races with an internal compaction
attempt (see below).  Instead, let the store get opened by the ceph_mon
caller, and pull a bit of the logic into the caller to make the flow a
little easier to follow.

    -2> 2013-06-25 17:49:25.184490 7f4d439f8780 10 needs_conversion
    -1> 2013-06-25 17:49:25.184495 7f4d4065c700  5 asok(0x13b1460) entry start
     0> 2013-06-25 17:49:25.316908 7f4d3fe5b700 -1 *** Caught signal (Segmentation fault) **
 in thread 7f4d3fe5b700

 ceph version 0.64-667-g089cba8 (089cba8fc0e8ae8aef9a3111cba7342ecd0f8314)
 1: ceph-mon() [0x649f0a]
 2: (()+0xfcb0) [0x7f4d435dccb0]
 3: (leveldb::Table::BlockReader(void*, leveldb::ReadOptions const&, leveldb::Slice const&)+0x154) [0x806e54]
 4: ceph-mon() [0x808840]
 5: ceph-mon() [0x808b39]
 6: ceph-mon() [0x806540]
 7: (leveldb::DBImpl::DoCompactionWork(leveldb::DBImpl::CompactionState*)+0xdd) [0x7f363d]
 8: (leveldb::DBImpl::BackgroundCompaction()+0x2c0) [0x7f4210]
 9: (leveldb::DBImpl::BackgroundCall()+0x68) [0x7f4cc8]
 10: ceph-mon() [0x80b3af]
 11: (()+0x7e9a) [0x7f4d435d4e9a]
 12: (clone()+0x6d) [0x7f4d4196bccd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/Paxos: simplify trim()
Sage Weil [Tue, 25 Jun 2013 23:12:39 +0000 (16:12 -0700)]
mon/Paxos: simplify trim()

Collapse all the trim methods into a single simple method.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: rename scrub
Sage Weil [Wed, 26 Jun 2013 04:06:14 +0000 (21:06 -0700)]
mon/PaxosService: rename scrub

Make the name patch the one in Paxos.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/Paxos: clean up removal of pre-conversion paxos states
Sage Weil [Tue, 25 Jun 2013 23:54:58 +0000 (16:54 -0700)]
mon/Paxos: clean up removal of pre-conversion paxos states

Use a helper, independent of trim machinery, and call on leader, too.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/Paxos: update first_committed only from paxos
Sage Weil [Tue, 25 Jun 2013 22:58:43 +0000 (15:58 -0700)]
mon/Paxos: update first_committed only from paxos

Do not touch the in-memory first_committed until the trim commits.  This
avoids any possible confusion due to races and keeps commit() as similar
to store_state() as possible.

Similarly, do not touch first_committed from store_state.  We should
*only* pull it out of the kv store.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/Paxos: set first_committed on first commit
Sage Weil [Tue, 25 Jun 2013 23:45:05 +0000 (16:45 -0700)]
mon/Paxos: set first_committed on first commit

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodoc: public network statement needed on new monitors.
Gary Lowell [Wed, 26 Jun 2013 13:27:17 +0000 (06:27 -0700)]
doc:  public network statement needed on new monitors.

When using ceph-deploy to create a new monitor on a host that is not
in the initial set of hosts defined by the ceph-deploy new command,
a "public network" statement needs to be added to the ceph.conf file.
Fixes #5195.

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
12 years agomon/Paxos: never write first_committed except during trim
Sage Weil [Tue, 25 Jun 2013 22:43:33 +0000 (15:43 -0700)]
mon/Paxos: never write first_committed except during trim

The trimming is handled by proposing transactions.  Do not confuse matters
by writing (incorrect) first_committed values at any other point.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: enable leveldb cache by default
Sage Weil [Tue, 25 Jun 2013 22:22:05 +0000 (15:22 -0700)]
mon: enable leveldb cache by default

512 MB sounds reasonable to me.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/Paxos: assert that the store gives us back what we just wrote
Sage Weil [Tue, 25 Jun 2013 04:07:09 +0000 (21:07 -0700)]
mon/Paxos: assert that the store gives us back what we just wrote

In bug #5424 I observed leveldb failing internally and then returning
bad info.  We then hit a random/confusing assert.  Try to detect this
earlier by verifying that a get of a just-written last_committed gives
us back the right thing.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon/Paxos: drop unnecessary last_committed loads
Sage Weil [Tue, 25 Jun 2013 18:58:22 +0000 (11:58 -0700)]
mon/Paxos: drop unnecessary last_committed loads

Drop (apparently) ad-hoc refreshes of last_committed from the store.
These are unnecessary and confusing.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: allow paxos service writes while paxos is updating
Sage Weil [Thu, 20 Jun 2013 22:39:23 +0000 (15:39 -0700)]
mon/PaxosService: allow paxos service writes while paxos is updating

In commit f985de28f86675e974ac7842a49922a35fe24c6c I mistakenly made
is_writeable() false while paxos was updating due to a misread of
Paxos::propose_new_value() (I didn't see that it would queue).
This is problematic because it narrows the window during which each service
is writeable for no reason.

Allow service to be writeable both when paxos is active and updating.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PGMonitor: store PGMap directly in store, bypassing PaxosService stash_full
Sage Weil [Tue, 25 Jun 2013 19:01:53 +0000 (12:01 -0700)]
mon/PGMonitor: store PGMap directly in store, bypassing PaxosService stash_full

Instead of encoding incrementals and periodically dumping the whole encoded
PGMap, instead store everything in a range of keys, and update them
between versions using transactions.  The per-version values are now
breadcrumbs indicating which keys were dirtied so they can be refreshed
via update_from_paxos().

This has several benefits:
 - we avoid every encoding the entire PGMap
 - we avoid dumping that blob into leveldb keys
 - we limit the amount of data living in forward-moving keys, which leveldb
   has a hard time compacting away
 - pgmap data instead lives over a fixed range of keys, which leveldb
   excels at
 - we only keep the latest copy of the PGMap (which is all we care about)

Bump the internal monitor protocol version.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodoc/release-notes: v0.65
Sage Weil [Tue, 25 Jun 2013 21:14:39 +0000 (14:14 -0700)]
doc/release-notes: v0.65

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'next'
Gary Lowell [Tue, 25 Jun 2013 20:45:22 +0000 (13:45 -0700)]
Merge branch 'next'

12 years agoMerge pull request #380 from dachary/wip-4907
Josh Durgin [Tue, 25 Jun 2013 17:57:41 +0000 (10:57 -0700)]
Merge pull request #380 from dachary/wip-4907

get_xattr() can return more than 4KB

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoMerge pull request #379 from dachary/wip-5312
Sage Weil [Tue, 25 Jun 2013 17:15:10 +0000 (10:15 -0700)]
Merge pull request #379 from dachary/wip-5312

skip TEST(EXT4StoreTest, _detect_fs) if DISK or MOUNTPOINT are undefined

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agomon/AuthMonitor: start at format 1 (latest) for new clusters
Sage Weil [Fri, 21 Jun 2013 00:44:06 +0000 (17:44 -0700)]
mon/AuthMonitor: start at format 1 (latest) for new clusters

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: move upgrade_format() machinery into PaxosService
Sage Weil [Thu, 20 Jun 2013 21:12:16 +0000 (14:12 -0700)]
mon/PaxosService: move upgrade_format() machinery into PaxosService

We originally did this in AuthMonitor, but it is perfect for PGMonitor too,
so make it generic.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PGMonitor: drop some dead code
Sage Weil [Tue, 18 Jun 2013 22:22:35 +0000 (15:22 -0700)]
mon/PGMonitor: drop some dead code

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PGMap: make int type explicit
Sage Weil [Tue, 18 Jun 2013 22:20:00 +0000 (15:20 -0700)]
mon/PGMap: make int type explicit

We get away with this because int is 32-bits on x86_64 and i386 both, but
we should be explicit anyway!

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: s/get_version()/get_last_committed()/
Sage Weil [Tue, 18 Jun 2013 16:20:44 +0000 (09:20 -0700)]
mon/PaxosService: s/get_version()/get_last_committed()/

Avoid aliasing simple accessors; use a single name instead.  Also, function
name overloading will throw a wrench in the class inheritance later.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agov0.65 v0.65
Gary Lowell [Tue, 25 Jun 2013 16:19:32 +0000 (09:19 -0700)]
v0.65

12 years agoget_xattr() can return more than 4KB 380/head
Loic Dachary [Tue, 25 Jun 2013 14:10:02 +0000 (16:10 +0200)]
get_xattr() can return more than 4KB

Instead of failing if the attribute to be returned is larger than 4KB,
double the buffer size each time librados.rados_getxattr returns
-errno.ERANGE and try again.

http://tracker.ceph.com/issues/4907 fixes #4907

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agoskip TEST(EXT4StoreTest, _detect_fs) if DISK or MOUNTPOINT are undefined 379/head
Loic Dachary [Tue, 25 Jun 2013 13:04:34 +0000 (15:04 +0200)]
skip TEST(EXT4StoreTest, _detect_fs) if DISK or MOUNTPOINT are undefined

The TEST(EXT4StoreTest, _detect_fs) test is meant to be run from
qa/workunits/filestore/filestore.sh, after the ext4 file system was
created. If the DISK and MOUNTPOINT environment variables are not
defined, display a message explaining the expected environment and
silentely skip the test. The tests in store_test.cc are not unit tests
because they depend on their environment.

http://tracker.ceph.com/issues/5312 fixes #5312

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agoAdd rc script for rbd map/unmap 378/head
Laurent Barbe [Fri, 21 Jun 2013 15:17:09 +0000 (17:17 +0200)]
Add rc script for rbd map/unmap

Init script for mapping/unmapping rbd device on startup and shutdown.
On start, map rbd dev according to /etc/rbdmap, and force mount -a
On stop, umount file system depending on rbd and unmap all rbd
Since some distribution use symlink for /etc/mtab, the user-space attribute _netdev is not enough to umount file system before rbd dev.
(also concern: #1790)

Signed-off-by: Laurent Barbe <laurent@ksperis.com>
12 years agomon/PaxosService: drop unused last_accepted_name
Sage Weil [Tue, 18 Jun 2013 16:01:24 +0000 (09:01 -0700)]
mon/PaxosService: drop unused last_accepted_name

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: some whitespace
Sage Weil [Tue, 18 Jun 2013 01:03:30 +0000 (18:03 -0700)]
mon/PaxosService: some whitespace

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: drop unused {get,set,put}_version(prefix, a, bl)
Sage Weil [Tue, 18 Jun 2013 01:02:30 +0000 (18:02 -0700)]
mon/PaxosService: drop unused {get,set,put}_version(prefix, a, bl)

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/OSDMOnitor: use provided get_version_full()
Sage Weil [Tue, 18 Jun 2013 01:00:31 +0000 (18:00 -0700)]
mon/OSDMOnitor: use provided get_version_full()

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: simplify full helpers, drop single-use helper
Sage Weil [Tue, 18 Jun 2013 00:57:00 +0000 (17:57 -0700)]
mon/PaxosService: simplify full helpers, drop single-use helper

We are the only caller for get_version(prefix, name), so move it inline
and drop it.  Also rename full_version_name to full_prefix_name, which I
find slightly less confusing.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: remove mkfs helpers
Sage Weil [Tue, 18 Jun 2013 00:46:00 +0000 (17:46 -0700)]
mon/PaxosService: remove mkfs helpers

Keep it simple.  These are one-liners.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: fix mkfs monmap cleanup
Sage Weil [Tue, 18 Jun 2013 00:17:56 +0000 (17:17 -0700)]
mon: fix mkfs monmap cleanup

exists_key(a,b) was looking for "monmap/mkfs/monmap".

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: make PaxosService::get_value() int return type 64-bit
Sage Weil [Tue, 18 Jun 2013 00:39:20 +0000 (17:39 -0700)]
mon: make PaxosService::get_value() int return type 64-bit

Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: drop unused helpers
Sage Weil [Tue, 18 Jun 2013 00:39:49 +0000 (17:39 -0700)]
mon/PaxosService: drop unused helpers

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/MonmapMonitor: avoid exists_version() helper
Sage Weil [Tue, 18 Jun 2013 00:26:41 +0000 (17:26 -0700)]
mon/MonmapMonitor: avoid exists_version() helper

We are the only user; open-code it.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/PaxosService: remove unused exists_version() variant
Sage Weil [Tue, 18 Jun 2013 00:08:55 +0000 (17:08 -0700)]
mon/PaxosService: remove unused exists_version() variant

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Tue, 25 Jun 2013 03:41:15 +0000 (20:41 -0700)]
Merge remote-tracking branch 'gh/next'

12 years agomon/Elector: cancel election timer if we bootstrap
Sage Weil [Tue, 25 Jun 2013 01:51:07 +0000 (18:51 -0700)]
mon/Elector: cancel election timer if we bootstrap

If we short-circuit and bootstrap, cancel our timer.  Otherwise it will
go off some time later when we are in who knows what state.

Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: cancel probe timeout on reset
Sage Weil [Tue, 25 Jun 2013 01:12:11 +0000 (18:12 -0700)]
mon: cancel probe timeout on reset

If we are probing and get (say) an election timeout that calls reset(),
cancel the timer.  Otherwise, we assert later with a splat like

2013-06-24 01:09:33.675882 7fb9627e7700  4 mon.b@0(leader) e1 probe_timeout 0x307a520
2013-06-24 01:09:33.676956 7fb9627e7700 -1 mon/Monitor.cc: In function 'void Monitor::probe_timeout(int)' thread 7fb9627e7700 time 2013-06-24 01:09:43.675904
mon/Monitor.cc: 1888: FAILED assert(is_probing() || is_synchronizing())

 ceph version 0.64-613-g134d08a (134d08a9654f66634b893d493e4a92f38acc63cf)
 1: (Monitor::probe_timeout(int)+0x161) [0x56f5c1]
 2: (Context::complete(int)+0xa) [0x574a2a]
 3: (SafeTimer::timer_thread()+0x425) [0x7059a5]
 4: (SafeTimerThread::entry()+0xd) [0x7065dd]
 5: (()+0x7e9a) [0x7fb966f62e9a]
 6: (clone()+0x6d) [0x7fb9652f9ccd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Fixes: #5438
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon/AuthMonitor: ensure initial rotating keys get encoded when create_initial called 2x
Sage Weil [Tue, 25 Jun 2013 00:58:48 +0000 (17:58 -0700)]
mon/AuthMonitor: ensure initial rotating keys get encoded when create_initial called 2x

The create_initial() method may get called multiple times; make sure it
will unconditionally generate new/initial rotating keys.  Move the block
up so that we can easily assert as much.

Broken by commit cd98eb0c651d9ee62e19c2cc92eadae9bed678cd.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
12 years agoosd: tolerate racing threads starting recovery ops
Sage Weil [Mon, 24 Jun 2013 23:37:29 +0000 (16:37 -0700)]
osd: tolerate racing threads starting recovery ops

We sample the (max - active) recovery ops to know how many to start, but
do not hold the lock over the full duration, such that it is possible to
start too many ops.  This isn't problematic except that our condition
checks for being == max but not beyond it, and we will continue to start
recovery ops when we shouldn't.  Fix this by adjusting the conditional
to be <=.

Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
12 years agoinit-radosgw.sysv: remove -x debug mode
Sage Weil [Tue, 25 Jun 2013 00:42:04 +0000 (17:42 -0700)]
init-radosgw.sysv: remove -x debug mode

Fixes: #5443
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocommon/pick_addresses: behave even after internal_safe_to_start_threads
Sage Weil [Mon, 24 Jun 2013 19:52:44 +0000 (12:52 -0700)]
common/pick_addresses: behave even after internal_safe_to_start_threads

ceph-mon recently started using Preforker to working around forking issues.
As a result, internal_safe_to_start_threads got set sooner and calls to
pick_addresses() which try to set string config values now fail because
there are no config observers for them.

Work around this by observing the change while we adjust the value.  We
assume pick_addresses() callers are smart enough to realize that their
result will be reflected by cct->_conf and not magically handled elsewhere.

Fixes: #5195, #5205
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agoAdd python-argparse to dependencies (for pre-2.7 systems)
Dan Mick [Mon, 24 Jun 2013 21:50:07 +0000 (14:50 -0700)]
Add python-argparse to dependencies (for pre-2.7 systems)

Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agoMerge pull request #376 from dalgaaf/wip-da-SCA-cppcheck-3
Sage Weil [Mon, 24 Jun 2013 20:47:50 +0000 (13:47 -0700)]
Merge pull request #376 from dalgaaf/wip-da-SCA-cppcheck-3

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agodebian, rpm: remove python-lockfile dependency
Sage Weil [Mon, 24 Jun 2013 20:04:44 +0000 (13:04 -0700)]
debian, rpm: remove python-lockfile dependency

As for 2a4953b697a3464862fd3913336edfd7eede2487 ceph-disk no longer uses
this.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Mon, 24 Jun 2013 19:25:58 +0000 (12:25 -0700)]
Merge remote-tracking branch 'gh/next'

12 years agomds: do not assume segment list is non-empty in standby_trim_segments
Sage Weil [Fri, 21 Jun 2013 21:23:45 +0000 (14:23 -0700)]
mds: do not assume segment list is non-empty in standby_trim_segments

If we restart standby replay shortly after startup, before we actually have
any segments, we an trigger a segfault here:

 ceph version 0.64-441-gc39b99c (c39b99cdecceaca77f66eafbcc38387406826406)
 1: ceph-mds() [0x975caa]
 2: (()+0xfcb0) [0x7fc33b5a5cb0]
 3: (MDLog::standby_trim_segments()+0x192) [0x78a932]
 4: (MDS::C_MDS_StandbyReplayRestartFinish::finish(int)+0x39) [0x595f69]
 5: (Journaler::_finish_reprobe(int, unsigned long, Context*)+0x190) [0x7917b0]
 6: (Filer::_probed(Filer::Probe*, object_t const&, unsigned long, utime_t)+0x558) [0x7c6b38]
 7: (Objecter::C_Stat::finish(int)+0xc0) [0x7c7930]
 8: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xe48) [0x7b2c78]
 9: (MDS::handle_core_message(Message*)+0xae8) [0x589858]
 10: (MDS::_dispatch(Message*)+0x2f) [0x589a1f]
 11: (MDS::ms_dispatch(Message*)+0x1d3) [0x58b4a3]
 12: (DispatchQueue::entry()+0x3f1) [0x943861]
 13: (DispatchQueue::DispatchThread::entry()+0xd) [0x86e32d]

Fixes: #5333
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit abd0ff64e108b7670a062b3fa39baaf3d3e48fb3)

12 years agoMerge pull request #374 from ceph/wip-5427
Sage Weil [Mon, 24 Jun 2013 17:20:24 +0000 (10:20 -0700)]
Merge pull request #374 from ceph/wip-5427

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agotest/librados/cmd.cc: use static_cast instead of C-Style cast 376/head
Danny Al-Gaaf [Mon, 24 Jun 2013 13:29:12 +0000 (15:29 +0200)]
test/librados/cmd.cc: use static_cast instead of C-Style cast

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoosdc/Objecter.cc: use static_cast instead of C-Style cast
Danny Al-Gaaf [Mon, 24 Jun 2013 13:24:00 +0000 (15:24 +0200)]
osdc/Objecter.cc: use static_cast instead of C-Style cast

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agomon/MonClient.cc: use static_cast instead of C-Style cast
Danny Al-Gaaf [Mon, 24 Jun 2013 13:19:41 +0000 (15:19 +0200)]
mon/MonClient.cc: use static_cast instead of C-Style cast

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agocommon/cmdparse.cc: reduce scope of local variable 'pos'
Danny Al-Gaaf [Mon, 24 Jun 2013 12:34:46 +0000 (14:34 +0200)]
common/cmdparse.cc: reduce scope of local variable 'pos'

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agocommon/cmdparse.cc: remove unused variable
Danny Al-Gaaf [Mon, 24 Jun 2013 12:29:50 +0000 (14:29 +0200)]
common/cmdparse.cc: remove unused variable

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoosd/OSD.cc: prefer prefix ++operator for non-trivial iterator
Danny Al-Gaaf [Mon, 24 Jun 2013 12:24:14 +0000 (14:24 +0200)]
osd/OSD.cc: prefer prefix ++operator for non-trivial iterator

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoOSDMonitor.cc: prefer prefix ++operator for non-trivial iterator
Danny Al-Gaaf [Mon, 24 Jun 2013 12:18:52 +0000 (14:18 +0200)]
OSDMonitor.cc: prefer prefix ++operator for non-trivial iterator

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agomon/MonCap.cc: use empty() instead of if(size())
Danny Al-Gaaf [Mon, 24 Jun 2013 12:14:38 +0000 (14:14 +0200)]
mon/MonCap.cc: use empty() instead of if(size())

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agocommon/cmdparse.cc: prefer prefix ++operator for non-trivial iterator
Danny Al-Gaaf [Mon, 24 Jun 2013 11:50:33 +0000 (13:50 +0200)]
common/cmdparse.cc: prefer prefix ++operator for non-trivial iterator

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoMerge pull request #375 from ceph/wip-msgr
Gregory Farnum [Mon, 24 Jun 2013 05:42:07 +0000 (22:42 -0700)]
Merge pull request #375 from ceph/wip-msgr

misc msgr fixes

Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomsgr: clear_pipe+queue reset when replacing lossy connections 375/head
Sage Weil [Mon, 24 Jun 2013 01:09:55 +0000 (18:09 -0700)]
msgr: clear_pipe+queue reset when replacing lossy connections

We already handle the lossless replacement and lossy fault paths, but
not the lossy replacement.  This fixes an assert(!cleared) in the
reaper.  Adjust comments appropriately.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomsgr: reaper: make sure pipe has been cleared (under pipe_lock)
Sage Weil [Mon, 17 Jun 2013 20:32:38 +0000 (13:32 -0700)]
msgr: reaper: make sure pipe has been cleared (under pipe_lock)

All paths to pipe shutdown should have cleared the con->pipe reference
already.  Assert as much.

Also, do it under pipe_lock!

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomsg/Pipe: goto fail_unlocked on early failures in accept()
Sage Weil [Mon, 17 Jun 2013 21:14:02 +0000 (14:14 -0700)]
msg/Pipe: goto fail_unlocked on early failures in accept()

Instead of duplicating an incomplete cleanup sequence (that does not
clear_pipe()), goto fail_unlocked and do the cleanup in a generic way.
s/rc/r/ while we are here.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomsgr: clear con->pipe inside pipe_lock on mark_down
Sage Weil [Mon, 17 Jun 2013 20:32:07 +0000 (13:32 -0700)]
msgr: clear con->pipe inside pipe_lock on mark_down

We need to do this under protection of the pipe_lock.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomsgr: clear_pipe inside pipe_lock on mark_down_all
Sage Weil [Mon, 17 Jun 2013 19:47:11 +0000 (12:47 -0700)]
msgr: clear_pipe inside pipe_lock on mark_down_all

Observed a segfault in rebind -> mark_down_all -> clear_pipe -> put that
may have been due to a racing thread clearing the connection_state pointer.
Do the clear_pipe() call under the protection of pipe_lock, as we do in
all other contexts.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/AuthMonitor: make initial auth include rotating keys 374/head
Sage Weil [Sun, 23 Jun 2013 16:25:55 +0000 (09:25 -0700)]
mon/AuthMonitor: make initial auth include rotating keys

This closes a very narrow race during mon creation where there are no
service keys.

Fixes: #5427
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: do not leak no_reply messages
Sage Weil [Sun, 23 Jun 2013 15:52:46 +0000 (08:52 -0700)]
mon: do not leak no_reply messages

I think I assumed no_reply() was releasing the references, but it is
not.  Which is better, since send_reply() doesn't either.  Fix the leaks
by dropping the message ref explicitly.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: fix leak of MOSDFailure messages
Sage Weil [Sun, 23 Jun 2013 15:53:09 +0000 (08:53 -0700)]
mon: fix leak of MOSDFailure messages

We need to discard/cancel/free the failure report messages before we
cancel a report out.  Assert in the dtor to ensure we didn't forget.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agodebian: ceph-common requires matching version of python-ceph
Sage Weil [Sat, 22 Jun 2013 17:28:16 +0000 (10:28 -0700)]
debian: ceph-common requires matching version of python-ceph

If they skew the ceph_argparse.py module may be missing.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'next'
Dan Mick [Sat, 22 Jun 2013 01:46:08 +0000 (18:46 -0700)]
Merge branch 'next'

Conflicts:
src/ceph.in

12 years agoAdd header comments and Inktank copyrights to ceph.in/ceph_argparse.py
Dan Mick [Sat, 22 Jun 2013 01:39:59 +0000 (18:39 -0700)]
Add header comments and Inktank copyrights to ceph.in/ceph_argparse.py

Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agoceph.in: rip out reusable code to pybind/ceph_argparse.py
Dan Mick [Fri, 21 Jun 2013 23:10:35 +0000 (16:10 -0700)]
ceph.in: rip out reusable code to pybind/ceph_argparse.py

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Conflicts:
src/ceph.in

12 years agoceph: even shinier
Sage Weil [Fri, 21 Jun 2013 22:52:32 +0000 (15:52 -0700)]
ceph: even shinier

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agoceph: do not busy-loop on ceph -w
Sage Weil [Fri, 21 Jun 2013 22:50:59 +0000 (15:50 -0700)]
ceph: do not busy-loop on ceph -w

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agolibrados: make cmd test tolerate NXIO for osd commands
Sage Weil [Fri, 21 Jun 2013 21:53:22 +0000 (14:53 -0700)]
librados: make cmd test tolerate NXIO for osd commands

The cluster may be thrashing underneath us; tolerate NXIO in case the OSD
is currently down.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'gh/wip-mds'
Sage Weil [Fri, 21 Jun 2013 21:25:34 +0000 (14:25 -0700)]
Merge remote-tracking branch 'gh/wip-mds'

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agomds: do not assume segment list is non-empty in standby_trim_segments
Sage Weil [Fri, 21 Jun 2013 21:23:45 +0000 (14:23 -0700)]
mds: do not assume segment list is non-empty in standby_trim_segments

If we restart standby replay shortly after startup, before we actually have
any segments, we an trigger a segfault here:

 ceph version 0.64-441-gc39b99c (c39b99cdecceaca77f66eafbcc38387406826406)
 1: ceph-mds() [0x975caa]
 2: (()+0xfcb0) [0x7fc33b5a5cb0]
 3: (MDLog::standby_trim_segments()+0x192) [0x78a932]
 4: (MDS::C_MDS_StandbyReplayRestartFinish::finish(int)+0x39) [0x595f69]
 5: (Journaler::_finish_reprobe(int, unsigned long, Context*)+0x190) [0x7917b0]
 6: (Filer::_probed(Filer::Probe*, object_t const&, unsigned long, utime_t)+0x558) [0x7c6b38]
 7: (Objecter::C_Stat::finish(int)+0xc0) [0x7c7930]
 8: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xe48) [0x7b2c78]
 9: (MDS::handle_core_message(Message*)+0xae8) [0x589858]
 10: (MDS::_dispatch(Message*)+0x2f) [0x589a1f]
 11: (MDS::ms_dispatch(Message*)+0x1d3) [0x58b4a3]
 12: (DispatchQueue::entry()+0x3f1) [0x943861]
 13: (DispatchQueue::DispatchThread::entry()+0xd) [0x86e32d]

Fixes: #5333
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: rev protocol
Sage Weil [Fri, 21 Jun 2013 15:20:33 +0000 (08:20 -0700)]
mds: rev protocol

Commit 18b9e63b4df643e1f2fb8f17416089e5d970bf60 changed the OTW lock
encoding.

Signed-off-by: Sage Weil <sage@inktank.com>