]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
12 years agoAdd python-argparse to dependencies (for pre-2.7 systems)
Dan Mick [Mon, 24 Jun 2013 21:50:07 +0000 (14:50 -0700)]
Add python-argparse to dependencies (for pre-2.7 systems)

Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agomds: do not assume segment list is non-empty in standby_trim_segments
Sage Weil [Fri, 21 Jun 2013 21:23:45 +0000 (14:23 -0700)]
mds: do not assume segment list is non-empty in standby_trim_segments

If we restart standby replay shortly after startup, before we actually have
any segments, we an trigger a segfault here:

 ceph version 0.64-441-gc39b99c (c39b99cdecceaca77f66eafbcc38387406826406)
 1: ceph-mds() [0x975caa]
 2: (()+0xfcb0) [0x7fc33b5a5cb0]
 3: (MDLog::standby_trim_segments()+0x192) [0x78a932]
 4: (MDS::C_MDS_StandbyReplayRestartFinish::finish(int)+0x39) [0x595f69]
 5: (Journaler::_finish_reprobe(int, unsigned long, Context*)+0x190) [0x7917b0]
 6: (Filer::_probed(Filer::Probe*, object_t const&, unsigned long, utime_t)+0x558) [0x7c6b38]
 7: (Objecter::C_Stat::finish(int)+0xc0) [0x7c7930]
 8: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xe48) [0x7b2c78]
 9: (MDS::handle_core_message(Message*)+0xae8) [0x589858]
 10: (MDS::_dispatch(Message*)+0x2f) [0x589a1f]
 11: (MDS::ms_dispatch(Message*)+0x1d3) [0x58b4a3]
 12: (DispatchQueue::entry()+0x3f1) [0x943861]
 13: (DispatchQueue::DispatchThread::entry()+0xd) [0x86e32d]

Fixes: #5333
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit abd0ff64e108b7670a062b3fa39baaf3d3e48fb3)

12 years agoMerge pull request #374 from ceph/wip-5427
Sage Weil [Mon, 24 Jun 2013 17:20:24 +0000 (10:20 -0700)]
Merge pull request #374 from ceph/wip-5427

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon/AuthMonitor: make initial auth include rotating keys 374/head
Sage Weil [Sun, 23 Jun 2013 16:25:55 +0000 (09:25 -0700)]
mon/AuthMonitor: make initial auth include rotating keys

This closes a very narrow race during mon creation where there are no
service keys.

Fixes: #5427
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: do not leak no_reply messages
Sage Weil [Sun, 23 Jun 2013 15:52:46 +0000 (08:52 -0700)]
mon: do not leak no_reply messages

I think I assumed no_reply() was releasing the references, but it is
not.  Which is better, since send_reply() doesn't either.  Fix the leaks
by dropping the message ref explicitly.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: fix leak of MOSDFailure messages
Sage Weil [Sun, 23 Jun 2013 15:53:09 +0000 (08:53 -0700)]
mon: fix leak of MOSDFailure messages

We need to discard/cancel/free the failure report messages before we
cancel a report out.  Assert in the dtor to ensure we didn't forget.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agodebian: ceph-common requires matching version of python-ceph
Sage Weil [Sat, 22 Jun 2013 17:28:16 +0000 (10:28 -0700)]
debian: ceph-common requires matching version of python-ceph

If they skew the ceph_argparse.py module may be missing.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoAdd header comments and Inktank copyrights to ceph.in/ceph_argparse.py
Dan Mick [Sat, 22 Jun 2013 01:39:59 +0000 (18:39 -0700)]
Add header comments and Inktank copyrights to ceph.in/ceph_argparse.py

Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agoceph.in: rip out reusable code to pybind/ceph_argparse.py
Dan Mick [Fri, 21 Jun 2013 23:10:35 +0000 (16:10 -0700)]
ceph.in: rip out reusable code to pybind/ceph_argparse.py

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Conflicts:
src/ceph.in

12 years agoceph: even shinier
Sage Weil [Fri, 21 Jun 2013 22:52:32 +0000 (15:52 -0700)]
ceph: even shinier

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agoceph: do not busy-loop on ceph -w
Sage Weil [Fri, 21 Jun 2013 22:50:59 +0000 (15:50 -0700)]
ceph: do not busy-loop on ceph -w

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agolibrados: make cmd test tolerate NXIO for osd commands
Sage Weil [Fri, 21 Jun 2013 21:53:22 +0000 (14:53 -0700)]
librados: make cmd test tolerate NXIO for osd commands

The cluster may be thrashing underneath us; tolerate NXIO in case the OSD
is currently down.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph.in: remove some TAB chars
Dan Mick [Thu, 20 Jun 2013 22:12:24 +0000 (15:12 -0700)]
ceph.in: remove some TAB chars

Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agoceph.in: fix ^C handling in watch (trap exception in while, too)
Dan Mick [Thu, 20 Jun 2013 22:11:03 +0000 (15:11 -0700)]
ceph.in: fix ^C handling in watch (trap exception in while, too)

Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agoceph: --version as well as -v
Sage Weil [Thu, 20 Jun 2013 22:04:51 +0000 (15:04 -0700)]
ceph: --version as well as -v

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoqa/workunits/misc/multiple_rsync.sh: wtf
Sage Weil [Sun, 16 Jun 2013 03:42:39 +0000 (20:42 -0700)]
qa/workunits/misc/multiple_rsync.sh: wtf

2013-06-15T12:55:29.808 INFO:teuthology.task.workunit.client.0.err:+ rsync -auv --exclude local/ /usr/ usr.1
2013-06-15T12:55:29.808 INFO:teuthology.task.workunit.client.0.err:+ tee a
2013-06-15T12:55:29.820 INFO:teuthology.task.workunit.client.0.out:sending incremental file list
2013-06-15T12:56:46.019 INFO:teuthology.task.workunit.client.0.out:
2013-06-15T12:56:46.020 INFO:teuthology.task.workunit.client.0.out:sent 1452634 bytes  received 7485 bytes  19086.52 bytes/sec
2013-06-15T12:56:46.020 INFO:teuthology.task.workunit.client.0.out:total size is 3205063225  speedup is 2195.07
2013-06-15T12:56:46.020 INFO:teuthology.task.workunit.client.0.err:+ wc -l a
2013-06-15T12:56:46.021 INFO:teuthology.task.workunit.client.0.out:4 a
2013-06-15T12:56:46.022 INFO:teuthology.task.workunit.client.0.err:+ wc -l a
2013-06-15T12:56:46.022 INFO:teuthology.task.workunit.client.0.err:+ grep 4
2013-06-15T12:56:46.023 INFO:teuthology.task.workunit.client.0.out:4 a
2013-06-15T12:56:46.024 INFO:teuthology.task.workunit.client.0.err:+ rsync -auv --exclude local/ /usr/ usr.2
2013-06-15T12:56:46.024 INFO:teuthology.task.workunit.client.0.err:+ tee a
2013-06-15T12:56:46.112 INFO:teuthology.task.workunit.client.0.out:sending incremental file list
2013-06-15T12:57:17.172 INFO:teuthology.task.workunit.client.0.out:
2013-06-15T12:57:17.174 INFO:teuthology.task.workunit.client.0.out:sent 1452634 bytes  received 7485 bytes  46352.98 bytes/sec
2013-06-15T12:57:17.174 INFO:teuthology.task.workunit.client.0.out:total size is 3205063225  speedup is 2195.07
2013-06-15T12:57:17.175 INFO:teuthology.task.workunit.client.0.err:+ wc -l a
2013-06-15T12:57:17.175 INFO:teuthology.task.workunit.client.0.out:3 a

Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 21e85f90be3e4915376106dd384f6982086e2311)

12 years agoqa/workunits/cephtool/test.sh: fix and cleanup several tests
Sage Weil [Thu, 20 Jun 2013 18:28:26 +0000 (11:28 -0700)]
qa/workunits/cephtool/test.sh: fix and cleanup several tests

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: drop deprecated 'stop_cluster'
Sage Weil [Thu, 20 Jun 2013 18:23:38 +0000 (11:23 -0700)]
mon: drop deprecated 'stop_cluster'

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomds: make 'mds compat rm_*compat' idempotent
Sage Weil [Thu, 20 Jun 2013 18:23:11 +0000 (11:23 -0700)]
mds: make 'mds compat rm_*compat' idempotent

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: make 'log ...' command wait for commit before reply
Sage Weil [Thu, 20 Jun 2013 18:11:50 +0000 (11:11 -0700)]
mon: make 'log ...' command wait for commit before reply

Previously we would just dump the command argument to our local log client
and reply immediately, which could lose the message if we then restarted.
Instead, commit directly and wait before replying.

Also, log as the actual client, not as the monitor processing the message.

Fixes: #5409
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agoa/workunits/cephtool/test.sh: --no-log-to-stderr when examining stderr
Sage Weil [Thu, 20 Jun 2013 18:04:26 +0000 (11:04 -0700)]
a/workunits/cephtool/test.sh: --no-log-to-stderr when examining stderr

We can get random messages to stderror from socket reconnects and such;
discard those if we are looking at stderr in the test.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: more fix dout use in sync_requester_abort()
Sage Weil [Thu, 20 Jun 2013 16:46:42 +0000 (09:46 -0700)]
mon: more fix dout use in sync_requester_abort()

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: fix raw use of *_dout in sync_requester_abort()
Sage Weil [Mon, 10 Jun 2013 18:48:25 +0000 (11:48 -0700)]
mon: fix raw use of *_dout in sync_requester_abort()

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoFileStore: handle observers in constructor/destructor
Samuel Just [Thu, 20 Jun 2013 02:46:06 +0000 (19:46 -0700)]
FileStore: handle observers in constructor/destructor

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoFileStore: apply changes after disabling m_filestore_replica_fadvise
Samuel Just [Thu, 20 Jun 2013 01:57:05 +0000 (18:57 -0700)]
FileStore: apply changes after disabling m_filestore_replica_fadvise

Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
(cherry picked from commit ed8b0e65bde14d0a3a08bc233dee6a997e379dcc)

12 years agoceph-disk: make list_partition behave with unusual device names
Alexandre Maragone [Tue, 18 Jun 2013 23:18:01 +0000 (16:18 -0700)]
ceph-disk: make list_partition behave with unusual device names

When you get device names like sdaa you do not want to mistakenly conclude that
sdaa is a partition of sda.  Use /sys/block/$device/$partition existence
instead.

Fixes: #5211
Backport: cuttlefish
Signed-off-by: Alexandre Maragone <alexandre.maragone@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoos/FileStore: disable fadvise on XFS
Sage Weil [Wed, 19 Jun 2013 04:44:15 +0000 (21:44 -0700)]
os/FileStore: disable fadvise on XFS

fadvise(DONTNEED) on XFS can break writeback ordering and zeroing; see

      http://oss.sgi.com/archives/xfs/2013-06/msg00066.html

If we detect XFS, turn this option off.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoRevert "client: fix warning"
Sage Weil [Wed, 19 Jun 2013 16:58:41 +0000 (09:58 -0700)]
Revert "client: fix warning"

This reverts commit 4a3127f48d75121745f81d1aba723cb7f867f790.

Wrong branch.

12 years agomon: Monitor: make sure we backup a monmap during sync start
Joao Eduardo Luis [Wed, 19 Jun 2013 01:50:45 +0000 (02:50 +0100)]
mon: Monitor: make sure we backup a monmap during sync start

First of all, we must find a monmap to backup.  The newest version.

Secondly, we must make sure we back it up before clearing the store.

Finally, we must make sure that we don't remove said backup while
clearing the store; otherwise, we would be out of a backup monmap if the
sync happened to fail (and if the monitor happened to be killed before a
new sync had finished).

This patch makes sure these conditions are met.

Fixes: #5256 (partially)
Backport: cuttlefish

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agomon: Monitor: obtain latest monmap on sync store init
Joao Eduardo Luis [Wed, 19 Jun 2013 01:36:44 +0000 (02:36 +0100)]
mon: Monitor: obtain latest monmap on sync store init

Always use the highest version amongst all the typically available
monmaps: whatever we have in memory, whatever we have under the
MonmapMonitor's store, and whatever we have backed up from a previous
sync.  This ensures we always use the newest version we came across
with.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agomon: Monitor: don't remove 'mon_sync' when clearing the store during abort
Joao Eduardo Luis [Wed, 19 Jun 2013 01:21:58 +0000 (02:21 +0100)]
mon: Monitor: don't remove 'mon_sync' when clearing the store during abort

Otherwise, we will end up losing the monmap we backed up when we started
the sync, and the monitor may be unable to start if it is killed or
crashes in-between the sync abort and finishing a new sync.

Fixes: #5256 (partially)
Backport: cuttlefish

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoAuthMonitor: auth export's status message to ss, not ds
Dan Mick [Tue, 18 Jun 2013 22:44:04 +0000 (15:44 -0700)]
AuthMonitor: auth export's status message to ss, not ds

This puts it on stderr, not stdout

Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agoceph.spec: create /var/run on package install
Sage Weil [Tue, 18 Jun 2013 21:51:08 +0000 (14:51 -0700)]
ceph.spec: create /var/run on package install

The %ghost %dir ... line will make this get cleaned up but won't install
it.

Reported-by: Derek Yarnell <derek@umiacs.umd.edu>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Gary Lowell <gary.lowell@inktank.com>
12 years agotest_rados.py: add some tests for mon_command
Dan Mick [Tue, 18 Jun 2013 19:20:33 +0000 (12:20 -0700)]
test_rados.py: add some tests for mon_command

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agorados.py: wrap target in c_char_p()
Dan Mick [Tue, 18 Jun 2013 18:05:52 +0000 (11:05 -0700)]
rados.py: wrap target in c_char_p()

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agorados.py: return error strings even if ret != 0
Dan Mick [Tue, 18 Jun 2013 18:05:10 +0000 (11:05 -0700)]
rados.py: return error strings even if ret != 0

Key rados_free() off returned length, not ret

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoceph.in: pass parsed conffile to Rados constructor
Dan Mick [Tue, 18 Jun 2013 18:04:15 +0000 (11:04 -0700)]
ceph.in: pass parsed conffile to Rados constructor

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoceph.in: global var dontsplit should be capitalized
Dan Mick [Tue, 18 Jun 2013 18:03:24 +0000 (11:03 -0700)]
ceph.in: global var dontsplit should be capitalized

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoclient: fix warning
Sage Weil [Tue, 18 Jun 2013 21:09:18 +0000 (14:09 -0700)]
client: fix warning

signed/unsigned comparison

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocommon/Preforker: fix warning
Sage Weil [Tue, 18 Jun 2013 03:32:15 +0000 (20:32 -0700)]
common/Preforker: fix warning

common/Preforker.h: In member function ‘int Preforker::signal_exit(int)’:
warning: common/Preforker.h:82:45: ignoring return value of ‘ssize_t safe_write(int, const void*, size_t)’, declared with attribute warn_unused_result [-Wunused-result]

This is harder than it should be to fix.  :(
  http://stackoverflow.com/questions/3614691/casting-to-void-doesnt-remove-warn-unused-result-error

Whatever, I guess we can do something useful with this return value.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
12 years agoclient: fix warning
Sage Weil [Tue, 18 Jun 2013 03:28:24 +0000 (20:28 -0700)]
client: fix warning

client/Client.cc: In member function 'virtual void Client::ms_handle_remote_reset(Connection*)':
warning: client/Client.cc:7892:9: enumeration value 'STATE_NEW' not handled in switch [-Wswitch]
warning: client/Client.cc:7892:9: enumeration value 'STATE_OPEN' not handled in switch [-Wswitch]
warning: client/Client.cc:7892:9: enumeration value 'STATE_CLOSED' not handled in switch [-Wswitch]

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
12 years agoclient: handle reset during initial mds session open
Sage Weil [Mon, 17 Jun 2013 23:38:26 +0000 (16:38 -0700)]
client: handle reset during initial mds session open

If we get a reset during our attempt to open an MDS session, close out the
Connection* and retry to open the session, moving the waiters over.

Fixes: #5379
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomon: fix 'osd dump <epoch>'
Sage Weil [Mon, 17 Jun 2013 23:39:30 +0000 (16:39 -0700)]
mon: fix 'osd dump <epoch>'

The optional epoch argument was missing from the command spec.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agoMerge branch 'wip-5194' into next
Sage Weil [Mon, 17 Jun 2013 22:46:47 +0000 (15:46 -0700)]
Merge branch 'wip-5194' into next

Reviewed-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Gary Lowell <gary.lowell@inktank.com>
12 years agoceph-disk: add some notes on wth we are up to
Sage Weil [Mon, 17 Jun 2013 22:43:40 +0000 (15:43 -0700)]
ceph-disk: add some notes on wth we are up to

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoOSD: we need to check pg ?.0 for resurrection
Samuel Just [Mon, 17 Jun 2013 20:09:21 +0000 (13:09 -0700)]
OSD: we need to check pg ?.0 for resurrection

Fixes: #5269
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoceph-disk: clear TERM to avoid libreadline hijinx
Sage Weil [Fri, 14 Jun 2013 23:29:10 +0000 (16:29 -0700)]
ceph-disk: clear TERM to avoid libreadline hijinx

The weird output from libreadline users is related to the TERM variable.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph-disk-udev: set up by-partuuid, -typeuuid symlinks on ancient udev
Sage Weil [Mon, 17 Jun 2013 16:49:46 +0000 (09:49 -0700)]
ceph-disk-udev: set up by-partuuid, -typeuuid symlinks on ancient udev

Make the ancient-udev/blkid workaround script for RHEL/CentOS create the
symlinks for us too.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph-disk: do not stop activate-all on first failure
Sage Weil [Sun, 16 Jun 2013 03:06:33 +0000 (20:06 -0700)]
ceph-disk: do not stop activate-all on first failure

Keep going even if we hit one activation error.  This avoids failing to
start some disks when only one of them won't start (e.g., because it
doesn't belong to the current cluster).

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph.spec: include partuuid rules in package
Sage Weil [Fri, 14 Jun 2013 23:30:24 +0000 (16:30 -0700)]
ceph.spec: include partuuid rules in package

Commit f3234c147e083f2904178994bc85de3d082e2836 missed this.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agorgw: escape prefix correctly when listing objects
Yehuda Sadeh [Fri, 14 Jun 2013 21:53:54 +0000 (14:53 -0700)]
rgw: escape prefix correctly when listing objects

Fixes: #5362
When listing objects prefix needs to be escaped correctly (the
same as with the marker). Otherwise listing objects with prefix
that starts with underscore doesn't work.
Backport: bobtail, cuttlefish

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoclient: fix ancient typo in caps revocation path
Sage Weil [Sat, 15 Jun 2013 15:48:37 +0000 (08:48 -0700)]
client: fix ancient typo in caps revocation path

If we have dropped all references to a revoked capability, send the ack
to the MDS.  This typo has been there since v0.7 (early 2009)!

Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoceph.spec: install/uninstall init script
Sage Weil [Fri, 14 Jun 2013 22:01:14 +0000 (15:01 -0700)]
ceph.spec: install/uninstall init script

This was commented out almost years ago in commit 9baf5ef4 but it is not
clear to me that it was correct to do so.  In any case, we are not
installing the rc.d links for ceph, which means it does not start up after
a reboot.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agosysvinit, upstart: ceph-disk activate-all on start
Sage Weil [Fri, 14 Jun 2013 20:39:03 +0000 (13:39 -0700)]
sysvinit, upstart: ceph-disk activate-all on start

On 'service ceph start' or 'service ceph start osd' or start ceph-osd-all
we should activate any osd GPT partitions.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph-disk: add 'activate-all'
Sage Weil [Fri, 14 Jun 2013 20:34:40 +0000 (13:34 -0700)]
ceph-disk: add 'activate-all'

Scan /dev/disk/by-parttypeuuid for ceph OSDs and activate them all.  This
is useful when the event didn't trigger on the initial udev event for
some reason.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoudev: /dev/disk/by-parttypeuuid/$type-$uuid
Sage Weil [Fri, 14 Jun 2013 20:23:52 +0000 (13:23 -0700)]
udev: /dev/disk/by-parttypeuuid/$type-$uuid

We need this to help trigger OSD activations.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: make mark_me_down asserts match check
Sage Weil [Mon, 17 Jun 2013 03:13:51 +0000 (20:13 -0700)]
mon: make mark_me_down asserts match check

The OSD may have sent a request where the message source does not match
the target in the message.  Verify that the target matches so that it
matches the assert.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph: remove space when prefix is blank
Sage Weil [Sun, 16 Jun 2013 23:49:05 +0000 (16:49 -0700)]
ceph: remove space when prefix is blank

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph: fix return code for multi-target commands
Sage Weil [Sun, 16 Jun 2013 23:48:41 +0000 (16:48 -0700)]
ceph: fix return code for multi-target commands

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph: error out properly when failing to get commands
Sage Weil [Sun, 16 Jun 2013 23:48:27 +0000 (16:48 -0700)]
ceph: error out properly when failing to get commands

If we make ret positive here we miss the failure check below.  Instead,
just set outs appropriately.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agotest/admin_socket/objecter_requests: fix test
Sage Weil [Sun, 16 Jun 2013 23:42:27 +0000 (16:42 -0700)]
test/admin_socket/objecter_requests: fix test

Commit 2bda9db1c24530cbaaa161b7ff0a80efa913aa78 added command_ops
to the result.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph: do not print status to output file when talking to old mons
Sage Weil [Sun, 16 Jun 2013 20:36:19 +0000 (13:36 -0700)]
ceph: do not print status to output file when talking to old mons

The old cli would send the status message to stdout instead of stderr;
we try to emulate that behavior when talking to old monitors because
they send some useful data to outs instead of the data payload.
However, when outputting to a *file*, the outs would still go to
stdout.  Maintain that so that, e.g.,

 ceph mon getmap -o /tmp/foo

doesn't prefix the monmap with 'got latest monmap\n'.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocommon/Preforker: fix broken recursion on exit(3)
Sage Weil [Sat, 15 Jun 2013 15:14:40 +0000 (08:14 -0700)]
common/Preforker: fix broken recursion on exit(3)

If we exit via preforker, call exit(3) and not recursively back into
Preforker::exit(r).  Otherwise you get a hang with the child blocked
at:

Thread 1 (Thread 0x7fa08962e7c0 (LWP 5419)):
#0  0x000000309860e0cd in write () from /lib64/libpthread.so.0
#1  0x00000000005cc906 in Preforker::exit(int) ()
#2  0x00000000005c8dfb in main ()

and the parent at

#0  0x000000309860eba7 in waitpid () from /lib64/libpthread.so.0
#1  0x00000000005cc87a in Preforker::parent_wait() ()
#2  0x00000000005c75ae in main ()

Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd/OSDMap: fix is_blacklisted()
Sage Weil [Sat, 15 Jun 2013 16:10:46 +0000 (09:10 -0700)]
osd/OSDMap: fix is_blacklisted()

You can only call set_port() if is_ip() is true (there is an assert in
the accessor).

Fixes: #5366
Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph: pass --format=foo to old monitors
Sage Weil [Sat, 15 Jun 2013 00:30:02 +0000 (17:30 -0700)]
ceph: pass --format=foo to old monitors

And --threshold too, although.. really.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviwed-by: Dan Mick <dan.mick@inktank.com>
12 years agoceph: add newline when using old monitors
Sage Weil [Sat, 15 Jun 2013 00:30:44 +0000 (17:30 -0700)]
ceph: add newline when using old monitors

The old tool would print a newline after outs, e.g. from 'ceph osd create'.
Do the same when we are talking to old monitors.  Also, put outs at the
top, not the bottom!

Tweak the json code to not add the newline again if we already did so
above.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agoceph.in: zero-arg invocation was broken (check array length)
Dan Mick [Fri, 14 Jun 2013 23:51:40 +0000 (16:51 -0700)]
ceph.in: zero-arg invocation was broken (check array length)

Also remove stray comment char

Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agorules: Don't disable tcmalloc on ARM (and other non-intel)
Gary Lowell [Thu, 13 Jun 2013 23:38:26 +0000 (16:38 -0700)]
rules:  Don't disable tcmalloc on ARM (and other non-intel)

Fixes #5342

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
12 years agoudev: drop useless --mount argument to ceph-disk
Sage Weil [Fri, 14 Jun 2013 05:02:03 +0000 (22:02 -0700)]
udev: drop useless --mount argument to ceph-disk

It doesn't mean anything anymore; drop it.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph-disk-udev: activate-journal
Sage Weil [Fri, 14 Jun 2013 05:01:34 +0000 (22:01 -0700)]
ceph-disk-udev: activate-journal

Trigger 'ceph-disk activate-journal' from the alt udev rules.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph-disk: do not use mount --move (or --bind)
Sage Weil [Fri, 14 Jun 2013 04:56:23 +0000 (21:56 -0700)]
ceph-disk: do not use mount --move (or --bind)

The kernel does not let you mount --move when the parent mount is
shared (see, e.g., https://bugzilla.redhat.com/show_bug.cgi?id=917008
for another person this also confused).  We can't use --bind either
since that (on RHEL at least) screws up /etc/mtab so that the final
result looks like

 /var/lib/ceph/tmp/mnt.HNHoXU /var/lib/ceph/osd/ceph-0 none rw,bind 0 0

Instead, mount the original dev in the final location and then umount
from the old location.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph.spec: include by-partuuid udev workaround rules
Sage Weil [Fri, 14 Jun 2013 04:22:53 +0000 (21:22 -0700)]
ceph.spec: include by-partuuid udev workaround rules

These are need for old or buggy udev.  Having them for new and unbroken
udev is harmless.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph.spec: add missing ceph_test_rados_api_cmd to package
Sage Weil [Fri, 14 Jun 2013 04:21:28 +0000 (21:21 -0700)]
ceph.spec: add missing ceph_test_rados_api_cmd to package

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph: flush stderr, stdout for sane output; add prefix
Sage Weil [Fri, 14 Jun 2013 19:35:46 +0000 (12:35 -0700)]
ceph: flush stderr, stdout for sane output; add prefix

Aie.

e.g., ceph tell mon.* injectargs '--debug-ms 1'

 mon.a: injectargs:debug_ms=1/1
 mon.b: injectargs:debug_ms=1/1
 mon.c: injectargs:debug_ms=1/1

or

 osd.0: debug_ms=1/1
 osd.1: debug_ms=1/1
 osd.2: Problem getting command descriptions from ('osd', '2'), ENXIO
 osd.3: Problem getting command descriptions from ('osd', '3'), ENXIO
 osd.4: Problem getting command descriptions from ('osd', '4'), ENXIO
 osd.5: Problem getting command descriptions from ('osd', '5'), ENXIO

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agoceph-disk: work around buggy rhel/centos parted
Sage Weil [Fri, 14 Jun 2013 19:10:49 +0000 (12:10 -0700)]
ceph-disk: work around buggy rhel/centos parted

parted on RHEL/Centos prefixes the *machine readable output* with

 1b 5b 3f 31 30 33 34 68

Note that the same thing happens when you 'import readline' in python.

Work around it!

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: OSDMonitor: don't ignore apply_incremental()'s return on UfP [1]
Joao Eduardo Luis [Fri, 14 Jun 2013 16:11:43 +0000 (17:11 +0100)]
mon: OSDMonitor: don't ignore apply_incremental()'s return on UfP [1]

apply_incremental() may return -EINVAL.  Don't ignore it.

[1] UfP = Update from Paxos

Fixes: #5343
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoupstart: start ceph-all on runlevel [2345]
Sage Weil [Fri, 14 Jun 2013 18:21:25 +0000 (11:21 -0700)]
upstart: start ceph-all on runlevel [2345]

Starting when only one network interface has started breaks machines with
multiple nics in very problematic ways.

There may be an earlier trigger that we can use for cases where other
services on the local machine depend on ceph, but for now this is better
than the existing behavior.

See #5248

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph: fix mon.*
Sage Weil [Fri, 14 Jun 2013 18:00:46 +0000 (11:00 -0700)]
ceph: fix mon.*

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #362 from ceph/wip-4984
Dan Mick [Fri, 14 Jun 2013 02:37:37 +0000 (19:37 -0700)]
Merge pull request #362 from ceph/wip-4984

ceph-disk: udev/partprobe redo, zap command, activate-journal command

12 years agoceph-fuse: fix uninitialized variable
Sage Weil [Fri, 14 Jun 2013 01:13:34 +0000 (18:13 -0700)]
ceph-fuse: fix uninitialized variable

There is a delete call in the out_mc_start_failed path.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph-disk: implement 'activate-journal' 362/head
Sage Weil [Thu, 13 Jun 2013 22:54:58 +0000 (15:54 -0700)]
ceph-disk: implement 'activate-journal'

Activate an osd via its journal device.  udev populates its symlinks and
triggers events in an order that is not related to whether the device is
an osd data partition or a journal.  That means that triggering
'ceph-disk activate' can happen before the journal (or journal symlink)
is present and then fail.

Similarly, it may be that they are on different disks that are hotplugged
with the journal second.

This can be wired up to the journal partition type to ensure that osds are
started when the journal appears second.

Include the udev rules to trigger this.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph-disk: call partprobe outside of the prepare lock; drop udevadm settle
Sage Weil [Wed, 12 Jun 2013 01:35:01 +0000 (18:35 -0700)]
ceph-disk: call partprobe outside of the prepare lock; drop udevadm settle

After we change the final partition type, sgdisk may or may not trigger a
udev event, depending on how well udev is behaving (it varies between
distros, it seems).  The old code would often settle and wait for udev to
activate the device, and then partprobe would uselessly fail because it
was already mounted.

Call partprobe only at the very end, after prepare is done.  This ensures
that if partprobe calls udevadm settle (which is sometimes does) we do not
get stuck.

Drop the udevadm settle.  I'm not sure what this accomplishes; take it out,
at least until we determine we need it.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph-disk: add 'zap' command
Sage Weil [Thu, 13 Jun 2013 18:03:37 +0000 (11:03 -0700)]
ceph-disk: add 'zap' command

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agolibrados: add missing #include
Sage Weil [Fri, 14 Jun 2013 00:38:02 +0000 (17:38 -0700)]
librados: add missing #include

librados/librados.cc: In function 'int rados_mon_command_target(void*, const char*, const char**, size_t, const char*, size_t, char**, size_t*, char**, size_t*)':
error: librados/librados.cc:1877: 'LONG_MAX' was not declared in this scope
error: librados/librados.cc:1877: 'LONG_MIN' was not declared in this scope

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agolibrados: wait for osdmap for commands that need it
Sage Weil [Thu, 13 Jun 2013 23:39:30 +0000 (16:39 -0700)]
librados: wait for osdmap for commands that need it

In commit 7e1cf87b5158c870e2a118ed6d316be8cb9818ce we stopped waiting for
the osdmap on start because the Objecter will normally wait, but for some
commands we assume the osdmap is recent(ish).

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoMerge branch 'wip-objecter' into next
Sage Weil [Thu, 13 Jun 2013 23:15:44 +0000 (16:15 -0700)]
Merge branch 'wip-objecter' into next

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoosdc/Objecter: dump command ops
Sage Weil [Thu, 13 Jun 2013 23:01:31 +0000 (16:01 -0700)]
osdc/Objecter: dump command ops

Dump command_ops along with everything else.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/Objecter: ping osds for which we have pending commands
Sage Weil [Thu, 13 Jun 2013 22:57:57 +0000 (15:57 -0700)]
osdc/Objecter: ping osds for which we have pending commands

As with ops and linger_ops, this ensures we detect connection resets.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph.in: refuse 'ceph <type> tell' commands; suggest 'ceph tell <type>'
Dan Mick [Thu, 13 Jun 2013 22:48:32 +0000 (15:48 -0700)]
ceph.in: refuse 'ceph <type> tell' commands; suggest 'ceph tell <type>'

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoceph.in: argparsing cleanup: suppress --completion, add help
Dan Mick [Thu, 13 Jun 2013 22:30:38 +0000 (15:30 -0700)]
ceph.in: argparsing cleanup: suppress --completion, add help

Options -v, --verbose, --concise didn't have helpstrings
Option --completion doesn't quite work yet, and should be hidden anyway

Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agoosdc/Objecter: kick command ops on osd con resets
Sage Weil [Thu, 13 Jun 2013 22:13:47 +0000 (15:13 -0700)]
osdc/Objecter: kick command ops on osd con resets

Resend osd/pg commands on the OSDSession, just as we do with other request
types.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosdc/Objecter: add perfcounters for commands
Sage Weil [Thu, 13 Jun 2013 22:13:18 +0000 (15:13 -0700)]
osdc/Objecter: add perfcounters for commands

This matches the other counters we maintain for other kinds of ops.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon: fix idempotency of 'osd crush add'
Sage Weil [Thu, 13 Jun 2013 21:01:01 +0000 (14:01 -0700)]
mon: fix idempotency of 'osd crush add'

If we add an item that already exists in particular position, we should
update instead of inserting it; the CrushWrapper methods are not
idempotent.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agolibrados: do not wait for osdmap on start
Sage Weil [Thu, 13 Jun 2013 21:42:03 +0000 (14:42 -0700)]
librados: do not wait for osdmap on start

If we abort while waiting, we incorrect clean up (we switch the state value
incorrectly, and also fail to clean up the initialized objecter).

Intead, skip this wait.. it's useless!

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agomon/MonmapMonitor: remove unused label
Sage Weil [Thu, 13 Jun 2013 18:27:49 +0000 (11:27 -0700)]
mon/MonmapMonitor: remove unused label

mon/MonmapMonitor.cc: In member function 'bool MonmapMonitor::preprocess_command(MMonCommand*)':
mon/MonmapMonitor.cc:273:2: warning: label 'out' defined but not used [-Wunused-label]

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agomon/MonCap: bootstrap-* need to subscribe to osdmap, monmap
Sage Weil [Thu, 13 Jun 2013 18:27:23 +0000 (11:27 -0700)]
mon/MonCap: bootstrap-* need to subscribe to osdmap, monmap

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'wip-tell' into next
Sage Weil [Thu, 13 Jun 2013 16:27:15 +0000 (09:27 -0700)]
Merge branch 'wip-tell' into next

Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agomon: remove support for 'mon tell ...' and 'osd tell ...'
Sage Weil [Wed, 12 Jun 2013 23:56:45 +0000 (16:56 -0700)]
mon: remove support for 'mon tell ...' and 'osd tell ...'

It doesn't work.  The commands the ceph cli sends are vector<string>, and
the mon expects json.

Leave the MDS on in place since ceph-mds still takes strings.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoceph: add support for 'tell mon.X ...'
Sage Weil [Wed, 12 Jun 2013 23:55:03 +0000 (16:55 -0700)]
ceph: add support for 'tell mon.X ...'

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agolibrados: new rados_mon_command_target to talk to a specific monitor
Sage Weil [Wed, 12 Jun 2013 23:36:39 +0000 (16:36 -0700)]
librados: new rados_mon_command_target to talk to a specific monitor

Signed-off-by: Sage Weil <sage@inktank.com>