Sage Weil [Sat, 15 Jun 2013 00:30:44 +0000 (17:30 -0700)]
ceph: add newline when using old monitors
The old tool would print a newline after outs, e.g. from 'ceph osd create'.
Do the same when we are talking to old monitors. Also, put outs at the
top, not the bottom!
Tweak the json code to not add the newline again if we already did so
above.
Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>
Sage Weil [Fri, 14 Jun 2013 04:56:23 +0000 (21:56 -0700)]
ceph-disk: do not use mount --move (or --bind)
The kernel does not let you mount --move when the parent mount is
shared (see, e.g., https://bugzilla.redhat.com/show_bug.cgi?id=917008
for another person this also confused). We can't use --bind either
since that (on RHEL at least) screws up /etc/mtab so that the final
result looks like
osd.0: debug_ms=1/1
osd.1: debug_ms=1/1
osd.2: Problem getting command descriptions from ('osd', '2'), ENXIO
osd.3: Problem getting command descriptions from ('osd', '3'), ENXIO
osd.4: Problem getting command descriptions from ('osd', '4'), ENXIO
osd.5: Problem getting command descriptions from ('osd', '5'), ENXIO
Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>
Sage Weil [Fri, 14 Jun 2013 18:21:25 +0000 (11:21 -0700)]
upstart: start ceph-all on runlevel [2345]
Starting when only one network interface has started breaks machines with
multiple nics in very problematic ways.
There may be an earlier trigger that we can use for cases where other
services on the local machine depend on ceph, but for now this is better
than the existing behavior.
David Zafman [Fri, 14 Jun 2013 01:15:39 +0000 (18:15 -0700)]
osd: EINVAL from truncate causes osd to crash
Maximum object size is 100GB configurable with osd_max_object_size
Error EFBIG if attempt to WRITE/WRITEFULL/TRUNCATE beyond osd_max_object_size
Error EINVAL if length < 1 for WRITE/WRITEFULL/ZERO
Make ZERO beyond existing size a no-op
Fixes: #5252 Fixes: #5340 Signed-off-by: David Zafman <david.zafman@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Sage Weil [Thu, 13 Jun 2013 22:54:58 +0000 (15:54 -0700)]
ceph-disk: implement 'activate-journal'
Activate an osd via its journal device. udev populates its symlinks and
triggers events in an order that is not related to whether the device is
an osd data partition or a journal. That means that triggering
'ceph-disk activate' can happen before the journal (or journal symlink)
is present and then fail.
Similarly, it may be that they are on different disks that are hotplugged
with the journal second.
This can be wired up to the journal partition type to ensure that osds are
started when the journal appears second.
Sage Weil [Wed, 12 Jun 2013 01:35:01 +0000 (18:35 -0700)]
ceph-disk: call partprobe outside of the prepare lock; drop udevadm settle
After we change the final partition type, sgdisk may or may not trigger a
udev event, depending on how well udev is behaving (it varies between
distros, it seems). The old code would often settle and wait for udev to
activate the device, and then partprobe would uselessly fail because it
was already mounted.
Call partprobe only at the very end, after prepare is done. This ensures
that if partprobe calls udevadm settle (which is sometimes does) we do not
get stuck.
Drop the udevadm settle. I'm not sure what this accomplishes; take it out,
at least until we determine we need it.
Sage Weil [Fri, 14 Jun 2013 00:38:02 +0000 (17:38 -0700)]
librados: add missing #include
librados/librados.cc: In function 'int rados_mon_command_target(void*, const char*, const char**, size_t, const char*, size_t, char**, size_t*, char**, size_t*)':
error: librados/librados.cc:1877: 'LONG_MAX' was not declared in this scope
error: librados/librados.cc:1877: 'LONG_MIN' was not declared in this scope
Sage Weil [Thu, 13 Jun 2013 23:39:30 +0000 (16:39 -0700)]
librados: wait for osdmap for commands that need it
In commit 7e1cf87b5158c870e2a118ed6d316be8cb9818ce we stopped waiting for
the osdmap on start because the Objecter will normally wait, but for some
commands we assume the osdmap is recent(ish).
Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Sage Weil [Thu, 13 Jun 2013 18:27:49 +0000 (11:27 -0700)]
mon/MonmapMonitor: remove unused label
mon/MonmapMonitor.cc: In member function 'bool MonmapMonitor::preprocess_command(MMonCommand*)':
mon/MonmapMonitor.cc:273:2: warning: label 'out' defined but not used [-Wunused-label]
Sage Weil [Thu, 13 Jun 2013 14:39:02 +0000 (07:39 -0700)]
mon/MonClient: mark_down during get_monmap_privately() shutdown
We explicitly mark_down() and clear cur_con when shutting down; do the same
for get_monmap_privately() to ensure that the reset event doesn't make us
do something silly (like, in this case, call _reopen_session() again).
Sage Weil [Wed, 12 Jun 2013 02:27:01 +0000 (19:27 -0700)]
msg/DispatchQueue: do not discard queued events on stop
When the shutdown/stop flag is set, continue to work through the queue.
Process events, but discard messages. This avoids the loss of reset events
on shutdown that are necessary to clean up ref cycles.
Sage Weil [Tue, 11 Jun 2013 23:44:05 +0000 (16:44 -0700)]
msgr: queue reset exactly once on any connection
Use the atomic pipe link removal as a signal that we are the one failing
the con and use that to queue the reset event.
This fixes the case where we have an open, the session gets set up via the
handle_accept callback, and then race with another connection and go into
wait + close, or just close. In that case, fault() needs to queue a reset
event to match the accept.
Sage Weil [Mon, 10 Jun 2013 03:21:49 +0000 (20:21 -0700)]
msgr: use ConnectionRef throughout
Make RefCountedObject a private parent of Connection so that users are
forced to use ConnectionRef whenever references are taken.
Many methods can still take a raw Connection* when they are using the
caller's reference but not taking their own; this is cheaper than
twiddling the reference count, and the lifetime is still well defined.
Local variables generally use ConnectionRef, though.
Loic Dachary [Thu, 13 Jun 2013 06:53:26 +0000 (08:53 +0200)]
add apt-get update to installation instructions
Without apt-get update the repository added to the sources.list is not taken into consideration and an older version of ceph-deploy is going to be installed.
Dan Mick [Thu, 13 Jun 2013 01:08:17 +0000 (18:08 -0700)]
ceph, mon/OSDMonitor: fix up osd crush commands for <osd.N> or <N>
The new parsing code had been trying to allow flexibility for the
'old form' commands (where id could be different from N in osd.N),
but also accept 'new form' commands. The new rule is that where
there's an OSD specified in the osd crush command, it is of type
CephOsdName, which can be an id *or* 'osd.<id>', but not both.
Pass CephOsdName as int64_t 'id' for convenience in mon code
Signed-off-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Sage Weil [Wed, 12 Jun 2013 23:36:21 +0000 (16:36 -0700)]
mon/MonClient: send commands to a specific monitor
This implementation is limited: we direct our command by reopening
a session with the specific monitor. If there is more than one of these
queued we will fail to reach either.