Dan Mick [Fri, 16 Nov 2012 06:41:36 +0000 (22:41 -0800)]
rbd: fix import pool assumptions
import allows specifying one image, implicitly or explicitly the
"source" image, even though it's really the destination. Fix up
the reassignment of 'source' to 'dest', and check for and complain
about specifying two different pools or images for import.
Signed-off-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
New config key:
- 'osd mkfs options $fstype': file system specific options for mkfs
- 'osd mkfs type': to define the filesystem for mkfs and also mount
Replaced in mkcephfs: --mkbtrfs with --mkfs
Replaced in init-ceph:
- --btrfs with --fsmount
- --nobtrfs with --nofsmount
- --btrfsumount with --fsumount
NOTE: old options from mkcephfs and init-ceph will still work, but
get may removed in the future from the scripts.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
auth: cephx: increase log levels when logging secrets
We understand that logging secrets may be useful when debugging the root
causes for auth issues. However, logging secrets is far from a good idea.
Therefore, just increase the log levels to a high enough value so that
most other debug infos can be obtained without even logging the secrets.
If one really wants to log the secrets, then setting --debug-auth 30 should
do the trick.
Fixes: #3361 Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
crush: CrushWrapper: don't add item to a bucket with != type than wanted
We take little consideration about the type of the bucket we are adding
an item to. Although this works for the vast majority of cases, it was
also leaving room for silly little mistakes to become problematic and
leading a monitor to crash.
For instance, say that we ran:
'ceph osd crush set 0 osd.0 1 root=foo row=foo'
If root 'foo' exists, then this will work and 'row=foo' will be ignored.
However, if there is no bucket named 'foo', then we would (in order)
create a bucket for row 'foo', adding osd.0 to it, and would then add
osd.0 to bucket 'foo' again -- remember, little consideration regarding
the bucket type was given.
This would trigger a monitor crash due to the recursion done in
'adjust_item_weight'. A solution to this problem is to make sure that we
do not allow specifying multiple buckets with the same name when adding
an item to crush. Not only solves our crash problem, but will also render
invalid any mistake when specifying the wrong bucket type (say, using
'row=bar' when in fact 'bar' is a rack).
Fixes: #3515 Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Sage Weil [Wed, 21 Nov 2012 00:11:00 +0000 (16:11 -0800)]
osdc/Striper: fix handling for sparse reads in add_partial_sparse_result()
If bl_map begins *after* the first item in buffer_extents, we want to
skip only the first buffer extent before doing 'continue' to loop to the
next one.
This fixes a crash caused by underflow with a pattern like:
2012-11-20 13:54:30.347861 7f9404ed6700 10 striper add_partial_sparse_result(0x1efa088) 192 covering {12288=192} (offset 2906) to [0,5286,38054,4288]
2012-11-20 13:54:30.347863 7f9404ed6700 20 striper t 0~5286 bl has 192 off 2906
2012-11-20 13:54:30.347866 7f9404ed6700 20 striper s gap 9382, skipping
2012-11-20 13:54:30.347867 7f9404ed6700 20 striper s has 192, copying
2012-11-20 13:54:30.347872 7f9404ed6700 20 striper t 9574~18446744073709547328 bl has 0 off 12480
2012-11-20 13:54:30.347874 7f9404ed6700 20 striper s at end
2012-11-20 13:54:30.347876 7f9404ed6700 20 striper t 38054~4288 bl has 0 off 12480
2012-11-20 13:54:30.347877 7f9404ed6700 20 striper s at end
Noah Watkins [Tue, 20 Nov 2012 21:44:47 +0000 (13:44 -0800)]
java: add Java exception for ENOTDIR
This specialization is useful in the Hadoop CephFS shim. An lstat may
return ENOTENT or ENOTDIR or some other IOException without a
specialization. In Hadoop we convert ENOTDIR into ENOENT.
Sage Weil [Sun, 18 Nov 2012 16:34:35 +0000 (08:34 -0800)]
mon: shutdown async signal handler sooner
Before the mon, and lockdep, in particular.
#0 __pthread_mutex_lock (mutex=0x30) at pthread_mutex_lock.c:50
#1 0x0000000000816092 in ceph::log::Log::submit_entry (this=0x0, e=0x2f4a270) at log/Log.cc:138
#2 0x00000000007ee0f8 in handle_fatal_signal (signum=11) at global/signal_handler.cc:100
#3 <signal handler called>
#4 0x00000000008e1300 in lockdep_will_lock (name=0x959aa7 "SignalHandler::lock", id=17) at common/lockdep.cc:163
#5 0x00000000008867fc in Mutex::_will_lock (this=0x2f20428) at ./common/Mutex.h:56
#6 0x0000000000886605 in Mutex::Lock (this=0x2f20428, no_lockdep=false) at common/Mutex.cc:81
#7 0x00000000007eeb95 in SignalHandler::entry (this=0x2f20300) at global/signal_handler.cc:198
#8 0x00000000008b0bd1 in Thread::_entry_func (arg=0x2f20300) at common/Thread.cc:43
#9 0x00007f36fefd6b50 in start_thread (arg=<optimized out>) at pthread_create.c:304
#10 0x00007f36fd80b6dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#11 0x0000000000000000 in ?? ()
#0 0x00007f36fefd7e75 in pthread_join (threadid=139874129766144, thread_return=0x0) at pthread_join.c:89
#1 0x00000000008b11ec in Thread::join (this=0x2f20300, prval=0x0) at common/Thread.cc:130
#2 0x00000000007eeae7 in SignalHandler::shutdown (this=0x2f20300) at global/signal_handler.cc:186
#3 0x00000000007ee9cf in SignalHandler::~SignalHandler (this=0x2f20300, __in_chrg=<optimized out>) at global/signal_handler.cc:175
#4 0x00000000007eea58 in SignalHandler::~SignalHandler (this=0x2f20300, __in_chrg=<optimized out>) at global/signal_handler.cc:176
#5 0x00000000007ee643 in shutdown_async_signal_handler () at global/signal_handler.cc:324
#6 0x00000000006de9d2 in main (argc=7, argv=0x7fffbfb8a1e8) at ceph_mon.cc:439
Sage Weil [Sun, 4 Nov 2012 16:21:50 +0000 (08:21 -0800)]
mon/MonClient: use thread-safe RNG for picking monitors
Avoid using shared-state rand() when picking monitors. This way we don't
screw with library users like test_librbd_fsx that rely on srand() and
rand() being deterministic.
Sage Weil [Sat, 17 Nov 2012 00:10:30 +0000 (16:10 -0800)]
msg/Accepter: only close socket if >= 0
It is possible for rebind() to fail, in which case the OSD will go through
it's shutdown procedure and call stop(). This is simpler than trying to
avoid calling stop() when rebind() fails.
Fixes: #3504 Signed-off-by: Sage Weil <sage@inktank.com>
Josh Durgin [Fri, 16 Nov 2012 00:20:33 +0000 (16:20 -0800)]
ObjectCacher: fix off-by-one error in split
This error left a completion that should have been attached
to the right BufferHead on the left BufferHead, which would
result in the completion never being called unless the buffers
were merged before it's original read completed. This would cause
a hang in any higher level waiting for a read to complete.
The existing loop went backwards (using a forward iterator),
but stopped when the iterator reached the beginning of the map,
or when a waiter belonged to the left BufferHead.
If the first list of waiters should have been moved to the right
BufferHead, it was skipped because at that point the iterator
was at the beginning of the map, which was the main condition
of the loop.
Restructure the waiters-moving loop to go forward in the map instead,
so it's harder to make an off-by-one error.
Josh Durgin [Fri, 16 Nov 2012 20:26:16 +0000 (12:26 -0800)]
ObjectCacher: retry reads when they are incomplete
Skipping these callbacks when there's a racing write or
a gap in the results causes the original reads they represent
to never be completed. If the read falls within the range
of a BufferHead, retry all waiters no matter what.
Yehuda Sadeh [Wed, 14 Nov 2012 01:02:02 +0000 (17:02 -0800)]
rgw: ops log can also go to socket
Adding a new ops log output (into a unix domain socket).
Configuration:
rgw_enable_usage_log : master switch for ops log
rgw ops log socket path : set socket path
rgw ops log rados : whether ops should be logged in the rados
cluster
rgw ops log data backlog : max size in MB to be accumulated
without flushing
Sage Weil [Fri, 16 Nov 2012 22:19:25 +0000 (14:19 -0800)]
common/ceph_argparse: fix malloc failure check
CID 743418 (#1 of 1): Dereference before null check (REVERSE_INULL)
Null-checking "argv" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.