Haomai Wang [Mon, 8 Dec 2014 03:41:54 +0000 (11:41 +0800)]
WBThrottle: make bytes/ios/inode_wb's perf counter effective
Since sync thread will cause unstable iops and latency performance curve, we
may want make WBThread do more(or moderate?) writeback and avoid sync thread
flush too much which will cause journal io long tail.
Via these counters, we can view how much object or bytes are write backed by
WBThread and how much bytes are flushed . Then we can have a great tuning for
"*bytes_start_flusher", "*ios_start_flusher" and "*inodes_start_flusher".
What we want to see is that in-memory data can writeback into disk with a
moderate rate.
David Zafman [Thu, 4 Dec 2014 22:01:39 +0000 (14:01 -0800)]
ceph_objectstore_tool: Add --format and --pretty-format support
--pretty-format defaults true
Add --format so xml output can be requested
--op list defaults to single line of json per object
To override this more human readable output use --pretty-format=false
Add testing of --op list special handling
Loic Dachary [Wed, 26 Nov 2014 22:35:21 +0000 (23:35 +0100)]
objectstore_tool: filter --op list and explore all PGs
The positional object name is used to filter the output of --op list and
only show the objects with a matching name. If both the object name and
the pgid are omitted, all objects from all PGs are displayed.
Loic Dachary [Wed, 26 Nov 2014 22:34:22 +0000 (23:34 +0100)]
objectstore_tool: lookup objects by name
If the object is not a parsable JSON string, assume an object name and
look it up in all the PGs. If multiple objects have the same name, only
apply the command to one of them. It is primarily useful in a test
environment where the names of the tests objects are known and only a
small number of objects exists. It replaces the following:
path='--data-path dev/osd0 --journal-path dev/osd0.journal'
for pgid in $(./ceph_objectstore_tool $path --op list-pgs) ; do
object=$(./ceph_objectstore_tool $path --pgid $pgid --op list |
grep '"oid":"NAME"')
test -n "$object" && break
done
./ceph_objectstore_tool $path --pgid $pgid "$object" remove
Loic Dachary [Fri, 28 Nov 2014 17:47:58 +0000 (18:47 +0100)]
arch: add support for HW_CAP based neon runtime detection
Rename the files from neon to arm to reflect the fact that it's related
to arm processors and also because NEON was renamed ASIMD later. The
NEON and ASIMD features are mutually exclusive. 32bits binaries will get
NEON and never ASIMD, if they run on ARMv7 or ARMv8. 64bits binaries
will only run on ARMv8 and get ASIMD and never NEON.
The flag remains with _neon and no other flag is introduced since there
is no risk of confusion. Besides people who care usually know NEON but
are not yet aware of the ASIMD renaming. Keeping the _neon name probably
saves some questions.
Also modify aio_read test for wait: write an object, take its active set
down, try to aio_read; verify read doesn't complete until active set is
allowed back up
Fixes: #10104 Signed-off-by: Dan Mick <dan.mick@redhat.com>
Yan, Zheng [Thu, 4 Dec 2014 04:18:47 +0000 (12:18 +0800)]
osdc/Filer: use finisher to execute C_Probe and C_PurgeRange
Currently contexts C_Probe/C_PurgeRange are executed while holding
OSDSession::completion_lock. C_Probe and C_PurgeRange may call
Objecter::stat() and Objecter::remove() respectively, which acquire
Objecter::rwlock. This can cause deadlock because there is intermediate
dependency between Objecter::rwlock and OSDSession::completion_lock:
Ken Dreyer [Tue, 2 Dec 2014 01:24:22 +0000 (18:24 -0700)]
heap_profiler: support new gperftools header locations
The google/ headers location has been deprecated as of gperftools 2.0.
As of gperftools 2.2rc, the google/ headers will now give deprecation
warnings, and they will probably disappear in a future gperftools
update.
Jianpeng Ma [Wed, 3 Dec 2014 02:26:26 +0000 (10:26 +0800)]
test/perf_counters: Replace perfcounters_dump to perf dump.
The func of command perfcounters_dump and 'perf dump' are the same .
But from the print 'ceph --admin-daemon help', it only print 'perf
dump'. So replace.
In order to keep consistent, still keep perfcounters_dump in code for
old user.
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Sage Weil [Mon, 24 Nov 2014 17:22:30 +0000 (09:22 -0800)]
osd: require SNAPMAPPER feature from peers
This was introduced before cuttlefish. We require users to upgrade first
to a newer release, so there is no need to support a mixed cluster with
such old code.
Ken Dreyer [Tue, 2 Dec 2014 22:52:58 +0000 (15:52 -0700)]
doc: clarify "B" flag in os recommendations page
We don't exactly do continuous builds on all the platforms marked with
"B", but we have published binary RPMs for them. Adjust the "B"
footnote definition to reflect this.
Loic Dachary [Tue, 2 Dec 2014 00:07:34 +0000 (01:07 +0100)]
erasure-code: enforce chunk size alignment
Let say the ErasureCode::encode function is given a 4096 bytes
bufferlist made of a 1249 bytes bufferptr followed by a 2847 bytes
bufferptr, both properly starting on SIMD_ALIGN address. As a result the
second 2048 had to be reallocated when bufferlist::substr_of gets the
second 2048 buffer, the address starts at 799 bytes after the beginning
of the 2847 buffer ptr and is not SIMD_ALIGN'ed.
The ErasureCode::encode must enforce a size alignment based on the chunk
size in addition to the memory alignment required by SIMD operations,
using the bufferlist::rebuild_aligned_size_and_memory function instead of
bufferlist::rebuild_aligned.
Loic Dachary [Tue, 2 Dec 2014 01:04:14 +0000 (02:04 +0100)]
common: allow size alignment that is not a power of two
Do not assume the alignment is a power of two in the is_n_align_sized()
predicate. When used in the context of erasure code it is common
for chunks to not be powers of two.
The function bufferlist::rebuild_aligned checks memory and size
alignment with the same variable. It is however useful to separate
memory alignment constraints from size alignment constraints. For
instance rebuild_aligned could be called to allocate an erasure coded
buffer where each 2048 bytes chunk needs to start on a memory address
aligned on 32 bytes.
Sage Weil [Tue, 2 Dec 2014 02:15:59 +0000 (18:15 -0800)]
osd: tolerate sessionless con in fast dispatch path
We can now get a session cleared from a Connection at any time. Change
the assert to an if in ms_fast_dispatch to cope. It's pretty rare, but it
can happen, especially with delay injection. In particular, a racing
thread can call mark_down() on us.
Fixes: #10209
Backport: giant Signed-off-by: Sage Weil <sage@redhat.com>