Sage Weil [Tue, 16 Dec 2014 01:04:32 +0000 (17:04 -0800)]
osd: handle no-op write with snapshot case
If we have a transaction that does something to the object but it !exists
both before and after, we will continue through the write path. If the
snapdir object already exists, and we try to create it again, we will
leak a snapdir obc and lock and later crash on an assert when the obc
is destroyed:
0> 2014-12-06 01:49:51.750163 7f08d6ade700 -1 osd/osd_types.h: In function 'ObjectContext::~ObjectContext()' thread 7f08d6ade700 time 2014-12-06 01:49:51.605411
osd/osd_types.h: 2944: FAILED assert(rwstate.empty())
Fix is to not recreated the snapdir if it already exists.
Fixes: #10262 Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Tue, 16 Dec 2014 00:11:05 +0000 (16:11 -0800)]
cls/refcount: ENOENT when put on non-existent object
If we get ENOENT, do not that that to mean an implicit reference count of
1. That means that if you put a non-existent object, we should get
ENOENT instead of doing a useless delete on the OSD.
Note that this changes the get behavior slightly, too: doing a get on a
non-existent object will now fail with ENOENT instead of implicitly
creating it.
David Anderson [Sun, 7 Dec 2014 05:14:27 +0000 (21:14 -0800)]
ceph: respect the PYTHON environment variable for dev mode.
On OSes where `python` is python3, dev mode's re-exec makes the
ceph tool fail. The standard way to fix this is by exporting
the PYTHON envvar pointing to the python2 interpreter.
David Zafman [Thu, 4 Dec 2014 22:01:39 +0000 (14:01 -0800)]
ceph_objectstore_tool: Add --format and --pretty-format support
--pretty-format defaults true
Add --format so xml output can be requested
--op list defaults to single line of json per object
To override this more human readable output use --pretty-format=false
Add testing of --op list special handling
Loic Dachary [Wed, 26 Nov 2014 22:35:21 +0000 (23:35 +0100)]
objectstore_tool: filter --op list and explore all PGs
The positional object name is used to filter the output of --op list and
only show the objects with a matching name. If both the object name and
the pgid are omitted, all objects from all PGs are displayed.
Loic Dachary [Wed, 26 Nov 2014 22:34:22 +0000 (23:34 +0100)]
objectstore_tool: lookup objects by name
If the object is not a parsable JSON string, assume an object name and
look it up in all the PGs. If multiple objects have the same name, only
apply the command to one of them. It is primarily useful in a test
environment where the names of the tests objects are known and only a
small number of objects exists. It replaces the following:
path='--data-path dev/osd0 --journal-path dev/osd0.journal'
for pgid in $(./ceph_objectstore_tool $path --op list-pgs) ; do
object=$(./ceph_objectstore_tool $path --pgid $pgid --op list |
grep '"oid":"NAME"')
test -n "$object" && break
done
./ceph_objectstore_tool $path --pgid $pgid "$object" remove
Loic Dachary [Fri, 28 Nov 2014 17:47:58 +0000 (18:47 +0100)]
arch: add support for HW_CAP based neon runtime detection
Rename the files from neon to arm to reflect the fact that it's related
to arm processors and also because NEON was renamed ASIMD later. The
NEON and ASIMD features are mutually exclusive. 32bits binaries will get
NEON and never ASIMD, if they run on ARMv7 or ARMv8. 64bits binaries
will only run on ARMv8 and get ASIMD and never NEON.
The flag remains with _neon and no other flag is introduced since there
is no risk of confusion. Besides people who care usually know NEON but
are not yet aware of the ASIMD renaming. Keeping the _neon name probably
saves some questions.
Also modify aio_read test for wait: write an object, take its active set
down, try to aio_read; verify read doesn't complete until active set is
allowed back up
Fixes: #10104 Signed-off-by: Dan Mick <dan.mick@redhat.com>
Yan, Zheng [Thu, 4 Dec 2014 04:18:47 +0000 (12:18 +0800)]
osdc/Filer: use finisher to execute C_Probe and C_PurgeRange
Currently contexts C_Probe/C_PurgeRange are executed while holding
OSDSession::completion_lock. C_Probe and C_PurgeRange may call
Objecter::stat() and Objecter::remove() respectively, which acquire
Objecter::rwlock. This can cause deadlock because there is intermediate
dependency between Objecter::rwlock and OSDSession::completion_lock:
Ken Dreyer [Tue, 2 Dec 2014 01:24:22 +0000 (18:24 -0700)]
heap_profiler: support new gperftools header locations
The google/ headers location has been deprecated as of gperftools 2.0.
As of gperftools 2.2rc, the google/ headers will now give deprecation
warnings, and they will probably disappear in a future gperftools
update.
Jianpeng Ma [Wed, 3 Dec 2014 02:26:26 +0000 (10:26 +0800)]
test/perf_counters: Replace perfcounters_dump to perf dump.
The func of command perfcounters_dump and 'perf dump' are the same .
But from the print 'ceph --admin-daemon help', it only print 'perf
dump'. So replace.
In order to keep consistent, still keep perfcounters_dump in code for
old user.
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Sage Weil [Mon, 24 Nov 2014 17:22:30 +0000 (09:22 -0800)]
osd: require SNAPMAPPER feature from peers
This was introduced before cuttlefish. We require users to upgrade first
to a newer release, so there is no need to support a mixed cluster with
such old code.
Ken Dreyer [Tue, 2 Dec 2014 22:52:58 +0000 (15:52 -0700)]
doc: clarify "B" flag in os recommendations page
We don't exactly do continuous builds on all the platforms marked with
"B", but we have published binary RPMs for them. Adjust the "B"
footnote definition to reflect this.
Loic Dachary [Tue, 2 Dec 2014 00:07:34 +0000 (01:07 +0100)]
erasure-code: enforce chunk size alignment
Let say the ErasureCode::encode function is given a 4096 bytes
bufferlist made of a 1249 bytes bufferptr followed by a 2847 bytes
bufferptr, both properly starting on SIMD_ALIGN address. As a result the
second 2048 had to be reallocated when bufferlist::substr_of gets the
second 2048 buffer, the address starts at 799 bytes after the beginning
of the 2847 buffer ptr and is not SIMD_ALIGN'ed.
The ErasureCode::encode must enforce a size alignment based on the chunk
size in addition to the memory alignment required by SIMD operations,
using the bufferlist::rebuild_aligned_size_and_memory function instead of
bufferlist::rebuild_aligned.
Loic Dachary [Tue, 2 Dec 2014 01:04:14 +0000 (02:04 +0100)]
common: allow size alignment that is not a power of two
Do not assume the alignment is a power of two in the is_n_align_sized()
predicate. When used in the context of erasure code it is common
for chunks to not be powers of two.
The function bufferlist::rebuild_aligned checks memory and size
alignment with the same variable. It is however useful to separate
memory alignment constraints from size alignment constraints. For
instance rebuild_aligned could be called to allocate an erasure coded
buffer where each 2048 bytes chunk needs to start on a memory address
aligned on 32 bytes.