Erwan Velu [Mon, 21 Mar 2016 10:47:51 +0000 (11:47 +0100)]
tests: Making "objectstore" calls parallel in osd-scrub-repair.sh
osd-scrub-repair is making several similar objectore calls in a
sequential way while they could be easily parallelized.
Each single objectore call can spent up to dozen of seconds so making
the call parallel is saving a lot of time while keeping the code pretty
simple.
This particular patch saves approx. 2 minutes on the actual code on a recent
laptop. The global running time of osd-scrub-repair drops from 9m33 to
7m37 ! Signed-off-by: Erwan Velu <erwan@redhat.com>
Erwan Velu [Mon, 21 Mar 2016 10:30:13 +0000 (11:30 +0100)]
tests: Optimizing wait_for_clean()
wait_for_clean() is a very common call when running the make check.
It does wait the cluster to be stable before continuing.
This script was doing the same calls twice and could be optimized by
making the useful calls only once.
is_clean() function was checking num_pgs & get_num_active_clean()
The main loop itself was also calling get_num_active_clean()
This patch is inlining the is_clean() inside this loop to benefit from a
single get_num_active_clean() call. This avoid a useless call of (ceph +
xmlstarlet).
This patch does move all the 'timer reset' conditions into an else
avoiding spawning other ceph+xmlstarlet call while we already know we
should reset the timer.
The last modification is to reduce the sleeping time as the state of the
cluster is changing very fast.
This whole patch could looks like almost not a big win but for a test
like test/osd/osd-scrub-repair.sh, we drop from 9m56 to 9m30 while
reducing the number system calls.
At the scale of make check, that's a lot of saving.
Erwan Velu [Mon, 21 Mar 2016 10:16:12 +0000 (11:16 +0100)]
tests: Reducing commands in get_num_active_clean()
get_num_active_clean() is called very often but spawn 1 useless process.
The current "grep -v | wc -l" can be easily replaced by "grep -cv" which
do the same while spawning one process less.
Erwan Velu [Wed, 16 Mar 2016 13:24:04 +0000 (14:24 +0100)]
tests: Adding parallelism for sequential ceph-dencoder calls
The current code was running sequentially two ceph-dencoder calls.
This process is executed pretty fast but adding sequentiality and by the number
of loops to execute, it have a cost.
This patch is just making this two calls being run in parallel.
As a result, the test/encoding/readable.sh test is running in 4m50 instead of 6.
The associate loadavg isn't impacted as it stays at 6 while being run with
nproc=8.
This patch save 1/6th of building time without impact the loadavg.
Erwan Velu [Tue, 15 Mar 2016 15:00:17 +0000 (16:00 +0100)]
tests: Adding parallelism to encoding/readable.sh
When running make -j x check, we face a weird situation where the makefile
targets are spawn in parallel up to "x" but one of those target is very very
long and sequential.
The "readable.sh" test is trying to run ~7.9K tests where 5.3K are actually
executed.
The current code is taking 23mn on a recent laptop (Intel(R) Core(TM)
i7-4810MQ CPU @ 2.80GHz, 32GB of RAM & SSD).
This patch implements parallelism to speed up this process which is not really CPU and
neither IO bound.
By default, readable.sh is now using the number of logical processors to determine
the level of parallelism (by using nproc). If needed, defining the MAX_PARALLEL_JOBS
variable will override this default value.
On the same system, where nproc=8, the resulting execution time is 5m55 seconds :
4x faster than the original code.
The global 'make check' is therefore getting faster too and dropped from 30 to
16 minutes : 2x faster than the original code.
Sage Weil [Wed, 30 Mar 2016 15:55:55 +0000 (11:55 -0400)]
osdc/Objecter: use full hash value for pg[n]ls ops
Normal ops do this so they can behave when racing with split; pgnls ops
are no different.
In particular, this fixes a bug where we have an old OSDMap that doesn't
reflect a split, and the OSD replies with a 'next' value of the PG's new
max. If we resend the same value to that PG, it'll be out of bounds,
and BlueStore will notice.
Jason Dillaman [Fri, 1 Apr 2016 16:08:12 +0000 (12:08 -0400)]
librbd: avoid throwing error if mirroring is unsupported
Attempting to remove an image will remove the image from the mirroring
directory. However, if the OSD is older and doesn't support this
new feature, avoid throwing an error.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Casey Bodley [Thu, 31 Mar 2016 21:19:12 +0000 (17:19 -0400)]
rgw: add exclusive flag to set_as_default()
this dodges the race in RGWRealm::create() and RGWZoneParams::create()
that decides whether to set the new object as a default. by calling
set_as_default() with exclusive=true, it will fail with EEXIST if a
default is already set
it also fixes an issue with 'realm pull' on a secondary zone, where a
'default' zone may be created but never actually set_as_default()
Dan Mick [Fri, 1 Apr 2016 03:30:00 +0000 (20:30 -0700)]
debian/rules: put init-ceph in /etc/init.d/ceph, not ceph-base
When the package name changed from ceph to ceph-base, dh_installinit
started installing the init script into /etc/init.d/ceph-base. Fix
this by using --name ceph with dh_installinit, which requires
1) naming the .init file ceph-base.ceph.init, and
2) calling dh_installinit separately for each package
Fixes: http://tracker.ceph.com/issues/15329 Signed-off-by: Dan Mick <dan.mick@redhat.com>
Adam Kupczyk [Thu, 18 Feb 2016 09:47:56 +0000 (10:47 +0100)]
tools: Auto complete feature for CLI.
Now logic moved from bash to python.
Not bind to bash yet. Use as 'ceph --comp osd ls'.
Able to fulfill commands and print command line help. Signed-off-by: Adam Kupczyk <a.kupczyk@mirantis.com>
Dongsheng Yang [Wed, 30 Mar 2016 02:51:31 +0000 (22:51 -0400)]
os/filestore: fix a -Wunused-label warning in compiling.
os/filestore/FileStore.cc: In member function ‘int FileStore::_zero(const coll_t&, const ghobject_t&, uint64_t, size_t)’:
os/filestore/FileStore.cc:3328:2: warning: label ‘out’ defined but not used [-Wunused-label]
out:
^
Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Dongsheng Yang [Wed, 30 Mar 2016 02:28:45 +0000 (22:28 -0400)]
test/system: fix a -Wsign-compare warning in compiling.
test/system/st_rados_create_pool.cc: In function ‘std::__cxx11::string get_temp_pool_name(const char*)’:
test/system/st_rados_create_pool.cc:128:9: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
assert(ret < sizeof(poolname));
^
Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Samuel Just [Tue, 22 Mar 2016 21:20:12 +0000 (14:20 -0700)]
OSD::handle_pg_create: check same_primary_since
Rather than add a flag to handle_pg_peering_evt, check
same_primary_since here and pass the create event to
handle_pg_peering_evt as if it originated at the current epoch (the
project_pg_history checks in handle_pg_peering_evt will be noops).
Fixes: tracker.ceph.com/issues/15241 Signed-off-by: Samuel Just <sjust@redhat.com>