Casey Bodley [Fri, 22 May 2015 14:38:29 +0000 (10:38 -0400)]
cmake: skip man/CMakeLists.txt
man pages have to be preprocessed now, and can't be installed directly.
skip installing them until we add the cmake-fu to copy what man/Makefile.am
is doing
Casey Bodley [Wed, 13 May 2015 17:15:10 +0000 (13:15 -0400)]
xio: fix for xio_msg release after teardown
The xio_msg pointers to be freed in XioPortal::release_xio_rsp() are no
longer valid after a call to xio_connection_destroy(). We were already
avoiding the call to xio_release_msg() in this case, but were still
dereferencing the xio_msg for its user_context pointer. Moved the check
for is_connected() outside of the loop to avoid any access to msg.
Suggested-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Casey Bodley <casey@cohortfs.com>
Casey Bodley [Fri, 8 May 2015 16:16:20 +0000 (12:16 -0400)]
xio: use ceph clock for timestamps
accelio is using rdtsc to generate xio_msg.timestamp, which can't be
reliably converted to a timeval. now uses ceph_clock_now() to assign
the Message::recv_stamp and recv_complete_stamp
Vu Pham [Fri, 15 May 2015 16:52:20 +0000 (09:52 -0700)]
xio: save nonce for bind address
A missing nonce in the osd addrs was preventing the monitor from
detecting osd restarts. XioMessenger::bind() now sets the nonce in the
same way that SimpleMessenger and AsyncMessenger do
Signed-off-by: Casey Bodley <casey@cohortfs.com> Signed-off-by: Vu Pham <vu@mellanox.com>
Vu Pham [Wed, 15 Apr 2015 23:33:38 +0000 (16:33 -0700)]
xio: better way to assign connections to specific lane
Better way to assign connections to a specific lane of a portal
Avoiding lane competition/hogging.
This change resolves the slow ramping up and spiky behaviors during
clients starting/running I/Os.
Ken Dreyer [Tue, 14 Apr 2015 13:58:17 +0000 (07:58 -0600)]
debian: move ceph_argparse into ceph-common
Prior to this commit, if a user installed the "ceph-common" Debian
package without installing "ceph", then /usr/bin/ceph would crash
because it was missing the ceph_argparse library.
Ship the ceph_argparse library in "ceph-common" instead of "ceph". (This
was the intention of the original commit that moved argparse to "ceph", 2a23eac54957e596d99985bb9e187a668251a9ec)
http://tracker.ceph.com/issues/11388 Refs: #11388
Reported-by: Jens Rosenboom <j.rosenboom@x-ion.de> Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
Kefu Chai [Mon, 9 Mar 2015 08:42:34 +0000 (16:42 +0800)]
osd: randomize scrub times to avoid scrub wave
- to avoid the scrub wave when the osd_scrub_max_interval reaches in a
high-load OSD, the scrub time is randomized.
- extract scrub_load_below_threshold() out of scrub_should_schedule()
- schedule an automatic scrub job at a time which is uniformly distributed
over [now+osd_scrub_min_interval,
now+osd_scrub_min_interval*(1+osd_scrub_time_limit]. before
this change this sort of scrubs will be performed once the hard interval
is end or system load is below the threshold, but with this change, the
jobs will be performed as long as the load is low or the interval of
the scheduled scrubs is longer than conf.osd_scrub_max_interval. all
automatic jobs should be performed in the configured time period, otherwise
they are postponed.
- the requested scrub job will be scheduled right away, before this change
it is queued with the timestamp of `now` and postponed after
osd_scrub_min_interval.
Douglas Fuller [Tue, 19 May 2015 00:37:00 +0000 (17:37 -0700)]
rbd: expunged xfstests generic/078
This tests RENAME_WHITEOUT, which was enabled for xfs in kernel commit 7dcf5c3e4527cfa2807567b00387cf2ed5e07f00. At first execution, it throws a BUG.
Subsequent executions appear to work correctly. This issue manifests for disks
and RBD instances.
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
John Spray [Mon, 18 May 2015 15:15:07 +0000 (16:15 +0100)]
mds: fix handling missing mydir dirfrag
This was broken by 96992466 aka "mds: handle missing mydir dirfrag"
The previous code was mistakenly treating a not-yet-loaded
dirfrag as a non-existent dirfrag, resulting in
inconsistent fragstats even when no objects had
actually been lost.
Fixes: #11641 Signed-off-by: John Spray <john.spray@redhat.com>
to check_SCRIPTS. Their output is captured in .log file when running
with a recent automake. This reduces the output of make check by an
order of magnitude.
Use ceph-helpers.sh instead of mon/mon-test-helpers.sh.
* modifying the .asok and .log names to match the ceph-helpers.sh
conventions
* use explicit ports 7300 and 7301 instead of +1 so that grep
will show that 7301 is used. This reduces the odds of a
port collision when looking for a port that's not already
used by an existing test.
Instead of using mon-test-helpers.sh, primarily because the kill_daemon
function implemented in mon-test-helpers.sh is not as good as
ceph-helpers.sh.
Instead of having tests that share the same monitor, each test now runs
on a fresh monitor. The test writer no longer has to worry that it will
be re-using the pool or profile from a previous test. That causes
problems that are difficult to diagnose and the overhead of running a
new monitor is not so high.
Loic Dachary [Sat, 16 May 2015 13:32:20 +0000 (15:32 +0200)]
tests: ceph-helpers.sh do not hardcode id a in run_mon
Fix hardcoding of id a in the run_mon function. The directory
in which the mon data is stored must be a sub-directory of the
directory given in argument.
If mon_initial_members is set, the rbd pool cannot be redefined, which
is ok because this is rare and it's only an optimization to reduce the
number of PG.
Loic Dachary [Sat, 16 May 2015 09:12:16 +0000 (11:12 +0200)]
tests: ceph-helpers.sh implement wait_for_osd
The wait_for_osd to wait for an osd to go up and down is needed
internally, after running an osd. Move the inline snippet from run_osd
into a function so that it can be used by scripts as well.
Varada Kari [Fri, 15 May 2015 14:16:26 +0000 (19:46 +0530)]
KeyValueStore: Fix the prefix comparion to avoid object leaks.
Iterator becomes invalid due to a partial prefix comparision in
rmkeys_by_prefix, resulting in not deleting the objects from backend.
Modified the comparision to the given prefix.
Signed-off-by: Varada Kari <varada.kari@sandisk.com>
John Spray [Tue, 12 May 2015 16:24:58 +0000 (17:24 +0100)]
mds: validate the state+rank in MDS map
Especially:
* once I have been assigned a rank, it
can't be taken away without restarting
the daemon.
* once I have entered standby, I can
only go upwards through the states.
Fixes: #11481 Signed-off-by: John Spray <john.spray@redhat.com>
Kefu Chai [Thu, 14 May 2015 10:51:22 +0000 (18:51 +0800)]
doc: use @name to define a group, not @group
we are able to output a specified group using the directive
of `doxygengroup` in breathe. this directive prints out the
description of the group. but it's not realistic to enumerate
all groups defined in source code in the rst files. but the
doxygen command @name also helps to group functions together.
the downside of this approach is that we can not add more
items to a group later on. but it should be fine with us,
since in our case, all the grouped items are living in a single
header file.
David Zafman [Tue, 12 May 2015 22:28:07 +0000 (15:28 -0700)]
test: Add config changes to all tests to avoid order dependency
ReplayCorrupt was crashing because journal_ignore_corruption wasn't set true
Improve ReplayCorrupt by checking setting of bool corrupt and use ASSERT_FALSE()