Samuel Just [Wed, 23 Nov 2016 19:23:54 +0000 (11:23 -0800)]
PGTransaction,ReplicatedPG: clarify handling of noop operations
The offending transaction was [call rbd.copyup,delete] on a non-existent
object. PGTransaction incorrectly ended up with Create and delete_first
causing a transaction beginning with trying to collection_move_rename a
non-existent head object. In fact, if we delete an object which the
transaction currently claims to be creating, the transaction should show
as empty (for that object). Rather than going through the normal write
pipeline for that case, let's just record a 0 error code in the log and
call it a day. That way, the transaction generating code only needs to
worry about about updates to actual objects.
Revert "rgw: temporarily use std::vector in place of static_vector of Boost."
This reverts commit bc23e0f7fa8491c44fa938eeb954197f6aad2367.
We're doing that because the reverted commit was a makeshift
solution to not fail Ceph compilation on platforms lacking
Boost modern enough to ship the container/static_vector.hpp.
As we got the in-tree Boost the commit isn't necessary anymore.
Loic Dachary [Mon, 21 Nov 2016 09:32:17 +0000 (10:32 +0100)]
tests: fix ceph-helpers.sh wait_for_clean delays
The TENTH_TIMEOUT was not delcared as an int and failed to be set with
the correct number. The test of the function did not catch this.
Implement computing of the increasingly large sleep delays in a separate
function so that it can be tested more easily. Give up on sub-second
sleep because a the function will not sleep at all if the cluster is
already clean. And if it is not already clean, it is very unlikely to
become clean within less than a second. The downside of having very
short sleep time is that it needlessly stress the machine and also
possibly spam the logs.
Venky Shankar [Wed, 9 Nov 2016 11:28:06 +0000 (16:58 +0530)]
librados: drop io_ctx_impl on ioctx_create/create2
close() was never called for the passed in IoCtx which
could probably result in an IoCtx leak if the original
IoCtx was a valid pool context allocated earlier.
Its kind of better to do it here rather than to leave
the destruction on the caller for better (or cleaner)
common case handling.
rgw: compilation of the ASIO front-end is enabled by default.
We're changing the default value because the previous one was
a makeshift solution to not fail Ceph compilation due to
the Beast's dependency on Boost >= 1.54 that wasn't available
on CentoOS 7. As we got the in-tree Boost we can compile
the ASIO front-end by default.
Loic Dachary [Fri, 18 Nov 2016 07:06:02 +0000 (08:06 +0100)]
tests: save 9 characters for asok paths
For vstart.sh powered tests, save 9 characters in the path name
by replacing testdir/test- with td/t-
60 characters imposed by jenkins
9 characters for src/test
5 characters for td/t-
33 left (instead of 24) for the test to create asok such as out/client.admin.25327.asok
Moving these files outside of the build directory is a bad idea because
tests should only create/use files within the builddir and not write
outside of this directory. Doing so would make things more complicated
for cleanup in case the test fail and create other problems as a
consequence (filling out disk space, conflicting directories between
runs etc.).
For ceph-helpers.sh tests replace testdir with td, saving 5 characters.
This is not strictly necessary but keeps the directory names consistent:
if the developer wants to get rid of all the test leftovers, it is
enough to remove the a single directory: td.
David Disseldorp [Thu, 17 Nov 2016 16:55:26 +0000 (17:55 +0100)]
doc/cephfs: add note about deletion from OSD restricted pool
As described in http://tracker.ceph.com/issues/17937, a client with
restricted pool access can still delete files unless a corresponding
MDS path restriction is also in place.
Matt Benjamin [Tue, 15 Nov 2016 22:43:16 +0000 (17:43 -0500)]
cmake: produce civetweb.h, again
The recent change to do this logic with file copy (and in src/rgw)
resolved the build problem, but now updates to the civetweb
submodule were not reflected in the build.
Move the copy into a custom target which will always source the
current submodule version at build time.
Avoid using the BYPRODUCTS option, as it is not supported in many
older cmake versions (e.g., Centos 7).
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Samuel Just [Fri, 21 Oct 2016 21:29:09 +0000 (14:29 -0700)]
osd/: cleanup the snap trimmer and deal with delayed repops
With the PGBackend changes, it's not necessarily the case that
calling simple_opc_submit syncronously updates the SnapMapper.
Thus, we can't rely on being able to just ask the snap mapper
for the next object immediately (we could well loop on the same
one if ECBackend is flushing the pipeline). Instead, update
SnapMapper and the SnapTrimmer to grab N at a time.
Additionally, we need to make sure we don't try this again until
all of the previously submitted repops are flushed (a good idea
anyway). To that end, this patch also refactors the SnapTrimmer
machine to be fully explicit about why it's blocked so we can be
sure that we don't queue an async work item unless we really
want to.
Samuel Just [Wed, 19 Oct 2016 16:56:46 +0000 (09:56 -0700)]
osd/ECBackend: use an explicit backfill field on ECSubWrite
Previously, we used an empty transaction to indicate when we
were sending the op to a backfill peer which needs the logs,
but can't run the transaction. I'd like to be able to send
and empty transaction for the rollforward side effect without
it causing the peer to think it missed a backfill op, so
instead, use an explicit flag. Compatability is handled by
interpretting an old version encoding with an empty transaction
as having the backfill field filled.
Samuel Just [Tue, 18 Oct 2016 21:46:53 +0000 (14:46 -0700)]
ReplicatedPG::OpContext::start_async_reads: tolerate case sync callback call
If the read can be completed immediately, objects_read_async will call
the callback syncronously, which will result in ctx being cleaned up.
Clear pending_async_reads before the call.
Samuel Just [Thu, 3 Nov 2016 00:38:13 +0000 (17:38 -0700)]
osd/: use PGBackend::call_write_ordered to submit log entries in commit order
Without this change, we might submit new log entries for marking objects
unfound in a way that causes replicas to process them out-of-order with
pending writes with lower version numbers. That would be bad. Instead,
add an interface to allow an arbitrary callback to be called after any
previously submitted transaction commit, but before any subsequently
submitted operations commit.
Samuel Just [Fri, 21 Oct 2016 21:33:08 +0000 (14:33 -0700)]
osd/: Update PGBackend users to project last_update and submit stat deltas
The RMW pipeline means that we don't start committing an update
immediately, so we can't update the log syncronously with
submit_transaction. Thus, in order to pipeline writes, PG/ReplicatedPG
will need to project last_update and abstain from updating info
directly (updating info.stats was the only offender).
Samuel Just [Tue, 15 Nov 2016 23:47:37 +0000 (15:47 -0800)]
osd/: refactor PGLog a bit and add support for rolling back extents
It was hard to reason about the validity of the IndexedLog internal
pointers and iterators during updates, so this patch cleans that up
a bunch. It also moves responsibility for doing rollbacks into
PGBackend. Finally, it adds support for the new log entry format.
Samuel Just [Sat, 27 Aug 2016 18:33:02 +0000 (11:33 -0700)]
osd/: 's/trim_rollback_to/roll_forward_to/g'
trim_rollback_to was a not terrible name before in that all
it ever did is (possibly) trim the stashed version of the
object. However, now, it's going to encompass, in general,
the roll_forward part of a tpc (which will still be to
delete the stashed object in cases where that is
appropriate).