Loic Dachary [Wed, 8 Jan 2014 11:41:06 +0000 (12:41 +0100)]
mon: shell test helpers to run MONs from sources
The intent is to make it more convenient to reproduce a specific mon
behavior and observe the result by grepping the logs. It can be handy
for bug diagnostic. The test could be included as a unit test to
be run on make check.
The setup function will prepare a directory and kill leftover from a
previous run. The teardown function cleans up on success. The run
function is expected to be provided by the calling script and can make
use of the run_mon function to mkfs + run a monitor.
Loic Dachary [Sun, 26 Jan 2014 12:32:57 +0000 (13:32 +0100)]
unittests: fail early when low on disk
Scripts from qa that are run as unittests via test/vstart_wrapper.sh may
fail because the partition on which it runs is low on space ( 95% full
). When it happens the cause of the problem may be unclear because it
is likely to show only as a client not being able to reach the mon.
A test is added in test/vstart_wrapper.sh to verify the disk space usage
using the same method as the mon would and fail with a detailed error if
it is the case.
Sage Weil [Thu, 23 Jan 2014 17:16:54 +0000 (09:16 -0800)]
osd/OSDMap: do not create erasure rule by default
If we do, we will require the v2 feature bit from clients.
We could only include feature bits for rules that are actually referenced
by pools, but for now making the user create the rule is simpler. There is
no need to create this rule ahead of time.
Samuel Just [Sun, 19 Jan 2014 09:17:49 +0000 (01:17 -0800)]
PG: drop messages from down peers
This overlaps with the existing old_peering_msg() mechanism
except in one case: pulls from a replica not in the acting
set. If such a replica gets marked down, we may resend
pulls to another replica without causing a new interval
to start. If we recieved, but didn't process, a push in
response to such a pull prior to processing the map marking
the peer down, we might process the push after having reset
the pull state for a different pull operation. We can
avoid this by discarding ops from down peers.
Samuel Just [Thu, 16 Jan 2014 20:04:01 +0000 (12:04 -0800)]
PG::calc_acting: consider newest_update_osd when choosing backfill peers
We must include newest_update_osd->second.log_tail when considering backfill
peers because in GetLog we will request logs back to the min last_update over
our acting_backfill set. This will result in our log being extended as far
backwards as necessary to pick up any peers which can be log recovered by the
union of newest_update_osd's log and that of the chosen primary.
Samuel Just [Sat, 7 Dec 2013 22:52:49 +0000 (14:52 -0800)]
ReplicatedPG: handle removing the old object in finish_copy_op
do_osd_ops will need to either copy the old version out of the
way or simply delete it depending on mod_desc. Thus, defer
handling filling that part in until we finish the copy op.
Samuel Just [Mon, 9 Dec 2013 03:36:51 +0000 (19:36 -0800)]
PGLog: create interface allowing interface user to cleanup/rollback
We need to be able to allow the PGLog interface user to provide
logic for rolling back and trimming log entries. To that end,
serveral PGLog methods now take a LogEntryHander.
In PGLog::merge_old_entry, if prior_version > info.log_tail and
the object is not missing, we must have rolled back the prior
log entry. Thus, we don't skip the entry.
To simplify the code, _merge_old_entry has been split out as
a const helper. This way, proc_replica_log can be reexpressed
as merging the divergent replica log entries with the fully
merged authoritative log.
Samuel Just [Thu, 10 Oct 2013 23:12:10 +0000 (16:12 -0700)]
ReplicatedBackend: implement RPGTransaction
RPGTransaction is essentially a wrapped ObjectStore::Transaction.
The coll_t argument is elided, tempness is instead encoded in the
hobject. RPGTransaction tracks which temp objects are created and
cleared so we can update the ReplicatedBackend tracking and possibly
create the temp collection as needed.
Samuel Just [Thu, 10 Oct 2013 23:10:36 +0000 (16:10 -0700)]
hobject_t/ReplicatedPG: tempness is now an hobject thing
PGBackend implmentations will have complete control over the temp
collection. Rather than specifying the collection when sending
ops into the PGBackend, hobjects themselves will be temp or not.
Samuel Just [Thu, 5 Dec 2013 00:06:17 +0000 (16:06 -0800)]
test/osd: restructure Object/RadosModel in prep for append
Attribute handling no longer has special support in ContentsGenerator.
The most recent operation information is now stored in a special
attr rather than at the beginning of the object. ObjectDesc layers
include their own ContentsGenerators to allow more flexibility.
Also, writes truncate to the new object size rather than simply
causing reads to stop at that object size.
Samuel Just [Fri, 22 Nov 2013 19:20:23 +0000 (11:20 -0800)]
OSDMonitor: add debug_fake_ec_pool
This flag will cause ReplicatedPG to act as though the
pool were actually an EC pool in that operations will
be restricted to operations which can be locally rolled
back thereby allowing us to test the ReplicatedPG local
log rollback mechanisms independent of EC. It will also
cause ReplicatedPG to use the async read mechanism on
the PGBackend implementation once it is implemented.
Samuel Just [Sat, 7 Dec 2013 21:19:49 +0000 (13:19 -0800)]
PGLog: don't move up log.tail
Moving up log.tail unnecessarily risks backfilling
a replica after a split. Also, it disrupts the
property that replicas from the most recent interval
which performed writes must have overlapping logs.
Ilya Dryomov [Wed, 22 Jan 2014 15:33:39 +0000 (17:33 +0200)]
MOSDMap: reencode maps if target doesn't have OSDMAP_ENC
Reencode both full and incremental maps if target doesn't know how to
decode OSDMAP_ENC maps (CEPH_FEATURE_OSDMAP_ENC bit is not set). This
fixes a compatibility bug that was introduced in 3d7c69fb0986 ("OSDMap:
add a CEPH_FEATURE_OSDMAP_ENC feature, and use new encoding").