Samuel Just [Sun, 19 Jan 2014 09:17:49 +0000 (01:17 -0800)]
PG: drop messages from down peers
This overlaps with the existing old_peering_msg() mechanism
except in one case: pulls from a replica not in the acting
set. If such a replica gets marked down, we may resend
pulls to another replica without causing a new interval
to start. If we recieved, but didn't process, a push in
response to such a pull prior to processing the map marking
the peer down, we might process the push after having reset
the pull state for a different pull operation. We can
avoid this by discarding ops from down peers.
Samuel Just [Thu, 16 Jan 2014 20:04:01 +0000 (12:04 -0800)]
PG::calc_acting: consider newest_update_osd when choosing backfill peers
We must include newest_update_osd->second.log_tail when considering backfill
peers because in GetLog we will request logs back to the min last_update over
our acting_backfill set. This will result in our log being extended as far
backwards as necessary to pick up any peers which can be log recovered by the
union of newest_update_osd's log and that of the chosen primary.
Samuel Just [Sat, 7 Dec 2013 22:52:49 +0000 (14:52 -0800)]
ReplicatedPG: handle removing the old object in finish_copy_op
do_osd_ops will need to either copy the old version out of the
way or simply delete it depending on mod_desc. Thus, defer
handling filling that part in until we finish the copy op.
Samuel Just [Mon, 9 Dec 2013 03:36:51 +0000 (19:36 -0800)]
PGLog: create interface allowing interface user to cleanup/rollback
We need to be able to allow the PGLog interface user to provide
logic for rolling back and trimming log entries. To that end,
serveral PGLog methods now take a LogEntryHander.
In PGLog::merge_old_entry, if prior_version > info.log_tail and
the object is not missing, we must have rolled back the prior
log entry. Thus, we don't skip the entry.
To simplify the code, _merge_old_entry has been split out as
a const helper. This way, proc_replica_log can be reexpressed
as merging the divergent replica log entries with the fully
merged authoritative log.
Samuel Just [Thu, 10 Oct 2013 23:12:10 +0000 (16:12 -0700)]
ReplicatedBackend: implement RPGTransaction
RPGTransaction is essentially a wrapped ObjectStore::Transaction.
The coll_t argument is elided, tempness is instead encoded in the
hobject. RPGTransaction tracks which temp objects are created and
cleared so we can update the ReplicatedBackend tracking and possibly
create the temp collection as needed.
Samuel Just [Thu, 10 Oct 2013 23:10:36 +0000 (16:10 -0700)]
hobject_t/ReplicatedPG: tempness is now an hobject thing
PGBackend implmentations will have complete control over the temp
collection. Rather than specifying the collection when sending
ops into the PGBackend, hobjects themselves will be temp or not.
Samuel Just [Thu, 5 Dec 2013 00:06:17 +0000 (16:06 -0800)]
test/osd: restructure Object/RadosModel in prep for append
Attribute handling no longer has special support in ContentsGenerator.
The most recent operation information is now stored in a special
attr rather than at the beginning of the object. ObjectDesc layers
include their own ContentsGenerators to allow more flexibility.
Also, writes truncate to the new object size rather than simply
causing reads to stop at that object size.
Samuel Just [Fri, 22 Nov 2013 19:20:23 +0000 (11:20 -0800)]
OSDMonitor: add debug_fake_ec_pool
This flag will cause ReplicatedPG to act as though the
pool were actually an EC pool in that operations will
be restricted to operations which can be locally rolled
back thereby allowing us to test the ReplicatedPG local
log rollback mechanisms independent of EC. It will also
cause ReplicatedPG to use the async read mechanism on
the PGBackend implementation once it is implemented.
Samuel Just [Sat, 7 Dec 2013 21:19:49 +0000 (13:19 -0800)]
PGLog: don't move up log.tail
Moving up log.tail unnecessarily risks backfilling
a replica after a split. Also, it disrupts the
property that replicas from the most recent interval
which performed writes must have overlapping logs.
Ilya Dryomov [Wed, 22 Jan 2014 15:33:39 +0000 (17:33 +0200)]
MOSDMap: reencode maps if target doesn't have OSDMAP_ENC
Reencode both full and incremental maps if target doesn't know how to
decode OSDMAP_ENC maps (CEPH_FEATURE_OSDMAP_ENC bit is not set). This
fixes a compatibility bug that was introduced in 3d7c69fb0986 ("OSDMap:
add a CEPH_FEATURE_OSDMAP_ENC feature, and use new encoding").
Kai Zhang [Sat, 18 Jan 2014 20:17:10 +0000 (12:17 -0800)]
Missing a key for perm 'w' in permmap (src/pybin/ceph_rest_api.py:277)
It leads to a 500 error when getting mds help info via rest api.
Changed "w" to "rw" in MonCommands.h
Fixes: #7180 Signed-off-by: Kai Zhang <kazhang2@cisco.com>
Noah Watkins [Fri, 20 Dec 2013 15:56:58 +0000 (09:56 -0600)]
libc++: fix null pointer comparison
This error is thrown when comparing a shared_ptr to NULL. To resolve
this we just use shared_ptr::operator bool that checks if the stored
pointer is null.
In C++11 the shared_ptr can be compared to nullptr, but as of yet I have
not come up with a good compatibility fix.
Noah Watkins [Tue, 29 Oct 2013 20:41:20 +0000 (13:41 -0700)]
libc++: avoid hash re-definitions
The definitions of hash<> for int64_t/uint64_t that were not available
on i386 in the __gnu_cxx namespace are available when we switch over to
std::tr1 namespace so we remove them to avoid the redefinition errors.
Noah Watkins [Tue, 29 Oct 2013 14:47:05 +0000 (07:47 -0700)]
libc++: use ceph::shared_ptr in installed header
librados.hpp uses std::tr1::shared_ptr which may not be available such
as in libc++. This switches the use to ceph::shared_ptr and as a result
also ships include/memory.h for the definition.
Greg Farnum [Sat, 18 Jan 2014 01:23:33 +0000 (17:23 -0800)]
OSDMap: Populate primary_temp values a little more carefully
In _get_temp_osds(), we populate temp_pg from the list in the OSDMap,
but we also skip anybody in the list who's down. We need to account
for those skips when setting the primary. It's easy enough to do -- just
look at the output pg_temp list instead of the OSDMap's starting one.