Sage Weil [Fri, 15 Feb 2019 22:22:14 +0000 (16:22 -0600)]
Merge PR #26448 into master
* refs/pull/26448/head:
osd: do not send peers really old maps
osd: build_incremental_map_msg: recover if we are missing an incremental
osd: build_incremental_map_msg: behave if we have latest full but not incremental
Sage Weil [Fri, 15 Feb 2019 22:08:55 +0000 (16:08 -0600)]
Merge PR #26345 into master
* refs/pull/26345/head:
ceph-osd: be helpful about upgrade gate
mon: record 'min_mon_release' file and prevent startup if it's too old
mon/Monitor: do not join cluster that is >2 releases old
mon/Elector: respect monmap min_mon_release
mon: check min_mon_release during probe
mon/MonmapMonitor: increase min_mon_release when full quorum is upgraded
mon: maintain quorum_min_mon_release
mon/MonMapMonitor: set initial min_mon_release
monmaptool: add --set-min-mon-release
monmaptool: don't spew usage on any error
mon/MonMap: add min_mon_release field
Sage Weil [Fri, 15 Feb 2019 14:43:23 +0000 (08:43 -0600)]
osd: do not send peers really old maps
We may receive a message that sat in a queue for a while with a low
priority and is tagged with an older epoch. Don't send a bunch of old
maps that we have already sent the peer.
Sage Weil [Fri, 15 Feb 2019 00:21:38 +0000 (18:21 -0600)]
mgr: hold GIL while generating a typed option value
Drop the GIL while looking up the module and option value. Retake
it before calling get_typed_option_value(), where we generate the
Python objects for the value.
Fixes: http://tracker.ceph.com/issues/38123 Fixes: http://tracker.ceph.com/issues/38204 Fixes: http://tracker.ceph.com/issues/38292 Signed-off-by: Sage Weil <sage@redhat.com>
xie xingguo [Tue, 12 Feb 2019 04:07:42 +0000 (20:07 -0800)]
osd: Fix the problem by rechecking backfill status each time
a RemoteReservationRevoked message is received, so we can
restart the backfill process in a more safe and less-likely raced way.
This original fix handled the race by accepting the RecoveryDone
while in RepWaitBackfillReserved by going to RepNotRecovering.
Once we weren't crashing OSDs, none of the tests at the time
detected that backfill was actually hung.
alfonsomthd [Thu, 14 Feb 2019 15:15:58 +0000 (16:15 +0100)]
mgr/dashboard: fix: toast notifications hiding utility menu
* Fixed margin-top taking into account responsiveness.
* dropdown-menu class: set z-index to avoid notification
hiding dropdown menus when menu item clicked.
Fixes: https://tracker.ceph.com/issues/38313 Signed-off-by: Alfonso Martínez <almartin@redhat.com>
Boris Ranto [Thu, 14 Feb 2019 09:35:56 +0000 (10:35 +0100)]
restful: Fix regression when traversing leaf nodes
The commit 23b6c90 introduced a regression when traversing leaf nodes.
The issue is that it traverses the keys of a `dict` returned by
`nodes_by_id`, not the actual `items` of the node. That resulted in an
500 error because it tried to treat `str` as a `dict` and failed.
Nathan Cutler [Tue, 15 Jan 2019 14:43:13 +0000 (15:43 +0100)]
rpm: move Python deps out of distro-conditional blocks
The python%{_python_buildid}-bcrypt and python%{_python_buildid}-requests RPMs
are identically named across all the RPM distros, so move them out of the
distro conditional blocks.
Yan, Zheng [Tue, 18 Dec 2018 08:22:21 +0000 (16:22 +0800)]
mds: update MClientReconnect encoding
The old encoding assumes that snaprealms are encoded at the tail of
message payload. So it does not allow adding new fields to the message.
The patch introduce new encoding for MClientReconnect, the new encoding
allows us to extend MClientReconnect.
The new encoding is not compatible with the old encoding. If mds does
not understand the new encoding. client needs to use the old encoding
to encode MClientReconnect.
Sage Weil [Wed, 13 Feb 2019 22:22:40 +0000 (16:22 -0600)]
Merge PR #26389 into master
* refs/pull/26389/head:
message/MMonMgrReport: conditionally reencode PGMapDigest
qa/suites/upgrade/luminous-x/parallel: enable all classes
qa/suites/upgrade/luminous-x/parallel/5-final-workload/rados_mon_thrash: use x branch
qa/suites/upgade/luminous-x: pglog_hardlimit succeeds now on luminous due to backport
mgr/DaemonServer: use a luminous-compatible 'mgr metadata' command
mgr/Mgr: print bad (non-object) json
qa/suites/upgrade/luminous-x/stress-split: mons on separate hosts, enable msgr2
qa/suites/upgrade/luminous-x/parallel: mon per host, msgr2
qa/suites/upgrade/luminous-x: whitelist 'slow request'
mon/HealthMonitor: add mon_warn_on_msgr2_not_enabled
Sage Weil [Fri, 8 Feb 2019 19:46:00 +0000 (13:46 -0600)]
qa/suites/upgrade/luminous-x/parallel: enable all classes
Otherwise it's annoying because the class list changes between luminous and nautilus,
and we don't want to futz around with changing this setting during the upgrade.
The problematic classes are 'cas' (added) and 'sdk' (not enabled by default but
included by the cls/ workunit.
Sage Weil [Fri, 8 Feb 2019 21:12:57 +0000 (15:12 -0600)]
mon/Monitor: do not join cluster that is >2 releases old
This enforces the N+2 upgrade rule from the mon's perspective.
Note that this safety check is not as safe as the OSDs. Notably, we
start up our backend store (rocksdb) *before* we probe other monitors
and discover any newer monmap that tells us we shouldn't join. If there
is a *rocksdb* backward-compatibility problem it is too late by this
point. Unfortunately, I don't see an easy way to get this far before
rocksdb is read-write--not without a lot more code, at least!
However, we'll still protect against a whole class of other potential
problems by not getting involved in a cluster that is too old. :)
Kefu Chai [Thu, 7 Feb 2019 13:13:14 +0000 (21:13 +0800)]
msg/msg_types.h: do not cast `ceph_entity_name` to `entity_name_t` for printing
in GCC-9, `-Waddress-of-packed-member` is enabled, so we have warnings like:
src/msg/msg_types.h:142:41: warning: converting a packed 'const
ceph_entity_name' pointer (alignment 1) to a 'const entity_name_t'
pointer (alignment 8) may result in an unaligned pointer value
[-Waddress-of-packed-member]
142 | return out << *(const entity_name_t*)&addr;
| ^~~~
since the alignment of these two structures are different, we cannot
cast a structure with the alignment of 1 to a structure with the
alignment of 8. as the code generated by compiler accessing the members
of alignment 8 won't work with the members of alignment 1, we need to
create a temporary structure for printing it.
alfonsomthd [Wed, 13 Feb 2019 16:17:30 +0000 (17:17 +0100)]
mgr/dashboard: fix: tox not detecting deps changes
* tox.ini: replaced 'deps' setting by appropriate commands
due to this bug:
https://github.com/tox-dev/tox/issues/149
tox is not detecting changes in requirements.txt when using
'deps' setting in 'tox.ini'.
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
Kefu Chai [Tue, 12 Feb 2019 09:13:12 +0000 (17:13 +0800)]
rpm: split ceph-mgr-dashboard plugin into its own package
to make ceph-mgr-dashboard a separated package
- helps to reduce the repo size of downstream. because
ceph-mgr-dashboard is an architecture independent package. by
making it separated package avoids needless duplication of
the same data in mutiple .debs.
- gives user a fine grained control of selection.