Sage Weil [Fri, 15 Mar 2019 17:24:52 +0000 (12:24 -0500)]
osd/PG: fix pg merge check for rc clusters
If a cluster had a pg merge pending before last_pg_merge_meta was
introduced then the source_pgid will be pg_t(). If that's the case,
skip these new checks.
Likewise, if we decode a legacy pg_pool_t, put the old merge les/lec
values into the correct location.
Sage Weil [Fri, 15 Mar 2019 17:08:34 +0000 (12:08 -0500)]
Merge PR #26965 into nautilus
* refs/pull/26965/head:
ms/async/ProtocolV2: add ms_die_on_bug and assert rxbuf/txbuf don't get big
msg/async/ProtocolV2: do not reenable pre_auth buffering on from reset_recv_state
Sage Weil [Fri, 15 Mar 2019 03:50:29 +0000 (22:50 -0500)]
msg/async/ProtocolV2: do not reenable pre_auth buffering on from reset_recv_state
This is specifically bad because we call reset_recv_state from
reuse_connection, which turns buffering back on on an already-authenticated
session.
Instead, reenable it only when we set the state to START_CONNECT. (On
the accepting side, it is a fresh connection, so it starts out true.)
Also, we want to *disable* it on the connection we are reusing, which
might be in a pre-auth state, while we are in a post-auth state.
Fixes: http://tracker.ceph.com/issues/38746 Signed-off-by: Sage Weil <sage@redhat.com>
Lenz Grimmer [Fri, 15 Mar 2019 09:38:00 +0000 (10:38 +0100)]
Merge pull request #26738 from votdev/fix_docs
mgr/dashboard: Fix issues in controllers/docs
Reviewed-by: Ernesto Puerta <epuertat@redhat.com> Reviewed-by: Patrick Nawracay <pnawracay@suse.com> Reviewed-by: Tatjana Dehler <tdehler@suse.com> Reviewed-by: Tina Kallio <tina.kallio@gmail.com>
Sage Weil [Fri, 15 Mar 2019 03:37:18 +0000 (22:37 -0500)]
Merge PR #26898 into nautilus
* refs/pull/26898/head:
osd/PG: invalidate PG if merging with unexpected version
osd,mon: include more pg merge metadata in pg_pool_t
qa/standalone/osd/pg-split-merge.sh: reproduce pg merge problem with empty pgs
osd: add osd_debug_no_{acting_change,purge_strays}
Sage Weil [Thu, 14 Mar 2019 15:04:14 +0000 (10:04 -0500)]
Merge PR #26875 into nautilus
* refs/pull/26875/head:
common: implement HMACs on top of OpenSSL.
msg/async, v2: switch the pre-auth mechanism to HMAC-SHA256.
include/types: beef sha_digest_t up with encode and compare.
auth: add hmac_sha256() to CryptoKey.
msg/async, v2: introduce pre_auth exchanges with CRC32.
msg/async, v2: introduce pre_auth buffers.
msg/async, v2: rectify the encapsulation of rx_segments_{desc,data}.
msg/async, v2: rework decoding of MessageFrame.
msg/async, v2: limit the num_segments to non-empty segments.
msg/async, v2: drop the bl onwire space optimization in ControlFrames.
msg/async, v2: clean up ret handling in ProtocolV2::write().
msg/async, v2: drop next_payload_len as we don't need anymore.
msg/async, v2: drop temp_buffer and limitations driven by it.
msg/async, v2: switch to rx_buffer_t entirely.
msg/async, v2: rx continuations use buffer::ptr_node.
msg/async, v2: use bptr continuation for segment reading.
msg/async: introduce bptr-carrying continuations.
msg/async: replace CONTINUATION_PARAM() with specialized types.
msg/async, v2: ::_banner_exchange() takes CtRef instead of CtPtr.
msg/async: avoid extra pointers in continuation definitions.
msg/async, v2: dissect setting stream handlers into ::finish_auth().
msg/async, v2: drop ceph_msg_header2 handling from ControlFrames.
msg/async, v2: drop the SignedEncryptedFrame entirely.
msg/async, v2: reintroduce segment aligment. It's compile-time now.
msg/async, v2: generalize Frame about number of segments.
msg/async, v2: rework and generalize Frame encryption.
msg/async, v2: rework the class hierarchy - introduce MessageFrame.
msg/async, v2: rework the class hierarchy - introduce ControlFrame.
msg/async/ProtocolV2: remove obsolete AuthFlags
Sage Weil [Thu, 14 Mar 2019 03:07:45 +0000 (22:07 -0500)]
Merge PR #26894 into nautilus
* refs/pull/26894/head:
qa/standalone/erasure-code/test-erasure-code: adjust test to avoid m=0
erasure-code: ensure m >= 1
mon/OSDMonitor: set ec min_size to k + min(1, m - 1)
Sage Weil [Wed, 13 Mar 2019 17:46:50 +0000 (12:46 -0500)]
qa/standalone/erasure-code/test-erasure-code: adjust test to avoid m=0
_DD is k=2 m=0, which we don't allow. Switch it to cDD.
I confess I don't fully understand why this was _DD to begin with, but
I'm pretty sure mapping is there to control the order of results so that
it can be mapped to the CRUSH rule output sanely, and the coding portion
is not relevant to the test.
Patrick Donnelly [Wed, 13 Mar 2019 16:13:02 +0000 (09:13 -0700)]
qa: extend MDS heartbeat grace for valgrind
Valgrind makes the MDS slowwwww. The newish mds_heartbeat_grace config allows
us to keep sending beacons to the mons even if the internal heartbeat is slow.
This avoids the laggy messages which are useful to grep for unrelated messaging
issues.
Fixes: http://tracker.ceph.com/issues/38723 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
The crush/builder.c crush_add_bucket method resizes the max_buckets array
but a power of 2 when it has to expand, but the code in CrushWrapper was
assuming that if the array grew the pos for the new bucket would be the
last position in the new array. This led to a situation where the
crush_choose_arg_map args array size didn't match max_buckets, and
eventually caused a crash.
Fixes: http://tracker.ceph.com/issues/38664 Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Mon, 11 Mar 2019 22:35:27 +0000 (17:35 -0500)]
osd/PG: invalidate PG if merging with unexpected version
If the source or target PG version is 0'0, we may silently take the max
of the source and target and still leave the PG complete. This
specifically can happen with an empty PG, as seen with bug 38655. In
theory we could encounter one of the PGs with some other last_update
that doesn't match what we expect. If that ever happens, make sure the
result is incomplete so that backfill can clean up.
Additionally check that the pool metadata for the last merge matches the
PGs at all. This could mismatch if we have an osdmap gap and are forced
to do some merge without merge info at all... in which case we should
definitely invalidate: there should be newer copies of the PG(s), and we
have no idea whether the PGs we are merging are what we want. If this is
some disaster recovery situation, an operator is always free to use
ceph-objectstore-tool to re-mark a PG complete (at their own peril!).
Fixes: http://tracker.ceph.com/issues/38655 Signed-off-by: Sage Weil <sage@redhat.com>
Tim Serong [Tue, 12 Mar 2019 09:07:49 +0000 (20:07 +1100)]
mgr/deepsea: always use 'password' parameter for salt-api auth
Prior to https://github.com/saltstack/salt/commit/71d5601507, the
salt-api expected the password to be sent using the 'sharedsecret'
parameter if using shared secrets, and the 'password' parameter
for other authentication types. The above commit unifies this so
that we always only need to use the 'password' parameter.