xie xingguo [Sat, 26 Jan 2019 10:03:15 +0000 (18:03 +0800)]
osd/OSDMap: more improvements to upmap
- add ability of appending a 2nd, 3rd, etc... pair to existing upmaps
when possible, rather than just continuing to the next PG
- handle the underfull case: we can rm-pg-upmap-items if there exist
any upmaps which remapped a PG out from an underfull OSD
get_all_versions() is documented as a lower-level api that doesn't
handle paging, ands suggests list_versions() instead. also caches the
results to avoid listing each bucket twice
Sage Weil [Mon, 28 Jan 2019 11:50:59 +0000 (05:50 -0600)]
Merge PR #24546 into master
* refs/pull/24546/head:
msg/async/ProtocolV2: clear dispatch throttle on connection stop
msg/async: fix should_use_msgr2 behavior (including monc)
msg/async/AsyncMessenger: clear need_addr *after* we set our new addr
msg/async/ProtocolV2: fix handling for v2 client connection with v1 addr
ceph_test_msgr: do not connect_to on the client side
msg/async: do not connect from server
msg/async: do not use peer to addr detection; use getsockname()
msg/async/ProtocolV2: always send non-empty addrvec for self
msg/async: never fill out port in myaddr if we didn't bind
ceph_test_msgr: use v2 addrs for simplemessenger
msg/async: msgr2: don't force write event on every message received
msg/async/ProtocolV2: be forgiving in server identity check
msg/async/ProtocolV2: fault if we connect to the wrong peer
msg/async: msgr2: clean cookie if connection failed in ACCEPT_SESSION
msg/async/ProtocolV2: do not bump connect_seq for fault during ACCEPTING_SESSION
msg/async: msgr2: don't send SESSION_RETRY_GLOBAL in handle_existing_connection
msg/async: msgr2: organizing log messages
msg/async: msgr2: fix connection fault when replacing
msg/async: msgr2: fix replacing race handling
msg/async: msgr2: fix connection race when existing connection is newer
msg/async: msgr2: assign recv_stamp in handle_message
msg/async: msgr2: fix peer_addrs discovery
msg/async: msgr2: keep authorizer bufferlist across reconnects
msg/async: msgr2: fix connection secret problems for WITH_SEASTAR builds
msg/async: msgr2: send keepalive on connection race winner
msg/async: msgr2: fix client address learning
msg/async: msgr2: fix keepalive_ack message
msg/async: msgr2: do not force updating rotating keys inline
msg/async: msgr2: fix mark_down vs accept race
msg/async: msgr2: unregister con from accept vs mark_down race
auth/cephx/CephxSessionHandler: use connection_secret for encryption
msg,cephx: establish a unique connection_secret for every connection
msg/async: msgr2: use sha256_digest_t to print signature hex strings
types.h,rgw: merge sha*_digest_t definitions
msg/async: msgr2: close connection when no authorizer is given
msg/async: msgr2: formatting fixes
msg/async: msgr2: send client v2 address when only v1 address is defined
msg/async: msgr2: add payload length to banner
msg/async: msgr2: check protocol state after fast dispatch
msg/async: msgr2: reduce log level for sending messages event
msg/async: msgr2: call verify authorizer when CEPH_AUTH_NONE is used
msg/async: msgr2: store peer entity name in the protocol
msg/async: msgr2: apply sign/encrypt to messages data payload
msg/async: msgr2: encryption/decryption of frames
cephx: added encrypt/decrypt bufferlist method to session handler
msg/async: msgr2: refactored the frame structures
cephx: add sign bufferlist method
options: msgr2 enable/disable signing and encrytion options
msg/async: msgr2: cephx authentication
msg/async: msgr2: implement reconnect
msg/async: msgr2: fault handling
msg/async: msgr2: messange exchange phase
msg/async: msgr2: message flow handshake
msg/async: msgr2: authentication phase
msg/async: msgr2: exchange peer_type in banner phase
test/msgr: cloned test_msgr test for testing msgr2 protocol
msg/async: msgr2: banner exchange
msg/async: asyncconnection: update the source address info
msg/async: move base class Protocol its own source file
Sage Weil [Sat, 26 Jan 2019 23:01:14 +0000 (17:01 -0600)]
msg/async/AsyncMessenger: clear need_addr *after* we set our new addr
We check need_addr at the top without a lock held, so we need to be sure
we finished our work before we clear it, or else when there are two racing
threads the first will get the clock and clear the value and the second
will do nothing and see the unlearned value before the first finishes.
Sage Weil [Sat, 26 Jan 2019 07:28:13 +0000 (01:28 -0600)]
msg/async/ProtocolV2: fix handling for v2 client connection with v1 addr
Switch it to be v2. Reject the case where the client sends and addrvec, though;
that should only happen for clients that did_bind, and they should only connect to
v2 if they have a v2 bound addr.
Sage Weil [Fri, 25 Jan 2019 21:21:45 +0000 (15:21 -0600)]
msg/async: do not connect from server
We could have a fault on a server-side of a non-lossy connectoin where
there is a fault and we have outgoing data queued. Since we are a server,
we cannot connect; we should just go into standby and wait for the other
end to reconnect, or for someone to mark us down.
This fixes a failure reproduced by Messenger/MessengerTest.SyntheticInjectTest/0
where it would assert(!policy.server) in the connect code.
Sage Weil [Fri, 25 Jan 2019 21:14:56 +0000 (15:14 -0600)]
msg/async: do not use peer to addr detection; use getsockname()
If of relying on the peer to tell us what address we are connecting from,
look at how our local socket is bound, and use that address.
This removes the possibility for error because we will infer our address
locally and that will be the one place it is decide; the server will just
use our value. As things were previously, we had to make the local and
remote inference match, which was fragile.
This does take away the client's ability to discover if it is traversing
NAT to reach the server and learning its public/external address. I
don't think anybody has ever tested this, so it probably didn't even work,
and I've never heard it come up as a requirement.
tl;dr: this change addresses the failures of "make check" runs on arm64
builders when they try to build `mgr-dashboard-test-venv` target.
long story: without this change, we will fail to pull in
setuptools >= 36, and as a result pip will fail to import
`setuptools.build_meta` in `pip/_vendor/pep517/_in_process.py`. and will
a `BackendUnavailable` exception thrown by `_call_hook()` in
`pip/_vendor/pep517/wrappers.py`. since the issue addressed by 30ce5e55
has been addressed since setuptools >= 36.0.1, we should be safe to
upgrade to the latest setuptools now.
Sage Weil [Fri, 25 Jan 2019 20:14:55 +0000 (14:14 -0600)]
msg/async/ProtocolV2: always send non-empty addrvec for self
If we don't know our address yet, send the peer a 0.0.0.0 or :: address with an empty
port and a populated nonce. That way the peer can infer our final addr the same way
we do from learned_addr.
Sage Weil [Fri, 25 Jan 2019 20:12:27 +0000 (14:12 -0600)]
msg/async: never fill out port in myaddr if we didn't bind
If we are a client and didn't bind, then we should not fill in the port for our
address. The one the peer sent us is just the random port our outgoing connection
happened to land on!
Patrick Donnelly [Fri, 25 Jan 2019 18:52:47 +0000 (10:52 -0800)]
Merge PR #25254 into master
* refs/pull/25254/head:
mds: optimize resuming stale caps
client: avoid unnecessary wakeup when handling RENEWCAPS
client: set cap->wanted when adding new cap
client: don't wakeup cap waiters twice when mds recovered
mds: optimize revoking stale caps
mds: put notable caps at the front of session's caps list
mds: track if client has writeable range in Capability
mds: add session pointer to Capability
client: skip updating 'wanted' caps if caps are already issued
client: sync 'retain caps' logical from kernel client
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Yan, Zheng [Mon, 10 Dec 2018 03:37:32 +0000 (11:37 +0800)]
mds: optimize resuming stale caps
If client doesn't want any cap, there is no need to re-issue stale
caps.
A special case is that client wants some caps, but skipped updating
'wanted'. For this case, client needs to update 'wanted' when stale
session get renewed.
Sage Weil [Wed, 23 Jan 2019 23:54:00 +0000 (17:54 -0600)]
msg/async/ProtocolV2: do not bump connect_seq for fault during ACCEPTING_SESSION
If we have a connection race, and we lose, we may end up with outgoing
messages *and* be in ACCEPTING_SESSION. If we then fault, we want to
leave connect_seq at 0 to avoid triggering a reset.
Yehuda Sadeh [Wed, 14 Nov 2018 21:44:25 +0000 (13:44 -0800)]
rgw: archive: rework bucket removal
Keep reference to the original bucket in the new bucket instance object,
so that if a new bucket is created with the same name, we can generate
a name off the original name. Things could be much simpler if we could
just point a new bucket entrypoint at the old bucket entrypoint, but
we can't have (at the moment) bucket entrypoint that points at a bucket
instance with a different bucket name.
Javier M. Mellid [Fri, 12 Oct 2018 12:21:41 +0000 (14:21 +0200)]
rgw: Add archive zone metadata manager
The current implementation supports one only metadata manager. The new
archive zone requires a new metadata manager to override functionality.
The archive zone metadata manager need to plug in/out depending on the
zone tier type. This metadata manager switching is driven by the new
archive data sync module.
The metadata manager switch requires migrating the old metadata manager
state to the new one to avoid crashing the current operations.
Signed-off-by: Javier M. Mellid <jmunhoz@igalia.com>
David Zafman [Tue, 27 Nov 2018 00:48:52 +0000 (16:48 -0800)]
osd, test: Add test case with osd support for overdue PG scrubs and deep scrubs
Add trigger_deep_scrub osd command for testing
Publish stats when trigger_scrub/trigger_deep_scrub is used for testing
Add optional argument to trigger_scrub/trigger_deep_scrub
for amount of extra time to change last scrub stamps
David Zafman [Tue, 15 Jan 2019 20:12:39 +0000 (12:12 -0800)]
mon: Fix scrub health warning handling and change config to a ratio
Make this mon_warn code clearer since it involves 2 values
Code used mon scrub interval instead of pg scrub interval
Rename config values to include _pg_ and ratio to make it more clear
Fix scrub warniing handling use per-pool intervals when specified
Fixes: http://tracker.ceph.com/issues/37264 Signed-off-by: David Zafman <dzafman@redhat.com>