J. Eric Ivancich [Thu, 31 Jan 2019 19:21:07 +0000 (14:21 -0500)]
rgw: `radosgw-admin bucket rm ... --purge-objects` can hang...
This command can hang (i.e., enter an infinite loop) due to
problematic bucket index entries left as a result of bug
https://tracker.ceph.com/issues/38007 .
The fix is to ignore the false bucket index entries -- since they do
not represent actual objects -- and remove all actual objects in the
bucket, so that bucket itself can be removed.
This fixes the both code paths whether `--bypass-gc` is specified or
not.
Furthermore, to made these operations more efficient, the internal
listing of the bucket is done unordered. This would improve behavior
when removing buckets with a large number of objects.
Fixes: http://tracker.ceph.com/issues/38134 Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Patrick Donnelly [Thu, 31 Jan 2019 20:08:26 +0000 (12:08 -0800)]
Merge PR #26038 into master
* refs/pull/26038/head:
mds: simplify recall warnings
mds: add extra details for cache drop output
qa: test mds_max_caps_per_client conf
mds: limit maximum number of caps held by session
mds: adapt drop cache for incremental recall
mds: recall caps incrementally
mds: adapt drop cache for incremental trim
mds: add throttle for trimming MDCache
mds: cleanup SessionMap init
mds: cleanup Session init
Patrick Donnelly [Mon, 28 Jan 2019 23:48:38 +0000 (15:48 -0800)]
mds: simplify recall warnings
Instead of a timeout and complicated decisions about whether the client is
releasing caps in an expeditious fashion, just use a DecayCounter that tracks
the number of caps we've recalled. This counter is decremented whenever the
client releases caps. If the counter passes a threshold, then we raise the
warning.
Similar reworking is done for the steady-state recall of client caps. Another
release DecayCounter is added so we can tell when the client is not releasing
any more caps.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Wed, 30 Jan 2019 23:52:06 +0000 (15:52 -0800)]
mds: move session setup to ms_handle_accept
Session setup in ms_handle_authentication is (historically) racy where multiple
connections from the same client can come in before one is finally accepted. A
session should only be created after ms_handle_accept. The MDS did some
backflips before this commit to ensure this.
Moreover, with the msgr2 changes, it is even more necessary since the address
nonce is not set until before ms_handle_accept is called.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Lenz Grimmer [Wed, 30 Jan 2019 14:49:06 +0000 (15:49 +0100)]
Merge pull request #26172 from rhcs-dashboard/fix-skipped-api-tests
mgr/dashboard: fix skipped backend API tests
Reviewed-by: Ernesto Puerta <epuertat@redhat.com> Reviewed-by: Laura Paduano <lpaduano@suse.com> Reviewed-by: Patrick Nawracay <pnawracay@suse.com> Reviewed-by: Ricardo Dias <rdias@suse.com> Reviewed-by: Sebastian Wagner <swagner@suse.com>
alfonsomthd [Wed, 30 Jan 2019 12:05:02 +0000 (13:05 +0100)]
mgr/dashboard: fix skipped backend API tests
* When the creation of the cluster is delegated to vstart_runner.py
(--create or --create-target-only) the amount of MGRs required
is calculated by the script so there is no more skipped tests
due to insufficient amount of MGRs.
* Additionally, this issue is not reproducible anymore: Fixes: https://tracker.ceph.com/issues/37964
* Fixed typo: TEUTHOLOFY_PY_REQS
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
Sebastian Wagner [Wed, 30 Jan 2019 08:48:11 +0000 (09:48 +0100)]
Merge pull request #25893 from sebastian-philipp/orchestrator-current-status
doc/orchestrator: Aligned Documentation with specification
Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com> Reviewed-by: Noah Watkins <noahwatkins@gmail.com>
Patrick Donnelly [Thu, 24 Jan 2019 22:23:08 +0000 (14:23 -0800)]
mds: limit maximum number of caps held by session
This is to prevent unsustainable situations where a client has so many
outstanding caps that a linear traversal/operation on the session's caps takes
unacceptable amounts of time.
Fixes: http://tracker.ceph.com/issues/38022 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Sat, 19 Jan 2019 00:18:59 +0000 (16:18 -0800)]
mds: add throttle for trimming MDCache
This is necessary when the MDS cache size decreases by a significant amount.
For example, when stopping a large MDS or when the operator makes a large cache
size reduction.
Fixes: http://tracker.ceph.com/issues/37723 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Casey Bodley [Tue, 29 Jan 2019 15:43:58 +0000 (10:43 -0500)]
rgw: only update last_trim marker on ENODATA
if cls_log_trim() returns 0, it may have stopped after 1000 entries
before trimming all the way to to_marker. only update last_trim on
ENODATA, so we continue trimming until done
Sebastian Krah [Tue, 29 Jan 2019 13:38:31 +0000 (14:38 +0100)]
mgr/dashboard: Cleanup cluster and audit log
Applies the following changes:
- Makes the timestamp bold and not the message
- Colors the priority according to its level
- Indents the message correct if it's longer than one line
- Displays a message when the log is empty
Fixes: https://tracker.ceph.com/issues/37916 Signed-off-by: Sebastian Krah <skrah@suse.com>
Sebastian Krah [Tue, 29 Jan 2019 13:32:52 +0000 (14:32 +0100)]
mgr/dashboard: Refactor log pipe
The pipe returns a class name instead of an object now. This has the advantage,
that the layout can be modified directly in scss and keeps the code seperated
from the layout.
Fixes: https://tracker.ceph.com/issues/37916 Signed-off-by: Sebastian Krah <skrah@suse.com>
Patrick Nawracay [Tue, 29 Jan 2019 09:31:19 +0000 (09:31 +0000)]
mgr/dashboard: Fix reloading of pool listing
Remove broken functionality that prevents pools from being reloaded after
deletion. The code also introduced a different problem; when clicking on a tab
shortly after the page has been loaded, the code will restore the tab to the
wrong one.
Kefu Chai [Tue, 29 Jan 2019 09:18:21 +0000 (17:18 +0800)]
osd/HitSet: mark copy ctor of HitSet::Params noexcept
to be returned using seastar::future<...> the value type should satisfy
std::is_nothrow_constructible<T>.
with this change, pg_pool_t will be nothrow_constructible. and hence
can be returned using seastar::future<pg_pool_t>. otherwise
std::is_nothrow_constructible<pg_pool_t>::value would be false.
get_all_versions() is documented as a lower-level api that doesn't
handle paging, ands suggests list_versions() instead. also caches the
results to avoid listing each bucket twice
Sage Weil [Mon, 28 Jan 2019 11:50:59 +0000 (05:50 -0600)]
Merge PR #24546 into master
* refs/pull/24546/head:
msg/async/ProtocolV2: clear dispatch throttle on connection stop
msg/async: fix should_use_msgr2 behavior (including monc)
msg/async/AsyncMessenger: clear need_addr *after* we set our new addr
msg/async/ProtocolV2: fix handling for v2 client connection with v1 addr
ceph_test_msgr: do not connect_to on the client side
msg/async: do not connect from server
msg/async: do not use peer to addr detection; use getsockname()
msg/async/ProtocolV2: always send non-empty addrvec for self
msg/async: never fill out port in myaddr if we didn't bind
ceph_test_msgr: use v2 addrs for simplemessenger
msg/async: msgr2: don't force write event on every message received
msg/async/ProtocolV2: be forgiving in server identity check
msg/async/ProtocolV2: fault if we connect to the wrong peer
msg/async: msgr2: clean cookie if connection failed in ACCEPT_SESSION
msg/async/ProtocolV2: do not bump connect_seq for fault during ACCEPTING_SESSION
msg/async: msgr2: don't send SESSION_RETRY_GLOBAL in handle_existing_connection
msg/async: msgr2: organizing log messages
msg/async: msgr2: fix connection fault when replacing
msg/async: msgr2: fix replacing race handling
msg/async: msgr2: fix connection race when existing connection is newer
msg/async: msgr2: assign recv_stamp in handle_message
msg/async: msgr2: fix peer_addrs discovery
msg/async: msgr2: keep authorizer bufferlist across reconnects
msg/async: msgr2: fix connection secret problems for WITH_SEASTAR builds
msg/async: msgr2: send keepalive on connection race winner
msg/async: msgr2: fix client address learning
msg/async: msgr2: fix keepalive_ack message
msg/async: msgr2: do not force updating rotating keys inline
msg/async: msgr2: fix mark_down vs accept race
msg/async: msgr2: unregister con from accept vs mark_down race
auth/cephx/CephxSessionHandler: use connection_secret for encryption
msg,cephx: establish a unique connection_secret for every connection
msg/async: msgr2: use sha256_digest_t to print signature hex strings
types.h,rgw: merge sha*_digest_t definitions
msg/async: msgr2: close connection when no authorizer is given
msg/async: msgr2: formatting fixes
msg/async: msgr2: send client v2 address when only v1 address is defined
msg/async: msgr2: add payload length to banner
msg/async: msgr2: check protocol state after fast dispatch
msg/async: msgr2: reduce log level for sending messages event
msg/async: msgr2: call verify authorizer when CEPH_AUTH_NONE is used
msg/async: msgr2: store peer entity name in the protocol
msg/async: msgr2: apply sign/encrypt to messages data payload
msg/async: msgr2: encryption/decryption of frames
cephx: added encrypt/decrypt bufferlist method to session handler
msg/async: msgr2: refactored the frame structures
cephx: add sign bufferlist method
options: msgr2 enable/disable signing and encrytion options
msg/async: msgr2: cephx authentication
msg/async: msgr2: implement reconnect
msg/async: msgr2: fault handling
msg/async: msgr2: messange exchange phase
msg/async: msgr2: message flow handshake
msg/async: msgr2: authentication phase
msg/async: msgr2: exchange peer_type in banner phase
test/msgr: cloned test_msgr test for testing msgr2 protocol
msg/async: msgr2: banner exchange
msg/async: asyncconnection: update the source address info
msg/async: move base class Protocol its own source file