]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
5 years agotest/msgr/perf_msgr_client.cc: fix misleading total op num 34103/head
Yang Honggang [Sat, 21 Mar 2020 11:39:06 +0000 (19:39 +0800)]
test/msgr/perf_msgr_client.cc: fix misleading total op num

Signed-off-by: Yang Honggang <yanghonggang@kuaishou.com>
5 years agomds: trim cache on regular schedule 29542/head
Patrick Donnelly [Wed, 7 Aug 2019 17:35:02 +0000 (10:35 -0700)]
mds: trim cache on regular schedule

Do this outside the standard tick interval as it needs to be driven more
frequently to keep up with client workloads that generate a lot of
capabilities.

Fixes: https://tracker.ceph.com/issues/41141
Fixes: https://tracker.ceph.com/issues/41140
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
5 years agoMerge pull request #29754 from xiexingguo/wip-inc-recovery-3
Xie Xingguo [Sat, 24 Aug 2019 05:51:21 +0000 (13:51 +0800)]
Merge pull request #29754 from xiexingguo/wip-inc-recovery-3

osd: misc inc-recovery compat fixes

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
5 years agoMerge pull request #28344 from iotcg/rdma
Kefu Chai [Sat, 24 Aug 2019 03:37:03 +0000 (11:37 +0800)]
Merge pull request #28344 from iotcg/rdma

check rdma configuration and fix some logic problem

Reviewed-by: Roman Penyaev <rpenyaev@suse.de>
Reviewed-by: Kefu Chai <kchai@redhat.com>
5 years agoMerge PR #28855 into master
Patrick Donnelly [Fri, 23 Aug 2019 23:16:16 +0000 (16:16 -0700)]
Merge PR #28855 into master

* refs/pull/28855/head:
doc: document scrub summary in ceph status output
test: extend scrub control test to validate mds task status
mds: send scrub state changes to cluster log.
mds: periodically sent mds scrub status to ceph manager
mgr, mon: allow normal ceph services to register with manager

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
5 years agoMerge PR #29167 into master
Patrick Donnelly [Fri, 23 Aug 2019 23:10:40 +0000 (16:10 -0700)]
Merge PR #29167 into master

* refs/pull/29167/head:
client: return -eio when sync file which unsafe reqs has been dropped

Reviewed-by: Zheng Yan <zyan@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
5 years agoMerge PR #29572 into master
Patrick Donnelly [Fri, 23 Aug 2019 23:06:51 +0000 (16:06 -0700)]
Merge PR #29572 into master

* refs/pull/29572/head:
mds: Reorganize class members in FSMap header

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
5 years agoMerge pull request #29796 from trociny/wip-journal-player-handle_cache_rebalanced2
Jason Dillaman [Fri, 23 Aug 2019 17:40:15 +0000 (13:40 -0400)]
Merge pull request #29796 from trociny/wip-journal-player-handle_cache_rebalanced2

journal: fix race between player shut down and cache rebalance

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
5 years agoMerge pull request #29775 from trociny/wip-41229
Jason Dillaman [Fri, 23 Aug 2019 17:39:37 +0000 (13:39 -0400)]
Merge pull request #29775 from trociny/wip-41229

librbd: always try to acquire exclusive lock when removing image

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
5 years agoMerge pull request #29459 from zy751713126/config_set
Jason Dillaman [Fri, 23 Aug 2019 17:39:07 +0000 (13:39 -0400)]
Merge pull request #29459 from zy751713126/config_set

pybind/rbd: add config_set/get/remove api in rbd.pyx

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
5 years agoMerge PR #29715 into master
Patrick Donnelly [Fri, 23 Aug 2019 17:09:17 +0000 (10:09 -0700)]
Merge PR #29715 into master

* refs/pull/29715/head:
qa: fix broken ceph.restart marking of OSDs down
qa: add debugging failed osd-release setting

Reviewed-by: Sage Weil <sage@redhat.com>
5 years agoMerge PR #29821 into master
Patrick Donnelly [Fri, 23 Aug 2019 17:00:52 +0000 (10:00 -0700)]
Merge PR #29821 into master

* refs/pull/29821/head:
qa: stop DaemonWatchdog for each cluster in daemon roles

Reviewed-by: Jos Collin <jcollin@redhat.com>
5 years agoMerge PR #29575 into master
Sage Weil [Fri, 23 Aug 2019 16:26:28 +0000 (11:26 -0500)]
Merge PR #29575 into master

* refs/pull/29575/head:
objclass, osd: improve const-correctness of PGLSFilter.
common: add bl::contents_equal() override for void* + size_t.
osd: refactor manufacturing of PGLSFilter.
osd: don't carry PGLSFilter between multiple ops in MOSDOp.

Reviewed-by: Kefu Chai <kchai@redhat.com>
5 years agoMerge PR #28727 into master
Sage Weil [Fri, 23 Aug 2019 16:25:28 +0000 (11:25 -0500)]
Merge PR #28727 into master

* refs/pull/28727/head:
test/crimson: resolve name collision
test: switch to ldout; let users specify mon debug level
test: add new ElectionLogic unit test framework
elector: const-ify a bunch of functions
elector: swap order of parameters in ElectionLogic::receive_propose
elector: Update Elector and ElectionLogic function documentation
elector: persist the epoch in bump_epoch()
elector: make some more ElectionLogic members private
elector: fix privacy and restore dout in Elector
elector: don't clear peer_info in bump_epoch()
elector: split ElectionLogic into its own compilation unit
elector: move all the elector callouts into the Elector
elector: make ElectionLogic private to Elector; undo most public shenanigans
elector: create declare_standlone_victory in Elector/Logic for Monitor
elector: make ElectionLogic::declare_victory private
elector: route _bump_epoch through the interface-to-be
elector: rename handle_propose_logic -> receive_propose
elector: hoist handle_victory into ElectionLogic
elector: hoist handle_ack into ElectionLogic
elector: hoist victory into ElectionLogic
elector: hoist expire into ElectionLogic
elector: hoist start into ElectionLogic
elector: hoist participating into ElectionLogic
elector: hoist init into ElectionLogic
elector: hoist defer into ElectionLogic
elector: split handle_propose in two and hoist into ElectionLogic
elector: hoist bump_epoch into ElectionLogic
elector: store accessors for ElectionLogic
elector: hoist Elector data bits out into a new ElectionLogic class
mon: Rearrange Paxos::dispatch to be a little cleaner

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
5 years agoMerge PR #15183 into master
Sage Weil [Fri, 23 Aug 2019 15:46:33 +0000 (10:46 -0500)]
Merge PR #15183 into master

* refs/pull/15183/head:
kv/rocksdb: support rmrange unconditionally
cls/rgw: rgw_bi_log_trim() uses cls_cxx_map_remove_range()
cls/log: cls_log_trim() uses cls_cxx_map_remove_range()
test/cls: add cls_log.trim_by_marker test
test/cls: test_cls_log doesn't allocate ObjectOperations
test/cls: test_cls_log uses fixture for temporary pool
test/cls: add cls_rgw.bi_log_trim test
cls/rgw: expose cls_rgw_bilog_list/trim() for single shard
test/cls: test_cls_rgw uses cls_rgw_obj_key
test/cls: test_cls_rgw doesn't allocate ObjectOperations
test/cls: test_cls_rgw uses fixture for temporary pool
objclass: add cls_cxx_map_remove_range()
librados: add rados_write_op_omap_rm_range2()
osdc: add Objecter omap_rm_range()
osd: add CEPH_OSD_OP_OMAPRMKEYRANGE to do_osd_ops()
osd: add omap_rmkeyrange() to PGTransaction
os: add bufferlist overload for omap_rmkeyrange()
tracing: add do_osd_op_pre_omaprmkeyrange
rados: add CEPH_OSD_OP_OMAPRMKEYRANGE

Reviewed-by: Sage Weil <sage@redhat.com>
5 years agoMerge pull request #29778 from cbodley/wip-41212
Casey Bodley [Fri, 23 Aug 2019 14:24:41 +0000 (10:24 -0400)]
Merge pull request #29778 from cbodley/wip-41212

vstart: move [client.rgw] config into [client]

Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
5 years agomgr/dashboard: User Management E2E tests (#29641)
Lenz Grimmer [Fri, 23 Aug 2019 14:00:16 +0000 (14:00 +0000)]
mgr/dashboard: User Management E2E tests (#29641)

mgr/dashboard: User Management E2E tests

Reviewed-by: Tiago Melo <tmelo@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
5 years agomgr/dashboard: run-backend-api-tests.sh CI improvements (#29504)
Lenz Grimmer [Fri, 23 Aug 2019 09:11:39 +0000 (09:11 +0000)]
mgr/dashboard: run-backend-api-tests.sh CI improvements (#29504)

mgr/dashboard: run-backend-api-tests.sh CI improvements

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Sebastian Wagner <swagner@suse.com>
5 years agoMerge pull request #29590 from Aran85/fix_proc_replica_log
Kefu Chai [Fri, 23 Aug 2019 06:59:02 +0000 (14:59 +0800)]
Merge pull request #29590 from Aran85/fix_proc_replica_log

osd: merge replica log on primary need according to replica log's crt

Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agomsg/async/Stack: rename variable to improve readability 28344/head
Changcheng Liu [Wed, 31 Jul 2019 01:33:51 +0000 (09:33 +0800)]
msg/async/Stack: rename variable to improve readability

1. rename var i to be worker_id when creating Worker
"i" is assigned to be Worker::id, it means worker's id

2. rename EventCenter::idx to EventCenter::center_id
"idx" is EventCenter's index in global_centers obj.
rename it to be center_id.

3. rename EventCenter::init API's parameter n to be nevent
"n" is actually assigned to EventCenter::nevent. rename it
to be "nevent".

4. rename EventCenter::init API's paramter t to be type
"t" is corresponding to Epoll Driver's implementation's type.

5. rename EpollDriver::size to be EpollDriver::nevent
"size" is actually epoll events number, rename it to be "nevent"

6. use event_id as index name to get event instead of "j"

7. rename "nw" to be "nowait"

8. Processor::start unify variable name with Processor::accept & Processor::stop
==> auto &l to be auto &listen_socket

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: remove stack from RDMAWorker
Changcheng Liu [Wed, 7 Aug 2019 07:13:38 +0000 (15:13 +0800)]
msg/async/rdma: remove stack from RDMAWorker

There's no need to cache stack since RDMAWorker already has
Inifiniband obj ib & RDMADispatcher obj dispatcher.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: use shared_ptr to manage RDMADispatcher obj
Changcheng Liu [Wed, 7 Aug 2019 07:08:15 +0000 (15:08 +0800)]
msg/async/rdma: use shared_ptr to manage RDMADispatcher obj

1. Don't use bare pointer to manage RDMADispatcher obj.

2. access RDMADispatcher obj directly instead of accessing it
from RDMAStack. This could avoid caching RDMAStack obj in
RDMAWorker & RDMADispatcher.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: remove stack from RDMADispatcher
Changcheng Liu [Wed, 7 Aug 2019 06:27:05 +0000 (14:27 +0800)]
msg/async/rdma: remove stack from RDMADispatcher

There's no need to cache stack since RDMADispatcher already has
Inifiniband obj ib.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: use shared_ptr to manage Infiniband obj
Changcheng Liu [Wed, 7 Aug 2019 06:19:11 +0000 (14:19 +0800)]
msg/async/rdma: use shared_ptr to manage Infiniband obj

1. Don't use bare pointer to manage Infiniband obj.

2. access Infiniband obj directly instead of accessing it from
RDMAStack. This could avoid caching RDMAStack obj in RDMAWorker
& RDMADispatcher.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: implement function to prefetch buffers
Changcheng Liu [Fri, 2 Aug 2019 03:23:08 +0000 (11:23 +0800)]
msg/async/rdma: implement function to prefetch buffers

The original RDMAConnectedSocketImpl::read read date from buffers and
prefertch data into buffers for next round of reading. It makes the
logical a little complex and the code isn't smooth to be read.
In this patch:
1) RDMAConnectedSocketImpl::buffer_prefetch private API is added to
prefetch data into buffers at the head of read_buffers.
2) reduce one time of calling notify() to reduce context switches.
It's really not needed to notify upper layer to read data since current
read operation hasn't finished yet.
3) Simplify RDMAConnectedSocketImpl::read implementation.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: remove redundant code
Changcheng Liu [Wed, 19 Jun 2019 07:53:08 +0000 (15:53 +0800)]
msg/async/rdma: remove redundant code

1. Below three bits are meaningless in pollfd::events field:
   POLLERR, POLLHUP, or POLLNVAL.
2. QueuePair::pd is initialized in the initialize list.
   There's no need to assign same value to it.
3. Remove the never used function Chunk::set_bound
4. Remove the never used function Chunk::set_offset
5. Remove the never used function QueuePair::is_error
6. Remove SimplePolicyMessenger used vars
7. remove socket_fd() interface since it's never used.
   All data write/read is based on ConnectedSocketImpl::fd.
   So, there's no need to expose socket_fd since it's never used.
8. Remove RDMAServerSocketImpl::get_fd which is not used.
   BTW, RDMAServerSocketImpl::fd has the same function as get_fd.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: show port state with string
Changcheng Liu [Mon, 19 Aug 2019 02:48:52 +0000 (10:48 +0800)]
msg/async/rdma: show port state with string

Show the port state with string is more easy to be read through
value.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: convert port_id from type uint8_t to int for output
Changcheng Liu [Wed, 14 Aug 2019 06:18:33 +0000 (14:18 +0800)]
msg/async/rdma: convert port_id from type uint8_t to int for output

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agoMerge pull request #29747 from liewegas/wip-39546
Kefu Chai [Fri, 23 Aug 2019 05:28:52 +0000 (13:28 +0800)]
Merge pull request #29747 from liewegas/wip-39546

osd/PeeringState: do not complain about past_intervals constrained by oldest epoch

Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agoMerge pull request #29624 from NancySu05/osdmonitor_markmedown
Kefu Chai [Fri, 23 Aug 2019 05:23:46 +0000 (13:23 +0800)]
Merge pull request #29624 from NancySu05/osdmonitor_markmedown

mon:C_AckMarkedDown has not handled the Callback Arguments

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
5 years agoMerge pull request #29738 from ifed01/wip-ifed-alloc-cleanup
Kefu Chai [Fri, 23 Aug 2019 05:22:52 +0000 (13:22 +0800)]
Merge pull request #29738 from ifed01/wip-ifed-alloc-cleanup

os/bluestore: minor improvements/cleanup around allocator

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
5 years agoMerge pull request #29614 from votdev/issue_41205
Kefu Chai [Fri, 23 Aug 2019 05:20:52 +0000 (13:20 +0800)]
Merge pull request #29614 from votdev/issue_41205

mgr/dashboard: Access control database does not restore disabled users correctly

Reviewed-by: Patrick Seidensal <pnawracay@suse.com>
5 years agoMerge pull request #29146 from badone/wip-tracker-40835-OSDCap.PoolClassRNS-abort
Kefu Chai [Fri, 23 Aug 2019 05:16:22 +0000 (13:16 +0800)]
Merge pull request #29146 from badone/wip-tracker-40835-OSDCap.PoolClassRNS-abort

osd/OSDCap: Check for empty namespace

Reviewed-by: Kefu Chai <kchai@redhat.com>
5 years agoMerge pull request #25697 from Aran85/fix-onode-trim
Kefu Chai [Fri, 23 Aug 2019 05:15:27 +0000 (13:15 +0800)]
Merge pull request #25697 from Aran85/fix-onode-trim

os/bluestore: more aggressive deferred submit when onode trim skipping

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
5 years agoMerge pull request #28488 from liuchang0812/show-pool-id-in-pool-ls-cmd
Kefu Chai [Fri, 23 Aug 2019 05:13:32 +0000 (13:13 +0800)]
Merge pull request #28488 from liuchang0812/show-pool-id-in-pool-ls-cmd

mon: show pool id in pool ls command

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
5 years agoMerge pull request #24636 from rzarzynski/wip-denc-container_base
Kefu Chai [Fri, 23 Aug 2019 05:12:04 +0000 (13:12 +0800)]
Merge pull request #24636 from rzarzynski/wip-denc-container_base

denc: slightly optimize container_base::bound_encode

Reviewed-by: Kefu Chai <kchai@redhat.com>
5 years agoMerge pull request #29756 from Aran85/fix-repair-object
Kefu Chai [Fri, 23 Aug 2019 05:08:49 +0000 (13:08 +0800)]
Merge pull request #29756 from Aran85/fix-repair-object

osd: clear PG_STATE_CLEAN when repair object

Reviewed-by: David Zafman <dzafman@redhat.com>
5 years agomsg/async/rdma: rename variable to improve readability
Changcheng Liu [Wed, 7 Aug 2019 05:33:37 +0000 (13:33 +0800)]
msg/async/rdma: rename variable to improve readability

Device::binding_port
1. port_id is more meaningful compared to i as variable name.
2. start port_id from 1 instead of 0.

PoolAllocator::malloc
1. make clear relationship among buffer/chunk/block/memory_region with new
variable name.
2. define the variable when it's first being used.

RDMAConnectedSocketImpl::submit
1. use "wait_copy_len" to replace "need_reserve_bytes" which stands for the memory
that is waiting to be copied into chunk.
2. use "copy_start" to replace "copy_it" which stands for the start iterator to be copied.
3. use "total_copied" to replace "total" which stands for the memory that has been copied.

allocate huge page
1. use "HUGE_PAGE_SIZE_2MB" to be used for 2MB page alignment.
2. use "ALIGN_TO_PAGE_2MB" to stands align request size to 2MB.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: make clear to get mem_info address
Changcheng Liu [Mon, 1 Jul 2019 02:41:18 +0000 (10:41 +0800)]
msg/async/rdma: make clear to get mem_info address

The parameter "block" points to mem_info::chunks space. It's not quite
clear about the function of "reinterpret_cast<mem_info *>(block) - 1;".
Get the mem_info::chunks address and minus the member offset from struct
head to get mem_info address.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: use different strategy to reset read/write chunk
Changcheng Liu [Mon, 1 Jul 2019 02:27:45 +0000 (10:27 +0800)]
msg/async/rdma: use different strategy to reset read/write chunk

When releasing read chunk to pool, the chunk::offset & chunk::bound
should be reset to zero. For write chunk, it's better to reset
chunk::offset to zero and chunk::bound to chunk length which means that
[offset, bound) is writable.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: cosmetics initialize ibv_send_wr* var
Changcheng Liu [Thu, 27 Jun 2019 05:19:58 +0000 (13:19 +0800)]
msg/async/rdma: cosmetics initialize ibv_send_wr* var

API usage:
int ibv_post_send(struct ibv_qp *qp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr)
Input Parameters:
   qp struct ibv_qp from ibv_create_qp
   wr first work request (WR)
Output Parameters:
   bad_wr pointer to first rejected WR
Return Value:
   0 on success, -1 on error.
   If the call fails, errno will be set to indicate the reason for the failure.
To avoid wrong checking return value, it's better to initialize the
value to be nullptr.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: cosmetics RDMAWorker listen & connect & get_reged_mem
Changcheng Liu [Fri, 21 Jun 2019 02:57:08 +0000 (10:57 +0800)]
msg/async/rdma: cosmetics RDMAWorker listen & connect & get_reged_mem

1. There's no need to get stack & dispatcher from RDMAStack again
since RDMAWorker has stored the value.
2. cache the Infiniband object to be used in local scope.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: cosmetics RDMAConnectedSocketImpl::read_buffers
Changcheng Liu [Thu, 13 Jun 2019 11:20:28 +0000 (19:20 +0800)]
msg/async/rdma: cosmetics RDMAConnectedSocketImpl::read_buffers

After refactoring, there's no need to do below judgement
    -  if (c != buffers.end() && (*c)->over())
    -    ++c;

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: cosmetics post_chunks_to_rq implementation
Changcheng Liu [Wed, 5 Jun 2019 02:33:20 +0000 (10:33 +0800)]
msg/async/rdma: cosmetics post_chunks_to_rq implementation

1. It's not proper to allocate large space in stack. e.g. rx_queue_len is 4096.
The patch changes to allocate rx_work_request and isge in heap.

2. Set rx_work_request and isge array whole space into zero which could avoid
setting the space into zero one by one in the while loop.

3. Change parameter name "num" to be "rq_wr_num" to improve readiness
rq_wr_num i.e. receive-queue_work-request_number

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: refine Chunk construction function
Changcheng Liu [Thu, 27 Jun 2019 03:03:44 +0000 (11:03 +0800)]
msg/async/rdma: refine Chunk construction function

1. all values are initialized in construction function
   In this way, it's easy to construct Chunk object in
   PoolAllocator::malloc function.
2. For read chunk, member bound is initialized to be 0.
3. For send chunk, member bound is initialzied to be full space size.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: avoid long lambda function for readability
Changcheng Liu [Wed, 26 Jun 2019 06:31:29 +0000 (14:31 +0800)]
msg/async/rdma: avoid long lambda function for readability

Extract the long lambda function to improve readability.
There's no advantage since "this" pointer is also needed
in original lambad function.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: define handle_rx_event to handle recv-comple-queue
Changcheng Liu [Thu, 20 Jun 2019 08:16:29 +0000 (16:16 +0800)]
msg/async/rdma: define handle_rx_event to handle recv-comple-queue

1. define handle_rx_event to let dispatch handle
recvive-completion-queue
2. simplify RDMADispatcher::polling implementation

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: deal with all RDMA device async event
Changcheng Liu [Thu, 20 Jun 2019 03:20:27 +0000 (11:20 +0800)]
msg/async/rdma: deal with all RDMA device async event

1. List all asynchronous event of the RDMA device
2. Output the fatal error events to check RDMA device status

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/Event: simplify EventCenter::process_events implementation
Changcheng Liu [Wed, 31 Jul 2019 08:54:26 +0000 (16:54 +0800)]
msg/async/Event: simplify EventCenter::process_events implementation

The original implementation makes it's hard to understand:
1) Whether timer event should be executed.
2) How long should epoll wait for timeout.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/Event: simplfy logical implementation
Changcheng Liu [Thu, 11 Jul 2019 09:21:07 +0000 (17:21 +0800)]
msg/async/Event: simplfy logical implementation

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: simplify RDMAConnectedSocketImpl::read implementation
Changcheng Liu [Mon, 1 Jul 2019 06:09:51 +0000 (14:09 +0800)]
msg/async/rdma: simplify RDMAConnectedSocketImpl::read implementation

After reading one chunk, the chunk could be pushed into buffer list if its
effecitve content size is not zero. In this case, it also means that the
caller has got the required read length. Then all the continuous chunk will
be pushed into buffer list since the effective content size is not zero.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: simplify Cluster::get_buffers implementation
Changcheng Liu [Thu, 20 Jun 2019 06:29:52 +0000 (14:29 +0800)]
msg/async/rdma: simplify Cluster::get_buffers implementation

Keep same logic:
1. If parameter block_size is zero, then allocate all the free chunks
to parameter std::vector<Chunk*> &chunks. i.e.
   chunk_buffer_number = free_chunks.size()
2. If paramter block_size is not zero, then allocate the requested or
all the free chunks to paramter std::vector<Chunk*> &chunks.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: simplify chunk::write implementation
Changcheng Liu [Wed, 26 Jun 2019 06:46:03 +0000 (14:46 +0800)]
msg/async/rdma: simplify chunk::write implementation

Keep same logic to improve readability

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: simplify chunk::read implementation
Changcheng Liu [Mon, 1 Jul 2019 05:18:44 +0000 (13:18 +0800)]
msg/async/rdma: simplify chunk::read implementation

1. offload chunk::read without managing bound.
2. reset chunk::offset & chunk::bound before releasing to pool.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: use Chunk::get_size to get chunk size
Changcheng Liu [Thu, 13 Jun 2019 11:04:40 +0000 (19:04 +0800)]
msg/async/rdma: use Chunk::get_size to get chunk size

remove Chunk::over interface and add Chunk::get_size interface
1) It's not clear when reading "over" function name.
2) Some places need know the current chunk block effective content size.
3) "Chunk::over()" could be replaced by "Chunk::get_size() == 0"

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: seperate Device construction if rdma_cm is used
Changcheng Liu [Thu, 13 Jun 2019 10:34:44 +0000 (18:34 +0800)]
msg/async/rdma: seperate Device construction if rdma_cm is used

If ms_async_rdma_cm is false, there's no need to call the api
rdma_get_device. If rdma_get_device is called, the devices remain
opened while librdmacm is loaded. This is not what we want when
ms_async_rdma_cm is false.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: operate event fd with event_{read,write}
Changcheng Liu [Mon, 10 Jun 2019 04:56:15 +0000 (12:56 +0800)]
msg/async/rdma: operate event fd with event_{read,write}

1. use wrapper function event_read & event_write to access
event file descriptor.
2. change event fd access value name to be event_val.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: fix error argument to get right qp state
Changcheng Liu [Fri, 28 Jun 2019 06:26:41 +0000 (14:26 +0800)]
msg/async/rdma: fix error argument to get right qp state

1. It's wrong to use "-1" as argument to query queue state.
In rdma library, ibv_query_qp will call ibv_cmd_query_qp to query
queue state. If "-1" is used as attr_mask, ibv_cmd_query_qp will
return error EOPNOTSUPP which means query failed.

2. In class QueuePair, is_error() could use member function get_state()
to get the queue pair state.

3. It's better to use qp_state as queue pair state according to
ibv_query_qp manual guide.
   struct ibv_qp_attr {
      enum ibv_qp_state       qp_state;            /* Current QP state */
      enum ibv_qp_state       cur_qp_state;        /* Current QP state - irrelevant for ibv_query_qp */
      ...

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: export RDMAV_HUGEPAGES_SAFE before ibv_fork_init
Changcheng Liu [Mon, 3 Jun 2019 05:31:09 +0000 (13:31 +0800)]
msg/async/rdma: export RDMAV_HUGEPAGES_SAFE before ibv_fork_init

In rdma-core library, ibv_fork_init will check environment variable
RDMAV_HUGEPAGES_SAFE to decide whether huge page is usable in system.
It doesn't make sense to export RDMAV_HUGEPAGES_SAFE env after
calling ibv_fork_init.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: use ibv_port_attr object type in Port class
Changcheng Liu [Mon, 3 Jun 2019 05:00:22 +0000 (13:00 +0800)]
msg/async/rdma: use ibv_port_attr object type in Port class

1. Avoid to do memory management without using pointer to operate
operate the allocated space. Or, it could have memory leak.
2. Since member type has been changed in class Device, it need
to use member domain operator "." to access to the sub-member in
object.
3. There's no need to consider experimental API of ibv_query_port.
So, merge ibv_query_port in the prolog.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: cosmetics by set member value in initialize list
Changcheng Liu [Fri, 21 Jun 2019 09:06:57 +0000 (17:06 +0800)]
msg/async/rdma: cosmetics by set member value in initialize list

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: define package sequence numbers macro
Changcheng Liu [Wed, 5 Jun 2019 03:13:07 +0000 (11:13 +0800)]
msg/async/rdma: define package sequence numbers macro

Refer to Doc: InfiniBandTM Architecture Specification Volume 1 Ver1.2.1
Section: 9.2 BASE TRANSPORT HEADER

bits  |31---------24 | 23-----------16 | 15----------8 | 7---------0 |
bytes |______________________________________________________________|
0 - 3 |____OpCode____|__|SE|M|Pad|Tver_|_________ Partition Key______|
4 - 7 |___Reserved___|______________Destination QP___________________|
8 -11 |A|Reserved 7__|________ PSN - Packet Sequence Number _________|

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: limit buffer size under rdma max memory region size
Changcheng Liu [Thu, 13 Jun 2019 10:20:39 +0000 (18:20 +0800)]
msg/async/rdma: limit buffer size under rdma max memory region size

The allocated buf size should be under hardware's max_mr_size. Or it'll
trigger out-of-bound access problem when calling ibv_reg_mr.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: check device_attr->max_srq is not zero
Changcheng Liu [Mon, 3 Jun 2019 09:53:37 +0000 (17:53 +0800)]
msg/async/rdma: check device_attr->max_srq is not zero

Some rdma devices don't support srq(shared receive queue).
Check hardware attribute if ceph is configured to use srq.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: check memory region size before tx buffer allocation
Changcheng Liu [Fri, 31 May 2019 10:32:04 +0000 (18:32 +0800)]
msg/async/rdma: check memory region size before tx buffer allocation

It'll trigger out-of-bound access problem in kernel if the required
memory region size is bigger than ibv_device_attr.max_mr_size

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agomsg/async/rdma: correct receive queue length info
Changcheng Liu [Tue, 21 May 2019 01:54:36 +0000 (09:54 +0800)]
msg/async/rdma: correct receive queue length info

It will hit below misleading log without this patch:
   Infiniband init requested receive queue length 4095 is too big. Setting 4095

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
5 years agoMerge PR #29806 into master
Sage Weil [Thu, 22 Aug 2019 19:07:49 +0000 (14:07 -0500)]
Merge PR #29806 into master

* refs/pull/29806/head:
mgr/BaseMgrModule: tolerate Int or Long for health 'count'

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
5 years agoMerge pull request #29298 from zhangsw/rgw-fix-bug-listobjv2-startafter
Ali Maredia [Thu, 22 Aug 2019 18:12:30 +0000 (14:12 -0400)]
Merge pull request #29298 from zhangsw/rgw-fix-bug-listobjv2-startafter

rgw: continuationToken or startAfter shouldn't be returned if not specified

5 years agoMerge PR #29780 into master
Sage Weil [Thu, 22 Aug 2019 17:52:16 +0000 (12:52 -0500)]
Merge PR #29780 into master

* refs/pull/29780/head:
osd/PeeringState: semi-colon after DECLARE_LOCALS
osd/PeeringState: on_new_interval on child PG after split

Reviewed-by: Samuel Just <sjust@redhat.com>
5 years agoMerge PR #29774 into master
Sage Weil [Thu, 22 Aug 2019 17:27:26 +0000 (12:27 -0500)]
Merge PR #29774 into master

* refs/pull/29774/head:
qa/standalone/scrub/osd-scrub-snaps: snapmapper omap is now 'm'

Reviewed-by: David Zafman <dzafman@redhat.com>
5 years agoMerge PR #29807 into master
Sage Weil [Thu, 22 Aug 2019 17:26:11 +0000 (12:26 -0500)]
Merge PR #29807 into master

* refs/pull/29807/head:
mgr/pg_autoscaler: fix race with pool deletion

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
5 years agoqa: stop DaemonWatchdog for each cluster in daemon roles 29821/head
Patrick Donnelly [Thu, 22 Aug 2019 15:59:43 +0000 (08:59 -0700)]
qa: stop DaemonWatchdog for each cluster in daemon roles

Fixes: https://tracker.ceph.com/issues/41398
Introduced-by: 08b99eef277b00a3ea423cbf085bd114a805813f
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
5 years agoqa: fix broken ceph.restart marking of OSDs down 29715/head
Patrick Donnelly [Thu, 22 Aug 2019 04:13:37 +0000 (21:13 -0700)]
qa: fix broken ceph.restart marking of OSDs down

Sage noticed `osd down` was not being performed. Bug was that the role
format had changed so splitting no longer worked correctly.

Fixes: https://tracker.ceph.com/issues/40773
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
5 years agoMerge pull request #17719 from mikulely/fix-usage-stats
Casey Bodley [Thu, 22 Aug 2019 15:04:38 +0000 (11:04 -0400)]
Merge pull request #17719 from mikulely/fix-usage-stats

rgw: distinguish different get_usage for usage log

Reviewed-by: Robin H. Johnson <rjohnson@digitalocean.com>
5 years agoqa: add debugging failed osd-release setting
Patrick Donnelly [Fri, 16 Aug 2019 21:54:48 +0000 (14:54 -0700)]
qa: add debugging failed osd-release setting

See-also: https://tracker.ceph.com/issues/40773
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
5 years agomgr/dashboard: run-backend-api-tests.sh CI improvements 29504/head
alfonsomthd [Thu, 22 Aug 2019 13:33:02 +0000 (15:33 +0200)]
mgr/dashboard: run-backend-api-tests.sh CI improvements

As there is now a jenkins job to run this script
(see https://github.com/ceph/ceph-build/pull/1351),
this refactoring adapt the script to be run in a jenkins job as well as locally.

Signed-off-by: alfonsomthd <almartin@redhat.com>
5 years agoMerge pull request #29544 from tchaikov/wip-doc-search-CSP
Kefu Chai [Thu, 22 Aug 2019 09:56:58 +0000 (17:56 +0800)]
Merge pull request #29544 from tchaikov/wip-doc-search-CSP

doc: always load resources via HTTPS

Reviewed-by: Tiago Melo <tmelo@suse.com>
5 years agodoc: always load resources via HTTPS 29544/head
Kefu Chai [Thu, 8 Aug 2019 10:40:47 +0000 (18:40 +0800)]
doc: always load resources via HTTPS

Signed-off-by: Tiago Melo <tmelo@suse.com>
5 years agopybind/rbd: add config_image_set/get/remove test case 29459/head
zhengyin [Fri, 2 Aug 2019 07:57:12 +0000 (03:57 -0400)]
pybind/rbd: add config_image_set/get/remove test case

Signed-off-by: Zheng Yin <zhengyin@cmss.chinamobile.com>
5 years agopybind/rbd: add config_image_set/get/remove api in rbd.pyx
zhengyin [Fri, 2 Aug 2019 07:26:20 +0000 (03:26 -0400)]
pybind/rbd: add config_image_set/get/remove api in rbd.pyx

Signed-off-by: Zheng Yin <zhengyin@cmss.chinamobile.com>
5 years agoMerge pull request #29755 from xiexingguo/wip-inc-recovery-4
Xie Xingguo [Thu, 22 Aug 2019 05:48:15 +0000 (13:48 +0800)]
Merge pull request #29755 from xiexingguo/wip-inc-recovery-4

osd: do not invalidate clear_regions of missing item at boot

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
5 years agoosd/osd_type: disable incremental recovery for legacy missing item 29754/head
xie xingguo [Wed, 21 Aug 2019 08:34:26 +0000 (16:34 +0800)]
osd/osd_type: disable incremental recovery for legacy missing item

which is important to let us talk with pre-octopus osds and
make sure the pg_missing_items created before Octopus can be
correctly (fully) recovered too.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
5 years agoosd/PGLog: more verbose missing set log at boot
xie xingguo [Wed, 21 Aug 2019 06:02:57 +0000 (14:02 +0800)]
osd/PGLog: more verbose missing set log at boot

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
5 years agoosd/PGLog: trigger full recovery for divergent missing objects
xie xingguo [Wed, 21 Aug 2019 02:33:42 +0000 (10:33 +0800)]
osd/PGLog: trigger full recovery for divergent missing objects

They might have a dirty/invalid log history (and hence an invalid
clean_regions as well), and there is no easy way to deduce the
complete clean_regions portion.

For simplicity (and correctness), disable potential incremental recovery
mode for these objects.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
5 years agoosd/PGLog: disable incremental recovery for pre-kraken versions
xie xingguo [Tue, 20 Aug 2019 02:43:43 +0000 (10:43 +0800)]
osd/PGLog: disable incremental recovery for pre-kraken versions

Since kraken, we always persist the missing set explicitly
(see https://github.com/ceph/ceph/pull/10334) and manually
building the missing set is only meaningful to be compatiable
with pre-kraken versions.

For safety, explicitly disable incremental recovery if we have
to completely re-build the missing set at boot up.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
5 years agomgr/BaseMgrModule: tolerate Int or Long for health 'count' 29806/head
Sage Weil [Wed, 21 Aug 2019 19:41:08 +0000 (14:41 -0500)]
mgr/BaseMgrModule: tolerate Int or Long for health 'count'

Signed-off-by: Sage Weil <sage@redhat.com>
5 years agoMerge pull request #29804 from alfredodeza/wip-rm41378
Andrew Schoen [Wed, 21 Aug 2019 21:59:18 +0000 (16:59 -0500)]
Merge pull request #29804 from alfredodeza/wip-rm41378

ceph-volume tests set the noninteractive flag for Debian

Reviewed-by: Andrew Schoen <aschoen@redhat.com>
5 years agoMerge PR #29744 into master
Sage Weil [Wed, 21 Aug 2019 20:02:27 +0000 (15:02 -0500)]
Merge PR #29744 into master

* refs/pull/29744/head:
qa/run-standalone.sh: fix python path
qa/standalone/mon/health-mute.sh: fix up rachet test
qa/standalone/mon/health-mute.sh: s/kill daemons/kill_daemons/

Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: Sebastian Wagner <swagner@suse.com>
5 years agoMerge PR #29749 into master
Sage Weil [Wed, 21 Aug 2019 20:02:14 +0000 (15:02 -0500)]
Merge PR #29749 into master

* refs/pull/29749/head:
mon/HealthMonitor: remove unused label

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
5 years agoMerge PR #29757 into master
Sage Weil [Wed, 21 Aug 2019 20:02:01 +0000 (15:02 -0500)]
Merge PR #29757 into master

* refs/pull/29757/head:
osd: always initialize local variable

Reviewed-by: Sage Weil <sage@redhat.com>
5 years agoMerge PR #29763 into master
Sage Weil [Wed, 21 Aug 2019 20:01:50 +0000 (15:01 -0500)]
Merge PR #29763 into master

* refs/pull/29763/head:
qa/suites/rados: whitelist POOL_APP_NOT_ENABLED warning

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agomgr/pg_autoscaler: fix race with pool deletion 29807/head
Sage Weil [Wed, 21 Aug 2019 19:56:43 +0000 (14:56 -0500)]
mgr/pg_autoscaler: fix race with pool deletion

The pool_stats map comes from a get('df') that may not include a pool
because it was just deleted.

Fixes: https://tracker.ceph.com/issues/41386
Signed-off-by: Sage Weil <sage@redhat.com>
5 years agoceph-volume tests set the noninteractive flag for Debian, to avoid prompts in apt 29804/head
Alfredo Deza [Wed, 21 Aug 2019 18:15:32 +0000 (14:15 -0400)]
ceph-volume tests set the noninteractive flag for Debian, to avoid prompts in apt

Signed-off-by: Alfredo Deza <adeza@redhat.com>
5 years agoMerge PR #28378 into master
Patrick Donnelly [Wed, 21 Aug 2019 17:57:15 +0000 (10:57 -0700)]
Merge PR #28378 into master

* refs/pull/28378/head:
qa/tasks: introduce Thrasher base class
qa/tasks: Fix typo
qa/tasks: manage thrashers
qa/tasks: start DaemonWatchdog when ceph starts
qa/tasks: make watch and bark handle more daemons
qa/tasks: move DaemonWatchdog to new file

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
5 years agoMerge pull request #29773 from dillaman/wip-41352
Mykola Golub [Wed, 21 Aug 2019 14:26:01 +0000 (17:26 +0300)]
Merge pull request #29773 from dillaman/wip-41352

pybind/mgr/rbd_support: fix missing variable in error path

Reviewed-by: Mykola Golub <mgolub@suse.com>
5 years agojournal: fix race between player shut down and cache rebalance 29796/head
Mykola Golub [Wed, 21 Aug 2019 14:04:47 +0000 (15:04 +0100)]
journal: fix race between player shut down and cache rebalance

25a23364 was supposed to fix this race, but it was not enough:
there was still a window between `prefetch` is queued for
execution in handle_cache_rebalanced and is actually executed,
during which shut_down can be called and completed.

Signed-off-by: Mykola Golub <mgolub@suse.com>
5 years agoMerge pull request #29659 from jan--f/c-v-simple-functional-no-lvm-zap
Jan Fajerski [Wed, 21 Aug 2019 07:17:30 +0000 (09:17 +0200)]
Merge pull request #29659 from jan--f/c-v-simple-functional-no-lvm-zap

ceph-volume: don't try to test lvm zap on simple tests

5 years agoqa/tasks: introduce Thrasher base class 28378/head
Jos Collin [Mon, 5 Aug 2019 10:52:10 +0000 (16:22 +0530)]
qa/tasks: introduce Thrasher base class

* Introduced a Thrasher base class.
* Updated thrashers to inherit from Thrasher.
* Replaced the magic variable e with Thrasher.exception as per the discussion.
  Now the exception variable sets by default as the thrashers are inheriting
  from the Thrasher class.

Fixes: https://github.com/ceph/ceph/pull/28378#discussion_r309337928
Fixes: https://tracker.ceph.com/issues/41133
Signed-off-by: Jos Collin <jcollin@redhat.com>
5 years agorgw: distinguish different get_usage for usage log 17719/head
Jiaying Ren [Thu, 14 Sep 2017 07:30:45 +0000 (15:30 +0800)]
rgw: distinguish different get_usage for usage log

get_usage op via s3 endpoint are not the same as get_usage
via admin endpoint in the rgw usage log categories.

Signed-off-by: Jiaying Ren <jiaying.ren@umcloud.com>
5 years agomon/HealthMonitor: remove unused label 29749/head
Kefu Chai [Tue, 20 Aug 2019 02:05:09 +0000 (10:05 +0800)]
mon/HealthMonitor: remove unused label

move the whole sanity checks into `HealthMonitor::preprocess_command()`.

this change silences warning of:

warning: label 'reply' defined but not used [-Wunused-label]

Signed-off-by: Kefu Chai <kchai@redhat.com>