]> git.apps.os.sepia.ceph.com Git - ceph-ci.git/log
ceph-ci.git
4 years agomds: resolve SIGSEGV in waiting for uncommitted fragments
Patrick Donnelly [Thu, 30 Jul 2020 02:36:28 +0000 (19:36 -0700)]
mds: resolve SIGSEGV in waiting for uncommitted fragments

The MDSGatherBuilder was not correctly used / wired up.

Fixes: https://tracker.ceph.com/issues/46765
Fixes: 77eb368d2d35f2418875227fff9a34b5ef15a290
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoMerge pull request #36370 from tchaikov/wip-crimson-scrub2
Kefu Chai [Thu, 30 Jul 2020 15:24:11 +0000 (23:24 +0800)]
Merge pull request #36370 from tchaikov/wip-crimson-scrub2

crimson/osd: handle MOSDScrub2

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
4 years agoMerge pull request #36286 from sebastian-philipp/cephadm-notify_config-ceph-conf...
Sebastian Wagner [Thu, 30 Jul 2020 14:20:44 +0000 (16:20 +0200)]
Merge pull request #36286 from sebastian-philipp/cephadm-notify_config-ceph-conf-race

mgr/cephadm: revamp ceph.conf distribution scheduling

Reviewed-by: Ricardo Marques <rimarques@suse.com>
4 years agoMerge pull request #36231 from tspmelo/wip-fix-overflow
Kiefer Chang [Thu, 30 Jul 2020 14:02:37 +0000 (22:02 +0800)]
Merge pull request #36231 from tspmelo/wip-fix-overflow

mgr/dashboard: Configure overflow of popover in health page

Reviewed-by: Ni-Feng Chang <kiefer.chang@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
4 years agoMerge pull request #36243 from Devp00l/wip-46660
Volker Theile [Thu, 30 Jul 2020 13:00:59 +0000 (15:00 +0200)]
Merge pull request #36243 from Devp00l/wip-46660

mgr/dashboard: Fix regression on table error handling

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
4 years agoMerge pull request #36324 from dillaman/wip-46737
Mykola Golub [Thu, 30 Jul 2020 12:04:10 +0000 (15:04 +0300)]
Merge pull request #36324 from dillaman/wip-46737

librbd: ensure image cannot be closed until in-flight IO callbacks complete

Reviewed-by: Mykola Golub <mgolub@suse.com>
4 years agocrimson/osd: handle MOSDScrub2
Kefu Chai [Thu, 30 Jul 2020 11:45:23 +0000 (19:45 +0800)]
crimson/osd: handle MOSDScrub2

MOSDScrub2 is sent from mgr for serving "ceph pg
{scrub|deep-scrub|repair}' commands when it's talking to a mimic and newer OSD.

ceph task checks if all pgs are scrubbed by looking at the `last_scrub_stamp` fields
in the `ceph pg dump` output. and it request the not-yet-scrubbed pgs a
deep scrub to ensure they are scrub before timeout.

in this change, crimson handles MOSDScrub2 by starting a remote peering
request, and the underlying peering_state will notify the corresponding
PG to start scrub. to get the test pass, a minimal implmentation is
added to update the scrub timestamp to `now` upon request of
peering_state.

we will need to add the correct scrubbing support later. but this is
enough for passing the thrasher test and for preparing for more tests
which uses the "ceph" task.

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #36281 from sebastian-philipp/mgr-tox-rm-cov
Sebastian Wagner [Thu, 30 Jul 2020 11:42:53 +0000 (13:42 +0200)]
Merge pull request #36281 from sebastian-philipp/mgr-tox-rm-cov

pybind/mgr: remove coverage from tox.ini

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
4 years agoMerge pull request #36335 from mgfritch/cephadm-vstart-mon-cidr
Sebastian Wagner [Thu, 30 Jul 2020 10:27:08 +0000 (12:27 +0200)]
Merge pull request #36335 from mgfritch/cephadm-vstart-mon-cidr

vstart: infer the mon public_network

Reviewed-by: Joshua Schmid <jschmid@suse.de>
Reviewed-by: Sage Weil <sage@redhat.com>
4 years agoMerge pull request #36334 from mgfritch/cephadm-event-multiline
Sebastian Wagner [Thu, 30 Jul 2020 10:13:05 +0000 (12:13 +0200)]
Merge pull request #36334 from mgfritch/cephadm-event-multiline

mgr/orch: allow for multiline OrchestratorEvent message

Reviewed-by: Joshua Schmid <jschmid@suse.de>
Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
4 years agoMerge pull request #36162 from Daniel-Pivonka/cephadm-43681
Sebastian Wagner [Thu, 30 Jul 2020 09:55:33 +0000 (11:55 +0200)]
Merge pull request #36162 from Daniel-Pivonka/cephadm-43681

mgr/cephadm: streamline rgw deployment

Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
4 years agoMerge pull request #36362 from tchaikov/wip-crimson-read-write-file
Kefu Chai [Thu, 30 Jul 2020 09:16:14 +0000 (17:16 +0800)]
Merge pull request #36362 from tchaikov/wip-crimson-read-write-file

crimson: extract read_file()

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
4 years agocrimson: move read_file() into common/buffer_io
Kefu Chai [Wed, 29 Jul 2020 08:01:20 +0000 (16:01 +0800)]
crimson: move read_file() into common/buffer_io

so it can be reused by other components in crimson

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agocrimson/common: move write_file into crimson namespace
Kefu Chai [Wed, 29 Jul 2020 07:52:00 +0000 (15:52 +0800)]
crimson/common: move write_file into crimson namespace

it simply does not belong to ceph::buffer

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #36355 from batrick/ptl-tool-paging
Kefu Chai [Thu, 30 Jul 2020 04:17:36 +0000 (12:17 +0800)]
Merge pull request #36355 from batrick/ptl-tool-paging

script/ptl-tool: page through github response

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #36093 from sebastian-philipp/build-integration-branch-stable...
Kefu Chai [Thu, 30 Jul 2020 03:15:38 +0000 (11:15 +0800)]
Merge pull request #36093 from sebastian-philipp/build-integration-branch-stable-branch-name

build-integration-branch: Append stable branch name

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge PR #36288 into master
Patrick Donnelly [Thu, 30 Jul 2020 03:13:10 +0000 (20:13 -0700)]
Merge PR #36288 into master

* refs/pull/36288/head:
mds/CInode: Optimize only pinned by subtrees check

Reviewed-by: Zheng Yan <zyan@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoscript/ptl-tool: page through github response
Patrick Donnelly [Tue, 14 Jul 2020 02:50:00 +0000 (19:50 -0700)]
script/ptl-tool: page through github response

This fixes the script to go through the response pages from GitHub.
Previously it would only look at the first page and potentially miss
some reviews/comments.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoMerge PR #36249 into master
Patrick Donnelly [Thu, 30 Jul 2020 03:06:14 +0000 (20:06 -0700)]
Merge PR #36249 into master

* refs/pull/36249/head:
client: expose ceph.quota.max_bytes xattr within snapshots

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoMerge PR #35977 into master
Patrick Donnelly [Thu, 30 Jul 2020 03:00:06 +0000 (20:00 -0700)]
Merge PR #35977 into master

* refs/pull/35977/head:
doc/cephfs: Update about cephfs-shell custom exit codes
cephfs-shell: Define cephfs-shell exit code

Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoMerge PR #36064 into master
Patrick Donnelly [Thu, 30 Jul 2020 02:58:16 +0000 (19:58 -0700)]
Merge PR #36064 into master

* refs/pull/36064/head:
mgr/volumes: Fix traceback of ops when volume doesn't exist

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoMerge PR #36200 into master
Patrick Donnelly [Thu, 30 Jul 2020 02:56:57 +0000 (19:56 -0700)]
Merge PR #36200 into master

* refs/pull/36200/head:
mds: fix mds peer request 'no_available_op_found'

Reviewed-by: Zheng Yan <zyan@redhat.com>
4 years agoMerge PR #36233 into master
Patrick Donnelly [Thu, 30 Jul 2020 02:55:28 +0000 (19:55 -0700)]
Merge PR #36233 into master

* refs/pull/36233/head:
client: fix extra open ref decrease

Reviewed-by: Jeff Layton <jlayton@redhat.com>
4 years agomgr/cephadm: streamline rgw deployment
Daniel-Pivonka [Thu, 16 Jul 2020 12:24:47 +0000 (08:24 -0400)]
mgr/cephadm: streamline rgw deployment

cephadm will create realm, zonegroup, and zone if needed before creating rgw service

fixes: https://tracker.ceph.com/issues/43681
Signed-off-by: Daniel-Pivonka <dpivonka@redhat.com>
4 years agoMerge PR #24068 into master
Patrick Donnelly [Wed, 29 Jul 2020 18:05:02 +0000 (11:05 -0700)]
Merge PR #24068 into master

* refs/pull/24068/head:
mds: rename {CDir,Migrator}::cache to mdcache
mds: make MDSCacheObject::is_ambiguous_auth() virtual
mds: make sure rename old inode's parent dirfrag is projected.
mds: track projected inode/fnode in Mutation
mds: use smart pointer to manager CDir::fnode
mds: use smart pointer to manage CInode::{inode,xattrs,old_inodes}
osdc/Filer: make layout pointer const

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agomgr/orch: allow for multiline OrchestratorEvent message
Michael Fritch [Tue, 28 Jul 2020 19:57:17 +0000 (13:57 -0600)]
mgr/orch: allow for multiline OrchestratorEvent message

Signed-off-by: Michael Fritch <mfritch@suse.com>
4 years agoMerge pull request #36053 from tchaikov/wip-mkdir
Kefu Chai [Wed, 29 Jul 2020 14:48:04 +0000 (22:48 +0800)]
Merge pull request #36053 from tchaikov/wip-mkdir

kv: replace compat_mkdir with fs::create_directory

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
4 years agomgr/dashboard: Fix regression on table error handling
Stephan Müller [Wed, 22 Jul 2020 14:00:32 +0000 (16:00 +0200)]
mgr/dashboard: Fix regression on table error handling

The regression was introduced by #35290 through the use of the new tab module
the pools and host listing got wrapped into the new usage, however they needed
to use the table as Viewchild and the table was static before, but it's now
dynamic. This resulted in an empty variable that wasn't filled with the
right table object. The calling of the ".reset()" was not possible
during an error case and produced an error in console trying to access
"reset" of undefined, by not calling "reset" the table get's stuck with an
rotating reload symbol.

Fixes: https://tracker.ceph.com/issues/46660
Signed-off-by: Stephan Müller <smueller@suse.com>
4 years agomds: rename {CDir,Migrator}::cache to mdcache
Yan, Zheng [Tue, 23 Jul 2019 01:41:30 +0000 (09:41 +0800)]
mds: rename {CDir,Migrator}::cache to mdcache

make it be consistant with CInode::mdcache

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
4 years agomds: make MDSCacheObject::is_ambiguous_auth() virtual
Yan, Zheng [Tue, 23 Jul 2019 01:14:41 +0000 (09:14 +0800)]
mds: make MDSCacheObject::is_ambiguous_auth() virtual

CInode overrides is_ambiguous_auth(). Locker calls is_ambiguous_auth()
from base class.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
4 years agomds: make sure rename old inode's parent dirfrag is projected.
Yan, Zheng [Thu, 7 May 2020 02:33:12 +0000 (10:33 +0800)]
mds: make sure rename old inode's parent dirfrag is projected.

if rename dest dentry is remote dentry, Server::_rename_prepare() only
pre dirty old inode, but does not project fnode for old inode's parent
dirfrag. This will trigger a assertion (introduced by previous commit)
in CDir::mark_dirty().

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
4 years agomds: track projected inode/fnode in Mutation
Yan, Zheng [Tue, 9 Jul 2019 10:15:35 +0000 (18:15 +0800)]
mds: track projected inode/fnode in Mutation

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
4 years agomds: use smart pointer to manager CDir::fnode
Yan, Zheng [Sat, 14 Jul 2018 08:33:19 +0000 (16:33 +0800)]
mds: use smart pointer to manager CDir::fnode

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
4 years agomds: use smart pointer to manage CInode::{inode,xattrs,old_inodes}
Yan, Zheng [Thu, 16 Jul 2020 03:19:10 +0000 (11:19 +0800)]
mds: use smart pointer to manage CInode::{inode,xattrs,old_inodes}

this avoid copying whole inode_t and xattr map when journaling inodes.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
4 years agoMerge pull request #36307 from petrutlucian94/rocksdb_lz4
Kefu Chai [Wed, 29 Jul 2020 12:35:13 +0000 (20:35 +0800)]
Merge pull request #36307 from petrutlucian94/rocksdb_lz4

cmake: fix lz4 params when building rocksdb

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #36295 from majianpeng/bluefs-reduce-unnecessary-flush
Kefu Chai [Wed, 29 Jul 2020 12:30:11 +0000 (20:30 +0800)]
Merge pull request #36295 from majianpeng/bluefs-reduce-unnecessary-flush

os/bluestore/BlueFS: Don't flush unused device.

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
4 years agoMerge pull request #36232 from mgfritch/cephadm-ok-to-stop
Kefu Chai [Wed, 29 Jul 2020 12:29:01 +0000 (20:29 +0800)]
Merge pull request #36232 from mgfritch/cephadm-ok-to-stop

mgr/cephadm: add `orch ok-to-stop` commands

Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
Reviewed-by: Ricardo Marques <rimarques@suse.com>
4 years agoMerge pull request #34848 from changchengx/protocolv2
Kefu Chai [Wed, 29 Jul 2020 12:25:56 +0000 (20:25 +0800)]
Merge pull request #34848 from changchengx/protocolv2

refine class member function implementation in ProtocolV2

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #36007 from votdev/issue_46448_hosts_unit_tests
Tatjana Dehler [Wed, 29 Jul 2020 11:33:14 +0000 (13:33 +0200)]
Merge pull request #36007 from votdev/issue_46448_hosts_unit_tests

mgr/dashboard: Add hosts page unit tests

Reviewed-by: Sebastian Krah <skrah@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
4 years agolibrbd: potentially delay completion of image dispatcher spec
Jason Dillaman [Wed, 29 Jul 2020 11:30:28 +0000 (07:30 -0400)]
librbd: potentially delay completion of image dispatcher spec

If an AioCompletion is being completed for an external API user, ensure
that the completion of image dispatcher finalizer does not race with the
potential to close the image.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
4 years agoMerge pull request #36320 from s0nea/wip-dashboard-46573
Volker Theile [Wed, 29 Jul 2020 11:24:14 +0000 (13:24 +0200)]
Merge pull request #36320 from s0nea/wip-dashboard-46573

mgr/dashboard: wait longer for health status to be cleared

Reviewed-by: Ni-Feng Chang <kiefer.chang@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
4 years agoclient: expose ceph.quota.max_bytes xattr within snapshots
Shyamsundar Ranganathan [Wed, 22 Jul 2020 19:21:50 +0000 (15:21 -0400)]
client: expose ceph.quota.max_bytes xattr within snapshots

For directories within snapshots, expose the ceph.quota.max_bytes
extended attribute information. This enables fetching quota
information when the snapshot was taken and is particularly useful
when cloning subvolume snapshots, to enforce the quota on the
clone subvolume as well.

Fixes: https://tracker.ceph.com/issues/46278
Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
4 years agoMerge pull request #36323 from tchaikov/wip-crimson-msgr-v1-v2
Kefu Chai [Wed, 29 Jul 2020 09:05:36 +0000 (17:05 +0800)]
Merge pull request #36323 from tchaikov/wip-crimson-msgr-v1-v2

crimson: picking peer addr of the compatible type

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
4 years agoMerge pull request #35728 from jan--f/c-v-add-subcommand-parse-drive-groups
Jan Fajerski [Wed, 29 Jul 2020 08:56:57 +0000 (10:56 +0200)]
Merge pull request #35728 from jan--f/c-v-add-subcommand-parse-drive-groups

ceph-volume: add drive-group subcommand

4 years agoMerge pull request #36342 from tchaikov/wip-crimson-heartbeat-erase
Kefu Chai [Wed, 29 Jul 2020 08:48:23 +0000 (16:48 +0800)]
Merge pull request #36342 from tchaikov/wip-crimson-heartbeat-erase

crimson/osd: erase an element by iterator instead

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
4 years agoMerge pull request #36217 from sebastian-philipp/cephadm-common-mypy-ini
Sebastian Wagner [Wed, 29 Jul 2020 08:19:42 +0000 (10:19 +0200)]
Merge pull request #36217 from sebastian-philipp/cephadm-common-mypy-ini

cephadm: use src/mypy.ini instead

Reviewed-by: Joshua Schmid <jschmid@suse.de>
Reviewed-by: Michael Fritch <mfritch@suse.com>
4 years agoMerge pull request #36235 from matthewoliver/cephadm_iscsi_tcmu_runner
Sebastian Wagner [Wed, 29 Jul 2020 08:15:07 +0000 (10:15 +0200)]
Merge pull request #36235 from matthewoliver/cephadm_iscsi_tcmu_runner

cephadm: Add tcmu-runner container when deploying ceph-iscsi

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
4 years agocrimson/osd: implement cls_get_pool_stripe_width
Kefu Chai [Wed, 29 Jul 2020 07:33:59 +0000 (15:33 +0800)]
crimson/osd: implement cls_get_pool_stripe_width

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agocrimson/osd: erase an element by iterator instead
Kefu Chai [Wed, 29 Jul 2020 06:37:57 +0000 (14:37 +0800)]
crimson/osd: erase an element by iterator instead

we should not remove an element while iterating it in a map, as erasing
the element invalidates the iterator, which causes segmfault when we are
advancing it after erasing the dereferenced element.

in this change, an iterator is used for walking through the map, in
comparision with creating a to-be-removed list, this one is more
efficient and more idiomatic.

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #36341 from tchaikov/wip-crimson-cls
Kefu Chai [Wed, 29 Jul 2020 06:18:56 +0000 (14:18 +0800)]
Merge pull request #36341 from tchaikov/wip-crimson-cls

crimson/osd: correct the function name of cls_cxx_map_get_vals_by_keys()

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
4 years agocrimson/osd: correct the function name of cls_cxx_map_get_vals_by_keys()
Kefu Chai [Wed, 29 Jul 2020 04:32:26 +0000 (12:32 +0800)]
crimson/osd: correct the function name of cls_cxx_map_get_vals_by_keys()

it was an oversight in 7a4c6359e483f8c71276ece5cde16eb0771ac5d2

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #36079 from winndows/superfluous_break6
Kefu Chai [Wed, 29 Jul 2020 01:44:46 +0000 (09:44 +0800)]
Merge pull request #36079 from winndows/superfluous_break6

msg: Remove superfluous breaks

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoMerge pull request #36297 from dvanders/dvanders_46443
Neha Ojha [Wed, 29 Jul 2020 01:11:12 +0000 (18:11 -0700)]
Merge pull request #36297 from dvanders/dvanders_46443

osd: fix crash in _committed_osd_maps if incremental osdmap crc fails

Reviewed-by: Xiaoxi Chen <xiaoxchen@ebay.com>
Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
4 years agomgr/cephadm: add `orch host ok-to-stop` command
Michael Fritch [Tue, 21 Jul 2020 21:06:19 +0000 (15:06 -0600)]
mgr/cephadm: add `orch host ok-to-stop` command

$ ceph orch host ok-to-stop host1
It is presumed safe to stop host host1

Signed-off-by: Michael Fritch <mfritch@suse.com>
4 years agomgr/cephadm: return HandleCommandResult from ok_to_stop
Michael Fritch [Tue, 21 Jul 2020 21:26:43 +0000 (15:26 -0600)]
mgr/cephadm: return HandleCommandResult from ok_to_stop

- return output from the result of the ok_to_stop command
- log ok-to-stop result during all invocations

Signed-off-by: Michael Fritch <mfritch@suse.com>
4 years agomgr/orch: add errno to OrchestratorError
Michael Fritch [Wed, 22 Jul 2020 23:43:05 +0000 (17:43 -0600)]
mgr/orch: add errno to OrchestratorError

add errno to OrchestratorError and ServiceSpecValidationError exceptions

Signed-off-by: Michael Fritch <mfritch@suse.com>
4 years agoqa/suites/rados/thrash/crc-failures: randomly inject bad incremental osdmap crc
Neha Ojha [Tue, 28 Jul 2020 17:36:09 +0000 (10:36 -0700)]
qa/suites/rados/thrash/crc-failures: randomly inject bad incremental osdmap crc

Signed-off-by: Neha Ojha <nojha@redhat.com>
4 years agoosd: don't write transaction when inc crc failed
Dan van der Ster [Mon, 27 Jul 2020 15:40:27 +0000 (17:40 +0200)]
osd: don't write transaction when inc crc failed

80da5f9a987c6a48b93f25228fdac85890013520 exposed a flaw in how
handle_osd_map falls back to a full osdmap if the crc of an incremental
failed.

If the first message in a map message had a crc error, then the
loop would exit with last < start, which would then cause a null
dereference in _committed_osd_maps.

Fixes: https://tracker.ceph.com/issues/46443
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
4 years agoqa/standalone/osd: add bad-inc-map.sh
Dan van der Ster [Mon, 27 Jul 2020 12:23:54 +0000 (14:23 +0200)]
qa/standalone/osd: add bad-inc-map.sh

Test that the osd doesn't crash when it gets a bad incremental osdmap.

Related-to: https://tracker.ceph.com/issues/46443
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
4 years agovstart: infer the mon public_network
Michael Fritch [Tue, 28 Jul 2020 17:24:21 +0000 (11:24 -0600)]
vstart: infer the mon public_network

set the mon public_network when deploying with the cephadm flag

Signed-off-by: Michael Fritch <mfritch@suse.com>
4 years agoMerge pull request #36287 from dillaman/wip-librbd-close
Mykola Golub [Tue, 28 Jul 2020 18:30:07 +0000 (21:30 +0300)]
Merge pull request #36287 from dillaman/wip-librbd-close

librbd: use task finisher thread for image open/close callbacks

Reviewed-by: Mykola Golub <mgolub@suse.com>
4 years agoMerge pull request #36253 from changchengx/exclusive
Jason Dillaman [Tue, 28 Jul 2020 17:33:17 +0000 (13:33 -0400)]
Merge pull request #36253 from changchengx/exclusive

doc: specify RBD_LOCK_MODE_EXCLUSIVE for exclusive-lock

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
4 years agoMerge pull request #36328 from tchaikov/wip-crimson-cls_cxx_map_get_vals
Kefu Chai [Tue, 28 Jul 2020 15:22:14 +0000 (23:22 +0800)]
Merge pull request #36328 from tchaikov/wip-crimson-cls_cxx_map_get_vals

crimson/osd: implement cls_cxx_map_get_vals()

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
4 years agocrimson/osd: implement cls_cxx_map_get_vals()
Kefu Chai [Tue, 28 Jul 2020 14:39:21 +0000 (22:39 +0800)]
crimson/osd: implement cls_cxx_map_get_vals()

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agodoc: specify RBD_LOCK_MODE_EXCLUSIVE for exclusive-lock
Changcheng Liu [Thu, 23 Jul 2020 03:09:46 +0000 (11:09 +0800)]
doc: specify RBD_LOCK_MODE_EXCLUSIVE for exclusive-lock

The exclusive-lock could be transited transparently between clients
after finishing write operation. To disable "transparent" transition,
it needs to acquire the lock with RBD_LOCK_MODE_EXCLUSIVE.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
4 years agolibrbd: ensure image cannot be closed until in-flight IO callbacks complete
Jason Dillaman [Tue, 28 Jul 2020 13:07:49 +0000 (09:07 -0400)]
librbd: ensure image cannot be closed until in-flight IO callbacks complete

If a librbd client attempts to close the image while it still has in-flight IO
pending, it's possible for the AsyncOperation tracker which prevents the image
from being closed to be completed before the actual AioCompletion callback
fires. This can result in the now destructed ImageCtx being de-referenced by
the AioCompletion.

Fixes: https://tracker.ceph.com/issues/46737
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
4 years agoMerge pull request #34928 from p-se/wip-pse-revise-monitoring-doc
Kefu Chai [Tue, 28 Jul 2020 12:49:58 +0000 (20:49 +0800)]
Merge pull request #34928 from p-se/wip-pse-revise-monitoring-doc

mgr/dashboard: revise monitoring documentation

Reviewed-by: Lenz Grimmer <lgrimmer@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
4 years agocrimson: use pick_addr() for picking peer addr
Kefu Chai [Tue, 28 Jul 2020 12:32:59 +0000 (20:32 +0800)]
crimson: use pick_addr() for picking peer addr

in teuthology tests, there is good chance that we have ceph.conf
containing:

mon host = 172.21.15.122

which is translated to two monitors

- a: 172.21.15.122:3300
- a-legacy: 172.21.15.122:6789

both has protocol type of "any". so, to enable crimson to use settings
like this, we should let crimson to accept them, and drop the connection
if the peer claim to be using an incompatible protocol, when they are
exchanging banners.

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agomgr/dashboard: wait longer for health status to be cleared
Tatjana Dehler [Tue, 28 Jul 2020 11:18:56 +0000 (13:18 +0200)]
mgr/dashboard: wait longer for health status to be cleared

Because of reasons the cluster needs more time to recover from
HEALTH_WARN while changes are made by `test_pool_update_metadata`.
Lets wait several times for the cluster status to be HEALTH_OK
again.

Fixes: https://tracker.ceph.com/issues/46573
Signed-off-by: Tatjana Dehler <tdehler@suse.com>
4 years agocrimson/mon: use mon with only v2 address
Kefu Chai [Tue, 28 Jul 2020 12:32:00 +0000 (20:32 +0800)]
crimson/mon: use mon with only v2 address

crimson msgr supports v2 protocol now, so we can connect to monitor
which only provides v2 addresses.

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agomsg/msg_types.h: add pick_addr()
Kefu Chai [Tue, 28 Jul 2020 12:29:29 +0000 (20:29 +0800)]
msg/msg_types.h: add pick_addr()

for picking an addr from an entity_addrvec_t by given protocol type.
so:
  - v2 => v2, any
  - v1 => v1, any
  - any => any, v1, v2

and add a helper of `addr_of_type()` to avoid repeatings.

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #36285 from sebastian-philipp/orch-completion-generic
Sebastian Wagner [Tue, 28 Jul 2020 12:46:20 +0000 (14:46 +0200)]
Merge pull request #36285 from sebastian-philipp/orch-completion-generic

mgr/orch: Add some more type annotations

Reviewed-by: Michael Fritch <mfritch@suse.com>
4 years agoMerge pull request #36258 from rhcs-dashboard/fix-cpu-stats
Volker Theile [Tue, 28 Jul 2020 12:00:25 +0000 (14:00 +0200)]
Merge pull request #36258 from rhcs-dashboard/fix-cpu-stats

mgr/dashboard: cpu stats incorrectly displayed

Reviewed-by: Patrick Seidensal <pseidensal@suse.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
4 years agomgr/orch: Add some more type annotations
Sebastian Wagner [Fri, 24 Jul 2020 15:29:28 +0000 (17:29 +0200)]
mgr/orch: Add some more type annotations

Made `orch.Completion` a generic type

Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
4 years agomgr/cephadm: revamp ceph.conf distribution scheduling
Sebastian Wagner [Mon, 27 Jul 2020 12:27:12 +0000 (14:27 +0200)]
mgr/cephadm: revamp ceph.conf distribution scheduling

Having an in-memeory list doesn't work properly: Especially
when loading the mgr module, we didn't knwo if we should
deploy confs or not.

Now we only distribute ceph.confs, if there is a new mon_map.
We also store that info now in the config store

Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
4 years agomgr/cephadm: Add test verifying the initializaiton order
Sebastian Wagner [Mon, 27 Jul 2020 10:09:30 +0000 (12:09 +0200)]
mgr/cephadm: Add test verifying the initializaiton order

Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
4 years agoMerge pull request #36012 from adk3798/cephadm_44886
Sebastian Wagner [Tue, 28 Jul 2020 09:54:12 +0000 (11:54 +0200)]
Merge pull request #36012 from adk3798/cephadm_44886

mgr/cephadm: allow use of authenticated registry

4 years agoMerge pull request #36262 from sebastian-philipp/orch-readd-apply_dg
Sebastian Wagner [Tue, 28 Jul 2020 09:52:53 +0000 (11:52 +0200)]
Merge pull request #36262 from sebastian-philipp/orch-readd-apply_dg

mgr/cephadm: re-add `apply_drivegroups()`

Reviewed-by: Joshua Schmid <jschmid@suse.de>
Reviewed-by: Kiefer Chang <kiefer.chang@suse.com>
4 years agoMerge pull request #36306 from smithfarm/wip-add-octopus-to-release-table
Nathan Cutler [Tue, 28 Jul 2020 09:33:30 +0000 (11:33 +0200)]
Merge pull request #36306 from smithfarm/wip-add-octopus-to-release-table

doc/releases: add "octopus" column to Release Timeline

Reviewed-by: Neha Ojha <nojha@redhat.com>
4 years agoMerge pull request #36301 from sebastian-philipp/doc-cephadm-status-no-progress
Sebastian Wagner [Tue, 28 Jul 2020 09:29:01 +0000 (11:29 +0200)]
Merge pull request #36301 from sebastian-philipp/doc-cephadm-status-no-progress

doc/cephadm: `status` doesn't show a progress

Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Zac Dover <zac.dover@gmail.com>
4 years agocephadm: Add tcmu-runner container when deploying ceph-iscsi
Matthew Oliver [Wed, 22 Jul 2020 07:09:12 +0000 (17:09 +1000)]
cephadm: Add tcmu-runner container when deploying ceph-iscsi

Currently when we deploy ceph-iscsi via cephadm it doesn't include a
running tcmu-runner. Which means initiators will be able to login but
you wont see the LUNS on the initiator.

This patch deploys an additional tcmu-runner container along side the
ceph-iscsi container that just runs the tcmu-runner service.

Fixes: https://tracker.ceph.com/issues/46540
Signed-off-by: Matthew Oliver <moliver@suse.com>
4 years agomds/CInode: Optimize only pinned by subtrees check
Mark Nelson [Fri, 24 Jul 2020 05:29:15 +0000 (05:29 +0000)]
mds/CInode: Optimize only pinned by subtrees check

Fixes: https://tracker.ceph.com/issues/46727
Signed-off-by: Mark Nelson <mnelson@redhat.com>
4 years agoMerge pull request #36090 from inspur-wyq/wip-37532
Kefu Chai [Tue, 28 Jul 2020 02:21:24 +0000 (10:21 +0800)]
Merge pull request #36090 from inspur-wyq/wip-37532

mon: fix the 'Error ERANGE' message when conf "osd_objectstore" is filestore

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #36283 from rzarzynski/wip-bl-raw-privatization
Kefu Chai [Tue, 28 Jul 2020 02:20:00 +0000 (10:20 +0800)]
Merge pull request #36283 from rzarzynski/wip-bl-raw-privatization

common/bl: don't access raw::data nor raw::len directly. Use getters instead.

Reviewed-by: Neha Ojha <nojha@redhat.com>
4 years agocmake: fix lz4 params when building rocksdb
Lucian Petrut [Mon, 27 Jul 2020 13:57:59 +0000 (13:57 +0000)]
cmake: fix lz4 params when building rocksdb

Recent RocksDB version use slightly different parameter names for
the LZ4 include/lib dirs, we'll have to pass the right ones.

We'll also have to fix the "CMAKE_TOOLCHAIN_FILE" parameter,
which isn't passed properly.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
4 years agodoc/releases: add "octopus" column to Release Timeline
Nathan Cutler [Mon, 27 Jul 2020 15:40:58 +0000 (17:40 +0200)]
doc/releases: add "octopus" column to Release Timeline

Octopus has been out for awhile. I suppose this should have been done
earlier, but "better late than never".

Signed-off-by: Nathan Cutler <ncutler@suse.com>
4 years agoMerge pull request #36245 from smithfarm/wip-mimic-is-eol
Nathan Cutler [Mon, 27 Jul 2020 15:39:22 +0000 (17:39 +0200)]
Merge pull request #36245 from smithfarm/wip-mimic-is-eol

doc/releases: Mimic is EOL

Reviewed-by: Abhishek Lekshmanan <abhishek@suse.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
4 years agoMerge pull request #36279 from tchaikov/wip-crimson-msgr-v2.1
Kefu Chai [Mon, 27 Jul 2020 14:53:07 +0000 (22:53 +0800)]
Merge pull request #36279 from tchaikov/wip-crimson-msgr-v2.1

crimson/net: enable on-wire-encryt and v2.1 support

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
4 years agodoc/cephadm: `status` doesn't show a progress
Sebastian Wagner [Mon, 27 Jul 2020 14:50:01 +0000 (16:50 +0200)]
doc/cephadm: `status` doesn't show a progress

Fixes: https://tracker.ceph.com/issues/45858
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
4 years agoMerge pull request #35852 from smithfarm/wip-opensuse-os-recommendations
Nathan Cutler [Mon, 27 Jul 2020 14:25:30 +0000 (16:25 +0200)]
Merge pull request #35852 from smithfarm/wip-opensuse-os-recommendations

doc/start/os-recommendations: current state of openSUSE

Reviewed-by: Tim Serong <tserong@suse.com>
4 years agoMerge pull request #36269 from dang/wip-dang-46692
Casey Bodley [Mon, 27 Jul 2020 13:54:46 +0000 (09:54 -0400)]
Merge pull request #36269 from dang/wip-dang-46692

RGW - fix bulkupload, broken by zipper

Reviewed-by: Casey Bodley <cbodley@redhat.com>
4 years agocephadm: use src/mypy.ini instead
Sebastian Wagner [Mon, 20 Jul 2020 11:55:09 +0000 (13:55 +0200)]
cephadm: use src/mypy.ini instead

Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
4 years agolibrbd: use task finisher thread for image open/close callbacks
Jason Dillaman [Fri, 24 Jul 2020 16:13:10 +0000 (12:13 -0400)]
librbd: use task finisher thread for image open/close callbacks

There was a potential race condition with utilizing the AsioEngine
to deliver asynchronous image open and close callbacks. This left
the potential for the io_context thread to attempt to destroy itself.

This commit changes the behavior of the image open and close callbacks
to always delete the ImageCtx (now matches the synchronous API behavior)
and it always invokes the callback in Finisher thread whose lifetime is
tied to the CephContext.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
4 years agoMerge pull request #36219 from guits/guits-fix_zap_osdid_osdfsid
Jan Fajerski [Mon, 27 Jul 2020 08:29:21 +0000 (10:29 +0200)]
Merge pull request #36219 from guits/guits-fix_zap_osdid_osdfsid

ceph-volume: filter by osd-id or osd-fsid when zapping

4 years agoos/bluestore/BlueFS: Don't flush unused device.
Jianpeng Ma [Mon, 27 Jul 2020 06:59:08 +0000 (14:59 +0800)]
os/bluestore/BlueFS: Don't flush unused device.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
4 years agoceph-volume: filter by osd-id or osd-fsid when zapping
Guillaume Abrioux [Mon, 20 Jul 2020 13:43:38 +0000 (15:43 +0200)]
ceph-volume: filter by osd-id or osd-fsid when zapping

2f5c10c12c37e6865ce54bb4940d3779353cba4f introduced a bug:

`ceph-volume lvm zap` command fails under certain conditions.

when passing `--osd-id` or `--osd-fsid` to `ceph-volume lvm zap` command
it tries to zap additionnal devices that have nothing to do with the osd
being zapped.

When calling `api.get_lvs()` in `ensure_associated_lvs()` we have to
pass the osd-id/osd-fsid information so only related devices are
returned by `get_lvs()` method

Closes: https://tracker.ceph.com/issues/46627
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agomds: fix mds peer request 'no_available_op_found'
Yanhu Cao [Mon, 27 Jul 2020 02:23:01 +0000 (10:23 +0800)]
mds: fix mds peer request 'no_available_op_found'

Fixes: https://tracker.ceph.com/issues/46583
Signed-off-by: Yanhu Cao <gmayyyha@gmail.com>
4 years agomessages/MOSDBoot: pass OSDSuperblock by const ref
Kefu Chai [Sat, 25 Jul 2020 09:22:14 +0000 (17:22 +0800)]
messages/MOSDBoot: pass OSDSuperblock by const ref

MOSDBoot's ctor does not change the parameter, so let's pass by const
reference.

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agocrimson/os/alienstore: always use fsid in bluestore
Kefu Chai [Sat, 25 Jul 2020 09:13:41 +0000 (17:13 +0800)]
crimson/os/alienstore: always use fsid in bluestore

alienstore should not be stateful in this perspective, it should proxy
all acccess of fsid to bluestore.

there are couple issues in existing implementation:

* when mkfs, bluestore tries to generate a new osd_fsid if the specified
  one is empty. but we explicitly pass the given uuid down to
  AlienStore::mkfs() so the bluestore can use it. so we should pass it
  down instad of storing it locally.
* when persisting superblock in OSD::mkfs(), superblock.osd_fsid() is
  read from store->get_fsid(), if user specifies an empty uuid, we
  should persist the generated uuid in the superblock.

in this change, all access to fsid is proxied to the underlying
bluestore.

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agostop.sh: stop osd before mon
Kefu Chai [Sat, 25 Jul 2020 07:45:26 +0000 (15:45 +0800)]
stop.sh: stop osd before mon

osd sends a MOSDMarkMeDown message to monitor and waits for its ack
before timeout, so if we can stop osd before stopping mon, stop.sh can
return sooner without waiting until the timeout.

Signed-off-by: Kefu Chai <kchai@redhat.com>