Casey Bodley [Tue, 6 Oct 2020 21:59:24 +0000 (17:59 -0400)]
rgw: generalize error handling in RGWShardCollectCR
RGWShardCollectCR was hard-coded to ignore ENOENT errors and print a
'failed to fetch log status' error message. this moves that logic into a
handle_result() virtual function. it also exposes the member variables
'status' and 'max_concurrent' as protected, so they can be consulted or
modified by overrides of handle_result() and spawn_next()
Sage Weil [Thu, 4 Feb 2021 17:19:25 +0000 (12:19 -0500)]
Merge PR #39147 into master
* refs/pull/39147/head:
qa/tasks/ceph_fuse: do not createfs
qa/tasks/cephfs/fuse_mount: pass admin_socket path
qa/suites/fs/cephadm/multivolume: add basic multivolume test
mgr/mds_autoscaler: some fixes and cleanup
mgr/volumes: deploy MDSs when creating fs
Jason Dillaman [Thu, 4 Feb 2021 14:00:23 +0000 (09:00 -0500)]
qa/suites/rbd: drop require-osd-release command
Teuthology already defaults to quincy now and results in a failure
when trying to set to pacific. Additionally, drop the LUKS readbalance
test since it's unnecessary to duplicate that test.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
J. Eric Ivancich [Fri, 29 Jan 2021 17:03:50 +0000 (12:03 -0500)]
rgw: add rgw-gap-list-comparator tool
The rgw-gap-list tool can produce a number of false positives when the
cluster is being used during its run. One technique to minimize the
number of false positives is to run the tool twice and look for the
objects that appear in both lists. The rgw-gap-list-comparator tool is
designed to do this comparison.
J. Eric Ivancich [Thu, 17 Dec 2020 23:21:36 +0000 (18:21 -0500)]
rgw: add rgw-gap-list tool
Due to a prior bug (pr: 38228) tail rados objects of some RGW objects
could have been incorrectly deleted. This tool is designed to look for
such cases. It essentially does the opposite of rgw-orphan-list,
looking for rados objects that RGW expects to be there, but which are
not to be found.
IMPORTANT: This is very experimental at this point in time, and any
"results" produced should be verified by other means.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com> Signed-off-by: Michael Kidd <linuxkidd@gmail.com>
Casey Bodley [Wed, 3 Feb 2021 20:17:19 +0000 (15:17 -0500)]
cmake/rgw: forward spawn's compile options to rgw_common object library
since rgw_common is an OBJECT library, we can't use
target_link_libraries() for its dependency on spawn. we add its
include directories manually already with
$<TARGET_PROPERTY:spawn,INTERFACE_INCLUDE_DIRECTORIES>, but this didn't
pull in the compile definitions. this ultimately prevented the
WITH_BOOST_VALGRIND option from passing the BOOST_USE_VALGRIND
definition attached to boost::context
Sage Weil [Wed, 3 Feb 2021 15:38:49 +0000 (10:38 -0500)]
Merge PR #39069 into master
* refs/pull/39069/head:
mgr/cephadm/upgrade: tolerate pre-pacific upgrade state
mgr/cephadm/upgrade: scale down MDS cluster(s) for major version upgrades
mgr/cephadm: fix capitalization, level; drop elipses of log msgs
mgr/cephadm/upgrade: match against any repo_digest, not image_id
cephadm: return repo_digests (plural) in pull/inspect output
mgr/cephadm: include container_image_digests in inventory
cephadm: include image_digests list in 'ls' output
vstart.sh: only extract first container digest
mgr/cephadm: move release -> major translation to helper
mgr/cephadm/upgrade: tolerate old upgrade_state.target_versoin
mgr/cephadm/upgrade: set require-osd-release when done with OSDs
mgr: add lookup_release_name(int) to mgr interface
mgr/cephadm: verify container image version after we pull it
mgr/cephadm: only save version portion of version string
cephadm: fix 'inspect' and 'pull'
mgr/cephadm/upgrade: implement N-2 version checks on upgrade start
Casey Bodley [Wed, 3 Feb 2021 14:46:33 +0000 (09:46 -0500)]
cmake: partial revert of BOOST_USE_VALGRIND when ALLOCATOR=libc
the WITH_SYSTEM_BOOST binaries are not built with BOOST_USE_VALGRIND, so
it probably isn't safe to define for the headers only
this flag is needed for teuthology testing, and the shaman builds use
WITH_SYSTEM_BOOST=OFF. so the better fix is to enable WITH_BOOST_VALGRIND
so BuildBoost.cmake will build the libraries with valgrind support and add
-DBOOST_USE_VALGRIND to the necessary targets
this change was merged in https://github.com/ceph/ceph-build/pull/1736
Zac Dover [Wed, 3 Feb 2021 14:06:13 +0000 (00:06 +1000)]
doc/dev: Remove workbench mentions
This PR removes the "running-tests-in-cloud.rst"
file, which explains how to use ceph-workbench.
ceph-workbench is now deprecated, and the new
Teuthology documentation supplants the information
in the ceph-workbench-related documentation.
This PR also alters the "index.rst" file to remove
a link to "running-tests-in-cloud.rst".
Lucian Petrut [Wed, 3 Feb 2021 08:59:24 +0000 (08:59 +0000)]
win32*.sh: move debug symbols to separate files
This patch simplifies releasing Windows binaries along with debug
symbols.
By default, we're going to provide minimum debug information (-g1).
The symbols are extracted from the binaries and placed in separate
files in the ".debug" folder, which is used by gdb implicitly.
This is more convenient than having separate versions of the binaries,
with or without debug symbols.
Kefu Chai [Fri, 29 Jan 2021 16:48:36 +0000 (00:48 +0800)]
pybind/mgr/hello: add typing annotation
also, use Option and CLIReadCommand to define options and cli
commands. this module serves as a "hello world" example for
developers of mgr modules. so it's important to use the more
convenient and safer way to implement the module
Lucian Petrut [Fri, 29 Jan 2021 11:03:20 +0000 (11:03 +0000)]
rbd: propagate WNBD start errors
This change will propagate the errors that WNBD may return when
spinning up the IO workers.
Also, we'll avoid removing the registry record for failed
non-persistent mappings. Those will be cleaned up when the service
restarts or when explicitly unmapped.
Lucian Petrut [Fri, 29 Jan 2021 09:54:10 +0000 (09:54 +0000)]
rbd: improve Windows remap failure handling
At the moment, if an image can't be remapped when the centralized
RBD service starts, the service will stop and already started
daemons will continue running.
This change adds a new option: "--remap-failure-fatal". If set,
when an image can't be remmaped, the service stops AND cleans up
the running daemons. By default, an error will be logged and the
service will continue running.