Sage Weil [Wed, 14 Apr 2021 13:39:38 +0000 (09:39 -0400)]
Merge PR #40734 into master
* refs/pull/40734/head:
mgr/cephadm: make prometheus scrape ingress haproxy
doc/cephadm: remove big warning about stability
doc/cepham/compatibility: rgw-ha -> ingress; note possibility of breaking changes
mgr/cephadm: ingress: add optional virtual_interface_networks
doc/cephadm/rgw: clean up example spec
mgr/cephadm/services/ingress: less verbose about prepare_create
doc/cephadm/rgw: add note about which ethernet interface is used
cephadm: make keepalived unit fiddle sysctl settings
mgr/orchestrator: report external endpoints from 'orch ls'
mgr/orchestrator: drop - when no ports
doc/cephadm/rgw: update docs for ingress service
mgr/cephadm: use per_host_daemon feature in scheduler
mgr/cephadm/schedule: add per_host_daemon_type support
mgr/cephadm: HA_RGW -> Ingress
mgr/cephadm: include daemon_type in DaemonPlacement
mgr/cephadm: update list-networks to report interface names too
mgr/orchestrator: streamline 'orch ps' PORTS formatting
mgr/cephadm/schedule: handle multiple ports per daemon
mgr/cephadm/utils: resolve_ip(): prefer IPv4
tools/osdmaptool: mark unused variable [[maybe_unused]]
to silence warning from GCC when performing release build, like:
../src/tools/osdmaptool.cc: In function ‘int main(int, const char**)’:
../src/tools/osdmaptool.cc:472:9: warning: variable ‘r’ set but not used [-Wunused-but-set-variable]
472 | int r = clock_gettime(CLOCK_MONOTONIC, &round_start);
| ^
in the latest document generated from RtD, the spacing after `ul li p`
elements is set to 24px as the plain `p` elements. but this the lists
more sparse and difficult to read.
in this change, the spacing is restored to 0 as it was in old theme.css
in sphinx_rtd_theme.
mon: MMonProbe: direct MMonJoin messages to the leader, instead of the first mon
When monitors are joining a cluster, they may send an MMonJoin message to place
themselves correctly in the map in either handle_probe_reply() or
finish_election(). These messages must be sent to the leader -- monitors do not
forward each other's messages.
Unfortunately, this scenario was missed when converting the monitors to support
connectivity-based elections, and they're sending these messages to
quorum.begin(). Fix this by including an explicit leader in MMonProbe (that the
new monitor may reference in handle_probe_reply) and using the leader
value in both locations.
It may be that the virtual IP we want to use is not in the same network
as any existing IPs on the host. In that case, allow the spec to specify
a list of networks to match against existing IPs so that a match will
identify an ethernet interface to use.
extract the options in common/options.cc into separate .yaml.in
files, and preprocess them using CMake before translating them into .cc
files using a python script.
this change paves the road to render the options using sphinx, and
will allow us to further annotate the options to include more metadata.
also, a this YAML file can be consumed by applications like dashboard
and Sphinx to consume these metadata in a simpler way.
* use @variable-name@ for substituting the variables in .yaml.in file
* use cmake variable of `mgr_disabled_modules` instead of C macro
to define `mgr_disabled_modules` in global.yaml.in
* debian/control, ceph.spec.in, win32_deps_build.sh: add python3-yaml
as build dep
* add y2c.py (short for YAML to C++) to translate .yaml to .cc file
* common/options/*.yaml.in: extract and split options into .yaml.in
files, the subvars in it is then replaced with CMake variables,
and copied to the corresponding .yaml files
* include/config-h.in.cmake: remove MGR_DISABLED_MODULES, as it
is not a CMake variable.
cmake: use the same name for macros and cmake variables
for two reasons,
* consolidate the namings
* pave the road to yamlize options where we will use cmake variables
to substitude the @<variable-name>@ in .in files instead of relying
on C/C++ macros
instead of checking "HAVE_NASM_X64_AVX2 OR HAVE_ARMV8_SIMD" everywhere,
use a single cached variable of WITH_EC_ISA_PLUGIN. so it's more
consistent when checking the availability of ec_isa plugin.
Sage Weil [Tue, 6 Apr 2021 14:41:09 +0000 (09:41 -0500)]
qa/suites/rados/objectstore: separate store_test tests
This takes 5 hours currently.
- Separate out filestore and memstore into separate task (~1 hr)
- Split bluestore into -a and -b (a tests exclude SynethicMatrixC,
b tests include it)
Sage Weil [Mon, 12 Apr 2021 15:45:50 +0000 (11:45 -0400)]
Merge PR #40736 into master
* refs/pull/40736/head:
mgr/cephadm: rewrite/simplify describe_service
mgr/orchestrator: report osds as osd.unmanaged as appropriate
mgr/orchestrator: remove IMAGE ID from 'orch ls'
all the scripts except for test_cls_cas.sh under qa/workunits/cls
are executable. to be more consistent, add the executable bit to
test_cls_cas.sh as well.
also, these scripts are launched by src/script/gen-corpus.sh directly,
so it's convenient just call them.
if we happen to run this script on a host where /etc/ceph/ceph.conf is
available, ceph CLI would use it instead. so, point it to $PWD/ceph.conf
instead.
/home/kchai/ceph/src/include/denc.h: In member function ‘void DencDumper<T>::dump() const’:
/home/kchai/ceph/src/include/denc.h:121:60: error: ‘O_BINARY’ was not declared in this scope
int fd = ::open(fn, O_WRONLY|O_TRUNC|O_CREAT|O_CLOEXEC|O_BINARY, 0644);
^~~~~~~~
/home/kchai/ceph/src/include/denc.h:121:60: note: the macro ‘O_BINARY’ had not yet been defined
In file included from /home/kchai/ceph/src/include/statlite.h:14,
from /home/kchai/ceph/src/include/types.h:41,
from /home/kchai/ceph/src/auth/Crypto.h:19,
from /home/kchai/ceph/src/auth/Crypto.cc:21:
../src/mon/Monitor.cc: In member function ‘void Monitor::handle_command(MonOpRequestRef)’:
../src/mon/Monitor.cc:3703:55: warning: ‘osd’ may be used uninitialized in this function [-Wmaybe-uninitialized]
3703 | uint64_t seq = mgrstatmon()->get_last_osd_stat_seq(osd);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~
otherwise it fails to build with gcc-toolset-10, like:
../src/common/Formatter.cc: In member function ‘virtual void ceph::XMLFormatter::close_section()’:
../src/common/Formatter.cc:449:8: error: ‘transform’ is not a member of ‘std’
449 | std::transform(section.begin(), section.end(), section.begin(),
| ^~~~~~~~~
Changes to the socket code now result in returning EINVAL
In the past ENOENT was returned which is the FreeBSD error code
if DNS lookup does not work.
And that change is probably because somewhere in the code that
errorcode is not passed verbatim from the systemcall, but is
rewritten in extra evaluation.
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
FreeBSD ceph-dencoder crashes in the exit() calls, due to
invalid pointer references during the release process of
the loaded libraries.
Often this is signaled by libc reporting:
__cxa_thread_call_dtors: dtr 0x47efc0 from unloaded dso, skipping
The cause for this is different behaviour between FreeBSD and Linux:
https://groups.google.com/g/bsdmailinglist/c/22ncTZAbDp4/m/Dii_pII5AwAJ
_The FreeBSD implementation here looks racy. If one thread dlcloses an
object while another thread is exiting, we can end up calling a
function at an invalid memory address. It also looks as if it may
be possible to unload one library, load another at the same address,
and end up executing entirely the wrong code, which would have some
serious security implications.
The GNU/Linux equivalent of this function locks the DSO in memory
until all references to it have gone away. A call to dlclose() on
GNU/Linux will not actually unload the library until all threads
with destructors in that library have been unloaded. I believe
that this reuses the same reference counting mechanism that
allows the same library to be dlopened and dlclosed multiple times.
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
Sage Weil [Fri, 9 Apr 2021 20:26:00 +0000 (16:26 -0400)]
mgr/cephadm: rewrite/simplify describe_service
The prior implementation first tried to fabricate services based on the
running daemons, and then filled in defined services on top. This led
to duplication and a range of small errors.
Instead, flip this around: start with the services that are defined,
and only fill in 'unmanaged' services where we need to.
Drop the osd kludges and instead rely on DaemonDescription.service_id to
return the right thing.
rgw: test `radosgw-admin radoslist` and incomplete multiparts better
Make sure there are more than 1000 incomplete multiparts and also make
sure one of the incomplete multiparts has at least 1000 parts. This
test is done indirectly through rgw-orphan-list, which invokes
`radosgw-admin radoslist`.
Also, clean up shell flags, so script output is less verbose.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>