Avan Thakkar [Tue, 4 May 2021 22:01:10 +0000 (03:31 +0530)]
mgr/dashboard: ingress service creation follow-up
Fixes: https://tracker.ceph.com/issues/50568 Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Pre-populating the service id(read-only) with the value same as backend service.
Kefu Chai [Wed, 12 May 2021 03:50:49 +0000 (11:50 +0800)]
crimson/os/alienstore: create tuple in-place
no need to use make_tuple<> when constructing a future whose value is
available. as future<> can be constructed by perfect forwarding the
parameters to its state constructor.
also, wrap the lines whose length is over 80 chars.
Avan Thakkar [Thu, 6 May 2021 11:05:38 +0000 (16:35 +0530)]
mgr/dashboard: add Services e2e tests
Fixes: https://tracker.ceph.com/issues/50567 Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Introducing e2e tests for service creation for Ingress and RGW service types.
The rgw bucket creation form has the Name field which have an async
validator. The validator calls all the bucket name and check if the
entered name is unique or not. This happens on every keystroke. So if
100 or more buckets are there, then the async validation can be real
slow and causes misvalidations in different fields.
I changed the validation logic and did some cleanups to improve the
performance of the async validation.
Fixes: https://tracker.ceph.com/issues/50514 Signed-off-by: Nizamudeen A <nia@redhat.com>
Kefu Chai [Tue, 11 May 2021 09:55:32 +0000 (17:55 +0800)]
doc/_theme: show the menu button
because we have a top-nav bar, which is setting on top of the bar
containing the menu button when the docs is displayed wit a device with
smaller width. in this change, the container of the menu button is moved
down a little bit, so it is visible again.
max_misplaced with replaced by in target_max_misplaced_ratio edbd592ee44e02a5328e1510879555c2f9dcfc9e, but the document was not
sync'ed. let's update it accordingly.
Zac Dover [Mon, 10 May 2021 23:19:10 +0000 (09:19 +1000)]
doc/cephadm: rewrite "config ssl/tls f. grafana"
This PR streamlines the grammar in the subsection
called "Configuring SSL/TLS for Grafana" in the
monitoring.rst file. It also corrects the prompt
rst.
Kefu Chai [Sat, 8 May 2021 08:43:55 +0000 (16:43 +0800)]
crimson/common: use string_view when appropriate
the typical use case of get_val() passes a literal string as the key,
in that case, there is no need to create a std::string. as
md_config_t::get_val() always accepts a string_view as the option name.
crimson/monc: honor auth_result_t::canceled as the result of do_auth().
An attempt to `Connection::do_auth()` may finish in one of three states:
_success_, _failure_ and _cancellation_. Unfortunately, its callers were
missing the third treating cancellation like a failure. This was the root
cause of the following failure at Sepia:
```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-06_22:08:43-rados-master-distro-basic-smithi/6102605$ less ./remote/smithi204/log/ceph-osd.3.log.gz
...
WARN 2021-05-06 22:35:40,464 [shard 0] osd - ms_handle_reset
...
INFO 2021-05-06 22:35:40,465 [shard 0] monc - do_auth_single: connection closed
INFO 2021-05-06 22:35:40,465 [shard 0] ms - [osd.3(client) v2:172.21.15.204:6808/31418@57568 >> mon.? v2:172.21.15.204:3300/0] execute_connecting(): protocol aborted at CLOSING -- std::system_error (error crimson::net:6, protocol aborted)
...
ERROR 2021-05-06 22:35:40,465 [shard 0] osd - mon.osd.3 dispatch() ms_handle_reset caught exception: std::system_error (error crimson::net:3, negotiation failure)
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3909-g81233a18/rpm/el8/BUILD/ceph-17.0.0-3909-g81233a18/src/crimson/common/gated.h:36: crimson::common::Gated::dispatch(const char*, T&, Func&&) [with Func = crimson::mon::Client::ms_handle_reset(crimson::net::ConnectionRef, bool)::<lambda()>&; T = crimson::mon::Client]::<lambda(std::__exception_ptr::exception_ptr)>: Assertion `*eptr.__cxa_exception_type() == typeid(seastar::gate_closed_exception)' failed.
Aborting on shard 0.
Backtrace:
0# 0x00005618C973932F in ceph-osd
1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
3# 0x00007F7BB592EB20 in /lib64/libpthread.so.0
4# gsignal in /lib64/libc.so.6
5# abort in /lib64/libc.so.6
6# 0x00007F7BB3F29B09 in /lib64/libc.so.6
7# 0x00007F7BB3F37DE6 in /lib64/libc.so.6
8# 0x00005618C9FF295C in ceph-osd
9# 0x00005618C3907313 in ceph-osd
10# 0x00005618CCA2F84F in ceph-osd
11# 0x00005618CCA34D90 in ceph-osd
12# 0x00005618CCBEC9BB in ceph-osd
13# 0x00005618CC744E9A in ceph-osd
14# main in ceph-osd
15# __libc_start_main in /lib64/libc.so.6
16# _start in ceph-osd
daemon-helper: command crashed with signal 6
```
The low-level signal handler above assumes `local_engine._backend`
is not null which stays true only for threads from the S*'s world.
Unfortunately, as we don't block the `SIGHUP` for alien threads,
kernel is perfectly authorized to pick up one them to run the handler
leading to weirdly-looking segfaults like this one:
```
INFO 2021-04-23 07:06:57,807 [shard 0] bluestore - stat
DEBUG 2021-04-23 07:06:58,753 [shard 0] ms - [osd.1(client) v2:172.21.15.100:6802/30478@51064 >> mgr.4105 v2:172.21.15.109:6800/29891] --> #7 === pg_stats(0 pgs seq 55834574872 v 0) v2 (87)
...
INFO 2021-04-23 07:06:58,813 [shard 0] bluestore - stat
DEBUG 2021-04-23 07:06:59,753 [shard 0] osd - AdminSocket::handle_client: incoming asok string: {"prefix": "get_command_descriptions"}
INFO 2021-04-23 07:06:59,753 [shard 0] osd - asok response length: 2947
INFO 2021-04-23 07:06:59,817 [shard 0] bluestore - stat
DEBUG 2021-04-23 07:06:59,865 [shard 0] osd - AdminSocket::handle_client: incoming asok string: {"prefix": "get_command_descriptions"}
INFO 2021-04-23 07:06:59,866 [shard 0] osd - asok response length: 2947
DEBUG 2021-04-23 07:07:00,020 [shard 0] osd - AdminSocket::handle_client: incoming asok string: {"prefix": "get_command_descriptions"}
INFO 2021-04-23 07:07:00,020 [shard 0] osd - asok response length: 2947
INFO 2021-04-23 07:07:00,820 [shard 0] bluestore - stat
...
Backtrace:
0# 0x00005600CD0D6AAF in ceph-osd
1# FatalSignal::signaled(int) in ceph-osd
2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
3# 0x00007F5877C7EB20 in /lib64/libpthread.so.0
4# 0x00005600CD830B81 in ceph-osd
5# 0x00007F5877C7EB20 in /lib64/libpthread.so.0
6# pthread_cond_timedwait in /lib64/libpthread.so.0
7# crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) in ceph-osd
8# 0x00007F5877999BA3 in /lib64/libstdc++.so.6
9# 0x00007F5877C7414A in /lib64/libpthread.so.0
10# clone in /lib64/libc.so.6
daemon-helper: command crashed with signal 11
```
Ultimately, it turned out the thread came out from a syscall (`futex`)
and started crunching the `SIGHUP` handler's code in which a nullptr
dereference happened.
This patch blocks `SIGHUP` for all threads spawned by `AlienStore`.
Kefu Chai [Fri, 7 May 2021 13:36:48 +0000 (21:36 +0800)]
crimson/os/seastore: use map::merge() to merge maps
C++17's std::map allows us to merge two maps, and in this case, we can
even consume `child_result`. so map::merge() is used instead of insert()
in hope to avoid the memcpy and allocation of pair<> nodes.
Lucian Petrut [Fri, 7 May 2021 09:23:30 +0000 (09:23 +0000)]
win*.sh,cmake: Fix Windows linking errors
The Windows build is hitting linking errors after
bumping the Boost version to 1.75. The issue is that Boost
is now setting the zlib dependecy using INTERFACE_LINK_LIBRARIES,
which means that it's no longer located using the standard
"find_package" mechanism.
In order for the linker to locate zlib, we'll add it to the
linker search path.