crimson/osd: sending EVENT_DISCONNECT becomes implementation detail of Watch.
In contrast to ceph-osd crimson sends CEPH_WATCH_EVENT_DISCONNECT directly
from the timeout handler and after CEPH_WATCH_EVENT_NOTIFY_COMPLETE.
This simplifies the Watch::remove() interface as callers aren't obliged
anymore to decide whether EVENT_DISCONNECT needs to be send or not -- it
becomes an implementation detail of Watch.
crimson/osd: PG::repair_object() doesn't depend on MOSDOp anymore.
Before this commit the method was depending on `MOSDOp::get_min_epoch()`
to start an `UrgentRecovery`. However, it seems `PG::get_osdmap_epoch()`
would be sufficient here as the very early stages of the processing
in `ClientRequest` ensure the PG fits the `get_min_epoch()` requirement.
In the classical OSD the counterpart code looks like below:
crimson/osd: introduce RollbackOrchestrator to OpsExecuter.
If the execution of an `OSDOp` fails, we're left with potentially
altered `ObjectContext`. We deal with that by reloading `obc` if
there was any modification to it. To figure this out, `has_seen_write()`
on `OpsExecuter` is being called. Unfortunately, the current impl.
has following drawbacks:
* `has_seen_write()` can be called after `std::move(ox).flush_...()`
which is very inelegant;
* it requires catching both `ObjectContext` and `OpsExecuter` while
the latter already references the former;
* there is no explicitly given reason in the header for justifying
the presence of `has_seen_writes()`.
crimson/osd: split PG::do_osd_ops() to facilitate InternalClientRequest.
This commit brings `PG::do_osd_ops_execute()` a subset of
`PG::do_osd_ops()`; it handles the ops execution through
`OpsExecuter` and the `submit_transaction()` but it stays
indepedent from `MOSDOp` and `MOSDOpReply`. This trait
facilitates the `InternalClientRequest`.
This commit introduces a `ObjectContext`-taking variant of
`PG::with_locked_obc()`. The upcoming internal counterpart
for the `ClientRequest` is the intended audience.
crimson/osd: ObjectContext allows the hobject_t to be std::moved in ctor.
Taken with "crimson/osd: use obc->get_oid() instead of passing
hobject_t around" and enriched with the move-constructing down
the `ObjectState` path this should allows to save some work in
e.g. `std::string` instances that are part of the `hobject_t`.
Kefu Chai [Sat, 8 May 2021 08:43:55 +0000 (16:43 +0800)]
crimson/common: use string_view when appropriate
the typical use case of get_val() passes a literal string as the key,
in that case, there is no need to create a std::string. as
md_config_t::get_val() always accepts a string_view as the option name.
crimson/monc: honor auth_result_t::canceled as the result of do_auth().
An attempt to `Connection::do_auth()` may finish in one of three states:
_success_, _failure_ and _cancellation_. Unfortunately, its callers were
missing the third treating cancellation like a failure. This was the root
cause of the following failure at Sepia:
```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-06_22:08:43-rados-master-distro-basic-smithi/6102605$ less ./remote/smithi204/log/ceph-osd.3.log.gz
...
WARN 2021-05-06 22:35:40,464 [shard 0] osd - ms_handle_reset
...
INFO 2021-05-06 22:35:40,465 [shard 0] monc - do_auth_single: connection closed
INFO 2021-05-06 22:35:40,465 [shard 0] ms - [osd.3(client) v2:172.21.15.204:6808/31418@57568 >> mon.? v2:172.21.15.204:3300/0] execute_connecting(): protocol aborted at CLOSING -- std::system_error (error crimson::net:6, protocol aborted)
...
ERROR 2021-05-06 22:35:40,465 [shard 0] osd - mon.osd.3 dispatch() ms_handle_reset caught exception: std::system_error (error crimson::net:3, negotiation failure)
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3909-g81233a18/rpm/el8/BUILD/ceph-17.0.0-3909-g81233a18/src/crimson/common/gated.h:36: crimson::common::Gated::dispatch(const char*, T&, Func&&) [with Func = crimson::mon::Client::ms_handle_reset(crimson::net::ConnectionRef, bool)::<lambda()>&; T = crimson::mon::Client]::<lambda(std::__exception_ptr::exception_ptr)>: Assertion `*eptr.__cxa_exception_type() == typeid(seastar::gate_closed_exception)' failed.
Aborting on shard 0.
Backtrace:
0# 0x00005618C973932F in ceph-osd
1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
3# 0x00007F7BB592EB20 in /lib64/libpthread.so.0
4# gsignal in /lib64/libc.so.6
5# abort in /lib64/libc.so.6
6# 0x00007F7BB3F29B09 in /lib64/libc.so.6
7# 0x00007F7BB3F37DE6 in /lib64/libc.so.6
8# 0x00005618C9FF295C in ceph-osd
9# 0x00005618C3907313 in ceph-osd
10# 0x00005618CCA2F84F in ceph-osd
11# 0x00005618CCA34D90 in ceph-osd
12# 0x00005618CCBEC9BB in ceph-osd
13# 0x00005618CC744E9A in ceph-osd
14# main in ceph-osd
15# __libc_start_main in /lib64/libc.so.6
16# _start in ceph-osd
daemon-helper: command crashed with signal 6
```
The low-level signal handler above assumes `local_engine._backend`
is not null which stays true only for threads from the S*'s world.
Unfortunately, as we don't block the `SIGHUP` for alien threads,
kernel is perfectly authorized to pick up one them to run the handler
leading to weirdly-looking segfaults like this one:
```
INFO 2021-04-23 07:06:57,807 [shard 0] bluestore - stat
DEBUG 2021-04-23 07:06:58,753 [shard 0] ms - [osd.1(client) v2:172.21.15.100:6802/30478@51064 >> mgr.4105 v2:172.21.15.109:6800/29891] --> #7 === pg_stats(0 pgs seq 55834574872 v 0) v2 (87)
...
INFO 2021-04-23 07:06:58,813 [shard 0] bluestore - stat
DEBUG 2021-04-23 07:06:59,753 [shard 0] osd - AdminSocket::handle_client: incoming asok string: {"prefix": "get_command_descriptions"}
INFO 2021-04-23 07:06:59,753 [shard 0] osd - asok response length: 2947
INFO 2021-04-23 07:06:59,817 [shard 0] bluestore - stat
DEBUG 2021-04-23 07:06:59,865 [shard 0] osd - AdminSocket::handle_client: incoming asok string: {"prefix": "get_command_descriptions"}
INFO 2021-04-23 07:06:59,866 [shard 0] osd - asok response length: 2947
DEBUG 2021-04-23 07:07:00,020 [shard 0] osd - AdminSocket::handle_client: incoming asok string: {"prefix": "get_command_descriptions"}
INFO 2021-04-23 07:07:00,020 [shard 0] osd - asok response length: 2947
INFO 2021-04-23 07:07:00,820 [shard 0] bluestore - stat
...
Backtrace:
0# 0x00005600CD0D6AAF in ceph-osd
1# FatalSignal::signaled(int) in ceph-osd
2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
3# 0x00007F5877C7EB20 in /lib64/libpthread.so.0
4# 0x00005600CD830B81 in ceph-osd
5# 0x00007F5877C7EB20 in /lib64/libpthread.so.0
6# pthread_cond_timedwait in /lib64/libpthread.so.0
7# crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) in ceph-osd
8# 0x00007F5877999BA3 in /lib64/libstdc++.so.6
9# 0x00007F5877C7414A in /lib64/libpthread.so.0
10# clone in /lib64/libc.so.6
daemon-helper: command crashed with signal 11
```
Ultimately, it turned out the thread came out from a syscall (`futex`)
and started crunching the `SIGHUP` handler's code in which a nullptr
dereference happened.
This patch blocks `SIGHUP` for all threads spawned by `AlienStore`.
Kefu Chai [Fri, 7 May 2021 13:36:48 +0000 (21:36 +0800)]
crimson/os/seastore: use map::merge() to merge maps
C++17's std::map allows us to merge two maps, and in this case, we can
even consume `child_result`. so map::merge() is used instead of insert()
in hope to avoid the memcpy and allocation of pair<> nodes.
Lucian Petrut [Fri, 7 May 2021 09:23:30 +0000 (09:23 +0000)]
win*.sh,cmake: Fix Windows linking errors
The Windows build is hitting linking errors after
bumping the Boost version to 1.75. The issue is that Boost
is now setting the zlib dependecy using INTERFACE_LINK_LIBRARIES,
which means that it's no longer located using the standard
"find_package" mechanism.
In order for the linker to locate zlib, we'll add it to the
linker search path.
Samuel Just [Wed, 28 Apr 2021 07:24:00 +0000 (00:24 -0700)]
crimson/os/seastore: clean up meta implementation
There's really no reason to cache the decoded representation here since
the meta keys are only accessed during startup, mkfs. This approach is
much simpler.
Neha Ojha [Mon, 3 May 2021 19:28:27 +0000 (19:28 +0000)]
qa/standalone: use osd op queue = wpq
mclock_scheduler is now the default and some of these tests need to be modified
to run well with it. Continue using wpq until
https://tracker.ceph.com/issues/50574 is addressed.