]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/commit
crimson/monc: fix races between on_session_opened() and the reset sequence. 41756/head
authorRadoslaw Zarzynski <rzarzyns@redhat.com>
Tue, 8 Jun 2021 10:33:31 +0000 (10:33 +0000)
committerRadoslaw Zarzynski <rzarzyns@redhat.com>
Tue, 8 Jun 2021 10:57:03 +0000 (10:57 +0000)
commitb5f3dc4d9d537ee6bece8a70f5b3243817041b44
tree9a112bdedea43120b47db3704e8187eb68f7efa2
parenta5fd875665a62c8fc02b0c9473174ef383656ef0
crimson/monc: fix races between on_session_opened() and the reset sequence.

The `active_con` can get invalidated every single time when there is
a preemption point. This includes even the middle connection open
sequence as it's spread across multiple continuations!

Unfortunately, we don't check for `active_con` in the lambdas inside
the `on_session_opened()` method. That was the reason of the following
crash at Sepia [1]:

```
INFO  2021-06-08 09:36:23,992 [shard 0] monc - do_auth_single: connection closed
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-4967-g96cdf983/rpm/el8/BUILD/ceph-17.0.0-4967-g96cdf983/src/crimson/mon/MonClient.cc:399:10: runtime error: member access within null pointer of type 'struct Connection'
Segmentation fault on shard 0.
Backtrace:
 0# 0x000055C3C1CA860F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007FAAE1713B20 in /lib64/libpthread.so.0
 4# crimson::mon::Connection::get_conn() in ceph-osd
 5# 0x000055C3C2532DA8 in ceph-osd
 6# 0x000055C3C2535CB5 in ceph-osd
 7# 0x000055C3BBC9FC70 in ceph-osd
 8# 0x000055C3C76FAE5F in ceph-osd
 9# 0x000055C3C77003A0 in ceph-osd
10# 0x000055C3C78B240B in ceph-osd
11# 0x000055C3C740FE8A in ceph-osd
12# 0x000055C3C7419FAE in ceph-osd
13# main in ceph-osd
14# __libc_start_main in /lib64/libc.so.6
15# _start in ceph-osd
Fault at location: 0x98
```

[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-06-08_09:11:05-rados-master-distro-basic-smithi/6159602

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
src/crimson/mon/MonClient.cc