crimson/monc: honor auth_result_t::canceled as the result of do_auth().
An attempt to `Connection::do_auth()` may finish in one of three states:
_success_, _failure_ and _cancellation_. Unfortunately, its callers were
missing the third treating cancellation like a failure. This was the root
cause of the following failure at Sepia:
```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-06_22:08:43-rados-master-distro-basic-smithi/6102605$ less ./remote/smithi204/log/ceph-osd.3.log.gz
...
WARN 2021-05-06 22:35:40,464 [shard 0] osd - ms_handle_reset
...
INFO 2021-05-06 22:35:40,465 [shard 0] monc - do_auth_single: connection closed
INFO 2021-05-06 22:35:40,465 [shard 0] ms - [osd.3(client) v2:172.21.15.204:6808/31418@57568 >> mon.? v2:172.21.15.204:3300/0] execute_connecting(): protocol aborted at CLOSING -- std::system_error (error crimson::net:6, protocol aborted)
...
ERROR 2021-05-06 22:35:40,465 [shard 0] osd - mon.osd.3 dispatch() ms_handle_reset caught exception: std::system_error (error crimson::net:3, negotiation failure)
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3909-g81233a18/rpm/el8/BUILD/ceph-17.0.0-3909-g81233a18/src/crimson/common/gated.h:36: crimson::common::Gated::dispatch(const char*, T&, Func&&) [with Func = crimson::mon::Client::ms_handle_reset(crimson::net::ConnectionRef, bool)::<lambda()>&; T = crimson::mon::Client]::<lambda(std::__exception_ptr::exception_ptr)>: Assertion `*eptr.__cxa_exception_type() == typeid(seastar::gate_closed_exception)' failed.
Aborting on shard 0.
Backtrace:
0# 0x00005618C973932F in ceph-osd
1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
3# 0x00007F7BB592EB20 in /lib64/libpthread.so.0
4# gsignal in /lib64/libc.so.6
5# abort in /lib64/libc.so.6
6# 0x00007F7BB3F29B09 in /lib64/libc.so.6
7# 0x00007F7BB3F37DE6 in /lib64/libc.so.6
8# 0x00005618C9FF295C in ceph-osd
9# 0x00005618C3907313 in ceph-osd
10# 0x00005618CCA2F84F in ceph-osd
11# 0x00005618CCA34D90 in ceph-osd
12# 0x00005618CCBEC9BB in ceph-osd
13# 0x00005618CC744E9A in ceph-osd
14# main in ceph-osd
15# __libc_start_main in /lib64/libc.so.6
16# _start in ceph-osd
daemon-helper: command crashed with signal 6
```
Sage Weil [Thu, 6 May 2021 14:33:00 +0000 (10:33 -0400)]
Merge PR #41107 into master
* refs/pull/41107/head:
mgr/cephadm: apply hostname/addr checks to 'orch host set-addr' too
mgr/cephadm: make 'host add' idempotent
mgr/cephadm: set host crush location based on HostSpec
python-common: add location property to HostSpec, + tests
Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
Kefu Chai [Thu, 6 May 2021 07:36:27 +0000 (15:36 +0800)]
doc/_ext: rewrite directive using ObjectDescription
which allows us to use different scheme when defining an option,
without this change, if two options in different mgr module share the
same name we cannot differentiate them, after this change, their id
would prefixed with the module name.
Zac Dover [Thu, 6 May 2021 00:54:24 +0000 (10:54 +1000)]
doc/cephadm: rewrite "using customized con..."
This PR rewrites the text in "Using Customized
Container Images" so that it is just a bit
clearer, and it also formats the prompt in the
text correctly.
Kefu Chai [Tue, 4 May 2021 03:08:15 +0000 (11:08 +0800)]
doc/man/8/cephfs-shell: reformat options
* format global options using option directive
* fix the header, so man/conf.py is able to parse
the description
* define "Synopsis" section to be consistent with other manpages.
* drop reference to glossary using "term" as manapge does not have
reference to glossary entries.
Sage Weil [Tue, 4 May 2021 16:06:17 +0000 (11:06 -0500)]
ceph_test_rados_api_service: stop threads before asserting
Otherwise, if we assert, we'll hang here:
Thread 1 (Thread 0x7f74eba79580 (LWP 1688617)):
#0 0x00007f74eb2aa529 in futex_wait (private=<optimized out>, expected=132, futex_word=0x7ffd642b4b54) at ../sysdeps/unix/sysv/linux/futex-internal.h:61
#1 futex_wait_simple (private=<optimized out>, expected=132, futex_word=0x7ffd642b4b54) at ../sysdeps/nptl/futex-internal.h:135
#2 __pthread_cond_destroy (cond=0x7ffd642b4b30) at pthread_cond_destroy.c:54
#3 0x0000563ff2e5a891 in LibRadosService_StatusFormat_Test::TestBody (this=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:78
#4 0x0000563ff2e9dc3a in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (location=0x563ff2ea72e4 "the test body", method=<optimized out>, object=0x563ff422a6d0)
at ./src/googletest/googletest/src/gtest.cc:2605
#5 testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=object@entry=0x563ff422a6d0, method=<optimized out>, location=location@entry=0x563ff2ea72e4 "the test body")
at ./src/googletest/googletest/src/gtest.cc:2641
#6 0x0000563ff2e908c3 in testing::Test::Run (this=0x563ff422a6d0) at ./src/googletest/googletest/src/gtest.cc:2680
#7 0x0000563ff2e90a25 in testing::TestInfo::Run (this=0x563ff41a3b70) at ./src/googletest/googletest/src/gtest.cc:2858
#8 0x0000563ff2e90ec1 in testing::TestSuite::Run (this=0x563ff41b6230) at ./src/googletest/googletest/src/gtest.cc:3012
#9 0x0000563ff2e92bdc in testing::internal::UnitTestImpl::RunAllTests (this=<optimized out>) at ./src/googletest/googletest/src/gtest.cc:5723
#10 0x0000563ff2e9e14a in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (location=0x563ff2ea8728 "auxiliary test code (environments or event listeners)",
method=<optimized out>, object=0x563ff41a2d10) at ./src/googletest/googletest/src/gtest.cc:2605
#11 testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x563ff41a2d10, method=<optimized out>,
location=location@entry=0x563ff2ea8728 "auxiliary test code (environments or event listeners)") at ./src/googletest/googletest/src/gtest.cc:2641
#12 0x0000563ff2e90ae8 in testing::UnitTest::Run (this=0x563ff30c0660 <testing::UnitTest::GetInstance()::instance>) at ./src/googletest/googletest/src/gtest.cc:5306
Dennis Körner [Tue, 4 May 2021 15:59:59 +0000 (17:59 +0200)]
Add Rocky Linux to supported DISTRO_NAMES
Rocky Linux is a RHEL clone. I did a test-installation of ceph pacific on Rocky Linux RC1 with cephadm. As far as I can see, everything works as expected.
Signed-off-by: Dennis Körner <koerner@netzwerge.de>
Kefu Chai [Tue, 4 May 2021 13:07:01 +0000 (21:07 +0800)]
cmake: let libglobal_obj depend on legacy-option-headers
to address following build failure:
FAILED: src/global/CMakeFiles/libglobal_objs.dir/global_init.cc.o ...
src/global/CMakeFiles/libglobal_objs.dir/global_init.cc.o -MF src/global/CMakeFiles/libglobal_objs.dir/global_init.cc.o.d -o src/global/CMakeFiles/libglobal_objs.dir/global_init.cc.o -c
../src/global/global_init.cc
In file included from ../src/global/global_init.cc:26:
In file included from ../src/common/config.h:26:
In file included from ../src/common/config_values.h:59:
../src/common/options/legacy_config_opts.h:1:10: fatal error: 'global_legacy_options.h' file not found
^~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
librbd/mirror/snapshot: avoid UnlinkPeerRequest with a unlinked peer
CreatePrimaryRequest could create some UnlinkPeerRequest with an already
unlinked peer in a scenario where you have multiple peers. This request
will not remove the peer (as it's already not linked to the requested
peer) and will skip deletion of the mirror snapshot if another peer
remains. Eventually the code will go through an infinite recursive loop
between CreatePrimaryRequest and UnlinkPeerRequest and segfault.
This commit adds an extra condition to make sure to not submit a
UnlinkPeerRequest if the peer is not linked to the current snapshot. If
there is already no peer in the list it will submit a UnlinkPeerRequest
to remove the snapshot.
Fixes: https://tracker.ceph.com/issues/50439 Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
Merge pull request #40313 from jmolmo/purge_iscsi_config
mgr/cephadm: Remove gateway.conf from iscsi pool when service is removed
Reviewed-by: Adam King <adking@redhat.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com> Reviewed-by: Sage Weil <sage@newdream.net> Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Kefu Chai [Tue, 4 May 2021 03:08:15 +0000 (11:08 +0800)]
doc/man/8/cephfs-shell: reformat options
* format global options using option directive
* fix the header, so man/conf.py is able to parse
the description
* define "Synopsis" section to be consistent with other manpages.
* drop reference to glossary using "term" as manapge does not have
reference to glossary entries.
* refs/pull/40962/head:
test: add test to validate snap synchronization with parent directory snapshots
cephfs-mirror: ignore parent directory snapshots when building snap map
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/40903/head:
test: add test case for MDS privated inos accessing
mds: make the lost+found dir accessible from clients
mds: move the inos 1,2 and 3 macros to ceph_fs.h
Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/41120/head:
doc/_ext: ignore desc if it is unavailable
doc/_ext: check "default" for None
doc/_ext: print 0B if option value is 0
doc/cephfs: render options using confval directive
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
librbd/cache/pwl/ssd/WriteLog: don't crash on split log entries
write_log_entries() will split a log entry at the end of the log, the
remainder is written to the beginning at DATA_RING_BUFFER_OFFSET. On
the read side aio_read_data_block() doesn't handle this case and just
crashes. Unless the workload in use is <= 4K, the image is rendered
unusable sooner or later.
Paul Cuzner [Thu, 8 Apr 2021 04:43:22 +0000 (16:43 +1200)]
mgr/dashboard:include compression stats on pool dashboard
This is a replacement dashboard configuration for the
pool overview page. It provides a cluster wide view of
capacity consumed and compression effectiveness, and
breaks this down by each pool within the configuration.
Fixes: https://tracker.ceph.com/issues/50226 Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
Ilya Dryomov [Sun, 2 May 2021 21:13:29 +0000 (23:13 +0200)]
qa/workunits/rbd: disable qemu-iotest test 055 globally
It doesn't work on Focal and already disabled on CentOS 7 and 8. More
importantly, it doesn't actually test rbd -- it always tests "file", no
matter which protocol is specified in IMGPROTO.
Kefu Chai [Sun, 2 May 2021 11:57:27 +0000 (19:57 +0800)]
doc/_ext: ignore desc if it is unavailable
there is chance that we don't have desc, desc_long or fmt_desc, in that
case, we should just skip desc before checking its length. so, just use
'if desc' which is able to check for None or empty string.