Kefu Chai [Wed, 25 Mar 2020 01:54:04 +0000 (09:54 +0800)]
qa/suites/perf-basic: only test on bionic
because centos8/rhel8 does not package collectl or pdsh anymore. but
these packages are required by CBT for collecting performance stats.
so instead of testing on all supported distros, let's run the perf tests
only on distros offering these packages.
Sage Weil [Tue, 24 Mar 2020 20:38:04 +0000 (15:38 -0500)]
Merge PR #34085 into master
* refs/pull/34085/head:
debian: add ceph-grafana-dashboards package
ceph.spec: put prometheus alerts in vendor-neutral location
mgr/cephadm: include prom alerts, if present in the container
Reviewed-by: Patrick Seidensal <pseidensal@suse.com> Reviewed-by: Paul Cuzner <pcuzner@redhat.com>
"The default values are handled by mgr_module.py's _get_module_option();
the or here means that we break any non-true (0, false, none) value and
override it with the default."
Nathan Cutler [Mon, 23 Mar 2020 15:46:02 +0000 (16:46 +0100)]
script/ceph-backport.sh: set target_branch in API case
When we falling back to the GitHub API to determine the milestone
number, we were not initializing target_branch, so the script was
broken for octopus backports.
Kiefer Chang [Tue, 10 Mar 2020 11:43:42 +0000 (19:43 +0800)]
mgr/orch: allow list daemons by service_name
Services like rgw and mds are differentiated by service_name. For
example: mds.xyz vs. mds.abc. With current interface, we can't list all
daemons belonged to mds.xyz only. Add service_name as a new argument to
filter daemons by it.
Sage Weil [Tue, 24 Mar 2020 01:00:58 +0000 (20:00 -0500)]
Merge PR #34115 into octopus
* refs/pull/34115/head:
doc/releases/octopus: drop stray line
doc/releases/octopus: note about repository locations
doc/releases: include octopus in index
doc/install/get-packages: update package install instructions
doc/releases/octopus: final notes
Neha [Sun, 22 Mar 2020 20:01:23 +0000 (20:01 +0000)]
qa/*/osd-backfill-recovery-log.sh: flush_pg_stats before checking log length
It is possible for the pg dump to not be the latest when we check for newprimary
in _common_test(). This is because mgr_stats_period is 5 seconds, and we may not
have fetched the latest stats just yet. This causes the test to look at the same
stats before and after wait_for_clean.
Sebastian Wagner [Mon, 23 Mar 2020 13:27:51 +0000 (14:27 +0100)]
mgr/cephadm: Add example to run when debugging ssh failures
```
$ ceph orch host add foobar
Error ENOENT: Failed to connect to foobar (foobar). Check that the host is reachable and accepts connections using the cephadm SSH key
you may want to run:
> ssh -F =(ceph cephadm get-ssh-config) -i =(ceph config-key get mgr/cephadm/ssh_identity_key) rook@foobar
$ ssh -F =(ceph cephadm get-ssh-config) -i =(ceph config-key get mgr/cephadm/ssh_identity_key) rook@foobar
ssh: Could not resolve hostname foobar: Temporary failure in name resolution
```
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Sage Weil [Mon, 23 Mar 2020 13:24:06 +0000 (08:24 -0500)]
Merge PR #34105 into master
* refs/pull/34105/head:
Merge PR #34042 into octopus
Merge PR #33959 into octopus
Merge PR #34067 into octopus
mgr/DaemonServer: add explicit check that acting matches for merge
Merge pull request #34040 from dillaman/wip-44396-partial-fix
Merge PR #34098 into octopus
mgr/rook: list rgw services
mgr/rook: tolerate timestamps that are None
mgr/orch: add 'subcluster' property to RGWSpec
mgr/rook: do not create radosgw pools
mgr/rook: refactor apply/add for rgw
Merge PR #34082 into octopus
Merge PR #34068 into octopus
cephadm: relabel /etc/ganesha mount
Merge PR #34046 into octopus
Merge PR #34092 into octopus
Merge pull request #33719 from ukernel/wip-44416
rbd-mirror: leader watcher should not cancel get locker if locker is invalid
rbd-mirror: snapshot sync request needs to check for interruption
librbd: request exclusive lock when moving to trash
rbd-mirror: basic integration with sync throttling
rbd-mirror: don't prematurely finish snapshot replay loop
rbd-mirror: pass InstanceWatcher to snapshot Replayer
doc/releases/octopus.rst: add note about ec recovery below min_size
mgr/cephadm: configure rgw_frontends for rgw service
cephadm: switch grafana image to the ceph repo
Merge PR #34034 into octopus
qa/suites/rados/cephadm/upgrade: update starting version
Merge PR #33540 into octopus
Merge PR #34023 into octopus
Merge PR #34044 into octopus
Merge PR #34030 into octopus
doc/orchestrator: update rgw creation
mgr/cephadm: clean up client.crash.* container_image settings after upgrade
cephadm: make add-repo --release and --version independent
cephadm: env over last used
mgr/orch: accept port and ssl flags to 'apply rgw'
mgr/orch: 'ceph upgrade ...' -> 'ceph orch upgrade ...'
cephadm: fall back to default for infer_image
cephadm: remove outdated check
cephadm: consolidate default image logic
remove ceph_test_rados_watch_notify
python-common/ceph/deployment/service_spec: add ssl to RGWSpec
cephadm: only infer image for shell, run, inspect-image, pull, ceph-volume
mgr/test_orchestrator: fix service filtering when using dummy data
mgr/dashboard: fix adding/removing host errors
mgr/rook: fix 'orch ps' for osds
qa: fix all the fsx.sh-invoking yaml files to install dependencies
mds: pass proper MutationImpl::LockOp to Locker::wrlock_start()
Reviewed-by: Kiefer Chang <kiefer.chang@suse.com> Reviewed-by: Laura Paduano <lpaduano@suse.com>
The informaction about SocketConnection::side and
SocketConnection::ephemeral_port are not up-to-date in the log, because
they are not moved with Socket during connection replacement. They are
actually socket-level information.
Kefu Chai [Sat, 21 Mar 2020 12:18:50 +0000 (20:18 +0800)]
crimson/admin: do not reset connected_sock before closing
* no need to discard_result(). as `output_stream::close()` returns an
empty future<> already
* free the connected socket after the background task finishes, because:
we should not free the connected socket before the promise referencing it is fulfilled.
otherwise we have error messages from ASan, like
==287182==ERROR: AddressSanitizer: heap-use-after-free on address 0x611000019aa0 at pc 0x55e2ae2de882 bp 0x7fff7e2bf080 sp 0x7fff7e2bf078
READ of size 8 at 0x611000019aa0 thread T0
#0 0x55e2ae2de881 in seastar::reactor_backend_aio::await_events(int, __sigset_t const*) ../src/seastar/src/core/reactor_backend.cc:396
#1 0x55e2ae2dfb59 in seastar::reactor_backend_aio::reap_kernel_completions() ../src/seastar/src/core/reactor_backend.cc:428
#2 0x55e2adbea397 in seastar::reactor::reap_kernel_completions_pollfn::poll() (/var/ssd/ceph/build/bin/crimson-osd+0x155e9397)
#3 0x55e2adaec6d0 in seastar::reactor::poll_once() ../src/seastar/src/core/reactor.cc:2789
#4 0x55e2adae7cf7 in operator() ../src/seastar/src/core/reactor.cc:2687
#5 0x55e2adb7c595 in __invoke_impl<bool, seastar::reactor::run()::<lambda()>&> /usr/include/c++/10/bits/invoke.h:60
#6 0x55e2adb699b0 in __invoke_r<bool, seastar::reactor::run()::<lambda()>&> /usr/include/c++/10/bits/invoke.h:113
#7 0x55e2adb50222 in _M_invoke /usr/include/c++/10/bits/std_function.h:291
#8 0x55e2adc2ba00 in std::function<bool ()>::operator()() const /usr/include/c++/10/bits/std_function.h:622
#9 0x55e2adaea491 in seastar::reactor::run() ../src/seastar/src/core/reactor.cc:2713
#10 0x55e2ad98f1c7 in seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) ../src/seastar/src/core/app-template.cc:199
#11 0x55e2a9e57538 in main ../src/crimson/osd/main.cc:148
#12 0x7fae7f20de0a in __libc_start_main ../csu/libc-start.c:308
#13 0x55e2a9d431e9 in _start (/var/ssd/ceph/build/bin/crimson-osd+0x117421e9)
0x611000019aa0 is located 96 bytes inside of 240-byte region [0x611000019a40,0x611000019b30)
freed by thread T0 here:
#0 0x7fae80a4e487 in operator delete(void*, unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.6+0xac487)
#1 0x55e2ae302a0a in seastar::aio_pollable_fd_state::~aio_pollable_fd_state() ../src/seastar/src/core/reactor_backend.cc:458
#2 0x55e2ae2e1059 in seastar::reactor_backend_aio::forget(seastar::pollable_fd_state&) ../src/seastar/src/core/reactor_backend.cc:524
#3 0x55e2adab9b9a in seastar::pollable_fd_state::forget() ../src/seastar/src/core/reactor.cc:1396
#4 0x55e2adab9d05 in seastar::intrusive_ptr_release(seastar::pollable_fd_state*) ../src/seastar/src/core/reactor.cc:1401
#5 0x55e2ace1b72b in boost::intrusive_ptr<seastar::pollable_fd_state>::~intrusive_ptr() /opt/ceph/include/boost/smart_ptr/intrusive_ptr.hpp:98
#6 0x55e2ace115a5 in seastar::pollable_fd::~pollable_fd() ../src/seastar/include/seastar/core/internal/pollable_fd.hh:109
#7 0x55e2ae0ed35c in seastar::net::posix_server_socket_impl::~posix_server_socket_impl() ../src/seastar/include/seastar/net/posix-stack.hh:161
#8 0x55e2ae0ed3cf in seastar::net::posix_server_socket_impl::~posix_server_socket_impl() ../src/seastar/include/seastar/net/posix-stack.hh:161
#9 0x55e2ae0ed943 in std::default_delete<seastar::net::api_v2::server_socket_impl>::operator()(seastar::net::api_v2::server_socket_impl*) const /usr/include/c++/10/bits/unique_ptr.h:81
#10 0x55e2ae0db357 in std::unique_ptr<seastar::net::api_v2::server_socket_impl, std::default_delete<seastar::net::api_v2::server_socket_impl> >::~unique_ptr()
/usr/include/c++/10/bits/unique_ptr.h:357 #11 0x55e2ae1438b7 in seastar::api_v2::server_socket::~server_socket() ../src/seastar/src/net/stack.cc:195
#12 0x55e2aa1c7656 in std::_Optional_payload_base<seastar::api_v2::server_socket>::_M_destroy() /usr/include/c++/10/optional:260
#13 0x55e2aa16c84b in std::_Optional_payload_base<seastar::api_v2::server_socket>::_M_reset() /usr/include/c++/10/optional:280
#14 0x55e2ac24b2b7 in std::_Optional_base_impl<seastar::api_v2::server_socket, std::_Optional_base<seastar::api_v2::server_socket, false, false> >::_M_reset() /usr/include/c++/10/optional:432
#15 0x55e2ac23f37b in std::optional<seastar::api_v2::server_socket>::reset() /usr/include/c++/10/optional:975
#16 0x55e2ac21a2e7 in crimson::admin::AdminSocket::stop() ../src/crimson/admin/admin_socket.cc:265
#17 0x55e2aa099825 in operator() ../src/crimson/osd/osd.cc:450
#18 0x55e2aa0d4e3e in apply ../src/seastar/include/seastar/core/apply.hh:36
Sage Weil [Sun, 22 Mar 2020 23:32:11 +0000 (18:32 -0500)]
Merge PR #34042 into octopus
* refs/pull/34042/head:
mgr/rook: list rgw services
mgr/rook: tolerate timestamps that are None
mgr/orch: add 'subcluster' property to RGWSpec
mgr/rook: do not create radosgw pools
mgr/rook: refactor apply/add for rgw
mgr/cephadm: configure rgw_frontends for rgw service
mgr/orch: accept port and ssl flags to 'apply rgw'
python-common/ceph/deployment/service_spec: add ssl to RGWSpec
mgr/rook: fix 'orch ps' for osds
Kefu Chai [Sat, 21 Mar 2020 06:07:40 +0000 (14:07 +0800)]
cephadm: init config and keyring with None
and we should not assume that both `config` and `keying` are specified
when calling this method. because, for instance, `create_daemon_dirs()`
does handle the case where `config` and/or `keyring` is not specified.