Sage Weil [Tue, 4 May 2021 16:06:17 +0000 (11:06 -0500)]
ceph_test_rados_api_service: stop threads before asserting
Otherwise, if we assert, we'll hang here:
Thread 1 (Thread 0x7f74eba79580 (LWP 1688617)):
#0 0x00007f74eb2aa529 in futex_wait (private=<optimized out>, expected=132, futex_word=0x7ffd642b4b54) at ../sysdeps/unix/sysv/linux/futex-internal.h:61
#1 futex_wait_simple (private=<optimized out>, expected=132, futex_word=0x7ffd642b4b54) at ../sysdeps/nptl/futex-internal.h:135
#2 __pthread_cond_destroy (cond=0x7ffd642b4b30) at pthread_cond_destroy.c:54
#3 0x0000563ff2e5a891 in LibRadosService_StatusFormat_Test::TestBody (this=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:78
#4 0x0000563ff2e9dc3a in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (location=0x563ff2ea72e4 "the test body", method=<optimized out>, object=0x563ff422a6d0)
at ./src/googletest/googletest/src/gtest.cc:2605
#5 testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=object@entry=0x563ff422a6d0, method=<optimized out>, location=location@entry=0x563ff2ea72e4 "the test body")
at ./src/googletest/googletest/src/gtest.cc:2641
#6 0x0000563ff2e908c3 in testing::Test::Run (this=0x563ff422a6d0) at ./src/googletest/googletest/src/gtest.cc:2680
#7 0x0000563ff2e90a25 in testing::TestInfo::Run (this=0x563ff41a3b70) at ./src/googletest/googletest/src/gtest.cc:2858
#8 0x0000563ff2e90ec1 in testing::TestSuite::Run (this=0x563ff41b6230) at ./src/googletest/googletest/src/gtest.cc:3012
#9 0x0000563ff2e92bdc in testing::internal::UnitTestImpl::RunAllTests (this=<optimized out>) at ./src/googletest/googletest/src/gtest.cc:5723
#10 0x0000563ff2e9e14a in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (location=0x563ff2ea8728 "auxiliary test code (environments or event listeners)",
method=<optimized out>, object=0x563ff41a2d10) at ./src/googletest/googletest/src/gtest.cc:2605
#11 testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x563ff41a2d10, method=<optimized out>,
location=location@entry=0x563ff2ea8728 "auxiliary test code (environments or event listeners)") at ./src/googletest/googletest/src/gtest.cc:2641
#12 0x0000563ff2e90ae8 in testing::UnitTest::Run (this=0x563ff30c0660 <testing::UnitTest::GetInstance()::instance>) at ./src/googletest/googletest/src/gtest.cc:5306
Sage Weil [Wed, 28 Apr 2021 15:44:21 +0000 (10:44 -0500)]
Merge PR #40922 into pacific
* refs/pull/40922/head:
pybind/ceph_argparse: print --format flag name in help descs
mgr/cephadm: don't list non ceph daemons as needing upgrade in upgrade check
qa/tasks/cephadm: ignore --keep-logs failure
qa/tasks/cephadm: use yaml.dump_all()
qa/suites/rados/cephadm/smoke-*: use cephadm.wait_for_service
qa/tasks/cephadm: tear down clsuter before gathering logs
qa/suites/rados/cephadm/smoke-roleless: test rgw-ingress
mgr/cephadm: remove virtual_ip check during scheduling
mgr/orchestrator: orch ls: leave off virtual_ip prefixlen
qa/tasks/cephadm: add wait_for_service
qa/tasks/cephadm: allow skip_monitor_stack=true
qa/tasks/cephadm: do subst_vip for cephadm.shell and .apply
qa/tasks/vip: add vip task to allocate virtual IPs
qa/suites/rados/cephadm/smoke-roleless: add rgw-ingress test case
qa/tasks/cephadm: shell: take 'all-roles' or 'all-hosts'
qa/tasks/cephadm: let cephadm.shell take string or list
doc/cephadm: wrong command for single daemon events
mgr/cephadm: place maximum on placement count based on host count
mgr/cephadm: fix nfs-rgw stray daemon
mgr/cephadm: skip-ssh flag enables cephadm mgr module
mgr/cephadm: report exception during upgrade in upgrade status
qa/suites/rados/thrash: shorten radosbench
mgr/cephadm: remove old haproxy and keepalived templates
mgr/orchestrator: validate lists in spec jsons
python-common: Verify service spec is not None
python-common: Verify data_devices is not None
mgr/orchestrator: DG loads properly the unmanaged attribute
mgr/orchestractor: rgw realm and zone flags must both be provided
mgr/cephadm: make prometheus scrape ingress haproxy
doc/cephadm: remove big warning about stability
doc/cepham/compatibility: rgw-ha -> ingress; note possibility of breaking changes
doc/cephadm: rewrite "dry run" section in osd.rst
doc/cephadm: rewrite part of "deploy osds"
doc/cephadm: rewrite osd.rst "Remove an OSD"
doc/cephadm: rewrite osd.rst - list devices
doc/cephadm: break mon section into sections
doc/cephadm: rewrite "deploying add. mons"
doc: fixes for cephadm documentation
doc/cephadm: remove warning about cephadm in production
doc/cephadm: Add Compatibility with Podman Versions
doc/cephadm: rewrite "index.rst"
doc/cephadm: explicitly show host requirments in adding host section
mgr/cephadm: ingress: add optional virtual_interface_networks
doc/cephadm/rgw: clean up example spec
mgr/cephadm/services/ingress: less verbose about prepare_create
doc/cephadm/rgw: add note about which ethernet interface is used
cephadm: make keepalived unit fiddle sysctl settings
mgr/orchestrator: report external endpoints from 'orch ls'
mgr/orchestrator: drop - when no ports
doc/cephadm/rgw: update docs for ingress service
mgr/cephadm: use per_host_daemon feature in scheduler
cephadm: fix a typo
mgr/cephadm/schedule: add per_host_daemon_type support
mgr/cephadm: HA_RGW -> Ingress
mgr/cephadm: include daemon_type in DaemonPlacement
mgr/cephadm: update list-networks to report interface names too
mgr/orchestrator: streamline 'orch ps' PORTS formatting
mgr/cephadm/schedule: handle multiple ports per daemon
mgr/cephadm/utils: resolve_ip(): prefer IPv4
cephadm: cleanup extra slash in runtime dir
cephadm: use split cgroup strategy for podman
cephadm: use class to represent container engine
mgr/cephadm: don't cleanup the daemon keyring on failed redeploy
mgr/cephadm: fix orch host add with multiple labels and no addr
doc/cephadm: remove keepalived_user from haproxy docs
rpm: re-disable SUSE lttng build on z390x
ceph.spec.in: enable tcmalloc and lttng on s390x
pacific: mds: "cluster [WRN] Scrub error on inode 0x1000000039d (/client.0/tmp/blogbench-1.0/src/blogtest_in) see mds.a log and `damage ls` output for details"
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Sage Weil [Thu, 15 Apr 2021 22:55:00 +0000 (17:55 -0500)]
qa/tasks/cephadm: tear down clsuter before gathering logs
We dont' always stop all services, because teuthology doesn't know about
things it didn't start. Use rm-cluster to tear things down, but do not
remove the logs themselves. After we get logs, we'll clean up completely.
mgr/orchestrator: DG loads properly the unmanaged attribute
Fixes: https://tracker.ceph.com/issues/49805 Signed-off-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>
(cherry picked from commit 0af4ad8614e426adf60eec32bd4b36974c5cb30b)
Zac Dover [Wed, 24 Mar 2021 15:47:17 +0000 (01:47 +1000)]
doc/cephadm: rewrite "dry run" section in osd.rst
This rewrites the "dry run" section of the "OSD Service"
chapter of the Cephdam documentation. This commit makes
minor changes that reduce the cognitive load of the
reader.
Zac Dover [Wed, 24 Mar 2021 14:39:01 +0000 (00:39 +1000)]
doc/cephadm: rewrite part of "deploy osds"
This reorganizes the section "Deploy OSDs"
in the "OSD Service" chapter of the Cephadm
Guide. Two new sections, "Listing Storage
Devices" and "Creating New OSDs" gather
information under headings in a sensible way,
making the information more accessible to someone
skimming this Guide.
Zac Dover [Sun, 28 Mar 2021 19:23:08 +0000 (05:23 +1000)]
doc/cephadm: rewrite osd.rst "Remove an OSD"
This commit rewrites the entire "Remove an OSD"
section of the "OSD Service" chapter of the
cephadm book.
I got carried away and didn't break this one into
four smaller PRs, and I'm sorry in advance to
whomever ends up reviewing this. I'll break "Advanced
OSD Service Specifications", the next section in the
queue, into multiple sections.
Zac Dover [Mon, 15 Mar 2021 15:03:06 +0000 (01:03 +1000)]
doc/cephadm: break mon section into sections
This PR breaks the "Deploy Additional Monitors" section
of the cephadm documentation into several subsections
whose titles spotlight the matter under discussion in
those respective subsections.
inb4: Another PR is on deck that rewrites the sentences
in this chapter of the cephadm documentation. I'd like
to get this chapter broken up into these subsections before
I rewrite those sentences. So I'm hoping for no grammatical
mission creep on this one. The grammar and clarity updates
are coming.
Jeff Layton [Fri, 29 Jan 2021 19:15:26 +0000 (14:15 -0500)]
doc: fixes for cephadm documentation
Be sure to note that python 3 is a prerequisite. Minimal centos 8
installs don't have it, for instance.
Also, we probably don't want to hardcode an octopus URL into the
suggested curl command. Change it to fill that in with
"|stable-release|", which should always point to the latest released
version name.
Fixes: https://tracker.ceph.com/issues/49806 Signed-off-by: Jeff Layton <jlayton@redhat.com>
(cherry picked from commit bf69cdc68970789a7410928bd8a1af34d0d9b6a2)
It may be that the virtual IP we want to use is not in the same network
as any existing IPs on the host. In that case, allow the spec to specify
a list of networks to match against existing IPs so that a match will
identify an ethernet interface to use.
胡玮文 [Thu, 11 Mar 2021 04:43:34 +0000 (12:43 +0800)]
cephadm: use split cgroup strategy for podman
Since systemd will create a cgroup for each service, we can instruct podman to
just split the current cgroup into sub-cgroups. This enables system admins to
use resource control features from systemd.