Sage Weil [Thu, 19 Mar 2020 00:04:14 +0000 (19:04 -0500)]
mgr/progress: fix duration strings
- simplify the code to just calculate the durations when we need them
(I'm not sure why we had those temporary strings!)
- use a nicer time delta format
Fixes: https://tracker.ceph.com/issues/44672 Signed-off-by: Sage Weil <sage@redhat.com>
Jason Dillaman [Wed, 18 Mar 2020 16:54:16 +0000 (12:54 -0400)]
qa/workunits/rbd: use context managers to control Rados lifespan
There is a potential race between the expected exceptions being
thrown and Python shutting down racing with librados background
threads. Ensure that librados is properly shut down prior to
exiting Python.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Sage Weil [Wed, 18 Mar 2020 14:45:16 +0000 (09:45 -0500)]
Merge PR #33981 into octopus
* refs/pull/33981/head:
doc/install: edits
doc/cephadm: more edits
doc/cephadm/install: edits
doc/cephadm/adoption: improvements
doc/cephadm/install: a few edits
doc/cephadm/install: do not install ceph-common on host (by default)
doc/cephadm: drop os recs link
doc/cephadm/upgrade: improvements
doc/cephadm/upgrade: document upgrade
doc/cephadm/install: revamp install docs
doc: reorganize cephadm docs
doc/cephadm/administration: update docs on customizing SSH config
doc/cephadm/administration: add a note about the 'removed' dir
Venky Shankar [Tue, 14 Jan 2020 09:13:16 +0000 (04:13 -0500)]
mgr/volumes: introduce 'canceled' state in clone op state machine
When fetching the next execution state, -EINTR jumps to 'canceled'
state signifying a canceled (interrupted) operation. Also include
a helper routine to check if a given state machine is in initial
state.
Sage Weil [Sun, 15 Mar 2020 13:45:46 +0000 (08:45 -0500)]
doc: reorganize cephadm docs
- reorganized cephadm into a top-level item with a series of sub-items.
- condensed the 'install' page so that it doesn't create a zillion items
in the toctree on the left
- started updating the cephadm/install sequence (incomplete)
Sage Weil [Tue, 17 Mar 2020 20:03:32 +0000 (15:03 -0500)]
mgr/balancer: tolerate pgs outside of target weight map
We build a target weight map based on the primary crush weights, and
ignore weights that are 0. However, it's possible that existing PGs are
on other OSDs that have weight 0 because the weight-set weight is >0.
That leads to a KeyError exception when we
pgs_by_osd[osd] += 1
and the key isn't present. Fix by simply populating those keys as we
encounter OSDs. Drop the old initialization loop. The net of this is
we may have OSDs outside of target_by_root (won't matter, as far as I can
tell) and we won't have keys for osds with weight 0 (also won't matter,
as far as I can tell).
Fixes: https://tracker.ceph.com/issues/42721 Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Sat, 14 Mar 2020 21:35:07 +0000 (16:35 -0500)]
update default container images
- For tests, use bleeding-edge octopus branch
- For production defaults, use ceph/ceph:v15.2 tag
- For bootstrap, grab cephadm script from latest octopus branch
Sage Weil [Mon, 16 Mar 2020 22:36:43 +0000 (17:36 -0500)]
Merge PR #33952 into octopus
* refs/pull/33952/head:
qa/workunits/cephadm: --skip-mon-network when using 127.0.0.1
cephadm: add tests
qa/tasks/cephadm: pass -v to bootstrap
mgr/cephadm: only try to place mons on hosts matching public_network
mgr/cephadm: keep track of host networks, ips
cephadm: automatically infer mon public_network, if we can
cephadm: add list-networks command
Sage Weil [Mon, 16 Mar 2020 22:36:17 +0000 (17:36 -0500)]
Merge PR #33955 into octopus
* refs/pull/33955/head:
mgr/cephadm: respect 'unmanaged' flag in spec
mgr/orch: orch ls: show <no spec> or <unmanaged> as appropriate
mgr/orch: orch ls: rename SPEC -> PLACEMENT
mgr/orch: add 'unmanaged' property to ServiceSpec
mgr/orch: combine 'orch daemon add <type> ...' into one command
mgr/orch: combine 'orch apply <type> [<placement>]' into one command
Sage Weil [Fri, 13 Mar 2020 14:11:31 +0000 (09:11 -0500)]
mgr/cephadm: only try to place mons on hosts matching public_network
Only try to schedule new mons on hosts that match the configured
public_network, if any. If we do not have one configured, then don't
try to place new mons at all.
Note that there are other restrictions that ceph-mon supports that we
aren't considering, here: public_network_interface in particular, which
might further limit which IPs we consider binding to.
Jason Dillaman [Mon, 16 Mar 2020 17:17:17 +0000 (13:17 -0400)]
librbd: defer event socket completion until after callback issued
A change post-Nautilus fixed an issue where multiple threads could
concurrently invoke callbacks to librbd clients. However, it also
introduced the potential that a callback hasn't yet fired by the
time the event socket is completed. This resulted in a crash of
fio under high-throughput testing since it expected both a callback
and the event socket completion, in that order.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Sage Weil [Fri, 13 Mar 2020 22:49:25 +0000 (17:49 -0500)]
cephadm: add-repo: add --version
Instead of --release octopus, which would get the latest octopus
version (whatever it might be), or possibly a repo with many build versions
inside, you can instead do --version 15.2.1 to get a repo with a
specific version and that version only.
Jason Dillaman [Mon, 16 Mar 2020 13:13:56 +0000 (09:13 -0400)]
rbd-mirror: hold lock while updating local image name
There is a potential for an independently scheduled status update to
request the local image name from the bootstrap state machine during its
initialization.
Fixes: https://tracker.ceph.com/issues/44391 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Sage Weil [Mon, 16 Mar 2020 12:44:59 +0000 (07:44 -0500)]
Merge PR #33980 into octopus
* refs/pull/33980/head:
cephadm: bootstrap: allow --output-dir
cephadm: do not infer image for bootstrap
cephadm: write output files to /etc/ceph by default
cephadm: verify the output files' containing directory exists
Kefu Chai [Mon, 16 Mar 2020 09:26:06 +0000 (17:26 +0800)]
doc/_static/css: fine tune the spacing
* `ul.simple > li` is also used in front page, where the items are too
sparse now. so we need to restore the default spacing.
* `ul.simple > li > ul > li:last-child` is used to control the spacing
between nested unordered list. we need have more spacing there.
* `div.section > ul > li > p` is added to decrease the spacing of nested
list items.