Sage Weil [Mon, 9 Mar 2020 13:28:57 +0000 (08:28 -0500)]
Merge PR #33793 into master
* refs/pull/33793/head:
qa/suites/rados/cephadm/upgrade: new start point
qa/tasks/cephadm: put bootstrap config etc directly in /etc/ceph
cephadm: shell: default to config and keyring in /etc/ceph, if present
Sage Weil [Mon, 9 Mar 2020 13:28:37 +0000 (08:28 -0500)]
Merge PR #33808 into master
* refs/pull/33808/head:
mgr/cephadm: apply: fill in default placement if none is provided
mgr/cephadm: make placement truly optional (default to count=1)
mgr/cephadm: allow count == 0
mgr/cephadm: remove magic labels
Kefu Chai [Mon, 9 Mar 2020 03:48:07 +0000 (11:48 +0800)]
crimson/mgr: close() in background
as per Yingxin,
application code is not required to wait for the `close()` future, it
would be safe to ignore it, because:
- `close()` will shutdown its socket synchronously;
- `close()` will create an internal `ConnectionRef` when it's closing;
- `Messenger` will wait for all connections closed during `shutdown()`;
Chunsong Feng [Thu, 19 Dec 2019 09:32:09 +0000 (17:32 +0800)]
os/bluestore/spdk: Fix the overflow error of parsing spdk coremask
coremask supports up to 256 bits in DPDK19.05, but the use of stoll in
NVMEManager::try_get limits the maximum use to 64 bits. Parse coremask by
hex character from low to high.
Fixes: https://tracker.ceph.com/issues/43044 Signed-off-by: Hu Ye <yehu5@huawei.com> Signed-off-by: Chunsong Feng <fengchunsong@huawei.com> Signed-off-by: luo rixin <luorixin@huawei.com>
Sage Weil [Mon, 9 Mar 2020 00:57:06 +0000 (19:57 -0500)]
Merge PR #33804 into master
* refs/pull/33804/head:
cephadm: ls: warn if daemon type (version) is not supported
cephadm: report grafana version
cephadm: report prometheus, node-exporter, alertmanager versions
cephadm: use None (not '<no value>') for monitoring daemon version
Sage Weil [Sun, 8 Mar 2020 22:29:00 +0000 (17:29 -0500)]
Merge PR #33792 into master
* refs/pull/33792/head:
doc/cephadm: fix formatting for osd section
doc/cephadm: update 'adding mons' section to suggest/prefer 'apply'
doc/cephadm: fix formatting, typos
mgr/cephadm: implement apply_mon
mgr/cephadm: allow mon creation without explicit ip or addr
mgr/cephadm: allow _apply_service to delete mon daemon's data
mgr/cephadm: remove mon from monmap before removing mon
mgr/cephadm: do not remove mon if it breaks quorum
Sage Weil [Sun, 8 Mar 2020 21:49:38 +0000 (16:49 -0500)]
Merge PR #33802 into master
* refs/pull/33802/head:
mgr/cephadm: sanity check upgrade version
mgr/cephadm: only need to invalidate once here
mgr/cephadm: upgrade requires root mode for now
Sage Weil [Sun, 8 Mar 2020 17:00:45 +0000 (12:00 -0500)]
mgr/cephadm: remove magic labels
Remove the magic label behavior. It makes the code confusing, it
makes the overall behavior hard to explain, and it makes the PlacementSpec
meaning different than what Rook is doing.
Instead, if you want mons on hosts with label 'mon', then say 'label:mon'.
Sage Weil [Fri, 6 Mar 2020 21:26:20 +0000 (15:26 -0600)]
qa/tasks/cephadm: put bootstrap config etc directly in /etc/ceph
This puts the conf and keyring in /etc/ceph earlier rather than later,
making them useful for debugging a live system *during* bootstrap. It's
also less code.
Sage Weil [Sat, 7 Mar 2020 19:45:16 +0000 (13:45 -0600)]
Merge PR #33706 into master
* refs/pull/33706/head:
qa/suites/rados/cephadm/upgrade: adjust starting version
mgr/orch: from_strings -> from_string; do not accept a list
mgr/volumes: pass placement as string, not list
qa/tasks/mgr/test_orchestrator_cli: adjust placement args
qa/tasks/cephadm: pass apply placement as a single arg
mgr/orch: PlacementSpec: allow 'count:123'
mgr/orch: PlacementSpec: may pretty_str() match input
mgr/orch: take single placement argument
mgr/orch: PlacementSpec.from_strings: take a string *or* a list
Xuehan Xu [Fri, 6 Mar 2020 10:55:07 +0000 (18:55 +0800)]
crimson: decouple mgr client reconnect and connect reset handling
As of now, the following invocation sequence triggers deadlock when
closing crimson-osd's connection with mgr:
ProtocolV2::dispatch_reset() --> crimson::mgr::Client::ms_handle_reset
--> crimson::mgr::Client::reconnect --> crimson::net::SocketConnection::close
--> crimson::net::Protocol::close()
In the above invocation sequence, ProtocalV2::dispatch_reset() enters the gate
"pending_dispatch" the leaving of which would wait for the complete of crimson::\
net::Protocal::close() which further wait for the complete of the gate's close().
Sage Weil [Tue, 3 Mar 2020 21:39:50 +0000 (15:39 -0600)]
mgr/orch: take single placement argument
This is maybe a wash on the 'ceph orch ...' portion of the CLI. However,
it means that elsewhere, like 'ceph fs volume ...', we can be consistent
and have placement be (1) optional and (2) a single arg so that it is
easier to use both positionally and as a flag (--placement=all:true).
Sage Weil [Sat, 7 Mar 2020 03:19:49 +0000 (21:19 -0600)]
Merge PR #33700 into master
* refs/pull/33700/head:
mgr/cephadm: point dashboard at grafana automatically
doc/cephadm/monitoring: document process to set up monitoring with cephadm
Reviewed-by: Alexandra Settle <asettle@suse.com> Reviewed-by: Patrick Seidensal <pseidensal@suse.com>
Sage Weil [Fri, 6 Mar 2020 17:26:47 +0000 (11:26 -0600)]
Merge PR #33614 into master
* refs/pull/33614/head:
mgr/cephadm: enable custom TLS certificates for grafana
mgr: enable verification of TLS certs without files
mgr/cephadm: dump config to JSON only once when creating daemons
Kefu Chai [Fri, 6 Mar 2020 04:17:40 +0000 (12:17 +0800)]
qa/tasks/ceph.py: quote "<kind>" in command line
otherwise bash will intepret "kind" as a file when handling command like
```
sudo zgrep <kind> /var/log/ceph/valgrind/* /dev/null | sort | uniq
```
and try to feed its content to zgrep, and write the output of zgrep
to /var/log/ceph/valgrind/*. this is not the intended behavior. what we
what to do is to pass "<kind>" as an argument to zgrep, along with
the globbed files names which matches "/var/log/ceph/valgrind/*".
Sage Weil [Fri, 6 Mar 2020 03:24:53 +0000 (21:24 -0600)]
mgr/cephadm: do not specify --image arg for non-ceph daemons; fix upgrade
If we are calling the cephadm script for a non-ceph daemon (prometheus,
etc), do not specify the --image argument, and do not pull it out of
the config db from sections that don't exist.
Sage Weil [Thu, 5 Mar 2020 16:42:26 +0000 (10:42 -0600)]
mgr/cephadm: make osd create on an existing LV idempotent
If we try to prepare an LV that was already prepared, ceph-volume will
return an error message and code. We want our osd create command to be
idempotent, though, so recognize the error string and continue.
This is an ugly hack, but quicker than changing ceph-volume behavior, and
it is sufficient to stop all of the teuthology failures.
The second part of this is that we have to deploy the daemon on OSDs that
are already prepared and already exist in our osdmap beforehand, but have
never started.
Works-around: https://tracker.ceph.com/issues/44313 Signed-off-by: Sage Weil <sage@redhat.com>