Sage Weil [Thu, 18 Mar 2021 20:11:38 +0000 (16:11 -0400)]
Merge PR #40048 into master
* refs/pull/40048/head:
mgr/cephadm: stop conflicting daemon when deploying to a specific port
mgr/cephadm: make DaemonPlacement print nicer
mgr/cephadm: fix --force remove comment
mgr/cephadm/schedule: choose an IP from a subnet list
mgr/cephadm: rgw: clean up config and config-key values on removal
mgr/cephadm: rgw: drop .crt extension when storing cert in config-key
mgr/cephadm/services: allow beast/civetweb to bind to a particular IP
python-common: add 'networks' property to ServiceSpec
mgr/cephadm/schedule: match placement ip only combination with port
Reviewed-by: Sebastian Wagner <swagner@suse.com> Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
qa/tasks: Add additional wait_for_clean() check in lost_unfound tasks.
At the end of the lost_unfound tests add an additional wait_for_clean()
check to ensure that recoveries get enough time to complete before
proceeding and avoid failures down the line. For e.g. failure like
"Scrubbing terminated -- not all pgs were active and clean." is because
recoveries on the PGs did not get sufficient time to complete even though
they were bound to eventually complete.
Sage Weil [Wed, 17 Mar 2021 19:39:15 +0000 (15:39 -0400)]
mgr/cephadm: stop conflicting daemon when deploying to a specific port
If we are deploying a daemon to bind to a specific port and there is
an existing daemon we are removing that also binds to that port, stop
it first. Unless we are both binding to different IPs.
This resolves the case where daemons bind to * and we redeploy with a
subnet to bind to. It would eventually converge before, but would
throw a bind error in the process and take longer.
Sage Weil [Wed, 17 Mar 2021 19:50:50 +0000 (15:50 -0400)]
Merge PR #40160 into master
* refs/pull/40160/head:
qa/suites/rados/cephadm/orchestrator_cli: random-distro$ -> 0-random-distro$
qa/suites/rados/cephadm/smoke-roleless: distro -> 0-distro
qa/distros/podman: install kubic once per host, in parallel
qa/suites/fs/multiclient: use clients: not all: for pexec
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Jason Dillaman [Wed, 17 Mar 2021 19:29:37 +0000 (15:29 -0400)]
test: ignore failures to force-enable lockdep
PR #40062 tweaked the behavior of lockdep to compile it out
of the code entirely for release builds. This fixes several
gtests where lockdep was force-enabled.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jeff Layton [Wed, 17 Mar 2021 17:12:09 +0000 (13:12 -0400)]
ceph-debug-docker: podman build doesn't accept input via stdin
podman on centos 8 at least doesn't accept the Dockerfile being fed to
it via stdin. Change that branch of the script to use the same method
that the ubuntu side does.
Jeff Layton [Fri, 29 Jan 2021 19:15:26 +0000 (14:15 -0500)]
doc: fixes for cephadm documentation
Be sure to note that python 3 is a prerequisite. Minimal centos 8
installs don't have it, for instance.
Also, we probably don't want to hardcode an octopus URL into the
suggested curl command. Change it to fill that in with
"|stable-release|", which should always point to the latest released
version name.
Fixes: https://tracker.ceph.com/issues/49806 Signed-off-by: Jeff Layton <jlayton@redhat.com>
Ilya Dryomov [Wed, 17 Mar 2021 10:00:33 +0000 (11:00 +0100)]
qa: krbd_blkroset.t: update for separate hw and user read-only flags
Since kernel 5.12, hardware read-only state and user read-only
policy (BLKROGET/SET ioctls) are tracked separately in the block
layer. As the purpose of our ->set_read_only() method was exactly
that, it was removed.
As a side effect, BLKROSET no longer returns EROFS on an attempt
to make a read-only mapping read-write with "blockdev --setrw".
The policy gets updated, but the device remains read-only as before
because the hardware (== mapping) state is controlled by the driver.
Sage Weil [Thu, 11 Mar 2021 23:47:24 +0000 (18:47 -0500)]
mgr/cephadm/schedule: choose an IP from a subnet list
Choose an IP from the subnet list provided by the ServiceSpec.
A few caveats:
- we ignore hosts that don't have IPs in the given subnet
- the subnet matching is STRICT. That is, the CIDR name has to exactly
match what is configured on the host. That means you can't just say 10/8
to match any 10.whatever addres--you need the exactly network on the host
(e.g, 10.1.2.0/24).
- If you modify a servicespec and change the networks when there are
already deployed daemons, we will try to deploy the new instances on
the same ports but bound to a specific IP instead of *. Which will fail.
You need to remove the service first, or remove the old daemons manually
so that creating new ones will succeed.
Sage Weil [Tue, 16 Mar 2021 16:58:03 +0000 (12:58 -0400)]
mgr/cephadm: rgw: drop .crt extension when storing cert in config-key
This will no affect upgrades since we will run the config() method before
prepare_create() any time we deploy a new daemon on this service, which
means we'll re-store the cert in the new key location before we generate
a new rgw_frontends option that references it.
Rachanaben Patel [Tue, 16 Mar 2021 22:37:46 +0000 (15:37 -0700)]
doc/RBD:fixes for ceph-immutable-object-cache daemon enable command
Document for rbd-persistent-read-only-cache show how to manage
ceph-immutable-object-cache daemon using systemd.
command example needs fixing.It should be
Ken Dreyer [Tue, 16 Mar 2021 22:16:52 +0000 (16:16 -0600)]
rpm: ceph-resource-agents package is noarch
The ceph-resource-agents package contains an architecture-independent
bash script and parent directories. There are no architecture-dependent
files here, so we can use a single noarch RPM across all host
architectures.
Sage Weil [Thu, 11 Mar 2021 23:40:22 +0000 (18:40 -0500)]
mgr/cephadm/schedule: match placement ip only combination with port
1- We only have an IP to bind to if we also have a port, and
2- If we do, we want an exact match: if the DaemonPlacement has ip of
None, then the DaemonDescription should have None too.
Sage Weil [Tue, 16 Mar 2021 17:04:55 +0000 (13:04 -0400)]
Merge PR #39931 into master
* refs/pull/39931/head:
mgr/cephadm: fall back to service spec port if none on DaemonDescription
mgr/cephadm: fix redeploy when daemons have ip:port
mgr/cephadm/schedule: add test case
qa/suites/rados/cephadm/smoke-roleless: add rgw test on many ports
doc/cephadm/rgw: update docs to show count-per-host
mgr/cephadm: add support for rgw_frontend_type (beast or civetweb)
mgr/cephadm: remove ssl_frontend_ssl_key from RGWSpec
mgr/cephadm: fix beast private key config option
mgr/cephadm: fix rgw ssl cert/key config-key path
mgr/cephadm/schedule: dynamically assign ports for rgw
mgr/cephadm/schedule: only 1 port in DaemonPlacement
mgr/cephadm: move rgw frontend logic into RgwService
mgr/cephadm/schedule: return DaemonPlacement instead of HostPlacementSpec
mgr/cephadm/schedule: remove unused methods
mgr/cephadm: propagate ip:port from CephadmDaemoNDeploySpec to deployment
cephadm: populate ports if known and not included in unit.meta
mgr/cephadm: gather and report ports in 'orch ps' output
Zac Dover [Mon, 15 Mar 2021 15:03:06 +0000 (01:03 +1000)]
doc/cephadm: break mon section into sections
This PR breaks the "Deploy Additional Monitors" section
of the cephadm documentation into several subsections
whose titles spotlight the matter under discussion in
those respective subsections.
inb4: Another PR is on deck that rewrites the sentences
in this chapter of the cephadm documentation. I'd like
to get this chapter broken up into these subsections before
I rewrite those sentences. So I'm hoping for no grammatical
mission creep on this one. The grammar and clarity updates
are coming.
Xiubo Li [Thu, 4 Feb 2021 06:14:13 +0000 (14:14 +0800)]
mgr: enhance the rados service
For some use cases, like the tcmu-runner, there maybe handreds or
thousands of LUNs, and then for each LUN it will register one service
daemon, then in the `ceph -s` output will be full of useless info.
This will allow to classify the sevices service daemons in one
specified format by adding two pairs in metadata:
TYPE: will be used to replace the default "daemon(s)"
showed in `ceph -s`. If absent, the "daemon" will be used.
PREFIX: if present the active members will be classified
by the prefix instead of "daemon_name".
For exmaple for iscsi gateways, it will be something likes:
"daemon_type" : "portal"
"daemon_prefix" : "gw${N}"
Then the `ceph -s` output will be:
...
services:
mon: 3 daemons, quorum a,b,c (age 50m)
mgr: x(active, since 49m)
mds: a:1 {0=c=up:active} 2 up:standby
osd: 3 osds: 3 up (since 49m), 3 in (since 49m)
iscsi: 8 portals active (gw0, gw1, gw2, gw3, gw4, gw5, gw6, gw7)
...
Fixes: https://tracker.ceph.com/issues/49057 Signed-off-by: Xiubo Li <xiubli@redhat.com>
Sage Weil [Mon, 15 Mar 2021 22:58:15 +0000 (18:58 -0400)]
mgr/cephadm: fall back to service spec port if none on DaemonDescription
For an RGW instance that is deployed in an older version, we won't have
IP or port metadata in the DaemonDescription. In that case, fall back
to the port in the ServiceSpec. This should be safe since any such
instance will only have 1 daemon per host.
Sage Weil [Mon, 15 Mar 2021 19:34:13 +0000 (15:34 -0400)]
mgr/cephadm: fix redeploy when daemons have ip:port
The _daemon_action() method can be called directly by upgrade and by
the 'orch daemon <action> <name>' commands. When this happens, construct
a CephadmDaemonDeploySpec from the DaemonDescription that incldes the
metadata we assigned when teh service was created: the IP and port(s).
This fixes upgrade and the CLI.