This is just so we can load up a stored spec after upgrade. We'll silently
drop it, since we have the service_id, and this was only used to generate
that anyway.
Sage Weil [Mon, 15 Mar 2021 22:05:45 +0000 (17:05 -0500)]
mgr/orchestrator: drop $realm.$zone naming convention
- Let users name the rgw service(s) whatever they like
- Make the rgw_zone and rgw_realm arguments optional, so that they can
omit them and let radosgw start up in the generic single-cluster
configuration (whichk, notably, has no realm).
- Continue to set the rgw_realm and rgw_zone options for the daemon(s),
but only when those values are specified.
- Adjust the CLI accordingly. Drop the subcluster argument, which was
only used to generate a service_id.
- Adjust rook. This is actually a simplification and improved mapping onto
the rook CRD, *except* that for octopus we enforced the realm.zone
naming *and* realm==zone. I doubt a single user actually did this
so it is not be worth dealing with, but we need a special case for
where there is a . in the service name.
Sage Weil [Fri, 5 Mar 2021 18:13:56 +0000 (13:13 -0500)]
mgr/cephadm: rgw: do not mess with realm configuration
It is simpler to consider this out of scope for the orchestrator. The
user should set up their multisite realms/zones before deploying the
daemons (or the daemons will not start). In the future we can wrap this
with a more friendly tool, perhaps.
Paul Cuzner [Fri, 19 Feb 2021 02:09:02 +0000 (15:09 +1300)]
mgr/cephadm:Drop active healthcheck during a disable request
The healthcheck could already be active when the admin attempts
to disable it. This patch removes the related healthcheck if it's set
during a config-check disable request.
Paul Cuzner [Thu, 18 Feb 2021 23:50:22 +0000 (12:50 +1300)]
mgr/cephadm:skip an alert if the linkspeed is better than most
The logic was issuing a healthcheck if the linkspeed was different
to the majority. But if the difference is good (i.e. better!) we should
not be raising a healthcheck
Paul Cuzner [Thu, 18 Feb 2021 23:23:13 +0000 (12:23 +1300)]
mgr/cephadm:Minor updates to address review comments
Changes to reflect review comments
- picked up on subscribed = unknown state
- using get_daemon_types() call
- use log.exception more
- changed logic and errors from the public_network check
Paul Cuzner [Thu, 18 Feb 2021 03:17:22 +0000 (16:17 +1300)]
mgr/cephadm:Multiple updates related to the addition of the CLI
Some changes needed to support the introduction of the CLI commands
used to manage the cephadm checks. For example, the main Cephadm
check class now interacts with the keystore directly to determine
status, and provides support for commands like ls to list the
check definitions. In addition the main class now handles existing
configuration checks and ensure that the stored state in the keystore
matches the checks defined by the module
Paul Cuzner [Thu, 18 Feb 2021 03:11:55 +0000 (16:11 +1300)]
mgr/cephadm:Moved 'ownership' of the checker to cephadm
Initial implementation used the Serve class as the owner of the
configuration checker. This patch moves the checker up to the
cephadm module itself, to make the CLI command logic cleaner
Paul Cuzner [Tue, 16 Feb 2021 23:45:38 +0000 (12:45 +1300)]
mgr/cephadm:Adds unit tests for the CephadmConfigChecks class
Add unit tests to test suite to verify functionality. The unit tests use
a sample host definition and scale that to simulate a cluster to run
the tests against
Paul Cuzner [Wed, 3 Mar 2021 00:12:05 +0000 (13:12 +1300)]
mgr/cephadm:Document the intergration with libstoragemgmt
Updates the cephadm osd documentation to include details about
including integration with libstoragement - including the potential
hardware issues that may arise.
Paul Cuzner [Mon, 22 Feb 2021 01:35:03 +0000 (14:35 +1300)]
mgr/cephadm:Enable cephadm device scan to use LSM
Using libstoragemgmt (LSM) in ceph-volume was disabled by default,
(nov 2020) which meant cephadm's inventory never had a way to
request the LSM data. This patch adds a module option called
'device_enhanced_scan' (bool), that if set will append the
--with-lsm parameter to the ceph-volume inventory call.
Daniel Pivonka [Mon, 8 Mar 2021 19:04:29 +0000 (14:04 -0500)]
mgr/cephadm: prevent traceback when invalid osd id passed to 'orch osd rm stop'
orch osd rm exepcts a str that can be converted to an int passed to it
if the user passed something that cant be converted it shows a traceback
catching the ValueError prevents this traceback.
Sage Weil [Tue, 9 Mar 2021 18:15:20 +0000 (12:15 -0600)]
mgr/cephadm: do not prime service cache on reconfig
Ceph daemon reconfig does not need any daemon state refresh since we don't
do a restart--we just rewrite the ceph.conf. This also avoids priming
our cache with a 'starting' state when the daemon wasn't touched.
Sage Weil [Fri, 26 Feb 2021 13:46:26 +0000 (08:46 -0500)]
mgr/cephadm: optional pass 'known' through to ok_to_stop
Optionally provide a list of previously known-to-be-ok-to-stop items to
the ok_to_stop method. This has to get plumbed through a zillion instances
of this class method.
Or Ozeri [Tue, 9 Mar 2021 20:14:49 +0000 (22:14 +0200)]
librbd: crypto format api semantics change
This commit alters the semantics of the encryption format api
to also load the encryption after format completes.
Additionally, several other small changes in librbd crypto are included,
in preparation of supporting clone formatting.
Sage Weil [Thu, 11 Mar 2021 21:56:52 +0000 (16:56 -0500)]
cephadm: use image id, not name, when inspecting for RepoDigests
The name is ambiguous, but the image_id is not! This fixes problems
during upgrade where upgrade thinks the container is upgraded (due to
an incorrect digest) when in fact it is not.
Fixes: 0826c45e0cb5d60fcf8cd71cd14edd34a6997cd4 Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit f6e802a0d865f64c5e408bdef8e6c3153b1e9842)
Sage Weil [Wed, 10 Mar 2021 20:27:27 +0000 (14:27 -0600)]
Merge PR #39807 into pacific
* refs/pull/39807/head:
cephadm: split custom container args into argv
cephadm: fix escaping/quoting of stderr-prefix arg for ceph daemons
cephadm: set CEPH_USE_RANDOM_NONCE if using --init
msg/Messenger: use random nonce if CEPH_USE_RANDOM_NONCE or pid == 1
Revert "Merge PR #39482 into master"
cephadm: remove redundant `ERROR` during check-host
cephadm: remove unused imports
cephadm: `cephadm ls` broken for SUSE downstream alertmanager container
cephadm: `cephadm ls` broken for SUSE downstream alertmanager container
mgr/cephadm: add ok-to-stop functions for ceph client services
mgr/test_orchestrator: Refactor create_osds
mgr/volumes: adapt to now orch interface
doc/mgr/orchestrator_modules: adapt to now orch interface
mgr/selftest: adapt to now orch interface
mgr/dashboard: adapt to now orch interface
mgr/mds_autoscaler: Add to tox.ini
mgr/mds_autoscaler: adapt to now orch interface
mgr/test_orchestrator: adapt to now orch interface
mgr/rook: Adapt to new orch interface
mgr/cephadm: Adapt cephadm to new orch interface
mgr/orch: Remove old tests
mgr/orch: adapt orchestrator CLI to new interface
mgr/orch: replace Completion with OrchResult(Generic[T])
mgr/orchestrator: Fix ceph orch ls in Rook
doc/cephadm: rewrite "install cephadm"
doc/cephadm: rewrite "b.strap a new cluster"
cephadm: add docker.service dependency in systemd units
cephadm: add multi-digest test
mgr/orchestrator: validate config options at apply time
mgr/cephadm: disallow managed options in ServiceSpec config section
mgr/cephadm: add config section to ServiceSpec
doc/cephadm: s/togeter/together
cephadm: provide meta during bootstrap
mgr/cephadm: put service_name in unit.meta and use it when available
cephadm: accept arbitrary dict via --meta-json
mgr/cephadm: incorporte memory_{usage,request,limit} from 'ls'
cephadm: accept --memory-{request,limit}
cephadm: include memory_usage in 'ls' output
doc/cephadm: remove Orchestrator CLI from cephadm toc
doc/cephadm: move host labels to host mgmt
doc/cephadm: group MDS sections into one chapter
doc/cephadm: Add iscsi
doc/cephadm: group NFS sections into one chapter
doc/cephadm: rename monitoring chapter title
doc/cephadm: group MON sections into one chapter
doc/cephadm: make custom containers its own chapter
doc/cephadm: group RGW mgmt sections into one chapter
doc/cephadm: move scheduler topic to service mgmt
doc/cephadm: move unmanaged=true to service-mgmt.rst
doc/cephadm: group general service mgmt sections into one chapter
doc/cephadm: group OSD mgmt sections into one chapter
doc/cephadm: Move FQDN chapter to host mgmt.rst
doc/cephadm: Move SSH config from operations to host-mgmt.rst
doc/cephadm: group host mgmt sections into one chapter
cephadm: fix bug in orphan-initial-daemons logic
mgr/orch: drop __all__ from __init__.py
mgr/cephadm: add DaemonDescriptionStatus
cephadm: version command hide traceback when login is needed
doc/cephadm: troubleshooting: manually deploy MGR
cephadm: fix port_in_use when IPv6 is disabled
cephadm: Allow to use paths in all <_devices> drivegroup sections
mgr/cephadm: error if service action called with daemonless service
mgr/cephadm: fix up the strings reporting osd ids
mgr/cephadm: remove daemon before osd destroy/purge
mgr/cephadm: simplify OSD __str__ for drain
mgr/cephadm: make drain adjust crush weight if not replacing
mgr/cephadm: less log noise from osd drain code
mgr/cephadm: fix 'orch daemon add osd ...'
mgr/cephadm/upgrade: fix typo
mgr/cephadm: remove spec from CephadmDaemonDeploySpec
mgr/cephadm/upgrade: restart mgr after mons upgrade to pacific
mgr/cephadm: use get_foreign_ceph_option() instead of 'config get' mon command
Reviewed-by: Avan Thakkar <athakkar@redhat.com> Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>