Or Ozeri [Tue, 9 Mar 2021 20:14:49 +0000 (22:14 +0200)]
librbd: crypto format api semantics change
This commit alters the semantics of the encryption format api
to also load the encryption after format completes.
Additionally, several other small changes in librbd crypto are included,
in preparation of supporting clone formatting.
Sage Weil [Thu, 11 Mar 2021 21:56:52 +0000 (16:56 -0500)]
cephadm: use image id, not name, when inspecting for RepoDigests
The name is ambiguous, but the image_id is not! This fixes problems
during upgrade where upgrade thinks the container is upgraded (due to
an incorrect digest) when in fact it is not.
Fixes: 0826c45e0cb5d60fcf8cd71cd14edd34a6997cd4 Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit f6e802a0d865f64c5e408bdef8e6c3153b1e9842)
Sage Weil [Wed, 10 Mar 2021 20:27:27 +0000 (14:27 -0600)]
Merge PR #39807 into pacific
* refs/pull/39807/head:
cephadm: split custom container args into argv
cephadm: fix escaping/quoting of stderr-prefix arg for ceph daemons
cephadm: set CEPH_USE_RANDOM_NONCE if using --init
msg/Messenger: use random nonce if CEPH_USE_RANDOM_NONCE or pid == 1
Revert "Merge PR #39482 into master"
cephadm: remove redundant `ERROR` during check-host
cephadm: remove unused imports
cephadm: `cephadm ls` broken for SUSE downstream alertmanager container
cephadm: `cephadm ls` broken for SUSE downstream alertmanager container
mgr/cephadm: add ok-to-stop functions for ceph client services
mgr/test_orchestrator: Refactor create_osds
mgr/volumes: adapt to now orch interface
doc/mgr/orchestrator_modules: adapt to now orch interface
mgr/selftest: adapt to now orch interface
mgr/dashboard: adapt to now orch interface
mgr/mds_autoscaler: Add to tox.ini
mgr/mds_autoscaler: adapt to now orch interface
mgr/test_orchestrator: adapt to now orch interface
mgr/rook: Adapt to new orch interface
mgr/cephadm: Adapt cephadm to new orch interface
mgr/orch: Remove old tests
mgr/orch: adapt orchestrator CLI to new interface
mgr/orch: replace Completion with OrchResult(Generic[T])
mgr/orchestrator: Fix ceph orch ls in Rook
doc/cephadm: rewrite "install cephadm"
doc/cephadm: rewrite "b.strap a new cluster"
cephadm: add docker.service dependency in systemd units
cephadm: add multi-digest test
mgr/orchestrator: validate config options at apply time
mgr/cephadm: disallow managed options in ServiceSpec config section
mgr/cephadm: add config section to ServiceSpec
doc/cephadm: s/togeter/together
cephadm: provide meta during bootstrap
mgr/cephadm: put service_name in unit.meta and use it when available
cephadm: accept arbitrary dict via --meta-json
mgr/cephadm: incorporte memory_{usage,request,limit} from 'ls'
cephadm: accept --memory-{request,limit}
cephadm: include memory_usage in 'ls' output
doc/cephadm: remove Orchestrator CLI from cephadm toc
doc/cephadm: move host labels to host mgmt
doc/cephadm: group MDS sections into one chapter
doc/cephadm: Add iscsi
doc/cephadm: group NFS sections into one chapter
doc/cephadm: rename monitoring chapter title
doc/cephadm: group MON sections into one chapter
doc/cephadm: make custom containers its own chapter
doc/cephadm: group RGW mgmt sections into one chapter
doc/cephadm: move scheduler topic to service mgmt
doc/cephadm: move unmanaged=true to service-mgmt.rst
doc/cephadm: group general service mgmt sections into one chapter
doc/cephadm: group OSD mgmt sections into one chapter
doc/cephadm: Move FQDN chapter to host mgmt.rst
doc/cephadm: Move SSH config from operations to host-mgmt.rst
doc/cephadm: group host mgmt sections into one chapter
cephadm: fix bug in orphan-initial-daemons logic
mgr/orch: drop __all__ from __init__.py
mgr/cephadm: add DaemonDescriptionStatus
cephadm: version command hide traceback when login is needed
doc/cephadm: troubleshooting: manually deploy MGR
cephadm: fix port_in_use when IPv6 is disabled
cephadm: Allow to use paths in all <_devices> drivegroup sections
mgr/cephadm: error if service action called with daemonless service
mgr/cephadm: fix up the strings reporting osd ids
mgr/cephadm: remove daemon before osd destroy/purge
mgr/cephadm: simplify OSD __str__ for drain
mgr/cephadm: make drain adjust crush weight if not replacing
mgr/cephadm: less log noise from osd drain code
mgr/cephadm: fix 'orch daemon add osd ...'
mgr/cephadm/upgrade: fix typo
mgr/cephadm: remove spec from CephadmDaemonDeploySpec
mgr/cephadm/upgrade: restart mgr after mons upgrade to pacific
mgr/cephadm: use get_foreign_ceph_option() instead of 'config get' mon command
Reviewed-by: Avan Thakkar <athakkar@redhat.com> Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
Nizamudeen A [Tue, 2 Feb 2021 12:26:13 +0000 (17:56 +0530)]
mgr/dashboard: Host Maintenance Feature
In Cluster -> Hosts, I've added additional button to put the selected host on maintenance or exit out of the maintenance mode. Also for some hosts the ok-to-stop tests may trigger some warnings which requires a --force command to pass along with the maintenance enter command to enter a host into maintenance. In UI this is achieved using a confirmation Modal. In addition to this if the check error is It is NOT safe to stop the host then the host wont be able to put into maintenance mode.
The osd_fast_shutdown option may cause the cluster log to receive
too many entries of 'osd.X reported immediately failed by osd.Y',
depending on cluster scale.
This might be an issue for LMA stacks/tools that check ceph logs
for failed lines, and then require additional logic to filter on
an intended OSD (fast) shutdown; might not be an option/possible,
and require an admin to analyze.
So, add osd_fast_shutdown_notify_mon option for OSD to also tell
the monitor it is shutting down (done in slow/non-fast shutdown)
under osd_fast_shutdown.
This introduces minimal delay (the ack from the mon is required
to prevent the messages), and addresses the cluster log issue.
Note: the osd_mon_shutdown_timeout option can be used to control
the maximum amount of time waiting for the monitor ack to arrive.
Fixes: http://tracker.ceph.com/issues/46978 Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
(cherry picked from commit c75734729764868c5c501722fc8de08dac9ebd4a)
Jason Dillaman [Wed, 3 Mar 2021 19:38:35 +0000 (14:38 -0500)]
qa/objectstore: reduce debug log levels for bluestore
This will help speed up teuthology jobs for non-RADOS suites
where previously tests were IO bound due to excessive logging
and the artifact collection was slowed due to very large OSD
logs.
Jason Dillaman [Wed, 3 Mar 2021 19:26:38 +0000 (14:26 -0500)]
qa/suites: move RADOS tests to use new debug log objectstores
This will retain the debug log settings for all RADOS suites
that were previously symlinked to the 'objectstore'
directory. The next commit will reduce the debug log level
for the original 'objectstore' directory for the remainder
of tests.
Sage Weil [Sat, 27 Feb 2021 20:45:47 +0000 (15:45 -0500)]
msg/Messenger: use random nonce if CEPH_USE_RANDOM_NONCE or pid == 1
If we are in a container, then we do not have a unique pid, and need to
use a random nonce. We normally detect this if our pid is 1, but that
doesn't work when we have a init process--we'll (probably?) have a small
pid (in my tests, the OSDs were getting pid 7).
To be safe, also check for an environment variable set by cephadm.
This avoids problems that arise when we don't have a unique address.
Daniel-Pivonka [Thu, 14 Jan 2021 22:18:43 +0000 (17:18 -0500)]
mgr/cephadm: add ok-to-stop functions for ceph client services
Signed-off-by: Daniel-Pivonka <dpivonka@redhat.com> Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 674912bfed92537a97e625bb79397bf97f10b24b)
Fixes: https://tracker.ceph.com/issues/49411 Signed-off-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>
(cherry picked from commit d070caedb5971351de9521e3125d162838270c2b)
Zac Dover [Sun, 28 Feb 2021 12:13:39 +0000 (22:13 +1000)]
doc/cephadm: rewrite "install cephadm"
This PR breaks the "Deploying a New Ceph Cluster"
section into several sub-sections, so that each sub-section
pertains to only one subject. I've also added some explanatory
text that puts the instructions into context more than they were
before.
Zac Dover [Mon, 1 Mar 2021 14:01:05 +0000 (00:01 +1000)]
doc/cephadm: rewrite "b.strap a new cluster"
This PR rewrites the section "Bootstrap A New
Cluster" in the Cephadm Guide, in the Install
Chapter. I've broken this section up into what
seem to me to be the topics that the content
naturally divides into.