Paul Cuzner [Thu, 26 Nov 2020 00:02:54 +0000 (13:02 +1300)]
cephadm: updates to cephadm configuration and runtime behaviour
Multiple changes based on PR feedback
- invocation of set-exporter-config now sets all paramters
- customvalidation of exporter-config now throws
argparse exceptions
- config validation now checks for SSL crt/key format
- failing threads are now caught and change the http
response to the caller (500, and 503 may not be returned)
- bootstrap options descriptions tidied up
Paul Cuzner [Wed, 25 Nov 2020 23:52:47 +0000 (12:52 +1300)]
cephadm: Handle exporter config in a dedicated class
Several changes related to feedback during review
- repositioned stdlib imports
- added an exporterconfig class to handle config interactions
- service prepare and create simplified due to new class
Paul Cuzner [Tue, 17 Nov 2020 20:57:17 +0000 (09:57 +1300)]
cephadm: tests - updated to include cephadm-exporter
Main unit tests now includes daemon and service
deployments for cephadm-exporter.
In addition, the assert_rm_daemon function has been
tweaked, to avoid problems with services that contain
'-' during fnmatch invocations. Service names that use
'-' could result in "bad character range" exceptions from
the re module, it can be treated like a range separator.
Paul Cuzner [Tue, 17 Nov 2020 20:39:03 +0000 (09:39 +1300)]
cephadm: tests - add cephadm-exporter to service check
Add cephadm-exporter service, and confirm the instance
doesn't contain ceph methods like get_auth_entity (same
approach as taken for the monitoring stack)
Paul Cuzner [Thu, 12 Nov 2020 02:55:38 +0000 (15:55 +1300)]
cephadm: edd exporter support to mgr/cephadm
Several changes to introduce support for the exporter
1. new commands for generating/managing the config
settings required;
- generate-exporter-config (defaults)
- get-exporter-config
- clear-exporter-config
- set-exporter-config token/port
- set-exporter-tls (crt/key)
2. exporter is not a container, so a patch is added to the
run_cephadm method to 'tolerate' cephadm-exporter
daemons
3. the purge method is called on removal of a service. This has a default do-nothing response in CephService, but in
the CephadmExporter service, purge will remove it's config
from the mon store
4. during deploy (_create_daemon) a new remoto call to copy
the cephadm binary to the target is needed - prior to
invoking the deploy request
5. Default placement for the exporter is *, so by default once
the service is created, exporter daemons will appear on all
cluster nodes
Paul Cuzner [Thu, 12 Nov 2020 02:43:39 +0000 (15:43 +1300)]
cephadm: updated binary to integrate the exporter
Calling the deploy from orch (stdin) forced changes to the
way the parameters are read, and how the validation of the
config is done. In addition the bootstrap has a couple of new
parameters to allow the exporter to be deployed
automatically (in future!). This patch is a first step with
--with-exporter,but without this parm, there are no changes,
In addition, since deployment needs cephadm commands the
full deployment will be enabled in a follow up PR (once the
new commands are merged in the mgr/cephadm!)
Paul Cuzner [Thu, 12 Nov 2020 02:33:58 +0000 (15:33 +1300)]
cephadm: add cephadm-exporter and service purge 'hook'
Adds a class for the cephadm-exporter that handles the
prepare and generate_config calls. Since the exporter uses
TLS, prepare validates the the crt/key and generate_config
pulls the ssl settings from the store returning a valid config
that the deploy process can consume.
In addition a purge method has been added to the default cephservice class, that by default does nothing. In
CephadmExporter class a purge method is provided which
deletes the variables in the mon store used by the exporter.
Paul Cuzner [Tue, 3 Nov 2020 21:35:25 +0000 (10:35 +1300)]
cephadm: update self-signed cert generation
Generating a self-signed cert patched to accept a
DNAME dictionary. This allows features like the cephadm
exporter to generate a crt/key pair that doesn't use CN
to make the same crt usable across multiple hosts.
Jason Dillaman [Fri, 23 Oct 2020 18:31:36 +0000 (14:31 -0400)]
qa/tasks: support explicit disk configuration for qemu task
The 'disks' key will now be treated as a dictionary where all previous
global settings can be individually applied. Additionally, a disk can be
pre-created and provided for use by qemu.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Wed, 21 Oct 2020 18:29:15 +0000 (14:29 -0400)]
rbd: add new 'migration prepare --import-only' CLI optional
The '--import-only' optional can be combined with either a
'--source-spec' or '--source-spec-path' optional to define
the source for the read-only import.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Samuel Just [Fri, 30 Oct 2020 21:18:27 +0000 (14:18 -0700)]
crimson/os/seastore/.../lba_btree_node_impl: hold reference to *this during lookup
4f2f4f modified TransactionManager::mount() to use a weak_transaction
while calling init_cached_extents. This is fine, but lookup() needs
to hold a reference to *this until the child lookup completes in order
to ensure residence in the lba pinning set.
Paul Cuzner [Thu, 8 Oct 2020 03:30:56 +0000 (16:30 +1300)]
mgr/prometheus: Add healthcheck metric for SLOW_OPS
SLOW_OPS is triggered by op tracker, and generates a health
alert but healthchecks do not create metrics for prometheus to
use as alert triggers. This change adds SLOW_OPS metric, and
provides a simple means to extend to other relevant health
checks in the future
If the extract of the value from the health check message fails
we log an error and remove the metric from the metric set. In
addition the metric description has changed to better reflect
the scenarios where SLOW_OPS can be triggered.
Jason Dillaman [Tue, 27 Oct 2020 00:29:57 +0000 (20:29 -0400)]
rbd: show migration source spec in image status
When using the non-legacy mode image migration, the 'rbd status'
command will now show the source-spec for the migration if the
source pool id is invalid.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Fri, 30 Oct 2020 14:00:27 +0000 (10:00 -0400)]
librbd/migration: close image if failed to open source
To remain consistent between the native which will automatically close
the ImageCtx if it fails to open the image, the raw format should also
close the ImageCtx.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
For backing source formats that don't care about the low-level
RBD layout details, provide safe default values for the layout
to ensure that striping logic will still function.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman [Wed, 21 Oct 2020 18:27:38 +0000 (14:27 -0400)]
librbd: added new 'migration_prepare_import' API methods
These related methods accept a JSON-encoded source-spec that
describes how to read from the source image. The migration is
a one-way operation, so children/groups/etc won't be updated
to point to the new destination image.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Kefu Chai [Sun, 1 Nov 2020 03:23:13 +0000 (11:23 +0800)]
crimson/osd: merge pg_temp_wanted before sending them
there is chance that new pg_temp_wanted is added when we are sending
them in ShardServices::send_pg_temp(), but the pg temp are already
collected into the messages before seastar::parallel_for_each(). so,
we cannot move all elements from pg_temp_wanted to pg_temp_pending after
sending the messages, that'd also move the unsent pg temps to
pg_temp_pending as well. so we have to do this before do the async call,
this issue was root caused by Xuehan Xu <xxhdx1985126@gmail.com>.