Marcus Watts [Fri, 8 Jan 2021 22:49:20 +0000 (17:49 -0500)]
rgw/kms/vault - rework unit test logic for new transit logic.
The "new" transit logic is organized quite differently
than the old logic, so the existing unit test logic was
very broken. Additionally, it's possible to test the
input arguments and send_request() has more of them now,
so add logic to verify most of those arguments are correct.
Fixes: http://tracker.ceph.com/issues/48746 Signed-off-by: Marcus Watts <mwatts@redhat.com>
Marcus Watts [Tue, 5 Jan 2021 02:22:07 +0000 (21:22 -0500)]
rgw/kms/vault - document configuration for new transit logic
Using the new transit logic requires slightly different configuration.
additionally there is some backwards compatibility support, which
also needed documentation.
The existing description of how to configure hashicorp vault
to work with ceph was also incomplete. I've fleshed that out a bit,
including considerably more information on how to use configure
and use the vault secret agent with ceph.
Fixes: http://tracker.ceph.com/issues/48746 Signed-off-by: Marcus Watts <mwatts@redhat.com>
Marcus Watts [Tue, 5 Jan 2021 02:21:33 +0000 (21:21 -0500)]
rgw/kms/vault - new transit logic - fix compat logic
Teuthology passes in a vault uri that ends in /export/encryption-key/
So: we need to handle (and ignore) trailing slashes when deciding
to enable compatibility support.
Fixes: http://tracker.ceph.com/issues/48746 Signed-off-by: Marcus Watts <mwatts@redhat.com>
Marcus Watts [Tue, 8 Dec 2020 23:09:04 +0000 (18:09 -0500)]
rgw/kms/vault - "compat" option
For transit engine:
"compat" option: 0=new only, 1=old & new, 2=old only.
This is just the option parsing itself: not the actual logic
for make_key | reconstitute_key.
Fixes: http://tracker.ceph.com/issues/48746 Signed-off-by: Marcus Watts <mwatts@redhat.com>
Marcus Watts [Mon, 7 Dec 2020 22:55:22 +0000 (17:55 -0500)]
rgw/kms/vault - encryption context - first part
This includes the logic to process the user provided
encryption context, turn it into "canonical json", and
to add in a default arn if it's not present.
Also present here is the start of logic to distinguish
between "prepare_encrypt" and "prepare_decrypt" at a lower
level; as "make_key" and "reconstitute_key" these will be
the functions that separately create a new datakey using
the vault transit operation, and to retrieve a previously
stored datakey.
Fixes: http://tracker.ceph.com/issues/48746 Signed-off-by: Marcus Watts <mwatts@redhat.com>
Marcus Watts [Mon, 7 Dec 2020 22:53:05 +0000 (17:53 -0500)]
rgw/kms/vault - define attribute to store encryption context
For rgw sse:kms use, the aws s3 standard provides an attribute
to store the base-64 encoded canonical json "encryption context".
This should be used to vary the per-object keys used for the
actual object encryption.
Fixes: http://tracker.ceph.com/issues/48746 Signed-off-by: Marcus Watts <mwatts@redhat.com>
Marcus Watts [Mon, 7 Dec 2020 22:48:31 +0000 (17:48 -0500)]
rgw/kms/vault - share get/set attr between rgw_crypt.cc and rgw_kms.cc
In order to pass down and manage "attrs" from crypt logic to kms
logic, it's necessary to share the functions that can get and
set strings in that structure. Eventually, I plan to have
the various engines store and retrieve a per-object "datakey" that
is encrypted (wrapped) by the named kms key.
Fixes: http://tracker.ceph.com/issues/48746 Signed-off-by: Marcus Watts <mwatts@redhat.com>
Marcus Watts [Mon, 7 Dec 2020 22:28:59 +0000 (17:28 -0500)]
rgw/kms/vault - relax configuration parsing for rgw_crypt_vault_secret_engine
To better manage forwards and backwards compatibility when using vault
transit for rgw object encryption (sse:kms); it is desirable to provide
parameters to control how this works. It was more attractive to overload
the existing rgw_crypt_vault_secret_engine parameter for this purpose
than to invent one or more all-new parameters.
Additionally, the enum support in the configuration parser looks like
it ought to have helpful syntax checking functionality. This is not so;
failure to provide a supported enum results in silently replacing that
with the default option, resulting in confusing and non-obvious behavior
that is not at all helpful.
This change removes the enum constraint on rgw_crypt_vault_secret_engine,
allowing for more useful messages from the rgw code, and the possibility
to also provide additional information on the same line.
Fixes: http://tracker.ceph.com/issues/48746 Signed-off-by: Marcus Watts <mwatts@redhat.com>
Marcus Watts [Mon, 7 Dec 2020 22:20:49 +0000 (17:20 -0500)]
rgw/kms/vault - need libicu to make canonical json for encryption contexts.
for encryption, aws s3 provides an "encryption" context to vary per-object
keys. The encryption context is a base64 encoded json structure, which
must be converted to a determinstic form -- "canonical json". This
requires converting all strings to a normalized canonical form: "utf-8 nfc",
it also requires thta keys in objects be sorted in a fixed order; so some
form of sorting based on nfc.
It turns out that libicu was the best way to produce utf-8 nfc (boost also
provides a mechanism, but it has many quirks). So, here are the hooks
to pull the system libicu into the build.
Fixes: http://tracker.ceph.com/issues/48746 Signed-off-by: Marcus Watts <mwatts@redhat.com>
Sage Weil [Fri, 5 Mar 2021 20:33:50 +0000 (15:33 -0500)]
Merge PR #39817 into master
* refs/pull/39817/head:
qa/suites/rados/cephadm: drop centos/rhel cephadm tests for the moment
qa/sites/rados/cephadm/thrash: rename 3-tasks.yaml/ -> 3-tasks/
qa/suites/rados/cephadm: adjust distros
qa/suites/upgrade: use kubic; test all distros
qa/suites/rados/cephadm/upgrade: use kubic on centos
qa: new kubic distro files; use kubic podman for centos/rhel
* refs/pull/38859/head:
mds: don't start purging inodes in the middle of recovery
mds: purge orphan objects created by lost async file creation
mds: track free prealloc_inos and delegated_inos separately
mds: cleanup code that purges orphan objects created by lost unsafe file creation
mds: subtract inos_to_purge from prealloc_inos when session close is logged
mds: use vector to define old_pools in PurgeItem and inode_backtrace_t
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
This broke cephadm (by triggering CEPHADM_STRAY_DAEMON) because cephadm
assumes that a daemon named rgw.r.z.foo will register as rgw.r.z.foo.
It is not clear to me that there is a way to work around this naming
mismatch that makes much sense. I think it makes more sense to focus on
the use-case that needs daemons to register under unique names and perhaps
control that naming behavior via an option or invest in providing daemons
with unique ids up front.
Kefu Chai [Sat, 13 Feb 2021 04:57:19 +0000 (12:57 +0800)]
doc/_theme: customize sphinx_rtd_theme
* move the breadcrumbs to the top
* add border around admonition elements
* use different colors and fonts for section headers
* add decoration lines at the bottom of breadcrumbs
* remove left and right borders in tables
* override the injected versions, the name of theme
is different from "sphinx_rtd_theme", but the
versions element is still displayed at the
bottom-left corner as "versions.html" defines.
without overriding .rst-badge CSS styling,
readthedocs puts the injected versions at
the default bottom-right corner, see
https://github.com/readthedocs/readthedocs.org/blob/2a519f1146142d18f6a63b61c2f08984067280e0/readthedocs/api/v2/templates/restapi/footer.html
Sage Weil [Wed, 3 Mar 2021 14:14:29 +0000 (08:14 -0600)]
qa: new kubic distro files; use kubic podman for centos/rhel
The current centos/rhel version of podman (2.2.1) is broken.
- create new qa/distros/podman/* files that install kubic podman
- include centos/rhel variants
- adjust cephadm jobs to use new yaml files
- remove old qa/distros/all/*_podman.yaml files
I did some visual cleanup too but mostly this changeset is to support
specifying subsets for each suite type. For now, only "fs" suite is
using partitions different from rados.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/39724/head:
qa: skip exit-on-first-failure option for valgrind on ubuntu
mds,qa: exit instead of respawn under valgrind
qa: skip chdir for fuse_mount
qa: ignore all slow request warnings
qa: add new mds beacon grace mon config
qa: wait for MDS to join fsmap
qa: move get_valgrind_args to qa
* refs/pull/38684/head:
qa: add _check_scrub_status helper to simplify the code
qa: add run_scrub helper in filesystem class
qa: add get_scrub_status helper in filesystem class
qa: wait the scrub task to complete
qa: remove passed_validation check for test_damage
qa: move wait_until_scrub_complete helper to filesystem class
mds: simplify the C_MDS_EnqueueScrub finish code
Reviewed-by: Rishabh Dave <ridave@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Sage Weil [Thu, 4 Mar 2021 19:31:57 +0000 (14:31 -0500)]
Merge PR #39726 into master
* refs/pull/39726/head:
mgr/cephadm: document ok_to_stop output argument for clarity
mgr/DaemonServer: make warning language a bit friendlier
mgr/cephadm/upgrade: improve language a bit
mgr/cephadm/upgrade: restart multiple osds at once
mgr/cephadm: gather other osds that are safe to stop
mgr/cephadm: optional pass 'known' through to ok_to_stop
mgr/cephadm/upgrade: log start/stop/pause/resume
Sage Weil [Thu, 4 Mar 2021 13:35:24 +0000 (08:35 -0500)]
mgr/DaemonServer: osd ok-to-stop: return json when there are unknown PGs
In 791952cc01201010f298033003ba52374cc0159f we switched to return JSON
both on success and fail to describe which PGs are affected or are blocking
the ability to stop/restart OSDs. Do the same for the case where
some PG states are unknown (i.e., just after a mgr restart) so that
the cephadm upgrade process can unconditionally expect a JSON result.
Marcus Watts [Wed, 3 Feb 2021 19:26:46 +0000 (14:26 -0500)]
rgw/kms/kmip - document configuration for a new feature: kmip kms
I've written up a brief description of using kmip
with ceph. Major features:
* ceph configuration.
* making keys with a "paste-in" python script.
* pointers to PyKMIP and IBM SKLM.
Marcus Watts [Thu, 12 Nov 2020 03:38:18 +0000 (22:38 -0500)]
rgw/kms/kmip - rgw / kmip test integration.
s3tests needs to know key names in order to run kms tests.
It seems desirable to have s3tests default to discovering
the names that were created by the pykmip task, and that
if there is more than one rgw connected to more than one
pykmip, that names belonging to the appropriate pykmip
instance should be used.
This logic does the following:
rgw task: save pykmip role name.
s3tests task: set kms_key (and kms_keyid2) to
these in order of priority
1 s3tests client task property ['kms_key'] (or ['kms_key2'])
2 first (second) secret created in the matching pykmip instance.
3 testkey-1 (testkey-2)
For case 2, names from the secrets have an initial "token-" stripped from them.
The assumption here is that rgw is being run with a setting such as
rgw crypt kmip kms key template: pykmip-$keyid
therefore "pykmip-" will be prefixed back onto the key before use.
Marcus Watts [Thu, 29 Oct 2020 16:04:36 +0000 (12:04 -0400)]
rgw/kms/kmip - correct documentation.
The pykmip task should be after ceph, and before rgw.
kmip needs ssl certs in order to function correctly.
Because the openssl_keys task has an indeterminate
order of execution, it is best to create the ca as
a separate task. The ca can be shared with rgw, but
real life deployments of kmip are likely to have their
own CA.
In order to create kmip secrets, a client certificate
is necessary, so must be supplied to the pykmip task.
Marcus Watts [Thu, 29 Oct 2020 03:40:58 +0000 (23:40 -0400)]
rgw/kms/kmip - pykmip.py needs to make keys too.
The logic to deploy pykmip in teuthology was not complete.
The necessary logic to add kmip keys was missing.
Existing logic for other key services providers could use rest based
protocols directly from the teuthology host. For kmip, it is necessary
to use a special protocol, and it is more convenient to run this directly
on the pykmip server.
Marcus Watts [Tue, 27 Oct 2020 21:16:14 +0000 (17:16 -0400)]
rgw/kms/kmip - pykmip.py should actually run pykmip.
The logic to deploy pykmip in teuthology was not complete.
While it deployed all the code and certs to run pykmip,
it didn't actually run it. This commit fixes that.
Marcus Watts [Fri, 23 Oct 2020 23:07:09 +0000 (19:07 -0400)]
rgw/kms/kmip - python3 changes for testing.
python3 requires different imports and there's a different
way to get at the first element in a view.
This is to match changes introduced in the rest of ceph in these
commits: 24e7acc261a4d7258ea7fdcd
Marcus Watts [Sun, 16 Feb 2020 02:08:29 +0000 (21:08 -0500)]
kmip: first pass at implementation logic.
This implements SSE-KMS for the radosgw using kmip.
This uses symmetric raw keys with a name attribute in kmip,
so providing the same functionality as the "kv" key store
in hashicorp vault.
Nathan Cutler [Mon, 1 Mar 2021 11:07:29 +0000 (12:07 +0100)]
rpm: use PMDK system libraries on SUSE
As of a49d1dbb32e2436ff2836a85b2fa84418f0a5fff, when the rbd_rwl_cache and
rbd_ssd_cache bconds are enabled and WITH_SYSTEM_PMDK is disabled (as it is by
default), the RPM build attempts to
git clone https://github.com/ceph/pmdk.git
but of course that won't work in the OBS, where the build workers have no
Internet connectivity.
Fortunately, the openSUSE/SLE versions targeted by Ceph master and pacific ship
the necessary PMDK libraries as RPM packages.