Casey Bodley [Wed, 11 May 2022 18:49:49 +0000 (14:49 -0400)]
qa/rgw: use 'with-sse-s3' override for s3tests
don't rely on the ceph manager task to parse a config file. each rgw
could be using a different config. instead, revert to an s3tests
override called 'with-sse-s3'
this way, the only job that enables sse-s3, vault_transit.yaml, contains
both the 'rgw crypt sse s3' configurables, and the flag to enable the
associated test cases
Marcus Watts [Wed, 27 Apr 2022 22:50:56 +0000 (18:50 -0400)]
qa/rgw - run sse-s3 test cases only if configured or requested
This commit adds logic to automatically detect when sse-s3 is
available and if not, disables sse-s3 tests by default.
Configuration opions are provided to override the default either way.
Marcus Watts [Fri, 4 Mar 2022 01:37:53 +0000 (20:37 -0500)]
rgw/crypt - fix rest call to fail if insufficient kms args supplied.
in s3-land, it is ok to supply incomplete kms args for bucket encryption
configuration, but not on the rest call. This is a fix to distinguish
between the two and error out in the case of the latter.
The existing logic for bucket encryption was incomplete. This adds the
rest of the changes necessary to support sse-kms with default bucket
encryption.
The new logic has these changes:
on input: SSEAlgorithm is now optional.
On output: emit xmlns attribute at top level.
also output
BucketKeyEnabled and KMSMasterKeyID.
Hnadle "empty rule" case.
for testing and diagnostics:
support RGWBucketEncryptionConfig in ceph-dencoder
Marcus Watts [Tue, 15 Feb 2022 01:02:34 +0000 (20:02 -0500)]
rgw/crypt - remote old parts path for sse attributes
crypt_attribute_map is the place where sse attributes
should be found by the rest of the sse logic. There is
no longer any need to feed "parts" down to the crypto
logic; this commit removes the old data path.
Marcus Watts [Fri, 28 Jan 2022 10:34:43 +0000 (05:34 -0500)]
rgw/crypt - generalize putbucketencryption.
The previous logic only suported putbucketencryption to enable
sse-s3. The protocol allows putbucketencryption to be used to
enable sse-kms by default, and the surrounding logic is now ready
to do this as well. This commit removes the checks which stopped
this from working, so that it is now possible to use putbucketencryption
to default either sse-s3 or sse-kms on.
Marcus Watts [Fri, 28 Jan 2022 10:32:14 +0000 (05:32 -0500)]
rgw/crypt - fix sse-s3 logic.
The previous logic path was overly eager to do sse-s3. This version
ensures that the "no-encryption" case does not default to sse-s3.
It also removes some argument sanity checking which is now down before
this code is reached.
Marcus Watts [Sat, 18 Dec 2021 04:16:09 +0000 (23:16 -0500)]
rgw/sse-s3: +get_encryption_defaults, use new crypt_attribute_map
putobj and postobj: get_encryption_defaults
this fetches bucketencryption policy and resolves defaults.
also errors for various conflicts between parameters (& policy).
verify_permisions
fetch encryption attributes from crypt_attribute_map not x_meta_map
for postobj, x_meta_map only gets meta attributes, not sse.
if bucketencryption policy exists, it *may* be correct to
prepopulate this before bucket policy sees it.
map_qs_metadata
for putobj it now also copies sse attributes into crypt_attribute_map.
Marcus Watts [Sat, 18 Dec 2021 04:13:09 +0000 (23:13 -0500)]
rgw/sse-s3: various improvements.
1. sse-s3 should not require bucketencryption policy, work w/ postobj
2. make bucket key name configurable
3. +rgw_remove_sse_s3_bucket_key
1. for sse-s3 should not require bucketencryption policy, work w/ postobj
get_crypt_attribute ->
using s->info.crypt_attribute_map instead of s->env to avoid havoing
to know about HTTP_X_AMZ_SERVER_SIDE_ENCRYPTION_CUSTOMER_ALGORITHM names,
crypt_attribute_get -> crypt_attributes.get
to consolidate crypt attribute sources
rework sse-s3 logic: sse-s3 can be specified entirely in the rest call,
so remove requirement that bucket has bucket encryption policy.
also avoid term "default encryption", prefer term "test key".
2. for make bucket key name configurable:
With this modification, sse-s3 key names default to being
the bucket id, but can be configured to instead consist
of the owners name, a fixed string, or variations thereof.
3. +rgw_remove_sse_s3_bucket_key
For sse-s3, keys are supposed to be managed entirely by s3.
This means when a bucket is removed, we should be removing its key,
which should no longer be in use for anything. This is only safe
if the key was constructed using "%bucket_id", otherwise it might be
used in another bucket and we can never remove it automatically.
Marcus Watts [Sat, 18 Dec 2021 04:09:56 +0000 (23:09 -0500)]
rgw/sse-s3: save sse attributes in req_state->crypt_attribute_map
req_state->crypt_attribute_map to save sse-s3 cryptographic attributes
this is not quite a duplicate of x_meta_map because I think some of
of its uses conflict with sse-s3. (for instance, bucketencryption vs. signatures)
rgw: Adding SSE-S3 support in GET and PUT paths (using Vault as KMS)
Added the support to generate KEK based on bucket owner UID in
PutBucketEncryption. This is stored in bucket x-attrs. The KEK-ID is
later used in GET and PUT paths.
In the PUT path, we check if BucketEncryption is enabled for the bucket.
If yes, we detemine if the encryption type is AES256 (i.e., SSE-S3),
then we fetch the KEK-ID from the bucket x-attrs and use it to wrap the
data key. Thereafter, we call generate-data-key. We store the KEK-ID
and the wrapped data-key in the object x-attrs.
In the GET path, we simply pull out the KEK-ID from the object x-attr
and decrypt the object.
haoyixing [Fri, 25 Mar 2022 03:02:13 +0000 (03:02 +0000)]
mds: add a perf counter to record slow replies
Though we have MDS_HEALTH_SLOW_METADATA_IO and MDS_HEALTH_SLOW_REQUEST health alert, but those are not
precise nor accumulated. With slow reply counter compared to reply counter, we can find out the ratio
of slow requests through perf dump.
Fixes: https://tracker.ceph.com/issues/55126 Signed-off-by: haoyixing <haoyixing@kuaishou.com>
(cherry picked from commit e8e3b307c87dc9eec2d087b396c0e7a0248b4f1d)
but when WITH_SYSTEM_ARROW is enabled, the targets we get from
find_package() do not carry this dependency. so rgw's cmake needs to
depend on both targets
windgmbh [Fri, 12 Nov 2021 15:51:03 +0000 (16:51 +0100)]
Apply sysctl.d migration from /usr/lib to /etc
A fix regarding the SYSCTL_DIR location (#53130) requires to migrate
sysctl.d/*.conf files from /usr/lib to /etc. Signed-off-by: Lukas Mayer <lmayer@wind.gmbh>
(cherry picked from commit a167a27f30536958e0f2c513d351642e81ba06d5)
windgmbh [Wed, 3 Nov 2021 17:16:53 +0000 (18:16 +0100)]
Fix sysctl.d location FHS compliance
This fixes #53130
Containers should not write to '/usr/lib'.
That location could be read-only or overwritten. Signed-off-by: Lukas Mayer <lmayer@wind.gmbh>
(cherry picked from commit 77afa812ea8b7e1e802246e4aa3a31e7b644a502)
Adam King [Thu, 24 Mar 2022 13:59:10 +0000 (09:59 -0400)]
cephadm: pass "--security-opt label=disable" to node-exporter container
in order to support setting '--path.procfs=/host/proc','--path.sysfs=/host/sys',
'--path.rootfs=/rootfs' for node-exporter we need to disable selinux separation
between the node-exporter container and the host to avoid selinux denials
Adam King [Fri, 4 Mar 2022 02:47:47 +0000 (21:47 -0500)]
mgr/cephadm: offline host watcher
To be able to detect if certain offline hosts go
offline quicker. Could be useful for the NFS
HA feature as this requires moving nfs daemons from
offline hosts within 90 seconds.
Redouane Kachach [Tue, 29 Mar 2022 16:37:10 +0000 (18:37 +0200)]
mgr/cephadm: fallback to normal sorted if cannot import natsorted Fixes: https://tracker.ceph.com/issues/55113 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 19c07de8207de5038df6f510a3c2ff41b10f7e08)
Adam King [Tue, 22 Mar 2022 22:57:21 +0000 (18:57 -0400)]
mgr/cephadm: Reschedule nfs daemons from offline hosts
In order to improve nfs availability, if there are other
hosts we can place an nfs daemon on or if there is a host
with a lower rank nfs daemon when a higher rank one is on
an offline host, we should reschedule the nfs daemons
mgr/cephadm: Adding support to store ceph conf per cluster fsid Fixes: https://tracker.ceph.com/issues/55185 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 2ea76173a163a93bbfbf69d0faa732d46eaf05ba)
mgr/cephadm: do not add _admin label when no-minimize-config is provided Fixes: https://tracker.ceph.com/issues/52727 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 01c8999d0354a71a7ef8526aab9b39e30d67c1bb)
Redouane Kachach [Wed, 23 Mar 2022 17:24:01 +0000 (18:24 +0100)]
mgr/cephadm: Adding image tag and date to cephadm startup messages Fixes: https://tracker.ceph.com/issues/55008 Fixes: https://tracker.ceph.com/issues/54373 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 92ecb58d46b6f75265a664f3165f4b3a0dd4993a)
Adam King [Wed, 6 Apr 2022 14:32:22 +0000 (10:32 -0400)]
mgr/cephadm: allow setting insecure_skip_verify for alertmanager
Add a "secure" parameter to alertmanager spec that will cause it
to deploy alertmanagers with insecure_skip_verify as true or false
depending on the value given for "secure".
NOTE: alertmanager must still be reconfigured after applying a yaml
with this option changed.
Fixes: https://tracker.ceph.com/issues/55272 Fixes: https://tracker.ceph.com/issues/55333 Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit e583d4ef1ac23a7473d50d253e0edf70580542ae)
Moritz Röhrich [Mon, 21 Mar 2022 16:32:25 +0000 (17:32 +0100)]
cephadm: avoid crashing on expected non-zero exit
- Avoid crashing when a call out to an external program expectedly does
not return exit status zero.
There are programs that communicate other information than error/no
error through exit status. E.g. `systemctl status` will return different
exit codes depending on the actual status of the units in question.
In cases where this is expected crashing with a RuntimeError exception
is inappropriate and should be avoided.
Fixes: https://tracker.ceph.com/issues/55117 Signed-off-by: Moritz Röhrich <moritz.rohrich@suse.com>
(cherry picked from commit a02be6f22fa18094cd8758700ab74581b6ce1701)
Melissa Li [Wed, 23 Mar 2022 15:38:37 +0000 (11:38 -0400)]
cephadm: show error message if private registry credentials not provided
Raise UnauthorizedRegistryError in `_pull_image` if user tries to pull from a private registry without authentication, handle error in `command_bootstrap`, `commond_adopt`, `command_pull`
Fixes: https://tracker.ceph.com/issues/55015 Signed-off-by: Melissa Li <melissali@redhat.com>
(cherry picked from commit 4de0803ba893abf341ab634d1382208370de7c98)
mgr/cephadm: support non-root ssh-user w permissions
Restructured code, so that in case of non-root, the resulting file will
be created with permissions set to the ssh-user. This allows the
subsequent scp to be able to write the file. The remaining code kept the
same, so that file permissions are restored to the expected ones, but
just runs after the scp.
Fixes: https://tracker.ceph.com/issues/54620 Signed-off-by: Christoph Glaubitz <c.glaubitz@syseleven.de>
(cherry picked from commit 452e52a7e39409e3409d59940133333416b830bc)
ceph-volume/tests: reject loop devices in lvm.conf
The current task doesn't works (typo?).
Otherwise api/lvm.py can't work properly, functions such as
`get_single_lv()` and many other don't return the expected results.
Indeed, lvm is confused because of the nvme_loop setup.
This adds the support of complex OSD creation with command
`orch daemon add osd`.
Any argument supported by `DriveGroupSpec()` can be passed on the command line.
Cephadm shouldn't try to deploy a disk reported as unavailable by ceph-volume.
The idea here is to check the rejection reason so we can still use DB devices
in case of OSD replacement.
Redouane Kachach [Fri, 11 Mar 2022 11:41:18 +0000 (12:41 +0100)]
mgr/cephadm: Show warning when user provides --fsid option Fixes: https://tracker.ceph.com/issues/50804 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 8780aa04651fa2cddeec1d9d2dfcf4e08412d4ce)
mgr/cephadm: checking service name before removal Fixes: https://tracker.ceph.com/issues/54503 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit b26c114c8456941d6cccf7d4355445f21cb373a7)
Joseph Sawaya [Fri, 11 Mar 2022 20:45:16 +0000 (15:45 -0500)]
doc: Add note to osds_per_device description about dual-actuator devices
This commit adds information about using dual-actuator devices with the
osds_per_device drive group option, letting users know they can create
an OSD for each actuator by setting this value to 2 in the drive group
they're using to apply OSDs to the device.