Ilya Dryomov [Mon, 17 Oct 2022 13:51:04 +0000 (15:51 +0200)]
librbd: non-pruning parent overlap handling fixes
Apply similar "reduce overlap and respect area" logic to places
that don't use prune_parent_extents(). Changes to FlattenRequest
and TrimRequest here should complete the long tail of encrypted
I/O path and flatten fixes.
Ilya Dryomov [Sat, 15 Oct 2022 16:31:45 +0000 (18:31 +0200)]
librbd: reduce overlap and respect area when pruning parent extents
DATA area in the parent may be smaller than the part of DATA area in
the clone that is still within the overlap. This would occur e.g. in
LUKS2-formatted parent + LUKS1-formatted clone case, due to LUKS2
header usually being bigger than LUKS1 header:
parent: raw size = 64M
LUKS2 header area = 16M
data area = 48M
clone: raw size = 64M (raw overlap 64M)
LUKS1 header area = 4M
data area = 60M
Currently, because parent extents are pruned only according to raw
overlap (64M), the clone ends up attempting to reach the parent for all
of its data area (60M < 64M) even though the parent only has 48M worth
of data. All kinds of bugs ensue for 48M..60M offsets and this range
basically becomes inaccessible to the user.
A related issue is that prune_parent_extents() ignores area.
librbd: clip extents to their area instead of DATA area
This fixes cases where CRYPTO HEADER area is larger than DATA area.
In particular, it was effectively impossible to flatten unformatted
clones of such images.
Ilya Dryomov [Fri, 14 Oct 2022 14:20:24 +0000 (16:20 +0200)]
librbd: introduce reduce_parent_overlap() and switch overlap API
When encryption is loaded, rbd_get_overlap() and Image::overlap() now
return "effective" overlap, similar to rbd_get_size() and Image::size().
Previously, returned overlap could have been bigger than "effective"
size.
Note that get_effective_image_size() successor doesn't take snap_id
because passing anything but ictx->snap_id was broken. Since the size
of the crypto header is stored in the crypto header itself, image areas
are defined only for the "opened at" snap_id. Getting "effective" size
for an arbitrary snapshot requires actually opening it and loading
encryption on it.
Ilya Dryomov [Fri, 14 Oct 2022 14:01:45 +0000 (16:01 +0200)]
librbd: tweak get_parent_overlap() signature
Make it clear that get_parent_overlap() returns the raw parent overlap
value (i.e. physical offset into the parent image). Also drop redundant
ceph_mutex_is_locked assert -- get_parent_info() already has one.
Ilya Dryomov [Thu, 20 Oct 2022 15:25:46 +0000 (17:25 +0200)]
librbd: remap resize target size if encryption is loaded
When encryption is loaded, rbd_get_size() and Image::size() return
"effective" size, but rbd_resize() and Image::resize() continue to take
raw size. The user has to constantly keep these domains in mind.
Saying that resize must be done without encryption loaded is not an
answer because shrinking a clone that has snapshots involves copying up
objects in the affected range (identical to flattening). In addition,
even if a clone doesn't have snapshots, shrinking it to a size that
isn't an object boundary is going to involve a copyup for the victim
object as well.
To avoid subtle data corruption on shrink, treat resize operation the
same as flatten operation (including on the CLI).
librbd: check stripe pattern when loading encryption
Currently it's done in FormatRequest but not in LoadRequest. However
an image can be deep copied or exported and imported with a different
stripe pattern such that an area boundary would fall in the middle of
an object.
Currently it's done in FormatRequest but not in LoadRequest. However
an image can be shrunk to a size such that encryption can loaded (i.e.
enough of the header is still present) but nothing else can, breaking
implicit assumptions all around.
Ilya Dryomov [Fri, 7 Oct 2022 09:56:27 +0000 (11:56 +0200)]
librbd: relax image size check in luks::FormatRequest
Proceed with formatting an image even if all space would be consumed by
the crypto header. There is no reason to be strict here since we allow
creating zero-sized images as well as shrinking any image to 0.
librbd: don't temporarily shut down crypto when flattening
(Temporarily) shutting down crypto can lead to data corruption in the
face of concurrent I/O, especially when flatten operation is proxied to
the remote lock owner. This was added to be able to read, optionally
modify and write crypto header without it being subjected to remapping
and encryption itself. read_header() and write_header() now achieve
that by specifying CRYPTO_HEADER area explicitly.
librbd: move get_file_offset() into CryptoObjectDispatch
This method doesn't propagate area. Since its only user is
CryptoObjectDispatch which is now applied only to DATA area,
move get_file_offset() there to avoid misuse in the future.
librbd: apply CryptoObjectDispatch layer only to DATA area
Objects in CRYPTO_HEADER area should not be subjected to encryption.
Unit tests needed adjustment because MockCryptoInterface is configured
with DATA_OFFSET = 4 * 1024 * 1024, thus disqualifying object 0.
- readahead and PWL cache are limited to DATA area as explained in
the previous commit
- DATA area is assumed for the journal as encryption can't be used
with journaling anyway
To postpone the churn associated with passing area through
ImageDispatchInterface (where only WriteLogImageDispatch and
ImageDispatch care), add a new image dispatch flag.
librbd: pass area to ImageDispatchSpec::create_*()
- DATA area is assumed at the API layer as there is no way to pass
an area
- DATA area is assumed by ImageWriteback because PWL cache persists
image extents as provided by the user without any kind of designator
and therefore can be active only in either area
- luks::FlattenRequest operates on CRYPTO_HEADER area
The passed area is acted upon in ImageDispatchSpec constructor in the
next commit.
Note that, as suggested by extents_to_file() signature, all returned
image extents would pertain to the same area. This means that an area
boundary must coincide with an object boundary.
luks::FormatRequest is actually more strict: crypto header area size is
set to a multiple of stripe period (i.e. one or more whole objects).
librbd: introduce ImageArea and split remap_extents() into two methods
Since remap in either direction can really be done only once, iterating
through image dispatch layers in ImageDispatcher::remap_extents() makes
no sense. For now, just replace the iteration with CryptoImageDispatch
lookup.
librbd: pass image_extents to create_{discard,write_same}()
These are still taking off and len separately which is inconsistent
with the rest of ImageDispatchSpec and also ImageDiscardRequest and
ImageWriteSameRequest.
Zac Dover [Thu, 1 Dec 2022 07:27:16 +0000 (17:27 +1000)]
doc/cephadm - remove "danger" admonition
Remove the "Danger!" admonition that warned against upgrading to Pacific
from an older version of Ceph. That admonition was meant to protect
against the issue discussed here: https://tracker.ceph.com/issues/53062.
That issue was fixed in this PR:
https://github.com/ceph/ceph/pull/43793.
gaoweinan [Thu, 1 Dec 2022 02:19:45 +0000 (10:19 +0800)]
doc/radosgw:Fix index error
Fix index error.There is a sentence in the note of chapter "Pool Placement and storage classes": "If you have not done any previous Multisite Configuration", so it seems that chapter "Multisite Configuration" precedes chapter "Pool Placement and storage classes", but actually chapter "Multisite Configuration" follows chapter "Pool Placement and storage classes" in the index.
Zac Dover [Wed, 30 Nov 2022 03:56:52 +0000 (13:56 +1000)]
doc/cephadm: add airgapped install procedure
Add a procedure describing an installation with an airgapped registry.
This commit ingests work done in https://github.com/ceph/ceph/pull/44346
that was abandoned for lo these past eleven months. The PR connected
with this commit supersedes that PR.
rgw: refactor selected files for better above- vs below-the-line
Based on https://github.com/ceph/ceph/pull/48272, separate selected methods
into new files for above-the-line vs below-the-line linkage. This is more of
the work to prepare for eventual merging of the loadable module implementation.
"Utility" functions that don't reference the Store are moved, e.g. from
rgw_zone.cc, into the new rgw_zone_utils.cc. Methods in the new *_utils.cc
files are above-the-line.
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>