Ilya Dryomov [Sun, 10 Dec 2023 16:01:24 +0000 (17:01 +0100)]
test/pybind/rbd: don't ignore from_snapshot in check_diff()
Despite the test in test_diff_iterate() being correct, it started
failing:
> check_diff(self.image, 0, IMG_SIZE, 'snap1', [(0, 512, False)])
...
a = [], b = [(0, 512, False)]
...
> assert a == b
E AssertionError
This is because check_diff() drops 'snap1' argument on the floor and
passes None to image.diff_iterate() instead. This goes back to 2013,
see commit e88fe3cbbc8f ("rbd.py: add some missing functions").
- beginning of time -> HEAD, through intermediate snap
- snap -> snap, directly
- snap -> HEAD, directly
But coverage is too weak: none of the weird OBJECT_PENDING cases and
only a single diff-iterate vs deep-copy case is tested, for example.
Coverage is missing completely for:
- beginning of time -> HEAD, directly
- beginning of time -> snap, directly
- beginning of time -> snap, through intermediate snap
- snap -> snap, through intermediate snap
- snap -> HEAD, through intermediate snap
Ilya Dryomov [Fri, 8 Dec 2023 14:19:02 +0000 (15:19 +0100)]
librbd: OBJECT_PENDING should always be treated as dirty
OBJECT_PENDING is a transition state which normally isn't encountered
in (snapshot) object maps. In case it's encountered, for example when
a snapshot is taken after losing power at the time a discard was being
handled, the object should be treated as dirty and produce a diff as
a result.
Assuming an object is marked OBJECT_PENDING, theoretically there are
four cases with respect to object's state in the next snapshot:
Prior to commit b81cd2460de7 ("librbd/object_map: diff state machine
should track object existence"), (3) was handled incorrectly (diff set
to DIFF_STATE_NONE instead of DIFF_STATE_UPDATED).
Post commit 399a45e11332 ("librbd/object_map: rbd diff between two
snapshots lists entire image content"), (4) is handled incorrectly
(diff set to DIFF_STATE_DATA instead of DIFF_STATE_DATA_UPDATED).
Similar to DiffIterateTest.DiffIterateDeterministic, systematically
cover the most common cases involving full-object discards. With this
in place, issue [1] can be reproduced by any of:
(preparatory) before snap3 is taken
(1) beginning of time -> HEAD
(2) snap1 -> HEAD
(5) beginning of time -> snap3
(6) snap1 -> snap3
Sub-object discards aren't covered here because of further issues
[2][3].
Ilya Dryomov [Fri, 10 Nov 2023 10:14:42 +0000 (11:14 +0100)]
librbd: resurrect "exists" assert in simple_diff_cb()
This effectively reverts commit 3ccc3bb4bd35 ("librbd: diff_iterate
needs to handle holes in parent images") which just dropped the assert
instead of addressing the root cause of reported crashes.
Ilya Dryomov [Thu, 9 Nov 2023 19:44:18 +0000 (20:44 +0100)]
librbd: diff-iterate shouldn't ever report "new hole" against a hole
If an object doesn't exist in both start and end versions but there is
an intermediate snapshot which contains it (i.e. the object is written
to and captured at some point but then discarded prior to or in the end
version), diff-iterate reports "new hole" -- callback is invoked with
exists=false. This occurs both on the slow list_snaps path and in
fast-diff mode.
Despite going all the way back to the introduction of diff-iterate in
commit 0296c7cdae91 ("librbd: implement diff_iterate"), this behavior
is wrong and contradicts diff-iterate API documentation added in commit a69532e86450 ("librbd: document diff_iterate in header") in the same
series:
If the source snapshot name is NULL, we interpret that as
the beginning of time and return all allocated regions of the
image.
It also triggered an assert added in commit c680531e070a ("librbd:
change diff_iterate interface to be more C-friendly") in the same
series. Unfortunately, commit f1f6407221a0 ("test_librbd: add
diff_iterate test including discard"), also part of the same series,
added a test which expected the wrong behavior. Very confusing!
A year later, a different manifestation of this bug was fixed in commit 9a1ab95176fe ("rbd: Fix rbd diff for non-existent objects"), but the
fix only covered the case where calc_snap_set_diff() goes past the end
snap ID while processing clones. The case where it runs out of clones
to process before reaching the end snap ID remained mishandled.
A year after that, commit 3ccc3bb4bd35 ("librbd: diff_iterate needs to
handle holes in parent images") dropped the assert mentioned above and
this bug got enshrined in the newly introduced fast-diff mode.
Finally, a few years later, deep-copy actually started relying on this
bug in commit e5a21e904142 ("librbd: deep-copy image copy state machine
skips clean objects"). This necessitates bifurcation in DiffRequest
because deep-copy wants the "has this object been touched" semantics,
which is different from diff-iterate (and also potentially much more
expensive to produce!).
This commit brings a minimal update to TestMockObjectMapDiffRequest
tests and DiffIterateTest.DiffIterateDiscard. Coverage is expanded in
the following commits.
Zac Dover [Tue, 5 Dec 2023 19:46:26 +0000 (20:46 +0100)]
doc/radosgw: update link in rgw-cache.rst
Update link in doc/radosgw/rgw-cache.rst. The link updated here is a
link to all the Nginx configuration files. The old link was broken. This
update comes to us from an anonymous report on
https://pad.ceph.com/p/Report_Documentation_Bugs.
Casey Bodley [Tue, 5 Dec 2023 17:21:18 +0000 (12:21 -0500)]
common: use inline for monostate dencoders
fix a 'multiple definition' error when included by multiple sources:
src/common/versioned_variant.h:31: multiple definition of `ceph::encode(std::monostate const&, ceph::buffer::v15_2_0::list&)';
rgw_main.cc.o:src/common/versioned_variant.h:31: first defined here
Ilya Dryomov [Fri, 1 Dec 2023 17:29:12 +0000 (18:29 +0100)]
test/librbd: drop DiffIterateTest.DiffIterateRegression6926
This was added to test [1]. It's duplicated by several cases in
DiffIterateTest.DiffIterateDeterministicPP now. Specifically, the
issue could be reproduced by any of:
(8) beginning of time -> snap2
(9) snap1 -> snap2
(10) beginning of time -> snap1
Ilya Dryomov [Fri, 1 Dec 2023 17:54:19 +0000 (18:54 +0100)]
test/librbd: drop TestLibRBD.SnapDiff
This was added to integration test [1], separate from the fix which
went in only with unit test adjustments. It's duplicated by several
cases in DiffIterateTest.DiffIterateDeterministic now. Specifically,
the issue could be reproduced by any of:
(3) snap2 -> HEAD
(4) snap3 -> HEAD
(7) snap2 -> snap3
scribble()-based DiffIterate tests are too weak: at least two
regressions that should been caught by DiffIterate.DiffIterate or
DiffIterate.DiffIterateStress were missed [1][2]. Aside from the
randomness which can be both a good and a bad thing, asserts there
ensure only that the returned diff covers all changes that were made.
If the returned diff is too excessive or otherwise bogus, this isn't
detected [3].
Add a deterministic test to systematically cover the most common cases
that don't involve discards. A similar test for discards will be added
with the fix for [4].
Comment out debug log in vector_iterate_cb() like it's done in
iterate_cb().
Vallari Agrawal [Thu, 26 Oct 2023 07:55:44 +0000 (13:25 +0530)]
qa: add rbd/nvmeof test
A basic test for ceph-nvmeof[1] where
nvmeof initiator is created.
It requires use of a new task "nvmeof_gateway_cfg"
under cephadm which shares config information
between two remote hosts.
Zac Dover [Sun, 3 Dec 2023 12:17:46 +0000 (13:17 +0100)]
doc/rados: repair stretch-mode.rst
Remove a section of doc/rados/operations/stretch-mode.rst that I wrongly
re-included after its removal. The request for this (re)-removal is
here: https://github.com/ceph/ceph/pull/54689#discussion_r1413007655.
Ilya Dryomov [Mon, 27 Nov 2023 10:59:26 +0000 (11:59 +0100)]
librbd: fix read_whole_object handling in ObjectListSnapsRequest
Originally, in commit 2be4840afd4f ("librados/snap_set_diff: don't
assert on empty snapset"), exists was set to true. This didn't make
ObjectListSnapsRequest, causing the following deep-copy tests to fail
when run against calc_snap_set_diff() rigged to return "whole object"
as described in [1]:
This is a regression introduced in commit cc87a8bd697e ("librbd:
deep-copy object utilizes image-extent IO methods") by way of commit 11923e234efc ("librbd: generic object list snapshot request").
Ilya Dryomov [Mon, 27 Nov 2023 09:11:52 +0000 (10:11 +0100)]
librbd: fix LIST_SNAPS_FLAG_WHOLE_OBJECT behavior
Bundling read_whole_object and LIST_SNAPS_FLAG_WHOLE_OBJECT cases
together is wrong:
- In read_whole_object case, calc_snap_set_diff() sets just
read_whole_object. Everything else is zeroed out and may require
resetting to fit with the rest of ObjectListSnapsRequest logic.
- In LIST_SNAPS_FLAG_WHOLE_OBJECT case, only the diff should be
expanded. Everything else is set by calc_snap_set_diff() and should
be used as is. This goes for end_size in particular -- if it's reset
to object size, bogus zero extents may be returned as the object
would appear to have grown.
This is a regression introduced in commit 4429ed4f3f4c ("librbd: switch
diff iterate API to use new snaps list dispatch methods") by way of
commit 66dd53d9c4d9 ("librbd: optionally return full object extent for
any snapshot deltas").
Ilya Dryomov [Sun, 19 Nov 2023 21:44:28 +0000 (22:44 +0100)]
test/librbd: make ListSnapsWholeObject actually test stuff
Despite being added in commit 66dd53d9c4d9 ("librbd: optionally return
full object extent for any snapshot deltas") ostensibly to test the new
LIST_SNAPS_FLAG_WHOLE_OBJECT code, it surely doesn't do that because
the flag isn't even passed to MockObjectListSnapsRequest::create().
I can only guess, but it looks like snap ID 3 was intended to be
a starting point. Otherwise, with 0 and CEPH_NOSNAP passed as snap
IDs, the overlap that is set up for the clone wouldn't affect the
computation in any way.
Use snap ID 3 as a starting point and run both with and without
LIST_SNAPS_FLAG_WHOLE_OBJECT on the same snapset to pinpoint the
difference.
Ilya Dryomov [Sat, 11 Nov 2023 13:15:49 +0000 (14:15 +0100)]
librados/snap_set_diff: set end_size only if end object exists
Since commit 73f50a13109f ("rbd-mirror: use generalized deep copy for
image sync"), the only user of calc_snap_set_diff() immediately unsets
end_size otherwise.
calc_snap_set_diff() semantics are clearer if end_size is set together
with end_exists and clone_end_snap_id.
Zac Dover [Sat, 2 Dec 2023 05:38:28 +0000 (06:38 +0100)]
doc/radosgw: fix formatting
Repair the formatting of a string that had a string inside backticks
that itself was inside double asterisks. The presence of the asterisks
around the entire string caused the backticks to appear in the rendered
documentation.
Ronen Friedman [Fri, 1 Dec 2023 14:48:49 +0000 (08:48 -0600)]
tests/scrub: deactivate osd-scrub-dump stand-alone test
as the scrub reservation changes had made it obsolete.
Note - it is not an issue of fixing the test, but rather
that the tested functionality is no longer there.
Casey Bodley [Tue, 14 Nov 2023 01:05:47 +0000 (20:05 -0500)]
common: add versioned encodings for std::variant
adds two encoding strategies for `std::variant<>` under the namespaces
`ceph::versioned_variant` and `ceph::converted_variant`
these versioned encodings allow the variant to be extended with new
types, provided that they're always added to the end without changing
or removing existing types. because of this requirement, no default
encoding is provided for `std::variant`. callers must opt in to one
namespace or the other
the `converted_variant` encoding requires the variant's first type T
to use versioned encoding, and guarantees that the variant's encoding
is backward-compatible with T's
Nizamudeen A [Wed, 18 Oct 2023 17:46:09 +0000 (23:16 +0530)]
mgr/dashboard: cephfs subvolume list snapshots
Added a tab for displaying the subvolume snapshots
- this tab will show an info alert when there are no subvolumes present
- if the subvolume is present, then it'll be auto-selected by default
Implemented a filter to search the groups and subvolumes by its name.
Also added a scrollbar when there are too many items in the nav list
Modified the REST APIs to fetch only the names of the resources and
fetch the info when an API call is requesting for it.
Added unit tests
Fixes: https://tracker.ceph.com/issues/63237 Signed-off-by: Nizamudeen A <nia@redhat.com>
John Mulligan [Thu, 9 Nov 2023 19:26:35 +0000 (14:26 -0500)]
cephadm: convert all deploy tests to use funkypatch fixture
During the refactor of various daemon type classes some of the tests had
been converted to funkypatch in order to deal with imports occuring over
multiple files. However, this conversion was done piece by piece in
order to make clear what was changing. This left the functions in this
file inconsistent. Change all the remaining function to use funkypatch
for consistency.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 9 Nov 2023 18:46:04 +0000 (13:46 -0500)]
cephamd: update tests to use should_log_to_journald from context_getters
Update tests to import should_log_to_journald from context_getters - the
module that actually defines that function. This makes later refactoring
easier.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 9 Nov 2023 00:21:10 +0000 (19:21 -0500)]
cephadm: use funkypatch for setting up common patches in deploy tests
Add a shim function and convert to the use of the FunkyPatcher class in
the test_deploy.py test functions. Use a shim as to not have to change
all the tests (yet).
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Wed, 8 Nov 2023 19:31:12 +0000 (14:31 -0500)]
cephadm: convert test_mon_crush_location to use funkypatch fixture
The test_mon_crush_location test always seems to have me tinkering
with it during refactoring. Re-do the fixtures to use funkpatch instead
of mock.patch and normal monkeypatch. This looks nicer (IMO) and should
avoid having to frequently mess with it when moving functions during future
refactoring.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Sun, 20 Aug 2023 17:50:00 +0000 (13:50 -0400)]
cephadm: add a new funkypatch fixture based on mock.patch and pytest
This fixture acts like a combination of mock.patch and pytest's
monkeypatch fixture. It has the additional feature of automatically
finding and patching the same object imported in other modules. If you
have 'from x import y', where y is a function or class, in both a.py and
b.py it will patch both instances (so long as both a and b are already
imported).
This behavior is useful for cephadm because of the heavy use of the
`from x import y` idiom and how cephadm is being actively refactored.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Sun, 5 Nov 2023 21:03:34 +0000 (16:03 -0500)]
cephadm: add a make_run_dir function
This function is roughly the same as make_var_run only it doesn't rely
on shelling out to the install command. Eventually, it will be used
to replace make_var_run in certain locations.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Casey Bodley [Sat, 18 Nov 2023 16:27:50 +0000 (11:27 -0500)]
rgw/acl: ACLGrant uses variant for grantee types
use of `ACLGrant::get_id()` was awkward because most grantee types
returned nothing, but emails were returned as `struct rgw_user`. change
the internal representation into a variant, and expose getters for each
grantee type so callers can handle each type specifically. the encoded
format of `ACLGrant` remains unchanged
Casey Bodley [Sat, 18 Nov 2023 15:22:20 +0000 (10:22 -0500)]
rgw/acl: req_state stores ACLs by value instead of unique_ptr
we no longer rely on polymorphism for the s3/swift variants of
`RGWAccessControlPolicy`, so `req_state` can store `bucket_acl`,
`object_acl` and `user_acl` by value
most functions now take these acls by const- or mutable reference
instead of pointers since they won't be nullptr
some code paths won't initialize some of these bucket/object/user acl
variables, and we rely on `RGWAccessControlPolicy::verify_permissions()`
to return false for those because we won't match an empty owner or
array of grants
in only one case, `verify_user_permissions()` has to return true when
`user_acl` is uninitialized, because S3 doesn't have user acls so
uninitialized user acls should not deny access
Casey Bodley [Sat, 18 Nov 2023 02:29:25 +0000 (21:29 -0500)]
rgw/acl/s3: parse_policy() as free function
s3 acl parsing classes no longer inherit from the acl classes
themselves, and are all encapsulated in rgw_acl_s3.cc behind a single
rgw::s3::parse_policy() function