Casey Bodley [Wed, 28 Jun 2023 21:14:16 +0000 (17:14 -0400)]
rgw: fetch_remote_obj() preserves original part lengths for BlockDecrypt
because multisite replicates multipart objects as a single part, we lose
information about the part sizes from the original manifest that is
necessary to correctly decrypt across those part boundaries
on replication, parse the part lengths out of the source object's
manifest, and store them in a separate RGW_ATTR_CRYPT_PARTS for use on
decryption
Update the doc to match the reality in the code. I don't know where
the recommendation to have min_size = k+2 came from, but for awhile
now we've defaulted to K+1. See PR #8008.
Signed-off-by: Dan van der Ster <dan.vanderster@clyso.com>
Ali Masarwa [Wed, 12 Jul 2023 10:41:47 +0000 (13:41 +0300)]
RGW: control the persistency of the notification
via adding expiry by number of retries and time
and controling the frequency of retries by sleep duration (in the options of vstart)
Signed-off-by: Ali Masarwa <ali.saed.masarwa@gmail.com>
Yuval Lifshitz [Wed, 2 Aug 2023 10:19:00 +0000 (10:19 +0000)]
rgw/amqp: skip idleness tests since it needs to sleep longer than 30s
current idle timeout is 30s, so, making the test sleep for 30s may not
be enough. setting sleep time to be longer, and skippign the test so it
won't take too long.
* refs/pull/52675/head:
test/TestOSDMap: don't use the deprecated std::random_shuffle method
Reviewed-by: Ronen Friedman <rfriedma@redhat.com> Reviewed-by: Samuel Just <sjust@redhat.com> Reviewed-by: Laura Flores <lflores@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Thu, 18 May 2023 13:52:10 +0000 (09:52 -0400)]
qa/tasks/vstart_runner: allow writing to command's stdin
There's no technical reason to disallow this. The original intent was to
avoid deadlocks but this possibility is already present when interacting
with a teuthology RemoteProcess. Avoiding it only for local processes
does not make sense.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Prior to this set of commits, the MDS would write the ESubtreeMap to the
journal, trim everything up to that segment, then finally force the
trimming of that last segment (`MDLog::trim(0)`). This is awkward in the
new code which preserves a major segment boundary at the beginning of
the journal during trimming. Instead of writing a special case for this
situation, it seems more natural to just use a new "lid" or "cap" event
to mark the beginning of the journal when no subtree map can yet be
written but we need sequence numbers to tie in other MDS tables.
Like ESegment, ELid doesn't actually contain any state. It's just a
marker for the beginning the log after rank deactivation or rank
creation. It can appear in the middle of the log if the shutdown
sequence is interrupted while writing the event but the MDS will skip it
during replay in that case.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Mon, 30 Jan 2023 19:33:32 +0000 (14:33 -0500)]
qa: add numerous subtree test
When the ESubtreeMap is very large (~5k+ subtrees), the MDS will
end up logging only a few events (as bad as 1) per segment as the
subtree map dominates the segment size.
This test simply creates an artificially large subtree and confirms that
other file system activity completes in a timely manner. This is now
taking advantage of the minor segments which allows for a normal set of
events per log segment (and fewer subtree maps). The test fails on the
current main HEAD.
Historical note: when I first observed this abberant behavior, the
vstart cluster was actually using mds_debug_subtrees = True (the default
for every vstart cluster). This caused the MDS to write out the subtree
map (for debugging reasons) with every event. When testing the MDS with
large subtrees (distributed ephemeral pinning), this caused the MDS to
slow to a trickle of operations per second. Despite this unintentional
misconfiguration, the problem still exists but the number of auth
subtrees must be large for a particlar rank to replicate the behavior.
On main HEAD, the creation of 10k files (workload stage) takes ~110
seconds. On this branch, it takes ~30 seconds.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This commit adds a new ESegment event type which can delineate
LogSegments. This event can be used as an alternative to the heavy
weight ESubtreeMap which can be very expensive to generate when the MDS
has a large subtree map.
Fixes: https://tracker.ceph.com/issues/58154 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
The major problem here is that the MDLog::_start_entry method puts the
current event sequence number in the EMetaBlob of the event (if
present). Because of this, no other event can be submitted as this would
invalidate the event sequence. Instead, fixup the event sequence during
submission and simplify related logic that uses it during EMetaBlob
construction.
Secondarily, for the purposes of this commit series, _start_entry
introduced recursive locks when generating the ESubtreeMap within
MDLog::_segment_upkeep. So, this commit is a necessary cleanup.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
John Mulligan [Mon, 25 Jul 2022 21:01:12 +0000 (17:01 -0400)]
script: add a new cpatch.py that is like cpatch but in python
This is a somewhat straightforward translation of the current cpatch
bash script into python. The image is built in the same manner but some
of the python components can be selected in a more granular manner.
Why do this? I had my own private copy of the cpatch script that I have
been modifying for my own needs, one that has the previously mentioned
more granular selection of python components. However, I had additional
plans for this script and found it difficult to manage and I felt that a
single-source-file python program would allow the use of python's nicer
(IMO) CLI parsing, data structures, and standard library modules. I also
think that this provides a cleaner program structure that will allow for
experimentation with different backends vs. the current approach of
copying files and creating a temporary Dockerfile.
For whatever reason the original cpatch was assuming that the python
version was 3.8, however in the current ceph images using centos 8
stream as a base, python 3.6 is the available python version.
Old versions of ceph kept cephadm as one large single source file in the
source tree. New versions "compile" cephadm into a python zipapp. Try to
automatically detect which case and "install" the correct form of
cephadm.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
[346/687] Building CXX object src/erasure-code/clay/CMakeFiles/ec_clay.dir/ErasureCodeClay.cc.o
In file included from /home/cbodley/ceph/src/common/dout.h:21,
from /home/cbodley/ceph/src/common/debug.h:18,
from /home/cbodley/ceph/src/erasure-code/clay/ErasureCodeClay.cc:22:
home/cbodley/ceph/src/erasure-code/clay/ErasureCodeClay.cc: In member function ‘int ErasureCodeClay::repair_one_lost_chunk(std::map<int, ceph::buffer::v15_2_0::list>&, std::set<int>&, std::map<int, ceph::buffer::v15_2_0::list>&, int, std::vector<std::pair<int, int> >&)’:
/home/cbodley/ceph/src/erasure-code/clay/ErasureCodeClay.cc:620:41: warning: ‘lost_chunk’ may be used uninitialized [-Wmaybe-uninitialized]
620 | ceph_assert(y == lost_chunk / q);
| ~~~~~~~~~~~^~~
/home/cbodley/ceph/src/include/ceph_assert.h:106:6: note: in definition of macro ‘ceph_assert’
106 | ((expr) \
| ^~~~
/home/cbodley/ceph/src/erasure-code/clay/ErasureCodeClay.cc:510:7: note: ‘lost_chunk’ was declared here
510 | int lost_chunk;
| ^~~~~~~~~~
Adam Emerson [Wed, 19 Jul 2023 21:12:08 +0000 (17:12 -0400)]
build: Remove old ceph-libboost* packages in install-deps
Here, we extract `clean_boost_on_ubuntu()` and call it before other
installs on Debian distributions so that if we install a system boost,
a potentially newer `ceph-libboost` won't get in the way.
As the sources.list.d being removed in the original cleanup code isn't
the one we're currently installing in the install code, add a removal
for the currently used source, then do apt-update so packages from the
removed source are no longer included as available.
Two subsidiary dev packages from conflicting boost libraries can be
installed, but it leaves apt in an inconsistent state. To clean this
up, add `--fix-missing` to the removal line and call
`clean_boost_on_ubuntu()` before other uses of apt.
Fixes: https://tracker.ceph.com/issues/62097 Signed-off-by: Adam Emerson <aemerson@redhat.com>
run_cluster_cmd() method is not available anymore because it was deleted
here on this PR -
https://github.com/ceph/ceph/pull/50569/files#diff-1c6c246ba42f343603d7174198dd1fb9c2654b6c883594d1a0891096b7a35875L408
Fixes: https://tracker.ceph.com/issues/62243 Signed-off-by: Rishabh Dave <ridave@redhat.com>