Patrick Donnelly [Tue, 30 Apr 2024 16:19:31 +0000 (12:19 -0400)]
Merge PR #56934 into main
* refs/pull/56934/head:
mds: move drop_locks to directly after rdonly check
qa: test quiesce.block is replicated
qa: test that ceph.dir.subvolume is replicated properly
mds: add debug "lock path" command
qa: move reqid_tostr helper
qa: return run_shell process for waiters
Add a list of default monitor images to the documentation. This commit
is made in response to a request from Eugen Block, and is made using the
information developed by Mr Block here:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/QGC66QIFBKRTPZAQMQEYFXOGZJ7RLWBN/.
At present, if a transaction gets interrupted right after it enters
WritePipeline::ReserveProjectedUsage and before any later continuations
get executed, WritePipeline::ReserveProjectedUsage will be locked
forever.
crimson/osd/pg: SnapTrimEvent to support interrupts
SnapTrimEvent operations are scheduled from `PG::on_active_actmap()`
using a `seastar::do_until` loop. This commit replaces the loop type
into an `interruptor::repeat` and SnapTrimEvent are now scheduled by
`start_operation_may_interrupt`.
Previously, `SnapTrimEvent::start` handled interruptions by returning
a `crimson::ct_error::eagain::make();`. Now, the errorator is directly
returned via the `snap_trim_event_ret_t` and interrupts the loop
described above.
As a result, interruptions originated by interval changes are now
supported by SnapTrimEvent.
test/ceph_crypto: define __has_feature if the compiler doesn't have it
Refer to https://gcc.gnu.org/onlinedocs/cpp/_005f_005fhas_005ffeature.html
and https://clang.llvm.org/docs/LanguageExtensions.html#has-feature-and-has-extension
for further information
so, in this change, let's manage the lifecycle of the `CrushWrapper`
instance with a smart pointer, so that it is destroyed and free'd
properly, and this should silence the ASan warning.
erasure-code/shec: use free() to release alloc()'ed memory chunk
ASan warns
```
==445793==ERROR: AddressSanitizer: alloc-dealloc-mismatch (malloc vs operator delete) on 0x602000039b10
#0 0x5604a544112d in operator delete(void*) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_erasure_code_shec_all+0x1e012d) (BuildId: 8cfc74d22471b6905f9b23304aed2af945265a13)
#1 0x7fc14752f588 in ErasureCodeShecTableCache::~ErasureCodeShecTableCache() /home/jenkins-build/build/workspace/ceph-pull-requests/src/erasure-code/shec/ErasureCodeShecTableCache.cc:61:19
#2 0x5604a544ccbe in ParameterTest_parameter_all_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/erasure-code/TestErasureCodeShec_all.cc:263:1
...
0x602000039b10 is located 0 bytes inside of 4-byte region [0x602000039b10,0x602000039b14)
allocated by thread T0 here:
#0 0x5604a5405afe in malloc (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_erasure_code_shec_all+0x1a4afe) (BuildId: 8cfc74d22471b6905f9b23304aed2af945265a13)
#1 0x7fc1474c9617 in reed_sol_vandermonde_coding_matrix /home/jenkins-build/build/workspace/ceph-pull-requests/src/erasure-code/jerasure/jerasure/src/reed_sol.c:86:10
#2 0x7fc147528634 in ErasureCodeShec::shec_reedsolomon_coding_matrix(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/erasure-code/shec/ErasureCodeShec.cc:514:12
#3 0x7fc147526cd8 in ErasureCodeShecReedSolomonVandermonde::prepare() /home/jenkins-build/build/workspace/ceph-pull-requests/src/erasure-code/shec/ErasureCodeShec.cc:390:14
#4 0x7fc1475187aa in ErasureCodeShec::init(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&, std::ostream*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/erasure-code/shec/ErasureCodeShec.cc:57:3
```
where we use `delete` to free the encoder matrix allocated using
`malloc()`. as jerasure is a library implemented in C language,
unless we want to reimplment it in C++, we should use `free()` to
free the memory chunk allocated by
`reed_sol_vandermonde_coding_matrix()`. also, please note,
jerasure does not provide a function to free the memory allocated
by this function, we have to explore its implementation, and use
`malloc()` directly. this should silence the ASan warning.
erasure-code/shec: replace 0 with nullptr when appropriate
0 fails to send the message to human readers that the variable is
a pointer, but nullptr does. for improving the readability, let's
use nullptr when the variable in question is a pointer.
Explain that an error message received in response to
"redirect_resolve_ip_addr True" might be caused by having an
insufficiently recent release of Ceph running in your cluster.
John Mulligan [Tue, 23 Apr 2024 12:16:19 +0000 (08:16 -0400)]
qa/tasks/cephadm: add a wait_for_service_not_present task func
Add a wait_for_service_not_present task function that will wait until a
given service name is not present in the list of running cephadm
services. This is intended for testing service cleanup operations.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Sat, 30 Mar 2024 20:50:29 +0000 (16:50 -0400)]
doc/mgr: add documentation for new smb mgr module
Add initial documentation for the new smb mgr module. It doesn't cover
every possible thing or expected future changes but it should cover
the basics of interacting with the module from the cli.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Wed, 20 Mar 2024 18:08:24 +0000 (14:08 -0400)]
ceph.spec.in: add smb module and python-dataclasses dependency
The only distro ceph squid+ is building for at the moment that does not
already have a python version that includes dataclasses is centos/rhel
8. Add a dependency for the backport package on rhel8.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Tue, 30 Jan 2024 21:49:25 +0000 (16:49 -0500)]
pybind/mgr: use black & isort on the smb module
Provide tox envs that check or reformat code with black and isort,
currently applied to only the new smb module.
This is similar to what we recently did for enabling tox in the
cephadmlib dir as it only applies to new code. However, other modules
that want to opt-in to automated, python-community-wide typical,
stop-thinking-and-let-tools-do-it approach to code formatting can
be added to the new envs later on.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Tue, 30 Jan 2024 19:39:16 +0000 (14:39 -0500)]
pybind/mgr/smb: add resourcelib.py an internal resource mgmt lib
While I like the workflow that `ceph orch apply` provides I find the
code a little too "loose". Create a new minimalistic un/re-structuring
library that partly inspired by my work with Go, cephadm, and a little
from pydantic. But without adding any dependencies beyond python's
dataclasses.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
The config store abstraction is defined in proto.py the config_store.py
configuration stores meet this protocol with wrappers around in memory
structures.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Tue, 9 Apr 2024 23:04:18 +0000 (19:04 -0400)]
cephadm: handle user_sources uri values in smb daemon
When a smb daemon is being configured it may have user_sources - a
field containing uris that are supplemental configurations expected
to define users and/or groups for a non-AD member server. Ensure these
uris get passed to the env var for the config uris to get processed.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Tue, 9 Apr 2024 21:41:39 +0000 (17:41 -0400)]
python-common: add a user_sources field to smb service spec
We had a mechanism for passing primary configs and join sources to the
smb service but need a way to pass configs containing user (and group)
definitions for non-AD scenarios.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Leonid Usov [Sat, 16 Mar 2024 15:41:47 +0000 (11:41 -0400)]
qa/tasks: vstart_runner: introduce --config-mode
The new mode of the vstart_runner allows for passing
paths to yaml configs that will be merged and then
run just as the teuthology would do it.
Building on the standard run method we can even
pass "-" as the config name and provide one on the stdin like
python3 ../qa/tasks/vstart_runner.py --config-mode "-" << END
tasks:
- quiescer:
quiesce_factor: 0.5
min_quiesce: 10
max_quiesce: 10
initial_delay: 5
cancelations_cap: 2
paths:
- a
- b
- c
- waiter:
on_exit: 100
END
This commit does the minimum to allow testing of the quiescer,
but it also lays the groundwork for running arbitrary configs.
The cornerstone of the approach is to inject our local implementations
of the main fs suite classes. To be able to do that, some minor
refactoring was required in the corresponding modules:
the standard classes were renamed to have a *Base suffix, and the
former class name without the suffix is made a module level variable
initialized with the *Base implementation. This refactoring
is meant to be backward compatible.