Kefu Chai [Mon, 25 Mar 2024 14:56:28 +0000 (22:56 +0800)]
test/rgw/test_rgw_iam_policy: do not increase ref when creating intrusive_ptr<CephContext>
before this change, we increment the refcount when constructing
`cct` instrusive_ptr, but nobody owns this smart pointer. also,
`CephContext` 's constructor set its refcount to 1. so, when the
test finishes, the refcount is 1, and this leads to a leakage of
the `CephContext` instance. and LeakSanitizer points this out:
```
Indirect leak of 10880000 byte(s) in 1 object(s) allocated from:
#0 0xaaaac359c7c8 in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests-arm64/build/bin/unittest_rgw_iam_policy+0x211c7c8) (BuildId: 060fadb10da261b52fd5757c7b1e9812d34542f1)
#1 0xffff96f764e4 in __gnu_cxx::new_allocator<ceph::logging::ConcreteEntry>::allocate(unsigned long, void const*) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
#2 0xffff96f757cc in std::allocator<ceph::logging::ConcreteEntry>::allocate(unsigned long) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
#3 0xffff96f757cc in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::allocate(unsigned long) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/build/boost/include/boost/circular_buffer/base.hpp:2396:39
#4 0xffff96f75500 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::initialize_buffer(unsigned long) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/build/boost/include/boost/circular_buffer/base.hpp:2494:18
#5 0xffff96f6ec4c in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::circular_buffer(unsigned long, std::allocator<ceph::logging::ConcreteEntry> const&) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/build/boost/include/boost/circular_buffer/base.hpp:1039:9
#6 0xffff96f63528 in ceph::logging::Log::Log(ceph::logging::SubsystemMap const*) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/log/Log.cc:53:5
#7 0xffff96045300 in ceph::common::CephContext::CephContext(unsigned int, ceph::common::CephContext::create_options const&) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/common/ceph_context.cc:729:16
#8 0xffff960446ec in ceph::common::CephContext::CephContext(unsigned int, code_environment_t, int) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/common/ceph_context.cc:697:5
#9 0xaaaac3629238 in IPPolicyTest::IPPolicyTest() /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/test/rgw/test_rgw_iam_policy.cc:864:15
#10 0xaaaac3628da0 in IPPolicyTest_MaskedIPOperations_Test::IPPolicyTest_MaskedIPOperations_Test() /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/test/rgw/test_rgw_iam_policy.cc:869:1
#11 0xaaaac3628d3c in testing::internal::TestFactoryImpl<IPPolicyTest_MaskedIPOperations_Test>::CreateTest() /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/googletest/googletest/include/gtest/internal/gtest-internal.h:472:44
```
so, in this change, we do not increase the refcount when creating cct.
Yingxin Cheng [Wed, 20 Mar 2024 07:25:33 +0000 (15:25 +0800)]
crimson/os/pg_map: allow multiple shards to create new pg mappings at the same time
Also:
* Better detections in case of inconsistent racings, such as:
* The new mapping is creating towards different cores.
* Mapping creation is racing with its eracing.
* Multiple shards are erasing the same mapping at the same time.
* Add more logs to debug in case of unexpected issues.
Kefu Chai [Mon, 25 Mar 2024 03:19:35 +0000 (11:19 +0800)]
test/common: do not leak in MemoryIsZeroSmallTest
before this change, we allocate memory chunks with specified
size using `new []`, but we never free them. when testing with
LeakSanitizer enabled, it rightly points identifies the leakage:
```
Direct leak of 8754 byte(s) in 184 object(s) allocated from:
#0 0x55c0b2470f0d in operator new[](unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_memory+0x196f0d) (BuildId: d3267dd8819427b804c4729e0467dbe7601fb321)
#1 0x55c0b247456c in MemoryIsZeroSmallTest_MemoryIsZeroTestSmall_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/common/test_memory.cc:33:18
#2 0x55c0b2598ee6 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
#3 0x55c0b2553b92 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
#4 0x55c0b25049dc in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2680:5
#5 0x55c0b2506a12 in testing::TestInfo::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2858:11
#6 0x55c0b250804b in testing::TestSuite::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:3012:28
#7 0x55c0b25254d8 in testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5723:44
#8 0x55c0b25a16f6 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
#9 0x55c0b255a502 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
#10 0x55c0b2524862 in testing::UnitTest::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5306:10
#11 0x55c0b24ab4c0 in RUN_ALL_TESTS() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/include/gtest/gtest.h:2486:46
#12 0x55c0b24ab451 in main /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googlemock/src/gmock_main.cc:70:10
#13 0x7f45e065ad8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
```
Kefu Chai [Sun, 24 Mar 2024 23:43:26 +0000 (07:43 +0800)]
test/common: avoid leakage of CephContext
before this change, in test_util.cc, we increment the refcount of
when constructing it. but at that moment, nobody really owns it.
also, `CephContext` 's refcount is set to 1 in its constructor.
so, we should not do this. otherwise, the created `CephContext`
is leaked as LeakSanitizer rightly points out:
```
Indirect leak of 10880000 byte(s) in 1 object(s) allocated from:
#0 0x5632320d27ed in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_util+0x1917ed) (BuildId: ff1df1455bd07b651ad580584a17ea204afeb36e)
#1 0x7ff9d535b189 in __gnu_cxx::new_allocator<ceph::logging::ConcreteEntry>::allocate(unsigned long, void const*) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
#2 0x7ff9d535a563 in std::allocator<ceph::logging::ConcreteEntry>::allocate(unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
#3 0x7ff9d535a563 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::allocate(unsigned long) /opt/ceph/include/boost/circular_buffer/base.hpp:2396:39
#4 0x7ff9d535a2c0 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::initialize_buffer(unsigned long) /opt/ceph/include/boost/circular_buffer/base.hpp:2494:18
#5 0x7ff9d5354192 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::circular_buffer(unsigned long, std::allocator<ceph::logging::ConcreteEntry> const&) /opt/ceph/include/boost/circular_buffer/base.hpp:1039:9
#6 0x7ff9d53471e4 in ceph::logging::Log::Log(ceph::logging::SubsystemMap const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/log/Log.cc:53:5
#7 0x7ff9d461d96d in ceph::common::CephContext::CephContext(unsigned int, ceph::common::CephContext::create_options const&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/ceph_context.cc:729:16
#8 0x7ff9d461c93b in ceph::common::CephContext::CephContext(unsigned int, code_environment_t, int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/ceph_context.cc:697:5
#9 0x5632320d52e0 in util_collect_sys_info_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/common/test_util.cc:34:27
#10 0x563232205c16 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
#11 0x5632321c2742 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
#12 0x5632321736dc in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2680:5
```
in this change, instead of using a raw pointer, let's
use `boost::intrusive_ptr<CephContext>` to manage the lifecyle
of `CephContext`, this also address the leakage reported by
LeakSanitizer.
Kefu Chai [Sun, 24 Mar 2024 23:37:58 +0000 (07:37 +0800)]
test: do not increase ref when creating intrusive_ptr<CephContext>
before this change, we increment the refcount when constructing
`cct` instrusive_ptr, but nobody owns this smart pointer. also,
`CephContext` 's constructor set its refcount to 1. so, when the
test finishes, the refcount is 1, and this leads to a leakage of
the `CephContext` instance, this not only annoys ASan, and defeats
the purpose of 14d878c8.
```
Indirect leak of 10880000 byte(s) in 1 object(s) allocated from:
#0 0x5564d173537d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_ipaddr+0x19b37d) (BuildId: 45c0c7f28b253c04fcb7bb1a43aed52a5526d734)
#1 0x7fe7f2ccd189 in __gnu_cxx::new_allocator<ceph::logging::ConcreteEntry>::allocate(unsigned long, void const*) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
#2 0x7fe7f2ccc563 in std::allocator<ceph::logging::ConcreteEntry>::allocate(unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
#3 0x7fe7f2ccc563 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::allocate(unsigned long) /opt/ceph/include/boost/circular_buffer/base.hpp:2396:39
#4 0x7fe7f2ccc2c0 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::initialize_buffer(unsigned long) /opt/ceph/include/boost/circular_buffer/base.hpp:2494:18
#5 0x7fe7f2cc6192 in boost::circular_buffer<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::circular_buffer(unsigned long, std::allocator<ceph::logging::ConcreteEntry> const&) /opt/ceph/include/boost/circular_buffer/base.hpp:1039:9
#6 0x7fe7f2cb91e4 in ceph::logging::Log::Log(ceph::logging::SubsystemMap const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/log/Log.cc:53:5
#7 0x7fe7f1f8f96d in ceph::common::CephContext::CephContext(unsigned int, ceph::common::CephContext::create_options const&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/ceph_context.cc:729:16
#8 0x7fe7f1f8e93b in ceph::common::CephContext::CephContext(unsigned int, code_environment_t, int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/ceph_context.cc:697:5
#9 0x5564d1752eb9 in pick_address_find_ip_in_subnet_list_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/test_ipaddr.cc:706:47
#10 0x5564d18694d6 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
#11 0x5564d1820fc2 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
#12 0x5564d17d19dc in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2680:5
#13 0x5564d17d3a12 in testing::TestInfo::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2858:11
#14 0x5564d17d504b in testing::TestSuite::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:3012:28
#15 0x5564d17f24d8 in testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5723:44
#16 0x5564d1871d06 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2605:10
#17 0x5564d1827932 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2641:14
#18 0x5564d17f1862 in testing::UnitTest::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:5306:10
#19 0x5564d1775d80 in RUN_ALL_TESTS() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/include/gtest/gtest.h:2486:46
#20 0x5564d1775d11 in main /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googlemock/src/gmock_main.cc:70:10
```
so, in this change, we do not increase the refcount when
creating cct.
the same applies to `test/common/test_fault_injector.cc`.
Ernesto Puerta [Wed, 13 Mar 2024 13:06:10 +0000 (14:06 +0100)]
mgr/dashboard: fix NVMeoF API
* Update NVMe-oF gRPC Proto to 1.0.0
* Error handling,
* Missing PATCH for certain namespace ops (resize, set QoS, set balance
groups),
* Stop bypassing gRPC payloads and validate those in the back-end,
* Fix incorrect HTTP 1.1 semantics for some POST/DELETE and URIs.
* Catch errors/exceptions.
* Clean-up EndpointDoc Params
* Run Black linter.
* Remove most of NVMeoFClient glue code between gRPC and controller.
* Fix namespace delete endpoint by exposing trsvcid
* nvmeof io_stats support
Patrick Donnelly [Fri, 22 Mar 2024 15:56:08 +0000 (11:56 -0400)]
Merge PR #56271 into main
* refs/pull/56271/head:
qa/cephfs: stop ignoring MON_DOWN globally
qa: extend mon timeout coming up after mondb creation
qa: update dashboard schema for mon_status
mon: do not log MON_DOWN if monitor uptime is less than threshold
Reviewed-by: Leonid Usov <leonid.usov@ibm.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
Patrick Donnelly [Tue, 19 Mar 2024 13:56:20 +0000 (09:56 -0400)]
qa/cephfs: add probabilistic ignorelist for pg_health
PG_AVAILABILITY/PG_DEGRADED warnings are dominating fs runs. We want the
underlying issue fixed but it cannot continue to fail all of our tests 100% of
the time. Use a probabilistic addition of these warnings to the ignorelist.
Fixes: https://tracker.ceph.com/issues/64984 Related-to: https://tracker.ceph.com/issues/52624 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Venky Shankar [Fri, 22 Mar 2024 07:24:29 +0000 (12:54 +0530)]
Merge PR #54581 into main
* refs/pull/54581/head:
doc/dev: update quiesce developer document
qa: wrap quiesce verification to dump debugging on error
qa: update quiesce tests for control via locallock
qa: set archive path in vstart_runner
qa: refactor CephFSMount.kill_background to optionally kill all background jobs
qa: use kwarg for rank parameter
qa: simplify calls to (rank|mds)_(tell|asok)
Revert "pybind/mgr/volumes: block quiesce for critical .meta file"
mds: remove is_root indication on quiesce_inode op
mds: prevent new lock cache cons when invalidating an existing one
mds: use XLOCK_WAIT For local lock xlockers
mds: prevent new wrlocks on LocalLock if there exists any xlock waiter
mds: block import discover when parent directory inode is quiesced
mds: avoid issuing exclusive caps to clients lacking w caps
mds: print lock cache during invalidation
mds: use inodeno_t to track quiesce requests
mds: dispatch quiesce_inode ops after dir traversal
mds: remove quiescelock handling for SimpleLock type
mds: quiescelock as local lock + cap masking
qa: run quiesce unit tests in fs:functional
qa: add quiesce protocol unit tests
qa: detect partial migrations during large config of dist epin
qa: use stdin-killer to timeout run_shell_payload
qa: simplify run_shell argument processing
doc: add dev docs for quiesce protocol
pybind/mgr/volumes: block quiesce for critical .meta file
mds: add vxattr to block quiesce on an inode
mds: convert encoded ephemeral dist pin to flags
mds: add counter to throttle quiesce
mds: add quiesce set feature flag
mds: skip non-head inodes for quiesce
mds: add quiesce op
mds: print all SimpleLock flags in debug output
mds: pretty print mutation when dumping lock
mds: add new inode quiescelock
mds: use 128 bits for waiters on MDSCacheObject
mds: provide mechanism to authpin while freezing
mds: add command to get specific op
mds: finish request before completing internal req
mds: complete internal op if killed
mds: avoid killing dead requests
mds: add command to kill request
mds: add path argument to `ops` and `dump tree` to stream result to local file
mds: print internal_request filepaths if present
mds: add more information to debug message
mds: remove redundant parenthesis
mds: implement Mutation::dump method
mds: make LockType fields const
mds: annotate mdr with try_rdlock_snap_layout failure
mds: refactor if into switch
mds: call Locker method using this
mds: simplify assert
mds: dump locks passed to Locker::acquire_locks
mds: add LockOp::print method for debugging
mds: use new insert template via print
mds: add request result to mutation for analysis by tests
mds: add comment on locking order rules
mds: allow specifying rdlock position
mds: remove dead method
common: provide a template for object dumps
common: support long running ops without slow warnings
common: simplify loop
common: add JSONFormatterFile class
common: use more efficient vector for stack
include: use larger int for large gathers
Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
Shachar Sharon [Wed, 13 Mar 2024 14:43:29 +0000 (16:43 +0200)]
qa/suites/orch: add minimal smb non-AD test
Test minimal SMB deployment over CephFS, using local users (non-AD).
Upon successful deployment run minima smbclient command ('ls') to probe
Samba's share liveness.
Co-authored-by: John Mulligan <jmulligan@redhat.com> Signed-off-by: Shachar Sharon <ssharon@redhat.com> Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Fri, 15 Mar 2024 17:48:35 +0000 (13:48 -0400)]
qa/tasks: add a cephadm samba container helper func independent of AD DC
To have the standalone (non-AD) server test function similarly to the AD
member server test we need to set a variable for samba client container
command similar to how the AD setup command does it.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Sat, 24 Feb 2024 15:52:53 +0000 (10:52 -0500)]
qa/suites/orch: add a new smb service cephadm sub-suite and test
Start a new subdir under cephadm suite for the new smb service
that cephadm can deploy. Add one new test that checks that a
smb service with domain membership can be deployed and connect
to it with smbclient from the samba client container image.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Tue, 27 Feb 2024 14:48:25 +0000 (09:48 -0500)]
qa/tasks: add error condition to exec functions
Looking at the code that expands `all-roles` and `all-hosts` there's no
proper error checking for when these values appear but there are >1
top-level roles in the task config. If a user does this it'll fail
but in a somewhat unclear manner. Add a new condition that raises a
clear exception in this case hopefully saving someone future debugging
time.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Tue, 27 Feb 2024 14:44:51 +0000 (09:44 -0500)]
qa/tasks: reduce duplicated code
All `exec`-style function in teuthology appear to have a transformation
block that expands names like `all-roles` and `all-hosts`. With the new
cephadm.exec task that block appeared twice in cephadm.py. This change
removes the duplication by creating an _expand_roles function that
can be called from the command executing functions.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Mon, 26 Feb 2024 21:17:22 +0000 (16:17 -0500)]
qa/tasks: add a template filter to map a role name to a remote
Add a `role_to_remote` template filter function that has the ability to
map a role name to a remote. Attributes of the remote can then be
used to get the actual node ip or name.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Mon, 26 Feb 2024 21:16:57 +0000 (16:16 -0500)]
qa/tasks: a new cephadm exec task similar to vip.exec but generalized
Add a new cephadm.exec task that works similarly to the existing
vip.exec but instead of only considering VIP related string replacements
it uses that templating feature that was recently added to the
cephadm module for generalized string templating.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Mon, 26 Feb 2024 18:47:04 +0000 (13:47 -0500)]
qa/tasks: add a cephadm.exclude role
Add a cephadm.exclude role that excludes a test node from cluster setup
and related commands. I need this as I have test node that will be set
up as an AD Domain Controller for testing Samba and do not want that
node to be have *any* other services running on it.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Sat, 24 Feb 2024 19:26:36 +0000 (14:26 -0500)]
qa/tasks: allow passing stdin string to cephadm shell commands
There are cases where I want to pass some large-ish strings to ceph
commands executed via cephadm shell. Allow items within the commands
list to be dicts containing a command (as before) and an optional
stdin variable. This change also supports possible future extensions as
well.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Tue, 20 Feb 2024 23:28:58 +0000 (18:28 -0500)]
qa/tasks: add a new cephadm task for setting up samba ad dc
Add a new task function to cephadm.py that sets up a container running
the Samba based domain controller on a node using podman or docker.
Much of the function actually deals with disabling systemd-resolved
because that service conflicts with the DNS server component of the DC.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Fri, 5 Jan 2024 15:45:08 +0000 (10:45 -0500)]
mgr/cephadm: simplify _get_container_image a bit
Because the "if-ladder" was only ever assigning a single variable with
a value it can be directly replaced by a dict & dict-lookup which is
much more succinct.
Also take the opportunity to sort the (non-comment) lines as there's
no meaning to the previous order and this makes it easier for a reader
to scan through.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 4 Jan 2024 21:38:08 +0000 (16:38 -0500)]
mgr/cepahdm: add various touch points to enable smb service
Add the smb service by name or by type to one of the many, many touch
points in the orchestrator and cephadm packages needed to get the
orchestrator aware of smb.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 14 Dec 2023 00:20:45 +0000 (19:20 -0500)]
python-common: reformat ServiceSpec class level service type lists
Reformat the ServiceSpec classes properties KNOWN_SERVICE_TYPES and
REQUIRES_SERVICE_ID. These were previously strings that were converted
to lists via a call to split. With a string there's very little a human
or a tool can do to validate the content. Changing these into proper
lists in the source code brings clarity of intent and the ability to
analyze the code. Because there's no semantic difference what services
are listed where (this means the type could probably be a set - a quest
for another day) I also took the opportunity to sort the contents of the
lists and add some basic comments for what these lists are for.
It also removes the use of (ugly, IMO) line continuations. The downside
is that it makes more total lines, but if that bugs you - use code
folding :-).
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Fri, 5 Jan 2024 15:24:10 +0000 (10:24 -0500)]
mgr/cephadm: refactor keyring simplification out of get_keyring_with_caps
Refactor get_keyring_with_caps such that the keyring simplification code
is moved into a new function that can be used in other locations.
get_keyring_with_caps will now call the new function to return the
simplified & consistent keyring output.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Wed, 13 Dec 2023 20:49:12 +0000 (15:49 -0500)]
mgr/cephadm: reformat the _service_classes variable
Reformat the _service_classes variable so that it uses a multi-line list
with a single item on each line in a more black-ish style that is more
readable (especially if you use code-folding wisely).
Sort the list while we're at it.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Wed, 13 Dec 2023 21:05:27 +0000 (16:05 -0500)]
mgr/orchestrator: fix the sorting of the imports
While ceph doesn't enforce sorted imports I prefer them when possible. I
had once sorted these imports but then nvmeof came along an ruined
things. Put nvmeof back in it's place.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Wed, 13 Dec 2023 19:33:20 +0000 (14:33 -0500)]
mgr/cephadm: fix test failure on newer python
Tests that touch this enum fail for me locally but pass in the CI. This
seems to be due to new enum related behavior in Python 3.11.
See: https://blog.pecar.me/python-enum
Instead of fixing it as suggested in the above blog, adding a __str__
method works on all python versions I care to know about.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Tue, 16 Jan 2024 20:37:27 +0000 (15:37 -0500)]
cephadm: fix issue joining to ad by using a virtual hostname
The not-a-real-fqdn hostname that the containers got were causing
performance issues joining AD (and running testjoin and winbind).
Define a virtual hostname that can be passed in from the service or
automatically derived from the system's hostname.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Wed, 6 Dec 2023 20:14:32 +0000 (15:14 -0500)]
cephadm: import and enable deployment of SMB daemon class
Enable the use of the SMB container daemon form class by importing, and
thus registering, it. Note that the only way to invoke this feature is
by hand rolling some JSON to feed to the `ceph _orch deploy` command.
Connecting this with the cephadm mgr module is left as a future task.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Wed, 6 Dec 2023 20:14:31 +0000 (15:14 -0500)]
cephadm: add an SMB daemon module and classes
Add an incomplete but largely viable SMB/Samba container daemon form
implementation to cephadm. Currently unused but it lays out some of the
basics needed to create smb sharing using samba containers under cephadm
orchestration.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Sun, 3 Dec 2023 16:01:05 +0000 (11:01 -0500)]
cephadm: add generic methods for sharing namespaces across containers
In the future, some sidecar containers will need to share namespaces
with the primary container (or each other). Make it easy to set this up
by creating a enable_shared_namespaces function and Namespace enum.
Signed-off-by: John Mulligan <jmulligan@redhat.com>