Rongqi Sun [Fri, 12 Apr 2024 06:51:34 +0000 (06:51 +0000)]
bluestore/bluestore_types: check 'it' valid before using
When sanitizer is enabled, unittest_bluestore_types fails as following
```
[ RUN ] sb_info_space_efficient_map_t.basic
=================================================================
==143714==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xffff99f8b7f4 at pc 0xaaaab50bde18 bp 0xffffebefcdb0 sp 0xffffebefcda8
READ of size 8 at 0xffff99f8b7f4 thread T0
#0 0xaaaab50bde14 in sb_info_t::get_sbid() const /root/ceph/src/os/bluestore/bluestore_types.h:1337:30
#1 0xaaaab50a5908 in sb_info_space_efficient_map_t::find(unsigned long) /root/ceph/src/os/bluestore/bluestore_types.h:1385:10
#2 0xaaaab50bd638 in sb_info_space_efficient_map_t::_add(long) /root/ceph/src/os/bluestore/bluestore_types.h:1424:15
#3 0xaaaab50a52bc in sb_info_space_efficient_map_t::add_maybe_stray(unsigned long) /root/ceph/src/os/bluestore/bluestore_types.h:1358:12
#4 0xaaaab4fec03c in sb_info_space_efficient_map_t_basic_Test::TestBody() /root/ceph/src/test/objectstore/test_bluestore_types.cc:113:11
#5 0xaaaab51e9a40 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2605:10
#6 0xaaaab5197040 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2641:14
#7 0xaaaab51488a4 in testing::Test::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:2680:5
#8 0xaaaab514a7e8 in testing::TestInfo::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:2858:11
#9 0xaaaab514bde8 in testing::TestSuite::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:3012:28
#10 0xaaaab5167bac in testing::internal::UnitTestImpl::RunAllTests() /root/ceph/src/googletest/googletest/src/gtest.cc:5723:44
#11 0xaaaab51f3940 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2605:10
#12 0xaaaab519e5d8 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2641:14
#13 0xaaaab5167024 in testing::UnitTest::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:5306:10
#14 0xaaaab50b4d6c in RUN_ALL_TESTS() /root/ceph/src/googletest/googletest/include/gtest/gtest.h:2486:46
#15 0xaaaab50a1080 in main /root/ceph/src/test/objectstore/test_bluestore_types.cc:2847:10
#16 0xffff9d6c73f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#17 0xffff9d6c74c8 in __libc_start_main csu/../csu/libc-start.c:392:3
#18 0xaaaab4f3812c in _start (/root/ceph/build/bin/unittest_bluestore_types+0xe4812c) (BuildId: cb75399658026f83a4e89012de8fb02f08f6d239)
0xffff99f8b7f4 is located 0 bytes to the right of 20-byte region [0xffff99f8b7e0,0xffff99f8b7f4)
allocated by thread T0 here:
#0 0xaaaab4fe636c in operator new[](unsigned long) (/root/ceph/build/bin/unittest_bluestore_types+0xef636c) (BuildId: cb75399658026f83a4e89012de8fb02f08f6d239)
#1 0xaaaab50c0d2c in mempool::pool_allocator<(mempool::pool_index_t)11, sb_info_t>::allocate(unsigned long, void*) /root/ceph/src/include/mempool.h:375:33
#2 0xaaaab50c0c0c in std::allocator_traits<mempool::pool_allocator<(mempool::pool_index_t)11, sb_info_t> >::allocate(mempool::pool_allocator<(mempool::pool_index_t)11, sb_info_t>&, unsigned long) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:318:20
#3 0xaaaab50c044c in std::_Vector_base<sb_info_t, mempool::pool_allocator<(mempool::pool_index_t)11, sb_info_t> >::_M_allocate(unsigned long) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:346:20
#4 0xaaaab50bf954 in void std::vector<sb_info_t, mempool::pool_allocator<(mempool::pool_index_t)11, sb_info_t> >::_M_realloc_insert<long&>(__gnu_cxx::__normal_iterator<sb_info_t*, std::vector<sb_info_t, mempool::pool_allocator<(mempool::pool_index_t)11, sb_info_t> > >, long&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:440:33
#5 0xaaaab50be0d8 in sb_info_t& std::vector<sb_info_t, mempool::pool_allocator<(mempool::pool_index_t)11, sb_info_t> >::emplace_back<long&>(long&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:121:4
#6 0xaaaab50bd760 in sb_info_space_efficient_map_t::_add(long) /root/ceph/src/os/bluestore/bluestore_types.h:1429:24
#7 0xaaaab50a5e78 in sb_info_space_efficient_map_t::add_or_adopt(unsigned long) /root/ceph/src/os/bluestore/bluestore_types.h:1361:15
#8 0xaaaab4feb07c in sb_info_space_efficient_map_t_basic_Test::TestBody() /root/ceph/src/test/objectstore/test_bluestore_types.cc:103:11
#9 0xaaaab51e9a40 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2605:10
#10 0xaaaab5197040 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2641:14
#11 0xaaaab51488a4 in testing::Test::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:2680:5
#12 0xaaaab514a7e8 in testing::TestInfo::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:2858:11
#13 0xaaaab514bde8 in testing::TestSuite::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:3012:28
#14 0xaaaab5167bac in testing::internal::UnitTestImpl::RunAllTests() /root/ceph/src/googletest/googletest/src/gtest.cc:5723:44
#15 0xaaaab51f3940 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2605:10
#16 0xaaaab519e5d8 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph/src/googletest/googletest/src/gtest.cc:2641:14
#17 0xaaaab5167024 in testing::UnitTest::Run() /root/ceph/src/googletest/googletest/src/gtest.cc:5306:10
#18 0xaaaab50b4d6c in RUN_ALL_TESTS() /root/ceph/src/googletest/googletest/include/gtest/gtest.h:2486:46
#19 0xaaaab50a1080 in main /root/ceph/src/test/objectstore/test_bluestore_types.cc:2847:10
#20 0xffff9d6c73f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#21 0xffff9d6c74c8 in __libc_start_main csu/../csu/libc-start.c:392:3
#22 0xaaaab4f3812c in _start (/root/ceph/build/bin/unittest_bluestore_types+0xe4812c) (BuildId: cb75399658026f83a4e89012de8fb02f08f6d239)
SUMMARY: AddressSanitizer: heap-buffer-overflow /root/ceph/src/os/bluestore/bluestore_types.h:1337:30 in sb_info_t::get_sbid() const
Shadow bytes around the buggy address:
0x200ff33f16a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff33f16b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff33f16c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff33f16d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff33f16e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x200ff33f16f0: fa fa fa fa fa fa fa fa fa fa fa fa 00 00[04]fa
0x200ff33f1700: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff33f1710: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff33f1720: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff33f1730: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff33f1740: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==143714==ABORTING
```
'it' might be invalid, so before using 'it', need to figure validity out
Patrick Donnelly [Thu, 11 Apr 2024 23:00:53 +0000 (19:00 -0400)]
Merge PR #56839 into main
* refs/pull/56839/head:
script/ptl-tool: push branch to shaman for qa runs
script/ptl-tool: add release name to branch with switch
script/ptl-tool: alphabetize arguments
script/ptl-tool: avoid repo specific remotes entirely
script/ptl-tool: improve help text for envvars
Patrick Donnelly [Thu, 11 Apr 2024 17:10:59 +0000 (13:10 -0400)]
Merge PR #56835 into main
* refs/pull/56835/head:
script/ptl-tool: create qa trackers for test branches
script/ptl-tool: add switch for debugging
script/ptl-tool: add --stop-at-built flag
When sanitizer is enabled, unittest_mds_quiesce_agent fails as following
```
[==========] Running 5 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 5 tests from QuiesceAgentTest
[ RUN ] QuiesceAgentTest.ThreadManagement
[ OK ] QuiesceAgentTest.ThreadManagement (3 ms)
[ RUN ] QuiesceAgentTest.DbUpdates
[ OK ] QuiesceAgentTest.DbUpdates (1 ms)
[ RUN ] QuiesceAgentTest.QuiesceProtocol
[ OK ] QuiesceAgentTest.QuiesceProtocol (3 ms)
[ RUN ] QuiesceAgentTest.DuplicateQuiesceRequest
[ OK ] QuiesceAgentTest.DuplicateQuiesceRequest (2 ms)
[ RUN ] QuiesceAgentTest.TimeoutBeforeComplete
[ OK ] QuiesceAgentTest.TimeoutBeforeComplete (2 ms)
[----------] 5 tests from QuiesceAgentTest (11 ms total)
[----------] Global test environment tear-down
[==========] 5 tests from 1 test suite ran. (11 ms total)
[ PASSED ] 5 tests.
Remove information about the installation of the Zabbix module and link
to a discussion of the reasoning behind Ceph's refusal to support
Zabbix.
John Jasen developed a procedure explaining how to install "Zabbix 2".
This commit removes outdated procedures and explains why those
procedures were removed. Immediately following this explanation, the
text includes an explanation of how to install "Zabbix 2".
Adam King [Tue, 9 Apr 2024 17:59:56 +0000 (13:59 -0400)]
qa/cephadm: add ability to deploy raw OSDS from cephadm task
For now, it only works in a roleless fashion as OSD deployment
with roles is a bit more complicated. Unless we want to deploy
a mix of raw and lvm OSDs in one test, having it work in roleless
setups should be enough.
Adam King [Tue, 9 Apr 2024 16:19:06 +0000 (12:19 -0400)]
mgr/cephadm: make enable_monitor_client configurable for nvmeof
Currently, the mon client work is not merged on main, but our
default nvmeof container will attempt to make use of it by default,
causing it to crash. This makes it configurable and defaults the
behavior to false. That can be changed once the work is actually
present in main.
this may happen when a test fails, and does not cleanup topics
it created. other tests that verify the number of topics may fail
because of that.
all tests that verify number of topics, should delete all topics at the
start of the test.
doc: reorder "releases" entries for reef to fix the diagram
The Gantt diagram currently shows "reef (latest 18.2.0)" instead of
18.2.2. This is because ReleasesGantt expects releases array to be
sorted in reverse order:
for code_name, info in releases.items():
last_release = info['releases'][0]
first_release = info['releases'][-1]
Matt Benjamin [Tue, 30 Jan 2024 20:35:07 +0000 (15:35 -0500)]
rgw: ignore SIGXFSZ, which apparently can triggered by heavy ops-log traffic
Fixes: https://tracker.ceph.com/issues/64244
That is, it causes radosgw to ignore SIGXFSZ, which in the apparent
reproducer should not cause radosgw to terminate.
Updated to call unregister_async_signal_handler (frees a structure).
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Nizamudeen A [Wed, 27 Mar 2024 16:29:55 +0000 (21:59 +0530)]
install-deps: enable copr ceph/grpc
In dashboard, to generate nvmeof apis in el8 this is needed so that it
can download the python3-grpcio packages.
https://copr.fedorainfracloud.org/coprs/ceph/grpc/
Fixes: https://tracker.ceph.com/issues/65184 Signed-off-by: Nizamudeen A <nia@redhat.com>
Adam King [Wed, 3 Apr 2024 17:11:08 +0000 (13:11 -0400)]
mgr/cephadm: pass daemon's current image when reconfiguring
Important to note here is that a reconfig will rewrite
the config files of the daemon and restart it, but not
rewrite the unit files. This lead to a bug where the
grafana image we used between the quincy and squid release
used different UIDs internally, which caused us to rewrite
the config files as owned by a UID that worked with the
new image but did not work with the image still specified
in the unit.run file. This meant the grafana daemon was down
from a bit after the mgr upgrade until the end
of the upgrade when we redeploy all the monitoring images.
Fixes: https://tracker.ceph.com/issues/65234 Signed-off-by: Adam King <adking@redhat.com>
Adam King [Thu, 4 Apr 2024 19:49:27 +0000 (15:49 -0400)]
qa/cephadm: update images for test_cephadm workunit
This patch doesn't actually fix any tests, but looking
at https://tracker.ceph.com/issues/64208 has made me
realize we haven't been updating the images on here
at all. This patch just removes the octopus and pacific
image variables and puts in ones for reef and squid. I
think this is a reasonable way to try and maintain these,
as we're just keeping around the image for the last two
releases, which matches with what versions could one day
upgrade to what main will become (the "T" release as
of writing this)
rgw/notification: Load bucket attrs before calling publish_reserve.
As part of PR# 55657, publish_reserve would reload bucket to ensure bucket_attrs are loaded. However for lc events, where the bucket attrs were already loaded, the reloading was causing crash but there was no obvious root cause, so to avoid the crashes, remove reloading of bucket in publish_reserve and put the onus on callers to load the bucket before calling publish_reserve.