Change the wording of a sentence in doc/radosgw/metrics.rst so that its
articles read as though they were written by a native speaker of the
English language.
This commit is being raised as part of a diagnostic process aimed at
discovering why the ReadtheDocs check is failing on PR
https://github.com/ceph/ceph/pull/62877.
qa: Add Teuthology test for BlueStore ESB assertion failure
Adds a test to reproduce the !ito->is_valid() assertion in BlueStore
with bluestore_elastic_shared_blobs=true on a 2+1 EC pool using a
FIO randwrite workload (512 concurrent ops, 50G, 12,500 objects).
The test deploys a 6-OSD cluster and runs FIO for 1 hour via workunit,
failing if an OSD crashes.
Zac Dover [Mon, 2 Jun 2025 02:32:36 +0000 (12:32 +1000)]
doc/start: edit documenting-ceph.rst
Edit the section "Build the Source" in doc/start/documenting-ceph.rst.
Also correct a misuse of the word "presently", which means "in a little
while", not "now".
Zac Dover [Mon, 2 Jun 2025 02:16:47 +0000 (12:16 +1000)]
doc/dev/cephfs-mirroring: edit file 4 of x
Add prompts (and perform necessary corrections to glaring grammatical
errors) to doc/dev/cephfs-mirroring.rst, as requested by Jos Collin in
https://github.com/ceph/ceph/pull/63237/files#r2085886075.
This commit edits the fourth (and final) quarter of the
doc/dev/cephfs-mirroring.rst file.
Further refinements to the English in this file are possible.
Zac Dover [Sun, 1 Jun 2025 23:45:42 +0000 (09:45 +1000)]
doc/mgr: edit nfs.rst
Edit the "Updating an NFS Cluster" section of doc/mgr/nfs.rst. This
commit includes changes requested by Anthony D'Atri in
https://github.com/ceph/ceph/pull/63452.
Zac Dover [Sun, 1 Jun 2025 23:14:45 +0000 (09:14 +1000)]
doc/mgr: edit iostat.rst
Rewrite the first sentence in doc/mgr/iostat.rst. This follows up on a
request made by Anthony D'Atri in
https://github.com/ceph/ceph/pull/63418#discussion_r2102806688.
Zac Dover [Fri, 30 May 2025 12:38:03 +0000 (22:38 +1000)]
doc/rados/operations: edit cache-tiering.rst
Strengthen the warning against deploying cache tiering in releases after
Reef. This follows up on Anthony D'Atri's request in
https://github.com/ceph/ceph/pull/63465.
Previously, we had memory leak in the test_bluestore_types.cc tests where
`BufferCacheShard` and `OnodeCacheShard` objects were allocated with
raw pointers but never freed, causing leaks detected by AddressSanitizer.
ASan rightly pointed this out:
```
Direct leak of 224 byte(s) in 1 object(s) allocated from:
#0 0x55a7432a079d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_bluestore_types+0xf2e79d) (BuildId: c3bec647afa97df6bb147bc82eac937531fc6272)
#1 0x55a743523340 in BlueStore::BufferCacheShard::create(BlueStore*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, ceph::common::PerfCounters*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/os/bluestore/Bl
ueStore.cc:1678:9
#2 0x55a74330b617 in ExtentMap_seek_lextent_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/objectstore/test_bluestore_types.cc:1077:7
#3 0x55a7434f2b2d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.
cc:2653:10
#4 0x55a7434b5775 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:
2689:14
#5 0x55a74347005d in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2728:5
```
```
Direct leak of 9928 byte(s) in 1 object(s) allocated from:
#0 0x7ff249d21a2d in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:86
#1 0x6048ed878b76 in BlueStore::OnodeCacheShard::create(ceph::common::CephContext*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::common::PerfCounters*) /home/kefu/dev/ceph/src/os/bluestore/BlueStore.cc:1219
#2 0x6048ed66d4f9 in GarbageCollector_BasicTest_Test::TestBody() /home/kefu/dev/ceph/src/test/objectstore/test_bluestore_types.cc:2662
#3 0x6048ed820555 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
#4 0x6048ed80c78a in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689
#5 0x6048ed7b8bfa in testing::Test::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2728
```
In this change, we replace raw pointer allocation with unique_ptr to
ensure automatic cleanup when the objects go out of scope.
` Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Casey Bodley [Wed, 28 May 2025 20:33:44 +0000 (16:33 -0400)]
script: ceph-backport.sh adds redmine key to api requests
the ceph-backport.sh script recently started failing with:
> ceph-backport.sh: DEBUG: Considering Redmine issue: https://tracker.ceph.com/issues/70374 - is it in the Backport tracker?
> ceph-backport.sh: DEBUG:
> ceph-backport.sh: ERROR: Issue https://tracker.ceph.com/issues/70374 is not a Backport
because the command `curl --silent https://tracker.ceph.com/issues/70374.json`
now fails with `HTTP/2 401` (Unauthorized) and returns an empty string
the command succeeds after adding my redmine key as a query param like
some of the other redmine requests
Fixed calculation on effective blob size.
When fully non-compressible data is passed,
it could cause losing few bytes in the end.
Example:
-107> 2025-05-17T20:40:50.468+0000 7f267a42f640 15 bluestore(/var/lib/ceph/osd/ceph-4) _do_write_v2_compressed 200000~78002 -> 200000~78002
-106> 2025-05-17T20:40:50.468+0000 7f267a42f640 20 blobs to put: 200000~f000(4d61) 20f000~f000(b51) 21e000~f000(b51) 22d000~f000(b51) 23c000~f000(b51) 24b000~f000(b51) 25a000~f000(b51) 269000~f000(b51)
In result we split 0x78002 into 8 * 0xf000, losing 0x2 in the process.
Calculations for original:
>>> size=0x78002
>>> blobs=(size+0xffff) / 0x10000
>>> blob_size = size / blobs
>>> print hex(size), blobs, hex(blob_size)
0x78002 8 0xf000 <-this means roundup is 0xf000
Laura Flores [Tue, 27 May 2025 17:09:04 +0000 (12:09 -0500)]
qa/crontab: update priority for tentacle upgrade command
The current prio (100) results in this error:
```
teuthology.exceptions.ScheduleFailError: Scheduling failed: Unable to schedule 244 jobs with priority 100.
```
I tested the prio on 150 on my teuthology setup, and this passes with the amount of jobs.
Connor Fawcett [Mon, 9 Dec 2024 17:02:11 +0000 (17:02 +0000)]
qa/tasks: Add a task which performs an offline check of the consistency of parity shards
Add a Python script which can be used to scan a Ceph cluster, find any erasure coded data objects and
check them for consistency. This is achieved by reading the data shards for a given object, running the data shards
through the existing EC tool and verifying the output matches the parity shards stored on the OSDs.
This commit adds a new teuthology task but does not add it to any YAMLs currently, this work will be
expanded on in future commits.