Currently, we use the "Check ceph config" CI check to remind users about
any configuration changes that were detected in the PR. There's no easy
way for the script to detect if the relevant docuemtations has been
updated for the config change that was detected.
Users might get confused to still see the CI check failing even after
updating the relevant docs. We update the text message to help diffuse
the confusion. If the users will still like to see the CI check go green
- they can comment `/config check ok` and re-run the failed test.
Kefu Chai [Wed, 18 Jun 2025 13:19:21 +0000 (21:19 +0800)]
ceph-object-corpus: update submodule
Update the ceph-object-corpus submodule to pick up the change to
mark cls_rbd_snap as forward incompatible since nautilus. This change
allows us to update ceph-dencoder to allocate fresh instances for each
decode operation instead of reusing existing ones. The uncoming
change in ceph-dencoder will allow us to identify the potential
compatibility break early.
Kefu Chai [Wed, 18 Jun 2025 09:22:36 +0000 (17:22 +0800)]
deb: use variable expansion to support systemd unit dir changes
Ubuntu changed the systemd unit directory location between releases:
- Jammy (22.04): /lib/systemd/system
- Noble (24.04): /usr/lib/systemd/system
To maintain compatibility across both versions, update .install files
to use brace expansion pattern {usr/,}lib/systemd/system/<service>.
This pattern works because dh_install uses bsd_glob() with GLOB_CSH
flags, which expands braces and matches files in both locations
depending on where CMakeLists.txt actually installed them.
Fixes installation issues when building packages on Noble while
maintaining backward compatibility with Jammy builds.
Kefu Chai [Sat, 14 Jun 2025 13:44:05 +0000 (21:44 +0800)]
cls/rbd: use default values for non-decoded fields in test instances
Previously, test instances for cls_rbd_snap used non-default values
for the "parent" field, which is ignored during decoding. The
check-generated.sh test passed because they reused the same instance
for re-encoding, preserving undecoded fields.
An upcoming change will allocate new instances for each encode/decode
verification instead of reusing instances. This will expose
discrepancies between original test instances and re-encoded values
when fields contain non-default values but aren't decoded.
This change sets ignored fields to their default values in test
instances, ensuring consistency between encoding and decoding
operations regardless of the verification approach used.
Since the incompatibility of cls_rbd_snap's on-disk format was
introduced in 32b14ed1, which was introduced Ceph v14, we will
mark this version the first incompatible version in ceph-object-corpus
in the sense that the re-encoded cls_rbd_snap with v8 struct version
is different from the original copy if its parent field is set with
< v8 struct version.
Kefu Chai [Tue, 17 Jun 2025 09:22:16 +0000 (17:22 +0800)]
cmake: use find_program(REQUIRED) to fail early on missing programs
Since upgrading minimum CMake version to 3.22.1 (commit 469d82a1), we can
now use find_program(REQUIRED) which was introduced in CMake 3.18.
This change replaces manual FATAL_ERROR checks with the REQUIRED option
and adds it to programs that are actually needed during the build. This
ensures the build fails early during configuration rather than later
during compilation when missing programs are invoked.
Changes:
- Replace find_program() + message(FATAL_ERROR) patterns with REQUIRED
- Add REQUIRED to programs that are used during build but previously
had no error checking
Zac Dover [Tue, 17 Jun 2025 06:05:08 +0000 (16:05 +1000)]
doc: mgr/dashboard: add --enable-auth flag
Add an instruction that includes the --enable-auth flag in a "git orch
apply mgmt-gateway" command, in accordance with a request made by
afreen23 here: https://github.com/ceph/ceph/pull/60440#discussion_r1953530599
nvme-cli has changed json output in 2.11
and then reverted the change in 2.13 release.
So this commit goes back to 2.13 (or pre-2.11)
processing of "nvme list" json output.
Kefu Chai [Fri, 13 Jun 2025 12:22:20 +0000 (20:22 +0800)]
mds: generate symlink inode with correct mode
Fix test instance generation for InodeStoreBare and InodeStore to
properly set the mode field to S_IFLNK for symlink inodes.
Previously, generated test instances with symlink inodes had unset
mode fields, creating inconsistent data. This issue was masked because
ceph-dencoder reused existing instances during encode/decode consistency
tests, leaving stale values intact.
The problem would surface when check-generated.sh and readable.sh
allocate fresh instances for decoding tests, as the missing mode field
would cause decode/encode inconsistencies.
This change fixes generate_test_instances() to set the mode field to
S_IFLNK for symlink inodes, creating valid InodeStore and InodeStoreBare
instances with consistent field values for proper encode/decode testing.
Ville Ojamo [Fri, 13 Jun 2025 10:28:23 +0000 (17:28 +0700)]
doc: Fix sphinx warnings
doc/cephadm/services/snmp-gateway.rst: Don't use double backticks for
links. Makes it a link instead of rendering syntax verbatim.
Also for consistency use single backticks for links instead of a plain
trailing underscore.
Improve language of opening sentence.
doc/dev/cephfs-mirroring.rst: Add missing empty line before preformatted
blocks. No change in rendered docs.
doc/mgr/telemetry.rst: Fix external link syntax. Makes it a link instead
of rendering syntax and pointing to non-existing link.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Shachar Sharon [Tue, 22 Oct 2024 12:06:54 +0000 (15:06 +0300)]
client: fix memory leak in Client::CRF_iofinish::complete
Commit 1210ddf7a ("Client: Add non-blocking helper classes") introduced
Client::C_Read_Finisher Context object for async READ operations, but
it has a read-after-free bug which may cause memory leak when calling
libcephf's non-blocking ceph_ll_nonblocking_readv_writev API with async
READ:
ceph_ll_nonblocking_readv_writev (READ)
Client::ll_preadv_pwritev
...
Client::_read_async
Context::complete
Client::CRF_iofinish::complete
Client::CRF_iofinish::finish
CRF->finish_io()
Client::C_Read_Finisher::finish_io
...
delete this; // frees CRF_iofinish->CRF
if (CRF->iofinished) // use-after-free of CRF
delete this; // may not get here
A possible memory leak depends on timing and race with other thread
allocation which alters the memory address of CRF->iofinished to
false, thus skipping the last delete operation.
The check of `if (CRF->iofinished)` is unnecessary: it is always set to
true upon calling CRF->finish_io(). Thus, there is no need to have the
override function Client::CRF_iofinish::complete() as it now has the
same logic as Context::complete(). Removed.
Ville Ojamo [Fri, 13 Jun 2025 10:02:33 +0000 (17:02 +0700)]
doc/cephfs: Improve formatting in mantle.rst
Use ordered lists instead of hardcoded list item number paragraphs.
Indent list item contents correctly so that a text block is not
rendered inside a previous preformatted block.
Also fix indentation of one preformatted block inside a list item to be
at the same amount of indent as other such blocks.
Use inline preformatted for commands, method/function names etc. instead
of italic/MD-style inline preformatted.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
```
CMake Deprecation Warning at src/dmclock/CMakeLists.txt:1 (cmake_minimum_required):
Compatibility with CMake < 3.10 will be removed from a future version of
CMake.
Update the VERSION argument <min> value. Or, use the <min>...<max> syntax
to tell CMake that the project requires at least <min> but has been updated
to work with policies introduced by <max> or earlier.
```
The upstream CI workflow uses Ubuntu 22.04 (CMake 3.22.1) and CentOS 9
(CMake 3.26.5), so bumping to 3.22.1 maintains compatibility with our
supported build environments while enabling access to newer CMake
features.
Kefu Chai [Fri, 13 Jun 2025 08:20:10 +0000 (16:20 +0800)]
cmake: drop c-ares::c-ares alias
Remove the c-ares::c-ares alias that was causing build failures after
bumping the minimum CMake version:
```
CMake Error at cmake/modules/Findc-ares.cmake:34 (add_library):
add_library cannot create ALIAS target "c-ares::c-ares" because another
target with the same name already exists.
Call Stack (most recent call first):
src/CMakeLists.txt:463 (_find_package)
src/seastar/cmake/SeastarDependencies.cmake:136 (find_package)
src/seastar/CMakeLists.txt:395 (seastar_find_dependencies)`
```
The alias was originally added for backward compatibility with Seastar,
but is no longer needed since the updated Seastar submodule no longer
references the c-ares::c-ares target.
Matt Benjamin [Sun, 18 May 2025 01:02:34 +0000 (21:02 -0400)]
rgw: aws-chunked need not supply any content-length
The updated logic for aws chunked handling (2024) appears sufficient
to handle the cases produced by aws-sdk-go-v2.
Note that https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html
states that "For all requests, you must include the
x-amz-decoded-content-length header, specifying the size of the object in
bytes." (accessed 5/17/2025) (but now we do not enforce it).
Reported (with reproducer!) by: Fred Heinecke.
Fixes: https://tracker.ceph.com/issues/71183 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Matt Benjamin [Sat, 17 May 2025 19:52:20 +0000 (15:52 -0400)]
rgw: recognize checksum from x-amz-checksum-{type} alone
Some SDKs may send x-amz-checksum-algorithm or
x-amz-sdk-checksum-algorithm regardless as well, but those are
only required if the checksum header is in the trailer section.
Fixes: https://tracker.ceph.com/issues/71350 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
John Mulligan [Fri, 21 Mar 2025 18:28:25 +0000 (14:28 -0400)]
script/build-with-container: cache git branch result
Cache the branch we got from the git command as it is highly unlikely
to change during the script execution and if it does -- we mostly don't
care anyway.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Shraddha Agrawal [Thu, 29 May 2025 10:10:01 +0000 (15:40 +0530)]
qa/standalone/mon/availability.sh: add test for config option
This commit adds two tests, first, to ensure we get an error
message when the feature is disabled. It checks if the config
option, enable_availability_tracking is working properly.
Second test ensures that we actually do stop calculating the
score when the feature is disabled.
Shraddha Agrawal [Thu, 22 May 2025 10:26:41 +0000 (15:56 +0530)]
mon/MgrStatMonitor: ignore duration for which feature is off
When the availability tracking feature is disabled, we should not
be updating the score. We should start recalculating the score
when the user enables the features again. Essentially, for the
purpose of calculating the score, we need to ignore the duration
for which the feature was turned off.
The score is calculated from the uptime and downtime durations
recorded in `pool_availability` object. These durations are updated
in `calc_pool_availability` by adding the diff between last_uptime/
last_downtime and now.
To discard the duration for which the feature was turned off, we
need to offset the uptime/downtime by this duration. A simple way
to do this is to update the last_uptime and last_downtime to the
timestamp when the feature is toggled on again. To implement the
same, we record the time at which the feature is toggled from off
to on. When `calc_pool_availability` is invoked, if a reset is
required, it resets last_uptime and last_downtime before proceeding
with availability calculations.
We only care about the state when the feature is toggled from off to
on. All other toggle states for the config option will not have any
effect on the score.