Matan Breizman [Tue, 4 Feb 2025 10:24:43 +0000 (10:24 +0000)]
CMakeLists: Fallback to RelWithDebInfo
Currently, if .git exists, we set CMAKE_BUILD_TYPE=Debug.
Otherwise, we leave it empty and no optimization flags will
be used.
With this change, the fallback CMAKE_BUILD_TYPE is set
to RelWithDebInfo instead.
From CMAKE_BUILD_TYPE manual:
The default value is often an empty string, but this is usually not
desirable and one of the other standard build types is usually more appropriate.
Note: One notable change is that -DNDEBUG will now be defined.
Kefu Chai [Wed, 7 May 2025 00:42:52 +0000 (08:42 +0800)]
librbd, tools: migrate from boost::variant to std::variant
Complete migration started in commit 017f333, replacing boost::variant with
std::variant throughout the librbd codebase. This change is part of our ongoing
effort to reduce third-party dependencies by leveraging C++ standard library
alternatives where possible.
Benefits include:
- Improved code readability and maintainability
- Reduced external dependency surface
- More consistent API usage with other components
Implementation note: Unlike Boost.variant, std::variant lacks built-in
operator<< support. This commit implements the necessary operator<< for
AttributeValue, our specific std::variant instantiation, to preserve the
existing behavior.
Also, despite that `apply_visit()` calls can be replaced with `visit()`
without being qualified with `std::` because of ADL, we are taking this
opportunity to adding the `std::` prefix for better readability.
Matan Breizman [Sun, 4 May 2025 14:22:38 +0000 (14:22 +0000)]
crimson/osd: Logging fixes
* Fix "failed to log message"
* PGRecovery move to new logging macro
* PGRecovery to print pg prefix as it's impossible to debug specific pg
recovery ops without it.
crimson/osd/pg: Let PGListener use start_peering_event_operation
PG::start_peering_event_operation is a template function while
PGRecovery::pg is of PGRecoveryListener* type. We can't expose a template
function through the PGRecoveryListener interface since it must be
also virtual.
Instead, introduce start_peering_event_operation_listener which will act
as a wrapper to PG::start_peering_event_operation for PGRecovery to use
freely.
Rishabh Dave [Wed, 2 Apr 2025 15:31:31 +0000 (21:01 +0530)]
mgr/vol: handle case where path goes missing for a clone
A thread is spawned to get the value of a certain extended attribute to
generate the progress statistics for the ongoing clone operations. In
case source and/or destination path for a clone operation goes missing,
this thread crashes. Instead of crashing, handle this case gracefully.
Fixes: https://tracker.ceph.com/issues/71019 Signed-off-by: Rishabh Dave <ridave@redhat.com>
Ronen Friedman [Fri, 2 May 2025 08:03:15 +0000 (03:03 -0500)]
osd/scrub: check all(*) conditions in restrictions_on_scrubbing()
Modified OsdScrub::restrictions_on_scrubbing() to check all(*)
conditions, instead of stopping at the first one that is true.
The "new" (since Tentacle) scrub-type-to-conditions mapping is no
longer a simple one (is not "monotonic" in the sense of restrictions
always being removed as the scrub type is more important),
and the caller may want to know them all.
(*) The somewhat costly check for the random backoff is still only
performed if the OSD is not already running too many scrubs.
rgw-admin: report correct error code for non-existent bucket on deletion
admin api should return the correct error code when the bucket doesn't
exist on bucket deletion. apparently a regression by 9ae2d8c4e95807179fc17f84be6754d2b19fe639.
Ville Ojamo [Wed, 30 Apr 2025 07:37:57 +0000 (14:37 +0700)]
doc/radosgw: Improve language, capitalization and use config database
Use "RADOS Gateway" instead of "Rados Gateway", "rados gateway" etc.
I am aware of the term "Ceph Object Gateway" but this change intends to
be an uncontroversial low hanging fruit fix of obviously incorrectly
capitalized terms.
Use "RGW daemon" instead of "Gateway", "Rados Gateway" etc.
Use "RGW instance" instead of "rados gateway" for consistency with
exactly similar other instance.
If referring obviously clearly to an instance of the daemon with an
obviously not preferred term, change it to "RGW daemon"; for example
when talking about restarting the RGW.
Do not touch other instances that are not 100% clear.
The files touched mostly do not use "Ceph Object Gateway" so changing
the term to it would create inconsistency, or several more changes
would need to be done to update all instances to use this terminology.
Use configuration database instead of ceph.conf in d3n_datacache.rst.
Improve language in d3n_datacache.rst.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Correct the presentation of an example string in doc/cephadm/rgw.rst in
order to obviate an error reading "rgw.rst:202: WARNING: Inline emphasis start-string without end-string."
doc/rados: Update mClock doc on steps to override OSD IOPS capacity config
Describe the steps involved to
- Specify a global value for osd_mclock_max_capacity_iops_{ssd,hdd}, and
- Override existing individually scoped values for OSDs determined during
start-up for osd_mclock_max_capacity_iops_{ssd,hdd}.
The above is to help with the following:
- Steps to override existing setting with a global value.
- reduce the number of entries in the mon store and instead use a single
global specification for all OSDs in the cluster in case the underlying
hardware is the same for all OSDs.
N Balachandran [Wed, 30 Apr 2025 05:15:13 +0000 (10:45 +0530)]
rbd: write image mirror status if state is CREATING
It can take upto 30s for the image mirror status to be written
to rbd_mirroring on the secondary for a newly created image. This fix
attempts to reduce the time by writing the status to rbd_mirroring even
if the image state is set to CREATING.
Fixes: https://tracker.ceph.com/issues/71138 Signed-off-by: N Balachandran <nithya.balachandran@ibm.com>
test/rgw/multisite: test error handling of forwarded s3:PutBucketPolicy
PutBucketPolicy doesn't parse the given policy until after it's
forwarded and applied on the master zone, so add a test that sends a
non-json policy document that will fail to parse
without the fix to rgw_forward_request_to_master(), the InvalidArgument
error Code is still mapped correctly, but the error Message is not
preserved:
test/rgw/multisite: test error handling of forwarded iam:DeleteRole
DeleteRole's conflict handling happens after forwarding, so use
test_role_delete_sync() to test that forwarded 409 Conflict errors
preserve the DeleteConflict code and error message
without the fix to forward_iam_request_to_master(), DeleteRole instead
fails with:
> botocore.exceptions.ClientError: An error occurred (BucketNotEmpty) when calling the DeleteRole operation: None
when a forwarded request fails on the master zone, the local zone should
return that same error response back to the client. this means
reproducing both the http error and the aws xml <Error> response
rgw_forward_request_to_master() stores these errors in s->err, and
set_req_state_err() now avoids overwriting existing an error
rgw/rest: RGWRESTConn::forward() prefers to return http errors
callers need to distinguish between transport errors (a failure to
forward the request) and http errors (successfully forwarded and got a
response). forward() was losing this information by mapping any http
errors to errnos
use tl::expected to differentiate between transport errors and http
errors, with the latter being the successful/expected case