Matan Breizman [Sun, 4 May 2025 14:22:38 +0000 (14:22 +0000)]
crimson/osd: Logging fixes
* Fix "failed to log message"
* PGRecovery move to new logging macro
* PGRecovery to print pg prefix as it's impossible to debug specific pg
recovery ops without it.
crimson/osd/pg: Let PGListener use start_peering_event_operation
PG::start_peering_event_operation is a template function while
PGRecovery::pg is of PGRecoveryListener* type. We can't expose a template
function through the PGRecoveryListener interface since it must be
also virtual.
Instead, introduce start_peering_event_operation_listener which will act
as a wrapper to PG::start_peering_event_operation for PGRecovery to use
freely.
rgw-admin: report correct error code for non-existent bucket on deletion
admin api should return the correct error code when the bucket doesn't
exist on bucket deletion. apparently a regression by 9ae2d8c4e95807179fc17f84be6754d2b19fe639.
Ville Ojamo [Wed, 30 Apr 2025 07:37:57 +0000 (14:37 +0700)]
doc/radosgw: Improve language, capitalization and use config database
Use "RADOS Gateway" instead of "Rados Gateway", "rados gateway" etc.
I am aware of the term "Ceph Object Gateway" but this change intends to
be an uncontroversial low hanging fruit fix of obviously incorrectly
capitalized terms.
Use "RGW daemon" instead of "Gateway", "Rados Gateway" etc.
Use "RGW instance" instead of "rados gateway" for consistency with
exactly similar other instance.
If referring obviously clearly to an instance of the daemon with an
obviously not preferred term, change it to "RGW daemon"; for example
when talking about restarting the RGW.
Do not touch other instances that are not 100% clear.
The files touched mostly do not use "Ceph Object Gateway" so changing
the term to it would create inconsistency, or several more changes
would need to be done to update all instances to use this terminology.
Use configuration database instead of ceph.conf in d3n_datacache.rst.
Improve language in d3n_datacache.rst.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Correct the presentation of an example string in doc/cephadm/rgw.rst in
order to obviate an error reading "rgw.rst:202: WARNING: Inline emphasis start-string without end-string."
doc/rados: Update mClock doc on steps to override OSD IOPS capacity config
Describe the steps involved to
- Specify a global value for osd_mclock_max_capacity_iops_{ssd,hdd}, and
- Override existing individually scoped values for OSDs determined during
start-up for osd_mclock_max_capacity_iops_{ssd,hdd}.
The above is to help with the following:
- Steps to override existing setting with a global value.
- reduce the number of entries in the mon store and instead use a single
global specification for all OSDs in the cluster in case the underlying
hardware is the same for all OSDs.
test/rgw/multisite: test error handling of forwarded s3:PutBucketPolicy
PutBucketPolicy doesn't parse the given policy until after it's
forwarded and applied on the master zone, so add a test that sends a
non-json policy document that will fail to parse
without the fix to rgw_forward_request_to_master(), the InvalidArgument
error Code is still mapped correctly, but the error Message is not
preserved:
test/rgw/multisite: test error handling of forwarded iam:DeleteRole
DeleteRole's conflict handling happens after forwarding, so use
test_role_delete_sync() to test that forwarded 409 Conflict errors
preserve the DeleteConflict code and error message
without the fix to forward_iam_request_to_master(), DeleteRole instead
fails with:
> botocore.exceptions.ClientError: An error occurred (BucketNotEmpty) when calling the DeleteRole operation: None
when a forwarded request fails on the master zone, the local zone should
return that same error response back to the client. this means
reproducing both the http error and the aws xml <Error> response
rgw_forward_request_to_master() stores these errors in s->err, and
set_req_state_err() now avoids overwriting existing an error
rgw/rest: RGWRESTConn::forward() prefers to return http errors
callers need to distinguish between transport errors (a failure to
forward the request) and http errors (successfully forwarded and got a
response). forward() was losing this information by mapping any http
errors to errnos
use tl::expected to differentiate between transport errors and http
errors, with the latter being the successful/expected case
scan_for_backfill was seperated to scan_for_backfill_primary and
scan_for_backfill_replica.
The fix from:
https://github.com/ceph/ceph/pull/62837/commits/88432ebd7432c513ccd495e77425401beddb9953
was only copied to the replica version.
Ville Ojamo [Tue, 29 Apr 2025 06:20:26 +0000 (13:20 +0700)]
doc/radosgw: Use privileged prompt for CLI commands in admin.rst
Instead of not defining a prompt to use in CLI commands and falling back
to the default unprivileged prompt, use explicit privileged bash prompt
for CLI commands that require privileges.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Samuel Just [Sat, 5 Apr 2025 01:57:33 +0000 (18:57 -0700)]
crimson: fix DynamicPerfStats usage in ClientRequest
ClientRequest::get_connection() return l_conn, which will be
null by the time PG::add_client_request_lat is called in
ClientRequest::do_process. Modify get_connection() to
return a Connection& from whichever of l_conn or r_conn
isn't null.
rgw: utilize is_impersonating for forwarded sts requests
With the introduction of is_impersonating in SysReqApplier,
RoleApplier can now use the same mechanism to mark when a request
has been forwarded by a system user on behalf of another role (e.g.,
through STS) to mark it as a system request (s->system_request).
In rgw_sync_pipe_params, the mode can be either system or user.
When in system mode, no user is involved, but the current
implementation holds an empty rgw_user, which can cause confusion
in pipe_rules::find_basic_info_without_tags().
With this change, rgw_user is now optional, ensuring that when no
user is involved, it is explicitly nullopt rather than an empty object.
Seena Fallah [Fri, 28 Mar 2025 20:55:20 +0000 (21:55 +0100)]
rgw: remote copy obj pass rgwx-perm-check-uid for perm evaluation
When copying object from remote source (bucket from another zonegroup)
the perms of the source is not evaluated resulting in reading from
unauthorized buckets.
passing `rgwx-perm-check-uid` will let the source zone evaluates the
perm and close this bug.
Seena Fallah [Fri, 28 Mar 2025 20:52:47 +0000 (21:52 +0100)]
rgw: RGWRadosPutObj evals source bucket perm for backward compatibility
As of a3f40b4 we no longer evaluate perms locally for source bucket,
this could cause broken permission evaluation dusring upgrade as one
zone is not respecting the perm evaluation based on the `rgwx-perm-check-uid`
arg.
Seena Fallah [Fri, 28 Mar 2025 20:48:34 +0000 (21:48 +0100)]
rgw: give hint via header for perm evaluation in GetObj
Return `Rgwx-Perm-Checked` header as a hint for the destination zone
to know whether the perms where considered or not.
This is just a backward compatibility for upgrade and can be dropped
in T+2 release.
Seena Fallah [Thu, 27 Feb 2025 10:53:44 +0000 (11:53 +0100)]
rgw: take account GetObject(Version)Tagging when replicating
In case the uid has no permission to read tagging, the tags should
not be replicated.
Ref. https://docs.aws.amazon.com/AmazonS3/latest/userguide/setting-repl-config-perm-overview.html