Zac Dover [Tue, 13 May 2025 06:58:39 +0000 (16:58 +1000)]
doc/dev/cephfs-mirroring: edit file 2 of x
Add prompts (and perform necessary corrections to glaring grammatical
errors) to doc/dev/cephfs-mirroring.rst, as requested by Jos Collin in
https://github.com/ceph/ceph/pull/63237/files#r2085886075.
This commit edits the second quarter of the doc/dev/cephfs-mirroring.rst
file. This commit encompasses about one-hundred lines of RST.
Ronen Friedman [Sun, 11 May 2025 05:24:33 +0000 (00:24 -0500)]
osd/scrub: remove the 'deadline' attribute from the scrub job
The scrub job's 'overdue' attribute is no longer calculated -
the only 'scrub is overdue' status remaining after latest
scheduling refactor, is the one performed in PGMap.cc (the
one affecting the 'health warning' status of the cluster).
Thus - there is no longer any reason to maintain any 'deadline'
attribute for the scrub scheduler.
Ronen Friedman [Fri, 9 May 2025 12:46:26 +0000 (07:46 -0500)]
osd/scrub: remove the deep-scrubs deadline attribute
As it is no longer meaningful in the context of the new
scrub scheduling design.
The change mandates fixes to the way 'schedule-[deeps]crub'
commands are implemented. The offset to use when forcing the
last-scrub timestamp to a new value in now calculated in
ScrubJob::guaranteed_offset(), as ScrubJob is where all
schedule adjustments (which employ the same logic) are
implemented.
Ronen Friedman [Thu, 8 May 2025 13:45:23 +0000 (08:45 -0500)]
osd/scrub: fix deadline calculations
The scrub scheduling deadlines are calculated based on pool and OSD
configuration parameters. The specifics of the calculations are
modified to match the new scrub scheduling design.
Comments and documentation are updated to reflect the fact that
the deadlines no longer have any meaningful effect on scrub
scheduling.
Afreen Misbah [Tue, 6 May 2025 14:27:03 +0000 (19:57 +0530)]
mgr/dashboard: Fix delete listener
- pass gw_group to delete API in frontend
- when more than one gw groups present delete listener failing with error message: Multiple NVMe-oF gateway groups are configured. Please specify the 'gw_group' parameter in the request.
- added missing types, i18n
Afreen Misbah [Thu, 8 May 2025 04:09:59 +0000 (09:39 +0530)]
mgr/dashboard: Add default state when gateway groups are empty
Fixes https://tracker.ceph.com/issues/71247
- after upgrades the nvmeof service spec does not contain `group` field
- this causes UI combobox internal errors
- checking for `group` in spec and disabling the selector
N Balachandran [Wed, 30 Apr 2025 05:15:13 +0000 (10:45 +0530)]
rbd: write image mirror status if state is CREATING
It can take upto 30s for the image mirror status to be written
to rbd_mirroring on the secondary for a newly created image. This fix
attempts to reduce the time by writing the status to rbd_mirroring even
if the image state is set to CREATING.
Fixes: https://tracker.ceph.com/issues/71138 Signed-off-by: N Balachandran <nithya.balachandran@ibm.com>
(cherry picked from commit 25a8de9c3db8309387eed3502e781872bc1e035e)
Zac Dover [Thu, 8 May 2025 02:29:25 +0000 (12:29 +1000)]
doc/mgr: edit alerts.rst
Edit doc/mgr/alerts.rst as part of the project to determine where the
error is in https://github.com/ceph/ceph/pull/62782 that prevents the
Jenkins tests from passing.
This commit adds to the work done in
https://github.com/ceph/ceph/pull/62782 by correcting some of the
English that was present in that PR.
This is a change to one of twenty-five files in
https://github.com/ceph/ceph/pull/62782, and this commit represents one
of what will be at least twenty-five other commits made to track this
error down.
Zac Dover [Thu, 8 May 2025 00:08:06 +0000 (10:08 +1000)]
doc/mgr/ceph_api: edit index.rst
Edit doc/mgr/ceph_api/index.rst as part of the project to determine
where the error is in https://github.com/ceph/ceph/pull/62782 that
prevents the Jenkins tests from passing.
This is a change to one of twenty-five files in
https://github.com/ceph/ceph/pull/62782, and this commit represents one
of what will be at least twenty-five other commits made to track this
error down.
Adam Kupczyk [Wed, 7 May 2025 08:30:11 +0000 (08:30 +0000)]
os/bluestore/recompression: Estimator omits large compressed blobs
The problem was that Estimator accepted large compressed blobs for
recompression. The fix is to discourage such actions by penalizing
compressed blobs based on their size. In effect small compressed
blob is likely to be recompressed, and large compressed blob will not.
Fixes: https://tracker.ceph.com/issues/71244 Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
(cherry picked from commit bbc9e961e9046949138bb3d70e8dd91761fcb088)
Adam Kupczyk [Wed, 7 May 2025 08:25:19 +0000 (08:25 +0000)]
os/bluestore/recompression: Now able to reach left boundary
Bad comparision caused recompression range to exclude left boundary
point. In most cases it makes little difference, but it prevents from:
1) including extent starting at 0
2) including extent at begging of onode segment
Now fixed.
Fixes: https://tracker.ceph.com/issues/71244 Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
(cherry picked from commit acfe527d9bbe3364f9e321ce6e790f93eafe41df)
Nitzan Mordechai [Wed, 26 Mar 2025 08:20:15 +0000 (08:20 +0000)]
osd_types: Restore new_object marking for delete missing entries
Recent changes (PR #29893) removed the “new_object” parameter from missing.add() and the
pg_missing_item constructor. As a result, when processing delete log entries,
if an object is found on disk, its on‑disk version is stored as “have” instead
of the default eversion_t() (0'0). The invariant in read_log_and_missing() then
fails because delete entries are expected to have “have” set to eversion_t().
This patch reintroduces the following check:
if (have == eversion_t())
clean_regions.mark_object_new();
By doing so, we ensure that when the on‑disk “have” is default, the missing record
is marked as new—restoring the previous behavior and satisfying the invariant for
delete operations.
Ronen Friedman [Fri, 2 May 2025 08:03:15 +0000 (03:03 -0500)]
osd/scrub: check all(*) conditions in restrictions_on_scrubbing()
Modified OsdScrub::restrictions_on_scrubbing() to check all(*)
conditions, instead of stopping at the first one that is true.
The "new" (since Tentacle) scrub-type-to-conditions mapping is no
longer a simple one (is not "monotonic" in the sense of restrictions
always being removed as the scrub type is more important),
and the caller may want to know them all.
(*) The somewhat costly check for the random backoff is still only
performed if the OSD is not already running too many scrubs.
doc/rados: Update mClock doc on steps to override OSD IOPS capacity config
Describe the steps involved to
- Specify a global value for osd_mclock_max_capacity_iops_{ssd,hdd}, and
- Override existing individually scoped values for OSDs determined during
start-up for osd_mclock_max_capacity_iops_{ssd,hdd}.
The above is to help with the following:
- Steps to override existing setting with a global value.
- reduce the number of entries in the mon store and instead use a single
global specification for all OSDs in the cluster in case the underlying
hardware is the same for all OSDs.
crimson: Create the shared promise before waited upon
RecoveryBackend::pushes map creates each shared_promise
in wait_for_pushes call. There can be a situation where
set_pushed is called due to handled push reply (handle_push_reply)
before the shared_promise was even constructed due to backfill progress
is stuck.
Samuel Just [Sat, 5 Apr 2025 01:57:33 +0000 (18:57 -0700)]
crimson: fix DynamicPerfStats usage in ClientRequest
ClientRequest::get_connection() return l_conn, which will be
null by the time PG::add_client_request_lat is called in
ClientRequest::do_process. Modify get_connection() to
return a Connection& from whichever of l_conn or r_conn
isn't null.
Samuel Just [Thu, 3 Apr 2025 03:42:11 +0000 (03:42 +0000)]
crimson: remove CommonClientRequest, move do_recover_missing to PG
do_recover_missing was the only thing left, and inheriting from a class
to get a static method is somewhat confusing. Simply move
do_recover_missing to PG.
scan_for_backfill was seperated to scan_for_backfill_primary and
scan_for_backfill_replica.
The fix from:
https://github.com/ceph/ceph/pull/62837/commits/88432ebd7432c513ccd495e77425401beddb9953
was only copied to the replica version.
rgw/sts: adding validation of jwks_uri cert according
to https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_create_oidc_verify-thumbprint.html
for n&e which can be later used for all key types
(x5c, n&e).
rgw: utilize is_impersonating for forwarded sts requests
With the introduction of is_impersonating in SysReqApplier,
RoleApplier can now use the same mechanism to mark when a request
has been forwarded by a system user on behalf of another role (e.g.,
through STS) to mark it as a system request (s->system_request).
In rgw_sync_pipe_params, the mode can be either system or user.
When in system mode, no user is involved, but the current
implementation holds an empty rgw_user, which can cause confusion
in pipe_rules::find_basic_info_without_tags().
With this change, rgw_user is now optional, ensuring that when no
user is involved, it is explicitly nullopt rather than an empty object.
Seena Fallah [Fri, 28 Mar 2025 20:55:20 +0000 (21:55 +0100)]
rgw: remote copy obj pass rgwx-perm-check-uid for perm evaluation
When copying object from remote source (bucket from another zonegroup)
the perms of the source is not evaluated resulting in reading from
unauthorized buckets.
passing `rgwx-perm-check-uid` will let the source zone evaluates the
perm and close this bug.
Seena Fallah [Fri, 28 Mar 2025 20:52:47 +0000 (21:52 +0100)]
rgw: RGWRadosPutObj evals source bucket perm for backward compatibility
As of a3f40b4 we no longer evaluate perms locally for source bucket,
this could cause broken permission evaluation dusring upgrade as one
zone is not respecting the perm evaluation based on the `rgwx-perm-check-uid`
arg.
Seena Fallah [Fri, 28 Mar 2025 20:48:34 +0000 (21:48 +0100)]
rgw: give hint via header for perm evaluation in GetObj
Return `Rgwx-Perm-Checked` header as a hint for the destination zone
to know whether the perms where considered or not.
This is just a backward compatibility for upgrade and can be dropped
in T+2 release.
Seena Fallah [Thu, 27 Feb 2025 10:53:44 +0000 (11:53 +0100)]
rgw: take account GetObject(Version)Tagging when replicating
In case the uid has no permission to read tagging, the tags should
not be replicated.
Ref. https://docs.aws.amazon.com/AmazonS3/latest/userguide/setting-repl-config-perm-overview.html
Seena Fallah [Mon, 24 Feb 2025 22:41:13 +0000 (23:41 +0100)]
rgw: check source object replication by replication actions
Check for permissions of `s3:GetObjectVersionForReplication` in
addition to `s3:GetObject` and `s3:GetObjectVersion` when fetching
the object for multisite.
Seena Fallah [Mon, 24 Feb 2025 22:33:45 +0000 (23:33 +0100)]
rgw: only allow system override if identity is not impersonating
Since multisite now delegates permission checks for source objects
to the source zone (a3f40b4), we need to avoid allowing system-level
overrides when the request is impersonating another identity.
SysReqApplier should only grant override permission if the request
is truly system-authenticated and not acting on behalf of another
user or role (i.e., no rgwx-perm-check-uid or rgwx-perm-check-role
in the request).