Ronen Friedman [Sat, 17 Aug 2024 16:08:19 +0000 (11:08 -0500)]
osd/scrub: delay both targets on some failures
If the failure of a scrub-job is due to a condition that affects
both targets, both should be delayed. Otherwise, we may end up
with the following bogus scenario:
A high priority deep target is scheduled, but scrub session initiation
fails due to, for example, a concurrent snap trim. The deep target
will be delayed. A second initiation attempt may happen after the
snap trimming is done, but before the updated deep target not-before.
As a result - the lower priority target will be scheduled before the
higher priority one - which is a bug.
Ronen Friedman [Thu, 15 Aug 2024 13:17:48 +0000 (08:17 -0500)]
osd/scrub: reverse OSDRestrictions flags polarity
As most of the flags in OSDRestrictions are of 'true is bad' polarity,
reverse the two non-conforming flags - cpu load and time-of-day
restrictions - to match.
This flag was used to indicate that a deep scrub should
be performed if a shallow scrub finds an error. It was
always set true for shallow, regular, scrubs - if
can_autorepair flag was set. Thus, the ephemeral flag in
the requested_scrub_t object is not really needed.
Ronen Friedman [Tue, 6 Aug 2024 13:07:17 +0000 (08:07 -0500)]
qa/standalone/scrub: disable scrub_extended_sleep test
Disabling osd-scrub-test.sh::TEST_scrub_extended_sleep,
as the test is no longer valid (updated code no longer
produces the same logs or the same behavior).
osd/scrub: OSD's scrub queue now holds SchedEntry-s
The OSD's scrub queue now holds SchedEntry-s, instead of ScrubJob-s.
The queue itself is implemented using the 'not_before_queue_t' class.
Note: this is not a stable state of the scrubber code. In the next
commits:
- modifying the way sched targets are modified and updated, to match the
new queue implementation.
- removing the 'planned scrub' flags.
Important note: the interaction of initiate_scrub() and pop_ready_pg()
is not changed by this commit. Namely:
Currently - pop..() loops over all eligible jobs, until it finds one
that matches the environment restrictions (which most of the time, as the
concurrency limit is usually reached, would be 'high-priority-only').
The other option is to maintain Sam's 'not_before_q' clean interface: we
always pop the top, and if that top fails the preconds tests - we delay and
re-push. This has the following troubling implications:
- it would take a long time to find a viable scrub job, if the problem
is related to, for example, 'no scrub'.
- local resources failure (inc_scrubs() failure) must be handles
separately, as we do not want to reshuffle the queue for this
very very common case.
- but the real problem: unneeded shuffling of the queue, even as the
problem is not with the scrub job itself, but with the environment
(esp. no-scrub etc.).
This is a common case, and it would be wrong to reshuffle the queue
for that.
- and - remember that any change to a sched-entry must be done under PG
lock.
osd/scrub: modify ScrubJob to hold two SchedTarget-s
ScrubJob will now hold two SchedTarget-s - two sets of scheduling
information (times, levels, etc.) for the next shallow and deep scrubs.
This is in preparation for the upcoming changes to the scheduling queue.
The change cannot stand on its own, as the partial implementation
creates some inconsistencies in the scheduling logic.
Specifically, here is what changes here, and how it differs from the
desired implementation:
- The OSD still maintains a queue of scrub jobs - one object only per
PG.
But now - each queue element holds two SchedTarget-s.
- When a scrub is initiated, the Scrubber is handed a ScrubJob object.
Only in the next commit will it also receive the ID of the selected
level. That causes some issues when re-determining the level of the
initiated scrub. A failure to match the queue "intent" results in
failures.
- the 'planned scrub' flags are still here, instead of directly
encoding the characteristics of the next scrub in the relevant
sched-entry.
- the 'urgency' levels do not cover the full required range of
behaviors and priorities.
osd/scrub: introducing the concept of a SchedEntry
SchedEntry holds the scheduling details for scrubbing a specific PG at
a specific scrub level. Namely - it identifies the [pg,level]
combination, the 'urgency' attribute of the scheduled scrub
(which determines most of its behavior and scheduling decisions)
and the actual time attributes for scheduling (target,
deadline, not_before).
Added a table detailing, for each type of scrub, what limitations apply
to it, and what restrictions are waived.
The following commits will reshape the ScrubJob objects to hold
two instances of SchedTarget-s - two wrappers around SchedEntry-s,
one for the next shallow scrub and one for the next deep scrub.
Sched-entries (wrapped in sched-targets) have a defined order:
For ready-to-scrub entries (those that have an n.b. in the past),
the order is first by urgency, then by target time (and then by
level - deep before shallow - and then by the n.b. itself).
'Future' entries are ordered by n.b., then urgency,
target time, and level.
NitzanMordhai [Tue, 28 Nov 2023 09:52:05 +0000 (09:52 +0000)]
mgr/rest: Trim request array and limit size
Presently, the requests array in the REST module has the potential to grow
indefinitely, leading to excessive memory consumption, particularly when
dealing with lengthy and intricate request results.
To address this issue, a limit will be imposed on the requests array within
the REST module.
This limitation will be governed by the `mgr/restful/x/max_requests` configuration
parameter specific to the REST module.
when submit_request called we will check request array if exceed max_request option
if it does we will check if the future trimmed request finished and log error
message in case we are trimming un-finished requests.
Tobias Urdin [Thu, 15 Aug 2024 15:17:14 +0000 (17:17 +0200)]
qa: barbican: restrict python packages with upper-constraints
We install barbican by doing a pip install directly on the
cloned git repository but we don't honor the upper-constraints
from the OpenStack Requirements project that handles what
versions is supported.
This changes the pip install command that we issue when
installing barbican to honor the requirements for the
version (derived from the branch) that we use, in
this case it's the 2023.1 release upper-constraints [1].
This prevents us from pulling in untested Python packages.
This only updates Barbican because for the Keystone job
we dont directly issue pip but install using tox using the
`venv` environment which already by default sets the
constraints as you can see in [2].
Yuval Lifshitz [Mon, 19 Aug 2024 10:37:07 +0000 (13:37 +0300)]
Merge pull request #59239 from yuvalif/wip-yuval-67513
Reviewed-By: Casey Bodley <cbodley@ibm.com>
test/rgw/notification: use real ip address instead of localhost
based on that comment:
https://tracker.ceph.com/issues/67206#note-6
the address used by the endpoint is taken as the real IP address of the
host where the test script is running and not localhost.
we also changed the rabbitmq-server conf to allow "guest"
user to connect over non localhost address
The commit starts to submit OOL writes before submitting the journal
write, true, but it cannot guarantee that OOL writes finish before the
journal write.
Thus it is possible that during SeaStore restart, a journal record
appears valid but its dependent OOL records are partial written, which
leads to corruption.
Zac Dover [Sat, 17 Aug 2024 03:44:30 +0000 (13:44 +1000)]
doc/cephfs: s/mountpoint/mount point/
Change the string "mountpoint" to "mount point" in English-language
strings (as opposed to in commands, where the string "mountpoint"
sometimes appears and is correct).
cf. https://github.com/ceph/ceph/pull/58908#discussion_r1697715486 in
which page 345 of The IBM Style Guide is referenced to back up this
change.
This commit alters only English-language text and example commands in
which the string "{mount point}" is meant to be replaced. No commands
meant for cutting-and-pasting have been altered in this commit.
Zac Dover [Sat, 17 Aug 2024 03:37:58 +0000 (13:37 +1000)]
doc/cephfs: s/mountpoint/mount point/
Change the string "mountpoint" to "mount point" in English-language
strings (as opposed to in commands, where the string "mountpoint"
sometimes appears and is correct).
cf. https://github.com/ceph/ceph/pull/58908#discussion_r1697715486
in which page 345 of The IBM Style Guide is referenced to back up this
change.
Yuval Lifshitz [Thu, 15 Aug 2024 14:34:57 +0000 (14:34 +0000)]
test/rgw/notification: use real ip address instead of localhost
based on that comment:
https://tracker.ceph.com/issues/67206#note-6
the address used by the endpoint is taken as the real IP address of the
host where the test script is running and not localhost.
we also changed the rabbitmq-server conf to allow "guest"
user to connect over non localhost address
Xiubo Li [Mon, 29 Jul 2024 06:20:41 +0000 (14:20 +0800)]
client: flush the caps release in filesystem sync
We have hit a race between cap releases and cap revoke request
that will cause the check_caps() to miss sending a cap revoke ack
to MDS. And the client will depend on the cap release to release
that revoking caps, which could be delayed for some unknown reasons.
In Kclient we have figured out the RCA about race and we need
a way to explictly trigger this manually could help to get rid
of the caps revoke stuck issue.
Fixes: https://tracker.ceph.com/issues/67221 Signed-off-by: Xiubo Li <xiubli@redhat.com>
adding new oauth2-proxy service. The enable_auth flag enables SSO
authentication via the oauth2-proxy service. The user must ensure the
oauth2-proxy service is deployed before enabling this flag in the
mgmt-gateway service.
FQDN related changes: previously, we were obtaining the FQDN using a
call to the Python socket library run inside the container. While this
generally works, the FQDN returned inside a container can sometimes
differ from the one obtained outside the container. This discrepancy
could cause some issues. To ensure consistency, we now use the FQDN
from the inventory, which provides the correct value as recognized on the host.
Ramana Raja [Sun, 11 Aug 2024 02:18:07 +0000 (22:18 -0400)]
rbd: fix CLI output of `rbd group snap info` command
... when a group snapshot has no member images.
A group snapshot can be created with no member images. For such a group
snapshot, omit the 'image snap' and 'images' fields from the
unformatted CLI output of `rbd group snap info` command so as to not
confuse the user. In the librbd C/C++ data structures representing a
group snapshot with no member images, set the 'image_snap_name' data
member to an empty string.
Fixes: https://tracker.ceph.com/issues/67436 Signed-off-by: Ramana Raja <rraja@redhat.com>