Zac Dover [Wed, 4 Dec 2024 02:13:05 +0000 (03:13 +0100)]
doc/rados: fix sentences in health-checks (3 of x)
Make sentences agree at the head of each section in
doc/rados/operations/health-checks.rst. The sentences were sometimes in
the imperative mood and sometimes in the declarative mood.
This commit edits the second third of
doc/rados/operations/health-checks.rst.
Note to (I hope soon) future Zac: There are a a couple of places near
the end of this file where the sentences are ungrammatical. Update these
in a separate PR (in isolation, so that the grammar and technical
accuracy of these sentences can be the primary focus of the reviewers).
Zac Dover [Tue, 3 Dec 2024 11:02:43 +0000 (12:02 +0100)]
doc/rados: fix sentences in health-checks (2 of x)
Make sentences agree at the head of each section in
doc/rados/operations/health-checks.rst. The sentences were sometimes in
the imperative mood and sometimes in the declarative mood.
This commit edits the second third of
doc/rados/operations/health-checks.rst.
Zac Dover [Tue, 3 Dec 2024 08:28:09 +0000 (09:28 +0100)]
doc/rados: make sentences agree in health-checks.rst
Make sentences agree at the head of each section in
doc/rados/operations/health-checks.rst. The sentences were sometimes in
the imperative mood and sometimes in the declarative mood.
This commit edits the first third of
doc/rados/operations/health-checks.rst.
Zac Dover [Sat, 30 Nov 2024 16:50:53 +0000 (17:50 +0100)]
doc/glossary.rst: add "Dashboard Plugin"
Add an entry below the (Mimic-era and therefore outdated but
nonetheless historically important) Dashboard Plugin key word in the
glosssary, which before now had never been added to the glossary.
Zac Dover [Fri, 29 Nov 2024 03:12:02 +0000 (13:12 +1000)]
doc/radosgw: update rgw_dns_name doc
Update doc/radosgw/s3/commons.rst with the changes made by Jiffin Tony
Thottan in https://github.com/ceph/ceph/pull/54524 and the suggestions
made in that same PR by Anthony D'Atri.
Explain how to set rgw_dns_name to a domain name in order to configure
access to virtual hosted buckets.
common/options: Change HDD OSD shard configuration defaults for mClock
Based on tests performed at scale on a HDD based cluster, it was found
that scheduling with mClock was not optimal with multiple OSD shards. For
e.g., in the scaled cluster with multiple OSD node failures, the client
throughput was found to be inconsistent across test runs coupled with
multiple reported slow requests.
However, the same test with a single OSD shard and with multiple worker
threads yielded significantly better results in terms of consistency of
client and recovery throughput across multiple test runs.
For more details see https://tracker.ceph.com/issues/66289.
Therefore, as an interim measure until the issue with multiple OSD shards
(or multiple mClock queues per OSD) is investigated and fixed, the
following change to the default HDD OSD shard configuration is made:
The other changes in this commit include:
- Doc change to the OSD and mClock config reference describing
this change.
- OSD troubleshooting entry on the procedure to change the shard
configuration for clusters affected by this issue running on older
releases.
- Add release note for this change.
Conflicts:
doc/rados/troubleshooting/troubleshooting-osd.rst
- Included the troubleshooting entry before the "Flapping OSDs" section.
PendingReleaseNotes
- Moved the release note under 19.0.0 section.
Zac Dover [Sat, 23 Nov 2024 12:32:13 +0000 (22:32 +1000)]
doc/cephadm: Clarify "Deploying a new Cluster"
Change the title of the section "Deploying a new Ceph cluster" to "Using
cephadm to Deploy a New Ceph Cluster". This is part of the initiative to
separate package-related documentation from container-based
documenation.
rgw/multisite: in order to sleep between mdlog polling events, we check if the mdlog_marker is not modified by comparing
mdlog_marker and max_marker. but max_marker is exposed to changes from RGWReadMDLogEntriesCR, and if there is a race
coming from mdlog trimming which could render max_marker empty, then its comparison with mdlog polling can be incorrect.
To fix this, we now save the previous mdlog marker and compare with the updated mdlog marker.
Oshrey Avraham [Mon, 18 Nov 2024 10:06:22 +0000 (12:06 +0200)]
rgw/notification: fix segmentation fault and topic listing logic
- Fixed a segmentation fault caused by a null bucket pointer in RGWPSListTopicsOp::execute()
- Corrected logic to use get_topics_v2 when supported, with fallback otherwise
Zac Dover [Tue, 19 Nov 2024 00:37:56 +0000 (10:37 +1000)]
doc/start: update os-recommendations.rst
Remove information about the operating systems that support Ceph's
official container images from the "Platforms" table in
doc/start/os-recommendations.rst and add that information to the (new)
table that shows the operating systems that support Ceph's official
container images.
Credit for this change should go to Enrico Bocchi, who noticed a
discrepancy that motivated it.
Zac Dover [Mon, 11 Nov 2024 23:31:28 +0000 (09:31 +1000)]
doc/rados: correct "full ratio" note
Correct a note that directed users not to add an OSD after the cluster
has reached its "full ratio". The note now says "Do not let your cluster
reach its full ratio before adding an OSD."
Hat tip: Oskar Berggren
Fixes: https://tracker.ceph.com/issues/68900 Co-authored-by: Oskar Berggren <oskar.berggren@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit f1a2637c79a15c26a769661dd72ca68d766b2f0d)
Laura Flores [Mon, 28 Oct 2024 22:40:13 +0000 (22:40 +0000)]
mgr/balancer: optimize 'balancer status detail'
Before, we were updating the balancer status by
iterating through all pg upmap entires. This was
affecting the loading time of other mgr modules
on clusters with a large number of pgs (600+).
This can be optimized by simply pulling from
the incremental.
Zac Dover [Tue, 29 Oct 2024 07:27:43 +0000 (17:27 +1000)]
doc/start: separate package chart from container chart
Separate the packages-and-containers chart into two charts:
(1) a chart that shows which OSes Ceph builds packages for
(2) a chart that shows which OSes support Ceph's containers
common,osd: Use last valid OSD IOPS value if measured IOPS is unrealistic
The OSD's IOPS capacity is used by the mClock scheduler to determine the
quantum of bandwidth allocation for the various operations on the OSD.
Prior to this commit, maybe_override_max_osd_capacity_for_qos() only
checked if the measured IOPS capacity exceeded the higher threshold defined
by 'osd_mclock_iops_capacity_threshold_[hdd|ssd]' and if so fallback to the
last valid or the default IOPS capacity as defined by
osd_mclock_max_capacity_iops_[hdd|ssd].
It's quite possible that the reported IOPS is unrealistically low. This
could be due to transient factors on the underlying device or it could
indicate bad health of the device. Either way, the safer option would be
to fallback to the last valid or the default IOPS setting for that OSD in
order to avoid cluster performance (slow or stalled ops) issues down the
line.
Therefore, to handle this case, the commit introduces additional config
options viz.,
- osd_mclock_iops_capacity_low_threshold_hdd - set to 50 IOPS and
- osd_mclock_iops_capacity_low_threshold_ssd - set to 1000 IOPS
If the measured IOPS capacity doesn't fall within the low and high
threshold range, the default or the last valid IOPS capacity is used.
The existing cluster log warning is suitably modified to convey the
reason.
Additionally, for a couple of valgrind related teuthology tests, the
cluster warning is added to the ignorelist since the reported IOPS can
be very low due to slowness.
Zac Dover [Wed, 6 Nov 2024 12:22:14 +0000 (22:22 +1000)]
doc/cephadm: link to "host pattern" matching sect
Link to the "Placement by Pattern Matching" section in
doc/cephadm/services/index.rst from the "Advanced OSD Service
Specifications" section in doc/cephadm/services/osd.rst.
Nizamudeen A [Mon, 4 Nov 2024 05:42:32 +0000 (11:12 +0530)]
mgr/dashboard: remove cherrypy_backports.py
since its mostly used only for older cherrypy versions which we don't
support anymore in any of our recent upstream releases, we could remove
it completely