Afreen [Wed, 6 Mar 2024 20:22:16 +0000 (01:52 +0530)]
mgr/dashboard: handle infinite values for pools
Fixes https://tracker.ceph.com/issues/64724
Issue:
======
Json parsing is failing because of Infinity values present in pools
meteadata. "read_balance": {"score_acting": Infinity, "score_stable":
Infinity,}
Due to this entire pool list is not rendered.
Fix:
====
Added a handler for checking "inf" values and replacing them with a
string "Infinity" so that json parsing does not fail on frontend.
Zac Dover [Fri, 13 Dec 2024 06:12:49 +0000 (16:12 +1000)]
doc/cephfs: edit 3rd 3rd of mount-using-kernel-driver
Edit the third third of doc/cephfs/mount-using-kernel-driver.rst in
preparation for correcting mount commands that may not work in Reef as
described in this documentation.
This commit edits only English-language strings in
doc/cephfs/mount-using-kernel-driver.rst. No technical content (that is,
no commands and no settings) have been altered in this commit.
Technical alterations to this file will be made only after the English
is unambiguous.
This PR follows the following two PRs:
https://github.com/ceph/ceph/pull/61048 - 1st 3rd
https://github.com/ceph/ceph/pull/61049 - 2nd 3rd
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 9c7580a2935511d009c9e66885e76635aa504ee8)
Zac Dover [Wed, 4 Dec 2024 20:43:12 +0000 (21:43 +0100)]
doc/dev: instruct devs to backport
Add a note to doc/dec/development-workflow.rst that instructs developers
to do their own backports. This change was requested by Laura Flores on
04 Dec 2024.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 5d584b4badb606d372c266424f59076408f62f40)
Adam King [Tue, 3 Dec 2024 20:22:22 +0000 (15:22 -0500)]
qa/tasks/nvme_loop: update task to work with new nvme list format
Specifically on some centos 9 tests, we've seen that a newer
version of some nvme related package is causing this task to fail
with "KeyError: 'DevicePath'" due to the format of the output
of the nvme list command changing. This patch adds handling for
the new format we've seen while also still supporting the old
format (necessary for the tests running on ubuntu).
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Fixes some doc lint and also fixed qa tests for having both 3 & 4 protocols
by default in expot config.
Avan Thakkar [Thu, 29 Aug 2024 10:39:14 +0000 (16:09 +0530)]
mgr/nfs: add additional tests for cmount_path & user_id deletion
Add unit tests for unique user ID generation, deletion and `cmount_path` handling in FSAL exports
- Ensure unique user ID generation for different FSAL blocks when creating exports.
- Test deletion behavior when multiple exports share the same user ID and one has a unique ID.
- Test default behavior when no `cmount_path` is provided (defaults to `/`).
- Add tests to validate error handling for invalid `cmount_path` values.
avanthakkar [Tue, 10 Oct 2023 17:06:28 +0000 (22:36 +0530)]
mgr/nfs: add cmount_path
Implementing necessary changes for the NFS module to align with the new export block format introduced in nfs-ganesha-V5.6.
The purpose of these changes is to enhance memory efficiency for exports. To achieve this goal, we have introduced a new field
called cmount_path under the FSAL block of export. Initially, this is applicable only to CephFS-based exports.
Furthermore, the newly created CephFS exports will now share the same user_id and secret_access_key, which are determined based
on the NFS cluster name and filesystem name. This results in each export on the same filesystem using a shared connection,
thereby optimizing resource usage.
Signed-off-by: avanthakkar <avanjohn@gmail.com>
mgr/nfs: fix a unit test failure
Signed-off-by: John Mulligan <jmulligan@redhat.com>
mgr/nfs: fix a unit test failure
Signed-off-by: John Mulligan <jmulligan@redhat.com>
mgr/nfs: fix a unit test failure
Signed-off-by: John Mulligan <jmulligan@redhat.com>
mgr/nfs: enable user management on a per-fs basis
Add back the ability to create a user for a cephfs export but do
it only for a cluster+filesystem combination. According to the
ganesha devs this ought to continue sharing a cephfs client connection
across multiple exports.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
mgr/nfs: add more unit tests with cmount_path
Add more unit tests for CEPHFS based NFS exports with
newly added cmount_path field under FSAL. Signed-off-by: avanthakkar <avanjohn@gmail.com>
mgr/nfs: fix rgw nfs export when no existing exports
Signed-off-by: avanthakkar <avanjohn@gmail.com>
mgr/nfs: generate user_id & access_key for apply_export(CEPHFS)
Generate user_id & secret_access_key for CephFS based exports
for apply export. Also ensure the export FSAL block has
`cmount_path`.
- Improved validation to check cmount_path as a superset of path
- Replaced format method with f-strings
- Added generate_user_id function using SHA-1 hash
- Enhanced error handling and integrated new user_id generation
Dan Mick [Thu, 21 Nov 2024 02:18:59 +0000 (18:18 -0800)]
container/Containerfile: purge .repo files with secrets before commit
ceph.repo had creds in it for download.ceph.com/prerelease.
Remove the .repo files we construct, since they're not necessary
once the container is built (no one should be dnf'ing anything
in the container).
Dan Mick [Fri, 1 Nov 2024 02:55:36 +0000 (19:55 -0700)]
container/make-manifest-list.py
- don't print command failure in worker; let the caller print them
if desired (allow silent failure)
- allow for empty tags list
- look for CEPH_SHA1. GIT_COMMIT was the sha1 of the ceph-container.git
commit
- change default paths to prerelease
- add --dry-run to avoid final push
- rename 'HOST' to 'CONTAINER_HOST'
- Use ARCH_SPECIFIC_HOST instead of CONTAINER_HOST (which is used by podman)
Zac Dover [Wed, 4 Dec 2024 02:13:05 +0000 (03:13 +0100)]
doc/rados: fix sentences in health-checks (3 of x)
Make sentences agree at the head of each section in
doc/rados/operations/health-checks.rst. The sentences were sometimes in
the imperative mood and sometimes in the declarative mood.
This commit edits the second third of
doc/rados/operations/health-checks.rst.
Note to (I hope soon) future Zac: There are a a couple of places near
the end of this file where the sentences are ungrammatical. Update these
in a separate PR (in isolation, so that the grammar and technical
accuracy of these sentences can be the primary focus of the reviewers).
Zac Dover [Tue, 3 Dec 2024 11:02:43 +0000 (12:02 +0100)]
doc/rados: fix sentences in health-checks (2 of x)
Make sentences agree at the head of each section in
doc/rados/operations/health-checks.rst. The sentences were sometimes in
the imperative mood and sometimes in the declarative mood.
This commit edits the second third of
doc/rados/operations/health-checks.rst.
Zac Dover [Tue, 3 Dec 2024 08:28:09 +0000 (09:28 +0100)]
doc/rados: make sentences agree in health-checks.rst
Make sentences agree at the head of each section in
doc/rados/operations/health-checks.rst. The sentences were sometimes in
the imperative mood and sometimes in the declarative mood.
This commit edits the first third of
doc/rados/operations/health-checks.rst.
Zac Dover [Sat, 30 Nov 2024 16:50:53 +0000 (17:50 +0100)]
doc/glossary.rst: add "Dashboard Plugin"
Add an entry below the (Mimic-era and therefore outdated but
nonetheless historically important) Dashboard Plugin key word in the
glosssary, which before now had never been added to the glossary.
Zac Dover [Fri, 29 Nov 2024 03:12:02 +0000 (13:12 +1000)]
doc/radosgw: update rgw_dns_name doc
Update doc/radosgw/s3/commons.rst with the changes made by Jiffin Tony
Thottan in https://github.com/ceph/ceph/pull/54524 and the suggestions
made in that same PR by Anthony D'Atri.
Explain how to set rgw_dns_name to a domain name in order to configure
access to virtual hosted buckets.
common/options: Change HDD OSD shard configuration defaults for mClock
Based on tests performed at scale on a HDD based cluster, it was found
that scheduling with mClock was not optimal with multiple OSD shards. For
e.g., in the scaled cluster with multiple OSD node failures, the client
throughput was found to be inconsistent across test runs coupled with
multiple reported slow requests.
However, the same test with a single OSD shard and with multiple worker
threads yielded significantly better results in terms of consistency of
client and recovery throughput across multiple test runs.
For more details see https://tracker.ceph.com/issues/66289.
Therefore, as an interim measure until the issue with multiple OSD shards
(or multiple mClock queues per OSD) is investigated and fixed, the
following change to the default HDD OSD shard configuration is made:
The other changes in this commit include:
- Doc change to the OSD and mClock config reference describing
this change.
- OSD troubleshooting entry on the procedure to change the shard
configuration for clusters affected by this issue running on older
releases.
- Add release note for this change.
Conflicts:
doc/rados/troubleshooting/troubleshooting-osd.rst
- Included the troubleshooting entry before the "Flapping OSDs" section.
PendingReleaseNotes
- Moved the release note under 19.0.0 section.
Zac Dover [Sat, 23 Nov 2024 12:32:13 +0000 (22:32 +1000)]
doc/cephadm: Clarify "Deploying a new Cluster"
Change the title of the section "Deploying a new Ceph cluster" to "Using
cephadm to Deploy a New Ceph Cluster". This is part of the initiative to
separate package-related documentation from container-based
documenation.
rgw/multisite: in order to sleep between mdlog polling events, we check if the mdlog_marker is not modified by comparing
mdlog_marker and max_marker. but max_marker is exposed to changes from RGWReadMDLogEntriesCR, and if there is a race
coming from mdlog trimming which could render max_marker empty, then its comparison with mdlog polling can be incorrect.
To fix this, we now save the previous mdlog marker and compare with the updated mdlog marker.
Oshrey Avraham [Mon, 18 Nov 2024 10:06:22 +0000 (12:06 +0200)]
rgw/notification: fix segmentation fault and topic listing logic
- Fixed a segmentation fault caused by a null bucket pointer in RGWPSListTopicsOp::execute()
- Corrected logic to use get_topics_v2 when supported, with fallback otherwise
Zac Dover [Tue, 19 Nov 2024 00:37:56 +0000 (10:37 +1000)]
doc/start: update os-recommendations.rst
Remove information about the operating systems that support Ceph's
official container images from the "Platforms" table in
doc/start/os-recommendations.rst and add that information to the (new)
table that shows the operating systems that support Ceph's official
container images.
Credit for this change should go to Enrico Bocchi, who noticed a
discrepancy that motivated it.
Zac Dover [Mon, 11 Nov 2024 23:31:28 +0000 (09:31 +1000)]
doc/rados: correct "full ratio" note
Correct a note that directed users not to add an OSD after the cluster
has reached its "full ratio". The note now says "Do not let your cluster
reach its full ratio before adding an OSD."
Hat tip: Oskar Berggren
Fixes: https://tracker.ceph.com/issues/68900 Co-authored-by: Oskar Berggren <oskar.berggren@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit f1a2637c79a15c26a769661dd72ca68d766b2f0d)