Zac Dover [Thu, 8 May 2025 02:29:25 +0000 (12:29 +1000)]
doc/mgr: edit alerts.rst
Edit doc/mgr/alerts.rst as part of the project to determine where the
error is in https://github.com/ceph/ceph/pull/62782 that prevents the
Jenkins tests from passing.
This commit adds to the work done in
https://github.com/ceph/ceph/pull/62782 by correcting some of the
English that was present in that PR.
This is a change to one of twenty-five files in
https://github.com/ceph/ceph/pull/62782, and this commit represents one
of what will be at least twenty-five other commits made to track this
error down.
Zac Dover [Thu, 8 May 2025 00:08:06 +0000 (10:08 +1000)]
doc/mgr/ceph_api: edit index.rst
Edit doc/mgr/ceph_api/index.rst as part of the project to determine
where the error is in https://github.com/ceph/ceph/pull/62782 that
prevents the Jenkins tests from passing.
This is a change to one of twenty-five files in
https://github.com/ceph/ceph/pull/62782, and this commit represents one
of what will be at least twenty-five other commits made to track this
error down.
Correct the presentation of an example string in doc/cephadm/rgw.rst in
order to obviate an error reading "rgw.rst:202: WARNING: Inline emphasis start-string without end-string."
Patrick Donnelly [Wed, 30 Apr 2025 12:36:28 +0000 (08:36 -0400)]
Merge PR #61287 into squid
* refs/pull/61287/head:
mds: add or update MDS thread names
log: cache recent threads up to a day
common: cache pthread names
log: concatenate thread names and print once per thread
Janne Heß [Mon, 28 Apr 2025 09:04:25 +0000 (11:04 +0200)]
ceph-volume: Fix splitting with too many parts
The data can be anything and also contain a `=`, causing the line to
fail with `Too many values to unpack`. In my case, it failed with
`ID_FS_LABEL=pvc_name=rook-ceph-lvm-data-44f2gc`.
Regression was introduced here: https://github.com/ceph/ceph/pull/60006
doc/rados: Update mClock doc on steps to override OSD IOPS capacity config
Describe the steps involved to
- Specify a global value for osd_mclock_max_capacity_iops_{ssd,hdd}, and
- Override existing individually scoped values for OSDs determined during
start-up for osd_mclock_max_capacity_iops_{ssd,hdd}.
The above is to help with the following:
- Steps to override existing setting with a global value.
- reduce the number of entries in the mon store and instead use a single
global specification for all OSDs in the cluster in case the underlying
hardware is the same for all OSDs.
Ronen Friedman [Wed, 26 Jun 2024 15:02:19 +0000 (10:02 -0500)]
qa/standalone/scrub: fix osd-scrub-test.sh
following changes in scrub code
(cherry picked from commit 24647e87e8fba9b16d81730662b22798ed1885cb)
Conflict resolved by:
- electing to keep the up-to-date order between 'set noscrub' and 'set ..chunk_max'
in 'step 2'
Ville Ojamo [Sat, 26 Apr 2025 04:17:16 +0000 (11:17 +0700)]
doc/radosgw: Fix RST syntax rendeded as text in oidc.rst
Empty line after starting a pre-formatted block with the double-colon
syntax is required, otherwise the double-colon does nothing and is just
rendered as-is as "::" and there would be no following pre-formatted
block.
Add empty lines after the double-colon syntax so that the following
block is rendered pre-formatted.
Also add bash privileged prompts to a block with 2 example CLI commands.
librbd: disallow "rbd trash mv" if image is in a group
Removing an image that is a member of a group has always been
disallowed. However, moving an image that is a member of a group to
trash is currently allowed and this is deceptive -- the only reason for
a user to move an image to trash should be the intent to remove it.
More importantly, group APIs operate in terms of image names -- there
are no corresponding variants that would operate in terms of image IDs.
For example, even though internally GroupImageSpec struct stores an
image ID, the public rbd_group_image_info_t struct insists on an image
name. When rbd_group_image_list() encounters a trashed member image
(i.e. one that doesn't have a name), it just fails with ENOENT and no
listing gets produced at all until the offending image is restored from
trash. Something like this can be very hard to debug for an average
user, so let's make rbd_trash_move() fail with EMLINK the same way as
rbd_remove() does in this scenario.
The one case where moving a member image to trash makes sense is live
migration where the source image gets trashed to be almost immediately
replaced by the destination image as part of preparing migration.
EMLINK is returned by rbd_remove() if the image is a member of a group.
Add a dedicated exception similar to ImageBusy or ImageHasSnapshots and
a test for it.
Conflicts:
src/test/pybind/test_rbd.py [ commits 68eea0eb814e
("src/tools/rbd: add group info command to output group id")
and e5ccce14c4b0 ("rbd: add group snap info command") not in
squid ]
Ilya Dryomov [Tue, 25 Mar 2025 08:13:27 +0000 (09:13 +0100)]
mgr/rbd_support: always parse interval and start_time in Schedules::remove()
Commit 1b62447071a9 ("mgr/rbd_support: fix schedule remove") addressed
the issue that it was concerned with in a rather suboptimal way: instead
of moving the parsing of interval and start_time upfront to be able to
bail early, it wrapped from_string() constructors with try/finally and
left the conditional behavior in place.
Ilya Dryomov [Fri, 21 Mar 2025 13:43:50 +0000 (14:43 +0100)]
librbd: respect rbd_default_snapshot_quiesce_mode in group_snap_create()
Make group_snap_create() behave the same as snap_create() and
mirror_image_create_snapshot(): APIs that don't take RBD_SNAP_CREATE_
flags explicitly should respect rbd_default_snapshot_quiesce_mode
option.
osd/scrub: additional configuration params to trigger scrub reschedule
Adding the following parameters to the (small) set of configuration
options that, if changed, trigger re-computation of the next scrub
schedule:
- osd_scrub_interval_randomize_ratio,
(not cherry-picked) - osd_deep_scrub_interval_cv, and
- osd_deep_scrub_interval (which was missing in the list of
parameters watched by the OSD).
Fixes: https://tracker.ceph.com/issues/70909
Original tracker: https://tracker.ceph.com/issues/70806
(cherry picked from commit d56f613d5a69797e727938f04b66aed747cfb6b1)
Conflicts resolved by removing refs to the deep_scrub_interval_cv
parameter, which does not yet exist in this version. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
N Balachandran [Mon, 21 Apr 2025 11:34:08 +0000 (17:04 +0530)]
rbd: display correct mirror state when creating
The mirror image state is set to MIRROR_IMAGE_STATE_CREATING
when the image is first created on the secondary, but was displayed
as "unknown" by the rbd info command. This has been fixed.
Fixes: https://tracker.ceph.com/issues/70963 Signed-off-by: N Balachandran <nithya.balachandran@ibm.com>
(cherry picked from commit f2e35646721ed3076e3da54124f8d783c456b2dc)
Yuval Lifshitz [Tue, 1 Oct 2024 15:19:46 +0000 (15:19 +0000)]
common: missing std include with GCC 14
In file included from src/rgw/driver/posix/bucket_cache.h:19,
from src/test/rgw/test_posix_bucket_cache.cc:4:
src/common/cohort_lru.h: In member function _void cohort::lru::TreeX<T, TTree, CLT, CEQ, K, LK>::lock()_:
src/common/cohort_lru.h:334:14: error: _for_each_ is not a member of _std_
334 | std::for_each(locks.begin(), locks.end(),
| ^~~~~~~~
src/common/cohort_lru.h: In member function _void cohort::lru::TreeX<T, TTree, CLT, CEQ, K, LK>::unlock()_:
/home/yuvalif/ceph5/src/common/cohort_lru.h:339:14: error: _for_each_ is not a member of _std_
339 | std::for_each(locks.begin(), locks.end(),
| ^~~~~~~~
rgw/async/notifications: use common async waiter in pubsub push
* use the "yield_waiter" and "waiter" from common/async insteasd of the "waiter"
implemented inside the bucket notification code (this is so we don't
need separate investigations for 2 implementations)
* added a unit test that simulate how a separate thread (kafka or amqp) is
resuming a coroutine which is created by either the frontend or the
notification manager.
before using "defer" the unit test is passing, however,
when executed under thread sanitizer (using the WITH_TSAN cmake flag)
the following errors are observed: https://0x0.st/Xp4P.txt
after using "defer" the unit test passes under TSAN without errors.
rgw: metadata and data sync fairness notifications to retry upon any error case
This is a complementary fix to the earlier one described at #62156.
When the sync shard notification fails due to any failures including timeout,
this change keeps the loop going for both metadata and data sync.
Ville Ojamo [Thu, 10 Apr 2025 10:34:57 +0000 (17:34 +0700)]
doc/radosgw: Promptify CLI, cosmetic fixes
Use the more modern prompt block for CLI commands
and use right one $ vs #.
Fix indentation on JSON example outputs and
some CLI command switches.
Add some arguably missing comma in JSON example output.
Add a full stop at the end of a one-sentence paragraph.
Remove extra comma mid-sentence in another.
Fix missing backslashes or typo at end of multiline commands.
Lines under section headings as long as heading text.
Fix hyperlinks. Fix list items prefixed with - insted of *.
Format configuration syntax in the middle of text as code.
Fix typo "PI" to "API" and remove extra space.
Remove colons at the end of section headers in a few places.
Use Title Case in section titles consistently with short words lowercase.
Possibly controversial: don't add whitespace before and
after main title section header text.
Possibly controversial: don't indent line continuation
backslashes, leave only 1 space before them.
Adam Kupczyk [Tue, 1 Apr 2025 14:01:23 +0000 (14:01 +0000)]
os/bluestore/bluefs: Fix race condition between truncate() and unlink()
It was possible for unlink() to interrupt ongoing truncate().
As the result, unlink() finishes properly, but truncate() is not aware
of it and does:
1) updates file that is already removed
2) releases same allocations again
Now fixed by checking if file is deleted under FILE lock.
Adam Kupczyk [Mon, 31 Mar 2025 20:49:26 +0000 (20:49 +0000)]
test/store_test: Expose race in BlueFS truncate / remove
Created test that exposes race between BlueFS::truncate and BlueFS::unlink.
Test requires injection of 1ms sleep to BlueFS::truncate.
Therefore, in this form, it is unsuitable for merge.