Afreen Misbah [Tue, 6 May 2025 14:27:03 +0000 (19:57 +0530)]
mgr/dashboard: Fix delete listener
- pass gw_group to delete API in frontend
- when more than one gw groups present delete listener failing with error message: Multiple NVMe-oF gateway groups are configured. Please specify the 'gw_group' parameter in the request.
- added missing types, i18n
Afreen Misbah [Thu, 8 May 2025 04:09:59 +0000 (09:39 +0530)]
mgr/dashboard: Add default state when gateway groups are empty
Fixes https://tracker.ceph.com/issues/71247
- after upgrades the nvmeof service spec does not contain `group` field
- this causes UI combobox internal errors
- checking for `group` in spec and disabling the selector
Venky Shankar [Mon, 12 May 2025 13:02:40 +0000 (18:32 +0530)]
Merge PR #62250 into main
* refs/pull/62250/head:
qa/cephfs: increase data to be delay data sync by mirror daemon
cephfs-mirror: integrate blockdiff API for regular file transfers
mds: dout snapdiff snapid's before validation check
cephfs-mirror: current sync mechanism uses sync mechanism subclass'ing
qa: add test for syncing already existing snapshots
cephfs_mirror: avoid latest changes on the source fs to enable mirroring
Ronen Friedman [Sun, 11 May 2025 05:24:33 +0000 (00:24 -0500)]
osd/scrub: remove the 'deadline' attribute from the scrub job
The scrub job's 'overdue' attribute is no longer calculated -
the only 'scrub is overdue' status remaining after latest
scheduling refactor, is the one performed in PGMap.cc (the
one affecting the 'health warning' status of the cluster).
Thus - there is no longer any reason to maintain any 'deadline'
attribute for the scrub scheduler.
Ronen Friedman [Fri, 9 May 2025 12:46:26 +0000 (07:46 -0500)]
osd/scrub: remove the deep-scrubs deadline attribute
As it is no longer meaningful in the context of the new
scrub scheduling design.
The change mandates fixes to the way 'schedule-[deeps]crub'
commands are implemented. The offset to use when forcing the
last-scrub timestamp to a new value in now calculated in
ScrubJob::guaranteed_offset(), as ScrubJob is where all
schedule adjustments (which employ the same logic) are
implemented.
Samuel Just [Fri, 9 May 2025 16:46:48 +0000 (16:46 +0000)]
crimson/osd/pg_recovery: only reset_pglog_based_recovery_op if complete
ce4e9aaad, as part of the start_recovery_ops changed the call to
reset_pglog_based_recovery_op to occur unconditionally rather than only
if recovery has completed.
Note, this fix only restores the prior behavior. There's actually still
a race here where a DeferRecovery could be processed between the call to
reset_pglog_based_recovery_op and the RequestBackfill or
AllReplicasRecovered being processed.
Introduced: ce4e9aaad8f2cafae24511fe1687c61dc41affc1
Related: https://tracker.ceph.com/issues/71267 Fixes: https://tracker.ceph.com/issues/70337 Signed-off-by: Samuel Just <sjust@redhat.com>
Kefu Chai [Wed, 7 May 2025 01:09:00 +0000 (09:09 +0800)]
tools/ceph_dedup: remove 'using namespace std'
Remove 'using namespace std' from common.h to maintain consistent coding
practices. Although common.h is only used by ceph_dedup implementation,
keeping namespace declarations out of header files prevents potential
name conflicts and follows best practices for C++ code organization.
This change improves code clarity and reduces the risk of symbol collisions
when standard library elements are used alongside custom
implementations.
Ville Ojamo [Fri, 9 May 2025 08:17:00 +0000 (15:17 +0700)]
doc/radosgw: Use ref for hyperlinks, 1st batch
Use validated ":ref:" hyperlinks instead of "external links" in "target
definitions" when linking within the Ceph docs:
- Add a label at beginning of referenced files if missing.
- Remove unused "target definitions".
The rendered PR should look the same as the old docs, only differing in
the source RST.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Samuel Just [Thu, 24 Apr 2025 22:13:04 +0000 (15:13 -0700)]
vstart.sh: simplify crimson core assignment, use assign_crimson_cores.py
This commit simplifies the internal flow in a few ways:
- core assignment is entirely handled by prep_balance_cpu and
do_balance_cpu. The latter simply does as the cpu_table
instructs.
- assign_crimson_cores calls lscpu and taskset internally, no
need for temp files.
It also changes some defaults:
- if crimson-balance-cpu is unset or set to none, crimson-osd will not
pin cpus at all rather than using the simple sequential allocation
scheme, which could be much less efficient on platforms where
cpuids 0,1,2,3,... are on socket 0,1,2,3,... "osd" and "socket"
options provide numa aware assignments when requested.
New features:
- Alienstore cores are now assigned with assign_crimson_cores
using the same balance strategy using
--crimson-alien-num-cores.
- --crimson-reactor-physical-only and
--crimson-alienstore-physical-only will cause reactor or
alienstore cpus respectively to be allocated with one
cpu per physical core rather than including smt siblings.
Fixes: https://tracker.ceph.com/issues/71096 Signed-off-by: Samuel Just <sjust@redhat.com>
Jos Collin [Mon, 27 Jan 2025 12:42:34 +0000 (18:12 +0530)]
cephfs_mirror: avoid latest changes on the source fs to enable mirroring
This avoids considering latest changes from the source filesystem for
the mirroring of already existing snapshots. Thus the destination
filesystem and snapshots would be created based only on the source snapshots.
The destination fs would be a replica of the last snapshot taken.
Fixes: https://tracker.ceph.com/issues/68567 Signed-off-by: Jos Collin <jcollin@redhat.com>
Ronen Friedman [Thu, 8 May 2025 13:45:23 +0000 (08:45 -0500)]
osd/scrub: fix deadline calculations
The scrub scheduling deadlines are calculated based on pool and OSD
configuration parameters. The specifics of the calculations are
modified to match the new scrub scheduling design.
Comments and documentation are updated to reflect the fact that
the deadlines no longer have any meaningful effect on scrub
scheduling.
Zac Dover [Thu, 8 May 2025 02:29:25 +0000 (12:29 +1000)]
doc/mgr: edit alerts.rst
Edit doc/mgr/alerts.rst as part of the project to determine where the
error is in https://github.com/ceph/ceph/pull/62782 that prevents the
Jenkins tests from passing.
This commit adds to the work done in
https://github.com/ceph/ceph/pull/62782 by correcting some of the
English that was present in that PR.
This is a change to one of twenty-five files in
https://github.com/ceph/ceph/pull/62782, and this commit represents one
of what will be at least twenty-five other commits made to track this
error down.
Zac Dover [Thu, 8 May 2025 00:08:06 +0000 (10:08 +1000)]
doc/mgr/ceph_api: edit index.rst
Edit doc/mgr/ceph_api/index.rst as part of the project to determine
where the error is in https://github.com/ceph/ceph/pull/62782 that
prevents the Jenkins tests from passing.
This is a change to one of twenty-five files in
https://github.com/ceph/ceph/pull/62782, and this commit represents one
of what will be at least twenty-five other commits made to track this
error down.
Kefu Chai [Wed, 7 May 2025 00:42:52 +0000 (08:42 +0800)]
librbd, tools: migrate from boost::variant to std::variant
Complete migration started in commit 017f333, replacing boost::variant with
std::variant throughout the librbd codebase. This change is part of our ongoing
effort to reduce third-party dependencies by leveraging C++ standard library
alternatives where possible.
Benefits include:
- Improved code readability and maintainability
- Reduced external dependency surface
- More consistent API usage with other components
Implementation note: Unlike Boost.variant, std::variant lacks built-in
operator<< support. This commit implements the necessary operator<< for
AttributeValue, our specific std::variant instantiation, to preserve the
existing behavior.
Also, despite that `apply_visit()` calls can be replaced with `visit()`
without being qualified with `std::` because of ADL, we are taking this
opportunity to adding the `std::` prefix for better readability.
Samuel Just [Tue, 29 Apr 2025 01:53:11 +0000 (01:53 +0000)]
tools/contrib: add assign_crimson_cores as a more general replacement for balance_cpu
Improvements:
- shorter
- has tests
- uses lscpu -e --json to get logical<->physical mappings and avoid
needing to parse cpu ranges in lscpu --json
- supports allocating alienstore threads
- supports requiring physical cores only independently for alienstore
and seastar reactors
Matan Breizman [Sun, 4 May 2025 14:22:38 +0000 (14:22 +0000)]
crimson/osd: Logging fixes
* Fix "failed to log message"
* PGRecovery move to new logging macro
* PGRecovery to print pg prefix as it's impossible to debug specific pg
recovery ops without it.