]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
2 months agodoc/radosgw: Fix section header levels in multisite-sync-policy.rst 62966/head
Ville Ojamo [Fri, 25 Apr 2025 07:16:52 +0000 (14:16 +0700)]
doc/radosgw: Fix section header levels in multisite-sync-policy.rst

The section header levels are reversed so the hierarchy in the TOC is
incorrect. Switch around the section header levels to make the TOC
hierarchy correct, for example individual examples are children of the
"Examples" section.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
2 months agoMerge pull request #59673 from shraddhaag/availability-score-feature
Shraddha Agrawal [Fri, 25 Apr 2025 05:56:15 +0000 (11:26 +0530)]
Merge pull request #59673 from shraddhaag/availability-score-feature

monitor: add availability score feature

2 months agoMerge pull request #62937 from gbregman/main
Gil Bregman [Fri, 25 Apr 2025 05:34:07 +0000 (08:34 +0300)]
Merge pull request #62937 from gbregman/main

mgr/cephadm/nvmeof: Allow setting NVMEoF gateway huge pages count in the spec file

2 months agoMerge PR #62658 into main
Patrick Donnelly [Fri, 25 Apr 2025 02:41:14 +0000 (22:41 -0400)]
Merge PR #62658 into main

* refs/pull/62658/head:
libcephfs_proxy: Remove arithmetic on `void*`

Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
Reviewed-by: Matan Breizman <mbreizma@redhat.com>
Reviewed-by: Xavi Hernandez <xhernandez@gmail.com>
2 months agomgr/cephadm/nvmeof: Allow setting NVMEoF gateway huge pages count in the spec file 62937/head
Gil Bregman [Wed, 23 Apr 2025 20:55:24 +0000 (23:55 +0300)]
mgr/cephadm/nvmeof: Allow setting NVMEoF gateway huge pages count in the spec file
Fixes https://tracker.ceph.com/issues/71043

Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
2 months agoMerge pull request #62561 from rkachach/fix_issue_70359_v2
Adam King [Thu, 24 Apr 2025 18:40:28 +0000 (14:40 -0400)]
Merge pull request #62561 from rkachach/fix_issue_70359_v2

mgr/cephadm: harmonize mgmt-gateway and oauth2-proxy spec fields

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>
2 months agoMerge pull request #62302 from thegreenbear/cephadm-sd-custom-containers
Adam King [Thu, 24 Apr 2025 18:34:21 +0000 (14:34 -0400)]
Merge pull request #62302 from thegreenbear/cephadm-sd-custom-containers

mgr/cephadm: enhanced service to allow discovery of custom containers

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Redouane Kachach <rkachach@redhat.com>
2 months agoMerge pull request #62936 from cbodley/wip-doc-rgw-getobjattrs
Casey Bodley [Thu, 24 Apr 2025 15:35:48 +0000 (11:35 -0400)]
Merge pull request #62936 from cbodley/wip-doc-rgw-getobjattrs

doc/rgw: release note for GetObjectAttributes

Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2 months agoMerge pull request #62845 from rhcs-dashboard/fix-path
Pedro Gonzalez Gomez [Thu, 24 Apr 2025 15:26:18 +0000 (17:26 +0200)]
Merge pull request #62845 from rhcs-dashboard/fix-path

mgr/dashboard: fix smb edit resources

Reviewed-by: Afreen Misbah <afreen@ibm.com>
2 months agoMerge pull request #62715 from cbodley/wip-qa-rgw-no-gc
Casey Bodley [Thu, 24 Apr 2025 14:59:33 +0000 (10:59 -0400)]
Merge pull request #62715 from cbodley/wip-qa-rgw-no-gc

qa/rgw: run verify tests with garbage collection disabled

Reviewed-by: Jane Zhu <jzhu116@bloomberg.net>
2 months agoMerge pull request #62921 from idryomov/wip-71026
Ilya Dryomov [Thu, 24 Apr 2025 14:36:46 +0000 (16:36 +0200)]
Merge pull request #62921 from idryomov/wip-71026

librbd: disallow "rbd trash mv" if image is in a group

Reviewed-by: Ramana Raja <rraja@redhat.com>
2 months agoqa/standalone/misc/availability.sh: add tests 59673/head
Shraddha Agrawal [Mon, 6 Jan 2025 07:12:11 +0000 (07:12 +0000)]
qa/standalone/misc/availability.sh: add tests

This commit adds a standalone test for verifying if
the availability score of a pool comes down when there
are unfound objects present.

Fixes: https://tracker.ceph.com/issues/67777
Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>
2 months agosrc/mon/PGMap.cc: check unfound obejcts in `get_unavailable_pg_in_pool_map`
Shraddha Agrawal [Mon, 7 Oct 2024 06:16:34 +0000 (11:46 +0530)]
src/mon/PGMap.cc: check unfound obejcts in `get_unavailable_pg_in_pool_map`

If a pool has any PG with unfound objects, we should consider
it unavailable for the availability score. If a PG has unfound
objects, it will be recorded in PGMap.

In `get_unavailable_pg_in_map`, if a PG has unfound obejcts,
we add it to `pool_pg_unavailable_map`.

Fixes: https://tracker.ceph.com/issues/67777
Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>
2 months agosrc/osd/PeeringState.cc: update last_unstale properly
Kamoltat [Tue, 21 Nov 2023 18:55:29 +0000 (18:55 +0000)]
src/osd/PeeringState.cc: update last_unstale properly

Problem:

When we update the `pg_stat` we don't
check whether the pg state is in `stale`.
Therefore, the attribute `last_unstale`
will always get updated even if the pg
state actually contains `stale`.

Solution:

Place a condition to only update
the attribute `last_unstale` when
we the pg truly doesn't have `stale`
in its state.

Fixes: https://tracker.ceph.com/issues/67777
Signed-off-by: Kamoltat <ksirivad@redhat.com>
2 months agosrc/mgr/OSDMonitor.cc Add command `ceph osd pool availability-status`
Kamoltat [Tue, 10 Oct 2023 15:15:35 +0000 (15:15 +0000)]
src/mgr/OSDMonitor.cc Add command `ceph osd pool availability-status`

```
ceph osd pool availability-status
```
outputs:

`POOL`
`UPTIME`
`DOWNTIME`
`NUMFAILURES`
`MTBF`
`MTTR`
`SCORE`
`AVAILABLE`

Fixes: https://tracker.ceph.com/issues/67777
Signed-off-by: Kamoltat <ksirivad@redhat.com>
2 months agosrc/mon/PGMap.cc: init pool_availability
Kamoltat [Thu, 26 Oct 2023 19:08:37 +0000 (19:08 +0000)]
src/mon/PGMap.cc: init pool_availability

Added PoolAvailability Struct

Modified PGMap.cc to include a k,v map:
`pool_availability`.

The key being the `poolid` and value
is `PoolAvailability`

Init the function:
`PGMap::get_unavailable_pg_in_pool_map()`
to identify and aggregate all the PGs we
mark as `unavailable` as well as the pool
that associates with the unavailable PG.

Also, included `pool_availability`
to `PGMapDigest::dump()`.

Fixes: https://tracker.ceph.com/issues/67777
Signed-off-by: Kamoltat <ksirivad@redhat.com>
2 months agoMerge pull request #62941 from MaxKellermann/mds_Locker__abort
Max Kellermann [Thu, 24 Apr 2025 09:12:12 +0000 (11:12 +0200)]
Merge pull request #62941 from MaxKellermann/mds_Locker__abort

mds/Locker: use ceph_abort_msg() instead of ceph_assert()

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2 months agoMerge pull request #59248 from kamoltat/wip-ksirivad-improve-netsplit-warning
Radoslaw Zarzynski [Thu, 24 Apr 2025 06:17:51 +0000 (08:17 +0200)]
Merge pull request #59248 from kamoltat/wip-ksirivad-improve-netsplit-warning

HealthMonitor: Add topology-aware netsplit detection and warning

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
2 months agomds/Locker: use ceph_abort_msg() instead of ceph_assert() 62941/head
Max Kellermann [Thu, 24 Apr 2025 05:17:48 +0000 (07:17 +0200)]
mds/Locker: use ceph_abort_msg() instead of ceph_assert()

This ceph_assert() always fails, but depending on the configuration
value `ceph_assert_supresssions`, execution may continue, but the
`dir` variable is left uninitialized.  This leads to a compiler
warning:

 /home/jenkins-build/build/workspace/ceph-api/src/mds/Locker.cc:451:22: error: variable 'dir' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]

clang then suggests to nullptr-initialize the variable:

 /home/jenkins-build/build/workspace/ceph-api/src/mds/Locker.cc:447:11: note: initialize the variable 'dir' to silence this warning
   447 |         CDir *dir;
       |                  ^
       |                   = nullptr

This, however, is a very bad idea because all this does is suppress
the warning; it still crashes the process.

Since there's no recovery from this problem, let's switch to
ceph_abort_msg() which is [[noreturn]] and the compiler can deduce
that `dir` is always initialized when it's used.

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
2 months agoMerge pull request #62693 from ronen-fr/wip-rf-iocnt
Ronen Friedman [Thu, 24 Apr 2025 05:17:33 +0000 (08:17 +0300)]
Merge pull request #62693 from ronen-fr/wip-rf-iocnt

osd/scrub: performance counters for I/O performed by the scrubber

Reviewed-by: Alex Ainscow <aainscow@uk.ibm.com>
Reviewed-by: Bill Scales <bill_scales@uk.ibm.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2 months agoMerge pull request #62898 from nbalacha/wip-nbalacha-70963
Ilya Dryomov [Wed, 23 Apr 2025 22:28:52 +0000 (00:28 +0200)]
Merge pull request #62898 from nbalacha/wip-nbalacha-70963

rbd: display mirror state creating

Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2 months agoMerge pull request #60899 from clwluvw/curl-einval
Casey Bodley [Wed, 23 Apr 2025 22:28:16 +0000 (18:28 -0400)]
Merge pull request #60899 from clwluvw/curl-einval

rgw: handle EINVAL translation in forward_request

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2 months agoMerge pull request #62888 from clwluvw/neorados-fifotrim
Casey Bodley [Wed, 23 Apr 2025 20:42:08 +0000 (16:42 -0400)]
Merge pull request #62888 from clwluvw/neorados-fifotrim

neorados: relax fifo trim error for ENODATA

Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
2 months agosrc/pybind/mgr/cephadm/service_discovery: enhanced service to allow discovery of... 62302/head
Bernard Landon [Fri, 14 Mar 2025 14:25:00 +0000 (14:25 +0000)]
src/pybind/mgr/cephadm/service_discovery: enhanced service to allow discovery of custom containers

Fixes: https://tracker.ceph.com/issues/70482
Signed-off-by: Bernard Landon <bernard@lndn.ch>
2 months agodoc/rgw: release note for GetObjectAttributes 62936/head
Casey Bodley [Wed, 23 Apr 2025 19:06:19 +0000 (15:06 -0400)]
doc/rgw: release note for GetObjectAttributes

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2 months agoMerge pull request #62902 from cbodley/wip-70700-disable
Casey Bodley [Wed, 23 Apr 2025 18:46:59 +0000 (14:46 -0400)]
Merge pull request #62902 from cbodley/wip-70700-disable

cmake/common: temporarily remove decode_start_v_checker tests

Reviewed-by: Dan Mick <dmick@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
2 months agoMerge pull request #60227 from clwluvw/zonegroup-delbucket
Casey Bodley [Wed, 23 Apr 2025 18:02:16 +0000 (14:02 -0400)]
Merge pull request #60227 from clwluvw/zonegroup-delbucket

rgw: skip empty check on non-owned buckets by zonegroup

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2 months agoMerge pull request #62738 from clwluvw/copy-obj-remote-zonegroup
Casey Bodley [Wed, 23 Apr 2025 18:00:56 +0000 (14:00 -0400)]
Merge pull request #62738 from clwluvw/copy-obj-remote-zonegroup

rgw: dont store replication attrs on remote copy obj

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2 months agoosd/scrub: count scrub I/O 62693/head
Ronen Friedman [Tue, 15 Apr 2025 08:34:06 +0000 (03:34 -0500)]
osd/scrub: count scrub I/O

Implement I/O counting in the PGBackend::be_scan_list()
and relevant functions it calls.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 months agoMerge pull request #62699 from Matan-B/wip-matanb-crimson-ignore-abort-v2
Matan Breizman [Wed, 23 Apr 2025 15:35:28 +0000 (18:35 +0300)]
Merge pull request #62699 from Matan-B/wip-matanb-crimson-ignore-abort-v2

crimson/common/errorator: rework aborts error handlers

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
2 months agoMerge pull request #62556 from aainscow/ec_pr_and_prereqs
Radoslaw Zarzynski [Wed, 23 Apr 2025 15:19:31 +0000 (17:19 +0200)]
Merge pull request #62556 from aainscow/ec_pr_and_prereqs

osd: Optimised EC

Reviewed-by: Radoslaw Zarzynski <rzarzynski@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
2 months agorbd: display correct mirror state when creating 62898/head
N Balachandran [Mon, 21 Apr 2025 11:34:08 +0000 (17:04 +0530)]
rbd: display correct mirror state when creating

The mirror image state is set to MIRROR_IMAGE_STATE_CREATING
when the image is first created on the secondary, but was displayed
as "unknown" by the rbd info command. This has been fixed.

Fixes: https://tracker.ceph.com/issues/70963
Signed-off-by: N Balachandran <nithya.balachandran@ibm.com>
2 months agoMerge pull request #62710 from bill-scales/ec_backfill
Laura Flores [Wed, 23 Apr 2025 15:06:56 +0000 (10:06 -0500)]
Merge pull request #62710 from bill-scales/ec_backfill

osd: EC Optimizations: Backfill changes for partial writes

2 months agoMerge pull request #62725 from VallariAg/nvmeof-teuthology-fio
Vallari Agrawal [Wed, 23 Apr 2025 13:17:12 +0000 (18:47 +0530)]
Merge pull request #62725 from VallariAg/nvmeof-teuthology-fio

qa/suites/nvmeof: Fix thrasher and fio script

2 months agoMerge pull request #60731 from joscollin/wip-B68954-check-headers-journal-recovery
Rishabh Dave [Wed, 23 Apr 2025 12:15:54 +0000 (17:45 +0530)]
Merge pull request #60731 from joscollin/wip-B68954-check-headers-journal-recovery

cephfs-journal-tool: check the headers in dump file after journal recovery

Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
2 months agoMerge pull request #62914 from baum/ms_dispatch2_clean_up
baum [Wed, 23 Apr 2025 10:36:16 +0000 (13:36 +0300)]
Merge pull request #62914 from baum/ms_dispatch2_clean_up

src/nvmeof/NVMeofGwMonitorClient.cc: ms_dispatch2 clean up

2 months agoMerge pull request #62696 from anthonyeleven/mgr-prom
Zac Dover [Wed, 23 Apr 2025 09:27:00 +0000 (19:27 +1000)]
Merge pull request #62696 from anthonyeleven/mgr-prom

doc/mgr: Improve prometheus.rst

Reviewed-by: Zac Dover <zac.dover@proton.me>
2 months agoMerge PR #62577 into main
Venky Shankar [Wed, 23 Apr 2025 09:16:03 +0000 (14:46 +0530)]
Merge PR #62577 into main

* refs/pull/62577/head:
libcephfs_proxy: avoid libc buffering for logging

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Anoop C S <anoopcs@cryptolab.net>
2 months agoMerge branch 'main' into mgr-prom 62696/head
Zac Dover [Wed, 23 Apr 2025 09:15:19 +0000 (19:15 +1000)]
Merge branch 'main' into mgr-prom

Signed-off-by: Zac Dover <zac.dover@proton.me>
2 months agoqa: test 'journal import' recognizes invalid headers post journal recovery 60731/head
Jos Collin [Tue, 11 Feb 2025 10:45:51 +0000 (16:15 +0530)]
qa: test 'journal import' recognizes invalid headers post journal recovery

Fixes: https://tracker.ceph.com/issues/68954
Signed-off-by: Jos Collin <jcollin@redhat.com>
2 months agocephfs-journal-tool: check the headers in dump file after journal recovery
Jos Collin [Thu, 14 Nov 2024 05:12:18 +0000 (10:42 +0530)]
cephfs-journal-tool: check the headers in dump file after journal recovery

Fixes: https://tracker.ceph.com/issues/68954
Signed-off-by: Jos Collin <jcollin@redhat.com>
2 months agodoc/mgr: Improve prometheus.rst
Anthony D'Atri [Mon, 7 Apr 2025 03:03:53 +0000 (23:03 -0400)]
doc/mgr: Improve prometheus.rst

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
2 months agoMerge pull request #62911 from bluikko/doc-cleanup-radosgw
Anthony D'Atri [Wed, 23 Apr 2025 03:25:27 +0000 (23:25 -0400)]
Merge pull request #62911 from bluikko/doc-cleanup-radosgw

doc/radosgw: Fix indentation in admin.rst

2 months agoMerge pull request #62896 from zdover23/wip-doc-2025-04-21-revert-62782-c4f0f8e
Zac Dover [Tue, 22 Apr 2025 23:31:21 +0000 (09:31 +1000)]
Merge pull request #62896 from zdover23/wip-doc-2025-04-21-revert-62782-c4f0f8e

doc: Revert "doc/mgr: Promptify CLI commands and other formatting fixes"

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2 months agodoc/rados/operations/health-checks: Add MON_NETSPLIT Warning 59248/head
Kamoltat Sirivadhna [Fri, 23 Aug 2024 20:24:36 +0000 (20:24 +0000)]
doc/rados/operations/health-checks: Add MON_NETSPLIT Warning

Fixes: https://tracker.ceph.com/issues/67371
Signed-off-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
2 months agoHealthMonitor: Add topology-aware netsplit detection and warning
Kamoltat Sirivadhna [Thu, 15 Aug 2024 20:25:43 +0000 (20:25 +0000)]
HealthMonitor: Add topology-aware netsplit detection and warning

Problem:
Currently, Ceph cannot detect and report network partitions (netsplits)
between monitors in different topology locations in a consolidated way.
While stretch mode can handle partitions through monitor elections,
users lack visibility into the topology-level view of network
disconnections, making troubleshooting difficult.

Solution:
This implementation adds a hierarchical netsplit detection mechanism that:
- Uses DirectedGraph structure for netsplit detection
- Maps monitor disconnections to relevant CRUSH topology levels
- Aggregates individual disconnections into location-level reports when appropriate
- Detects complete location-level netsplits when ALL monitors between locations
  cannot communicate
- Reports specific topology locations experiencing complete communication failures
- Falls back to individual monitor-level reporting for partial disconnections
- Handles monitors with missing location data gracefully
- Leverages HealthMonitor::check_for_mon_down to receive a set of down monitors,
  efficiently avoiding false netsplit reports for monitors already known to be down
- Implements smart filtering that correctly excludes down monitors from location-based
  analysis, ensuring accurate netsplit reporting at both individual and topology levels

The implementation produces user-friendly health warnings:
1. For complete location netsplits: "Netsplit detected between dc1 and dc2"
2. For individual monitor disconnections: "Netsplit detected between mon.a and mon.d"

Performance considerations:
- Time complexity: O(m²) where m is the number of monitors
- Space complexity: O(m²) for connection tracking
- Practical impact is minimal as monitor count is typically small (3-7)

Fixes: https://tracker.ceph.com/issues/67371
Signed-off-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
Conflicts:
src/mon/Elector.cc - Trivial Fix

2 months agoMerge pull request #62416 from kamoltat/wip-ksirivad-fix-connection-score
Laura Flores [Tue, 22 Apr 2025 21:05:44 +0000 (16:05 -0500)]
Merge pull request #62416 from kamoltat/wip-ksirivad-fix-connection-score

2 months agolibrbd: disallow "rbd trash mv" if image is in a group 62921/head
Ilya Dryomov [Wed, 16 Apr 2025 11:15:19 +0000 (13:15 +0200)]
librbd: disallow "rbd trash mv" if image is in a group

Removing an image that is a member of a group has always been
disallowed.  However, moving an image that is a member of a group to
trash is currently allowed and this is deceptive -- the only reason for
a user to move an image to trash should be the intent to remove it.

More importantly, group APIs operate in terms of image names -- there
are no corresponding variants that would operate in terms of image IDs.
For example, even though internally GroupImageSpec struct stores an
image ID, the public rbd_group_image_info_t struct insists on an image
name.  When rbd_group_image_list() encounters a trashed member image
(i.e. one that doesn't have a name), it just fails with ENOENT and no
listing gets produced at all until the offending image is restored from
trash.  Something like this can be very hard to debug for an average
user, so let's make rbd_trash_move() fail with EMLINK the same way as
rbd_remove() does in this scenario.

The one case where moving a member image to trash makes sense is live
migration where the source image gets trashed to be almost immediately
replaced by the destination image as part of preparing migration.

Fixes: https://tracker.ceph.com/issues/71026
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2 months agopybind/rbd: add ImageMemberOfGroup exception
Ilya Dryomov [Mon, 21 Apr 2025 15:11:17 +0000 (17:11 +0200)]
pybind/rbd: add ImageMemberOfGroup exception

EMLINK is returned by rbd_remove() if the image is a member of a group.
Add a dedicated exception similar to ImageBusy or ImageHasSnapshots and
a test for it.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2 months agorbd: don't print "image will expire at" message when trash_move() fails
Ilya Dryomov [Mon, 21 Apr 2025 14:52:02 +0000 (16:52 +0200)]
rbd: don't print "image will expire at" message when trash_move() fails

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2 months agoMerge pull request #61212 from rishabh-d-dave/mgr-vol-count-clones
Rishabh Dave [Tue, 22 Apr 2025 15:51:52 +0000 (21:21 +0530)]
Merge pull request #61212 from rishabh-d-dave/mgr-vol-count-clones

mgr/vol: count number of ongoing clones in CloneProgressReporter...

Reviewed-by: Milind Changire <mchangir@redhat.com>
2 months agoMerge pull request #62870 from MaxKellermann/mds_includes
Max Kellermann [Tue, 22 Apr 2025 15:28:37 +0000 (17:28 +0200)]
Merge pull request #62870 from MaxKellermann/mds_includes

mds: include cleanup

Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
2 months agocrimson: remove any assert_failure pre assert usages 62699/head
Matan Breizman [Mon, 21 Apr 2025 14:11:58 +0000 (14:11 +0000)]
crimson: remove any assert_failure pre assert usages

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2 months agocrimosn/common/errorator: cleanup assert_failure pre_assert
Matan Breizman [Mon, 21 Apr 2025 14:11:52 +0000 (14:11 +0000)]
crimosn/common/errorator: cleanup assert_failure pre_assert

Any usage should be replaced with a message that supports priniting the
error.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2 months agocrimson/common/errorator: assert_failure to print error
Matan Breizman [Mon, 21 Apr 2025 14:11:47 +0000 (14:11 +0000)]
crimson/common/errorator: assert_failure to print error

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2 months agocrimson/osd: Verbose assert_all aborts
Matan Breizman [Mon, 21 Apr 2025 14:13:46 +0000 (14:13 +0000)]
crimson/osd: Verbose assert_all aborts

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2 months agocrimson/common/errorator: allow assert_all to accept c_str()
Matan Breizman [Mon, 21 Apr 2025 14:13:41 +0000 (14:13 +0000)]
crimson/common/errorator: allow assert_all to accept c_str()

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2 months agocrimson/common/errorator: Cleanup assert_all pre_assert
Matan Breizman [Mon, 21 Apr 2025 14:13:38 +0000 (14:13 +0000)]
crimson/common/errorator: Cleanup assert_all pre_assert

Not used

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2 months agocrimson/common/errorator: cleanup ErrorT::handler call
Matan Breizman [Mon, 21 Apr 2025 14:13:34 +0000 (14:13 +0000)]
crimson/common/errorator: cleanup ErrorT::handler call

call error_t::handle without decalring handler and invoking it later on

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2 months agocrimson/ertr: assert_all informs about error being handled that way
Matan Breizman [Mon, 21 Apr 2025 14:13:31 +0000 (14:13 +0000)]
crimson/ertr: assert_all informs about error being handled that way

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2 months agotest/crimson/test_errorator: ignore assert_all
Matan Breizman [Mon, 7 Apr 2025 09:50:35 +0000 (09:50 +0000)]
test/crimson/test_errorator: ignore assert_all

This came up during: https://tracker.ceph.com/issues/69406#note-25
Where an "assert_all" was called but didn't cause an abort.
Added "ignore_assert_all" to showcase this scenario along with any
other case where we are expected to abort.
The tests could be used to verify errorator's aborting behavior.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2 months agocrimson/common/errorator: print abort message when possible
Matan Breizman [Thu, 10 Apr 2025 15:50:01 +0000 (15:50 +0000)]
crimson/common/errorator: print abort message when possible

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2 months agocrimson/common/errorator: Always check exception type
Matan Breizman [Thu, 10 Apr 2025 14:23:50 +0000 (14:23 +0000)]
crimson/common/errorator: Always check exception type

We shouldn't bypass this check in the is_same_v<return_t, no_touch_error_marker>
case.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2 months agocrimson/common/errorator: add TODO
Matan Breizman [Thu, 10 Apr 2025 13:12:36 +0000 (13:12 +0000)]
crimson/common/errorator: add TODO

There are few TODOs around errorator code which might be worth looking
into: https://tracker.ceph.com/issues/70875

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2 months agocrimson/common/errorator: introduce take_exception_from_future
Matan Breizman [Thu, 10 Apr 2025 10:07:08 +0000 (10:07 +0000)]
crimson/common/errorator: introduce take_exception_from_future

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2 months agocrimson/common/errorator: move exception_comment
Matan Breizman [Thu, 10 Apr 2025 09:48:38 +0000 (09:48 +0000)]
crimson/common/errorator: move exception_comment

move the comment to where __cxa_exception_type is used
to keep handle() comments shorter.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2 months agocrimson/common/errorator: fix skipped aborts
Xuehan Xu [Tue, 8 Apr 2025 12:50:47 +0000 (20:50 +0800)]
crimson/common/errorator: fix skipped aborts

We should also invoke the errfunc (which aborts) when the
return type is no_touch_error_marker.
Added comments explaining:
* why it's forbidden to return void
* why std::is_same_v<return_t, no_touch_error_marker> is checked

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2 months agoMerge pull request #62708 from rishabh-d-dave/vols-snap-path
Rishabh Dave [Tue, 22 Apr 2025 15:06:34 +0000 (20:36 +0530)]
Merge pull request #62708 from rishabh-d-dave/vols-snap-path

mgr/vol: add command to get snapshot path

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2 months agoMerge pull request #62837 from athanatos/sjust/wip-crimson-stuck-backfilling
Samuel Just [Tue, 22 Apr 2025 15:02:59 +0000 (08:02 -0700)]
Merge pull request #62837 from athanatos/sjust/wip-crimson-stuck-backfilling

crimson: fix several bugs causing stuck backfills

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
2 months agoMerge pull request #62619 from athanatos/sjust/wip-replica-read-crimson-mosdpct
Samuel Just [Tue, 22 Apr 2025 14:59:41 +0000 (07:59 -0700)]
Merge pull request #62619 from athanatos/sjust/wip-replica-read-crimson-mosdpct

crimson: add MOSDPGPCT support

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
2 months agosrc/nvmeof/NVMeofGwMonitorClient.cc: ms_dispatch2 clean up 62914/head
Alexander Indenbaum [Tue, 22 Apr 2025 10:20:02 +0000 (13:20 +0300)]
src/nvmeof/NVMeofGwMonitorClient.cc: ms_dispatch2 clean up

- return ACKNOWLEDGED/HANDLED
- remove registration for unwanted keys

Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
2 months agoMerge pull request #62899 from tchaikov/cmake-build-boost
Kefu Chai [Tue, 22 Apr 2025 13:27:46 +0000 (21:27 +0800)]
Merge pull request #62899 from tchaikov/cmake-build-boost

cmake: Fix b2 build with postfixed compiler versions

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
2 months agodoc/radosgw: Fix indentation in admin.rst 62911/head
Ville Ojamo [Tue, 22 Apr 2025 12:09:23 +0000 (19:09 +0700)]
doc/radosgw: Fix indentation in admin.rst

Indent the CLI command continuation lines correctly to start at the same
position as the other such commands, add one space on 2 lines.
Introduced in #62877.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
2 months agocmake/common: temporarily remove decode_start_v_checker tests 62902/head
Casey Bodley [Tue, 22 Apr 2025 12:19:18 +0000 (08:19 -0400)]
cmake/common: temporarily remove decode_start_v_checker tests

these test cases appear to be the cause of many 'make check' failures:

> error while loading shared libraries: path/to/libceph-common.so.2: file too short

Reported in https://tracker.ceph.com/issues/70700

disable the tests until we can fix them

Signed-off-by: Casey Bodley <cbodley@redhat.com>
2 months agoMerge pull request #61997 from Naveenaidu/wip-naveen-telemetry-show-labeled-perf...
Ronen Friedman [Tue, 22 Apr 2025 11:56:21 +0000 (14:56 +0300)]
Merge pull request #61997 from Naveenaidu/wip-naveen-telemetry-show-labeled-perf-counters

telemetry: include labeled perf counters in report

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
Reviewed-by: Yaarit Hatuka <yaarithatuka@gmail.com>
Reviewed-by: Afreen Misbah <afreen@ibm.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2 months agoMerge pull request #62712 from leonidc/heuristic-redeploy-fix
leonidc [Tue, 22 Apr 2025 10:27:27 +0000 (13:27 +0300)]
Merge pull request #62712 from leonidc/heuristic-redeploy-fix

nvmeofgw: fix host issue during redeploy, improves previous redeploy fix

2 months agoMerge PR #62578 into main
Venky Shankar [Tue, 22 Apr 2025 10:14:11 +0000 (15:44 +0530)]
Merge PR #62578 into main

* refs/pull/62578/head:
mds: fix dump stray command

Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
2 months agoMerge PR #62674 into main
Venky Shankar [Tue, 22 Apr 2025 10:13:19 +0000 (15:43 +0530)]
Merge PR #62674 into main

* refs/pull/62674/head:
mon: Fix cast warning

Reviewed-by: Kefu Chai <tchaikov@gmail.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2 months agoMerge pull request #62877 from bluikko/doc-formatting-radosgw
Zac Dover [Tue, 22 Apr 2025 09:47:09 +0000 (19:47 +1000)]
Merge pull request #62877 from bluikko/doc-formatting-radosgw

doc/radosgw: Improve and more consistent formatting

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2 months agodoc: Revert "doc/mgr: Promptify CLI commands and other formatting fixes" 62896/head
Zac Dover [Mon, 21 Apr 2025 09:33:01 +0000 (19:33 +1000)]
doc: Revert "doc/mgr: Promptify CLI commands and other formatting fixes"

This reverts commit c4f0f8edad46c852961a622795d14b735f660d94.

Signed-off-by: Zac Dover <zac.dover@proton.me>
2 months agomgr/dashboard: fix smb edit resources 62845/head
Pedro Gonzalez Gomez [Wed, 16 Apr 2025 09:48:01 +0000 (11:48 +0200)]
mgr/dashboard: fix smb edit resources

- Fixes smb cluster edit due to wrong path
- Disables authId and usersGroupsId for ad/standalone edit
- Removes hardcoded paths by unifyng all smb resources path with constants
- Renames smb/clusters -> smb/cluster path for consistency
- Reorders task messages

Fixes: https://tracker.ceph.com/issues/70946
Signed-off-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>
2 months agoMerge pull request #62869 from afreen23/wip-nvme
afreen23 [Tue, 22 Apr 2025 07:46:39 +0000 (13:16 +0530)]
Merge pull request #62869 from afreen23/wip-nvme

mgr/dashboard: Fix pool update on edit

Reviewed-by: Nizamudeen A <nia@redhat.com>
2 months agoosd: Introduce optimized EC 62556/head
Alex Ainscow [Thu, 3 Apr 2025 13:47:28 +0000 (14:47 +0100)]
osd: Introduce optimized EC

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
2 months agoosd: Install stub extent cache in OSD. 62555/head
Alex Ainscow [Mon, 7 Apr 2025 08:20:44 +0000 (09:20 +0100)]
osd: Install stub extent cache in OSD.

The extent cache in new EC is a per OSD-shard cache will caches
reads used by read-modify-write to improve performance of sequential
IO. We want to provide a single PR with all of EC in it, so this
PR provides a non-functional stub to allow all the non-EC code to
be installed.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
2 months agoosd: New options for configuring new EC
Alex Ainscow [Thu, 27 Mar 2025 15:37:57 +0000 (15:37 +0000)]
osd: New options for configuring new EC

Adding three new configuration options which will apply once new EC
is in place:

osd_pool_default_flag_ec_optimizations

This allows EC optimizations to be turned on by default.

ec_extent_cache_size

This allows the user to specify the size of the per-shard extent cache if
they feel that the default 10MiB is too large or too small.

The default value may well change following more extensive testing.

ec_pd_write_mode

This is a development flag for testing the parity delta write RMW mechanism
within the EC code.  Setting to anything other than 0 will cause performance
problems.  It is provided as a test mechanism for performance and
teuthology.  Performance may wish too turn off all PDW writes for a particular
IO pattern. This will allow us to determine if the automatic mode should be
using conventional RMW writes.  The force-on mode allows testing on more
unusual scenarios and on smaller configurations.

Finally, we tweak the way optimisations are enabled, so as to be common between
enabling and default-enabled.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
2 months agoosd: Add interface for new EC to determine the acting recovery backfill shard id...
Alex Ainscow [Thu, 27 Mar 2025 14:57:27 +0000 (14:57 +0000)]
osd: Add interface for new EC to determine the acting recovery backfill shard id set.

This is used to optimise the set of shards that will be written to by
EC writes.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
2 months agocrimson/osd: Add scrub stubs for crimson and classic, ready for new EC
Alex Ainscow [Thu, 27 Mar 2025 14:38:44 +0000 (14:38 +0000)]
crimson/osd: Add scrub stubs for crimson and classic, ready for new EC

The new optimised EC code is not backward compatible withold  EC Code.
Before this commit there is some stub code which assumes that an hinfo
xattr will exist and can be used for scrub. This is no longer the case in new EC.

We plan to first make the scrubbing changes for new EC in classic and will
subsequently port to crimson. It will not look like the code here, so there is
little point in keeping it.

Additionally, add some stubs for scrub in classic optimized EC.

There will be a later PR specifically for dealing with scrubbing in
new EC which fix all the fix mes in class,

The crimson code will be fixed up at a later date and will only
support optimised EC.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
2 months agotest: Add mempool (but don't use it yet)
Alex Ainscow [Thu, 27 Mar 2025 13:52:05 +0000 (13:52 +0000)]
test: Add mempool (but don't use it yet)

Optimised has an extent cache which consumes memory resources. As such, we create a new mempool to track these.  This is not yet used.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
2 months agoosd: Fix some signing and typo issues in debug output
Alex Ainscow [Thu, 27 Mar 2025 13:11:02 +0000 (13:11 +0000)]
osd: Fix some signing and typo issues in debug output

Some dout messages contained unsigned shard ids.  These should be signed, so that invalid shard ids show as -1, rather than max int.

Also a very basic typo.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
2 months agocommon: Generalise to_interval_set to allow more interval_set implementations.
Alex Ainscow [Fri, 28 Mar 2025 13:46:30 +0000 (13:46 +0000)]
common: Generalise to_interval_set to allow more interval_set implementations.

This generalises to_interval_set so that the interval set does not need to share a
common internal map structure with interval_map. The implementation is achieved
through iteration, so there is no requirement for the old restriction.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
2 months agocommon: bitset_set
Alex Ainscow [Thu, 27 Mar 2025 11:44:29 +0000 (11:44 +0000)]
common: bitset_set

This bitset_set change relaxes policing of bitset_set, so that
out-of-range can be queried in the contains interface. This means
that callers cam simplifiy calls.  For example:

 if (key == invalid) || !set.contains(key)) {
  do_stuff
 }

 becomes

  if (!set.contains(key)) {
   do_stuff
  }

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
2 months agoosd: EC Optimizations: proc_master_log changes for partial logs 62523/head
Bill Scales [Wed, 26 Mar 2025 13:43:43 +0000 (13:43 +0000)]
osd: EC Optimizations: proc_master_log changes for partial logs

proc_master_log is part of the peering process that merges
the authorative log (in the case of EC pools the log of the
shard missing the most updates) into the primary log.

When there are partial writes it is likely that the
authorative log is behind because of partial writes that
did not update that shard. proc_master_log works out where
the logs diverge and then studies each additional log entry
to see if all the updates made in that log entry have been
applied. If any shard is missing an update then that log
entry (and all subsequent entries) need to be rolled back,
otherwise the entry can be rolled forward and included in
the authorative log.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
2 months agoosd: EC Optimizations: Peer changes for partial logs
Bill Scales [Wed, 26 Mar 2025 13:25:07 +0000 (13:25 +0000)]
osd: EC Optimizations: Peer changes for partial logs

Changes to peering for replica/strays to handle partial
logs. For EC optimized pools shards may not have a complete
log if there have been partial writes that did not update
the shard. If the most recent entries in the log have all
skipped updating a shard then it will have a log that ends
earlier than other shards. During peering the primary which
has a full copy of the log works out whether other shards
have any missing objects and then communicates this to
the replica/stray shards during activation.

The primary uses the partial write last complete data in
pg_info_t to explain to other shards if they are missing
log entries and just need to update last_update and
last_complete.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
2 months agoosd: EC Optimizations: Get missing changes for partial logs
Bill Scales [Wed, 26 Mar 2025 13:15:45 +0000 (13:15 +0000)]
osd: EC Optimizations: Get missing changes for partial logs

Changes to the get missing step of peering to handle partial
writes. Having established the authorative log the primary
works out what shards are missing objects. With partial
writes this code needs to differentiate between a shard that
missed an update (and hence has a missing object) versus a
shard that was not updated by a partial write. The divergent
log entries are examined to see if the updates were partial
writes that did not involve the shard.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
2 months agoosd: EC Optimizations: PG log changes for partial logs
Bill Scales [Wed, 26 Mar 2025 13:04:46 +0000 (13:04 +0000)]
osd: EC Optimizations: PG log changes for partial logs

Optimized EC pools will not add a log entry for shards that
are not modified by a partial write. This means the shard
will have a partial copy of the log.

There are several asserts in PGLog that assume that the log
is contiguous, these need to be relaxed when it is an optimized
EC pool (other pools retain the full strength asserts).

During peering the primary may provide a complete log to a
non-primary shard to merge into its log. This merge can skip
log entries for partial writes that do not update the shard.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
2 months agoosd: EC optimizations: Get log twice when auth_log_shard is a non-primary
Bill Scales [Wed, 26 Mar 2025 12:40:13 +0000 (12:40 +0000)]
osd: EC optimizations: Get log twice when auth_log_shard is a non-primary

When an event such as splitting the PG occurs the new primary does
not have any log at the start of peering. Non-primary shards in an
EC optimized pool may not have a complete log of writes due to
partial writes. If the choosen authorative shard is a non-primary
shard then the new primary needs to first get a full copy of the
log (which extends past the authorative shard log) from another
shard and then repeat the get log step to get the authorative
shard's log so it can be merged rewinding divergent entries.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
2 months agoosd: EC Optimizations: Partial write changes to add_next_event
Bill Scales [Thu, 6 Mar 2025 09:47:17 +0000 (09:47 +0000)]
osd: EC Optimizations: Partial write changes to add_next_event

add_next_event is used during peering to process log entries
that a shard is missing to build up a list of missing objects.
With EC optimized pools and partial writes not every update
modifies every shard. The log entry contains details of which
shards were modified and this can be used to work out whether
a missing entry needs to be created/updated.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
2 months agoosd: EC Optimizations: Relax reset_complete_to for partial writes 62522/head
Bill Scales [Wed, 26 Mar 2025 10:46:07 +0000 (10:46 +0000)]
osd: EC Optimizations: Relax reset_complete_to for partial writes

EC Optimized pools can have shards missing log entries because
of partial writes. This means it is possible to have a missing
entry with a newer version than the log. Relax an assert in
reset_complete_to to avoid this.

reset_complete_to also resets last_complete to 0 when the
oldest missing object is before the first log entry. This
is to aggressive for partial writes and needs to be relaxed.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
2 months agoosd: EC Optimizations: Add shard_id_sets for backfill_target and ...
Bill Scales [Wed, 26 Mar 2025 10:05:07 +0000 (10:05 +0000)]
osd: EC Optimizations: Add shard_id_sets for backfill_target and ...
acting_recovery_backfill

Optimized EC code uses shard_id_sets as a convinient and fast way of
representing sets of shards. Peering calculates a backfill_target set
and an active_recovery_backfill set as a map of pg_shard_ids during
peering and these are then used while processing I/O requests.

Modify peering so that it initializes a shard_id_set version of
these two sets and makes these available to ECBackend code.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
2 months agoosd: EC Optimizations: Update pwlc for split/merge
Bill Scales [Wed, 26 Mar 2025 08:55:44 +0000 (08:55 +0000)]
osd: EC Optimizations: Update pwlc for split/merge

Update pwlc data in pg_info_t when splitting and
merging PGs.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>