]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
7 weeks agoMerge pull request #64594 from Hezko/wip-72181-tentacle
afreen23 [Tue, 22 Jul 2025 07:05:00 +0000 (12:35 +0530)]
Merge pull request #64594 from Hezko/wip-72181-tentacle

tentacle: mgr/dashboard: nvmeof cli feedback fixes

Reviewed-by: Afreen Misbah <afreen@ibm.com>
7 weeks agoMerge pull request #64506 from NitzanMordhai/wip-72119-tentacle
Yuri Weinstein [Mon, 21 Jul 2025 14:49:21 +0000 (07:49 -0700)]
Merge pull request #64506 from NitzanMordhai/wip-72119-tentacle

tentacle: mixed balance read and rwordered in read ops

Reviewed-by: Laura Flores <lflores@redhat.com>
7 weeks agoMerge pull request #63803 from badone/tentacle
Yuri Weinstein [Mon, 21 Jul 2025 14:47:20 +0000 (07:47 -0700)]
Merge pull request #63803 from badone/tentacle

Tentacle: OSDMonitor: Make sure pcm is initialised

Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>
7 weeks agoMerge pull request #64415 from ljflores/wip-72053-tentacle
Radoslaw Zarzynski [Mon, 21 Jul 2025 13:48:56 +0000 (15:48 +0200)]
Merge pull request #64415 from ljflores/wip-72053-tentacle

tentacle: osd: Multiple fixes to optimized EC and peering

Reviewed-by: Alex Ainscow <aainscow@uk.ibm.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
7 weeks agoMerge pull request #64576 from ronen-fr/wip-rf-64567-tentacle
Ronen Friedman [Mon, 21 Jul 2025 12:54:16 +0000 (15:54 +0300)]
Merge pull request #64576 from ronen-fr/wip-rf-64567-tentacle

tentacle: osd/scrub: allow auto-repair on operator-initiated scrubs

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
7 weeks agomgr/dashboard: nvmeof cli rename ns to namespace, fixes for text responses, subsys... 64594/head
Tomer Haskalovitch [Sun, 6 Jul 2025 20:15:50 +0000 (23:15 +0300)]
mgr/dashboard: nvmeof cli rename ns to namespace, fixes for text responses, subsys add params

Signed-off-by: Tomer Haskalovitch <tomer.haska@ibm.com>
(cherry picked from commit 702dfddf23036e6ec79e4b9d5eac7d09637971b8)

8 weeks agoMerge pull request #64570 from Hezko/wip-72180-tentacle
afreen23 [Mon, 21 Jul 2025 06:37:58 +0000 (12:07 +0530)]
Merge pull request #64570 from Hezko/wip-72180-tentacle

tentacle: mgr/dashboard: support human friendly size parameter split commands to separate api and cli functions

Reviewed-by: Nizamudeen A <nia@redhat.com>
8 weeks agoMerge pull request #64542 from Hezko/wip-72165-tentacle
afreen23 [Mon, 21 Jul 2025 06:37:01 +0000 (12:07 +0530)]
Merge pull request #64542 from Hezko/wip-72165-tentacle

tentacle: mgr/dashboard: add help for nvmeof cli

Reviewed-by: Nizamudeen A <nia@redhat.com>
8 weeks agoosd/scrub: allow auto-repair on operator-initiated scrubs 64576/head
Ronen Friedman [Thu, 17 Jul 2025 16:59:00 +0000 (11:59 -0500)]
osd/scrub: allow auto-repair on operator-initiated scrubs

Previously, operator-initiated scrubs would not auto-repair, regardless
of the value of the 'osd_scrub_auto_repair' config option.  This was
less confusing to the operator than it could have been, as most
operator commands would in fact cause a regular periodic scrub
to be initiated. However, that quirk is now fixed: operator commands
now trigger 'op-initiated' scrubs. Thus the need for this patch.

The original bug was fixed in https://github.com/ceph/ceph/pull/54615,
but was unfortunately re-introduced later on.
Fixes: https://tracker.ceph.com/issues/72178
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit 97de817ad1c253ee1c7c9c9302981ad2435301b9)

8 weeks agoMerge pull request #64442 from ronen-fr/wip-rf-noempty-64429-tentacle
SrinivasaBharathKanta [Fri, 18 Jul 2025 03:26:28 +0000 (08:56 +0530)]
Merge pull request #64442 from ronen-fr/wip-rf-noempty-64429-tentacle

tentacle: qa/standalone/scrub: fix "scrubbed in 0ms" in osd-scrub-test.sh

8 weeks agoMerge pull request #64419 from ljflores/wip-72023-tentacle
SrinivasaBharathKanta [Fri, 18 Jul 2025 03:26:17 +0000 (08:56 +0530)]
Merge pull request #64419 from ljflores/wip-72023-tentacle

tentacle: qa/tasks: generalize stuck pg ignorelist entry

8 weeks agoMerge pull request #64414 from ljflores/wip-72052-tentacle
Laura Flores [Thu, 17 Jul 2025 20:19:58 +0000 (15:19 -0500)]
Merge pull request #64414 from ljflores/wip-72052-tentacle

tentacle: Optimised EC: Ignore snapshot scrubbing on non-primary shards

8 weeks agomgr/dashboard: split ns add to separate api and cli functions 64570/head
Tomer Haskalovitch [Mon, 14 Jul 2025 18:53:30 +0000 (21:53 +0300)]
mgr/dashboard: split ns add to separate api and cli functions

Signed-off-by: Tomer Haskalovitch <tomer.haska@ibm.com>
(cherry picked from commit f93afc474675f364972ee2719ad284f0ac850740)

8 weeks agoMerge pull request #64556 from zdover23/wip-doc-2025-07-17-backport-64537-to-tentacle...
Anthony D'Atri [Thu, 17 Jul 2025 13:32:23 +0000 (09:32 -0400)]
Merge pull request #64556 from zdover23/wip-doc-2025-07-17-backport-64537-to-tentacle-take-two

tentacle: doc/radosgw: Improve formatting and language in bucket_logging.rst

8 weeks agodoc/radosgw: Improve formatting and language in bucket_logging.rst 64556/head
Ville Ojamo [Wed, 16 Jul 2025 07:14:26 +0000 (14:14 +0700)]
doc/radosgw: Improve formatting and language in bucket_logging.rst

Trim trailing extra line characters around main title.

Add missing full stops in list items.

Use double backticks for configuration options, data etc.

Linkify reference to REST API.

No hyphen in "regular expression".

Fix section hierarchy by moving "Log Records" up 2 levels and try to
make the section title more consistent with another section title.

Try to improve partial sentences and try to simplify one sentence.

Remove whitespace at otherwise empty line.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
(cherry picked from commit a19834c2bbdef6feb7b4bf5266d40f4d427a8247)

8 weeks agoMerge pull request #64515 from afreen23/wip-72146-tentacle
afreen23 [Thu, 17 Jul 2025 09:16:01 +0000 (14:46 +0530)]
Merge pull request #64515 from afreen23/wip-72146-tentacle

tentacle: mgr/dashboard: Fix smb module enablement

Reviewed-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>
8 weeks agomgr/dashboard: add help for nvmeof cli 64542/head
Tomer Haskalovitch [Tue, 15 Jul 2025 07:40:07 +0000 (10:40 +0300)]
mgr/dashboard: add help for nvmeof cli

Signed-off-by: Tomer Haskalovitch <tomer.haska@ibm.com>
(cherry picked from commit f7f93b2c7a8bf3730fe4f82a9f4a30bb2ee89b68)

8 weeks agoMerge pull request #63773 from zdover23/wip-doc-2025-06-06-backport-63740-to-tentacle
Zac Dover [Thu, 17 Jul 2025 06:35:27 +0000 (16:35 +1000)]
Merge pull request #63773 from zdover23/wip-doc-2025-06-06-backport-63740-to-tentacle

tentacle: doc/mgr: edit telemetry (3 of x)

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
8 weeks agoMerge pull request #64546 from zdover23/wip-doc-2025-07-17-backport-64532-to-tentacle
Zac Dover [Thu, 17 Jul 2025 06:34:46 +0000 (16:34 +1000)]
Merge pull request #64546 from zdover23/wip-doc-2025-07-17-backport-64532-to-tentacle

tentacle: doc/radosgw: edit "Lifecycle Settings"

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
8 weeks agodoc/mgr: edit telemetry (3 of x) 63773/head
Zac Dover [Thu, 5 Jun 2025 02:24:08 +0000 (12:24 +1000)]
doc/mgr: edit telemetry (3 of x)

Improve the English and the formatting in doc/mgr/telemetry.rst. This
follows up on https://github.com/ceph/ceph/pull/63476.

This commit edits the third hundred lines in doc/mgr/telemetry.rst.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 3ce61e065121e07e2c37097f1fe6736bdf985e8e)

8 weeks agodoc/radosgw: edit "Lifecycle Settings" 64546/head
Zac Dover [Wed, 16 Jul 2025 12:11:03 +0000 (22:11 +1000)]
doc/radosgw: edit "Lifecycle Settings"

Edit the section "Lifecycle Settings" in the file
doc/radosgw/config-ref.rst. Remove solecisms and pleonasms and plain old
infelicitious formulations.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit ac2e5f502523d1bf326303e904ccb47236c81fcb)

8 weeks agoMerge pull request #64533 from zdover23/wip-doc-2025-07-16-backport-64328-to-tentacle
Zac Dover [Thu, 17 Jul 2025 03:29:31 +0000 (13:29 +1000)]
Merge pull request #64533 from zdover23/wip-doc-2025-07-16-backport-64328-to-tentacle

tentacle: doc/rgw/logging: fix journal record format

Reviewed-by: Yuval Lifshitz <ylifshit@ibm.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
8 weeks agoMerge pull request #63808 from zdover23/wip-doc-2025-06-09-backport-63781-to-tentacle
Zac Dover [Thu, 17 Jul 2025 03:28:46 +0000 (13:28 +1000)]
Merge pull request #63808 from zdover23/wip-doc-2025-06-09-backport-63781-to-tentacle

tentacle: doc/mgr: edit telemetry.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
8 weeks agoMerge pull request #64529 from zdover23/wip-doc-2025-07-16-backport-64433-to-tentacle
Anthony D'Atri [Wed, 16 Jul 2025 14:22:37 +0000 (10:22 -0400)]
Merge pull request #64529 from zdover23/wip-doc-2025-07-16-backport-64433-to-tentacle

tentacle: doc: update mgr modules notify_types

8 weeks agodoc/mgr: edit telemetry.rst 63808/head
Zac Dover [Fri, 6 Jun 2025 05:11:15 +0000 (15:11 +1000)]
doc/mgr: edit telemetry.rst

Edit doc/mgr/telemetry.rst.

Incorporate Anthony D'Atri's suggestions from
https://github.com/ceph/ceph/pull/63739.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit ac7f757db7b3644761a2295cfe5e1a9a55319f72)

8 weeks agodoc/rgw/logging: fix journal record format 64533/head
Yuval Lifshitz [Thu, 3 Jul 2025 10:24:30 +0000 (10:24 +0000)]
doc/rgw/logging: fix journal record format

Fixes: https://tracker.ceph.com/issues/71945
Signed-off-by: Yuval Lifshitz <ylifshit@ibm.com>
(cherry picked from commit 2dd5edf17aed392dc51a0fe9d55fa9963574ced1)

8 weeks agodoc: update mgr modules notify_types 64529/head
Nitzan Mordechai [Thu, 10 Jul 2025 10:03:06 +0000 (10:03 +0000)]
doc: update mgr modules notify_types

Signed-off-by: Nitzan Mordechai <nmordec@redhat.com>
(cherry picked from commit fc4396d6280fcbf0a95567cff144052d81dcd964)

2 months agomgr/dashboard: Fix smb module enablement 64515/head
Afreen Misbah [Thu, 10 Jul 2025 21:15:08 +0000 (02:45 +0530)]
mgr/dashboard: Fix smb module enablement

- changed prop name to `module_name` to avoid confusion while pasisng input props
- the module name is required to enable module

Signed-off-by: Afreen Misbah <afreen@ibm.com>
(cherry picked from commit 8f2a88eb4dd8e779044e7cd5b48c90f290303912)

2 months agoMerge pull request #64495 from zdover23/wip-doc-2025-07-15-backport-63877-to-tentacle
Anthony D'Atri [Tue, 15 Jul 2025 13:45:03 +0000 (09:45 -0400)]
Merge pull request #64495 from zdover23/wip-doc-2025-07-15-backport-63877-to-tentacle

tentacle: doc/rados/ops: edit cache-tiering.rst

2 months agoPrimeryLogPG: don't accept ops with mixed balance_reads and rwordered flags 64506/head
Nitzan Mordechai [Mon, 14 Apr 2025 11:50:23 +0000 (11:50 +0000)]
PrimeryLogPG: don't accept ops with mixed balance_reads and rwordered flags

do_op can't accept mixed flag of rwordered and balance_read

Fixes: https://tracker.ceph.com/issues/70715
Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
(cherry picked from commit f68b2178a24613960fe1303ece413b24f3ea02e7)

2 months agoObjecter: remove balance_read and localize_read if rwordered
Nitzan Mordechai [Mon, 14 Apr 2025 11:49:30 +0000 (11:49 +0000)]
Objecter: remove balance_read and localize_read if rwordered

Objecter shouldn't sent ops with mixed rwordered and balance_read flags

Fixes: https://tracker.ceph.com/issues/70715
Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
(cherry picked from commit 40292f2fd10f338c9baab60a019dfe4806e642c7)

2 months agodoc/rados/ops: edit cache-tiering.rst 64495/head
Zac Dover [Wed, 11 Jun 2025 12:44:32 +0000 (22:44 +1000)]
doc/rados/ops: edit cache-tiering.rst

Add material to doc/rados/operations/cache-tiering.rst, as suggested by
Anthony D'Atri in
https://github.com/ceph/ceph/pull/63745#discussion_r2127887785.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit d3c46820a5fc72391ef46ab4b03bbe867e0e51d2)

2 months agoMerge pull request #64491 from zdover23/wip-doc-2025-07-15-backport-64483-to-tentacle
Anthony D'Atri [Tue, 15 Jul 2025 02:36:32 +0000 (22:36 -0400)]
Merge pull request #64491 from zdover23/wip-doc-2025-07-15-backport-64483-to-tentacle

tentacle: doc: add note admonitions in two files

2 months agodoc: add note admonitions in two files 64491/head
Zac Dover [Mon, 14 Jul 2025 14:40:21 +0000 (00:40 +1000)]
doc: add note admonitions in two files

Add note admonitions when discussing client package support in the
context of OS Recommendations in the following two files:

- doc/cephfs/ceph-dokan.rst
- doc/rbd/rbd-windows.rst

This addresses a change requested by Ilya Dryomov in
https://github.com/ceph/ceph/pull/64374#discussion_r2199756581.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 69d641f0207d803cd9a3c3e102d5b2073e6b0f77)

2 months agoMerge pull request #64480 from zdover23/wip-doc-2025-07-15-backport-64374-to-tentacle
Anthony D'Atri [Mon, 14 Jul 2025 17:22:20 +0000 (13:22 -0400)]
Merge pull request #64480 from zdover23/wip-doc-2025-07-15-backport-64374-to-tentacle

tentacle: doc: Clarify the status of MS Windows client support

2 months agodoc: Clarify the status of MS Windows client support 64480/head
Anthony D'Atri [Mon, 7 Jul 2025 15:47:02 +0000 (11:47 -0400)]
doc: Clarify the status of MS Windows client support

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
(cherry picked from commit 89eabfc3f0c8db3501b3ea3097e2983617c5234a)

2 months agoMerge pull request #64471 from zdover23/wip-doc-2025-07-14-backport-64462-to-tentacle
Anthony D'Atri [Mon, 14 Jul 2025 13:30:23 +0000 (09:30 -0400)]
Merge pull request #64471 from zdover23/wip-doc-2025-07-14-backport-64462-to-tentacle

tentacle: doc/cephfs: Improve mount-using-fuse.rst

2 months agoMerge pull request #64474 from zdover23/wip-doc-2025-07-14-backport-63080-to-tentacle
Anthony D'Atri [Mon, 14 Jul 2025 13:28:36 +0000 (09:28 -0400)]
Merge pull request #64474 from zdover23/wip-doc-2025-07-14-backport-63080-to-tentacle

tentacle: doc/radosgw: Improve rgw-cache.rst

2 months agodoc/radosgw: Improve rgw-cache.rst 64474/head
Ville Ojamo [Wed, 30 Apr 2025 18:17:14 +0000 (01:17 +0700)]
doc/radosgw: Improve rgw-cache.rst

Try to improve the language by completely rewriting some sentences.
Attempt to format the document more like the rest of the docs.
Fix several errors in punctuation, capitalization, spaces etc.
Use blocks with bash prompts for CLI commands instead of hardcoded
prompts.
Fix section hierarchy and section title underline lengths.
Use admonition.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
(cherry picked from commit 6e836f8f1e1e53bc7f8d8b497960b100e6b625d6)

2 months agodoc/cephfs: Improve mount-using-fuse.rst 64471/head
Anthony D'Atri [Fri, 11 Jul 2025 19:02:45 +0000 (15:02 -0400)]
doc/cephfs: Improve mount-using-fuse.rst

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
(cherry picked from commit 329ee7b3038e49cf0def2f2628444e3e90796c05)

2 months agoMerge pull request #64460 from Hezko/wip-72093-tentacle
afreen23 [Sun, 13 Jul 2025 19:54:42 +0000 (01:24 +0530)]
Merge pull request #64460 from Hezko/wip-72093-tentacle

tentacle: mgr/dashboard: add missing commands for subsystem: change_key and del…

Reviewed-by: Afreen Misbah <afreen@ibm.com>
2 months agomgr/dashboard: add missing commands for subsystem: change_key and del_key and missing... 64460/head
Tomer Haskalovitch [Tue, 8 Jul 2025 17:45:09 +0000 (20:45 +0300)]
mgr/dashboard: add missing commands for subsystem: change_key and del_key and missing params for host add

Signed-off-by: Tomer Haskalovitch <tomer.haska@ibm.com>
(cherry picked from commit eaab5a0bee0fb53569efc3b3893725705eeba805)

2 months agoMerge pull request #64268 from zdover23/wip-doc-2025-06-30-backport-64164-to-tentacle
afreen23 [Fri, 11 Jul 2025 15:31:03 +0000 (21:01 +0530)]
Merge pull request #64268 from zdover23/wip-doc-2025-06-30-backport-64164-to-tentacle

tentacle: mgr/dashboard: Fix inline markup warning in API documentation

Reviewed-by: Afreen Misbah <afreen@ibm.com>
2 months agoMerge pull request #64360 from soumyakoduri/wip-skoduri-tentacle
Casey Bodley [Fri, 11 Jul 2025 14:35:36 +0000 (10:35 -0400)]
Merge pull request #64360 from soumyakoduri/wip-skoduri-tentacle

[rgw][tentacle] Add Restore support from Glacier/Tape cloud endpoints

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2 months agoMerge pull request #64389 from shraddhaag/wip-72027-tentacle
Shraddha Agrawal [Fri, 11 Jul 2025 07:41:18 +0000 (13:11 +0530)]
Merge pull request #64389 from shraddhaag/wip-72027-tentacle

tentacle: mon/MgrStatMonitor.cc: cleanup pool_availability

2 months agorgw: Fix the version of struct RGWZoneParams 64360/head
Soumya Koduri [Sun, 6 Jul 2025 16:58:22 +0000 (22:28 +0530)]
rgw: Fix the version of struct RGWZoneParams

Fix the version of `restore_pool` and `dedup_pool` to be
 compatible with earlier releases.

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit b6fc0be439f79f2aef833d703f7a6f9c2e48de02)

2 months agorgw/cloud-restore: Update doc with new options added
Soumya Koduri [Fri, 4 Jul 2025 07:20:53 +0000 (12:50 +0530)]
rgw/cloud-restore: Update doc with new options added

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit a981b4c0245eeafe077042a045cb05eeec9d8161)

2 months agorgw/restore: Update to neorados FIFO routines
Soumya Koduri [Mon, 7 Jul 2025 09:41:06 +0000 (15:11 +0530)]
rgw/restore: Update to neorados FIFO routines

Use new neorados/FIFO routines to store restore state.

Note: Old librados ioctx is also still retained as it is needed
by RestoreRadosSerializer.

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit faf06bca959d8e8f2d40f610ae2ed409a69271f6)

2 months agoMerge pull request #64294 from ivancich/wip-71777-tentacle
Yuri Weinstein [Thu, 10 Jul 2025 21:18:27 +0000 (14:18 -0700)]
Merge pull request #64294 from ivancich/wip-71777-tentacle

tentacle: rgw: make sure max_objs_per_shard is appropriate in debugging scenarios

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2 months agoMerge pull request #64201 from ideepika/wip-71154-tentacle
Yuri Weinstein [Thu, 10 Jul 2025 21:17:32 +0000 (14:17 -0700)]
Merge pull request #64201 from ideepika/wip-71154-tentacle

tentacle: rgw: make keystone work without admin token(service ac requirement)

Reviewed-by: Adam Emerson <aemerson@redhat.com>
2 months agorgw/restore: Use strtoull to read size till 2^64
Soumya Koduri [Fri, 23 May 2025 21:39:50 +0000 (03:09 +0530)]
rgw/restore: Use strtoull to read size till 2^64

Reviewed-by: Adam Emerson <aemerson@redhat.com>
Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit b3c867a121a7315b5a9e2d30d0af44c08676f8ca)

2 months agorgw/cloud-restore: Fixing issues with initializing and resetting FIFO
Soumya Koduri [Fri, 23 May 2025 21:37:58 +0000 (03:07 +0530)]
rgw/cloud-restore: Fixing issues with initializing and resetting FIFO

In addition, added some more debug statements and done code cleanup

Reviewed-by: Adam Emerson <aemerson@redhat.com>
Reviewed-by: Jiffin Tony Thottan <thottanjiffin@gmail.com>
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit f00ac7c96f0ac48e0ca487ecb5918db42e6cf234)

2 months agorgw/cloud-restore: Handle failure with adding restore entry
Soumya Koduri [Fri, 23 May 2025 20:25:30 +0000 (01:55 +0530)]
rgw/cloud-restore: Handle failure with adding restore entry

In case adding restore entry to FIFO fails, reset the `restore_status`
of that object as "RestoreFailed" so that restore process can be
retried from the end S3 user.

Reviewed-by: Adam Emerson <aemerson@redhat.com>
Reviewed-by: Jiffin Tony Thottan <thottanjiffin@gmail.com>
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit 9974f51eb61603b8117d7b50e6b0b4614fcce721)

2 months agorgw/cloud-restore: Support restoration of objects transitioned to Glacier/Tape endpoint
Soumya Koduri [Wed, 30 Apr 2025 20:36:21 +0000 (02:06 +0530)]
rgw/cloud-restore: Support restoration of objects transitioned to Glacier/Tape endpoint

Restoration of objects from certain cloud services (like Glacier/Tape) could
take significant amount of time (even days). Hence store the state of such restore requests
and periodically process them.

Brief summary of changes

* Refactored existing restore code to consolidate and move all restore processing into rgw_restore* file/class

* RGWRestore class is defined to manage the restoration of objects.

* Lastly, for SAL_RADOS, FIFO is used to store and read restore entries.

Currently, this PR handles storing state of restore requests sent to cloud-glacier tier-type which need async processing.
The changes are tested with AWS Glacier Flexible Retrieval with tier_type Expedited and Standard.

Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
Reviewed-by: Adam Emerson <aemerson@redhat.com>
Reviewed-by: Jiffin Tony Thottan <thottanjiffin@gmail.com>
Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit ef96bb0d6137bacf45b9ee2f99ad5bcd8b3b6add)

2 months agoMerge pull request #64264 from benhanokh/wip-71899-tentacle
Casey Bodley [Thu, 10 Jul 2025 16:28:59 +0000 (12:28 -0400)]
Merge pull request #64264 from benhanokh/wip-71899-tentacle

tentacle: rgw/dedup: full object dedup

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2 months agoqa/standalone/scrub: fix "scrubbed in 0ms" in osd-scrub-test.sh 64442/head
Ronen Friedman [Thu, 10 Jul 2025 07:57:37 +0000 (02:57 -0500)]
qa/standalone/scrub: fix "scrubbed in 0ms" in osd-scrub-test.sh

The specific test looks for a 'last scrub duration' higher than
0 as a sign that the scrub actually ran.  Previous code fixes
guaranteed that even a scrub duration as low as 1ms would be
reported as "1" (1s).  However, none of the 15 objects created
in this test were designated for the tested PG, which remained
empty.  As a result, the scrub duration was reported as "0".

The fix is to create a large enough number of objects so that
at least one of them is mapped to the tested PG.

Fixes: https://tracker.ceph.com/issues/71801
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit b303afed7a8b2a65043f56170ed478f8d2bc591a)

2 months agoMerge pull request #63880 from Matan-B/wip-matanb-crimson-tentacle-63376
Matan Breizman [Thu, 10 Jul 2025 12:45:41 +0000 (15:45 +0300)]
Merge pull request #63880 from Matan-B/wip-matanb-crimson-tentacle-63376

tentacle: crimson/os/seastore/omap_manager: handle the cases in which omap nodes are rewritten before seen by users

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
2 months agoMerge pull request #64404 from rhcs-dashboard/wip-72036-tentacle
Nizamudeen A [Thu, 10 Jul 2025 09:33:17 +0000 (15:03 +0530)]
Merge pull request #64404 from rhcs-dashboard/wip-72036-tentacle

tentacle: mgr/dashboard: rm requirements-extra file

2 months agoMerge pull request #64413 from Matan-B/wip-matanb-crimson-tentacle-compalint-time
Matan Breizman [Thu, 10 Jul 2025 07:42:41 +0000 (10:42 +0300)]
Merge pull request #64413 from Matan-B/wip-matanb-crimson-tentacle-compalint-time

tentacle: qa/config/crimson_qa_overrides: increase complaint time

Reviewed-by: Samuel Just <sjust@redhat.com>
2 months agoMerge pull request #64178 from adk3798/tentacle-nvmeof-interval
Yuri Weinstein [Wed, 9 Jul 2025 21:18:12 +0000 (14:18 -0700)]
Merge pull request #64178 from adk3798/tentacle-nvmeof-interval

tentacle: mgr/cephadm/nvmeof: Allow setting NVMEoF gateway read notifications interval in the spec file

Reviewed-by: Adam King adking@redhat.com
2 months agoMerge pull request #64300 from aainscow/wip-71874-tentacle
Yuri Weinstein [Wed, 9 Jul 2025 21:16:34 +0000 (14:16 -0700)]
Merge pull request #64300 from aainscow/wip-71874-tentacle

tentacle: Fix _setattr() with rare memory alignments

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
2 months agoqa/tasks: generalize stuck pg ignorelist entry 64419/head
Laura Flores [Tue, 4 Mar 2025 21:42:37 +0000 (15:42 -0600)]
qa/tasks: generalize stuck pg ignorelist entry

Fixes: https://tracker.ceph.com/issues/70307
Signed-off-by: Laura Flores <lflores@ibm.com>
(cherry picked from commit 26fdfbb9171664f69038b1fda26f4a225d7e52f8)

2 months agoosd: Fix extent cache unit test 64415/head
Alex Ainscow [Tue, 24 Jun 2025 12:18:50 +0000 (13:18 +0100)]
osd: Fix extent cache unit test

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 71c915486fd14ec8f7e3a135d7251327c324dbfc)

2 months agoosd: Optimised EC should avoid decodes off the end of objects.
Alex Ainscow [Fri, 6 Jun 2025 11:09:04 +0000 (12:09 +0100)]
osd: Optimised EC should avoid decodes off the end of objects.

This was a particular edge case whereby the need to do an encode and a decode as part of recovery
was causing EC to attempt to do a decode off the end of the shard, despite this being
unnecessary.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit bd9d44dd89a7fa587182cf49c1f444c79126a99e)

2 months agoosd: During recovery, pass "for_recovery" when attempting re-reads
Alex Ainscow [Fri, 23 May 2025 08:59:31 +0000 (09:59 +0100)]
osd: During recovery, pass "for_recovery" when attempting re-reads

get_all_remaining_reads() is used for two purposes:

1. If a read unexpectedly fails, recover from other shards.
2. If a shard is missing, but is allowed to be missing (typically
   due to unequal shard sizes), we rely on this function to return
   no new reads, without an error.

The test failure we saw case (2), but I think case (1) is important.

Most of the time we probably would not notice, but if insufficient redundancy
exists without the for_recovery being set, then this will result in
recovery failing.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 7ab0588ad2853030bf0e716a5f887e0fe714b314)

2 months agoosd: In optimized EC send empty transactions to non-primaries if object is recovering.
Alex Ainscow [Thu, 22 May 2025 13:37:56 +0000 (14:37 +0100)]
osd: In optimized EC send empty transactions to non-primaries if object is recovering.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit ce5d58d704238816e0ddbb1279c273f5c520e5ac)

2 months agoosd: Improve backfill in new EC.
Alex Ainscow [Fri, 2 May 2025 09:11:45 +0000 (10:11 +0100)]
osd: Improve backfill in new EC.

In old EC, the full stripe was always read and written.  In new EC, we only attempt
to recover the shards that were missing. If an old OSD is available, the read can
be directed there.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 1becd2c5f6ec1d4c31059243ac247f046efd4fe3)

2 months agoosd: Cope with empty reads from an OSD without panic.
Alex Ainscow [Thu, 8 May 2025 15:14:03 +0000 (16:14 +0100)]
osd: Cope with empty reads from an OSD without panic.

If a ReadOp from EC contains two objects where one object only reads from a single shard, but
 other onjects require other shards, then this bug can be hit.  The fix should make it clear
 what the issue is

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 8c92dcf6c0092f4dd140bb712ce4a38990a3cba2)

2 months agoosd: Recover non-primary shards with the correct version.
Alex Ainscow [Thu, 8 May 2025 13:22:36 +0000 (14:22 +0100)]
osd: Recover non-primary shards with the correct version.

Scrub revealed a bug whereby the non-primary shards were being given a
version number in the OI which did not match the expected version in
the authoritative OI.

A secondary issue is that all attributes were being pushed to the non
primary shards, whereas only OI is actually needed.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
# Conflicts:
# src/osd/osd_types.h
(cherry picked from commit 4332a188b72eea09c48d913d1b4576259a32a4a4)

2 months agoosd: Do not do a read-modify-write if op.delete_first is set
Alex Ainscow [Wed, 7 May 2025 09:33:24 +0000 (10:33 +0100)]
osd: Do not do a read-modify-write if op.delete_first is set

Some client OPs are able to generate transactions which delete an
object and then write it again. This is used by the copy-from ops.

If such a write is not 4k aligned, then the new EC code was incorrectly doing
a read-modify write on the misaligned 4k.  This causes some
garbage to be written to the backend OSD, off the end of the
object.  This is only a problem if the object is later extended
without the end being written.

Problematic sequence is:

1. Create two objects (A and B) of size X and Y where:
X > Y, (Y % 4096) != 0

2. copy_from  OP B -> A
3. Extend B without writing offset Y+1

This will result in a corrupt data buffer at Y+1 without this fix.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit dafe15ec4d7fd6c7deb77cfa00a3e737c71545dc)

2 months agoosd: Fix off-by-one error in shard_extent_map.
Alex Ainscow [Thu, 1 May 2025 09:09:15 +0000 (10:09 +0100)]
osd: Fix off-by-one error in shard_extent_map.

Inserting the first parity buffer was causing the ro-range within the SEM to be incorrectly calculated.
Simple fix and I have added some unit tests to defend this error in the future.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 2a569d8680ff64287efea5d4b1790b396aae86cc)

2 months agoosd: Invalidate CRC on all slice iterator calls.
Alex Ainscow [Wed, 30 Apr 2025 09:34:30 +0000 (10:34 +0100)]
osd: Invalidate CRC on all slice iterator calls.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 29039e9441e4c05c5b59ca1a1f1064fed44f4078)

2 months agoosd: Minor performance improvement in ECUtil.cc
Alex Ainscow [Fri, 25 Apr 2025 18:15:34 +0000 (19:15 +0100)]
osd: Minor performance improvement in ECUtil.cc

Code changes to prevent create then erase of empty
shard.
Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit c2d6414f659b123fa7060442bff7a90a7ceeb7c0)

2 months agoosd: Use correct shard_versions when queuing multiple writes in optimized EC.
Alex Ainscow [Fri, 25 Apr 2025 08:13:46 +0000 (09:13 +0100)]
osd: Use correct shard_versions when queuing multiple writes in optimized EC.

The problem was that incorrect shard_versions were being detected
by scrub. The comment in the code explains the detailed problem
and solution.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 6ab686fefcef4bf83e2b77164c39c9c7d00c567c)

2 months agoosd: clone + delete ops should invalidate source new EC extent cache.
Alex Ainscow [Thu, 24 Apr 2025 13:02:41 +0000 (14:02 +0100)]
osd: clone + delete ops should invalidate source new EC extent cache.

The op.is_delete() function only returns true if the op is not ALSO
doing something else (e.g. a clone). This causes issues with clearing
the new EC extent cache.

Also improve some debug and code cleanliness.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 1224063e8f9b6fcafcba399373a43f0bc7488bcc)

2 months agoosd: Make projected_size in ECTransaction const
Alex Ainscow [Thu, 24 Apr 2025 12:57:45 +0000 (13:57 +0100)]
osd: Make projected_size in ECTransaction const

This does not need to change once set, so adapt constructor to
allow it to be const.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 350f3a7a7a8511899adb52b499bc537e178acb0e)

2 months agoosd: Fix parity updates in truncates.
Alex Ainscow [Wed, 23 Apr 2025 14:41:11 +0000 (15:41 +0100)]
osd: Fix parity updates in truncates.

Previously in optimised EC, when truncating to a partial
stripe, the parity was not being updated.  This fix reads
the non-truncated data from the final stripe and calculates
parity updates, which are written to the parity shards.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 54a06c97a279d842db0f8059ef991245c3350171)

2 months agoosd: Fix EC cache invalidation bug
Alex Ainscow [Tue, 22 Apr 2025 12:41:19 +0000 (13:41 +0100)]
osd: Fix EC cache invalidation bug

With optimised EC, there were two bugs with cache invalidation:
1. If two invalidates were in the queue, its possible the second
invalidate might be cleared by the first.

2. Reads were being requested if size was being reduced.

Also, added a few debug improvements and some new asserts.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 7fad8b94a7509d052e4f6f773fe718b18964c282)

2 months agoosd: Fix panic on fast_read completion in optimised EC
Alex Ainscow [Thu, 17 Apr 2025 21:53:31 +0000 (22:53 +0100)]
osd: Fix panic on fast_read completion in optimised EC

The completion of sub reads was incorrectly marking all processed reads complete on the first read.

This was causing an early attempt at reconstruct, which panics.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit b72a13ef1fcde577edd5ed397e924178ca083f9b)

2 months agoosd: Use partial read path for fast_reads
Alex Ainscow [Thu, 17 Apr 2025 21:51:51 +0000 (22:51 +0100)]
osd: Use partial read path for fast_reads

Previously fast reads had attempted to read entire stripes.  This is not necessary or desirable.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 9ff7589c873d368cfa797acdd970278232a0c906)

2 months agoosd: Fix access-freed-memory issue in EC extent cache.
Alex Ainscow [Thu, 17 Apr 2025 16:23:04 +0000 (17:23 +0100)]
osd: Fix access-freed-memory issue in EC extent cache.

A very similar issue has been in product code, but this was found using valgrind.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 1d7425ef1621816e061c4a8d22d9e4f7617912e8)

2 months agoosd: Make EC alignment independent of page size.
Alex Ainscow [Wed, 9 Apr 2025 12:49:49 +0000 (13:49 +0100)]
osd: Make EC alignment independent of page size.

Code which manipulates full pages is often faster. To exploit this
optimised EC was written to deal with 4k alignment wherever possible.
When inputs are not aligned, they are quickly aligned to 4k.

Not all architectures use 4k page sizes. Some power architectures for
example have a 64k page size.  In such situations, it is unlikely that
using 64k page alignment will provide any performance boost, indeed it
is likely to hurt performance significantly.  As such, EC has been
moved to maintain its internal alignment (4k), whcih can be configured.

This has the added advantage, that we can can potentially tweak this
value in the future.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 6bbc3f9b9947575d91748bba849d9e7a1e7d27e5)

2 months agoosd: Fix Truncates in Optimised EC
Alex Ainscow [Wed, 16 Apr 2025 09:41:48 +0000 (10:41 +0100)]
osd: Fix Truncates in Optimised EC

The previous truncate code attempted to perform a non-aligned truncate by
creating a zero buffer at the end of the object, which was written.

The new code initially truncates to the exact size of the user object before
growing the object to the required 4k alignment. This simpler arrangement
also simplifies the rollback.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit d17f06b5fcb2fc749c4d0cbae9beb963bd06c145)

2 months agoosd: Fix written shards policing for multiple loops through generate.
Alex Ainscow [Wed, 16 Apr 2025 06:44:25 +0000 (07:44 +0100)]
osd: Fix written shards policing for multiple loops through generate.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit c5f67d3771384e7a780ed6488019fca2509e31c2)

2 months agoosd: Fix shard ordering bug
Alex Ainscow [Thu, 24 Apr 2025 14:14:08 +0000 (15:14 +0100)]
osd: Fix shard ordering bug

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 2ead42b8bdf836acd662537a4f2ec3e0b3b61a34)

2 months agoosd: Remove EC-optimized only flag for not reset_complete_to
Alex Ainscow [Tue, 10 Jun 2025 14:58:28 +0000 (15:58 +0100)]
osd: Remove EC-optimized only flag for not reset_complete_to

The protection here applies to non-optimized EC and replica shards, but will
not be exercised as much. So this is essentially a clean up

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit ca7eebcce7a7f7879deac5b917c3fcaf2249da14)

2 months agoosd: EC optimizations fix bug when recovering only partial write objects
Bill Scales [Fri, 6 Jun 2025 12:28:14 +0000 (13:28 +0100)]
osd: EC optimizations fix bug when recovering only partial write objects

PGLog::reset_complete_to is not handling the scenario where all the
missing objects have a partial write that excludes updating the shard being
recovered as their most recent update. In this scenario the oldest need
is newer than newest log entry. Setting last_compelte to the head of the
log confuses code and makes it think that recovery has completed.

The fix is to hold last_complete one entry behind the head of the log
until all missing objects have been recovered.

PGLog::recover_got already does this when an object is recovered and the
remaining objects to recover match this scenario, so this fix just makes
reset_complete_to behave the same way as recover_got.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
(cherry picked from commit 8ca209e33709b1915858a4cd9747d6c580797a4c)

2 months agoosd: EC optimizations correct pwlc after PG split
Bill Scales [Thu, 5 Jun 2025 10:17:06 +0000 (11:17 +0100)]
osd: EC optimizations correct pwlc after PG split

When a PG splits the log entries are divided between the two PGs,
this can result in PWLC refering to log entries in the other PG.
Rollback PWLC after the split so it is not further advanced that
the most recently completed log entry.

Non-primary shards can be missing log entries and may rollback
PWLC too far because of this, however this does not matter
because a split occurs at the start of a peering cycle and these
shards will be updated with the correct PWLC from the primary
shard later in the peering cycle when they are activated.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
(cherry picked from commit 6c04e4cf2b81c768eb313c3d0e6ac37d8e69b150)

2 months agoosd: EC optimizations overaggresive check for missing objects
Bill Scales [Mon, 26 May 2025 13:33:12 +0000 (14:33 +0100)]
osd: EC optimizations overaggresive check for missing objects

Relax an assert in read_log_and_missing for optimized EC
pools. Because the log may not have entries for partial
writes but the missing list is calculated from the full
log the need version for a missing item may be newer than
the lastest log entry for that object.

ceph_objectstore_tool needs care because we don't want to add
extra dependencies. To minimise the dependencies, we always
relax the asserts when using this tool.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 20e883fedaf19293e939c4cac44de196bd6c9c19)

2 months agoosd: EC Optimizations fix proc_master_log handling of splits
Bill Scales [Wed, 21 May 2025 17:16:50 +0000 (18:16 +0100)]
osd: EC Optimizations fix proc_master_log handling of splits

For optimized EC pools proc_master_log needs to deal with
the other log being merged being behind the local log because
it is missing partial writes. This is done by finding the
point where the logs diverge and then checking whether local
log entries have been committed on all the shards.

A bug in this code meant that after a PG split (where there
may be gaps in the log due to entries moving to the other PG)
that the divergence point was not found and committed
partial writes ended up being discarded which creates
unfound objects.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
(cherry picked from commit 5f687c4a182b18cab31476854a6b04a46e8c8464)

2 months agoosd: EC Optimizations fix missing call to partial_write
Bill Scales [Tue, 20 May 2025 10:37:05 +0000 (11:37 +0100)]
osd: EC Optimizations fix missing call to partial_write

When a shard is backfilling and it receives a log entry where the
transaction is not applied it can skip the roll forward by
immediately advancing crt. However it is still necessary to
call partial_write in this scenario to keep the pwlc information
up to date.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
(cherry picked from commit ddc306255868f26ae0a3951710ef18207fff9b30)

2 months agoosd: Do not complete log on non primary until missing recovered.
Alex Ainscow [Fri, 16 May 2025 13:52:22 +0000 (14:52 +0100)]
osd: Do not complete log on non primary until missing recovered.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 7f134a32b38f16555984b06f87a8ce581b492cf9)

2 months agoosd: EC Optimizations bug fix for flip/flop acting set
Bill Scales [Wed, 14 May 2025 07:39:40 +0000 (08:39 +0100)]
osd: EC Optimizations bug fix for flip/flop acting set

EC optimizations pools have a set of non-primary shards which
cannot become the primary because they do not have all the
metadata updates. If one of these shards is chosen as the
primary it will set the acting set to force another shard to
be chosen.

It is important that the selected acting set is the same
acting set that will be chosen by the next primary (assuming
nothing else changes) otherwise a PG can get into a state where
the acting set flip/flops between two different states causing
the PG to get stuck in peering and hanging I/O.

A bug in update_peer_info meant that non-primary shards did not
present the same info to choose_acting_set as primary shards
because they were not updating their pg_info_t based on pwlc
information from other shards.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
(cherry picked from commit 54b265f811e545885916367d7d63c7f4d734fae0)

2 months agoosd: Refuse to commit/rollforward beyond end of log.
Alex Ainscow [Tue, 13 May 2025 11:55:14 +0000 (12:55 +0100)]
osd: Refuse to commit/rollforward beyond end of log.

In optimised EC, if transaction is applied to all shards, followed by a
partial transaction AND these two transactions overlap, then it is
possible for the non-primary shards to commit a version which is after
then end of the log.

This commit changes the apply_log such that the commit version will be
changed to the head of the log in such situations.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 912437d47053f92086261e285462ac5b4d8d749a)

2 months agoosd: Refactor partial_write to address multiple issues.
Alex Ainscow [Tue, 29 Apr 2025 11:02:07 +0000 (12:02 +0100)]
osd: Refactor partial_write to address multiple issues.

We fix a number of issues with partial_write here.

Fix an issue where it is unclear whether the empty PWLC state is
newer or older than a populated PWLC on another shard by always
updating the pwlc with an empty range, rather than blank.

This is an unfortunate small increase in metadata, so we should
come back to this in a later commit (or possibly later PR).

Normally a PG log consists of a set of log entries with each
log entry have a version number one greater than the previous
entry. When a PG splits the PG log is split so that each of the
new PGs only has log entries for objects in that PG, which
means there can be gaps between version numbers.

PGBackend::partial_write is trying to keep track of adjacent
log updates than do not update a particular shard storing
these as a range in partial_writes_last_complete. To do this
it must compare with the version number of the previous log
entry rather than testing for a version number increment of one.

Also simplify partial_writes to make it more readable.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 2467406f3e22c0746ce20cd04b838dccedadf055)

2 months agoosd: nonprimary shards are permitted to have a crt newer than head
Alex Ainscow [Tue, 29 Apr 2025 10:59:24 +0000 (11:59 +0100)]
osd: nonprimary shards are permitted to have a crt newer than head

Non-primary shards do not get updates for some transactions.  It is possible
however for other transactions to increase the can_rollback_to to a later
version.  This causes an assert for some operations.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 6209c8cdf980f261c60073d4535d745f24538a7d)

2 months agoosd: overaggressive assert in read_log_and_missing with optimized EC pool
Bill Scales [Fri, 25 Apr 2025 14:03:02 +0000 (15:03 +0100)]
osd: overaggressive assert in read_log_and_missing with optimized EC pool

read_log_and_missing is called during OSD initializaiton to sanity check
the PG log. One of its checks is too agressive for an optimized EC pool
where because of a partial write there can be a log entry but no update
to the object on this shard (other shards will have been updated). The
fix is to skip the checks when the log entry indicates this shard was
not updated.

Only affects pool with allow_ec_optimizations flag on.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
(cherry picked from commit c8739a2bdb0bf523402d85517a7fce39d445eef5)

2 months agoosd: EC optimizations rework for pg_temp
Bill Scales [Thu, 29 May 2025 11:53:27 +0000 (12:53 +0100)]
osd: EC optimizations rework for pg_temp

Bug fixes for how pg_temp is used with optimized EC pools. For these
pools pg_temp is re-ordered with non-primary shards last. The acting
set was undoing this re-ordering in PeeringState, but this is too
late and results code getting the shard id wrong. One consequence
iof this was an OSD refusing to create a PG because of an incorrect
shard id.

This commit moves the re-ordering earlier into OSDMap::_get_temp_osds,
some changes are then required to OSDMap::clean_temps.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
(cherry picked from commit 6c8b0297aaafeb0cff7350e52212140c85435afe)

2 months agoosd: EC Optimizations OSDMap::clean_temps preventing change of primary
Bill Scales [Fri, 23 May 2025 09:45:46 +0000 (10:45 +0100)]
osd: EC Optimizations OSDMap::clean_temps preventing change of primary

clean_temps is clearing pg_temp if the acting set will be the same
as the up set. For optimized EC pools this is overaggressive because
there are scenarios where it is setting acting set to be the same as
up set to force an alternative shard to be chosen as primary - this
happens because the acting set is transformed to place non-primary
shards at the end of the pg_temp vector.

Detect this scenario and stop clean_temps from undoing the acting
set which is being set by PeeringState::choose_acting.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
(cherry picked from commit 9d9265337a43b3edab8a3c41752baaca835be92a)

2 months agoosd: EC optimizations bug in OSDMap::clean_temps
Bill Scales [Thu, 22 May 2025 12:12:57 +0000 (13:12 +0100)]
osd: EC optimizations bug in OSDMap::clean_temps

OSDMap clean_temps clears pg_temp for a PG when the up set
matches the acting_set. For optimized EC pools the pg_temp
is reordered to place primary shards first, this function
was not calling pgtemp_undo_primaryfirst to revert the
reordering.

This meant that a 2+1 EC PG with up set [1,2,3] and
a desired acting set [1,3,2] re-ordered the acting
set to produce pg_temp as [1,2,3] and then deleted this
because it equals the up set.

Calling pgtemp_undo_primaryfirst makes this code work
as intended.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
(cherry picked from commit ef0025ab168e6dd604465921dbecb7fa3b0331bd)