git.apps.os.sepia.ceph.com Git

]> git.apps.os.sepia.ceph.com Git - ceph-ci.git/log

Naman Munet [Mon, 7 Jul 2025 09:26:49 +0000 (14:56 +0530)]

mgr/dashboard: differentiate account users from rgw users in bucket form

fixes: https://tracker.ceph.com/issues/71523

commit includes:
1) Added checkbox to select account user and another dropdown to show account users
2) Also fixed bucket replication as it was throwing error for 'invalidBucketARN'

Signed-off-by: Naman Munet <naman.munet@ibm.com>
(cherry picked from commit c5b5408c5be8c64280de368093509ad4ef8e28ec)

commit | commitdiff | tree

Venky Shankar [Wed, 23 Jul 2025 03:38:15 +0000 (09:08 +0530)]

Merge pull request #64204 from rishabh-d-dave/wip-71853-tentacle

tentacle: mgr/vol: include group name in subvolume's pool namespace name

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Venky Shankar [Wed, 23 Jul 2025 03:37:42 +0000 (09:07 +0530)]

Merge pull request #64450 from joscollin/wip-72085-tentacle

tentacle: qa: increase the wait time to prevent check_counter failing

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Venky Shankar [Wed, 23 Jul 2025 03:37:14 +0000 (09:07 +0530)]

Merge pull request #64538 from batrick/wip-72164-tentacle

tentacle: mds: nudge log for unstable locks after early reply

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Venky Shankar [Wed, 23 Jul 2025 03:36:39 +0000 (09:06 +0530)]

Merge pull request #64579 from joscollin/wip-72184-tentacle

tentacle: qa: increase the randomness to trigger the directory import/export

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Tue, 22 Jul 2025 14:17:47 +0000 (07:17 -0700)]

Merge pull request #64457 from cbodley/wip-72091-tentacle

tentacle: deb/mgr: remove deprecated distutils from ceph-mgr.requires

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

afreen23 [Tue, 22 Jul 2025 07:05:00 +0000 (12:35 +0530)]

Merge pull request #64594 from Hezko/wip-72181-tentacle

tentacle: mgr/dashboard: nvmeof cli feedback fixes

Reviewed-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 21 Jul 2025 14:49:21 +0000 (07:49 -0700)]

Merge pull request #64506 from NitzanMordhai/wip-72119-tentacle

tentacle: mixed balance read and rwordered in read ops

Reviewed-by: Laura Flores <lflores@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 21 Jul 2025 14:47:20 +0000 (07:47 -0700)]

Merge pull request #63803 from badone/tentacle

Tentacle: OSDMonitor: Make sure pcm is initialised

Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Mon, 21 Jul 2025 13:48:56 +0000 (15:48 +0200)]

Merge pull request #64415 from ljflores/wip-72053-tentacle

tentacle: osd: Multiple fixes to optimized EC and peering

Reviewed-by: Alex Ainscow <aainscow@uk.ibm.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Mon, 21 Jul 2025 12:54:16 +0000 (15:54 +0300)]

Merge pull request #64576 from ronen-fr/wip-rf-64567-tentacle

tentacle: osd/scrub: allow auto-repair on operator-initiated scrubs

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Tomer Haskalovitch [Sun, 6 Jul 2025 20:15:50 +0000 (23:15 +0300)]

mgr/dashboard: nvmeof cli rename ns to namespace, fixes for text responses, subsys add params

Signed-off-by: Tomer Haskalovitch <tomer.haska@ibm.com>
(cherry picked from commit 702dfddf23036e6ec79e4b9d5eac7d09637971b8)

commit | commitdiff | tree

afreen23 [Mon, 21 Jul 2025 06:37:58 +0000 (12:07 +0530)]

Merge pull request #64570 from Hezko/wip-72180-tentacle

tentacle: mgr/dashboard: support human friendly size parameter split commands to separate api and cli functions

Reviewed-by: Nizamudeen A <nia@redhat.com>

commit | commitdiff | tree

afreen23 [Mon, 21 Jul 2025 06:37:01 +0000 (12:07 +0530)]

Merge pull request #64542 from Hezko/wip-72165-tentacle

tentacle: mgr/dashboard: add help for nvmeof cli

Reviewed-by: Nizamudeen A <nia@redhat.com>

commit | commitdiff | tree

Jos Collin [Wed, 16 Jul 2025 10:02:26 +0000 (15:32 +0530)]

qa: increase the randomness to trigger the directory import/export

Fixes: https://tracker.ceph.com/issues/65770
Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit a4ef269d1e47e639029a86e35304cf6ba0df4ce7)

commit | commitdiff | tree

Ronen Friedman [Thu, 17 Jul 2025 16:59:00 +0000 (11:59 -0500)]

osd/scrub: allow auto-repair on operator-initiated scrubs

Previously, operator-initiated scrubs would not auto-repair, regardless
of the value of the 'osd_scrub_auto_repair' config option. This was
less confusing to the operator than it could have been, as most
operator commands would in fact cause a regular periodic scrub
to be initiated. However, that quirk is now fixed: operator commands
now trigger 'op-initiated' scrubs. Thus the need for this patch.

The original bug was fixed in https://github.com/ceph/ceph/pull/54615,
but was unfortunately re-introduced later on.
Fixes: https://tracker.ceph.com/issues/72178
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit 97de817ad1c253ee1c7c9c9302981ad2435301b9)

commit | commitdiff | tree

SrinivasaBharathKanta [Fri, 18 Jul 2025 03:26:28 +0000 (08:56 +0530)]

Merge pull request #64442 from ronen-fr/wip-rf-noempty-64429-tentacle

tentacle: qa/standalone/scrub: fix "scrubbed in 0ms" in osd-scrub-test.sh

commit | commitdiff | tree

SrinivasaBharathKanta [Fri, 18 Jul 2025 03:26:17 +0000 (08:56 +0530)]

Merge pull request #64419 from ljflores/wip-72023-tentacle

tentacle: qa/tasks: generalize stuck pg ignorelist entry

commit | commitdiff | tree

Laura Flores [Thu, 17 Jul 2025 20:19:58 +0000 (15:19 -0500)]

Merge pull request #64414 from ljflores/wip-72052-tentacle

tentacle: Optimised EC: Ignore snapshot scrubbing on non-primary shards

commit | commitdiff | tree

Tomer Haskalovitch [Mon, 14 Jul 2025 18:53:30 +0000 (21:53 +0300)]

mgr/dashboard: split ns add to separate api and cli functions

Signed-off-by: Tomer Haskalovitch <tomer.haska@ibm.com>
(cherry picked from commit f93afc474675f364972ee2719ad284f0ac850740)

commit | commitdiff | tree

Anthony D'Atri [Thu, 17 Jul 2025 13:32:23 +0000 (09:32 -0400)]

Merge pull request #64556 from zdover23/wip-doc-2025-07-17-backport-64537-to-tentacle-take-two

tentacle: doc/radosgw: Improve formatting and language in bucket_logging.rst

commit | commitdiff | tree

Ville Ojamo [Wed, 16 Jul 2025 07:14:26 +0000 (14:14 +0700)]

doc/radosgw: Improve formatting and language in bucket_logging.rst

Trim trailing extra line characters around main title.

Add missing full stops in list items.

Use double backticks for configuration options, data etc.

Linkify reference to REST API.

No hyphen in "regular expression".

Fix section hierarchy by moving "Log Records" up 2 levels and try to
make the section title more consistent with another section title.

Try to improve partial sentences and try to simplify one sentence.

Remove whitespace at otherwise empty line.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
(cherry picked from commit a19834c2bbdef6feb7b4bf5266d40f4d427a8247)

commit | commitdiff | tree

afreen23 [Thu, 17 Jul 2025 09:16:01 +0000 (14:46 +0530)]

Merge pull request #64515 from afreen23/wip-72146-tentacle

tentacle: mgr/dashboard: Fix smb module enablement

Reviewed-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>

commit | commitdiff | tree

Tomer Haskalovitch [Tue, 15 Jul 2025 07:40:07 +0000 (10:40 +0300)]

mgr/dashboard: add help for nvmeof cli

Signed-off-by: Tomer Haskalovitch <tomer.haska@ibm.com>
(cherry picked from commit f7f93b2c7a8bf3730fe4f82a9f4a30bb2ee89b68)

commit | commitdiff | tree

Zac Dover [Thu, 17 Jul 2025 06:35:27 +0000 (16:35 +1000)]

Merge pull request #63773 from zdover23/wip-doc-2025-06-06-backport-63740-to-tentacle

tentacle: doc/mgr: edit telemetry (3 of x)

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

commit | commitdiff | tree

Zac Dover [Thu, 17 Jul 2025 06:34:46 +0000 (16:34 +1000)]

Merge pull request #64546 from zdover23/wip-doc-2025-07-17-backport-64532-to-tentacle

tentacle: doc/radosgw: edit "Lifecycle Settings"

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

commit | commitdiff | tree

Zac Dover [Thu, 5 Jun 2025 02:24:08 +0000 (12:24 +1000)]

doc/mgr: edit telemetry (3 of x)

Improve the English and the formatting in doc/mgr/telemetry.rst. This
follows up on https://github.com/ceph/ceph/pull/63476.

This commit edits the third hundred lines in doc/mgr/telemetry.rst.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 3ce61e065121e07e2c37097f1fe6736bdf985e8e)

commit | commitdiff | tree

Zac Dover [Wed, 16 Jul 2025 12:11:03 +0000 (22:11 +1000)]

doc/radosgw: edit "Lifecycle Settings"

Edit the section "Lifecycle Settings" in the file
doc/radosgw/config-ref.rst. Remove solecisms and pleonasms and plain old
infelicitious formulations.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit ac2e5f502523d1bf326303e904ccb47236c81fcb)

commit | commitdiff | tree

Zac Dover [Thu, 17 Jul 2025 03:29:31 +0000 (13:29 +1000)]

Merge pull request #64533 from zdover23/wip-doc-2025-07-16-backport-64328-to-tentacle

tentacle: doc/rgw/logging: fix journal record format

Reviewed-by: Yuval Lifshitz <ylifshit@ibm.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

commit | commitdiff | tree

Zac Dover [Thu, 17 Jul 2025 03:28:46 +0000 (13:28 +1000)]

Merge pull request #63808 from zdover23/wip-doc-2025-06-09-backport-63781-to-tentacle

tentacle: doc/mgr: edit telemetry.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 1 Jul 2025 01:27:57 +0000 (21:27 -0400)]

doc/cephfs: add mds_allow_batched_ops to conf ref

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
(cherry picked from commit 80c086a7d9028fd9147e23237e0cbce695e01464)

commit | commitdiff | tree

Patrick Donnelly [Fri, 27 Jun 2025 18:46:07 +0000 (14:46 -0400)]

mds: nudge log for unstable locks after early_reply

A getattr/lookup can cause a wrlock or xlock to become unstable after a request
(like rename) acquires it but before early reply. The MDS will not nudge the
log in this situation and the getattr/lookup will need to wait for the eventual
journal flush before the lock is released.

Now looks like:

    2025-06-27T19:41:32.043+0000 7f11d21a9640  5 mds.0.log _submit_thread 25185408~2845 : EUpdate rename [metablob 0x1, 3 dirs]
    2025-06-27T19:41:32.043+0000 7f11d99b8640  1 -- [v2:172.21.10.4:6868/56297870,v1:172.21.10.4:6869/56297870] --> [v2:172.21.10.4:6818/999439823,v1:172.21.10.4:6819/999439823] -- osd_op(unknown.0.23:135 2.e 2:7bf7e7b5:::200.00000006:head [write 19584~2865 [fadvise_dontneed] in=2865b] snapc 0=[] ondisk+write+known_if_redirected+full_force+supports_pool_eio e72) -- 0x563e9b730000 con 0x563e9a674800
    2025-06-27T19:41:32.043+0000 7f11d99b8640 10 mds.0.locker nudge_log NO (dn xlock x=1 by request(client.4393:20702 nref=6 cr=0x563e9b83ae00)) on [dentry #0x1/a/file [2,head] auth (dn xlock x=1 by request(client.4393:20702 nref=6 cr=0x563e9b83ae00)) (dversion lock w=1 last_client=4393) pv=14558 v=14556 ap=2 ino=0x10000000002 remote_ino=0x0 referent_inode_ptr=(nil) referent_ino=0x0 state=1610612736 | request=1 lock=2 inodepin=1 dirty=1 waiter=0 authpin=1 0x563e95629900]
    2025-06-27T19:41:32.043+0000 7f11d99b8640 10 mds.0.locker nudge_log NO (dn xlock x=1 by request(client.4393:20702 nref=6 cr=0x563e9b83ae00)) on [dentry #0x1/b/file [2,head] auth NULL (dn xlock x=1 by request(client.4393:20702 nref=6 cr=0x563e9b83ae00)) (dversion lock w=1 last_client=4393) pv=14557 v=14555 ap=2 ino=(nil) remote_ino=0x0 referent_inode_ptr=(nil) referent_ino=0x0 state=1610612736 | request=1 lock=2 inodepin=0 dirty=1 waiter=0 authpin=1 0x563e95629b80]
    2025-06-27T19:41:32.043+0000 7f11d99b8640 10 mds.0.locker nudge_log NO (dversion lock w=1 last_client=4393) on [dentry #0x1/a/file [2,head] auth (dn xlock x=1 by request(client.4393:20702 nref=6 cr=0x563e9b83ae00)) (dversion lock w=1 last_client=4393) pv=14558 v=14556 ap=2 ino=0x10000000002 remote_ino=0x0 referent_inode_ptr=(nil) referent_ino=0x0 state=1610612736 | request=1 lock=2 inodepin=1 dirty=1 waiter=0 authpin=1 0x563e95629900]
    2025-06-27T19:41:32.043+0000 7f11d99b8640 10 mds.0.locker nudge_log NO (dversion lock w=1 last_client=4393) on [dentry #0x1/b/file [2,head] auth NULL (dn xlock x=1 by request(client.4393:20702 nref=6 cr=0x563e9b83ae00)) (dversion lock w=1 last_client=4393) pv=14557 v=14555 ap=2 ino=(nil) remote_ino=0x0 referent_inode_ptr=(nil) referent_ino=0x0 state=1610612736 | request=1 lock=2 inodepin=0 dirty=1 waiter=0 authpin=1 0x563e95629b80]
    2025-06-27T19:41:32.043+0000 7f11d99b8640 10 mds.0.locker nudge_log NO (ifile excl w=1) on [inode 0x10000000001 [...2,head] /b/ auth v29159 pv29165 ap=1 DIRTYPARENT f(v0 m2025-06-27T19:41:32.017971+0000) n(v0 rc2025-06-27T19:41:32.017971+0000 1=0+1) (isnap sync r=1) (inest lock w=1) (ifile excl w=1) (iversion lock w=1 last_client=4393) caps={4393=pAsLsXsFsx/-@955},l=4393 | request=0 lock=4 dirfrag=1 caps=1 dirtyparent=1 dirty=1 authpin=1 0x563e9a6f5080]
    2025-06-27T19:41:32.043+0000 7f11d99b8640 10 mds.0.locker nudge_log NO (inest lock w=1 dirty) on [inode 0x1 [...2,head] / auth v179 snaprealm=0x563e9a6ce6c0 f(v0 m2025-06-27T19:09:29.187695+0000 2=0+2) n(v45 rc2025-06-27T19:41:31.249722+0000 3=0+3)/n(v0 rc2025-06-27T19:08:43.024940+0000 1=0+1) (isnap sync r=2) (inest lock w=1 dirty) caps={4359=pAsLsXsFs/-@83,4393=pAsLsXsFs/-@1245} | dirtyscattered=1 request=0 lock=2 dirfrag=1 caps=1 openingsnapparents=0 dirty=1 authpin=0 0x563e9a6f4b00]
    2025-06-27T19:41:32.043+0000 7f11d99b8640 10 mds.0.locker nudge_log NO (inest lock w=1) on [inode 0x10000000001 [...2,head] /b/ auth v29159 pv29165 ap=1 DIRTYPARENT f(v0 m2025-06-27T19:41:32.017971+0000) n(v0 rc2025-06-27T19:41:32.017971+0000 1=0+1) (isnap sync r=1) (inest lock w=1) (ifile excl w=1) (iversion lock w=1 last_client=4393) caps={4393=pAsLsXsFsx/-@955},l=4393 | request=0 lock=4 dirfrag=1 caps=1 dirtyparent=1 dirty=1 authpin=1 0x563e9a6f5080]
    2025-06-27T19:41:32.043+0000 7f11d99b8640 10 mds.0.locker nudge_log NO (iquiesce lock w=1 last_client=4393) on [inode 0x10000000000 [...2,head] /a/ auth v29161 pv29163 ap=3 DIRTYPARENT f(v0 m2025-06-27T19:41:32.017971+0000 1=1+0) n(v0 rc2025-06-27T19:41:32.017971+0000 2=1+1) (isnap sync r=2) (inest lock w=1) (ifile lock->sync w=1) (iversion lock w=1 last_client=4393) (iquiesce lock w=1 last_client=4393) caps={4359=pAsLsXs/-@109,4393=pAsLsXs/-@955} | request=1 lock=5 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x563e9a6f5600]
    2025-06-27T19:41:32.043+0000 7f11d99b8640 10 mds.0.locker nudge_log NO (isnap sync r=1) on [inode 0x10000000001 [...2,head] /b/ auth v29159 pv29165 ap=1 DIRTYPARENT f(v0 m2025-06-27T19:41:32.017971+0000) n(v0 rc2025-06-27T19:41:32.017971+0000 1=0+1) (isnap sync r=1) (inest lock w=1) (ifile excl w=1) (iversion lock w=1 last_client=4393) caps={4393=pAsLsXsFsx/-@955},l=4393 | request=0 lock=4 dirfrag=1 caps=1 dirtyparent=1 dirty=1 authpin=1 0x563e9a6f5080]
    2025-06-27T19:41:32.043+0000 7f11d99b8640 10 mds.0.locker nudge_log NO (isnap sync r=2) on [inode 0x1 [...2,head] / auth v179 snaprealm=0x563e9a6ce6c0 f(v0 m2025-06-27T19:09:29.187695+0000 2=0+2) n(v45 rc2025-06-27T19:41:31.249722+0000 3=0+3)/n(v0 rc2025-06-27T19:08:43.024940+0000 1=0+1) (isnap sync r=2) (inest lock w=1 dirty) caps={4359=pAsLsXsFs/-@83,4393=pAsLsXsFs/-@1245} | dirtyscattered=1 request=0 lock=2 dirfrag=1 caps=1 openingsnapparents=0 dirty=1 authpin=0 0x563e9a6f4b00]
    2025-06-27T19:41:32.043+0000 7f11d99b8640 10 mds.0.locker nudge_log NO (iversion lock w=1 last_client=4393) on [inode 0x10000000000 [...2,head] /a/ auth v29161 pv29163 ap=3 DIRTYPARENT f(v0 m2025-06-27T19:41:32.017971+0000 1=1+0) n(v0 rc2025-06-27T19:41:32.017971+0000 2=1+1) (isnap sync r=2) (inest lock w=1) (ifile lock->sync w=1) (iversion lock w=1 last_client=4393) (iquiesce lock w=1 last_client=4393) caps={4359=pAsLsXs/-@109,4393=pAsLsXs/-@955} | request=1 lock=5 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x563e9a6f5600]
    2025-06-27T19:41:32.043+0000 7f11d99b8640 10 mds.0.locker nudge_log NO (iversion lock w=1 last_client=4393) on [inode 0x10000000001 [...2,head] /b/ auth v29159 pv29165 ap=1 DIRTYPARENT f(v0 m2025-06-27T19:41:32.017971+0000) n(v0 rc2025-06-27T19:41:32.017971+0000 1=0+1) (isnap sync r=1) (inest lock w=1) (ifile excl w=1) (iversion lock w=1 last_client=4393) caps={4393=pAsLsXsFsx/-@955},l=4393 | request=0 lock=4 dirfrag=1 caps=1 dirtyparent=1 dirty=1 authpin=1 0x563e9a6f5080]
    2025-06-27T19:41:32.043+0000 7f11d99b8640 10 mds.0.locker nudge_log YES (ifile lock->sync w=1) on [inode 0x10000000000 [...2,head] /a/ auth v29161 pv29163 ap=3 DIRTYPARENT f(v0 m2025-06-27T19:41:32.017971+0000 1=1+0) n(v0 rc2025-06-27T19:41:32.017971+0000 2=1+1) (isnap sync r=2) (inest lock w=1) (ifile lock->sync w=1) (iversion lock w=1 last_client=4393) (iquiesce lock w=1 last_client=4393) caps={4359=pAsLsXs/-@109,4393=pAsLsXs/-@955} | request=1 lock=5 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x563e9a6f5600]
    2025-06-27T19:41:32.043+0000 7f11d99b8640 20 mds.0.locker : request(client.4393:20702 nref=5 cr=0x563e9b83ae00)
    2025-06-27T19:41:32.043+0000 7f11d99b8640 20 mds.0.log _submit_entry EUpdate rename [metablob 0x1, 3 dirs]

Easily reproducible with two ceph-fuse clients. One session doing:

    $ mkdir a b
    $ touch a/file
    $ while true; do mv -v a/file b/file; mv -v b/file a/file; done

and the other

    $ while true; do stat a/file; done

You can observe the rename/stat stalls (up to 5 seconds; MDS tick interval) without this patch.

Fixes: https://tracker.ceph.com/issues/71876
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
(cherry picked from commit db5c9dc2e6cc95a8d112c2131e4cac5340ca9dd0)

commit | commitdiff | tree

Patrick Donnelly [Fri, 27 Jun 2025 18:38:17 +0000 (14:38 -0400)]

mds: allow disabling batch ops

To address a bug and future ones where batching lookup/getattr does not help
"kick" the MDS in switching state more quickly (e.g. flushing the MDS journal).

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
(cherry picked from commit 0201f86e6939a3d787bea755a48cb4b4254d2f9c)

commit | commitdiff | tree

Patrick Donnelly [Tue, 1 Jul 2025 19:37:53 +0000 (15:37 -0400)]

common/options: chomp whitespace

If there are trailing newlines in a string (like long_desc), then the generated
C++ code is invalid (because the newline will not be escaped).

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
(cherry picked from commit 73792b78b4b520bcbfac254528f79a80b4d331c5)

commit | commitdiff | tree

Anthony D'Atri [Wed, 16 Jul 2025 14:22:37 +0000 (10:22 -0400)]

Merge pull request #64529 from zdover23/wip-doc-2025-07-16-backport-64433-to-tentacle

tentacle: doc: update mgr modules notify_types

commit | commitdiff | tree

Zac Dover [Fri, 6 Jun 2025 05:11:15 +0000 (15:11 +1000)]

doc/mgr: edit telemetry.rst

Edit doc/mgr/telemetry.rst.

Incorporate Anthony D'Atri's suggestions from
https://github.com/ceph/ceph/pull/63739.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit ac7f757db7b3644761a2295cfe5e1a9a55319f72)

commit | commitdiff | tree

Yuval Lifshitz [Thu, 3 Jul 2025 10:24:30 +0000 (10:24 +0000)]

doc/rgw/logging: fix journal record format

Fixes: https://tracker.ceph.com/issues/71945
Signed-off-by: Yuval Lifshitz <ylifshit@ibm.com>
(cherry picked from commit 2dd5edf17aed392dc51a0fe9d55fa9963574ced1)

commit | commitdiff | tree

Nitzan Mordechai [Thu, 10 Jul 2025 10:03:06 +0000 (10:03 +0000)]

doc: update mgr modules notify_types

Signed-off-by: Nitzan Mordechai <nmordec@redhat.com>
(cherry picked from commit fc4396d6280fcbf0a95567cff144052d81dcd964)

commit | commitdiff | tree

Afreen Misbah [Thu, 10 Jul 2025 21:15:08 +0000 (02:45 +0530)]

mgr/dashboard: Fix smb module enablement

- changed prop name to `module_name` to avoid confusion while pasisng input props
- the module name is required to enable module

Signed-off-by: Afreen Misbah <afreen@ibm.com>
(cherry picked from commit 8f2a88eb4dd8e779044e7cd5b48c90f290303912)

commit | commitdiff | tree

Anthony D'Atri [Tue, 15 Jul 2025 13:45:03 +0000 (09:45 -0400)]

Merge pull request #64495 from zdover23/wip-doc-2025-07-15-backport-63877-to-tentacle

tentacle: doc/rados/ops: edit cache-tiering.rst

commit | commitdiff | tree

Nitzan Mordechai [Mon, 14 Apr 2025 11:50:23 +0000 (11:50 +0000)]

PrimeryLogPG: don't accept ops with mixed balance_reads and rwordered flags

do_op can't accept mixed flag of rwordered and balance_read

Fixes: https://tracker.ceph.com/issues/70715
Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
(cherry picked from commit f68b2178a24613960fe1303ece413b24f3ea02e7)

commit | commitdiff | tree

Nitzan Mordechai [Mon, 14 Apr 2025 11:49:30 +0000 (11:49 +0000)]

Objecter: remove balance_read and localize_read if rwordered

Objecter shouldn't sent ops with mixed rwordered and balance_read flags

Fixes: https://tracker.ceph.com/issues/70715
Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
(cherry picked from commit 40292f2fd10f338c9baab60a019dfe4806e642c7)

commit | commitdiff | tree

Zac Dover [Wed, 11 Jun 2025 12:44:32 +0000 (22:44 +1000)]

doc/rados/ops: edit cache-tiering.rst

Add material to doc/rados/operations/cache-tiering.rst, as suggested by
Anthony D'Atri in
https://github.com/ceph/ceph/pull/63745#discussion_r2127887785.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit d3c46820a5fc72391ef46ab4b03bbe867e0e51d2)

commit | commitdiff | tree

Anthony D'Atri [Tue, 15 Jul 2025 02:36:32 +0000 (22:36 -0400)]

Merge pull request #64491 from zdover23/wip-doc-2025-07-15-backport-64483-to-tentacle

tentacle: doc: add note admonitions in two files

commit | commitdiff | tree

Zac Dover [Mon, 14 Jul 2025 14:40:21 +0000 (00:40 +1000)]

doc: add note admonitions in two files

Add note admonitions when discussing client package support in the
context of OS Recommendations in the following two files:

- doc/cephfs/ceph-dokan.rst
- doc/rbd/rbd-windows.rst

This addresses a change requested by Ilya Dryomov in
https://github.com/ceph/ceph/pull/64374#discussion_r2199756581.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 69d641f0207d803cd9a3c3e102d5b2073e6b0f77)

commit | commitdiff | tree

Anthony D'Atri [Mon, 14 Jul 2025 17:22:20 +0000 (13:22 -0400)]

Merge pull request #64480 from zdover23/wip-doc-2025-07-15-backport-64374-to-tentacle

tentacle: doc: Clarify the status of MS Windows client support

commit | commitdiff | tree

Anthony D'Atri [Mon, 7 Jul 2025 15:47:02 +0000 (11:47 -0400)]

doc: Clarify the status of MS Windows client support

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
(cherry picked from commit 89eabfc3f0c8db3501b3ea3097e2983617c5234a)

commit | commitdiff | tree

Anthony D'Atri [Mon, 14 Jul 2025 13:30:23 +0000 (09:30 -0400)]

Merge pull request #64471 from zdover23/wip-doc-2025-07-14-backport-64462-to-tentacle

tentacle: doc/cephfs: Improve mount-using-fuse.rst

commit | commitdiff | tree

Anthony D'Atri [Mon, 14 Jul 2025 13:28:36 +0000 (09:28 -0400)]

Merge pull request #64474 from zdover23/wip-doc-2025-07-14-backport-63080-to-tentacle

tentacle: doc/radosgw: Improve rgw-cache.rst

commit | commitdiff | tree

Ville Ojamo [Wed, 30 Apr 2025 18:17:14 +0000 (01:17 +0700)]

doc/radosgw: Improve rgw-cache.rst

Try to improve the language by completely rewriting some sentences.
Attempt to format the document more like the rest of the docs.
Fix several errors in punctuation, capitalization, spaces etc.
Use blocks with bash prompts for CLI commands instead of hardcoded
prompts.
Fix section hierarchy and section title underline lengths.
Use admonition.

Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
(cherry picked from commit 6e836f8f1e1e53bc7f8d8b497960b100e6b625d6)

commit | commitdiff | tree

Anthony D'Atri [Fri, 11 Jul 2025 19:02:45 +0000 (15:02 -0400)]

doc/cephfs: Improve mount-using-fuse.rst

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
(cherry picked from commit 329ee7b3038e49cf0def2f2628444e3e90796c05)

commit | commitdiff | tree

afreen23 [Sun, 13 Jul 2025 19:54:42 +0000 (01:24 +0530)]

Merge pull request #64460 from Hezko/wip-72093-tentacle

tentacle: mgr/dashboard: add missing commands for subsystem: change_key and del…

Reviewed-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Tomer Haskalovitch [Tue, 8 Jul 2025 17:45:09 +0000 (20:45 +0300)]

mgr/dashboard: add missing commands for subsystem: change_key and del_key and missing params for host add

Signed-off-by: Tomer Haskalovitch <tomer.haska@ibm.com>
(cherry picked from commit eaab5a0bee0fb53569efc3b3893725705eeba805)

commit | commitdiff | tree

afreen23 [Fri, 11 Jul 2025 15:31:03 +0000 (21:01 +0530)]

Merge pull request #64268 from zdover23/wip-doc-2025-06-30-backport-64164-to-tentacle

tentacle: mgr/dashboard: Fix inline markup warning in API documentation

Reviewed-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Casey Bodley [Fri, 11 Jul 2025 14:35:36 +0000 (10:35 -0400)]

Merge pull request #64360 from soumyakoduri/wip-skoduri-tentacle

[rgw][tentacle] Add Restore support from Glacier/Tape cloud endpoints

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Casey Bodley [Tue, 8 Jul 2025 14:32:01 +0000 (10:32 -0400)]

deb/mgr: remove deprecated distutils from ceph-mgr.requires

distutils was deprecated in python 3.10 and removed in 3.12 which we
need to support for ubuntu 24.04

Fixes: https://tracker.ceph.com/issues/72020
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 3fb3f892aa381f0226be3331896d303b87626777)

commit | commitdiff | tree

Shraddha Agrawal [Fri, 11 Jul 2025 07:41:18 +0000 (13:11 +0530)]

Merge pull request #64389 from shraddhaag/wip-72027-tentacle

tentacle: mon/MgrStatMonitor.cc: cleanup pool_availability

commit | commitdiff | tree

Jos Collin [Wed, 2 Jul 2025 09:19:57 +0000 (14:49 +0530)]

qa: increase the wait time to prevent check_counter failing

Fixes: https://tracker.ceph.com/issues/70441
Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit a86f6a6ce89ff20f2f160464abb7500499db76b3)

commit | commitdiff | tree

Soumya Koduri [Sun, 6 Jul 2025 16:58:22 +0000 (22:28 +0530)]

rgw: Fix the version of struct RGWZoneParams

Fix the version of `restore_pool` and `dedup_pool` to be
compatible with earlier releases.

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit b6fc0be439f79f2aef833d703f7a6f9c2e48de02)

commit | commitdiff | tree

Soumya Koduri [Fri, 4 Jul 2025 07:20:53 +0000 (12:50 +0530)]

rgw/cloud-restore: Update doc with new options added

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit a981b4c0245eeafe077042a045cb05eeec9d8161)

commit | commitdiff | tree

Soumya Koduri [Mon, 7 Jul 2025 09:41:06 +0000 (15:11 +0530)]

rgw/restore: Update to neorados FIFO routines

Use new neorados/FIFO routines to store restore state.

Note: Old librados ioctx is also still retained as it is needed
by RestoreRadosSerializer.

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit faf06bca959d8e8f2d40f610ae2ed409a69271f6)

commit | commitdiff | tree

Yuri Weinstein [Thu, 10 Jul 2025 21:18:27 +0000 (14:18 -0700)]

Merge pull request #64294 from ivancich/wip-71777-tentacle

tentacle: rgw: make sure max_objs_per_shard is appropriate in debugging scenarios

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 10 Jul 2025 21:17:32 +0000 (14:17 -0700)]

Merge pull request #64201 from ideepika/wip-71154-tentacle

tentacle: rgw: make keystone work without admin token(service ac requirement)

Reviewed-by: Adam Emerson <aemerson@redhat.com>

commit | commitdiff | tree

Soumya Koduri [Fri, 23 May 2025 21:39:50 +0000 (03:09 +0530)]

rgw/restore: Use strtoull to read size till 2^64

Reviewed-by: Adam Emerson <aemerson@redhat.com>
Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit b3c867a121a7315b5a9e2d30d0af44c08676f8ca)

commit | commitdiff | tree

Soumya Koduri [Fri, 23 May 2025 21:37:58 +0000 (03:07 +0530)]

rgw/cloud-restore: Fixing issues with initializing and resetting FIFO

In addition, added some more debug statements and done code cleanup

Reviewed-by: Adam Emerson <aemerson@redhat.com>
Reviewed-by: Jiffin Tony Thottan <thottanjiffin@gmail.com>
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit f00ac7c96f0ac48e0ca487ecb5918db42e6cf234)

commit | commitdiff | tree

Soumya Koduri [Fri, 23 May 2025 20:25:30 +0000 (01:55 +0530)]

rgw/cloud-restore: Handle failure with adding restore entry

In case adding restore entry to FIFO fails, reset the `restore_status`
of that object as "RestoreFailed" so that restore process can be
retried from the end S3 user.

Reviewed-by: Adam Emerson <aemerson@redhat.com>
Reviewed-by: Jiffin Tony Thottan <thottanjiffin@gmail.com>
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit 9974f51eb61603b8117d7b50e6b0b4614fcce721)

commit | commitdiff | tree

Soumya Koduri [Wed, 30 Apr 2025 20:36:21 +0000 (02:06 +0530)]

rgw/cloud-restore: Support restoration of objects transitioned to Glacier/Tape endpoint

Restoration of objects from certain cloud services (like Glacier/Tape) could
take significant amount of time (even days). Hence store the state of such restore requests
and periodically process them.

Brief summary of changes

* Refactored existing restore code to consolidate and move all restore processing into rgw_restore* file/class

* RGWRestore class is defined to manage the restoration of objects.

* Lastly, for SAL_RADOS, FIFO is used to store and read restore entries.

Currently, this PR handles storing state of restore requests sent to cloud-glacier tier-type which need async processing.
The changes are tested with AWS Glacier Flexible Retrieval with tier_type Expedited and Standard.

Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
Reviewed-by: Adam Emerson <aemerson@redhat.com>
Reviewed-by: Jiffin Tony Thottan <thottanjiffin@gmail.com>
Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit ef96bb0d6137bacf45b9ee2f99ad5bcd8b3b6add)

commit | commitdiff | tree

Casey Bodley [Thu, 10 Jul 2025 16:28:59 +0000 (12:28 -0400)]

Merge pull request #64264 from benhanokh/wip-71899-tentacle

tentacle: rgw/dedup: full object dedup

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Thu, 10 Jul 2025 07:57:37 +0000 (02:57 -0500)]

qa/standalone/scrub: fix "scrubbed in 0ms" in osd-scrub-test.sh

The specific test looks for a 'last scrub duration' higher than
0 as a sign that the scrub actually ran.  Previous code fixes
guaranteed that even a scrub duration as low as 1ms would be
reported as "1" (1s).  However, none of the 15 objects created
in this test were designated for the tested PG, which remained
empty.  As a result, the scrub duration was reported as "0".

The fix is to create a large enough number of objects so that
at least one of them is mapped to the tested PG.

Fixes: https://tracker.ceph.com/issues/71801
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit b303afed7a8b2a65043f56170ed478f8d2bc591a)

commit | commitdiff | tree

Matan Breizman [Thu, 10 Jul 2025 12:45:41 +0000 (15:45 +0300)]

Merge pull request #63880 from Matan-B/wip-matanb-crimson-tentacle-63376

tentacle: crimson/os/seastore/omap_manager: handle the cases in which omap nodes are rewritten before seen by users

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Nizamudeen A [Thu, 10 Jul 2025 09:33:17 +0000 (15:03 +0530)]

Merge pull request #64404 from rhcs-dashboard/wip-72036-tentacle

tentacle: mgr/dashboard: rm requirements-extra file

commit | commitdiff | tree

Matan Breizman [Thu, 10 Jul 2025 07:42:41 +0000 (10:42 +0300)]

Merge pull request #64413 from Matan-B/wip-matanb-crimson-tentacle-compalint-time

tentacle: qa/config/crimson_qa_overrides: increase complaint time

Reviewed-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 9 Jul 2025 21:18:12 +0000 (14:18 -0700)]

Merge pull request #64178 from adk3798/tentacle-nvmeof-interval

tentacle: mgr/cephadm/nvmeof: Allow setting NVMEoF gateway read notifications interval in the spec file

Reviewed-by: Adam King adking@redhat.com

commit | commitdiff | tree

Yuri Weinstein [Wed, 9 Jul 2025 21:16:34 +0000 (14:16 -0700)]

Merge pull request #64300 from aainscow/wip-71874-tentacle

tentacle: Fix _setattr() with rare memory alignments

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>

commit | commitdiff | tree

Laura Flores [Tue, 4 Mar 2025 21:42:37 +0000 (15:42 -0600)]

qa/tasks: generalize stuck pg ignorelist entry

Fixes: https://tracker.ceph.com/issues/70307
Signed-off-by: Laura Flores <lflores@ibm.com>
(cherry picked from commit 26fdfbb9171664f69038b1fda26f4a225d7e52f8)

commit | commitdiff | tree

Alex Ainscow [Tue, 24 Jun 2025 12:18:50 +0000 (13:18 +0100)]

osd: Fix extent cache unit test

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 71c915486fd14ec8f7e3a135d7251327c324dbfc)

commit | commitdiff | tree

Alex Ainscow [Fri, 6 Jun 2025 11:09:04 +0000 (12:09 +0100)]

osd: Optimised EC should avoid decodes off the end of objects.

This was a particular edge case whereby the need to do an encode and a decode as part of recovery
was causing EC to attempt to do a decode off the end of the shard, despite this being
unnecessary.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit bd9d44dd89a7fa587182cf49c1f444c79126a99e)

commit | commitdiff | tree

Alex Ainscow [Fri, 23 May 2025 08:59:31 +0000 (09:59 +0100)]

osd: During recovery, pass "for_recovery" when attempting re-reads

get_all_remaining_reads() is used for two purposes:

1. If a read unexpectedly fails, recover from other shards.
2. If a shard is missing, but is allowed to be missing (typically
due to unequal shard sizes), we rely on this function to return
no new reads, without an error.

The test failure we saw case (2), but I think case (1) is important.

Most of the time we probably would not notice, but if insufficient redundancy
exists without the for_recovery being set, then this will result in
recovery failing.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 7ab0588ad2853030bf0e716a5f887e0fe714b314)

commit | commitdiff | tree

Alex Ainscow [Thu, 22 May 2025 13:37:56 +0000 (14:37 +0100)]

osd: In optimized EC send empty transactions to non-primaries if object is recovering.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit ce5d58d704238816e0ddbb1279c273f5c520e5ac)

commit | commitdiff | tree

Alex Ainscow [Fri, 2 May 2025 09:11:45 +0000 (10:11 +0100)]

osd: Improve backfill in new EC.

In old EC, the full stripe was always read and written. In new EC, we only attempt
to recover the shards that were missing. If an old OSD is available, the read can
be directed there.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 1becd2c5f6ec1d4c31059243ac247f046efd4fe3)

commit | commitdiff | tree

Alex Ainscow [Thu, 8 May 2025 15:14:03 +0000 (16:14 +0100)]

osd: Cope with empty reads from an OSD without panic.

If a ReadOp from EC contains two objects where one object only reads from a single shard, but
other onjects require other shards, then this bug can be hit. The fix should make it clear
what the issue is

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 8c92dcf6c0092f4dd140bb712ce4a38990a3cba2)

commit | commitdiff | tree

Alex Ainscow [Thu, 8 May 2025 13:22:36 +0000 (14:22 +0100)]

osd: Recover non-primary shards with the correct version.

Scrub revealed a bug whereby the non-primary shards were being given a
version number in the OI which did not match the expected version in
the authoritative OI.

A secondary issue is that all attributes were being pushed to the non
primary shards, whereas only OI is actually needed.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
# Conflicts:
# src/osd/osd_types.h
(cherry picked from commit 4332a188b72eea09c48d913d1b4576259a32a4a4)

commit | commitdiff | tree

Alex Ainscow [Wed, 7 May 2025 09:33:24 +0000 (10:33 +0100)]

osd: Do not do a read-modify-write if op.delete_first is set

Some client OPs are able to generate transactions which delete an
object and then write it again. This is used by the copy-from ops.

If such a write is not 4k aligned, then the new EC code was incorrectly doing
a read-modify write on the misaligned 4k.  This causes some
garbage to be written to the backend OSD, off the end of the
object.  This is only a problem if the object is later extended
without the end being written.

Problematic sequence is:

1. Create two objects (A and B) of size X and Y where:
X > Y, (Y % 4096) != 0

2. copy_from  OP B -> A
3. Extend B without writing offset Y+1

This will result in a corrupt data buffer at Y+1 without this fix.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit dafe15ec4d7fd6c7deb77cfa00a3e737c71545dc)

commit | commitdiff | tree

Alex Ainscow [Thu, 1 May 2025 09:09:15 +0000 (10:09 +0100)]

osd: Fix off-by-one error in shard_extent_map.

Inserting the first parity buffer was causing the ro-range within the SEM to be incorrectly calculated.
Simple fix and I have added some unit tests to defend this error in the future.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 2a569d8680ff64287efea5d4b1790b396aae86cc)

commit | commitdiff | tree

Alex Ainscow [Wed, 30 Apr 2025 09:34:30 +0000 (10:34 +0100)]

osd: Invalidate CRC on all slice iterator calls.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 29039e9441e4c05c5b59ca1a1f1064fed44f4078)

commit | commitdiff | tree

Alex Ainscow [Fri, 25 Apr 2025 18:15:34 +0000 (19:15 +0100)]

osd: Minor performance improvement in ECUtil.cc

Code changes to prevent create then erase of empty
shard.
Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit c2d6414f659b123fa7060442bff7a90a7ceeb7c0)

commit | commitdiff | tree

Alex Ainscow [Fri, 25 Apr 2025 08:13:46 +0000 (09:13 +0100)]

osd: Use correct shard_versions when queuing multiple writes in optimized EC.

The problem was that incorrect shard_versions were being detected
by scrub. The comment in the code explains the detailed problem
and solution.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 6ab686fefcef4bf83e2b77164c39c9c7d00c567c)

commit | commitdiff | tree

Alex Ainscow [Thu, 24 Apr 2025 13:02:41 +0000 (14:02 +0100)]

osd: clone + delete ops should invalidate source new EC extent cache.

The op.is_delete() function only returns true if the op is not ALSO
doing something else (e.g. a clone). This causes issues with clearing
the new EC extent cache.

Also improve some debug and code cleanliness.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 1224063e8f9b6fcafcba399373a43f0bc7488bcc)

commit | commitdiff | tree

Alex Ainscow [Thu, 24 Apr 2025 12:57:45 +0000 (13:57 +0100)]

osd: Make projected_size in ECTransaction const

This does not need to change once set, so adapt constructor to
allow it to be const.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 350f3a7a7a8511899adb52b499bc537e178acb0e)

commit | commitdiff | tree

Alex Ainscow [Wed, 23 Apr 2025 14:41:11 +0000 (15:41 +0100)]

osd: Fix parity updates in truncates.

Previously in optimised EC, when truncating to a partial
stripe, the parity was not being updated. This fix reads
the non-truncated data from the final stripe and calculates
parity updates, which are written to the parity shards.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 54a06c97a279d842db0f8059ef991245c3350171)

commit | commitdiff | tree

Alex Ainscow [Tue, 22 Apr 2025 12:41:19 +0000 (13:41 +0100)]

osd: Fix EC cache invalidation bug

With optimised EC, there were two bugs with cache invalidation:
1. If two invalidates were in the queue, its possible the second
invalidate might be cleared by the first.

2. Reads were being requested if size was being reduced.

Also, added a few debug improvements and some new asserts.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 7fad8b94a7509d052e4f6f773fe718b18964c282)

commit | commitdiff | tree

Alex Ainscow [Thu, 17 Apr 2025 21:53:31 +0000 (22:53 +0100)]

osd: Fix panic on fast_read completion in optimised EC

The completion of sub reads was incorrectly marking all processed reads complete on the first read.

This was causing an early attempt at reconstruct, which panics.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit b72a13ef1fcde577edd5ed397e924178ca083f9b)

commit | commitdiff | tree

Alex Ainscow [Thu, 17 Apr 2025 21:51:51 +0000 (22:51 +0100)]

osd: Use partial read path for fast_reads

Previously fast reads had attempted to read entire stripes. This is not necessary or desirable.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 9ff7589c873d368cfa797acdd970278232a0c906)

commit | commitdiff | tree

Alex Ainscow [Thu, 17 Apr 2025 16:23:04 +0000 (17:23 +0100)]

osd: Fix access-freed-memory issue in EC extent cache.

A very similar issue has been in product code, but this was found using valgrind.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 1d7425ef1621816e061c4a8d22d9e4f7617912e8)

commit | commitdiff | tree

Alex Ainscow [Wed, 9 Apr 2025 12:49:49 +0000 (13:49 +0100)]

osd: Make EC alignment independent of page size.

Code which manipulates full pages is often faster. To exploit this
optimised EC was written to deal with 4k alignment wherever possible.
When inputs are not aligned, they are quickly aligned to 4k.

Not all architectures use 4k page sizes. Some power architectures for
example have a 64k page size. In such situations, it is unlikely that
using 64k page alignment will provide any performance boost, indeed it
is likely to hurt performance significantly. As such, EC has been
moved to maintain its internal alignment (4k), whcih can be configured.

This has the added advantage, that we can can potentially tweak this
value in the future.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 6bbc3f9b9947575d91748bba849d9e7a1e7d27e5)

commit | commitdiff | tree

Alex Ainscow [Wed, 16 Apr 2025 09:41:48 +0000 (10:41 +0100)]

osd: Fix Truncates in Optimised EC

The previous truncate code attempted to perform a non-aligned truncate by
creating a zero buffer at the end of the object, which was written.

The new code initially truncates to the exact size of the user object before
growing the object to the required 4k alignment. This simpler arrangement
also simplifies the rollback.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit d17f06b5fcb2fc749c4d0cbae9beb963bd06c145)

commit | commitdiff | tree

Alex Ainscow [Wed, 16 Apr 2025 06:44:25 +0000 (07:44 +0100)]

osd: Fix written shards policing for multiple loops through generate.

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit c5f67d3771384e7a780ed6488019fca2509e31c2)

commit | commitdiff | tree

Alex Ainscow [Thu, 24 Apr 2025 14:14:08 +0000 (15:14 +0100)]

osd: Fix shard ordering bug

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 2ead42b8bdf836acd662537a4f2ec3e0b3b61a34)

commit | commitdiff | tree

Alex Ainscow [Tue, 10 Jun 2025 14:58:28 +0000 (15:58 +0100)]

osd: Remove EC-optimized only flag for not reset_complete_to

The protection here applies to non-optimized EC and replica shards, but will
not be exercised as much. So this is essentially a clean up

Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit ca7eebcce7a7f7879deac5b917c3fcaf2249da14)

commit | commitdiff | tree

Bill Scales [Fri, 6 Jun 2025 12:28:14 +0000 (13:28 +0100)]

osd: EC optimizations fix bug when recovering only partial write objects

PGLog::reset_complete_to is not handling the scenario where all the
missing objects have a partial write that excludes updating the shard being
recovered as their most recent update. In this scenario the oldest need
is newer than newest log entry. Setting last_compelte to the head of the
log confuses code and makes it think that recovery has completed.

The fix is to hold last_complete one entry behind the head of the log
until all missing objects have been recovered.

PGLog::recover_got already does this when an object is recovered and the
remaining objects to recover match this scenario, so this fix just makes
reset_complete_to behave the same way as recover_got.

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
(cherry picked from commit 8ca209e33709b1915858a4cd9747d6c580797a4c)

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom