git-server-git.apps.pok.os.sepia.ceph.com Git

qa: Add mirror metrics testcases

Add testcases for newly introduced mirror
metrics and validate it via 'fs mirror peer status'
asok interface.

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>

doc: Update the mirroring doc with new metrics fields

Update the mirroring documentation and also the
release notes with new metrics introduced and it's
availability via 'fs mirror peer status' asok
interface.

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>

qa: Fix the mirroring tests with new nested peer_status output

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>

tools/cephfs_mirror: Nest peer_status metrics by dir path and peer uuid

Restructure peer_status output so mirrored directory paths can be
shared by multiple peers without key collisions. Metrics are grouped
as metrics/<dir_path>/peer/<peer_uuid>/ instead of flat dir keys.

Sample output:
--------------
1. When two dirs are syncing.
{
    "metrics": {
        "/parent/d0": {
            "peer": {
                "8a85ab25-70f9-48e9-b82d-56324e75209b": {
                    "state": "syncing",
                    "current_syncing_snap": {
                        "id": 2,
                        "name": "d0_snap0",
                        "sync-mode": "full",
                        "avg_read_throughput_bytes": "9.01 MiB/s",
                        "avg_write_throughput_bytes": "26.74 MiB/s",
                        "crawl": {
                            "state": "completed",
                            "duration": "2s"
                        },
                        "datasync_queue_wait": {
                            "state": "completed",
                            "duration": "0s"
                        },
                        "bytes": {
                            "sync_bytes": "60.83 MiB",
                            "total_bytes": "149.94 MiB",
                            "sync_percent": "40.57%"
                        },
                        "files": {
                            "sync_files": 2028,
                            "total_files": 5000,
                            "sync_percent": "40.56%"
                        },
                        "eta": "10s"
                    },
                    "snaps_synced": 0,
                    "snaps_deleted": 0,
                    "snaps_renamed": 0
                }
            }
        },
        "/parent/d1": {
            "peer": {
                "8a85ab25-70f9-48e9-b82d-56324e75209b": {
                    "state": "syncing",
                    "current_syncing_snap": {
                        "id": 3,
                        "name": "d1_snap0",
                        "sync-mode": "full",
                        "avg_read_throughput_bytes": "6.80 MiB/s",
                        "avg_write_throughput_bytes": "20.04 MiB/s",
                        "crawl": {
                            "state": "in-progress",
                            "duration": "2s"
                        },
                        "datasync_queue_wait": {
                            "state": "completed",
                            "duration": "1s"
                        },
                        "bytes": {
                            "sync_bytes": "4.12 MiB",
                            "total_bytes": "124.98 MiB",
                            "sync_percent": "3.30%"
                        },
                        "files": {
                            "sync_files": 125,
                            "total_files": 4189,
                            "sync_percent": "2.98%"
                        },
                        "eta": "18s"
                    },
                    "snaps_synced": 0,
                    "snaps_deleted": 0,
                    "snaps_renamed": 0
                }
            }
        }
    }
}
---------
2. When two directories are synced

------------------------------------------
{
    "metrics": {
        "/parent/d0": {
            "peer": {
                "8a85ab25-70f9-48e9-b82d-56324e75209b": {
                    "state": "idle",
                    "last_synced_snap": {
                        "id": 2,
                        "name": "d0_snap0",
                        "crawl_duration": "2s",
                        "datasync_queue_wait_duration": "0s",
                        "sync_duration": "30s",
                        "sync_time_stamp": "422538.254127s",
                        "sync_bytes": "149.94 MiB",
                        "sync_files": 5000
                    },
                    "snaps_synced": 1,
                    "snaps_deleted": 0,
                    "snaps_renamed": 0
                }
            }
        },
        "/parent/d1": {
            "peer": {
                "8a85ab25-70f9-48e9-b82d-56324e75209b": {
                    "state": "idle",
                    "last_synced_snap": {
                        "id": 3,
                        "name": "d1_snap0",
                        "crawl_duration": "2s",
                        "datasync_queue_wait_duration": "1s",
                        "sync_duration": "33s",
                        "sync_time_stamp": "422546.205798s",
                        "sync_bytes": "149.94 MiB",
                        "sync_files": 5000
                    },
                    "snaps_synced": 1,
                    "snaps_deleted": 0,
                    "snaps_renamed": 0
                }
            }
        }
    }
}

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>

tools/cephfs_mirror: Add datasync_queue_wait_duration metric

Add the metric which measures the time spent by the snapshot
in the data queue waiting for the datasync threads.

Sample output:
When still 'waiting' in queue
{
    "/d1": {
        "state": "syncing",
        "current_syncing_snap": {
            "id": 18,
            "name": "d1_snap5",
            "sync-mode": "delta",
            "avg_read_throughput_bytes": "0.00 B/s",
            "avg_write_throughput_bytes": "0.00 B/s",
            "crawl": {
                "state": "in-progress",
                "duration": "13s"
            },
            "datasync_queue_wait": {
                "state": "waiting",
                "duration": "12s"
            },
            "bytes": {
                "sync_bytes": "0.00 B",
                "total_bytes": "110.99 MiB",
                "sync_percent": "0.00%"
            },
            "files": {
                "sync_files": 0,
                "total_files": 3719,
                "sync_percent": "0.00%"
            },
            "eta": "calculating..."
        },
        "last_synced_snap": {
            "id": 15,
            "name": "d1_snap4"
        },
        "snaps_synced": 0,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    },
}
---------------
After 'complete'
{
    "/d1": {
        "state": "syncing",
        "current_syncing_snap": {
            "id": 18,
            "name": "d1_snap5",
            "sync-mode": "delta",
            "avg_read_throughput_bytes": "11.66 MiB/s",
            "avg_write_throughput_bytes": "34.55 MiB/s",
            "crawl": {
                "state": "completed",
                "duration": "17s"
            },
            "datasync_queue_wait": {
                "state": "completed",
                "duration": "19s"
            },
            "bytes": {
                "sync_bytes": "149.94 MiB",
                "total_bytes": "149.94 MiB",
                "sync_percent": "100.00%"
            },
            "files": {
                "sync_files": 5000,
                "total_files": 5000,
                "sync_percent": "100.00%"
            },
            "eta": "0s"
        },
        "last_synced_snap": {
            "id": 15,
            "name": "d1_snap4"
        },
        "snaps_synced": 0,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}
-----
Also stored in last_sync_snap section
{
    "/d1": {
        "state": "idle",
        "last_synced_snap": {
            "id": 18,
            "name": "d1_snap5",
            "crawl_duration": "17s",
            "datasync_queue_wait_duration": "19s",
            "sync_duration": "44s",
            "sync_time_stamp": "8172.009480s",
            "sync_bytes": "149.94 MiB",
            "sync_files": 5000
        },
        "snaps_synced": 1,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>

tools/cephfs_mirror: Add eta metrics

Add estimate time of completion for the current
syncing snapshot. The calculation takes into
account the average read/write throughput from
the start of snapshot sync and not the current
read/write throughput. So the ETA is affected
accordingly.

Sample output:
-------------
{
    "/d0": {
        "state": "syncing",
        "current_syncing_snap": {
            "id": 2,
            "name": "d0_snap0",
            "sync-mode": "full",
            "avg_read_throughput_bytes": "3.28 MiB/s",
            "avg_write_throughput_bytes": "71.03 MiB/s",
            "crawl": {
                "state": "completed",
                "duration": "1s"
            },
            "bytes": {
                "sync_bytes": "2.31 MiB",
                "total_bytes": "149.94 MiB",
                "sync_percent": "1.54%"
            },
            "files": {
                "sync_files": 67,
                "total_files": 5000,
                "sync_percent": "1.34%"
            },
            "eta": "calculating..."
        },
        "snaps_synced": 0,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}
------------------------------------------
{
    "/d0": {
        "state": "syncing",
        "current_syncing_snap": {
            "id": 2,
            "name": "d0_snap0",
            "sync-mode": "full",
            "avg_read_throughput_bytes": "12.17 MiB/s",
            "avg_write_throughput_bytes": "66.46 MiB/s",
            "crawl": {
                "state": "completed",
                "duration": "1s"
            },
            "bytes": {
                "sync_bytes": "26.64 MiB",
                "total_bytes": "149.94 MiB",
                "sync_percent": "17.77%"
            },
            "files": {
                "sync_files": 892,
                "total_files": 5000,
                "sync_percent": "17.84%"
            },
            "eta": "10s"
        },
        "snaps_synced": 0,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>

tools/cephfs_mirror: Add read/write throughput

The read throughput added measures the bytes
read per second from the source ceph filesystem.
Similarly, the write throughput added measures
the bytes written per second to the remote ceph
filesystem. It's derived from the time spent
in preadv and pwritev calls.

Sample output:
-------------
{
    "/d0": {
        "state": "syncing",
        "current_syncing_snap": {
            "id": 2,
            "name": "d0_snap0",
            "sync-mode": "full",
            "avg_read_throughput_bytes": "12.69 MiB/s",
            "avg_write_throughput_bytes": "54.49 MiB/s",
            "crawl": {
                "state": "completed",
                "duration": "1s"
            },
            "bytes": {
                "sync_bytes": "149.94 MiB",
                "total_bytes": "149.94 MiB",
                "sync_percent": "100.00%"
            },
            "files": {
                "sync_files": 5000,
                "total_files": 5000,
                "sync_percent": "100.00%"
            }
        },
        "snaps_synced": 0,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}
-------------

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>

tools/cephfs_mirror: Add crawl-state and sync-mode metric

The 'crawl' and 'sync-mode' metric is added.

sync-mode: full/delta,
"crawl": {
           "state": "completed",
           "duration": "37s"
       }

sync-mode:
---------
The 'sync-mode: full/delta' is added to peer status.
The 'delta' means, blockdiff along with snapdiff is
being used to sync the files where as 'full' means
full directory is crawled and each file is synced
entirely.

crawl:
-----
The state can be in-progress/completed. This
identifies whether the crawler thread is done
queuing the files for data sync threads.

The time taken for the duration is also shown.
If the crawl is in-progress, the duration
would show the time taken till then from the
start of the crawl. If the crawl state is
completed, then duration indicates total
time taken for the crawl.

The crawl duration is shown in "d h m s" format.
The existing 'sync_duration' in last_synced_snap
is also formatted

The values are as below. When crawl state is
completed, the 'total_files' metric doesn't
grow anymore.

crawl_duration:
--------------
The crawl_duration of last snapshot is saved in last_synced_snap
section as well.

Sample outputs:
---------------
{
    "/d0": {
        "state": "syncing",
        "current_syncing_snap": {
            "id": 2,
            "name": "d0_snap0",
            "sync-mode": "full",
            "crawl": {
                "state": "in-progress",
                "duration": "21s"
            },
            "bytes": {
                "sync_bytes": "149.25 MiB",
                "total_bytes": "176.47 MiB",
                "sync_percent": "84.57%"
            },
            "files": {
                "sync_files": 4931,
                "total_files": 5845,
                "sync_percent": "84.36%"
            }
        },
        "snaps_synced": 0,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}
------------------------------------------
{
    "/d0": {
        "state": "syncing",
        "current_syncing_snap": {
            "id": 2,
            "name": "d0_snap0",
            "sync-mode": "full",
            "crawl": {
                "state": "completed",
                "duration": "37s"
            },
            "bytes": {
                "sync_bytes": "891.39 MiB",
                "total_bytes": "901.52 MiB",
                "sync_percent": "98.88%"
            },
            "files": {
                "sync_files": 29656,
                "total_files": 30000,
                "sync_percent": "98.85%"
            }
        },
        "snaps_synced": 0,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}
---------
  {
        "/d0": {
            "state": "syncing",
            "current_syncing_snap": {
                "id": 3,
                "name": "d0_snap1",
                "sync-mode": "delta",
                "crawl": {
                    "state": "completed",
                    "duration": "15s"
                },
                "bytes": {
                    "sync_bytes": "120.20 MiB",
                    "total_bytes": "149.94 MiB",
                    "sync_percent": "80.16%"
                },
                "files": {
                    "sync_files": 4032,
                    "total_files": 5000,
                    "sync_percent": "80.64%"
                }
            },
            "last_synced_snap": {
                "id": 2,
                "name": "d0_snap0",
                "crawl_duration": "17s",
                "sync_duration": 45,
                "sync_time_stamp": "5642.805770s",
                "sync_bytes": "300.85 MiB",
                "sync_files": 10000
            },
            "snaps_synced": 1,
            "snaps_deleted": 0,
            "snaps_renamed": 0
        }
    }
-------------
{
    "/d0": {
        "state": "idle",
        "last_synced_snap": {
            "id": 2,
            "name": "d0_snap0",
            "crawl_duration": "17s",
            "sync_duration": "2m 38s",
            "sync_time_stamp": "9259.225009s",
            "sync_bytes": "901.52 MiB",
            "sync_files": 30000
        },
        "snaps_synced": 1,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>

tools/cephfs_mirror: Add inprogress bytes and files metric

Add following mirroring progress metrics to current_syncing_snap
as below

bytes:
  sync_bytes - bytes synced till now
  total_bytes - total bytes to be synced
  sync_percent - Percentage of bytes synced till now
files:
  total_files - Total files to be synced
  sync_files - files synced till now
  sync_percent - Percentage of files synced till now

sync_files and sync_bytes are also stored in last_synced_snap section
after the snapshot is synced.

The bytes is formatted as below.

Sample output:
--------
{
    "/d0": {
        "state": "syncing",
        "current_syncing_snap": {
            "id": 3,
            "name": "d0_snap1",
            "bytes": {
                "sync_bytes": "120.20 MiB",
                "total_bytes": "149.94 MiB",
                "sync_percent": "80.16%"
            },
            "files": {
                "sync_files": 4032,
                "total_files": 5000,
                "sync_percent": "80.64%"
            }
        },
        "last_synced_snap": {
            "id": 2,
            "name": "d0_snap0",
            "sync_duration": 45,
            "sync_time_stamp": "5642.805770s",
            "sync_bytes": "300.85 MiB",
            "sync_files": 10000
        },
        "snaps_synced": 1,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}

Fixes: https://tracker.ceph.com/issues/73453
Signed-off-by: Kotresh HR <khiremat@redhat.com>

Merge pull request #68946 from tchaikov/wip-doc-vstart

doc/dev: refresh vstart.sh options in dev_cluster_deployment

Reviewed-by: Vikhyat Umrao <vikhyat@ibm.com>

Merge pull request #68758 from tchaikov/cmake-build-isal-lib

cmake/BuildISAL: build and install library targets only

Reviewed-by: Jamie Pryde <jamiepry@uk.ibm.com>

Merge pull request #68178 from rhcs-dashboard/start-with-libvirt-group

mgr/dashboard: run kcli commands in libvritd group

Reviewed-by: Afreen Misbah <afreen@ibm.com>

Merge pull request #68967 from rhcs-dashboard/remove-mirroing

mgr/dashboard: Remove cephfs mirroring navigation from Umbrella

Reviewed-by: Afreen Misbah <afreen@ibm.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #68970 from rhcs-dashboard/nfs-terminology

mgr/dashboard: NFS enhancements - terminology alignment

Reviewed-by: Afreen Misbah <afreen@ibm.com>

Merge pull request #68782 from smanjara/wip-fix-frontend-exception

rgw: catch exception from abort_early() on client disconnect

Merge pull request #68685 from perezjosibm/wip-perezjos-doc-crimson-dev

doc: crimson/dev - add a vstart.sh example using SeaStore options, minor formatting fixes

Merge pull request #68891 from rhcs-dashboard/carbonize-cluster-wide-osd-flags-modal

mgr/dashboard: Carbonize cluster-wide OSD flags modal

Reviewed-by: Afreen Misbah <afreen@ibm.com>
Reviewed-by: pujaoshahu <pshahu@redhat.com>

Merge pull request #68971 from rhcs-dashboard/carbonize-upgrade

Carbonize upgrade page

Reviewed-by: Devika Babrekar <devika.babrekar@ibm.com>

doc:crimson-dev: add RANDOM_BLOCK_SSD usage example, fix indentation

Signed-off-by: Jose J Palacios-Perez <perezjos@uk.ibm.com>

Merge PR #68937 into main

* refs/pull/68937/head:
.github/workflows/releng-audit: group events to serialize executions
.github/workflows/releng-audit: remove override on reopen
.github/workflows/releng-audit: refactor auth check to function

Reviewed-by: Yuri Weinstein <yweins@redhat.com>

Merge pull request #68868 from rhcs-dashboard/fix-edit

mgr/dashboard: Fix edit and delete access for pool-manager role

Reviewed-by: Abhishek Desai <abhishek.desai1@ibm.com>

Merge pull request #68951 from rhcs-dashboard/revert-nx

Revert: mgr/dashboard: reverting the nx tool changes

Reviewed-by: Nizamudeen A <nia@redhat.com>

mgr/dashboard: Remove cephfs mirroring navigation from Umbrella

Fixes: https://tracker.ceph.com/issues/76649
Signed-off-by: Dnyaneshwari Talwekar <dtalweka@redhat.com>

Merge pull request #67547 from mheler/wip-list-restorestatus

rgw: add RestoreStatus support to object listings

mgr/dashboard: fix logs e2e tests after carbonization

Update e2e test selectors to match the new Carbon component structure.
The .card-body and .message classes were replaced with .log-viewer
and .log-entry__message after carbonizing the logs component.

Assisted-by: Claude
Signed-off-by: Afreen Misbah <afreen@ibm.com>

Merge pull request #68953 from rhcs-dashboard/linter-modernization-research

mgr/dashboard: Replace htmllint with Prettier for HTML linting

Reviewed-by: Nizamudeen A <nia@redhat.com>

Revert "mgr/dashboard: set up dashboard as a app shell"

Fixes https://tracker.ceph.com/issues/74006

This reverts commit a0dd52fe100932922ceab9277490bfa2f8631431.

Conflicts:
src/pybind/mgr/dashboard/frontend/module-federation.config.ts
src/pybind/mgr/dashboard/frontend/package-lock.json
src/pybind/mgr/dashboard/frontend/package.json
src/pybind/mgr/dashboard/frontend/project.json

Signed-off-by: Afreen Misbah <afreen@ibm.com>

Revert " mgr/dashboard: add rollup as optional deps"

This reverts commit 6f14d6f25f06ed3d78a4c603e1ad9f10fc9c17d8.

Conflicts:
src/pybind/mgr/dashboard/frontend/package-lock.json
src/pybind/mgr/dashboard/frontend/package.json

Signed-off-by: Afreen Misbah <afreen@ibm.com>

mgr/dashboard: remove unused upgradable component

The upgradable component is no longer used after converting
the upgrade page to use Carbon tiles directly.

Assisted-by: Claude
Signed-off-by: Afreen Misbah <afreenmisbah@ibm.com>

mgr/dashboard: carbonize logs component

Fixes https://tracker.ceph.com/issues/68260

Assisted-by: Claude
Signed-off-by: Afreen Misbah <afreenmisbah@ibm.com>

mgr/dashboard: Carbonize upgrade page

- Made cluster status clickable to navigate to overview when not HEALTH_OK
- Replaced Bootstrap classes with Carbon design tokens
- Updated upgrade.component.scss to use CSS custom properties

Assisted-by: Claude
Signed-off-by: Afreen Misbah <afreenmisbah@ibm.com>

Merge pull request #66908 from rkachach/fix_nvmeof_dashboard_interface

mgr/cephadm: Add a new cephadm's API to get nvmeof TLS bundle

Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Adam King <adking@redhat.com>

mgr/dashboard: NFS enhancements - terminology alignment

Fixes: https://tracker.ceph.com/issues/76655
Signed-off-by: Dnyaneshwari Talwekar <dtalweka@redhat.com>

Merge pull request #68686 from rishabh-d-dave/fs-scrub-set-flag-for-dirfrags

mds/ScrubStack: set added_children to true for dirfrags too

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #67752 from supriti/wip-s3-policy-keystone-role

rgw: Inject keystone roles into IAM policy

Merge pull request #68740 from smanjara/wip-fix-multi-delete-crash

rgw: remove redundant close_section() call in RGWDeleteMultiObj end_response()

Merge pull request #68601 from aza547/multisite-data-log-fix

rgw: multisite sync data_log error handling broken in tentacle

Merge pull request #68567 from aza547/radosgw-sync-status-flush-fix

radosgw-admin: fix output of sync status

mgr/dashboard: Fix mon_allow_pool_delete unit test

Signed-off-by: Afreen Misbah <afreen@ibm.com>

mgr/dashboard: Fix edit and delete access for pool-manager role

Fixes https://tracker.ceph.com/issues/76561

- allows deleting pools in pool-manager role by bypassing config-opt read permissions
- allows editing in pool-manager role which failing deu to misisng rbd mirroring permissions
- fixes a bug with pool edit mode where when both compression and name are edited it fails due to an if-else logic bug

Signed-off-by: Afreen Misbah <afreen@ibm.com>

cmake/BuildISAL: build and install library targets only

Skip building the igzip executables; Ceph only needs libisal.la.
This should speed up the build a little bit, as we don't build the
executables previous built with "make"

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

Merge pull request #68949 from fultheim/fix-cleanr-space-leak

crimson/os/seastore: fix cleaner space leak from shadowed result list

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
Reviewed-by: Kefu Chai <tchaikov@gmail.com>

mgr/dashboard: Replace htmllint with Prettier for HTML linting

Fixes: https://tracker.ceph.com/issues/76631
Signed-off-by: Afreen Misbah <afreenmisbah@example.com>

crimson/os/seastore: fix cleaner space leak from shadowed result list

TransactionManager::get_extents_if_live() declared an inner
std::list<CachedExtentRef> res inside the "extent is cached" branch
that shadowed the outer res returned by the coroutine. When the
queried extent was present in the cache, it was moved into the inner
list and immediately discarded, and the empty outer list was returned
to the caller.

The async cleaner uses this result to decide whether to rewrite an
extent or treat it as dead. For recently-allocated LBA tree internal
nodes (still hot in cache), the shadowed return caused the cleaner to
skip them, so mark_space_free() never paired with the earlier
mark_space_used(). Each affected reclaim leaked exactly one extent
(4 KiB for LADDR_INTERNAL), tripping the live_bytes != 0 assertion in
SegmentCleaner::clean_space() (async_cleaner.cc:1441) once a victim
segment with such a leftover was selected.

The reproducer (at ~70% full) deterministically aborted within ~3
minutes before this fix; with the fix the OSDs run cleanly past the
trigger point.

Fixes: 87a5984b3ae ("crimson/.../transaction_manager: convert get_extents_if_live to coroutine")
Signed-off-by: Shai Fultheim <shai.fultheim@gmail.com>

.github/workflows/releng-audit: group events to serialize executions

This avoids confusion when several events are fired for e.g. label
changes before the bot can validate each change is authorized.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Assisted-by: Gemini

.github/workflows/releng-audit: remove override on reopen

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Assisted-by: Gemini

.github/workflows/releng-audit: refactor auth check to function

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Assisted-by: Gemini

Merge pull request #68743 from tchaikov/mgr-get_metadata

pybind/mgr/status: drop asserts that fight the defaultdict defaults

Reviewed-by: Nitzan Mordechai <nmordec@ibm.com>

doc/dev: refresh vstart.sh options in dev_cluster_deployment

Bring doc/dev/dev_cluster_deployment.rst back in line with the current
src/vstart.sh:

* drop the removed -K/--kstore objectstore backend
* drop -N/--not-new, which was dropped in 8dd2e418; reusing the existing
cluster config is simply the default when -n is not given
* correct the --rgw_frontend default from civetweb to beast
* note that -b/--bluestore is the default objectstore backend
* update the example and add a note that a fresh build needs -n on the
first run, while later runs can omit it
* note that the option list is not exhaustive and point at src/vstart.sh

Fixes: https://tracker.ceph.com/issues/57272
Signed-off-by: Kefu Chai <k.chai@proxmox.com>

Merge pull request #68571 from lumir-sliva/wip-rgw-postobj-bytes-received

rgw: account presigned POST bytes_received in usage log

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #68932 from mheler/wip-mclock-docs

doc/rados/configuration: recommend wpq for EC clusters seeing slow ops

Merge pull request #68909 from ShwetaBhosale1/fix_nfs_version_build_issue

Use GANESHA_REPO_BASEURL for NFS-Ganesha on all distros

Merge PR #68931 into main

* refs/pull/68931/head:
doc/dev: fix release cycle diagram and missing text

Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>

Merge PR #68923 into main

* refs/pull/68923/head:
script/ptl-tool: consolidate conflict reviews

Reviewed-by: Yuri Weinstein <yweins@redhat.com>

Merge PR #68921 into main

* refs/pull/68921/head:
.github/workflows/releng-audit: handle missing case of skipping audit on override

Reviewed-by: Yuri Weinstein <yweins@redhat.com>

doc/rados/configuration: recommend wpq for EC clusters seeing slow ops

On large EC clusters, mClock currently routes recovery EC sub-reads
through the immediate queue, skipping throttling. When many OSDs read
from one source during recovery, that source's high-priority queue
saturates and starves client work, producing slow ops. Recommend
falling back to wpq in the mClock config reference until the
scheduler treats those reads as background.

Signed-off-by: Matthew N. Heler <matthew.heler@hotmail.com>

doc/dev: fix release cycle diagram and missing text

Introduced in 0a54fcdfc491ce2b2bb3ded77e319a7cff785e73

Signed-off-by: Ville Ojamo <git2233+ceph@ojamo.eu>

Merge pull request #68359 from ronen-fr/wip-rf-cls-fromerror

cls: return EIO instead of ceph::from_error_code()

Reviewed-by: Alex Ainscow <aainscow@uk.ibm.com>
Reviewed-by: Adam C. Emerson <aemerson@redhat.com>

Merge pull request #68811 from tchaikov/wip-silence-cpp-btree-warnings

include/cpp-btree: fix false -Warray-bounds in child accessors

Reviewed-by: Matan Breizman<mbreizma@redhat.com>

script/ptl-tool: consolidate conflict reviews

To avoid saying the same things repeatedly.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>

.github/workflows/releng-audit: handle missing case of skipping audit on override

If someone adds -fail/-pass and override exists, the label should be
removed and -override respected.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>

Merge pull request #68721 from adamemerson/wip-boost-1.91-container-bug

rgw: Work around Boost.Containers bug in 1.91

Reviewed-by: Kefu Chai <tchaikov@gmail.com>

Merge PR #68913 into main

* refs/pull/68913/head:
.github/workflows/releng-audit: reuse existing redmine secret
.github/workflows/releng-audit: consolidate into single job
.github/workflows/releng-audit: handle simultaneous override and fail label changes

Reviewed-by: Yuri Weinstein <yweins@redhat.com>

Merge pull request #68409 from kamoltat/wip-ksirivad-hide-tiebreaker

mon: make tiebreaker mon optional in stretch-mode
Reviewed-by: Greg Farnum <gfarnum@redhat.com>

mgr/dashboard: adding daemon_name as an arg to nvmeof get bundle API

When cephadm-signed are in use, we know to know exacly which nvmeof daemon is
being used so we get the correct certificates for this daemon in
particular

Fixes: https://tracker.ceph.com/issues/74377
Signed-off-by: Redouane Kachach <rkachach@ibm.com>

Merge pull request #67858 from adk3798/cephadm-serialize-osd-rm-status

mgr/cephadm: serialize OSD class before returning for OSD rm status

Reviewed-by: John Mulligan <jmulligan@redhat.com>

Merge pull request #67694 from ashjosh1git/ceph-tracker-69477-pgscalar

Control PG autoscaler during upgrades with pg_autoscale_during_upgrade

Reviewed-by: Adam King <adking@redhat.com>

.github/workflows/releng-audit: reuse existing redmine secret

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>

.github/workflows/releng-audit: consolidate into single job

In order to make this a required check someday, we can't have the main
job ever be skipped. So, consolidate into a single job and skip actions
based on the router logic.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>

.github/workflows/releng-audit: handle simultaneous override and fail label changes

And add branch debugging.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>

Merge PR #68703 into main

* refs/pull/68703/head:
script/ptl-tool: continue adding conflicts to review when interactive
script/ptl-tool: improve wording for rationale requests
script/ptl-tool: refactor verify_commit_parity
script/ptl-tool: replace gitauth redirection
doc: document the releng-audit workflow and update release examples
script/ptl-tool, actions: introduce event-driven CI backport auditing
script/ptl-tool: introduce interactive backport parity and conflict verification
script/ptl-tool: use Authorization header

Reviewed-by: John Mulligan <jmulligan@redhat.com>

Merge pull request #68866 from ochaze/wip-doc-rgw-usage-shards-warning

doc/rgw: warn about rgw_usage_max_shards consistency

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #66064 from mheler/lifecycle_monitoring

rgw/lc: add per-bucket lifecycle performance monitoring

Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>

Use GANESHA_REPO_BASEURL for NFS-Ganesha on all distros

Fixes: https://tracker.ceph.com/issues/76603
Signed-off-by: Shweta Bhosale <Shweta.Bhosale1@ibm.com>

Merge pull request #68842 from ShwetaBhosale1/fix_issue_76504_nfs_to_reuse_cephfsclient_cache

mgr/nfs: reuse CephfsClient for path checks and earmark resolver

Reviewed-by: Kushal Deb <Kushal.Deb@ibm.com>
Reviewed-by: Ashwin M. Joshi <ashjosh1@in.ibm.com>

Merge pull request #68646 from ShwetaBhosale1/fix_issue_76284_skip_rdma_device_check_for_nfs_during_upgarde

mgr/cephadm: Skip RDMA device check for NFS during upgrade

Reviewed-by: Redouane Kachach <rkachach@ibm.com>

Merge pull request #67070 from JoshuaGabriel/wip-cephadm-ssh-74551

mgr/cephadm: remove SSH error logs from health detail when host is unreachable

Reviewed-by: Redouane Kachach <rkachach@ibm.com>

Merge pull request #68699 from Shubhaj1810/fix-issue-IBMCEPH-13078

cephadm: improve oauth2-proxy validation error messaging

Reviewed-by: Adam King <adking@redhat.com>

Merge pull request #68712 from yzaken/oauth2_proxy_redirect_dahsboard_browser_to_correct_port

mgr/cephadm: redirect browser to correct port by identity provider

Reviewed-by: Redouane Kachach <rkachach@ibm.com>

mds/ScrubStack: set added_children to true for dirfrags too

Introduced-by: 9e83e1c
Fixes: https://tracker.ceph.com/issues/76321
Signed-off-by: Rishabh Dave <ridave@redhat.com>

Merge PR #64774 into main

* refs/pull/64774/head:
test_cephfs.py: delete purge_dir() helper method, use rmtree() instead
test_cephfs.py: remove rendundant call to purge_dir()
test_cephfs.py: test rmtree on root
pybind/cephfs: don't attempt to unlink root in rmtree
test_cephfs.py: test rmtree with and without should_cancel
pybind/cephfs: make should_cancel option parameter for rmtree()
mgr/volumes: clone using cptree() from cephfs python bindings
test_cephfs: add unit tests for cptree() in cephfs python bindings
test/pybind/assertions: add helper method assert_less
pybind/cephfs: use depth-first, non-recursive approach for cloning
test_cephfs: call object setup/teardown for all tests in TestWithRootUser
test_cephfs.py: add tests for utimensat()
pybind/cephfs: add python bindings for utimensat()
qa/cephfs: add tests for chownat()
pybind/cephfs: add python bindings for chownat()
test_cephfs.py: add tests for chmodat()
pybind/cephfs: add python bindings for chmodat()
test_cephfs.py: add tests for symlinkat()
pybind/cephfs: add python binding for symlinkat()
test_cephfs.py: add test for readlinkat()
pybind/cephfs: add python binding for readlinkat()
pybind/cephfs: add tests for statxat()
pybind/cephfs: add python bindings for statxat()
test_cephfs.py: add tests for mkdirat()
pybind/cephfs: add python binding for mkdirat()

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Jos Collin <jcollin@redhat.com>

Merge pull request #67087 from ShwetaBhosale1/fix_issue_74479_nfs_active_active_support_allow_colo

mgr/cephadm: Allow colocation of NFS daemon to support active-active mode

Reviewed-by: Adam King <adking@redhat.com>

mgr/dashboard: Carbonize cluster-wide OSD flags modal
fixes:https://tracker.ceph.com/issues/76580
Signed-off-by: Sagar Gopale <sagar.gopale@ibm.com>

Merge pull request #68725 from ronen-fr/wip-rf-cmem-crimson

crimson/osd,qa: support OSD memory size in the OSD and in QA suites

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
Reviewed-by: Jose J Palacios-Perez <perezjos@uk.ibm.com>

Merge pull request #68876 from tchaikov/wip-crimson-co-return

crimson/osd: drop redundant trailing co_return in pg_advance_map

Reviewed-by: Matan Breizman<mbreizma@redhat.com>

Merge pull request #68602 from phlogistonjohn/jjm-bwc-u26

script/build-with-container: add distro references for ubuntu 26.04

Merge pull request #68014 from adamemerson/wip-rgw-no-vla

rgw: VLAs are no longer welcome

Reviewed-by: Jesse F. Williamson <jfw@ibm.com>

Merge pull request #68761 from MaxKellermann/librbd__missing_includes

librbd: add missing includes

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

Merge PR #68781 into main

* refs/pull/68781/head:
doc/governance: remove Sam from CSC

Reviewed-by: Joseph Mundackal <jmundackal@bloomberg.net>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>

mgr/cephadm: serialize OSD class before returning for OSD rm status

Fixes: https://tracker.ceph.com/issues/74862
Signed-off-by: Adam King <adking@redhat.com>

Merge PR #68780 into main

* refs/pull/68780/head:
doc/governance: remove Ken and Jeff from CSC

Reviewed-by: Dan van der Ster <dan.vanderster@clyso.com>

Merge PR #68779 into main

* refs/pull/68779/head:
doc/governance: update Ceph Executive Council List

Reviewed-by: Dan van der Ster <dan.vanderster@clyso.com>

doc: Updated the doc for NFS colocating ports

Fixes: https://tracker.ceph.com/issues/74479
Signed-off-by: Shweta Bhosale <Shweta.Bhosale1@ibm.com>

Merge pull request #68801 from afreen23/custom-image

mgr/dashboard: Allow quick bootstrap script to use custom images

Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #68769 from guits/fix-76433

ceph-volume: fix argparse dmcrypt opts: use str type

Merge pull request #68765 from guits/cv-fix-get-file-contents

ceph-volume: fallback to default for empty get_file_contents values

Merge pull request #68844 from Matan-B/wip-matanb-java17-crimson-rgw

qa/suites/crimson-rados/rgw/sts/tasks/1-keycloak: dont install java-1…

Reviewed-by: Shraddha Agrawal <shraddhaag@ibm.com>

pybind/mgr/status: drop asserts that fight the defaultdict defaults

The 'assert metadata' checks in the status module were actually fighting
against our own defaults. Since an empty defaultdict is falsy, these
asserts would blow up the whole command if a single daemon was down
after a mgr restart.

This drops those four grumpy asserts. Now, instead of a traceback,
`ceph osd status` and `ceph fs status` will just show a blank hostname
or "unknown" version as intended.

The trigger is common in practice: any mgr restart leaves daemons
that are currently down without metadata in daemon_state, since
they never reconnect via MMgrOpen to repopulate it. After such a
restart, `ceph osd status` and `ceph fs status` blow up:
```
  Error EINVAL: Traceback (most recent call last):
    ...
    File ".../status/module.py", line 340, in handle_osd_status
      assert metadata
  AssertionError
```

The bug was introduced in 5ac2901f54ff

Fixes: https://tracker.ceph.com/issues/76416
Reported-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
Signed-off-by: Kefu Chai <tchaikov@gmail.com>

mgr: narrow get_metadata return type with @overload

Enable type narrowing for get_metadata() when a non-None default is
provided. Previously, the return type was always `Optional[Dict[str, str]]`,
forcing callers to use defensive `assert metadata` checks even when
a result was guaranteed.

The wrapper returns either the metadata from `_ceph_get_metadata()` or the
caller-supplied default. Providing an `@overload` allows type checkers to
prove the result is non-None, avoiding invalid assertions for falsy
defaults (like an empty defaultdict).

This is a hygienic change with no runtime impact.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

Merge pull request #68814 from amathuria/wip-amat-fix-76447

crimson/osd: skip PGAdvanceMap on a deleted PG

Reviewed-by: Kefu Chai <tchaikov@gmail.com>