git-server-git.apps.pok.os.sepia.ceph.com Git

18.2.6

Signed-off-by: Ceph Release Team <ceph-maintainers@ceph.io>

common/pick_address: Add IPv6 support to is_addr_in_subnet

Updated the is_addr_in_subnet function to work with both
IPv4 and IPv6 addresses. Previously, it only supported IPv4,
which caused failures when IPv6 addresses were passed in.

Changes:
- Use inet_pton to detect IPv4 (AF_INET) or IPv6 (AF_INET6).
- Added sockaddr_in6 for IPv6 handling while keeping sockaddr_in for IPv4.
- Adjust the family and ifa_addr dynamically based on the address type.

Fixes: https://tracker.ceph.com/issues/67517
Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
(cherry picked from commit d68857c1e57e93a68d9301b3beff7e652f327a9e)
(cherry picked from commit 23a110bfbaf886aeb14f3a3147f429a9cf86b70c)

ceph-volume: fix regex usage in `set_dmcrypt_no_workqueue`

- Updated the regex pattern to `r'(\d+\.?)+'` to more accurately
  capture version numbers.

- Replaced `re.match` with `re.search` to properly match the cryptsetup
  version in the output.

- `re.match` only checks for a match at the beginning of the string,
   while `re.search` looks for a match anywhere in the string.

This fix ensures that the function correctly retrieves the
cryptsetup version from the output.

Fixes: https://tracker.ceph.com/issues/66393
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
(cherry picked from commit 69e5789f4ac81d79393fcd8fcc0f65578c518c43)

18.2.5

Signed-off-by: Ceph Release Team <ceph-maintainers@ceph.io>

test/librbd/test_notify.py: conditionally ignore some errors

In 2020, commit 01ff1530544c ("librbd: make all maintenance op
notifications async") introduced a backwards compatibility issue where
if exclusive lock is held by an older (octopus and below) client and
a maintenance op is proxied to it from a newer client, the newer client
interprets the notification for the in-place completion of the op as
the notification for the acceptance of an async request and expects
another notification for the completion of the op which never comes.
In 2021, this bug was discovered and test_notify.py was amended to
ignore it in commit 9c0b239d70cd ("qa/upgrade: conditionally disable
update_features tests").

However the two update_features tests that started hanging and got
disabled weren't the only ones to misbehave.  Rename, create_snap and
remove_snap tests were affected too but didn't hang or fail because
librbd also filtered certain errors codes like EEXIST and EINVAL.
Taking rename is an example:

1. a rename request is sent to from a newer client (N) to an octopus
   client (O)
2. O successfully renames the image and sends a completion notification
   with result = 0
3. N mistakes it for async request acceptance
4. after a timeout, N resends the rename request to O
5. O sees that an image already has that name (after step 2) and sends
   a completion notification with result = EEXIST
6. N interprets it as async request denial and bubbles up EEXIST,
   however right before returning control from Operations::rename()
   EEXIST is filtered and 0 is returned to the user

So back then rename, create_snap and remove_snap tests continued to
pass but started taking 30+ seconds instead of completing immediately.
In 2025 we did away with filtering error codes in commit 66508cdaa190
("librbd: stop filtering async request error codes") and these tests
started to fail.  Following the approach taken in commit 9c0b239d70cd
("qa/upgrade: conditionally disable update_features tests"), let's
ignore these failures based on the same environment variable.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit e7aeb7e325b8777251051dcc32e3baa70880257d)

rgw: remove keep_tail from RGWObjState

Signed-off-by: Jane Zhu <jzhu116@bloomberg.net>
(cherry picked from commit fd76b6466c298121994ba65cce3c0e76f8568841)

Conflicts:
src/rgw/rgw_sal_store.h RGWObjState is in rgw_sal.h on reef
(cherry picked from commit b94fcdfb6a2e01fd471e8c6ebd34145bebb78e20)

rgw: keep the tails when copying object to itself

Signed-off-by: Jane Zhu <jzhu116@bloomberg.net>
(cherry picked from commit 333e4a9b0de745cf5be40c5f6c32df7a340b007a)

Conflicts:
src/rgw/driver/rados/rgw_rados.cc
src/rgw/driver/rados/rgw_rados.h
_do_write_meta() no req_context arg
complete_atomic_modification() no optional_yield arg
(cherry picked from commit fdea7f34829010aaf77e8bb7ae979b07887abe78)

reef: qa/cephfs: switch to ubuntu 22.04 for stock kernel testing

This is for reef only since we don't have rhel8 images (which results
in failure to schedule fs suite run), so switch to using ubuntu 22.04.

Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit ccd60c09424fe7f5c4d4e364871c7be091d3c0d7)

ceph-volume: allow zapping partitions on multipath devices

ceph-volume refuses to zap a device if it is a partition on a multipath
device due to an overly strict condition. This change ensures that only
full mapper devices (excluding partitions) are blocked from being zapped,
allowing partitions on multipath devices to be processed correctly.

Fixes: https://tracker.ceph.com/issues/70363
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
(cherry picked from commit 16daa6a687c0536324b06536af12ce4e8fc04087)
(cherry picked from commit 29b6bcda3f69f594a751ec92b6985b3dfdd4d56b)

Merge pull request #62354 from aaSharma14/wip-70523-reef

reef: mgr/dashboard: When configuring the RGW Multisite endpoints from the UI allow FQDN(Not only IP)

Reviewed-by: Naman Munet <naman.munet@ibm.com>

Merge pull request #62376 from zdover23/wip-doc-2025-03-19-backport-62371-to-reef

reef: doc/dev/developer_guide/essentials: update mailing lists

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc/dev/developer_guide/essentials: update mailing lists

Update information for subscribing to Ceph development mailing lists as current documentation is outdated.

Fixes: https://tracker.ceph.com/issues/64580
Signed-off-by: Laimis Juzeliunas <laimis.juzeliunas@oxylabs.io>
(cherry picked from commit e7bf607269335ac40d91cb4b8f265064ffaac402)

Merge pull request #62191 from ljflores/wip-reef-backport-69760

Merge pull request #62369 from phlogistonjohn/jjm-reef-more-type-ignore

reef: mgr/diskprediction_local: avoid more mypy errors

Similar to c4111033172db28c4737e8438f27901811919ce4 this patch
suppresses mypy errors in the diskprediction_local mgr module.
I probably put the magic comment on more lines than needed but
mypy does not have a block-comment method to suppress checking
for just a region of code today.
This patch is not a backport as the issue is only impacting
reef CI jobs and so it is applied directly to the reef branch.

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>

mgr/dashboard: When configuring the RGW Multisite endpoints from the UI allow FQDN(Not only IP)

When configuring the RGW Multisite endpoints from the UI allow FQDN, at the moment when using a FQDN it's not allowed

Fixes: https://tracker.ceph.com/issues/69055
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit 9f3619af9ae911955916195084d225928d4b2f43)

Conflicts:
src/pybind/mgr/dashboard/frontend/package-lock.json (conflicts
with typescript package version, kept the existing one)
src/pybind/mgr/dashboard/frontend/package.json (conflicts with
typescript package version, kept the existing one)
src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-multisite-migrate/rgw-multisite-migrate.component.ts (conflicts with automated system user creation in main)
src/pybind/mgr/dashboard/frontend/src/app/shared/forms/cd-validators.ts (conflicts with oauthAddressTest validator)

mon, osd: add command to remove invalid pg-upmap-primary entries

The current rm-pg-upmap-primary command checks that the pgid exists
in the pgmap before continuing to remove it. Due to https://tracker.ceph.com/issues/66867,
some invalid pg-upmap-primary entires may exist for pools that have been removed.
Currently, these mappings are impossible to remove since the pgids no longer
exist in the pgmap.

This new command, rm-pg-upmap-primary-all, allows users the ability to remove
any and all pg-upmap-primary mappings in the osdmap at once, which includes
valid and invalid entries.

This command may also be helpful when upgrading from versions where users
are plagued by https://tracker.ceph.com/issues/61948. Users may use an upgraded
mon to remove all pg-upmap-primray entries (valid and invalid) so they continue
to upgrade to a safe version.

See manual testing for this patch here: https://tracker.ceph.com/issues/67179#note-12

Fixes: https://tracker.ceph.com/issues/67179
Fixes: https://tracker.ceph.com/issues/69760
Signed-off-by: Laura Flores <lflores@ibm.com>
(cherry picked from commit 6e9e2033bf0f4779bdfac9a3a4f29115459c8c0e)

Conflicts:
src/osd/OSDMap.cc
src/osd/OSDMap.h
The `rm_all_upmap_prims` per pool function is part of
https://github.com/ceph/ceph/commit/2953db8b58535605882dff2e1d4ff36e6075e122, which
is related to the "size optimized" read balancer feature that
is only included >= Squid.

Merge pull request #62087 from aaSharma14/wip-70252-reef

reef: mgr: fix subuser creation via dashboard

Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #62321 from zdover23/wip-doc-2025-03-15-backport-62319-to-reef

reef: doc/rados/troubleshooting: Improve troubleshooting-pg.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc/rados/troubleshooting: Improve troubleshooting-pg.rst

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
(cherry picked from commit 56a21cbc03e352867608c4cf0623d5566fb60cc8)

Merge pull request #62318 from zdover23/wip-doc-2025-03-15-backport-62316-to-reef

reef: doc/rados/operations: improve crush-map-edits.rst

doc/rados/operations: improve crush-map-edits.rst

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
(cherry picked from commit 59a894713a9e3540ed74c763cf856636bf300099)

Merge pull request #62218 from idryomov/wip-66419-reef

reef: qa/workunits/rbd: wait for resize to be applied in rbd-nbd

Reviewed-by: Ramana Raja <rraja@redhat.com>

Merge pull request #60615 from YiteGu/wip-68792-reef

reef: blk/KernelDevice: using join() to wait thread end is more safe

Merge pull request #62298 from zdover23/wip-doc-2025-03-14-backport-62119-to-reef

reef: doc: fixup #58689 - document SSE-C iam condition key

doc: fixup #58689 - document SSE-C iam condition key

Signed-off-by: dawg <code@dawg.eu>
(cherry picked from commit 7b4ac886621b71abb9356bce6c44b3c36b2c0ee2)

Merge pull request #62266 from zdover23/wip-doc-2025-03-13-backport-62249-to-reef

reef: doc/monitoring: Improve index.rst

Merge pull request #59697 from rhcs-dashboard/wip-67928-reef

reef: qa/mgr/dashboard: fix test race condition

Reviewed-by: Afreen Misbah <afreen@ibm.com>

mgr: fix subuser creation via dashboard

Subusers couldn't be created through the dashboard, because the get call was overwritten with Python magic due to it being the function under the HTTP call.
The get function was therefore split into an "external" and "internal" function, whereas one
can be used by functions without triggering the magic. Since the user object was then returned correctly, json.loads could be removed.

Signed-off-by: Hannes Baum <hannes.baum@cloudandheat.com>
(cherry picked from commit 90e221d0b53ad137e912b8cbd84935a8755f1fe7)

qa/tests: retry the api call after making the request

based on the pointer from Bill in https://tracker.ceph.com/issues/62972#note-75

Fixes: https://tracker.ceph.com/issues/62972
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 1588712b558f229d54fdfab744f2480f15333067)

qa/dashboard: fix test_list_enabled_module failure

Check the ports availability and go for a new port if the current one is
not available

Fixes: https://tracker.ceph.com/issues/62972
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit a2a4a3415c0e5ceef2cb01d3bcdf5eb1fff23803)

qa/dashboard: fix tasks.mgr.dashboard.test_health.HealthTest

as per: https://tracker.ceph.com/issues/47612#note-14

Fixes: https://tracker.ceph.com/issues/47612
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 79d61bbb58cb34f9da678e37b4373fc84bd393f5)

qa/tests: fix test_list_enabled_modules timeout error

This test deals with enabling/disabling the modules. The assumption I
have is after enabling the
module test will wait for an active mgr but its not able to find it in
time and it fails. so taking inspiration from https://github.com/ceph/ceph/pull/58995/commits/6c7253be6f6fbfa6faed7a539cb78847fec04580 adding retries and logs to see if that's the case

Fixes: https://tracker.ceph.com/issues/62972
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit b2da7394ee02abd6525372d94cab090818cd6c8e)

qa/mgr/dashboard: fix test race condition

Fixes: https://tracker.ceph.com/issues/66844
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 6c7253be6f6fbfa6faed7a539cb78847fec04580)

Merge pull request #62256 from rhcs-dashboard/wip-70424-reef

reef: mgr/dashboard: pin lxml to fix run-dashboard-tox-make-check failure

doc/monitoring: Improve index.rst

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
(cherry picked from commit 1bc67295c8b75475455ff702998ae4d7fb2ec749)

mgr/dashboard: pin lxml to fix run-dashboard-tox-make-check failure

xmlsec had an upgrade yesterday night and python3-saml might need to
adapt its library accordingly I suppose. Testing a fix by pinning lxml

Another approach is being tried out separately
https://github.com/ceph/ceph/pull/62239, but that is failing with some
other errors.

Fixes: https://tracker.ceph.com/issues/70411
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 1f84505f1232dd8445df1a2a819fa000062d3934)

Conflicts:
src/pybind/mgr/dashboard/requirements.txt
- only kept the lxml pinning. didn't add the newer deps that are
present in main

Merge pull request #57590 from NitzanMordhai/wip-66141-reef

reef: common/pick_address: check if address in subnet all public address

Reviewed-by: Laura Flores <lflores@redhat.com>

Merge pull request #62124 from ifed01/wip-ifed-fragmentation-command-reef

reef: tool/ceph-bluestore-tool: fix wrong keyword for 'free-fragmentation' …

Reviewed-by: akupczyk@ibm.com

Merge pull request #62209 from aaSharma14/wip-67936-reef

reef: mgr/dashboard: Fix variable capitalization in embedded rbd-details panel

Reviewed-by: Naman Munet <naman.munet@ibm.com>

Merge pull request #61892 from k0ste/wip-70068-reef

reef: os/bluestore: fix the problem that _estimate_log_size_N calculates the log size incorrectly

qa/workunits/rbd: wait for resize to be applied in rbd-nbd

Implement the same logic as in commit 6f3d0f570f1a ("test/librbd/fsx:
wait for resize to propagate in krbd_resize()").

Fixes: https://tracker.ceph.com/issues/66419
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit bedc75fff2876d2edd4e5af2e008467fa480b4c6)

mgr/dashboard: Fix variable capitalization in embedded rbd-details panel

Fix capitalization of image and pool variables in embedded grafana rbd-details panel

Fixes: https://tracker.ceph.com/issues/67849
Signed-off-by: Juan Ferrer Toribio <22457707+juan-ferrer-toribio@users.noreply.github.com>
(cherry picked from commit dfca044b6466d599fc4eb50f31bc40949e91e70e)

Merge pull request #62162 from phlogistonjohn/wip-70345-reef

reef: build-with-container: fixes and enhancements

Reviewed-by: Adam King <adking@redhat.com>

Merge pull request #62175 from idryomov/wip-64063-reef

reef: rbd-nbd: use netlink interface by default

Reviewed-by: Ramana Raja <rraja@redhat.com>

Merge pull request #61379 from vshankar/wip-reef-client-secfix

reef: client: disallow unprivileged users to escalate root privileges

Reviewed-by: Milind Changire <mchangir@redhat.com>

Merge pull request #62193 from zdover23/wip-doc-2025-03-10-backport-62176-to-reef

reef: doc/releases: Add ordering comment to releases.yml

doc/releases: Add ordering comment to releases.yml

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
(cherry picked from commit 2290a904bd6a3194f77b76d7deb0c9d24c8b4b93)

Merge pull request #62065 from dmick/wip-70241-reef

reef: container/build.sh: remove local container images

Merge pull request #62129 from idryomov/wip-65720-reef

reef: librbd: add rbd_diff_iterate3() API to take source snapshot by ID

Reviewed-by: Vinay Bhaskar Varada <vvarada@redhat.com>

Merge pull request #62127 from idryomov/wip-70190-reef

reef: librbd: fix a deadlock on image_lock caused by Mirror::image_disable()

Reviewed-by: Vinay Bhaskar Varada <vvarada@redhat.com>

Revert "test/librbd/fsx: switch to netlink interface for rbd-nbd"

This reverts commit 1a128a8d8c5cc4313fa301db5381af9963940383.

With commit fcbf7367d285 ("rbd-nbd: map using netlink interface by
default") backported to reef, this reef-only fixup limited to fsx is no
longer needed.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

rbd-nbd: map using netlink interface by default

Mapping rbd images to nbd devices using ioctl interface is not
robust. It was discovered that the device size or the md5 checksum
of the nbd device was incorrect immediately after mapping using
ioctl method. When using the nbd netlink interface to map RBD images
the issue was not encountered. Switch to using nbd netlink interface
for mapping.

Fixes: https://tracker.ceph.com/issues/64063
Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit fcbf7367d285629b382e3d9d32ac354319d1cc66)

Conflicts:
PendingReleaseNotes [ moved to >=18.2.5 section ]

Merge pull request #62037 from ceph/template-reef

Links to Jenkins jobs in PR comment commands / Remove deprecated commands

test/pybind/rbd: fix read offset in write zeroes tests

Random data is written and write zeroes is invoked on 0~256, but the
read is done on 256~256. This means that if write zeroes malfunctions
the test wouldn't catch it (especially in the thick provision case).

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit d41f0fa01f59a8d056dc28934b92212c78a05a62)

librbd: add rbd_diff_iterate3() API to take source snapshot by ID

Allow a diff to start from a non-user snapshot.  This would be used by
"rbd du" command to account for non-user snapshots which are currently
just skipped potentially resulting in underreported space usage and in
other places.

Fixes: https://tracker.ceph.com/issues/65720
Co-authored-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Vinay Bhaskar Varada <vvarada@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 54f47cc28ffd2d29b4f8cfaf56a5a5be2909bde7)

Conflicts:
src/include/rbd/librbd.h [ commit e5ccce14c4b0 ("rbd: add group
  snap info command") not in reef ]
src/test/pybind/test_rbd.py [ commit d7fd66ec9944 ("librbd: add
  rbd_clone4() API to take parent snapshot by ID") not in reef ]

doc: document the new container build tool and link to it in README

Add a new markdown file in the root of the tree, ContainerBuild.md, that
can serve as a basic introduction to the new container build tools
recently merged to ceph.
Add a small 'breadcrumb' section to the project README.md to help find
this new document.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 313546146c429e784ec291b686907f47b33c595c)

script/build-with-container: add support for overlay dir

The source dir (aka homedir, default /ceph) is mounted in the container
read-write. This is needed as the various ceph build scripts expect to
write things into the tree - often this is in the build directory - but
not always. This can lead to small messes and/or situations that are
confusing to debug, especially if one is jumping between distros often.
Add an option to use an overlay volume for the homedir - by default we
enable a persistent overlay with a supplied "upper dir" where files that
were written will appear. One can also enable a temporary overlay that
forgets the writes when the container exits - maybe useful when doing
experiments in 'interactive' mode.

To use this option run the command with the `--overlay=<dir>` option.
For example: `./src/script/build-with-container.py -b build.inner
--overlay-dir build.ovr`. This will create a directory
`build.ovr/content` automatically and all new files will appear there.
For example the build directory will appear at
`build.ovr/content/build.inner`.

To use the temporary overlay use a `-` as the directory name. For
example: `./src/script/build-with-container.py -b build.inner
--overlay-dir -`

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 794e3d0b25a05e019e549eb51ba0ddba1268d5a6)

script/build-with-container: skip dnf cache dir volume mounts on docker

When using docker the --volume option is not available during build
(docker [buildx] build), unlike podman. Since passing these volumes must
be conditional on them being set up I see no way to handle this short of
just disabling the option on docker. Log the fact that it's being
skipped - the only other issue is that we pointlessly set up some dirs
and the build may be a bit slower.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 4208a736652190fdaad3006c435f6c068e81a093)

script/build-with-container: remove default --volume arg from ctr build

On the original github pr #59841 user fayak kindly informed us that the
--volume option was not supported by docker build. Since this section
was a leftover from a previous way of constructing the builder image and
was no longer needed we simply removed it.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 612a9d6808f4f1d4f93aeca055acba064e7a1209)

script/build-with-container.py: build builder image with --pull=always

Construct the builder image using the --pull=always flag to initiate a
pull of the base image (centos, ubuntu, etc) in order to avoid using a
stale base image. Since the script automatically (by default) avoids
building if a matching tag is in local container storage it is handy to
use a fresh base when it *is* time to build something. Otherwise, you
end up in a situation like I sometimes do - using a months old base
unintentionally.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit f6e6188e30a9d765e86bd2d710666cfbdeb0818c)

script/build-with-container: add a common packages target

Add a `packages` target to build-with-container.py that requests a build
of packages, whatever package type is native to the distro selected.
For example `./src/script/build-with-container.py -d ubuntu22.04 -e
packages` will automatically select a deb packages build where
`./src/script/build-with-container.py -d centos9 -e packages` will
trigger rpm packages to be built. The underlying package-type specific
targets remain unchanged.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 37b7d509c59348ae11badd6673cb49ce9ce303fa)

script/build-with-container: support custom tag suffixes

Previously, one could use the `--tag` option to completely override the
container tag generated by the script. However, there are cases where
one may want to add information to the tag rather than override it.
Allow the tag value to start with a plus (+) character that indicates
that the remainder of the string is to be suffixed to the generated tag.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 30836c4ed4b9332f22b31897ce4ece0ad4da6fc0)

script/build-with-container: add --base-branch cli option

Add a command line option --base-branch that allows the user to supply a
custom base branch name. git doesn't make determining this easy so we
always assume a base branch of 'main' by default - but this option lets
one change that.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit ff34bf7241f1a1072f74494cc8f50156e0076019)

src/script: rename CEPH_BRANCH to CEPH_BASE_BRANCH for build container

Previously, we were passing build argument of CEPH_BRANCH, but that was
a bit misleading as we expect the current branch to vary a bit (as users
will be using branches to develop and test the code). What we actually
care about is the base branch ('main', 'squid', etc) as that is fed into
our bootstrap script and we want the option to simple variations based
on the name of said base branch.
Rename CEPH_BRANCH to CEPH_BASE_BRANCH for clarity.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit a1d49d557cfcc75bab6121e652350a6bfec3409f)

script/build-with-container: add --current-branch cli option

Add a new --current-branch argument that lets the user supply a name for
the current branch. This allows the automatic tag generation to avoid
calling git - something useful if the tree is not using a git checkout
(like a tarball). It also allows you to pull a temporary branch in git
but ignore it and act like the temporary branch is the base branch.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit c1713c5bc37b7e31bd84555066c06a72bb0f025b)

script/build-with-container: add more distro aliases

Add a system to define distro name aliases and use that to define some
additional aliases, primarily to match ubuntu codenames rather than
version numbers. Requested by Zack.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 65f055f0d8390b9787007433d16cf3a1737584ff)

script/build-with-container: apply black formatting to file

After the last set of fixes and enhancements I forgot to reformat the
file. This applies standard `black` formatting to the file.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit de855aec1c7a483ca5f0971a149860e8aaee8f7f)

Merge pull request #61531 from soumyakoduri/wip-skoduri-reef

reef: rgw: Fix LC process stuck issue

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>

Merge pull request #55431 from adk3798/reef-mcltf-true

reef: qa/tasks/cephadm: enable mon_cluster_log_to_file

Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

Merge pull request #61434 from idryomov/wip-57864-reef

reef: qa/tasks: Include stderr on tasks badness check.

librbd: fix a deadlock on image_lock caused by Mirror::image_disable()

With Mirror::image_disable() taking image_lock for write and calling
list_children() under it, the following deadlock is possible:

1. Mirror::image_disable() takes image_lock for write and calls
   list_children()
2. AbstractWriteLog::periodic_stats() timer fires (it runs every
   5 seconds) and ImageCacheState::write_image_cache_state() is called
   under a global timer_lock
3. ImageCacheState::write_image_cache_state() successfully takes
   owner_lock and blocks attempting to take image_lock for read because
   it's already held for write by Mirror::image_disable()
4. list_children() blocks inside of a call to ImageState::close() on
   a descendant image
5. The descendant image close can't proceed because TokenBucketThrottle
   requires a global timer_lock to complete QosImageDispatch shutdown
6. safe_timer thread which is holding timer_lock can't proceed because
   ImageCacheState::write_image_cache_state() is effectively blocked on
   the descendant image close through Mirror::image_disable()

Until commit 281a64acf920 ("librbd: remove snapshot mirror image-meta
when disabling"), Mirror::image_disable() was taking image_lock only for
read meaning that this deadlock wasn't possible.  The only other change
that commit 281a64acf920 made to the code block protected by image_lock
was using child_mirror_image_internal for cls_client::mirror_image_get()
call on descendant images instead of mirror_image_internal to preserve
the value of mirror_image_internal for later.  Both are local variables
that have nothing to do with image_lock, so I'm going back and making
Mirror::image_disable() take image_lock only for read again.

Fixes: https://tracker.ceph.com/issues/70190
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ff9aa20bc358775bf372052787322b1452886c13)

tool/ceph-bluestore-tool: fix wrong keyword for 'free-fragmentation' command.

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit 33037eccf07ded85ba9127bde333184a4de8f060)

Merge pull request #62104 from cbodley/wip-70152

reef: qa/rgw: avoid 'user rm' of keystone users

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>

qa/rgw: avoid 'user rm' of keystone users

partial backport of 2390788b89037bf5121adf4251b980dc20a8f269 did not
include a nearby change from ff81a31ad678472e6847ad39f57e14efd89b0ead

Fixes: https://tracker.ceph.com/issues/70152
Signed-off-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #61575 from zdover23/wip-doc-2025-01-30-backport-61566-to-reef

reef: doc/cephadm: simplify confusing math proposition

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #56408 from batrick/wip-65082-reef

reef: mon: do not log MON_DOWN if monitor uptime is less than threshold

Merge pull request #62046 from pritha-srivastava/wip-69257-reef

reef: rgw/sts: fix to disallow unsupported JWT algorithms

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #62045 from nbalacha/wip-70098-reef

reef: librbd: fix a crash in get_rollback_snap_id

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

Merge pull request #62043 from nbalacha/wip-69983-reef

reef: rbd-mirror: fix possible recursive lock of ImageReplayer::m_lock

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

Merge pull request #61595 from yuvalif/wip-63630-reef

reef: rgw/test/kafka: let consumer read events from the beginning

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>

Merge pull request #61973 from rhcs-dashboard/wip-70122-reef

reef: mgr/dashboard: disable deleting bucket with objects

Reviewed-by: Afreen Misbah <afreen@ibm.com>

Merge pull request #62078 from zdover23/wip-doc-2025-03-03-backport-62076-to-reef

reef: doc/rados/operations: Clarify stretch mode vs device class

doc/rados/operations: Clarify stretch mode vs device class

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
(cherry picked from commit 75be0272e8469ed214302b8f354bed675cdcaed6)

Merge pull request #61403 from ronen-fr/wip-rf-61289-reef

reef: common: fix md_config_cacher_t

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

container/build.sh: remove local container images

Optionally, for those that want to run build.sh locally and
use the images. The default is to remove, for Jenkins builders,
which will build, push, and rmi.

Fixes: https://tracker.ceph.com/issues/70196
Signed-off-by: Dan Mick <dan.mick@redhat.com>
(cherry picked from commit 642e5f2da00ad7382393c2b721078bccb9f823c0)

qa/workunits/rbd: add a test for force promote with a user snapshot

Add a reproducer for the crash on a bad variant access which was fixed
in commit 7d75161051da ("librbd: fix a crash in get_rollback_snap_id").

The reproducer deliberately works around many other issues with force
promote in snapshot-based mirroring: stopping rbd-mirror daemon
shouldn't be necessary (let alone with SIGKILL), get_rollback_snap_id()
and its caller can_create_primary_snapshot() are flawed and can pick
the wrong snapshot to roll back to or skip rollback when it's actually
required, the user snapshot in this scenario should be removed as part
of force promoting because it's incomplete and won't be usable after
the image is promoted, etc.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 0f4a37dd9f28070d0d421379385a5f2912cc9627)

Conflicts:
qa/workunits/rbd/rbd_mirror_journal.sh [ commits 3fd8a0388735
  ("qa/workunits/rbd: merge journal and snapshot test scripts")
  and 3fdbc160bb21 ("rbd-mirror: allow mirroring to a different
  namespace") not in reef ]
qa/workunits/rbd/rbd_mirror_snapshot.sh [ duplicated/cloned for
  snapshot-based mirroring ]

Merge pull request #62057 from zdover23/wip-doc-2025-02-28-backport-61626-to-reef

reef: doc/rados: improve pg_num/pgp_num info

doc/rados: improve pg_num/pgp_num info

Improve the guidance around setting pg_num, and clear up confusion
around whether pgp_num should be set manually or, indeed, if it even can
be set manually.

This PR was raised in response to Mark Schouten's email here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/CBDJTLTTIEZVG7GVZBX37UAWGYNSSMPD/

Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit c43e7337212fe38e8db63d00345fa9858b3cb10a)

mgr/dashboard: disable deleting bucket with objects

Fixes: https://tracker.ceph.com/issues/70078
Signed-off-by: Naman Munet <naman.munet@ibm.com>
(cherry picked from commit 11677c29ee6ee60d9191edfdbfbe37b5308eb45e)

Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-bucket-list/rgw-bucket-list.component.ts

Merge pull request #61980 from rhcs-dashboard/wip-70139-reef

reef: mgr/dashboard: critical confirmation modal changes

Reviewed-by: Afreen Misbah <afreen@ibm.com>

[CVE-2024-48916] rgw/sts: fix to disallow unsupported JWT algorithms
while authenticating AssumeRoleWithWebIdentity using JWT obtained
from an external IDP.

fixes: https://tracker.ceph.com/issues/68836

Signed-off-by: Pritha Srivastava <prsrivas@redhat.com>
(cherry picked from commit 919da3696668a07c6810dfa39301950c81c2eba4)

librbd: fix a crash in get_rollback_snap_id

get_rollback_snap_id() did not check if the snapshot it was
accessing was a mirror snapshot, causing it to crash if it wasn't.

Fixes: https://tracker.ceph.com/issues/70075
Signed-off-by: N Balachandran <nithya.balachandran@ibm.com>
(cherry picked from commit 7d75161051dad8047925259555d7ddd1a3e98de2)

rbd-mirror: fix possible recursive lock of ImageReplayer::m_lock

If periodic status update (LambdaContext which is queued from
handle_update_mirror_image_replay_status()) races with shutdown and
ends up being the last in-flight operation that shutdown was pending
on, we attempt to recursively acquire m_lock in shut_down() because
m_in_flight_op_tracker.finish_op() is called with m_lock (and also
m_threads->timer_lock) held. These locks are needed only for the call
to schedule_update_mirror_image_replay_status() and should be unlocked
immediately.

Fixes: https://tracker.ceph.com/issues/69978
Co-authored-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: N Balachandran <nithya.balachandran@ibm.com>
(cherry picked from commit c60514087bc29540d3babd7855c5a4e28f2bf1b0)

Merge PR #57190 into reef

* refs/pull/57190/head:
pybind/mgr/mgr_module: turn off all automatic transactions
pybind/mgr: disable sqlite3/python autocommit
qa/tasks/mgr: add tests for sqlite autocommit
qa/tasks/vstart_runner: run daemons in foreground
qa/tasks/vstart_runner: add missing poll method
qa/suites/rados/mgr: add cli/devicehealth tasks
qa: reorganize mgr unit tests
qa: use position-independent link
qa: add missing terminating newline
pybind/mgr: add killpoint for sqlite3 database setup
mgr: allow specifying module option level
mon/MgrMonitor: promote standby when unsetting down flag
mon/MgrMonitor: only drop active if exists

Reviewed-by: Laura Flores <lflores@redhat.com>

doc: PR Template - Remove non-functional trigger phrases

Signed-off-by: David Galloway <david.galloway@ibm.com>

doc: PR Template - Add Jenkins job URLs to commands

Signed-off-by: David Galloway <david.galloway@ibm.com>

Merge pull request #61831 from idryomov/wip-69911-reef

reef: librbd: fix mirror image status summary in a namespace

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

Merge pull request #61916 from adk3798/wip-68158-reef

reef: cephadm: Support Docker Live Restore

Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Reviewed-by: John Mulligan <jmulligan@redhat.com>

Merge pull request #62005 from zdover23/wip-doc-2025-02-26-backport-62001-to-reef

reef: doc: fix incorrect radosgw-admin subcommand

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>