git.apps.os.sepia.ceph.com Git

Merge PR #65817 into wip-jcollin-testing-20251016.055245-squid

* refs/pull/65817/head:
src/common: add helper to prepend "..." to trimmed paths
mds/ScrubStack: avoid generating inode path since it is unused
mds: fix few log entries
client: trim path before logging it
mds: log trimmed path wherever generating full path is necessary
mds: for logging generate only 10 final components of dentry path
mds: for logging generate only 10 final components of inode path
qa, test: run unit tests for cephfs.pyx with non-root user
test/pybind: add unit tests for rmtree() in cephfs python bindings
pybind/cephfs, mgr/volumes: refactor purge() to be non-recursive

Merge PR #65821 into wip-jcollin-testing-20251016.055245-squid

* refs/pull/65821/head:
client: Fix a deadlock when osd is full

Merge PR #65822 into wip-jcollin-testing-20251016.055245-squid

* refs/pull/65822/head:
mds/FSMap: fix join_fscid being incorrectly reset for active MDS during filesystem removal

Merge PR #65823 into wip-jcollin-testing-20251016.055245-squid

* refs/pull/65823/head:
mds: fix rank 0 marked damaged if stopping fails after Elid flush and log trimmed

Merge PR #65824 into wip-jcollin-testing-20251016.055245-squid

* refs/pull/65824/head:
mds: fix test that directory has no snaps
qa: test for child dir with first beyond parent snaps
qa: remove extraneous directory from test
qa: correct test description

Merge pull request #65945 from phlogistonjohn/jjm-bwc-variants-s

squid: build-with-container: build image variants

Merge pull request #65928 from rhcs-dashboard/wip-73509-squid

squid: mgr/dashboard : Fixed usage bar for secondary site in rbd mirroing

Reviewed-by: Afreen Misbah <afreen@ibm.com>

script/build-with-container: add build image variants

Allow the user to control the content of the build image with a
high-level `--image-variant=` switch. Currently the supported values are
`default` (the same maximal image we have been generating) and
`packages` a slimmer image that avoids installing certain test-only
dependencies.

Signed-off-by: John Mulligan <jmulligan@redhat.com>

Dockerfile.build: make FOR_MAKE_CHECK a build argument

Set it only during install time.

Signed-off-by: John Mulligan <jmulligan@redhat.com>

install-deps.sh: let FOR_MAKE_CHECK variable take precedence

Previously, the FOR_MAKE_CHECK variable could only enable installing
extra (test) dependencies when install-deps.sh was used and it was
ignored if `tty -s` exited true. This change allows FOR_MAKE_CHECK to
take precedence over the tty check and to specify one of true, 1, yes to
enable extra "for make check" deps or false, 0, no to explicitly disable
the extra deps.

Based-on-work-by: Dan Mick <dan.mick@redhat.com>
Signed-off-by: John Mulligan <jmulligan@redhat.com>

Merge pull request #60839 from vshankar/wip-68922-squid

squid: qa/cephfs: randomize configs in `fs:thrash:workloads`

Reviewed-by: Rishabh Dave <ridave@redhat.com>

Merge pull request #61301 from batrick/wip-68722-squid

squid: qa/cephfs: override testing kernel with -k option

Reviewed-by: Rishabh Dave <ridave@redhat.com>

Merge pull request #61303 from batrick/wip-68450-squid

squid: qa: ignore pg availability/degraded warnings

Reviewed-by: Rishabh Dave <ridave@redhat.com>

Merge pull request #61304 from batrick/wip-68244-squid

squid: qa: correct daemon for warning conf

Reviewed-by: Rishabh Dave <ridave@redhat.com>

Merge pull request #62091 from batrick/wip-70156-squid

squid: qa: ignore variant of down fs

Reviewed-by: Rishabh Dave <ridave@redhat.com>

mgr/dashboard : Fixed usage bar for secondary site in rbd mirroing
fixes : https://tracker.ceph.com/issues/73447
Signed-off-by: Abhishek Desai <abhishek.desai1@ibm.com>
(cherry picked from commit 60140b1ccc8006325632320e39fc209724524aef)

Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/block/rbd-list/rbd-list.component.html

Merge pull request #62918 from rishabh-d-dave/wip-71018-squid

squid: mgr/vol: add command to get snapshot path

Merge pull request #63222 from rishabh-d-dave/wip-71276-squid

squid: mgr/vol: make "snapshot getpath" cmd work with v1 and legacy

Merge pull request #64205 from rishabh-d-dave/wip-71854-squid

squid: mgr/vol: include group name in subvolume's pool namespace name

Merge pull request #65838 from phlogistonjohn/jjm-rmc-backport-squid

squid: run-make-check.sh: handle sudo and command that may not run in container

qa: ignore variant of down fs

Fixes: https://tracker.ceph.com/issues/70107
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
(cherry picked from commit 1c0359dcf00309049d1b2081c64ea8bade5dafa1)

Conflicts:
qa/cephfs/overrides/ignorelist_health.yaml: trivial

Merge pull request #65444 from NitzanMordhai/wip-72919-squid

squid: suites/rados/cephadm: typo in ignore list for still running message

Merge pull request #65844 from phlogistonjohn/jjm-bwc-backports-s

squid: sync build-with-container patches from main

script/build-with-container: improve error handling for invalid distros

Instead of throwing a long obnoxious traceback at the user if the value
supplied to -d/--distro is invalid do something nicer. For example:
```
$ ./src/script/build-with-container.py -d trixy -e build
usage: build-with-container.py [-h] [--help-build-steps]
build-with-container.py: error: argument --distro/-d: unknown distro: 'trixy' not in centos10, centos10stream, centos8, centos9, centos9stream, rocky9, rockylinux9, rocky10, rockylinux10, fedora41, fc41, fedora42, fc42, fedora43, fc43, ubuntu20.04, ubuntu-focal, focal, ubuntu22.04, ubuntu-jammy, jammy, ubuntu24.04, ubuntu-noble, noble, debian12, debian-bookworm, bookworm, debian13, debian-trixie, trixie

```

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 72f3ad9549e84bdba7bdfd97d2ede3c55e02f103)

script/build-with-container: add debian 13 (trixie)

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit a13fa091dd6bad35c44076cb7c46cb7bcc17a7ac)

script/build-with-container: add ubuntu 20.04 (focal)

Add ubuntu 20.04 (focal) to the available list of distro kinds.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 7c40f7bd07ac935d0657b9284118da8590a5cf0d)

script/build-with-container: add a pair of fedora distro versions

Add fedora 42 and the soon-to-be-released fedora 43.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 76fe5ad298ee5626eeb63591a702e8f8cc9be7d0)

script/build-with-container: lightly organize the distro kind aliases

Do a tiny reorg of the distro kind aliases and container images to keep
the EL distros together and comment out each "section".

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 4430a5ad6be6f26309d5f5bea0e448a4bbd432e1)

script/build-with-container: be consistent with naming in distro kinds

Update the DistroKind enum and related items so that the naming is
applied consistently. That is: the canonical (no pun indented) form
of the name is "<name><version>" and codenames, such as "jammy" or
"bookworm" are aliases. This matches the previously existing code.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit ac11a80a63ab1909fbdf682d830acde96856f502)

src/script: add bookworm to build-with-container.py

..and its friend buildcontainer-setup.sh

Signed-off-by: Dan Mick <dan.mick@redhat.com>
(cherry picked from commit 34b497c2f3652e7d30c7b7476b711fd9f1f4ecac)

build-with-container: ensure npm dir is set up before configure

When the npm cache path option is passed the npm cache dir is passed
to all container `run` commands, ensure the dir has been created
before the first container command (configure) is used.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 79166af192ea0b4b982b56ce521516d5a29e7a0d)

run-make-check.sh: handle sudo and command that may not run in container

Work around a known failure that sudo is not expected to be present in
container images. Prepare to handle a failure to set a sysctl param.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 9f44155dff195015186315968a0a1e8ce925ed5d)

install-deps: extract SUDO variable logic into a reusable function

While the function is pretty simple and could be copy-pasted I
prefer to extract things into functions to indicate that the
logic is used/repeated elsewhere to ward off making changes to
one copy vs the other.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit bbd7933598e11d84758a6f09fd176f47c744aaa2)

src/common: add helper to prepend "..." to trimmed paths

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit c38a9138ba8294ab1243cf03ad0c8b0df4901967)

mds/ScrubStack: avoid generating inode path since it is unused

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 10e4ccb104d84444f0047e166a9dff997c4e2736)

mds: fix few log entries

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit e4c301b9f0204b6a82490a68ec4c3a26db7b013f)

Conflicts:
src/mds/MDSAuthCaps.cc
- is_capable()'s log message is slightly different in Squid branch
leading to conflict.

client: trim path before logging it

Path can be virtually infinitely long and logging a long long path
(imagine around 2000 path components) is un-useful as well as lowers
readability of the log. Therefore, trim before logging.

Fixes: https://tracker.ceph.com/issues/72993
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit bdc8aae400fbbdd61df811455d49176deab1f331)

Conflicts:
src/include/filepath.cc
src/include/filepath.h
- Unlink main branch, filepath.cc is absent in this branch. Therefore,
the changes must be moved to filepath.h.

mds: log trimmed path wherever generating full path is necessary

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 11de1e5772fa88125de10dc7972e0e31e33140d0)

Conflicts:
src/mds/MDSAuthCaps.cc
src/test/mds/TestMDSAuthCaps.cc
- is_capable takes one less argument in Squid compared to main branch
  version.

src/mds/Server.cc
src/mds/SessionMap.cc
-  There is lesser including of other header files in both these files
   leading to difficulty in patch application in this region.

mds: fix test that directory has no snaps

To look if the directory's first is beyond the last snap. This matches the behavior of lssnaps.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Fixes: https://tracker.ceph.com/issues/71462
(cherry picked from commit c22db4e683cf2e6b0decc937e9ab92ba15d46487)

qa: test for child dir with first beyond parent snaps

If the parent directory has snapshots but the child was created after, then we
should be able to modify its charmap.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Fixes: https://tracker.ceph.com/issues/71462
(cherry picked from commit 659e4262d042dc50a381846c25640c76a06bdec2)

qa: remove extraneous directory from test

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Fixes: https://tracker.ceph.com/issues/71462
(cherry picked from commit 7678dbfd8830141ece420fde66bbb1687c616206)

qa: correct test description

This test is checking for failure conditions.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Fixes: https://tracker.ceph.com/issues/71462
(cherry picked from commit c428149b9cb12c9e9b90d305131b669211a56b4b)

mds: fix rank 0 marked damaged if stopping fails after Elid flush and log trimmed

steps to reproduce
../src/vstart.sh --debug --new -x --localhost --bluestore
./bin/ceph tell mds.<rank 0> config set mds_kill_shutdown_at 10
./bin/ceph fs set <fs name> down true

wait for a few seconds and will see the following log from take-over mds
and rank 0 is marked damaged
2025-09-11T16:47:24.591+0800 785dabeaa6c0 -1 log_channel(cluster) log [ERR] : No subtrees found for root MDS rank!
2025-09-11T16:47:24.591+0800 785dabeaa6c0 5 mds.beacon.b set_want_state: up:rejoin -> down:damaged

During shutdown_pass after submitting Elid and trimming mdlog, mds log
will now have only ELid event which does nothing at replay.
After replay, no subtree is found.

Fix this by checking whther MDLog contains only one event.
If so, skip the subtree check for rank 0, and allow it to request
STATE_STOPPED just like the other ranks.

Fixes: https://tracker.ceph.com/issues/72983
Signed-off-by: ethanwu <ethanwu@synology.com>
(cherry picked from commit adb448b4f4e421f75275874f5a67c3a2ceb0214c)

mds/FSMap: fix join_fscid being incorrectly reset for active MDS during filesystem removal

Fix bug where active MDS daemons in remaining filesystems incorrectly
have their join_fscid cleared to FS_CLUSTER_ID_NONE when any other
filesystem is removed.

The issue was caused by variable name shadowing in erase_filesystem()
where the loop variable 'fscid' shadowed the function parameter 'fscid':
Inside loop: if (info.join_fscid == fscid) compared against the
loop variable (remaining FS ID) instead of parameter (removed FS ID)

Renamed the loop variable to 'remaining_fscid' to eliminate the shadowing
and ensure the comparison uses the correct filesystem ID.

Reproducer:
../src/vstart.sh --new -x --localhost --bluestore
FS=b
./bin/ceph osd pool create cephfs.${FS}.meta 64 64 replicated
./bin/ceph osd pool create cephfs.${FS}.data 64 64 replicated
./bin/ceph fs new ${FS} cephfs.${FS}.meta cephfs.${FS}.data
./bin/ceph config set mds.a mds_join_fs a
./bin/ceph config set mds.b mds_join_fs a
./bin/ceph fs fail ${FS}
./bin/ceph fs rm ${FS} --yes-i-really-mean-it

Then from ./bin/ceph fs dump
We can see join_fscid in all active mds filesystem 'a' are reset.
Since there are standby mds with join_fscid=1
MDSMonitor think they have better affinity and trigger switch over.

Fixes: https://tracker.ceph.com/issues/73183
Signed-off-by: ethanwu <ethanwu@synology.com>
(cherry picked from commit cfecf7c867d20d7d05ab3f341844c7c2b9b733d0)

client: Fix a deadlock when osd is full

Problem:
When osd is full, the client receives the notification
and cancels the ongoing writes. If the ongoing writes
are async, it could cause a dead lock as the async
callback registered also takes the 'client_lock' which
the handle_osd_map takes at the beginning.

The op_cancel_writes calls the callback registered for
the async write synchronously holding the 'client_lock'
causing the deadlock.

Earlier approach:
  It was tried to solve this issue by calling 'op_cancel_writes'
without holding 'client_lock'. But this failed lock dependency
between objecter's 'rwlock' and async write's callback taking
'client_lock'. The 'client_lock' should always be taken before
taking 'rwlock'. So this approach is dropped against the current
approach.

Solution:
Use C_OnFinisher for objecter async write callback i.e., wrap
the async write's callback using the Finisher. This queues the
callback to the Finisher's context queue which the finisher
thread picks up and executes thus avoiding the deadlock.

Testing:
The fix is tested in the vstart cluster with the following reproducer.
1. Mount the cephfs volume using nfs-ganesha at /mnt
2. Run fio on /mnt on one terminal
3. On the other terminal, blocklist the nfs client session
4. The fio would hang

It is reproducing in the vstart cluster most of the times. I think
that's because it's slow. The same test written for teuthology is
not reproducing the issue. The test expects one or more writes
to be on going in rados when the client is blocklisted for the deadlock
to be hit.

Stripped down version of Traceback:
----------
0  0x00007f4d77274960 in __lll_lock_wait ()
1  0x00007f4d7727aff2 in pthread_mutex_lock@@GLIBC_2.2.5 ()
2  0x00007f4d7491b0a1 in __gthread_mutex_lock (__mutex=0x7f4d200f99b0)
3  std::mutex::lock (this=<optimized out>)
4  std::scoped_lock<std::mutex>::scoped_lock (__m=..., this=<optimized out>, this=<optimized out>, __m=...)
5  Client::C_Lock_Client_Finisher::finish (this=0x7f4ca0103550, r=-28)
6  0x00007f4d74888dfd in Context::complete (this=0x7f4ca0103550, r=<optimized out>)
7  0x00007f4d7498850c in std::__do_visit<...>(...) (__visitor=...)
8  std::visit<Objecter::Op::complete(...) (__visitor=...)
9  Objecter::Op::complete(...) (e=..., e=..., r=-28, ec=..., f=...)
10 Objecter::Op::complete (e=..., r=-28, ec=..., this=0x7f4ca022c7f0)
11 Objecter::op_cancel (this=0x7f4d200fab20, s=<optimized out>, tid=<optimized out>, r=-28)
12 0x00007f4d7498ea12 in Objecter::op_cancel_writes (this=0x7f4d200fab20, r=-28, pool=103)
13 0x00007f4d748e1c8e in Client::_handle_full_flag (this=0x7f4d200f9830, pool=103)
14 0x00007f4d748ed20c in Client::handle_osd_map (m=..., this=0x7f4d200f9830)
15 Client::ms_dispatch2 (this=0x7f4d200f9830, m=...)
16 0x00007f4d75b8add2 in Messenger::ms_deliver_dispatch (m=..., this=0x7f4d200ed3e0)
17 DispatchQueue::entry (this=0x7f4d200ed6f0)
18 0x00007f4d75c27fa1 in DispatchQueue::DispatchThread::entry (this=<optimized out>)
19 0x00007f4d77277c02 in start_thread ()
20 0x00007f4d772fcc40 in clone3 ()
--------

Fixes: https://tracker.ceph.com/issues/68641
Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 60c58013c53d0f280b8f96b7caf9c255b54640fb)

mds: for logging generate only 10 final components of dentry path

Generating full absolute path for dentries for printing in MDS logs
slows the down the FS to a great extent especially when the path is very
long (imagine a path with 2000 components). Printing such long paths in
MDS logs is not only pointless but also greatly reduces the readability
of MDS logs.

Therefore, generate only 10 final components of the dentry paths for logging.

Fixes: https://tracker.ceph.com/issues/72779
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 1430cd67d8f7bd7d98b241a7511fa3ceb7e5ba2e)

Conflicts:
src/include/filepath.cc
- Unlike main branch, this file is absent in squid

mds: for logging generate only 10 final components of inode path

Generating full absolute path for inodes for printing in MDS logs slows
down the FS to a great extent especially when the path is very long
(imagine a path with 2000 components). Also printing such long paths in
MDS logs is not only pointless but also greatly reduces the readability
of the MDS logs.

Therefore, generate only 10 final components of inode paths for logging.

Fixes: https://tracker.ceph.com/issues/72779
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 1518690210f3a4473978c7a9274e902fccaad862)

Conflicts:
src/mds/CDir.cc
- Certain code region where trimmed inode path was generated was
modified to generated inode path but that code region is absent on
this branch.

qa, test: run unit tests for cephfs.pyx with non-root user

Run test_python.sh with non-root user. This makes it necessary to change
the owner user and group of file system root to be same as this non-root
user. This brings testing closer to the real-world scenario and also
allows exercising negative tests where an FS op would fail for a non-root
user but it would pass for root user.

There are few tests that exercise FS operations where root user is
needed. Group these tests under a separate class and add extra code for
this class that allows these tests to run with root UID and GID.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 6021dda7ed137445885979cd4d4b28c770abce13)

Conflicts:
src/test/pybind/test_cephfs.py
- Slight difference in this file compared to main branch version led to
confict.

test/pybind: add unit tests for rmtree() in cephfs python bindings

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 05082a932984bb6329481c14ee76ae033c019f4e)

pybind/cephfs, mgr/volumes: refactor purge() to be non-recursive

Method purge() in trash.py calls rmtree() which is recursive method. To
avoid Python's recurision limit, switch to non-recursive approach.

Path to directory along directory handle are clubbed in to a tuple and
that tuple is stored on the stack. Storing directory handle reduces call
to opendir() dramatically.

Fixes: https://tracker.ceph.com/issues/71648
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit f9046ca052d10a884a59c1d928cb0c8f0235696b)

Merge pull request #65462 from pdvian/wip-72853-squid

squid: mgr/DaemonState: Minimise time we hold the DaemonStateIndex lock

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

Merge pull request #65214 from ifed01/wip-ifed-discard-threads-better-lifecycle-squi

squid: blk/kernel: improve DiscardThread life cycle.

Reviewed-by: YiteGu <yitegu0@gmail.com>

Merge pull request #65006 from mchangir/wip-72564-squid

squid: mgr: avoid explicit dropping of ref

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

Merge pull request #65335 from abitdrag/wip-72817-squid

squid: auth: msgr2 can return incorrect allowed_modes through AuthBadMethodFrame

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

Merge pull request #64739 from VinayBhaskar-V/wip-72319-squid

squid: rbd-mirror: prevent image deletion if remote image is not primary

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

Merge pull request #65665 from kchheda3/wip-73055-squid

squid: rgw/account: bucket acls are not completely migrated once the user is migrated to an account

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #65709 from aaSharma14/wip-73293-squid

squid: monitoring: fix MTU Mismatch alert rule and expr

Reviewed-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>

Merge pull request #65706 from rhcs-dashboard/wip-73274-squid

squid: mgr/dashboard: Blank entry for Storage Capacity in dashboard under Cluster > Expand Cluster > Review

Reviewed-by: Pedro Gonzalez Gomez <pegonzal@ibm.com>

monitoring: fix MTU Mismatch alert rule and expr

Fixes: https://tracker.ceph.com/issues/73290
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit bee24dec441b9e6b263e4498c2ab333b0a60a52d)

Conflicts:
monitoring/ceph-mixin/prometheus_alerts.yml
monitoring/ceph-mixin/tests_alerts/test_alerts.yml
src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/prometheus/active-alert-list/active-alert-list.component.html
src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/prometheus/active-alert-list/active-alert-list.component.ts
src/pybind/mgr/dashboard/frontend/src/app/shared/datatable/table-key-value/table-key-value.component.scss

release note: add note for change in format of name of pool...

namespace of CephFS volumes.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit f350d9800024661eecdfd7da6d57fa0e0324d981)

mgr/dashboard: Blank entry for Storage Capacity in dashboard under Cluster > Expand Cluster > Review

https://tracker.ceph.com/issues/73220

Signed-off-by: Naman Munet <naman.munet@ibm.com>
(cherry picked from commit a01909e7588c7ff757079475e3ea6f1dc3054db7)

Merge pull request #64456 from cbodley/wip-72090-squid

squid: deb/mgr: remove deprecated distutils from ceph-mgr.requires

Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #65141 from mchangir/wip-70925-squid

squid: mds: fix heap-use-after-free in C_Flush_Journal

Merge pull request #65620 from aaSharma14/wip-73167-squid

squid: mgr/dashboard: fix zone update API forcing STANDARD storage class

Reviewed-by: Afreen Misbah <afreen@ibm.com>

Merge pull request #65671 from aaSharma14/wip-73231-squid

squid: monitoring: fix "In" OSDs in Cluster-Advanced grafana panel. Also change units from decbytes to bytes wherever used in the panel

Reviewed-by: Afreen Misbah <afreen@ibm.com>

release note: add a note for "snapshot getpath" command

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit a59b1fa431e2b546877c160beb5f67f2970776f0)

doc/cephfs: add doc for "snapshot getpath" cmd

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 9e40a5c8d7a5cd6e4c1929559c4c7e3411653de5)

qa/cephfs: add tests for "subvolume snapshot getpath" cmd

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 870cbf62d288ae09ea06a5da112ea62156336924)

mgr/vol: add command to get snapshot path

Fixes: https://tracker.ceph.com/issues/70815
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 50d28992d99fcd67390815aa42f9da8ffaa82575)

Conflicts:
src/pybind/mgr/volumes/fs/volume.py
- Line where the original patch makes the change is slightly different
in main compared to Squid branch, leading to conflict.

monitoring/ceph_mixin: fix Cluster - Advanced OSD grafana panel

1. Fixes the promql expr used to calculate "In" OSDs in
ceph-cluster-advanced.json.
2. Fixes the color coding for the single state panels used in the OSDs
grafana panel like "In", "Out" etc

Fixes: https://tracker.ceph.com/issues/72810
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit 53a6856d603e0fe4ff31f76e19263a80359a9f1d)

Merge pull request #65659 from ceph/wip-squid-noble

squid: cmake: remove _FORTIFY_SOURCE define

Merge pull request #64605 from cbodley/wip-72190-squid

squid: deb/cephadm: add explicit --home for cephadm user

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

rgw/account: bucket acls are not completely migrated once the user is migrated to an account

Signed-off-by: kchheda3 <kchheda3@bloomberg.net>
(cherry picked from commit 23dc3697cfd309b4d8736ec99490cd57db621cf7)

cmake: remove _FORTIFY_SOURCE define

according to `dpkg-buildflags`, ubuntu 24 raised this value to
`-D_FORTIFY_SOURCE=3` which causes `error: "_FORTIFY_SOURCE" redefined`
compilation failures because Ceph itself adds `-D_FORTIFY_SOURCE=2`

`_FORTIFY_SOURCE` is a hardening option. both our rpm and debian builds
already specify that via environment variables, so Ceph's cmake should
leave it alone

Fixes: https://tracker.ceph.com/issues/72361
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 66bec97b0dc90b91f5be586351f52082beb6374a)

Merge pull request #61166 from anoopcs9/wip-69306-squid

squid: client: Handle empty pathnames for `ceph_chownat()` and `ceph_statxat()`

Reviewed-by: Rishabh Dave <ridave@redhat.com>

Merge pull request #65636 from adk3798/squid-cephadm-pin-cheroot

squid: pybind/mgr: pin cheroot version in requirements-required.txt

Reviewed-by: John Mulligan <jmulligan@redhat.com>

Merge pull request #65588 from adamemerson/wip-perfcounters-unique-string-squid

squid: common: Allow PerfCounters to return a provided service ID

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #65556 from jzhu116-bloomberg/wip-72972-squid

squid: rgw: discard olh_ attributes when copying object from a versioning-suspended bucket to a versioning-disabled bucket

Reviewed-by: Adam Emerson <aemerson@redhat.com>

mgr/dashboard: bump cheroot to > 10.0

Fixes: https://tracker.ceph.com/issues/55837
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 1ec74a8360d1c4abb39754320eba118d080e3499)

client: Gracefully handle empty pathname for statxat()

man statx(2)[1] says the following:
. . .
AT_EMPTY_PATH
    If pathname is an empty string, operate on the file referred to by
    dirfd (which may have been obtained using the open(2) O_PATH flag).
    In this case, dirfd can refer to any type of file, not just a
    directory.

    If dirfd is AT_FDCWD, the call operates on the current working
    directory.
. . .

Look out for an empty pathname and use the relative fd's inode in the
presence of AT_EMPTY_PATH flag before calling internal _getattr().

Fixes: https://tracker.ceph.com/issues/68189
Review with: git show -w

[1] https://www.man7.org/linux/man-pages/man2/statx.2.html

Signed-off-by: Anoop C S <anoopcs@cryptolab.net>
(cherry picked from commit edd7fe76c4919bc243377c6d7aae20b0606b89c3)

Conflicts:
        src/client/Client.cc
- path_walk() refactor from https://github.com/ceph/ceph/pull/62095
  included the required core changes.

libcephfs.h: Fix API documentation for ceph_statxat

flags parameter for ceph_statxat() API is supposed to accept only
AT_STATX_DONT_SYNC and AT_SYMLINK_NOFOLLOW. Modify the corresponding
documentation to reflect the acceptance of above two flags.

Signed-off-by: Anoop C S <anoopcs@cryptolab.net>
(cherry picked from commit 92c5ab99b8dcaae56e4a92cfe72a7e3d343b8a0c)

client: Gracefully handle empty pathname for chownat()

man fchownat(2)[1] says the following:
. . .
AT_EMPTY_PATH (since Linux 2.6.39)
    If pathname is an empty string, operate on the file referred to by
    dirfd (which may have been obtained using the open(2) O_PATH flag).
    In this case, dirfd can refer to any type of file, not just a
    directory. If dirfd is AT_FDCWD, the call operates on the current
    working directory.
. . .

Look out for an empty pathname and use the relative fd's inode in the
presence of AT_EMPTY_PATH flag before calling internal _setattr().

Fixes: https://tracker.ceph.com/issues/68189
Review with: git show -w

[1] https://www.man7.org/linux/man-pages/man2/fchownat.2.html

Signed-off-by: Anoop C S <anoopcs@cryptolab.net>
(cherry picked from commit 829f38899226fcd1f603ba446b018f53c5b0921d)

Conflicts:
        src/client/Client.cc
- path_walk() refactor from https://github.com/ceph/ceph/pull/62095
  included the required core changes.

Merge pull request #65639 from zdover23/wip-doc-2025-09-23-squid-remove-cloud-restore-rst

squid: doc/radosgw: remove cloud-restore from squid

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #61451 from anoopcs9/wip-69556-squid

squid: mds: Fix invalid access of mdr->dn[0].back()

Reviewed-by: Rishabh Dave <ridave@redhat.com>

Merge pull request #62391 from neesingh-rh/wip-70416-squid

squid: cephfs-shell: add option to remove xattr

Reviewed-by: Rishabh Dave <ridave@redhat.com>

Merge pull request #64652 from rishabh-d-dave/wip-72200-squid

squid: mgr/vol: keep and show clone source info

Reviewed-by: Rishabh Dave <ridave@redhat.com>

Merge pull request #65279 from joscollin/wip-67809-squid

squid: mds: add more debug logs and log events

Reviewed-by: Rishabh Dave <ridave@redhat.com>

Merge pull request #65280 from joscollin/wip-69369-squid

squid: qa: use a larger timeout for kernel_untar_build workunit

Reviewed-by: Rishabh Dave <ridave@redhat.com>

test/rbd-mirror: eliminate a race in ResyncRequestedRemoteNotPrimary

Adjust the wait_for_notification call in TestMockImageReplayerSnapshotReplayer.ResyncRequestedRemoteNotPrimary
to expect 2 notifications instead of 1. This allows the test to correctly wait for both expected events
i.e for finish_sync() and handle_replay_complete(locker, -EREMOTEIO, "remote image demoted"), ensuring the
replayer transitions to STATE_COMPLETE and is_replaying() returns false as intended.

Fixes: https://tracker.ceph.com/issues/72325
Signed-off-by: VinayBhaskar-V <vvarada@redhat.com>
(cherry picked from commit b5a013f6170bb4445da8f5469243e4869b760a81)

rbd-mirror: prevent image deletion if remote image is not primary

A resync on a mirrored image may incorrectly results in the local
image being deleted even when the remote image is no longer primary.
This issue can occur under the following conditions:
* if  resync is requested on the secondary before the remote image has
  been fully demoted
* if the demotion of the primary image is not mirrored
  due to the rbd-mirror daemon being offline.

This can be fixed by ensuring that image deletion during a resync is
only allowed when the remote image is confirmed to be primary.

This commit fixes the issue only for snapshot based mirroring mode

Fixes: https://tracker.ceph.com/issues/70948
Signed-off-by: VinayBhaskar-V <vvarada@redhat.com>
(cherry picked from commit e14afbc95a5fb8f5a33e7ea23a035992b966d671)

Merge pull request #63019 from batrick/wip-71094-squid

squid: mds: check for snapshots on parent snaprealms

Reviewed-by: Jos Collin <jcollin@redhat.com>

Merge pull request #62499 from batrick/wip-70663-squid

squid: client: ll_walk will process absolute paths as relative

Reviewed-by: Jos Collin <jcollin@redhat.com>

Merge pull request #65629 from phlogistonjohn/jjm-s-65514

squid: build-with-container: add argument groups to organize options

doc/radosgw: remove cloud-restore from squid

Remove doc/radosgw/cloud-restore.rst from the Squid branch.

cloud-restore does not appear in index.rst, so its removal from
index.rst is unnecessary.

Signed-off-by: Zac Dover <zac.dover@proton.me>

Merge pull request #64090 from vshankar/wip-cephfs-client-fixes-squid

squid: client: cephfs user-space client fixes

Reviewed-by: Jos Collin <jcollin@redhat.com>

pybind/mgr: pin cheroot version in requirements-required.txt

With python 3.10 (didn't seem to happen with python 3.12) the
pybind/mgr/cephadm/tests/test_node_proxy.py test times out.
This appears to be related to a new release of the cheroot
package and a github issues describing the same problem
we're seeing has been opened by another user
https://github.com/cherrypy/cheroot/issues/769

It is worth noting that the workaround described in that
issue does also work for us. If you add

```
import cheroot
cheroot.server.HTTPServer._serve_unservicable = lambda: None
```

after the existing imports in test_node_proxy.py the
test hanging issue also disappears. Also worth noting the
particular pin of

cheroot~=10.0

was chosen as it matches the existing pin being used
in pybind/mgr/dashboard/constraints.txt

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 6231955b5d00ae6b3630ee94e85b2449092ef0fe)

Merge pull request #61274 from kotreshhr/wip-68940-squid

squid: ceph-fuse: Improve fuse mount usage message

Reviewed-by: Jos Collin <jcollin@redhat.com>

Merge pull request #62517 from salieri11/wip-70631-squid

squid: mds: add MDS asok command for dumping stray directories

Reviewed-by: Jos Collin <jcollin@redhat.com>

Merge pull request #65133 from chrisphoffman/wip-72645-squid

squid: client: use path supplied in statfs

Reviewed-by: Jos Collin <jcollin@redhat.com>

build-with-container: add argument groups to organize options

Use the argparse add_argument_group feature to organize the mass of
arguments into more sensible categories. Hopefully, someone reading
over the `--help` output can now more easily see options that
are useful rather than being overwhelmed by a wall of text.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 71a1be4dd0aea004da56c2f518ee70a281a3f7d3)