Jos Collin [Thu, 11 Sep 2025 04:06:12 +0000 (09:36 +0530)]
Merge PR #65259 into wip-jcollin-testing-20250911.040549-tentacle
* refs/pull/65259/head:
mds: fix test that directory has no snaps
qa: test for child dir with first beyond parent snaps
qa: remove extraneous directory from test
qa: correct test description
Jos Collin [Thu, 11 Sep 2025 04:06:03 +0000 (09:36 +0530)]
Merge PR #65262 into wip-jcollin-testing-20250911.040549-tentacle
* refs/pull/65262/head:
mgr/volumes: Fix json.loads for test on mon caps
mgr/volumes: Add test for mon caps if auth key has remaining mds/osd caps
mgr/volumes: Keep mon caps if auth key has remaining mds/osd caps
This refactores redundant device setup calls in LvmBlueStore class:
Calling the same function twice with different arguments for WAL
and DB devices was inefficient and unnecessary.
The new implementation simplifies the logic by directly accessing
`self.args`, it removes the need for passing arguments manually.
John Mulligan [Mon, 8 Sep 2025 18:13:59 +0000 (14:13 -0400)]
tentacle: update formatting to match across tentacle branches
I managed to create a bit of a mess with formatting changes after
a fix was cherry picked to `tentacle-release`. This change makes
the formatting on `tentacle-release` match that of `tentacle`.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Laura Flores [Fri, 5 Sep 2025 21:46:20 +0000 (16:46 -0500)]
doc/rados/operations: add kernel client procedure to read balancer documentation
As of now, the kernel client does not support `pg-upmap-primary`. I have
added some troubleshooting steps to help users who are unable to
mount images and filesystems with the kernel client while using `pg-upmap-primary`.
Once the feature is supported by the kernel client, users will be able
to perform mounts along with `pg-upmap-primary`.
Problem:
The readdir wouldn't list all the entries in the directory
when the osd is full with rstats enabled.
Cause:
The issue happens only in multi-mds cephfs cluster. If rstats
is enabled, the readdir would request 'Fa' cap on every dentry,
basically to fetch the size of the directories. Note that 'Fa' is
CEPH_CAP_GWREXTEND which maps to CEPH_CAP_FILE_WREXTEND and is
used by CEPH_STAT_RSTAT.
The request for the cap is a getattr call and it need not go to
the auth mds. If rstats is enabled, the getattr would go with
the mask CEPH_STAT_RSTAT which mandates the requirement for
auth-mds in 'handle_client_getattr', so that the request gets
forwarded to auth mds if it's not the auth. But if the osd is full,
the indode is fetched in the 'dispatch_client_request' before
calling the handler function of respective op, to check the
FULL cap access for certain metadata write operations. If the inode
doesn't exist, ESTALE is returned. This is wrong for the operations
like getattr, where the inode might not be in memory on the non-auth
mds and returning ESTALE is confusing and client wouldn't retry. This
is introduced by the commit 6db81d8479b539d which fixes subvolume
deletion when osd is full.
Fix:
Fetch the inode required for the FULL cap access check for the
relevant operations in osd full scenario. This makes sense because
all the operations would mostly be preceded with lookup and load
the inode in memory or they would handle ESTALE gracefully.
Alex Ainscow [Fri, 27 Jun 2025 15:00:56 +0000 (16:00 +0100)]
osd: Deduplicate zeros in EC slice iterator
Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 06658fdac16dde95d20a8907511afb7fde7313da) Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
Calling udevadm via subprocess can cause processes to pile up
under heavy load on production clusters.
This commit switches to reading udev data directly from /run/udev/data,
which is mounted as tmpfs.
mgr/dashboard: Allow the user to re-use existing r
ealm/zg/zone and setup replication
1. Currently, we just allow the user to create a new realm/zg/zone and setup replication using the multi-site replication wizard. The ask is to allow the user to select the pre-existing realm/zg/zone and setup replication via automatic export and import of token as well.
2. Enable rgw module automatically in the selected cluster if its not
enabled
Update the "Disconnected+Remounted FS" section in
doc/cephfs/troubleshooting.rst, as suggested by Venky Shankar in https://github.com/ceph/ceph/pull/65129/files#r2312903062
according to `dpkg-buildflags`, ubuntu 24 raised this value to
`-D_FORTIFY_SOURCE=3` which causes `error: "_FORTIFY_SOURCE" redefined`
compilation failures because Ceph itself adds `-D_FORTIFY_SOURCE=2`
`_FORTIFY_SOURCE` is a hardening option. both our rpm and debian builds
already specify that via environment variables, so Ceph's cmake should
leave it alone
according to `dpkg-buildflags`, ubuntu 24 raised this value to
`-D_FORTIFY_SOURCE=3` which causes `error: "_FORTIFY_SOURCE" redefined`
compilation failures because Ceph itself adds `-D_FORTIFY_SOURCE=2`
`_FORTIFY_SOURCE` is a hardening option. both our rpm and debian builds
already specify that via environment variables, so Ceph's cmake should
leave it alone
Dan Mick [Tue, 26 Aug 2025 00:45:21 +0000 (17:45 -0700)]
Remove git clean -fdx
either
1) a source tarball is supplied, in which case the local dir is
irrelevant, or
2) make-debs calls make-dist, which doesn't care about a dirty cwd
so it just punishes the unaware by removing things that they may
have wanted to keep.
Dan Mick [Sat, 23 Aug 2025 00:43:24 +0000 (17:43 -0700)]
make-debs.sh: invoke tar with --no-same-owner
When running as a normal user, tar does not attempt to preserve
owners set on the tar content files. When running as root, it does.
Containerized builds are running as root. Stop make-debs.sh from
trying to set other owners for files, and leaving files in the
host system with mapped UIDs other than the user running the container
(which causes jenkins to be unable to clear the workspace).
Dan Mick [Thu, 21 Aug 2025 20:00:43 +0000 (13:00 -0700)]
make-debs.sh: make "skip debug packages" conditional
Now that we're using make-debs.sh as a builder inside containers,
the default should be to build all the packages, including debug.
(Also, fix a typo.)
Nizamudeen A [Mon, 18 Aug 2025 07:47:01 +0000 (13:17 +0530)]
mgr/dashboard: expose image summary API
Introduce a new API for getting per image summary
```
╰─$ curl -kX GET "https://localhost:4200/api/block/mirroring/rbd/test/summary" \
-H "Accept: application/vnd.ceph.api.v1.0+json" \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" | jq .
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 637 100 637 0 0 14597 0 --:--:-- --:--:-- --:--:-- 14813
{
"name": "test2",
"id": "10d618ea1a58",
"info": {
"global_id": "f25678be-64a2-481f-b96c-9bcc566dcbfe",
"state": 1,
"primary": true
},
"remote_statuses": [
{
"state": "Replaying",
"description": {
"bytes_per_second": 0.0,
"bytes_per_snapshot": 0.0,
"last_snapshot_bytes": 0,
"last_snapshot_sync_seconds": 0,
"local_snapshot_timestamp": 1755579780,
"remote_snapshot_timestamp": 1755579780,
"replay_state": "idle"
},
"last_update": "2025-08-19T05:03:17Z",
"up": true,
"mirror_uuid": "4d734616-5a38-4399-b743-86bcd8c1ab8f"
}
],
"state": 6,
"description": "local image is primary",
"last_update": "2025-08-19T05:03:10Z",
"up": true
}
```
Also update the existing API to add the image syncing status. The
/summary API's `image_ready` will also have the `remote_status` which is
a list of dict to show the status of all the remote clusters (one image
can be mirrored to more than one cluster)
In file included from /home/pdonnell/ceph/src/mds/FSMap.h:31,
from /home/pdonnell/ceph/src/mon/PaxosFSMap.h:20,
from /home/pdonnell/ceph/src/mon/MDSMonitor.h:26,
from /home/pdonnell/ceph/src/mon/FSCommands.cc:17:
/home/pdonnell/ceph/src/mds/MDSMap.h: In member function ‘int FileSystemCommandHandler::set_val(Monitor*, FSMap&, MonOpRequestRef, const cmdmap_t&, std::ostream&, FileSystemCommandHandler::fs_or_fscid, std::string, std::string)’:
/home/pdonnell/ceph/src/mds/MDSMap.h:223:40: warning: ‘fsp’ may be used uninitialized in this function [-Wmaybe-uninitialized]
223 | bool test_flag(int f) const { return flags & f; }
| ^~~~~
/home/pdonnell/ceph/src/mon/FSCommands.cc:417:21: note: ‘fsp’ was declared here
417 | const Filesystem* fsp;
| ^~~
This is required to test the features involving
fixes both in client and mds. This is to make
sure the older clients are not broken with the
fix. The version 19.2.2 is used for client.
The test suite sets up the cluster with squid
19.2.2 and upgrades only the ceph cluster node
leaving the client node.
Dan Mick [Tue, 26 Aug 2025 00:45:21 +0000 (17:45 -0700)]
Remove git clean -fdx
either
1) a source tarball is supplied, in which case the local dir is
irrelevant, or
2) make-debs calls make-dist, which doesn't care about a dirty cwd
so it just punishes the unaware by removing things that they may
have wanted to keep.
Dan Mick [Sat, 23 Aug 2025 00:43:24 +0000 (17:43 -0700)]
make-debs.sh: invoke tar with --no-same-owner
When running as a normal user, tar does not attempt to preserve
owners set on the tar content files. When running as root, it does.
Containerized builds are running as root. Stop make-debs.sh from
trying to set other owners for files, and leaving files in the
host system with mapped UIDs other than the user running the container
(which causes jenkins to be unable to clear the workspace).
Dan Mick [Thu, 21 Aug 2025 20:00:43 +0000 (13:00 -0700)]
make-debs.sh: make "skip debug packages" conditional
Now that we're using make-debs.sh as a builder inside containers,
the default should be to build all the packages, including debug.
(Also, fix a typo.)