Adam King [Wed, 30 Jul 2025 19:51:11 +0000 (15:51 -0400)]
mgr/cephadm: don't use list_servers to get active mgr host for prometheus SD config
Having a lot of calls into list_servers causes issues with
the core ceph mgr on large clusters. Additionally, we were
using it purely to get the active mgr's host here, which
cephadm should be able to do without needing a mgr api call
Adam King [Wed, 30 Jul 2025 19:49:20 +0000 (15:49 -0400)]
mgr/cephadm: add interval control for stray daemon checks
Primarily to avoid running list_servers (which we kind of
need to do stray daemon checks since the whole point is
to check against a source that isn't cephadm). It was
found on larger clusters calling into list_servers
often can cause issues with the core ceph mgr
mgr/dashboard: Enable rgw module automatically in the primary and secondary cluster if not enabled during multi-site automation
1. Enable rgw module automatically in the primary and secondary cluster if not enabled during multi-site automation
2. Improve progress bar descriptions and add sub-descriptions for steps
libcephfs_proxy: fix userperm pointer decoding for older protocols
The random data used to decode pointers coming from the old protocol was
taken from the client instead of using the global_random data, which is
the correct one.
libcephfs_proxy: remove unnecessary protocol references in daemon
With the new protocol structure definitions, it's not necessary to
explicitly access each field inside its version substructure (v0, for
example). Now all fields of the latest version are declared inside an
anonymous substructure that can be accessed without a prefix.
libcephfs_proxy: remove unnecessary protocol references in client
With the new protocol structure definitions, it's not necessary to
explicitly access each field inside its version substructure (v0, for
example). Now all fields of the latest version are declared inside an
anonymous substructure that can be accessed without a prefix.
libcephfs_proxy: fix protocol structures for backward compatibility
The structures used for transferring data between the proxy client and
the proxy daemon had been reworked in a recent change to be able to
expand the protocol. This caused an inconsistency in the size of the
data transferred when communication with a peer using the older version.
The result was that the peer receiving the data with an unexpected size
was closing the connection, causing unexpected errors.
The discrepancy in size is the result of how compilers pad structures
combined with the change in the structure layout introduced when
extending the protocol. With these changes, the computation of the size
of each version of the structures was not done correctly.
This change makes the layout equal to the older version, so that
computing the size of the structures becomes easier and doesn't depend
on unexpected paddings.
This refactores redundant device setup calls in LvmBlueStore class:
Calling the same function twice with different arguments for WAL
and DB devices was inefficient and unnecessary.
The new implementation simplifies the logic by directly accessing
`self.args`, it removes the need for passing arguments manually.
ftbfs:
```
/home/jenkins-build/build/workspace/ceph-pull-requests/src/neorados/cls/fifo/detail/fifo.h:630:14: error: no member named 'parse' in namespace 'ceph'; did you mean 'pause'?
630 | auto n = ceph::parse<decltype(m.num)>(num);
| ^~~~~~~~~~~
```
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit 3df1133c17b174c27c250cf7ac018199cc40b15b) Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Adam C. Emerson [Mon, 30 Jun 2025 20:54:46 +0000 (16:54 -0400)]
rgw/datalog: Manage and shutdown tasks properly
This is slightly ugly but good enough for now. Make sure we can block
when shutting down background tasks.
Remove a few `driver` parameters that are unused. This lets us
simplify the IAM Policy and Lua tests and not construct stores we
never use. (Which is good since we aren't running them under a cluster.)
Adam C. Emerson [Fri, 11 Jul 2025 18:57:02 +0000 (14:57 -0400)]
neorados/fifo: Rewrite as proper I/O object
Split nominal handle object and reference-counted
implementation. While we're at it, add lazy-open functionality.
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 3097297dd39432d172d69454419fa83a908075f6) Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Adam C. Emerson [Thu, 26 Jun 2025 17:58:57 +0000 (13:58 -0400)]
{neorados,osdc}: Support subsystem cancellation
Tag operations with a subsystem so we can cancel them all in one go.
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 2526eb573b789b33b7d9ebf1169491f13e2318bb) Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Conflicts:
src/rgw/driver/rados/rgw_service.cc
src/rgw/rgw_sal.cc
- `#ifdef`s for standalone Rados
src/rgw/driver/rados/rgw_datalog.cc
- Periodic re-run of recovery removed in main and pending backport
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Adam C. Emerson [Fri, 30 May 2025 20:54:45 +0000 (16:54 -0400)]
neorados: Hold reference to implementation across operations
Asynchrony combined with cancellations keeps leading to occasional
lifetime issues, so follow the best-practices of Asio I/O objects by
having completions keep a reference live.
The original NeoRados backing implements Asio's two-phase shutdown
properly.
The RadosClient backing does not, because it shares an Objecter with
completions that do not belong to it. In practice I don't think this
will matter since librados and neorados get shut down around the same
time.
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 57c9723928b4d2b2148ca0dd4d505acdc071f8eb) Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
John Mulligan [Mon, 8 Sep 2025 18:13:59 +0000 (14:13 -0400)]
tentacle: update formatting to match across tentacle branches
I managed to create a bit of a mess with formatting changes after
a fix was cherry picked to `tentacle-release`. This change makes
the formatting on `tentacle-release` match that of `tentacle`.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Laura Flores [Fri, 5 Sep 2025 21:46:20 +0000 (16:46 -0500)]
doc/rados/operations: add kernel client procedure to read balancer documentation
As of now, the kernel client does not support `pg-upmap-primary`. I have
added some troubleshooting steps to help users who are unable to
mount images and filesystems with the kernel client while using `pg-upmap-primary`.
Once the feature is supported by the kernel client, users will be able
to perform mounts along with `pg-upmap-primary`.
N Balachandran [Thu, 28 Aug 2025 06:22:23 +0000 (11:52 +0530)]
rgw/logging: fixes data loss during rollover
Multiple threads attempting to roll over the same log object can result
in the creation of numerous orphan tail objects, each with a single record.
This occurs when a NULL RGWObjVersionTracker is used during the creation of
a new logging object. These records are inaccessible, leading to data loss,
which is particularly critical in Journal mode.
Furthermore, valid log tail objects may be added to the Garbage Collection (GC)
list, exacerbating data loss.
Fixes: https://tracker.ceph.com/issues/72740 Signed-off-by: N Balachandran <nithya.balachandran@ibm.com>
(cherry picked from commit eea6525c031ae93f4ae846b06d55831e658faa2c)
Alex Ainscow [Fri, 27 Jun 2025 15:00:56 +0000 (16:00 +0100)]
osd: Deduplicate zeros in EC slice iterator
Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
(cherry picked from commit 06658fdac16dde95d20a8907511afb7fde7313da) Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
Calling udevadm via subprocess can cause processes to pile up
under heavy load on production clusters.
This commit switches to reading udev data directly from /run/udev/data,
which is mounted as tmpfs.
mgr/dashboard: Allow the user to re-use existing r
ealm/zg/zone and setup replication
1. Currently, we just allow the user to create a new realm/zg/zone and setup replication using the multi-site replication wizard. The ask is to allow the user to select the pre-existing realm/zg/zone and setup replication via automatic export and import of token as well.
2. Enable rgw module automatically in the selected cluster if its not
enabled
Update the "Disconnected+Remounted FS" section in
doc/cephfs/troubleshooting.rst, as suggested by Venky Shankar in https://github.com/ceph/ceph/pull/65129/files#r2312903062
according to `dpkg-buildflags`, ubuntu 24 raised this value to
`-D_FORTIFY_SOURCE=3` which causes `error: "_FORTIFY_SOURCE" redefined`
compilation failures because Ceph itself adds `-D_FORTIFY_SOURCE=2`
`_FORTIFY_SOURCE` is a hardening option. both our rpm and debian builds
already specify that via environment variables, so Ceph's cmake should
leave it alone
according to `dpkg-buildflags`, ubuntu 24 raised this value to
`-D_FORTIFY_SOURCE=3` which causes `error: "_FORTIFY_SOURCE" redefined`
compilation failures because Ceph itself adds `-D_FORTIFY_SOURCE=2`
`_FORTIFY_SOURCE` is a hardening option. both our rpm and debian builds
already specify that via environment variables, so Ceph's cmake should
leave it alone
Dan Mick [Tue, 26 Aug 2025 00:45:21 +0000 (17:45 -0700)]
Remove git clean -fdx
either
1) a source tarball is supplied, in which case the local dir is
irrelevant, or
2) make-debs calls make-dist, which doesn't care about a dirty cwd
so it just punishes the unaware by removing things that they may
have wanted to keep.
Dan Mick [Sat, 23 Aug 2025 00:43:24 +0000 (17:43 -0700)]
make-debs.sh: invoke tar with --no-same-owner
When running as a normal user, tar does not attempt to preserve
owners set on the tar content files. When running as root, it does.
Containerized builds are running as root. Stop make-debs.sh from
trying to set other owners for files, and leaving files in the
host system with mapped UIDs other than the user running the container
(which causes jenkins to be unable to clear the workspace).
Dan Mick [Thu, 21 Aug 2025 20:00:43 +0000 (13:00 -0700)]
make-debs.sh: make "skip debug packages" conditional
Now that we're using make-debs.sh as a builder inside containers,
the default should be to build all the packages, including debug.
(Also, fix a typo.)
Nizamudeen A [Mon, 18 Aug 2025 07:47:01 +0000 (13:17 +0530)]
mgr/dashboard: expose image summary API
Introduce a new API for getting per image summary
```
╰─$ curl -kX GET "https://localhost:4200/api/block/mirroring/rbd/test/summary" \
-H "Accept: application/vnd.ceph.api.v1.0+json" \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" | jq .
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 637 100 637 0 0 14597 0 --:--:-- --:--:-- --:--:-- 14813
{
"name": "test2",
"id": "10d618ea1a58",
"info": {
"global_id": "f25678be-64a2-481f-b96c-9bcc566dcbfe",
"state": 1,
"primary": true
},
"remote_statuses": [
{
"state": "Replaying",
"description": {
"bytes_per_second": 0.0,
"bytes_per_snapshot": 0.0,
"last_snapshot_bytes": 0,
"last_snapshot_sync_seconds": 0,
"local_snapshot_timestamp": 1755579780,
"remote_snapshot_timestamp": 1755579780,
"replay_state": "idle"
},
"last_update": "2025-08-19T05:03:17Z",
"up": true,
"mirror_uuid": "4d734616-5a38-4399-b743-86bcd8c1ab8f"
}
],
"state": 6,
"description": "local image is primary",
"last_update": "2025-08-19T05:03:10Z",
"up": true
}
```
Also update the existing API to add the image syncing status. The
/summary API's `image_ready` will also have the `remote_status` which is
a list of dict to show the status of all the remote clusters (one image
can be mirrored to more than one cluster)