David Galloway [Mon, 22 Dec 2025 21:31:17 +0000 (16:31 -0500)]
ceph-dev-pipeline: Still send with_crimson=true
The replacement of crimson builds with debug builds is still half baked. ceph.spec and install-deps are still expecting with_crimson to be set if the crimson dependency packages should be installed.
See https://github.com/ceph/ceph/blame/main/ceph.spec.in#L367-L384 for the dependencies that will never get installed.
This is manifesting in:
```
+ rpmbuild --rebuild '-D_topdir /ceph/rpmbuild' --with=sccache --without=dwz --with=tcmalloc /ceph/ceph-20.3.0-4645.gcfa448da.el9.src.rpm
error: Failed build dependencies:
cryptopp-devel is needed by ceph-2:20.3.0-4645.gcfa448da.el9.x86_64
gcc-toolset-13-gcc-plugin-annobin is needed by ceph-2:20.3.0-4645.gcfa448da.el9.x86_64
gcc-toolset-13-libasan-devel is needed by ceph-2:20.3.0-4645.gcfa448da.el9.x86_64
gcc-toolset-13-libubsan-devel is needed by ceph-2:20.3.0-4645.gcfa448da.el9.x86_64
gnutls-devel is needed by ceph-2:20.3.0-4645.gcfa448da.el9.x86_64
hwloc-devel is needed by ceph-2:20.3.0-4645.gcfa448da.el9.x86_64
libasan is needed by ceph-2:20.3.0-4645.gcfa448da.el9.x86_64
libpciaccess-devel is needed by ceph-2:20.3.0-4645.gcfa448da.el9.x86_64
libubsan is needed by ceph-2:20.3.0-4645.gcfa448da.el9.x86_64
lksctp-tools-devel is needed by ceph-2:20.3.0-4645.gcfa448da.el9.x86_64
ragel is needed by ceph-2:20.3.0-4645.gcfa448da.el9.x86_64
Installing /ceph/ceph-20.3.0-4645.gcfa448da.el9.src.rpm
2025-12-22 17:48:10,148: INFO: step done: rpm failed in 00:00:27
```
Nothing will ever tell build-with-container.py or install-deps.sh to pull those dependencies in because we're no longer setting WITH_CRIMSON due to this removal https://github.com/ceph/ceph-build/commit/0f0e4fd7dea0c06d855b93581e5b13cc0bf4c350#diff-d34216471695ce2f36f9cf1550524392c85b94d0566b3bc6d591383411b91f25R218-L381.
Signed-off-by: David Galloway <david.galloway@ibm.com>
David Galloway [Sat, 20 Dec 2025 19:04:43 +0000 (14:04 -0500)]
ceph-windows: Update qcow2 image location. Bump VM spec.
The previous one was living on the LRC and served via reverse proxy. The RDU lab was shut down earlier and I neglected to back the qcow2 image up. Who knows when the old LRC will be brought back online so the image has been rebuilt.
Also bumping the VM resources.
Signed-off-by: David Galloway <david.galloway@ibm.com>
David Galloway [Thu, 18 Dec 2025 19:39:27 +0000 (14:39 -0500)]
ceph-windows-PRs: Fix virt-install command
Behavior changed in newer virt-install.
`--import` says boot the prebuilt qcow2. Older virt-install versions tolerated `--boot hd` without `--import` but now `--import` is required in this scenario and `--boot hd` is redundant.
Signed-off-by: David Galloway <david.galloway@ibm.com>
See the following comment:
```
Tentacle is the last release that needs dedicated Crimson builds,
Later releases are able to use Crimson with the default build.
As the "Crimson flavor" is no longer available, we need a *temporary* way
to be able build Crimson for tentacle.
Note: This could be removed once Crimson we have Umbrella release builds.
```
Matan Breizman [Sun, 23 Nov 2025 12:18:11 +0000 (14:18 +0200)]
Cleanup "crimson" flavor
With https://github.com/ceph/ceph/pull/66229 merged,
Crimson is now included (though not used) by default in our RPM builds.
This means the existing default flavor can also be used for Crimson testing
by selecting Crimson as the default OSD package.
Notes:
* The previous workaround related to DWITH_STATIC_LIBSTDCXX is no longer
relevant for Crimson (it was tied to older compiler issues).
* The crimson-only branch name selection is also cleaned up,
as centos9-only can now be used instead.
* This change breaks Crimson Tentacle CI builds:
The packaging update that includes Crimson in RPM builds was not backported to Tentacle.
Tentacle builds would still require a dedicated flavor that enables WITH_CRIMSON
However, since Crimson changes have not been backported to Tentacle (since the first RCA),
there is no strong reason to keep building and testing the same Crimson HEAD.
So we can use this opprtuinity to stop nightly Crimson/Tentacle builds and tests.
See last Crimson tentacle run, (which is not expected to change):
https://pulpito.ceph.com/teuthology-2025-11-22_22:56:11-crimson-rados-tentacle-distro-crimson-debug-smithi/
Matan Breizman [Sun, 23 Nov 2025 11:41:42 +0000 (13:41 +0200)]
Introduce "debug" flavor
Currently, the only flavor used for testing is "default".
While this ensures that the tested flavor matches the released flavor, it can also be a limitation.
Introducing a debug flavor would allow us to test branches that require additional or more thorough validation.
The main difference is that built-in assertions would be compiled in.
The reasons for this change are:
a) The Crimson suite uses a crimson-debug flavor for project PR gating.
The next commit will clean up the Crimson flavor entirely,
and the new debug flavor introduced here could be used as its replacement.
b) Good practice: having an additional build with debug enabled is useful
when retesting or performing extra checks.
Initially, the new debug flavor will only apply to centos9 builds.
If it proves valuable, we can expand support to other distros.
Note: The current way to schedule debug builds is by using a *-debug branch name.
Having a dedicated flavor seems more straightforward.
Dan Mick [Sat, 15 Nov 2025 00:00:40 +0000 (16:00 -0800)]
review comments:
- explain --stragglers a bit in help text
- add re match group names and use them
- use variables for URLs in messages
- dryrun -> dry-run
- remove dead code
Dan Mick [Tue, 11 Nov 2025 00:17:18 +0000 (16:17 -0800)]
quay-pruner: completely overhaul prune-quay.py
Pruning had stopped working (up to >26000 image tags), and the
reasons were many; one, pruning's always been less deterministic
than I'd hope; two, when I switched us to ceph.git/container for
building images, I mistakenly changed the format of the 'fulltag'
(no longer has a short sha1 in it) and that was sort of driving
the pruning process. three, I suspect some of the newer flavors
etc. were slipping through the cracks.
So here's an attempt to fix all that by changing the algorithm
fundamentally; now, tags of a certain manifest digest are considered
at the same time, and their sha1 checked in shaman as usual (but
only their sha1); if it's found, the tags are all left, and if not,
they're all removed. This should be cleaner, faster, and more
reliable.
Also refactored a lot of the worker routines to util.py so I could
add some helper/debug/info scripts:
get-tagdates.py generates JSON showing tag-to-age for examining the
state of things
delete-tags.py takes tags on the CLI to delete, or can be invoked
with '--stragglers <age>' to remove anything older than age (as long
as it's not in shaman or seems like it might be a 'distinguished' build
(with recent release names in its name)).
prune-quay.py also now reports summary statistics of its operation.