John Mulligan [Sat, 15 Mar 2025 16:44:00 +0000 (12:44 -0400)]
install-deps: extract SUDO variable logic into a reusable function
While the function is pretty simple and could be copy-pasted I
prefer to extract things into functions to indicate that the
logic is used/repeated elsewhere to ward off making changes to
one copy vs the other.
John Mulligan [Wed, 8 Oct 2025 20:41:36 +0000 (16:41 -0400)]
script/build-with-container: improve error handling for invalid distros
Instead of throwing a long obnoxious traceback at the user if the value
supplied to -d/--distro is invalid do something nicer. For example:
```
$ ./src/script/build-with-container.py -d trixy -e build
usage: build-with-container.py [-h] [--help-build-steps]
build-with-container.py: error: argument --distro/-d: unknown distro: 'trixy' not in centos10, centos10stream, centos8, centos9, centos9stream, rocky9, rockylinux9, rocky10, rockylinux10, fedora41, fc41, fedora42, fc42, fedora43, fc43, ubuntu20.04, ubuntu-focal, focal, ubuntu22.04, ubuntu-jammy, jammy, ubuntu24.04, ubuntu-noble, noble, debian12, debian-bookworm, bookworm, debian13, debian-trixie, trixie
John Mulligan [Wed, 8 Oct 2025 14:23:25 +0000 (10:23 -0400)]
script/build-with-container: be consistent with naming in distro kinds
Update the DistroKind enum and related items so that the naming is
applied consistently. That is: the canonical (no pun indented) form
of the name is "<name><version>" and codenames, such as "jammy" or
"bookworm" are aliases. This matches the previously existing code.
John Mulligan [Thu, 28 Aug 2025 23:39:06 +0000 (19:39 -0400)]
build-with-container: ensure npm dir is set up before configure
When the npm cache path option is passed the npm cache dir is passed
to all container `run` commands, ensure the dir has been created
before the first container command (configure) is used.
John Mulligan [Fri, 12 Sep 2025 17:52:25 +0000 (13:52 -0400)]
build-with-container: add argument groups to organize options
Use the argparse add_argument_group feature to organize the mass of
arguments into more sensible categories. Hopefully, someone reading
over the `--help` output can now more easily see options that
are useful rather than being overwhelmed by a wall of text.
Dan Mick [Sat, 23 Aug 2025 00:43:24 +0000 (17:43 -0700)]
make-debs.sh: invoke tar with --no-same-owner
When running as a normal user, tar does not attempt to preserve
owners set on the tar content files. When running as root, it does.
Containerized builds are running as root. Stop make-debs.sh from
trying to set other owners for files, and leaving files in the
host system with mapped UIDs other than the user running the container
(which causes jenkins to be unable to clear the workspace).
Improve source rpm detection by adding a new detection method that
executes and rpm command in a container to get exactly the version of
the source rpm that the ceph.spec file would have generated. For
backwards compatibility and that I don't entirely trust myself to have
tested this the old methods are still available.
The old `--rpm-no-match-sha` is now an alias for `--srpm-match=any` to
cause it to build any (unique) ceph srpm it finds.
`--srpm-match=versionglob` retains the previous default behavior of
using a glob matching on the git id or ceph version value. The new
default of `--srpm-match=auto` implements the rpm command based behavior
described above.
All of this is wrapped in a new step `find-rpm` but that's mostly an
implementation detail and for testing.
John Mulligan [Fri, 20 Jun 2025 23:34:45 +0000 (19:34 -0400)]
Dockerfile.build: make WITH_CRIMSON a build arg
We've chosen to enable crimson by default to match the CI, but that
is not always something a developer may want, so make WITH_CRIMSON
a build argument that can be toggled off if necessary.
John Mulligan [Fri, 20 Jun 2025 23:46:16 +0000 (19:46 -0400)]
script/build-with-container: support --build-arg arguments
Allow passing --build-arg arguments to build-with-container.py
which are passed directly to the container build command.
This allows a developer to toggle certain features of the build
container, however this should not be used in CI.
Remove the unused build arg for JENKINS_HOME. This was
once used to try and create build images like the CI jobs. However,
the env var is now unconditionally set in the build script and must
be passed (or not) explicitly by the user.
John Mulligan [Fri, 21 Mar 2025 18:28:25 +0000 (14:28 -0400)]
script/build-with-container: cache git branch result
Cache the branch we got from the git command as it is highly unlikely
to change during the script execution and if it does -- we mostly don't
care anyway.
John Mulligan [Fri, 14 Mar 2025 18:39:09 +0000 (14:39 -0400)]
script/build-with-container: fix building on docker
Fix building images on docker by using the `--pull` option instead of
`--pull=always`. The latter apparently only works on podman. The former
should do the same thing on both container engines.
John Mulligan [Fri, 14 Mar 2025 18:37:02 +0000 (14:37 -0400)]
script/buildcontainer-setup: set JENKINS_HOME while building image
Set the JENKINS_HOME environment variable while building the builder
image. This is needed because parts of scripts like run-make.sh and
install-deps.sh key off of this variable. Since we want to be able to
use the build container to build, run "make check" and the like, we want
that environment to be as similar to the jenkins CI environment as we
can make it.
John Mulligan [Fri, 20 Jun 2025 23:03:22 +0000 (19:03 -0400)]
script/build-with-container: add rocky10 to built-in distros
Add "rocky10" (also aliased to "rockylinux10") to the known distro bases
so that the team can begin to experiment with the Rocky Linux 10 distro
for containerized builds.
Test procedure:
docker run --rm -ti -v /home/baum/ceph-ci:/home/ceph quay.io/centos/centos:stream9 bash
[root@a3c4b1545e93 /]# cd /home/ceph/
[root@a3c4b1545e93 ceph]# ./install-deps.sh 2>&1 tee install-deps.log
John Mulligan [Thu, 20 Feb 2025 00:17:30 +0000 (19:17 -0500)]
script/build-with-container: add support for overlay dir
The source dir (aka homedir, default /ceph) is mounted in the container
read-write. This is needed as the various ceph build scripts expect to
write things into the tree - often this is in the build directory - but
not always. This can lead to small messes and/or situations that are
confusing to debug, especially if one is jumping between distros often.
Add an option to use an overlay volume for the homedir - by default we
enable a persistent overlay with a supplied "upper dir" where files that
were written will appear. One can also enable a temporary overlay that
forgets the writes when the container exits - maybe useful when doing
experiments in 'interactive' mode.
To use this option run the command with the `--overlay=<dir>` option.
For example: `./src/script/build-with-container.py -b build.inner
--overlay-dir build.ovr`. This will create a directory
`build.ovr/content` automatically and all new files will appear there.
For example the build directory will appear at
`build.ovr/content/build.inner`.
To use the temporary overlay use a `-` as the directory name. For
example: `./src/script/build-with-container.py -b build.inner
--overlay-dir -`
John Mulligan [Thu, 20 Feb 2025 14:50:49 +0000 (09:50 -0500)]
script/build-with-container: skip dnf cache dir volume mounts on docker
When using docker the --volume option is not available during build
(docker [buildx] build), unlike podman. Since passing these volumes must
be conditional on them being set up I see no way to handle this short of
just disabling the option on docker. Log the fact that it's being
skipped - the only other issue is that we pointlessly set up some dirs
and the build may be a bit slower.
John Mulligan [Wed, 19 Feb 2025 18:20:36 +0000 (13:20 -0500)]
script/build-with-container: remove default --volume arg from ctr build
On the original github pr #59841 user fayak kindly informed us that the
--volume option was not supported by docker build. Since this section
was a leftover from a previous way of constructing the builder image and
was no longer needed we simply removed it.
John Mulligan [Wed, 19 Feb 2025 18:20:01 +0000 (13:20 -0500)]
script/build-with-container.py: build builder image with --pull=always
Construct the builder image using the --pull=always flag to initiate a
pull of the base image (centos, ubuntu, etc) in order to avoid using a
stale base image. Since the script automatically (by default) avoids
building if a matching tag is in local container storage it is handy to
use a fresh base when it *is* time to build something. Otherwise, you
end up in a situation like I sometimes do - using a months old base
unintentionally.
John Mulligan [Fri, 14 Feb 2025 19:50:42 +0000 (14:50 -0500)]
script/build-with-container: add a common packages target
Add a `packages` target to build-with-container.py that requests a build
of packages, whatever package type is native to the distro selected.
For example `./src/script/build-with-container.py -d ubuntu22.04 -e
packages` will automatically select a deb packages build where
`./src/script/build-with-container.py -d centos9 -e packages` will
trigger rpm packages to be built. The underlying package-type specific
targets remain unchanged.
John Mulligan [Fri, 14 Feb 2025 16:44:35 +0000 (11:44 -0500)]
script/build-with-container: support custom tag suffixes
Previously, one could use the `--tag` option to completely override the
container tag generated by the script. However, there are cases where
one may want to add information to the tag rather than override it.
Allow the tag value to start with a plus (+) character that indicates
that the remainder of the string is to be suffixed to the generated tag.
Add a command line option --base-branch that allows the user to supply a
custom base branch name. git doesn't make determining this easy so we
always assume a base branch of 'main' by default - but this option lets
one change that.
Add a new --current-branch argument that lets the user supply a name for
the current branch. This allows the automatic tag generation to avoid
calling git - something useful if the tree is not using a git checkout
(like a tarball). It also allows you to pull a temporary branch in git
but ignore it and act like the temporary branch is the base branch.
John Mulligan [Tue, 11 Feb 2025 23:36:13 +0000 (18:36 -0500)]
script/build-with-container: add more distro aliases
Add a system to define distro name aliases and use that to define some
additional aliases, primarily to match ubuntu codenames rather than
version numbers. Requested by Zack.
John Mulligan [Tue, 20 Aug 2024 19:01:05 +0000 (15:01 -0400)]
src/script: add a script to help build ceph using containers
The build-with-container script tries to encapsulate nearly all major
build tasks using docker/podman containers. If there's no build image
locally it will create one for your. It provides targets for building
(make), testing (make check), building rpm packages or deb packages and
is designed to be fairly easily extended.
View the comment at the top of the source file for usage details.
John Mulligan [Tue, 1 Nov 2022 18:51:57 +0000 (14:51 -0400)]
script: add discover_compiler function to lib-build.sh
The discover_compiler function is an abstraction over the current
compiler detection code in run-make.sh. It is intended to be flexible
enough to work on {centos,rhel} systems, but currently is just an
updated version of the logic from run-make.sh. The intent is that this
function will grow and become useful for other scripts used for
building (possibly do_cmake.sh for example).
John Mulligan [Tue, 1 Nov 2022 18:51:57 +0000 (14:51 -0400)]
script: add discover_compiler function to lib-build.sh
The discover_compiler function is an abstraction over the current
compiler detection code in run-make.sh. It is intended to be flexible
enough to work on {centos,rhel} systems, but currently is just an
updated version of the logic from run-make.sh. The intent is that this
function will grow and become useful for other scripts used for
building (possibly do_cmake.sh for example).
John Mulligan [Mon, 31 Oct 2022 19:06:25 +0000 (15:06 -0400)]
script: add a common ci_debug function to print ci debug lines
Reduces some of the boilerplate around emitting the "CI_DEBUG:"
prefixed debug lines for the CI. Additionally, enables using
the FORCE_CI_DEBUG var to enable ci debug lines even when not
in a jenkins environment.
John Mulligan [Thu, 6 Oct 2022 17:43:41 +0000 (13:43 -0400)]
script: have run-make.sh honor BUILD_DIR like do_cmake.sh does
The BUILD_DIR environment variable is honored by do_cmake.sh in order to
create multiple build output directories. Before this change run-make.sh
did not support BUILD_DIR the same way as do_cmake.sh. This change makes
it possible to use BUILD_DIR with run-make.sh.
Dan Mick [Wed, 26 Jun 2024 02:07:41 +0000 (19:07 -0700)]
Add Containerfile and build.sh to build it.
The intent is to replace ceph-container.git, at first for ci containers
only, and eventually production containers as well.
There is code present for production containers, including
a separate "make-manifest-list.py" to scan for and glue the two
arch-specific containers into a 'manifest-list' 'fat' container,
but that code is not yet fully tested.
This code will not be used until a corresponding change to the
Jenkins jobs in ceph-build.git is pushed.
Note that this tooling does not authenticate to the container repo;
it is assumed that will be done elsewhere. Authentication is
verified by pushing a minimal image to the requested repo.
qa/suites/upgrade/reef-p2p/reef-p2p-parallel: increment upgrade to 18.2.2
Instead of installing 18.2.0, which still contains the osdmap crc bug tracked
in https://tracker.ceph.com/issues/63389, we should install v18.2.2 since this contains
the fix. Then, we upgrade to reef_latest. In this scenario, we do not expect to see the
crc bug. If we test any upgrade path before that, we will hit the warning and the test will fail.
Nizamudeen A [Fri, 3 May 2024 08:56:19 +0000 (14:26 +0530)]
mgr/k8sevents: update V1Events to CoreV1Events
centos9 only provides kubernetes 26.1.0 as base dep and hence the
k8sevents code needs to be updated accordingly. the api changes happened
in kuberenetes while 19.0.0 was released
Zack Cerza [Fri, 14 Jun 2024 19:37:16 +0000 (13:37 -0600)]
qa/tasks/qemu: Fix OS version comparison
See: https://sentry.ceph.com/share/issue/21ed88d705854238bdafbf6711e795ee/
They're strings, not floats.
This surfaced as a result of https://github.com/ceph/teuthology/pull/1953
Dhairya Parmar [Mon, 6 Nov 2023 14:24:20 +0000 (19:54 +0530)]
qa: refactor client upgrade yamls and other minor touchups
* start testing new_ops and stress_tests with both the drivers(i.e. fuse and kclient)
therefore moved 0-clients/ from tasks/3-workload/new_ops/ to tasks/ and renamed it to
2-clients/
* since new_ops/ and stress_tests/ now share the common upgrade yaml, moved the
tests yamls(in stress_tests/1-tests) directly under 3-workload/stress_tests/
* renamed 1-client-sanity.yaml in new_ops/ to newops.yaml
Casey Bodley [Wed, 26 Jun 2024 16:11:10 +0000 (12:11 -0400)]
qa/rgw/upgrade/pacific: remove centos_8.stream.yaml and rely on ubuntu_20.04.yaml
we can't test this pacific->reef upgrade path on centos because pacific doesn't
have centos 9 builds, and reef no longer has centos 8 builds. only test
this upgrade on ubuntu focal which is still supported for both releases
this commit targets the reef branch directly because this rgw/upgrade/pacific
suite no longer exists on main and squid branches
Adam King [Fri, 14 Jun 2024 15:59:27 +0000 (11:59 -0400)]
qa/crimson-rados: remove centos 8 symlinks
As we're trying to drop centos 8 from the distros we
test on these symlinks are now dead and need to be
cleaned up. In main, there was no replacement for
these symlinks (it just relies on the
crimson-supposted-all-distro dir for its distro)
so I'm just removing them here.
Adam King [Fri, 7 Jun 2024 17:36:31 +0000 (13:36 -0400)]
qa/distros: add ubuntu 22.04 for containerized tests
Partial backport of 0fa3eb67387eaf403b5a6e716a81582949dcecf1
that adds the symlinks for the containerized tests to use
ubuntu 22.04 but leaves out the part dropping ubuntu 20.04
Adam King [Mon, 11 Dec 2023 20:44:30 +0000 (15:44 -0500)]
qa/cephadm: fix iscsi pids limit check for centos 9
Centos 9 uses cgroups v2 which has a slightly
different file location for the pids.max. This commit
updates the test to also check the new location
so the test can pass on centos 9
Adam King [Mon, 11 Dec 2023 18:59:42 +0000 (13:59 -0500)]
qa/cephadm: use quincy for add-repo test
There are no centos 9 build for octopus, so if we
want to start testing on cnetos 9 as a distro we need
the add-repo test to be done on a newer release
for which there are actual builds
the subsuite had a supported-all-distro$/ subdirectory, but that only
contained centos_8.yaml. qa/tasks/rabbitmq.py is hardcoded to use 'yum'
and rpm packages, so replace supported-all-distro$ with a link to
centos_latest.yaml
mon: validate SERVER_REEF on set-require-min-compat-client
Unit testing
-------------
```
[rzarzynski@o06 build]$ bin/unittest_features
...
[ RUN ] features.release_features
1 argonaut features 0x40000 looks like argonaut
2 bobtail features 0x40000 looks like argonaut
3 cuttlefish features 0x40000 looks like argonaut
4 dumpling features 0x42040000 looks like dumpling
5 emperor features 0x42040000 looks like dumpling
6 firefly features 0x20842040000 looks like firefly
7 giant features 0x20842040000 looks like firefly
8 hammer features 0x1020842040000 looks like hammer
9 infernalis features 0x1020842040000 looks like hammer
10 jewel features 0x401020842040000 looks like jewel
11 kraken features 0xc01020842040000 looks like kraken
12 luminous features 0xe01020842240000 looks like luminous
13 mimic features 0xe01020842240000 looks like luminous
14 nautilus features 0xe01020842240000 looks like luminous
15 octopus features 0xe01020842240000 looks like luminous
16 pacific features 0xe01020842240000 looks like luminous
17 quincy features 0xe01020842240000 looks like luminous
18 reef features 0xe010208d2240000 looks like reef
19 squid features 0xe010208d2240000 looks like reef
[ OK ] features.release_features (0 ms)
```
Manual testing
--------------
\### 'quincy` client connected to `main` cluster
There was `ceph -w` from `quincy` running in the background.
```
[rzarzynski@o06 build]$ bin/ceph osd set-require-min-compat-client reef
Error EPERM: cannot set require_min_compat_client to reef: 1 connected client(s) look like luminous (missing 0x80000000); add --yes-i-really-mean-it to do it anyway
```
mon, osd, *: expose upmap-primary in OSDMap::get_features()
This is a minimal fix to ensure only peers understanding
`pg-upmap-primary` are able to connect, and thus to exclude
the possibility of running into the `pg_upmap_primaries.empty()`
assertion in encoders.
Fixes for other problems will follow up.
The intention is to ship this patch in the very next minor
release of reef.
Manual testing
--------------
\### start using upmap-primar is presence of `quincy` client
NOTE: incompatible clients aren't disconnected but this is
known and expected as we lack the machinery.
\### `main` client is still able to connect
```
[rzarzynski@o06 build]$ bin/ceph -w
cluster:
id: d570a7cd-84ca-4fd0-aafb-80138762c6af
health: HEALTH_WARN
11 mgr modules have failed dependencies
1 pool(s) do not have an application enabled
services:
mon: 1 daemons, quorum a (age 64m)
mgr: x(active, since 64m)
osd: 3 osds: 3 up (since 64m), 3 in (since 64m)
\### `quincy` client may connect again
```
[rzarzynski@o06 build-quincy]$ bin/ceph -s -c /home/rzarzynski/ceph2/build/ceph.conf
cluster:
id: d570a7cd-84ca-4fd0-aafb-80138762c6af
health: HEALTH_WARN
11 mgr modules have failed dependencies
1 pool(s) do not have an application enabled
services:
mon: 1 daemons, quorum a (age 77m)
mgr: x(active, since 77m)
osd: 3 osds: 3 up (since 76m), 3 in (since 76m)
John Mulligan [Fri, 29 Mar 2024 18:04:33 +0000 (14:04 -0400)]
ceph.spec.in: remove command-with-macro line
A comment clearly left as a breadcrumb for a node-proxy manpage is
causing (intermittent) build failures. Remove the line and hope
the manpage is added if/when appropriate.
mds: raise health warning if client lacks feature for root_squash
Rather than evict all clients lacking this feature bit, raise a health error
that pushes the administrator to address it. This avoids the surprise of having
all affected clients suddenly evicted in the cluster.
Fixes: https://tracker.ceph.com/issues/65733 Fixes: 954ed30 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 66ff5c9fc8d4664f18b2fa462e96e5548c35951f)