John Mulligan [Fri, 14 Feb 2025 19:51:03 +0000 (14:51 -0500)]
doc: document the new container build tool and link to it in README
Add a new markdown file in the root of the tree, ContainerBuild.md, that
can serve as a basic introduction to the new container build tools
recently merged to ceph.
Add a small 'breadcrumb' section to the project README.md to help find
this new document.
John Mulligan [Thu, 20 Feb 2025 00:17:30 +0000 (19:17 -0500)]
script/build-with-container: add support for overlay dir
The source dir (aka homedir, default /ceph) is mounted in the container
read-write. This is needed as the various ceph build scripts expect to
write things into the tree - often this is in the build directory - but
not always. This can lead to small messes and/or situations that are
confusing to debug, especially if one is jumping between distros often.
Add an option to use an overlay volume for the homedir - by default we
enable a persistent overlay with a supplied "upper dir" where files that
were written will appear. One can also enable a temporary overlay that
forgets the writes when the container exits - maybe useful when doing
experiments in 'interactive' mode.
To use this option run the command with the `--overlay=<dir>` option.
For example: `./src/script/build-with-container.py -b build.inner
--overlay-dir build.ovr`. This will create a directory
`build.ovr/content` automatically and all new files will appear there.
For example the build directory will appear at
`build.ovr/content/build.inner`.
To use the temporary overlay use a `-` as the directory name. For
example: `./src/script/build-with-container.py -b build.inner
--overlay-dir -`
John Mulligan [Thu, 20 Feb 2025 14:50:49 +0000 (09:50 -0500)]
script/build-with-container: skip dnf cache dir volume mounts on docker
When using docker the --volume option is not available during build
(docker [buildx] build), unlike podman. Since passing these volumes must
be conditional on them being set up I see no way to handle this short of
just disabling the option on docker. Log the fact that it's being
skipped - the only other issue is that we pointlessly set up some dirs
and the build may be a bit slower.
John Mulligan [Wed, 19 Feb 2025 18:20:36 +0000 (13:20 -0500)]
script/build-with-container: remove default --volume arg from ctr build
On the original github pr #59841 user fayak kindly informed us that the
--volume option was not supported by docker build. Since this section
was a leftover from a previous way of constructing the builder image and
was no longer needed we simply removed it.
John Mulligan [Wed, 19 Feb 2025 18:20:01 +0000 (13:20 -0500)]
script/build-with-container.py: build builder image with --pull=always
Construct the builder image using the --pull=always flag to initiate a
pull of the base image (centos, ubuntu, etc) in order to avoid using a
stale base image. Since the script automatically (by default) avoids
building if a matching tag is in local container storage it is handy to
use a fresh base when it *is* time to build something. Otherwise, you
end up in a situation like I sometimes do - using a months old base
unintentionally.
John Mulligan [Fri, 14 Feb 2025 19:50:42 +0000 (14:50 -0500)]
script/build-with-container: add a common packages target
Add a `packages` target to build-with-container.py that requests a build
of packages, whatever package type is native to the distro selected.
For example `./src/script/build-with-container.py -d ubuntu22.04 -e
packages` will automatically select a deb packages build where
`./src/script/build-with-container.py -d centos9 -e packages` will
trigger rpm packages to be built. The underlying package-type specific
targets remain unchanged.
John Mulligan [Fri, 14 Feb 2025 16:44:35 +0000 (11:44 -0500)]
script/build-with-container: support custom tag suffixes
Previously, one could use the `--tag` option to completely override the
container tag generated by the script. However, there are cases where
one may want to add information to the tag rather than override it.
Allow the tag value to start with a plus (+) character that indicates
that the remainder of the string is to be suffixed to the generated tag.
Add a command line option --base-branch that allows the user to supply a
custom base branch name. git doesn't make determining this easy so we
always assume a base branch of 'main' by default - but this option lets
one change that.
John Mulligan [Fri, 14 Feb 2025 16:24:29 +0000 (11:24 -0500)]
src/script: rename CEPH_BRANCH to CEPH_BASE_BRANCH for build container
Previously, we were passing build argument of CEPH_BRANCH, but that was
a bit misleading as we expect the current branch to vary a bit (as users
will be using branches to develop and test the code). What we actually
care about is the base branch ('main', 'squid', etc) as that is fed into
our bootstrap script and we want the option to simple variations based
on the name of said base branch.
Rename CEPH_BRANCH to CEPH_BASE_BRANCH for clarity.
Add a new --current-branch argument that lets the user supply a name for
the current branch. This allows the automatic tag generation to avoid
calling git - something useful if the tree is not using a git checkout
(like a tarball). It also allows you to pull a temporary branch in git
but ignore it and act like the temporary branch is the base branch.
John Mulligan [Tue, 11 Feb 2025 23:36:13 +0000 (18:36 -0500)]
script/build-with-container: add more distro aliases
Add a system to define distro name aliases and use that to define some
additional aliases, primarily to match ubuntu codenames rather than
version numbers. Requested by Zack.
Ilya Dryomov [Thu, 20 Feb 2025 15:38:41 +0000 (16:38 +0100)]
qa/workunits/rbd: add a test for force promote with a user snapshot
Add a reproducer for the crash on a bad variant access which was fixed
in commit 7d75161051da ("librbd: fix a crash in get_rollback_snap_id").
The reproducer deliberately works around many other issues with force
promote in snapshot-based mirroring: stopping rbd-mirror daemon
shouldn't be necessary (let alone with SIGKILL), get_rollback_snap_id()
and its caller can_create_primary_snapshot() are flawed and can pick
the wrong snapshot to roll back to or skip rollback when it's actually
required, the user snapshot in this scenario should be removed as part
of force promoting because it's incomplete and won't be usable after
the image is promoted, etc.
Conflicts:
qa/workunits/rbd/rbd_mirror_journal.sh [ commits 3fd8a0388735
("qa/workunits/rbd: merge journal and snapshot test scripts")
and 3fdbc160bb21 ("rbd-mirror: allow mirroring to a different
namespace") not in reef ]
qa/workunits/rbd/rbd_mirror_snapshot.sh [ duplicated/cloned for
snapshot-based mirroring ]
Zac Dover [Mon, 3 Feb 2025 13:37:34 +0000 (23:37 +1000)]
doc/rados: improve pg_num/pgp_num info
Improve the guidance around setting pg_num, and clear up confusion
around whether pgp_num should be set manually or, indeed, if it even can
be set manually.
This PR was raised in response to Mark Schouten's email here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/CBDJTLTTIEZVG7GVZBX37UAWGYNSSMPD/
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit c43e7337212fe38e8db63d00345fa9858b3cb10a)
[CVE-2024-48916] rgw/sts: fix to disallow unsupported JWT algorithms
while authenticating AssumeRoleWithWebIdentity using JWT obtained
from an external IDP.
N Balachandran [Sat, 15 Feb 2025 13:26:31 +0000 (18:56 +0530)]
rbd-mirror: fix possible recursive lock of ImageReplayer::m_lock
If periodic status update (LambdaContext which is queued from
handle_update_mirror_image_replay_status()) races with shutdown and
ends up being the last in-flight operation that shutdown was pending
on, we attempt to recursively acquire m_lock in shut_down() because
m_in_flight_op_tracker.finish_op() is called with m_lock (and also
m_threads->timer_lock) held. These locks are needed only for the call
to schedule_update_mirror_image_replay_status() and should be unlocked
immediately.
Fixes: https://tracker.ceph.com/issues/69978 Co-authored-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: N Balachandran <nithya.balachandran@ibm.com>
(cherry picked from commit c60514087bc29540d3babd7855c5a4e28f2bf1b0)
Patrick Donnelly [Fri, 28 Feb 2025 00:29:26 +0000 (19:29 -0500)]
Merge PR #57190 into reef
* refs/pull/57190/head:
pybind/mgr/mgr_module: turn off all automatic transactions
pybind/mgr: disable sqlite3/python autocommit
qa/tasks/mgr: add tests for sqlite autocommit
qa/tasks/vstart_runner: run daemons in foreground
qa/tasks/vstart_runner: add missing poll method
qa/suites/rados/mgr: add cli/devicehealth tasks
qa: reorganize mgr unit tests
qa: use position-independent link
qa: add missing terminating newline
pybind/mgr: add killpoint for sqlite3 database setup
mgr: allow specifying module option level
mon/MgrMonitor: promote standby when unsetting down flag
mon/MgrMonitor: only drop active if exists
Patrick Donnelly [Wed, 12 Feb 2025 02:28:40 +0000 (21:28 -0500)]
pybind/mgr/mgr_module: turn off all automatic transactions
I misunderstood autocommit=False in prior patches. The sqlite3 binding will
still create transactions automatically which confused newer bindings using
autocommit.
So, turn off automatic transaction management completely to maintain backwards
compatibility.
Fixes: https://tracker.ceph.com/issues/69912 Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
(cherry picked from commit df49652987019d5eeec31c86332d8e69995d931a)
Naman Munet [Wed, 22 Jan 2025 10:59:20 +0000 (16:29 +0530)]
mgr/dashboard: Add confirmation textbox for resource name on delete action
Before:
=====
User was able to delete a single or multiple critical resources like ( images, snapshots, subvolumes, subvolume-groups, pools, hosts , OSDs, buckets, file system, services ) by just clicking on a checkbox.
After:
=====
User now has to type the resource name that they are deleting in the textbox on the delete modal, and then only they will be able to delete the critical resource.
Also from now onwards multiple selection for deletions of critical resources is not possible. Hence, user can delete only single resource at a time. On the other side, non-critical resources can be deleted in one go.
Ilya Dryomov [Tue, 18 Feb 2025 16:51:47 +0000 (17:51 +0100)]
test/rbd_mirror: clear Namespace::s_instance at the end of a test
TestMockPoolReplayer.Namespaces and NamespacesError tests leave behind
a dangling pointer to a stack-allocated MockNamespace which leads to an
easily reproducible use-after-free and segfault when tests are shuffled.
Ilya Dryomov [Mon, 17 Feb 2025 11:41:51 +0000 (12:41 +0100)]
test/rbd_mirror: flush watch/notify callbacks in TestImageReplayer
TestImageReplayer establishes its own (i.e. outside of the SUT code)
watch on the header of the remote image to be able to synchronize the
execution of the test with certain notifications. This watch is
established before the remote image is opened and is teared down until
after the remote image is closed but while the image replayer is still
running. The flush that is part of image close sequence thus isn't
guaranteed to cover all callbacks, especially for snapshot-based
mirroring where UnlinkPeerRequest spawned from Replayer::unlink_peer()
generates a notification on the remote image for each completed unlink.
Since TestImageReplayer further immediately deletes C_WatchCtx, pretty
much any test can segfault when C_WatchCtx::handle_notify() is invoked
by TestWatchNotify infrastructure. Because it's a virtual method, the
segfault often involves a completely bogus instruction pointer:
John Mulligan [Thu, 27 Jul 2023 18:17:36 +0000 (14:17 -0400)]
python-common: fix valid_addr on python 3.11
The behavior on python 3.11 regarding IPv4 addresses in bracket has
changed:
```
$ python3.8 -c 'from urllib.parse import urlparse; urlparse("http://[192.168.0.1]")'
[john@edfu ~]$ python3.11 -c 'from urllib.parse import urlparse; urlparse("http://[192.168.0.1]")'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib64/python3.11/urllib/parse.py", line 395, in urlparse
splitresult = urlsplit(url, scheme, allow_fragments)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/urllib/parse.py", line 500, in urlsplit
_check_bracketed_host(bracketed_host)
File "/usr/lib64/python3.11/urllib/parse.py", line 448, in
_check_bracketed_host
raise ValueError(f"An IPv4 address cannot be in brackets")
ValueError: An IPv4 address cannot be in brackets
```
This breaks the test in test_valid_addr that asserts that function
valid_addr returns the string "IPv4 address wrapped in brackets is
invalid".
Move the step that checks for brackets and dots above the urllib
check so that the function continues returning the expected string.
Adam King [Wed, 12 Feb 2025 16:32:24 +0000 (11:32 -0500)]
mgr/cephadm: use double quotes for NFSv4 RecoveryBackend in ganesha conf
This came directly from someone on the ganesha team. We've actually had
this use single quotes for a long time (at least since mid 2020) but I
believe recent feature work on the ganesha side exposed the issue
Adam King [Thu, 30 Jan 2025 14:15:37 +0000 (09:15 -0500)]
mgr/cephadm: create OSD daemon deploy specs through make_daemon_spec
That function handles setting up the extra container/entrypoint
args for the daemon during initial deployment. Having the
CephadmDaemonDeploySpec made directly in the OSD deployment
workflow means initial deployments of OSDs won't have the
extra container/entrypoint args from the spec
Michal Nasiadka [Wed, 11 Sep 2024 12:26:37 +0000 (14:26 +0200)]
cephadm: Support Docker Live Restore
Currently with Docker Live Restore [1] enabled and while restarting
Docker Engine - all Ceph container images will get restarted,
while the feature allows restarting docker.service without
containers downtime.
This is due to Requires=docker.service in systemd units templates,
which mandates that on docker.service restart - the ceph container
systemd units will be restarted as well.
Reworking Requires= to Wants= that is a weaker version of the former,
see [2].
Leaving After= entries, because they should allow systemd to correctly
order the startup (first docker, then ceph containers).
orch: refactor boolean handling in drive group spec
The intent of 42721c03ee6f was to address an issue where boolean
parameters weren't handled correctly.
I noticed that a parameter (`tpm2`) was missed, which made me realize
that maintaining a list of these boolean parameters is necessary.
To simplify things, we should only accept `"true"` or `"false"` (in any case),
allowing us to avoid the need to maintain a list of boolean parameters.
This change introduces a `list_drive_group_spec_bool_arg` to store boolean
arguments related to drive group specifications, simplifying the validation
process for boolean values by directly checking if the values are 'true' or 'false'.