Add a new --current-branch argument that lets the user supply a name for
the current branch. This allows the automatic tag generation to avoid
calling git - something useful if the tree is not using a git checkout
(like a tarball). It also allows you to pull a temporary branch in git
but ignore it and act like the temporary branch is the base branch.
John Mulligan [Tue, 11 Feb 2025 23:36:13 +0000 (18:36 -0500)]
script/build-with-container: add more distro aliases
Add a system to define distro name aliases and use that to define some
additional aliases, primarily to match ubuntu codenames rather than
version numbers. Requested by Zack.
Ilya Dryomov [Thu, 20 Feb 2025 15:38:41 +0000 (16:38 +0100)]
qa/workunits/rbd: add a test for force promote with a user snapshot
Add a reproducer for the crash on a bad variant access which was fixed
in commit 7d75161051da ("librbd: fix a crash in get_rollback_snap_id").
The reproducer deliberately works around many other issues with force
promote in snapshot-based mirroring: stopping rbd-mirror daemon
shouldn't be necessary (let alone with SIGKILL), get_rollback_snap_id()
and its caller can_create_primary_snapshot() are flawed and can pick
the wrong snapshot to roll back to or skip rollback when it's actually
required, the user snapshot in this scenario should be removed as part
of force promoting because it's incomplete and won't be usable after
the image is promoted, etc.
Conflicts:
qa/workunits/rbd/rbd_mirror_journal.sh [ commits 3fd8a0388735
("qa/workunits/rbd: merge journal and snapshot test scripts")
and 3fdbc160bb21 ("rbd-mirror: allow mirroring to a different
namespace") not in reef ]
qa/workunits/rbd/rbd_mirror_snapshot.sh [ duplicated/cloned for
snapshot-based mirroring ]
Zac Dover [Mon, 3 Feb 2025 13:37:34 +0000 (23:37 +1000)]
doc/rados: improve pg_num/pgp_num info
Improve the guidance around setting pg_num, and clear up confusion
around whether pgp_num should be set manually or, indeed, if it even can
be set manually.
This PR was raised in response to Mark Schouten's email here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/CBDJTLTTIEZVG7GVZBX37UAWGYNSSMPD/
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit c43e7337212fe38e8db63d00345fa9858b3cb10a)
[CVE-2024-48916] rgw/sts: fix to disallow unsupported JWT algorithms
while authenticating AssumeRoleWithWebIdentity using JWT obtained
from an external IDP.
N Balachandran [Sat, 15 Feb 2025 13:26:31 +0000 (18:56 +0530)]
rbd-mirror: fix possible recursive lock of ImageReplayer::m_lock
If periodic status update (LambdaContext which is queued from
handle_update_mirror_image_replay_status()) races with shutdown and
ends up being the last in-flight operation that shutdown was pending
on, we attempt to recursively acquire m_lock in shut_down() because
m_in_flight_op_tracker.finish_op() is called with m_lock (and also
m_threads->timer_lock) held. These locks are needed only for the call
to schedule_update_mirror_image_replay_status() and should be unlocked
immediately.
Fixes: https://tracker.ceph.com/issues/69978 Co-authored-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: N Balachandran <nithya.balachandran@ibm.com>
(cherry picked from commit c60514087bc29540d3babd7855c5a4e28f2bf1b0)
Patrick Donnelly [Fri, 28 Feb 2025 00:29:26 +0000 (19:29 -0500)]
Merge PR #57190 into reef
* refs/pull/57190/head:
pybind/mgr/mgr_module: turn off all automatic transactions
pybind/mgr: disable sqlite3/python autocommit
qa/tasks/mgr: add tests for sqlite autocommit
qa/tasks/vstart_runner: run daemons in foreground
qa/tasks/vstart_runner: add missing poll method
qa/suites/rados/mgr: add cli/devicehealth tasks
qa: reorganize mgr unit tests
qa: use position-independent link
qa: add missing terminating newline
pybind/mgr: add killpoint for sqlite3 database setup
mgr: allow specifying module option level
mon/MgrMonitor: promote standby when unsetting down flag
mon/MgrMonitor: only drop active if exists
Patrick Donnelly [Wed, 12 Feb 2025 02:28:40 +0000 (21:28 -0500)]
pybind/mgr/mgr_module: turn off all automatic transactions
I misunderstood autocommit=False in prior patches. The sqlite3 binding will
still create transactions automatically which confused newer bindings using
autocommit.
So, turn off automatic transaction management completely to maintain backwards
compatibility.
Fixes: https://tracker.ceph.com/issues/69912 Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
(cherry picked from commit df49652987019d5eeec31c86332d8e69995d931a)
Naman Munet [Wed, 22 Jan 2025 10:59:20 +0000 (16:29 +0530)]
mgr/dashboard: Add confirmation textbox for resource name on delete action
Before:
=====
User was able to delete a single or multiple critical resources like ( images, snapshots, subvolumes, subvolume-groups, pools, hosts , OSDs, buckets, file system, services ) by just clicking on a checkbox.
After:
=====
User now has to type the resource name that they are deleting in the textbox on the delete modal, and then only they will be able to delete the critical resource.
Also from now onwards multiple selection for deletions of critical resources is not possible. Hence, user can delete only single resource at a time. On the other side, non-critical resources can be deleted in one go.
Ilya Dryomov [Tue, 18 Feb 2025 16:51:47 +0000 (17:51 +0100)]
test/rbd_mirror: clear Namespace::s_instance at the end of a test
TestMockPoolReplayer.Namespaces and NamespacesError tests leave behind
a dangling pointer to a stack-allocated MockNamespace which leads to an
easily reproducible use-after-free and segfault when tests are shuffled.
Ilya Dryomov [Mon, 17 Feb 2025 11:41:51 +0000 (12:41 +0100)]
test/rbd_mirror: flush watch/notify callbacks in TestImageReplayer
TestImageReplayer establishes its own (i.e. outside of the SUT code)
watch on the header of the remote image to be able to synchronize the
execution of the test with certain notifications. This watch is
established before the remote image is opened and is teared down until
after the remote image is closed but while the image replayer is still
running. The flush that is part of image close sequence thus isn't
guaranteed to cover all callbacks, especially for snapshot-based
mirroring where UnlinkPeerRequest spawned from Replayer::unlink_peer()
generates a notification on the remote image for each completed unlink.
Since TestImageReplayer further immediately deletes C_WatchCtx, pretty
much any test can segfault when C_WatchCtx::handle_notify() is invoked
by TestWatchNotify infrastructure. Because it's a virtual method, the
segfault often involves a completely bogus instruction pointer:
John Mulligan [Thu, 27 Jul 2023 18:17:36 +0000 (14:17 -0400)]
python-common: fix valid_addr on python 3.11
The behavior on python 3.11 regarding IPv4 addresses in bracket has
changed:
```
$ python3.8 -c 'from urllib.parse import urlparse; urlparse("http://[192.168.0.1]")'
[john@edfu ~]$ python3.11 -c 'from urllib.parse import urlparse; urlparse("http://[192.168.0.1]")'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib64/python3.11/urllib/parse.py", line 395, in urlparse
splitresult = urlsplit(url, scheme, allow_fragments)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/urllib/parse.py", line 500, in urlsplit
_check_bracketed_host(bracketed_host)
File "/usr/lib64/python3.11/urllib/parse.py", line 448, in
_check_bracketed_host
raise ValueError(f"An IPv4 address cannot be in brackets")
ValueError: An IPv4 address cannot be in brackets
```
This breaks the test in test_valid_addr that asserts that function
valid_addr returns the string "IPv4 address wrapped in brackets is
invalid".
Move the step that checks for brackets and dots above the urllib
check so that the function continues returning the expected string.
Adam King [Wed, 12 Feb 2025 16:32:24 +0000 (11:32 -0500)]
mgr/cephadm: use double quotes for NFSv4 RecoveryBackend in ganesha conf
This came directly from someone on the ganesha team. We've actually had
this use single quotes for a long time (at least since mid 2020) but I
believe recent feature work on the ganesha side exposed the issue
Adam King [Thu, 30 Jan 2025 14:15:37 +0000 (09:15 -0500)]
mgr/cephadm: create OSD daemon deploy specs through make_daemon_spec
That function handles setting up the extra container/entrypoint
args for the daemon during initial deployment. Having the
CephadmDaemonDeploySpec made directly in the OSD deployment
workflow means initial deployments of OSDs won't have the
extra container/entrypoint args from the spec
Michal Nasiadka [Wed, 11 Sep 2024 12:26:37 +0000 (14:26 +0200)]
cephadm: Support Docker Live Restore
Currently with Docker Live Restore [1] enabled and while restarting
Docker Engine - all Ceph container images will get restarted,
while the feature allows restarting docker.service without
containers downtime.
This is due to Requires=docker.service in systemd units templates,
which mandates that on docker.service restart - the ceph container
systemd units will be restarted as well.
Reworking Requires= to Wants= that is a weaker version of the former,
see [2].
Leaving After= entries, because they should allow systemd to correctly
order the startup (first docker, then ceph containers).
orch: refactor boolean handling in drive group spec
The intent of 42721c03ee6f was to address an issue where boolean
parameters weren't handled correctly.
I noticed that a parameter (`tpm2`) was missed, which made me realize
that maintaining a list of these boolean parameters is necessary.
To simplify things, we should only accept `"true"` or `"false"` (in any case),
allowing us to avoid the need to maintain a list of boolean parameters.
This change introduces a `list_drive_group_spec_bool_arg` to store boolean
arguments related to drive group specifications, simplifying the validation
process for boolean values by directly checking if the values are 'true' or 'false'.