Patrick Donnelly [Tue, 24 Mar 2026 14:47:10 +0000 (10:47 -0400)]
Merge PR #67954 into main
* refs/pull/67954/head:
orchestrator/test/test_orchestrator: fix return code to negative
mgr/mgr_module: fix tox test missing a type annotation
mgr/selftest: mypy error fix missing a type annotation
Reviewed-by: Patrick Donnelly <pdonnell@ibm.com> Reviewed-by: John Mulligan <jmulligan@redhat.com>
David Galloway [Mon, 23 Mar 2026 15:05:38 +0000 (11:05 -0400)]
container/build.sh: FROM_IMAGE=rockylinux-10 default for >=tentacle
We build centos9 and rocky10 packages and containers by default now for wip, main, and tentacle branches as of https://github.com/ceph/ceph-build/pull/2557.
Starting with tentacle, we want a `podman pull quay.ceph.io/ceph-ci/ceph:tentacle` or `podman pull quay.ceph.io/ceph-ci/ceph:$SHA1` to get the container with Rocky 10 as the Base OS image, or FROM_IMAGE.
Fixes: https://tracker.ceph.com/issues/75673 Signed-off-by: David Galloway <david.galloway@ibm.com>
Patrick Donnelly [Fri, 20 Mar 2026 21:49:53 +0000 (17:49 -0400)]
Merge PR #67102 into main
* refs/pull/67102/head:
qa/workunits/rados/test_envlibrados_for_rocksdb.sh: Add Rocky support
qa/workunits/ceph-helpers-root: Add Rocky support for install packages
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
Patrick Donnelly [Fri, 20 Mar 2026 21:49:06 +0000 (17:49 -0400)]
Merge PR #66396 into main
* refs/pull/66396/head:
neorados: specify alignments for aligned_storage
Reviewed-by: Adam C. Emerson <aemerson@redhat.com> Reviewed-by: Laura Flores <lflores@redhat.com> Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com> Reviewed-by: Kefu Chai <k.chai@proxmox.com> Reviewed-by: Mark Kogan <mkogan@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
Patrick Donnelly [Fri, 20 Mar 2026 21:44:45 +0000 (17:44 -0400)]
Merge PR #66244 into main
* refs/pull/66244/head:
mgr/Gil.cc: simplify Gil(), ~Gil()
mgr/Gil.cc: do not use PyGILState_Check()
mgr: add mgr_subinterpreter_modules config
python-common/.../service_spec: implement ServiceSpec.__getnewargs__ to allow unpickle to work correctly
mgr: serialize python objects sent between subinterpreters via remote
Reviewed-by: Nitzan Mordechai <nmordech@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
Patrick Donnelly [Fri, 20 Mar 2026 21:39:45 +0000 (17:39 -0400)]
Merge PR #63859 into main
* refs/pull/63859/head:
qa/workunits/mgr: account for nvmeof module being "always-on"
mgr, qa: clarify module checks in DaemonServer
mgr, qa: add `pending_modules` to asock command
mgr, common, qa, doc: issue health error after max expiration is exceeded
mgr: ensure that all modules have started before advertising active mgr
Reviewed-by: Nitzan Mordechai <nmordech@redhat.com> Reviewed-by: Anthony D Atri <anthony.datri@gmail.com> Reviewed-by: Samuel Just <sjust@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
Ilya Dryomov [Wed, 11 Mar 2026 11:04:24 +0000 (12:04 +0100)]
librbd/migration/QCOWFormat: avoid use-after-free in execute_request()
Both L2TableCache and QCOWFormat can be destroyed after the completion
for the last L2 cache request is posted, particularly so in unit tests.
The strand destructor doesn't drain the handler queue in any way but
merely ensures that previously posted handlers would get dispatched in
a non-concurrent fashion. As a result, use-after-free can ensue when
execute_request() unnecessarily dispatches itself for the last time.
David Galloway [Mon, 26 Jan 2026 17:05:01 +0000 (12:05 -0500)]
qa: allowlist bpf podman denials on Rocky 10
Rocky Linux 10 logs SELinux AVCs for systemd BPF operations during container startup due to incomplete SELinux policy coverage. These AVCs occur in permissive mode, are reproducible without Ceph, and do not indicate functional failure. Tests should ignore this specific AVC class while continuing to fail on enforced denials.
Signed-off-by: David Galloway <david.galloway@ibm.com>
Dan Mick [Mon, 16 Mar 2026 20:13:30 +0000 (13:13 -0700)]
container/make-manifest-list.py: add version support
Add mandatory -v/--version to select version to examine (to allow
multiple prerelease tags to exist). Reorder arguments so that
usage help in the 'missing version' case shows the long option names.
Requires change to ceph-release-containers job as well to pass
the --version argument.
This commit is part of a PR that includes an update to the "promote"
invocation of make-manifest-list.py, which is done manually and must
also contain the --version argument.
Normally when fast devices are passed to batch command but
no fast allocations could be found the batch command will
do nothing and return an empty plan. This leads to issues
however because the return essentially makes this issue silent
which makes it hard to debug in certain scenarios. I propose
to change this to raise error, and have made changes in osd.py
to better log the errors and process the exceptions. This
shouldn't affect processes that much and the change in
osd.py ensures the raised errors will not interrupt the return
output. I've also changed the unit tests to account for
change.
Ilya Dryomov [Mon, 9 Mar 2026 11:57:28 +0000 (12:57 +0100)]
qa/workunits/rbd: drop racy assert in test_tasks_recovery()
Even though "ceph rbd task list" is executed immediately after
a successful "ceph rbd task add flatten", the operation may complete
in the interim and the task listing may come back empty legitimately.
Given that we are asserting that flatten actually occurs based on
"rbd info" output, there is no real need to try to briefly observe
the flatten task in the task list.
Alex Ainscow [Wed, 18 Mar 2026 14:51:57 +0000 (14:51 +0000)]
src: Move the decision to build the ISA plugin to the top level make file
Previously, the first time you build ceph, common did not see the correct
value of WITH_EC_ISA_PLUGIN. The consequence is that the global.yaml gets
build with osd_erasure_code_plugins not including isa. This is not great
given its our default plugin.
We considered simply removing this parameter from make entirely, but this
may require more discussion about supporting old hardware.
So the slightly ugly fix is to move this erasure-code specific declartion
to the top-level.
Fixes: https://tracker.ceph.com/issues/75537 Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
CompleteMultipartUpload depends on this lock to ensure consistency of
uploads and protect against data loss, so we should try very hard to
hold this lock as long as it takes to complete successfully
MPRadosSerializer accomplishes this by spawning a background lock
renewal coroutine. this coroutine is started during a successful call to
try_lock(), and stopped before unlock() releases the lock
this duration ultimately gets passed down to cls_lock's set_duration()
function, which has overloads for both utime_t and ceph::timespan.
prefer ceph::timespan because it also works with boost asio timers
Casey Bodley [Thu, 12 Mar 2026 14:39:02 +0000 (10:39 -0400)]
rgw: check for broken lock before multipart complete
if lock renewal fails, is_locked() will return false. check that just
before upload->complete() goes on to write/overwrite the head object,
and return the same ERR_INTERNAL_ERROR from lock contention