mds: add retry request to MDSRank wait queue rather via finisher
C_MDS_RetryRequest inherits from MDSInternalContext which does not
acquire mds_lock by itself. Adding to MDSRank wait queue will process
this via the progress thread which completes the context with mds_lock
acquired.
Patrick Donnelly [Wed, 13 May 2026 12:40:50 +0000 (08:40 -0400)]
Merge PR #68476 into tentacle
* refs/pull/68476/head:
mgr/dashboard: Update permissions for pool-manager role
mgr/dashboard : Select replicated rule by default in pools form
mgr/dashboard : Fix application names in pools form
mgr/dashboard : add stretch cluster validation for pools form
Reviewed-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>
Remove 'success_message_template' from NvmeofCLICommand decorator of
NVMeoFSubsystem.add_network() and NVMeoFSubsystem.del_network().
This is because 'success_message_template' feature introduction PR
hasn't been backported to tentacle.
This commit can be reverted later in tentacle branch.
Gil Bregman [Mon, 13 Apr 2026 21:41:25 +0000 (00:41 +0300)]
mgr/dashboard: Add port and secure-listeners to subsystem add NVMeoF CLI command
Fixes: https://tracker.ceph.com/issues/75998 Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
(cherry picked from commit 624adc09431dc2fdfa617940161f188c0831bf97)
Conflicts:
src/pybind/mgr/dashboard/controllers/nvmeof.py
Resolve conflict to use "traddr" instead of "server_address"
in NVMeoFSubsystem().create() parameters.
Main branch renamed the param ("traddr") to "server_address".
Tentacle
Vallari Agrawal [Thu, 12 Mar 2026 13:50:00 +0000 (19:20 +0530)]
mgr/dashboard: Add 'network_mask' to nvmeof cli
This commit add the following to nvmeof cli:
0. Add new param `--network-mask` to 'subsystem add' cmd
It's a list parameter so we can pass multiple netmask by
`subsystem add --network-mask <subnet1> --network-mask <subnet2>`
1. Add new cli `subsystem add_network --network-mask <subnet>`
2. Add new cli `subsystem del_network --network-mask <subnet>`
3. Add column 'network_mask' to `subsystem list` output
4. Add column 'manual' to `listener list` output
Conflicts:
src/pybind/mgr/dashboard/controllers/nvmeof.py
NVMeoFSubsystem controller uses param name "traddr"
in tentacle branch and its renamed to "server_address"
in main branch. Since its a breaking change, it would be
changed to "server_address" in next major version.
So in this backport commit, we use "traddr" in create(),
add_network(), and del_network().
PR #67318 (boto3 migration) cherry-picked onto tentacle conflicted with
PR #66168 which had added test_suspended_delete_marker_incremental_sync
using the old boto API. The conflict resolution added the boto3-rewritten
version of the function but left the original old-API version in place,
resulting in dup definitions with the same name.
Remove the stale old-API duplicate; keep the boto3 version added by PR #67318.
* refs/pull/68505/head:
rgw: reenable 'bucket stats' on indexless buckets
rgw: 'bucket stats' omits usage for buckets on other zonegroups
rgw/rados: pass SiteConfig into bucket_stats()
rgw: bucket_stats() uses local variable 'index'
* refs/pull/66769/head:
test/rgw/logging: run teuthology on erasure coded pool
rgw/bucket-logging: support for EC pools
rgw/logging: do not create empty temporary objects
rgw/logging: deleteting the object holding the temp object name on cleanup
rgw/logging: make sure source bucket is in the target's list
rgw/logging: removed unused APIs from header
rgw/logging: fix race condition when name update returns ECANCELED
rgw/logging: add error message when log_record fails
rgw/logging: rollover objects when conf changes
rgw/logging: allow committing empty objects
rgw/logging: verify http method exists
rgw/logging: fix/remove/add bucket logging op names
rgw/logging: refactor canonical_name()
rgw/logging: fix canonical names
rgw: RGWPostBucketLoggingOp uses yield context
Reviewed-by: Anthony D Atri <anthony.datri@gmail.com> Reviewed-by: Yuval Lifshitz <ylifshit@redhat.com>
* refs/pull/67923/head:
rgw/test [tentacle]: add missing ceph and call_ceph shell functions
RGW/test_multi/RGWBucketFullSyncCR: test bucket full sync while source bucket is deleted in the middle
RGW/multisite/RGWListRemoteBucketCR: clear reused bucket_list_result to avoid stale listings
RGW/multisite: bucket_list_result object provides a method to reset its entries
RGW/multisite: add some more debug logs to sync codepath
RGW/test_multi: remove unused import
RGW/test_multi: allow Cluster object to run ceph admin commands
RGW: add delay injection options for integration testing
RGW: make SSTR macro safe against variable name collisions
* refs/pull/68374/head:
qa: enforce centos9 for test
qa: rename distro
qa/suites/fs/bugs: use centos9 for squid upgrade test
qa: remove unused variables
qa: use centos9 for fs suites using k-testing
qa: update fs suite to rocky10
qa: skip dashboard install due to dependency noise
qa: only setup nat rules during bridge creation
qa: correct wording of comment
qa: use nft instead iptables
qa: use py3 builtin ipaddress module
* refs/pull/68569/head:
tentacle: only add package tasks if rocky is the final distro
qa/distros: add centos 9 stream back to supported distros
qa/distros: re-install nvme-cli package in rocky tests
qa: allowlist bpf podman denials on Rocky 10
qa/distros: bump rocky to 10.1
qa/distros: add rocky_10 as supported container host
qa/distros: bump rpm_latest.yaml to rocky_10.yaml
qa/distros: rename centos_latest.yaml to rpm_latest.yaml
qa/distros: add rocky_9 and rocky_10
* refs/pull/67507/head:
qa/workunits/rados/test_envlibrados_for_rocksdb.sh: Add Rocky support
qa/workunits/ceph-helpers-root: Add Rocky support for install packages
* refs/pull/67515/head:
pybind/orchestrator/cli: fix OrchestratorError retval sign
orchestrator/test/test_orchestrator: fix return code to negative
mgr/mgr_module: fix tox test missing a type annotation
mgr/selftest: mypy error fix missing a type annotation
mgr/dashboard: use __name__ for module-specific logging
selftest: Add logging self tests
pybind/mgr/mgr_module: isolate logging per mgr module
mgr/Gil.cc: simplify Gil(), ~Gil()
mgr/Gil.cc: do not use PyGILState_Check()
mgr: add mgr_subinterpreter_modules config
python-common/.../service_spec: implement ServiceSpec.__getnewargs__ to allow unpickle to work correctly
mgr: serialize python objects sent between subinterpreters via remote
Patrick Donnelly [Tue, 14 Apr 2026 00:47:43 +0000 (20:47 -0400)]
qa: rename distro
The kernel mount overrides for the distro have no effect if they are
applied before `supported-random-distro`.
Fixes:
2026-04-13T19:06:13.603 INFO:teuthology.task.pexec:sudo dnf remove nvme-cli -y
2026-04-13T19:06:13.603 INFO:teuthology.task.pexec:sudo dnf install nvmetcli nvme-cli -y
2026-04-13T19:06:13.626 INFO:teuthology.task.pexec:Running commands on host ubuntu@trial005.front.sepia.ceph.com
2026-04-13T19:06:13.627 INFO:teuthology.task.pexec:sudo dnf remove nvme-cli -y
2026-04-13T19:06:13.627 INFO:teuthology.task.pexec:sudo dnf install nvmetcli nvme-cli -y
2026-04-13T19:06:13.652 INFO:teuthology.orchestra.run.trial148.stderr:sudo: dnf: command not found
2026-04-13T19:06:13.653 DEBUG:teuthology.orchestra.run:got remote process result: 1
2026-04-13T19:06:13.654 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/run_tasks.py", line 105, in run_tasks
manager = run_one_task(taskname, ctx=ctx, config=config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/run_tasks.py", line 83, in run_one_task
return task(**kwargs)
^^^^^^^^^^^^^^
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/task/pexec.py", line 149, in task
with parallel() as p:
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/parallel.py", line 84, in __exit__
for result in self:
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/parallel.py", line 98, in __next__
resurrect_traceback(result)
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/parallel.py", line 30, in resurrect_traceback
raise exc.exc_info[1]
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/parallel.py", line 23, in capture_traceback
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/task/pexec.py", line 62, in _exec_host
tor.wait([r])
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/orchestra/run.py", line 485, in wait
proc.wait()
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/orchestra/run.py", line 161, in wait
self._raise_for_status()
File "/home/teuthworker/src/git.ceph.com_teuthology_426ec63bc4a39bba882efb593125294667afc593/teuthology/orchestra/run.py", line 181, in _raise_for_status
raise CommandFailedError(
teuthology.exceptions.CommandFailedError: Command failed on trial148 with status 1: 'TESTDIR=/home/ubuntu/cephtest bash -s'
which was done because these dnf commands were pulled from rocky10.yaml from the kclient overrides but ubuntu_latest was used for the random distro.
Patrick Donnelly [Thu, 12 Feb 2026 15:36:29 +0000 (10:36 -0500)]
qa: use centos9 for fs suites using k-testing
A better approach would be to include centos9 OR rocky10 for
distribution choice. Then we can just filter out rocky10 when we're
testing the `testing` kernel but keep rocky10 coverage for other
testing.
Patrick Donnelly [Wed, 19 Nov 2025 17:25:45 +0000 (12:25 -0500)]
qa: skip dashboard install due to dependency noise
2025-11-18T19:46:46.226 INFO:teuthology.orchestra.run.smithi008.stdout:/usr/bin/ceph: stderr Error ENOTSUP: Module 'alerts' is not enabled/loaded (required by command 'dashboard set-ssl-certificate'): use `ceph mgr module enable alerts` to enable it
OrchestratorError stores errno as abs(), so e.errno is always positive.
Returning retval=e.errno (+22) caused the ceph CLI to exit 0 since it
only propagates the exit code when ret < 0.
NitzanMordhai [Tue, 3 Mar 2026 12:17:07 +0000 (12:17 +0000)]
mgr/dashboard: use __name__ for module-specific logging
Previously, using a hard-coded logger name like 'rgw_client' created
a top-level logger that bypassed the 'mgr.dashboard' hierarchy.
By switching to __name__, we ensure the logger identity follows the
package structure (e.g., 'mgr.dashboard.services.rgw_client').
Since propagate=True is enabled, this allows log records to flow
upward through the 'mgr' parent loggers, ensuring they are correctly
captured, formatted, and attributed to the dashboard module rather than
falling back to the root logger.
NitzanMordhai [Thu, 12 Feb 2026 09:13:41 +0000 (09:13 +0000)]
pybind/mgr/mgr_module: isolate logging per mgr module
After PR #66244, all mgr modules run inside the same Python interpreter.
That means they also share the same logging subsystem.
Previously, each module attached its handlers to the root logger. In practice,
whichever module initialized logging last effectively “owned” the root logger,
and log messages from other modules could end up attributed incorrectly.
This change scopes logging per module. Each module now registers its handlers
on a dedicated logger named after the module itself, with propagate=False to avoid
leaking messages into the root logger.
Now, the getLogger() default (no args) returns the module's named logger
rather than the root logger. This ensures self.log routes correctly.
Samuel Just [Fri, 7 Nov 2025 23:56:14 +0000 (23:56 +0000)]
mgr: add mgr_subinterpreter_modules config
This commit adds a mgr_subinterpreter_modules config to cause specified
modules (or all if * is specified) to be loaded in individual
subinterpreters.
This changes the default behavior of ceph-mgr from running each module
in a distinct subinterpreter to running them all in the same main
interpreter. We can reintroduce subinterpreter support over time by
adding modules to the list as we test them.
Fixes: https://tracker.ceph.com/issues/73857 Fixes: https://tracker.ceph.com/issues/73859 Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 239b0dc8a9c42449ee1faa1bf78bdcc380345ae2)
Conflicts:
- src/mgr/PyModule.cc
#include "common/JSONFormatter.h" - removed (missing commit 3ab70dd in tentacle), not in tentacle
dtor - still missing commit 3366ef5 on tentacle causing conflicts,
taking tentacle changes with use_main_interpreter and end the interpater pMyThreadState.ts
mgr/dashboard: Update permissions for pool-manager role
Fixes https://tracker.ceph.com/issues/76307
- says denied access when clicked on create pool table action
- this was happening due to the failing monitor API added for stretch cluster configuration
- also updates overview nav permissions
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/shared/models/service.interface.ts
- conflict due to certmgr interfaces were misisng in tentacle
which are not needed. hence removed them
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/cephfs/cephfs.module.ts
- While merging, TagModule was missing from ceph.module.ts in the incoming changes,
so it was added in this PR to align with the existing setup Signed-off-by: pujaoshahu <pshahu@redhat.com>
* refs/pull/67566/head:
mgr/dashboard: show rados ns in 'ceph nvmeof top io'
mgr/dashboard: validate args in nvmeof top cmds
src/pybind/mgr: Add nvmeof-top tool
Gil Bregman [Mon, 23 Feb 2026 10:56:54 +0000 (12:56 +0200)]
nvmeof: Change the NVMEOF image version to 1.6 Fixes: https://tracker.ceph.com/issues/75097 Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
(cherry picked from commit 02587347b0a4e7ae1d7f5d738bd33808e2d56bc9) Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
Conflicts:
src/python-common/ceph/cephadm/images.py
We forgot to set version 1.6 in time. By the time we got to it, the main
branch has already moved up to use version 1.7 which contains features
we don't have in tentacle. We want version 1.6 for tentacle so we had to
change the commit here to 1.6.
Conflicts:
monitoring/ceph-mixin/dashboards_out/ceph-application-overview.json
- schema version different in tentacle (41)
- this commit was absent https://github.com/ceph/ceph/commit/6754d7a28fbf598468dd0a5d4792f177da239064 . It brings tags entries not defiend in tentacle
Aashish Sharma [Tue, 31 Mar 2026 04:30:23 +0000 (10:00 +0530)]
mgr/dashboard: Add option to edit zone with keys/
argument like"sync_from" and "sync_from_all"
Currently, there is no option to configure the sync_from and sync_from_all keys directly while creating or editing a zone from the dashboard. These arguments are particularly important when setting up archive zones. In archive zones, duplicate objects appear when sync_from_all is set to true (which is the default). The fix is to:
1.Set sync_from_all to false
2.Set sync_from to point to the master zone only
This ensures that the archive zone syncs exclusively from the master zone, preventing duplicate object issues.
Conflicts:
src/pybind/mgr/dashboard/frontend/src/styles.scss
- recent merged PRs added at the end of file more rules which were giving conflict with this commit as it also adds a rule in the end