node-proxy: split out config, bootstrap and redfish logic
refactor config, bootstrap, redfish layer, and monitoring:
this:
- adds a config module (CephadmCofnig, load_cephadm_config and
get_node_proxy_config) and protocols for api/reporter.
- extracts redfish logic to redfish.py
- adds a vendor registry with entrypoints.
- simplifies main() and NodeProxyManager().
This commit renames CONFIG to DEFAULTS and add load_config() with
deep merge, refactor Config to use path + defaults and makes
node-proxy config path configurable via bootstrap JSON or env.
node-proxy: introduce component spec registry and overrides for updates
This change introduces a single COMPONENT_SPECS dict and get_update_spec(component)
as the single source of truth for RedFish component update config (collection, path,
fields, attribute). To support hardware that uses different paths or attributes,
get_component_spec_overrides() allows overriding only those fields (via dataclasses.replace())
without duplicating the rest of the spec.
All _update_network, _update_power, etc. now call _run_update(component).
For instance, AtollonSystem uses this to set the power path to 'PowerSubsystem'.
mgr/cephadm: safe status/health access in node-proxy agent and inventory
This adds helpers in NodeProxyEndpoint and NodeProxyCache to safely
read status.health and status.state.
In NodeProxyEndpoint, methods _get_health_value() and _get_state_value()
are used in get_nok_members() to avoid KeyError on malformed data.
In NodeProxyCache, _get_health_value(), _has_health_value(),
_is_error_status(), and _is_unknown_status() are used in fullreport()
and when filtering 'non ok' members instead of accessing
status['status']['health'] inline.
node-proxy: narrow build_data exception handling and re-raise
With this commit, it catches only KeyError, TypeError, and
AttributeError in build_data() instead of Exception, and
re-raise after logging so callers get the actual error.
node-proxy: refactor Endpoint/EndpointMgr and fix chassis paths
This commit refactors EndpointMgr and Endpoint to use explicit dicts
instead of dynamic attributes. It also fixes member path filtering
so chassis endpoints use Chassis paths.
node-proxy: reduce log verbosity for missing optional fields
Change missing field logging from warning to debug level in
RedfishDellSystem, as missing optional fields can be expected behavior
and and doesn't require warning level logging.
Ville Ojamo [Tue, 3 Feb 2026 06:28:12 +0000 (13:28 +0700)]
doc: unpin pip in admin/doc-read-the-docs.txt
7dd00ca introduced a proper fix for pip 25.3/PEP517 compatibility by
adding pyproject.toml files and the workaround in a65c46c is no longer
necessary. RTD builds with pip 25.3 and later work with the proper fix.
Remove the pinned pip in admin/doc-read-the-docs.txt and let RTD use the
default PIP version.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Shraddha Agrawal [Thu, 29 Jan 2026 04:28:00 +0000 (09:58 +0530)]
ceph-volume: support crimson osd binary
Prior to this commit, ceph-volume was using hardcoded OSD binary
to issue commands (eg - to perform mkfs, etc). This commit enables
ceph-volume to start supporting crimson OSDs.
A new argument, --osd-type is introduced with the default value
classic. When this parameter is set to 'crimson', ceph-osd-crimson
binary will be used to execute OSD commands.
This commit enables us to deploy both classic and crimson
type OSDs using cephadm. To enable the same, a new feature,
osd_type is added to DriverGroupSpec. The default value for
the same is classic, but can also be set to crimson.
When this value is read by cephadm, the entrypoint is
changed from /usr/bin/ceph-osd to /usr/bin/ceph-osd-crimson.
- updates tearsheet component css to match with carbon component
- adds laoding state to submit button
- adds support for step validation when angualr component are use for steps rather than plain html templates
- adds step one of nvmeof
Ilya Dryomov [Fri, 30 Jan 2026 15:32:35 +0000 (16:32 +0100)]
qa/tasks/rbd_mirror_thrash: don't use random.randrange() on floats
This stopped working in Python 3.12:
Changed in version 3.12: Automatic conversion of non-integer types
is no longer supported. Calls such as randrange(10.0) and
randrange(Fraction(10, 1)) now raise a TypeError.
Ilya Dryomov [Tue, 11 Nov 2025 15:33:16 +0000 (16:33 +0100)]
qa/tasks/qemu: install genisoimage package
genisoimage is expected to be included in our base images but currently
isn't on Rocky 10. Since it's quite a niche thing, let's install the
package explicitly.
Ilya Dryomov [Thu, 29 Jan 2026 20:41:03 +0000 (21:41 +0100)]
qa/workunits/rbd: reduce randomized sleeps in live import tests
These tests were tuned for slower hardware than what we have now.
Currently "rbd migration execute" always finishes (successfully) before
the NBD server is killed.
Ilya Dryomov [Tue, 11 Nov 2025 20:39:58 +0000 (21:39 +0100)]
qa/valgrind.supp: make gcm_cipher_internal suppression more resilient
gcm_cipher_internal() and ossl_gcm_stream_final() make it to the stack
trace only on CentOS Stream 9. On Ubuntu 22.04 and Rocky 10, it looks
as follows:
Thread 4 msgr-worker-1:
Conditional jump or move depends on uninitialised value(s)
at 0x70A36D4: ??? (in /usr/lib64/libcrypto.so.3.2.2)
by 0x70A39A1: ??? (in /usr/lib64/libcrypto.so.3.2.2)
by 0x6F8A09C: EVP_DecryptFinal_ex (in /usr/lib64/libcrypto.so.3.2.2)
by 0xB498C1F: ceph::crypto::onwire::AES128GCM_OnWireRxHandler::authenticated_decrypt_update_final(ceph::buffer::v15_2_0::list&) (crypto_onwire.cc:271)
by 0xB4992D7: ceph::msgr::v2::FrameAssembler::disassemble_preamble(ceph::buffer::v15_2_0::list&) (frames_v2.cc:281)
by 0xB482D98: ProtocolV2::handle_read_frame_preamble_main(std::unique_ptr<ceph::buffer::v15_2_0::ptr_node, ceph::buffer::v15_2_0::ptr_node::disposer>&&, int) (ProtocolV2.cc:1149)
by 0xB475318: ProtocolV2::run_continuation(Ct<ProtocolV2>&) (ProtocolV2.cc:54)
by 0xB457012: AsyncConnection::process() (AsyncConnection.cc:495)
by 0xB49E61A: EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*) (Event.cc:492)
by 0xB49EA9D: UnknownInlinedFun (Stack.cc:50)
by 0xB49EA9D: UnknownInlinedFun (invoke.h:61)
by 0xB49EA9D: UnknownInlinedFun (invoke.h:111)
by 0xB49EA9D: std::_Function_handler<void (), NetworkStack::add_thread(Worker*)::{lambda()#1}>::_M_invoke(std::_Any_data const&) (std_function.h:290)
by 0xBB11063: ??? (in /usr/lib64/libstdc++.so.6.0.33)
by 0x4F17119: start_thread (in /usr/lib64/libc.so.6)
The proposal to amend the existing suppression so that it's tied to the
specific callsite rather than libcrypto internals [1] received a thumbs
up from Radoslaw.
Roland Sommer [Fri, 30 Jan 2026 07:54:49 +0000 (08:54 +0100)]
debian: package mgr/smb in ceph-mgr-modules-core
The `BaseController` auto-imports the packaged `mgr/dashboard/controllers/smb.py`
file, which in turn wants to import `smb.enums` etc. which is part of the `smb`
package which is missing from `debian/ceph-mgr-modules-core.install`, thus
missing in the package. The missing module causes an exception
`ModuleNotFoundError: No module named 'smb'` on mgr instances when running a
ceph tentacle cluster installed from debian packages.
See: https://tracker.ceph.com/issues/74268 Signed-off-by: Roland Sommer <rol@ndsommer.de>
Afreen Misbah [Wed, 28 Jan 2026 09:59:08 +0000 (15:29 +0530)]
mgr/dashboard: fetch all namespaces in a gateway group
- adds a new API /api/gateway_group/{group}/namespace
- updates tests
- needed for UI flows and in general to fetch all namespaces, could not change existing API due to the maintenence of backward compatibility
- in a followup PR will add server side pagination
Ville Ojamo [Fri, 30 Jan 2026 04:47:40 +0000 (11:47 +0700)]
doc/dev: add sequence diagrams back to health-reports.rst
The sequence diagrams were removed in ce96ddd because they were causing
issues. Add them back as SVG images. Include as comments the source code
used to generate the diagrams.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
John Mulligan [Thu, 29 Jan 2026 23:28:44 +0000 (18:28 -0500)]
Merge pull request #65632 from phlogistonjohn/jjm-smb-hosts-allow
smb: support shares equivalent for hosts allow
Reviewed-by: Anthony D Atri <anthony.datri@gmail.com> Reviewed-by: Anoop C S <anoopcs@cryptolab.net> Reviewed-by: Shwetha Acharya <sacharya@redhat.com> Reviewed-by: Avan Thakkar <athakkar@redhat.com> Reviewed-by: Adam King <adking@redhat.com>
John Mulligan [Fri, 9 Jan 2026 16:25:43 +0000 (11:25 -0500)]
qa/workunits/smb: make the runner script easier to use manually
When testing the tests it can help speed things up to avoid
recreating the virtualenv, allow an env var SMB_REUSE_VENV=<path>
to supply a specific virtual env dir to (re)use.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 8 Jan 2026 18:42:14 +0000 (13:42 -0500)]
qa/suites/orch/cephadm: enable hosts_access tests
Enable the hosts_access tests when running deploy_smb_mgr_basic.yaml,
deploy_smb_mgr_domain.yaml, deploy_smb_mgr_res_basic.yaml, or
deploy_smb_mgr_res_dom.yaml.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 8 Jan 2026 18:45:43 +0000 (13:45 -0500)]
qa/workunits/smb: add tests for hosts_access field
The recently added hosts_access field allows a share to be configured
to allow or deny hosts by IP or network. The new module reconfigures
a share to attempt a small set of access scenarios with the hosts_access
field.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Wed, 19 Nov 2025 22:26:27 +0000 (17:26 -0500)]
qa/workunits/smb: add utility module for cephadm shell commands
Add a helper module that makes it a bit cleaner and easier to
find and interact with the cluster's 'admin node' the node where
we can run `cephadm shell` and commands within that shell.
This will allow us to make modifications to smb resources via
the ceph command and JSON in order to test various features.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Fri, 9 Jan 2026 14:32:56 +0000 (09:32 -0500)]
qa/workunits/smb: make the smb_cfg fixture module scoped
This means the file will only be read when pytest changes modules.
This also allows this fixture to be used with other fixtures at the
module or scope "higher" than the function scope.
John Mulligan [Fri, 9 Jan 2026 16:12:46 +0000 (11:12 -0500)]
qa/tasks: add client node info to smb workunit config dump
When generating the big ball of config JSON that helps define
parameters for the smb tests in the workunit add client "node"
info as well.
Add a function to avoid repeating the logic of getting node
info from the teuthology remote object.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Wed, 7 Jan 2026 23:02:21 +0000 (18:02 -0500)]
qa/tasks: embed use of ssh_keys task in smb workunit
Automatically use the ssh_keys tasks in the smb workunit task.
It can be disabled by passing false to `ssh_keys:` config key.
This allows the node running the tests to ssh into the node where
cephadm is installed in order to execute commands within
the cephadm shell.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Using the Share resource hosts_access parameter generate
smb.conf-equivalent configuration for the 'hosts allow' and 'hosts deny'
configuration parms. Note that currently we automatically set hosts deny
to all if *any* hosts allow is set to avoid the possibly surprising
result of explicitly setting hosts to allow and then having the share
continue to allow hosts not explicitly listed.
If needed, in the future we could allow the user to override the
default deny - but I'm trying to keep it real simple for now.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Mon, 22 Sep 2025 18:44:30 +0000 (14:44 -0400)]
mgr/smb: add a new hosts_access field to the Share resource
This access list can be used to allow or deny access to hosts by
IP address or network (IP/prefixlen-style). It partially borrows
from the previous work to do ip address binds.
The structure would look something like the following:
```
hosts_access:
- address: 192.168.7.200
access: allow
- address: 192.168.7.202
access: allow
- network: 10.10.220.0/24
access: allow
```
or
```
hosts_access:
- access: deny
network: 10.10.220.0/24
``
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Fri, 26 Sep 2025 18:22:12 +0000 (14:22 -0400)]
python-common/smb: move network conversion validation func to common
Extract code from the service_spec.py file that parses, validates and
converts network or ip address strings into a network object into a new
file so that it can be re-used more widely later.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Kefu Chai [Wed, 28 Jan 2026 02:58:31 +0000 (10:58 +0800)]
pybind/rbd: move legacy_implicit_noexcept to rbd.pyx
Move the legacy_implicit_noexcept compiler directive from setup.py to
the top of rbd.pyx, making it consistent with how CephFS handles this
directive. This simplifies the build setup by:
- Removing conditional logic based on Cython version in setup.py
- Eliminating the need for compiler_directives dict and packaging import
- Making RBD's directive handling consistent with other bindings
The directive is needed for building with both Cython 0.x and Cython 3
from the same file while preserving the same behavior. Cython safely
ignores unknown compiler directives when specified at the top of .pyx
files, so this works across all supported Cython versions.
When Cython 0.x support is eventually dropped, this directive can be
replaced with explicit noexcept annotations on rbd_callback_t and
librbd_progress_fn_t type definitions.
Kefu Chai [Fri, 23 Jan 2026 01:36:22 +0000 (09:36 +0800)]
pybind: hardwire language_level to 3
Previously, to maintain backward compatibility with Python 2, we set
'language_level' to sys.version_info.major, so the value would be 2
when building with Python 2, and 3 with Python 3. Now that Python 2
support has been dropped, we can hardwire it to "3".
This change also removes the comment about switching to
`language_level=3str` in the future. According to the Cython 3.1+
documentation,
> language_level=3 is now the default. language_level=3str has become a
> legacy alias.
see https://cython.readthedocs.io/en/3.1.x/src/changes.html.
For context, in Cython < 3.1, language_level=3 and language_level=3str
had different meanings:
- 3 = unprefixed strings are unicode
- 3str = unprefixed strings follow Python version (bytes in Py2, unicode
in Py3)
Since we no longer support Python 2, this distinction is irrelevant and
the comment can be safely removed.