]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Guillaume Abrioux [Wed, 22 Nov 2023 14:27:09 +0000 (14:27 +0000)]
node-proxy: run only when idrac details provided
This agent shouldn't run when no idrac details are
available.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Mon, 20 Nov 2023 14:55:26 +0000 (14:55 +0000)]
cephadm: inventory.NodeProxyCache() refactor
This modifies fullreport(), summary() and common() methods
so they use the same logic as firmwares() and criticals()
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 16 Nov 2023 13:35:51 +0000 (13:35 +0000)]
cephadm/agent: add docstring to NodeProxy class
In order to document that part of the code and it might
help to generate API spec and documentation.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Mon, 30 Oct 2023 15:51:56 +0000 (15:51 +0000)]
node-proxy: implement criticals endpoint
This adds the required changes in order to implement the endpoint
'/criticals'.
The goal of this endpoint is to provide a report of all critical statuses
for either a given host or all hosts across the cluster.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 26 Oct 2023 14:34:10 +0000 (14:34 +0000)]
orch/cephadm: implement `ceph orch hardware` command
This adds a first implementation of the `ceph orch hardware` CLI.
Usage:
```
ceph orch hardware status [<hostname>] [--category <value>]
```
Omitting the `[<hostname>]` argument will generate a report for all hosts.
The default for argument `[--category]` is `summary`.
Example with `--category` :
```
+------------+-------------+-------+--------+---------+
| HOST | NAME | SPEED | STATUS | STATE |
+------------+-------------+-------+--------+---------+
| ceph-00001 | eno8303 | 0 | OK | Enabled |
| ceph-00001 | eno8403 | 0 | OK | Enabled |
| ceph-00001 | eno12399np0 | 10000 | OK | Enabled |
| ceph-00001 | eno12409np1 | 10000 | OK | Enabled |
| ceph-00001 | bond0 | 10000 | OK | Enabled |
+------------+-------------+-------+--------+---------+
```
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 16 Nov 2023 09:48:02 +0000 (09:48 +0000)]
node-proxy: validate_node_proxy_data() refactor
raise cherrypy.HTTPError() when the received data is
not valid instead of returning `self.validate_msg`
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Wed, 25 Oct 2023 15:07:09 +0000 (15:07 +0000)]
node-proxy: implement http_query() helper function
so we can drop the dependency to `requests` and
use same helper function from both reporter.py and redfish_client.py
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Tue, 24 Oct 2023 11:28:11 +0000 (11:28 +0000)]
node-proxy: address mypy and flake8 errors
This addresses some flake8 and python typing errors.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Tue, 24 Oct 2023 08:43:53 +0000 (08:43 +0000)]
node-proxy: fetch idrac details from NodeProxyCache()
The class ` NodeProxyCache()` is intended for that, it already
has this information so there's no need to make a call to `get_store()`
each time we want to access idrac details.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Mon, 23 Oct 2023 15:28:35 +0000 (15:28 +0000)]
node-proxy: parametrize idrac port
This adds the missing piece to make the idrac port
a parameter that one can customize.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Mon, 23 Oct 2023 13:42:09 +0000 (13:42 +0000)]
cephadm: add new option to CLI
this adds the `--deploy-cephadm-agent` option to the cephadm
CLI's bootstrap subcommand.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Fri, 20 Oct 2023 16:12:55 +0000 (16:12 +0000)]
node-proxy: implement /led endpoint
This is the first 'act on node' feature implementation.
This adds the endpoint /led
a GET request to this endpoint returns the current status
of the enclosure LED.
a PATCH request to this endpoint allows to set the
enclosure LED status.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Fri, 20 Oct 2023 09:21:16 +0000 (09:21 +0000)]
node-proxy: drop dispatch() in NodeProxy()
The current logic prevents from using any cherrypy decorators
on actual endpoints as we use a set of 'proxy functions'
(index and dispatch) instead.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 19 Oct 2023 07:42:24 +0000 (07:42 +0000)]
node-proxy: local API (NodeProxy) refactor
- subclass cherrypy._cpserver.Server,
- drop cherrypy.quickstart() call,
- drop nested classes approach,
- make it run over https
- print tracebacks when an exception is raised
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Fri, 13 Oct 2023 12:15:21 +0000 (12:15 +0000)]
node-proxy: clean up node_proxy dir
This removes a legacy file that is not needed any longer.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Fri, 13 Oct 2023 12:09:56 +0000 (12:09 +0000)]
node-proxy: collect firmwares details
This makes all the required changes in order to support
collecting, pushing and exposing data regarding firmwares
status and versions for all the underlying hardware.
This also refactors the redfish dell corresponding logic:
Having so many nested/inheritance classes seems unnecessary.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 12 Oct 2023 13:29:19 +0000 (13:29 +0000)]
node-proxy: update the JSON data structure
Change the data structure from:
```
{
"storage": "ok",
"processors": "ok",
"network": "ok",
"memory": "ok",
"power": "ok",
"fans": "ok"
}
```
to:
```
{
"host": "node1",
"sn": "xxxx",
"status": {
"storage": {
}
}
}
```
In order to provide a unique key (sn) which is more reliable at the top
level of the dict.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Wed, 11 Oct 2023 15:15:50 +0000 (15:15 +0000)]
node-proxy: quick clean up
This removes some files which are not needed any longer.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Wed, 11 Oct 2023 14:50:40 +0000 (14:50 +0000)]
node-proxy: run all update functions in parallel
This makes the update logic run faster.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Wed, 11 Oct 2023 08:34:38 +0000 (08:34 +0000)]
cephadm/node-proxy: reset ceph warning when needed
This makes the mgr reset the warning when the alert is fixed.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Tue, 10 Oct 2023 12:42:42 +0000 (12:42 +0000)]
node-proxy: rename server.py -> main.py
This is going to be the entrypoint of node-proxy, let's rename
this file to main.py
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Tue, 10 Oct 2023 12:41:09 +0000 (12:41 +0000)]
node-proxy: subclass Thread class
The idea is to subclass Thread so I can catch
exceptions in threads from the main process.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Tue, 10 Oct 2023 12:38:12 +0000 (12:38 +0000)]
node-proxy: drop current main.py
This file was there for devel purposes.
Let's drop it as it is not used any longer.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Fri, 6 Oct 2023 13:55:21 +0000 (13:55 +0000)]
cephadm/node-proxy: logging issues / error handling refactor
- fix multiple logging issue because of new handler
added each time `Logger` is called
- do not propagate to parent (root) logger: as it makes it log the messages too
- implement a new method `is_logged()` in `RedFishClient`
- refactor the logic regarding caught errors in `RedFishClient`
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Fri, 6 Oct 2023 11:10:39 +0000 (11:10 +0000)]
mgr/cephadm: add NodeProxyCache class
This is for tracking and caching any node-proxy data.
The node-proxy API now uses this class to serve its data.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Wed, 4 Oct 2023 10:00:26 +0000 (10:00 +0000)]
monitoring: add new alerts
This adds new hardware monitoring alerts.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Fri, 29 Sep 2023 13:05:31 +0000 (13:05 +0000)]
node-proxy: validate_node_proxy_data() refactor
This introduces minor changes in order to improve error
handling in validate_node_proxy_data()
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Wed, 27 Sep 2023 13:00:17 +0000 (13:00 +0000)]
node-proxy: lower verbosity level
This reduces the verbosity level for some messages.
These are generating a lot of messages while they can be needed
only for debugging purposes.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Wed, 27 Sep 2023 09:41:49 +0000 (09:41 +0000)]
node-proxy: update alert names
Given that the 'node-proxy' terminology is internal, let's change
the few node-proxy related alert names to something
more user friendly as they are intended to be seen by the user
(NODE_PROXY_xxx > HARDWARE_xxx).
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Wed, 27 Sep 2023 08:27:28 +0000 (08:27 +0000)]
node-proxy: split redfishdell class
This refactors split the redfishdell class in order
to collect power and thermal details from the redfish API.
'power' and 'thermal' details are very different in many points:
- not available at the same endpoint,
- data structure is different.
For these two reasons, let's split that class.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 21 Sep 2023 14:52:01 +0000 (14:52 +0000)]
cephadm/agent: endpoint refactor
These changes are required in order to be able to re-use
the existing agent endpoint. The current code doesn't ease/allow
adding a new application. The idea here is to add a new class for
handling the '/node-proxy' endpoint.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Tue, 19 Sep 2023 11:49:44 +0000 (11:49 +0000)]
node-proxy: raise ceph warning(s) if needed
This makes the agent endpoint raise alert(s) when one or multiple
members of a component are critical.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Tue, 19 Sep 2023 07:55:54 +0000 (07:55 +0000)]
node-proxy: drop redfish library dependency
Given that this library isn't packaged for both
upstream and downstream and we can achieve what it was used for
directly with a lib such `urllib` (basically just auth), let's
drop this dependency.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Tue, 19 Sep 2023 07:46:42 +0000 (07:46 +0000)]
node-proxy: logging refactor
This makes `logger` a class attribute so we don't have
the `Logger` instantiation outside of the different classes.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Tue, 19 Sep 2023 07:41:57 +0000 (07:41 +0000)]
node-proxy: add __init__.py file
In order to make node-proxy a package.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Mon, 18 Sep 2023 06:50:24 +0000 (06:50 +0000)]
node-proxy: parametrize reporter url
node-proxy entrypoint (`server.main()`) now takes two parameters
(addr / port) in order to make the reporter agent know how to reach
the http agent endpoint hosted in the mgr daemon.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 14 Sep 2023 16:10:01 +0000 (16:10 +0000)]
node-proxy: modify the endpoint url from default config
This updates the endpoint url from DEFAULT_CONFIG in order
to match the new endpoint recently added.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 14 Sep 2023 16:08:26 +0000 (16:08 +0000)]
node-proxy: update reporter agent
This commit introduces the required changes in order to make
the reporter agent query the new mgr endpoint '/node-proxy/data'
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 14 Sep 2023 15:53:34 +0000 (15:53 +0000)]
node-proxy: fetch idrac details from ceph
The idrac details are now fetched from ceph (monitor kv store) and
passed by the cephadm binary at the agent startup.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 14 Sep 2023 15:41:32 +0000 (15:41 +0000)]
mgr/cephadm: add node-proxy endpoints to the mgr
This adds 2 endpoints to the existing http agent endpoint:
- '/node_proxy/idrac': support POST requests only although this endpoint
is intended for fetching the idrac credentials of a given node. As we pass
sensitive details (ceph secret) I didn't want to pass it as a query parameter
in the url. Passing it in a HTTP header is perhaps a better approach but we already
do similar thing for endpoint '/data' (agent) so for consistency reason I stick to
that.
- '/node_proxy/data': support GET and POST requests. A GET will return the
aggregated data for all nodes within the cluster. node-proxy will use a POST
request to that endpoint to push its collected data.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 14 Sep 2023 15:32:38 +0000 (15:32 +0000)]
cephadm/binary: add `query_endpoint()` method
This encapsulates the existing code in a new method
`query_endpoint()`.
The idea is to avoid duplicating code if we need to make multiple
calls to the agent endpoint from the `run()` method.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 14 Sep 2023 15:27:45 +0000 (15:27 +0000)]
mgr/cephadm: store oob mgmt credentials in mon kv store
The idea is to store the oob mgmt credentials into the monitor kv store
when they are passed via a host spec.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 14 Sep 2023 15:16:57 +0000 (15:16 +0000)]
python-common: update HostSpec
This adds new parameters to the current spec 'HostSpec'.
The idea is to make it possible to pass idrac credentials so
it will be possible for the node-proxy agent to consume them in order
to communicate with the redfish API.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 17 Aug 2023 09:21:00 +0000 (11:21 +0200)]
node-proxy: migrate to cephadm-agent
This moves the existing files to the new directory 'cephadmlib' so
we can make the existing code for node-proxy run within the cephadm
agent. Indeed, we can leverage the existing code for the cephadm agent
given that both daemons would achieve the same thing.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 17 Aug 2023 09:18:10 +0000 (11:18 +0200)]
node-proxy: rename directory
this renames the node-proxy directory node-proxy > node_proxy
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 22 Jun 2023 13:54:55 +0000 (15:54 +0200)]
node-proxy: add unit tests for node-proxy endpoint
This adds some unit tests for the node-proxy endpoint recently added to
the mgr.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Tue, 20 Jun 2023 12:35:02 +0000 (14:35 +0200)]
node-proxy: move administration operations to /admin path
This adds a new path /admin where all administrator operation are grouped.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Tue, 20 Jun 2023 12:33:42 +0000 (14:33 +0200)]
node-proxy: add new endpoint for flushing the data
Although this is mostly for devel and debug purposes at the moment,
it might be useful to be able to flush the data whenever the user needs it.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Tue, 20 Jun 2023 12:24:42 +0000 (14:24 +0200)]
node-proxy: try to acquire lock early in reporter's loop
The lock should be acquired early in this loop.
If the lock gets acquired by another call after we enter that condition *and*
before Reporter.loop() actually acquires it, it can lead to issue if during
this short amount of time the value of `data_ready` gets modified
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Tue, 20 Jun 2023 11:33:14 +0000 (13:33 +0200)]
node-proxy: variabilize the observer_url
create a new parameter in DEFAULT_CONFIG for the reporter agent.
The default value, (especially the tcp port) still has to be defined though.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Tue, 20 Jun 2023 11:31:40 +0000 (13:31 +0200)]
node-proxy: update endpoint url in Reporter.loop()
change the path of the endpoint to something more generic
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Tue, 20 Jun 2023 11:30:36 +0000 (13:30 +0200)]
node-proxy: implement _update_memory() in redfish_dell.py
This implements the `_update_memory()` method in redfish_dell.py
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Tue, 20 Jun 2023 11:28:55 +0000 (13:28 +0200)]
node-proxy: redfish_dell.py refactor
This commit introduces a small refactor of `redfish_dell.py` in order
to avoid code redundancy.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Fri, 16 Jun 2023 11:09:48 +0000 (13:09 +0200)]
node-proxy: RedfishClient class refactor
This implements BaseClient class and make RedfishClient inherit from it.
Same logic as BaseSystem / RedfishSystem given that any other backend could
need to implement a new client for collecting the data.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Fri, 16 Jun 2023 11:07:34 +0000 (13:07 +0200)]
node-proxy: fix mypy warning regarding Config.logging
Config's attributes are dynamically created so mypy complains.
using `__dict__['logging']` addresses that.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Fri, 16 Jun 2023 11:06:03 +0000 (13:06 +0200)]
node-proxy: rename server-v2.py
As the previous version has been removed, let's rename this file.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Fri, 16 Jun 2023 11:04:56 +0000 (13:04 +0200)]
node-proxy: drop old server.py
This version relies on flask.
At the end, we decided to migrate to cherrypy given that
we already use it quite a lot in ceph/ceph
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Fri, 16 Jun 2023 09:13:56 +0000 (11:13 +0200)]
node-proxy: create entrypoint main()
This creates a `main()` function in server.py that will be the
entrypoint of node-proxy.
This also implement arg parsing and add a `--config` parameter
to specify the configuration file.
Finally, this introduce a small refactor of class `Config` and class
`Logger` in util.py because there was a circular dependency between them.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Fri, 16 Jun 2023 06:08:38 +0000 (08:08 +0200)]
node-proxy: rename System to BaseSystem
In order to avoid confusion or redefinition issue with class System()
defined in server.py.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 15 Jun 2023 14:23:13 +0000 (16:23 +0200)]
node-proxy: add a timeout when posting data
if this call is stuck for any reason, the report will block
the whole daemon given that at this point it has acquired a lock.
We need to make sure this call won't block the daemon for a long time,
let's add a timeout.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 15 Jun 2023 14:20:31 +0000 (16:20 +0200)]
node-proxy: (Redfish_System) reuse the existing client when possible
Otherwise, the method start_client() recreates a new client.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 15 Jun 2023 14:19:27 +0000 (16:19 +0200)]
node-proxy: remove a redundant message
This message is not needed given that there's the same in
the RedFishClient class.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Mon, 12 Jun 2023 12:36:54 +0000 (14:36 +0200)]
node-proxy: add requirements.txt
This adds the requirements.txt file in order to manage the required
libraries.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Fri, 9 Jun 2023 13:03:24 +0000 (15:03 +0200)]
node-proxy: add a retry on redfish_client.get_path() calls
The idea is to retry multiple times before stating the endpoint is
definitely unreachable.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Fri, 9 Jun 2023 12:58:02 +0000 (14:58 +0200)]
node-proxy: add a decorator 'retry'
This decorator will be useful for calls that should do multiple
attempts before actually failing.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 8 Jun 2023 16:31:38 +0000 (18:31 +0200)]
node-proxy: add type annotation
This commit adds the type annotation in all files.
This was missing since the initial implementation, let's add
it before the project gets bigger.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 8 Jun 2023 16:22:26 +0000 (18:22 +0200)]
node-proxy: address some flake8 linting errors
This addresses some flake8 errors.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 8 Jun 2023 13:12:16 +0000 (15:12 +0200)]
node-proxy: implement config & logging management
This adds the classes 'Config' and 'Logger' in order to manage
the logging and the configuration within the node-proxy daemon.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Wed, 7 Jun 2023 12:23:57 +0000 (14:23 +0200)]
node-proxy: catch RequestException in reporter
This catches the requests.exceptions.RequestException
exception in the reporter agent so we can better handle the
case where it can't reach the endpoint when trying to send the
collected data.
Before this change, if for some reason the refreshed data couldn't be
sent to the endpoint, it wouldn't have retried because
`self.system.previous_data` was overwritten anyway.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Wed, 7 Jun 2023 12:20:07 +0000 (14:20 +0200)]
node-proxy: catch more error in redfish_client
This catches more potential exceptions in the redfish_client
class.
So if an error is caught we can log a more accurate and nicer message.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Mon, 22 May 2023 12:27:48 +0000 (14:27 +0200)]
node-proxy: add some logging in the reporter agent
This adds some calls to the logging module, mostly for
devel/debug purposes at the moment.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Mon, 22 May 2023 12:26:54 +0000 (14:26 +0200)]
node-proxy: fix a typo in redfish_system.get_status()
s/Status/status
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Mon, 22 May 2023 12:25:35 +0000 (14:25 +0200)]
node-proxy: redfish_system.get_system refactor
This method should return the 'unified structure' version of the
collected data instead of the huge json returned by redfish.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Mon, 22 May 2023 12:20:54 +0000 (14:20 +0200)]
node-proxy: add a lock mechanism
The loop in the reporter agent has to wait that the data are all
collected before checking and pushing them to the ceph-mgr (if needed).
The idea is to use the lock mechanism offered by the threading module
from python.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Mon, 22 May 2023 12:19:09 +0000 (14:19 +0200)]
node-proxy: migrate to cherrypy
cherrypy is already widely used in Ceph.
Let's not add new dependencies and use cherrypy instead of
python-flask
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Mon, 22 May 2023 12:15:05 +0000 (14:15 +0200)]
node-proxy: add method start_client() redfish_system class
This is going to be useful for a new endpoint '/start'
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Mon, 22 May 2023 12:09:03 +0000 (14:09 +0200)]
node-proxy: drop redfish_system._process_redfish_system method
This method isn't needed, let's drop it.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 11 May 2023 11:29:05 +0000 (13:29 +0200)]
node-proxy: display error messages when Exception is caught
This is mostly for development purposes.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 11 May 2023 11:25:36 +0000 (13:25 +0200)]
node-proxy: merge self._system with current values
Otherwise `self._system` gets reset in each iteration.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 11 May 2023 11:23:22 +0000 (13:23 +0200)]
node-proxy: add normalize_dict() function
this is to make sure all keys are converted into
lowercase.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 6 Apr 2023 15:29:28 +0000 (17:29 +0200)]
node-proxy: split RedfishSystem class
This class should be split because the logic will be different depending on the
hardware.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 6 Apr 2023 12:56:48 +0000 (14:56 +0200)]
node-proxy: implement storage endpoint
This adds the required logic for the endpoint '/system/storage'
to gather and return data about physical drives.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 6 Apr 2023 12:55:41 +0000 (14:55 +0200)]
node-proxy: implement network endpoint
This adds the required logic for the endpoint '/system/network'
to gather and return data about network interfaces.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Thu, 6 Apr 2023 12:53:41 +0000 (14:53 +0200)]
node-proxy: implement processors endpoint
This adds the required logic for the endpoint '/system/processors'
to gather and return data about CPUs.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Wed, 5 Apr 2023 12:18:19 +0000 (14:18 +0200)]
node-proxy: use `use_reloader=False`
In order to prevent the server from restarting in a loop
when an error shows up. Otherwise, it creates a bunch of new
redfish client session and make it quickly unavailable due to the
session limit.
Probably not intended to be kept.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Wed, 5 Apr 2023 12:16:29 +0000 (14:16 +0200)]
node-proxy: add a /shutdown endpoint
Add a '/shutdown' endpoint to force the client to logout and delete its current
session.
This is for devel puroposes and probably not intended to be kept.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Wed, 5 Apr 2023 12:14:40 +0000 (14:14 +0200)]
node-proxy: logout from redfish api on Exception
Otherwise it ends up recreating new session each time whereas the previous session
is left. After multiple failures, it reaches the limit and left sessions need to be
cleaned up manually.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Wed, 5 Apr 2023 12:10:41 +0000 (14:10 +0200)]
node-proxy: variabilize the system_endpoint
This makes it possible to define the value of the 'System endpoint'.
This can be different according to the hardware.
This probably means that the class `RedfishSystem` should be split itself.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Wed, 5 Apr 2023 12:08:38 +0000 (14:08 +0200)]
node-proxy: improve logging
this adds a new file `util.py` with a logger function in order
to improve a bit the logging.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
Guillaume Abrioux [Tue, 21 Mar 2023 06:07:54 +0000 (07:07 +0100)]
node-proxy: various unified interface changes
this slightly modifies the data structure of the unified interface.
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
(cherry picked from commit
b853761836febe92f6460a13d554cd966ff2e529 )
Redouane Kachach [Wed, 8 Mar 2023 14:27:57 +0000 (15:27 +0100)]
First hardware-monitoring draft version
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
Ilya Dryomov [Thu, 25 Jan 2024 12:04:26 +0000 (13:04 +0100)]
Merge pull request #55287 from ajarr/wip-64139
rbd-nbd: fix resize of images mapped using netlink
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Nizamudeen A [Thu, 25 Jan 2024 10:10:43 +0000 (15:40 +0530)]
Merge pull request #55270 from afreen23/fix-cap-inconsistency-multisite
mgr/dashboard: Fix inconsistency in capitalisation of "Multi-site"
Reviewed-by: Ankush Behl <cloudbehl@gmail.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: rosinL <NOT@FOUND>
Redouane Kachach [Thu, 25 Jan 2024 09:23:43 +0000 (10:23 +0100)]
Merge pull request #55182 from rkachach/fix_issue_64029
mgr/rook: adding some basic rook e2e testing
Samuel Just [Thu, 25 Jan 2024 05:05:09 +0000 (21:05 -0800)]
Merge pull request #55266 from athanatos/sjust/wip-63996
crimson: retain map references in OSDSingletonState::store_maps
Reviewed-by: Xuehan Xu <xuxuehan@qianxin.com>
Reviewed-by: Matan Breizman <mbreizma@redhat.com>
Samuel Just [Wed, 10 Jan 2024 17:43:45 +0000 (09:43 -0800)]
crimson/osd/shard_services: retain map references in OSDSingletonState::store_maps
Introduced:
3f11cd94
Fixes: https://tracker.ceph.com/issues/63996
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Wed, 10 Jan 2024 17:16:49 +0000 (17:16 +0000)]
crimson/osd/shard_service.cc: convert to newer logging machinery
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Sat, 6 Jan 2024 23:32:03 +0000 (15:32 -0800)]
crimson/osd/osd.cc: migrate logging to new style
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Thu, 25 Jan 2024 01:23:47 +0000 (17:23 -0800)]
Merge pull request #55288 from athanatos/sjust/wip-64140
Revert "crimson/os/alienstore/alien_log: _flush concurrently"
Reviewed-by: Matan Breizman <mbreizma@redhat.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Yuri Weinstein [Wed, 24 Jan 2024 21:31:31 +0000 (13:31 -0800)]
Merge pull request #54987 from batrick/i63822
pybind/mgr/devicehealth: skip legacy objects that cannot be loaded
Reviewed-by: Nitzan Mordechai <nmordech@redhat.com>
Reviewed-by: Yaarit Hatuka <yaarithatuka@gmail.com>