]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
18 months agonode-proxy: RedfishClient class refactor
Guillaume Abrioux [Fri, 16 Jun 2023 11:09:48 +0000 (13:09 +0200)]
node-proxy: RedfishClient class refactor

This implements BaseClient class and make RedfishClient inherit from it.
Same logic as BaseSystem / RedfishSystem given that any other backend could
need to implement a new client for collecting the data.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: fix mypy warning regarding Config.logging
Guillaume Abrioux [Fri, 16 Jun 2023 11:07:34 +0000 (13:07 +0200)]
node-proxy: fix mypy warning regarding Config.logging

Config's attributes are dynamically created so mypy complains.
using `__dict__['logging']` addresses that.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: rename server-v2.py
Guillaume Abrioux [Fri, 16 Jun 2023 11:06:03 +0000 (13:06 +0200)]
node-proxy: rename server-v2.py

As the previous version has been removed, let's rename this file.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: drop old server.py
Guillaume Abrioux [Fri, 16 Jun 2023 11:04:56 +0000 (13:04 +0200)]
node-proxy: drop old server.py

This version relies on flask.
At the end, we decided to migrate to cherrypy given that
we already use it quite a lot in ceph/ceph

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: create entrypoint main()
Guillaume Abrioux [Fri, 16 Jun 2023 09:13:56 +0000 (11:13 +0200)]
node-proxy: create entrypoint main()

This creates a `main()` function in server.py that will be the
entrypoint of node-proxy.

This also implement arg parsing and add a `--config` parameter
to specify the configuration file.

Finally, this introduce a small refactor of class `Config` and class
`Logger` in util.py because there was a circular dependency between them.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: rename System to BaseSystem
Guillaume Abrioux [Fri, 16 Jun 2023 06:08:38 +0000 (08:08 +0200)]
node-proxy: rename System to BaseSystem

In order to avoid confusion or redefinition issue with class System()
defined in server.py.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: add a timeout when posting data
Guillaume Abrioux [Thu, 15 Jun 2023 14:23:13 +0000 (16:23 +0200)]
node-proxy: add a timeout when posting data

if this call is stuck for any reason, the report will block
the whole daemon given that at this point it has acquired a lock.
We need to make sure this call won't block the daemon for a long time,
let's add a timeout.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: (Redfish_System) reuse the existing client when possible
Guillaume Abrioux [Thu, 15 Jun 2023 14:20:31 +0000 (16:20 +0200)]
node-proxy: (Redfish_System) reuse the existing client when possible

Otherwise, the method start_client() recreates a new client.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: remove a redundant message
Guillaume Abrioux [Thu, 15 Jun 2023 14:19:27 +0000 (16:19 +0200)]
node-proxy: remove a redundant message

This message is not needed given that there's the same in
the RedFishClient class.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: add requirements.txt
Guillaume Abrioux [Mon, 12 Jun 2023 12:36:54 +0000 (14:36 +0200)]
node-proxy: add requirements.txt

This adds the requirements.txt file in order to manage the required
libraries.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: add a retry on redfish_client.get_path() calls
Guillaume Abrioux [Fri, 9 Jun 2023 13:03:24 +0000 (15:03 +0200)]
node-proxy: add a retry on redfish_client.get_path() calls

The idea is to retry multiple times before stating the endpoint is
definitely unreachable.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: add a decorator 'retry'
Guillaume Abrioux [Fri, 9 Jun 2023 12:58:02 +0000 (14:58 +0200)]
node-proxy: add a decorator 'retry'

This decorator will be useful for calls that should do multiple
attempts before actually failing.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: add type annotation
Guillaume Abrioux [Thu, 8 Jun 2023 16:31:38 +0000 (18:31 +0200)]
node-proxy: add type annotation

This commit adds the type annotation in all files.
This was missing since the initial implementation, let's add
it before the project gets bigger.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: address some flake8 linting errors
Guillaume Abrioux [Thu, 8 Jun 2023 16:22:26 +0000 (18:22 +0200)]
node-proxy: address some flake8 linting errors

This addresses some flake8 errors.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: implement config & logging management
Guillaume Abrioux [Thu, 8 Jun 2023 13:12:16 +0000 (15:12 +0200)]
node-proxy: implement config & logging management

This adds the classes 'Config' and 'Logger' in order to manage
the logging and the configuration within the node-proxy daemon.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: catch RequestException in reporter
Guillaume Abrioux [Wed, 7 Jun 2023 12:23:57 +0000 (14:23 +0200)]
node-proxy: catch RequestException in reporter

This catches the requests.exceptions.RequestException
exception in the reporter agent so we can better handle the
case where it can't reach the endpoint when trying to send the
collected data.
Before this change, if for some reason the refreshed data couldn't be
sent to the endpoint, it wouldn't have retried because
`self.system.previous_data` was overwritten anyway.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: catch more error in redfish_client
Guillaume Abrioux [Wed, 7 Jun 2023 12:20:07 +0000 (14:20 +0200)]
node-proxy: catch more error in redfish_client

This catches more potential exceptions in the redfish_client
class.
So if an error is caught we can log a more accurate and nicer message.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: add some logging in the reporter agent
Guillaume Abrioux [Mon, 22 May 2023 12:27:48 +0000 (14:27 +0200)]
node-proxy: add some logging in the reporter agent

This adds some calls to the logging module, mostly for
devel/debug purposes at the moment.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: fix a typo in redfish_system.get_status()
Guillaume Abrioux [Mon, 22 May 2023 12:26:54 +0000 (14:26 +0200)]
node-proxy: fix a typo in redfish_system.get_status()

s/Status/status

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: redfish_system.get_system refactor
Guillaume Abrioux [Mon, 22 May 2023 12:25:35 +0000 (14:25 +0200)]
node-proxy: redfish_system.get_system refactor

This method should return the 'unified structure' version of the
collected data instead of the huge json returned by redfish.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: add a lock mechanism
Guillaume Abrioux [Mon, 22 May 2023 12:20:54 +0000 (14:20 +0200)]
node-proxy: add a lock mechanism

The loop in the reporter agent has to wait that the data are all
collected before checking and pushing them to the ceph-mgr (if needed).
The idea is to use the lock mechanism offered by the threading module
from python.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: migrate to cherrypy
Guillaume Abrioux [Mon, 22 May 2023 12:19:09 +0000 (14:19 +0200)]
node-proxy: migrate to cherrypy

cherrypy is already widely used in Ceph.
Let's not add new dependencies and use cherrypy instead of
python-flask

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: add method start_client() redfish_system class
Guillaume Abrioux [Mon, 22 May 2023 12:15:05 +0000 (14:15 +0200)]
node-proxy: add method start_client() redfish_system class

This is going to be useful for a new endpoint '/start'

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: drop redfish_system._process_redfish_system method
Guillaume Abrioux [Mon, 22 May 2023 12:09:03 +0000 (14:09 +0200)]
node-proxy: drop redfish_system._process_redfish_system method

This method isn't needed, let's drop it.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: display error messages when Exception is caught
Guillaume Abrioux [Thu, 11 May 2023 11:29:05 +0000 (13:29 +0200)]
node-proxy: display error messages when Exception is caught

This is mostly for development purposes.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: merge self._system with current values
Guillaume Abrioux [Thu, 11 May 2023 11:25:36 +0000 (13:25 +0200)]
node-proxy: merge self._system with current values

Otherwise `self._system` gets reset in each iteration.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: add normalize_dict() function
Guillaume Abrioux [Thu, 11 May 2023 11:23:22 +0000 (13:23 +0200)]
node-proxy: add normalize_dict() function

this is to make sure all keys are converted into
lowercase.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: split RedfishSystem class
Guillaume Abrioux [Thu, 6 Apr 2023 15:29:28 +0000 (17:29 +0200)]
node-proxy: split RedfishSystem class

This class should be split because the logic will be different depending on the
hardware.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: implement storage endpoint
Guillaume Abrioux [Thu, 6 Apr 2023 12:56:48 +0000 (14:56 +0200)]
node-proxy: implement storage endpoint

This adds the required logic for the endpoint '/system/storage'
to gather and return data about physical drives.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: implement network endpoint
Guillaume Abrioux [Thu, 6 Apr 2023 12:55:41 +0000 (14:55 +0200)]
node-proxy: implement network endpoint

This adds the required logic for the endpoint '/system/network'
to gather and return data about network interfaces.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: implement processors endpoint
Guillaume Abrioux [Thu, 6 Apr 2023 12:53:41 +0000 (14:53 +0200)]
node-proxy: implement processors endpoint

This adds the required logic for the endpoint '/system/processors'
to gather and return data about CPUs.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: use `use_reloader=False`
Guillaume Abrioux [Wed, 5 Apr 2023 12:18:19 +0000 (14:18 +0200)]
node-proxy: use `use_reloader=False`

In order to prevent the server from restarting in a loop
when an error shows up. Otherwise, it creates a bunch of new
redfish client session and make it quickly unavailable due to the
session limit.
Probably not intended to be kept.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: add a /shutdown endpoint
Guillaume Abrioux [Wed, 5 Apr 2023 12:16:29 +0000 (14:16 +0200)]
node-proxy: add a /shutdown endpoint

Add a '/shutdown' endpoint to force the client to logout and delete its current
session.
This is for devel puroposes and probably not intended to be kept.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: logout from redfish api on Exception
Guillaume Abrioux [Wed, 5 Apr 2023 12:14:40 +0000 (14:14 +0200)]
node-proxy: logout from redfish api on Exception

Otherwise it ends up recreating new session each time whereas the previous session
is left. After multiple failures, it reaches the limit and left sessions need to be
cleaned up manually.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: variabilize the system_endpoint
Guillaume Abrioux [Wed, 5 Apr 2023 12:10:41 +0000 (14:10 +0200)]
node-proxy: variabilize the system_endpoint

This makes it possible to define the value of the 'System endpoint'.
This can be different according to the hardware.

This probably means that the class `RedfishSystem` should be split itself.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: improve logging
Guillaume Abrioux [Wed, 5 Apr 2023 12:08:38 +0000 (14:08 +0200)]
node-proxy: improve logging

this adds a new file `util.py` with a logger function in order
to improve a bit the logging.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
18 months agonode-proxy: various unified interface changes
Guillaume Abrioux [Tue, 21 Mar 2023 06:07:54 +0000 (07:07 +0100)]
node-proxy: various unified interface changes

this slightly modifies the data structure of the unified interface.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
(cherry picked from commit b853761836febe92f6460a13d554cd966ff2e529)

18 months agoFirst hardware-monitoring draft version
Redouane Kachach [Wed, 8 Mar 2023 14:27:57 +0000 (15:27 +0100)]
First hardware-monitoring draft version

Signed-off-by: Redouane Kachach <rkachach@redhat.com>
18 months agoMerge pull request #55287 from ajarr/wip-64139
Ilya Dryomov [Thu, 25 Jan 2024 12:04:26 +0000 (13:04 +0100)]
Merge pull request #55287 from ajarr/wip-64139

rbd-nbd: fix resize of images mapped using netlink

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
18 months agoMerge pull request #55270 from afreen23/fix-cap-inconsistency-multisite
Nizamudeen A [Thu, 25 Jan 2024 10:10:43 +0000 (15:40 +0530)]
Merge pull request #55270 from afreen23/fix-cap-inconsistency-multisite

mgr/dashboard: Fix inconsistency in capitalisation of "Multi-site"

Reviewed-by: Ankush Behl <cloudbehl@gmail.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: rosinL <NOT@FOUND>
18 months agoMerge pull request #55182 from rkachach/fix_issue_64029
Redouane Kachach [Thu, 25 Jan 2024 09:23:43 +0000 (10:23 +0100)]
Merge pull request #55182 from rkachach/fix_issue_64029

mgr/rook: adding some basic rook e2e testing

18 months agoMerge pull request #55266 from athanatos/sjust/wip-63996
Samuel Just [Thu, 25 Jan 2024 05:05:09 +0000 (21:05 -0800)]
Merge pull request #55266 from athanatos/sjust/wip-63996

crimson: retain map references in OSDSingletonState::store_maps

Reviewed-by: Xuehan Xu <xuxuehan@qianxin.com>
Reviewed-by: Matan Breizman <mbreizma@redhat.com>
18 months agocrimson/osd/shard_services: retain map references in OSDSingletonState::store_maps 55266/head
Samuel Just [Wed, 10 Jan 2024 17:43:45 +0000 (09:43 -0800)]
crimson/osd/shard_services: retain map references in OSDSingletonState::store_maps

Introduced: 3f11cd94
Fixes: https://tracker.ceph.com/issues/63996
Signed-off-by: Samuel Just <sjust@redhat.com>
18 months agocrimson/osd/shard_service.cc: convert to newer logging machinery
Samuel Just [Wed, 10 Jan 2024 17:16:49 +0000 (17:16 +0000)]
crimson/osd/shard_service.cc: convert to newer logging machinery

Signed-off-by: Samuel Just <sjust@redhat.com>
18 months agocrimson/osd/osd.cc: migrate logging to new style
Samuel Just [Sat, 6 Jan 2024 23:32:03 +0000 (15:32 -0800)]
crimson/osd/osd.cc: migrate logging to new style

Signed-off-by: Samuel Just <sjust@redhat.com>
18 months agoMerge pull request #55288 from athanatos/sjust/wip-64140
Samuel Just [Thu, 25 Jan 2024 01:23:47 +0000 (17:23 -0800)]
Merge pull request #55288 from athanatos/sjust/wip-64140

Revert "crimson/os/alienstore/alien_log: _flush concurrently"

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
18 months agoMerge pull request #54987 from batrick/i63822
Yuri Weinstein [Wed, 24 Jan 2024 21:31:31 +0000 (13:31 -0800)]
Merge pull request #54987 from batrick/i63822

pybind/mgr/devicehealth: skip legacy objects that cannot be loaded

Reviewed-by: Nitzan Mordechai <nmordech@redhat.com>
Reviewed-by: Yaarit Hatuka <yaarithatuka@gmail.com>
18 months agoMerge pull request #54491 from jianwei1216/fix_osd_pg_stat_report_interval_max_cmain
Yuri Weinstein [Wed, 24 Jan 2024 21:30:50 +0000 (13:30 -0800)]
Merge pull request #54491 from jianwei1216/fix_osd_pg_stat_report_interval_max_cmain

fix: resolve inconsistent judgment of osd_pg_stat_report_interval_max

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Matan Breizman <Matan.Brz@gmail.com>
18 months agoMerge pull request #53250 from YiteGu/add-perfcount-for-allocator
Yuri Weinstein [Wed, 24 Jan 2024 21:30:07 +0000 (13:30 -0800)]
Merge pull request #53250 from YiteGu/add-perfcount-for-allocator

os/bluestore: add perfcount for bluestore/bluefs allocator

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
18 months agoMerge pull request #52530 from amathuria/wip-amat-fix-59531
Yuri Weinstein [Wed, 24 Jan 2024 21:28:16 +0000 (13:28 -0800)]
Merge pull request #52530 from amathuria/wip-amat-fix-59531

osd: Add memstore to unsupported objstores for QoS

Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
18 months agorbd-nbd: log errors during netlink_resize() using derr 55287/head
Ramana Raja [Tue, 23 Jan 2024 21:07:04 +0000 (16:07 -0500)]
rbd-nbd: log errors during netlink_resize() using derr

When using rbd CLI to map the images to NBD devices via netlink,
any errors that arose during image resizing in netlink_resize()
were not logged. Switching the error logging from using cerr to
derr helps log the errors from netlink_resize().

Signed-off-by: Ramana Raja <rraja@redhat.com>
18 months agorbd_nbd: fix resize of images mapped using netlink
Ramana Raja [Mon, 22 Jan 2024 22:06:58 +0000 (17:06 -0500)]
rbd_nbd: fix resize of images mapped using netlink

Include device identifier or cookie in the message sent to the kernel
to resize images mapped to NBD devices using netlink. Otherwise,
netlink_resize() fails and the size of the device isn't updated.

Fixes: https://tracker.ceph.com/issues/64139
Signed-off-by: Ramana Raja <rraja@redhat.com>
18 months agoMerge pull request #49462 from rzarzynski/wip-bug-53789
Laura Flores [Wed, 24 Jan 2024 20:00:03 +0000 (14:00 -0600)]
Merge pull request #49462 from rzarzynski/wip-bug-53789

osdc: fix the ENOCONN normalization in Objecter::_linger_reconnect()

18 months agoMerge pull request #55219 from samarahu/rgw_asio_frontend_asserts
Casey Bodley [Wed, 24 Jan 2024 18:39:14 +0000 (18:39 +0000)]
Merge pull request #55219 from samarahu/rgw_asio_frontend_asserts

rgw/asio: Add asserts to rgw_asio_frontend.cc

Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
18 months agomgr/rook: increase minikube mem to 6GB to avoid stability issues 55182/head
Redouane Kachach [Wed, 24 Jan 2024 18:03:56 +0000 (19:03 +0100)]
mgr/rook: increase minikube mem to 6GB to avoid stability issues

Signed-off-by: Redouane Kachach <rkachach@redhat.com>
18 months agoMerge pull request #55192 from dparmar18/fix_docstrings_ceph_test_case
Gregory Farnum [Wed, 24 Jan 2024 17:37:34 +0000 (09:37 -0800)]
Merge pull request #55192 from dparmar18/fix_docstrings_ceph_test_case

qa: typo fixes in ceph_test_case docstrings

18 months agoMerge pull request #53320 from jzhu116-bloomberg/wip-62710
Casey Bodley [Wed, 24 Jan 2024 16:01:43 +0000 (16:01 +0000)]
Merge pull request #53320 from jzhu116-bloomberg/wip-62710

rgw/multisite: maintain endpoints connectable status and retry the requests to them when appropriate

Reviewed-by: Mark Kogan <mkogan@ibm.com>
18 months agoMerge pull request #54941 from samsungceph/vstart_network_v2
Adam King [Wed, 24 Jan 2024 15:25:23 +0000 (10:25 -0500)]
Merge pull request #54941 from samsungceph/vstart_network_v2

vstart: Pick only CIDR-formatted routes when cephadm enabled

Reviewed-by: Adam King <adking@redhat.com>
18 months agoMerge pull request #53668 from mdw-at-linuxbox/wip-master-update-kmip-1
Casey Bodley [Wed, 24 Jan 2024 13:30:25 +0000 (13:30 +0000)]
Merge pull request #53668 from mdw-at-linuxbox/wip-master-update-kmip-1

Update libkmip to pull in some portability changes.

Reviewed-by: Casey Bodley <cbodley@redhat.com>
18 months agomgr/rook: adding some basic rook e2e testing
Redouane Kachach [Mon, 15 Jan 2024 14:25:02 +0000 (15:25 +0100)]
mgr/rook: adding some basic rook e2e testing
Fixes: https://tracker.ceph.com/issues/64029
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
18 months agoRevert "crimson/os/alienstore/alien_log: _flush concurrently" 55288/head
Samuel Just [Tue, 23 Jan 2024 21:47:27 +0000 (21:47 +0000)]
Revert "crimson/os/alienstore/alien_log: _flush concurrently"

While submitting the log line asyncronously is reasonable,
with this implementation the EntryVector &q parameter does
not necessarily outlive the submission continuation.

This reverts commit 511af83e2747361350b60ce0ce88e67a726d9343.

Fixes: https://tracker.ceph.com/issues/64140
Signed-off-by: Samuel Just <sjust@redhat.com>
18 months agoMerge pull request #55183 from galsalomon66/s3select_fixes_QE_bugs
Gal Salomon [Tue, 23 Jan 2024 21:59:15 +0000 (23:59 +0200)]
Merge pull request #55183 from galsalomon66/s3select_fixes_QE_bugs

rgw/s3select: bug fixes per QE recent defects

18 months agoUpdate libkmip submodule to pull in some portability changes. 53668/head
Marcus Watts [Tue, 26 Sep 2023 07:04:35 +0000 (03:04 -0400)]
Update libkmip submodule to pull in some portability changes.

Signed-off-by: Marcus Watts <mwatts@redhat.com>
18 months agoMerge pull request #55217 from ronen-fr/wip-rf-old-reserv
Ronen Friedman [Tue, 23 Jan 2024 19:39:35 +0000 (21:39 +0200)]
Merge pull request #55217 from ronen-fr/wip-rf-old-reserv

osd/scrub: check reservation replies for relevance

Reviewed-by: Samuel Just <sjust@redhat.com>-
18 months agoMerge pull request #55067 from yaarith/telemetry-pool-flags
Laura Flores [Tue, 23 Jan 2024 18:22:32 +0000 (12:22 -0600)]
Merge pull request #55067 from yaarith/telemetry-pool-flags

mgr/telemetry: add pool flags

18 months agoMerge pull request #55240 from rosinL/wip-fix-64032
Laura Flores [Tue, 23 Jan 2024 16:38:46 +0000 (10:38 -0600)]
Merge pull request #55240 from rosinL/wip-fix-64032

install-deps: Force remove ceph-libboost* packages

18 months agoMerge pull request #55278 from Himura2la/patch-2
zdover23 [Tue, 23 Jan 2024 15:58:23 +0000 (01:58 +1000)]
Merge pull request #55278 from Himura2la/patch-2

Docs: Specify correct fs type for mkfs on volume creation

Reviewed-by: Zac Dover <zac.dover@proton.me>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
18 months agobug fixes per QE recent defects 55183/head
galsalomon66 [Mon, 15 Jan 2024 14:38:19 +0000 (16:38 +0200)]
bug fixes per QE recent defects
update for the engine_version message
s3select submodule

Signed-off-by: galsalomon66 <gal.salomon@gmail.com>
18 months agoMerge pull request #55277 from yuvalif/wip-yuval-63578
Casey Bodley [Tue, 23 Jan 2024 14:18:22 +0000 (14:18 +0000)]
Merge pull request #55277 from yuvalif/wip-yuval-63578

rgw/lua: fix compilation issue when lua packages are disabled

Reviewed-by: Casey Bodley <cbodley@redhat.com>
18 months agodoc: specify correct fs type for mkfs 55278/head
Himura Kazuto [Tue, 23 Jan 2024 12:59:10 +0000 (12:59 +0000)]
doc: specify correct fs type for mkfs

The default value is ext2, which is not supported (anymore?).

Signed-off-by: Vladislav Glagolev <vladislav.glagolev@devexpress.com>
18 months agorgw/lua: fix compilation issue when lua packages are disabled 55277/head
Yuval Lifshitz [Tue, 23 Jan 2024 11:09:26 +0000 (11:09 +0000)]
rgw/lua: fix compilation issue when lua packages are disabled

Fixes: https://tracker.ceph.com/issues/63578#change-253102
Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
18 months agomgr/dashboard: Fix inconsistency in capitalisation of "Multi-site" 55270/head
Afreen [Tue, 23 Jan 2024 02:34:32 +0000 (08:04 +0530)]
mgr/dashboard: Fix inconsistency in capitalisation of "Multi-site"

fixes https://tracker.ceph.com/issues/64125

Across the dashboard, two instances are present: Multi-site and
Multi-Site.
Making it consistent all over by using Multi-site.

Signed-off-by: Afreen <afreen23.git@gmail.com>
18 months agoosd/scrub: check reservation replies for relevance 55217/head
Ronen Friedman [Wed, 17 Jan 2024 15:36:16 +0000 (09:36 -0600)]
osd/scrub: check reservation replies for relevance

Compare a token (nonce) carried in the reservation reply with the remembered
token of the reservation request.  If they don't match, the reply is
stale and should be ignored (and logged).

Fixes: https://tracker.ceph.com/issues/64052
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
18 months agoMerge pull request #55107 from ronen-fr/wip-rf-rm-penaltyq
Ronen Friedman [Tue, 23 Jan 2024 05:57:39 +0000 (07:57 +0200)]
Merge pull request #55107 from ronen-fr/wip-rf-rm-penaltyq

osd/scrub: remove the 'penalty queue' from the scrubber

Reviewed-by: Samuel Just <sjust@redhat.com>-
18 months agoMerge pull request #55269 from zdover23/wip-doc-2024-01-23-radosgw-admin-read-write...
zdover23 [Tue, 23 Jan 2024 02:31:46 +0000 (12:31 +1000)]
Merge pull request #55269 from zdover23/wip-doc-2024-01-23-radosgw-admin-read-write-global-rate-limit-config

doc/radosgw: edit "read/write global rate limit" admin.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
18 months agodoc/radosgw: edit "read/write global rate limit" admin.rst 55269/head
Zac Dover [Tue, 23 Jan 2024 02:13:10 +0000 (12:13 +1000)]
doc/radosgw: edit "read/write global rate limit" admin.rst

Edit "Reading/Writing Global Rate Limit Configuration" in
doc/radosgw/admin.rst.

Signed-off-by: Zac Dover <zac.dover@proton.me>
18 months agoMerge pull request #55223 from athanatos/sjust/wip-64055
Samuel Just [Mon, 22 Jan 2024 21:26:19 +0000 (13:26 -0800)]
Merge pull request #55223 from athanatos/sjust/wip-64055

crimson: clear obc_registry on interval change

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
18 months agorgw/multisite: add multisite test cases with some rgw instances down 53320/head
Jane Zhu [Wed, 10 Jan 2024 05:40:35 +0000 (00:40 -0500)]
rgw/multisite: add multisite test cases with some rgw instances down

Signed-off-by: Juan Zhu <jzhu4@dev-10-34-20-139.pw1.bcc.bloomberg.com>
18 months agoMerge pull request #55070 from pdvian/wip-fix-progressevent
Yuri Weinstein [Mon, 22 Jan 2024 16:18:41 +0000 (08:18 -0800)]
Merge pull request #55070 from pdvian/wip-fix-progressevent

mon: initialize ProgressEvent::add_to_ceph_s

Reviewed-by: Laura Flores <lflores@redhat.com>
18 months agoMerge pull request #53154 from ifed01/wip-ifed-no-death-tests
Yuri Weinstein [Mon, 22 Jan 2024 16:17:35 +0000 (08:17 -0800)]
Merge pull request #53154 from ifed01/wip-ifed-no-death-tests

test/store_test: get rid off assert_death.

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
18 months agoMerge pull request #49415 from ljflores/wip-update-telemetry-upgrade
Yuri Weinstein [Mon, 22 Jan 2024 16:13:17 +0000 (08:13 -0800)]
Merge pull request #49415 from ljflores/wip-update-telemetry-upgrade

qa/workunits: update telemetry quincy workunits with `basic_pool_options_bluestore` collection

Reviewed-by: Yaarit Hatuka <yaarithatuka@gmail.com>
18 months agoosd/scrub: update job's NB on failure 55107/head
Ronen Friedman [Tue, 2 Jan 2024 16:09:06 +0000 (10:09 -0600)]
osd/scrub: update job's NB on failure

When a scrub job fails, update its NB to the current time plus a
fixed delay.  This prevents the job from being scheduled again
immediately.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
18 months agoosd/scrub: fix set_last_deep_scrub_stamp()
Ronen Friedman [Tue, 9 Jan 2024 14:15:33 +0000 (08:15 -0600)]
osd/scrub: fix set_last_deep_scrub_stamp()

The call should update last_scrub_stamp, too, without
requiring an extra call to on_scrub_schedule_input_change()

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
18 months agoosd/scrub: introduce a 'not before' attribute for scrub jobs
Ronen Friedman [Sun, 31 Dec 2023 16:18:09 +0000 (10:18 -0600)]
osd/scrub: introduce a 'not before' attribute for scrub jobs

The NB enables the OSD to delay the next attempt to schedule a specific
scrub job.  This is useful for jobs that have failed for whatever
reason, especially if the primary has failed to acquire the replicas.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
18 months agoosd/scrub: remove the 'penalized jobs' queue
Ronen Friedman [Sat, 30 Dec 2023 12:36:26 +0000 (06:36 -0600)]
osd/scrub: remove the 'penalized jobs' queue

The 'penalized jobs' queue was used to track scrub jobs that had failed
to acquire their replicas, and to prevent those jobs from being retried
too quickly.  This functionality will be replaced by a
simple 'not before' delay (see the next commits).

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
18 months agoMerge pull request #55262 from Matan-B/wip-matanb-crimson-bluestore-submit
Matan Breizman [Mon, 22 Jan 2024 08:44:23 +0000 (10:44 +0200)]
Merge pull request #55262 from Matan-B/wip-matanb-crimson-bluestore-submit

crimson/os/alienstore/alien_log: _flush concurrently

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
18 months agocrimson/os/alienstore/alien_log: _flush concurrently 55262/head
Matan Breizman [Sun, 21 Jan 2024 09:33:59 +0000 (09:33 +0000)]
crimson/os/alienstore/alien_log: _flush concurrently

In continuation to c15e56e386251403a876454f6a4aa186284565e1

Authored-by: Yingxin Cheng <yingxin.cheng@intel.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
18 months agoMerge pull request #55190 from zdover23/wip-doc-2024-01-16-radosgw-admin-enable-disab...
zdover23 [Sun, 21 Jan 2024 09:47:10 +0000 (19:47 +1000)]
Merge pull request #55190 from zdover23/wip-doc-2024-01-16-radosgw-admin-enable-disable-bucket-rate-limit

doc/radosgw: edit "Enable/Disable Bucket Rate Limit"

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
18 months agoMerge pull request #53288 from rzarzynski/wip-crimson-dont-shadow-store-in-ecbackend
Matan Breizman [Sun, 21 Jan 2024 08:44:29 +0000 (10:44 +0200)]
Merge pull request #53288 from rzarzynski/wip-crimson-dont-shadow-store-in-ecbackend

crimson: drop store from ECBackend to not shadow PGBackend::store

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Matan Breizman <mbreizma@redhat.com>
18 months agoMerge pull request #54813 from amathuria/wip-crimson-amat-fix-config-set-cmd
Matan Breizman [Sun, 21 Jan 2024 08:43:59 +0000 (10:43 +0200)]
Merge pull request #54813 from amathuria/wip-crimson-amat-fix-config-set-cmd

src/crimson: Add support for the OSD to receive config changes

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Matan Breizman <mbreizma@redhat.com>
18 months agoMerge pull request #55127 from idryomov/wip-63341
Ilya Dryomov [Sat, 20 Jan 2024 17:43:35 +0000 (18:43 +0100)]
Merge pull request #55127 from idryomov/wip-63341

librbd: improve rbd_diff_iterate2() performance in fast-diff mode

Reviewed-by: Mykola Golub <mgolub@suse.com>
18 months agoPendingReleaseNotes: add rbd_diff_iterate2 note 55127/head
Ilya Dryomov [Sat, 20 Jan 2024 15:00:46 +0000 (16:00 +0100)]
PendingReleaseNotes: add rbd_diff_iterate2 note

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
18 months agolibrbd: try to preserve object map for diff-iterate in fast-diff mode
Ilya Dryomov [Sat, 6 Jan 2024 16:08:04 +0000 (17:08 +0100)]
librbd: try to preserve object map for diff-iterate in fast-diff mode

As an optimization, try to ensure that the object map for the end
version is preloaded through the acquisition of exclusive lock and
as a consequence remains around until exclusive lock is released.
If it's not around, DiffRequest would (re)load it on each call.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
18 months agolibrbd/object_map: potentially use in-memory object map in DiffRequest
Ilya Dryomov [Sat, 6 Jan 2024 16:05:39 +0000 (17:05 +0100)]
librbd/object_map: potentially use in-memory object map in DiffRequest

If the object map for the end version is around (already loaded in
memory, either due to the end version being a snapshot or due to
exclusive lock being held), use it to run diff-iterate against the
beginning of time.  Since it's the only object map needed in that
case, such calls would be satisfied locally.

Fixes: https://tracker.ceph.com/issues/63341
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
18 months agolibrbd/object_map: decouple object map processing in DiffRequest
Ilya Dryomov [Fri, 5 Jan 2024 12:15:54 +0000 (13:15 +0100)]
librbd/object_map: decouple object map processing in DiffRequest

In preparation for potentially using in-memory object map, decouple
object map processing from loading object maps and place the logic in
prepare_for_object_map() and process_object_map().

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
18 months agocommon/bit_vector: fix iterator vs reference constness confusion
Ilya Dryomov [Fri, 5 Jan 2024 11:23:24 +0000 (12:23 +0100)]
common/bit_vector: fix iterator vs reference constness confusion

T (ConstIterator or Iterator) is confused with const T here:
IteratorImpl dereference operator is wrongly overloaded on const
and returns Reference instead of ConstReference for ConstIterator.
This then fails inside bufferlist bowels because Reference is
incompatible with bufferlist::const_iterator.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
18 months agolibrbd/object_map: make object map in handle_load_object_map() local
Ilya Dryomov [Thu, 4 Jan 2024 10:44:46 +0000 (11:44 +0100)]
librbd/object_map: make object map in handle_load_object_map() local

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
18 months agolibrbd/object_map: don't resize object map in handle_load_object_map()
Ilya Dryomov [Thu, 4 Jan 2024 10:39:20 +0000 (11:39 +0100)]
librbd/object_map: don't resize object map in handle_load_object_map()

Currently it's done in two cases:

- if the loaded object map is larger than expected based on byte size,
  it's truncated to expected number of objects
- in case of deep-copy, if the loaded object map is smaller than diff
  state, it's expanded to get "track the largest of all versions in the
  set" semantics

Both of these cases can be easily dealt with without modifying the
object map.  Being able to process a const object map is needed for
working on in-memory object map which is external to DiffRequest.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
18 months agocommon/bit_vector: fix IteratorImpl post-increment operator
Ilya Dryomov [Sat, 6 Jan 2024 11:22:35 +0000 (12:22 +0100)]
common/bit_vector: fix IteratorImpl post-increment operator

It's totally broken: instead of returning the current position and
moving to the next position, it returns the next position and doesn't
move anywhere.  Luckily it hasn't been used until now.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
18 months agolibrbd: drop DiffIterate::diff_object_map() declaration
Ilya Dryomov [Thu, 28 Dec 2023 09:52:11 +0000 (10:52 +0100)]
librbd: drop DiffIterate::diff_object_map() declaration

This is a leftover from commit 2b3a46801d39 ("librbd: switch
diff-iterate API to use new object-map diff helper").

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>