As ScrubResources is no longer involved in remote reservations, some
of the data listed by 'dump_scrub_reservations' is now collected by
OsdScrub itself (prior to this change, OsdScrub just forwarded the
request to ScrubResources).
Josh Salomon [Wed, 24 Jan 2024 12:40:53 +0000 (14:40 +0200)]
osd: Add score for read balance osd size aware policy
This score works for pools in which the read_ratio
value is set.
Current limitations:
- This mechanism ignores osd read affinty
- There is a plan adding support for read affinity 0
in the next version.
- This mechanism works only when all PGs are full
- If read_ration is not set - the existing mechanism (named
fair score) is used.
Josh Salomon [Tue, 16 Jan 2024 18:33:47 +0000 (20:33 +0200)]
osd: Read balancer for OSDs with different sizes
This commit adds calculation for desired primary distribution which
takes into account the osd size. This way smaller OSDs can take more
read operations (by adding more primaries) and the larger OSDs take less
primaries and the load of the cluater can increase. (This feature offset
a bit the weakest link in the chain effect under some conditions). In
order to calculate the loads correctly there is a need to know the
read/write ratio for the pool, and this commit assumes the read_ratio
parameter is available for the pool.
Josh Salomon [Tue, 26 Dec 2023 08:41:18 +0000 (10:41 +0200)]
osd: Add 'read_ratio' pool parameterr
This parameter is used for better read balancing with non identical
devices.
- This parameter is controlled using the commands 'ceph osd pool set/get'
- This parameter is applicable only for replicated pools
- Valid values are integers in the range [0..100] and represent the
percentage of read IOs out of all IOs in the pool
- Value of 0 unsets this parameter and the value will be the default
value (this is the generic behavior of the command 'ceph osd pool
set'
- default value can be set by config parameter
`osd_pool_default_read_ratio`
Venky Shankar [Tue, 30 Jan 2024 07:40:19 +0000 (13:10 +0530)]
Merge PR #52652 into main
* refs/pull/52652/head:
PendingReleaseNotes: add note about new mdlog trimming configurations
mds: drive mdlog trimming via a separate thread
mds: allow runtime modification of mdlog trimming configuration
mds: remove a bunch of heuristics from MDLog::trim()
mds: add mdlog trimming threshold and decay counter
mds: remove a bunch of heuristics from MDLog::trim()
These were probbaly introduced to workaround some sort of
resource overusage by the MDS during trimming, but now it
looks like they are not really neeeded, especially if we
introduce a dedicated thread for log trimming.
Venky Shankar [Thu, 25 Jan 2024 09:32:33 +0000 (15:02 +0530)]
qa: remove error string checks and check w/ return value
I ran into this failure once #54972 was merged. The test is validating
the error string returned due to the failed mount. There aren't any
return value checks - which is a _more_ important check. Generic error
string checks will fail once a (error) string is changed (typo, etc..).
Venky Shankar [Tue, 30 Jan 2024 04:33:57 +0000 (10:03 +0530)]
Merge PR #54808 into main
* refs/pull/54808/head:
client: fix copying bufferlist to iovec structures in Client::_read
src/test: test sync call providing nullptr as ctx to async api
Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Milind Changire <mchangir@redhat.com> Reviewed-by: Frank S. Filz <ffilzlnx@mindspring.com>
- Adds support to set bucket policies through the Dashboard.
- Rename rgw bucket policy from 'policy' to 'bucket policy' and tab 'Permissions' to 'Policies'
- Fix: hide Tags when none are present on bucket list details and sets bucket form dirty after deleting a tag
- Added service to manage the formatting of a textArea that works with json
Signed-off-by: Pedro Gonzalez Gomez <pegonzal@redhat.com> Fixes: https://tracker.ceph.com/issues/63942
Venky Shankar [Mon, 29 Jan 2024 13:12:36 +0000 (18:42 +0530)]
Merge PR #53734 into main
* refs/pull/53734/head:
qa: refactor client upgrade yamls and other minor touchups
qa/upgrade/nofs: upgrade pacific->reef
qa/upgrade/upgraded_client: upgrade nautilus->pacific and pacific->reef
luo rixin [Mon, 29 Jan 2024 11:25:00 +0000 (19:25 +0800)]
script/run-make: install lvm2 for make check cephadm test
The make check test `run-tox-cephdam` reports error:
```
if errors:
> raise Error('\nERROR: '.join(errors))
E cephadmlib.exceptions.Error: lvcreate binary does not appear to be installed
cephadm.py:4434: Error
```
So let's installing lvm2 for make check cephadm test.
Fixes: https://tracker.ceph.com/issues/64122 Signed-off-by: luo rixin <luorixin@huawei.com>
Afreen [Mon, 29 Jan 2024 10:12:10 +0000 (15:42 +0530)]
mgr/dashboard: Create subvol of same name in different group
Fixes https://tracker.ceph.com/issues/64112
Issue:
Currently, we are unable to create subvolume of same name in different
subvolume group
Fix:
We are validating only the filesystem name of subvolume
which is stopping the creation a subvolume of same name.
Added more granularity , by adding the subvolumegroup name.
Laura Flores [Mon, 29 Jan 2024 00:58:25 +0000 (00:58 +0000)]
mgr: pin pytest to version 7.4.4
On 2024-01-27, pytest updated to 8.0.0,
which broke run-tox-mgr.
https://docs.pytest.org/en/stable/changelog.html
==================================== ERRORS ====================================
_____________________ ERROR collecting alerts/__init__.py ______________________
alerts/__init__.py:2: in <module>
from .module import Alerts
alerts/module.py:6: in <module>
from mgr_module import CLIReadCommand, HandleCommandResult, MgrModule, Option
mgr_module.py:1: in <module>
import ceph_module # noqa
E ModuleNotFoundError: No module named 'ceph_module'
______________________ ERROR collecting alerts/module.py _______________________
alerts/module.py:6: in <module>
from mgr_module import CLIReadCommand, HandleCommandResult, MgrModule, Option
mgr_module.py:1: in <module>
import ceph_module # noqa
E ModuleNotFoundError: No module named 'ceph_module'
____________________ ERROR collecting balancer/__init__.py _____________________
balancer/__init__.py:2: in <module>
from .module import Module
balancer/module.py:12: in <module>
from mgr_module import CLIReadCommand, CLICommand, CommandResult, MgrModule, Option, OSDMap, CephReleases
mgr_module.py:1: in <module>
import ceph_module # noqa
E ModuleNotFoundError: No module named 'ceph_module'
_____________________ ERROR collecting balancer/module.py ______________________
balancer/module.py:12: in <module>
from mgr_module import CLIReadCommand, CLICommand, CommandResult, MgrModule, Option, OSDMap, CephReleases
mgr_module.py:1: in <module>
import ceph_module # noqa
E ModuleNotFoundError: No module named 'ceph_module'
Fixes: https://tracker.ceph.com/issues/64200 Signed-off-by: Laura Flores <lflores@ibm.com>
Laura Flores [Fri, 26 Jan 2024 17:32:43 +0000 (17:32 +0000)]
osd: clear out unneeded pending pg-upmap-primary mappings
If the score did not improve, we should clear out any
pending pg-upmap-primary mappings so they don't execute
in situations where the same incremental is used to balance
multiple pools (i.e. in the balancer mgr module).
Laura Flores [Tue, 2 Jan 2024 21:28:03 +0000 (21:28 +0000)]
mgr/balancer: add pg_upmap_primaries to `balancer status detail`
Followup to https://github.com/ceph/ceph/pull/54801/commits/8a5553597ca6a428cb8ffc9fc5bebde048fbd068.
Streamlines some of the logic so pg upmap activity is properly
initalized, and updated in offline mode as well as online.
Laura Flores [Thu, 18 Jan 2024 18:57:24 +0000 (18:57 +0000)]
mgr: add read balancer support inside the balancer module
Read balancing may now be managed automatically via the balancer
manager module. Users may choose between two new modes: ``upmap-read``, which
offers upmap and read optimization simultaneously, or ``read``, which may be used
to only optimize reads. Existing balancer commands have also been added to
contain more information about read balancing.
Run the following commands to test the new automatic behavior:
`ceph balancer on` (on by default)
`ceph balancer mode <read|upmap-read>`
`ceph balancer status`
Run the following commands to test the new supervised behavior:
`ceph balancer off`
`ceph balancer mode <read|upmap-read>`
`ceph balancer eval` | `ceph balancer eval <pool-name>`
`ceph balancer eval-verbose` | `ceph balancer eval-verbose <pool-name>`
`ceph balancer optimize <plan-name>`
`ceph balancer show <plan-name>`
`ceph balancer eval <plan-name>`
`ceph balancer execute <plan-name>`
In the balancer module, there is also a new "self_test" function which tests
the module's basic functionality. This test can be triggered with the following
commands:
`ceph mgr module enable selftest`
`ceph mgr self-test module balancer`
Related Trello: https://trello.com/c/sWoKctzL/859-add-read-balancer-support-inside-the-balancer-module Signed-off-by: Laura Flores <lflores@ibm.com>
Ronen Friedman [Fri, 5 Jan 2024 15:07:19 +0000 (09:07 -0600)]
osd/scrub: add "queue my request" flag to replica reservation messages
Up-to-date primaries will set this flag when sending a reservation
request. The replica OSD, if too busy to handle the request immediately, will queue
it until such time that the number of concurrent reservations is below the
configured limit. The queued requests are honored in FIFO order.
Old primaries will not set this flag, and will receive the expected
grant or deny reply immediately.
ceph-volume: fix partitions support in disk.get_devices()
The following:
```
is_part = get_file_contents(os.path.join(_sys_dev_block_path, item, 'partition')) == "1"
```
assumes any `/sys/dev/block/x:y/partition` contains '1' which is wrong.
This file actually contains the corresponding partition number.