Kefu Chai [Tue, 13 Aug 2024 22:37:57 +0000 (06:37 +0800)]
ceph-volume: add "packaging" to install_requires
in 0985e201, "packaging" was introduced as a runtime dependency of
ceph-volume, and `ceph.spec.in` was updated accordingly to note
this new dependency. but the debian packaging was not updated.
in 80edcd40, the missing dependency was added to debian/control as
one of ceph-volume's runtime dependency.
but dh_python3 is able to figure out the dependencies by reading
the egg's metadata of the ceph-volume python module. and as a
python project, ceph-volume is using its `setup.py` for
tracking its dependencies.
so in order to be more consistent, and keep all of its dependencies
in one place, let's move this dependency to setup.py . as the
packagings in both distros are able to figure the dependencies
from egg-info.
see also
- https://manpages.debian.org/testing/dh-python/dh_python3.1.en.html#dependencies
- https://docs.fedoraproject.org/en-US/packaging-guidelines/Python_201x/#_automatically_generated_dependencies
ceph-volume: refactor device path handling for LVM lookups
This consolidates the conditional checks for device paths to
reduce redundancy and improve readability and adds logic to
handle both '/dev/mapper' and '/dev/dm-' paths uniformly by
introducing a utility function `get_lvm_mapper_path_from_dm()`.
Zac Dover [Wed, 7 Aug 2024 13:11:11 +0000 (23:11 +1000)]
doc/cephfs: add cache pressure information
Add information to doc/cephfs/cache-configuration.rst about how to deal
with a message that reads "clients failing to respond to cache
pressure". This procedure explains how to slow the growth of the
recall_caps value so that it does not exceed the
mds_recall_warning_threshold.
The information in this commit was developed by Eugen Block. See
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/5ROH5CWKKOEIQMVXOVRT5OO7CWK2HPM3/#J65DFUPP4BY57MICPANXKI7KAXSZ5Z5P
and https://www.spinics.net/lists/ceph-users/msg73188.html.
Fixes: https://tracker.ceph.com/issues/57115 Co-authored-by: Eugen Block <eblock@nde.ag> Signed-off-by: Zac Dover <zac.dover@proton.me>
Michael J. Kidd [Fri, 19 Apr 2024 14:20:22 +0000 (07:20 -0700)]
PGMap: remove pool max_avail scale factor
The scaling of max_avail by the ratio of non-degraded to total objects
count results in the reported max_avail increasing proportionally to the
number of OSDs marked `down` but not `out`. This is counter intuitive
since OSDs going `down` should never result in more space being
available.
Removing the scale factor allows max_avail to remain unchanged until the
OSDs are marked `out`.
Signed-off-by: Michael J. Kidd <linuxkidd@gmail.com>
Zac Dover [Thu, 8 Aug 2024 07:04:45 +0000 (17:04 +1000)]
doc/README.md - add "tip" alert styling
Add "tip" alert styling (what in Docbook XML is called "an admonition")
to information about Ninja in an ordered list (which is what markdown
has here instead of procedures).
Adam Kupczyk [Thu, 1 Aug 2024 11:54:21 +0000 (11:54 +0000)]
os/bluestore: Write_v2 changes
1) moved stats and blobs update to Writer::do_write
2) preallocate space in Writer:_split_data
3) fixed Writer::_write_expand_l that could check one extent too much
Adam Kupczyk [Mon, 29 Jul 2024 13:32:58 +0000 (13:32 +0000)]
tests/bluestore_types: Fixed data generation bluestore_blob_t::release_extents
The #1 and #2 elements could form a continuous sequence but still not
joined:
Expected equality of these values:
result
Which is: { 0x7b138000~48000, 0x883b0000~48000, 0xf0c10000~10000, 0x727b8000~38000 }
mid
Which is: { 0x7b138000~30000, 0x7b168000~18000, 0x883b0000~48000, 0xf0c10000~10000, 0x727b8000~38000 }
Adam Kupczyk [Thu, 25 Jul 2024 07:48:14 +0000 (07:48 +0000)]
os/bluestore: Add conf.bluestore_write_v2_random
Added conf.bluestore_write_v2_random. This is useful only for testing.
If set, it overrides value of bluestore_write_v2 with a random
true/false selection.
It is useful for v1 / v2 compatibility testing.
Adam Kupczyk [Thu, 25 Jul 2024 07:41:46 +0000 (07:41 +0000)]
os/bluestore: Writer, fix find_mutable_blob
1) Algorithm assumed that blob->blob_start() is aligned to csum size.
It is true for blobs created by write_v2, but write_v1 can generate
blob like: begin = 0x9000, size = 0x6000, csum = 0x2000.
2) Blobs with unused were selected even if those need to be expanded.
This is illegal since we cannot expand unused.
Adam Kupczyk [Sat, 13 Jul 2024 18:08:16 +0000 (18:08 +0000)]
os/bluestore: Add Writer::_crop_allocs_to_io
Usually the data we put to disk is AU aligned.
In weird cases like AU=16K we put less data than we allocated.
_crop_allocs_to_io trims allocated extents into disk block extents
to reflect real IO.
Add new conf variable.
bluestore_write_v2 = true : use new _do_write_v2() function
bluestore_write_v2 = false : use legacy _do_write() function
This variable is read only at start time.
Adam Kupczyk [Fri, 5 Jan 2024 08:08:09 +0000 (08:08 +0000)]
os/bluestore: Introducing BlueStore::Writer
BlueStore::Writer is a toolkit to give more options to control write.
It gives more control over compression process, letting user of the class
manually split incoming data to blobs.
Now for large writes all but last blob can be fully filled with data.
There is now a single place that decides on deferred/direct.
Adam Kupczyk [Tue, 14 Nov 2023 16:25:01 +0000 (16:25 +0000)]
os/bluestore: Refactor of write path. New punch_hole_2 function.
Introducing new logic of Onode processing during write.
New punch_hole_2 function empties range, but keeps track of elements:
- allocations that are no longer used
- blobs that are now empty
- shared blobs that got modified
- statfs changes to apply later
This change allows to reuse allocation for deferred freely, which means
that we can use allocations in deferred mode in other blob then they come from.
Adam Kupczyk [Wed, 8 Nov 2023 13:37:15 +0000 (13:37 +0000)]
os/bluestore: New variant of bluestore_blob_t::release_extents
Created new variant of bluestore_blob_t::release_extents function.
Now the function takes range [offset~length] as an argument,
a simplification that allows it to have much better performance.
Created comprehensive unit test, tests 40k random blobs.
The unit test does not test for a potential case of having
bluestore_blob_t.extents that are not allocation unit aligned.
Adam Kupczyk [Wed, 29 Nov 2023 11:55:44 +0000 (11:55 +0000)]
os/bluestore: Add improved printer for Onode
Added nice replacement for dump_onode function.
Introduce printer class that allows to select parts of Onode that are to be printed.
It severly reduced amount of clutter in output.
Usage:
using P = Bluestore::printer;
dout << blob->print(P::ptr + P::sdisk + P::schk + P::buf + P::attrs);
Adam Kupczyk [Wed, 18 Oct 2023 12:10:22 +0000 (12:10 +0000)]
os/bluestore: Add improved printer for Blob
Introduce printer class that allows to select parts of Blob that are to be printed.
It severly reduced amount of clutter in output.
Usage:
using P = Bluestore::Blob::printer;
dout << blob->printer(P::ptr + P::sdisk + P::schk);