Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/block/nvmeof-namespaces-form/nvmeof-namespaces-form.component.ts
src/pybind/mgr/dashboard/frontend/src/app/shared/api/nvmeof.service.spec.ts
src/pybind/mgr/dashboard/frontend/src/app/shared/api/nvmeof.service.ts
- restore it to the request type where groups is not present
workunit/dencoder: fix corpus test for backword and forward compability
- changed the check for non-deterministic, return code 1 is also legit
- unneeded check for is_dir, if it exist
- limit the number of threads to prevent error
Fixes: https://tracker.ceph.com/issues/67263 Signed-off-by: NitzanMordhai <nmordech@redhat.com>
(cherry picked from commit 30921272ddee5e7c8aaf4bdb8d69645ce92ba379)
Dan Mick [Thu, 27 Feb 2025 00:16:26 +0000 (16:16 -0800)]
container/build.sh: remove local container images
Optionally, for those that want to run build.sh locally and
use the images. The default is to remove, for Jenkins builders,
which will build, push, and rmi.
Zac Dover [Mon, 3 Feb 2025 13:37:34 +0000 (23:37 +1000)]
doc/rados: improve pg_num/pgp_num info
Improve the guidance around setting pg_num, and clear up confusion
around whether pgp_num should be set manually or, indeed, if it even can
be set manually.
This PR was raised in response to Mark Schouten's email here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/CBDJTLTTIEZVG7GVZBX37UAWGYNSSMPD/
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit c43e7337212fe38e8db63d00345fa9858b3cb10a)
Zac Dover [Tue, 25 Feb 2025 04:57:11 +0000 (14:57 +1000)]
doc/releases: correct squid release order
Put the releases of Squid in descending order. This change alters the
order of the Squid releases so that it is the same as the order of the
other Ceph releases.
ceph-volume: migrate unit tests from 'mock' to 'unittest.mock'
unit tests in ceph-volume was still using the external 'mock' library
for unit tests, which is unnecessary since 'unittest.mock' is part
of the Python standard library (available since Python 3.3).
This commit updates all imports to use 'unittest.mock' instead,
ensuring better maintainability and removing the need for an extra
dependency.
This refactors `get_physical_osds()`.
The calculation of `data_slots` is now more concise. The handling of
`dev_size`, `rel_data_size`, and `abs_size` is standardized.
The initialization of `free_size` is moved outside the loop
for clarity. Redundant checks and assignments are removed to simplify
the code.
ceph-volume: support splitting db even on collocated scenario
This change enables ceph-volume to create OSDs where the DB is
explicitly placed on a separate LVM partition, even in collocated
scenarios (i.e., block and DB on the same device).
This helps mitigate BlueStore fragmentation issues.
Given that ceph-volume can't automatically predict a proper default size for the db device,
the idea is to use the `--block-db-size` parameter:
Passing `--block-db-size` and `--db-devices` makes ceph-volume create db devices
on dedicated devices (current implementation):
```
Total OSDs: 2
Type Path LV Size % of device
----------------------------------------------------------------------------------------------------
data /dev/vdb 200.00 GB 100.00%
block_db /dev/vdd 4.00 GB 2.00%
----------------------------------------------------------------------------------------------------
data /dev/vdc 200.00 GB 100.00%
block_db /dev/vdd 4.00 GB 2.00%
```
Passing `--block-db-size` without `--db-devices` makes ceph-volume create a separate
LV for db device on the same device (new behavior):
```
Total OSDs: 2
Type Path LV Size % of device
----------------------------------------------------------------------------------------------------
data /dev/vdb 196.00 GB 98.00%
block_db /dev/vdb 4.00 GB 2.00%
----------------------------------------------------------------------------------------------------
data /dev/vdc 196.00 GB 98.00%
block_db /dev/vdc 4.00 GB 2.00%
```
This new behavior is supported with the `--osds-per-device` parameter:
```
Total OSDs: 4
Type Path LV Size % of device
----------------------------------------------------------------------------------------------------
data /dev/vdb 96.00 GB 48.00%
block_db /dev/vdb 4.00 GB 2.00%
----------------------------------------------------------------------------------------------------
data /dev/vdb 96.00 GB 48.00%
block_db /dev/vdb 4.00 GB 2.00%
----------------------------------------------------------------------------------------------------
data /dev/vdc 96.00 GB 48.00%
block_db /dev/vdc 4.00 GB 2.00%
----------------------------------------------------------------------------------------------------
data /dev/vdc 96.00 GB 48.00%
block_db /dev/vdc 4.00 GB 2.00%
```
This adds Python type annotations to `ceph_volume.util.device`,
along with all necessary adjustments to ensure compatibility
and maintain code clarity.
ceph-volume: set default value for BlueStore.block_lv to None
This change updates the `BlueStore` class in
`ceph_volume.objectstore` by initializing the `block_lv` attribute
to `None` with the type `Optional[Volume]`. This ensures that the
attribute has a default value and avoids potential runtime errors
when the attribute is accessed before being explicitly assigned.
ceph-volume: improve wipefs retry logic in lvm.zap
- Simplify the initialization of `tries` and `interval` variables for clarity.
- Adjust the retry logic in the `wipefs` function to:
- Include the attempt count in the warning message for better debugging.
- Start the retry loop at 1 and increment up to `tries`.
- Remove unnecessary unpacking of `stdout` and `stderr` since they were unused.
- Update the loop to increment `tries` by 1 to reflect the intended number of attempts.
This change improves code readability and makes retry behavior more transparent.
Ilya Dryomov [Tue, 18 Feb 2025 16:51:47 +0000 (17:51 +0100)]
test/rbd_mirror: clear Namespace::s_instance at the end of a test
TestMockPoolReplayer.Namespaces and NamespacesError tests leave behind
a dangling pointer to a stack-allocated MockNamespace which leads to an
easily reproducible use-after-free and segfault when tests are shuffled.
Ilya Dryomov [Mon, 17 Feb 2025 11:41:51 +0000 (12:41 +0100)]
test/rbd_mirror: flush watch/notify callbacks in TestImageReplayer
TestImageReplayer establishes its own (i.e. outside of the SUT code)
watch on the header of the remote image to be able to synchronize the
execution of the test with certain notifications. This watch is
established before the remote image is opened and is teared down until
after the remote image is closed but while the image replayer is still
running. The flush that is part of image close sequence thus isn't
guaranteed to cover all callbacks, especially for snapshot-based
mirroring where UnlinkPeerRequest spawned from Replayer::unlink_peer()
generates a notification on the remote image for each completed unlink.
Since TestImageReplayer further immediately deletes C_WatchCtx, pretty
much any test can segfault when C_WatchCtx::handle_notify() is invoked
by TestWatchNotify infrastructure. Because it's a virtual method, the
segfault often involves a completely bogus instruction pointer:
Kefu Chai [Sun, 24 Mar 2024 10:41:40 +0000 (18:41 +0800)]
squid: crush: use std::vector instead of variable length arrays
despite that variable length arrays (VLA for short) has been around
for a long time, it is an extension supported by GCC and Clang, and is
not a part of C++ standard, its implementation allocates the dynamically
sized array on stack, hence is a source of potential stack overflow.
when compiling with Clang, it complains. so in this change, we switch
to std::vector<>, which is defined by the C++ standard, and it allocates
the storage on heap, so it is immune to the possible stack overflow problem.
```
/home/kefu/dev/ceph/src/crush/CrushWrapper.h:1613:16: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
1613 | int rawout[maxout];
| ^~~~~~
/home/kefu/dev/ceph/src/crush/CrushWrapper.h:1613:16: note: function parameter 'maxout' with unknown value cannot be used in a constant expression
/home/kefu/dev/ceph/src/crush/CrushWrapper.h:1610:60: note: declared here
1610 | void do_rule(int rule, int x, std::vector<int>& out, int maxout,
| ^
/home/kefu/dev/ceph/src/crush/CrushWrapper.h:1614:15: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
1614 | char work[crush_work_size(crush, maxout)];
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/kefu/dev/ceph/src/crush/CrushWrapper.h:1614:31: note: implicit use of 'this' pointer is only allowed within the evaluation of a call to a 'constexpr' member function
1614 | char work[crush_work_size(crush, maxout)];
| ^
2 warnings generated.
```
Naman Munet [Wed, 22 Jan 2025 10:59:20 +0000 (16:29 +0530)]
mgr/dashboard: Add confirmation textbox for resource name on delete action
Before:
=====
User was able to delete a single or multiple critical resources like ( images, snapshots, subvolumes, subvolume-groups, pools, hosts , OSDs, buckets, file system, services ) by just clicking on a checkbox.
After:
=====
User now has to type the resource name that they are deleting in the textbox on the delete modal, and then only they will be able to delete the critical resource.
Also from now onwards multiple selection for deletions of critical resources is not possible. Hence, user can delete only single resource at a time. On the other side, non-critical resources can be deleted in one go.
Adam Kupczyk [Fri, 20 Dec 2024 20:21:14 +0000 (20:21 +0000)]
os/bluestore: Add health warning for bluestore fragmentation
Changed "bluestore/fragmentation_micros" from quick imprecise to
slow but more representative score.
Introduced config "bluestore_warn_on_free_fragmentation" that controls
when free space fragmentation score becomes a health warning.
Currently calculation of fragmentation score might be non-instant for
severly fragmented disks. It might induce stalls to write IO.
Config value "bluestore_fragmentation_check_period" control score
calculation period.
In future, costly score calculation will be replaced with method that
continously updates score.
Improve the grammar and correct the formatting of the "Upgrading root ca
certificates" procedure that was added to the documentation in https://github.com/ceph/ceph/pull/61867
Matan Breizman [Tue, 18 Feb 2025 10:40:24 +0000 (12:40 +0200)]
script/lib-build: Use clang 14
Updating to newer clang requires multiple fixes.
Don't use newer clang than 14. If needed, we could backport
the fixes from [1] and then use newer releases.
Laura Flores [Mon, 17 Feb 2025 22:53:56 +0000 (22:53 +0000)]
qa/tasks: improve ignorelist for thrashing OSDs
This yaml file is used in rados/thrash-old-clients.
In this commit, I added some pattern matching for
warnings that show up in the cluster log detail that
are related to degraded PGs. In these tests, we are intentionally
marking down or killing OSDs, which leads to these states
showing up in the cluster log. So, the warnings are intended
and can be ignored in the context of OSD thrashing.
This directly aligns the squid ignorelist with main's, rather than
just a direct backport of
https://github.com/ceph/ceph/pull/61723/commits/ec50fc720c4fc0bdd3165d0014f19c36e9819012.
Fixes: https://tracker.ceph.com/issues/69962 Signed-off-by: Laura Flores <lflores@ibm.com>
Ronen Friedman [Thu, 30 Jan 2025 09:27:58 +0000 (03:27 -0600)]
osd/scrub: discard repair_oinfo_oid()
repair_oinfo_oid(), called every scrub, has a very specific
functionality: fix the object ID specified in the Object Info
attribute, if different from the ID of the owning object.
This fix was added in 2017, as a response to a unique failure
scenario that was observed in Sepia - probably following a
filesystem bug. See https://tracker.ceph.com/issues/18409 &
https://tracker.ceph.com/issues/20471.
The limited functionality of repair_oinfo_oid() -
only repairing this one specific issue, and only if the OI_ATTR
exists and is decodable - does not justify the overhead of
running it every scrub.
Zac Dover [Mon, 10 Feb 2025 08:12:34 +0000 (18:12 +1000)]
doc/cephadm: improve "Activate Existing OSDs".
Make three minor changes to doc/cephadm/services/osd.rst. These three
changes were suggested by Eugen Block, who reviewed this procedure after
developing it.