Zac Dover [Mon, 10 Feb 2025 08:12:34 +0000 (18:12 +1000)]
doc/cephadm: improve "Activate Existing OSDs".
Make three minor changes to doc/cephadm/services/osd.rst. These three
changes were suggested by Eugen Block, who reviewed this procedure after
developing it.
Co-authored-by: Eugen Block <eblock@nde.ag> Signed-off-by: Zac Dover <zac.dover@proton.me>
John Mulligan [Fri, 7 Feb 2025 16:41:41 +0000 (11:41 -0500)]
cephadm: add cephadmlib to tox coverage environment
When using the `coverage` tox environment we want to see the coverage
for the majority of the cephadm code. There is now a lot of code in
cephadmlib and so it makes sense to extend the default coverage report
to include cephadmlib.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
`os::Transaction::decode_bp()` has only one user: `_setattrs()`
of `BlueStore`. It uses that for optimization purposes: keeping
up contigous space instead of potentially fragmented `bufferlist`
that would require rectifying memcpy later.
The problem is `_setattrs()` also needs to avoid keeping large
raw buffers with only small subset being referenced. It achieves
this by copying the data if `bufferptr:::is_partial()` returns
`true`. However, this means the memcpy happens virtually always
as it's hard to even imagine the `val`, decoded from the wire,
can fulfill the 0 waste requirement.
Therefore the optimization doesn't make sense; it only imposes
costs in terms of complexity breaking the symmetry between encode
and decode in `os::Transation` (there is no `encode_bp()`).
This commit kills the optimization and simplifies `os::Transaction`.
Zac Dover [Fri, 7 Feb 2025 01:32:20 +0000 (11:32 +1000)]
doc/cephadm: improve "Activate Existing OSDs"
Improve the section "Activate Existing OSDs".
Supplement the information in the "Activate Existing OSDs" section with
a procedure developed by Eugen Block, here:
https://heiterbiswolkig.blogs.nde.ag/2025/02/06/cephadm-activate-existing-osds/
This procedure explains how to activate OSDs on a host that, for
whatever reason, has had to have its operating system reinstalled.
* refs/pull/61562/head:
qa: remove redundant and broken test
mds: skip scrubbing damaged dirfrag
tools/cephfs/DataScan: test equality of link including frag
tools/cephfs/DataScan: skip linkages that have been removed
tools/cephfs/DataScan: do not error out when failing to read a dentry
tools/cephfs/DataScan: create all ancestors during scan_inodes
tools/cephfs/DataScan: cleanup debug prints
qa: remove old MovedDir test
qa: add data scan tests for ancestry rebuild
qa: make the directory non-empty to force migration
qa: avoid unnecessary mds restart
Ivo Almeida [Wed, 13 Nov 2024 12:16:23 +0000 (12:16 +0000)]
mgr/dashboard: fixed unit tests
* fixed unit tests due to upgrade to angular v18
* run npm fix in order to fix code style violations
* upgraded eslint/* packages' versions
* fixed eslint errors and warnings
Fixes: https://tracker.ceph.com/issues/68896 Signed-off-by: Ivo Almeida <ialmeida@redhat.com>
John Mulligan [Tue, 20 Aug 2024 19:01:05 +0000 (15:01 -0400)]
src/script: add a script to help build ceph using containers
The build-with-container script tries to encapsulate nearly all major
build tasks using docker/podman containers. If there's no build image
locally it will create one for your. It provides targets for building
(make), testing (make check), building rpm packages or deb packages and
is designed to be fairly easily extended.
View the comment at the top of the source file for usage details.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Tue, 20 Aug 2024 19:00:57 +0000 (15:00 -0400)]
build: add files needed to create a build container
A build container contains all the tools and dependencies needed to
build ceph. It provides a Container file and small script that
helps bootstrap the container setup. This script installs a few extra
things we need before farming most of the work out to install-deps.sh.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Sat, 14 Sep 2024 10:31:23 +0000 (06:31 -0400)]
build: small script tweak to allow different build dirs
Move the mkdir line to allow for other builds dir naming schemes outside
of what appears in the .gitignore file. A tiny bit of added flexibility
at little cost.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Mon, 14 Nov 2022 15:57:25 +0000 (10:57 -0500)]
src/script: add helper function has_build_dir
This function returns successfully if $BUILD_DIR exists and is valid.
This is a useful building block for automation around the build and
can be used to avoid re-running commands that fail is the build dir
exists already.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Fri, 31 Jan 2025 00:04:56 +0000 (19:04 -0500)]
cephadm: use get_container_image_stats in cephadm.py
Replace the existing get_container_stats_by_image_name with a version
from container_engines.py that returns the parsed results of the container
status command (as a ContainerInfo, None on error).
Fix up a few tests. Part of the test is somewhat pointless now as the
input and the output are both the same ContainerInfo, but this was never
a great test to test parsing anyway.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Fri, 31 Jan 2025 00:04:32 +0000 (19:04 -0500)]
cephadm: add parsed_container_image_stats to container_engines
Add a new function that combines the call and parse operations that
exist in cephadm.py (currently get_container_stats_by_image_name and
related parsing code). This will be used in a future commit to replace
that code and reduce the size of cephadm.py.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 30 Jan 2025 23:49:01 +0000 (18:49 -0500)]
cephadm: replace get_container_stats in cephadm.py
Replace the existing get_container_stats with a version from
container_types.py that returns the parsed results of the
container status command (as a ContainerInfo, None on error).
Fix up a bunch of tests.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 30 Jan 2025 23:48:42 +0000 (18:48 -0500)]
cephadm: add get_container_stats to container_types
We're in the process of trying to remove get_container_types from
cephadm.py. Most of the generic logic has been moved to
container_engines.py. However, the remaining parts of the current
get_container_stats rely on classes from container_types.py so add a
future replacement for the existing get_container_types to
container_types.py.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 30 Jan 2025 23:48:26 +0000 (18:48 -0500)]
cephadm: add parsed_container_stats to container_engines
Add a new function that combines the call and parse operations that
exist in cephadm.py (currently get_container_stats and related parsing
code).. This will be used in a future commit to replace that code and
reduce the size of cephadm.py.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 30 Jan 2025 22:54:22 +0000 (17:54 -0500)]
cephadm: move ContainerInfo class to container_engines.py
Move the ContainerInfo class, which has basically no dependencies, to
container engines module which is the current home for generic
*low-level* container things.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Omid Yoosefi [Wed, 29 Jan 2025 16:33:49 +0000 (11:33 -0500)]
pybind/mgr/cephadm: fix issue with multiple nfs clusters on the same port
Currently even if ingress and virtual_ip are used, the port_in_use
check in cephadm fails when the same port is used. This PR fixes
the issue by ensuring that the daemon spec includes the virtual
ip address for the endpoint to be checked is included so it
doesn't default to checking `0.0.0.0:port` only.
Nizamudeen A [Fri, 22 Nov 2024 12:54:42 +0000 (18:24 +0530)]
mgr/dashboard: fix host form issues
Addressing the review comments in https://github.com/ceph/ceph/pull/60355#pullrequestreview-2391335490
Issues fixing
- cluster expansion host form not closing form when submitting it
- cluster expansion host form changing the location when closing it
- put the helper text inside helper component since its longer
- maintenance field should be hidden in cluster expansion form since
its not a valid option while expanding a cluster. We already add host
in _no_schedule
Fixes: https://tracker.ceph.com/issues/69020 Signed-off-by: Nizamudeen A <nia@redhat.com>
Adam King [Wed, 29 Jan 2025 20:48:53 +0000 (15:48 -0500)]
mgr/cephadm: continue in nfs service purge if grace file is already deleted
The test_nfs task we run in teuthology creates and removes a number of
nfs clusters during the task. I think it's possible based on timing for
it to end up in a situation where it tries to remove an nfs service before
the grace file has been created. In that case, cephadm doesn't know it
hasn't created the grace file and just repeatedly fails forever attempting
to remove the nonexistent file. This patch adds handling for the error
case where we get a nonzero rc but the error message implies the command
failed because the file already does not exist.
Fixes: https://tracker.ceph.com/issues/69736 Signed-off-by: Adam King <adking@redhat.com>
Vallari Agrawal [Tue, 4 Feb 2025 07:50:18 +0000 (13:20 +0530)]
qa/suites/nvmeof: use SCALING_DELAYS: '120'
Increase delays for qa/workunits/nvmeof/scalability_test.sh
as namespace rebalancing takes more time. After upscaling,
gateway initially could be 'CREATED', it is a valid state during
gateway initialization, but then the state should progress
to 'AVAILABLE' within couple of seconds.