Zac Dover [Mon, 10 Feb 2025 08:12:34 +0000 (18:12 +1000)]
doc/cephadm: improve "Activate Existing OSDs".
Make three minor changes to doc/cephadm/services/osd.rst. These three
changes were suggested by Eugen Block, who reviewed this procedure after
developing it.
Co-authored-by: Eugen Block <eblock@nde.ag> Signed-off-by: Zac Dover <zac.dover@proton.me>
`os::Transaction::decode_bp()` has only one user: `_setattrs()`
of `BlueStore`. It uses that for optimization purposes: keeping
up contigous space instead of potentially fragmented `bufferlist`
that would require rectifying memcpy later.
The problem is `_setattrs()` also needs to avoid keeping large
raw buffers with only small subset being referenced. It achieves
this by copying the data if `bufferptr:::is_partial()` returns
`true`. However, this means the memcpy happens virtually always
as it's hard to even imagine the `val`, decoded from the wire,
can fulfill the 0 waste requirement.
Therefore the optimization doesn't make sense; it only imposes
costs in terms of complexity breaking the symmetry between encode
and decode in `os::Transation` (there is no `encode_bp()`).
This commit kills the optimization and simplifies `os::Transaction`.
Zac Dover [Fri, 7 Feb 2025 01:32:20 +0000 (11:32 +1000)]
doc/cephadm: improve "Activate Existing OSDs"
Improve the section "Activate Existing OSDs".
Supplement the information in the "Activate Existing OSDs" section with
a procedure developed by Eugen Block, here:
https://heiterbiswolkig.blogs.nde.ag/2025/02/06/cephadm-activate-existing-osds/
This procedure explains how to activate OSDs on a host that, for
whatever reason, has had to have its operating system reinstalled.
* refs/pull/61562/head:
qa: remove redundant and broken test
mds: skip scrubbing damaged dirfrag
tools/cephfs/DataScan: test equality of link including frag
tools/cephfs/DataScan: skip linkages that have been removed
tools/cephfs/DataScan: do not error out when failing to read a dentry
tools/cephfs/DataScan: create all ancestors during scan_inodes
tools/cephfs/DataScan: cleanup debug prints
qa: remove old MovedDir test
qa: add data scan tests for ancestry rebuild
qa: make the directory non-empty to force migration
qa: avoid unnecessary mds restart
Ivo Almeida [Wed, 13 Nov 2024 12:16:23 +0000 (12:16 +0000)]
mgr/dashboard: fixed unit tests
* fixed unit tests due to upgrade to angular v18
* run npm fix in order to fix code style violations
* upgraded eslint/* packages' versions
* fixed eslint errors and warnings
Fixes: https://tracker.ceph.com/issues/68896 Signed-off-by: Ivo Almeida <ialmeida@redhat.com>
John Mulligan [Tue, 20 Aug 2024 19:01:05 +0000 (15:01 -0400)]
src/script: add a script to help build ceph using containers
The build-with-container script tries to encapsulate nearly all major
build tasks using docker/podman containers. If there's no build image
locally it will create one for your. It provides targets for building
(make), testing (make check), building rpm packages or deb packages and
is designed to be fairly easily extended.
View the comment at the top of the source file for usage details.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Tue, 20 Aug 2024 19:00:57 +0000 (15:00 -0400)]
build: add files needed to create a build container
A build container contains all the tools and dependencies needed to
build ceph. It provides a Container file and small script that
helps bootstrap the container setup. This script installs a few extra
things we need before farming most of the work out to install-deps.sh.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Sat, 14 Sep 2024 10:31:23 +0000 (06:31 -0400)]
build: small script tweak to allow different build dirs
Move the mkdir line to allow for other builds dir naming schemes outside
of what appears in the .gitignore file. A tiny bit of added flexibility
at little cost.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Mon, 14 Nov 2022 15:57:25 +0000 (10:57 -0500)]
src/script: add helper function has_build_dir
This function returns successfully if $BUILD_DIR exists and is valid.
This is a useful building block for automation around the build and
can be used to avoid re-running commands that fail is the build dir
exists already.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Omid Yoosefi [Wed, 29 Jan 2025 16:33:49 +0000 (11:33 -0500)]
pybind/mgr/cephadm: fix issue with multiple nfs clusters on the same port
Currently even if ingress and virtual_ip are used, the port_in_use
check in cephadm fails when the same port is used. This PR fixes
the issue by ensuring that the daemon spec includes the virtual
ip address for the endpoint to be checked is included so it
doesn't default to checking `0.0.0.0:port` only.
Nizamudeen A [Fri, 22 Nov 2024 12:54:42 +0000 (18:24 +0530)]
mgr/dashboard: fix host form issues
Addressing the review comments in https://github.com/ceph/ceph/pull/60355#pullrequestreview-2391335490
Issues fixing
- cluster expansion host form not closing form when submitting it
- cluster expansion host form changing the location when closing it
- put the helper text inside helper component since its longer
- maintenance field should be hidden in cluster expansion form since
its not a valid option while expanding a cluster. We already add host
in _no_schedule
Fixes: https://tracker.ceph.com/issues/69020 Signed-off-by: Nizamudeen A <nia@redhat.com>
Adam King [Wed, 29 Jan 2025 20:48:53 +0000 (15:48 -0500)]
mgr/cephadm: continue in nfs service purge if grace file is already deleted
The test_nfs task we run in teuthology creates and removes a number of
nfs clusters during the task. I think it's possible based on timing for
it to end up in a situation where it tries to remove an nfs service before
the grace file has been created. In that case, cephadm doesn't know it
hasn't created the grace file and just repeatedly fails forever attempting
to remove the nonexistent file. This patch adds handling for the error
case where we get a nonzero rc but the error message implies the command
failed because the file already does not exist.
Fixes: https://tracker.ceph.com/issues/69736 Signed-off-by: Adam King <adking@redhat.com>
Vallari Agrawal [Tue, 4 Feb 2025 07:50:18 +0000 (13:20 +0530)]
qa/suites/nvmeof: use SCALING_DELAYS: '120'
Increase delays for qa/workunits/nvmeof/scalability_test.sh
as namespace rebalancing takes more time. After upscaling,
gateway initially could be 'CREATED', it is a valid state during
gateway initialization, but then the state should progress
to 'AVAILABLE' within couple of seconds.
Kushal Deb [Fri, 29 Nov 2024 08:38:51 +0000 (14:08 +0530)]
cephadm: Add pre_remove and ensure deployment values are reset and API settings are updated when removing Prometheus or Alertmanager daemons
This fixes an issue where the dashboard API settings are not updated
properly when the active Prometheus or Alertmanager daemon is removed.
If the active daemon is removed, the settings are reconfigured to point
to a remaining daemon or reset if no daemons are available.
This avoids dashboard errors like "404 Not Found" caused by stale API
host settings.