John Mulligan [Tue, 20 Aug 2024 19:01:05 +0000 (15:01 -0400)]
src/script: add a script to help build ceph using containers
The build-with-container script tries to encapsulate nearly all major
build tasks using docker/podman containers. If there's no build image
locally it will create one for your. It provides targets for building
(make), testing (make check), building rpm packages or deb packages and
is designed to be fairly easily extended.
View the comment at the top of the source file for usage details.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Tue, 20 Aug 2024 19:00:57 +0000 (15:00 -0400)]
build: add files needed to create a build container
A build container contains all the tools and dependencies needed to
build ceph. It provides a Container file and small script that
helps bootstrap the container setup. This script installs a few extra
things we need before farming most of the work out to install-deps.sh.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Sat, 14 Sep 2024 10:31:23 +0000 (06:31 -0400)]
build: small script tweak to allow different build dirs
Move the mkdir line to allow for other builds dir naming schemes outside
of what appears in the .gitignore file. A tiny bit of added flexibility
at little cost.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Mon, 14 Nov 2022 15:57:25 +0000 (10:57 -0500)]
src/script: add helper function has_build_dir
This function returns successfully if $BUILD_DIR exists and is valid.
This is a useful building block for automation around the build and
can be used to avoid re-running commands that fail is the build dir
exists already.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Adam King [Wed, 29 Jan 2025 20:48:53 +0000 (15:48 -0500)]
mgr/cephadm: continue in nfs service purge if grace file is already deleted
The test_nfs task we run in teuthology creates and removes a number of
nfs clusters during the task. I think it's possible based on timing for
it to end up in a situation where it tries to remove an nfs service before
the grace file has been created. In that case, cephadm doesn't know it
hasn't created the grace file and just repeatedly fails forever attempting
to remove the nonexistent file. This patch adds handling for the error
case where we get a nonzero rc but the error message implies the command
failed because the file already does not exist.
Fixes: https://tracker.ceph.com/issues/69736 Signed-off-by: Adam King <adking@redhat.com>
Vallari Agrawal [Tue, 4 Feb 2025 07:50:18 +0000 (13:20 +0530)]
qa/suites/nvmeof: use SCALING_DELAYS: '120'
Increase delays for qa/workunits/nvmeof/scalability_test.sh
as namespace rebalancing takes more time. After upscaling,
gateway initially could be 'CREATED', it is a valid state during
gateway initialization, but then the state should progress
to 'AVAILABLE' within couple of seconds.
Kushal Deb [Fri, 29 Nov 2024 08:38:51 +0000 (14:08 +0530)]
cephadm: Add pre_remove and ensure deployment values are reset and API settings are updated when removing Prometheus or Alertmanager daemons
This fixes an issue where the dashboard API settings are not updated
properly when the active Prometheus or Alertmanager daemon is removed.
If the active daemon is removed, the settings are reconfigured to point
to a remaining daemon or reset if no daemons are available.
This avoids dashboard errors like "404 Not Found" caused by stale API
host settings.
Let ReplicatedRecoveryBackend::handle_recovery_op route pushes
between handle_push and handle_pull_response instead of
ReplicatedRecoveryBackend::handle_push.
Adam King [Wed, 29 Jan 2025 17:02:50 +0000 (12:02 -0500)]
mgr/cephadm: add Server_Scope = <fsid> to NFSv4 section of ganesha conf
From the ganesha team
"""
In the NFSv4 param block, we need a parameter Server_Scope set to some value common among all servers in a cluster.
The default with it blank is to use the hostname which may be different for each server in the cluster.
"""
This is related to ongoing work on high availability nfs. From the cephadm side
we just need to make sure all nfs daemons in the cluster end up with
the same value for the Server_Scope field. This patch uses the cluster
id (which we already brought into the template as the "namespace" attribute)
Ilya Dryomov [Thu, 30 Jan 2025 19:30:18 +0000 (20:30 +0100)]
doc/rbd: use https links in live import examples
Even though it's explicitly said that "http" stream can be used to
import via both HTTP and HTTPS, it can still be confusing that "type":
"http" is expected to go with "url": "https://...". Switch example
URLs from HTTP to HTTPS to make it more obvious.
Adam King [Thu, 30 Jan 2025 14:15:37 +0000 (09:15 -0500)]
mgr/cephadm: create OSD daemon deploy specs through make_daemon_spec
That function handles setting up the extra container/entrypoint
args for the daemon during initial deployment. Having the
CephadmDaemonDeploySpec made directly in the OSD deployment
workflow means initial deployments of OSDs won't have the
extra container/entrypoint args from the spec
Fixes: https://tracker.ceph.com/issues/69734 Signed-off-by: Adam King <adking@redhat.com>
When the NBD server is killed, nbd_pread() can set errno to at least
ENOTCONN, EINVAL and 0 which is supposed to stand for "no additional
errno information is available for this error". Add a test to ensure
that "rbd migration execute" command always fails and that the image
isn't transitioned to MIGRATION_STATE_EXECUTED in this scenario.
Ilya Dryomov [Wed, 29 Jan 2025 11:56:34 +0000 (12:56 +0100)]
librbd: stop filtering async request error codes
The roots of this go back to 2015 when snap create was changed to
filter EEXIST in commit 63f6c9bac9a4 ("librbd: fixed snap create race
conditions") and flatten respectively EINVAL in commit ef7e210c3f74
("librbd: better handling for duplicate flatten requests"). From there
this pattern made it to most other operations that can be proxied
including "rbd migration execute".
The motivation was to suppress generation of an "expected" error in
response to a duplicate async request notification for the operation.
However, doing this at the top of the handler (right before returning
to the caller) and for an error as generic as EINVAL is super fragile.
It's trivial for an error that is being filtered to sneak in with
a lower level change completely unnoticed. For example, live migration
recently added NBD stream which is implemented on top of libnbd and it
turns out that some libnbd APIs return EINVAL on various occasions when
the NBD endpoint disappears and an error like ENOTCONN would make more
sense. If this occurs during "rbd migration execute" operation, the
rest of librbd never learns that migration was disrupted and the image
is transitioned to MIGRATION_STATE_EXECUTED, thus handing a partially
imported (read: corrupted) image to the user.
Luckily, with commits 07fbc4b71df4 ("librbd: track complete async
operation requests") and 96bc20445afb ("librbd: track complete async
operation return code"), the scenario which originally prompted error
code filtering isn't an issue anymore. Despite a few shortcomings
(e.g. when an async request notification is acked with result 0, it's
impossible to tell whether a) a new operation was kicked off, b) there
is an operation that is still in progress or c) it's for an operation
that completed earlier but hasn't "expired" yet), even just commit 07fbc4b71df4 by itself prevents a duplicate notification from kicking
off a second operation that could generate an error for something that
actually succeeded. With that in mind, eradicate error code filtering
from Operations class.
Vallari Agrawal [Tue, 28 Jan 2025 09:18:15 +0000 (14:48 +0530)]
qa/tasks/nvmeof.py: Fix do_checks() method
All checks currently run on initator node, now
run all "ceph" commands on one of gateway hosts
instead of initator nodes. And run "nvme list"
and "nvme list-subsys" checks on initator node.
Add retry (5 times) to do_checks if any command fails.
Vallari Agrawal [Tue, 14 Jan 2025 03:52:31 +0000 (09:22 +0530)]
qa/tasks/nvmeof.py: Do not use 'systemctl start' in thrasher
Instead use 'daemon start' in revive_daemon() to bring
up gateways thrashed with 'systemctl stop'.
This is because 'systemctl start' method seems to temporary
issues.
Gil Bregman [Thu, 30 Jan 2025 11:33:51 +0000 (13:33 +0200)]
mgr/cephadm/nvmeof: Add verify_listener_ip field to NVMeOF configuration and remove obsolete enable_key_encryption
Fixes https://tracker.ceph.com/issues/69731
Ronen Friedman [Thu, 30 Jan 2025 09:27:58 +0000 (03:27 -0600)]
osd/scrub: discard repair_oinfo_oid()
repair_oinfo_oid(), called every scrub, has a very specific
functionality: fix the object ID specified in the Object Info
attribute, if different from the ID of the owning object.
This fix was added in 2017, as a response to a unique failure
scenario that was observed in Sepia - probably following a
filesystem bug. See https://tracker.ceph.com/issues/18409 &
https://tracker.ceph.com/issues/20471.
The limited functionality of repair_oinfo_oid() -
only repairing this one specific issue, and only if the OI_ATTR
exists and is decodable - does not justify the overhead of
running it every scrub.
John Mulligan [Thu, 30 Jan 2025 00:25:52 +0000 (19:25 -0500)]
cephadm: use parsed_container_cpu_perc in cephadm.py
Replace the use of _parse_cpu_perc and related command calls with
parsed_container_mem_usage. This needs no additional test updates
because the test updates in the previous patch that added
parsed_container_mem_usage covered all of that already.
Signed-off-by: John Mulligan <jmulligan@redhat.com>