John Mulligan [Thu, 11 May 2023 14:25:38 +0000 (10:25 -0400)]
cephadm: split default_image decorator into two functions
Keep default_image as a decorator for functions that will only
ever need to update an image passed by the CLI. For other future
functions that want to execute code prior to assigning an image
from CLI parameters add `update_default_image` which takes a
CephadmContext and updates it and can be used by the caller
at an arbitrary point in the code flow.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 1 Jun 2023 15:16:13 +0000 (11:16 -0400)]
cephadm: change tests to assert using mock call not func.args
The Python docs [1] note that:
Changed in version 3.8: Added args and kwargs properties.
In order to run tests on python 3.6 (example: `tox -e py36`) we
change the tests to do the asserts using the older call(...)
comparison style, with mock.ANY used for args we don't care about.
Adam King [Thu, 13 Apr 2023 17:54:00 +0000 (13:54 -0400)]
cephadm: open ports in firewall when adopting monitoring stack daemons
Otherwise we risk the prometheus/alertmanager/grafana
not functioning properly after adoption due to the necessary
port in the firewall not being open.
Adam King [Thu, 13 Apr 2023 17:05:11 +0000 (13:05 -0400)]
cephadm: still try to open ports in firewall on redeploy/reconfig
Prior to this patch we were discarding the provided
ports on reconfig and redeploy in order to not fail
thinking there was a port conflict with the instance
of the daemon we were about to reconfig/redeploy. However,
it's still desirable for us to make sure the firewall ports
are open when we do a reconfig/redpeloy, so this refactors
the port handling approach to have it do that but
still avoid checking for port conflicts. It also include
an update of the type signature of deploy_daemon
to the py3 style. That wasn't needed for the change
but since I was added an arugment there I thought we might
as well do it now.
John Mulligan [Tue, 6 Jun 2023 17:24:37 +0000 (13:24 -0400)]
cephadm: use 0o600 as the default mode for write_new
Add a constant DEFAULT_MODE of `0o600`, and make it the default of
the perms argument to write_new. This reduces a lot of code since
0o600 is the majority of the permissions used. Other cases can continue
to pass None to indicate no particular permissions are desired.
John Mulligan [Tue, 6 Jun 2023 17:16:29 +0000 (13:16 -0400)]
cephadm: convert SNMPGateway create_daemon_conf to use write_new
While it is not entirely clear why this pattern of using os.open and
posix open flags instead of `open` directly was used I determined (using
strace) that the only major difference between these open flags and
those used by `open` was the lack of O_TRUNC. Unlike some other cases
this function does not use an intermediate temporary file. This means
that if the file being written already exists and the data being written
is smaller then the remaining data will not be over-written.
I looked over the context that this function is used in and decided that
this behavior must not be intentional. Thus it should be safe
to convert this function to `write_new`.
John Mulligan [Tue, 6 Jun 2023 16:37:14 +0000 (12:37 -0400)]
cephadm: convert more temporary file writes to use write_new
Some functions are using the pattern:
```
with open(os.open(name + '.new, os.O_CREAT | os.O_WRONLY, 0o600), 'w') as f:
f.write(...)
os.rename(name + '.new', name)
```
While it is not entirely clear why this pattern was first used,
it accomplishes the same goal as `write_new` only directly calling
the posix open call. I analyzed the open flags for `write_new` and
these calls using `strace` and noted that the only significant
difference was the lack of O_TRUNC in these cases. Since the ".new"
files should not exist the lack of O_TRUC ought not make any difference.
With this decided we can convert these instances to `write_new`.
John Mulligan [Tue, 6 Jun 2023 16:25:34 +0000 (12:25 -0400)]
cephadm: convert _write_custom_conf_files to use write_new
We double checked the meaning of "w+" and it will open the file
read-write. Since the file is never read there's no real reason
to keep it that way so its OK to convert to `write_new`.
John Mulligan [Tue, 6 Jun 2023 00:12:59 +0000 (20:12 -0400)]
cephadm: convert some functions to use write_new
Convert a lot of the basic uses of the pattern:
with open(...) as f:
f.write(...)
os.fchown(f, ...) # sometimes
os.fchmod(f, ...) # sometimes
os.rename(...) # sometimes
These are the most obvious cases to convert to `write_new`
and should largely be uncontroversial.
John Mulligan [Tue, 6 Jun 2023 00:08:49 +0000 (20:08 -0400)]
cephadm: create functional mock for fchown
The pyfakefs library apparently doesn't have its own mock for os.fchown.
This means that code using fchown currently calls into a mock with
no affect on the fake fs. For some reason I don't fully understand,
existing test cases work because they don't always follow the pattern
of open-write-rename. Switching to `write_new`, which always does a
rename, breaks some of the assertions performed in the tests on the fake
fs. Add a mock fchown that updates the state of the fake fs so
that converting call sites to use `write_new` will continue to work.
John Mulligan [Tue, 6 Jun 2023 00:12:10 +0000 (20:12 -0400)]
cephadm: add write_new function for robust file writes
The cephadm code has a very common pattern made of at least one of
the three following steps:
* call fchown on the open file to set ownership
* call fchmod on the open file to set permissions
* rename the file from a temp name to final name
Add the write_new function to encapsulate these common actions.
If owner is not None then fchown will be called.
If perms is not None then fchmod will be called.
An optional encoding value may be passed.
It always uses a temporary file as a temporary file ensures that
there can never be a partially written file even in the event of
a power outage or system crash.
Encapsulating this all into a function also allows us to make
changes to this approach in the future without touching every
call site using `open(..., "w")` etc.
John Mulligan [Mon, 8 May 2023 17:54:32 +0000 (13:54 -0400)]
qa/workunits/cephadm: align test_cephadm.sh with new cephadm version
The `cephadm version` command no longer bases the output on the
container images, rather it uses a special python file added to the
zipapp during the build to report on the version of cephadm (the
binary).
The other option was to preserve this behavior and add a new version
command or make it behave differently depending on what options were
provided. I discussed the options with AMK in person and we decided that
changing the tests was preferable.
John Mulligan [Thu, 27 Apr 2023 17:39:02 +0000 (13:39 -0400)]
cephadm: add executes_early decorator to version command
Add the executes_early decorator and apply it to the version command
such that the version command will function without the uaual
requirements and set up steps that a cephadm command needs.
This allows anyone with a binary, root or not, to check the version.
John Mulligan [Thu, 27 Apr 2023 17:36:51 +0000 (13:36 -0400)]
cephadm: get cephadm version information from zipapp module
Use the `_version.py` module that gets embedded into the zipapp at build
time to print cephadm's own version information.
The version string printed is basically copied from the one printed by
the `ceph` command.
John Mulligan [Tue, 27 Sep 2022 20:51:57 +0000 (16:51 -0400)]
cephadm: add generated version info file to zipapp
Pass --set-version-var=KEY=VALUE cli options to the script and it
will create an embedded "_version.py" file in the zipapp containing
the passed parameters.
Nizamudeen A [Tue, 29 Aug 2023 14:16:07 +0000 (19:46 +0530)]
mgr/dashboard: fix cephfs create form validator
dashboard was allowing to create a filesystem with / in its name but the
cli throws out error in doing so. And the created volume in dashboard
just gets ended up as a stale volume
Zac Dover [Wed, 23 Aug 2023 15:15:54 +0000 (01:15 +1000)]
doc/start: edit os-recommendations.rst
Improve the grammar in one sentence of the "Platforms" section of
doc/start/os-recommendations.rst. Improving that grammar involved
splitting the sentence into two sentences, but that's life. Update:
Anthony substantially rewrote this, so credit for this should rightly
go to him.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit e19b9cea62c56cbd73049957d69d5aac16a97c08)
Adam King [Fri, 2 Jun 2023 00:06:35 +0000 (20:06 -0400)]
cephadm: add tcmu-runner to logrotate config
This process could be used to set up the tcmu-runner
to log to a file much like other ceph daemons
- create /etc/tcmu directory
- create /etc/tcmu/tcmu.conf directory with default options
- change dir to /var/log
- change log level to 4
- add -v /etc/tcmu:/etc/tcmu to tcmu-runner container podman line in unit.run
In order to support this (mostly for debugging) we should
add tcmu-runner to the logrotate config
Adam King [Fri, 7 Jul 2023 15:03:56 +0000 (11:03 -0400)]
qa/cephadm: add test for ca signed keys
Test that bootstraps with a CA signed key using
the use_ca_signed_key cephadm override. Then follows
up by doing a check-host on each host which verifies
the cephadm mgr module can reach and authenticate with
the nodes using the new key setup.
This probably should really be a workunit, but
I didn't want to create a full new section for
this test and I needed a section that didn't
already run the cephadm task for every test. I could
see this being moved into some sort of
"test_special_deployment_scenarios" section in the future
Adam King [Fri, 7 Jul 2023 14:36:39 +0000 (10:36 -0400)]
qa/cephadm: add ca signed key to cephadm task
To allow bootstrapping a cluster using a CA signed
key instead of the standard pubkey authentication.
Will allow explicit testing of this as we add support
for it
Adam King [Sat, 3 Jun 2023 18:39:05 +0000 (14:39 -0400)]
doc/cephadm: document how to pass self made SSH key pairs to bootstrap
This didn't seem to exist in the install section of
the cephadm docs. Wanted to add it in before adding
documentation for bootstrapping with CA signed keys.
Error EINVAL: ServiceSpec: 'dict' object has no attribute 'validate'
which is not a useful error message. This is caused by the
spec assuming all osd specific fields are either defined
in the 'spec' section or outside of it, but not mixed in.
We could also just consider these specs to be invalid
and just raise a better error message, but it seems easier
to make the minor adjustment for it to work, given there doesn't
seem to be an issue with mixing the styles for specs for
other service types.
Adam King [Mon, 5 Jun 2023 17:18:06 +0000 (13:18 -0400)]
python-common/drive_selection: lower log level of limit policy message
This gets logged every time cephadm tries to apply a
relevant OSD spec and ends up spamming the logs. There's no reason
we really need this to be at info rather than debug level,
so let's lower it.
Adam King [Thu, 4 May 2023 16:55:55 +0000 (12:55 -0400)]
mgr/cephadm: also don't write client files/tuned profiles to maintenance hosts
Since they could have been taken offline for maintenance,
they should be treated the same as offline hosts in this
case and we should avoid trying to write files to them
Redouane Kachach [Fri, 31 Mar 2023 07:04:10 +0000 (09:04 +0200)]
mgr/cephadm: show meaningful messages when failing to execute cmds Fixes: https://tracker.ceph.com/issues/59254 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit a66bdaafbc26098faf6a105c2e0109816d63ad35)
Zac Dover [Tue, 22 Aug 2023 05:30:21 +0000 (15:30 +1000)]
doc/start: refactor ABC test chart
Refactor the ABC test chart so that the information about which tests
have been run is presented in the center of the chart instead of, as it
was before, in a superscript.