Vallari Agrawal [Tue, 20 Feb 2024 07:44:32 +0000 (13:14 +0530)]
qa/suite/rbd/nvmeof: Deploy multiple gateways and namespaces
1. Deploy 2 gateways on different nodes, then check for multi-path.
To add another gateway, only "roles" need to be changed in job yaml.
2. Create "n" nvmeof namespaces, configured by 'namespaces_count'
3. Rename qa/suites/rbd/nvmeof/cluster/fixed-3.yaml to fixed-4.yaml
which contains 2 gateways and 2 initiators.
Ivo Almeida [Wed, 21 Feb 2024 13:02:19 +0000 (13:02 +0000)]
mgr/dashboard: fix retention add for subvolume
- Added parameters for subvolume and subvolume group when adding a new
snap schedule.
- Added call to remove retention policies when removing a snap schedule
in case it is the last one with same path
Fixes: https://tracker.ceph.com/issues/64524 Signed-off-by: Ivo Almeida <ialmeida@redhat.com>
Sachin Punadikar [Tue, 19 Mar 2024 09:41:53 +0000 (05:41 -0400)]
vstart: Ganesha should not be started in DEBUG mode
Currently vstart script, deploy NFS Ganesha in debug mode. Enabling
DEBUG mode for Ganesha leads to logging lot of debug messages, which may
not be required all the time. One can enable DEBUG mode on need basis.
Hence removing the default DEBUG mode.
myoungwon oh [Mon, 18 Mar 2024 06:48:07 +0000 (06:48 +0000)]
crimson/os/seastore: cache metadata during trimming to prevent from disk read
I encountered continous disk reads during trimming even though there are sufficient
cache available, in 4K random write test with RBM (RBD).
This is because metadata is note cached if its source is background transaction
within touch_extent(). So, seastore, including the trimming process, needs to
constantly retrieve metadata (e.g., BACKREF_LEAF).
Based on the previous commits making the remote executables auditable
and explicit, document the admin's ability to restrict password-less
sudo access to only the set of commands cephadm actually uses.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 14 Mar 2024 18:02:17 +0000 (14:02 -0400)]
mgr/cephadm: add a simple unit test for RemoteCommand class
Converting a remote command to something that other libs uses requires
converting the enum to a string. Python behavior in the area varies
across versions so add a unit test that verifies the conversion
behaves as intended.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Wed, 14 Feb 2024 16:35:57 +0000 (11:35 -0500)]
mgr/cephadm: make remote command execution auditable
Update ssh.py and other code using it to only allow commands wrapped
in particular python types as executables on the remote hosts.
By using a specific type for remote executables we make the code more
auditable, avoiding the possibility of executing arbitrary strings
as commands with sudo. This is all enforced by mypy's type checking.
The result is a list of commands that the cephadm mgr module may
execute on a remote host using sudo:
```
$ git ls-files -z | xargs -0 grep 'RemoteExecutable(' -d skip -h | grep
-v '(str)' | sed -e 's/.*RemoteExecutable(//' -e 's/)//' -e 's/,$//'
'which'
'/usr/bin/cephadm'
python
'chmod'
'ls'
'sysctl'
'chown'
'mkdir'
'mv'
'touch'
'rm'
'true'
```
Note that *python* is special as it is based on the output of which and
may vary from OS to OS. The quoted items are used exactly as named.
Only the binary at `/usr/bin/cephadm` _or_ the dynamically discovered
python3 binary will be used. This depends on a configuration option for
the cephadm module.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Adam King [Wed, 13 Mar 2024 19:30:25 +0000 (15:30 -0400)]
mgr/cephadm: refresh public_network for config checks before checking
The place it was being run before meant it would only grab the
public_network setting once at startup of the module. This meant
if a user changed the setting, which they are likely to do if they
get the warning, cephadm would ignore the change and continue
reporting that the hosts don't match up with the old setting
for the public_network. This moves the call to refresh the
setting to right before we actually run the checks. It does
mean we'll do the `ceph config dump --format json` call
each serve loop iteration, but I've found that only tends
to take a few milliseconds, which is nothing compared to
the time to refresh other things we check during the serve
loop.
I additionally modified the use of this option to use
the attribute on the mgr, rather than calling
`get_module_option`. This was just to get it more in
line with how we tend to handle other config options
Fixes: https://tracker.ceph.com/issues/64902 Signed-off-by: Adam King <adking@redhat.com>
Adam King [Tue, 12 Mar 2024 14:26:18 +0000 (10:26 -0400)]
cephadm: fix `cephadm shell --name <daemon-name>` for stopped/failed daemon
This previously would always try to use 'podman
inspect' on the running container of the daemon,
but this doesn't work if the daemon is stopped
or failed. Doing this for stopped/failed daemons
is a valid use case as we recommend cephadm shell
with --name for running debugging tools (often
for OSDs)
Fixes: https://tracker.ceph.com/issues/64879 Signed-off-by: Adam King <adking@redhat.com>
Adam King [Mon, 11 Mar 2024 18:44:17 +0000 (14:44 -0400)]
cephadm: allow list_daemons for only a specific daemon
At the moment, my thoughts are to use this internally
in the binary for when we need infor from list_daemons
but only for a specific daemon. I could also see wanting
this just on the command line to get info on a certain
daemon, so I've added it as a flag for `cephadm ls` as well
After some tests, it turns out that depending on the hardware,
the header 'Location' which is returned by the server after logged can be different.
I could notice the following:
rgw: Add missing empty checks to the split string in is_string_in_set().
In certain cases, where a user misconfigures a CORS rule, the entirety
of the string can be token characters (or, at least, the string before
and after a given token is all token characters), but != "*". If the
misconfigured string includes "*" we'll try to split the string and we
assume that we can pop the list of string elements when "*" isn't
first/last, but get_str_list() won't return anything for token-only
substrings and thus 'ssplit' will have fewer elements than would be
expected for a correct rule. In the case of an empty list, front() has
undefined behaviour; in our experience, it often results in a huge
allocation attempt because the code tries to copy the string into a
local variable 'sl'.
An example of this misconfiguration (and thus a reproduction case) is
configuring an origin of " *".
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Matt Benjamin [Thu, 14 Mar 2024 00:19:01 +0000 (20:19 -0400)]
rgw_file: fix mv/rename cases broken by zipper integration
There were two problems. First, leaf object names must be
expressed as fully-qualified to the bucket as input to the
copy-object step. Second, handle s->object in the same step
indicates the being-created destination object of the copy,
this was correct in the original zipper change but broken
later.
* add a rename/mv unit test
Tests for the following cases added:
1. move between two sub-directory paths in a single bucket
2. move between two names at the top level of a single bucket
3. move between sub-directory paths in different buckets (cross-bucket rename)
Fixes: https://tracker.ceph.com/issues/64950 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
kchheda3 [Tue, 27 Feb 2024 20:59:15 +0000 (15:59 -0500)]
rgw/notification: Fix the filter_rules to be array vs dict in json output.
FilterRules when processed as dict in json, emits samy key name for prefix, suffix causing failure while parsing the json notification output.
So change the type FilterRules from JsonDict to Array while dumping in json.
Patrick Donnelly [Thu, 14 Mar 2024 18:59:36 +0000 (14:59 -0400)]
qa/crontab: use historically normal priorities for nightlies
Stop using --force-priority except when necessary.
Squid still gets elevated priority due to the increased attention with the
imminent release.
I've differentiated the priorities some in that release branches should get
higher priority than the main branch and that older release branches should be
prioritized over newer ones. Finally, upgrade tests should be prioritized over
other nightlies.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>