Rishabh Dave [Thu, 7 Mar 2024 12:49:21 +0000 (18:19 +0530)]
qa/cephfs: fix caps_helper.py
Replace calls to run_cluster_cmd() by calls to run_ceph_cmd().
Ideally, a backport should have fixed this issue but the method under
inspection here, "get_mon_cap_from_keyring()", has gone through lots of
changes, compared to the main branch, along with rest of caps_helper.py.
To avoid backporting large amount of QA changes, writing a specific fix
will be easier and faster.
This will also help with rest of Quincy backports since this patch
series/PR makes an important change to QA code.
run_cluster_cmd() method is not available anymore because it was deleted
here on this PR -
https://github.com/ceph/ceph/pull/50569/files#diff-1c6c246ba42f343603d7174198dd1fb9c2654b6c883594d1a0891096b7a35875L408
Fixes: https://tracker.ceph.com/issues/62243 Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 27f43c9a89bc96eb9691f5e2ba2e67d78f4be6be)
Rishabh Dave [Sat, 24 Jun 2023 04:17:12 +0000 (09:47 +0530)]
AuthMonitor: no need to check permission in MDS caps
For "fs authorize" command, AuthMonitor.cc checks if permissions is "r"
or begins with "rw". This check is redundant now.
AuthMonitor::valid_caps() runs MDSAuthCaps.parse() which now runs same
check for the MDS caps, regardless of the command.
Conflicts:
qa/tasks/cephfs/test_admin.py
Since line numbers where the patch should be applied is
different in quincy branch compared to main branch, the conflict
occured.
Rishabh Dave [Fri, 9 Jun 2023 18:54:12 +0000 (00:24 +0530)]
MDSAuthCaps: print a special error message for wrong permissions
Permissions mentioned in MDS caps flags can either begin with "r" or
"rw", or can be "*" and "all". But it can't start with or be just "w" or
something else. This is confusing for some CephFS users since MON caps
can be just "w".
Command "ceph fs authorize" complains about this to the user. But other
commands (specifically, "ceph auth add", "ceph auth caps",
"ceph auth get-or-create" and "ceph auth get-or-create-key") don't. Make
these commands too print a helpful message, the way "ceph fs authorize"
command does.
Fixes: https://tracker.ceph.com/issues/61666 Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit f163dd3ef1fd9f05f05fa50eda9993225770d524)
Conflicts:
qa/tasks/cephfs/test_admin.py
Lines surrounding the patch to be applied were absent in quincy
branch.
src/mds/MDSAuthCaps.cc
Conflict was due to the fact that "std:string" was replaced by
"string".
Rishabh Dave [Sat, 24 Jun 2023 17:11:07 +0000 (22:41 +0530)]
qa/ceph_test_case: add a method to negative test Ceph commands
Also, add comments to explain the users the arguments are accepted by
run_ceph_cmd(), get_ceph_cmd_result(), get_ceph_cmd_stdout() and
negtest_ceph_cmd() methods of class RunCephCmd.
Rishabh Dave [Wed, 9 Aug 2023 12:40:32 +0000 (18:10 +0530)]
qa: inherit RunCephCmd in CephTestCase instead of CephFSTestCase
MgrTestCase also needs RunCephCmd. If RunCephCmd is inherited by
CephTestCase, instead of CephFSTestCase, MgrTestCase will automatically
inherit RunCephCmd because it inhertis CephTestCase.
Fixes: https://tracker.ceph.com/issues/62084 Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 4b369cf18ed1391a426ab4ae86da834e9c074f81)
Rishabh Dave [Mon, 27 Mar 2023 06:21:16 +0000 (11:51 +0530)]
qa/cephfs: use run_ceph_cmd() when cmd output is not needed
In filesystem.py and wherever instance of class Filesystem are used, use
run_ceph_cmd() instead of get_ceph_cluster_stdout() when output of Ceph
command is not required.
Conflicts:
qa/tasks/cephfs/filesystem.py
Patches from commit coudn't be applied because lines being
modified were present on main branch but aren't present on
quincy branch.
Rishabh Dave [Mon, 27 Mar 2023 06:09:11 +0000 (11:39 +0530)]
qa/cephfs: add helper methods to filesystem.py
Add run_ceph_cmd(), get_ceph_cmd_stdout() and get_ceph_cmd_result() to
class Filesystem so that running Ceph command is easier. This affects
not only methods inside class Filesystem but also methods elsewhere that
uses instance of class Filesystem to run Ceph commands.
Instead of "self.fs.mon_manager.raw_cluster_cmd()" writing
"self.fs.run_ceph_cmd()" will suffice.
Conflicts:
qa/tasks/cephfs/filesystem.py
qa/tasks/cephfs/mount.py
Commit had more patches for these file since on main branch
these files are also bigger.
Rishabh Dave [Thu, 16 Mar 2023 10:02:39 +0000 (15:32 +0530)]
qa/cephfs: add and use get_ceph_cmd_stdout()
Add method get_ceph_cmd_stdout() to class CephFSTestCase so that one
doesn't have to type something as long as
"self.mds_cluster.mon_manager.raw_cluster_cmd()" to execute a
command and get its output. And delete and replace
CephFSTestCase.run_cluster_cmd() too.
Rishabh Dave [Thu, 16 Mar 2023 09:41:08 +0000 (15:11 +0530)]
qa/cephfs: add and use run_ceph_cmd()
Instead of writing something as long as
"self.mds_cluster.mon_manager.run_cluster_cmd()" to execute a command,
let's add a helper method to class CephFSTestCase and use it instead.
With this, running a command becomes simple - "self.run_ceph_cmd()".
Conflicts:
qa/tasks/cephfs/test_damage.py
Conflict was due to the fact that this file is slightly different on
quincy branch compared to the main branch version when the commit
being cherry-picked was merged.
Rishabh Dave [Tue, 14 Mar 2023 19:43:56 +0000 (01:13 +0530)]
qa/cephfs: add and use get_ceph_cmd_result()
To run a command and get its return value, instead of typing something
as long as "self.mds_cluster.mon_manager.raw_cluster_cmd_result" add a
hepler method in CephFSTestCase and use it. This makes this task very
simple - "self.get_ceph_cmd_result()".
Also, remove method CephFSTestCase.run_cluster_cmd_result() in favour of
this new method.
Rishabh Dave [Mon, 13 Mar 2023 13:05:50 +0000 (18:35 +0530)]
qa/cephfs: create CephManager instance in CephFSTestCase
To run a Ceph command conveniently, run_cluster_cmd(), raw_cluster_cmd()
or raw_cluster_cmd_result() must be called. These methods are available
in class CephManager which in turn is available only if an instance of
Filesystem, MDSCluster, CephCluster or MgrCluster is initialized. Having
an instance of CephManager in CephFSTestCase will provide easy access to
these methods.
For example, in CephFS tests writing "self.mon_manager.raw_cluser_cmd()"
instead of writing "self.mds_cluster.mon_manager.raw_cluster()" will
suffice.
This commit provides a basis for upcoming commits in this patch series.
With next patches, running Ceph command will be further simplified. Just
writing self.run_ceph_cmd() will suffice for running a CephFS command.
Ronen Friedman [Mon, 22 May 2023 15:09:28 +0000 (18:09 +0300)]
osd/scrub: increasing max_osd_scrubs to 3
Bug reports seem to hint that the current default value of
'1' is too low: the cluster is susceptible to scrub scheduling
delays and issues stemming from local software/networking/hardware
problems, even if affecting a very small number of OSDs.
Squid will include a major overhaul of the way scrubs are counted
in the cluster, providing a better solution to the problem. For
now - modifying the default is an effective stop-gap measure.
Zac Dover [Thu, 7 Mar 2024 03:01:47 +0000 (13:01 +1000)]
doc/start: add Slack invite link
Add a link to the ceph-storage Slack invitation page. Previously the
link went to a plain old "this is the ceph-storage Slack" page that did
not direct the reader to sign up.
Adam King [Tue, 13 Jun 2023 23:54:30 +0000 (19:54 -0400)]
cephadm: run tcmu-runner through script to do restart on failure
Currently, cephadm runs tcmu-runner as a background
process inside the unit file deployed for iscsi
(rbd-target-api is the primary process). This means
if tcmu-runner crashes for whatever reason, systemd
will not attempt to restart it. This commits sets
up a script to serve as the container entrypoint
for the tcmu-runner container that will run
tcmu-runner and also restart it on failure
(unless there are too many failures in a short
period, at which point it gives up).
The hope is to eventually drop use of this script
for a better solution in squid onward, but this
should be helpful on older releases (quincy and
pacific at least) where we won't be able to
bring that better solution
Adam King [Fri, 2 Jun 2023 00:06:35 +0000 (20:06 -0400)]
cephadm: add tcmu-runner to logrotate config
This process could be used to set up the tcmu-runner
to log to a file much like other ceph daemons
- create /etc/tcmu directory
- create /etc/tcmu/tcmu.conf directory with default options
- change dir to /var/log
- change log level to 4
- add -v /etc/tcmu:/etc/tcmu to tcmu-runner container podman line in unit.run
In order to support this (mostly for debugging) we should
add tcmu-runner to the logrotate config
Adam King [Fri, 7 Jul 2023 15:03:56 +0000 (11:03 -0400)]
qa/cephadm: add test for ca signed keys
Test that bootstraps with a CA signed key using
the use_ca_signed_key cephadm override. Then follows
up by doing a check-host on each host which verifies
the cephadm mgr module can reach and authenticate with
the nodes using the new key setup.
This probably should really be a workunit, but
I didn't want to create a full new section for
this test and I needed a section that didn't
already run the cephadm task for every test. I could
see this being moved into some sort of
"test_special_deployment_scenarios" section in the future
Adam King [Fri, 7 Jul 2023 14:36:39 +0000 (10:36 -0400)]
qa/cephadm: add ca signed key to cephadm task
To allow bootstrapping a cluster using a CA signed
key instead of the standard pubkey authentication.
Will allow explicit testing of this as we add support
for it
Adam King [Sat, 3 Jun 2023 18:39:05 +0000 (14:39 -0400)]
doc/cephadm: document how to pass self made SSH key pairs to bootstrap
This didn't seem to exist in the install section of
the cephadm docs. Wanted to add it in before adding
documentation for bootstrapping with CA signed keys.
Adam King [Thu, 1 Jun 2023 23:23:45 +0000 (19:23 -0400)]
mgr/cephadm: add is_host_<status> functions to HostCache
A bunch of places were doing list compression to see if a host
was unreachable/draining/schedulable by hostname. This is meant to
replace all those instances of list compression with a function
call that does the same
Error EINVAL: ServiceSpec: 'dict' object has no attribute 'validate'
which is not a useful error message. This is caused by the
spec assuming all osd specific fields are either defined
in the 'spec' section or outside of it, but not mixed in.
We could also just consider these specs to be invalid
and just raise a better error message, but it seems easier
to make the minor adjustment for it to work, given there doesn't
seem to be an issue with mixing the styles for specs for
other service types.
Adam King [Mon, 5 Jun 2023 17:18:06 +0000 (13:18 -0400)]
python-common/drive_selection: lower log level of limit policy message
This gets logged every time cephadm tries to apply a
relevant OSD spec and ends up spamming the logs. There's no reason
we really need this to be at info rather than debug level,
so let's lower it.
Adam King [Mon, 19 Jun 2023 20:07:31 +0000 (16:07 -0400)]
mgr/cephadm: add extra_entrypoint_args to mon spec
There was no reason for the mon spec to not include
this option. I believe this was just an oversight caused
by the addition of the mon spec and extra_entrypoint_args
in separate PRs around the same time.
Adam King [Mon, 19 Jun 2023 19:46:45 +0000 (15:46 -0400)]
mgr/cephadm: add extra_container_args and custom_configs to CustomContainer
CustomContainer was skipped previously for the extra_container_args
and custom_configs feature as these could already be done
using other fields within the custom container service spec
(the "args" and "files" fields respectively). It seems
desirable for us to allow setting these things for custom
containers the same as for other services for uniformity sake
and this allows us to use custom containers to test
these features.
Zac Dover [Mon, 4 Mar 2024 10:41:16 +0000 (20:41 +1000)]
doc/rados: link to pg setting commands
Link to the instructions for manually setting the number of PGs per
pool, from the mention of placement groups. These instructions are
included here in response to a request from Ronen Friedman on the
occasion of the removal of links to the PGcalc (see
https://github.com/ceph/ceph/pull/55899#pullrequestreview-1912940118).
Afreen [Tue, 6 Feb 2024 09:43:58 +0000 (15:13 +0530)]
mgr/dashboard: fix error while accessing roles tab when policy attached
Fixes https://tracker.ceph.com/issues/64270
Issue:
======
Accessing Object->Users-Roles tab causing 500 internal servor error.
This is due to the "PermissionPolicies" which are attached to role and
backend was not handling this field for rgw roles.
Fix:
====
Added "PermissionPolicies" as the valid field in backend and updated
frontend to render the attached policy in formatted JSON
Zac Dover [Sun, 3 Mar 2024 10:28:00 +0000 (20:28 +1000)]
doc/rados: remove PGcalc from docs
Remove mention of the "PG calc" tool from the documentation. I have
removed all mention of this in one fell swoop to help posterity restore
mention of this tool if we decide we need to do so.
Zac Dover [Fri, 1 Mar 2024 12:11:14 +0000 (22:11 +1000)]
doc/install: add manual RADOSGW install procedure
Add a manual RADOSGW installation procedure to
doc/install/manual-deployment.rst. This procedure was developed by Janne
Johansson and reported to the ceph-users mailing list on 29 Jan 2024
here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/LB3YRIKAPOHXYCW7MKLVUJPYWYRQVARU/
Co-authored-by: Janne Johansson <icepic.dz@gmail.com> Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 565bc9503838906995fa48f59debcd2843775b18)
Zac Dover [Thu, 29 Feb 2024 08:08:10 +0000 (18:08 +1000)]
doc/glossary: improve "MDS" entry
Improve the entry for "MDS" in doc/glossary.rst by linking to the
"ceph-mds" man page and mentioning the relationship between clients and
MDS (or MDSes).
Zac Dover [Mon, 26 Feb 2024 10:03:48 +0000 (20:03 +1000)]
doc/rados: add "change public network" procedure
Add a procedure to /doc/rados/operations/add-or-rm-mons.rst that
explains how to change the public_network in a Ceph cluster deployed
with cephadm. This procedure was developed by Eugen Block, and can be
seen in its original form here:
https://heiterbiswolkig.blogs.nde.ag/2024/02/22/cephadm-change-public-network/
John Mulligan [Wed, 12 Oct 2022 18:15:59 +0000 (14:15 -0400)]
cephadm: fix base class behavior on python3.6
This fixes the cephadm test files when running tox/pytest on python3.6
(centos/rhel 8).
Long story short, combining classmethod and property on py3.6 behaves
differently from py3.7 and up. Since the classmethod is actually
unnecessary for the base class to behave as it does, we drop that
decorator.
John Mulligan [Wed, 12 Oct 2022 18:06:40 +0000 (14:06 -0400)]
cephadm: fix running test suite on python3.6
While a new version of pyfakefs is available, version 5 is not available
for python 3.6. In order to run the test suite on centos 8 we will
continue to work with pyfakefs version 4.