Ronen Friedman [Mon, 22 May 2023 15:09:28 +0000 (18:09 +0300)]
osd/scrub: increasing max_osd_scrubs to 3
Bug reports seem to hint that the current default value of
'1' is too low: the cluster is susceptible to scrub scheduling
delays and issues stemming from local software/networking/hardware
problems, even if affecting a very small number of OSDs.
Squid will include a major overhaul of the way scrubs are counted
in the cluster, providing a better solution to the problem. For
now - modifying the default is an effective stop-gap measure.
Zac Dover [Thu, 7 Mar 2024 03:01:47 +0000 (13:01 +1000)]
doc/start: add Slack invite link
Add a link to the ceph-storage Slack invitation page. Previously the
link went to a plain old "this is the ceph-storage Slack" page that did
not direct the reader to sign up.
Adam King [Mon, 21 Aug 2023 17:48:56 +0000 (13:48 -0400)]
cephadm: make custom_configs work for tcmu-runner container
This is intended to be a temporary workaround to make
custom config files be able to be mounted into
the tcmu-runner container. The hope is to refactor
cephadm's iscsi handling for squid, but a patch
like this could be useful for iscsi in older
releases where currently custom config files
are unusable for the tcmu-runner container
What this patch actually does is have us write the
custom config files to a dir for the tcmu-runner
container so that the rest of the logic works without
change. I thought this would be easier to remove later
than a patch that integrates more with the container
mounts or general deployment
Adam King [Tue, 13 Jun 2023 23:54:30 +0000 (19:54 -0400)]
cephadm: run tcmu-runner through script to do restart on failure
Currently, cephadm runs tcmu-runner as a background
process inside the unit file deployed for iscsi
(rbd-target-api is the primary process). This means
if tcmu-runner crashes for whatever reason, systemd
will not attempt to restart it. This commits sets
up a script to serve as the container entrypoint
for the tcmu-runner container that will run
tcmu-runner and also restart it on failure
(unless there are too many failures in a short
period, at which point it gives up).
The hope is to eventually drop use of this script
for a better solution in squid onward, but this
should be helpful on older releases (quincy and
pacific at least) where we won't be able to
bring that better solution
Adam King [Fri, 2 Jun 2023 00:06:35 +0000 (20:06 -0400)]
cephadm: add tcmu-runner to logrotate config
This process could be used to set up the tcmu-runner
to log to a file much like other ceph daemons
- create /etc/tcmu directory
- create /etc/tcmu/tcmu.conf directory with default options
- change dir to /var/log
- change log level to 4
- add -v /etc/tcmu:/etc/tcmu to tcmu-runner container podman line in unit.run
In order to support this (mostly for debugging) we should
add tcmu-runner to the logrotate config
Adam King [Fri, 7 Jul 2023 15:03:56 +0000 (11:03 -0400)]
qa/cephadm: add test for ca signed keys
Test that bootstraps with a CA signed key using
the use_ca_signed_key cephadm override. Then follows
up by doing a check-host on each host which verifies
the cephadm mgr module can reach and authenticate with
the nodes using the new key setup.
This probably should really be a workunit, but
I didn't want to create a full new section for
this test and I needed a section that didn't
already run the cephadm task for every test. I could
see this being moved into some sort of
"test_special_deployment_scenarios" section in the future
Adam King [Fri, 7 Jul 2023 14:36:39 +0000 (10:36 -0400)]
qa/cephadm: add ca signed key to cephadm task
To allow bootstrapping a cluster using a CA signed
key instead of the standard pubkey authentication.
Will allow explicit testing of this as we add support
for it
Adam King [Sat, 3 Jun 2023 18:39:05 +0000 (14:39 -0400)]
doc/cephadm: document how to pass self made SSH key pairs to bootstrap
This didn't seem to exist in the install section of
the cephadm docs. Wanted to add it in before adding
documentation for bootstrapping with CA signed keys.
Adam King [Thu, 1 Jun 2023 23:23:45 +0000 (19:23 -0400)]
mgr/cephadm: add is_host_<status> functions to HostCache
A bunch of places were doing list compression to see if a host
was unreachable/draining/schedulable by hostname. This is meant to
replace all those instances of list compression with a function
call that does the same
Error EINVAL: ServiceSpec: 'dict' object has no attribute 'validate'
which is not a useful error message. This is caused by the
spec assuming all osd specific fields are either defined
in the 'spec' section or outside of it, but not mixed in.
We could also just consider these specs to be invalid
and just raise a better error message, but it seems easier
to make the minor adjustment for it to work, given there doesn't
seem to be an issue with mixing the styles for specs for
other service types.
Adam King [Mon, 5 Jun 2023 17:18:06 +0000 (13:18 -0400)]
python-common/drive_selection: lower log level of limit policy message
This gets logged every time cephadm tries to apply a
relevant OSD spec and ends up spamming the logs. There's no reason
we really need this to be at info rather than debug level,
so let's lower it.
Adam King [Mon, 19 Jun 2023 20:07:31 +0000 (16:07 -0400)]
mgr/cephadm: add extra_entrypoint_args to mon spec
There was no reason for the mon spec to not include
this option. I believe this was just an oversight caused
by the addition of the mon spec and extra_entrypoint_args
in separate PRs around the same time.
Adam King [Mon, 19 Jun 2023 19:46:45 +0000 (15:46 -0400)]
mgr/cephadm: add extra_container_args and custom_configs to CustomContainer
CustomContainer was skipped previously for the extra_container_args
and custom_configs feature as these could already be done
using other fields within the custom container service spec
(the "args" and "files" fields respectively). It seems
desirable for us to allow setting these things for custom
containers the same as for other services for uniformity sake
and this allows us to use custom containers to test
these features.
Zac Dover [Mon, 4 Mar 2024 10:41:16 +0000 (20:41 +1000)]
doc/rados: link to pg setting commands
Link to the instructions for manually setting the number of PGs per
pool, from the mention of placement groups. These instructions are
included here in response to a request from Ronen Friedman on the
occasion of the removal of links to the PGcalc (see
https://github.com/ceph/ceph/pull/55899#pullrequestreview-1912940118).
Afreen [Tue, 6 Feb 2024 09:43:58 +0000 (15:13 +0530)]
mgr/dashboard: fix error while accessing roles tab when policy attached
Fixes https://tracker.ceph.com/issues/64270
Issue:
======
Accessing Object->Users-Roles tab causing 500 internal servor error.
This is due to the "PermissionPolicies" which are attached to role and
backend was not handling this field for rgw roles.
Fix:
====
Added "PermissionPolicies" as the valid field in backend and updated
frontend to render the attached policy in formatted JSON
Zac Dover [Sun, 3 Mar 2024 10:28:00 +0000 (20:28 +1000)]
doc/rados: remove PGcalc from docs
Remove mention of the "PG calc" tool from the documentation. I have
removed all mention of this in one fell swoop to help posterity restore
mention of this tool if we decide we need to do so.
Zac Dover [Fri, 1 Mar 2024 12:11:14 +0000 (22:11 +1000)]
doc/install: add manual RADOSGW install procedure
Add a manual RADOSGW installation procedure to
doc/install/manual-deployment.rst. This procedure was developed by Janne
Johansson and reported to the ceph-users mailing list on 29 Jan 2024
here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/LB3YRIKAPOHXYCW7MKLVUJPYWYRQVARU/
Co-authored-by: Janne Johansson <icepic.dz@gmail.com> Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 565bc9503838906995fa48f59debcd2843775b18)
Zac Dover [Thu, 29 Feb 2024 08:08:10 +0000 (18:08 +1000)]
doc/glossary: improve "MDS" entry
Improve the entry for "MDS" in doc/glossary.rst by linking to the
"ceph-mds" man page and mentioning the relationship between clients and
MDS (or MDSes).
Zac Dover [Mon, 26 Feb 2024 10:03:48 +0000 (20:03 +1000)]
doc/rados: add "change public network" procedure
Add a procedure to /doc/rados/operations/add-or-rm-mons.rst that
explains how to change the public_network in a Ceph cluster deployed
with cephadm. This procedure was developed by Eugen Block, and can be
seen in its original form here:
https://heiterbiswolkig.blogs.nde.ag/2024/02/22/cephadm-change-public-network/
John Mulligan [Wed, 12 Oct 2022 18:15:59 +0000 (14:15 -0400)]
cephadm: fix base class behavior on python3.6
This fixes the cephadm test files when running tox/pytest on python3.6
(centos/rhel 8).
Long story short, combining classmethod and property on py3.6 behaves
differently from py3.7 and up. Since the classmethod is actually
unnecessary for the base class to behave as it does, we drop that
decorator.
John Mulligan [Wed, 12 Oct 2022 18:06:40 +0000 (14:06 -0400)]
cephadm: fix running test suite on python3.6
While a new version of pyfakefs is available, version 5 is not available
for python 3.6. In order to run the test suite on centos 8 we will
continue to work with pyfakefs version 4.