Ramana Raja [Tue, 25 Jan 2022 01:06:11 +0000 (20:06 -0500)]
mgr/nfs: allow dynamic update of cephfs nfs export
mgr/nfs module's apply_export() method is used to update an existing
CephFS NFS export. The method always restarted the ganesha service (
ganesha server cluster) after updating the export object and notifying
the ganesha servers to reload their exports. The restart temporarily
affected the clients connections of all the exports served by the
ganesha servers.
It is not always necessary to restart the ganesha servers. Only
updating the export ID, path, or FSAL block of a CephFS NFS export
requires a restart. So modify apply_export() to only restart the
ganesha servers for such export updates.
The mgr/nfs module creates a FSAL ceph user with read-only or
read-write path restricted MDS caps for each export. To change the
access type of the CephFS NFS export, the MDS caps of the export's FSAL
ceph user must also be changed. Ganesha can dynamically enforce an
export's access type changes, but Ceph server daemons can't dynamically
enforce changes in caps of the Ceph clients. To allow dynamic updates
of CephFS NFS exports, always create a FSAL Ceph user with read-write
path restricted MDS caps per export. Rely on the ganesha servers to
enforce the export access type changes for the NFS clients.
mgr/cephadm: adding logic to close ports when removing a daemon Fixes: https://tracker.ceph.com/issues/52906 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 4deb546ffd67ac8f05d2788150764a26b5671b87)
doc/cephadm/services: Add missing ceph command to orch apply
In cephadm service management documentation several of the
ceph orch commands are missing the ceph part, mostly in
ceph orch apply commands but not all of them.
Add ceph in the front of the command to make them consistent
with all other commands.
Zac Dover [Wed, 18 May 2022 10:36:53 +0000 (20:36 +1000)]
doc/start: s/3/three/ in intro.rst
I'm changing "3" to "three" for two reasons:
1. It's correct.
2. This allows me to test backports into Octopus, Pacific, and Quincy.
I am particularly interested to see what happens when I attempt
the backport into Octopus, because backports into Octopus have
failed. This will provide me with another unit of data.
Conflicts:
src/pybind/mgr/dashboard/frontend/cypress/integration/cluster/services.po.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/services/service-daemon-list/service-daemon-list.component.html
- Neglecting e2e changes as they aren't adopted fully to master yet- no regression. Adopting daemon-list html with master.
mgr/dashboard: introduce memory and cpu usage for daemons
Fixes: https://tracker.ceph.com/issues/55218 Signed-off-by: Avan Thakkar <athakkar@redhat.com> Co-authored-by: Aashish Sharma <aasharma@redhat.com>
Introducing 2 new columns in Cluster->Host->Daemons table for Memory and CPU usage.
Ilya Dryomov [Thu, 10 Mar 2022 17:32:30 +0000 (18:32 +0100)]
rpm: use system libpmem on Centos 9 Stream
We need libpmem 1.10 and Centos 9 Stream has it. On top of sticking
to distro-provided packages being generally a good thing, this fixes
a build failure: libpmem 1.10 doesn't build with LTO which is enabled
by default in Centos 9 Stream. The distro package works around it.
libpmem 1.10 is also there in Fedora 34 and Fedora 35.
haoyixing [Fri, 25 Mar 2022 03:02:13 +0000 (03:02 +0000)]
mds: add a perf counter to record slow replies
Though we have MDS_HEALTH_SLOW_METADATA_IO and MDS_HEALTH_SLOW_REQUEST health alert, but those are not
precise nor accumulated. With slow reply counter compared to reply counter, we can find out the ratio
of slow requests through perf dump.
Fixes: https://tracker.ceph.com/issues/55126 Signed-off-by: haoyixing <haoyixing@kuaishou.com>
(cherry picked from commit e8e3b307c87dc9eec2d087b396c0e7a0248b4f1d)
but when WITH_SYSTEM_ARROW is enabled, the targets we get from
find_package() do not carry this dependency. so rgw's cmake needs to
depend on both targets
windgmbh [Fri, 12 Nov 2021 15:51:03 +0000 (16:51 +0100)]
Apply sysctl.d migration from /usr/lib to /etc
A fix regarding the SYSCTL_DIR location (#53130) requires to migrate
sysctl.d/*.conf files from /usr/lib to /etc. Signed-off-by: Lukas Mayer <lmayer@wind.gmbh>
(cherry picked from commit a167a27f30536958e0f2c513d351642e81ba06d5)
windgmbh [Wed, 3 Nov 2021 17:16:53 +0000 (18:16 +0100)]
Fix sysctl.d location FHS compliance
This fixes #53130
Containers should not write to '/usr/lib'.
That location could be read-only or overwritten. Signed-off-by: Lukas Mayer <lmayer@wind.gmbh>
(cherry picked from commit 77afa812ea8b7e1e802246e4aa3a31e7b644a502)
Adam King [Thu, 24 Mar 2022 13:59:10 +0000 (09:59 -0400)]
cephadm: pass "--security-opt label=disable" to node-exporter container
in order to support setting '--path.procfs=/host/proc','--path.sysfs=/host/sys',
'--path.rootfs=/rootfs' for node-exporter we need to disable selinux separation
between the node-exporter container and the host to avoid selinux denials
Adam King [Fri, 4 Mar 2022 02:47:47 +0000 (21:47 -0500)]
mgr/cephadm: offline host watcher
To be able to detect if certain offline hosts go
offline quicker. Could be useful for the NFS
HA feature as this requires moving nfs daemons from
offline hosts within 90 seconds.
Redouane Kachach [Tue, 29 Mar 2022 16:37:10 +0000 (18:37 +0200)]
mgr/cephadm: fallback to normal sorted if cannot import natsorted Fixes: https://tracker.ceph.com/issues/55113 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 19c07de8207de5038df6f510a3c2ff41b10f7e08)
Adam King [Tue, 22 Mar 2022 22:57:21 +0000 (18:57 -0400)]
mgr/cephadm: Reschedule nfs daemons from offline hosts
In order to improve nfs availability, if there are other
hosts we can place an nfs daemon on or if there is a host
with a lower rank nfs daemon when a higher rank one is on
an offline host, we should reschedule the nfs daemons
mgr/cephadm: Adding support to store ceph conf per cluster fsid Fixes: https://tracker.ceph.com/issues/55185 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 2ea76173a163a93bbfbf69d0faa732d46eaf05ba)
mgr/cephadm: do not add _admin label when no-minimize-config is provided Fixes: https://tracker.ceph.com/issues/52727 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 01c8999d0354a71a7ef8526aab9b39e30d67c1bb)
Redouane Kachach [Wed, 23 Mar 2022 17:24:01 +0000 (18:24 +0100)]
mgr/cephadm: Adding image tag and date to cephadm startup messages Fixes: https://tracker.ceph.com/issues/55008 Fixes: https://tracker.ceph.com/issues/54373 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 92ecb58d46b6f75265a664f3165f4b3a0dd4993a)
Adam King [Wed, 6 Apr 2022 14:32:22 +0000 (10:32 -0400)]
mgr/cephadm: allow setting insecure_skip_verify for alertmanager
Add a "secure" parameter to alertmanager spec that will cause it
to deploy alertmanagers with insecure_skip_verify as true or false
depending on the value given for "secure".
NOTE: alertmanager must still be reconfigured after applying a yaml
with this option changed.
Fixes: https://tracker.ceph.com/issues/55272 Fixes: https://tracker.ceph.com/issues/55333 Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit e583d4ef1ac23a7473d50d253e0edf70580542ae)
Moritz Röhrich [Mon, 21 Mar 2022 16:32:25 +0000 (17:32 +0100)]
cephadm: avoid crashing on expected non-zero exit
- Avoid crashing when a call out to an external program expectedly does
not return exit status zero.
There are programs that communicate other information than error/no
error through exit status. E.g. `systemctl status` will return different
exit codes depending on the actual status of the units in question.
In cases where this is expected crashing with a RuntimeError exception
is inappropriate and should be avoided.
Fixes: https://tracker.ceph.com/issues/55117 Signed-off-by: Moritz Röhrich <moritz.rohrich@suse.com>
(cherry picked from commit a02be6f22fa18094cd8758700ab74581b6ce1701)
Melissa Li [Wed, 23 Mar 2022 15:38:37 +0000 (11:38 -0400)]
cephadm: show error message if private registry credentials not provided
Raise UnauthorizedRegistryError in `_pull_image` if user tries to pull from a private registry without authentication, handle error in `command_bootstrap`, `commond_adopt`, `command_pull`
Fixes: https://tracker.ceph.com/issues/55015 Signed-off-by: Melissa Li <melissali@redhat.com>
(cherry picked from commit 4de0803ba893abf341ab634d1382208370de7c98)
mgr/cephadm: support non-root ssh-user w permissions
Restructured code, so that in case of non-root, the resulting file will
be created with permissions set to the ssh-user. This allows the
subsequent scp to be able to write the file. The remaining code kept the
same, so that file permissions are restored to the expected ones, but
just runs after the scp.
Fixes: https://tracker.ceph.com/issues/54620 Signed-off-by: Christoph Glaubitz <c.glaubitz@syseleven.de>
(cherry picked from commit 452e52a7e39409e3409d59940133333416b830bc)
ceph-volume/tests: reject loop devices in lvm.conf
The current task doesn't works (typo?).
Otherwise api/lvm.py can't work properly, functions such as
`get_single_lv()` and many other don't return the expected results.
Indeed, lvm is confused because of the nvme_loop setup.
This adds the support of complex OSD creation with command
`orch daemon add osd`.
Any argument supported by `DriveGroupSpec()` can be passed on the command line.
Cephadm shouldn't try to deploy a disk reported as unavailable by ceph-volume.
The idea here is to check the rejection reason so we can still use DB devices
in case of OSD replacement.