Paul Cuzner [Wed, 2 Jun 2021 23:34:19 +0000 (11:34 +1200)]
mgr/cephadm:fix alerts sent to wrong URL
The path_prefix in prometheus.yml was specifying an
endpoint prefix, which was invalid. This resulted in 404
errors when trying to send alerts to alertmanager and
blocked alerts being sent on to the ceph-dashboard API
receiver. This fix remves this prefix.
Fixes: https://tracker.ceph.com/issues/51073 Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
(cherry picked from commit 9d408a70c7d01fd7c94f9b814af916396d7cbf1f)
Sage Weil [Thu, 3 Jun 2021 14:29:00 +0000 (10:29 -0400)]
mgr/cephadm/inventory: do not try to resolve current mgr host
The CNI configuration may set up a private network for the container, which
is mapped to the hostname in /etc/hosts. For example, my test box sets
up 10.88.0.0/24 because I was using crio + kubeadm on this host earlier
(at least I think that's why):
In any case, we should never trust a lookup of our own hostname from inside
a container!
This isn't quite sufficient, though: if this is a single-host cluster, then
we fall back to using get_mgr_ip(). That value may be distorted by the
public_network option on the mgr, but we don't have any other good
options here, and single-node clusters are unlikely to have complex
network configs.
Sage Weil [Wed, 2 Jun 2021 02:31:11 +0000 (22:31 -0400)]
pybind/mgr/mgr_module: make get_mgr_ip() return mgr's IP from mgrmap
The previous approach was convoluted: we tried to do a DNS lookup on the
hostname, which would fail if /etc/hosts had an entry. Which, with podman,
it does. And the IP it has will vary in all sorts of weird ways. For
example, CNI on my host means that I get a dynamic address in 10.88.0.0/24.
Avoid all of that nonsense and use the IP that is in the mgrmap. There
may be multiple IPs (v2 + v1, or maybe even IPv4 + v6 in the future); in
that case, use the first one.
Sage Weil [Tue, 25 May 2021 17:55:08 +0000 (13:55 -0400)]
cephadm: stop passing --no-hosts to podman
This reverts cfc1f914ce74f1fd1f45e2efd3ba2ddcb2da129a, which is no longer
neceesary because (1) we don't use socket.getfqdn(), and (2) we generally
do not rely on DNS or /etc/hosts at all anymore (with the exception of
the upgrade transition).
Sage Weil [Tue, 25 May 2021 20:10:49 +0000 (16:10 -0400)]
mgr/cephadm: convert host addr if non-IP to IP
Previously we allowed the host.addr to be a DNS name (short or fqdn).
This is problematic because of the inconsistent way that docker and podman
handle /etc/hosts, and undesirable because relying on external DNS is
an external source of failure for the cluster without any benefit in
return (simply updating DNS is not sufficient to make ceph behave).
So: update any non-IP to an IP as soon as we start up (presumably on
upgrade). If we get a loopback address (127.0.0.1 or 127.0.1.1), then
wait and hope that the next instance of the manager has better luck.
Sage Weil [Tue, 25 May 2021 17:00:35 +0000 (13:00 -0400)]
mgr/dashboard,prometheus: new method of getting mgr IP
- Use a centralized method get_mgr_ip()
- Look up the hostname via DNS. This is a bit more reliable than
getfqdn() since it will work even when podman adds the container
name to /etc/hosts.
Sage Weil [Fri, 21 May 2021 17:31:31 +0000 (13:31 -0400)]
mgr/cephadm: use known host addr
If the host IP/addr is known, use that. The addr might even be a FQDN
instead of an IP address, in which case we want to look that up instead
of the bare hostname.
Sage Weil [Fri, 21 May 2021 16:32:49 +0000 (12:32 -0400)]
mgr/cephadm: resolve IP at 'orch host add' time
We prefer to always have a real IP for hosts in the cluster. This avoids
a reliance on DNS for most operations.
Perhaps more importantly, it means we are less sensitive to inconsistent
host lookup results, for example due to (1) mismatched /etc/hosts files
between machines, or (2) a lookup of the local hostname that returns
127.0.1.1.
Adjust with_hosts() fixture to take an addr, and adjust tests accordingly.
Zac Dover [Thu, 27 May 2021 01:28:38 +0000 (11:28 +1000)]
doc/cephadm: enrich "service status"
This PR improves the syntax of the "Service
Status" section of the "Service Managment"
section of the cephadm guide. This includes
pretty significant reworking of the information
in the section, so vetting this one might be
annoying. Anyway, I think I've lowered the
cognitive load on the reader.
Michael Fritch [Thu, 13 May 2021 23:03:32 +0000 (17:03 -0600)]
cephadm: clean-up error message
use the standard error message from FileNotFound:
```
cephadm bootstrap --mon-ip 192.168.1.1 --config ~/foobar
ERROR: [Errno 2] No such file or directory: '/root/foobar'
```
Sage Weil [Thu, 6 May 2021 22:47:27 +0000 (18:47 -0400)]
mgr/nfs: take --ingress argument to 'nfs cluster create'
It is likely that the rook/k8s variation of ingress will not take a
virtual_ip argument. We want to make sure that ingress yes/no can be
specified independent of the virtual_ip.
Sage Weil [Thu, 6 May 2021 14:57:46 +0000 (10:57 -0400)]
cephadm: --stop-signal=SIGTERM
haproxy's container image tells docker|podman to send SIGUSR1 for a "clean"
shutdown. For NFS, the connections never close, so we will always hit the
podman|docker 10s timeout and get a SIGKILL. That, in turn, causes haproxy
to exit with 143, and puts the systemd unit in a failed state.
This highlights a general problem(?) with stopping containers: if they don't
do it quickly then we'll end up in this error state. We don't directly
address that here.
Avoid this problem by always stopping containers with SIGTERM. In the
haproxy case, that means an immediate shutdown (no graceful drain of
open connections). In theory we could do this only for haproxy with
NFS, but we can easily imagine RGW connections that don't close in 10s
either, and we don't want containers exiting in error state--we just
want the proxy to stop quickly.
Sage Weil [Mon, 3 May 2021 15:48:45 +0000 (11:48 -0400)]
mgr/orchestrator: default nfs pool, namespaces
Apply nfs default pool (currently 'nfs-ganesha'), and default the
namespace to the service_id.
There is no practical reason for users to ever need to change this, and
requiring them to provide this informaiton at config/apply time just
complicates life.
Sage Weil [Wed, 5 May 2021 16:59:44 +0000 (12:59 -0400)]
mgr/nfs: remove 'nfs cluster update'
This command is very awkward to implement unless all service spec fields
are always required. That will soon mean both the placement *and*
virtual_ip (if any), making it much less useful for a human to make use
of.
Instead, let them update yaml, or adjust the nfs and/or ingress specs
directly. I don't think this command is needed.
Sage Weil [Mon, 26 Apr 2021 18:48:03 +0000 (14:48 -0400)]
mgr/cephadm: nfs: add rank to grace file from mgr module
Do the grace file manipulation from the mgr module. For add, this isn't
especially important, but for remove it is very important. Clean out
old ranks from the grace table before we record that the rank has been
purged from the rank_map.
Sage Weil [Fri, 23 Apr 2021 19:33:23 +0000 (15:33 -0400)]
mgr/cephadm: enable ranked daemons for nfs
Use ranked daemons for NFS. Ganesha does not like it if multiple
instances start up with the same rank, but we need stable ranks so that
a rank can "fail over" to a new instance of a new daemon on another host
(with the same rank) for NFS client reclaim to work.
Specify a nodeid of '{service_name}.{rank}' for ganesha.
Include a unique id in the daemon_id just because this avoids some issues
with the create/destroy ordering, and because the daemon_id doesn't matter
much anymore since we are using a stable rank.
Sage Weil [Fri, 23 Apr 2021 19:31:14 +0000 (15:31 -0400)]
mgr/cephadm: support creation of daemons with ranks
- we need to assign all names and update the rank_map before we start
creating daemons.
- if we are using ranks, we should delete old daemons first, and
fence them from the cluster (where possible).
Sage Weil [Fri, 23 Apr 2021 19:13:05 +0000 (15:13 -0400)]
mgr/cephadm: include service_name is generated DaemonDescription
This makes 'orch ls' match up daemosn to services (and probably cleans up
other bits and pieces) when the old daemon id -> service name calc code
can't do its thing.
sunilkumarn417 [Wed, 19 May 2021 10:02:45 +0000 (15:32 +0530)]
qa/tasks/cephadm: Include bootstrap registry options for downstream
- registry-url, registry-username and registry-password bootstrap options are
supported now. This is needed to access monitoring service container images.
- usage of RHEL distribution based cephadm in download_cephadm task.
* refs/pull/41053/head:
cephfs-top: set the cursor to be invisible
cephfs-top: self-adapt the display according the window size
cephfs-top: use the default window object from curses.wrapper()
cephfs-top: improve the output
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/41389/head:
qa/tasks/nfs: add test to check if cmds fail on not passing required arguments
mgr/nfs: fix flake8 missing whitespace around parameter equals error
mgr/nfs: annotate _cmd_nfs_* methods return value
doc/cephfs/nfs: add section about ganesha logs
doc/cephfs/nfs: Replace volume/nfs with nfs
doc/cephfs/nfs: add note about export management with volume/nfs interface only
spec: add nfs to spec file
mgr/nfs: Don't enable nfs module by default
mgr/nfs: check for invalid chars in cluster id
mgr/nfs: Use CLICommand wrapper
mgr/nfs: reorg nfs files
mgr/nfs: Check if transport or protocol are list instance
mgr/nfs: reorg cluster class and common helper methods
mgr/nfs: move common export helper methods to ExportMgr class
mgr/nfs: move validate methods into new ValidateExport class
mgr/nfs: add custom exception module
mgr/nfs: create new module for export utils
mgr/nfs: rename fs dir to export
mgr/volumes/nfs: Move nfs code out of volumes plugin
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Ernesto Puerta [Mon, 31 May 2021 11:45:40 +0000 (13:45 +0200)]
mgr/dashboard: pass Grafana datasource in URL
PR https://github.com/ceph/ceph/pull/24314 added support for
specifying the Grafana datasource via $datasource template variable, but
this hadn't been used from the Dashboard side so far.
As per https://grafana.com/docs/grafana/latest/variables/#templates, by
adding `var-datasource=Dashboard1`, Dashboard can specify the
datasource.
with the recent support for async rbd operations from pacific+ when an
older client(non async support) goes on upgrade, and simultaneously
interacts with a newer client which expects the requests to be async,
experiences hang; considering the return code for request completion to
be acknowledgement for async request, which then keeps waiting for
another acknowledgement of request completion.
this if happens should be a rare only when lockowner is an old client
and should be deferred if compatibility issues arises.