Kefu Chai [Wed, 12 Feb 2020 04:24:50 +0000 (12:24 +0800)]
rpm: define weak_deps for el8
RHEL/CentOS 8 comes with rpm 4.14, see
https://centos.pkgs.org/8/centos-baseos-x86_64/rpm-4.14.2-25.el8.x86_64.rpm.html
and
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/packaging_and_distributing_software/new-features-in-rhel-8_packaging-and-distributing-software
and since "Recommends" was introduced by rpm 4.12, see
https://fedoraproject.org/wiki/Changes/RPM-4.12 .
so we are able to use "Recommends" in el8 as well.
Kefu Chai [Thu, 7 Mar 2019 12:28:24 +0000 (20:28 +0800)]
rpm: use Recommends on fedora also
"Recommends" and other weak dependencies were introduced in rpm 4.12. it
is included by quite a few distros, including fedora 21 and up, and
recent SUSE distros. but RHEL7 still ships rpm 4.11. see
https://fedoraproject.org/wiki/Changes/RPM-4.12 and
https://software.opensuse.org/package/rpm . so we enable Recommends on
fedora and SUSE distros.
dashboard: Resolve FQDN / hostname mismatch in hosts overview panel
In the AVG Disk Utilization panel, the result is calculated
by combining the output of node_disk_io_time_seconds_total
with the output of ceph_disk_occupation. However, the
first vector encodes the instance label with the full FQDN
while the ceph label only contains the hostname:port. In
order for these to match correctly, the domain name and port
has to be stripped from the labels.
When moving to LVM-based ceph-volume setups, several
grafana dashboards stopped working. The problem is that
(device, instance) no longer results in unique labels
which causes errors like:
"many-to-many matching not allowed: matching labels must be unique on one side"
The references to `$osd_hosts` etc. were encoded as
`[[osd_hosts]]` in the PromQL expression divisor, and
the panel always displayed N/A as the result of the
query.
Replacing the `[[...]]` with `$...` makes the expression
work again.
mgr/dashboard: show alert panel if prometheus/alertmanager is unconfigured
If the tabs under the "Monitoring" page aren't properly configured, a
notification is shown which explains the user which setting needs to be
enabled and also provides a link to the corresponding documentation.
Fixes: https://tracker.ceph.com/issues/42877 Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit 460f7bb3272c6536c9a5fc0919071d7c17e9aa5a)
by adding the previously added monitoring related features as well as
the newest feature addition. Extends the documentation where necessary
to describe the Prometheus' alert configuration.
Fixes: https://tracker.ceph.com/issues/42877 Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit 36421284c315baf7e79a8c0586ca98ac0126037e)
mgr/dashboard: move monitoring tabs to a single page
with a tab for 'active alerts', 'all alerts' and 'silences'. Due to
ambiguity with existing names, `AlertListComponent` has been renamed to
`ActiveAlertListComponent`. Introduces `MonitoringListComponent` as
first page for monitoring concerns, using path `/monitoring`.
Keeps the activated tab open, independent of the way that's used to go
back to the previous page, be it the cancel button or submit button or
the link on the breadcrumb. Also keeps the active tab open even when the
page is reloaded.
Fixes: https://tracker.ceph.com/issues/42877 Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit 855f214b29c8ed935c8f4ba0b8a8396692f946a1)
mgr/dashboard: refactor test of Prometheus alert service
Mocking the test the way it was removed the asynchronous nature of the
test. By using an Observable the test can stay asynchronous and be
tested as well.
because in teuthology we are using six.ensure_str, which was added in
six 1.12.0, see https://github.com/benjaminp/six/blob/1.12.0/CHANGES ,
we cannot continue using six 1.11.0, as a result, we need switch over to
six>1.12.0. since the latest stable version of six is now 1.14.0, let's
just use it.
Brad Hubbard [Fri, 1 Nov 2019 01:08:36 +0000 (11:08 +1000)]
tools/rados: Unmask '-o' to restore original behaviour
0b369e1aff1 masked the original behaviour of '-o' which was to indicate
'outfile' as documented in the man page. Changing object-size to capital
o will restore the original behaviour.
osd/OSDMap: Show health warning if a pool is configured with size 1
Introduce a config option called 'mon_warn_on_pool_no_redundancy' that is
used to show a health warning if any pool in the ceph cluster is
configured with a size of 1. The user can mute/unmute the warning using
'ceph health mute/unmute POOL_NO_REDUNDANCY'.
Add standalone test to verify warning on setting pool size=1. Set the
associated warning to 'false' in ceph.conf.template under qa/tasks so
that existing tests do not break.
Conflicts:
PendingReleaseNotes
- Added release notes under 14.2.9
qa/standalone/mon/health-mute.sh
- Deleted the script as 'health mute/unmute' cmd is unavailable in nautilus
qa/tasks/ceph.conf.template
- Removed a flag not available in nautilus
src/common/options.cc
- Removed a flag not available in nautilus
src/osd/OSDMap.cc
mgr/DaemonServer.{h,cc} deals with raw pointers while master uses ref_t<>
cast -- adjust to that. a minor conflict in the header and the metrics
templatization is not backported to nautilus. also, DaemonKey is a std::pair
in nautilus but a struct in master -- that requires a change in referencing
daemon type and name.
Venky Shankar [Sat, 8 Feb 2020 09:36:42 +0000 (04:36 -0500)]
mgr: helper function to check if a service is a normal ceph service
This would be widely required since ceph metadata server entries are
maintained in service map (DaemonServer::pending_service_map). Such
normal ceph services would need to filtered when processing the service
map to avoid extraneous entries getting processed.
This commit undoes the service daemon registration for the MDS. It doesn't look
absolutely necessary and it causes the MDS to be listed twice in the `ceph
versions` output:
Fixing that requires looking for duplicates or ignoring MDSs in the
service daemons when the mon processes `ceph versions`. I have a feeling
that it wasn't actually designed to be used by the MDS this way however.
Additionally, the reason for "unknown" version is because the metadata
sent to the mgr does not include "ceph_version".
- Make explicit the check for getting removed from the MDSMap. This was
only done before by checking if MDS held a rank which does not check the
case where a standby is removed from the FSMap.
- Use mds_info_t::dump to simplify various debug output.
- Add a few sanity asserts for invalid state transitions.
mgr, mon: allow normal ceph services to register with manager
Additionally, introduce `task status` field in manager report
messages to forward status of executing tasks in daemons (e.g.,
status of executing scrubs in ceph metadata servers).
`task status` makes its way upto service map which is then used
to display the relevant information in ceph status.
"The default values are handled by mgr_module.py's _get_module_option();
the or here means that we break any non-true (0, false, none) value and
override it with the default."
Alfonso Martínez [Tue, 24 Mar 2020 08:34:55 +0000 (09:34 +0100)]
mgr/dashboard: fix error when enabling SSO with cert. file
Nautilus dedicated fix: added py2 compatibility code.
Also:
* Disabled security setting 'wantNameIdEncrypted': not all Identity Providers support this and we are already requiring encrypted assertions (which is the default).
Fixes: https://tracker.ceph.com/issues/44666 Signed-off-by: Alfonso Martínez <almartin@redhat.com>
Sage Weil [Tue, 21 Jan 2020 16:43:04 +0000 (10:43 -0600)]
pybind/mgr/*: fix config_notify handling of default values
The default values are handled by mgr_module.py's _get_module_option();
the or here means that we break any non-true (0, false, none) value and
override it with the default.
Conflicts:
src/pybind/mgr/cephadm/module.py
- nautilus has no "cephadm" module. It does have an "orchestrator_cli"
module but it doesn't contain the code being patched
src/pybind/mgr/hello/module.py
- nautilus has a "hello" module, but it doesn't contain the code being
patched
Since the codebase is very different and a backport is not recommended or even
possible, I have created this commit with only the minimal code necessary.
Matthew Oliver [Tue, 4 Feb 2020 02:29:48 +0000 (13:29 +1100)]
ceph_argparse: increment matchcnt on kwargs
Currently when you pass a param in on the ceph cli as a kwarg
(--<param_name>) the matchcnt isn't incremented in the validate method
which is used to choose the right command signature.
The '--realm_name' and '--zone_name' isn't counted to the matchcnt, so
'orchestrator rgw rm' isn't picked as the valid command.
This patch simply corrects this by incrementing matchcnt on the kwarg
validate path before calling shortcircuiting the loop.
Fixes: https://tracker.ceph.com/issues/43803 Signed-off-by: Matthew Oliver <moliver@suse.com>
(cherry picked from commit cb37c9ee609864a078edf38d98608bd8cc18cbd7)
Conflicts:
test: exclude helper method from nosetest discovery
On nautilus the assertion helper was recognized by nosetest as a test
even though it doens't start with test_ prefix. Explicitely decorate it
with @nottest
Yao Zongyou [Tue, 3 Mar 2020 15:34:26 +0000 (15:34 +0000)]
rgw: clear ent_list for each loop of bucket list
if ent_list is not cleared, the old element will be checked repeatedly
and will occupy more memory. Fixes: http://tracker.ceph.com/issues/44394 Signed-off-by: Yao Zongyou <yaozongyou@vip.qq.com>
(cherry picked from commit f63bf47aa464c345c907c748dfdbbc5a239d8488)
anurag [Wed, 11 Mar 2020 14:17:05 +0000 (19:47 +0530)]
mgr/dashboard: Pool read/write OPS shows too many decimal places Fixes: https://tracker.ceph.com/issues/39714 Signed-off-by: anurag <anurag@localhost.localdomain>
(cherry picked from commit 27a2bbb12614b7aba0561c027346d9b5427f2405) Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/shared/datatable/table/table.component.spec.ts,
src/pybind/mgr/dashboard/frontend/src/app/shared/datatable/table/table-key-value/table-key-value.component.spec.ts:
added import of PipesModule to Angular unit tests