cephmetrics.git
11 months agoMerge pull request #254 from ceph/bz1842390 master v2.0.10
Boris Ranto [Mon, 13 Jul 2020 08:59:57 +0000 (10:59 +0200)]
Merge pull request #254 from ceph/bz1842390

ceph-at-a-glance: Fix Disk IOPS/Throughput

Signed-off-by: Boris Ranto <branto@redhat.com>
11 months agotox: Fix flake8 invocation 254/head
Zack Cerza [Fri, 10 Jul 2020 19:35:32 +0000 (13:35 -0600)]
tox: Fix flake8 invocation

Signed-off-by: Zack Cerza <zack@redhat.com>
11 months agoceph-at-a-glance: Fix Disk IOPS/Throughput
Zack Cerza [Thu, 9 Jul 2020 21:13:35 +0000 (15:13 -0600)]
ceph-at-a-glance: Fix Disk IOPS/Throughput

The node-exporter disk-related metrics will sometimes use the db_device
label instead of the wal_device label.

Resolves: rhbz#1842390
Signed-off-by: Zack Cerza <zack@redhat.com>
14 months agoMerge pull request #253 from ceph/bz1650209
Boris Ranto [Thu, 2 Apr 2020 06:12:27 +0000 (08:12 +0200)]
Merge pull request #253 from ceph/bz1650209

DNM: dashboards: Fix broken disk latency chart

Reviewed-by: David Galloway <dgallowa@redhat.com>
Reviewed-by: Boris Ranto <branto@redhat.com>
14 months agodashboards: Fix broken disk latency chart 253/head
Zack Cerza [Wed, 1 Apr 2020 19:02:03 +0000 (13:02 -0600)]
dashboards: Fix broken disk latency chart

The "All OSD Hosts - Highest Latency" chart's query was missing a
parenthesis, throwing numbers off by large amounts.

Resolves: rhbz#1650209
Signed-off-by: Zack Cerza <zack@redhat.com>
14 months agoMerge pull request #252 from ceph/bz1652233
Boris Ranto [Wed, 1 Apr 2020 17:17:54 +0000 (19:17 +0200)]
Merge pull request #252 from ceph/bz1652233

dashboards: Fix slow/inaccurate OSD down counter

Reviewed-by: Andrew Schoen <aschoen@redhat.com>
Reviewed-by: David Galloway <dgallowa@redhat.com>
Reviewed-by: Boris Ranto <branto@redhat.com
14 months agodashboards: Fix slow/inaccurate OSD down counter 252/head
Zack Cerza [Wed, 1 Apr 2020 17:09:30 +0000 (11:09 -0600)]
dashboards: Fix slow/inaccurate OSD down counter

Resolves: rhbz#1652233
Signed-off-by: Zack Cerza <zack@redhat.com>
18 months agoMerge pull request #249 from ceph/wip-no-wal-device v2.0.9
Boris Ranto [Fri, 22 Nov 2019 08:23:58 +0000 (09:23 +0100)]
Merge pull request #249 from ceph/wip-no-wal-device

dashboards: Ignore wal_device label

Reviewed-by: Zack Cerza <zcerza@redhat.com>
18 months agodashboards: Ignore wal_device label 249/head
Boris Ranto [Tue, 19 Nov 2019 22:17:47 +0000 (23:17 +0100)]
dashboards: Ignore wal_device label

The wal_device label was added to ceph_disk_occupation. We need to
ignore it in these queries to provide proper matching between values.
Otherwise, the query won't return any data. This is
backwards-compatible, if you ignore a non-existing label, nothing will
change.

Signed-off-by: Boris Ranto <branto@redhat.com>
20 months agoMerge pull request #246 from ceph/bz-1731919 v2.0.8
Boris Ranto [Mon, 16 Sep 2019 12:28:12 +0000 (14:28 +0200)]
Merge pull request #246 from ceph/bz-1731919

Revert comparison "fix" from ansible-lint

Reviewed-by: Boris Ranto <branto@redhat.com>
21 months agoRevert comparison "fix" from ansible-lint 246/head
Zack Cerza [Wed, 11 Sep 2019 15:54:36 +0000 (09:54 -0600)]
Revert comparison "fix" from ansible-lint

In d737083, we thought we were fixing something, but it turns out the
comparison suggested will throw an error if container_name contains a
dash (which it likely will). Revert the change and instruct
ansible-lint to not complain about that error on that line.

Resolves: rhbz#1731919

Signed-off-by: Zack Cerza <zack@redhat.com>
21 months agoMerge pull request #245 from ceph/wip-revert-mgr-change v2.0.7
Zack Cerza [Wed, 11 Sep 2019 15:19:16 +0000 (09:19 -0600)]
Merge pull request #245 from ceph/wip-revert-mgr-change

Revert "Only run ceph-mgr role on a single node"

21 months agoRevert "Only run ceph-mgr role on a single node" 245/head
Boris Ranto [Thu, 5 Sep 2019 17:11:35 +0000 (19:11 +0200)]
Revert "Only run ceph-mgr role on a single node"

This reverts commit 6f7559dfbb7b38463cf5c5d31f6ba1d8c7dcef0c.

Signed-off-by: Boris Ranto <branto@redhat.com>
21 months agoMerge pull request #243 from ceph/wip-no-data-silence
Boris Ranto [Fri, 30 Aug 2019 07:57:42 +0000 (09:57 +0200)]
Merge pull request #243 from ceph/wip-no-data-silence

prometheus: Silence no data alerts

Reviewed-by: Paul Cuzner <pcuzner@redhat.com>
21 months agoprometheus: Silence no data alerts 243/head
Boris Ranto [Fri, 23 Aug 2019 09:35:06 +0000 (11:35 +0200)]
prometheus: Silence no data alerts

Currently, we are sending an e-mail to the admin when the query returns
no data. This can get annoying as it does not mean that the alert has
actually been hit. We should fix this and alert only if the query says
we should.

Resolves: https://bugzilla.redhat.com/1663289

Signed-off-by: Boris Ranto <branto@redhat.com>
23 months agoMerge pull request #241 from ceph/wip-fix-typo v2.0.6
Zack Cerza [Thu, 11 Jul 2019 14:47:49 +0000 (08:47 -0600)]
Merge pull request #241 from ceph/wip-fix-typo

grafana: Fix typo in smtp configuration

23 months agografana: Fix typo in smtp configuration 241/head
Boris Ranto [Thu, 11 Jul 2019 12:21:05 +0000 (14:21 +0200)]
grafana: Fix typo in smtp configuration

The typo actually makes the ansible run fail if you don't have
smtp_enabled defined anywhere else.

Signed-off-by: Boris Ranto <branto@redhat.com>
23 months agoMerge pull request #240 from ceph/wip-preserve-config
Zack Cerza [Wed, 10 Jul 2019 22:24:30 +0000 (16:24 -0600)]
Merge pull request #240 from ceph/wip-preserve-config

grafana: Add option to preserve grafana config

23 months agografana: Add option to preserve grafana config 240/head
Boris Ranto [Wed, 10 Jul 2019 08:19:53 +0000 (10:19 +0200)]
grafana: Add option to preserve grafana config

This patch adds the grafana.overwrite_config option. You can change this
option if you want to keep your custom grafana configuration. The
scripts will still update the grafana config with the other configured
options but it won't overwrite your custom options.

Signed-off-by: Boris Ranto <branto@redhat.com>
23 months agoMerge pull request #239 from ceph/wip-fix-patch v2.0.5
Zack Cerza [Tue, 9 Jul 2019 18:44:30 +0000 (12:44 -0600)]
Merge pull request #239 from ceph/wip-fix-patch

patches: Fix patch to apply cleanly

23 months agopatches: Fix patch to apply cleanly 239/head
Boris Ranto [Tue, 9 Jul 2019 18:41:38 +0000 (20:41 +0200)]
patches: Fix patch to apply cleanly

Signed-off-by: Boris Ranto <branto@redhat.com>
23 months agoMerge pull request #238 from zmc/wip-trust-image v2.0.4
Boris Ranto [Thu, 20 Jun 2019 21:14:32 +0000 (23:14 +0200)]
Merge pull request #238 from zmc/wip-trust-image

Allow skipping image verification

Reviewed-by: Boris Ranto <branto@redhat.com>
23 months agoAllow skipping image verification 238/head
Zack Cerza [Thu, 20 Jun 2019 20:52:33 +0000 (14:52 -0600)]
Allow skipping image verification

For the prometheus and grafana containers, in some specific
circumstances it's desirable to skip verification of the container
image. Allow passing that value in via group_vars.

Resolves: rhbz#1636136
Signed-off-by: Zack Cerza <zack@redhat.com>
23 months agoMerge pull request #237 from ceph/wip-mds v2.0.3
Boris Ranto [Thu, 20 Jun 2019 20:39:03 +0000 (22:39 +0200)]
Merge pull request #237 from ceph/wip-mds

dashboards: Show only open sessions

Reviewed-by: Zack Cerza <zcerza@redhat.com>
23 months agodashboards: Show only open sessions 237/head
Boris Ranto [Wed, 19 Jun 2019 19:40:36 +0000 (21:40 +0200)]
dashboards: Show only open sessions

We should use 'ceph_mds_sessions_sessions_open' instead of
'ceph_mds_sessions_session_count' to compute the number of (active)
clients. Otherwise, we include all the sessions, including the stale
ones.

Resolves: rhbz#1652896
Signed-off-by: Boris Ranto <branto@redhat.com>
2 years agoMerge pull request #233 from jjict/master
Boris Ranto [Wed, 12 Jun 2019 13:10:38 +0000 (15:10 +0200)]
Merge pull request #233 from jjict/master

 Make SMTP, anonymous login and theme configuratebil

Reviewed-by: Boris Ranto <branto@redhat.com>
2 years agoMake SMTP, anonymous login and theme configuratebil 233/head
jjict [Fri, 29 Mar 2019 12:35:37 +0000 (13:35 +0100)]
Make SMTP, anonymous login and theme configuratebil

2 years agoMerge pull request #234 from marcosmamorim/pull_images
Zack Cerza [Wed, 5 Jun 2019 23:52:55 +0000 (17:52 -0600)]
Merge pull request #234 from marcosmamorim/pull_images

Prevent pull images for prometheus and grafana

2 years agoMerge pull request #236 from zmc/fix-lint
Boris Ranto [Wed, 5 Jun 2019 15:15:53 +0000 (17:15 +0200)]
Merge pull request #236 from zmc/fix-lint

Fix issues raised by newer ansible-lint versions

Reviewed-by: Boris Ranto <branto@redhat.com>
2 years agoansible-lint: Use delegate_to: localhost 236/head
Zack Cerza [Tue, 4 Jun 2019 23:32:36 +0000 (17:32 -0600)]
ansible-lint: Use delegate_to: localhost

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoansible-lint: Ignore pipefail warning
Zack Cerza [Tue, 4 Jun 2019 23:08:45 +0000 (17:08 -0600)]
ansible-lint: Ignore pipefail warning

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoansible-lint: Don't compare to empty string
Zack Cerza [Tue, 4 Jun 2019 23:06:22 +0000 (17:06 -0600)]
ansible-lint: Don't compare to empty string

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoansible-lint: Allow a tab in this particular line
Zack Cerza [Tue, 4 Jun 2019 23:04:27 +0000 (17:04 -0600)]
ansible-lint: Allow a tab in this particular line

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoReplace improperly copied file w/ symlink
Zack Cerza [Tue, 4 Jun 2019 22:58:49 +0000 (16:58 -0600)]
Replace improperly copied file w/ symlink

The rest of the roles symlink this file; this was a simple oversight.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoansible-lint: Ignore these lines' length
Zack Cerza [Tue, 4 Jun 2019 22:57:35 +0000 (16:57 -0600)]
ansible-lint: Ignore these lines' length

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoansible-lint: Ignore missing galaxy_info
Zack Cerza [Tue, 4 Jun 2019 22:57:05 +0000 (16:57 -0600)]
ansible-lint: Ignore missing galaxy_info

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #235 from servesha/wip-bug-fix
Zack Cerza [Tue, 4 Jun 2019 22:48:13 +0000 (16:48 -0600)]
Merge pull request #235 from servesha/wip-bug-fix

firewalld: added port

2 years agofirewalld: added port 235/head
Servesha Dudhgaonkar [Tue, 23 Apr 2019 06:39:48 +0000 (12:09 +0530)]
firewalld: added port

Signed-off-by: Servesha Dudhgaonkar <sdudhgao@redhat.com>
2 years agoPrevent pull images for prometheus and grafana 234/head
Marcos Amorim [Thu, 4 Apr 2019 18:21:10 +0000 (14:21 -0400)]
Prevent pull images for prometheus and grafana

This patch add a new prometheus and grafana variable to allow install when the images already pulled on docker.

Signed-off-by: Marcos Amorim <mamorim@redhat.com>
2 years agoMerge pull request #229 from ceph/dashboard-diagrams
Zack Cerza [Tue, 5 Feb 2019 23:40:16 +0000 (16:40 -0700)]
Merge pull request #229 from ceph/dashboard-diagrams

Update dashboard relationship diagrams

2 years agoUpdate dashboard relationship diagrams 229/head
Paul Cuzner [Thu, 31 Jan 2019 22:33:35 +0000 (11:33 +1300)]
Update dashboard relationship diagrams

The relationships have been updated and an svg
file included for future changes

Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
2 years agoMerge pull request #228 from ceph/node-detail-fix v2.0.2
Zack Cerza [Thu, 13 Dec 2018 23:49:38 +0000 (16:49 -0700)]
Merge pull request #228 from ceph/node-detail-fix

Fix Host breakdown and disk graphs

2 years agoFix Host breakdown and disk graphs 228/head
Paul Cuzner [Thu, 13 Dec 2018 22:13:24 +0000 (11:13 +1300)]
Fix Host breakdown and disk graphs

Host breakdown was hitting duplicate labels, and
the disk graphs needed a filter to only show the
disk stats for disks that relate to the host's
OSDs.

Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
2 years agoMerge pull request #225 from ceph/wip-cluster-osds v2.0.1
Zack Cerza [Tue, 9 Oct 2018 15:10:38 +0000 (09:10 -0600)]
Merge pull request #225 from ceph/wip-cluster-osds

dashboards: Fix cluster OSDs panel

2 years agodashboards: Fix cluster OSDs panel 225/head
Boris Ranto [Tue, 18 Sep 2018 21:31:14 +0000 (23:31 +0200)]
dashboards: Fix cluster OSDs panel

We still use the old 'id' label for getting the number of OSDs in the
ceph cluster dashboard. However, the label was superseded by the
'ceph_daemon' label in one of the updates to the prometheus exporter in
ceph-mgr. This patch changes the query to what we use in other
dashboards -- i.e. 'ceph_daemon' instead of 'id'.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1627725

Signed-off-by: Boris Ranto <branto@redhat.com>
2 years agoMerge pull request #224 from zmc/wip-205 v2.0
pcuzner [Tue, 28 Aug 2018 19:44:38 +0000 (07:44 +1200)]
Merge pull request #224 from zmc/wip-205

dashboards: Don't filter out 10GbE iface names

2 years agoMerge pull request #222 from ceph/fix-health-status
Zack Cerza [Tue, 28 Aug 2018 19:41:08 +0000 (12:41 -0700)]
Merge pull request #222 from ceph/fix-health-status

Fix health state colors

2 years agodashboards: Don't filter out 10GbE iface names 224/head
Zack Cerza [Mon, 27 Aug 2018 22:43:37 +0000 (15:43 -0700)]
dashboards: Don't filter out 10GbE iface names

https://github.com/ceph/cephmetrics/issues/205

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #212 from zmc/wip-prom-etc-hosts
pcuzner [Fri, 24 Aug 2018 02:07:55 +0000 (14:07 +1200)]
Merge pull request #212 from zmc/wip-prom-etc-hosts

ceph-prometheus: Optionally add /etc/hosts entries

2 years agoMerge pull request #219 from ceph/wip-erro-panel
pcuzner [Fri, 24 Aug 2018 02:07:20 +0000 (14:07 +1200)]
Merge pull request #219 from ceph/wip-erro-panel

dashboards: Add ceph error panel to alert status dashboard

2 years agoFix health state colors 222/head
Paul Cuzner [Fri, 24 Aug 2018 01:59:27 +0000 (13:59 +1200)]
Fix health state colors

Two dashboards were translating the values from
ceph_health_status incorrectly, resulting in
the wrong health state being shown.

Closes: https://github.com/ceph/cephmetrics/issues/202

Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
2 years agodashboards: Add error panel+alerting 219/head
Boris Ranto [Tue, 21 Aug 2018 10:34:03 +0000 (12:34 +0200)]
dashboards: Add error panel+alerting

Signed-off-by: Boris Ranto <branto@redhat.com>
2 years agoceph-prometheus: Optionally add /etc/hosts entries 212/head
Zack Cerza [Thu, 2 Aug 2018 21:24:37 +0000 (14:24 -0700)]
ceph-prometheus: Optionally add /etc/hosts entries

This only supports containerized deployments.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #215 from zmc/wip-nexp-service-name
Zack Cerza [Tue, 14 Aug 2018 03:09:05 +0000 (20:09 -0700)]
Merge pull request #215 from zmc/wip-nexp-service-name

ceph-node-exporter: Fix defaults for service_name

2 years agoMerge pull request #216 from zmc/wip-mgr-firewall
Zack Cerza [Tue, 14 Aug 2018 03:08:18 +0000 (20:08 -0700)]
Merge pull request #216 from zmc/wip-mgr-firewall

Open ports 9283 and 9090

2 years agoceph-prometheus: Open port 9090 216/head
Zack Cerza [Wed, 8 Aug 2018 21:19:07 +0000 (14:19 -0700)]
ceph-prometheus: Open port 9090

https://github.com/ceph/cephmetrics/issues/214

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoceph-mgr: Open port 9283
Zack Cerza [Tue, 7 Aug 2018 18:32:59 +0000 (11:32 -0700)]
ceph-mgr: Open port 9283

https://github.com/ceph/cephmetrics/issues/213

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoceph-node-exporter: Fix defaults for service_name 215/head
Zack Cerza [Tue, 7 Aug 2018 05:51:26 +0000 (22:51 -0700)]
ceph-node-exporter: Fix defaults for service_name

Indentation was off.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #211 from ceph/wip-mgr-optimize
Boris Ranto [Tue, 31 Jul 2018 18:13:00 +0000 (20:13 +0200)]
Merge pull request #211 from ceph/wip-mgr-optimize

Only run ceph-mgr role on a single node

Reviewed-by: Zack Cerza <zcerza@redhat.com>
2 years agoMerge pull request #210 from ceph/wip-cluster-name
Zack Cerza [Tue, 31 Jul 2018 16:43:46 +0000 (09:43 -0700)]
Merge pull request #210 from ceph/wip-cluster-name

ansible: Support non-default cluster name

2 years agoOnly run ceph-mgr role on a single node 211/head
Boris Ranto [Sat, 28 Jul 2018 12:02:28 +0000 (14:02 +0200)]
Only run ceph-mgr role on a single node

We do not need to run the ceph-mgr role multiple times. The command that
enables the module only need to be run on one of the machines, choosing
first as it is the easiest.

Signed-off-by: Boris Ranto <branto@redhat.com>
2 years agoansible: Support non-default cluster name 210/head
Boris Ranto [Fri, 27 Jul 2018 23:48:19 +0000 (01:48 +0200)]
ansible: Support non-default cluster name

Signed-off-by: Boris Ranto <branto@redhat.com>
2 years agoMerge pull request #209 from ceph/wip-branto
pcuzner [Thu, 26 Jul 2018 02:33:47 +0000 (14:33 +1200)]
Merge pull request #209 from ceph/wip-branto

Downstream fixes

2 years agoosd-node-detail: Fix value repetition 209/head
Boris Ranto [Wed, 25 Jul 2018 16:16:02 +0000 (18:16 +0200)]
osd-node-detail: Fix value repetition

We did not sum the values for RAM usage so we ended up with a couple of
entries being shown for each RAM usage query.

Signed-off-by: Boris Ranto <branto@redhat.com>
2 years agoansible: Fix service_name indentation
Boris Ranto [Wed, 25 Jul 2018 16:12:38 +0000 (18:12 +0200)]
ansible: Fix service_name indentation

The service_name for node_exporter was not indented properly.

Signed-off-by: Boris Ranto <branto@redhat.com>
2 years agoMerge pull request #203 from ceph/wip-rpm-patch
Zack Cerza [Mon, 16 Jul 2018 15:57:09 +0000 (09:57 -0600)]
Merge pull request #203 from ceph/wip-rpm-patch

rpm: use_epel is no longer defined

2 years agoMerge pull request #208 from ceph/fix-throughput-units
Boris Ranto [Mon, 16 Jul 2018 07:30:32 +0000 (09:30 +0200)]
Merge pull request #208 from ceph/fix-throughput-units

Fix units used for throughput

Reviewed-by: Boris Ranto <branto@redhat.com>
2 years agoFix units used on throughput charts 208/head
Paul Cuzner [Fri, 13 Jul 2018 00:24:35 +0000 (12:24 +1200)]
Fix units used on throughput charts

Applies the same change to the prometheus based
dashboards as commit 239afda4debc80544112fe98d31af3f692678f57 did for
the graphite based dashboards

Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
2 years agoMerge pull request #204 from ceph/osd-info-updates
pcuzner [Thu, 12 Jul 2018 23:51:24 +0000 (11:51 +1200)]
Merge pull request #204 from ceph/osd-info-updates

Multiple fixes to OSD information dashboard

2 years agoFix units used for throughput
Paul Cuzner [Thu, 12 Jul 2018 23:49:33 +0000 (11:49 +1200)]
Fix units used for throughput

Throughput units were using binary representation,
whereas the expected unit is decimal (i.e.
MB/s not MiB/s)

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1496186

Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
2 years agoMultiple fixes to OSD information dashboard 204/head
Paul Cuzner [Tue, 10 Jul 2018 23:41:30 +0000 (11:41 +1200)]
Multiple fixes to OSD information dashboard

Bluestore tables and charts updated, including;
- switched units from ms to secs which shows us too
- changed metric from commit to KV latency
- updated thresholds in bluestore tables
- switched from rate to irate for bluestore metrics
- updated bluestore text box description

Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
2 years agorpm: use_epel is no longer defined 203/head
Boris Ranto [Fri, 29 Jun 2018 11:25:17 +0000 (13:25 +0200)]
rpm: use_epel is no longer defined

Signed-off-by: Boris Ranto <branto@redhat.com>
2 years agoMerge pull request #201 from zmc/wip-uneditable-dbs
pcuzner [Tue, 3 Jul 2018 21:51:47 +0000 (09:51 +1200)]
Merge pull request #201 from zmc/wip-uneditable-dbs

Make dashboards uneditable by default

2 years agoMerge pull request #200 from zmc/wip-osd-info-db
pcuzner [Tue, 3 Jul 2018 21:50:28 +0000 (09:50 +1200)]
Merge pull request #200 from zmc/wip-osd-info-db

ceph-osd-information: Fix numerous bugs

2 years agoceph-osd-information: Fix numerous bugs 200/head
Zack Cerza [Thu, 28 Jun 2018 19:31:41 +0000 (13:31 -0600)]
ceph-osd-information: Fix numerous bugs

Too many to list here; almost every panel was
broken in some regard.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #197 from ceph/add-rgw-latencies
Zack Cerza [Tue, 3 Jul 2018 18:07:58 +0000 (12:07 -0600)]
Merge pull request #197 from ceph/add-rgw-latencies

Added RGW GET/PUT Latencies

2 years agoMerge pull request #196 from ceph/iscsi-db-updates
Zack Cerza [Tue, 3 Jul 2018 18:07:02 +0000 (12:07 -0600)]
Merge pull request #196 from ceph/iscsi-db-updates

Updated to use OS metrics from default scrape job

2 years agoMerge pull request #199 from zmc/wip-cluster-db
Zack Cerza [Tue, 3 Jul 2018 18:06:08 +0000 (12:06 -0600)]
Merge pull request #199 from zmc/wip-cluster-db

ceph-cluster: Fix column styles for version tables

2 years agoMerge pull request #198 from zmc/wip-iscsi-prom
Zack Cerza [Fri, 29 Jun 2018 15:23:10 +0000 (09:23 -0600)]
Merge pull request #198 from zmc/wip-iscsi-prom

Scrape iSCSI-related exporters

2 years agoMake dashboards uneditable by default 201/head
Zack Cerza [Thu, 28 Jun 2018 22:18:51 +0000 (16:18 -0600)]
Make dashboards uneditable by default

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoceph-prometheus: Scrape iscsi gateway exporter 198/head
Zack Cerza [Wed, 27 Jun 2018 22:58:42 +0000 (16:58 -0600)]
ceph-prometheus: Scrape iscsi gateway exporter

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoplaybook: Install node_exporter on iscsi gateways
Zack Cerza [Wed, 27 Jun 2018 22:57:44 +0000 (16:57 -0600)]
playbook: Install node_exporter on iscsi gateways

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoAdded RGW GET/PUT Latencies 197/head
Paul Cuzner [Thu, 28 Jun 2018 21:58:35 +0000 (09:58 +1200)]
Added RGW GET/PUT Latencies

Added multiple charts showing GET/PUT latencies
at overview and RGW detail levels. In addition
the failed HTTP request panel has been changed
from a singlestat to a graph to visualize the
failure rates across all RGW instances.

Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
2 years agoUpdated to use OS metrics from default scrape job 196/head
Paul Cuzner [Thu, 28 Jun 2018 04:47:03 +0000 (16:47 +1200)]
Updated to use OS metrics from default scrape job

All node_exporter scrapes are now done under the
same job (called node) so the dashboard now uses
an updated template query to identify the correct
host to pull out the OS metrics by iscsi gateway

Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
2 years agoceph-cluster: Fix column styles for version tables 199/head
Zack Cerza [Tue, 26 Jun 2018 20:32:19 +0000 (14:32 -0600)]
ceph-cluster: Fix column styles for version tables

We were using 'id', not 'ceph_daemon' and as a result that column wasn't
showing up.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoMerge pull request #195 from zmc/wip-dashboard-fixes
Zack Cerza [Tue, 26 Jun 2018 18:50:30 +0000 (12:50 -0600)]
Merge pull request #195 from zmc/wip-dashboard-fixes

Fix mon_server queries in network-usage-by-node and set home db

2 years agoMerge pull request #187 from ceph/wip-rpm
Zack Cerza [Tue, 26 Jun 2018 18:45:09 +0000 (12:45 -0600)]
Merge pull request #187 from ceph/wip-rpm

rpm: Update spec file for recent changes

2 years agorpm: Modify node_exporter service name 187/head
Boris Ranto [Tue, 26 Jun 2018 18:20:43 +0000 (20:20 +0200)]
rpm: Modify node_exporter service name

Signed-off-by: Boris Ranto <branto@redhat.com>
2 years agoceph-grafana: Set the admin users's home db 195/head
Zack Cerza [Fri, 22 Jun 2018 21:58:16 +0000 (15:58 -0600)]
ceph-grafana: Set the admin users's home db

If a human (or API request) changes the home dashboard for the admin
account, our method of setting it at the org level will no longer be
effective. Let's keep the admin user's home dashboard set to
ceph-at-a-glance.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agoceph-node-exporter: Fix inaccurate task name
Zack Cerza [Fri, 22 Jun 2018 18:57:00 +0000 (12:57 -0600)]
ceph-node-exporter: Fix inaccurate task name

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agonetwork-usage-by-node: Fix mon_server queries
Zack Cerza [Fri, 22 Jun 2018 18:55:21 +0000 (12:55 -0600)]
network-usage-by-node: Fix mon_server queries

The variable queries needed to drop 'mon.' from the mon names, and the
panel queries needed '[[mon_servers]]' to be wrapped in parentheses.

Signed-off-by: Zack Cerza <zack@redhat.com>
2 years agorpm: Modify container name/version
Boris Ranto [Wed, 6 Jun 2018 15:14:36 +0000 (17:14 +0200)]
rpm: Modify container name/version

Signed-off-by: Boris Ranto <branto@redhat.com>
2 years agorpm: Update spec file for recent changes
Boris Ranto [Wed, 23 May 2018 09:54:49 +0000 (11:54 +0200)]
rpm: Update spec file for recent changes

Signed-off-by: Boris Ranto <branto@redhat.com>
2 years agoMerge pull request #191 from zmc/wip-osp-fixes
Boris Ranto [Wed, 20 Jun 2018 07:44:32 +0000 (09:44 +0200)]
Merge pull request #191 from zmc/wip-osp-fixes

Fixes for OSP

Reviewed-by: Boris Ranto <branto@redhat.com>
2 years agoMerge pull request #192 from zmc/unpin-testinfra
Boris Ranto [Wed, 20 Jun 2018 07:31:23 +0000 (09:31 +0200)]
Merge pull request #192 from zmc/unpin-testinfra

Unpin testinfra

Reviewed-by: Boris Ranto <branto@redhat.com>
3 years agoUnpin testinfra 192/head
Zack Cerza [Wed, 13 Jun 2018 21:36:51 +0000 (15:36 -0600)]
Unpin testinfra

1.14.0 has been released with the fix we needed.

Signed-off-by: Zack Cerza <zack@redhat.com>
3 years agonode_exporter: Allow custom service name 191/head
Zack Cerza [Thu, 7 Jun 2018 22:40:24 +0000 (16:40 -0600)]
node_exporter: Allow custom service name

Signed-off-by: Zack Cerza <zack@redhat.com>
3 years agoceph-mgr: Cope with differently-named containers
Zack Cerza [Wed, 6 Jun 2018 19:23:22 +0000 (13:23 -0600)]
ceph-mgr: Cope with differently-named containers

We were expecting ceph-mgr@hostname, but let's also look for
ceph-mgr-hostname.

Signed-off-by: Zack Cerza <zack@redhat.com>
3 years agoMerge pull request #188 from zmc/wip-containers
Zack Cerza [Wed, 6 Jun 2018 18:55:59 +0000 (12:55 -0600)]
Merge pull request #188 from zmc/wip-containers

containers: grafana version; systemd unit comments

3 years agoMerge pull request #190 from zmc/wip-db-fixes
Zack Cerza [Wed, 6 Jun 2018 18:55:44 +0000 (12:55 -0600)]
Merge pull request #190 from zmc/wip-db-fixes

Add some dashboard tests, and make them pass