]> git-server-git.apps.pok.os.sepia.ceph.com Git - cephmetrics.git/log
cephmetrics.git
8 years agobackend-storage: cosmetic changes to heatmap and graphs
Paul Cuzner [Thu, 29 Jun 2017 03:09:01 +0000 (15:09 +1200)]
backend-storage: cosmetic changes to heatmap and graphs

Heatmap 'spectrum' was not showing well in the light theme - this
make it more readable. In addition 'info' has been added to explain the
heatmap and what it represents.

8 years agofrontend: layout of the pools is now by pool lining up iops/throughput charts across...
Paul Cuzner [Thu, 29 Jun 2017 03:06:15 +0000 (15:06 +1200)]
frontend: layout of the pools is now by pool lining up iops/throughput charts across rows

8 years agoat-a-glance : query and cosmetic changes
Paul Cuzner [Thu, 29 Jun 2017 03:04:17 +0000 (15:04 +1200)]
at-a-glance : query and cosmetic changes

Multiple changes as follows;
- dashboard links (top) hover over was mis-aligned. fixed
- forecast value if negative now shows N/A
- forecast and growth queries updated
- disks near full set to 0, if there aren't any issues (instead of no value)
- added descriptions on various panels

8 years agostatus-panel: Update the bkgnd color to support the light theme
Paul Cuzner [Thu, 29 Jun 2017 02:53:43 +0000 (14:53 +1200)]
status-panel: Update the bkgnd color to support the light theme

By default the panel just uses 'green', which with the light theme is too
dark, making the text on the panel difficult to read. This ansible step just
updates the color in the css to make the text more readable

8 years agoMerge pull request #41 from ceph/wip-prod
Zack Cerza [Tue, 27 Jun 2017 20:54:30 +0000 (14:54 -0600)]
Merge pull request #41 from ceph/wip-prod

Add ansible-syntax job

8 years agoAdd ansible-syntax job 41/head
Zack Cerza [Tue, 27 Jun 2017 15:46:12 +0000 (09:46 -0600)]
Add ansible-syntax job

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoMerge pull request #37 from ceph/wip-branto
Boris Ranto [Tue, 27 Jun 2017 11:56:42 +0000 (13:56 +0200)]
Merge pull request #37 from ceph/wip-branto

selinux: Additional policy changes

Reviewed-by: Paul Cuzner <pcuzner@redhat.com>
8 years agoselinux: Additional policy changes 37/head
Boris Ranto [Mon, 26 Jun 2017 20:12:20 +0000 (22:12 +0200)]
selinux: Additional policy changes

This was required to access whoami files inside /var/lib/ceph/osd/
directory.

Signed-off-by: Boris Ranto <branto@redhat.com>
8 years agoMerge pull request #34 from zmc/wip-selinux
Boris Ranto [Tue, 27 Jun 2017 09:08:37 +0000 (11:08 +0200)]
Merge pull request #34 from zmc/wip-selinux

ansible: Build and install SELinux module

Reviewed-by: Boris Ranto <branto@redhat.com>
8 years agoMerge pull request #26 from zmc/wip-docs
Zack Cerza [Tue, 27 Jun 2017 00:35:59 +0000 (18:35 -0600)]
Merge pull request #26 from zmc/wip-docs

ansible: Document variables

8 years agoselinux: Allow collectd to write in /var/log/ 34/head
Zack Cerza [Mon, 26 Jun 2017 21:26:35 +0000 (15:26 -0600)]
selinux: Allow collectd to write in /var/log/

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoansible: Build and install SELinux module
Zack Cerza [Mon, 26 Jun 2017 21:04:56 +0000 (15:04 -0600)]
ansible: Build and install SELinux module

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoMerge pull request #24 from zmc/wip-firewalld
Zack Cerza [Tue, 27 Jun 2017 00:20:05 +0000 (18:20 -0600)]
Merge pull request #24 from zmc/wip-firewalld

ansible: Default to using firewalld's public zone

8 years agoMerge pull request #33 from zmc/wip-ansible-lint
Zack Cerza [Tue, 27 Jun 2017 00:19:51 +0000 (18:19 -0600)]
Merge pull request #33 from zmc/wip-ansible-lint

Add ansible-lint job

8 years agoRemove trailing whitespace 33/head
Zack Cerza [Mon, 26 Jun 2017 19:25:00 +0000 (13:25 -0600)]
Remove trailing whitespace

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoName unnamed tasks
Zack Cerza [Mon, 26 Jun 2017 19:19:21 +0000 (13:19 -0600)]
Name unnamed tasks

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoAdd ansible-lint job
Zack Cerza [Mon, 26 Jun 2017 19:17:46 +0000 (13:17 -0600)]
Add ansible-lint job

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoMerge pull request #25 from zmc/wip-ansible-forks
Zack Cerza [Mon, 26 Jun 2017 23:40:53 +0000 (17:40 -0600)]
Merge pull request #25 from zmc/wip-ansible-forks

ansible: Use up to 50 forks

8 years agoMerge pull request #31 from zmc/wip-grafana-plugins
Zack Cerza [Mon, 26 Jun 2017 23:40:41 +0000 (17:40 -0600)]
Merge pull request #31 from zmc/wip-grafana-plugins

ansible: Add grafana-piechart-panel plugin

8 years agoMerge pull request #35 from zmc/wip-dashboard-update
Zack Cerza [Mon, 26 Jun 2017 23:40:09 +0000 (17:40 -0600)]
Merge pull request #35 from zmc/wip-dashboard-update

ansible: Replace dashboards by default when updating

8 years agoReplace dashboards by default when updating 35/head
Zack Cerza [Mon, 26 Jun 2017 22:25:35 +0000 (16:25 -0600)]
Replace dashboards by default when updating

This behavior can be disabled by setting 'replace_dashboards' to False

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoDocument variables 26/head
Zack Cerza [Fri, 23 Jun 2017 21:45:08 +0000 (15:45 -0600)]
Document variables

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoansible: Use up to 50 forks 25/head
Zack Cerza [Fri, 23 Jun 2017 21:55:45 +0000 (15:55 -0600)]
ansible: Use up to 50 forks

This speeds up deployment. Ansible will use this value or the number of
hosts - whichever is smaller.

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoDefault to using firewalld's public zone 24/head
Zack Cerza [Fri, 23 Jun 2017 21:55:45 +0000 (15:55 -0600)]
Default to using firewalld's public zone

This will be appropriate for most users.

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoInstall grafana-piechart-panel 31/head
Zack Cerza [Mon, 26 Jun 2017 18:15:09 +0000 (12:15 -0600)]
Install grafana-piechart-panel

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoUse grafana-cli to manage plugins
Zack Cerza [Mon, 26 Jun 2017 18:11:01 +0000 (12:11 -0600)]
Use grafana-cli to manage plugins

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoMerge pull request #32 from zmc/wip-tests
Zack Cerza [Mon, 26 Jun 2017 19:10:38 +0000 (13:10 -0600)]
Merge pull request #32 from zmc/wip-tests

Add tox.ini with flake8 job

8 years agoRemove unused import 32/head
Zack Cerza [Mon, 26 Jun 2017 18:35:00 +0000 (12:35 -0600)]
Remove unused import

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoAdd tox.ini with flake8 job
Zack Cerza [Mon, 26 Jun 2017 18:32:48 +0000 (12:32 -0600)]
Add tox.ini with flake8 job

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoMerge pull request #28 from ceph/wip-branto
Zack Cerza [Mon, 26 Jun 2017 18:22:27 +0000 (12:22 -0600)]
Merge pull request #28 from ceph/wip-branto

Couple of cephmetric fixes

8 years agorpm: Add SELinux support 28/head
Boris Ranto [Mon, 26 Jun 2017 16:24:53 +0000 (18:24 +0200)]
rpm: Add SELinux support

Signed-off-by: Boris Ranto <branto@redhat.com>
8 years agocollectors: Pass through keyword arguments
Boris Ranto [Mon, 26 Jun 2017 12:30:38 +0000 (14:30 +0200)]
collectors: Pass through keyword arguments

We need this to pass through the log_level keyword argument to the base
class. Otherwise, the collectd will fail because these classes get
unknown argument log_level.

Signed-off-by: Boris Ranto <branto@redhat.com>
8 years agorpm: Support piechart plugin
Boris Ranto [Mon, 26 Jun 2017 12:25:57 +0000 (14:25 +0200)]
rpm: Support piechart plugin

Signed-off-by: Boris Ranto <branto@redhat.com>
8 years agoansible: Write grafana config when grafana is down
Boris Ranto [Sun, 25 Jun 2017 08:13:51 +0000 (10:13 +0200)]
ansible: Write grafana config when grafana is down

We need to start and communcite with grafana after we push our own
config to the grafana. The old config can use different locations
e.g. for the DB and these won't get populated properly if we run
dashUpdater or post to the grafana API too early.

Signed-off-by: Boris Ranto <branto@redhat.com>
8 years agobackend-storage dashboard : fixed the disks near full panel
Paul Cuzner [Mon, 26 Jun 2017 07:17:51 +0000 (19:17 +1200)]
backend-storage dashboard : fixed the disks near full panel

The panel was just a place holder - this change handles the query and formatting
to make the table useful

NB. It relies on the disk_full_threshold template variable

8 years agoMerge branch 'master' of github.com:ceph/cephmetrics
Paul Cuzner [Mon, 26 Jun 2017 05:40:59 +0000 (17:40 +1200)]
Merge branch 'master' of github.com:ceph/cephmetrics

8 years agomisc dashboards: minor updates, mainly for light theme support
Paul Cuzner [Mon, 26 Jun 2017 05:34:39 +0000 (17:34 +1200)]
misc dashboards: minor updates, mainly for light theme support

8 years agoat-a-glance: various enhancements
Paul Cuzner [Mon, 26 Jun 2017 05:33:45 +0000 (17:33 +1200)]
at-a-glance: various enhancements

Dashboard updates;
- initial light theme support - some colour changes on charts
- shuffled panel order so the rows basically cover - overview, client and OS
- added disks near full panel
- added growth and forecast panels
- added OSD level ram usage summary
 - PG's now shown using Grafana labs pie chart plugin
 - count of rbds now shown (on client row)
 - osd host count query changed
 - queries on Mons and OSDs changed to mitigate state flapping
 - 'buttons' changed to make them work with light/dark themes

8 years agoosd/rgw : write the elapsed time of get_stats to it's log file
Paul Cuzner [Mon, 26 Jun 2017 05:13:23 +0000 (17:13 +1200)]
osd/rgw : write the elapsed time of get_stats to it's log file

8 years agomon: updated for logging and additional metrics collected for rbd and osd hosts
Paul Cuzner [Mon, 26 Jun 2017 05:12:23 +0000 (17:12 +1200)]
mon: updated for logging and additional metrics collected for rbd and osd hosts

collector class now scans the cluster to determine the count of RBDs. Each monitor
will pick a discrete set of pools to scan, so the overall load is shared across monitors.

In addition, since the osd tree command is used to determine the up/down state of
the OSDs, the same output is used to determine the number of osd hosts in the
configuration. Prior to this change the determination was inferred through a
graphite query.

8 years agocephmetrics: Use a default value type for graphite, and assign default logging
Paul Cuzner [Mon, 26 Jun 2017 05:07:50 +0000 (17:07 +1200)]
cephmetrics: Use a default value type for graphite, and assign default logging

Before this change, if a variable was defined in a class but NOT defined in it's attributes
the collector would fail. With this change a default of gauge is assigned.

In addition, a default logging level is set for all the collectors, if not specified by LogLevel
in the collectd.conf plugin

8 years agobase/common: updated to include basic logging per module
Paul Cuzner [Mon, 26 Jun 2017 05:04:02 +0000 (17:04 +1200)]
base/common: updated to include basic logging per module

The base class now creates a logging object, allowing the collectors
to log lower level debug and info messages outside of the collectd log

8 years agoINSTALL: steps updated to reflect the additional plugin now used on at-a-glance
Paul Cuzner [Mon, 26 Jun 2017 05:02:21 +0000 (17:02 +1200)]
INSTALL: steps updated to reflect the additional plugin now used on at-a-glance

at-a-glance now uses a pie-chart (plugin from grafana labs) to show the pg status

8 years agoMerge pull request #14 from b-ranto/wip-packaging
Zack Cerza [Fri, 23 Jun 2017 16:21:46 +0000 (10:21 -0600)]
Merge pull request #14 from b-ranto/wip-packaging

Add a couple of packaging changes

8 years agoansible: Setup repos only in devel mode 14/head
Boris Ranto [Fri, 23 Jun 2017 07:09:57 +0000 (09:09 +0200)]
ansible: Setup repos only in devel mode

We should not touch the repos in the production mode.

Signed-off-by: Boris Ranto <branto@redhat.com>
8 years agorpm: Add initial spec file
Boris Ranto [Tue, 20 Jun 2017 23:17:06 +0000 (01:17 +0200)]
rpm: Add initial spec file

Signed-off-by: Boris Ranto <branto@redhat.com>
8 years agoMove collectd module path to avoid conflicts
Boris Ranto [Tue, 20 Jun 2017 22:52:57 +0000 (00:52 +0200)]
Move collectd module path to avoid conflicts

Signed-off-by: Boris Ranto <branto@redhat.com>
8 years agoMerge pull request #22 from zmc/wip-prod
Boris Ranto [Fri, 23 Jun 2017 05:22:15 +0000 (07:22 +0200)]
Merge pull request #22 from zmc/wip-prod

Changes needed to support production deployment

Reviewed-by: Boris Ranto <branto@redhat.com>
8 years agoceph-grafana: Add devel_mode switch 22/head
Zack Cerza [Thu, 22 Jun 2017 20:07:58 +0000 (14:07 -0600)]
ceph-grafana: Add devel_mode switch

This will be set to False for production deployments.

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoadd home dashboard support
Paul Cuzner [Thu, 22 Jun 2017 21:16:08 +0000 (09:16 +1200)]
add home dashboard support

this change adds a _home_dashboard setting such that the grafana home dashboard
for the admin user can be changed to be the ceph-at-a-glance dashboard.

8 years agoceph-grafana: Configure firewalld earlier
Zack Cerza [Thu, 22 Jun 2017 20:18:14 +0000 (14:18 -0600)]
ceph-grafana: Configure firewalld earlier

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoceph-grafana: Split out repo setup
Zack Cerza [Thu, 22 Jun 2017 20:11:18 +0000 (14:11 -0600)]
ceph-grafana: Split out repo setup

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoceph-grafana: Split out plugin installation
Zack Cerza [Thu, 22 Jun 2017 20:06:07 +0000 (14:06 -0600)]
ceph-grafana: Split out plugin installation

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoceph-collectd: Add devel_mode switch
Zack Cerza [Thu, 22 Jun 2017 19:35:19 +0000 (13:35 -0600)]
ceph-collectd: Add devel_mode switch

This will be set to False for production deployments.

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoceph-collectd: Split out collectd configuration
Zack Cerza [Thu, 22 Jun 2017 19:44:17 +0000 (13:44 -0600)]
ceph-collectd: Split out collectd configuration

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoceph-collectd: Split out repo setup
Zack Cerza [Thu, 22 Jun 2017 19:32:31 +0000 (13:32 -0600)]
ceph-collectd: Split out repo setup

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoat-a-glance: fix for health panel, and colour match with status panel
Paul Cuzner [Thu, 22 Jun 2017 02:53:57 +0000 (14:53 +1200)]
at-a-glance: fix for health panel, and colour match with status panel

The singlestat panel was using value map, but singlestat appears to be interpolating
the health value which results in the value map not being used and a number appearing
on the dashboard. This update uses a range map to handle/workaround this nuance.

In addition, cosmetic changes to the health, disk and latency panels - making their warning
colour state match the status panel warning colour for consistency

8 years agoMerge pull request #21 from zmc/wip-grafana-2
pcuzner [Thu, 22 Jun 2017 00:14:48 +0000 (12:14 +1200)]
Merge pull request #21 from zmc/wip-grafana-2

Ship grafana.ini from this repo

8 years agoShip grafana.ini from this repo 21/head
Zack Cerza [Thu, 22 Jun 2017 00:07:26 +0000 (18:07 -0600)]
Ship grafana.ini from this repo

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoUse a universal value for root_url
Zack Cerza [Thu, 22 Jun 2017 00:00:32 +0000 (18:00 -0600)]
Use a universal value for root_url

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoMerge pull request #18 from zmc/wip-carbon
pcuzner [Thu, 22 Jun 2017 00:06:38 +0000 (12:06 +1200)]
Merge pull request #18 from zmc/wip-carbon

Ensure /var/lib/carbon has the right ownership

8 years agoEnsure /var/lib/carbon has the right ownership 18/head
Zack Cerza [Wed, 21 Jun 2017 23:46:49 +0000 (17:46 -0600)]
Ensure /var/lib/carbon has the right ownership

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoMerge branch 'master' of github.com:pcuzner/cephmetrics
Paul Cuzner [Wed, 21 Jun 2017 21:01:26 +0000 (09:01 +1200)]
Merge branch 'master' of github.com:pcuzner/cephmetrics

8 years agoat-a-glance: correct grammar on status panels (mons/osds)
Paul Cuzner [Wed, 21 Jun 2017 21:00:46 +0000 (09:00 +1200)]
at-a-glance: correct grammar on status panels (mons/osds)

8 years agoMerge pull request #15 from zmc/wip-carbon
pcuzner [Wed, 21 Jun 2017 20:16:58 +0000 (08:16 +1200)]
Merge pull request #15 from zmc/wip-carbon

Resize whisper databases when necessary

8 years agoResize whisper databases when necessary 15/head
Zack Cerza [Wed, 21 Jun 2017 18:00:12 +0000 (12:00 -0600)]
Resize whisper databases when necessary

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoMerge pull request #11 from zmc/wip-ansible
pcuzner [Wed, 21 Jun 2017 03:39:14 +0000 (15:39 +1200)]
Merge pull request #11 from zmc/wip-ansible

More ansible updates

8 years agorgw updates: tweak detection of rgw sockets, and modify RGW dashboards
Paul Cuzner [Wed, 21 Jun 2017 02:17:29 +0000 (14:17 +1200)]
rgw updates: tweak detection of rgw sockets, and modify RGW dashboards

 The detection didn't work on some systems (I was expecting a pid in the name),
 so the glob used is now more generic.

 The rgw dashboard has been updated with a row that shows a roll-up of all rgw's
 and then repeated rows for each specific rgw node

8 years agoCompatibility for ansible < 2.3 11/head
Zack Cerza [Tue, 20 Jun 2017 22:09:04 +0000 (16:09 -0600)]
Compatibility for ansible < 2.3

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoInstall dependencies of collector plugins
Zack Cerza [Tue, 20 Jun 2017 19:13:09 +0000 (13:13 -0600)]
Install dependencies of collector plugins

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoBy default, don't use EPEL
Zack Cerza [Tue, 20 Jun 2017 18:56:46 +0000 (12:56 -0600)]
By default, don't use EPEL

Instead, use a repo on ceph.com for now.

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoInstall unzip to extract the Vonage plugin
Zack Cerza [Tue, 20 Jun 2017 18:56:21 +0000 (12:56 -0600)]
Install unzip to extract the Vonage plugin

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoDon't fail when firewalld isn't installed
Zack Cerza [Tue, 20 Jun 2017 18:53:53 +0000 (12:53 -0600)]
Don't fail when firewalld isn't installed

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoPass --dashboard-dir to dashUpdater.py
Zack Cerza [Tue, 20 Jun 2017 18:54:21 +0000 (12:54 -0600)]
Pass --dashboard-dir to dashUpdater.py

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agoMake the dashboard dir an argument
Zack Cerza [Tue, 20 Jun 2017 18:53:11 +0000 (12:53 -0600)]
Make the dashboard dir an argument

Instead of hardcoding it.

Signed-off-by: Zack Cerza <zack@redhat.com>
8 years agodashboards updated to account for null missing values
Paul Cuzner [Tue, 20 Jun 2017 04:16:52 +0000 (16:16 +1200)]
dashboards updated to account for null missing values

Prior to this change the charts had "Display/Null Value" set as null - but in
an environment where observations/metrics are arriving late or miss, the
 resulting chart would be part populated at best, at worst blank with only
 hover only providing an indication that data points are present.

Ideally the data should be present - but by setting as connected, null values
will not stop the graphs from being rendered.

8 years agodashUpdater - dashboard refresh mode needed overwrite set to True
Paul Cuzner [Tue, 20 Jun 2017 02:35:37 +0000 (14:35 +1200)]
dashUpdater - dashboard refresh mode needed overwrite set to True

8 years agoat-a-glance : health history chart updated
Paul Cuzner [Mon, 19 Jun 2017 20:47:25 +0000 (08:47 +1200)]
at-a-glance : health history chart updated

The health history query lead to interpolation of health values, so you'd see 1,2 or 3's due to averaging. This
change updates the query to use consolidateBy function which should keep the values as intended - 0,4,8
representing OK, WARN, ERROR

INSTALL instructions updated since this chart problem was listed as a Known Issue

8 years agoINSTALL info doc update
Paul Cuzner [Mon, 19 Jun 2017 04:44:24 +0000 (16:44 +1200)]
INSTALL info doc update

Removed some of the initial issues recorded, since they have been
resolved or worked around.

8 years agoadded osd-state information
Paul Cuzner [Mon, 19 Jun 2017 04:10:09 +0000 (16:10 +1200)]
added osd-state information

The mon collector now includes the status of osd's in it's output
(where 0 = up, 1 = down). To use this additional info, the at-a-glance
and backend-storage dashboards have been updated.

at-a-glance now includes a link from the OSD's 'down' value to the
backend storage dashboard

backend-storage dashboard includes an osd down table - which basically
shows osd's with a status of > 0

8 years agoat-a-glance - mon reporting now uses status panel, not singlestat
Paul Cuzner [Sun, 18 Jun 2017 23:36:27 +0000 (11:36 +1200)]
at-a-glance - mon reporting now uses status panel, not singlestat

Status Panel allows the mon's to be shown as a total/in quorum and down. In
addition, the panel uses a threshold and goes amber if a mon is down.

8 years agodashUpdater - refactoring and startup options added
Paul Cuzner [Sun, 18 Jun 2017 22:08:23 +0000 (10:08 +1200)]
dashUpdater - refactoring and startup options added

Multiple changes
- refactored code
- added argparse to provide runtime options
- adding logging module to provide more debug information
- added checks to confirm grafana port is available before attempting to fetch dashboards
- issue a return code back to the calling shell