Paul Cuzner [Thu, 29 Jun 2017 03:09:01 +0000 (15:09 +1200)]
backend-storage: cosmetic changes to heatmap and graphs
Heatmap 'spectrum' was not showing well in the light theme - this
make it more readable. In addition 'info' has been added to explain the
heatmap and what it represents.
Paul Cuzner [Thu, 29 Jun 2017 03:04:17 +0000 (15:04 +1200)]
at-a-glance : query and cosmetic changes
Multiple changes as follows;
- dashboard links (top) hover over was mis-aligned. fixed
- forecast value if negative now shows N/A
- forecast and growth queries updated
- disks near full set to 0, if there aren't any issues (instead of no value)
- added descriptions on various panels
Paul Cuzner [Thu, 29 Jun 2017 02:53:43 +0000 (14:53 +1200)]
status-panel: Update the bkgnd color to support the light theme
By default the panel just uses 'green', which with the light theme is too
dark, making the text on the panel difficult to read. This ansible step just
updates the color in the css to make the text more readable
Boris Ranto [Mon, 26 Jun 2017 12:30:38 +0000 (14:30 +0200)]
collectors: Pass through keyword arguments
We need this to pass through the log_level keyword argument to the base
class. Otherwise, the collectd will fail because these classes get
unknown argument log_level.
Boris Ranto [Sun, 25 Jun 2017 08:13:51 +0000 (10:13 +0200)]
ansible: Write grafana config when grafana is down
We need to start and communcite with grafana after we push our own
config to the grafana. The old config can use different locations
e.g. for the DB and these won't get populated properly if we run
dashUpdater or post to the grafana API too early.
Paul Cuzner [Mon, 26 Jun 2017 05:33:45 +0000 (17:33 +1200)]
at-a-glance: various enhancements
Dashboard updates;
- initial light theme support - some colour changes on charts
- shuffled panel order so the rows basically cover - overview, client and OS
- added disks near full panel
- added growth and forecast panels
- added OSD level ram usage summary
- PG's now shown using Grafana labs pie chart plugin
- count of rbds now shown (on client row)
- osd host count query changed
- queries on Mons and OSDs changed to mitigate state flapping
- 'buttons' changed to make them work with light/dark themes
Paul Cuzner [Mon, 26 Jun 2017 05:12:23 +0000 (17:12 +1200)]
mon: updated for logging and additional metrics collected for rbd and osd hosts
collector class now scans the cluster to determine the count of RBDs. Each monitor
will pick a discrete set of pools to scan, so the overall load is shared across monitors.
In addition, since the osd tree command is used to determine the up/down state of
the OSDs, the same output is used to determine the number of osd hosts in the
configuration. Prior to this change the determination was inferred through a
graphite query.
Paul Cuzner [Mon, 26 Jun 2017 05:07:50 +0000 (17:07 +1200)]
cephmetrics: Use a default value type for graphite, and assign default logging
Before this change, if a variable was defined in a class but NOT defined in it's attributes
the collector would fail. With this change a default of gauge is assigned.
In addition, a default logging level is set for all the collectors, if not specified by LogLevel
in the collectd.conf plugin
Paul Cuzner [Thu, 22 Jun 2017 21:16:08 +0000 (09:16 +1200)]
add home dashboard support
this change adds a _home_dashboard setting such that the grafana home dashboard
for the admin user can be changed to be the ceph-at-a-glance dashboard.
Paul Cuzner [Thu, 22 Jun 2017 02:53:57 +0000 (14:53 +1200)]
at-a-glance: fix for health panel, and colour match with status panel
The singlestat panel was using value map, but singlestat appears to be interpolating
the health value which results in the value map not being used and a number appearing
on the dashboard. This update uses a range map to handle/workaround this nuance.
In addition, cosmetic changes to the health, disk and latency panels - making their warning
colour state match the status panel warning colour for consistency
Paul Cuzner [Tue, 20 Jun 2017 04:16:52 +0000 (16:16 +1200)]
dashboards updated to account for null missing values
Prior to this change the charts had "Display/Null Value" set as null - but in
an environment where observations/metrics are arriving late or miss, the
resulting chart would be part populated at best, at worst blank with only
hover only providing an indication that data points are present.
Ideally the data should be present - but by setting as connected, null values
will not stop the graphs from being rendered.
Paul Cuzner [Mon, 19 Jun 2017 20:47:25 +0000 (08:47 +1200)]
at-a-glance : health history chart updated
The health history query lead to interpolation of health values, so you'd see 1,2 or 3's due to averaging. This
change updates the query to use consolidateBy function which should keep the values as intended - 0,4,8
representing OK, WARN, ERROR
INSTALL instructions updated since this chart problem was listed as a Known Issue
Paul Cuzner [Mon, 19 Jun 2017 04:10:09 +0000 (16:10 +1200)]
added osd-state information
The mon collector now includes the status of osd's in it's output
(where 0 = up, 1 = down). To use this additional info, the at-a-glance
and backend-storage dashboards have been updated.
at-a-glance now includes a link from the OSD's 'down' value to the
backend storage dashboard
backend-storage dashboard includes an osd down table - which basically
shows osd's with a status of > 0
Paul Cuzner [Sun, 18 Jun 2017 22:08:23 +0000 (10:08 +1200)]
dashUpdater - refactoring and startup options added
Multiple changes
- refactored code
- added argparse to provide runtime options
- adding logging module to provide more debug information
- added checks to confirm grafana port is available before attempting to fetch dashboards
- issue a return code back to the calling shell