]>
git-server-git.apps.pok.os.sepia.ceph.com Git - cephmetrics.git/log
pcuzner [Thu, 27 Jul 2017 19:59:06 +0000 (07:59 +1200)]
Merge pull request #81 from zmc/wip-passwd
ansible: Support non-default Grafana password
Zack Cerza [Mon, 24 Jul 2017 23:33:35 +0000 (16:33 -0700)]
Merge pull request #79 from ceph/wip-paulc
Latest updates covering feedback and RFE's
Zack Cerza [Mon, 24 Jul 2017 23:09:02 +0000 (16:09 -0700)]
Update SELinux policy
The collectors need to be able to determine whether an OSD uses
filestore or bluestore
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Mon, 24 Jul 2017 23:07:58 +0000 (16:07 -0700)]
Restore SELinux context of OSD journals
So that our SELinux policy can properly allow collectors to detect
whether an OSD uses filestore or bluestore
Signed-off-by: Zack Cerza <zack@redhat.com>
Paul Cuzner [Mon, 24 Jul 2017 02:13:09 +0000 (14:13 +1200)]
alert-status dashboard : Enable default alerts
dashUpdater has been updated to automatically set up a cephmetrics
notifications channel (if it's not already there), and the alert-status
dashboard is loaded, which references the cephmetrics channel.
The ansible templates has been updated to reflect the introduction of the
alert-status dashboard
Paul Cuzner [Fri, 21 Jul 2017 22:25:17 +0000 (10:25 +1200)]
osd: fix determination of osd type
the presence of the type file was being relied upon across versions.
However, not all versions show this file (10.2.2 did, 10.2.7 didn't!), so
this fix looks for type and if it's there it uses it, if not it will
look for the presence of the journal link to determine if the osd
is filestore. It is assumed that bluestore will 'always' use the type
file..
Zack Cerza [Thu, 20 Jul 2017 23:21:50 +0000 (16:21 -0700)]
Optionally use a different Grafana admin password
Signed-off-by: Zack Cerza <zack@redhat.com>
Paul Cuzner [Fri, 21 Jul 2017 21:43:39 +0000 (09:43 +1200)]
osd-node-detail: fix to templating which caused charts to show no data
templating had a reference to a test server hard coded - resulting in
failed queries. This fix replaces this with $domain
Paul Cuzner [Fri, 21 Jul 2017 03:20:36 +0000 (15:20 +1200)]
dashboard relationships updated to show the alert-status dashboard
Paul Cuzner [Fri, 21 Jul 2017 02:57:49 +0000 (14:57 +1200)]
ceph-rgw-workload: updates to metric calculations and spark line colors
Spark lines now blue matching the at-a-glance view for consistency
Paul Cuzner [Fri, 21 Jul 2017 02:57:04 +0000 (14:57 +1200)]
ceph-rados: added the cluster flags
Cluster flags/features shows as singlestat panels alongside the monitor
state. The flag states are 0 - enabled/inactive, 1 active and 2 disabled.
thresholds are used on this panels to indicate the above states
Paul Cuzner [Fri, 21 Jul 2017 02:54:56 +0000 (14:54 +1200)]
ceph-frontend: added recovery by pool - since this is the pool dashboard!
Paul Cuzner [Fri, 21 Jul 2017 02:52:58 +0000 (14:52 +1200)]
ceph-backend: fixed bug in osd-down table query
Query was showing osd's that have been removed from the system
Zack Cerza [Thu, 20 Jul 2017 23:18:58 +0000 (16:18 -0700)]
Allow partial defaults overrides
With this change, it becomes possible to override certain keys in each configuration dict while accepting the default values for other keys.
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Thu, 20 Jul 2017 20:42:07 +0000 (13:42 -0700)]
Move Grafana configuration defaults
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Thu, 20 Jul 2017 21:59:35 +0000 (14:59 -0700)]
Merge pull request #71 from ceph/wip-whisper
Make whisper retention settings configurable
Zack Cerza [Wed, 19 Jul 2017 20:31:37 +0000 (13:31 -0700)]
Document whisper settings
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Thu, 13 Jul 2017 18:21:09 +0000 (12:21 -0600)]
Make whisper retention settings configurable
The first value must be '10s', still, for consistency with collectd
Signed-off-by: Zack Cerza <zack@redhat.com>
pcuzner [Thu, 13 Jul 2017 22:11:34 +0000 (10:11 +1200)]
Merge pull request #69 from ceph/wip-ubuntu
ansible: devel_mode deployment for Ubuntu
Zack Cerza [Thu, 13 Jul 2017 16:46:40 +0000 (10:46 -0600)]
Mention Ubuntu support in ansible/README.md
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Wed, 12 Jul 2017 18:30:09 +0000 (12:30 -0600)]
Reload systemd later in the process
Since we're shipping a .service file for graphite-api
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Wed, 12 Jul 2017 17:33:34 +0000 (11:33 -0600)]
ceph-grafana: Use graphite-api for Ubuntu
They ship it already, and it's a bit less involved to set up.
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Wed, 12 Jul 2017 17:35:25 +0000 (11:35 -0600)]
ceph-grafana: Initial Ubuntu support
This commit lets us grab upstream Grafana packages in devel_mode
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Wed, 12 Jul 2017 16:14:50 +0000 (10:14 -0600)]
ceph-collectd: Support Ubuntu
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Tue, 11 Jul 2017 23:48:48 +0000 (17:48 -0600)]
Merge pull request #64 from ceph/wip-fixes
dashUpdater now removes $domain references if domain is not provided
Paul Cuzner [Tue, 11 Jul 2017 23:32:34 +0000 (11:32 +1200)]
screenshot updated to show current state @ 2017-07-12 (used in wiki)
Paul Cuzner [Fri, 7 Jul 2017 04:01:50 +0000 (16:01 +1200)]
dashUpdater: remove $domain from dashboards, if domain is not configured
For environments that don't use dns, collectd will not provide a FQDN
on the metric name. In these circumstances, the dashboards are empty.
This fix looks for the domain setting, and if it's not supplied the
$domain reference in all queries is removed before the dashboard is loaded
into grafana.
Zack Cerza [Tue, 11 Jul 2017 22:42:20 +0000 (16:42 -0600)]
Merge pull request #58 from ceph/wip-paulc
Dashboard improvements and addition of OSD latencies
Paul Cuzner [Tue, 11 Jul 2017 01:17:52 +0000 (13:17 +1200)]
common/osd: fixes to support intelcas and nvme OSD/journals
Paul Cuzner [Fri, 7 Jul 2017 01:14:58 +0000 (13:14 +1200)]
dashboard added to ansible template
Add new ceph-osd-latency dashboard to template file
Paul Cuzner [Fri, 7 Jul 2017 00:16:32 +0000 (12:16 +1200)]
osd: remove unused import (flatten_dict)
Paul Cuzner [Thu, 6 Jul 2017 23:31:48 +0000 (11:31 +1200)]
osd: add support for osd related stats, and support journal devices
OSD daemons are now asked for perf data, so latencies within ceph can be
loaded to graphite. In addition the journal device is detected. If it's
not collocated on the osd device, additional disk metrics under a journal
subtree are created within graphite
Paul Cuzner [Thu, 6 Jul 2017 23:29:06 +0000 (11:29 +1200)]
common: changes to the Disk class
Two main things;
1. Disk instances are now initialized here, instead of with the caller
devices simplying code in the osd class
2. get_real_dev function added to convert a device name of an OSD to the
name we'll use as a metric. this now provides initial support for nvme
and intelcas based osd
Paul Cuzner [Thu, 6 Jul 2017 23:26:00 +0000 (11:26 +1200)]
base: _admin_socket function updated to allow easier reuse of the base class
Paul Cuzner [Thu, 6 Jul 2017 23:25:23 +0000 (11:25 +1200)]
mon: added debug messages to aid in diagnostics
Paul Cuzner [Thu, 6 Jul 2017 23:23:07 +0000 (11:23 +1200)]
osd-node-detail: network chart updates for device names and visualisation
device names are whitelisted as en,eth,bond and the rates are now stacked
on the chart so you can see total throughput easier
Paul Cuzner [Thu, 6 Jul 2017 23:21:46 +0000 (11:21 +1200)]
dashboard.yml : new dashboard entry added
The entry is needed to ensure it gets updated by the dashUpdater.py
script
Paul Cuzner [Thu, 6 Jul 2017 23:20:38 +0000 (11:20 +1200)]
doc update : dashboard relationships diagram updated
new dashboard added, so the diagram now reflects where it fits in the
flow
Zack Cerza [Tue, 11 Jul 2017 18:45:43 +0000 (12:45 -0600)]
Merge pull request #62 from ceph/wip-docs
Install enhancements and docs improvements
Zack Cerza [Mon, 10 Jul 2017 20:27:59 +0000 (14:27 -0600)]
Ensure subscription-manager repos are enabled
We need these for dependencies of python-carbon and ceph-ansible
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Mon, 10 Jul 2017 19:54:34 +0000 (13:54 -0600)]
Install cephmetrics repo in production mode
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Fri, 7 Jul 2017 19:36:07 +0000 (13:36 -0600)]
Print dashboard URL at playbook end
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Fri, 7 Jul 2017 19:06:22 +0000 (13:06 -0600)]
Recommend running -playbook on ceph-ansible host
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Fri, 7 Jul 2017 18:33:53 +0000 (12:33 -0600)]
Correct URL in spec file
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Fri, 7 Jul 2017 15:57:47 +0000 (09:57 -0600)]
Install repo before package
Signed-off-by: Zack Cerza <zack@redhat.com>
pcuzner [Thu, 6 Jul 2017 23:10:33 +0000 (11:10 +1200)]
Merge pull request #57 from ceph/wip-docs
Mention enabling rhel-7-server-optional-rpms
Zack Cerza [Thu, 6 Jul 2017 21:59:44 +0000 (15:59 -0600)]
Mention enabling rhel-7-server-optional-rpms
It's needed for pyserial, python-twisted-core and python-zope-interface,
which are dependencies of python-carbon
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Fri, 30 Jun 2017 23:20:31 +0000 (17:20 -0600)]
Merge pull request #51 from ceph/wip-metrics-wusui
Non-developer installation instructions
Warren Usui [Thu, 29 Jun 2017 21:28:02 +0000 (17:28 -0400)]
Non-developer installation instructions
Signed-off-by: Warren Usui <wusui@magna002.ceph.redhat.com>
Zack Cerza [Fri, 30 Jun 2017 20:36:05 +0000 (14:36 -0600)]
Merge pull request #53 from ceph/wip-paulc
Dashboard fixes for at-a-glance and rados
Zack Cerza [Fri, 30 Jun 2017 19:51:41 +0000 (13:51 -0600)]
Merge pull request #54 from ceph/wip-branto
rpm: Support light theme better
Boris Ranto [Fri, 30 Jun 2017 08:10:00 +0000 (10:10 +0200)]
rpm: Support light theme better
Signed-off-by: Boris Ranto <branto@redhat.com>
Paul Cuzner [Fri, 30 Jun 2017 05:49:18 +0000 (17:49 +1200)]
at-a-glance: fix for calcs on growth and forecast
Paul Cuzner [Fri, 30 Jun 2017 02:05:33 +0000 (14:05 +1200)]
at-a-glance: multiple fixes to mon/osd/growth and forecast panels
MON/OSD panel queries updated to address the interpolation
problem where floats were shown. OSD panel also now shows
total OSDs
Templating update for the disk_full_threshold (2->80)
Growth/Forecast panel queries updated to account for data coming
from multiple mon's
Health Panel updated to show as RED when the cluster is in an
ERROR state
Paul Cuzner [Fri, 30 Jun 2017 02:01:19 +0000 (14:01 +1200)]
ceph-rados : fixes to capacity and health history charts
- Capacity chart has been extended to cover 7 days
- health history fixed - was showing an entry for each mon!
Zack Cerza [Thu, 29 Jun 2017 21:50:12 +0000 (15:50 -0600)]
Merge pull request #48 from ceph/wip-paulc
dashboard updates
Zack Cerza [Thu, 29 Jun 2017 20:54:08 +0000 (14:54 -0600)]
Merge pull request #50 from ceph/wip-branto
ansible: Fix a typo in purge.yml
Boris Ranto [Thu, 29 Jun 2017 20:36:44 +0000 (22:36 +0200)]
ansible: Fix a typo in purge.yml
Signed-off-by: Boris Ranto <branto@redhat.com>
Zack Cerza [Thu, 29 Jun 2017 16:38:11 +0000 (10:38 -0600)]
Merge pull request #49 from ceph/wip-branto
ansible: Implement purge playbook
Zack Cerza [Thu, 29 Jun 2017 16:26:38 +0000 (10:26 -0600)]
Merge pull request #47 from ceph/develop
Set WHISPER_AUTOFLUSH to True
Boris Ranto [Thu, 29 Jun 2017 14:07:17 +0000 (16:07 +0200)]
ansible: Implement purge playbook
This should be a good basis for purge playbook, it should support devel
as well as production modes.
Signed-off-by: Boris Ranto <branto@redhat.com>
Paul Cuzner [Thu, 29 Jun 2017 05:00:47 +0000 (17:00 +1200)]
screenshot changed for wiki
Paul Cuzner [Thu, 29 Jun 2017 04:50:30 +0000 (16:50 +1200)]
at-a-glance: pg status pie chart changes
a degraded state is now shown based on the diff of pg_active and
pg_active_clean. This intermediate metric has been added to the pie
chart so it shows; active+clean, degraded and peering.
Paul Cuzner [Thu, 29 Jun 2017 03:26:01 +0000 (15:26 +1200)]
dashUpdater : Set the default Org's theme to light
Most ceph UI's use a light theme, so this change aligns to that
trend.
Paul Cuzner [Thu, 29 Jun 2017 03:15:58 +0000 (15:15 +1200)]
network-usage: dashboard updated to track enX interface stats
graphite doesn't support blacklisting in queries, so interface names that
we're interested in have to be whitelisted. This fix now tracks enX, ethX and
bondX interface names.
Paul Cuzner [Thu, 29 Jun 2017 03:12:54 +0000 (15:12 +1200)]
ceph-rados: display fixes to a several charts
health history was a mess with the light theme. The chart now uses threshold lines
(amber and red), and plots health against those lines. In addition small fixes to the capacity
chart (it was stacking values!), and the monitor status table
Paul Cuzner [Thu, 29 Jun 2017 03:09:01 +0000 (15:09 +1200)]
backend-storage: cosmetic changes to heatmap and graphs
Heatmap 'spectrum' was not showing well in the light theme - this
make it more readable. In addition 'info' has been added to explain the
heatmap and what it represents.
Paul Cuzner [Thu, 29 Jun 2017 03:06:15 +0000 (15:06 +1200)]
frontend: layout of the pools is now by pool lining up iops/throughput charts across rows
Paul Cuzner [Thu, 29 Jun 2017 03:04:17 +0000 (15:04 +1200)]
at-a-glance : query and cosmetic changes
Multiple changes as follows;
- dashboard links (top) hover over was mis-aligned. fixed
- forecast value if negative now shows N/A
- forecast and growth queries updated
- disks near full set to 0, if there aren't any issues (instead of no value)
- added descriptions on various panels
Paul Cuzner [Thu, 29 Jun 2017 02:53:43 +0000 (14:53 +1200)]
status-panel: Update the bkgnd color to support the light theme
By default the panel just uses 'green', which with the light theme is too
dark, making the text on the panel difficult to read. This ansible step just
updates the color in the css to make the text more readable
Zack Cerza [Wed, 28 Jun 2017 22:33:56 +0000 (16:33 -0600)]
Set WHISPER_AUTOFLUSH to True
Related to https://github.com/ceph/cephmetrics/issues/45
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Wed, 28 Jun 2017 15:32:40 +0000 (09:32 -0600)]
Merge pull request #44 from ceph/wip-branto
rpm: Make devel_mode: false default in the ansible rpms
Boris Ranto [Wed, 28 Jun 2017 15:04:35 +0000 (17:04 +0200)]
rpm: Make devel_mode: false default in the ansible rpms
Signed-off-by: Boris Ranto <branto@redhat.com>
Zack Cerza [Tue, 27 Jun 2017 20:54:30 +0000 (14:54 -0600)]
Merge pull request #41 from ceph/wip-prod
Add ansible-syntax job
Zack Cerza [Tue, 27 Jun 2017 15:46:12 +0000 (09:46 -0600)]
Add ansible-syntax job
Signed-off-by: Zack Cerza <zack@redhat.com>
Boris Ranto [Tue, 27 Jun 2017 11:56:42 +0000 (13:56 +0200)]
Merge pull request #37 from ceph/wip-branto
selinux: Additional policy changes
Reviewed-by: Paul Cuzner <pcuzner@redhat.com>
Boris Ranto [Mon, 26 Jun 2017 20:12:20 +0000 (22:12 +0200)]
selinux: Additional policy changes
This was required to access whoami files inside /var/lib/ceph/osd/
directory.
Signed-off-by: Boris Ranto <branto@redhat.com>
Boris Ranto [Tue, 27 Jun 2017 09:08:37 +0000 (11:08 +0200)]
Merge pull request #34 from zmc/wip-selinux
ansible: Build and install SELinux module
Reviewed-by: Boris Ranto <branto@redhat.com>
Zack Cerza [Tue, 27 Jun 2017 00:35:59 +0000 (18:35 -0600)]
Merge pull request #26 from zmc/wip-docs
ansible: Document variables
Zack Cerza [Mon, 26 Jun 2017 21:26:35 +0000 (15:26 -0600)]
selinux: Allow collectd to write in /var/log/
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Mon, 26 Jun 2017 21:04:56 +0000 (15:04 -0600)]
ansible: Build and install SELinux module
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Tue, 27 Jun 2017 00:20:05 +0000 (18:20 -0600)]
Merge pull request #24 from zmc/wip-firewalld
ansible: Default to using firewalld's public zone
Zack Cerza [Tue, 27 Jun 2017 00:19:51 +0000 (18:19 -0600)]
Merge pull request #33 from zmc/wip-ansible-lint
Add ansible-lint job
Zack Cerza [Mon, 26 Jun 2017 19:25:00 +0000 (13:25 -0600)]
Remove trailing whitespace
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Mon, 26 Jun 2017 19:19:21 +0000 (13:19 -0600)]
Name unnamed tasks
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Mon, 26 Jun 2017 19:17:46 +0000 (13:17 -0600)]
Add ansible-lint job
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Mon, 26 Jun 2017 23:40:53 +0000 (17:40 -0600)]
Merge pull request #25 from zmc/wip-ansible-forks
ansible: Use up to 50 forks
Zack Cerza [Mon, 26 Jun 2017 23:40:41 +0000 (17:40 -0600)]
Merge pull request #31 from zmc/wip-grafana-plugins
ansible: Add grafana-piechart-panel plugin
Zack Cerza [Mon, 26 Jun 2017 23:40:09 +0000 (17:40 -0600)]
Merge pull request #35 from zmc/wip-dashboard-update
ansible: Replace dashboards by default when updating
Zack Cerza [Mon, 26 Jun 2017 22:25:35 +0000 (16:25 -0600)]
Replace dashboards by default when updating
This behavior can be disabled by setting 'replace_dashboards' to False
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Fri, 23 Jun 2017 21:45:08 +0000 (15:45 -0600)]
Document variables
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Fri, 23 Jun 2017 21:55:45 +0000 (15:55 -0600)]
ansible: Use up to 50 forks
This speeds up deployment. Ansible will use this value or the number of
hosts - whichever is smaller.
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Fri, 23 Jun 2017 21:55:45 +0000 (15:55 -0600)]
Default to using firewalld's public zone
This will be appropriate for most users.
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Mon, 26 Jun 2017 18:15:09 +0000 (12:15 -0600)]
Install grafana-piechart-panel
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Mon, 26 Jun 2017 18:11:01 +0000 (12:11 -0600)]
Use grafana-cli to manage plugins
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Mon, 26 Jun 2017 19:10:38 +0000 (13:10 -0600)]
Merge pull request #32 from zmc/wip-tests
Add tox.ini with flake8 job
Zack Cerza [Mon, 26 Jun 2017 18:35:00 +0000 (12:35 -0600)]
Remove unused import
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Mon, 26 Jun 2017 18:32:48 +0000 (12:32 -0600)]
Add tox.ini with flake8 job
Signed-off-by: Zack Cerza <zack@redhat.com>
Zack Cerza [Mon, 26 Jun 2017 18:22:27 +0000 (12:22 -0600)]
Merge pull request #28 from ceph/wip-branto
Couple of cephmetric fixes
Boris Ranto [Mon, 26 Jun 2017 16:24:53 +0000 (18:24 +0200)]
rpm: Add SELinux support
Signed-off-by: Boris Ranto <branto@redhat.com>