Increase metric priorities to CRITICAL for metrics used in dashboard
As part of scale testing, we observed that, the volume of metrics was
very huge on large clusters. We did an analysis of the used metrics
and the complete list of metrics used in the dashboards and made the
below observations:
1. Only 17 metrics were used in the dashboards(grafana and management UI)
2. Total number of metrics collected in prometheus stack were around 245
A lot of metrics will incur:
1. Greater CPU and Memory demand for all marshaling and un-marshaling
requirements
2. Greater storage volume
3. Increased per-scrape network consumption
We intend to bump all the metrics leveraged in Ceph monitoring dashboards
to prio_level CRITICAL and also raise the default ceph-exporter prio_level
to CRITICAL. So, that prometheus ends up having only the required metrics.
This is Part 1 of the efforts to request the metric implementation teams to
revisit the metric priorities.
If the customer needs other metrics, they can lower the ceph-exporter prio
level and restart the ceph exporter after a careful evaluation of the storage,
CPU and networking costs.