Introduce a config option called 'mon_warn_on_pool_no_redundancy' that is
used to show a health warning if any pool in the ceph cluster is
configured with a size of 1. The user can mute/unmute the warning using
'ceph health mute/unmute POOL_NO_REDUNDANCY'.
Add standalone test to verify warning on setting pool size=1. Set the
associated warning to 'false' in ceph.conf.template under qa/tasks so
that existing tests do not break.
Fixes: https://tracker.ceph.com/issues/41666
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit
33c647e8114b37404d8d62a08c85664cea709118)
Conflicts:
PendingReleaseNotes
- Added release notes under 14.2.9
qa/standalone/mon/health-mute.sh
- Deleted the script as 'health mute/unmute' cmd is unavailable in nautilus
qa/tasks/ceph.conf.template
- Removed a flag not available in nautilus
src/common/options.cc
- Removed a flag not available in nautilus
src/osd/OSDMap.cc
ceph osd pool set <pool-name> pg_autoscale_mode warn
* The format of MDSs in `ceph fs dump` has changed.
+
+* Ceph will issue a health warning if a RADOS pool's ``size`` is set to 1
+ or in other words the pool is configured with no redundancy. This can
+ be fixed by setting the pool size to the minimum recommended value
+ with::
+
+ ceph osd pool set <pool-name> size <num-replicas>
+
+ The warning can be silenced with::
+
+ ceph config set global mon_warn_on_pool_no_redundancy false
:Default: ``0``
+``mon warn on pool no redundancy``
+
+:Description: Issue a ``HEALTH_WARN`` in cluster log if any pool is
+ configured with no replicas.
+:Type: Boolean
+:Default: ``True``
+
+
``mon cache target full warn ratio``
:Description: Position between pool's ``cache_target_full`` and
mon warn on osd down out interval zero = false
mon warn on too few osds = false
mon_warn_on_pool_pg_num_not_power_of_two = false
+ mon_warn_on_pool_no_redundancy = false
osd pool default erasure code profile = "plugin=jerasure technique=reed_sol_van k=2 m=1 ruleset-failure-domain=osd crush-failure-domain=osd"
.add_service("mon")
.set_description("issue POOL_PG_NUM_NOT_POWER_OF_TWO warning if pool has a non-power-of-two pg_num value"),
+ Option("mon_warn_on_pool_no_redundancy", Option::TYPE_BOOL, Option::LEVEL_ADVANCED)
+ .set_default(true)
+ .add_service("mon")
+ .set_description("Issue a health warning if any pool is configured with no replicas")
+ .add_see_also("osd_pool_default_size")
+ .add_see_also("osd_pool_default_min_size"),
+
Option("mon_warn_on_misplaced", Option::TYPE_BOOL, Option::LEVEL_ADVANCED)
.set_default(false)
.add_service("mgr")
d.detail.swap(detail);
}
}
+
+ // POOL_NO_REDUNDANCY
+ if (cct->_conf.get_val<bool>("mon_warn_on_pool_no_redundancy"))
+ {
+ list<string> detail;
+ for (auto it : get_pools()) {
+ if (it.second.get_size() == 1) {
+ ostringstream ss;
+ ss << "pool '" << get_pool_name(it.first)
+ << "' has no replicas configured";
+ detail.push_back(ss.str());
+ }
+ }
+ if (!detail.empty()) {
+ ostringstream ss;
+ ss << detail.size() << " pool(s) have no replicas configured";
+ auto& d = checks->add("POOL_NO_REDUNDANCY", HEALTH_WARN, ss.str());
+ d.detail.swap(detail);
+ }
+ }
}
int OSDMap::parse_osd_id_list(const vector<string>& ls, set<int> *out,