Problem:
In a stretch cluster, we encountered
an assert failure when checking for
dead crush zones when we have a none-existing
CRUSH bucket.
Solution:
Ignore the none-existing crush bucket, instead
of assert.
Fixes: https://tracker.ceph.com/issues/63861
Signed-off-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
(cherry picked from commit
5c46c482dde1073810d9d4c2f1c7f8c0b1630185)
bool really_down = false;
for (auto dbi : dead_buckets) {
const string& bucket_name = dbi.first;
- ceph_assert(osdmap.crush->name_exists(bucket_name));
+ if (!osdmap.crush->name_exists(bucket_name)) {
+ dout(10) << "CRUSH bucket " << bucket_name << " does not exist" << dendl;
+ continue;
+ }
int bucket_id = osdmap.crush->get_item_id(bucket_name);
dout(20) << "Checking " << bucket_name << " id " << bucket_id
<< " to see if OSDs are also down" << dendl;