key/value set for a filesystem extended attributes. It effectively replaces
the old per-MDS `max_xattr_pairs_size` setting, which is now dropped.
Relevant tracker: https://tracker.ceph.com/issues/55725
+
+* Introduced a new file system flag `refuse_standby_for_another_fs` that can be
+set using the `fs set` command. This flag prevents using a standby for another
+file system (join_fs = X) when standby for the current filesystem is not available.
+Relevant tracker: https://tracker.ceph.com/issues/61599
When failing over MDS daemons, a cluster's monitors will prefer standby daemons with
``mds_join_fs`` equal to the file system ``name`` with the failed ``rank``. If no
standby exists with ``mds_join_fs`` equal to the file system ``name``, it will
-choose an unqualified standby (no setting for ``mds_join_fs``) for the replacement,
-or any other available standby, as a last resort. Note, this does not change the
-behavior that ``standby-replay`` daemons are always selected before
-other standbys.
+choose an unqualified standby (no setting for ``mds_join_fs``) for the replacement.
+As a last resort, a standby for another filesystem will be chosen, although this
+behavior can be disabled:
+
+::
+
+ ceph fs set <fs name> refuse_standby_for_another_fs true
+
+Note, configuring MDS file system affinity does not change the behavior that
+``standby-replay`` daemons are always selected before other standbys.
Even further, the monitors will regularly examine the CephFS file systems even when
stable to check if a standby with stronger affinity is available to replace an
#define CEPH_MDSMAP_ALLOW_STANDBY_REPLAY (1<<5) /* cluster alllowed to enable MULTIMDS */
#define CEPH_MDSMAP_REFUSE_CLIENT_SESSION (1<<6) /* cluster allowed to refuse client session
request */
+#define CEPH_MDSMAP_REFUSE_STANDBY_FOR_ANOTHER_FS (1<<7) /* fs is forbidden to use standby
+ for another fs */
#define CEPH_MDSMAP_DEFAULTS (CEPH_MDSMAP_ALLOW_SNAPS | \
CEPH_MDSMAP_ALLOW_MULTIMDS_SNAPS)
break;
} else if (info.join_fscid == FS_CLUSTER_ID_NONE) {
who = &info; /* vanilla standby */
- } else if (who == nullptr) {
+ } else if (who == nullptr &&
+ !fs.mds_map.test_flag(CEPH_MDSMAP_REFUSE_STANDBY_FOR_ANOTHER_FS)) {
who = &info; /* standby for another fs, last resort */
}
}
f->dump_bool(flag_display.at(CEPH_MDSMAP_ALLOW_MULTIMDS_SNAPS), allows_multimds_snaps());
f->dump_bool(flag_display.at(CEPH_MDSMAP_ALLOW_STANDBY_REPLAY), allows_standby_replay());
f->dump_bool(flag_display.at(CEPH_MDSMAP_REFUSE_CLIENT_SESSION), test_flag(CEPH_MDSMAP_REFUSE_CLIENT_SESSION));
+ f->dump_bool(flag_display.at(CEPH_MDSMAP_REFUSE_STANDBY_FOR_ANOTHER_FS), test_flag(CEPH_MDSMAP_REFUSE_STANDBY_FOR_ANOTHER_FS));
f->close_section();
}
out << " " << flag_display.at(CEPH_MDSMAP_ALLOW_STANDBY_REPLAY);
if (test_flag(CEPH_MDSMAP_REFUSE_CLIENT_SESSION))
out << " " << flag_display.at(CEPH_MDSMAP_REFUSE_CLIENT_SESSION);
+ if (test_flag(CEPH_MDSMAP_REFUSE_STANDBY_FOR_ANOTHER_FS))
+ out << " " << flag_display.at(CEPH_MDSMAP_REFUSE_STANDBY_FOR_ANOTHER_FS);
}
void MDSMap::get_health(list<pair<health_status_t,string> >& summary,
{CEPH_MDSMAP_ALLOW_SNAPS, "allow_snaps"},
{CEPH_MDSMAP_ALLOW_MULTIMDS_SNAPS, "allow_multimds_snaps"},
{CEPH_MDSMAP_ALLOW_STANDBY_REPLAY, "allow_standby_replay"},
- {CEPH_MDSMAP_REFUSE_CLIENT_SESSION, "refuse_client_session"}
+ {CEPH_MDSMAP_REFUSE_CLIENT_SESSION, "refuse_client_session"},
+ {CEPH_MDSMAP_REFUSE_STANDBY_FOR_ANOTHER_FS, "refuse_standby_for_another_fs"}
};
};
WRITE_CLASS_ENCODER_FEATURES(MDSMap::mds_info_t)
ss << "client(s) already allowed to establish new session(s)";
}
}
+ } else if (var == "refuse_standby_for_another_fs") {
+ bool refuse_standby_for_another_fs = false;
+ int r = parse_bool(val, &refuse_standby_for_another_fs, ss);
+ if (r != 0) {
+ return r;
+ }
+
+ if (refuse_standby_for_another_fs) {
+ if (!(fs->mds_map.test_flag(CEPH_MDSMAP_REFUSE_STANDBY_FOR_ANOTHER_FS))) {
+ fsmap.modify_filesystem(
+ fs->fscid,
+ [](std::shared_ptr<Filesystem> fs)
+ {
+ fs->mds_map.set_flag(CEPH_MDSMAP_REFUSE_STANDBY_FOR_ANOTHER_FS);
+ });
+ ss << "set to refuse standby for another fs";
+ } else {
+ ss << "to refuse standby for another fs is already set";
+ }
+ } else {
+ if (fs->mds_map.test_flag(CEPH_MDSMAP_REFUSE_STANDBY_FOR_ANOTHER_FS)) {
+ fsmap.modify_filesystem(
+ fs->fscid,
+ [](std::shared_ptr<Filesystem> fs)
+ {
+ fs->mds_map.clear_flag(CEPH_MDSMAP_REFUSE_STANDBY_FOR_ANOTHER_FS);
+ });
+ ss << "allowed to use standby for another fs";
+ } else {
+ ss << "to use standby for another fs is already allowed";
+ }
+ }
} else {
ss << "unknown variable " << var;
return -EINVAL;
"|allow_new_snaps|inline_data|cluster_down|allow_dirfrags|balancer"
"|standby_count_wanted|session_timeout|session_autoclose"
"|allow_standby_replay|down|joinable|min_compat_client|bal_rank_mask"
- "|refuse_client_session|max_xattr_size "
+ "|refuse_client_session|max_xattr_size|refuse_standby_for_another_fs "
"name=val,type=CephString "
"name=yes_i_really_mean_it,type=CephBool,req=false "
"name=yes_i_really_really_mean_it,type=CephBool,req=false",