From bffe1ad2cc79f337a3c6609828ecc4344c57e9c2 Mon Sep 17 00:00:00 2001 From: Mauricio Faria de Oliveira Date: Thu, 7 Jan 2021 19:44:44 -0300 Subject: [PATCH] osd: add osd_fast_shutdown_notify_mon option (default false) The osd_fast_shutdown option may cause the cluster log to receive too many entries of 'osd.X reported immediately failed by osd.Y', depending on cluster scale. This might be an issue for LMA stacks/tools that check ceph logs for failed lines, and then require additional logic to filter on an intended OSD (fast) shutdown; might not be an option/possible, and require an admin to analyze. So, add osd_fast_shutdown_notify_mon option for OSD to also tell the monitor it is shutting down (done in slow/non-fast shutdown) under osd_fast_shutdown. This introduces minimal delay (the ack from the mon is required to prevent the messages), and addresses the cluster log issue. Note: the osd_mon_shutdown_timeout option can be used to control the maximum amount of time waiting for the monitor ack to arrive. Fixes: http://tracker.ceph.com/issues/46978 Signed-off-by: Mauricio Faria de Oliveira (cherry picked from commit c75734729764868c5c501722fc8de08dac9ebd4a) --- src/common/legacy_config_opts.h | 1 + src/common/options.cc | 6 ++++++ src/osd/OSD.cc | 2 ++ 3 files changed, 9 insertions(+) diff --git a/src/common/legacy_config_opts.h b/src/common/legacy_config_opts.h index 6aa45b7e4cb11..203fe0b2ec147 100644 --- a/src/common/legacy_config_opts.h +++ b/src/common/legacy_config_opts.h @@ -766,6 +766,7 @@ OPTION(osd_op_history_slow_op_threshold, OPT_DOUBLE) // track the op if over thi OPTION(osd_target_transaction_size, OPT_INT) // to adjust various transactions that batch smaller items OPTION(osd_failsafe_full_ratio, OPT_FLOAT) // what % full makes an OSD "full" (failsafe) OPTION(osd_fast_shutdown, OPT_BOOL) +OPTION(osd_fast_shutdown_notify_mon, OPT_BOOL) // tell mon the OSD is shutting down on osd_fast_shutdown OPTION(osd_fast_fail_on_connection_refused, OPT_BOOL) // immediately mark OSDs as down once they refuse to accept connections OPTION(osd_pg_object_context_cache_count, OPT_INT) diff --git a/src/common/options.cc b/src/common/options.cc index f74aa788a73b0..d4f8e62949632 100644 --- a/src/common/options.cc +++ b/src/common/options.cc @@ -3555,6 +3555,12 @@ std::vector