From 823359c76f6c94d77f0dc8bbe0b90d8150ff0529 Mon Sep 17 00:00:00 2001 From: Sage Weil Date: Fri, 15 Nov 2019 09:31:50 -0600 Subject: [PATCH] osd: add osd_fast_shutdown option (default true) If we get a SIGINT or SIGTERM or are deleted from the OSDMap, do a fast shutdown by exiting immediately. This has a few important benefits: - We immediately stop responding (binding) to any sockets, which means other OSDs will immediately decide we are down (and dead!). This minimizes IO interruption. - We avoid the complex "clean" shutdown process, which is historically a source of bugs. In reality, the only purpose of the "clean" shutdown is to try to tear down everything in memory so we can do memory leak checking with valgrind. Set this option to false for valgrind QA runs so we can still do that. Not that with the new read leases in octopus, we rely on the default behavior that a ECONNREFUSED is taken to mean that the OSD is fully dead, so that we don't have to wait for any leases to time out. This works in sane environments with normal IP networks, but that behavior could conceivably be a bad idea if there are some weird network shenanigans going on. If osd_fast_fail_on_connection_refused were disabled, then this fast shutdown procedure might be *worse* than the clean shutdown because we would have to wait for the heartbeat timeout. Signed-off-by: Sage Weil (cherry picked from commit cf352c3ac0bd87d8b7e0c52ac724f94576ae5aa7) --- qa/suites/fs/verify/validater/valgrind.yaml | 2 ++ qa/suites/rados/singleton-flat/valgrind-leaks.yaml | 2 ++ qa/suites/rados/verify/validater/valgrind.yaml | 2 ++ qa/suites/rgw/multisite/valgrind.yaml | 2 ++ qa/suites/rgw/verify/validater/valgrind.yaml | 2 ++ src/common/legacy_config_opts.h | 1 + src/common/options.cc | 5 +++++ src/osd/OSD.cc | 6 ++++++ src/vstart.sh | 1 + 9 files changed, 23 insertions(+) diff --git a/qa/suites/fs/verify/validater/valgrind.yaml b/qa/suites/fs/verify/validater/valgrind.yaml index fc4c459d1ac2e..a5c081542d809 100644 --- a/qa/suites/fs/verify/validater/valgrind.yaml +++ b/qa/suites/fs/verify/validater/valgrind.yaml @@ -17,6 +17,8 @@ overrides: mds heartbeat grace: 60 mon: mon osd crush smoke test: false + osd: + osd fast shutdown: false valgrind: mon: [--tool=memcheck, --leak-check=full, --show-reachable=yes] osd: [--tool=memcheck] diff --git a/qa/suites/rados/singleton-flat/valgrind-leaks.yaml b/qa/suites/rados/singleton-flat/valgrind-leaks.yaml index e70a5e40a3f6f..c41f75fce02a9 100644 --- a/qa/suites/rados/singleton-flat/valgrind-leaks.yaml +++ b/qa/suites/rados/singleton-flat/valgrind-leaks.yaml @@ -23,6 +23,8 @@ overrides: osd max object namespace len: 64 mon: mon osd crush smoke test: false + osd: + osd fast shutdown: false valgrind: mon: [--tool=memcheck, --leak-check=full, --show-reachable=yes] osd: [--tool=memcheck] diff --git a/qa/suites/rados/verify/validater/valgrind.yaml b/qa/suites/rados/verify/validater/valgrind.yaml index 8b907c25688f6..2ed6637777f1e 100644 --- a/qa/suites/rados/verify/validater/valgrind.yaml +++ b/qa/suites/rados/verify/validater/valgrind.yaml @@ -13,6 +13,8 @@ overrides: debug refs: 5 mon: mon osd crush smoke test: false + osd: + osd fast shutdown: false log-whitelist: - overall HEALTH_ # valgrind is slow.. we might get PGs stuck peering etc diff --git a/qa/suites/rgw/multisite/valgrind.yaml b/qa/suites/rgw/multisite/valgrind.yaml index 08fad9da02381..99489951b4a9b 100644 --- a/qa/suites/rgw/multisite/valgrind.yaml +++ b/qa/suites/rgw/multisite/valgrind.yaml @@ -11,6 +11,8 @@ overrides: osd heartbeat grace: 40 mon: mon osd crush smoke test: false + osd: + osd fast shutdown: false valgrind: mon: [--tool=memcheck, --leak-check=full, --show-reachable=yes] osd: [--tool=memcheck] diff --git a/qa/suites/rgw/verify/validater/valgrind.yaml b/qa/suites/rgw/verify/validater/valgrind.yaml index 66571d34d7df3..4010ccf28b7f5 100644 --- a/qa/suites/rgw/verify/validater/valgrind.yaml +++ b/qa/suites/rgw/verify/validater/valgrind.yaml @@ -12,6 +12,8 @@ overrides: osd heartbeat grace: 40 mon: mon osd crush smoke test: false + osd: + osd fast shutdown: false valgrind: mon: [--tool=memcheck, --leak-check=full, --show-reachable=yes] osd: [--tool=memcheck] diff --git a/src/common/legacy_config_opts.h b/src/common/legacy_config_opts.h index 79d9c1fa73782..7aae31e8eabfe 100644 --- a/src/common/legacy_config_opts.h +++ b/src/common/legacy_config_opts.h @@ -795,6 +795,7 @@ OPTION(osd_op_history_slow_op_size, OPT_U32) // Max number of slow ops OPTION(osd_op_history_slow_op_threshold, OPT_DOUBLE) // track the op if over this threshold OPTION(osd_target_transaction_size, OPT_INT) // to adjust various transactions that batch smaller items OPTION(osd_failsafe_full_ratio, OPT_FLOAT) // what % full makes an OSD "full" (failsafe) +OPTION(osd_fast_shutdown, OPT_BOOL) OPTION(osd_fast_fail_on_connection_refused, OPT_BOOL) // immediately mark OSDs as down once they refuse to accept connections OPTION(osd_pg_object_context_cache_count, OPT_INT) diff --git a/src/common/options.cc b/src/common/options.cc index 6d82ed662b0d1..93e65fd2d6786 100644 --- a/src/common/options.cc +++ b/src/common/options.cc @@ -3864,6 +3864,11 @@ std::vector