Sage Weil [Fri, 16 Apr 2021 22:10:46 +0000 (18:10 -0400)]
Merge PR #40888 into master
* refs/pull/40888/head:
qa/tasks/cephadm: ignore --keep-logs failure
qa/tasks/cephadm: use yaml.dump_all()
qa/suites/rados/cephadm/smoke-*: use cephadm.wait_for_service
qa/tasks/cephadm: tear down clsuter before gathering logs
qa/suites/rados/cephadm/smoke-roleless: test rgw-ingress
mgr/cephadm: remove virtual_ip check during scheduling
mgr/orchestrator: orch ls: leave off virtual_ip prefixlen
qa/tasks/cephadm: add wait_for_service
qa/tasks/cephadm: allow skip_monitor_stack=true
qa/tasks/cephadm: do subst_vip for cephadm.shell and .apply
qa/tasks/vip: add vip task to allocate virtual IPs
qa/suites/rados/cephadm/smoke-roleless: add rgw-ingress test case
qa/tasks/cephadm: shell: take 'all-roles' or 'all-hosts'
qa/tasks/cephadm: let cephadm.shell take string or list
Merge pull request #40732 from neha-ojha/wip-50217
common/options/global.yaml.in: increase default value of bluestore_cache_trim_max_skip_pinned
Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Mark Nelson <mnelson@redhat.com> Reviewed-by: Adam Kupczyk <akupczyk@redhat.com> Reviewed-by: Igor Fedotov <ifedotov@suse
common/options/global.yaml.in: increase default value of bluestore_cache_trim_max_skip_pinned
This option controls the rate of trimming of onodes and the earlier default of
64 has been seen to be too low for large clusters, leading to buildup of
onodes resulting in memory growth.
Increase the default value to 1000, since there are no known downsides to it.
common/options,doc: extract formatted desc into .yaml.in
* add a field named "fmt_desc", which is the description formatted using
reStructuredText. it is preserved as it is if it's different from the
desc or long_desc of an option. we can consolidate it with long_desc
in future, and use pretty printer which has minimal support for
reStructuredText for printing the formatted descriptions for a better
user experience of command line. but at this moment, fmt_desc has
only one consumer: the "ceph_confval" sphinx extension which extracts
and translate the options yaml file to reStructuredText, which is in
turn rendered by sphinx.
* remove unused options from the doc
- journal_queue_max_ops
- journal_queue_max_bytes
Commit 5505fc0051a3 ("common: generate legacy_config_opts.h from
.yaml.in files") inadvertently reverted a change of a default value by
adding a second "default" key with the old value. This was corrected
in commit 75e07f8638ef ("common/options/global: correct default of
auth_mon_ticket_ttl"), but highlights that mis-merging a yaml file
is rather easy.
To prevent this happening again, fail the build if duplicate keys
exist in any of src/common/options/*.yaml.in files.
Sage Weil [Thu, 15 Apr 2021 22:55:00 +0000 (17:55 -0500)]
qa/tasks/cephadm: tear down clsuter before gathering logs
We dont' always stop all services, because teuthology doesn't know about
things it didn't start. Use rm-cluster to tear things down, but do not
remove the logs themselves. After we get logs, we'll clean up completely.
Sage Weil [Fri, 16 Apr 2021 12:14:28 +0000 (08:14 -0400)]
Merge PR #40870 into master
* refs/pull/40870/head:
auth/cephx: make KeyServer::build_session_auth_info() less confusing
auth/cephx: cap ticket validity by expiration of "next" key
auth/cephx: drop redundant KeyServerData::get_service_secret() overload
Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com> Reviewed-by: Adam King <adking@redhat.com> Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com> Reviewed-by: Michael Fritch <mfritch@suse.com>
Patrick Donnelly [Fri, 16 Apr 2021 04:06:31 +0000 (21:06 -0700)]
Merge PR #40539 into master
* refs/pull/40539/head:
cephfs-top: set the cursor to be invisible
cephfs-top: self-adapt the display according the window size
cephfs-top: use the default window object from curses.wrapper()
cephfs-top: improve the output
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
Patrick Donnelly [Fri, 16 Apr 2021 04:03:19 +0000 (21:03 -0700)]
Merge PR #39660 into master
* refs/pull/39660/head:
qa: Update the mdsmap schema in mgr/dashboard/test_health.py
doc: add lsflags command to Administrative Commands document
qa: test fs lsflags command
mon: add command to print fs flags
mds: print each flag value
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
auth/cephx: make KeyServer::build_session_auth_info() less confusing
The second KeyServer::build_session_auth_info() overload is used only
by the monitor, for mon <-> mon authentication. The monitor passes in
service_secret (mon secret) and secret_id (-1). The TTL is irrelevant
because there is no rotation.
However the signature doesn't make it obvious. Clarify that
service_secret and secret_id are input parameters and info is the only
output parameter.
to silence the health warning of "mons are allowing insecure global_id
reclaim", which prevents the cluster from being active+clean. couple
tests are expecting a warning free cluster before they starts.
as this option is enabled by default for appeasing the old clients, but when it
comes to most of upstream testing, we can just disable it.
auth/cephx: cap ticket validity by expiration of "next" key
If auth_mon_ticket_ttl is increased by several times as done in
commit 522a52e6c258 ("auth/cephx: rotate auth tickets less often"),
active clients eventually get stuck because the monitor sends out an
auth ticket with a bogus validity. The ticket is secured with the
"current" secret that is scheduled to expire according to the old TTL,
but the validity of the ticket is set to the new TTL. As a result,
the client simply doesn't attempt to renew, letting the secrets rotate
potentially more than once. When that happens, the client first hits
auth authorizer errors as it tries to renew service tickets and when
it finally gets to renewing the auth ticket, it hits the insecure
global_id reclaim wall.
Cap TTL by expiration of "next" key -- the "current" key may be
milliseconds away from expiration and still be used, legitimately.
Do it in KeyServerData alongside key rotation code and propagate the
capped TTL to the upper layer.
as the left-hand operator is promoted to off_t which is a signed
integer, while rgw_max_chunk_size will be an unsigned after the
yaml-to-cxx migration. so let's cast it to `off_t` before comparing
them.
the same applies to rgw_copy_obj_progress_every_bytes.
common: generate legacy_config_opts.h from .yaml.in files
* add a setting named "with_legacy" to .yaml.in files, so
each option with a true "with_legacy" will have an entry
in legacy_config_opts.h.
* preserve the comments from legacy_config_opts.h to .yaml.in,
some of them are solely for developers, but some of them are
good reading for users as well. we can use them for "desc"
field in a follow-up change.
* move common/legacy_config_opts.h to common/options/legacy_config_opts.h
as legacy_config_opts.h is "closer" to the options directory
than other sources files under src/common.
* update y2c.py to generate separate .h files which are in turn
included by legacy_config_opts.h
* add a target named "legacy-option-headers", and let
some targets depend on it so that these headers generated by
y2c.py can be generated before the .cc files including them
are compiled.
test/cls_cas: allow multi hobjects tracked by cls_cas
in d2737fd41a146e8efe3162cdc39845226bd5a756, we started to use multiset
for tracking the references of hobject for snapshot support. as the same
hobject maps to multiple snapshots. and we don't want to consider
different snapshots as the same entry tracked by cls_cas.
but cls_cas.dup_get() still tries to verify that the `get` operation
is able to dedup the same referenced "source". but this does not apply
to "by_object" trunk ref type anymore.
since we cannot check/choose the chunk ref type used by OSD from the
client of the cls_cas, in this change, cls_cas.dup_get() is updated
to adapt the change solely for "by_object". otherwise we could skip
this test for "by_object" type and/or define another test for other
chunk ref types.
J. Eric Ivancich [Wed, 14 Apr 2021 17:55:22 +0000 (13:55 -0400)]
rgw: during reshard lock contention, adjust logging
When RGW fails to get a lock on a reshard log, we log it in such a way
that it looks like an error. Instead we'll make sure that the log
message is informational.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
* CVE-2021-20288:
qa/standalone: default to disable insecure global id reclaim
qa/suites/upgrade/octopus-x: disable insecure global_id reclaim health warnings
qa/tasks/ceph[adm].conf[.template]: disable insecure global_id reclaim health alerts
cephadm: set auth_allow_insecure_global_id_reclaim for mon on bootstrap
mon/HealthMonitor: raise AUTH_INSECURE_GLOBAL_ID_RENEWAL[_ALLOWED]
auth/cephx: ignore CEPH_ENTITY_TYPE_AUTH in requested keys
auth/cephx: rotate auth tickets less often
mon: fail fast when unauthorized global_id (re)use is disallowed
auth/cephx: option to disallow unauthorized global_id (re)use
auth/cephx: make cephx_decode_ticket() take a const ticket_blob
auth/AuthServiceHandler: keep track of global_id and whether it is new
auth/AuthServiceHandler: build_cephx_response_header() is cephx-specific
auth/AuthServiceHandler: drop unused start_session() args
mon/MonClient: drop global_id arg from _add_conn() and _add_conns()
mon/MonClient: reset auth state in shutdown()
mon/MonClient: preserve auth state on reconnects
mon/MonClient: claim active_con's auth explicitly
mon/MonClient: resurrect "waiting for monmap|config" timeouts
Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>