Greg Farnum [Tue, 30 Nov 2021 18:29:46 +0000 (18:29 +0000)]
osd: Check range_blocklist in is_blocklisted(): we actually blocklist ranges
Carry a parallel map from cidr addresses to a new
range_bits class (stored entirely as ephemeral state) so that we
don't need to re-compute masks and bit mappings too often, and to
separate out the unpleasant ipv6 bit mapping logic. Then check
against those with range_bits::matches() the same way we check
for equality on specific-entity matches. Nice and simple loops!
Greg Farnum [Tue, 2 Nov 2021 00:38:50 +0000 (00:38 +0000)]
osdmap: convert get_blocklist() to provide the entity/IP and range blocklists
Providing a non-range-aware blocklist accessor would just be
asking for trouble, so don't.
The ugly part of this is how the Objecter is currently just
throwing the range blocklist on the end of its own list. The in-tree
callers are okay with this, and I'd like to look at removing the
blocklist events API from librados entirely -- it exposes "OSD-only"
state to clients and, as evidenced by this patch series, is not
particularly stable.
Greg Farnum [Wed, 8 Dec 2021 21:32:58 +0000 (21:32 +0000)]
mon: take blocklist ranges as a subcommand, not implicitly from address format
I discovered in testing with CephFS that this tends to interpret client IPs
(which don't have ports, but do have nonces) as invalid ranges. So give it
a separate input keyword that has to be applied first.
Greg Farnum [Mon, 25 Oct 2021 19:53:04 +0000 (19:53 +0000)]
msg: common: allow entity_addr_t to store a CIDR address range
This required very little change to the existing code. Use with care, because
existing code expects an IP address instead of a range, but it saves on
writing a new parser.
Greg Farnum [Tue, 2 Nov 2021 00:34:34 +0000 (00:34 +0000)]
mds: Server: Simplify apply_blocklist and usage of the OSDMap's blocklist
This previoulsly re-implemented a bunch of the OSDMap::is_blocklisted()
function, and wasn't actually any faster to run -- the list of new blocklists
may be smaller than the full set, but OSDMap::blocklist is an unordered_map
of constant lookup time so it shouldn't slow things down. More importantly,
this is much simpler, less likely to be buggy from duplicate code, and lets
the MDS off the hook for dealing with range blocklisting.
Greg Farnum [Mon, 1 Nov 2021 23:52:53 +0000 (23:52 +0000)]
client: Simplify blocklist tracking and interface
I'm not sure if the blocklist events tracking in Client.cc was ever
the simplest way to track that state, but it definitely isn't now. We
can just hand our addr_vec to the OSDMap and ask it -- it handles
version compatibility issues and, happily, means the Client doesn't
need to learn to deal with ranges directly.
Venky Shankar [Tue, 29 Mar 2022 13:18:06 +0000 (09:18 -0400)]
mount.ceph: remove `ms_mode' mount option when switching to old-syntax
... and switch to using v1 addresses (if users haven't specified those
explicitly). kernel versions <5.11 do not understand `ms_mode' mount
option which would result in mount failure.
librbd/cache/pwl: avoid inconsistencies in ImageCacheState
When empty and/or clean bools are updated in I/O handling code paths,
ImageCacheState becomes inconistent for a short while: e.g. with clean
transitioned to true, dirty_bytes counter could still be positive
because the counters are updated only in periodic_stats(). Move to
updating the counters in update_image_cache_state(Context*) to avoid
this.
update_image_cache_state(Context*) now requires m_lock -- most call
sites already hold it anyway. The only problematic call site was
AbstractWriteLog::shut_down() callback chain: perf_stop() needed to
be moved to the very end since perf counters must be alive now for
update_image_cache_state() to work.
Don't override expect_op_work_queue() in unit tests: completing
context in the same thread now results in a deadlock on m_lock in
all test cases that call AbstractWriteLog::init().
get_json_format() and create_image_cache_state() attempt to get
particular keys which could result in an unhandled std::runtime_error
exception. Conversely, ImageCacheState constructor just swallows that
exception which could leave the newly constructed object incorrectly
initialized. Avoid doing parsing in the constructor and introduce
init_from_config() and init_from_metadata() methods instead.
While at it, move everything out from under "persistent_cache" key.
Also fix init_state_json_write test case which stopped working now
that types are enforced by json_spirit.
Yin Congmin [Tue, 29 Mar 2022 08:59:05 +0000 (16:59 +0800)]
librbd/cache/pwl: add basic metrics to ImageCacheState
Add basic metrics to ImageCacheState and persist them, including
allocated_bytes, cached_bytes, dirty_bytes, free_bytes and hit/miss
info.
Leverage periodic_stats() timer to call update_image_cache_state.
In order to avoid outputting too much debug information, the original
statistics output log level is changed to 5.
Switch to json_spirit for encoding because encode_json encodes bool as
"true"/"false" string.
Remove rbd_persistent_cache_log_periodic_stats option because we need
to always update cache state.
[ idryomov: add cached_bytes and hits_partial; report misses and
miss_bytes instead of respective totals; naming ]
cmake: move rgw_lua_request.cc from rgw_common target to rgw_a
resolves a clang linker error where `rgw::lua::request::RequestLog()`
from rgw_lua_request.cc (in rgw_common) looks for `rgw_log_op()` from
rgw_log.cc (in rgw_a)
rgw_a depends on rgw_common, not the other way around. so this moves
rgw_lua_request.cc into the same target as rgw_log.cc
lua is now a public dependency of rgw_common so it's not hidden from
rgw_a or unit tests
Melissa Li [Wed, 23 Mar 2022 15:38:37 +0000 (11:38 -0400)]
cephadm: show error message if private registry credentials not provided
Raise UnauthorizedRegistryError in `_pull_image` if user tries to pull from a private registry without authentication, handle error in `command_bootstrap`, `commond_adopt`, `command_pull`
Fixes: https://tracker.ceph.com/issues/55015 Signed-off-by: Melissa Li <melissali@redhat.com>
Laura Flores [Thu, 7 Apr 2022 22:20:14 +0000 (22:20 +0000)]
mgr/telemetry: anonymize daemons in telemetry perf_counters
In the telemetry perf channel we collect 'perf_counters' of individual daemons.
The monitors appear with their full name, which includes the host name.
The host name part must be anonymized.
To err on the safe side, I have anonymized all daemons except for osds,
since they are not attached to host names.
Fixes: https://tracker.ceph.com/issues/55229 Signed-off-by: Laura Flores <lflores@redhat.com>
Nizamudeen A [Wed, 6 Apr 2022 07:39:26 +0000 (13:09 +0530)]
build: install-deps failing in docker build
install-deps.sh was failing in our docker build due to the recent change in
the script. Failure can be seen here: https://github.com/rhcs-dashboard/ceph-dev/runs/5844502455?check_suite_focus=true#step:3:2586
After:
-f {json,json-pretty,xml,xml-pretty,plain,yaml}, --format {json,json-pretty,xml,xml-pretty,plain,yaml}
Note: yaml is only valid for orch commands
Fixes: https://tracker.ceph.com/issues/53895 Signed-off-by: Laura Flores <lflores@redhat.com>
Merge pull request #45187 from rhcs-dashboard/update-monitoring-stack-versions
mgr/cephadm: update monitoring stack versions
Reviewed-by: Adam King <adking@redhat.com> Reviewed-by: Ernesto Puerta <epuertat@redhat.com> Reviewed-by: Michael Fritch <mfritch@suse.com> Reviewed-by: p-se <NOT@FOUND>