This was attempted in commit 69a7ed4eab36 ("run-make-check: enable
WITH_RBD_RWL when WITH_PMEM is true") but never completed. We soon
bumped the requirement on libpmem, so WITH_SYSTEM_PMDK=ON wouldn't
have worked anyway.
Enable the RWL mode conditionally based on WITH_RBD_RWL variable.
Enable the SSD mode unconditionally as it has no special dependencies
and can be built on any architecture.
test/encoding/check-generated.sh: show diff if binary reencode check fails
Take bf0b161115aa ("test/encoding/check-generated.sh: show diff if cmp
fails") a bit further. Suggesting "cmp $tmp1 $tmp2" isn't very helpful
since cmp would report just the mismatch offset.
librbd/cache/pwl: WriteLogCacheEntry constructor must initialize flags
Initializing the individual bit field members leaves the remaining two
bits uninitialized and that garbage state gets persisted.
In general, using bit fields in a structure where the layout actually
matters is not desirable. Even with a few single bits, such as here,
their order, strictly speaking, is not guaranteed:
An implementation may allocate any addressable storage unit large
enough to hold a bit-field. If enough space remains, a bit-field
that immediately follows another bit-field in a structure shall be
packed into adjacent bits of the same unit. If insufficient space
remains, whether a bit-field that does not fit is put into the next
unit or overlaps adjacent units is implementation-defined. The
order of allocation of bit-fields within a unit (high-order to
low-order or low-order to high-order) is implementation-defined.
The alignment of the addressable storage unit is unspecified.
librbd/cache/pwl: remove RBD_FEATURE_DIRTY_CACHE check in DiscardRequest
"m_image_ctx.features &&RBD_FEATURE_DIRTY_CACHE" is obviously wrong
because it would pretty much always be true. However, even if bitwise
AND was used, this check would still be dead because DiscardRequest is
only invoked if RBD_FEATURE_DIRTY_CACHE is enabled:
int invalidate_cache(ImageCtx *ictx) {
{
...
// Delete writeback cache if it is not initialized
if ((!ictx->exclusive_lock ||
!ictx->exclusive_lock->is_lock_owner()) &&
ictx->test_features(RBD_FEATURE_DIRTY_CACHE)) {
C_SaferCond ctx3;
ictx->plugin_registry->discard(&ctx3);
r = ctx3.wait();
}
librbd/cache/pwl: don't crash if cache file removal fails
The non-ec overload will throw fs::filesystem_error on any error
(e.g. EPERM due to unprivileged "rbd persistent-cache invalidate"
being brought up against a privileged workload).
Yin Congmin [Wed, 22 Dec 2021 07:07:11 +0000 (15:07 +0800)]
librbd/cache/pwl: rename persistent cache key
librbd "internal" metadata keys was change to ".rbd" prefix. Change
peristent cache to ".rbd" too.
And the name of persistent cache key is IMAGE_CACHE_STATE. Since
this key is planned to be used outside the pwl directory, it seems
more appropriate to change it to a clear name as PERSISTENT_CACHE_STATE.
Venky Shankar [Tue, 29 Mar 2022 13:18:06 +0000 (09:18 -0400)]
mount.ceph: remove `ms_mode' mount option when switching to old-syntax
... and switch to using v1 addresses (if users haven't specified those
explicitly). kernel versions <5.11 do not understand `ms_mode' mount
option which would result in mount failure.
librbd/cache/pwl: avoid inconsistencies in ImageCacheState
When empty and/or clean bools are updated in I/O handling code paths,
ImageCacheState becomes inconistent for a short while: e.g. with clean
transitioned to true, dirty_bytes counter could still be positive
because the counters are updated only in periodic_stats(). Move to
updating the counters in update_image_cache_state(Context*) to avoid
this.
update_image_cache_state(Context*) now requires m_lock -- most call
sites already hold it anyway. The only problematic call site was
AbstractWriteLog::shut_down() callback chain: perf_stop() needed to
be moved to the very end since perf counters must be alive now for
update_image_cache_state() to work.
Don't override expect_op_work_queue() in unit tests: completing
context in the same thread now results in a deadlock on m_lock in
all test cases that call AbstractWriteLog::init().
get_json_format() and create_image_cache_state() attempt to get
particular keys which could result in an unhandled std::runtime_error
exception. Conversely, ImageCacheState constructor just swallows that
exception which could leave the newly constructed object incorrectly
initialized. Avoid doing parsing in the constructor and introduce
init_from_config() and init_from_metadata() methods instead.
While at it, move everything out from under "persistent_cache" key.
Also fix init_state_json_write test case which stopped working now
that types are enforced by json_spirit.
Yin Congmin [Tue, 29 Mar 2022 08:59:05 +0000 (16:59 +0800)]
librbd/cache/pwl: add basic metrics to ImageCacheState
Add basic metrics to ImageCacheState and persist them, including
allocated_bytes, cached_bytes, dirty_bytes, free_bytes and hit/miss
info.
Leverage periodic_stats() timer to call update_image_cache_state.
In order to avoid outputting too much debug information, the original
statistics output log level is changed to 5.
Switch to json_spirit for encoding because encode_json encodes bool as
"true"/"false" string.
Remove rbd_persistent_cache_log_periodic_stats option because we need
to always update cache state.
[ idryomov: add cached_bytes and hits_partial; report misses and
miss_bytes instead of respective totals; naming ]
cmake: move rgw_lua_request.cc from rgw_common target to rgw_a
resolves a clang linker error where `rgw::lua::request::RequestLog()`
from rgw_lua_request.cc (in rgw_common) looks for `rgw_log_op()` from
rgw_log.cc (in rgw_a)
rgw_a depends on rgw_common, not the other way around. so this moves
rgw_lua_request.cc into the same target as rgw_log.cc
lua is now a public dependency of rgw_common so it's not hidden from
rgw_a or unit tests
Melissa Li [Wed, 23 Mar 2022 15:38:37 +0000 (11:38 -0400)]
cephadm: show error message if private registry credentials not provided
Raise UnauthorizedRegistryError in `_pull_image` if user tries to pull from a private registry without authentication, handle error in `command_bootstrap`, `commond_adopt`, `command_pull`
Fixes: https://tracker.ceph.com/issues/55015 Signed-off-by: Melissa Li <melissali@redhat.com>
Laura Flores [Thu, 7 Apr 2022 22:20:14 +0000 (22:20 +0000)]
mgr/telemetry: anonymize daemons in telemetry perf_counters
In the telemetry perf channel we collect 'perf_counters' of individual daemons.
The monitors appear with their full name, which includes the host name.
The host name part must be anonymized.
To err on the safe side, I have anonymized all daemons except for osds,
since they are not attached to host names.
Fixes: https://tracker.ceph.com/issues/55229 Signed-off-by: Laura Flores <lflores@redhat.com>
Nizamudeen A [Wed, 6 Apr 2022 07:39:26 +0000 (13:09 +0530)]
build: install-deps failing in docker build
install-deps.sh was failing in our docker build due to the recent change in
the script. Failure can be seen here: https://github.com/rhcs-dashboard/ceph-dev/runs/5844502455?check_suite_focus=true#step:3:2586
After:
-f {json,json-pretty,xml,xml-pretty,plain,yaml}, --format {json,json-pretty,xml,xml-pretty,plain,yaml}
Note: yaml is only valid for orch commands
Fixes: https://tracker.ceph.com/issues/53895 Signed-off-by: Laura Flores <lflores@redhat.com>
Merge pull request #45187 from rhcs-dashboard/update-monitoring-stack-versions
mgr/cephadm: update monitoring stack versions
Reviewed-by: Adam King <adking@redhat.com> Reviewed-by: Ernesto Puerta <epuertat@redhat.com> Reviewed-by: Michael Fritch <mfritch@suse.com> Reviewed-by: p-se <NOT@FOUND>