Melissa Li [Wed, 23 Mar 2022 15:38:37 +0000 (11:38 -0400)]
cephadm: show error message if private registry credentials not provided
Raise UnauthorizedRegistryError in `_pull_image` if user tries to pull from a private registry without authentication, handle error in `command_bootstrap`, `commond_adopt`, `command_pull`
Fixes: https://tracker.ceph.com/issues/55015 Signed-off-by: Melissa Li <melissali@redhat.com>
(cherry picked from commit 4de0803ba893abf341ab634d1382208370de7c98)
mgr/cephadm: support non-root ssh-user w permissions
Restructured code, so that in case of non-root, the resulting file will
be created with permissions set to the ssh-user. This allows the
subsequent scp to be able to write the file. The remaining code kept the
same, so that file permissions are restored to the expected ones, but
just runs after the scp.
Fixes: https://tracker.ceph.com/issues/54620 Signed-off-by: Christoph Glaubitz <c.glaubitz@syseleven.de>
(cherry picked from commit 452e52a7e39409e3409d59940133333416b830bc)
ceph-volume/tests: reject loop devices in lvm.conf
The current task doesn't works (typo?).
Otherwise api/lvm.py can't work properly, functions such as
`get_single_lv()` and many other don't return the expected results.
Indeed, lvm is confused because of the nvme_loop setup.
This adds the support of complex OSD creation with command
`orch daemon add osd`.
Any argument supported by `DriveGroupSpec()` can be passed on the command line.
Cephadm shouldn't try to deploy a disk reported as unavailable by ceph-volume.
The idea here is to check the rejection reason so we can still use DB devices
in case of OSD replacement.
Redouane Kachach [Fri, 11 Mar 2022 11:41:18 +0000 (12:41 +0100)]
mgr/cephadm: Show warning when user provides --fsid option Fixes: https://tracker.ceph.com/issues/50804 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 8780aa04651fa2cddeec1d9d2dfcf4e08412d4ce)
mgr/cephadm: checking service name before removal Fixes: https://tracker.ceph.com/issues/54503 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit b26c114c8456941d6cccf7d4355445f21cb373a7)
Joseph Sawaya [Fri, 11 Mar 2022 20:45:16 +0000 (15:45 -0500)]
doc: Add note to osds_per_device description about dual-actuator devices
This commit adds information about using dual-actuator devices with the
osds_per_device drive group option, letting users know they can create
an OSD for each actuator by setting this value to 2 in the drive group
they're using to apply OSDs to the device.
Redouane Kachach [Mon, 14 Feb 2022 14:17:31 +0000 (15:17 +0100)]
mgr/cephadm: Show an error when invalid format Fixes: https://tracker.ceph.com/issues/54198 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit d2fa22fd77d4a20f5db0d9315e5efebb016de481)
mgr/cephadm: Adding AGE field to device ls cmd Fixes: https://tracker.ceph.com/issues/53540 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 1c5b3e86f9b8ae0ca3ae41798dfa18e9ffe9fcb7)
Redouane Kachach [Tue, 25 Jan 2022 17:27:56 +0000 (18:27 +0100)]
mgr/cephadm: Adding logic to cleanup several dirs after an rm-cluster Fixes: https://tracker.ceph.com/issues/53010
https://tracker.ceph.com/issues/53815 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 0df6c04d8f99692a542b49c22bddbf12510801a5)
cmake/modules: use exact version of python3 when finding cython
* CMakeLists.txt:
always pass "EXACT" to find_package(Python3).
because per cmake document, "EXACT" only takes effect when
<Package>_FIND_VERSION_COUNT is greater than 1, where <Package>
is "Python3". see also cmake/modules/FindPython/Support.cmake
* cmake/modules/AddCephTest.cmake:
drop redundant find_package(Python3) calls. since Python3 is
a mandatory requirement for building Ceph, we only need a
single call of find_package(Python3..) in the top of the source
tree. the only possible case to repeat it is to ensure that we
have the correct version of Python3 used in following CMake
script. but there is no need to repeat it if we just want to
ensure that we have a python3 interpretor in place.
* cmake/modules/Distutils.cmake:
always pass "EXACT" to find_package(Python3).
we should always pass EXACT to find_package() when finding python3,
this is a follow-up of e2babdfae8c99f39f99a7c8a8f966299b2e62b19
This was attempted in commit 69a7ed4eab36 ("run-make-check: enable
WITH_RBD_RWL when WITH_PMEM is true") but never completed. We soon
bumped the requirement on libpmem, so WITH_SYSTEM_PMDK=ON wouldn't
have worked anyway.
Enable the RWL mode conditionally based on WITH_RBD_RWL variable.
Enable the SSD mode unconditionally as it has no special dependencies
and can be built on any architecture.
test/encoding/check-generated.sh: show diff if binary reencode check fails
Take bf0b161115aa ("test/encoding/check-generated.sh: show diff if cmp
fails") a bit further. Suggesting "cmp $tmp1 $tmp2" isn't very helpful
since cmp would report just the mismatch offset.
librbd/cache/pwl: WriteLogCacheEntry constructor must initialize flags
Initializing the individual bit field members leaves the remaining two
bits uninitialized and that garbage state gets persisted.
In general, using bit fields in a structure where the layout actually
matters is not desirable. Even with a few single bits, such as here,
their order, strictly speaking, is not guaranteed:
An implementation may allocate any addressable storage unit large
enough to hold a bit-field. If enough space remains, a bit-field
that immediately follows another bit-field in a structure shall be
packed into adjacent bits of the same unit. If insufficient space
remains, whether a bit-field that does not fit is put into the next
unit or overlaps adjacent units is implementation-defined. The
order of allocation of bit-fields within a unit (high-order to
low-order or low-order to high-order) is implementation-defined.
The alignment of the addressable storage unit is unspecified.
cmake/modules: always use the python3 specified in command line
if another python3 with higher version is found by
find_package(Python3), the cmake's install script would just
install the python modules/extensions into that python3's
dist-package directory, and the packaging script would fail
to find these artifacts when trying to package them.
so we need to ensure that the install directories for python
modeules/extensions are always "versioned" with WITH_PYTHON3
cmake option.
Jos Collin [Thu, 24 Mar 2022 04:57:58 +0000 (10:27 +0530)]
qa: make test_perf_stats_stale_metrics check only the clients created for the tests
Uses the client's global id to get the metrics, instead of using the index.
This ensures that test_perf_stats_stale_metrics checks only the clients mounted for
the tests.
Fixes: https://tracker.ceph.com/issues/54971 Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit 1621308214cd18750c8be803fc014bdf73e2218a)
Jos Collin [Tue, 29 Jun 2021 09:59:01 +0000 (15:29 +0530)]
mgr, pybind/mgr, mgr/stats: be resilient to offline rank0 MDS
Reregister the user queries during the rank0 MDS failover event
by calling listener.handle_query_updated(). This enables
`ceph fs perf stats` to receive the updated metrics after the
failover.
Fixes: https://tracker.ceph.com/issues/50033 Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit c2470f271cce4d512f2cf00552c9b753e4c69f71)
1. Created subvolume
2. Written some I/O on the subvolume
3. Create snapshot of the subvolume
4. Create clone of the snapshot
5. Delete snapshot from back end (don't use subvolume interface) before
clone completes
6. Delete clone with force
7. Delete subvolume
8. Delete fs and associated pools
9. Created new fs
10 Created new subvolume,
11. Written some I/O on the subvolume
12. Create snapshot of the subvolume
13. Create clone of the snapshot <---------------THIS OPERATION HANGS -----------------
Root Cause:
Since the snapshot is deleted from the back end, the clone fails. But it
also fails to remove the clone index at '/volumes/_index/clone'. The
cloner thread goes to infinite loop of starting the clone and failing.
This involves taking 'self.async_job.lock()' and reads the clone index
to get the job and registers the above job.
While the 'cloner thread' is in above loop, the fs is destroyed. The
cloner threads which lives till the mgr/volumes is enabled in mgr, takes
the 'self.async_job.lock()' and hangs while reading the clone index.
Any further clone operations which also requires above lock hangs.
Fix:
Remove the clone index even though snapshot is not present.
cmake: resurrect mutex debugging in all Debug builds
Commit 403f1ec2888a ("cmake: make "WITH_CEPH_DEBUG_MUTEX" depend on
CMAKE_BUILD_TYPE") made WITH_CEPH_DEBUG_MUTEX depend on build type
being set to Debug, in CMakeLists.txt. However, if CMAKE_BUILD_TYPE
isn't specified by the user, we may still set it to Debug later, in
src/CMakeLists.txt, and in that case WITH_CEPH_DEBUG_MUTEX doesn't
get enabled. The result is that
$ do_cmake.sh -DCMAKE_BUILD_TYPE=Debug ...
debug builds have mutex debugging enabled, while
$ do_cmake.sh ...
builds, which are supposed to be the same, don't. Jenkins builders
don't pass -DCMAKE_BUILD_TYPE=Debug so that commit effectively turned
off all ceph_mutex_is_locked* asserts in "make check".
test/rbd_mirror: grab timer lock before calling add_event_after()
add_event_after() expects an externally provided mutex to be held
for the call. This was missed in commit 8965a0f2a6f7 ("rbd-mirror:
synchronize with in-flight stop in ImageReplayer::stop()").
Casey Bodley [Thu, 10 Mar 2022 20:32:48 +0000 (15:32 -0500)]
cls/rgw: rgw_dir_suggest_changes detects race with completion
if bucket listing races with a pending index transaction, its suggested
removal may be mistakenly applied if that index transaction completes
before the osd receives this suggestion
in `rgw_dir_suggest_changes()`, the sole condition for applying a
suggested change is that the `cur_disk.pending_map` is empty. this is
true after rgw_bucket_complete_op()
on index completion, `rgw_bucket_dir_entry::index_ver` is updated to match
the new value of `rgw_bucket_dir_header::ver`. because most of `struct
rgw_bucket_dir_entry` makes the round trip through bucket listing ->
dir_suggest, we have access to the index_ver of the suggested entry. by
comparing this against the stored entry, we can ignore any suggestions
that were sent before the most recent completion
librbd/cache/pwl: remove RBD_FEATURE_DIRTY_CACHE check in DiscardRequest
"m_image_ctx.features &&RBD_FEATURE_DIRTY_CACHE" is obviously wrong
because it would pretty much always be true. However, even if bitwise
AND was used, this check would still be dead because DiscardRequest is
only invoked if RBD_FEATURE_DIRTY_CACHE is enabled:
int invalidate_cache(ImageCtx *ictx) {
{
...
// Delete writeback cache if it is not initialized
if ((!ictx->exclusive_lock ||
!ictx->exclusive_lock->is_lock_owner()) &&
ictx->test_features(RBD_FEATURE_DIRTY_CACHE)) {
C_SaferCond ctx3;
ictx->plugin_registry->discard(&ctx3);
r = ctx3.wait();
}
librbd/cache/pwl: don't crash if cache file removal fails
The non-ec overload will throw fs::filesystem_error on any error
(e.g. EPERM due to unprivileged "rbd persistent-cache invalidate"
being brought up against a privileged workload).
Yin Congmin [Wed, 22 Dec 2021 07:07:11 +0000 (15:07 +0800)]
librbd/cache/pwl: rename persistent cache key
librbd "internal" metadata keys was change to ".rbd" prefix. Change
peristent cache to ".rbd" too.
And the name of persistent cache key is IMAGE_CACHE_STATE. Since
this key is planned to be used outside the pwl directory, it seems
more appropriate to change it to a clear name as PERSISTENT_CACHE_STATE.
librbd/cache/pwl: avoid inconsistencies in ImageCacheState
When empty and/or clean bools are updated in I/O handling code paths,
ImageCacheState becomes inconistent for a short while: e.g. with clean
transitioned to true, dirty_bytes counter could still be positive
because the counters are updated only in periodic_stats(). Move to
updating the counters in update_image_cache_state(Context*) to avoid
this.
update_image_cache_state(Context*) now requires m_lock -- most call
sites already hold it anyway. The only problematic call site was
AbstractWriteLog::shut_down() callback chain: perf_stop() needed to
be moved to the very end since perf counters must be alive now for
update_image_cache_state() to work.
Don't override expect_op_work_queue() in unit tests: completing
context in the same thread now results in a deadlock on m_lock in
all test cases that call AbstractWriteLog::init().
get_json_format() and create_image_cache_state() attempt to get
particular keys which could result in an unhandled std::runtime_error
exception. Conversely, ImageCacheState constructor just swallows that
exception which could leave the newly constructed object incorrectly
initialized. Avoid doing parsing in the constructor and introduce
init_from_config() and init_from_metadata() methods instead.
While at it, move everything out from under "persistent_cache" key.
Also fix init_state_json_write test case which stopped working now
that types are enforced by json_spirit.
Yin Congmin [Tue, 29 Mar 2022 08:59:05 +0000 (16:59 +0800)]
librbd/cache/pwl: add basic metrics to ImageCacheState
Add basic metrics to ImageCacheState and persist them, including
allocated_bytes, cached_bytes, dirty_bytes, free_bytes and hit/miss
info.
Leverage periodic_stats() timer to call update_image_cache_state.
In order to avoid outputting too much debug information, the original
statistics output log level is changed to 5.
Switch to json_spirit for encoding because encode_json encodes bool as
"true"/"false" string.
Remove rbd_persistent_cache_log_periodic_stats option because we need
to always update cache state.
[ idryomov: add cached_bytes and hits_partial; report misses and
miss_bytes instead of respective totals; naming ]