Jason Dillaman [Mon, 22 Feb 2021 15:23:01 +0000 (10:23 -0500)]
librbd: permit disabling QCOW migration format support
Downstream Red Hat products do not support the older QCOW format. This
will allow the support for the legacy QCOW format to be disabled for the
new RBD import-only migration support.
Mykola Golub [Mon, 22 Feb 2021 12:53:38 +0000 (12:53 +0000)]
test/librbd: extend TestLibRBD.RenameViaLockOwner
To cover the following case:
- Client A has image opened but does not owns the lock.
- Client B renames the image (client A is not aware of it).
- Client A becomes the lock owner.
- Client B requests rename, which is proxied to the client A.
Mykola Golub [Mon, 22 Feb 2021 16:22:54 +0000 (16:22 +0000)]
rbd-mirror: reset update_status_task pointer in timer thread
To avoid a time window when m_update_status_task is invalid. If
during this time the cancel_update_mirror_image_replay_status is
called, it may cancel some other's ImageReplayer task, if it
happened to add the task with the same address.
Lucian Petrut [Wed, 17 Feb 2021 13:27:11 +0000 (13:27 +0000)]
rbd: fix rbd-wnbd device status
The "rbd-wnbd show" command will always report the device status
as "inactive". This patch adds the missing check, similar to the
one used by the "list" command.
Lucian Petrut [Wed, 17 Feb 2021 12:49:02 +0000 (12:49 +0000)]
common: fix win32 event log source
The Windows "get_process_name" function uses the input buffer
to store the entire executable path, while the caller only
expects the filename.
The "get_process_name_cpp" function is using an insufficient
buffer, for which reason it will return "(unknown)" when the
executable path exceeds 32 characters.
Windows event log entries have the wrong source because of this.
We'll update "get_process_name" to use a separate buffer for the
full executable path and avoid requesting a larger buffer than
actually needed.
Jason Dillaman [Thu, 11 Feb 2021 20:54:01 +0000 (15:54 -0500)]
librbd/mirror: leave non-primary snapshot images in creating state
The creating state is a special case in rbd-mirror where it will
automatically delete the image since it assumes it's malformed.
A non-primary, snapshot-based mirror image needs to have at least
one non-primary snapshot and the first one is not created until
after replay has started. Now rbd-mirror will update the mirror
image state to the enabled state after creating the first
non-primary snapshot but before attempting the sync.
Fixes: https://tracker.ceph.com/issues/49238 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 43f2c208fa3042d93e4810d804ffe28e9ca7af77)
Jason Dillaman [Thu, 11 Feb 2021 20:45:01 +0000 (15:45 -0500)]
rbd-mirror: ensure that the last non-primary snapshot cannot be pruned
Tweak the normal pruning behavior to ensure that an incomplete initial
non-primary snapshot is not included in the prune set since we know
it will be complete since otherwise the image would have been deleted
due to not updating the mirror-image-state to enabled. Also ensure
we cannot prune a non-primary mirror snapshot if we don't have a
predecessor.
Jason Dillaman [Thu, 11 Feb 2021 20:23:56 +0000 (15:23 -0500)]
rbd-mirror: update snapshot mirror image state after snapshot creation
The non-primary mirror snapshot is what is used to link the non-primary
to the primary image. If there is an interruption between creating the
non-primary image and the creation of the first non-primary snapshot,
the images will be considerered unlinked.
A future commit will modify librbd to avoid setting the mirror image
state to enabled for non-primary snapshot-based mirroring images.
rbd-mirror will already automatically delete images in the CREATING
state during the bootstrap phase.
If the requested write length does not match the provided bufferlist
length, disable the move optimization and instead fallback to creating
a new sub-bufferlist for the object request.
Fixes: https://tracker.ceph.com/issues/49173 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 8dbb4a3d971d9a48c171f161f531956dd0030403)
It appears that commit 6eb8f30a238 broke the test utility and
its failure was masked by the test case that expected a failure
due to a timeout force-killing the app.
Fixes: https://tracker.ceph.com/issues/49117 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 8643b046fb4d5b05b4c75b83f16cd8ccc6a8b0a0)
Kotresh HR [Fri, 5 Feb 2021 18:05:22 +0000 (23:35 +0530)]
qa: Fix a few mgr/volume test cases
Recovering dirty auth metadata file might not retain the order,
fixed the comparison in 'test_recover_auth_metadata_during_authorize'
and 'test_recover_auth_metadata_during_deauthorize'.
Ilya Dryomov [Mon, 8 Feb 2021 16:01:47 +0000 (17:01 +0100)]
librbd: don't hold owner_lock for validate_image_removal()
handle_exclusive_lock() and handle_shut_down_exclusive_lock() call
validate_image_removal() without owner_lock held, so holding it in
shut_down_exclusive_lock() appears to be redundant.
Ilya Dryomov [Sun, 7 Feb 2021 14:09:24 +0000 (15:09 +0100)]
librbd: treat EROFS as expected in handle_acquire_lock()
If the peer refuses to release exclusive lock (e.g. in case automatic
exclusive lock transitions are disabled), EROFS is retured. Suppress
a rather confusing "Read-only file system" error message -- this case
is no different from EBUSY or EAGAIN.
Ilya Dryomov [Sun, 7 Feb 2021 12:46:15 +0000 (13:46 +0100)]
librbd: refuse to release exclusive lock when removing
Commit 25c2ffe145be ("librbd: acquire exclusive lock from peer when
removing") changed PreRemoveRequest to request exclusive lock from the
peer instead of giving up and proceeding without exclusive lock. This
caused one of the test cases that sometimes runs concurrent "rbd rm"
against the same image to fail intermittently, most often on assert
because exclusive lock is now automatically transitioned to another
"rbd rm" on its request.
The root cause is older and probably goes back to when synchronous
librbd::remove() which held owner_lock across all operations including
trim_image() was converted to a set of state machines. Since then, any
peer that requests exclusive lock (instead of trying once and backing
off) is able to mess with image removal.
Install StandardPolicy to disable automatic exclusive lock transitions
during image removal.
Jason Dillaman [Mon, 8 Feb 2021 16:53:28 +0000 (11:53 -0500)]
rbd-mirror: don't prune older mirror snapshots when pruning incomplete snapshot
Since we normally prune in order, we need to ensure that we don't prune older
snapshots when we need to delete an incomplete mirror snapshot since the
older snapshot might be the only remaining mirror snapshot.
Jason Dillaman [Fri, 29 Jan 2021 15:44:38 +0000 (10:44 -0500)]
librbd/deep_copy: skip snap list if object is known to be clean
If the fast-diff indicates that the destination object should exist
and that it hasn't changed, there shouldn't be a need to issue the
snap list operation. Instead, just update the destination object map
to indicate the existence of the object.
Jason Dillaman [Fri, 29 Jan 2021 02:42:09 +0000 (21:42 -0500)]
librbd/deep_copy: object-copy state machine must update object map
If there was no data to copy, the object-copy state machine was bypassing
the object-map update states and prematurely completing. Since the
object-map is default-initialized to all non-existent objects, this results
in incorrect state for OBJECT_EXISTS_CLEAN objects.
Jason Dillaman [Wed, 3 Feb 2021 18:21:34 +0000 (13:21 -0500)]
librbd/io: track object non-existence when computing snapshot deltas
Re-use the existing DNE state to track whether or not the object
already exists when computing snapshot deltas from an arbitrary
set of snapshots. Previously, the non-existence of the object was
only computed for snap id 0 for tracking whiteouts. In a future
commit, the deep-copy object-copy state machine will be able to
properly update the object-map state to indicate exists clean
vs non-existent state.
Jason Dillaman [Wed, 3 Feb 2021 15:13:28 +0000 (10:13 -0500)]
librbd/io: only track initial diff extents if no diffs exists
The purpose of the initial diff extents ({0, 0}) was to help track
whether or not objects exists for read-from-parent / whiteout
tracking. Once we have at least one set of diffs on the object, we
actually have enough information to know about the object state.
Jason Dillaman [Thu, 28 Jan 2021 23:30:16 +0000 (18:30 -0500)]
librbd/object_map: diff state machine should track object existence
The deep-copy snapshot-create state machine initializes the object-map
state to non-existent for all objects. There was an assumption that the
deep-copy object-copy state machine would always update the object map
but that was being skipped for clean objects as an optimization. This
change will support a future commit to run the object-copy state machine
for existing objects.
Nizamudeen A [Tue, 19 Jan 2021 12:35:43 +0000 (18:05 +0530)]
mgr/dashboard: Automatically refresh the crush map metadata table
If we make any change to the osd crush map like do an osd crush reweight from cli, for that change to be reflected on metadata table we need to reload the entire page. Instead this PR takes care of auto refreshing the tree view.
Fixes: https://tracker.ceph.com/issues/48922 Signed-off-by: Nizamudeen A <nia@redhat.com> Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit bc8562ef2a17b78e80bd4e1272d3fd1a512249bb)
Sebastian Wagner [Fri, 29 Jan 2021 10:10:38 +0000 (11:10 +0100)]
mgr/cephadm: Add strings to assert statements
This helps with: https://tracker.ceph.com/issues/48981
Looks like there is an assert somewhere:
```
Error EINVAL: Traceback (most recent call last):
File "/usr/share/ceph/mgr/mgr_module.py", line 1269, in _handle_command
return self.handle_command(inbuf, cmd)
...snip...
File "/usr/share/ceph/mgr/orchestrator/module.py", line 550, in _list_services
raise_if_exception(completion)
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 653, in raise_if_exception
raise e
AssertionError
```
Sebastian Wagner [Fri, 15 Jan 2021 12:13:35 +0000 (13:13 +0100)]
mgr/cephadm: try again calling ceph-volume without --filter-for-batch
Fixes: https://tracker.ceph.com/issues/48870
This deals with a cephadm upgrade issue:
1. user calls `ceph orch upgrade`
2. mgr/cephadm calls `ceph orch config set mgr.x container_image <new-container>`
3. standby mgr gets upgraded
4. mgr failover to new mgr
5. mgr/cephadm calls `_refresh_host_devices`
6. `_refresh_host_devices` calls` ceph orch config get osd container_image`.
But this returns the old image
7. `_refresh_host_devices` calls `ceph-volume ... --filter-for-batch`
with an image that doesn't support `filter-for-batch`
The idea is to simply retiry calling ceph-volume inventory without `--filter-for-batch`
(also removed `out` being used without being declared)
Sage Weil [Tue, 26 Jan 2021 22:10:13 +0000 (16:10 -0600)]
mgr/cephadm/upgrade: scale down MDS cluster(s) for major version upgrades
For octopus -> pacific, as with other recent releases, we need to scale
down the MDS cluster(s) to a single daemon before upgrading. (This is
because the MDS intra-cluster protocols aren't fully versioned.)
Sage Weil [Wed, 27 Jan 2021 14:54:00 +0000 (08:54 -0600)]
mgr/cephadm/upgrade: match against any repo_digest, not image_id
The image id can vary across hosts and (most notably) docker vs podman.
Instead, use the repo_digest as an image identifier.
Unfortunately, a single image may have multiple digests, even within the
same registry, so keep a list of the digests for the image we are
upgrading to, and ensure that each container has a digest that matches at
least one of them.
This allows upgrade to proceed in mixed docker+podman clusters. However,
it does not yet address a cluster with mixed CPU architectures, because
the container image will have different digest(s) for each architecture
build.