Kotresh HR [Fri, 5 Feb 2021 18:05:22 +0000 (23:35 +0530)]
qa: Fix a few mgr/volume test cases
Recovering dirty auth metadata file might not retain the order,
fixed the comparison in 'test_recover_auth_metadata_during_authorize'
and 'test_recover_auth_metadata_during_deauthorize'.
Ilya Dryomov [Mon, 8 Feb 2021 16:01:47 +0000 (17:01 +0100)]
librbd: don't hold owner_lock for validate_image_removal()
handle_exclusive_lock() and handle_shut_down_exclusive_lock() call
validate_image_removal() without owner_lock held, so holding it in
shut_down_exclusive_lock() appears to be redundant.
Ilya Dryomov [Sun, 7 Feb 2021 14:09:24 +0000 (15:09 +0100)]
librbd: treat EROFS as expected in handle_acquire_lock()
If the peer refuses to release exclusive lock (e.g. in case automatic
exclusive lock transitions are disabled), EROFS is retured. Suppress
a rather confusing "Read-only file system" error message -- this case
is no different from EBUSY or EAGAIN.
Ilya Dryomov [Sun, 7 Feb 2021 12:46:15 +0000 (13:46 +0100)]
librbd: refuse to release exclusive lock when removing
Commit 25c2ffe145be ("librbd: acquire exclusive lock from peer when
removing") changed PreRemoveRequest to request exclusive lock from the
peer instead of giving up and proceeding without exclusive lock. This
caused one of the test cases that sometimes runs concurrent "rbd rm"
against the same image to fail intermittently, most often on assert
because exclusive lock is now automatically transitioned to another
"rbd rm" on its request.
The root cause is older and probably goes back to when synchronous
librbd::remove() which held owner_lock across all operations including
trim_image() was converted to a set of state machines. Since then, any
peer that requests exclusive lock (instead of trying once and backing
off) is able to mess with image removal.
Install StandardPolicy to disable automatic exclusive lock transitions
during image removal.
Jason Dillaman [Mon, 8 Feb 2021 16:53:28 +0000 (11:53 -0500)]
rbd-mirror: don't prune older mirror snapshots when pruning incomplete snapshot
Since we normally prune in order, we need to ensure that we don't prune older
snapshots when we need to delete an incomplete mirror snapshot since the
older snapshot might be the only remaining mirror snapshot.
Jason Dillaman [Fri, 29 Jan 2021 15:44:38 +0000 (10:44 -0500)]
librbd/deep_copy: skip snap list if object is known to be clean
If the fast-diff indicates that the destination object should exist
and that it hasn't changed, there shouldn't be a need to issue the
snap list operation. Instead, just update the destination object map
to indicate the existence of the object.
Jason Dillaman [Fri, 29 Jan 2021 02:42:09 +0000 (21:42 -0500)]
librbd/deep_copy: object-copy state machine must update object map
If there was no data to copy, the object-copy state machine was bypassing
the object-map update states and prematurely completing. Since the
object-map is default-initialized to all non-existent objects, this results
in incorrect state for OBJECT_EXISTS_CLEAN objects.
Jason Dillaman [Wed, 3 Feb 2021 18:21:34 +0000 (13:21 -0500)]
librbd/io: track object non-existence when computing snapshot deltas
Re-use the existing DNE state to track whether or not the object
already exists when computing snapshot deltas from an arbitrary
set of snapshots. Previously, the non-existence of the object was
only computed for snap id 0 for tracking whiteouts. In a future
commit, the deep-copy object-copy state machine will be able to
properly update the object-map state to indicate exists clean
vs non-existent state.
Jason Dillaman [Wed, 3 Feb 2021 15:13:28 +0000 (10:13 -0500)]
librbd/io: only track initial diff extents if no diffs exists
The purpose of the initial diff extents ({0, 0}) was to help track
whether or not objects exists for read-from-parent / whiteout
tracking. Once we have at least one set of diffs on the object, we
actually have enough information to know about the object state.
Jason Dillaman [Thu, 28 Jan 2021 23:30:16 +0000 (18:30 -0500)]
librbd/object_map: diff state machine should track object existence
The deep-copy snapshot-create state machine initializes the object-map
state to non-existent for all objects. There was an assumption that the
deep-copy object-copy state machine would always update the object map
but that was being skipped for clean objects as an optimization. This
change will support a future commit to run the object-copy state machine
for existing objects.
Nizamudeen A [Tue, 19 Jan 2021 12:35:43 +0000 (18:05 +0530)]
mgr/dashboard: Automatically refresh the crush map metadata table
If we make any change to the osd crush map like do an osd crush reweight from cli, for that change to be reflected on metadata table we need to reload the entire page. Instead this PR takes care of auto refreshing the tree view.
Fixes: https://tracker.ceph.com/issues/48922 Signed-off-by: Nizamudeen A <nia@redhat.com> Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit bc8562ef2a17b78e80bd4e1272d3fd1a512249bb)
Sebastian Wagner [Fri, 29 Jan 2021 10:10:38 +0000 (11:10 +0100)]
mgr/cephadm: Add strings to assert statements
This helps with: https://tracker.ceph.com/issues/48981
Looks like there is an assert somewhere:
```
Error EINVAL: Traceback (most recent call last):
File "/usr/share/ceph/mgr/mgr_module.py", line 1269, in _handle_command
return self.handle_command(inbuf, cmd)
...snip...
File "/usr/share/ceph/mgr/orchestrator/module.py", line 550, in _list_services
raise_if_exception(completion)
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 653, in raise_if_exception
raise e
AssertionError
```
Sebastian Wagner [Fri, 15 Jan 2021 12:13:35 +0000 (13:13 +0100)]
mgr/cephadm: try again calling ceph-volume without --filter-for-batch
Fixes: https://tracker.ceph.com/issues/48870
This deals with a cephadm upgrade issue:
1. user calls `ceph orch upgrade`
2. mgr/cephadm calls `ceph orch config set mgr.x container_image <new-container>`
3. standby mgr gets upgraded
4. mgr failover to new mgr
5. mgr/cephadm calls `_refresh_host_devices`
6. `_refresh_host_devices` calls` ceph orch config get osd container_image`.
But this returns the old image
7. `_refresh_host_devices` calls `ceph-volume ... --filter-for-batch`
with an image that doesn't support `filter-for-batch`
The idea is to simply retiry calling ceph-volume inventory without `--filter-for-batch`
(also removed `out` being used without being declared)
Sage Weil [Tue, 26 Jan 2021 22:10:13 +0000 (16:10 -0600)]
mgr/cephadm/upgrade: scale down MDS cluster(s) for major version upgrades
For octopus -> pacific, as with other recent releases, we need to scale
down the MDS cluster(s) to a single daemon before upgrading. (This is
because the MDS intra-cluster protocols aren't fully versioned.)
Sage Weil [Wed, 27 Jan 2021 14:54:00 +0000 (08:54 -0600)]
mgr/cephadm/upgrade: match against any repo_digest, not image_id
The image id can vary across hosts and (most notably) docker vs podman.
Instead, use the repo_digest as an image identifier.
Unfortunately, a single image may have multiple digests, even within the
same registry, so keep a list of the digests for the image we are
upgrading to, and ensure that each container has a digest that matches at
least one of them.
This allows upgrade to proceed in mixed docker+podman clusters. However,
it does not yet address a cluster with mixed CPU architectures, because
the container image will have different digest(s) for each architecture
build.
Paul Cuzner [Thu, 14 Jan 2021 22:08:48 +0000 (11:08 +1300)]
cephadm: install doc updated to include cluster-network parameter
Install guide updated to include a description of the --cluster-network
parameter. The text also links to the complete definition for cluster-network
on the rados/configuration/network-config-ref page.
Lucian Petrut [Wed, 3 Feb 2021 08:59:24 +0000 (08:59 +0000)]
win32*.sh: move debug symbols to separate files
This patch simplifies releasing Windows binaries along with debug
symbols.
By default, we're going to provide minimum debug information (-g1).
The symbols are extracted from the binaries and placed in separate
files in the ".debug" folder, which is used by gdb implicitly.
This is more convenient than having separate versions of the binaries,
with or without debug symbols.
Lucian Petrut [Fri, 29 Jan 2021 11:03:20 +0000 (11:03 +0000)]
rbd: propagate WNBD start errors
This change will propagate the errors that WNBD may return when
spinning up the IO workers.
Also, we'll avoid removing the registry record for failed
non-persistent mappings. Those will be cleaned up when the service
restarts or when explicitly unmapped.