Adam King [Fri, 21 Apr 2023 14:07:09 +0000 (10:07 -0400)]
cephadm: require --image is passed to inspect-image
The selection of an image by default was likely unused and
has always been a bit of a flaky thing, especially if multiple
clusters are making use of the host where this is run. It seems
preferable to just require this argument. Additionally, the
command without the image specified is currently untested
and prone to being broken. All uses of inspect-image done
through the cephadm mgr module specify the image.
doc: Add missing `ceph` command in documentation section `REPLACING AN OSD`
Signed-off-by: Alexander Proschek <alexander.proschek@protonmail.com> Signed-off-by: Alexander Proschek <alexander.proschek@protonmail.com>
(cherry picked from commit 0557d5e465556adba6d25db62a40ba55a5dd2400)
Zac Dover [Thu, 18 May 2023 21:07:02 +0000 (07:07 +1000)]
doc/radosgw: explain multisite dynamic sharding
Add a note to doc/radosgw/dynamicresharding.rst and a note to
doc/radosgw/multisite.rst that explains that dynamic resharding is not
supported in releases prior to Reef.
This commit is made in response to a request from Mathias Chapelain.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit d4ed4223d914328361528990f89f1ee4acd30e79)
Zac Dover [Wed, 17 May 2023 12:25:38 +0000 (22:25 +1000)]
doc/cephfs: line-edit "Mirroring Module"
Line-edit the "Mirroring Module" section of
doc/cephfs/cephfs-mirroring.rst. Add prompts and formatting where such
things contribute to the realization of adequate sentences.
This commit is a follow-up to https://github.com/ceph/ceph/pull/51505.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit dd8855d9a934bcdd6a026f1308ba7410b1e143e3)
Aashish Sharma [Mon, 8 May 2023 07:19:13 +0000 (12:49 +0530)]
mgr/dashboard: fix regression caused by cephPgImabalance alert
because of an earlier fix delivered, there is a regression caused by it
due to which alerts are not getting displayed in the active alerts tab.
This PR intends to fix this issue.
Venky Shankar [Tue, 16 May 2023 05:25:34 +0000 (10:55 +0530)]
doc: explain cephfs mirroring `peer_add` step in detail
@zdover23 reached out regarding missing explanation for `peer_add`
step in cephfs mirroring documentation. Add some explanation and
and example to make the step clear.
In the case where `iter->second.addr` is an empty address,
m_locker->address string is assigned with "0)/0" and therfore
will never result in an empty string.
Ramana Raja [Wed, 10 May 2023 18:37:44 +0000 (14:37 -0400)]
rbd_support: recover from "double blocklisting"
Recover from being blocklisted while recovering from blocklisting.
When the rbd_support module is being set up to recover from client
blocklisting, the module's new rados client connection can also get
blocklisted. Currently, this will cause the recovery to fail and
the module will remain inoperable. Instead, retry module recovery
when the new client gets blocklisted during the module setup in the
recovery thread.
Ramana Raja [Wed, 15 Feb 2023 15:12:54 +0000 (10:12 -0500)]
mgr/rbd_support: recover from rados client blocklisting
In certain scenarios the OSDs were slow to process RBD requests.
This lead to the rbd_support module's RBD client not being able to
gracefully handover a RBD exclusive lock to another RBD client.
After the condition persisted for some time, the other RBD client
forcefully acquired the lock by blocklisting the rbd_support module's
RBD client, and consequently blocklisted the module's RADOS client. The
rbd_support module stopped working. To recover the module, the entire
mgr service had to be restarted which reloaded other mgr modules.
Instead of recovering the rbd_support module from client blocklisting
by being disruptive to other mgr modules, recover the module
automatically without restarting the mgr serivce. On client getting
blocklisted, shutdown the module's handlers and blocklisted client,
create a new rados client for the module, and start the new handlers.
librbd: localize snap_remove op for mirror snapshots
A client may attempt a lock request not quickly enough to
obtain exclusive lock for operations when another competing
client responds quicker. This can happen when a peer site has
different performance characteristics or latency. Instead of
relying on this unpredictable behavior, localize operation to
primary cluster.
Fixes: https://tracker.ceph.com/issues/59393 Signed-off-by: Christopher Hoffman <choffman@redhat.com>
(cherry picked from commit ac552c9b4d65198db8038d397a3060d5a030917d)
librbd: always refresh after creating snapshot in CreatePrimaryRequest
Up until now this was conditioned on whether the caller expressed
interest in the ID of the created snapshot and happened to work only
because CreatePrimaryRequest wasn't actually consulting any mirror
snapshot metadata. This has just changed with unlink_peer() needing to
see an up-to-date complete flag which is set in SetImageStateRequest
following the write out of image state object(s).
librbd: remove previous incomplete primary snapshot after successfully creating a new one
Problem:
-------
At a high level, creating a primary snapshot consists of three steps:
1. actually creating a snapshot in the mirror namespace
2. generating a set of image state objects with additional metadata for
the snapshot
3. marking the snapshot as complete after the image state objects are
written out
Depending on the circumstances, a request to create a primary snapshot
can be forwarded to rbd-mirror daemon. If that happens and rbd-mirror
daemon gets axed for some practical reason after completing steps (1)
and/or (2) but before completing step (3), we are left with a
permanently incomplete primary snapshot because upon retrying that
primary snapshot creation request, librbd notices that such snapshot
already exists. It does not check whether this "pre-existing" snapshot
is complete.
Solution:
--------
As part of the next mirror snapshot create (say triggered by the
scheduler) the unlink_peer() is called, it checks if there exists any
incomplete snapshot and delete them accordingly.
galsalomon66 [Fri, 10 Mar 2023 12:27:05 +0000 (14:27 +0200)]
rgw: reef: adding s3test albin/json-op-serial
modify json chunk processing function to handle offset/length as csv-processing
a fix valgrind :: Conditional jump or move depends on uninitialised value
upon using Trino the Trino-server issue multiple requests per single query,upon completion of all requests
the results are merged (by Trino). these request splits the input into equal parts; the RGW side should be aligned with Trino expectations(for result).
fixing the main routine for shaping the chunk (range-scan) for Trino processing
upon removing the payload-TAG, it need to change the response element index
handling more use cases for "shaping" the processed chunk by s3select per Trino request
re-shape the processed chunk only upon Trino sent the request
bug-fix: the chunk offset was not handle correctly
bug-fix: progress-message calcualation
modifying the range-request boundaries only upon Trino request.
Zac Dover [Fri, 12 May 2023 10:35:25 +0000 (20:35 +1000)]
doc/cephfs: rectify prompts in fs-volumes.rst
Make sure all prompts are unselectable. This PR is meant to be
backported to Reef, Quincy, and Pacific, to get all of the prompts into
a fit state so that a line-edit can be performed on the Englsh language
in this file.
Zac Dover [Fri, 5 May 2023 06:35:28 +0000 (16:35 +1000)]
doc/cephfs: repairing inaccessible FSes
Add a procedure to doc/cephfs/troubleshooting.rst that explains how to
restore access to FileSystems that became inaccessible after
post-Nautilus upgrades. The procedure included here was written by Harry
G Coin, and merely lightly edited by me. I include him here as a
"co-author", but it should be noted that he did the heavy lifting on
this.
See the email thread here for more context:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/HS5FD3QFR77NAKJ43M2T5ZC25UYXFLNW/
Co-authored-by: Harry G Coin <hgcoin@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
Signed-off-by: Pedro Gonzalez Gomez <pegonzal@redhat.com> Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
(cherry picked from commit 8177a748bd831568417df5c687109fbbbd9b981d)
Pere Diaz Bou [Mon, 6 Mar 2023 19:32:24 +0000 (20:32 +0100)]
mgr/dashboard: replace ajsf with formly
ajsf json schema library for angular doesn't seem to be actively
maintained. Instead, fromly is a well maintained replacement with extra
stuff like validators builtin, support for json schemas, custom
components, etc...
Textareas weren't supported on ajsf, therefore, it made sense to move to
this dep instead.
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com> Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 2c43dd0c16e3cc3b3eada03ed11958a689cc4bcd)
Laura Flores [Mon, 1 May 2023 16:28:54 +0000 (16:28 +0000)]
mgr: add urllib3==1.26.15 to mgr/requirements.txt
We do not depend on any particular version of
urllib3, but as a workaround to the incompatibility
of urllib3 constraints between kubernetes and
requests, we need to pin it temporarily to
the version both are happy with.
Fixes: https://tracker.ceph.com/issues/59591 Signed-off-by: Laura Flores <lflores@redhat.com>
(cherry picked from commit 80d460005e44649191aa862fa78bd278644b5237)