Rishabh Dave [Thu, 19 Nov 2020 13:06:38 +0000 (18:36 +0530)]
ceph-volume: allow listing devices by OSD ID
Adds the ability to list devices by OSD IDs, i.e. "ceph-volume lvm list
3" would list all devices under OSD ID 3 (which can be up to 2 and 3
devices under filestore and bluestore OSDs respectively).
Fixes: https://tracker.ceph.com/issues/41294 Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 0792a396ce61b356ad004ac3dbd96fabed739eac)
Melissa Li [Thu, 26 May 2022 18:07:30 +0000 (14:07 -0400)]
mgr/dashboard: add rbd status endpoint
Show "No RBD pools available" error page when accessing block/rbd if there are no rbd pools.
Add a "button_name" and "button_route" property to `ModuleStatusGuardService` config to customize the button on the error page.
Modify `ModuleStatusGuardService` to execute API calls to `/ui-api/<uiApiPath>/status` which uses the `UIRouter`.
Fixes: https://tracker.ceph.com/issues/42109 Signed-off-by: Melissa Li <melissali@redhat.com>
(cherry picked from commit 6ac9b3cfe171a8902454ea907b3ba37d83eda3dc)
Nizamudeen A [Mon, 6 Jun 2022 05:51:29 +0000 (11:21 +0530)]
mgr/dashboard: configure rbd mirroring
One-click button in the case of an orch cluster for configuring the
rbd-mirroring when its not properly setup. This button will create an
rbd-mirror service and also an rbd labelled pool(replicated: size-3) (if they are not
existing)
Fixes: https://tracker.ceph.com/issues/55646 Signed-off-by: Nizamudeen A <nia@redhat.com>
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/core/error/error.component.html
src/pybind/mgr/dashboard/frontend/src/app/shared/services/module-status-guard.service.ts
src/pybind/mgr/dashboard/tests/test_rbd_mirroring.py
error component and module-status-guard.service.ts had one minor
conflict which was resolved by getting incoming changes (add missing
logic).
test_rbd_mirroring had an import conflict which was resolved by
accepting incoming changes which had the same imports with new ones.
Pere Diaz Bou [Mon, 30 May 2022 14:10:11 +0000 (16:10 +0200)]
mgr/dashboard: move replaying images to Syncing tab
Images with 'Replaying' state will be displayed in Syncing tab. syncTmpl
removed as it was unnecessary if sate is provided from the backend.
Replaying images in contrast of Syncing images don't have a progress
percentage, nevertheless, we have an approximation of how much time left
there is until the image is fully synced. Therefore, we can use seconds_until_synced to represent the progress.
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
Pere Diaz Bou [Fri, 13 May 2022 15:15:33 +0000 (17:15 +0200)]
mgr/dashboard: snapshot mirroring from dashboard
Enable snapshot mirroring from the Pools -> Image
Also show the mirror-snapshot in the image where snapshot is enabled
When parsing images if an image has the snapshot mode enabled, it will
try to run commands that don't work with that mode. The solution was
not running those for now and appending the mode in the get call.
Fixes: https://tracker.ceph.com/issues/55648 Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com> Signed-off-by: Nizamudeen A <nia@redhat.com> Signed-off-by: Aashish Sharma <aasharma@redhat.com> Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit 489a385a95d6ffa5dbd4c5f9c53c1f80ea179142)
Ilya Dryomov [Sun, 26 Jun 2022 11:05:09 +0000 (13:05 +0200)]
librbd: update progress for non-existent objects on deep-copy
As a side effect of commit e5a21e904142 ("librbd: deep-copy image copy
state machine skips clean objects"), handle_object_copy() stopped being
called for non-existent objects. This broke progress_object_no logic,
which expects to "see" all object numbers so that update_progress()
callback invocations can be ordered. Currently update_progress() based
progress reporting gets stuck after encountering a hole in the image.
To fix, arrange for handle_object_copy() to be called for all object
numbers, even if ObjectCopyRequest isn't created. Defer the extra call
to the image work queue to avoid locking issues.
Zac Dover [Wed, 29 Jun 2022 12:57:13 +0000 (22:57 +1000)]
doc/index.rst: add link to Dev Guide basic workfl.
This PR adds a link to the "Basic Workflow" section of the
Developer Guide on the landing page of docs.ceph.com.
This PR is meant to improve the documentation for developers
new to Ceph and to guide them to instructions that will allow
them to become full-fledged contributors to the Ceph project
as quickly as possible.
The "Basic Workflow" page of the Developer Guide contains
information that answers almost all of the questions that I had
about contributing to the Ceph project when I was new to it,
and I am finally acting on my long-held conviction that the
"Basic Workflow" page of the Developer Guide should have a more
prominent position in the documentation suite than it has had.
Laura Flores [Thu, 9 Jun 2022 18:55:48 +0000 (18:55 +0000)]
test/librados: modify LibRadosMiscConnectFailure.ConnectFailure to comply with new seconds unit
The unit type for `client_mount_timeout` was changed from "float" to "secs" in 983b10506dc8466a0e47ff0d320d480dd09999ec. To make this test comply with the new
seconds unit change, we need to change the value to an integer, as seconds
does not accept float values.
Fixes: https://tracker.ceph.com/issues/55971 Signed-off-by: Laura Flores <lflores@redhat.com>
(cherry picked from commit f357459e6b159229ad40491709f756b06a6e87f1) Signed-off-by: Neeraj Pratap Singh <neesingh@redhat.com>
client: Inode::hold_caps_until is time from monotonic clock now.
Inode::hold_caps_until storing time from ceph::coarse_mono_clock now.
This upstream code of this PR i.e. the parent PR contains the file
`src/common/options/mds-client.yaml.in` which intends to fix a part
of this PR whereas this file didn't exist in pacific branch.So, those
changes of `src/common/options/mds-client.yaml.in` are incorporated in
the below mentioned files in Conlicts. Fixes: https://tracker.ceph.com/issues/52982 Signed-off-by: Neeraj Pratap Singh <neesingh@redhat.com>
(cherry picked from commit 983b10506dc8466a0e47ff0d320d480dd09999ec)
Zac Dover [Tue, 21 Jun 2022 14:09:05 +0000 (00:09 +1000)]
doc/dev: add context note to dev guide config
This PR adds a note directing first-time cloners of
their Ceph git forks to make sure to cd into the ceph/
directory before trying to run the "git config" commands.
fix endianness issue with WriteLogCacheEntry encoding. abandon the
use of bits in the union. make '&' operation with the whole union
filed(flags) to get the bit information.
Ilya Dryomov [Sat, 18 Jun 2022 13:25:49 +0000 (15:25 +0200)]
rbd-mirror: spell out "remote image is not primary" status correctly
There is a difference: non-primary means NON_PRIMARY promotion state,
while "not primary" can refer to any of NON_PRIMARY, ORPHAN or UNKNOWN
promotion states.
Ilya Dryomov [Sat, 18 Jun 2022 11:00:34 +0000 (13:00 +0200)]
rbd-mirror: fix up PrepareReplayDisconnected test case
It was botched in commit 2bca9ee96c65 ("rbd-mirror: consolidate
prepare local/remote image steps to bootstrap") and went unnoticed
because currently no special handling is needed for disconnected
clients -- is_disconnected() check happens to be the last step
and it doesn't generate an error.
Ilya Dryomov [Mon, 20 Jun 2022 12:19:41 +0000 (14:19 +0200)]
rbd-mirror: generally skip replay/resync if remote image is not primary
Replay and resync should generally be skipped if the remote image is
not primary.
If this is not done for replay, snapshot-based mirroring can run into
a livelock if the primary image is demoted while a mirror snapshot is
being synced. On the demote site, rbd-mirror would pick up the just
demoted image, grab the exclusive lock on it and idle waiting for a new
mirror snapshot to be created. On the (still) non-primary site,
rbd-mirror would eventually finish syncing that mirror snapshot and
attempt to unlink from it on the demote site. These attempts would
fail with EROFS due to exclusive lock being held in the "refuse proxied
maintenance operations" mode, blocking forward progress (syncing of the
demotion snapshot so that the non-primary image can be orderly promoted
to primary, etc).
If this is not done for resync, data loss can ensue as the just demoted
image would be immediately trashed, underneath the non-primary site that
is still syncing.
Currently this is done in PrepareReplayRequest only for journal-based
mirroring. Note that it is conditional: if the local image is linked
to the remote image, proceeding is desirable.
Generalize this check, consolidate it with a related check in
PrepareRemoteImageRequest and move the result to BootstrapRequest to
cover both "local image does not exist" and "local image is unlinked"
cases for both modes.
Ilya Dryomov [Sat, 18 Jun 2022 10:35:51 +0000 (12:35 +0200)]
rbd-mirror: strengthen is_local_primary() and is_linked()
Initialize local_promotion_state and remote_promotion_state to UNKNOWN
instead of counterintuitive PRIMARY and NON_PRIMARY -- half the time the
final values are flipped. Then is_local_primary() and is_linked() can
be strengthened as a non-existent image should stay in UNKNOWN.
Kotresh HR [Fri, 18 Mar 2022 06:43:53 +0000 (12:13 +0530)]
mgr/volumes: Disable quota for mgr libcephfs connection
This is done to give 'mgr' libcephfs connection right to bypass
quota. The mgr/volumes plugin maintains configuration files
with in the directory where the user has enforced quota. So
when the quota is met, certain mgr/volumes apis don't work as
intended. e.g., When subvolumegroup quota is met, the group's
subvolume removal with '--retain-snapshots' fails.
This is done to give 'mgr' libcephfs connection right to bypass
quota. The mgr/volumes plugin maintains configuration files
with in the directory where the user has enforced quota. So
when the quota is met, certain mgr/volumes apis don't work as
intended. e.g., When subvolumegroup quota is met, the group's
subvolume removal with '--retain-snapshots' fails.
Conflicts:
src/pybind/mgr/volumes/fs/operations/group.py
- Updates in defination of create_groups
src/pybind/mgr/volumes/fs/volume.py
- Added set_group_attrs in import list and split long line
Xiubo Li [Wed, 8 Jun 2022 05:00:20 +0000 (13:00 +0800)]
qa: wait rank 0 to become up:active state before mounting fuse client
When setting the ec pool to the layout the filesystem may not be
ready, so when mounting a fuse client it will fail. To fix this we
need to wait at least the rank 0 to be in up:active state.
Xiubo Li [Fri, 27 May 2022 05:11:44 +0000 (13:11 +0800)]
client: choose auth MDS for getxattr with the Xs caps
If any 'x' caps is issued we can just choose the auth MDS instead
of the random replica MDSes. Because only when the Locker is in
LOCK_EXEC state will the loner client could get the 'x' caps. And
if we send the getattr requests to any replica MDS it must auth pin
and tries to rdlock from the auth MDS, and then the auth MDS need
to do the Locker state transition to LOCK_SYNC. And after that the
lock state will change back.
This cost much when doing the Locker state transition and usually
will need to revoke caps from clients.
And for the 'Xs' caps for getxattr we will also choose the auth MDS,
because the MDS side code is buggy due to setxattr won't notify the
increased xattr_version to replicated MDSes when the values changed
and the replica MDS will return the old xattr_version value. The
client will just drop the xattr values since it sees the xattr_version
doesn't change.
Xiubo Li [Wed, 16 Mar 2022 09:15:57 +0000 (17:15 +0800)]
mds, client: only send the metrices supported by MDSes
For the old ceph clusters the clients won't send any metrics to
them as default unless they have backported this commit, but there
has one option 'client_collect_and_send_global_metrics' still could
be used to enable it manually.
This will fix the crash bug when upgrading from old ceph clusters,
which will crash the MDSes once they receive unknown metrics.
mgr/cephadm: adding logic to close ports when removing a daemon Fixes: https://tracker.ceph.com/issues/52906 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 4deb546ffd67ac8f05d2788150764a26b5671b87)