Signed-off-by: Leonid Chernin <leonidc@il.ibm.com>
(cherry picked from commit 149e13f8da960f3f96d2267596cea821e0e3ebe4) Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
- ceph-nvmeof-mon: nvme-gw create/delete
* move creation to the `daemon_check_post` method to prevent zombie creation.
this change aligns with the dashboard logic, which uses the same callback.
* implement `purge` method to ensure that no zombie gateways are left behind
once the service is removed. Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
- nvmeof service spec: enable_monitor_client by default Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
- fix gw stuck in ana-state GW_WAIT_FAILBACK_PREPARED Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
(cherry picked from commit b0c764b6e22fbb51f59e469d9dd99895c152f73e)
- NVMeofGwMonitorClient: assert gateway state does not vanish Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
- send unavailable in beacon response to gw that did not exit after failover Signed-off-by: Leonid Chernin <leonidc@il.ibm.com>
- fix blacklist start when osd is not writtable Signed-off-by: Leonid Chernin <leonidc@il.ibm.com>
- Clean up
Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
(cherry picked from commit 29299a83a4f4b79a83cc1187e32d40ce79f2bb9a) Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
- gw subsystems update
propose state change when subsystems reported by the gateway changes.
otherwise, only the initially reported subsystem is handled, and any
new subsystem listener ana groups could remain "unreachable".
example of the bug: https://github.com/ceph/ceph-nvmeof/actions/runs/8603572675/job/23576286745?pr=560
Resolves bug introduced in fcfa6e797f021886971e3af97a5158797cd2d96f
There are two 'fsid' references as self scoped variables in upstream,
should be local variables in downstream.
This needed to be using the container id it was
passed, instead of ctx.image which is likely to
be `None` when this is run.
Fixes: https://tracker.ceph.com/issues/64229 Signed-off-by: Adam King <adking@redhat.com>
- commit 5c613b3788d9ae686b4dc29d9414674ecb6f6adb
Author: Roy Sahar <royswi@gmail.com>
Date: Thu Feb 8 17:58:43 2024 +0200
nvmeof: Add mount for log location
Signed-off-by: Roy Sahar <royswi@gmail.com>
- commit 726050aa3fd1ce2b21539fdd632b16e880bf2946
Author: Roy Sahar <royswi@gmail.com>
Date: Wed Apr 3 12:21:05 2024 +0300
Ceph NVMEoF Gateway - add mtls folder for nvmeof container
Signed-off-by: Roy Sahar <royswi@gmail.com>
Resolves: rhbz#2273837
Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
- fix assert issue in find_failback_gw - failover_peers array was removed Signed-off-by: Leonid Chernin <leonidc@il.ibm.com>
(cherry picked from commit 4c6a9b3ed194eb7837e6b341aba227042e497078)
- try to resolve slow ops
It seems that if monitor preprocess_query(MonOpRequestRef op) returns
true, then mon.reply_command() should be called. Remove path which
returns true without mon.reply_command() call.
(cherry picked from commit 99a345cb0d70851addc33aa3fa7fe07ed5d67b94)
Leonid Chernin [Tue, 17 Oct 2023 13:25:07 +0000 (13:25 +0000)]
mon: add NVMe-oF gateway monitor and HA
- gateway submodule
Fixes: https://tracker.ceph.com/issues/64777
This PR adds high availability support for the nvmeof Ceph service. High availability means that even in the case that a certain GW is down, there will be another available path for the initiator to be able to continue the IO through another GW. High availability is achieved by running nvmeof service consisting of at least 2 nvmeof GWs in the Ceph cluster. Every GW will be seen by the host (initiator) as a separate path to the nvme namespaces (volumes).
The implementation consists of the following main modules:
- NVMeofGWMon - a PaxosService. It is a monitor that tracks the status of the nvmeof running services, and take actions in case that services fail, and in case services restored.
- NVMeofGwMonitorClient – It is an agent that is running as a part of each nvmeof GW. It is sending beacons to the monitor to signal that the GW is alive. As a part of the beacon, the client also sends information about the service. This information is used by the monitor to take decisions and perform some operations.
- MNVMeofGwBeacon – It is a structure used by the client and the monitor to send/recv the beacons.
- MNVMeofGwMap – The map is tracking the nvmeof GWs status. It also defines what should be the new role of every GW. So in the events of GWs go down or GWs restored, the map will reflect the new role of each GW resulted by these events. The map is distributed to the NVMeofGwMonitorClient on each GW, and it knows to update the GW with the required changes.
It is also adding 2 new mon commands:
- nvme-gw create
- nvme-gw delete
- nvme-gw show
The commands are used by the ceph adm to update the monitor that a new
GW is deployed. The monitor will update the map accordingly and will
start tracking this GW until it is deleted.
Casey Bodley [Wed, 19 Nov 2025 15:40:30 +0000 (10:40 -0500)]
doc/rgw: remove metrics.rst which did not apply to reef
metrics.rst was backported in full due to conflicts from changes
in 1a3f3c9a8d2cd20821ebf3d45a452e18ccad4a64. these features don't exist
on reef, so remove the whole page
Rishabh Dave [Wed, 16 Apr 2025 08:24:27 +0000 (13:54 +0530)]
mgr/vol: don't delete user-created pool in "volume create" command
If one of the pool names passed to "ceph fs volume create" command
(through --data-pool and --meta-pool name) is absent, don't delete the
pool that is present and passed to this command during the cleanup code
of this command.
IOW, "volume create" command should continue deleting pool created by it
but not delete pool created by the user.
Fixes: https://tracker.ceph.com/issues/70945 Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 4299f660ba9c83ef9305b2834c195da9008810a9)
Rishabh Dave [Mon, 3 Mar 2025 16:36:10 +0000 (22:06 +0530)]
doc/cephfs: mention new options for "fs volume create" cmd
Command "ceph fs volume create" accepts 2 new options to allow users to
pass data and metadata pool name. Update docs to include mention of both
the options.