librbd: localize snap_remove op for mirror snapshots
A client may attempt a lock request not quickly enough to
obtain exclusive lock for operations when another competing
client responds quicker. This can happen when a peer site has
different performance characteristics or latency. Instead of
relying on this unpredictable behavior, localize operation to
primary cluster.
Fixes: https://tracker.ceph.com/issues/59393 Signed-off-by: Christopher Hoffman <choffman@redhat.com>
Adam King [Mon, 20 Mar 2023 19:31:12 +0000 (15:31 -0400)]
mgr/cephadm: asyncio based universal timeout for ssh/cephadm commands
Since we already have make use of asyncio for our ssh commands,
we can use asyncio's timeout on waiting for concurrent futures to complete
as a way to have universal timeouts on our cephadm commands.
This change also creates a contextmanager that will catch any asyncio.TimeoutError.
Using the contextmanager along with calls to the wait_async function
will catch any timeout exception raised and convert it into an appropriate
OrchetratorError including information about what and where for the timeout
if it was provided (host where run, what command). This allows us to guarantee a
background ssh command eventually returns and inform users of any
timeouts by raising a health warning or logging the error instead
of sitting idle indefinitely
Fixes: https://tracker.ceph.com/issues/54024 Signed-off-by: Adam King <adking@redhat.com>
doc/start: edit first 50 lines of documenting-ceph
Edit the first 150 lines of doc/start/documenting-ceph.rst. This is part
of an initiative to harvest the fruits of Cephalocon 2023, at which
documentation proved to be in demand to a surprising degree.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
Mistakenly removed in commit d79f2a81541c ("docs: warning and remove
few docs section for Filestore Update docs after filestore removal.").
The kernel client, however new, will continue to be able to talk to
FileStore OSDs for as long as they exist.
Line-edit doc/rados/user-management.rst (2 of x). Some internal
references had to be removed, but these will be repaired when the next
part of this file is updated in a future PR.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
librbd: always refresh after creating snapshot in CreatePrimaryRequest
Up until now this was conditioned on whether the caller expressed
interest in the ID of the created snapshot and happened to work only
because CreatePrimaryRequest wasn't actually consulting any mirror
snapshot metadata. This has just changed with unlink_peer() needing to
see an up-to-date complete flag which is set in SetImageStateRequest
following the write out of image state object(s).
librbd: remove previous incomplete primary snapshot after successfully creating a new one
Problem:
-------
At a high level, creating a primary snapshot consists of three steps:
1. actually creating a snapshot in the mirror namespace
2. generating a set of image state objects with additional metadata for
the snapshot
3. marking the snapshot as complete after the image state objects are
written out
Depending on the circumstances, a request to create a primary snapshot
can be forwarded to rbd-mirror daemon. If that happens and rbd-mirror
daemon gets axed for some practical reason after completing steps (1)
and/or (2) but before completing step (3), we are left with a
permanently incomplete primary snapshot because upon retrying that
primary snapshot creation request, librbd notices that such snapshot
already exists. It does not check whether this "pre-existing" snapshot
is complete.
Solution:
--------
As part of the next mirror snapshot create (say triggered by the
scheduler) the unlink_peer() is called, it checks if there exists any
incomplete snapshot and delete them accordingly.
* refs/pull/50089/head:
doc: add a note for minimum compatible python version and supported distros
tools/cephfs/top/CMakeList.txt: check the minimum compatible python version for cephfs-top
Rishabh Dave [Tue, 18 Apr 2023 14:55:01 +0000 (20:25 +0530)]
qa/cephfs/cap_tester: simplify CapTester and its instantiation
Class CapTester contains two distinct immiscible group of methods: one
that tests MON caps and other that tests MDS caps. When using CapTester
for the former reason the instantiation neither needs mount object and
the path where files for testing will be created nor it needs to run the
method that creates files for testing rw permissions. When using
this class for latter the case is the exact opposite.
Create 2 separate classes for each of these purpose and class that
inherits both of these classes so that instantiating the class becomes
as simple as it can be.
Rishabh Dave [Thu, 6 Apr 2023 09:42:14 +0000 (15:12 +0530)]
qa/cephfs: move few methods such that they can be reused
Move get_mon_cap_from_keyring() and get_fsnmes_from_moncap() from class
CapTester to main namespace of caps_helper.py so that they can be
imported freely and reused by tests.
This method checks if the output of the command "ceph fs ls" for client
ID it receives is same as the output printed for client.admin. Don't do
so, limit the test to only checking if "ceph fs ls --id client.x -k
keyring_file" prints fs name for which client.x has permissions.
Rishabh Dave [Fri, 31 Mar 2023 19:14:52 +0000 (00:44 +0530)]
qa/cephfs: improve caps_helper.CapTester
Improvement #1:
CapTester.write_test_files() not only creates the test file but also
does the following for every mount object it receives in parameters -
* carefully produces the path for the test file as per parameters
received
* generates the unique data for each test file on a CephFS mount
* creates a data structure -- list of lists -- that holds all this
information along with mount object itself for each mount object so
that tests can be conducted at a later point
Untangle this mess of code by splitting this method into 3 separate
methods -
1. To produce the path for test file (as per user's need).
2. To generate the data that will be written into the test file.
3. To actually create the test file on CephFS.
Improvement #2:
Remove the internal data structure used for testing -- self.test_set --
and use separate class attributes to store all the data required for
testing instead of a tuple. This serves two purpose -
One, it makes it easy to manipulate all this data from helper methods
and during debugging session, especially while using a PDB session.
And two, make it impossible to have multiple mounts/multiple "test sets"
within same CapTester instance for the sake of simplicity. Users can
instead create two instances of CapTester instances if needed.
Rishabh Dave [Thu, 13 Apr 2023 19:08:33 +0000 (00:38 +0530)]
qa/cephfs: don't inherit CephFSTestCase in CapTester
Inheritting CephFSTestCase in CapTester just for methods assertEqual()
and assertIn() from class unittest.TestCase is odd and heavy-weight.
Don't inherit CephFSTestCase and use simple assert instead.
We're currently installing cython with pip when using Ubuntu
to cross compile Ceph for Windows. This can fail with recent
Python versions if attempting to use the global env:
error: externally-managed-environment
× This environment is externally managed
╰─> To install Python packages system-wide, try apt install
python3-xyz, where xyz is the package you are trying to
install.
Cython isn't really needed by the Windows build so we can go
ahead and drop it. We were hoping to use the Python bindings
on Windows, however Python extensions can't be cross compiled.
We're no longer using pip either, so we're dropping the dependency.
g++ was getting installed as a pip dependency, so we'll have to
include that instead. Note that g++ is used when building the boost
b2 tool.
While at it, we'll also ensure that git is installed.
Merge pull request #51055 from ceph/wip-yuriw-release-16.2.12-main
doc: 16.2.12 Release Notes
Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Laura Flores <lflores@redhat.com> Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com> Reviewed-by: Adam King adking@redhat.com Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>