Any unknown exception causes the module to be unloaded and unresponsive.
So, it'll be ideal to catch all exceptions during command-line interaction
and report them instead of crashing with a traceback.
Dhairya Parmar [Mon, 22 May 2023 10:37:34 +0000 (16:07 +0530)]
mds: remove code to bypass dumping empty header scrub info
Previously when ~mdsdir was scrubbed at CephFS root, it's header
was kept empty, thus it became necessary to not dump it's values
for 'scrub status'. Now since both the scrubs(~mdsdir and root)
run under the same header, this code is no more needed.
Dhairya Parmar [Mon, 22 May 2023 10:36:24 +0000 (16:06 +0530)]
mds: dump_values no more needed
Previouly, two individual scrubs were initiated to scrub ~mdsdir
at root where the ~mdsdir scrub wasn't provided any tag thus, it
was necessary to not dump it's values for output of 'scrub start'.
Now since mdsdir and root scrub run under single header, there is
no need for this anymore, thus removing this redundant code.
Dhairya Parmar [Mon, 22 May 2023 07:04:51 +0000 (12:34 +0530)]
mds: enqueue ~mdsdir at the time of enqueing root
This would avoid the need to run individual scrubs for
~mdsdir and root, i.e. run both the scrubs under the
same header, this also helps to avoid edge case where
in case ~mdsdir is huge and it's taking time to scrub it,
the scrub status would report something like this until
root inodes kick in:
{
"status": "scrub active (757 inodes in the stack)",
"scrubs": {}
}
In the case where `iter->second.addr` is an empty address,
m_locker->address string is assigned with "0)/0" and therfore
will never result in an empty string.
Ilya Dryomov [Fri, 16 Jun 2023 12:01:52 +0000 (14:01 +0200)]
qa/workunits/rbd: make continuous export-diff test actually work
The current version is pretty useless:
- "rbd bench" writes the same byte (0xff) over and over again, so
almost all checksumming is in vain
- snapshots are taken in a steady state (i.e. not under I/O), so no
race conditions can get exposed
- even with these caveats, it's not wired up into the suite
Redo this workunit to be a reliable reproducer for the issue fixed
in the previous commit and wire it up for both krbd and rbd-nbd.
Ilya Dryomov [Tue, 13 Jun 2023 11:36:02 +0000 (13:36 +0200)]
librbd: stop passing IOContext to image dispatch write methods
This is a major footgun since any value passed e.g. at the API layer
may be stale by the time we get to object dispatch. All callers are
passing the IOContext returned by get_data_io_context() for their
ImageCtx anyway, highlighting that the parameter is fictitious.
Only the read method can meaningfully take IOContext.
Ilya Dryomov [Mon, 12 Jun 2023 19:45:03 +0000 (21:45 +0200)]
librbd: use an up-to-date snap context when owning the exclusive lock
By effectively moving capturing of the snap context to the API layer,
commit 1d0a3b17f590 ("librbd: pass IOContext to image-extent IO
dispatch methods") introduced a nasty regression. The snap context can
be captured only after exclusive lock is safely held for the duration
of dealing with the image request and even then must be refreshed if
a snapshot creation request is accepted from a peer. This is needed to
ensure correctness of the object map in general and fast-diff states in
particular (OBJECT_EXISTS vs OBJECT_EXISTS_CLEAN) and object deltas
computed based off of them. Otherwise the object map that is forked
for the snapshot isn't guaranteed to accurately reflect the contents of
the snapshot when the snapshot is taken under I/O (as in disabling the
object map may lead to different results being returned for reads).
The regression affects mainly differential backup and snapshot-based
mirroring use cases with object-map and/or fast-diff enabled: since
some object deltas may be incomplete, the destination image may get
corrupted.
This commit represents a reasonable minimal fix: IOContext passed
through to ImageDispatch is effected only for reads and just gets
ignored for writes. The next commit cleans up further by undoing the
passing of IOContext through the image dispatch layers for writes.
Conflicts:
src/test/librados/aio.cc:
removed test case for rados_aio_write_op_operate2()
which wasn't backported
test case for rados_aio_write_op_operate() uses rados_stat()
instead of rados_stat2() which doesn't exist on pacific
no test_data.m_oid, used "foo" for oids
rgw: avoid string_view to temporary in RGWBulkUploadOp
the `else` block below constructs a temporary std::string that destructs
at the end of the statement, leaving `filename` as a dangling view:
```
filename = file_prefix + std::string(header->get_filename());
```
store a copy of the `std::string` instead
Nitzan Mordechai [Wed, 10 May 2023 09:42:07 +0000 (09:42 +0000)]
mon/MonClient: before complete auth with error, reopen session
When monClient try to authenticate and fail with -EAGAIN there is
a possibility that we no longer hunting and not have active_con.
that will result of disconnecting the monClient and ticks will continue
without having open session.
the solution is to check at the end of auth, that we don't have -EAGAIN
error, and if we do, reopen the session and on the next tick try auth again
Casey Bodley [Tue, 23 May 2023 16:31:54 +0000 (12:31 -0400)]
librados: use ObjectOperationImpl for rados_write_op_t
the c++ api uses ObjectOperationImpl to wrap ObjectOperation with
additional storage for an optional mtime. the c api now reuses
ObjectOperationImpl for its write operations only - the mtime isn't
needed for read ops
librbd: localize snap_remove op for mirror snapshots
A client may attempt a lock request not quickly enough to
obtain exclusive lock for operations when another competing
client responds quicker. This can happen when a peer site has
different performance characteristics or latency. Instead of
relying on this unpredictable behavior, localize operation to
primary cluster.
Fixes: https://tracker.ceph.com/issues/59393 Signed-off-by: Christopher Hoffman <choffman@redhat.com>
(cherry picked from commit ac552c9b4d65198db8038d397a3060d5a030917d)
Conflicts:
src/cls/rbd/cls_rbd.cc [ commit 3a93b40 ("librbd:
s/boost::variant/std::variant/") not in pacific ]
src/librbd/mirror/snapshot/UnlinkPeerRequest.cc [ ditto ]
ceph-volume: fix a bug in `get_lvm_fast_allocs()` (batch)
`get_lvm_fast_allocs()` in `devices/lvm/batch.py` calls the property
`Device.used_by_ceph` in order to filter out devices that are already
used by ceph. The issue is that `Device.used_by_ceph()` itself filters
out journal devices (db/wal) given that a db/wal device can be shared
between multiple OSDs. The consequence is that `Device.used_by_ceph()`
always returns False for a db/wal device (even if it is actually
already used by ceph) so `get_lvm_fast_allocs()` always returns the
full list of the passed db/wal devices on the `lvm batch` CLI command.
Finally, the logic in `devices.lvm.batch.get_deployment_layout()`
checks whether the length of the list returned by `get_lvm_fast_allocs()`
is equal to `num_osds` (the number of OSD being created), if not it fails.
Laura Flores [Mon, 5 Jun 2023 20:23:42 +0000 (15:23 -0500)]
qa/suites/rados: remove rook coverage from the rados suite
The rook team relies on a daily CI system to validate
rook changes. It doesn't seem that the teuthology tests
are maintained, so it makes sense to remove them from the
rados suite.
By removing this symlink, rook test coverage will remain
in the orch suite, and coverage will only be removed from the
rados suite.
Workaround for: https://tracker.ceph.com/issues/58585 Signed-off-by: Laura Flores <lflores@redhat.com>
(cherry picked from commit c26674ef4c6cbbdd94c54cafbd66e98704f044d7)
This commit https://github.com/ceph/ceph/commit/bdb2241ca5a9758e8c52d47320d8b5ea0766aea9
was updating on logging changes in quincy, but seems to have been
erroneously included in a pacific batch backport https://github.com/ceph/ceph/pull/42736
This stuff doesn't work in pacific. For example,
[ceph: root@vm-00 /]# ceph version
ceph version 16.2.13-257-gd8c5d349 (d8c5d34975dce1c5eb0aa3a7979a4d9b9a99d1ec) pacific (stable)
[ceph: root@vm-00 /]# ceph config set global log_to_journald false
Error EINVAL: unrecognized config option 'log_to_journald'
Ilya Dryomov [Sat, 27 May 2023 10:28:40 +0000 (12:28 +0200)]
osd/OSDCap: allow rbd.metadata_list method under rbd-read-only profile
This was missed in commit acc447d5de7b ("osd/OSDCap: rbd profile
permits use of rbd.metadata_list cls method") which adjusted only
"profile rbd" OSD cap. Listing image metadata is an essential part
of opening the image and "profile rbd-read-only" OSD cap must allow
it too.
While at it, constrain the existing grant for rbd profile from "any
object in the pool" to just "rbd_info object in the global namespace of
the pool" as this is where pool-level image metadata actually lives.
Nitzan Mordechai [Wed, 17 May 2023 05:47:09 +0000 (05:47 +0000)]
test: correct osd pool default size
Using the default pool size of 2 with random eio thrashing can cause
some of the object to mark as lost.
fixing typo from 'osd default pool size: 3' to 'osd pool default size: 3'
so we will have pool size 3 correctly.
Nitzan Mordechai [Thu, 18 May 2023 13:37:38 +0000 (13:37 +0000)]
test: monitor thrasher wait until quorum
With 1 sec. delay we may sometimes fail to get correct length of
quorum since the monitor didn't updated on time.
With the following fix, we will wait for quorum and check every few
seconds (3) until timeout (30).
Zac Dover [Thu, 25 May 2023 09:01:49 +0000 (19:01 +1000)]
doc/rados: fix link in common.rst
Fix a link in doc/rados/configuration/common.rst that was missing its
final letter, causing a 404 error when readers attempted to follow it.
This bug was reported by stalwart friend of the Ceph documentation
project Eugen Block, who is here credited as a co-author. This bug was
reported at https://pad.ceph.com/p/Report_Documentation_Bugs.