liu shi [Fri, 14 May 2021 07:51:01 +0000 (03:51 -0400)]
cpu_profiler: fix asok command crash
fixes: https://tracker.ceph.com/issues/50814 Signed-off-by: liu shi <liu.shi@navercorp.com>
(cherry picked from commit be7303aafe34ae470d2fd74440c3a8d51fcfa3ff)
Adam Emerson [Thu, 20 Jul 2023 01:03:44 +0000 (21:03 -0400)]
build: install-deps.sh installs system boost on Jammy
Since on Jammy system boost is new enough for Pacific and we don't have
Jammy packages for older boost (we only have those for Bionic), just
install the system packages rather than fetching ceph-libboost.
No analogous commit exists in main as while main's Jammy case installs
ceph-libboost, we just need a system package here.
Fixes: https://tracker.ceph.com/issues/62103 Signed-off-by: Adam Emerson <aemerson@redhat.com>
Adam Emerson [Wed, 19 Jul 2023 21:12:08 +0000 (17:12 -0400)]
build: Remove old ceph-libboost* packages in install-deps
Here, we extract `clean_boost_on_ubuntu()` and call it before other
installs on Debian distributions so that if we install a system boost,
a potentially newer `ceph-libboost` won't get in the way.
As the sources.list.d being removed in the original cleanup code isn't
the one we're currently installing in the install code, add a removal
for the currently used source, then do apt-update so packages from the
removed source are no longer included as available.
Two subsidiary dev packages from conflicting boost libraries can be
installed, but it leaves apt in an inconsistent state. To clean this
up, add `--fix-missing` to the removal line and call
`clean_boost_on_ubuntu()` before other uses of apt.
Fixes: https://tracker.ceph.com/issues/62097 Signed-off-by: Adam Emerson <aemerson@redhat.com>
(cherry picked from commit 0c3f511e14af639b6509e69b889258b2f718f8fd)
Conflicts:
install-deps.sh
- Different boost version for Pacific than Squid.
- ci_debug does not exist in Pacific
- whitespace
- No INSTALL_EXTRA
Fixes: https://tracker.ceph.com/issues/62103 Signed-off-by: Adam Emerson <aemerson@redhat.com>
Jos Collin [Mon, 24 Jul 2023 08:46:52 +0000 (14:16 +0530)]
qa: fix cephfs-mirror unwinding and 'fs volume create/rm' order
* Fixes the 'fs volume create' happens before the cephfs-mirror daemon start.
* Fixes the 'fs volume rm' happen only after the cephfs-mirror daemon unwinding.
- This prevents the issue of mirror-daemon not returning from a libcephfs call, as
the volumes were deleted during the cephfs_mirror_thrash ing.
Fixes: https://tracker.ceph.com/issues/61182 Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit b9a1a3cdf9770bcb27d6e08ddbc059f01674f4b8)
Jos Collin [Fri, 23 Jun 2023 06:16:26 +0000 (11:46 +0530)]
mds: MDLog::_recovery_thread: handle the errors gracefully
A write fails if the MDS is already blocklisted due to the 'fs fail' issued by the qa tests.
Handle those write failures gracefully, even when the MDS is stopping.
Fixes: https://tracker.ceph.com/issues/61201 Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit d562905dcfb5b8a45ce7042c543720ef8b0fa05b)
Dhairya Parmar [Mon, 22 May 2023 10:37:34 +0000 (16:07 +0530)]
mds: remove code to bypass dumping empty header scrub info
Previously when ~mdsdir was scrubbed at CephFS root, it's header
was kept empty, thus it became necessary to not dump it's values
for 'scrub status'. Now since both the scrubs(~mdsdir and root)
run under the same header, this code is no more needed.
Dhairya Parmar [Mon, 22 May 2023 10:36:24 +0000 (16:06 +0530)]
mds: dump_values no more needed
Previouly, two individual scrubs were initiated to scrub ~mdsdir
at root where the ~mdsdir scrub wasn't provided any tag thus, it
was necessary to not dump it's values for output of 'scrub start'.
Now since mdsdir and root scrub run under single header, there is
no need for this anymore, thus removing this redundant code.
Dhairya Parmar [Mon, 22 May 2023 07:04:51 +0000 (12:34 +0530)]
mds: enqueue ~mdsdir at the time of enqueing root
This would avoid the need to run individual scrubs for
~mdsdir and root, i.e. run both the scrubs under the
same header, this also helps to avoid edge case where
in case ~mdsdir is huge and it's taking time to scrub it,
the scrub status would report something like this until
root inodes kick in:
{
"status": "scrub active (757 inodes in the stack)",
"scrubs": {}
}
Xiubo Li [Wed, 28 Jun 2023 13:59:53 +0000 (21:59 +0800)]
client: force sending cap revoke ack always
If just before the revoke request, which will increase the 'seq', is
sent out the clients released the corresponding caps and sent out
the cap update request with old 'seq', the mds will miss checking
the seqs and calculating the caps.
Xiubo Li [Sat, 10 Apr 2021 04:52:24 +0000 (12:52 +0800)]
client: wait rename to finish
In rare case during the rename if another thread tries to lookup the
dst dentry, it may get an inconsistent result that both src dentry
and dst dentry will link to the same inode at the same time.
Xiubo Li [Thu, 1 Jun 2023 12:00:01 +0000 (20:00 +0800)]
client: do not send metrics until the MDS rank is ready
In some cases when there are a lot of clients and these clients
have a lots of known requests need to replay too, the metrics
requests will be dropped by the MDS because the MDS is still in
the clientreplay state, and also the useless metric requests will
slow down MDS.
Xiubo Li [Mon, 8 May 2023 05:48:43 +0000 (13:48 +0800)]
client: always add one new capsnap if Fb is used and Fw is not used
If we set the 'writing' to 1 when the 'Fb' caps is used then later
if we have any dirty caps it will be skipped and will reuse the
existing capsnap, which is incorrect.
At the same time trigger to flush the buffer when making snapshot
and if the Fb is being used.
Milind Changire [Mon, 8 May 2023 07:52:12 +0000 (13:22 +0530)]
mon: block osd pool mksnap for fs pools
Commit 23db15d5c2b disabled pool snaps for the rados mksnap path. But
ceph osd pool mksnap was an alternate way that pool snaps could be
created.
This commit disables pool snaps via this alternate path as well.
NOTE:
Pool-level snaps and fs-level snaps can't co-exist since snap IDs are
likely to clash between the two different mechanisms and can result in
unintentional data loss when either of the snaps are deleted.
In the case where `iter->second.addr` is an empty address,
m_locker->address string is assigned with "0)/0" and therfore
will never result in an empty string.
Ilya Dryomov [Fri, 16 Jun 2023 12:01:52 +0000 (14:01 +0200)]
qa/workunits/rbd: make continuous export-diff test actually work
The current version is pretty useless:
- "rbd bench" writes the same byte (0xff) over and over again, so
almost all checksumming is in vain
- snapshots are taken in a steady state (i.e. not under I/O), so no
race conditions can get exposed
- even with these caveats, it's not wired up into the suite
Redo this workunit to be a reliable reproducer for the issue fixed
in the previous commit and wire it up for both krbd and rbd-nbd.
Ilya Dryomov [Tue, 13 Jun 2023 11:36:02 +0000 (13:36 +0200)]
librbd: stop passing IOContext to image dispatch write methods
This is a major footgun since any value passed e.g. at the API layer
may be stale by the time we get to object dispatch. All callers are
passing the IOContext returned by get_data_io_context() for their
ImageCtx anyway, highlighting that the parameter is fictitious.
Only the read method can meaningfully take IOContext.
Ilya Dryomov [Mon, 12 Jun 2023 19:45:03 +0000 (21:45 +0200)]
librbd: use an up-to-date snap context when owning the exclusive lock
By effectively moving capturing of the snap context to the API layer,
commit 1d0a3b17f590 ("librbd: pass IOContext to image-extent IO
dispatch methods") introduced a nasty regression. The snap context can
be captured only after exclusive lock is safely held for the duration
of dealing with the image request and even then must be refreshed if
a snapshot creation request is accepted from a peer. This is needed to
ensure correctness of the object map in general and fast-diff states in
particular (OBJECT_EXISTS vs OBJECT_EXISTS_CLEAN) and object deltas
computed based off of them. Otherwise the object map that is forked
for the snapshot isn't guaranteed to accurately reflect the contents of
the snapshot when the snapshot is taken under I/O (as in disabling the
object map may lead to different results being returned for reads).
The regression affects mainly differential backup and snapshot-based
mirroring use cases with object-map and/or fast-diff enabled: since
some object deltas may be incomplete, the destination image may get
corrupted.
This commit represents a reasonable minimal fix: IOContext passed
through to ImageDispatch is effected only for reads and just gets
ignored for writes. The next commit cleans up further by undoing the
passing of IOContext through the image dispatch layers for writes.
An MDS sends up:boot beacons until it sees an MDSMap with it joined. If
the mons are delaying the proposal, including because of quorum loss, of
the new FSMap, the subsequent up:boot messages would cause the
MDSMonitor to wrongly interpret the booting MDS to replace itself.
Instead, just ignore up:boot messages (as intended) when we know the MDS
has been added to the pending map.
Fixes: https://tracker.ceph.com/issues/59318 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 2e0bcc3c333d7fca2e06eafa1e3dc3a7c3ae1b36)
mon/MDSMonitor: batch last_metadata update with pending
I believe the problem here is that the last_metadata change is lost in a
ECANCELED/EAGAIN transaction but the pending map change goes through in
the next one. I've been unable to find an exact way to reproduce this.
The problem seems to occur when upgrades are performed which would
indicate shuffling in the monitors where quorum would be lost repeatedly.
This seems to be the most likely explanation so let's go ahead and make
this change even without the reproducer. In any case, it has the added
benefit of batching the pending map update (to up:standby) with the
last_metadata update.
Fixes: https://tracker.ceph.com/issues/24403 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 6f69fe9739a23974a46a4e13e4c29b431d95acc4)
Conflicts:
src/test/librados/aio.cc:
removed test case for rados_aio_write_op_operate2()
which wasn't backported
test case for rados_aio_write_op_operate() uses rados_stat()
instead of rados_stat2() which doesn't exist on pacific
no test_data.m_oid, used "foo" for oids
rgw: avoid string_view to temporary in RGWBulkUploadOp
the `else` block below constructs a temporary std::string that destructs
at the end of the statement, leaving `filename` as a dangling view:
```
filename = file_prefix + std::string(header->get_filename());
```
store a copy of the `std::string` instead
Nitzan Mordechai [Wed, 10 May 2023 09:42:07 +0000 (09:42 +0000)]
mon/MonClient: before complete auth with error, reopen session
When monClient try to authenticate and fail with -EAGAIN there is
a possibility that we no longer hunting and not have active_con.
that will result of disconnecting the monClient and ticks will continue
without having open session.
the solution is to check at the end of auth, that we don't have -EAGAIN
error, and if we do, reopen the session and on the next tick try auth again
Jos Collin [Mon, 22 May 2023 04:31:39 +0000 (10:01 +0530)]
mds: display sane hex value (0x0) for empty feature bit
Print a valid hex (0x0) during empty feature bit, so that the clients could recognize it.
When the _vec size becomes 0, print() function creates an invalid hex (0x) and 'perf stats'
crashes with the below error:
"
File "/opt/ceph/src/pybind/mgr/stats/fs/perf_stats.py", line 177, in notify_cmd
metric_features = int(metadata[CLIENT_METADATA_KEY]["metric_spec"]["metric_flags"]["feature_bits"], 16)
ValueError: invalid literal for int() with base 16: '0x'
"
This patch creates a valid hex (0x0), when _vec size is 0.
Fixes: https://tracker.ceph.com/issues/59551 Signed-off-by: Jos Collin <jcollin@redhat.com>
(cherry picked from commit 2ee9b3af82c788ecd68d09d5bd97d80f07dae0ca)