Dhairya Parmar [Mon, 22 May 2023 10:37:34 +0000 (16:07 +0530)]
mds: remove code to bypass dumping empty header scrub info
Previously when ~mdsdir was scrubbed at CephFS root, it's header
was kept empty, thus it became necessary to not dump it's values
for 'scrub status'. Now since both the scrubs(~mdsdir and root)
run under the same header, this code is no more needed.
Dhairya Parmar [Mon, 22 May 2023 10:36:24 +0000 (16:06 +0530)]
mds: dump_values no more needed
Previouly, two individual scrubs were initiated to scrub ~mdsdir
at root where the ~mdsdir scrub wasn't provided any tag thus, it
was necessary to not dump it's values for output of 'scrub start'.
Now since mdsdir and root scrub run under single header, there is
no need for this anymore, thus removing this redundant code.
Dhairya Parmar [Mon, 22 May 2023 07:04:51 +0000 (12:34 +0530)]
mds: enqueue ~mdsdir at the time of enqueing root
This would avoid the need to run individual scrubs for
~mdsdir and root, i.e. run both the scrubs under the
same header, this also helps to avoid edge case where
in case ~mdsdir is huge and it's taking time to scrub it,
the scrub status would report something like this until
root inodes kick in:
{
"status": "scrub active (757 inodes in the stack)",
"scrubs": {}
}
Venky Shankar [Tue, 16 May 2023 05:25:34 +0000 (10:55 +0530)]
doc: explain cephfs mirroring `peer_add` step in detail
@zdover23 reached out regarding missing explanation for `peer_add`
step in cephfs mirroring documentation. Add some explanation and
and example to make the step clear.
Ramana Raja [Wed, 10 May 2023 18:37:44 +0000 (14:37 -0400)]
rbd_support: recover from "double blocklisting"
Recover from being blocklisted while recovering from blocklisting.
When the rbd_support module is being set up to recover from client
blocklisting, the module's new rados client connection can also get
blocklisted. Currently, this will cause the recovery to fail and
the module will remain inoperable. Instead, retry module recovery
when the new client gets blocklisted during the module setup in the
recovery thread.
Fixes: https://tracker.ceph.com/issues/59713 Signed-off-by: Ramana Raja <rraja@redhat.com>
Venky Shankar [Mon, 15 May 2023 16:50:50 +0000 (22:20 +0530)]
Merge PR #47752 into main
* refs/pull/47752/head:
test/libcephfs: add test case for revoking caps
mds: remove the cap directly when releasing the cap
mds: add the revoking caps back to _revokes list
mds: move confirm_receipt() to Capability.cc
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
Yongseok Oh [Mon, 20 Mar 2023 10:24:25 +0000 (19:24 +0900)]
mds: remove inappropriate initialization of num_imported
The variable num_imported is being passed by reference. Additionally,
the decode_import_dir() function is invoked from handle_export_dir(),
where num_imported is initialized and passed by reference.
Therefore, there is no need to initialize it again within
the decode_import_dir() function.
Zac Dover [Fri, 12 May 2023 10:35:25 +0000 (20:35 +1000)]
doc/cephfs: rectify prompts in fs-volumes.rst
Make sure all prompts are unselectable. This PR is meant to be
backported to Reef, Quincy, and Pacific, to get all of the prompts into
a fit state so that a line-edit can be performed on the Englsh language
in this file.
Venky Shankar [Thu, 11 May 2023 05:51:14 +0000 (11:21 +0530)]
Merge PR #51251 into main
* refs/pull/51251/head:
PendingReleaseNotes: add a note about deleting files from lost+found directory
qa: add checks that validate removal of entries from lost+found dir
mds: allow unlink operation under lost+found directory
Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Shilpa Jagannath [Tue, 14 Mar 2023 02:14:24 +0000 (22:14 -0400)]
rgw/multisite: - add a lost_bid variable to handle is_highest_bidder() control flow
- check for is_highest_bidder() before even attempting to take the lock
- don't block on RGWMetaSyncShardNotifyCR()
- other minor fixes
Or Friedmann [Wed, 16 Feb 2022 17:00:33 +0000 (17:00 +0000)]
rgw: multisite metadata sync fairness
multisite metadata sync fairness
The approach of this commit is to allow multiple RGWs to participate in the multisite metadata sync.
Before this commit only single RGW has caught the all the sync locks.
This feature is using bidding algorithm.
For each lock, RGW is randomizing a number from 0 to shard count and for each shard is picking randomally one number and giving it as the bid_amount.
each one of those vectors each RGW handles are being sent using watch notify (based on RADOS watch notify).
Each time the RGW tries to lock it will compare its bid for the lock and the bids of other rgws, if the current RGW has the highest bid it will try to acquire the lock.
Important configs:
rgw_sync_work_period - For how long the RGW will sync until it will send unlock (very important in the beggining, because in the beginning only single RGW holds the locks)
rgw_sync_lease_period - not new to this commit but affecting it, For how many seconds the RGW will request from the RADOS to keep the lock, mainly important in case of failure, so automatically the RGW will lose a lock if it's down
Fixes: https://tracker.ceph.com/issues/41230 Signed-off-by: J. Eric Ivancich <ivancich@redhat.com> Signed-off-by: Or Friedmann <ofriedma@ibm.com> Signed-off-by: Casey Bodley <cbodley@redhat.com>
Venky Shankar [Wed, 10 May 2023 08:34:58 +0000 (14:04 +0530)]
Merge PR #43184 into main
* refs/pull/43184/head:
qa: fix journal flush failure issue due to the MDS daemon crashes
qa: add test support for the alloc ino failing
mds: do not take the ino which has been used
Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
galsalomon66 [Fri, 10 Mar 2023 12:27:05 +0000 (14:27 +0200)]
adding s3test albin/json-op-serial
modify json chunk processing function to handle offset/length as csv-processing
a fix valgrind :: Conditional jump or move depends on uninitialised value
upon using Trino the Trino-server issue multiple requests per single query,upon completion of all requests
the results are merged (by Trino). these request splits the input into equal parts; the RGW side should be aligned with Trino expectations(for result).
fixing the main routine for shaping the chunk (range-scan) for Trino processing
upon removing the payload-TAG, it need to change the response element index
handling more use cases for "shaping" the processed chunk by s3select per Trino request
re-shape the processed chunk only upon Trino sent the request
bug-fix: the chunk offset was not handle correctly
bug-fix: progress-message calcualation
modifying the range-request boundaries only upon Trino request.
Xuehan Xu [Wed, 17 Aug 2022 10:07:42 +0000 (18:07 +0800)]
crimson/os/seastore/backref_manager: retrieve live backref extents throught the backref tree
After involving intra-fixed-kv-btree parent-child pointers, we need to keep the
invariant that it's only when extents are not in transactions' read_set that
we can directly query cache with inspecting the transaction