Zac Dover [Wed, 7 Aug 2024 13:11:11 +0000 (23:11 +1000)]
doc/cephfs: add cache pressure information
Add information to doc/cephfs/cache-configuration.rst about how to deal
with a message that reads "clients failing to respond to cache
pressure". This procedure explains how to slow the growth of the
recall_caps value so that it does not exceed the
mds_recall_warning_threshold.
The information in this commit was developed by Eugen Block. See
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/5ROH5CWKKOEIQMVXOVRT5OO7CWK2HPM3/#J65DFUPP4BY57MICPANXKI7KAXSZ5Z5P
and https://www.spinics.net/lists/ceph-users/msg73188.html.
Fixes: https://tracker.ceph.com/issues/57115 Co-authored-by: Eugen Block <eblock@nde.ag> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit bf26274ae4737417193f8c2b56bea20eb2a358aa)
osd/scrub: exempt only operator scrubs from max_scrubs limit
Existing code exempts all 'high priority' scrubs, including for example
'after_repair' and 'mandatory on invalid history' scrubs from the limit.
PGs that do not have valid last-scrub data (which is what we have when
a pool is first created) - are set to shallow-scrub immediately.
Unfortunately - this type of scrub is (in the low granularity implemented
in existing code) is 'high priority'.
Which means that a newly created pool will have all its PGs start
scrubbing, regardless of concurrency (or any other) limits.
Fixes: https://tracker.ceph.com/issues/67253 Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit babd65e412266f5c734f7a2b57d87657d3470c47)
conflict resolution:
- eliminating irrelevant 'main' code that was picked into this branch.
- the code to set the scrub_job's flag moved to osd_scrub_sched.cc,
where the corresponding function is.
some tests are currently failing when `lvm2` isn't installed:
```
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestList::test_empty_device_json_zero_exit_status - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestList::test_empty_device_zero_exit_status - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestFullReport::test_no_ceph_lvs - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestFullReport::test_ceph_data_lv_reported - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestFullReport::test_ceph_journal_lv_reported - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestFullReport::test_ceph_wal_lv_reported - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestFullReport::test_physical_2nd_device_gets_reported[journal] - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestFullReport::test_physical_2nd_device_gets_reported[db] - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestFullReport::test_physical_2nd_device_gets_reported[wal] - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_not_a_ceph_lv - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_report_a_ceph_lv - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_report_a_ceph_journal_device - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_report_by_osd_id_for_just_block_dev - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_report_by_osd_id_for_just_data_dev - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_report_by_osd_id_for_just_block_wal_and_db_dev - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_report_by_osd_id_for_data_and_journal_dev - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_report_by_nonexistent_osd_id - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_report_a_ceph_lv_with_no_matching_devices - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_migrate.py::TestNew::test_newdb_not_target_lvm - FileNotFoundError: [Errno 2] No such file or directory: 'lvs'
FAILED ceph_volume/tests/devices/lvm/test_zap.py::TestEnsureAssociatedLVs::test_nothing_is_found - FileNotFoundError: [Errno 2] No such file or directory: 'lvs'
FAILED ceph_volume/tests/devices/lvm/test_zap.py::TestEnsureAssociatedLVs::test_multiple_journals_are_found - FileNotFoundError: [Errno 2] No such file or directory: 'lvs'
FAILED ceph_volume/tests/devices/lvm/test_zap.py::TestEnsureAssociatedLVs::test_multiple_dbs_are_found - FileNotFoundError: [Errno 2] No such file or directory: 'lvs'
FAILED ceph_volume/tests/devices/lvm/test_zap.py::TestEnsureAssociatedLVs::test_multiple_wals_are_found - FileNotFoundError: [Errno 2] No such file or directory: 'lvs'
FAILED ceph_volume/tests/devices/lvm/test_zap.py::TestEnsureAssociatedLVs::test_multiple_backing_devs_are_found - FileNotFoundError: [Errno 2] No such file or directory: 'lvs'
FAILED ceph_volume/tests/objectstore/test_lvmbluestore.py::TestLvmBlueStore::test_activate_all_osd_is_active - FileNotFoundError: [Errno 2] No such file or directory: 'lvs'
```
Everything should be actually mocked. This commit addresses that.
ceph-volume: do not convert LVs's symlink to real path
This commit:
- Adds a new function `get_lvm_mappers` in `ceph_volume/util/disk.py`
to retrieve a list of LVM device mappers.
- Updates the `is_lv` property in `ceph_volume/util/device.py`
to use the new `get_lvm_mappers` function for better accuracy.
- Modifies the symlink handling in `Device` class to properly
identify LVM logical volumes.
- Adds a new test `test_reject_lv_symlink_to_device` to ensure
LVM symlinks are correctly identified and handled.
- Updates relevant tests to cover the changes in LVM device detection.
These changes improve the reliability and accuracy of LVM device detection
and handling, ensuring that symlinks to LVM logical volumes are
correctly processed.
Fixes: https://tracker.ceph.com/issues/61597 Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com> Co-Authored-by: Jerry Pu <jerrypu@qnap.com>
(cherry picked from commit 729c5de4f852f1a1ee90e76b71157e9070af7d99)
Igor Fedotov [Fri, 31 May 2024 14:05:29 +0000 (17:05 +0300)]
ceph-volume: do source devices zapping if they're detached.
One needs to zap source device(s) after DB/WAL migration.
Original imlementation removes LVM tags only which leaves device(s) in a
state where "ceph-volume raw activate" still reconginizes them as
attached to OSD due to information preserved in bdev label.
Hence the need to do more zapping. Fixes: https://tracker.ceph.com/issues/66315 Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit ae5ef432845dcf9b061258357ffd97f4eae59a63)
test/crimson/seastore/test_seastore.cc: should not return a value
clang++-14:
```
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/crimson/seastore/test_seastore.cc:86:5: error: void function 'do_transaction' should not return a value [-Wreturn-type]
return sharded_seastore->do_transaction(
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/crimson/seastore/test_seastore.cc:94:5: error: void function 'set_meta' should not return a value [-Wreturn-type]
return seastore->write_meta(key, value).get();
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 errors generated.
```
Fixes: https://tracker.ceph.com/issues/66286
(Line added by Gabriel)
In RadosStore, the source and dest objects in the copy_object() call
used to share an obj_ctx. When obj_ctx was removed from the SAL API,
they each got their own, but RGWRados::copy_obj() still assumed they
shared one.
Pass in each one separately, and use the correct one for further calls.
Signed-off-by: Daniel Gryniewicz <dang@fprintf.net> Signed-off-by: Gabriel BenHanokh <gbenhano@redhat.com>
(cherry picked from commit 111c54a19dc12b84cda785feddb0a0ba483b1f77)
Fixes: https://tracker.ceph.com/issues/66286
Improve display of ref_count in the rados commandline utility
New test cases were added to detect behavior after server side copy in the following cases:
1) delete original only
2) delete destination only
3) delete original then delete destination (this will lead to orphaned tail-objects without the changes made in this PR)
d) delete destination then delete original (this will lead to orphaned tail-objects without the changes made in this PR)
Add call to GC between tests to help control the used disk space since we keep writing huge files of 5GB each Signed-off-by: Gabriel BenHanokh <gbenhano@redhat.com>
(cherry picked from commit d496d20c803590d41d711e446feab41476c0f20c)
RemoteApplier::load_acct_info() and create_account() decide whether to
add the implicit tenant. store the resulting rgw_user for use in
get_aclowner() and get_tenant()
Nitzan Mordechai [Tue, 25 Jun 2024 09:06:45 +0000 (09:06 +0000)]
crimson/osd: adding osdmap subscribe
when committed osdmap is complete, it will check if should restart.
in case we shouldn't restart but we are still active, we need
the next osdmap to continue the process.