Mark Kogan [Thu, 23 May 2024 12:17:22 +0000 (12:17 +0000)]
rgw: delay the RGW process exit until all actice requests have completed
Wait for up to `rgw_exit_timeout_secs` for all outstanding requests
to complete before exiting unconditionally.
(new HTTP requests will not be accepted during this time.)
Fixes: https://tracker.ceph.com/issues/66205 Signed-off-by: Mark Kogan <mkogan@ibm.com>
Patrick Donnelly [Sun, 23 Jun 2024 18:29:53 +0000 (14:29 -0400)]
Merge PR #57754 into main
* refs/pull/57754/head:
mds: set the proper extra bl for the create request
mds: encode the correct extra info depending on the feature bits
mds: add set_reply_extra_bl() helper support
mds: cleanup the code to make it to be more readable
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Laura Flores [Wed, 19 Jun 2024 21:57:45 +0000 (16:57 -0500)]
qa/suites/upgrade/telemetry-upgrade/reef-x: update how cephadm is pulled and change image reference
Update how cephadm is pulled:
`cephadm_git_url` and `cephadm_branch` are used in releases older than reef
to install cephadm. Both of these keys are needed to install it from the github
repo.
However, in reef and on, the compiled zipapp cephadm needs to be pulled differently
than the old single python script `cephadm` from earlier releases.
Laura Flores [Wed, 19 Jun 2024 21:07:31 +0000 (16:07 -0500)]
qa/suites/upgrade/telemetry-upgrade: add more ignorelist items and require_osd_release=squid
The warnings added to the ignorelist show up in the cluster log, but they are
expected during upgrades and should thus be ignored.
We also need to set require_osd_release=squid to avoid this warning:
```
cluster [WRN] Health check failed: all OSDs are running squid or later but require_osd_release < squid (OSD_UPGRADE_FINISHED)
```
Patrick Donnelly [Tue, 18 Jun 2024 17:31:14 +0000 (13:31 -0400)]
mon/AuthMonitor: add `ceph auth rotate` command
Add command to rotate the permanent key of an entity. This avoids the need to
delete / recreate the key when it is compromised, lost, or just scheduled for
rotation.
Fixes: https://tracker.ceph.com/issues/66509 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Note from Dan van der Ster:
This is a test that should succeed,
it definitely used to succeed back in the L/O days of Ceph. At some
point peering code changed and this behaviour regressed. In short,
an OSD goes down then comes up, and no objects were modified in the mean time.
There should be no degraded PGs in this case.
As this commit is currently breaking make check on all PRs, I think it should
be re-evaluated and merged so whatever fix is needed along with this test to
make it work are merged together.
Fixes: https://tracker.ceph.com/issues/66556 Signed-off-by: Laura Flores <lflores@ibm.com>
Patrick Donnelly [Thu, 20 Jun 2024 15:43:17 +0000 (11:43 -0400)]
script/backport-create-issue: update tag custom field
The redmineup_tags was added and, apparently, the old "Tags" custom field with
id == 3 was deleted and then recreated with id == 31. This broke the script.
Update and refactor the id for that "Tags" custom field.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Xiubo Li [Wed, 19 Jun 2024 03:27:33 +0000 (11:27 +0800)]
common/TrackedOp: do not count the ops marked as nowarn
If an op is marked as nowarn then it won't be counted as the slow
requests, but currently it will count the initiated time when
iterating the inflight ops.
For example:
[WRN] : 1 slow requests, 1 included below; oldest blocked for > 38.764892 secs
[WRN] : slow request 33.875059 seconds old, received at 2024-06-17T14:14:34.228261+0000: client_request(client.78109915:11369251 mkdir #0x1008ecedea2/chk-89588 2024-06-17T14:14:34.097825+0000 caller_uid=1002960000, caller_gid=0{0,1002960000,}) currently failed to wrlock, waiting
The oldest blocked request is 38.764892 old, but the oldest slow
request reported is 33.875059 old.
Fixes: commit e4160d7e783 ("mds: don't report slow request for blocked filelock request") Fixes: https://tracker.ceph.com/issues/66557 Signed-off-by: Xiubo Li <xiubli@redhat.com>
Patrick Donnelly [Wed, 19 Jun 2024 19:19:56 +0000 (15:19 -0400)]
Merge PR #55792 into main
* refs/pull/55792/head:
tools/cephfs: recover alternate_name of dentries from journal
qa: add test to verify recovery of alternate_name from journal
tools/cephfs/JournalTool: add some more debugging
tools/cephfs/JournalTool: remove extraneous 0x in debug output
mds: dump alternate_name to formatter
mds: add warning about encoding new fields
Reviewed-by: Christopher Hoffman <choffman@redhat.com> Reviewed-by: Xiubo Li <xiubli@redhat.com>
Afreen Misbah [Fri, 14 Jun 2024 05:21:47 +0000 (10:51 +0530)]
mgr/dashboard: fix service page e2e tests
- service page now uses defaults value for the placement count due to which mds test failing
- in test we pass "1" while "2" which is the default count for mds is already populated, making it 21 and causing unable to create mds service
Zack Cerza [Fri, 14 Jun 2024 19:37:16 +0000 (13:37 -0600)]
qa/tasks/qemu: Fix OS version comparison
See: https://sentry.ceph.com/share/issue/21ed88d705854238bdafbf6711e795ee/
They're strings, not floats.
This surfaced as a result of https://github.com/ceph/teuthology/pull/1953
Nizamudeen A [Sun, 16 Jun 2024 11:15:43 +0000 (16:45 +0530)]
mgr/dashboard: disable telemetry notification in e2e
After the new UI shell, the telemetry notifications are displayed on the
bottom side, which kind of exists on top of some buttons and that causes
some cypress failures while clicking Submit button in the form.
I am disabling the notification itself in e2e because its not checked in
the e2e at all so we can afford to disable it. Incase we decide to add
an e2e for the notification, we can just toggle it on later on and we
can check it
Fixes: https://tracker.ceph.com/issues/66506 Signed-off-by: Nizamudeen A <nia@redhat.com>
Xuehan Xu [Fri, 24 May 2024 09:30:41 +0000 (17:30 +0800)]
crimson/osd/osd_operations/client_request_common: `PeeringState::needs_recovery()`
may fail if the object is under backfill
Meanwhile, set the correct version for backfill:
From Classic:
```
if (is_degraded_or_backfilling_object(head)) {
if (can_backoff && g_conf()->osd_backoff_on_degraded) {
add_backoff(session, head, head);
maybe_kick_recovery(head);
}
```
John Mulligan [Thu, 2 May 2024 20:41:15 +0000 (16:41 -0400)]
mgr/smb: add validation funcs for custom parameter dictionaries
Custom parameter dictionaries will be used to pass options to samba
config without much filtering and control by the smb mgr module. Because
the risks that it entails the user must "agree" that using these options
can break their setup with a "magic" key-value pair.
This pair will be filtered out of the eventual data passed to samba.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 2 May 2024 20:40:53 +0000 (16:40 -0400)]
mgr/smb: convert failures to create a valid resource into error results
Convert failures to create a valid resource into error results that can
be reported back to the caller like the other error result types
generated by actually attempting to apply the resources.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 2 May 2024 20:33:01 +0000 (16:33 -0400)]
mgr/smb: add result type for reporting resource validation errors
This error type can not take a real resource object because the resource
object could not be constructed from the data. Use the raw data for
reporting the error result.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 2 May 2024 20:32:35 +0000 (16:32 -0400)]
mgr/smb: add resource construction error handling exception
Use error hook function to wrap plain ValueError instances to an
InvalidResourcError. This error retains the simplified data being
(re)constructed into an object and thus will be used later to generate
proper error results.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 2 May 2024 20:28:46 +0000 (16:28 -0400)]
mgr/smb: add error handling & conversion hook to resourcelib
Add a method to supply the Resource instances with a callback that
can handle errors that occur during object construction from simplified
data. If the callback is not set, the exceptions are handled as usual.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
some tests are currently failing when `lvm2` isn't installed:
```
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestList::test_empty_device_json_zero_exit_status - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestList::test_empty_device_zero_exit_status - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestFullReport::test_no_ceph_lvs - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestFullReport::test_ceph_data_lv_reported - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestFullReport::test_ceph_journal_lv_reported - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestFullReport::test_ceph_wal_lv_reported - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestFullReport::test_physical_2nd_device_gets_reported[journal] - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestFullReport::test_physical_2nd_device_gets_reported[db] - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestFullReport::test_physical_2nd_device_gets_reported[wal] - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_not_a_ceph_lv - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_report_a_ceph_lv - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_report_a_ceph_journal_device - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_report_by_osd_id_for_just_block_dev - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_report_by_osd_id_for_just_data_dev - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_report_by_osd_id_for_just_block_wal_and_db_dev - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_report_by_osd_id_for_data_and_journal_dev - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_report_by_nonexistent_osd_id - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_listing.py::TestSingleReport::test_report_a_ceph_lv_with_no_matching_devices - FileNotFoundError: [Errno 2] No such file or directory: 'pvs'
FAILED ceph_volume/tests/devices/lvm/test_migrate.py::TestNew::test_newdb_not_target_lvm - FileNotFoundError: [Errno 2] No such file or directory: 'lvs'
FAILED ceph_volume/tests/devices/lvm/test_zap.py::TestEnsureAssociatedLVs::test_nothing_is_found - FileNotFoundError: [Errno 2] No such file or directory: 'lvs'
FAILED ceph_volume/tests/devices/lvm/test_zap.py::TestEnsureAssociatedLVs::test_multiple_journals_are_found - FileNotFoundError: [Errno 2] No such file or directory: 'lvs'
FAILED ceph_volume/tests/devices/lvm/test_zap.py::TestEnsureAssociatedLVs::test_multiple_dbs_are_found - FileNotFoundError: [Errno 2] No such file or directory: 'lvs'
FAILED ceph_volume/tests/devices/lvm/test_zap.py::TestEnsureAssociatedLVs::test_multiple_wals_are_found - FileNotFoundError: [Errno 2] No such file or directory: 'lvs'
FAILED ceph_volume/tests/devices/lvm/test_zap.py::TestEnsureAssociatedLVs::test_multiple_backing_devs_are_found - FileNotFoundError: [Errno 2] No such file or directory: 'lvs'
FAILED ceph_volume/tests/objectstore/test_lvmbluestore.py::TestLvmBlueStore::test_activate_all_osd_is_active - FileNotFoundError: [Errno 2] No such file or directory: 'lvs'
```
Everything should be actually mocked. This commit addresses that.
Venky Shankar [Mon, 17 Jun 2024 09:27:52 +0000 (14:57 +0530)]
Merge PR #57991 into main
* refs/pull/57991/head:
qa: upgrade sub-suite upgraded_client from from n-1|n-2 releases
qa: upgrade sub-suite nofs from n-1 and n-2 releases
qa: use supported releases for featureful_client