Merge pull request #59420 from rishabh-d-dave/max-mds-confirm

author Rishabh Dave <ridave@redhat.com>

Fri, 18 Oct 2024 14:34:18 +0000 (20:04 +0530)

committer GitHub <noreply@github.com>

Fri, 18 Oct 2024 14:34:18 +0000 (20:04 +0530)
author Rishabh Dave <ridave@redhat.com>
Fri, 18 Oct 2024 14:34:18 +0000 (20:04 +0530)
committer GitHub <noreply@github.com>
Fri, 18 Oct 2024 14:34:18 +0000 (20:04 +0530)
diff --cc PendingReleaseNotes

index 1a4e26e747fb27ec73a9f75db7cdac31ffa56ed1,c35924c6e869039eec46431b3ddaa9f18096c025..8af2a262dff91242691ffeadd2504965d5dbaf07
--- 1/PendingReleaseNotes
--- 2/PendingReleaseNotes
+++ b/PendingReleaseNotes
@@@ -12,20 -12,15 +12,28 @@@
     of the column showing the state of a group snapshot in the unformatted CLI
     output is changed from 'STATUS' to 'STATE'. The state of a group snapshot
     that was shown as 'ok' is now shown as 'complete', which is more descriptive.
+ +* Based on tests performed at scale on a HDD based Ceph cluster, it was found
+ +  that scheduling with mClock was not optimal with multiple OSD shards. For
+ +  example, in the test cluster with multiple OSD node failures, the client
+ +  throughput was found to be inconsistent across test runs coupled with multiple
+ +  reported slow requests. However, the same test with a single OSD shard and
+ +  with multiple worker threads yielded significantly better results in terms of
+ +  consistency of client and recovery throughput across multiple test runs.
+ +  Therefore, as an interim measure until the issue with multiple OSD shards
+ +  (or multiple mClock queues per OSD) is investigated and fixed, the following
+ +  change to the default HDD OSD shard configuration is made:
+ +   - osd_op_num_shards_hdd = 1 (was 5)
+ +   - osd_op_num_threads_per_shard_hdd = 5 (was 1)
+ +  For more details see https://tracker.ceph.com/issues/66289.
   
+ * CephFS: Modifying the FS setting variable "max_mds" when a cluster is
+   unhealthy now requires users to pass the confirmation flag
+   (--yes-i-really-mean-it). This has been added as a precaution to tell the
+   users that modifying "max_mds" may not help with troubleshooting or recovery
+   effort. Instead, it might further destabilize the cluster.
+ 
+ 
+ 
   >=19.0.0
   
   * cephx: key rotation is now possible using `ceph auth rotate`. Previously,
diff --cc qa/tasks/cephfs/test_admin.py
Simple merge
author	Rishabh Dave <ridave@redhat.com>
	Fri, 18 Oct 2024 14:34:18 +0000 (20:04 +0530)
committer	GitHub <noreply@github.com>
	Fri, 18 Oct 2024 14:34:18 +0000 (20:04 +0530)
		1	2
PendingReleaseNotes	patch \|	diff1 \|	diff2 \|	blob \| history
qa/tasks/cephfs/test_admin.py	patch \|	diff1 \|	diff2 \|	blob \| history