osd/scrub: additional configuration params to trigger scrub reschedule
Adding the following parameters to the (small) set of configuration
options that, if changed, trigger re-computation of the next scrub
schedule:
- osd_scrub_interval_randomize_ratio,
(not cherry-picked) - osd_deep_scrub_interval_cv, and
- osd_deep_scrub_interval (which was missing in the list of
parameters watched by the OSD).
Fixes: https://tracker.ceph.com/issues/70909
Original tracker: https://tracker.ceph.com/issues/70806
(cherry picked from commit d56f613d5a69797e727938f04b66aed747cfb6b1)
Conflicts resolved by removing refs to the deep_scrub_interval_cv
parameter, which does not yet exist in this version. Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
Ville Ojamo [Thu, 10 Apr 2025 10:34:57 +0000 (17:34 +0700)]
doc/radosgw: Promptify CLI, cosmetic fixes
Use the more modern prompt block for CLI commands
and use right one $ vs #.
Fix indentation on JSON example outputs and
some CLI command switches.
Add some arguably missing comma in JSON example output.
Add a full stop at the end of a one-sentence paragraph.
Remove extra comma mid-sentence in another.
Fix missing backslashes or typo at end of multiline commands.
Lines under section headings as long as heading text.
Fix hyperlinks. Fix list items prefixed with - insted of *.
Format configuration syntax in the middle of text as code.
Fix typo "PI" to "API" and remove extra space.
Remove colons at the end of section headers in a few places.
Use Title Case in section titles consistently with short words lowercase.
Possibly controversial: don't add whitespace before and
after main title section header text.
Possibly controversial: don't indent line continuation
backslashes, leave only 1 space before them.
Igor Fedotov [Mon, 17 Feb 2025 20:14:34 +0000 (23:14 +0300)]
os/bluestore: be less strict in main bdev label validation.
This eliminates treating as an error the case when valid bdev label(s)
exists at location(s) beyond the size in bdev label.
This is effectively not an error but _check_main_bdev_label() returns an
error in this case. Which is undetectable by fsck and unrecoverable by repair.
Igor Fedotov [Sat, 15 Feb 2025 23:18:03 +0000 (02:18 +0300)]
os/bluestore: don't use bdev.size() when dealing with bdev labels in fsck.
This might cause assertions after incomplete volume expansion
(expand-device cmd hasn't been called) as allocmap bitmaps are initialized with
bdev label.size not bdev.size() and hence they are accessed
out-of-bound.
Conflicts:
qa/cephfs/overrides/ignorelist_health.yaml
- this file in main had more entries than on this (squid) branch,
resulting in cherry-picking conflict.
qa/tasks/cephfs/test_admin.py
- this file in main had more tests and a new set of test adjacent to
tests added by this patch-series, resulting in cherry-picking conflict.
Add a new command ("ceph mgr module force disable <module>") that allows
forcibly disabling an always-on module. This command should ideally only
be used to for cluster recovery.
Fixes: https://tracker.ceph.com/issues/66005 Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 9962772358048a98a6e871dccf1bfd0a15b4d791)
Rishabh Dave [Wed, 17 Jul 2024 12:35:33 +0000 (18:05 +0530)]
mon/MgrMonitor: improve a log message
Following log message has 3 distinct information (enabled modules,
modules that are alwats on and total number of commands enabled) printed
on the same line which makes it hard to find one of the information and
also makes it comparatively hard to read -
mgr/dashboard: Fix empty ceph version in GET api/hosts
Fixes https://tracker.ceph.com/issues/70821
Due to the pagination the host list is being fetched from orchestrator which caused a regression as via orchestrator list ceph version is always marked empty.
Caused by https://github.com/ceph/ceph/pull/52154
Also fixed tests , as the new version addition causing whole json object mock to fail in tests
lu.shasha [Mon, 2 Dec 2024 09:10:23 +0000 (17:10 +0800)]
rgw: fix stale entries in bucket indexes
If rados_osd_op_timeout is set, the primary osd is slow, the rgw_rados_operate for deleting the rgw head obj may return -ETIMEDOUT
rgw can't determine whether or not the delete succeeded, we shouldn't be calling index_op.complete_del or cancel()
Instead, we should leave that pending entry in the index so than bucket listing can recover with check_disk_state() and cls_rgw_suggest_changens()
When raced with another delete op, deleting the rgw head obj may return ENOENT, calling index_op.complete_del() instead of index_op.cancel()
In case of corrupted data invalid iterator could be dereferenced. Fixes: https://tracker.ceph.com/issues/66361 Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
(cherry picked from commit e59495b331765f4081d5aab66c939ec10b4b8344)
Aashish Sharma [Fri, 28 Feb 2025 06:12:13 +0000 (11:42 +0530)]
monitoring: Fix OSDs panel in host-details grafana dashboard
OSDs panel in host-details grafana dashboard shows total of all OSDs
across all hosts even if a particular host is selected from the
ceph_hosts filter. This PR intends to fix this issue
Kefu Chai [Sun, 30 Mar 2025 03:59:12 +0000 (11:59 +0800)]
cephfs-top: Removes unused `global` statements
Recent flake8 runs were failing with:
```
py3: flake8==7.2.0,mccabe==0.7.0,pip==25.0.1,pycodestyle==2.13.0,pyflakes==3.3.0,setuptools==75.8.0,wheel==0.45.1
py3: commands[0] /home/jenkins-build/build/workspace/ceph-pull-requests/src/tools/cephfs/top> flake8 --ignore=W503 --max-line-length=100 cephfs-top
cephfs-top:344:9: F824 `global fs_list` is unused: name is never assigned in scope
cephfs-top:466:13: F824 `global current_states` is unused: name is never assigned in scope
cephfs-top:872:9: F824 `global metrics_dict` is unused: name is never assigned in scope
cephfs-top:872:9: F824 `global current_states` is unused: name is never assigned in scope
cephfs-top:911:9: F824 `global fs_list` is unused: name is never assigned in scope
cephfs-top:981:9: F824 `global current_states` is unused: name is never assigned in scope
cephfs-top:1126:13: F824 `global current_states` is unused: name is never assigned in scope
py3: exit 1 (0.77 seconds) /home/jenkins-build/build/workspace/ceph-pull-requests/src/tools/cephfs/top> flake8 --ignore=W503 --max-line-length=100 cephfs-top pid=2309605
py3: FAIL code 1 (8.15=setup[7.38]+cmd[0.77] seconds)
evaluation failed :( (8.24 seconds)
```
Since these variables are only being referenced and not assigned within
their scopes, the `global` declarations are unnecessary and can be
safely removed. This change:
- Removes all flagged `global` statements
- Fixes the failing flake8 checks in the CI pipeline
- Maintains the original code behavior as variable references still work without the `global` keyword
The `global` keyword is only needed when assigning to global variables
within a function scope, not when simply referencing them.
Kefu Chai [Sun, 30 Mar 2025 03:48:28 +0000 (11:48 +0800)]
qa: Remove unnecessary global statements in tests
Removes unused `global` statements from Python test files to fix flake8
F824 errors.
Recent flake8 runs were failing with:
```
./tasks/radosgw_admin.py:330:5: F824 `global log` is unused: name is never assigned in scope
./workunits/dencoder/test_readable.py:99:5: F824 `global incompat_paths` is unused: name is never assigned in scope
./workunits/dencoder/test_readable.py:164:5: F824 `global backward_compat` is unused: name is never assigned in scope
./workunits/dencoder/test_readable.py:165:5: F824 `global fast_shouldnt_skip` is unused: name is never assigned in scope
```
Since these variables are only being referenced and not assigned within
their scopes, the `global` declarations are unnecessary and can be
safely removed. This change:
- Removes all flagged `global` statements
- Fixes the failing flake8 checks in the CI pipeline
- Maintains the original code behavior as variable references still work
without the `global` keyword
The `global` keyword is only needed when assigning to global variables
within a function scope, not when simply referencing them.
Aashish Sharma [Tue, 25 Mar 2025 11:35:05 +0000 (17:05 +0530)]
mgr/dashboard: fix image filter's query on rbd-details grafana panel
The image filter on the RBD Details grafana panel is using a query with a type "label_values(ceph_rbd_read_ops{cluster=~\"$cluster\", , pool=\"$pool\"}, image)". The extra comma needs to be removed.
Casey Bodley [Tue, 11 Mar 2025 16:07:22 +0000 (12:07 -0400)]
cls/rgw: non-versioned listings skip past version suffix
when skipping a versioned entry for a non-versioned listing, we must
advance the marker or risk infinite loops. in particular, plain entries
converted by convert_plain_entry_to_versioned() sort at the end of an
object's versions, but have an empty version id whose retry would start
back at the beginning of the object's versions