qa/standalone/scrub: fix "scrubbed in 0ms" in osd-scrub-test.sh
The specific test looks for a 'last scrub duration' higher than
0 as a sign that the scrub actually ran. Previous code fixes
guaranteed that even a scrub duration as low as 1ms would be
reported as "1" (1s). However, none of the 15 objects created
in this test were designated for the tested PG, which remained
empty. As a result, the scrub duration was reported as "0".
The fix is to create a large enough number of objects so that
at least one of them is mapped to the tested PG.
Merge pull request #62713 from soumyakoduri/wip-skoduri-restore-glacier
rgw/cloud-restore [PART2] : Add Restore support from Glacier/Tape cloud endpoints
Reviewed-by: Adam Emerson <aemerson@redhat.com> Reviewed-by: Jiffin Tony Thottan <thottanjiffin@gmail.com> Reviewed-by: Matt Benjamin <mbenjamin@redhat.com> Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
This should fix the chmod 777 /var/log/ceph failures.
We were missing the install task which resulted in no /var/log/ceph:
```
2025-07-07T08:55:44.586 INFO:teuthology.run_tasks:Running task ceph...
2025-07-07T08:55:44.679 INFO:tasks.ceph:Making ceph log dir writeable by
non-root...
2025-07-07T08:55:44.679 DEBUG:teuthology.orchestra.run.smithi144:> sudo
chmod 777 /var/log/ceph
2025-07-07T08:55:44.711
INFO:teuthology.orchestra.run.smithi144.stderr:chmod: cannot access
'/var/log/ceph': No such file or directory
```
Shachar Sharon [Thu, 26 Jun 2025 06:43:01 +0000 (09:43 +0300)]
mgr/smb: Enable per-share profile counters
Samba's commit 9f8d272 ("vfs_ceph_new: use per-share profile macros")
enables per-share profile counters for VFS ceph bridge. Enable those by
default for each smb-ceph share.
Matan Breizman [Sun, 8 Jun 2025 10:20:25 +0000 (10:20 +0000)]
crimson/CMakeLists: simplify crimson-common deps
instead of appending conditional dependencies to crimson-common with
crimson_common_deps and crimson_common_public_deps, use
target_link_libraries directly.
Connor Fawcett [Tue, 24 Jun 2025 11:45:06 +0000 (12:45 +0100)]
Adds a new command-line utility which can check the consistency of objects within an erasure coded pool.
A new test-only inject tells the EC backend to return both data and parity shards to the client so that they can
be checked for consistency by the new tool.
Soumya Koduri [Fri, 23 May 2025 20:25:30 +0000 (01:55 +0530)]
rgw/cloud-restore: Handle failure with adding restore entry
In case adding restore entry to FIFO fails, reset the `restore_status`
of that object as "RestoreFailed" so that restore process can be
retried from the end S3 user.
Reviewed-by: Adam Emerson <aemerson@redhat.com> Reviewed-by: Jiffin Tony Thottan <thottanjiffin@gmail.com> Signed-off-by: Soumya Koduri <skoduri@redhat.com>
rgw/cloud-restore: Support restoration of objects transitioned to Glacier/Tape endpoint
Restoration of objects from certain cloud services (like Glacier/Tape) could
take significant amount of time (even days). Hence store the state of such restore requests
and periodically process them.
Brief summary of changes
* Refactored existing restore code to consolidate and move all restore processing into rgw_restore* file/class
* RGWRestore class is defined to manage the restoration of objects.
* Lastly, for SAL_RADOS, FIFO is used to store and read restore entries.
Currently, this PR handles storing state of restore requests sent to cloud-glacier tier-type which need async processing.
The changes are tested with AWS Glacier Flexible Retrieval with tier_type Expedited and Standard.
Reviewed-by: Matt Benjamin <mbenjamin@redhat.com> Reviewed-by: Adam Emerson <aemerson@redhat.com> Reviewed-by: Jiffin Tony Thottan <thottanjiffin@gmail.com> Reviewed-by: Daniel Gryniewicz <dang@redhat.com> Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Dnyaneshwari [Thu, 22 May 2025 07:08:25 +0000 (12:38 +0530)]
mgr/dashboard: Local Storage Class - create and list Fixes: https://tracker.ceph.com/issues/71460 Signed-off-by: Dnyaneshwari Talwekar <dtalwekar@redhat.com>
Shraddha Agrawal [Thu, 26 Jun 2025 12:27:45 +0000 (17:57 +0530)]
mon/MgrStatMonitor.cc: cleanup handle_conf_change
Prior to this change, we were using a flag value,
`reset_availability_last_uptime_downtime_val` to record the
timestamp to which the last_uptime and last_downtime should be
updated to. This was originally done so to avoid the values
being overwritten by a paxos update.
Now, instead of using an intermediate value, we are immediately
clearing the last_uptime and last_downtime values in
pending_pool_availability object. Since we are updating the values
in the pending object, we will not lost this information due to
an incoming paxos update.
alienstore FTBFS [1] due to virtual-dtor warning when compiling seastar [2].
Instead of using alien::cflags which define INTERFACE_COMPILE_OPTIONS of
-Wno-non-virtual-dtor - Let's directly add this compile option to
tagets using seastar.
Crimson non-alien targets solve that with crimson::cflags which
defines the relevant compile flag. However, we don't reuse it here since
it also carries WITH_CRIMSON.
As both crimson::cflags and crimson-alienstore which are using seastar
have to set no-non-virtual-dtor - The compile option moved to the common
cmake file instead of setting it in both targets.
[1]
```
crimson/os/alienstore/alien_log.cc:21:28: required from here
seastar/include/seastar/core/future.hh:666:7:
warning: ‘class seastar::continuation_base<void>’ has virtual functions
and accessible non-virtual destructor [-Wnon-virtual-dtor]
```
mon: Integrate discard queue overflow into pg health warnings
Added a health warning mechanism to monitor the discard queue for potential overload
Emits a warning if the accumulated discarded bytes in the queue exceed the configured threshold
Introduced a debugging tool to simulate slow discard operations by adding a configurable delay
common/options: Added bdev_discard_max_bytes and bdev_debug_discard_sleep options
Added a health warning mechanism to monitor the discard queue for potential overload
Emits a warning if the accumulated discarded bytes in the queue exceed the configured threshold
Introduced a debugging tool to simulate slow discard operations by adding a configurable delay
Jaya Prakash [Thu, 9 Jan 2025 16:14:05 +0000 (21:44 +0530)]
blk:Warning added for discard queue overflow
Added a health warning mechanism to monitor the discard queue for potential overload
Emits a warning if the accumulated discarded bytes in the queue exceed the configured threshold
Introduced a debugging tool to simulate slow discard operations by adding a configurable delay
Matan Breizman [Mon, 30 Jun 2025 09:44:24 +0000 (09:44 +0000)]
crimson: switch to ceph_abort_msg
ceph_abort doesn't print a message. Use ceph_abort_msg instead.
Most of the instances are not printing useful information but some are:
ceph_abort_msg("seastore device size setting is too small");
Leonid Chernin [Tue, 24 Jun 2025 13:00:49 +0000 (16:00 +0300)]
nvmeofgw: fixing GW delete issues
1.fixing the issue when gw is deleted based on invalid subsystem info
2. in function track_deleting_gws: break from loop only if
delete was really done
3. fix published rebalance index - publish ana-group instead of
index
4. do not dump gw-id string after gw was removed
Fixes: https://tracker.ceph.com/issues/71896 Signed-off-by: Leonid Chernin <leonidc@il.ibm.com>
.github/workflows/scripts/config-diff-post-comment.js: fix config check ok logic
currently, whenever a "config diff tool output" comment is created it
also has the string `/config check ok` string in it. The next time the
test run it see's this text and assumes that the user has commented it.
We fix the logic to makes sure that we ignore such cases.