John Mulligan [Fri, 21 Mar 2025 17:09:01 +0000 (13:09 -0400)]
mgr/cephadm: do not log unexpected uri scheme at warning level
Change the warning level for the unexpected uri scheme detected log
event when determining caps based off of the uri. The log gets triggered
by `rados:mon-config-key:*` type uris and those are common now so
the warning is just noise.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Rishabh Dave [Mon, 3 Mar 2025 16:36:10 +0000 (22:06 +0530)]
doc/cephfs: mention new options for "fs volume create" cmd
Command "ceph fs volume create" accepts 2 new options to allow users to
pass data and metadata pool name. Update docs to include mention of both
the options.
J. Eric Ivancich [Tue, 18 Mar 2025 18:33:42 +0000 (14:33 -0400)]
rgw: modify radoslist to better support the rgw-gap-list tool
When the `radosgw-admin bucket radoslist ...` sub-command was
introduced, it was written specifically for finding orphans. It has
since been updated to work for finding gaps, that is indexed RGW
objects that are missing one or more supporting rados objects.
When a head object was not found, it was ignored. Now it does produce
output with the oid and related information for the missing head
object.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
J. Eric Ivancich [Tue, 18 Mar 2025 18:31:05 +0000 (14:31 -0400)]
rgw: fix regression in radoslist with SLO manifests
A regression was inadvertently introduced in commit bcd7883d7212c96ebfb89c938c79fc7efbb80d2f that then prevented
`radosgw-admin bucket radoslist ...` from working properly with
buckets using SLO manifests. This corrects that regression.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Nizamudeen A [Wed, 5 Mar 2025 16:46:03 +0000 (22:16 +0530)]
mgr/dashboard: fix access control permissions for roles
Since prometheus is being used in the dashboard page we need to make
sure every role has prometheus read only access so that the dashboard
page can load the utilization metrics.
I also saw permission issue with the osd settings endpoint when its
trying to get the nearfull/full ratio. so instead of failing the entire
page i am proceeding with a chart that doesn't have those details when
the user doesn't have permission to access the config opt.
Multisite page was not accessible in the case of rgw-manager or
read-only user because its trying to show the status of rgw module. This
si also now gracefully handled to show the alert only when the user has
sufficient permission.
Fixes: https://tracker.ceph.com/issues/70331 Signed-off-by: Nizamudeen A <nia@redhat.com>
Alex Ainscow [Wed, 19 Mar 2025 13:59:35 +0000 (13:59 +0000)]
test/common: skip google tests which create core dumps in test_interval_set
CI Pipelines are being broken because this test is creating a number of core dumps. This
does not make the test fail, but it does create some core dumps. This appears to be
breaking something in the CI pipeline. This commit is a workaround and I will find a
better solution later.
Fixes: https://tracker.ceph.com/issues/70543 Signed-off-by: Alex Ainscow <aainscow@uk.ibm.com>
This commit fixes documentation about many-to-many topic relationship for notifications. The current sentence states the same fact twice instead of clarifying.
Ronen Friedman [Wed, 12 Mar 2025 09:26:54 +0000 (04:26 -0500)]
common: add cmd_getval_cast_or()
This slight variation of cmd_getval_or() can be used where
the object type is different from the configuration item
type (as when the object is a wrapper around an integer).
It allows specifying the 'default' value in the object type.
Fix a clang warning in proxy_async.c where an unsigned value was being
unnecessarily compared against 0:
```
/home/kefu/dev/ceph/src/libcephfs_proxy/proxy_async.c:29:12: warning: result of comparison of unsigned expression >= 0 is always true [-Wtautological-unsigned-zero-compare]
29 | if ((size >= 0) && !info->write) {
| ~~~~ ^ ~
1 warning generated.
```
Since unsigned values are always >= 0 by definition, remove this
tautological check to resolve the "-Wtautological-unsigned-zero-compare"
warning.
Laura Flores [Fri, 7 Mar 2025 06:22:00 +0000 (06:22 +0000)]
mon, osd: add command to remove invalid pg-upmap-primary entries
The current rm-pg-upmap-primary command checks that the pgid exists
in the pgmap before continuing to remove it. Due to https://tracker.ceph.com/issues/66867,
some invalid pg-upmap-primary entires may exist for pools that have been removed.
Currently, these mappings are impossible to remove since the pgids no longer
exist in the pgmap.
This new command, rm-pg-upmap-primary-all, allows users the ability to remove
any and all pg-upmap-primary mappings in the osdmap at once, which includes
valid and invalid entries.
This command may also be helpful when upgrading from versions where users
are plagued by https://tracker.ceph.com/issues/61948. Users may use an upgraded
mon to remove all pg-upmap-primray entries (valid and invalid) so they continue
to upgrade to a safe version.
See manual testing for this patch here: https://tracker.ceph.com/issues/67179#note-12
Fixes: https://tracker.ceph.com/issues/67179 Fixes: https://tracker.ceph.com/issues/69760 Signed-off-by: Laura Flores <lflores@ibm.com>
Vallari Agrawal [Mon, 17 Mar 2025 16:28:19 +0000 (21:58 +0530)]
monitoring: rename NVMeoFSingleGatewayGroup alert
Rename the alert to NVMeoFSingleGateway.
The original name was confusing because it
accidently might convey that alert would be
triggered if there is a single gateway group.
Though 'NVMeoFSingleGatewayGroup' alert means that
there is single gateway in a group.
[3AZ Stretch pool]: Allow user to specify values when unsetting pools
Problem:
When we enable stretched mode on the pools,
we modify 6 configs on the pool,
namely peering_crush_bucket_count,
peering_crush_bucket_target,
peering_crush_bucket_barrier,
crush_rule, size, min_size.
Out of these, only 3 configs,
namely peering_crush_bucket_count,
peering_crush_bucket_target,
peering_crush_bucket_barrier is reset to 0.
The remaining 3 configs,
namely crush_rule, size,
min_size are not reverted back,
and are still the values that were set with stretch set command.
Solution:
The unset command now is required
to specify `crush_rule`, `size`, `min_size`.
Kefu Chai [Mon, 17 Mar 2025 09:52:25 +0000 (17:52 +0800)]
crypto: remove unused include
openssl_crypto_accel.cc does not need the declarations included by
openssl/engine.h, also, openssl/engine.h was deprecated in favor of
the provider API, the engine support was removed in fedora 41.
so, let's avoid including it. please note, the "ENGINE" struct's
definition is available in openssl/types.h.
Soumya Koduri [Sat, 1 Mar 2025 07:05:51 +0000 (12:35 +0530)]
rgw/cloudrestore: Add Restore support from Glacier/Tape cloud endpoints
Unlike regular S3 cloud services, restoring objects from S3/Tape or AWS Glacier services
would require special handling. We need to first restore the object using Glacier
RestoreObject API and then download it using GET.
https://docs.aws.amazon.com/cli/latest/reference/s3api/restore-object.html
This PR adds that support for "Expedited" tier retrieval type. That means the
restore would be quick and the object can be downloaded soon.
TODO: "Standard" tier-type support. Need to handle the case where in restore from
cloud endpoint could take a longer time and need to be monitored periodically
in the background.
Soumya Koduri [Sat, 8 Feb 2025 18:34:01 +0000 (00:04 +0530)]
rgw/cloud-restore: Add new tier-type & options related to S3 Glacier
Unlike regular S3 cloud services, restoring objects from S3/Tape
or AWS Glacier services would require special handling. We need to
first restore the object using Glacier `RestoreObject` API and then
download it using `GET`.
https://docs.aws.amazon.com/cli/latest/reference/s3api/restore-object.html
A new cloud tier-type `s3-glacier` is added to handle S3 Glacier
endpoints along with below tier-config options -
`glacier_restore_days` - lifetime of the restored copy on the Glacier
endpoint ; default: 1 day
`glacier_restore_tier_type` - Retrieval tier at which the restore will be processed.
Only "Standard" (default) and "Expedited" options
are supported.
In addition, a new option `restore_storage_class` is added to configure
the storage class the objects need to be restored to. Default value:
STANDARD