Samuel Just [Wed, 28 Aug 2024 01:54:04 +0000 (18:54 -0700)]
osd/OSDMap: require CRUSH_MSR if any rule is msr, even if used by no pool
OSDMap::get_features is used by
OSDMonitor::validate_crush_against_features via
OSDMap::get_min_compat_client() to check whether changes to the crushmap
will require newer features than the existing require_min_compat_client
field.
Monitor commands which create rules from ec profiles may result in msr
rules. While it might be harmless to allow msr rules to exist as long
as there aren't any pools actually using the rule, it's probably simpler
to disallow their creation in the first place until
require_min_compat_client is updated.
mon: validate everybody understands MSR on set-require-min-compat-client
Unit testing
------------
```
[rzarzynski@o06 build]$ bin/unittest_features
...
[ RUN ] features.release_features
1 argonaut features 0x40000 looks like argonaut
2 bobtail features 0x40000 looks like argonaut
3 cuttlefish features 0x40000 looks like argonaut
4 dumpling features 0x42040000 looks like dumpling
5 emperor features 0x42040000 looks like dumpling
6 firefly features 0x20842040000 looks like firefly
7 giant features 0x20842040000 looks like firefly
8 hammer features 0x1020842040000 looks like hammer
9 infernalis features 0x1020842040000 looks like hammer
10 jewel features 0x401020842040000 looks like jewel
11 kraken features 0xc01020842040000 looks like kraken
12 luminous features 0xe01020842240000 looks like luminous
13 mimic features 0xe01020842240000 looks like luminous
14 nautilus features 0xe01020842240000 looks like luminous
15 octopus features 0xe01020842240000 looks like luminous
16 pacific features 0xe01020842240000 looks like luminous
17 quincy features 0xe01020842240000 looks like luminous
18 reef features 0xe010208d2240000 looks like reef
19 squid features 0xe010248d2240000 looks like squid
[ OK ] features.release_features (0 ms)
```
Manual testing
--------------
\### `reef` client present in `squid` cluster
```
[rzarzynski@o06 build]$ bin/ceph daemon mon.a sessions | jq -jr '.[] | .name, "\t", .con_features, "\t", .con_features_hex, "\n"' | grep client
client.? 45407015477380382713f03cffffffdffff
client.? 45401383229067100153f01cfbffffdffff
[rzarzynski@o06 build]$ bin/ceph osd get-require-min-compat-client
luminous
[rzarzynski@o06 build]$ bin/ceph osd set-require-min-compat-client squid
Error EPERM: cannot set require_min_compat_client to squid: 1 connected client(s) look like reef (missing 0x4000000000); add --yes-i-really-mean-it to do it anyway
```
Casey Bodley [Fri, 9 Aug 2024 16:49:05 +0000 (12:49 -0400)]
rgw: revert account-related changes to get_iam_policy_from_attr()
while bucket ARNs in iam policies don't include account names, policy
evaluation does need to differentiate between buckets in different
tenant namespaces
when requests pass bucket/object ARNs into
verify_bucket/object_permission(), those do include the bucket's tenant
name. to match against those ARNs, we also need to pass the requested
bucket's tenant name into get_iam_policy_from_attr()
Casey Bodley [Fri, 23 Aug 2024 19:03:31 +0000 (15:03 -0400)]
rgw: ignore zoneless default realm when not configured
"default" zone/zonegroup deployments without a realm can be broken by
the creation of an unrelated realm, because that realm is (was)
automatically set as the default
when startup detects an incomplete default realm (one that doesn't have
a default zone), fall back to the realmless "default" zone/zonegroup
instead
Laura Flores [Mon, 15 Jul 2024 22:04:41 +0000 (17:04 -0500)]
qa/suites/rados/thrash-old-clients: test with N-2 releases on centos 9
It was recently decided to stop building and releasing ubuntu focal
packages for squid. This decision extended to the Shaman builds.
When we stopped building focal for squid in Shaman, this failure
started happening, because the test was looking for nonexistent
squid focal packages:
```
no results found at https://shaman.ceph.com/api/search/?project=ceph&distros=ubuntu%2F20.04%2Fx86_64&flavor=default&sha1=81127b728ce57cc8b876f0f2dd3e436633549a67
```
After a discussion in Slack, we agreed the best option going forward
would be to test on centos 9 and drop pacific from the mix, since pacific
does not have centos 9 packages. To later incorprate pacific, we will work
on a contanierized solution.
-----
Slack thread (may be expired):
https://ceph-storage.slack.com/archives/C1HFJ4VTN/p1721078395083699
Laura Flores
4:19 PM
@Dan Mick
I see we stopped building focal for squid on Shaman via Jenkins. I
know this is intended since we no longer plan to release squid focal
packages, but now the thrash-old-clients tests are failing on squid:
https://pulpito.ceph.com/teuthology-2024-07-14_21:00:02-rados-squid-distro-default-smithi/7801302/
These tests use an older client, i.e. reef, in a squid cluster. These
older clients go as far back as N-3 (so we test pacific, reef, and
quincy clients against a squid cluster). We need a distro that is shared
between all these releases in order to do that, which up until recently
was focal. Can we reintroduce focal shaman builds? We can put a note in
https://docs.ceph.com/en/latest/start/os-recommendations/#platforms to
explain that these packages are not released for squid, but are used to
test old clients.
Laura Flores
4:21 PM
In the above scenario, we could consider switching to centos 9 since squid,
reef and quincy share these. But we also test against pacific clients, and
pacific of course does not build c9.
Casey Bodley
41 minutes ago
it would be nice if those tests could eventually use containers for the upgraded
servers
Josh Durgin
4:46 PM
centos 9 is the easiest path for now, for quincy and reef
4:46
agree with casey containerized servers would be better going forward anyway
osd/scrub: exempt only operator scrubs from max_scrubs limit
Existing code exempts all 'high priority' scrubs, including for example
'after_repair' and 'mandatory on invalid history' scrubs from the limit.
PGs that do not have valid last-scrub data (which is what we have when
a pool is first created) - are set to shallow-scrub immediately.
Unfortunately - this type of scrub is (in the low granularity implemented
in existing code) is 'high priority'.
Which means that a newly created pool will have all its PGs start
scrubbing, regardless of concurrency (or any other) limits.
Fixes: https://tracker.ceph.com/issues/67253 Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit babd65e412266f5c734f7a2b57d87657d3470c47)
conflict resolution:
- eliminating irrelevant 'main' code that was picked into this branch.
- the code to set the scrub_job's flag moved to osd_scrub_sched.cc,
where the corresponding function is.
(cherry picked from commit a3f16627fde5426b19b932b9ef41c167e029d30f)
Fixes: https://tracker.ceph.com/issues/66286
(Line added by Gabriel)
In RadosStore, the source and dest objects in the copy_object() call
used to share an obj_ctx. When obj_ctx was removed from the SAL API,
they each got their own, but RGWRados::copy_obj() still assumed they
shared one.
Pass in each one separately, and use the correct one for further calls.
Signed-off-by: Daniel Gryniewicz <dang@fprintf.net> Signed-off-by: Gabriel BenHanokh <gbenhano@redhat.com>
(cherry picked from commit 111c54a19dc12b84cda785feddb0a0ba483b1f77)
Fixes: https://tracker.ceph.com/issues/66286
Improve display of ref_count in the rados commandline utility
New test cases were added to detect behavior after server side copy in the following cases:
1) delete original only
2) delete destination only
3) delete original then delete destination (this will lead to orphaned tail-objects without the changes made in this PR)
d) delete destination then delete original (this will lead to orphaned tail-objects without the changes made in this PR)
Add call to GC between tests to help control the used disk space since we keep writing huge files of 5GB each Signed-off-by: Gabriel BenHanokh <gbenhano@redhat.com>
(cherry picked from commit d496d20c803590d41d711e446feab41476c0f20c)
RemoteApplier::load_acct_info() and create_account() decide whether to
add the implicit tenant. store the resulting rgw_user for use in
get_aclowner() and get_tenant()
Nitzan Mordechai [Tue, 25 Jun 2024 09:06:45 +0000 (09:06 +0000)]
crimson/osd: adding osdmap subscribe
when committed osdmap is complete, it will check if should restart.
in case we shouldn't restart but we are still active, we need
the next osdmap to continue the process.
hualong feng [Fri, 14 Jun 2024 07:50:53 +0000 (15:50 +0800)]
rgw: fixup compressor_message didn't store in some cases
When I upload a object to RGW by multipart, the head object
xattr(user.rgw.compression) don't have compressor_message
when the value should be valid and part object xattr
have the value.
hualong feng [Thu, 6 Jun 2024 07:53:03 +0000 (15:53 +0800)]
compressor: Change data formt to QZ_DEFLATE_GZIP_EXT for QAT zlib
QAT zlib 'QZ_DEFLATE_RAW' data format cannot decompress
by QAT hardware. So here we replace 'QZ_DEFLATE_GZIP_EXT' data
format with 'QZ_DEFLATE_RAW'.
'QZ_DEFLATE_GZIP_EXT' data format need to add gz_header
by deflateSetHeader() in QATzip. And it leads multi stream
in one compression for hardware buffer. So the windows bit
is important information for decompression, which related to
if the inflate remove header.
All temp objects are added *only* to PGBackend::temp_content.
cleaning RecoveryBackend::temp_contents (which is always empty) instead
of PGBackend::temp_contents is wrong.
qa/workunits/rbd: avoid caching effects in luks-encryption.sh
Commit 40f6f5224bce ("qa/workunits/rbd: fix issues in
luks-encryption.sh") did the right thing for reads, which solved
most of the issue. However, it actually made a step in the opposite
direction for writes -- depending on the RBD cache settings, rbd-nbd
virtual devices can behave as physical devices with a volatile write
cache, so fsync is required.
While at it, involving O_DIRECT for reads isn't needed outside of
test_encryption_format().