Add a second method of changing the value of osd_deep_scrub_interval to
remedy the condition indicated by the "PGs not deep-scrubbed in time"
warning.
This procedure was developed by Eugen Block, and is at the time of this
commit available on his blog at
https://heiterbiswolkig.blogs.nde.ag/2024/09/06/pgs-not-deep-scrubbed-in-time/
doc/install: Keep the name field of the created user consistent with the node name in the Start RADOSGW service command
If the user name does not match the name of the node that started the RADOSGW service, this will cause confusion for those who are new to ceph. Because they can't start the radosgw service normally as shown in the tutorial.
doc/rados: add "pgs not deep scrubbed in time" info
Add a procedure to doc/rados/operations/health-warnings.rst that
explains how to remedy the "X PGs not deep-scrubbed in time" health
warning.
This procedure was developed by Eugen Block, and is at the time of this
commit available on his blog at
https://heiterbiswolkig.blogs.nde.ag/2024/09/06/pgs-not-deep-scrubbed-in-time/
Edit the section "bluefs-bdev-migrate" in
doc/man/8/ceph-bluestore-tool.rst to add the information that this
operation expands the target storage by updating its size label, making
"bluefs-bdev-expand" unnecessary.
Improve the subject-verb agreement in this section, and supply some
absent definite articles.
Co-authored-by: Peter Gervai <grin@drop.grin.hu> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 6b34707f827b2b197f53fe2e430d173b30b81401)
Ilya Dryomov [Fri, 30 Aug 2024 12:00:44 +0000 (14:00 +0200)]
rbd: mention namespace in "rbd mirror pool" command descriptions
Commit 5e64748927d0 ("doc/rbd: add namespace information for mirror
commands") did this for the man page, update the built-in help as well.
The "by default" bit in the description of "rbd mirror pool enable" and
"rbd mirror pool disable" commands is specific to pool mode which is in
turn specific to journal-based mirroring, so it's removed.
Make it clearer that, despite a full image or group spec being taken
for source and destination, an image or a group can be renamed only
within its pool or namespace.
Rename across pools or namespaces within the same pool is unsupported.
```
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph-authtool --gen-print-key
--> Failed to activate via raw: activate() takes 1 positional argument but 5 were given
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph-authtool --gen-print-key
--> Failed to activate via LVM: could not find osd.0 with osd_fsid 1e764f4a-db4b-4b41-86eb-468efe4c3f44
--> Failed to activate via simple: 'Namespace' object has no attribute 'json_config'
--> Failed to activate any OSD(s)
```
04c93a1ed42 seems to have broken it.
This commit fixes it.
Kefu Chai [Tue, 13 Aug 2024 22:37:57 +0000 (06:37 +0800)]
ceph-volume: add "packaging" to install_requires
in 0985e201, "packaging" was introduced as a runtime dependency of
ceph-volume, and `ceph.spec.in` was updated accordingly to note
this new dependency. but the debian packaging was not updated.
in 80edcd40, the missing dependency was added to debian/control as
one of ceph-volume's runtime dependency.
but dh_python3 is able to figure out the dependencies by reading
the egg's metadata of the ceph-volume python module. and as a
python project, ceph-volume is using its `setup.py` for
tracking its dependencies.
so in order to be more consistent, and keep all of its dependencies
in one place, let's move this dependency to setup.py . as the
packagings in both distros are able to figure the dependencies
from egg-info.
see also
- https://manpages.debian.org/testing/dh-python/dh_python3.1.en.html#dependencies
- https://docs.fedoraproject.org/en-US/packaging-guidelines/Python_201x/#_automatically_generated_dependencies
Thomas Lamprecht [Wed, 31 Jul 2024 07:48:08 +0000 (09:48 +0200)]
debian pkg: record python3-packaging dependency for ceph-volume
Since commit 0985e201342 ("ceph-volume: use 'no workqueue' options
with dmcrypt") the python "packaging" module is used to parse the
cryptsetup version output, but the debian packaging was not updated to
record that new dependency.
So simply record this in the d/control file, adding a <pkg>.requires
file seemed to not really winning us anything here.
Fixes: https://tracker.ceph.com/issues/67290 Fixes: 0985e201342fa53c014a811156aed661b4b8f994 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
(cherry picked from commit 80edcd40e8092d9fb3b45c1a6c7f9b7f4f37b69e)
Zac Dover [Fri, 30 Aug 2024 11:16:57 +0000 (21:16 +1000)]
doc/ceph-volume: add spillover fix procedure
Add a procedure that explains how, after an upgrade, to move bytes that
have spilled over to a relatively slow device back to the faster device.
This procedure was developed by Chris Dunlop on the [ceph-users] mailing
list, here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/POPUFSZGXR3P2RPYPJ4WJ4HGHZ3QESF6/
Eugen Block requested the addition of this procedure to the
documentation on 30 Aug 2024.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 98618aaa1c8b786c7d240a210b62cc737fdb048d)
Samuel Just [Wed, 28 Aug 2024 01:54:04 +0000 (18:54 -0700)]
osd/OSDMap: require CRUSH_MSR if any rule is msr, even if used by no pool
OSDMap::get_features is used by
OSDMonitor::validate_crush_against_features via
OSDMap::get_min_compat_client() to check whether changes to the crushmap
will require newer features than the existing require_min_compat_client
field.
Monitor commands which create rules from ec profiles may result in msr
rules. While it might be harmless to allow msr rules to exist as long
as there aren't any pools actually using the rule, it's probably simpler
to disallow their creation in the first place until
require_min_compat_client is updated.
mon: validate everybody understands MSR on set-require-min-compat-client
Unit testing
------------
```
[rzarzynski@o06 build]$ bin/unittest_features
...
[ RUN ] features.release_features
1 argonaut features 0x40000 looks like argonaut
2 bobtail features 0x40000 looks like argonaut
3 cuttlefish features 0x40000 looks like argonaut
4 dumpling features 0x42040000 looks like dumpling
5 emperor features 0x42040000 looks like dumpling
6 firefly features 0x20842040000 looks like firefly
7 giant features 0x20842040000 looks like firefly
8 hammer features 0x1020842040000 looks like hammer
9 infernalis features 0x1020842040000 looks like hammer
10 jewel features 0x401020842040000 looks like jewel
11 kraken features 0xc01020842040000 looks like kraken
12 luminous features 0xe01020842240000 looks like luminous
13 mimic features 0xe01020842240000 looks like luminous
14 nautilus features 0xe01020842240000 looks like luminous
15 octopus features 0xe01020842240000 looks like luminous
16 pacific features 0xe01020842240000 looks like luminous
17 quincy features 0xe01020842240000 looks like luminous
18 reef features 0xe010208d2240000 looks like reef
19 squid features 0xe010248d2240000 looks like squid
[ OK ] features.release_features (0 ms)
```
Manual testing
--------------
\### `reef` client present in `squid` cluster
```
[rzarzynski@o06 build]$ bin/ceph daemon mon.a sessions | jq -jr '.[] | .name, "\t", .con_features, "\t", .con_features_hex, "\n"' | grep client
client.? 45407015477380382713f03cffffffdffff
client.? 45401383229067100153f01cfbffffdffff
[rzarzynski@o06 build]$ bin/ceph osd get-require-min-compat-client
luminous
[rzarzynski@o06 build]$ bin/ceph osd set-require-min-compat-client squid
Error EPERM: cannot set require_min_compat_client to squid: 1 connected client(s) look like reef (missing 0x4000000000); add --yes-i-really-mean-it to do it anyway
```
Oguzhan Ozmen [Thu, 22 Aug 2024 02:44:01 +0000 (22:44 -0400)]
doc/rgw/account: Handling notification topics when migrating an existing user into an account
Add a subsection under "Migrate an existing User into an Account" to
describe how a client can seamlessly migrate the notification topics
after account migration.
Kamoltat [Tue, 21 May 2024 19:02:03 +0000 (19:02 +0000)]
qa/suites/rados: 3-az-stretch-cluster-netsplit test
Test the case where 2 DC loses connection with each other
for a 3 AZ stretch cluster with stretch pool enabled.
Check if cluster is accessible and PGs are active+clean
after reconnected.
`ceph osd pool stretch set`
`ceph osd pool stretch unset`
`ceph osd pool stretch show`
`qa/workunits/mon/mon-stretch-pool.sh`
will create the stretch cluster
while performing input validation for the CLI
Commands mentioned above.
`qa/tasks/stretch_cluster.py`
is in charge of
setting a pool to stretch cluster
and checks whether it prevents PGs
from the going active when there is not
enough buckets available in the acting
set of PGs to go active.
Also, test different MON fail over scenarios
after setting pool as stretch
user has the option of setting the value of `peering_crush_bucket_{count|target|barrier}`.
This will then allow the utilization `calc_replicated_acting_stretch`,
since with `peering_crush_bucket_count != 0`
the pool is now a stretch_pool and we can handle pg_temp
better by settubg barriers and limits to how much OSDs
should be in a pg_temp.
This will enable the specify pool to
handle pg_temp properly during create_acting, as a stretch pool
should.
User can also use the command:
`osd pool stretch show <pool> `
to show all the stretch related information for the pool
Yonatan Zaken [Mon, 12 Aug 2024 20:00:39 +0000 (23:00 +0300)]
mgr/orchestrator: fix encrypted flag handling in orch daemon add osd
The current implementation incorrectly parses this `encrypted` flag as a string rather than a boolean value.
This leads to unintended behavior causing an LVM encryption layer to be created regardless of whether `encrypted=True` or `encrypted=False` is passed.
The only way to prevent this behavior is by omitting the `encrypted` flag entirely.
This change prevents potential errors, aligning the behavior with user expectations.
sajibreadd [Mon, 27 May 2024 07:30:06 +0000 (13:30 +0600)]
Warning added for slow operations and stalled read in BlueStore. User can control how much time the warning should persist after last occurence and maximum number of operations as a threshold will be considered for the warning.
Casey Bodley [Fri, 23 Aug 2024 19:03:31 +0000 (15:03 -0400)]
rgw: ignore zoneless default realm when not configured
"default" zone/zonegroup deployments without a realm can be broken by
the creation of an unrelated realm, because that realm is (was)
automatically set as the default
when startup detects an incomplete default realm (one that doesn't have
a default zone), fall back to the realmless "default" zone/zonegroup
instead