Edit the "Data Scrubbing" listitem in the list of benefits conferred by
the use by OSDs of the aggregate power of the cluster, in the section
"Smart Daemons Enable Hyperscale" in doc/architecture.rst.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit d7b991646fcd36a1df7456e8e82c9d54a01e50f9)
Nizamudeen A [Wed, 19 Jul 2023 14:05:05 +0000 (19:35 +0530)]
install-deps: remove the legacy resolver flags
This was a workaround that was introduced long time ago. This will be
something that could be deprectaed at some point [1]. And its preventing some of the dependencies to be
downloaded or stored into the wheelhouse. Deps like jsonschema, parse,
mypy, cryptography etc.
Rewrite the explanation of how a client authenticates against a monitor.
This is a rewrite of a single paragraph, and has been set apart in its
own PR so that it can receive the maximum amount of scrutiny that the
upstream Ceph community can muster.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit c71cd84ec9e579ba0913c4952570bba6082e03b5)
rgw: fix FP error when calculating enteries per bi shard
When calculating how many entries per shard to request during an
ordered bucket listing, we divide by the number of bucket index
shards. If this value is 0, then a floating point exception is
generated, crashing the RGW.
This addresses the proximate issue by detecting the situation and
returning an error rather than crashing.
Matt Benjamin [Mon, 31 Oct 2022 16:40:50 +0000 (12:40 -0400)]
rgwlc: prevent lc for one bucket from exceeding time budget
Fixes: https://tracker.ceph.com/issues/57951 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit 617ffccbca0169ac0f1cd713962d44e8cc74a8af)
Seena Fallah [Fri, 9 Jun 2023 12:14:24 +0000 (14:14 +0200)]
rgw: pick http_date in case of http_x_amz_date absence
From the AWS doc:
The request date can be specified by using either the HTTP Date or the x-amz-date header. If both headers are present, x-amz-date takes precedence.
Refs: https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-auth-using-authorization-header.html
rgw: Fix potential null pointer dereferences where `RGWEnv::get()` is called
Here are the changes I've made:
- Added `RGWEnv::get_optional` with the similar implementation to the `RGWHTTPArgs::get_optional`
- Replaced `RGWEnv::get` in the RGW code where null pointer derefence happens with `RGWEnv::get_optional` as long as it accepts `std::string`
- Otherwise if calling function of `RGWEnv::get` accepts `char*` I leave it as it is
- Added null pointer checks to avoid the null pointer dereference
Rishabh Dave [Mon, 11 Sep 2023 09:55:46 +0000 (15:25 +0530)]
doc/cephfs: write cephfs commands fully in docs
We write CephFS commands incompletely in docs. For example, "ceph tell
mds.a help" is simply written as "tell mds.a help". This might confuse
the reader and it won't harm to write the command in full.
Fixes: https://tracker.ceph.com/issues/62791 Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit e63b573d3edc272d83ee1b5eb3dace037f762d87)
fix 2 null versionID after convert_plain_entry_to_versioned
After convert plain entry to versioned, the converted entry epoch is 1.
Setting this ensures that there is only one null version.
Fixes: https://tracker.ceph.com/issues/62013 Signed-off-by: rui ma <marui1@chinatelecom.cn> Signed-off-by: zhuo li <lizhuo@chinatelecom.cn>
(cherry picked from commit 14cfbfd60c45cc0f04f7a83057cb460731f3cc70)
Change the sentence structure of a sentence because the verb
"experience" looked like the abstract noun "experience" when I read it
with fresh eyes. I chose the perhaps TESOL-unfriendly verb "incur", but
I believe it is right.
qa/suites/upgrade/quincy-p2p: skip TestClsRbd.mirror_snapshot test
The behavior of the class method changed in reef; the change was
backported to pacific and quincy. An older quincy binary used against
newer quincy OSDs produces an expected failure:
[ RUN ] TestClsRbd.mirror_snapshot
.../ceph-17.2.0/src/test/cls_rbd/test_cls_rbd.cc:2278: Failure
Expected equality of these values:
-85
mirror_image_snapshot_unlink_peer(&ioctx, oid, 1, "peer2")
Which is: 0
[ FAILED ] TestClsRbd.mirror_snapshot (49 ms)
TestClsRbd.snapshots_namespaces test was removed in commit 4ad9d565a15c
("librbd: simplified retrieving snapshots from image header") many years
ago.
doc/man: remove docs about support for unix domain sockets
doc/man: support for unix domain sockets is not implemented, hence we
removed documentation about it.
(Note: the changes in this commit were the work of Rok Jaklič in
https://github.com/ceph/ceph/pull/48537. This pull request has been
raised because that pull request was for some mysterious reason causing
merge conflicts that were never resolved.)
Co-authored-by: Rok Jaklič rjaklic@gmail.com Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit fa40b7ef560fc60a107dad1604650e0bcf27e77e)
This makes ceph-volume report partitions in inventory.
A partition is a valid device for `ceph-volume lvm prepare`
so we should report them in inventory (when using `--list-all`
parameter).
This functions works for what it is supposed to do:
check if a device is busy.
That being said, this induces a race condition in `get_devices()`
Indeed, it does:
1/ `os.open()` with `(os.O_RDWR | os.O_EXCL)`
2/ `os.close()`
The second call has an effect: it triggers a udev event which causes
systemd-udevd to re-process the device. This seems to be a question of
millisecond but because of this, /sys (sysfs) isn't fully populated as
expected. Given that get_devices() collects a lot of details from sysfs
in a loop, some of these details can be missed.
ceph-volume overall doesn't make decisions based on `is_locked_raw_device()`
This detail is used only for reporting (inventory).
For this reason, dropping this function seems reasonnable.
As a compromise, we can check if the device has partitions and/or a FileSystem
on it.
This adds a new config option 'inventory_list_all' so one can make
the command `ceph orch device ls` report lvm devices too as they are
valid devices that can be used to be prepared as OSDs.
qa: add "failover / failback loop" test for rbd-mirror
For snapshot-based mirroring, check that demote (or other mirror
snapshots) don't pile up. Nothing in particular to assert on for
journal-based mirroring but the test is still useful.
Ilya Dryomov [Sat, 26 Aug 2023 11:04:52 +0000 (13:04 +0200)]
librbd: make CreatePrimaryRequest remove any unlinked mirror snapshots
After commit ac552c9b4d65 ("librbd: localize snap_remove op for mirror
snapshots"), rbd-mirror daemon no longer removes mirror snapshots when
it's done syncing them -- instead it only unlinks from them. However,
CreatePrimaryRequest state machine was not adjusted to compensate and
hence two cases were missed:
- primary demotion snapshot (rbd-mirror daemon unlinks from primary
demotion snapshots just like it does from regular primary snapshots);
this comes up when an image is demoted but then promoted on the same
cluster
- non-primary demotion snapshot (unlike regular non-primary snapshots,
non-primary demotion snapshots store peer uuids and rbd-mirror daemon
does unlinking just like in the case of primary snapshots); this
comes up when an image is demoted and promoted on the other cluster
Related is the case of orphan snapshots. Since they are dummy to begin
with, CreatePrimaryRequest would now clean up the orphan snapshot after
the creation of the force promote snapshot.
Fixes: https://tracker.ceph.com/issues/61707 Co-authored-by: Christopher Hoffman <choffman@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 9c05d3d81f4b06af2cfd47376e9ad86369bdf8cf)
Conflicts:
src/librbd/mirror/snapshot/CreatePrimaryRequest.cc [ commit 3a93b40721a1 ("librbd: s/boost::variant/std::variant/") not
in quincy ]
Ilya Dryomov [Tue, 22 Aug 2023 15:27:50 +0000 (17:27 +0200)]
librbd: don't attempt to remove image state on orphan snapshots
Despite being mirror snapshots, orphan snapshots don't have image
state: see CreateNonPrimaryRequest::write_image_state() for a similar
is_orphan() check. Attempting to remove image state generates bogus
"failed to read image state object" and "failed to remove image state"
errors.