]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
13 months agodoc: clarify use of location: in host spec 57646/head
Matthew Vernon [Wed, 22 May 2024 15:31:33 +0000 (16:31 +0100)]
doc: clarify use of location: in host spec

It wasn't clear that you can specify more than one element of the CRUSH hierarchy in a spec file, nor that it might be useful to do so (e.g. to ensure the host ends up beneath the default root).

So update the text to make it clearer, and similarly the example.

Signed-off-by: Matthew Vernon <mvernon@wikimedia.org>
(cherry picked from commit 2366391ccec0fb6d8a1c159d6e3cdf5ff4f1d603)

13 months agoMerge PR #57342 into squid
Patrick Donnelly [Thu, 23 May 2024 00:58:23 +0000 (20:58 -0400)]
Merge PR #57342 into squid

* refs/pull/57342/head:
PendingReleaseNotes: add note on the client incompatibility health warning and feature bit
doc/cephfs: add client_mds_auth_caps client feature bit
doc/cephfs: add missing client feature bits
doc/cephfs: document MDS_CLIENTS_BROKEN_ROOTSQUASH health error
qa: add tests for MDS_CLIENTS_BROKEN_ROOTSQUASH
mds: raise health warning if client lacks feature for root_squash
mon/MDSMonitor: add note about missing metadata inclusion
mds: check relevant caps for fs include root_squash
mds: refactor out fs_name match in MDSAuthCaps
qa: test for root_squash with multiple caps
qa: pass kwargs to mount from remount
qa: simplify update_attrs and only update relevant keys
client: allow overriding client features

Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>
13 months agoMerge PR #57340 into squid
Patrick Donnelly [Wed, 22 May 2024 18:22:01 +0000 (14:22 -0400)]
Merge PR #57340 into squid

* refs/pull/57340/head:
qa: make quiesce ops dump world readable
qa: use specific ops/cache dump file names

Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
13 months agoMerge PR #57179 into squid
Patrick Donnelly [Wed, 22 May 2024 18:21:18 +0000 (14:21 -0400)]
Merge PR #57179 into squid

* refs/pull/57179/head:
mds: encode flags for all inode types
qa: test file inode with F_QUIESCE_BLOCK is replicated

Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
13 months agoMerge PR #57176 into squid
Patrick Donnelly [Wed, 22 May 2024 18:20:46 +0000 (14:20 -0400)]
Merge PR #57176 into squid

* refs/pull/57176/head:
mds: move drop_locks to directly after rdonly check
qa: test quiesce.block is replicated
qa: test that ceph.dir.subvolume is replicated properly
mds: add debug "lock path" command
qa: move reqid_tostr helper
qa: return run_shell process for waiters

Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
13 months agoMerge PR #57175 into squid
Patrick Donnelly [Wed, 22 May 2024 18:20:19 +0000 (14:20 -0400)]
Merge PR #57175 into squid

* refs/pull/57175/head:
qa: extend rank 1 lockup for test_quiesce_authpin_wait

Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
13 months agoMerge PR #57171 into squid
Patrick Donnelly [Wed, 22 May 2024 18:19:46 +0000 (14:19 -0400)]
Merge PR #57171 into squid

* refs/pull/57171/head:
qa: increase debugging for snap_schedule

Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
13 months agoMerge PR #57063 into squid
Patrick Donnelly [Wed, 22 May 2024 18:19:17 +0000 (14:19 -0400)]
Merge PR #57063 into squid

* refs/pull/57063/head:
qa: do not iterate list being modified
qa: remove unnecessary background job cleanup

Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
13 months agoMerge PR #57062 into squid
Patrick Donnelly [Wed, 22 May 2024 18:18:41 +0000 (14:18 -0400)]
Merge PR #57062 into squid

* refs/pull/57062/head:
mds: use mds_cache_quiesce_decay_rate to init quiesce_counter

Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
13 months agoMerge PR #57061 into squid
Patrick Donnelly [Wed, 22 May 2024 18:18:07 +0000 (14:18 -0400)]
Merge PR #57061 into squid

* refs/pull/57061/head:
qa: add missing pg_health fragment links in fs suite
qa: ignore PG health warnings in CephFS QA

Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
13 months agoMerge PR #57203 into squid
Patrick Donnelly [Wed, 22 May 2024 18:06:45 +0000 (14:06 -0400)]
Merge PR #57203 into squid

* refs/pull/57203/head:
mds: do not try fragmenting or exporting a quiesced directory
mds: set/test ALL_LOCKED on fragment_dir request
mds: pass bypassfreezing to parent auth pin req
qa: add quiesce tests during fragmentation
qa: translate empty output from rank_tell to empty dict
qa: move reqid_tostr helper

Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
13 months agoMerge PR #57202 into squid
Patrick Donnelly [Wed, 22 May 2024 18:06:04 +0000 (14:06 -0400)]
Merge PR #57202 into squid

* refs/pull/57202/head:
squid: mds/cache: don't assume non-auth xlocks to be remote locks

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
13 months agoMerge PR #57013 into squid
Patrick Donnelly [Wed, 22 May 2024 18:04:24 +0000 (14:04 -0400)]
Merge PR #57013 into squid

* refs/pull/57013/head:
mds/quiesce: don't take mirrored cap-related locks on the replica
mds/quiesce: xlock the file to let clients keep their buffered writes

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
13 months agoMerge PR #56984 into squid
Patrick Donnelly [Wed, 22 May 2024 18:03:59 +0000 (14:03 -0400)]
Merge PR #56984 into squid

* refs/pull/56984/head:
mds/quiesce: agent: avoid a race condition with rapid db updates

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
13 months agoMerge pull request #56478 from sseshasa/wip-65150-squid
Yuri Weinstein [Wed, 22 May 2024 14:39:42 +0000 (07:39 -0700)]
Merge pull request #56478 from sseshasa/wip-65150-squid

squid: common/LogEntry: Add log level to str helper for fmt::formatter<LogEntry>

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
13 months agoMerge pull request #56477 from sseshasa/wip-65151-squid
Yuri Weinstein [Wed, 22 May 2024 14:38:55 +0000 (07:38 -0700)]
Merge pull request #56477 from sseshasa/wip-65151-squid

squid: qa: Add benign cluster warning from ec-inconsistent-hinfo test to ignorelist

Reviewed-by: Laura Flores <lflores@redhat.com>
13 months agoMerge pull request #57514 from Matan-B/wip-56610-squid
Matan Breizman [Wed, 22 May 2024 11:29:11 +0000 (14:29 +0300)]
Merge pull request #57514 from Matan-B/wip-56610-squid

squid: crimson/osd/replicated_recovery_backend: Fix recovery obc usage

Reviewed-by: Samuel Just <sjust@redhat.com>
13 months agoMerge pull request #57511 from Matan-B/wip-56844-squid
Matan Breizman [Wed, 22 May 2024 11:28:22 +0000 (14:28 +0300)]
Merge pull request #57511 from Matan-B/wip-56844-squid

squid: crimson/common/tri_mutex: make locking/promotion atomic if possible

Reviewed-by: Samuel Just <sjust@redhat.com>
13 months agoMerge pull request #57510 from Matan-B/wip-56606-squid
Matan Breizman [Wed, 22 May 2024 11:28:00 +0000 (14:28 +0300)]
Merge pull request #57510 from Matan-B/wip-56606-squid

squid: crimson/osd/ops_executer: fix snap overlap range error

Reviewed-by: Samuel Just <sjust@redhat.com>
13 months agoMerge pull request #57507 from Matan-B/wip-56848-squid
Matan Breizman [Wed, 22 May 2024 11:27:40 +0000 (14:27 +0300)]
Merge pull request #57507 from Matan-B/wip-56848-squid

squid: crimson/osd/recovery_backends: discard outdated recovery ops

Reviewed-by: Samuel Just <sjust@redhat.com>
13 months agoMerge pull request #57505 from Matan-B/wip-56277-squid
Matan Breizman [Wed, 22 May 2024 11:26:15 +0000 (14:26 +0300)]
Merge pull request #57505 from Matan-B/wip-56277-squid

squid: crimson: Add support for pool compression

Reviewed-by: Samuel Just <sjust@redhat.com>
13 months agoMerge pull request #57502 from Matan-B/wip-56511-squid
Matan Breizman [Wed, 22 May 2024 11:25:46 +0000 (14:25 +0300)]
Merge pull request #57502 from Matan-B/wip-56511-squid

squid: qa/suites/crimson-rados/thrash: enable chance_down

Reviewed-by: Samuel Just <sjust@redhat.com>
13 months agoMerge pull request #57559 from ceph/wip-lusov-await-eperm-squid
Zac Dover [Wed, 22 May 2024 06:04:53 +0000 (16:04 +1000)]
Merge pull request #57559 from ceph/wip-lusov-await-eperm-squid

squid: mds/quiesce: db: quiesce-await should EPERM if a set is past QS_QUIESCED

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
13 months agoPendingReleaseNotes: add note on the client incompatibility health warning and featur... 57342/head
Patrick Donnelly [Fri, 3 May 2024 00:45:43 +0000 (20:45 -0400)]
PendingReleaseNotes: add note on the client incompatibility health warning and feature bit

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit e70f005f1b2f4ba5466d254ec4a6432297d3fbf4)

13 months agodoc/cephfs: add client_mds_auth_caps client feature bit
Patrick Donnelly [Fri, 3 May 2024 00:46:17 +0000 (20:46 -0400)]
doc/cephfs: add client_mds_auth_caps client feature bit

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 536b47cbfc669b5a3f04d93964408a2258d05ad0)

13 months agodoc/cephfs: add missing client feature bits
Patrick Donnelly [Fri, 3 May 2024 00:38:19 +0000 (20:38 -0400)]
doc/cephfs: add missing client feature bits

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 79ba8970d7cbc714e160d5957bd849eede93e5a3)

13 months agodoc/cephfs: document MDS_CLIENTS_BROKEN_ROOTSQUASH health error
Patrick Donnelly [Thu, 2 May 2024 23:33:50 +0000 (19:33 -0400)]
doc/cephfs: document MDS_CLIENTS_BROKEN_ROOTSQUASH health error

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit b810bc9c54515b69aefffb36f1a47235b3c9125d)

13 months agoqa: add tests for MDS_CLIENTS_BROKEN_ROOTSQUASH
Patrick Donnelly [Fri, 3 May 2024 00:52:29 +0000 (20:52 -0400)]
qa: add tests for MDS_CLIENTS_BROKEN_ROOTSQUASH

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 9d0ab233d822668e88c873bc1314e984feaf1296)

13 months agomds: raise health warning if client lacks feature for root_squash
Patrick Donnelly [Fri, 3 May 2024 00:50:37 +0000 (20:50 -0400)]
mds: raise health warning if client lacks feature for root_squash

Rather than evict all clients lacking this feature bit, raise a health error
that pushes the administrator to address it. This avoids the surprise of having
all affected clients suddenly evicted in the cluster.

Fixes: https://tracker.ceph.com/issues/65733
Fixes: 954ed30
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 66ff5c9fc8d4664f18b2fa462e96e5548c35951f)

13 months agomon/MDSMonitor: add note about missing metadata inclusion
Patrick Donnelly [Fri, 3 May 2024 00:49:22 +0000 (20:49 -0400)]
mon/MDSMonitor: add note about missing metadata inclusion

There is a "client_count" metadata on the health warning that apparently was
intended to be used for aggregating warnings but never was. Add a TODO item for
that.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 6517b704e311fd24dbf5bfbdec2ddd23b17d4092)

13 months agomds: check relevant caps for fs include root_squash
Patrick Donnelly [Wed, 1 May 2024 01:41:14 +0000 (21:41 -0400)]
mds: check relevant caps for fs include root_squash

When denying client reconnects because the MDS caps include root_squash and the
client features do not include CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK, ensure those
caps are only for the file system the MDS is joined to.

Fixes: https://tracker.ceph.com/issues/65733
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit f79ae86f2c23388f6ecc3177764735e071998e09)

13 months agomds: refactor out fs_name match in MDSAuthCaps
Patrick Donnelly [Thu, 2 May 2024 12:55:36 +0000 (08:55 -0400)]
mds: refactor out fs_name match in MDSAuthCaps

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 247b4fed28617c104473d1586b66a8735bff0411)

13 months agoqa: test for root_squash with multiple caps
Patrick Donnelly [Thu, 2 May 2024 01:08:57 +0000 (21:08 -0400)]
qa: test for root_squash with multiple caps

Where the client has root_squash for one cap but not for another. The fs
without root_squash should not necessarily reject the client.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit bccc8ceb471c441ec04d7eb2c353630f8c5ce843)

Conflicts:
qa/tasks/cephfs/test_admin.py: missing test

13 months agoqa: pass kwargs to mount from remount
Patrick Donnelly [Thu, 2 May 2024 02:06:54 +0000 (22:06 -0400)]
qa: pass kwargs to mount from remount

So we can pass mntargs.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit afcbfc040b56779e58563f715f26a0fe25e9f916)

13 months agoqa: simplify update_attrs and only update relevant keys
Patrick Donnelly [Thu, 2 May 2024 02:04:57 +0000 (22:04 -0400)]
qa: simplify update_attrs and only update relevant keys

So we can just pass the caller's kwargs to update_attrs.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 597ff3cb15e7a7ce527b35eb01d9958b755bbf01)

13 months agoclient: allow overriding client features
Patrick Donnelly [Thu, 2 May 2024 00:51:59 +0000 (20:51 -0400)]
client: allow overriding client features

For testing purposes.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit d9239f9375c1ae92a4990950f40078766bd912e8)

13 months agoMerge pull request #57513 from Matan-B/wip-57129-squid
Matan Breizman [Tue, 21 May 2024 12:37:28 +0000 (15:37 +0300)]
Merge pull request #57513 from Matan-B/wip-57129-squid

squid: crimson/os/seastore/transaction_manager: fix write pipeline phase leak

Reviewed-by: Samuel Just <sjust@redhat.com>
13 months agoMerge pull request #57509 from Matan-B/wip-56383-squid
Matan Breizman [Tue, 21 May 2024 12:28:12 +0000 (15:28 +0300)]
Merge pull request #57509 from Matan-B/wip-56383-squid

squid: crimson/os/seastore/btree: clean up `FixedKVLeafNode::get_logical_child()`

Reviewed-by: Samuel Just <sjust@redhat.com>
13 months agoMerge pull request #57508 from Matan-B/wip-56875-squid
Matan Breizman [Tue, 21 May 2024 12:27:48 +0000 (15:27 +0300)]
Merge pull request #57508 from Matan-B/wip-56875-squid

squid: crimson/osd/osd_meta: load incremental osdmap from "inc_osdmap.XXX"

Reviewed-by: Samuel Just <sjust@redhat.com>
13 months agoMerge pull request #57506 from Matan-B/wip-56353-squid
Matan Breizman [Tue, 21 May 2024 12:26:28 +0000 (15:26 +0300)]
Merge pull request #57506 from Matan-B/wip-56353-squid

squid: crimson/os/seastore: avoid new allocation when overwriting data in RBM for performance

Reviewed-by: Samuel Just <sjust@redhat.com>
13 months agoMerge pull request #57504 from Matan-B/wip-56611-squid
Matan Breizman [Tue, 21 May 2024 12:25:24 +0000 (15:25 +0300)]
Merge pull request #57504 from Matan-B/wip-56611-squid

squid: crimson/osd/replicated_recovery_backend: prepare_pull use pg_info

Reviewed-by: Samuel Just <sjust@redhat.com>
13 months agoMerge pull request #57503 from Matan-B/wip-56912-squid
Matan Breizman [Tue, 21 May 2024 12:24:57 +0000 (15:24 +0300)]
Merge pull request #57503 from Matan-B/wip-56912-squid

squid: crimson/common/operation: fix and move exit() after entering the next phase

Reviewed-by: Samuel Just <sjust@redhat.com>
13 months agoMerge pull request #57501 from Matan-B/wip-56775-squid
Matan Breizman [Tue, 21 May 2024 12:24:21 +0000 (15:24 +0300)]
Merge pull request #57501 from Matan-B/wip-56775-squid

squid: crimson/osd: implement basic reactor-utilization stats report to log

Reviewed-by: Samuel Just <sjust@redhat.com>
13 months agoMerge pull request #57500 from Matan-B/wip-56627-squid
Matan Breizman [Tue, 21 May 2024 12:23:44 +0000 (15:23 +0300)]
Merge pull request #57500 from Matan-B/wip-56627-squid

squid: crimson/os/seastore: alloc mapping with refcount when rewriting logical extents

Reviewed-by: Samuel Just <sjust@redhat.com>
13 months agoMerge pull request #57499 from Matan-B/wip-55847-squid
Matan Breizman [Tue, 21 May 2024 12:23:11 +0000 (15:23 +0300)]
Merge pull request #57499 from Matan-B/wip-55847-squid

squid: crimson: convert some of client_request to use coroutines

Reviewed-by: Samuel Just <sjust@redhat.com>
13 months agoMerge pull request #57413 from neha-ojha/wip-squid-rc
Neha Ojha [Mon, 20 May 2024 21:35:42 +0000 (17:35 -0400)]
Merge pull request #57413 from neha-ojha/wip-squid-rc

squid: src/ceph_release: dev -> rc

Reviewed-by: Laura Flores <lflores@redhat.com>
13 months agoMerge pull request #57532 from rhcs-dashboard/wip-66092-squid
Adam King [Mon, 20 May 2024 17:50:37 +0000 (13:50 -0400)]
Merge pull request #57532 from rhcs-dashboard/wip-66092-squid

squid: mgr/k8sevents: update V1Events to CoreV1Events

Reviewed-by: Laura Flores <lflores@ibm.com>
13 months agodoc/dev/release-checklists.rst: squid milestone added 57413/head
Neha Ojha [Mon, 20 May 2024 16:53:01 +0000 (16:53 +0000)]
doc/dev/release-checklists.rst: squid milestone added

Signed-off-by: Neha Ojha <nojha@redhat.com>
13 months ago.github/milestone.yml: add squid
Neha Ojha [Fri, 10 May 2024 20:59:25 +0000 (20:59 +0000)]
.github/milestone.yml: add squid

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit bca45c7df2c26b105401b299161f553ffe4e32c1)

14 months agomds/quiesce: db: quiesce-await should EPERM if a set is past QS_QUIESCED 57559/head
Leonid Usov [Fri, 26 Apr 2024 00:20:42 +0000 (03:20 +0300)]
mds/quiesce: db: quiesce-await should EPERM if a set is past QS_QUIESCED

Fixes: https://tracker.ceph.com/issues/65669
Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit fd88b52d6fb11c206b3a06e8533c4551c902f173)
Fixes: https://tracker.ceph.com/issues/66034
14 months agoMerge pull request #57547 from zdover23/wip-doc-2024-05-19-backport-57542-to-squid
Zac Dover [Mon, 20 May 2024 06:12:48 +0000 (16:12 +1000)]
Merge pull request #57547 from zdover23/wip-doc-2024-05-19-backport-57542-to-squid

doc/cephfs: Squid and later - subvolume quiesce

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
14 months agoMerge pull request #57540 from zdover23/wip-doc-2024-05-18-backport-57534-to-squid
Zac Dover [Mon, 20 May 2024 06:11:52 +0000 (16:11 +1000)]
Merge pull request #57540 from zdover23/wip-doc-2024-05-18-backport-57534-to-squid

squid: doc/cephfs: edit fs-volumes.rst (2 of x)

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
14 months agodoc/cephfs: Squid and later - subvolume quiesce 57547/head
Zac Dover [Sun, 19 May 2024 00:00:29 +0000 (10:00 +1000)]
doc/cephfs: Squid and later - subvolume quiesce

Add a note to the "Subvolume quiesce" section that says that the
information in the section applies only to the Squid and later releases
of Ceph. This is included here so that I don't overwrite the Reef and
Quincy documentation with irrelevant information, and so that I don't
overwrite the Squid information with blank space where the "Subvolume
quiesce" section should be.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit be63ca6d037a954caacb3b9737332e886a1f762f)

14 months agodoc/cephfs: edit fs-volumes.rst (2 of x) 57540/head
Zac Dover [Fri, 17 May 2024 10:46:28 +0000 (20:46 +1000)]
doc/cephfs: edit fs-volumes.rst (2 of x)

Edit doc/cephfs/fs-volumes to the section "Cloning Snapshots" (but not
including the section "Cloning Snapshots".

Follows https://github.com/ceph/ceph/pull/57415

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 0a3981a011d5e768ab8a5782884283b6650af28a)

14 months agoMerge pull request #57520 from cbodley/wip-66069-squid
Casey Bodley [Fri, 17 May 2024 16:09:06 +0000 (17:09 +0100)]
Merge pull request #57520 from cbodley/wip-66069-squid

squid: cmake: disable WITH_QATLIB/ZIP on non-x86

Reviewed-by: Ken Dreyer <kdreyer@ibm.com>
14 months agomgr/k8sevents: update V1Events to CoreV1Events 57532/head
Nizamudeen A [Fri, 3 May 2024 08:56:19 +0000 (14:26 +0530)]
mgr/k8sevents: update V1Events to CoreV1Events

centos9 only provides kubernetes 26.1.0 as base dep and hence the
k8sevents code needs to be updated accordingly. the api changes happened
in kuberenetes while 19.0.0 was released

Fixes: https://tracker.ceph.com/issues/65627
Fixes: https://tracker.ceph.com/issues/64981
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 6af964719217d720e6c2fd1ba2a607f6255d2604)

14 months agocmake: disable WITH_QATLIB/ZIP on non-x86 57520/head
Ken Dreyer [Tue, 14 May 2024 18:53:51 +0000 (14:53 -0400)]
cmake: disable WITH_QATLIB/ZIP on non-x86

This feature is only relevant to x86 hosts.

Signed-off-by: Ken Dreyer <kdreyer@ibm.com>
Fixes: https://tracker.ceph.com/issues/66016
Signed-off-by: Ken Dreyer <kdreyer@ibm.com>
(cherry picked from commit 487cd2fddbab784269af9f48206a130e63f1eca3)

14 months agocrimson/osd/replicated_recovery_backend: don't resolve_oid on recovery 57514/head
Matan Breizman [Mon, 22 Apr 2024 08:29:22 +0000 (08:29 +0000)]
crimson/osd/replicated_recovery_backend: don't resolve_oid on recovery

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit 48be64b686dee418595210733dbbe6411402adc8)

14 months agocrimson/osd/object_context_loader: SnapTrim to not resolve_oid
Matan Breizman [Mon, 8 Apr 2024 07:52:20 +0000 (07:52 +0000)]
crimson/osd/object_context_loader: SnapTrim to not resolve_oid

SnapTrimObjSubEvent::remove_or_update partially resolves the to be
trimmed clone taking into account in_removed_snaps_queue.
The general resolve_oid is not suitable for this scenario.
Specifically the following check:
```
    if (std::find(
      citer->second.begin(),
      citer->second.end(),
      oid.snap) == citer->second.end()) {
       logger().debug("{} {} does not contain {} -- DNE",
                      __func__, ss.clone_snaps, oid.snap);
       return std::nullopt;
    }
```
because of earlier snap_map_modify call.

Example:
```
INFO  2024-04-07 13:44:01,118 [shard 0:main] osd - SnapTrimObjSubEvent(coid=2:e8855410:::folio011816418-576:8 snapid=8): 2:e8855410:::folio011816418-576:8 snaps [8, 7] -> {7}
DEBUG 2024-04-07 13:44:01,118 [shard 0:main] osd - snap_map_modify: soid 2:e8855410:::folio011816418-576:8, snaps {7}
...
This case will fail:
INFO  2024-04-07 13:44:04,139 [shard 0:main] osd - SnapTrimObjSubEvent(coid=2:e8855410:::folio011816418-576:8 snapid=7): 2:e8855410:::folio011816418-576:8 snaps [7] -> {} ... deleting
DEBUG 2024-04-07 13:44:04,139 [shard 0:main] osd - snap_map_remove: soid 2:e8855410:::folio011816418-576:8
```

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit 263b2ae77ac1df8a939b82101eca09f1d8d4089a)

14 months agocrimson/osd/object_context_loader: explicit with_head_obc call
Matan Breizman [Sun, 7 Apr 2024 12:25:09 +0000 (12:25 +0000)]
crimson/osd/object_context_loader: explicit with_head_obc call

No change in behavior, improved readability

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit 09537e174f20722626da35c584eb6026de79b8a6)

14 months agocrimson/osd/object_context_loader: Simplify with_obc
Matan Breizman [Sun, 7 Apr 2024 12:23:56 +0000 (12:23 +0000)]
crimson/osd/object_context_loader: Simplify with_obc

No change in behavior

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit 47af391c56e241e612efb656811d587cd309c82f)

14 months agocrimson/osd/object_context_loader: cleanup with_clone_obc_direct
Matan Breizman [Wed, 3 Apr 2024 08:50:55 +0000 (08:50 +0000)]
crimson/osd/object_context_loader: cleanup with_clone_obc_direct

ObjectContextLoader interface provides two variants:

* with_obc:
  // Use this variant by default
  // If oid is a clone object, the clone obc *and* it's
  // matching head obc will be locked and can be used in func.

* with_clone_obc_only:
  // Use this variant in the case where the head object
  // obc is already locked and only the clone obc is needed.
  // Avoid nesting with_head_obc() calls by using with_clone_obc()
  // with an already locked head.

with_clone_obc_direct variant is equal to with_obc on a clone obc
since both the head and the clone obcs will be locked and can be used.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit 9a27c95b958a6f5b6c52087e8d05899dc465befe)

14 months agocrimson/osd/object_context_loader: add comment to with_head_obc
Matan Breizman [Mon, 1 Apr 2024 08:20:14 +0000 (08:20 +0000)]
crimson/osd/object_context_loader: add comment to with_head_obc

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit 7e208d97e83e780bb939391df674c231a5f2cad8)

14 months agocrimson/osd/object_context_loader: fix with_clone_obc on resolve_oid case
Matan Breizman [Sun, 7 Apr 2024 09:38:06 +0000 (09:38 +0000)]
crimson/osd/object_context_loader: fix with_clone_obc on resolve_oid case

Resolve_oid on a clone object may actually return the head:
```
    // Because oid.snap > ss.seq, we are trying to read from a snapshot
    // taken after the most recent write to this object. Read from head.
```

In this case, with_clone_obc should apply `func` same as with_head_obc would have.

Note: previously, with_clone_obc_only was called on the resolved head object.
While it didn't cause any errors, using the head_obc as clone is wrong.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit d8aad5576f67c0e255b42c5d26334c87ba705931)

14 months agocrimson/osd/object_context_loader: clones to support ssc
Matan Breizman [Wed, 3 Apr 2024 08:56:56 +0000 (08:56 +0000)]
crimson/osd/object_context_loader: clones to support ssc

Previously, only the head obc had ssc reference. Let clone obc
also reference it's head ssc.

Fixes: https://tracker.ceph.com/issues/65203
Fixes: https://tracker.ceph.com/issues/65201
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit 5de3da0edaeb3b415cfc9359efb06cd0d7fb58d0)

14 months agocrimson/os/seastore/transaction_manager: fix write pipeline phase leak 57513/head
Xuehan Xu [Wed, 24 Apr 2024 09:00:53 +0000 (17:00 +0800)]
crimson/os/seastore/transaction_manager: fix write pipeline phase leak

At present, if a transaction gets interrupted right after it enters
WritePipeline::ReserveProjectedUsage and before any later continuations
get executed, WritePipeline::ReserveProjectedUsage will be locked
forever.

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
(cherry picked from commit 6a6f340c09a8dfe4565e298db11a30345ef7f82f)

14 months agocrimson/common/tri_mutex: make promotion atomic with func 57511/head
Yingxin Cheng [Mon, 15 Apr 2024 02:03:35 +0000 (10:03 +0800)]
crimson/common/tri_mutex: make promotion atomic with func

Specifically, make promotion atomic with load-obc to fix
assert(readers/writers == 1) failures.

Fixes: https://tracker.ceph.com/issues/65451
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
(cherry picked from commit 676947e1f73a9d7f900df5a6854c283c4eef24a9)

14 months agocrimson/common/tri_mutex: use seastar::now()
Yingxin Cheng [Fri, 19 Apr 2024 06:28:33 +0000 (14:28 +0800)]
crimson/common/tri_mutex: use seastar::now()

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
(cherry picked from commit 26b96f93bb1922e1f430ae0480aafa8c7ee0d38a)

14 months ago crimson/common/tri_mutex: improve comments
Yingxin Cheng [Mon, 15 Apr 2024 02:02:48 +0000 (10:02 +0800)]
 crimson/common/tri_mutex: improve comments

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
(cherry picked from commit 8eb9f0380f4328c672cbde9d3e1971b730695d73)

14 months agocrimson/common/tri_mutex: drop the unused greedy param
Yingxin Cheng [Thu, 21 Mar 2024 01:43:34 +0000 (09:43 +0800)]
crimson/common/tri_mutex: drop the unused greedy param

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
(cherry picked from commit 9cdb67fa19ea7d912a5f3e7a14092a205dee80ac)

14 months agocrimson/osd/ops_executer: fix snap overlap range error 57510/head
junxiang Mu [Mon, 1 Apr 2024 07:00:14 +0000 (03:00 -0400)]
crimson/osd/ops_executer: fix snap overlap range error

Fixes: https://tracker.ceph.com/issues/65113
Signed-off-by: junxiang Mu <1948535941@qq.com>
(cherry picked from commit 7eca779627a90dc80f54957cc49b25b4c965044d)

14 months agocrimson/os/seastore/btree: clean up `FixedKVLeafNode::get_logical_child()` 57509/head
Xuehan Xu [Fri, 22 Mar 2024 07:22:46 +0000 (15:22 +0800)]
crimson/os/seastore/btree: clean up `FixedKVLeafNode::get_logical_child()`

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
(cherry picked from commit 6e4f52a8ff93484d156f1d9095608a55e9c46aed)

14 months agocrimson/osd/osd_meta: load incremental osdmap from "inc_osdmap.XXX" 57508/head
Xuehan Xu [Sun, 14 Apr 2024 07:20:31 +0000 (15:20 +0800)]
crimson/osd/osd_meta: load incremental osdmap from "inc_osdmap.XXX"

Fixes: https://tracker.ceph.com/issues/65474
Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
(cherry picked from commit 69271aef7002d0a76b43e8d1fc0667355119823e)

14 months agocrimson/osd/recovery_backends: discard outdated recovery ops 57507/head
Xuehan Xu [Fri, 12 Apr 2024 06:01:16 +0000 (14:01 +0800)]
crimson/osd/recovery_backends: discard outdated recovery ops

Fixes: https://tracker.ceph.com/issues/65453
Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
(cherry picked from commit 234f41c33917e921b53c091af7b3fcbf0f141b4a)

14 months agocrimson/seastore: add a TODO comment regarding is_data_stable() 57506/head
myoungwon oh [Fri, 19 Apr 2024 04:57:52 +0000 (04:57 +0000)]
crimson/seastore: add a TODO comment regarding is_data_stable()

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
(cherry picked from commit 5df6ffc79b415def0b7984096413e9562e83399a)

14 months agocommon/options: correct explanation regarding delta_based_overwrite
myoungwon oh [Thu, 18 Apr 2024 11:30:45 +0000 (11:30 +0000)]
common/options: correct explanation regarding delta_based_overwrite

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
(cherry picked from commit 387c0c89992b5a28a7c5bba9f720a951aee1673b)

14 months agotest/crimson/seastore: rename set/unset_overwrite_threshold() to enable/disable_delta...
myoungwon oh [Tue, 9 Apr 2024 06:31:12 +0000 (06:31 +0000)]
test/crimson/seastore: rename set/unset_overwrite_threshold() to enable/disable_delta_based_overwrite()

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
(cherry picked from commit 24c0c198a5dd57070f210dc69dd83481df73fa15)

14 months agocrimson/os/seastore: avoid new allocation when overwriting data in RBM for performance
myoungwon oh [Thu, 21 Mar 2024 02:06:24 +0000 (02:06 +0000)]
crimson/os/seastore: avoid new allocation when overwriting data in RBM for performance

In 4K random write test, after seastore is filled up by 4MB extents,
current implementation performs deep copy in duplicate_for_write(), resulting in
significant performance degradation by 80%.
Therefore, this commit changes the deep copy behavior for bufferptr during the overwrite
situation to shallow copy, leaving the original data untouched.

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
(cherry picked from commit 76b87855b4a6fb60a1fae59792aeebff3e8762d3)

14 months agocrimson: Add support for pool compression 57505/head
Aishwarya Mathuria [Wed, 31 Jan 2024 09:20:44 +0000 (09:20 +0000)]
crimson: Add support for pool compression

1. Send pool options to Bluestore which include compression options as well.
2. Add pool related stats to MPGStats so that they all compression details are available in 'ceph df detail' command.

Fixes: https://tracker.ceph.com/issues/59242
Signed-off-by: Aishwarya Mathuria <amathuri@redhat.com>
(cherry picked from commit ba4f62c49ecee26d98100bb5cdb15ecfe212f0be)

14 months agocrimson/osd/replicated_recovery_backend: prepare_pull to use pg_info 57504/head
Matan Breizman [Mon, 1 Apr 2024 08:58:48 +0000 (08:58 +0000)]
crimson/osd/replicated_recovery_backend: prepare_pull to use pg_info

Don't use peer's info on prepare_pull.

Fixes: https://tracker.ceph.com/issues/65200
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit 7586abfad239d433049fb714d6fb8f2530a1b9c6)

14 months agocrimson/common/operation: fix and move exit() after entering the next phase 57503/head
Yingxin Cheng [Tue, 16 Apr 2024 07:53:47 +0000 (15:53 +0800)]
crimson/common/operation: fix and move exit() after entering the next phase

If exit/unlock the barrier before entering the next phase, it is
possible for the next request to exit the barrier at the same time, and
enters the next phase first, causing reorder issues.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
(cherry picked from commit 078768ff961abad12cf3e7f19190a6f986fc5fce)

14 months agoqa/suites/crimson-rados/thrash: enable chance_down 57502/head
Matan Breizman [Wed, 27 Mar 2024 08:37:40 +0000 (08:37 +0000)]
qa/suites/crimson-rados/thrash: enable chance_down

As the thrash tests were introduced, some options were disabled
until the tests are stabilized.

Re-enable chance_down option (default is 0.4) to detect bugs on restart.

Since it will probably take few iterations before thrash and recovery tests ('default.yaml')
will pass successfully, add anoter 'simple.yaml' which should remain stable.

Fixes: https://tracker.ceph.com/issues/65130
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit 372751789be1725ed06fd077d029295c8cab4c35)

14 months agocrimson/osd: implement basic reactor-utilization stats report to log 57501/head
Yingxin Cheng [Tue, 9 Apr 2024 03:06:36 +0000 (11:06 +0800)]
crimson/osd: implement basic reactor-utilization stats report to log

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
(cherry picked from commit 99909d45a282fb9dc89c6e8c98f4b866e67b09cb)

14 months agocrimson/os/seastore: alloc mapping with refcount when rewriting logical extents 57500/head
Zhang Song [Wed, 3 Apr 2024 09:21:43 +0000 (17:21 +0800)]
crimson/os/seastore: alloc mapping with refcount when rewriting logical extents

Signed-off-by: Zhang Song <zhangsong02@qianxin.com>
(cherry picked from commit 9269e68eacc825ccbe023e325c405bc0636d6b58)

14 months agocrimson/os/seastore/btree_lba_manager: update_refcount returns the refcount of interm...
Zhang Song [Wed, 3 Apr 2024 09:10:31 +0000 (17:10 +0800)]
crimson/os/seastore/btree_lba_manager: update_refcount returns the refcount of intermediate mapping

Signed-off-by: Zhang Song <zhangsong02@qianxin.com>
(cherry picked from commit 513bb4f3f21b739ea3f751680716b0fbcda9c3e1)

14 months agocrimson/os/seastore/btree_lba_manager: cleanup methods that return std::pair
Zhang Song [Wed, 3 Apr 2024 09:07:52 +0000 (17:07 +0800)]
crimson/os/seastore/btree_lba_manager: cleanup methods that return std::pair

Signed-off-by: Zhang Song <zhangsong02@qianxin.com>
(cherry picked from commit 36f96ca000ab8839b7649c32f6d53ea88f9d9700)

14 months agocrimson/os/seastore: introduce extent_ref_count_t
Zhang Song [Wed, 3 Apr 2024 08:27:50 +0000 (16:27 +0800)]
crimson/os/seastore: introduce extent_ref_count_t

Signed-off-by: Zhang Song <zhangsong02@qianxin.com>
(cherry picked from commit bbee55a105ffb31d01c9a024325b8eb06d8f56e7)

14 months agocrimson/os/seastore: remove unused return value of RecordScanner::scan_valid_records
Zhang Song [Wed, 3 Apr 2024 08:02:22 +0000 (16:02 +0800)]
crimson/os/seastore: remove unused return value of RecordScanner::scan_valid_records

Signed-off-by: Zhang Song <zhangsong02@qianxin.com>
(cherry picked from commit 1a914f64070299ab5712ce389096052709b72622)

14 months agocrimson/common/interruptible_future: add discard_result
Zhang Song [Wed, 3 Apr 2024 08:01:59 +0000 (16:01 +0800)]
crimson/common/interruptible_future: add discard_result

Signed-off-by: Zhang Song <zhangsong02@qianxin.com>
(cherry picked from commit 870441bb0c02992923ac6e0b6373b6d3770b470b)

14 months agocrimson/.../client_request: work around gcc bz98401 57499/head
Samuel Just [Fri, 9 Feb 2024 19:49:13 +0000 (11:49 -0800)]
crimson/.../client_request: work around gcc bz98401

See included comment for details.

Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 1388173764408d047f2fbd5cf7b33dcf42971a03)

14 months agocrimson/.../client_request: work around gcc bz101244 and bz102217
Samuel Just [Fri, 9 Feb 2024 02:45:43 +0000 (18:45 -0800)]
crimson/.../client_request: work around gcc bz101244 and bz102217

See included comment for details.

Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 90dcbfa0fa9d8a46f8075a7e6f2329652f0a312f)

14 months agocrimson/.../client_request: convert ClientRequest::do_process to coroutine
Samuel Just [Wed, 7 Feb 2024 02:05:42 +0000 (02:05 +0000)]
crimson/.../client_request: convert ClientRequest::do_process to coroutine

Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 30237472bea485cb57f5c14031fb545aed97da15)

14 months agocrimson/.../client_request: convert process_op to coroutine
Samuel Just [Wed, 7 Feb 2024 01:19:37 +0000 (17:19 -0800)]
crimson/.../client_request: convert process_op to coroutine

Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit a4905fb3e2f2400656f0b436275e9fa745f14cc1)

14 months agocrimson/.../client_request: make reply_op_error return interruptible_future<>
Samuel Just [Wed, 7 Feb 2024 01:19:13 +0000 (17:19 -0800)]
crimson/.../client_request: make reply_op_error return interruptible_future<>

co_await can't implicitely convert it.

Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit ecf5795b6666f122fa3bf2771e83b066e611ade9)

14 months agocrimson/.../client_request: convert ClientRequest::process_pg_op to coroutine
Samuel Just [Wed, 7 Feb 2024 00:32:38 +0000 (00:32 +0000)]
crimson/.../client_request: convert ClientRequest::process_pg_op to coroutine

Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 0de3613072659cfdcb41ea7b1df5865425774b3b)

14 months agocrimson/.../client_request: convert with_pg_process_interruptible coroutine
Samuel Just [Mon, 5 Feb 2024 20:46:19 +0000 (20:46 +0000)]
crimson/.../client_request: convert with_pg_process_interruptible coroutine

Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 904a9937b6560f9a1f936f7f79badc56d8949dd1)

14 months agocrimson/.../client_request: factor out with_pg_interruptible
Samuel Just [Mon, 5 Feb 2024 00:04:49 +0000 (16:04 -0800)]
crimson/.../client_request: factor out with_pg_interruptible

Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit b521cc3d822c39084f72c4f22f621928a184d731)

14 months agocrimson/.../client_request: don't pass Ref<PG> by reference
Samuel Just [Tue, 6 Feb 2024 04:46:57 +0000 (20:46 -0800)]
crimson/.../client_request: don't pass Ref<PG> by reference

If we only need a reference to the PG, pass a PG&.  Passing Ref<PG>&
makes it easy to inadvertently std::move() the passed value from
a caller.

Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 4164b08887c2b2c7a9df0cd80a681bb61be28ee6)

14 months agocrimson/.../client_request: rename with_pg_int to with_pg_process
Samuel Just [Sun, 4 Feb 2024 22:29:38 +0000 (14:29 -0800)]
crimson/.../client_request: rename with_pg_int to with_pg_process

Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 13042dde1134983f074f5bd4cf4c912d47d0193c)

14 months agocrimson/.../client_request.cc: move message decode check to with_pg
Samuel Just [Sun, 4 Feb 2024 22:28:46 +0000 (14:28 -0800)]
crimson/.../client_request.cc: move message decode check to with_pg

We only need to do this once, no need to recheck on requeue.

Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 10a9a11f5d535044f081964dc6b79813aa9bb624)