]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
6 months agoos/bluestore: Fix BlueFS::truncate() 61314/head
Adam Kupczyk [Fri, 10 Jan 2025 08:26:54 +0000 (08:26 +0000)]
os/bluestore: Fix BlueFS::truncate()

In `struct bluefs_fnode_t` there is a vector `extents` and
the vector `extents_index` that is a log2 seek cache.

Until modifications to truncate() we never removed extents from files.
Modified truncate() did not update extents_index.

For example 10 extents long files when truncated to 0 will have:
0 extents, 10 extents_index.
After writing some data to file:
1 extents, 11 extents_index.

Now, `bluefs_fnode_t::seek` will binary search extents_index,
lets say it located seek at item #3.
It will then jump up from #0 extent (that exists) to #3 extent which
does not exist at.
The worst part is that code is now broken, as #3 != extent.end().

There are 3 parts of the fix:
1) assert in `bluefs_fnode_t::seek` to protect against
   jumping outside extents
2) code in BlueFS::truncate to sync up `extents_index` with `extents`
3) dampening down assert in _replay to give a way out of cases
   where incorrect "offset 12345" (12345 is file size) instead of
   "offset 20000" (allocations occupied) was written to log.

Fixes: https://tracker.ceph.com/issues/69481
Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
6 months agoos/bluestore: bluefs unittest for truncate bug
Adam Kupczyk [Fri, 10 Jan 2025 10:07:18 +0000 (10:07 +0000)]
os/bluestore: bluefs unittest for truncate bug

Unittest showing 2 different flavours of problems:
1) bluefs log corruption
2) bluefs sigsegv

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
6 months agoMerge pull request #61254 from kamoltat/wip-ksirivad-fix-stretch-mode-doc
Kamoltat (Junior) Sirivadhna [Fri, 10 Jan 2025 02:53:59 +0000 (09:53 +0700)]
Merge pull request #61254 from kamoltat/wip-ksirivad-fix-stretch-mode-doc

doc/rados/operations/stretch-mode: Improve doc
Reviewed-by: zdover23
Reviewed-by: anthonyeleven
6 months agoMerge pull request #61288 from adamemerson/wip-69303
Adam Emerson [Fri, 10 Jan 2025 00:57:04 +0000 (19:57 -0500)]
Merge pull request #61288 from adamemerson/wip-69303

rgw: Don't crash on exceptions from pool listing

Reviewed-by: Casey Bodley <cbodley@redhat.com>
6 months agorgw: Don't crash on exceptions from pool listing 61288/head
Adam Emerson [Thu, 9 Jan 2025 16:46:32 +0000 (11:46 -0500)]
rgw: Don't crash on exceptions from pool listing

Fixes: https://tracker.ceph.com/issues/69303
Signed-off-by: Adam Emerson <aemerson@redhat.com>
6 months agoMerge pull request #60330 from JonBailey1993/JonBailey1993/ceph_test_rados_io_sequenc...
Jon Bailey [Thu, 9 Jan 2025 16:19:54 +0000 (16:19 +0000)]
Merge pull request #60330 from JonBailey1993/JonBailey1993/ceph_test_rados_io_sequence_inject_error

common/io_exerciser: Add support to ceph_test_rados_io_sequence injecting errors for testing how erasure coding deals with error scenarios

Reviewed-by: Ronen Friedman <rfriedma@ibm.com>
6 months agoMerge pull request #52791 from clwluvw/location-constraint
Casey Bodley [Thu, 9 Jan 2025 16:03:50 +0000 (11:03 -0500)]
Merge pull request #52791 from clwluvw/location-constraint

rgw: check for location constraint on master zonegroup

Reviewed-by: Casey Bodley <cbodley@redhat.com>
6 months agoMerge pull request #59960 from smanjara/wip-fix-missing-http-data
Casey Bodley [Thu, 9 Jan 2025 16:03:26 +0000 (11:03 -0500)]
Merge pull request #59960 from smanjara/wip-fix-missing-http-data

rgw/multisite:  include request body when CreateBucket op is forwarded to master

Reviewed-by: Casey Bodley <cbodley@redhat.com>
6 months agoMerge pull request #60881 from NitzanMordhai/wip-nitzan-ipv6-subnet-checks
Laura Flores [Thu, 9 Jan 2025 15:48:20 +0000 (09:48 -0600)]
Merge pull request #60881 from NitzanMordhai/wip-nitzan-ipv6-subnet-checks

common/pick_address: Add IPv6 support to is_addr_in_subnet

6 months agoMerge pull request #60987 from drunkard/main
David Galloway [Thu, 9 Jan 2025 15:43:37 +0000 (10:43 -0500)]
Merge pull request #60987 from drunkard/main

doc: fixes in doc building

6 months agoMerge pull request #60883 from xxhdx1985126/wip-crimson-backfill-throttle
Matan Breizman [Thu, 9 Jan 2025 14:59:44 +0000 (16:59 +0200)]
Merge pull request #60883 from xxhdx1985126/wip-crimson-backfill-throttle

crimson/osd/pg_recovery: throttle backfills together with pglog based recoveries

Reviewed-by: Samuel Just <sjust@redhat.com>
6 months agoMerge pull request #61262 from VallariAg/fix-nvmeof-teuthology-basic-test
Vallari Agrawal [Thu, 9 Jan 2025 14:00:49 +0000 (19:30 +0530)]
Merge pull request #61262 from VallariAg/fix-nvmeof-teuthology-basic-test

qa/workunits/nvmeof/basic_tests.sh: fix nvme list assert

6 months agoMerge pull request #60623 from piyushagarwal1411/ceph-exporter
Ilya Dryomov [Thu, 9 Jan 2025 13:59:25 +0000 (14:59 +0100)]
Merge pull request #60623 from piyushagarwal1411/ceph-exporter

vstart: add ceph-exporter support

Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
6 months agoMerge pull request #60891 from xxhdx1985126/wip-seastore-fadvise-backfill
Yingxin Cheng [Thu, 9 Jan 2025 10:08:35 +0000 (18:08 +0800)]
Merge pull request #60891 from xxhdx1985126/wip-seastore-fadvise-backfill

crimson/os/seastore: add fadvise support to SeaStore and prevent recovery/backfill from polluting the cache of SeaStore

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Matan Breizman <mbreizma@redhat.com>
6 months agodoc/rados/operations/stretch-mode: Improve doc 61254/head
Kamoltat Sirivadhna [Tue, 7 Jan 2025 09:36:03 +0000 (09:36 +0000)]
doc/rados/operations/stretch-mode: Improve doc

Added more content and rewrite some sections

Signed-off-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
6 months agoMerge pull request #60848 from cbodley/wip-rgw-deprecate-iam-tenant
Casey Bodley [Wed, 8 Jan 2025 18:02:17 +0000 (13:02 -0500)]
Merge pull request #60848 from cbodley/wip-rgw-deprecate-iam-tenant

docs/rgw: deprecate tenant-based IAM in favor of accounts

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
6 months agoMerge pull request #61074 from chardan/wip-radowsgw-admin-jfw-restructure_file
Jesse Williamson [Wed, 8 Jan 2025 17:53:07 +0000 (09:53 -0800)]
Merge pull request #61074 from chardan/wip-radowsgw-admin-jfw-restructure_file

rgw: migrate rgw_admin to new radosgw-admin directory

6 months agoMerge pull request #60379 from cbodley/wip-librados-cancellation
Casey Bodley [Wed, 8 Jan 2025 17:29:18 +0000 (12:29 -0500)]
Merge pull request #60379 from cbodley/wip-librados-cancellation

librados/asio: forward asio cancellations to AioCompletion::cancel()

Reviewed-by: Adam Emerson <aemerson@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
6 months agorgw/multisite: the create_bucket forward request omits the 59960/head
Shilpa Jagannath [Tue, 24 Sep 2024 21:12:02 +0000 (17:12 -0400)]
rgw/multisite: the create_bucket forward request omits the
the request body, thus missing some data if specified inside
CreateBucketConfiguration xml on the non-master zone.
also, now that we perform cksum validation against empty payloads,
such a request would fail with -ERR_AMZ_CONTENT_SHA256_MISMATCH due
to a zero content-length but a non-empty payload hash.
this fix ensures that request body is forwarded during create_bucket

Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
6 months agoMerge pull request #61127 from VallariAg/wip-nvmeof-delete-healthcheck
Vallari Agrawal [Wed, 8 Jan 2025 15:43:07 +0000 (21:13 +0530)]
Merge pull request #61127 from VallariAg/wip-nvmeof-delete-healthcheck

mon/NVMeofGwMap: add healthcheck warning NVMEOF_GATEWAY_DELETING

6 months agoMerge pull request #61136 from cbodley/wip-69301
Casey Bodley [Wed, 8 Jan 2025 15:36:25 +0000 (10:36 -0500)]
Merge pull request #61136 from cbodley/wip-69301

rgw: don't use merge_and_store_attrs() when recreating a bucket

Reviewed-by: Pritha Srivastava <prsrivas@redhat.com>
Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
6 months agoMerge pull request #61255 from Matan-B/wip-matanb-crimson-seastore-deafult-2
Matan Breizman [Wed, 8 Jan 2025 11:46:38 +0000 (13:46 +0200)]
Merge pull request #61255 from Matan-B/wip-matanb-crimson-seastore-deafult-2

common/options/crimson.yaml.in: fallback to Bluestore by default

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Xuehan Xu <xuxuehan@qianxin.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
6 months agoMerge pull request #61251 from guits/ceph-volume-69432
Guillaume Abrioux [Wed, 8 Jan 2025 10:24:35 +0000 (11:24 +0100)]
Merge pull request #61251 from guits/ceph-volume-69432

ceph-volume: fix loop devices support

6 months agovstart.sh: add support for launching the ceph-exporter daemon 60623/head
Piyush Agarwal [Wed, 8 Jan 2025 08:57:49 +0000 (14:27 +0530)]
vstart.sh: add support for launching the ceph-exporter daemon

Signed-off-by: Piyush Agarwal <piyushagarwal14.pa@gmail.com>
6 months agoMerge pull request #61252 from guits/hints-create_id
Guillaume Abrioux [Wed, 8 Jan 2025 08:03:15 +0000 (09:03 +0100)]
Merge pull request #61252 from guits/hints-create_id

ceph-volume: add python hints to util.prepare.create_id()

6 months agoqa/workunits/nvmeof/basic_tests.sh: fix connect-all assert 61262/head
Vallari Agrawal [Tue, 7 Jan 2025 13:35:35 +0000 (19:05 +0530)]
qa/workunits/nvmeof/basic_tests.sh: fix connect-all assert

There seems to be change in 'nvme list' json output
which caused failures in asserts after 'nvme connect-all'
command.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
6 months agoMerge pull request #61213 from Matan-B/wip-matanb-crimson-shared-lru-dtor
Matan Breizman [Tue, 7 Jan 2025 14:44:32 +0000 (16:44 +0200)]
Merge pull request #61213 from Matan-B/wip-matanb-crimson-shared-lru-dtor

crimson/common/shared_lru: invalidate Deleter's cache

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
6 months agoMerge pull request #60666 from ThomasLamprecht/debian-fix-librgw2-lua-dependencies
Kefu Chai [Tue, 7 Jan 2025 14:19:43 +0000 (22:19 +0800)]
Merge pull request #60666 from ThomasLamprecht/debian-fix-librgw2-lua-dependencies

debian/control: fix overly broad lua dependency declaration for librgw2 package

Reviewed-by: Kefu Chai <tchaikov@gmail.com>
6 months agoMerge pull request #60118 from kshtsk/wip-refactor-make-check
kyr [Tue, 7 Jan 2025 13:36:51 +0000 (14:36 +0100)]
Merge pull request #60118 from kshtsk/wip-refactor-make-check

script/run-make: stop args duplication

6 months agoMerge pull request #61241 from guits/ceph-volume-69430
Guillaume Abrioux [Tue, 7 Jan 2025 12:53:57 +0000 (13:53 +0100)]
Merge pull request #61241 from guits/ceph-volume-69430

ceph-volume: fix Zap.ensure_associated_raw()

6 months agoceph-volume: fix loop devices support 61251/head
Guillaume Abrioux [Tue, 7 Jan 2025 08:42:15 +0000 (08:42 +0000)]
ceph-volume: fix loop devices support

This commit updates the `is_device` function to correctly handle
loop devices.
The function now validates loop devices when they are explicitly
allowed, by checking their type (`loop`) in addition to `disk`
and `mpath`.

Changes include:
  - Extending the type check to include `loop` in the list of
    supported device types.
  - Enhancing the docstring for better documentation of the
    function's purpose and behavior.

These changes ensure that loop devices are properly recognized
and handled when configuring OSDs in ceph-volume.

Fixes: https://tracker.ceph.com/issues/69432
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
6 months agocommon/io_exerciser: Make chunksize so initial generated value is 4096 and random... 60330/head
JonBailey1993 [Fri, 29 Nov 2024 16:15:32 +0000 (16:15 +0000)]
common/io_exerciser: Make chunksize so initial generated value is 4096 and random values are generated thereafter

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
6 months agocommon/io_exerciser: Make sure Sequence 10 removes objects after finishing running
JonBailey1993 [Fri, 29 Nov 2024 16:13:57 +0000 (16:13 +0000)]
common/io_exerciser: Make sure Sequence 10 removes objects after finishing running

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
6 months agosrc/common/json: Modified JSON structures so they take advantage of ceph_json.h fully.
JonBailey1993 [Mon, 25 Nov 2024 11:40:09 +0000 (11:40 +0000)]
src/common/json: Modified JSON structures so they take advantage of ceph_json.h fully.

Also moved and renamed JSONStructures files so they structures are more easily identifiable and usable by others if desired.

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
6 months agosrc/common/io_exerciser: Remove unneccisairy override in data_generation::SeededRando...
JonBailey1993 [Fri, 15 Nov 2024 15:12:02 +0000 (15:12 +0000)]
src/common/io_exerciser: Remove unneccisairy override in data_generation::SeededRandomGenerator

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
6 months agosrc/common/io_exerciser: add missing override statements to JsonStructures.h
JonBailey1993 [Thu, 14 Nov 2024 16:51:47 +0000 (16:51 +0000)]
src/common/io_exerciser: add missing override statements to JsonStructures.h

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
6 months agosrc/common/io_exerciser: Formatting improvements using clang format
JonBailey1993 [Thu, 14 Nov 2024 16:47:40 +0000 (16:47 +0000)]
src/common/io_exerciser: Formatting improvements using clang format

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
6 months agocommon/io_exerciser: Add simple sequences for testing error injects
JonBailey1993 [Tue, 15 Oct 2024 15:00:33 +0000 (16:00 +0100)]
common/io_exerciser: Add simple sequences for testing error injects

Add sequences to test IOs with simple error injects, along with some small fixes for previous error inject implementation.

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
6 months agocommon/io_exerciser: Add injecterror commands to ceph_test_rados_io_sequence interact...
JonBailey1993 [Fri, 11 Oct 2024 12:44:17 +0000 (13:44 +0100)]
common/io_exerciser: Add injecterror commands to ceph_test_rados_io_sequence interactive mode

Add injecterror commands that can be used in interactive mode to inject read and write errors as well as clear them

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
6 months agoosd: EC error inject interfaces
Bill Scales [Tue, 8 Oct 2024 08:51:14 +0000 (08:51 +0000)]
osd: EC error inject interfaces

Error inject interfaces for EC reads and writes using
ceph tell osd.<n> interface

Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
6 months agosrc/common/io_exerciser: Code readability improvements
JonBailey1993 [Tue, 1 Oct 2024 10:25:48 +0000 (11:25 +0100)]
src/common/io_exerciser: Code readability improvements

Implement inheritence for ceph::io_exerciser::IoOp to allow better differentiation between the different Op types and allow more complex Operations to be implemented

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
6 months agoRevert "doc/dev/crimson: update SeaStore as default backend" 61255/head
Matan Breizman [Tue, 7 Jan 2025 10:07:18 +0000 (10:07 +0000)]
Revert "doc/dev/crimson: update SeaStore as default backend"

This reverts commit 41327dcf0f733777f0d022d1bab7bf72b9807b3c.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
6 months agoqa/suites/crimson-rados-exp: add Seastore/thrash
Matan Breizman [Tue, 7 Jan 2025 10:05:16 +0000 (10:05 +0000)]
qa/suites/crimson-rados-exp: add Seastore/thrash

As Seastore/trash was removed from the non-exp suite, it is
moved here until fully supported.
Follow-up to: 5150dae471c

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
6 months agoqa/suites/crimson-rados-exp: remove basic
Matan Breizman [Tue, 7 Jan 2025 10:03:33 +0000 (10:03 +0000)]
qa/suites/crimson-rados-exp: remove basic

"basic" directory is included in the non-exp suite and is fully supported

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
6 months agoqa/suites/crimson-rados: disable thrash/seastore
Matan Breizman [Tue, 7 Jan 2025 09:59:01 +0000 (09:59 +0000)]
qa/suites/crimson-rados: disable thrash/seastore

Seastore supports thrash_simple only until https://tracker.ceph.com/issues/69405
is resolved.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
6 months agocommon/options/crimson.yaml.in: Bluestore by default
Matan Breizman [Tue, 7 Jan 2025 09:56:14 +0000 (09:56 +0000)]
common/options/crimson.yaml.in: Bluestore by default

This partially reverts f4dee79c3309e5ac1e1a142b85f492851e6757e1
Until https://tracker.ceph.com/issues/69402 is resolved.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
6 months agoceph-volume: add python hints to util.prepare.create_id() 61252/head
Guillaume Abrioux [Tue, 7 Jan 2025 09:22:00 +0000 (09:22 +0000)]
ceph-volume: add python hints to util.prepare.create_id()

This commit introduces type annotations to the `create_id` function in `ceph_volume.util.prepare`.
The parameters and return value are now typed as follows:
  - `fsid` is a `str`.
  - `json_secrets` is a `str`.
  - `osd_id` is an optional `str` (`Optional[str]`).
  - The function returns a `str`.

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
6 months agomon/NVMeofGwMap: add delay to NVMEOF_GATEWAY_DELETING warning 61127/head
Vallari Agrawal [Wed, 25 Dec 2024 05:01:21 +0000 (10:31 +0530)]
mon/NVMeofGwMap: add delay to NVMEOF_GATEWAY_DELETING warning

Instead of immediately triggering, have this healthcheck trigger
after some time has elasped. This delay can be configured by
mon_nvmeofgw_delete_grace.

Track the time when gateways go into DELETING state in a new
member var (of NVMeofGwMon) 'gws_deleting_time'.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
6 months agoMerge pull request #61240 from neesingh-rh/wip-68974
Zac Dover [Tue, 7 Jan 2025 06:28:34 +0000 (16:28 +1000)]
Merge pull request #61240 from neesingh-rh/wip-68974

doc: add snapshots under cephfs concepts

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
6 months agorgw: migrate rgw_admin to new directory. 61074/head
Jesse Williamson [Thu, 12 Dec 2024 22:10:19 +0000 (14:10 -0800)]
rgw: migrate rgw_admin to new directory.

Some C++ fixes, macro fixes, etc.. Gathers other core
files, makes names more closely match executable.

Signed-off-by: Jesse F. Williamson <jfw@ibm.com>
6 months agorgw: don't use merge_and_store_attrs() when recreating a bucket 61136/head
Casey Bodley [Wed, 18 Dec 2024 16:28:02 +0000 (11:28 -0500)]
rgw: don't use merge_and_store_attrs() when recreating a bucket

https://github.com/ceph/ceph/pull/56583 recently fixed
merge_and_store_attrs() to preserve existing attrs, but this broke the
swift api's ability to remove container metadata. RGWCreateBucket
handles this merging itself with prepare_add_del_attrs(), so we should
just assign createparams.attrs to the bucket and store it with
bucket->put_info()

make the same change for RGWPutMetadataBucket which swift uses to
add/remove existing metadata

Fixes: https://tracker.ceph.com/issues/69301
Signed-off-by: Casey Bodley <cbodley@redhat.com>
6 months agoceph-volume: fix Zap.ensure_associated_raw() 61241/head
Guillaume Abrioux [Mon, 6 Jan 2025 16:12:22 +0000 (16:12 +0000)]
ceph-volume: fix Zap.ensure_associated_raw()

When an OSD creation fails, ceph-volume can zaps unrelated
existing raw based OSD as part of the 'rollback step'.

Fixes: https://tracker.ceph.com/issues/69430
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
6 months agoMerge pull request #61243 from anthonyeleven/etags-fix
Anthony D'Atri [Mon, 6 Jan 2025 18:27:06 +0000 (13:27 -0500)]
Merge pull request #61243 from anthonyeleven/etags-fix

doc/radosgw/s3: correct eTag op match tables

6 months agodoc/radosgw/s3: correct eTag op match tables 61243/head
Anthony D'Atri [Mon, 6 Jan 2025 17:48:04 +0000 (12:48 -0500)]
doc/radosgw/s3: correct eTag op match tables

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
6 months agosrc/common/options/mon.yaml.in: add mon_nvmeofgw_delete_grace
Vallari Agrawal [Wed, 25 Dec 2024 04:50:26 +0000 (10:20 +0530)]
src/common/options/mon.yaml.in: add mon_nvmeofgw_delete_grace

This config allows to configure the delay in triggering
NVMEOF_GATEWAY_DELETING healthcheck warning, which is
triggered when NVMeoF gateways are in DELETEING state
for too long (indicating a problem in namespace
load-balacing).
The default value for this config is 15 mins.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
6 months agodoc: add snapshots in docs under Cephfs concepts 61240/head
neeraj pratap singh [Mon, 6 Jan 2025 11:00:32 +0000 (16:30 +0530)]
doc: add snapshots in docs under Cephfs concepts

Fixes: https://tracker.ceph.com/issues/68974
Signed-off-by: Neeraj Pratap Singh <neesingh@redhat.com>
6 months agoMerge pull request #60720 from batrick/i68913
Venky Shankar [Mon, 6 Jan 2025 07:00:03 +0000 (12:30 +0530)]
Merge pull request #60720 from batrick/i68913

qa: write out ESubtreeMap more frequently to find large events

Reviewed-by: Venky Shankar <vshankar@redhat.com>
6 months agocrimson/osd/replicated_recovery_backend: prevent recovery/backfills from 60891/head
Xuehan Xu [Sun, 15 Dec 2024 08:17:03 +0000 (16:17 +0800)]
crimson/osd/replicated_recovery_backend: prevent recovery/backfills from
polluting the cache of the underlying futurized store

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
6 months agocrimson/os: all "read/get" interfaces accept op_flags
Xuehan Xu [Sun, 15 Dec 2024 08:06:35 +0000 (16:06 +0800)]
crimson/os: all "read/get" interfaces accept op_flags

To be used by exclusively by Seastore, See `seastore::cache_hint_t`

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
6 months agocrimson/os/seastore: introduce cache_hint_t
Xuehan Xu [Sat, 14 Dec 2024 13:26:33 +0000 (21:26 +0800)]
crimson/os/seastore: introduce cache_hint_t

Layers above Cache can use cache_hint_t to notify Cache whether to put
the extents to Cache::lru.

CEPH_OSD_OP_FLAG_FADVISE_DONTNEED and CEPH_OSD_OP_FLAG_FADVISE_NOCACHE
will be treated as "no need to put into Cache::lru"

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
6 months agocrimson/osd/pg_recovery: throttle backfills together with pg-log based 60883/head
Xuehan Xu [Wed, 27 Nov 2024 01:32:48 +0000 (09:32 +0800)]
crimson/osd/pg_recovery: throttle backfills together with pg-log based
recoveries

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
6 months agoMerge pull request #59593 from xxhdx1985126/wip-67888
Matan Breizman [Sun, 5 Jan 2025 14:35:42 +0000 (16:35 +0200)]
Merge pull request #59593 from xxhdx1985126/wip-67888

crimson/osd/backfill_state: treat `Cancelled` as a pause of the ongoing backfilling

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
6 months agoMerge pull request #60597 from xxhdx1985126/wip-68806
Matan Breizman [Sun, 5 Jan 2025 11:47:12 +0000 (13:47 +0200)]
Merge pull request #60597 from xxhdx1985126/wip-68806

crimson/osd/replicated_recovery_backend: call on_global_recover() only when all replicas and the primary have been recovered

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
6 months agoMerge pull request #60041 from xxhdx1985126/wip-68286
Matan Breizman [Sun, 5 Jan 2025 11:46:39 +0000 (13:46 +0200)]
Merge pull request #60041 from xxhdx1985126/wip-68286

crimson/osd/pg_shard_manager: discard outdated operations when the corresponding pgs are already removed

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
6 months agoMerge pull request #61216 from athanatos/sjust/wip-crimson-backfill-teuth
Matan Breizman [Sun, 5 Jan 2025 11:07:28 +0000 (13:07 +0200)]
Merge pull request #61216 from athanatos/sjust/wip-crimson-backfill-teuth

qa/suites/crimson-rados: enable short_pg_log

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
6 months agoMerge pull request #61227 from zdover23/wip-doc-2024-01-05-README
Zac Dover [Sun, 5 Jan 2025 09:38:45 +0000 (19:38 +1000)]
Merge pull request #61227 from zdover23/wip-doc-2024-01-05-README

doc: README.md - improve "Tshooting" and "Tips & Tricks"

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
6 months agoMerge pull request #61148 from ronen-fr/wip-rf-abortReserv
Ronen Friedman [Sun, 5 Jan 2025 06:41:13 +0000 (08:41 +0200)]
Merge pull request #61148 from ronen-fr/wip-rf-abortReserv

osd/scrub: abort running scrub in replica-reservation if an operator-initiated scrub is requested

Reviewed-by: Samuel Just <sjust@redhat.com>
6 months agodoc: README.md - improve "Tshooting" and "Tips & Tricks" 61227/head
Zac Dover [Sat, 4 Jan 2025 20:54:48 +0000 (06:54 +1000)]
doc: README.md - improve "Tshooting" and "Tips & Tricks"

Improve the formatting and English language in the sections
"Troubleshooting" and "Tips and Tricks", and move those sections to a
place where they don't interrupt the flow of the vstart cluster
installation instructions. Some of the strings in "Tips and Tricks" are
not yet unambiguous sentences that will make sense to the uninitiated,
but this PR represents a step in that direction.

This PR is part of a series of PRs meant to preserve the integrity of
the README.md file after some recent additions that break the flow of
the document.

This PR follows https://github.com/ceph/ceph/pull/61226 and
https://github.com/ceph/ceph/pull/61221.

Signed-off-by: Zac Dover <zac.dover@proton.me>
6 months agoMerge pull request #61226 from zdover23/wip-doc-2025-01-04-README
Zac Dover [Sat, 4 Jan 2025 19:56:07 +0000 (05:56 +1000)]
Merge pull request #61226 from zdover23/wip-doc-2025-01-04-README

doc: README.md - format "Troubleshooting"

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
6 months agoMerge pull request #61086 from athanatos/sjust/wip-rep-pipeline
Samuel Just [Fri, 3 Jan 2025 20:16:50 +0000 (12:16 -0800)]
Merge pull request #61086 from athanatos/sjust/wip-rep-pipeline

crimson: allow replica side write commits to pipeline

Reviewed-by: Xuehan Xu <xuxuehan@qianxin.com>
6 months agodoc: README.md - format "Troubleshooting" 61226/head
Zac Dover [Fri, 3 Jan 2025 19:52:24 +0000 (05:52 +1000)]
doc: README.md - format "Troubleshooting"

Format "Troubleshooting" into its own section so that it doesn't confuse
readers of the vstart installation procedure.

This PR is part of a series of PRs meant to preserve the integrity of
the README.md file after some recent additions that break the flow of
the document.

This PR follows https://github.com/ceph/ceph/pull/61221.

Signed-off-by: Zac Dover <zac.dover@proton.me>
6 months agoMerge pull request #61221 from zdover23/wip-doc-2025-01-03-README
Zac Dover [Fri, 3 Jan 2025 19:16:24 +0000 (05:16 +1000)]
Merge pull request #61221 from zdover23/wip-doc-2025-01-03-README

doc: README.md - format "Tips and Tricks"

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
6 months agoMerge pull request #60832 from clwluvw/bucket-delete-meta
J. Eric Ivancich [Fri, 3 Jan 2025 17:17:30 +0000 (12:17 -0500)]
Merge pull request #60832 from clwluvw/bucket-delete-meta

rgw: consider multi zonegroup for is_syncing_bucket_meta

Reviewed-by: Casey Bodley <cbodley@redhat.com>
6 months agoMerge pull request #60685 from clwluvw/data-sync-perm
J. Eric Ivancich [Fri, 3 Jan 2025 17:16:39 +0000 (12:16 -0500)]
Merge pull request #60685 from clwluvw/data-sync-perm

rgw: respect policies in data sync in user mode

Reviewed-by: Adam Emerson <aemerson@redhat.com>
6 months agomon/NVMeofGwMap: add healthcheck warning NVMEOF_GATEWAY_DELETING
Vallari Agrawal [Wed, 18 Dec 2024 07:59:47 +0000 (13:29 +0530)]
mon/NVMeofGwMap: add healthcheck warning NVMEOF_GATEWAY_DELETING

Add a warning when NVMeoF gateways are in DELETING state.
This happens when there are namespaces under the deleted gateway's
ANA group ID.

The gateways are removed completely after users manually move these
namespaces to another load balancing group. Or if a new gateway is
deployed on that host.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
6 months agoMerge pull request #61217 from gbregman/main
Gil Bregman [Fri, 3 Jan 2025 10:12:08 +0000 (12:12 +0200)]
Merge pull request #61217 from gbregman/main

mgr/cephadm/nvmeof: Add key verification field to NVMeOF configuration

6 months agodoc: README.md - format "Tips and Tricks" 61221/head
Zac Dover [Fri, 3 Jan 2025 09:23:54 +0000 (19:23 +1000)]
doc: README.md - format "Tips and Tricks"

Format "Tips and Tricks" into its own section so that it doesn't confuse
readers of the vstart installation procedure.

Signed-off-by: Zac Dover <zac.dover@proton.me>
6 months agomgr/cephadm/nvmeof: Add key verification field to NVMeOF configuration 61217/head
Gil Bregman [Thu, 2 Jan 2025 21:08:00 +0000 (23:08 +0200)]
mgr/cephadm/nvmeof: Add key verification field to NVMeOF configuration
Fixes https://tracker.ceph.com/issues/69413

Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
6 months agoMerge pull request #59980 from kchheda3/wip-fix-v1-v2-notification
J. Eric Ivancich [Thu, 2 Jan 2025 19:31:41 +0000 (14:31 -0500)]
Merge pull request #59980 from kchheda3/wip-fix-v1-v2-notification

rgw/notification: Forward Topic & Notification creation request to master when notification_v2 enabled

Reviewed-by: Yuval Lifshitz <ylifshit@ibm.com>
6 months agoMerge pull request #60430 from ivancich/wip-fix-multipart-empty-storage-class
J. Eric Ivancich [Thu, 2 Jan 2025 19:30:51 +0000 (14:30 -0500)]
Merge pull request #60430 from ivancich/wip-fix-multipart-empty-storage-class

rgw: fix empty storage class on display of multipart uploads

Reviewed-by: Adam Emerson <aemerson@redhat.com>
6 months agoMerge pull request #59631 from thotz/create-user-without-creds-cli
J. Eric Ivancich [Thu, 2 Jan 2025 19:29:13 +0000 (14:29 -0500)]
Merge pull request #59631 from thotz/create-user-without-creds-cli

radosgw-admin: create user without creds cli

Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
6 months agocrimson/common/shared_lru: rename Deleter::cache 61213/head
Matan Breizman [Thu, 2 Jan 2025 15:12:31 +0000 (15:12 +0000)]
crimson/common/shared_lru: rename Deleter::cache

to not be confused with SharedLRU::cache

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
6 months agocrimson/common/shared_lru: invalidate Deleter's cache
Matan Breizman [Tue, 24 Dec 2024 11:22:58 +0000 (11:22 +0000)]
crimson/common/shared_lru: invalidate Deleter's cache

Once we destruct SharedLRU, SharedLRU::weak_refs map is destroyed.
As a weak refernce might outlive the SharedLRU itself, when destroying
the object via the custom Deleter, we try to access the already
destroyed SharedLRU instance's weak ref map.

Instead, invalidate the custom Deleter (Deleter::cache), when
destructing the SharedLRU.

Fixes: https://tracker.ceph.com/issues/66478
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
6 months agoMerge PR #60411 into main
Venky Shankar [Thu, 2 Jan 2025 06:45:55 +0000 (12:15 +0530)]
Merge PR #60411 into main

* refs/pull/60411/head:
client: Fix a deadlock when osd is full

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
6 months agoMerge pull request #61103 from yuvalif/wip-yuval-fix-test-names
Yuval Lifshitz [Wed, 1 Jan 2025 09:03:41 +0000 (11:03 +0200)]
Merge pull request #61103 from yuvalif/wip-yuval-fix-test-names

test/rgw/noitifications: fix test names

Reviewed-By: Ali Masarwe <ali.masarwa@ibm.com>
6 months agoqa/standalone/scrub: osd-scrub-test.sh - test operator overrides 61148/head
Ronen Friedman [Wed, 25 Dec 2024 13:02:13 +0000 (07:02 -0600)]
qa/standalone/scrub: osd-scrub-test.sh - test operator overrides

verify that an operator scrub aborts a reserving scrub of the
same PG.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
6 months agoosd/scrub: reset m_active_target when the scrub ends
Ronen Friedman [Sun, 29 Dec 2024 11:44:47 +0000 (05:44 -0600)]
osd/scrub: reset m_active_target when the scrub ends

... as it is now queried to determine whever we are scrubbing,
but not yet 'm_active', as the scrubber is in ReservingReplicas.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
6 months agoosd/scrub: convey 'reserving replicas' status in query results
Ronen Friedman [Wed, 25 Dec 2024 15:12:10 +0000 (09:12 -0600)]
osd/scrub: convey 'reserving replicas' status in query results

... and not just in 'pg dump' output.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
6 months agoqa/standalone/scrub: add build_pg_dicts()
Ronen Friedman [Sun, 29 Dec 2024 11:26:28 +0000 (05:26 -0600)]
qa/standalone/scrub: add build_pg_dicts()

a helper function that builds bash dictionaries:
pg to acting set, pg to primary & pg to pool.

Also added are two helper functions that make use of the dictionaries:

count_common_active() to count the number of common OSDs
in the acting set of two PGs, and find_disjoint_but_primary()
to find a PG that is disjoint from the first PG, apart from
possibly having the same primary OSD.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
6 months agoMerge pull request #61197 from bebehei/fix-haproxy-dashboard-docs
Anthony D'Atri [Mon, 30 Dec 2024 15:05:10 +0000 (10:05 -0500)]
Merge pull request #61197 from bebehei/fix-haproxy-dashboard-docs

doc/mgr/dashboard: Fix HAProxy TLS example

6 months agodoc/mgr/dashboard: Fix HAProxy TLS example 61197/head
Benedikt Heine [Mon, 30 Dec 2024 14:26:16 +0000 (15:26 +0100)]
doc/mgr/dashboard: Fix HAProxy TLS example

With `ssl` set on the `server` option, HAProxy strips the TLS protocol
for all clients. You would need to connect to it with `http://<ip>:443`.

To have an active health check, which uses SSL, but does not strip it
for clients, you'd need to add:

- `check` to enable active health checks.
- `check-ssl` to instruct the health check to use TLS
- `verify none` to skip verification on the health check requests from
  HAProxy
- _REMOVE_ `ssl` to stop stripping TLS

The active health checks are required to not route any requests to the
inactive managers. These would redirect to any unusable IP from the
active mgr.

---

Alternatively you could add another certificate in the frontend and then
re-encrypt the traffic. But this would require tracking the certs also
in HAProxy.

Signed-off-by: Benedikt Heine <bebe@bebehei.de>
6 months agotest/rgw/noitifications: fix test names 61103/head
Yuval Lifshitz [Mon, 16 Dec 2024 17:16:36 +0000 (17:16 +0000)]
test/rgw/noitifications: fix test names

for persistent topic stats tests

Signed-off-by: Yuval Lifshitz <ylifshit@ibm.com>
6 months agoMerge pull request #60794 from dparmar18/wip-68571
Zac Dover [Mon, 30 Dec 2024 07:26:45 +0000 (17:26 +1000)]
Merge pull request #60794 from dparmar18/wip-68571

doc/cephfs: document purge queue and its perf counters

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
6 months agoqa: write out ESubtreeMap more frequently to find large events 60720/head
Patrick Donnelly [Wed, 13 Nov 2024 03:29:19 +0000 (22:29 -0500)]
qa: write out ESubtreeMap more frequently to find large events

With the trimming changes by 9d2b3aa, ESubtreeMap wasn't written reliably often
enough to pass the test.

Fixes: https://tracker.ceph.com/issues/68913
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
6 months agoMerge pull request #61191 from zdover23/wip-doc-2024-12-29-README-cleanup
Zac Dover [Sun, 29 Dec 2024 19:21:39 +0000 (05:21 +1000)]
Merge pull request #61191 from zdover23/wip-doc-2024-12-29-README-cleanup

doc: README.md - format admonition

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
6 months agodoc: README.md - format admonition 61191/head
Zac Dover [Sun, 29 Dec 2024 13:24:46 +0000 (23:24 +1000)]
doc: README.md - format admonition

Format an admonition correctly. This commit is a prelude to a cleanup of
a recent addition to README.md.

Signed-off-by: Zac Dover <zac.dover@proton.me>
6 months agoosd/scrub: abort reserving scrub if an operator-initiated scrub is
Ronen Friedman [Thu, 19 Dec 2024 16:02:08 +0000 (10:02 -0600)]
osd/scrub: abort reserving scrub if an operator-initiated scrub is
requested

Handling the case of receiving an operator command while the PG is
scrubbing, but
is waiting for replicas' reservations:

Now that the reservations are queued, the wait may be a very prolonged
one.
Usually - an operator direct scrub command has a priority high enough
to not require waiting for reservations. But in the current
implementation,
it would wait until the running scrub session terminates, and only then
will rerun at that high priority. This is not the intended behavior.

The solution is to abort the existing scrub session, and start the new
one.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
6 months agoosd/scrub: register for 'osd_max_scrubs' config changes
Ronen Friedman [Thu, 26 Dec 2024 13:06:10 +0000 (07:06 -0600)]
osd/scrub: register for 'osd_max_scrubs' config changes

Since https://github.com/ceph/ceph/pull/55340, the
osd_max_scrubs (also) affects the parameters of the
async scrub reserver used by the replicas. Thus,
the code must notice and acknowledge changes to this config.

Fixes: https://tracker.ceph.com/issues/69362
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
6 months agoMerge pull request #61142 from Dedsec0098/wip-doc-shrish
Zac Dover [Sat, 28 Dec 2024 15:32:26 +0000 (01:32 +1000)]
Merge pull request #61142 from Dedsec0098/wip-doc-shrish

doc: Update vstart section in readme.md

Reviewed-by: Zac Dover <zac.dover@proton.me>
6 months agoMerge pull request #61156 from zdover23/wip-doc-2024-12-20-radosgw-uadk-accel
Zac Dover [Sat, 28 Dec 2024 09:38:26 +0000 (19:38 +1000)]
Merge pull request #61156 from zdover23/wip-doc-2024-12-20-radosgw-uadk-accel

doc/radosgw: line edit uadk-accel.rst (1st half)

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>