]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
20 months agoMerge pull request #53759 from cbodley/wip-63058-pacific
Yuri Weinstein [Wed, 15 Nov 2023 21:02:30 +0000 (13:02 -0800)]
Merge pull request #53759 from cbodley/wip-63058-pacific

pacific: rgw: fix unwatch crash at radosgw startup

Reviewed-by: Casey Bodley <cbodley@redhat.com>
20 months agoMerge pull request #53593 from trociny/wip-58478-pacific
Yuri Weinstein [Wed, 15 Nov 2023 21:01:45 +0000 (13:01 -0800)]
Merge pull request #53593 from trociny/wip-58478-pacific

pacific: rgw: fix FP error when calculating enteries per bi shard

Reviewed-by: Casey Bodley <cbodley@redhat.com>
20 months agoMerge pull request #53474 from k0ste/wip-55701-pacific
Yuri Weinstein [Wed, 15 Nov 2023 21:01:16 +0000 (13:01 -0800)]
Merge pull request #53474 from k0ste/wip-55701-pacific

pacific: radosgw-admin: don't crash on --placement-id without --storage-class

Reviewed-by: Casey Bodley <cbodley@redhat.com>
20 months agoMerge pull request #53472 from k0ste/wip-57635-pacific
Yuri Weinstein [Wed, 15 Nov 2023 21:00:49 +0000 (13:00 -0800)]
Merge pull request #53472 from k0ste/wip-57635-pacific

pacific: rgw: Drain async_processor request queue during shutdown

Reviewed-by: Casey Bodley <cbodley@redhat.com>
20 months agoMerge pull request #53439 from k0ste/wip-62823-pacific
Yuri Weinstein [Wed, 15 Nov 2023 21:00:19 +0000 (13:00 -0800)]
Merge pull request #53439 from k0ste/wip-62823-pacific

pacific: RadosGW API: incorrect bucket quota in response to HEAD /{bucket}/?usage

Reviewed-by: Casey Bodley <cbodley@redhat.com>
20 months agoMerge pull request #53410 from trociny/wip-62308-pacific
Yuri Weinstein [Wed, 15 Nov 2023 20:59:46 +0000 (12:59 -0800)]
Merge pull request #53410 from trociny/wip-62308-pacific

pacific: rgw/sync-policy: Correct "sync status" & "sync group" commands

Reviewed-by: Casey Bodley <cbodley@redhat.com>
20 months agoMerge pull request #53400 from trociny/wip-62751-pacific
Yuri Weinstein [Wed, 15 Nov 2023 20:59:14 +0000 (12:59 -0800)]
Merge pull request #53400 from trociny/wip-62751-pacific

pacific: rgw: fix 2 null versionID after convert_plain_entry_to_versioned

Reviewed-by: Casey Bodley <cbodley@redhat.com>
20 months agoMerge pull request #53376 from jzhu116-bloomberg/wip-59692-pacific
Yuri Weinstein [Wed, 15 Nov 2023 20:51:27 +0000 (12:51 -0800)]
Merge pull request #53376 from jzhu116-bloomberg/wip-59692-pacific

pacific: rgw/notification: remove non x-amz-meta-* attributes from bucket notifications

Reviewed-by: Yuval Lifshitz <ylifshit@redhat.com>
20 months agoMerge pull request #53356 from k0ste/wip-53658-pacific
Yuri Weinstein [Wed, 15 Nov 2023 20:50:05 +0000 (12:50 -0800)]
Merge pull request #53356 from k0ste/wip-53658-pacific

pacific: rgw: fix UploadPartCopy error code when src object not exist and src bucket not exist

Reviewed-by: Casey Bodley <cbodley@redhat.com>
20 months agoMerge pull request #52936 from k0ste/wip-58902-pacific
Yuri Weinstein [Wed, 15 Nov 2023 20:49:17 +0000 (12:49 -0800)]
Merge pull request #52936 from k0ste/wip-58902-pacific

pacific: rgw: Fix Browser POST content-length-range min value

Reviewed-by: Casey Bodley <cbodley@redhat.com>
20 months agoMerge pull request #52797 from cbodley/wip-62300-pacific
Yuri Weinstein [Wed, 15 Nov 2023 20:48:47 +0000 (12:48 -0800)]
Merge pull request #52797 from cbodley/wip-62300-pacific

pacific: rgw: retry metadata cache notifications with INVALIDATE_OBJ

Reviewed-by: Casey Bodley <cbodley@redhat.com>
20 months agoMerge pull request #52729 from theanalyst/wip-58817
Yuri Weinstein [Wed, 15 Nov 2023 20:47:48 +0000 (12:47 -0800)]
Merge pull request #52729 from theanalyst/wip-58817

pacific: rgw: swift : check for valid key in POST forms

Reviewed-by: Casey Bodley <cbodley@redhat.com>
20 months agoMerge pull request #52113 from cbodley/wip-61728-pacific
Yuri Weinstein [Wed, 15 Nov 2023 20:45:37 +0000 (12:45 -0800)]
Merge pull request #52113 from cbodley/wip-61728-pacific

pacific: rgw/beast: add max_header_size option with 16k default, up from 4k

Reviewed-by: Casey Bodley <cbodley@redhat.com>
20 months agoMerge pull request #54167 from cbodley/wip-58238-pacific
Yuri Weinstein [Wed, 15 Nov 2023 16:28:52 +0000 (08:28 -0800)]
Merge pull request #54167 from cbodley/wip-58238-pacific

pacific: rgw: beast frontend checks for local_endpoint() errors

Reviewed-by: Casey Bodley <cbodley@redhat.com>
20 months agoMerge pull request #54120 from dparmar18/wip-63269-pacific
Yuri Weinstein [Wed, 15 Nov 2023 15:55:34 +0000 (07:55 -0800)]
Merge pull request #54120 from dparmar18/wip-63269-pacific

pacific: mds: report clients laggy due laggy OSDs only after checking any OSD is laggy

Reviewed-by: Rishabh Dave <ridave@redhat.com>
20 months agoMerge pull request #53916 from kotreshhr/wip-63164-pacific
Yuri Weinstein [Wed, 15 Nov 2023 15:55:00 +0000 (07:55 -0800)]
Merge pull request #53916 from kotreshhr/wip-63164-pacific

pacific: pybind/mgr/volumes: log mutex locks to help debug deadlocks

Reviewed-by: Rishabh Dave <ridave@redhat.com>
20 months agoMerge pull request #53576 from mchangir/wip-57157-pacific
Yuri Weinstein [Wed, 15 Nov 2023 15:54:26 +0000 (07:54 -0800)]
Merge pull request #53576 from mchangir/wip-57157-pacific

pacific: doc/cephfs: note regarding start time time zone

Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
20 months agoMerge pull request #53556 from batrick/wip-62731-pacific
Yuri Weinstein [Wed, 15 Nov 2023 15:53:11 +0000 (07:53 -0800)]
Merge pull request #53556 from batrick/wip-62731-pacific

pacific: mds: add event for batching getattr/lookup

Reviewed-by: Rishabh Dave <ridave@redhat.com>
20 months agoMerge pull request #53555 from batrick/wip-62897-pacific
Yuri Weinstein [Wed, 15 Nov 2023 15:52:40 +0000 (07:52 -0800)]
Merge pull request #53555 from batrick/wip-62897-pacific

pacific: qa: lengthen shutdown timeout for thrashed MDS

Reviewed-by: Rishabh Dave <ridave@redhat.com>
20 months agoMerge pull request #53550 from batrick/wip-62902-pacific
Yuri Weinstein [Wed, 15 Nov 2023 15:51:56 +0000 (07:51 -0800)]
Merge pull request #53550 from batrick/wip-62902-pacific

pacific: mds: log message when exiting due to asok command

Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Christopher Hoffman <choffman@redhat.com>
20 months agoMerge pull request #53495 from lxbsz/wip-62859
Yuri Weinstein [Wed, 15 Nov 2023 15:51:08 +0000 (07:51 -0800)]
Merge pull request #53495 from lxbsz/wip-62859

pacific: mds: fix deadlock between unlinking and linkmerge

Reviewed-by: Rishabh Dave <ridave@redhat.com>
20 months agoMerge pull request #53486 from batrick/wip-62854-pacific
Yuri Weinstein [Wed, 15 Nov 2023 15:50:36 +0000 (07:50 -0800)]
Merge pull request #53486 from batrick/wip-62854-pacific

pacific: qa: ignore expected cluster warning from damage tests

Reviewed-by: Rishabh Dave <ridave@redhat.com>
20 months agoMerge pull request #53453 from joscollin/wip-62834-pacific
Yuri Weinstein [Wed, 15 Nov 2023 15:49:44 +0000 (07:49 -0800)]
Merge pull request #53453 from joscollin/wip-62834-pacific

pacific: cephfs-top: include the missing fields in --dump output

Reviewed-by: Rishabh Dave <ridave@redhat.com>
20 months agoMerge PR #52852 into pacific
Patrick Donnelly [Mon, 13 Nov 2023 14:46:16 +0000 (09:46 -0500)]
Merge PR #52852 into pacific

* refs/pull/52852/head:
mds: remove calculating caps after adding revokes back
test/libcephfs: add test case for revoking caps
client: issue a cap release immediately if no cap exists
mds: add the revoking caps back to _revokes list
mds: move confirm_receipt() to Capability.cc

Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
20 months agoMerge pull request #54268 from ronen-fr/wip-63372-pacific
Yuri Weinstein [Thu, 9 Nov 2023 17:01:41 +0000 (09:01 -0800)]
Merge pull request #54268 from ronen-fr/wip-63372-pacific

pacific: osd: fix use-after-move in build_incremental_map_msg()

Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
20 months agoMerge pull request #53662 from lxbsz/wip-62523
Yuri Weinstein [Thu, 9 Nov 2023 16:59:13 +0000 (08:59 -0800)]
Merge pull request #53662 from lxbsz/wip-62523

pacific: ceph: allow xlock state to be LOCK_PREXLOCK when putting it

Reviewed-by: Rishabh Dave <ridave@redhat.com>
20 months agoMerge pull request #53645 from vshankar/wip-61803-pacific
Yuri Weinstein [Thu, 9 Nov 2023 16:58:43 +0000 (08:58 -0800)]
Merge pull request #53645 from vshankar/wip-61803-pacific

pacific: cephfs-journal-tool: disambiguate usage of all keyword (in tool help).

Reviewed-by: Rishabh Dave <ridave@redhat.com>
20 months agoMerge pull request #53640 from vshankar/wip-62949-pacific
Yuri Weinstein [Thu, 9 Nov 2023 16:58:06 +0000 (08:58 -0800)]
Merge pull request #53640 from vshankar/wip-62949-pacific

pacific: cephfs-mirror: do not run concurrent C_RestartMirroring context

Reviewed-by: Rishabh Dave <ridave@redhat.com>
20 months agoMerge pull request #53362 from k0ste/wip-57110-pacific
Yuri Weinstein [Thu, 9 Nov 2023 16:57:30 +0000 (08:57 -0800)]
Merge pull request #53362 from k0ste/wip-57110-pacific

pacific: mds: replacing bootstrap session only if handle client session message

Reviewed-by: Rishabh Dave <ridave@redhat.com>
20 months agoMerge pull request #53270 from mchangir/wip-59001-pacific
Yuri Weinstein [Thu, 9 Nov 2023 16:57:03 +0000 (08:57 -0800)]
Merge pull request #53270 from mchangir/wip-59001-pacific

pacific: cephfs_mirror: correctly set top level dir permissions

Reviewed-by: Rishabh Dave <ridave@redhat.com>
20 months agoMerge pull request #53169 from leonid-s-usov/bp/cap-throttle-event/pacific
Yuri Weinstein [Thu, 9 Nov 2023 16:56:20 +0000 (08:56 -0800)]
Merge pull request #53169 from leonid-s-usov/bp/cap-throttle-event/pacific

pacific: mds/Server: mark a cap acquisition throttle event in the request

Reviewed-by: Rishabh Dave <ridave@redhat.com>
20 months agoMerge pull request #54294 from ajarr/wip-63385-pacific
Yuri Weinstein [Wed, 8 Nov 2023 21:37:59 +0000 (13:37 -0800)]
Merge pull request #54294 from ajarr/wip-63385-pacific

pacific: qa/suites/rbd: add test to check rbd_support module recovery

Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
20 months agoMerge pull request #54293 from ajarr/wip-63382-pacific
Yuri Weinstein [Wed, 8 Nov 2023 21:37:19 +0000 (13:37 -0800)]
Merge pull request #54293 from ajarr/wip-63382-pacific

pacific: mgr/rbd_support: fix recursive locking on CreateSnapshotRequests lock

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Mykola Golub <mgolub@suse.com>
20 months agoMerge pull request #54256 from pkalever/wip-63351-pacific
Yuri Weinstein [Wed, 8 Nov 2023 21:36:35 +0000 (13:36 -0800)]
Merge pull request #54256 from pkalever/wip-63351-pacific

pacific: rbd-nbd: fix stuck with disable request

Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
20 months agomgr/rbd_support: remove CreateSnapshotRequests __del__() 54293/head
Ramana Raja [Mon, 30 Oct 2023 15:05:27 +0000 (11:05 -0400)]
mgr/rbd_support: remove CreateSnapshotRequests __del__()

There is no need for CreateSnapshotRequests.__del__() that calls
CreateSnapshotRequests.wait_for_pending().
MirrorSnapshotScheduleHandler.shutdown() already calls
CreateSnapshotRequests.wait_for_pending().

Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit fed1e87685a698876cf167b3681327e5b0066ee6)

Conflicts:
       src/pybind/mgr/rbd_support/mirror_snapshot_schedule.py
 - Above conflict was due to commit e4a16e2
   ("mgr/rbd_support: add type annotation") not in pacific

20 months agomgr/rbd_support: fix recursive locking on CreateSnapshotRequests lock
Ramana Raja [Thu, 26 Oct 2023 17:18:52 +0000 (13:18 -0400)]
mgr/rbd_support: fix recursive locking on CreateSnapshotRequests lock

The MirrorSnapshotScheduleHandler's run thread issues asynchronous
create snapshot requests using a CreateSnapshotRequests instance. When
the thread invokes a CreateSnapshotRequests instance's get_ioctx(),
the instance's class variable lock is acquired. With the class
variable lock held, the garbage collection of a CreateSnapshotRequests
instance may race in the thread. The thread would then call
CreateSnapshotRequests __del__() that tries to acquire the class
variable lock that the thread already holds. Fix this
recursive deadlock by converting the CreateSnapshotRequests lock from
a class variable to an instance variable. There is no need to share
the lock across CreateSnapshotRequests instances.

Also convert MirrorSnapshotScheduleHandler, PerfHandler and
TrashPurgeScheduleHandler class variables to instance variables
that don't need to be shared across the instances.

Fixes: https://tracker.ceph.com/issues/62994
Signed-off-by: Ramana Raja <rraja@redhat.com>
Co-Authored-By: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 4452bc22d1c6c8499cf55d6e39090adf7ae1dcbf)

 Conflicts:
src/pybind/mgr/rbd_support/mirror_snapshot_schedule.py
src/pybind/mgr/rbd_support/perf.py
src/pybind/mgr/rbd_support/trash_purge_schedule.py
 - Above conflicts were due to commit e4a16e2
   ("mgr/rbd_support: add type annotation") not in pacific

20 months agoMerge pull request #53970 from Matan-B/wip-63179-pacific
Matan Breizman [Sun, 5 Nov 2023 08:08:07 +0000 (10:08 +0200)]
Merge pull request #53970 from Matan-B/wip-63179-pacific

pacific: osd/OSD: introduce reset_purged_snaps_last

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Radosław Zarzyński <rzarzyns@redhat.com>
20 months agoMerge PR #53103 into pacific
Patrick Donnelly [Fri, 3 Nov 2023 20:32:01 +0000 (16:32 -0400)]
Merge PR #53103 into pacific

* refs/pull/53103/head:
libcephsqlite: fill 0s in unread portion of buffer

Reviewed-by: Laura Flores <lflores@redhat.com>
20 months agoMerge pull request #53984 from sseshasa/wip-63185-pacific
Yuri Weinstein [Thu, 2 Nov 2023 17:45:12 +0000 (10:45 -0700)]
Merge pull request #53984 from sseshasa/wip-63185-pacific

pacific: mon/ConfigMonitor: Show localized name in "config dump --format json" output

Reviewed-by: Laura Flores <lflores@redhat.com>
20 months agoMerge pull request #53430 from pdvian/wip-fix-scrubmsg
Yuri Weinstein [Thu, 2 Nov 2023 17:44:04 +0000 (10:44 -0700)]
Merge pull request #53430 from pdvian/wip-fix-scrubmsg

pacific: osd/scrub: Fix scrub starts messages spamming the cluster log

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
20 months agoqa/suites/rbd: add test to check rbd_support module recovery 54294/head
Ramana Raja [Mon, 18 Sep 2023 02:52:56 +0000 (22:52 -0400)]
qa/suites/rbd: add test to check rbd_support module recovery

... on repeated blocklisting of its client.

There were issues with rbd_support module not being able to recover
from its RADOS client being repeatedly blocklisted. This occured for
example in clusters with OSDs slow to process RBD requests while the
module's mirror_snapshot_scheduler was taking mirror snapshots by
requesting exclusive locks on the RBD images and workloads were running
on the snapshotted images via kernel clients.

Fixes: https://tracker.ceph.com/issues/62891
Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit 2f2cd3bcff82afc3a4d251143eb462e700e7fc60)

20 months agoosd: fix use-after-move in build_incremental_map_msg() 54268/head
Ronen Friedman [Wed, 25 Oct 2023 07:24:18 +0000 (02:24 -0500)]
osd: fix use-after-move in build_incremental_map_msg()

Fixes: https://tracker.ceph.com/issues/63310
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit 9e2b8b0e8235b36e55310aab49b8f760e8d57cad)

20 months agoMerge pull request #53135 from ifed01/wip-ifed-verbose-open-col-pac
Igor Fedotov [Tue, 31 Oct 2023 09:20:22 +0000 (12:20 +0300)]
Merge pull request #53135 from ifed01/wip-ifed-verbose-open-col-pac

pacific: osd,bluestore: gracefully handle a failure during meta collection load

Reviewed-by: Adam Kupczyk <akupczyk@ibm.com>
20 months agoMerge pull request #53587 from ifed01/wip-ifed-vselector-53906-pac
Yuri Weinstein [Mon, 30 Oct 2023 18:12:03 +0000 (11:12 -0700)]
Merge pull request #53587 from ifed01/wip-ifed-vselector-53906-pac

pacific: bluestore: Fix problem with volume selector

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
20 months agoMerge pull request #52948 from ifed01/wip-ifed-fix-55260-pac
Yuri Weinstein [Mon, 30 Oct 2023 18:08:58 +0000 (11:08 -0700)]
Merge pull request #52948 from ifed01/wip-ifed-fix-55260-pac

pacific: os/bluestore: don't require bluestore_db_block_size when attaching new

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
20 months agotest/librbd/fsx: wait for resize to propagate in krbd_resize() 54256/head
Prasanna Kumar Kalever [Fri, 20 Oct 2023 10:11:05 +0000 (15:41 +0530)]
test/librbd/fsx: wait for resize to propagate in krbd_resize()

With this changes resize request will not be blocked until the resize is
completed. Because of this the fsx test fails as it assumes that the
request to resize immediately implies changes on the device size.

Hence we have to add a wait in resize handler of fsx for the device to
actually get resized.

Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
(cherry picked from commit 6f3d0f570f1a262b06d4c661582091d8ddb11bfa)

20 months agorbd-nbd: fix stuck with disable request
Prasanna Kumar Kalever [Tue, 12 Sep 2023 12:15:05 +0000 (17:45 +0530)]
rbd-nbd: fix stuck with disable request

Problem:
-------
Trying to disable any feature on an rbd image mapped with nbd leads to stuck
in rbd-nbd.

The rbd-nbd registers a watcher callback to detect image resize in
NBDWatchCtx::handle_notify(). The handle_notify calls image info method, which
calls refresh_if_required and it got stuck there.

It is getting stuck in ImageState::refresh_if_required() because
DisableFeaturesRequest issues update notifications while still holding onto
the exclusive lock with everything that has to do with it blocked.

Solution:
--------
Set only notify flag as part of NBDWatchCtx::handle_notify() and handle
the resize detection part as part of a different thread.

Fixes: https://tracker.ceph.com/issues/58740
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
(cherry picked from commit dbb4daff404c5d2da32c33f4e852e84a257c0b8d)

20 months agoMerge pull request #54053 from idryomov/wip-63028-pacific
Yuri Weinstein [Mon, 30 Oct 2023 15:22:43 +0000 (08:22 -0700)]
Merge pull request #54053 from idryomov/wip-63028-pacific

pacific: pybind/rbd: don't produce info on errors in aio_mirror_image_get_info()

Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
20 months agoMerge pull request #54144 from rishabh-d-dave/wip-62879-pacific
Anthony D'Atri [Tue, 24 Oct 2023 23:51:40 +0000 (19:51 -0400)]
Merge pull request #54144 from rishabh-d-dave/wip-62879-pacific

pacific: cephfs: upgrade cephfs-shell's path wherever necessary

20 months agorgw: beast frontend checks for local_endpoint() errors 54167/head
Casey Bodley [Thu, 6 Oct 2022 17:22:35 +0000 (13:22 -0400)]
rgw: beast frontend checks for local_endpoint() errors

socket.local_endpoint() throws on error. use the error_code overload
instead and return on failure

Fixes: https://tracker.ceph.com/issues/57784
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 60af907c91210f60d0009318b8ca2ccd87941bb9)

20 months agoqa/tasks/ceph_manager: thrash - add reset_purged_snaps_last 53970/head
Matan Breizman [Thu, 31 Aug 2023 09:55:33 +0000 (09:55 +0000)]
qa/tasks/ceph_manager: thrash - add reset_purged_snaps_last

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit 680e088b8d48e51dfb3aaa4207de67025a0fdabd)

20 months agoosd/OSD: introduce reset_purged_snaps_last
Matan Breizman [Thu, 21 Sep 2023 12:10:07 +0000 (12:10 +0000)]
osd/OSD: introduce reset_purged_snaps_last

When the OSD preboots it sends a MMonGetPurgedSnaps message to
the monitor (`_get_purged_snaps`).
The monitor will reply with all the purged snapshots that their purged_epoch_ is in the
range of superblock.purged_snaps_last + 1 up to the last superblock.current_epoch + 1.
When the OSD will handle the reply from the mon (`handle_get_purged_snaps_reply`)
it will call `record_purged_snaps` to write those purged snapshots in the
OSD store as well (PSN_ keys).

Once purged_snaps_last is reset, in the following OSD reboot, the snapshots that were marked as
purged (purged_snaps_ keys) in the mon's store will be also marked,
correspondingly, in the OSD store.
That way `scrub_purged_snaps` will be able to re-trim the snapshots that weren't
marked as purged in the OSD side (for some reason)

Fixes: https://tracker.ceph.com/issues/62981
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit 120ed0f0e8f65c18bfcd1d649617770c2c5af663)
Manual conflict fixes: 'scrubdebug' command was removed since it's
                       not part of the original commit.

                       write_superblock() parameters were changed

20 months agoMerge pull request #53758 from cbodley/wip-63040-pacific
Yuri Weinstein [Mon, 23 Oct 2023 15:26:06 +0000 (08:26 -0700)]
Merge pull request #53758 from cbodley/wip-63040-pacific

pacific: [CVE-2023-43040] rgw: Fix bucket validation against POST policies

Reviewed-by: Casey Bodley <cbodley@redhat.com>
20 months agoMerge pull request #49477 from aaSharma14/wip-58299-pacific
Pedro Gonzalez Gomez [Mon, 23 Oct 2023 08:23:05 +0000 (10:23 +0200)]
Merge pull request #49477 from aaSharma14/wip-58299-pacific

pacific: mgr/dashboard: Fix CephPoolGrowthWarning alert

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
20 months agocephfs: upgrade cephfs-shell's path wherever necessary 54144/head
Rishabh Dave [Mon, 20 Feb 2023 04:47:10 +0000 (10:17 +0530)]
cephfs: upgrade cephfs-shell's path wherever necessary

Commit dc69033763cc116c6ccdf1f97149a74248691042 moves cephfs-shell from
"<CEPH-REPO-ROOT>/src/tools/cephfs/" to
"<CEPH-REPO-ROOT>/src/tools/cephfs/shell" but cephfs-shell's location in
src/vstart.sh and qa/tasks/cephfs/test_cephfs_shell.py is left
un-updated. This produces a broken vstart_environment.sh and broken
export command in test_cephfs_shell.py.

Introduced-by: dc69033763cc116c6ccdf1f97149a74248691042
Fixes: https://tracker.ceph.com/issues/58795
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 48ef0444774934dd6d0d3e026142d95e4098bebd)

 Conflicts:
qa/tasks/cephfs/test_cephfs_shell.py
- Comment present at the top of file was different in Pacific
  compared to main branch.

21 months agoMerge pull request #53808 from cfsnyder/wip-62945-pacific
Yuri Weinstein [Fri, 20 Oct 2023 14:53:53 +0000 (07:53 -0700)]
Merge pull request #53808 from cfsnyder/wip-62945-pacific

pacific: rgw: add radosgw-admin bucket check olh/unlinked commands

Reviewed-by: Casey Bodley <cbodley@redhat.com>
21 months agoMerge pull request #53784 from idryomov/wip-63010-pacific
Yuri Weinstein [Fri, 20 Oct 2023 14:53:04 +0000 (07:53 -0700)]
Merge pull request #53784 from idryomov/wip-63010-pacific

pacific: qa/suites/krbd: stress test for recovering from watch errors

Reviewed-by: Mykola Golub <mgolub@suse.com>
21 months agoMerge pull request #53562 from cfsnyder/wip-58787-pacific
Yuri Weinstein [Fri, 20 Oct 2023 14:51:43 +0000 (07:51 -0700)]
Merge pull request #53562 from cfsnyder/wip-58787-pacific

pacific: rgwlc: prevent lc for one bucket from exceeding time budget

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
21 months agoMerge pull request #53295 from ajarr/wip-62686-pacific
Yuri Weinstein [Fri, 20 Oct 2023 14:50:25 +0000 (07:50 -0700)]
Merge pull request #53295 from ajarr/wip-62686-pacific

pacific: librbd: kick ExclusiveLock state machine on client being blocklisted when waiting for lock

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
21 months agoMerge pull request #53274 from idryomov/wip-61707-pacific
Yuri Weinstein [Fri, 20 Oct 2023 14:49:22 +0000 (07:49 -0700)]
Merge pull request #53274 from idryomov/wip-61707-pacific

pacific: librbd: make CreatePrimaryRequest remove any unlinked mirror snapshots

Reviewed-by: Ramana Raja <rraja@redhat.com>
21 months agoqa: enhance test cases 54120/head
Dhairya Parmar [Thu, 12 Oct 2023 12:29:04 +0000 (17:59 +0530)]
qa: enhance test cases

Fixes: https://tracker.ceph.com/issues/63105
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit 9005451882371948359a1466fca10256476c5c37)

21 months agomds: erase clients getting evicted from laggy_clients
Dhairya Parmar [Wed, 11 Oct 2023 07:27:04 +0000 (12:57 +0530)]
mds: erase clients getting evicted from laggy_clients

Fixes: https://tracker.ceph.com/issues/63105
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit 754b6022fb9fda075d38dcb1d058482f75dcff4d)

21 months agomds: report clients laggy due laggy OSDs only after checking any OSD is laggy
Dhairya Parmar [Thu, 5 Oct 2023 12:11:38 +0000 (17:41 +0530)]
mds: report clients laggy due laggy OSDs only after checking any OSD is laggy

Fixes: https://tracker.ceph.com/issues/63105
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit 8a5677f956d1b18ebae22c27b690b83e82db13cc)

21 months agoMerge pull request #53978 from adk3798/wip-63115-pacific
Adam King [Tue, 17 Oct 2023 19:14:10 +0000 (15:14 -0400)]
Merge pull request #53978 from adk3798/wip-63115-pacific

pacific: mgr/cephadm: ceph orch add fails when ipv6 address is surrounded by square brackets.

Reviewed-by: John Mulligan <jmulligan@redhat.com>
21 months agoMerge pull request #53977 from adk3798/wip-62802-pacific
Adam King [Tue, 17 Oct 2023 19:13:25 +0000 (15:13 -0400)]
Merge pull request #53977 from adk3798/wip-62802-pacific

pacific: cephadm: run tcmu-runner through script to do restart on failure

Reviewed-by: Guillaume Abrioux <gabrioux@ibm.com>
21 months agoMerge pull request #53975 from adk3798/wip-62469-pacific
Adam King [Tue, 17 Oct 2023 19:12:14 +0000 (15:12 -0400)]
Merge pull request #53975 from adk3798/wip-62469-pacific

pacific: cephadm: add tcmu-runner to logrotate config

Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>
21 months agoMerge pull request #53974 from adk3798/wip-62448-pacific
Adam King [Tue, 17 Oct 2023 19:11:24 +0000 (15:11 -0400)]
Merge pull request #53974 from adk3798/wip-62448-pacific

pacific: mgr/cephadm: Add "networks" parameter to orch apply rgw

Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>
21 months agoMerge pull request #53469 from adk3798/pacific-tcmu-custom-configs
Adam King [Tue, 17 Oct 2023 19:10:24 +0000 (15:10 -0400)]
Merge pull request #53469 from adk3798/pacific-tcmu-custom-configs

pacific: cephadm: make custom_configs work for tcmu-runner container

Reviewed-by: John Mulligan <jmulligan@redhat.com>
21 months agoMerge pull request #52413 from adk3798/wip-61686-pacific
Adam King [Tue, 17 Oct 2023 19:09:29 +0000 (15:09 -0400)]
Merge pull request #52413 from adk3798/wip-61686-pacific

pacific: python-common/drive_group: handle fields outside of 'spec' even when 'spec' is provided

Reviewed-by: Guillaume Abrioux <gabrioux@ibm.com>
21 months agoMerge pull request #52412 from adk3798/wip-61683-pacific
Adam King [Tue, 17 Oct 2023 19:08:48 +0000 (15:08 -0400)]
Merge pull request #52412 from adk3798/wip-61683-pacific

pacific: python-common/drive_selection: lower log level of limit policy message

Reviewed-by: Guillaume Abrioux <gabrioux@ibm.com>
21 months agoMerge pull request #52411 from adk3798/wip-61544-pacific
Adam King [Tue, 17 Oct 2023 19:07:30 +0000 (15:07 -0400)]
Merge pull request #52411 from adk3798/wip-61544-pacific

pacific: cephadm: Adding support to configure public_network cfg section

Reviewed-by: Guillaume Abrioux <gabrioux@ibm.com>
21 months agoMerge pull request #52083 from adk3798/wip-61677-pacific
Adam King [Tue, 17 Oct 2023 19:06:04 +0000 (15:06 -0400)]
Merge pull request #52083 from adk3798/wip-61677-pacific

pacific: cephadm: allow ports to be opened in firewall during adoption, reconfig, redeploy

Reviewed-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Redouane Kachach <rkachach@redhat.com>
21 months agocephadm: make custom_configs work for tcmu-runner container 53469/head
Adam King [Mon, 21 Aug 2023 17:48:56 +0000 (13:48 -0400)]
cephadm: make custom_configs work for tcmu-runner container

This is intended to be a temporary workaround to make
custom config files be able to be mounted into
the tcmu-runner container. The hope is to refactor
cephadm's iscsi handling for squid, but a patch
like this could be useful for iscsi in older
releases where currently custom config files
are unusable for the tcmu-runner container

What this patch actually does is have us write the
custom config files to a dir for the tcmu-runner
container so that the rest of the logic works without
change. I thought this would be easier to remove later
than a patch that integrates more with the container
mounts or general deployment

The use case in mind is something like

service_type: iscsi
service_id: foo
service_name: iscsi.foo
placement:
  hosts:
  - host1
custom_configs:
  -  mount_path: /etc/tcmu/tcmu.conf
     content: |
       log_level = 4
spec:
  api_password: admin
  api_port: 5000
  api_user: admin
  pool: foo

which would allow users to modify the logging of the
tcmu-runner container for debugging purposes

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit de92392708bf456bba975cc18b3138035d79ae05)

21 months agomgr/rbd_support: make type hits on aio_mirror_image_*() callbacks better 54053/head
Ilya Dryomov [Thu, 12 Oct 2023 19:32:53 +0000 (21:32 +0200)]
mgr/rbd_support: make type hits on aio_mirror_image_*() callbacks better

Make it clear that mirror mode, mirror info and snap ID can be None if
the respective operation fails.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 01fff6a72a328459c1d153e5dc1de6a34e48a82f)

Conflicts:
src/pybind/mgr/rbd_support/mirror_snapshot_schedule.py [ commit
  e4a16e261370 ("mgr/rbd_support: add type annotation") not in
  pacific ]

21 months agopybind/rbd: don't produce info on errors in aio_mirror_image_get_info()
Ilya Dryomov [Thu, 12 Oct 2023 17:03:10 +0000 (19:03 +0200)]
pybind/rbd: don't produce info on errors in aio_mirror_image_get_info()

Check completion return value before attemting to decode c_info.
Otherwise we are guaranteed to access invalid memory in decode_cstr()
while trying to compute global_id string length when the client is
blocklisted for example.

Fixes: https://tracker.ceph.com/issues/63028
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit a81bd2db3af4d7b53736be8e42a3eaa53028d60c)

21 months agoRadosGW API: incorrect bucket quota in response to HEAD /{bucket}/?usage 53439/head
shreyanshjain7174 [Mon, 11 Sep 2023 10:40:33 +0000 (06:40 -0400)]
RadosGW API: incorrect bucket quota in response to HEAD /{bucket}/?usage

When we try to get the bucket usage via various methods, through curl or while accessing rgw api endpoint at HEAD /{bucket}/?usage doesn't return the updated information. The endpoint was always returning the user quota and not the actual bucket quota which we see after querying the endpoint.

Fixes: https://tracker.ceph.com/issues/62737
Signed-off-by: shreyanshjain7174 <ssanchet@redhat.com>
(cherry picked from commit 78cd82b6e9f36a91f47d44ad2cfae89add335d4c)

Conflicts:
  - path: src/rgw/rgw_rest_s3.cc
    comment: resolve minor conflict

21 months agorgw: Drain async_processor request queue during shutdown 53472/head
root [Thu, 8 Sep 2022 16:21:50 +0000 (12:21 -0400)]
rgw: Drain async_processor request queue during shutdown

Drain outstanding requests from the async_processor before stopping
the sync threads to avoid any use-after-free of their local variables.

Fixes: https://tracker.ceph.com/issues/49666
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit 9b451763ff583f25c821aecf3446884c0fb95273)

21 months agomds: remove calculating caps after adding revokes back 52852/head
Xiubo Li [Fri, 23 Jun 2023 14:44:23 +0000 (22:44 +0800)]
mds: remove calculating caps after adding revokes back

The calc_issued() makes no sense and will blindly set the 'issued'
to the 'pending', which is incorrect.

For the cap update msg it will pass the client's 'implemented' caps
to MDS, and MDS will use the 'implemented' to calculate the 'issued'
and 'pending' members and also will adjust the revoke list.

The confirm_receipt() has already correctly calculating the 'issued'
and 'pending' members. And after add the cap back to the revoke list
we should mark it notable, which will move the cap object to the
front of session list.

Fixes: https://tracker.ceph.com/issues/61781
Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit b6e2681ebd87ccbc964d0cb4758a26748d517fcf)

21 months agotest/libcephfs: add test case for revoking caps
Xiubo Li [Tue, 11 Oct 2022 04:53:17 +0000 (12:53 +0800)]
test/libcephfs: add test case for revoking caps

When writing to a file and the max_size is approaching the client
will try to trigger to call check_caps() and flush the caps to MDS.
But just in case the MDS is revoking Fsxrw caps, since the client
keeps writing and holding the Fw caps it may only release part of
the caps but the Fw.

Fixes: https://tracker.ceph.com/issues/57244
Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit 3c63980b9d38aa935cf920512e129968c15b5aa9)

21 months agoclient: issue a cap release immediately if no cap exists
Xiubo Li [Tue, 16 May 2023 01:18:15 +0000 (09:18 +0800)]
client: issue a cap release immediately if no cap exists

In case:

           mds                             client
                                - Releases cap and put Inode
  - Increase cap->seq and sends
    revokes req to the client
  - Receives release req and    - Receives & drops the revoke req
    skip removing the cap and
    then eval the CInode and
    issue or revoke caps again.
                                - Receives & drops the caps update
                                  or revoke req
  - Health warning for client
    isn't responding to
    mclientcaps(revoke)

All the IMPORT/REVOKE/GRANT cap ops will increase the session seq
in MDS side and then the client need to issue a cap release to
unblock MDS to remove the corresponding cap to unblock possible
waiters.

Fixes: https://tracker.ceph.com/issues/57244
Fixes: https://tracker.ceph.com/issues/61148
Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit 7aaf4ba81b978db63b9cb11a90f881196530e5d5)

21 months agomds: add the revoking caps back to _revokes list
Xiubo Li [Thu, 2 Mar 2023 14:01:08 +0000 (22:01 +0800)]
mds: add the revoking caps back to _revokes list

When revoking caps from clients and if the clients could release
some of the caps references and the clients still could send cap
update request back to MDS, while the confirm_receipt() will clear
the _revokes list anyway.

But this cap will still be kept in revoking_caps list.

At the same time add one debug log when revocation is not totally
finished.

Fixes: https://tracker.ceph.com/issues/57244
Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit 0454c449132abd1499d671990d680bd0794abc5a)

21 months agomds: move confirm_receipt() to Capability.cc
Xiubo Li [Thu, 2 Mar 2023 13:58:56 +0000 (21:58 +0800)]
mds: move confirm_receipt() to Capability.cc

Will add the debug logs later in confirm_receipt().

Fixes: https://tracker.ceph.com/issues/57244
Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit f9aa17c90ee01b0f6df56af3fa76a80870ecf38a)

21 months agoPendingReleaseNotes: Note change to 'ceph config dump' pretty-print output. 53984/head
Sridhar Seshasayee [Tue, 29 Aug 2023 04:29:15 +0000 (09:59 +0530)]
PendingReleaseNotes: Note change to 'ceph config dump' pretty-print output.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit 401b30f19f51e86f1471447a6af788b94e283ff0)
Conflicts:
    PendingReleaseNotes
- Remove unrelated release note related to Cephfs
- Move related release note under the new ">=16.2.15" section

21 months agomon/ConfigMonitor: Show localized name in "config dump --format json" output
Sridhar Seshasayee [Wed, 9 Aug 2023 12:52:29 +0000 (18:22 +0530)]
mon/ConfigMonitor: Show localized name in "config dump --format json" output

The "ceph config dump" command without the json formatted output shows
the localized option names and their values. An example of a normalized
vs localized option is shown below:

Normalized: mgr/dashboard/ssl_server_port (maintaned within Option struct)
Localized: mgr/dashboard/x/ssl_server_port (maintained in mon store)

But the "ceph config dump --format json*" output showed the normalized
option names which was not consistent with the "config dump" output.
The output of the command along with variations for pretty printing must
show the same content.

This commit introduces a new member within the ConfigMap's MaskedOption
struct called "localized_name". This is initialized to the localized name
as part of ConfigMonitor::load_config() method.

The MaskedOption::dump() used for the json formatting is modified to
display the localized_name instead of the normalized name.

Fixes: https://tracker.ceph.com/issues/62379
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit 3821722e5660437298a7c0f41e1061d363090103)

21 months agoceph orch add fails when ipv6 address is surrounded by square brackets. 53978/head
Teoman ONAY [Mon, 3 Jul 2023 14:00:20 +0000 (16:00 +0200)]
ceph orch add fails when ipv6 address is surrounded by square brackets.

fixes: https://tracker.ceph.com/issues/61885
fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2153448

Signed-off-by: Teoman ONAY <tonay@ibm.com>
(cherry picked from commit 1ea71bee6197ed0357b586498a43d9d726160a43)

21 months agocephadm: run tcmu-runner through script to do restart on failure 53977/head
Adam King [Tue, 13 Jun 2023 23:54:30 +0000 (19:54 -0400)]
cephadm: run tcmu-runner through script to do restart on failure

Currently, cephadm runs tcmu-runner as a background
process inside the unit file deployed for iscsi
(rbd-target-api is the primary process). This means
if tcmu-runner crashes for whatever reason, systemd
will not attempt to restart it. This commits sets
up a script to serve as the container entrypoint
for the tcmu-runner container that will run
tcmu-runner and also restart it on failure
(unless there are too many failures in a short
period, at which point it gives up).

The hope is to eventually drop use of this script
for a better solution in squid onward, but this
should be helpful on older releases (quincy and
pacific at least) where we won't be able to
bring that better solution

Fixes: https://tracker.ceph.com/issues/61667
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 47eb6b3f62afe993073429b02051ae0343d7aea3)

Conflicts:
src/cephadm/cephadm
src/cephadm/tests/test_cephadm.py

21 months agocephadm: Fix extra_container_args for iSCSI
Raimund Sacherer [Fri, 26 May 2023 15:52:57 +0000 (17:52 +0200)]
cephadm: Fix extra_container_args for iSCSI

extra_container_args where only applied for rbd_target_api container and not for
tcmu-runner container.

Signed-off-by: Raimund Sacherer <rsachere@redhat.com>
(cherry picked from commit ad60fc3db644b8bf44a582e79888e2fb15d7ce3a)

Conflicts:
src/cephadm/cephadm

21 months agocephadm: add tcmu-runner to logrotate config 53975/head
Adam King [Fri, 2 Jun 2023 00:06:35 +0000 (20:06 -0400)]
cephadm: add tcmu-runner to logrotate config

This process could be used to set up the tcmu-runner
to log to a file much like other ceph daemons

- create /etc/tcmu directory
- create /etc/tcmu/tcmu.conf directory with default options
- change dir to /var/log
- change log level to 4
- add -v /etc/tcmu:/etc/tcmu to tcmu-runner container podman line in unit.run

In order to support this (mostly for debugging) we should
add tcmu-runner to the logrotate config

Fixes: https://tracker.ceph.com/issues/61571
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit d5d40e07cae8a1d6a94029c4354d146b0baa3971)

21 months agomgr/cephadm: Add "networks" parameter to orch apply rgw 53974/head
Teoman ONAY [Wed, 26 Jul 2023 14:40:11 +0000 (16:40 +0200)]
mgr/cephadm: Add "networks" parameter to orch apply rgw

This parameter is available in specs but not available as a parameter.
Having it will ease its use in cephadm-adopt playbook in ceph-ansible.

fixes: https://tracker.ceph.com/issues/62185

Signed-off-by: Teoman ONAY <tonay@ibm.com>
(cherry picked from commit 7f33397c76540dff8ed724caf2aa14ac94c73e03)

21 months agolibrbd/ManagedLock: kickstart ExclusiveLock state machine 53295/head
Ramana Raja [Mon, 2 Oct 2023 16:39:26 +0000 (12:39 -0400)]
librbd/ManagedLock: kickstart ExclusiveLock state machine

... that is stalled waiting for lock. Do this when trying to reacquire
lock in the ImageWatcher's rewatch mechanism. This would enable the
ExclusiveLock state machine to propagate the blocklist error to the
caller trying to perform an image operation requiring an exclusive
lock.

Previous attempt, e66db763, to fix the hang due to exclusive lock
acquisiton (stuck waiting for lock) racing with client blocklisting
did not always work. e66db763 kickstarted the ExclusiveLock state
machine when the ImageWatcher tried to schedule a exclusive lock
request and the blocklisting was detected. However, there is a short
window between a watch getting deregistered and client blocklisting
getting detected as part of rewatching. If hit when trying to schedule
a lock request, the ExclusiveLock state machine wasn't kickstarted,
blocklist error wasn't propagated, and the hang resurfaced.

A more robust approach is taken to resume the ExclusiveLock state
machine stuck waiting for lock during client blocklisting. Whenever
a client's ImageWatcher loses connection to the cluster, as it happens
during blocklising, the ImageWatcher initiates a mechanism to rewatch
the image and tries to reacquire the lock. Piggyback on this rewatch
mechanism that gets triggered during client blocklisting. And when
trying to reacquire the lock, kickstart the ExclusiveLock state
machine stalled waiting for lock (STATE_WAITING_FOR_LOCK).

Fixes: https://tracker.ceph.com/issues/63009
Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit 18b018578cf8ac51a7e7a7d25f62d7bde345461a)

21 months agopybind/mgr/volumes: log mutex locks to help debug deadlocks 53916/head
Kotresh HR [Thu, 10 Aug 2023 10:02:23 +0000 (15:32 +0530)]
pybind/mgr/volumes: log mutex locks to help debug deadlocks

This patch logs the mutex locks which were missed logging
as part of the commit cf2a1ad65120. Refer [1] for more
details.

[1] https://tracker.ceph.com/issues/49605#note-5
Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 5e689f8792da712d1fcc61d07b135267aed7e3a8)

21 months agoMerge pull request #49478 from aaSharma14/wip-58301-pacific
Pedro Gonzalez Gomez [Tue, 10 Oct 2023 08:23:59 +0000 (10:23 +0200)]
Merge pull request #49478 from aaSharma14/wip-58301-pacific

pacific: mgr/dashboard: fix CephPGImbalance alert

Reviewed-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
21 months agoMerge pull request #48524 from k0ste/wip-57887-pacific
Avan [Thu, 5 Oct 2023 09:35:07 +0000 (15:05 +0530)]
Merge pull request #48524 from k0ste/wip-57887-pacific

pacific: mgr/prometheus: avoid duplicates and deleted entries for rbd_stats_pools

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
21 months agorgw: fix output formatting of bucket index check admin api 53808/head
Cory Snyder [Mon, 25 Sep 2023 10:06:41 +0000 (10:06 +0000)]
rgw: fix output formatting of bucket index check admin api

The bucket index check admin API was previously returning invalid
JSON.

Signed-off-by: Cory Snyder <csnyder@1111systems.com>
(cherry picked from commit 32fb6a1a68398a99324b2e64ebe3bcf3a9ccf02a)

21 months agorgw: fix radosgw-admin bucket check stat calculation bug
Cory Snyder [Fri, 22 Sep 2023 21:08:25 +0000 (21:08 +0000)]
rgw: fix radosgw-admin bucket check stat calculation bug

Fixes a regression with radosgw-admin bucket check stat
calculation and bucket reshard stat calculation when
there are objects that have transitioned from unversioned
to versioned. The bug was introduced in
152aadb71b61c53a4832a1c8cf82fce3d64b68d1.

Signed-off-by: Cory Snyder <csnyder@1111systems.com>
(cherry picked from commit 4728daa5557bfb79a608dd903b8630e2b15fcb2c)

21 months agorgw: add test case to reproduce bucket check stats bug for versioned bucket
Cory Snyder [Fri, 22 Sep 2023 21:00:46 +0000 (21:00 +0000)]
rgw: add test case to reproduce bucket check stats bug for versioned bucket

Reproduces a regression where radosgw-admin bucket check incorrectly counts
objects that started as unversioned and later transitioned to versioned.

Signed-off-by: Cory Snyder <csnyder@1111systems.com>
(cherry picked from commit 340522f9aed50d65137568c1f9dcf4b1e7945a79)

21 months agorgw: radosgw-admin bucket check should only print index entries with --check-objects...
Cory Snyder [Fri, 22 Sep 2023 08:35:16 +0000 (08:35 +0000)]
rgw: radosgw-admin bucket check should only print index entries with --check-objects flag

Printing all index entries can be very time consuming for large
buckets and the inability to switch this behavior off makes it
cumbersome to use the command for fixing bucket stats. This was
also preventing the command from outputting recalculated bucket
stats when the --fix flag wasn't specified.

Signed-off-by: Cory Snyder <csnyder@1111systems.com>
(cherry picked from commit 6b057fe55413c0eaf9959f006584cba6cc4c192a)

Conflicts:
src/rgw/driver/rados/rgw_bucket.cc

Cherry-pick notes:
- Conflicts due to rgw_bucket.cc moved into driver dir in later versions

21 months agorgw: prevent another leftover bucket index olh entry scenario
Cory Snyder [Thu, 21 Sep 2023 19:27:51 +0000 (19:27 +0000)]
rgw: prevent another leftover bucket index olh entry scenario

If a call to bucket_index_link_olh or bucket_index_unlink_instance
fails, its associated pending xattr may have prevented the olh object
from being removed by another thread. We should do a best effort
cleanup attempt for this case by calling update_olh before returning
an error to the caller.

Signed-off-by: Cory Snyder <csnyder@1111systems.com>
(cherry picked from commit 570adec5bb8142f5baf1f05f0040e8afdb11ec05)

Conflicts:
src/rgw/driver/rados/rgw_rados.cc

Cherry-pick notes:
- Conflicts due to rgw_rados file being moved into driver dir in later versions

21 months agorgw: fix rgw versioned bucket stat accounting during reshard and check index
Cory Snyder [Thu, 7 Sep 2023 17:23:14 +0000 (17:23 +0000)]
rgw: fix rgw versioned bucket stat accounting during reshard and check index

Fixes: https://tracker.ceph.com/issues/62760
Signed-off-by: Cory Snyder <csnyder@1111systems.com>
(cherry picked from commit 152aadb71b61c53a4832a1c8cf82fce3d64b68d1)

Conflicts:
src/cls/rgw/cls_rgw.cc

Cherry-pick notes:
- Some function ordering changed

21 months agoqa/workunits/rgw: add tests that reproduce bucket stats inconsistency bugs
Cory Snyder [Thu, 7 Sep 2023 14:43:23 +0000 (14:43 +0000)]
qa/workunits/rgw: add tests that reproduce bucket stats inconsistency bugs

Signed-off-by: Cory Snyder <csnyder@1111systems.com>
(cherry picked from commit b79dcf640ac2cc3dacf1b87bbe351db823c445d0)