]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Prasanna Kumar Kalever [Fri, 22 Nov 2024 19:46:39 +0000 (01:16 +0530)]
cls/rbd: changes needed to align rest with the proposed ones
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Fri, 22 Nov 2024 16:26:20 +0000 (21:56 +0530)]
cls/rbd: proposed changes
Changes proposed by 'N Balachandran'
Signed-off-by: N Balachandran <nibalach@redhat.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Tue, 22 Oct 2024 07:15:46 +0000 (12:45 +0530)]
rbd-mirror: group replayer work in-progress changes
This commit is a WIP targeting stability of tests failures, it might be diluted
into various existing commits
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Fri, 22 Nov 2024 07:12:19 +0000 (12:42 +0530)]
qa/workunits/rbd: mirror group tests improvements
Signed-off-by: N Balachandran <nibalach@redhat.com>
Signed-off-by: John Agombar <agombar@uk.ibm.com>
Signed-off-by: Adam Lyon-Jones <adamlyon@uk.ibm.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Tue, 3 Sep 2024 16:03:53 +0000 (21:33 +0530)]
rbd-mirror: move the rename detection logic to snapshot GroupReplayer
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Tue, 27 Aug 2024 07:03:21 +0000 (12:33 +0530)]
rbd-mirror: use group_header object for resync flagging
Also move the resync checks to snapshot GroupReplayer
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Mon, 2 Sep 2024 08:54:35 +0000 (14:24 +0530)]
rbd-mirror: rollback to last good snapshot just while force promote
fix the group membership to match the rollback snapshot
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Fri, 9 Aug 2024 06:33:12 +0000 (12:03 +0530)]
rbd-mirror: fix the below bugs
* fix braces in the imageMap update_images_added & update_images_removed
* do not allow image add from non-primary
* `down+unknown` status shown on querying individual images which are part
of group enabled for mirroring
* `mirror pool status` shows down+unknown status
* fix imageMap being overwritten when multiple images are enabled for mirroring
* fix misleading error msg when getting status of a non-mirror enabled group
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Mon, 5 Aug 2024 04:14:38 +0000 (09:44 +0530)]
rbd-mirror: prune snapshots added to the list on image_replayer shutdown
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Thu, 1 Aug 2024 13:41:52 +0000 (19:11 +0530)]
rbd-mirror: discover primary demote snapshot in group_replayer
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Sun, 21 Jul 2024 15:52:12 +0000 (21:22 +0530)]
rbd-mirror: remove group snaps on primary at snapshot creation
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Mon, 5 Aug 2024 06:35:26 +0000 (12:05 +0530)]
rbd-mirror: Independent GroupReplayer
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Fri, 12 Mar 2021 11:57:57 +0000 (11:57 +0000)]
qa/workunits/rbd: mirror group functional tests
Also sets the RBD_MIRROR_INSTANCES to 1 to avoid any deviations.
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Fri, 3 May 2024 08:16:53 +0000 (13:46 +0530)]
rbd-mirror: implement group resync functionality
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Mon, 22 Apr 2024 11:37:01 +0000 (17:07 +0530)]
rbd-mirror: support mirroring regular non-mirror group snapshots
This commit also enable deep copying `.group` snaps.
$ rbd --cluster site-a snap ls pool1/test_image1 --all --debug-rbd=0
SNAPID NAME SIZE PROTECTED TIMESTAMP NAMESPACE
4 .group.2_10416b8b4567_104d6b8b4567 128 MiB Mon Apr 22 17:07:57 2024 group (pool1/test_group@group_snap1)
7 .group.2_10416b8b4567_104f6b8b4567 128 MiB Mon Apr 22 17:07:59 2024 group (pool1/test_group@group_snap2)
8 .mirror.primary.
72855be4 -1ffb-4094-8426-
fb1d5f082c21 .
71b62dad -f515-43cf-91b2-
bf1225c5a0fc 128 MiB Mon Apr 22 17:08:03 2024 mirror (primary peer_uuids:[
5f9ea7aa -fa5b-4d2e-a098-
3afe006361aa ])
$ rbd --cluster site-b snap ls pool1/test_image1 --all --debug-rbd=0
SNAPID NAME SIZE PROTECTED TIMESTAMP NAMESPACE
5 .group.2_10416b8b4567_104d6b8b4567 128 MiB Mon Apr 22 17:08:07 2024 group
7 .group.2_10416b8b4567_104f6b8b4567 128 MiB Mon Apr 22 17:08:08 2024 group
8 .mirror.non_primary.
72855be4 -1ffb-4094-8426-
fb1d5f082c21 .
64488da7 -3350-4d04-8e4d-
90047182e004 128 MiB Mon Apr 22 17:08:09 2024 mirror (non-primary peer_uuids:[]
6d963873 -6679-483b-b05a-
8bf2536f4fdf :8 copied)
$ rbd --cluster site-a group snap ls pool1/test_group --debug-rbd=0
NAME STATUS
group_snap1 ok
group_snap2 ok
.mirror.2_10416b8b4567_10536b8b4567 ok
$ rbd --cluster site-b group snap ls pool1/test_group --debug-rbd=0
NAME STATUS
.mirror.2_10376b8b4567_1037327b23c6 ok
group_snap1 ok
group_snap2 ok
.mirror.2_10416b8b4567_10536b8b4567 ok
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Fri, 26 Apr 2024 15:33:28 +0000 (21:03 +0530)]
rbd-mirror: support rename with group mirroring
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Mon, 26 Feb 2024 08:53:28 +0000 (14:23 +0530)]
rbd-mirror: add undo code, exclusive locking and quiescing
* add essential logic to undo partially succeeded API's like, group promote,
group demote, group enable, group disable, group image add and
group image remove
* add exclusive locking and quiescing with-in all the required group API's
* adress code duplication and optimization with in the group API's
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Prasanna Kumar Kalever [Wed, 14 Feb 2024 15:01:21 +0000 (20:31 +0530)]
rbd-mirror: address partial group snapshots case
Make sure group snapshots doesn't get copied to secondary if the group snapshot
is incomplete on primary. On creation time of a group snapshots on primary,
make sure to delete the previous snapshot in case it is incomplete.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Thu, 18 Mar 2021 18:38:26 +0000 (18:38 +0000)]
rbd-mirror: request group snapshot creation when creating group image snapshot
It makes the group image replayers to synchronize and to have the
group snapshot created.
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Fri, 19 Mar 2021 13:09:35 +0000 (13:09 +0000)]
rbd-mirror: unlink group snapshot when pruning non-primary snapshot
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Sun, 17 Jan 2021 10:36:02 +0000 (10:36 +0000)]
rbd-mirror: initial group replayer implementation
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Tue, 5 Jan 2021 16:18:57 +0000 (16:18 +0000)]
rbd-mirror: hook GroupReplayer
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Thu, 21 Jan 2021 09:18:36 +0000 (09:18 +0000)]
rbd-mirror: make pool watcher also refresh groups
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Wed, 30 Dec 2020 16:37:31 +0000 (16:37 +0000)]
rbd-mirror: introduce generalized mirror entity (type, global_id, size)
where type may be image or group. Make pool watcher and image
mapper use it currently for images. The group support is coming.
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Fri, 25 Dec 2020 09:28:37 +0000 (09:28 +0000)]
rbd-mirror: remove dead code
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Sun, 13 Dec 2020 16:01:15 +0000 (16:01 +0000)]
rbd: add 'mirror group' commands
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Fri, 19 Mar 2021 14:37:17 +0000 (14:37 +0000)]
librbd: link group snapshot when creating non-primary snapshot
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Fri, 19 Feb 2021 17:57:06 +0000 (17:57 +0000)]
librbd: unlink group snapshot when removing mirror snapshot
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Wed, 27 Jan 2021 17:23:52 +0000 (17:23 +0000)]
librbd: allow to add image to group on creation
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Sat, 28 Nov 2020 10:15:26 +0000 (10:15 +0000)]
librbd: add mirror group API
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Fri, 18 Dec 2020 10:15:11 +0000 (10:15 +0000)]
librbd: don't send 'image updated' notifications
when enabling/disabling group mirroring
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Thu, 17 Dec 2020 12:22:57 +0000 (12:22 +0000)]
librbd: introduce 'group updated' mirroring watcher notification
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Wed, 27 Jan 2021 17:19:32 +0000 (17:19 +0000)]
librbd: introduce 'group add/remove image' async requests
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Fri, 23 Jul 2021 15:04:48 +0000 (16:04 +0100)]
librbd/api: assume user namespace for group snapshot
when the snapshot is specified by name only
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Thu, 18 Feb 2021 12:41:53 +0000 (12:41 +0000)]
cls/rbd: add method to unlink image snapshot from group snapshot
When no image snapshots are left the group snapshot will be
automatically removed.
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Tue, 19 Jan 2021 13:48:07 +0000 (13:48 +0000)]
cls/rbd: make possible mirror_image_list filter out group images
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Fri, 15 Jan 2021 16:23:37 +0000 (16:23 +0000)]
cls/rbd: add async versions of some group functions
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Mon, 7 Dec 2020 10:40:40 +0000 (10:40 +0000)]
cls/rbd: add group_spec and group_snap_id to mirror snapshot
which are going to be used when creating a group mirror snapshot.
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Sat, 28 Nov 2020 10:14:27 +0000 (10:14 +0000)]
cls/rbd: add mirror group types and methods
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Sun, 6 Dec 2020 13:12:39 +0000 (13:12 +0000)]
cls/rbd: add group snapshot namespace
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Mykola Golub [Wed, 23 Jun 2021 16:56:20 +0000 (17:56 +0100)]
cls/rbd: rename GroupSnapshotNamespace to GroupImageSnapshotNamespace
Signed-off-by: Mykola Golub <mgolub@suse.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Casey Bodley [Thu, 24 Apr 2025 15:35:48 +0000 (11:35 -0400)]
Merge pull request #62936 from cbodley/wip-doc-rgw-getobjattrs
doc/rgw: release note for GetObjectAttributes
Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Pedro Gonzalez Gomez [Thu, 24 Apr 2025 15:26:18 +0000 (17:26 +0200)]
Merge pull request #62845 from rhcs-dashboard/fix-path
mgr/dashboard: fix smb edit resources
Reviewed-by: Afreen Misbah <afreen@ibm.com>
Casey Bodley [Thu, 24 Apr 2025 14:59:33 +0000 (10:59 -0400)]
Merge pull request #62715 from cbodley/wip-qa-rgw-no-gc
qa/rgw: run verify tests with garbage collection disabled
Reviewed-by: Jane Zhu <jzhu116@bloomberg.net>
Ilya Dryomov [Thu, 24 Apr 2025 14:36:46 +0000 (16:36 +0200)]
Merge pull request #62921 from idryomov/wip-71026
librbd: disallow "rbd trash mv" if image is in a group
Reviewed-by: Ramana Raja <rraja@redhat.com>
Max Kellermann [Thu, 24 Apr 2025 09:12:12 +0000 (11:12 +0200)]
Merge pull request #62941 from MaxKellermann/mds_Locker__abort
mds/Locker: use ceph_abort_msg() instead of ceph_assert()
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Radoslaw Zarzynski [Thu, 24 Apr 2025 06:17:51 +0000 (08:17 +0200)]
Merge pull request #59248 from kamoltat/wip-ksirivad-improve-netsplit-warning
HealthMonitor: Add topology-aware netsplit detection and warning
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
Max Kellermann [Thu, 24 Apr 2025 05:17:48 +0000 (07:17 +0200)]
mds/Locker: use ceph_abort_msg() instead of ceph_assert()
This ceph_assert() always fails, but depending on the configuration
value `ceph_assert_supresssions`, execution may continue, but the
`dir` variable is left uninitialized. This leads to a compiler
warning:
/home/jenkins-build/build/workspace/ceph-api/src/mds/Locker.cc:451:22: error: variable 'dir' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
clang then suggests to nullptr-initialize the variable:
/home/jenkins-build/build/workspace/ceph-api/src/mds/Locker.cc:447:11: note: initialize the variable 'dir' to silence this warning
447 | CDir *dir;
| ^
| = nullptr
This, however, is a very bad idea because all this does is suppress
the warning; it still crashes the process.
Since there's no recovery from this problem, let's switch to
ceph_abort_msg() which is [[noreturn]] and the compiler can deduce
that `dir` is always initialized when it's used.
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Ronen Friedman [Thu, 24 Apr 2025 05:17:33 +0000 (08:17 +0300)]
Merge pull request #62693 from ronen-fr/wip-rf-iocnt
osd/scrub: performance counters for I/O performed by the scrubber
Reviewed-by: Alex Ainscow <aainscow@uk.ibm.com>
Reviewed-by: Bill Scales <bill_scales@uk.ibm.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Ilya Dryomov [Wed, 23 Apr 2025 22:28:52 +0000 (00:28 +0200)]
Merge pull request #62898 from nbalacha/wip-nbalacha-70963
rbd: display mirror state creating
Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Casey Bodley [Wed, 23 Apr 2025 22:28:16 +0000 (18:28 -0400)]
Merge pull request #60899 from clwluvw/curl-einval
rgw: handle EINVAL translation in forward_request
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Casey Bodley [Wed, 23 Apr 2025 20:42:08 +0000 (16:42 -0400)]
Merge pull request #62888 from clwluvw/neorados-fifotrim
neorados: relax fifo trim error for ENODATA
Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
Casey Bodley [Wed, 23 Apr 2025 19:06:19 +0000 (15:06 -0400)]
doc/rgw: release note for GetObjectAttributes
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Casey Bodley [Wed, 23 Apr 2025 18:46:59 +0000 (14:46 -0400)]
Merge pull request #62902 from cbodley/wip-70700-disable
cmake/common: temporarily remove decode_start_v_checker tests
Reviewed-by: Dan Mick <dmick@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
Casey Bodley [Wed, 23 Apr 2025 18:02:16 +0000 (14:02 -0400)]
Merge pull request #60227 from clwluvw/zonegroup-delbucket
rgw: skip empty check on non-owned buckets by zonegroup
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Casey Bodley [Wed, 23 Apr 2025 18:00:56 +0000 (14:00 -0400)]
Merge pull request #62738 from clwluvw/copy-obj-remote-zonegroup
rgw: dont store replication attrs on remote copy obj
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Ronen Friedman [Tue, 15 Apr 2025 08:34:06 +0000 (03:34 -0500)]
osd/scrub: count scrub I/O
Implement I/O counting in the PGBackend::be_scan_list()
and relevant functions it calls.
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
Matan Breizman [Wed, 23 Apr 2025 15:35:28 +0000 (18:35 +0300)]
Merge pull request #62699 from Matan-B/wip-matanb-crimson-ignore-abort-v2
crimson/common/errorator: rework aborts error handlers
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Radoslaw Zarzynski [Wed, 23 Apr 2025 15:19:31 +0000 (17:19 +0200)]
Merge pull request #62556 from aainscow/ec_pr_and_prereqs
osd: Optimised EC
Reviewed-by: Radoslaw Zarzynski <rzarzynski@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
N Balachandran [Mon, 21 Apr 2025 11:34:08 +0000 (17:04 +0530)]
rbd: display correct mirror state when creating
The mirror image state is set to MIRROR_IMAGE_STATE_CREATING
when the image is first created on the secondary, but was displayed
as "unknown" by the rbd info command. This has been fixed.
Fixes: https://tracker.ceph.com/issues/70963
Signed-off-by: N Balachandran <nithya.balachandran@ibm.com>
Laura Flores [Wed, 23 Apr 2025 15:06:56 +0000 (10:06 -0500)]
Merge pull request #62710 from bill-scales/ec_backfill
osd: EC Optimizations: Backfill changes for partial writes
Vallari Agrawal [Wed, 23 Apr 2025 13:17:12 +0000 (18:47 +0530)]
Merge pull request #62725 from VallariAg/nvmeof-teuthology-fio
qa/suites/nvmeof: Fix thrasher and fio script
Rishabh Dave [Wed, 23 Apr 2025 12:15:54 +0000 (17:45 +0530)]
Merge pull request #60731 from joscollin/wip-B68954-check-headers-journal-recovery
cephfs-journal-tool: check the headers in dump file after journal recovery
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
baum [Wed, 23 Apr 2025 10:36:16 +0000 (13:36 +0300)]
Merge pull request #62914 from baum/ms_dispatch2_clean_up
src/nvmeof/NVMeofGwMonitorClient.cc: ms_dispatch2 clean up
Zac Dover [Wed, 23 Apr 2025 09:27:00 +0000 (19:27 +1000)]
Merge pull request #62696 from anthonyeleven/mgr-prom
doc/mgr: Improve prometheus.rst
Reviewed-by: Zac Dover <zac.dover@proton.me>
Venky Shankar [Wed, 23 Apr 2025 09:16:03 +0000 (14:46 +0530)]
Merge PR #62577 into main
* refs/pull/62577/head:
libcephfs_proxy: avoid libc buffering for logging
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Anoop C S <anoopcs@cryptolab.net>
Zac Dover [Wed, 23 Apr 2025 09:15:19 +0000 (19:15 +1000)]
Merge branch 'main' into mgr-prom
Signed-off-by: Zac Dover <zac.dover@proton.me>
Jos Collin [Tue, 11 Feb 2025 10:45:51 +0000 (16:15 +0530)]
qa: test 'journal import' recognizes invalid headers post journal recovery
Fixes: https://tracker.ceph.com/issues/68954
Signed-off-by: Jos Collin <jcollin@redhat.com>
Jos Collin [Thu, 14 Nov 2024 05:12:18 +0000 (10:42 +0530)]
cephfs-journal-tool: check the headers in dump file after journal recovery
Fixes: https://tracker.ceph.com/issues/68954
Signed-off-by: Jos Collin <jcollin@redhat.com>
Anthony D'Atri [Mon, 7 Apr 2025 03:03:53 +0000 (23:03 -0400)]
doc/mgr: Improve prometheus.rst
Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Anthony D'Atri [Wed, 23 Apr 2025 03:25:27 +0000 (23:25 -0400)]
Merge pull request #62911 from bluikko/doc-cleanup-radosgw
doc/radosgw: Fix indentation in admin.rst
Zac Dover [Tue, 22 Apr 2025 23:31:21 +0000 (09:31 +1000)]
Merge pull request #62896 from zdover23/wip-doc-2025-04-21-revert-62782-
c4f0f8e
doc: Revert "doc/mgr: Promptify CLI commands and other formatting fixes"
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Kamoltat Sirivadhna [Fri, 23 Aug 2024 20:24:36 +0000 (20:24 +0000)]
doc/rados/operations/health-checks: Add MON_NETSPLIT Warning
Fixes: https://tracker.ceph.com/issues/67371
Signed-off-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
Kamoltat Sirivadhna [Thu, 15 Aug 2024 20:25:43 +0000 (20:25 +0000)]
HealthMonitor: Add topology-aware netsplit detection and warning
Problem:
Currently, Ceph cannot detect and report network partitions (netsplits)
between monitors in different topology locations in a consolidated way.
While stretch mode can handle partitions through monitor elections,
users lack visibility into the topology-level view of network
disconnections, making troubleshooting difficult.
Solution:
This implementation adds a hierarchical netsplit detection mechanism that:
- Uses DirectedGraph structure for netsplit detection
- Maps monitor disconnections to relevant CRUSH topology levels
- Aggregates individual disconnections into location-level reports when appropriate
- Detects complete location-level netsplits when ALL monitors between locations
cannot communicate
- Reports specific topology locations experiencing complete communication failures
- Falls back to individual monitor-level reporting for partial disconnections
- Handles monitors with missing location data gracefully
- Leverages HealthMonitor::check_for_mon_down to receive a set of down monitors,
efficiently avoiding false netsplit reports for monitors already known to be down
- Implements smart filtering that correctly excludes down monitors from location-based
analysis, ensuring accurate netsplit reporting at both individual and topology levels
The implementation produces user-friendly health warnings:
1. For complete location netsplits: "Netsplit detected between dc1 and dc2"
2. For individual monitor disconnections: "Netsplit detected between mon.a and mon.d"
Performance considerations:
- Time complexity: O(m²) where m is the number of monitors
- Space complexity: O(m²) for connection tracking
- Practical impact is minimal as monitor count is typically small (3-7)
Fixes: https://tracker.ceph.com/issues/67371
Signed-off-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
Conflicts:
src/mon/Elector.cc - Trivial Fix
Laura Flores [Tue, 22 Apr 2025 21:05:44 +0000 (16:05 -0500)]
Merge pull request #62416 from kamoltat/wip-ksirivad-fix-connection-score
Ilya Dryomov [Wed, 16 Apr 2025 11:15:19 +0000 (13:15 +0200)]
librbd: disallow "rbd trash mv" if image is in a group
Removing an image that is a member of a group has always been
disallowed. However, moving an image that is a member of a group to
trash is currently allowed and this is deceptive -- the only reason for
a user to move an image to trash should be the intent to remove it.
More importantly, group APIs operate in terms of image names -- there
are no corresponding variants that would operate in terms of image IDs.
For example, even though internally GroupImageSpec struct stores an
image ID, the public rbd_group_image_info_t struct insists on an image
name. When rbd_group_image_list() encounters a trashed member image
(i.e. one that doesn't have a name), it just fails with ENOENT and no
listing gets produced at all until the offending image is restored from
trash. Something like this can be very hard to debug for an average
user, so let's make rbd_trash_move() fail with EMLINK the same way as
rbd_remove() does in this scenario.
The one case where moving a member image to trash makes sense is live
migration where the source image gets trashed to be almost immediately
replaced by the destination image as part of preparing migration.
Fixes: https://tracker.ceph.com/issues/71026
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Ilya Dryomov [Mon, 21 Apr 2025 15:11:17 +0000 (17:11 +0200)]
pybind/rbd: add ImageMemberOfGroup exception
EMLINK is returned by rbd_remove() if the image is a member of a group.
Add a dedicated exception similar to ImageBusy or ImageHasSnapshots and
a test for it.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Ilya Dryomov [Mon, 21 Apr 2025 14:52:02 +0000 (16:52 +0200)]
rbd: don't print "image will expire at" message when trash_move() fails
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Rishabh Dave [Tue, 22 Apr 2025 15:51:52 +0000 (21:21 +0530)]
Merge pull request #61212 from rishabh-d-dave/mgr-vol-count-clones
mgr/vol: count number of ongoing clones in CloneProgressReporter...
Reviewed-by: Milind Changire <mchangir@redhat.com>
Max Kellermann [Tue, 22 Apr 2025 15:28:37 +0000 (17:28 +0200)]
Merge pull request #62870 from MaxKellermann/mds_includes
mds: include cleanup
Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
Matan Breizman [Mon, 21 Apr 2025 14:11:58 +0000 (14:11 +0000)]
crimson: remove any assert_failure pre assert usages
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Matan Breizman [Mon, 21 Apr 2025 14:11:52 +0000 (14:11 +0000)]
crimosn/common/errorator: cleanup assert_failure pre_assert
Any usage should be replaced with a message that supports priniting the
error.
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Matan Breizman [Mon, 21 Apr 2025 14:11:47 +0000 (14:11 +0000)]
crimson/common/errorator: assert_failure to print error
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Matan Breizman [Mon, 21 Apr 2025 14:13:46 +0000 (14:13 +0000)]
crimson/osd: Verbose assert_all aborts
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Matan Breizman [Mon, 21 Apr 2025 14:13:41 +0000 (14:13 +0000)]
crimson/common/errorator: allow assert_all to accept c_str()
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Matan Breizman [Mon, 21 Apr 2025 14:13:38 +0000 (14:13 +0000)]
crimson/common/errorator: Cleanup assert_all pre_assert
Not used
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Matan Breizman [Mon, 21 Apr 2025 14:13:34 +0000 (14:13 +0000)]
crimson/common/errorator: cleanup ErrorT::handler call
call error_t::handle without decalring handler and invoking it later on
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Matan Breizman [Mon, 21 Apr 2025 14:13:31 +0000 (14:13 +0000)]
crimson/ertr: assert_all informs about error being handled that way
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Matan Breizman [Mon, 7 Apr 2025 09:50:35 +0000 (09:50 +0000)]
test/crimson/test_errorator: ignore assert_all
This came up during: https://tracker.ceph.com/issues/69406#note-25
Where an "assert_all" was called but didn't cause an abort.
Added "ignore_assert_all" to showcase this scenario along with any
other case where we are expected to abort.
The tests could be used to verify errorator's aborting behavior.
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Matan Breizman [Thu, 10 Apr 2025 15:50:01 +0000 (15:50 +0000)]
crimson/common/errorator: print abort message when possible
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Matan Breizman [Thu, 10 Apr 2025 14:23:50 +0000 (14:23 +0000)]
crimson/common/errorator: Always check exception type
We shouldn't bypass this check in the is_same_v<return_t, no_touch_error_marker>
case.
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Matan Breizman [Thu, 10 Apr 2025 13:12:36 +0000 (13:12 +0000)]
crimson/common/errorator: add TODO
There are few TODOs around errorator code which might be worth looking
into: https://tracker.ceph.com/issues/70875
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Matan Breizman [Thu, 10 Apr 2025 10:07:08 +0000 (10:07 +0000)]
crimson/common/errorator: introduce take_exception_from_future
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Matan Breizman [Thu, 10 Apr 2025 09:48:38 +0000 (09:48 +0000)]
crimson/common/errorator: move exception_comment
move the comment to where __cxa_exception_type is used
to keep handle() comments shorter.
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Xuehan Xu [Tue, 8 Apr 2025 12:50:47 +0000 (20:50 +0800)]
crimson/common/errorator: fix skipped aborts
We should also invoke the errfunc (which aborts) when the
return type is no_touch_error_marker.
Added comments explaining:
* why it's forbidden to return void
* why std::is_same_v<return_t, no_touch_error_marker> is checked
Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Rishabh Dave [Tue, 22 Apr 2025 15:06:34 +0000 (20:36 +0530)]
Merge pull request #62708 from rishabh-d-dave/vols-snap-path
mgr/vol: add command to get snapshot path
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Samuel Just [Tue, 22 Apr 2025 15:02:59 +0000 (08:02 -0700)]
Merge pull request #62837 from athanatos/sjust/wip-crimson-stuck-backfilling
crimson: fix several bugs causing stuck backfills
Reviewed-by: Matan Breizman <mbreizma@redhat.com>
Samuel Just [Tue, 22 Apr 2025 14:59:41 +0000 (07:59 -0700)]
Merge pull request #62619 from athanatos/sjust/wip-replica-read-crimson-mosdpct
crimson: add MOSDPGPCT support
Reviewed-by: Matan Breizman <mbreizma@redhat.com>
Alexander Indenbaum [Tue, 22 Apr 2025 10:20:02 +0000 (13:20 +0300)]
src/nvmeof/NVMeofGwMonitorClient.cc: ms_dispatch2 clean up
- return ACKNOWLEDGED/HANDLED
- remove registration for unwanted keys
Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
Kefu Chai [Tue, 22 Apr 2025 13:27:46 +0000 (21:27 +0800)]
Merge pull request #62899 from tchaikov/cmake-build-boost
cmake: Fix b2 build with postfixed compiler versions
Reviewed-by: Matan Breizman <mbreizma@redhat.com>