]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
10 months agoqa: ignore warnings variations 59309/head
Patrick Donnelly [Mon, 19 Aug 2024 13:04:18 +0000 (09:04 -0400)]
qa: ignore warnings variations

Fixes: https://tracker.ceph.com/issues/67601
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
10 months agoMerge PR #59171 into main
Patrick Donnelly [Tue, 27 Aug 2024 17:20:54 +0000 (13:20 -0400)]
Merge PR #59171 into main

* refs/pull/59171/head:
client: use vectors for context lists

Reviewed-by: Venky Shankar <vshankar@redhat.com>
10 months agoMerge PR #59176 into main
Patrick Donnelly [Tue, 27 Aug 2024 17:12:11 +0000 (13:12 -0400)]
Merge PR #59176 into main

* refs/pull/59176/head:
mds: encode quiesce payload on demand
mds: print quiesce message name in debug log

Reviewed-by: Venky Shankar <vshankar@redhat.com>
10 months agoMerge PR #58419 into main
Patrick Donnelly [Tue, 27 Aug 2024 17:10:54 +0000 (13:10 -0400)]
Merge PR #58419 into main

* refs/pull/58419/head:
mds: generate correct path for unlinked snapped files
qa: add test for cephx path check on unlinked snapped dir tree
mds: add debugging for stray_prior_path

Reviewed-by: Milind Changire <mchangir@redhat.com>
10 months agoMerge PR #58987 into main
Patrick Donnelly [Tue, 27 Aug 2024 17:10:10 +0000 (13:10 -0400)]
Merge PR #58987 into main

* refs/pull/58987/head:
qa/cephfs: update ignorelist

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
10 months agoMerge PR #59088 into main
Patrick Donnelly [Tue, 27 Aug 2024 17:09:38 +0000 (13:09 -0400)]
Merge PR #59088 into main

* refs/pull/59088/head:
mds: add compile time checks for sortedness
mds: sort conf keys

Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
10 months agoMerge PR #59095 into main
Patrick Donnelly [Tue, 27 Aug 2024 17:09:11 +0000 (13:09 -0400)]
Merge PR #59095 into main

* refs/pull/59095/head:
qa: wait for file creation before changing mode

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
10 months agoMerge PR #59162 into main
Patrick Donnelly [Tue, 27 Aug 2024 17:08:33 +0000 (13:08 -0400)]
Merge PR #59162 into main

* refs/pull/59162/head:
client: Prevent race condition when printing Inode in ll_sync_inode

Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
10 months agoMerge PR #59173 into main
Patrick Donnelly [Tue, 27 Aug 2024 17:07:37 +0000 (13:07 -0400)]
Merge PR #59173 into main

* refs/pull/59173/head:
mds: fix spelling typo

Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Kotresh Hiremath Ravishankar <khiremat@redhat.com>
Reviewed-by: Neeraj Pratap Singh <neesingh@redhat.com>
10 months agoMerge pull request #59423 from idryomov/wip-67698
Ilya Dryomov [Tue, 27 Aug 2024 15:06:54 +0000 (17:06 +0200)]
Merge pull request #59423 from idryomov/wip-67698

rbd: "rbd bench" always writes the same byte

Reviewed-by: Mykola Golub <mgolub@suse.com>
10 months agoMerge pull request #59409 from adk3798/teuth-reinstall-nvme-cli
Adam King [Tue, 27 Aug 2024 12:48:26 +0000 (08:48 -0400)]
Merge pull request #59409 from adk3798/teuth-reinstall-nvme-cli

qa/distros: reinstall nvme-cli on centos 9 nodes

Reviewed-by: Guillaume Abrioux <gabrioux@ibm.com>
10 months agoMerge pull request #57952 from NitzanMordhai/wip-nitzan-bench-osd-admin-command
Matan Breizman [Tue, 27 Aug 2024 10:03:02 +0000 (13:03 +0300)]
Merge pull request #57952 from NitzanMordhai/wip-nitzan-bench-osd-admin-command

crimson: Add support for bench osd command

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
10 months agoMerge pull request #59189 from xxhdx1985126/wip-67508
Matan Breizman [Tue, 27 Aug 2024 08:25:03 +0000 (11:25 +0300)]
Merge pull request #59189 from xxhdx1985126/wip-67508

crimson/osd/recovery_backend: restart object pulls that are blocked by down osds

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
10 months agoMerge pull request #59085 from VallariAg/update-default-nvmeof-img
Aviv Caro [Tue, 27 Aug 2024 08:20:17 +0000 (11:20 +0300)]
Merge pull request #59085 from VallariAg/update-default-nvmeof-img

mgr/cephadm: bump DEFAULT_NVMEOF_IMAGE to 1.2.17

10 months agoMerge pull request #59433 from idryomov/wip-drop-xmlstarlet-variable
Ilya Dryomov [Tue, 27 Aug 2024 06:53:38 +0000 (08:53 +0200)]
Merge pull request #59433 from idryomov/wip-drop-xmlstarlet-variable

qa: drop XMLSTARLET variable, use xmlstarlet directly

Reviewed-by: Ramana Raja <rraja@redhat.com>
10 months agocrimson/osd/pg: add logs for repeating pulls 59189/head
Xuehan Xu [Thu, 22 Aug 2024 09:54:02 +0000 (17:54 +0800)]
crimson/osd/pg: add logs for repeating pulls

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
10 months agoMerge pull request #58870 from rhcs-dashboard/fix-67194-main
afreen23 [Tue, 27 Aug 2024 02:31:38 +0000 (08:01 +0530)]
Merge pull request #58870 from rhcs-dashboard/fix-67194-main

mgr/dashboard: fix typo in Multi-Cluster > Manager Cluster to Manage Clusters

Reviewed-by: Afreen Misbah <afreen23.git@gmail.com>
10 months agoMerge pull request #59376 from rhcs-dashboard/Upgrade-page-scroll-issue
afreen23 [Tue, 27 Aug 2024 02:20:30 +0000 (07:50 +0530)]
Merge pull request #59376 from rhcs-dashboard/Upgrade-page-scroll-issue

mgr/dashboard: can't scroll to the end of the page

Reviewed-by: Afreen Misbah <afreen23.git@gmail.com>
10 months agocrimson/osd/recovery_backend: restart object pulling for recoveries that
Xuehan Xu [Tue, 13 Aug 2024 07:32:02 +0000 (15:32 +0800)]
crimson/osd/recovery_backend: restart object pulling for recoveries that
are blocked pulling from down osds

Fixes: https://tracker.ceph.com/issues/67508
Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
10 months agocrimson/common/interruptible_future: new interruptor function `repeat_eagain`
Xuehan Xu [Tue, 13 Aug 2024 06:59:23 +0000 (14:59 +0800)]
crimson/common/interruptible_future: new interruptor function `repeat_eagain`

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
10 months agoMerge pull request #59332 from afreen23/nvmeof-group-mtls
afreen23 [Tue, 27 Aug 2024 01:27:02 +0000 (06:57 +0530)]
Merge pull request #59332 from afreen23/nvmeof-group-mtls

mgr/dashboard: Add group field in nvmeof service form

Reviewed-by: Afreen Misbah <afreen23.git@gmail.com>
10 months agoMerge pull request #59422 from cbodley/wip-67697
Casey Bodley [Mon, 26 Aug 2024 21:52:22 +0000 (17:52 -0400)]
Merge pull request #59422 from cbodley/wip-67697

rgw: ignore zoneless default realm when not configured

Reviewed-by: Shilpa Jagannath <smanjara@redhat.com>
10 months agoMerge pull request #59227 from xxhdx1985126/wip-67564
Matan Breizman [Mon, 26 Aug 2024 15:19:58 +0000 (18:19 +0300)]
Merge pull request #59227 from xxhdx1985126/wip-67564

crimson/osd/pg: implement PG::PGLogEntryHandler::remove()

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
10 months agoMerge pull request #59117 from cbodley/wip-67468
Casey Bodley [Mon, 26 Aug 2024 15:12:27 +0000 (11:12 -0400)]
Merge pull request #59117 from cbodley/wip-67468

rgw/rados: zero-init shard_count in RGWBucket::check_index_unlinked()

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
10 months agoMerge pull request #59172 from clwluvw/enoent-loglevel
Casey Bodley [Mon, 26 Aug 2024 15:12:03 +0000 (11:12 -0400)]
Merge pull request #59172 from clwluvw/enoent-loglevel

rgw: increase log level for enoent caused by clients

Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agoMerge pull request #59243 from cbodley/wip-67522
Casey Bodley [Mon, 26 Aug 2024 15:04:38 +0000 (11:04 -0400)]
Merge pull request #59243 from cbodley/wip-67522

rgw/http: finish_request() after logging errors

Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agoMerge pull request #59329 from smanjara/wip-data-sync-full-initialize
Casey Bodley [Mon, 26 Aug 2024 15:04:24 +0000 (11:04 -0400)]
Merge pull request #59329 from smanjara/wip-data-sync-full-initialize

rgw/multisite: initialize sync_status in RGWDataFullSyncSingleEntryCR ctor

Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agoMerge pull request #57956 from tobias-urdin/remove-keystone-v2
Casey Bodley [Mon, 26 Aug 2024 15:03:42 +0000 (11:03 -0400)]
Merge pull request #57956 from tobias-urdin/remove-keystone-v2

rgw/auth: Remove Keystone v2.0 API support

Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agotest/rgw: include --rgw-realm/zonegroup/zone args for 'account create' 59422/head
Casey Bodley [Fri, 23 Aug 2024 19:55:44 +0000 (15:55 -0400)]
test/rgw: include --rgw-realm/zonegroup/zone args for 'account create'

Signed-off-by: Casey Bodley <cbodley@redhat.com>
10 months agotest/rgw: test_multi.py creates realm with --default
Casey Bodley [Fri, 23 Aug 2024 19:54:18 +0000 (15:54 -0400)]
test/rgw: test_multi.py creates realm with --default

mstart.sh relies on default realm/zonegroup/zone configuration, because
it doesn't supply them to radosgw as config options

Signed-off-by: Casey Bodley <cbodley@redhat.com>
10 months agorgw: ignore zoneless default realm when not configured
Casey Bodley [Fri, 23 Aug 2024 19:03:31 +0000 (15:03 -0400)]
rgw: ignore zoneless default realm when not configured

"default" zone/zonegroup deployments without a realm can be broken by
the creation of an unrelated realm, because that realm is (was)
automatically set as the default

when startup detects an incomplete default realm (one that doesn't have
a default zone), fall back to the realmless "default" zone/zonegroup
instead

Fixes: https://tracker.ceph.com/issues/67697
Signed-off-by: Casey Bodley <cbodley@redhat.com>
10 months agoradosgw-admin: add 'realm default rm' command
Casey Bodley [Fri, 23 Aug 2024 18:53:46 +0000 (14:53 -0400)]
radosgw-admin: add 'realm default rm' command

the 'realm default' command could only set a different realm as the
default, and provided no way to clear the default

Signed-off-by: Casey Bodley <cbodley@redhat.com>
10 months agoMerge pull request #59301 from xxhdx1985126/wip-67604
Matan Breizman [Mon, 26 Aug 2024 10:55:47 +0000 (13:55 +0300)]
Merge pull request #59301 from xxhdx1985126/wip-67604

crimson/common/tri_mutex: also wake up waiters when demoting

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
10 months agoMerge pull request #58136 from xxhdx1985126/wip-66372
Matan Breizman [Mon, 26 Aug 2024 10:50:10 +0000 (13:50 +0300)]
Merge pull request #58136 from xxhdx1985126/wip-66372

crimson/osd/osd: mark down connections to downed osds

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
10 months agoMerge pull request #54620 from rishabh-d-dave/mgr-vol-clone-stats
Venky Shankar [Mon, 26 Aug 2024 10:14:53 +0000 (15:44 +0530)]
Merge pull request #54620 from rishabh-d-dave/mgr-vol-clone-stats

mgr/vol: show progress and stats for the subvolume snapshot clones

Reviewed-by: Venky Shankar <vshankar@redhat.com>
10 months agoMerge pull request #59428 from zdover23/wip-doc-2024-08-26-cephadm-services-osd
Zac Dover [Mon, 26 Aug 2024 08:09:16 +0000 (18:09 +1000)]
Merge pull request #59428 from zdover23/wip-doc-2024-08-26-cephadm-services-osd

doc/cephadm: how to get exact size_spec from device

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
10 months agocrimson: Add support for bench osd command 57952/head
Nitzan Mordechai [Mon, 10 Jun 2024 10:51:03 +0000 (10:51 +0000)]
crimson: Add support for bench osd command

this commit adds support for the 'bench' admin command in the OSD,
allowing administrators to perform benchmark tests on the OSD. The
'bench' command accepts 4 optional parameters with the following
default values:

1. count - Total number of bytes to write (default: 1GB).
2. size - Block size for each write operation (default: 4MB).
3. object_size - Size of each object to write (default: 0).
4. object_num - Number of objects to write (default: 0).

The results of the benchmark are returned in a JSON formatted output,
which includes the following fields:

1. bytes_written - Total number of bytes written during the benchmark.
2. blocksize - Block size used for each write operation.
3. elapsed_sec - Total time taken to complete the benchmark in seconds.
4. bytes_per_sec - Write throughput in bytes per second.
5. iops - Number of input/output operations per second.

Example JSON output:

```json
{
  "osd_bench_results": {
    "bytes_written": 1073741824,
    "blocksize": 4194304,
    "elapsed_sec": 0.5,
    "bytes_per_sec": 2147483648,
    "iops": 512
  }
}

Fixes: https://tracker.ceph.com/issues/66380
Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
10 months agoMerge pull request #59392 from cyx1231st/wip-inplace-rewrite-comments
Yingxin [Mon, 26 Aug 2024 03:28:03 +0000 (11:28 +0800)]
Merge pull request #59392 from cyx1231st/wip-inplace-rewrite-comments

crimson/os/seastore: refine documents related to inplace rewrite

Reviewed-by: Myoungwon Oh <myoungwon.oh@samsung.com>
10 months agodoc/cephadm: how to get exact size_spec from device 59428/head
Zac Dover [Sun, 25 Aug 2024 20:03:34 +0000 (06:03 +1000)]
doc/cephadm: how to get exact size_spec from device

Add instructions for retrieving the exact size of block devices.

Fixes: https://tracker.ceph.com/issues/66754
Signed-off-by: Zac Dover <zac.dover@proton.me>
10 months agoMerge pull request #59053 from baum/wip-baum-20240806-00
baum [Sun, 25 Aug 2024 18:10:46 +0000 (21:10 +0300)]
Merge pull request #59053 from baum/wip-baum-20240806-00

nvmeof/NVMeofGwMonitorClient: use a separate mutex for beacons

10 months agoMerge pull request #58858 from ronen-fr/wip-rf-entry
Ronen Friedman [Sun, 25 Aug 2024 16:44:03 +0000 (19:44 +0300)]
Merge pull request #58858 from ronen-fr/wip-rf-entry

osd/scrub: a scrub queue of level-specific entries

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Nitzan Mordechai <nmordech@redhat.com>
10 months agotest/osd/scrub: fix searched-for log string 58858/head
Ronen Friedman [Sun, 25 Aug 2024 08:57:42 +0000 (03:57 -0500)]
test/osd/scrub: fix searched-for log string

To match the modified log message in
OsdScrub::restrictions_on_scrubbing().

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix missing 'const' on some formatters
Ronen Friedman [Sat, 24 Aug 2024 11:41:44 +0000 (06:41 -0500)]
osd/scrub: fix missing 'const' on some formatters

required to pass CI checks.

co-author: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agotest/osd/scrub: disable tests for deleted scrub functionality
Ronen Friedman [Sat, 24 Aug 2024 05:36:44 +0000 (00:36 -0500)]
test/osd/scrub: disable tests for deleted scrub functionality

The scrub scheduler no longer "upgrades" shallow scrubs into
deep ones on error, so the tests that check this functionality
are no longer valid.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agotest/osd: test new functionality added to the not-before queue
Ronen Friedman [Sun, 18 Aug 2024 17:33:38 +0000 (12:33 -0500)]
test/osd: test new functionality added to the not-before queue

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: delay both targets on some failures
Ronen Friedman [Sat, 17 Aug 2024 16:08:19 +0000 (11:08 -0500)]
osd/scrub: delay both targets on some failures

If the failure of a scrub-job is due to a condition that affects
both targets, both should be delayed. Otherwise, we may end up
with the following bogus scenario:

A high priority deep target is scheduled, but scrub session initiation
fails due to, for example, a concurrent snap trim. The deep target
will be delayed. A second initiation attempt may happen after the
snap trimming is done, but before the updated deep target not-before.
As a result - the lower priority target will be scheduled before the
higher priority one - which is a bug.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: reverse OSDRestrictions flags polarity
Ronen Friedman [Thu, 15 Aug 2024 13:17:48 +0000 (08:17 -0500)]
osd/scrub: reverse OSDRestrictions flags polarity

As most of the flags in OSDRestrictions are of 'true is bad' polarity,
reverse the two non-conforming flags - cpu load and time-of-day
restrictions - to match.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix the conditions for auto-repair scrubs
Ronen Friedman [Thu, 15 Aug 2024 12:51:15 +0000 (07:51 -0500)]
osd/scrub: fix the conditions for auto-repair scrubs

The conditions for auto-repair scrubs should have been changed
when need_auto lost some of its setters.

Also fix the rescheduling of repair scrubs
when the last scrub ended with errors.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: remove requested_scrub_t::deep_scrub_on_error
Ronen Friedman [Thu, 8 Aug 2024 13:49:57 +0000 (08:49 -0500)]
osd/scrub: remove requested_scrub_t::deep_scrub_on_error

This flag was used to indicate that a deep scrub should
be performed if a shallow scrub finds an error. It was
always set true for shallow, regular, scrubs - if
can_autorepair flag was set. Thus, the ephemeral flag in
the requested_scrub_t object is not really needed.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoqa/standalone/scrub: disable scrub_extended_sleep test
Ronen Friedman [Tue, 6 Aug 2024 13:07:17 +0000 (08:07 -0500)]
qa/standalone/scrub: disable scrub_extended_sleep test

Disabling osd-scrub-test.sh::TEST_scrub_extended_sleep,
as the test is no longer valid (updated code no longer
produces the same logs or the same behavior).

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: remove non-display usage of target's is_high_priority()
Ronen Friedman [Tue, 30 Jul 2024 12:12:54 +0000 (07:12 -0500)]
osd/scrub: remove non-display usage of target's is_high_priority()

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: remove 'calculated_to_deep' flag
Ronen Friedman [Mon, 29 Jul 2024 04:34:32 +0000 (23:34 -0500)]
osd/scrub: remove 'calculated_to_deep' flag

as once a sched-target was selected, we know the level of the scrub.
Also removed: the ephemeral 'time_for_deep' flag.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: modify after-repair-scrub triggering
Ronen Friedman [Sun, 28 Jul 2024 12:37:07 +0000 (07:37 -0500)]
osd/scrub: modify after-repair-scrub triggering

... to manipulate the relevant scrub target directly, instead
of using the 'planned scrub' flags.

The relevant condition flag was moved from the PG and into the scrubber.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix ReplicaReservations ctor to use correct query
Ronen Friedman [Sun, 28 Jul 2024 10:52:38 +0000 (05:52 -0500)]
osd/scrub: fix ReplicaReservations ctor to use correct query

when determining whether replica reservations are required.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix parameters validation on scrub start
Ronen Friedman [Sun, 28 Jul 2024 06:09:25 +0000 (01:09 -0500)]
osd/scrub: fix parameters validation on scrub start

... as the selected target already determines the
scrub level & type.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix reserve_local()
Ronen Friedman [Sun, 28 Jul 2024 10:20:38 +0000 (05:20 -0500)]
osd/scrub: fix reserve_local()

to use the correct method when determining whether we should
perform the reservation.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix initiation path of operator-commanded scrubs
Ronen Friedman [Sat, 27 Jul 2024 17:59:46 +0000 (12:59 -0500)]
osd/scrub: fix initiation path of operator-commanded scrubs

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agocommon/not_before_queue: extending the container's API
Ronen Friedman [Tue, 30 Jul 2024 10:59:00 +0000 (05:59 -0500)]
common/not_before_queue: extending the container's API

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: OSD's scrub queue now holds SchedEntry-s
Ronen Friedman [Wed, 24 Jul 2024 07:02:46 +0000 (02:02 -0500)]
osd/scrub: OSD's scrub queue now holds SchedEntry-s

The OSD's scrub queue now holds SchedEntry-s, instead of ScrubJob-s.
The queue itself is implemented using the 'not_before_queue_t' class.

Note: this is not a stable state of the scrubber code. In the next
commits:
- modifying the way sched targets are modified and updated, to match the
  new queue implementation.
- removing the 'planned scrub' flags.

Important note: the interaction of initiate_scrub() and pop_ready_pg()
is not changed by this commit. Namely:

Currently - pop..() loops over all eligible jobs, until it finds one
that matches the environment restrictions (which most of the time, as the
concurrency limit is usually reached, would be 'high-priority-only').

The other option is to maintain Sam's 'not_before_q' clean interface: we
always pop the top, and if that top fails the preconds tests - we delay and
re-push. This has the following troubling implications:

- it would take a long time to find a viable scrub job, if the problem
  is related to, for example, 'no scrub'.
- local resources failure (inc_scrubs() failure) must be handles
  separately, as we do not want to reshuffle the queue for this
  very very common case.
- but the real problem: unneeded shuffling of the queue, even as the
  problem is not with the scrub job itself, but with the environment
  (esp. no-scrub etc.).
  This is a common case, and it would be wrong to reshuffle the queue
  for that.
- and - remember that any change to a sched-entry must be done under PG
  lock.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agocommon/not_before_queue: move status_t out of container_t
Ronen Friedman [Tue, 30 Jul 2024 10:54:59 +0000 (05:54 -0500)]
common/not_before_queue: move status_t out of container_t

for readability

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agocommon/not_before_queue: some spelling fixes
Ronen Friedman [Mon, 29 Jul 2024 03:58:22 +0000 (22:58 -0500)]
common/not_before_queue: some spelling fixes

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agocommon: add not_before_queue_t
Samuel Just [Fri, 16 Dec 2022 18:30:18 +0000 (18:30 +0000)]
common: add not_before_queue_t

Signed-off-by: Samuel Just <sjust@redhat.com>
10 months agoosd/scrub: modify ScrubJob to hold two SchedTarget-s
Ronen Friedman [Fri, 12 Jul 2024 13:18:30 +0000 (08:18 -0500)]
osd/scrub: modify ScrubJob to hold two SchedTarget-s

ScrubJob will now hold two SchedTarget-s - two sets of scheduling
information (times, levels, etc.) for the next shallow and deep scrubs.

This is in preparation for the upcoming changes to the scheduling queue.
The change cannot stand on its own, as the partial implementation
creates some inconsistencies in the scheduling logic.

Specifically, here is what changes here, and how it differs from the
desired implementation:
- The OSD still maintains a queue of scrub jobs - one object only per
  PG.
  But now - each queue element holds two SchedTarget-s.
- When a scrub is initiated, the Scrubber is handed a ScrubJob object.
  Only in the next commit will it also receive the ID of the selected
  level. That causes some issues when re-determining the level of the
  initiated scrub. A failure to match the queue "intent" results in
  failures.
- the 'planned scrub' flags are still here, instead of directly
  encoding the characteristics of the next scrub in the relevant
  sched-entry.
- the 'urgency' levels do not cover the full required range of
  behaviors and priorities.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agonvmeof/NVMeofGwMonitorClient: use a separate mutex for beacons 59053/head
Alexander Indenbaum [Mon, 5 Aug 2024 09:50:27 +0000 (09:50 +0000)]
nvmeof/NVMeofGwMonitorClient: use a separate mutex for beacons

Add beacon_lock to mitigate potential beacon delays caused by slow message
handling, particularly in handle_nvmeof_gw_map.

Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
10 months agoqa: drop XMLSTARLET variable, use xmlstarlet directly 59433/head
Ilya Dryomov [Sun, 25 Aug 2024 11:22:08 +0000 (13:22 +0200)]
qa: drop XMLSTARLET variable, use xmlstarlet directly

The variable was added in commit 9b6b7c35d03f ("Handle
differently-named xmlstarlet binary for *suse") but this
compatibility business is long outdated:

  Mon Oct 13 08:52:37 UTC 2014 - toms@opensuse.org

  - SPEC file changes
    - Added link from /usr/bin/xml to /usr/bin/xmlstarlet as other
      distributions do the same
    - Did the same for the manpage

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
10 months agorbd: "rbd bench" always writes the same byte 59423/head
Ilya Dryomov [Fri, 23 Aug 2024 21:00:24 +0000 (23:00 +0200)]
rbd: "rbd bench" always writes the same byte

It's expected that the buffer is filled with the same byte, but the
byte should differ from run to run:

    memset(bp.c_str(), rand() & 0xff, io_size);

This was broken in commit c7f71d14a5d3 ("rbd: migrated existing command
logic to new namespaces") which inadvertently moved the call to srand(),
leaving rand() unseeded for the above memset().

Fixes: https://tracker.ceph.com/issues/67698
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
10 months agorgw: realm create only sets default realm on --default
Casey Bodley [Fri, 23 Aug 2024 18:49:32 +0000 (14:49 -0400)]
rgw: realm create only sets default realm on --default

Signed-off-by: Casey Bodley <cbodley@redhat.com>
10 months agoMerge PR #58487 into main
Venky Shankar [Fri, 23 Aug 2024 16:32:34 +0000 (22:02 +0530)]
Merge PR #58487 into main

* refs/pull/58487/head:
qa/suites/fs/workload: drop mgrmodules stanza
qa/tasks/ceph: fix "ceph mgr module enable" command

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Milind Changire <mchangir@redhat.com>
10 months agoMerge pull request #58336 from Svelar/uadk
Casey Bodley [Fri, 23 Aug 2024 14:32:47 +0000 (10:32 -0400)]
Merge pull request #58336 from Svelar/uadk

Compressor: add UADK support

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
10 months agoMerge pull request #59418 from zdover23/wip-doc-2024-08-23-glossary-object-storage
Anthony D'Atri [Fri, 23 Aug 2024 14:11:16 +0000 (10:11 -0400)]
Merge pull request #59418 from zdover23/wip-doc-2024-08-23-glossary-object-storage

doc/glossary: add "object storage"

10 months agoMerge pull request #59086 from phlogistonjohn/jjm-smb-ctdb-clustering
Adam King [Fri, 23 Aug 2024 13:06:33 +0000 (09:06 -0400)]
Merge pull request #59086 from phlogistonjohn/jjm-smb-ctdb-clustering

smb: ctdb clustering

Reviewed-by: Adam King <adking@redhat.com>
10 months agoMerge pull request #59175 from Yonatan-Zaken/fix_boolean_flags_handling_for_ceph_orch...
Adam King [Fri, 23 Aug 2024 12:54:21 +0000 (08:54 -0400)]
Merge pull request #59175 from Yonatan-Zaken/fix_boolean_flags_handling_for_ceph_orch_daemon_add_osd

mgr/orchestrator: fix encrypted flag handling in orch daemon add osd

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Guillaume Abrioux <gabrioux@ibm.com>
10 months agodoc/glossary: add "object storage" 59418/head
Zac Dover [Fri, 23 Aug 2024 12:36:16 +0000 (22:36 +1000)]
doc/glossary: add "object storage"

Add a (very basic) definition of object storage.

Signed-off-by: Zac Dover <zac.dover@proton.me>
10 months agoMerge PR #56602 into main
Venky Shankar [Fri, 23 Aug 2024 09:21:41 +0000 (14:51 +0530)]
Merge PR #56602 into main

* refs/pull/56602/head:
mds: always make getattr wait for xlock to be released by the previous client

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 months agocrimson/os/seastore: refine documents related to inplace rewrite 59392/head
Yingxin Cheng [Thu, 22 Aug 2024 02:34:47 +0000 (10:34 +0800)]
crimson/os/seastore: refine documents related to inplace rewrite

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
10 months agoMerge pull request #44470 from orozery/rbd-external-migrate
Ilya Dryomov [Fri, 23 Aug 2024 08:20:47 +0000 (10:20 +0200)]
Merge pull request #44470 from orozery/rbd-external-migrate

librbd/migration: add external clusters support

Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
10 months agoMerge pull request #59393 from anthonyeleven/caps-man-caps
Zac Dover [Thu, 22 Aug 2024 22:08:06 +0000 (08:08 +1000)]
Merge pull request #59393 from anthonyeleven/caps-man-caps

doc/releases: Correct mimic.rst

Reviewed-by: Zac Dover <zac.dover@proton.me>
10 months agoqa/distros: reinstall nvme-cli on centos 9 nodes 59409/head
Adam King [Thu, 22 Aug 2024 17:53:38 +0000 (13:53 -0400)]
qa/distros: reinstall nvme-cli on centos 9 nodes

To work around a potential linking issue between
nvme-cli ad libnvme that prevents nvme-cli from
correctly generating a hostnqn, causing

nvme_fabrics: found same hostid edb4e426-766f-44c6-b127-da2a5b7446ef but different hostnqn hostnqn

messages in dmesg and the inability to setup nvme
loop devices

Fixes: https://tracker.ceph.com/issues/67684
Signed-off-by: Adam King <adking@redhat.com>
10 months agoMerge pull request #59253 from clwluvw/copy-source-attrs
Casey Bodley [Thu, 22 Aug 2024 18:25:11 +0000 (14:25 -0400)]
Merge pull request #59253 from clwluvw/copy-source-attrs

rgw: load copy source bucket attrs in putobj

Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agoMerge pull request #59318 from adk3798/cephadm-osd-original-weight-param
Adam King [Thu, 22 Aug 2024 17:38:23 +0000 (13:38 -0400)]
Merge pull request #59318 from adk3798/cephadm-osd-original-weight-param

mgr/cephadm: add "original_weight" parameter to OSD class

Reviewed-by: John Mulligan <jmulligan@redhat.com>
10 months agoMerge pull request #59204 from tchaikov/wip-ceph-volume-deps
Guillaume Abrioux [Thu, 22 Aug 2024 13:54:53 +0000 (15:54 +0200)]
Merge pull request #59204 from tchaikov/wip-ceph-volume-deps

ceph-volume: add "packaging" to install_requires

10 months agoMerge pull request #58990 from Matan-B/wip-matanb-fmt-draft
Matan Breizman [Thu, 22 Aug 2024 11:53:31 +0000 (14:53 +0300)]
Merge pull request #58990 from Matan-B/wip-matanb-fmt-draft

fmt: bump up version + related changes

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Radosław Zarzyński <rzarzyns@redhat.com>
10 months agoqa/workunits/rbd: exercise snap_{name,id} parsing in test_import_native_format() 44470/head
Ilya Dryomov [Wed, 21 Aug 2024 19:16:30 +0000 (21:16 +0200)]
qa/workunits/rbd: exercise snap_{name,id} parsing in test_import_native_format()

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
10 months agodoc/rbd: clarify when image_id is expected for import-only migration
Ilya Dryomov [Sat, 17 Aug 2024 08:28:50 +0000 (10:28 +0200)]
doc/rbd: clarify when image_id is expected for import-only migration

"optional if image in trash" can be easily interpreted as "required if
image not in trash".

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
10 months agolibrbd/migration: add external clusters support
Ilya Dryomov [Fri, 16 Aug 2024 17:09:39 +0000 (19:09 +0200)]
librbd/migration: add external clusters support

This commit extends NativeFormat (aka migration where the migration
source is an RBD image) to support external Ceph clusters, limited to
import-only mode.

Co-authored-by: Or Ozeri <oro@il.ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
10 months agoMerge PR #55144 into main
Venky Shankar [Thu, 22 Aug 2024 09:24:13 +0000 (14:54 +0530)]
Merge PR #55144 into main

* refs/pull/55144/head:
client: fix file cache cap leak which can stall async read call
test/client: test contiguous read for a non-contiguous write

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
10 months agoMerge PR #56816 into main
Venky Shankar [Thu, 22 Aug 2024 09:22:33 +0000 (14:52 +0530)]
Merge PR #56816 into main

* refs/pull/56816/head:
doc: mention the peer status failed when snapshot created on the remote filesystem.
qa: add test_cephfs_mirror_remote_snap_corrupt_fails_synced_snapshot
cephfs_mirror: update peer status for invalid metadata in remote snapshot

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>
10 months agoMerge PR #59166 into main
Venky Shankar [Thu, 22 Aug 2024 09:20:51 +0000 (14:50 +0530)]
Merge PR #59166 into main

* refs/pull/59166/head:
mon/thrasher: set stopping

Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
10 months agodoc/releases: Correct mimic.rst 59393/head
Anthony D'Atri [Thu, 22 Aug 2024 03:55:34 +0000 (23:55 -0400)]
doc/releases: Correct mimic.rst

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
10 months agoosd/scrub: introducing the concept of a SchedEntry
Ronen Friedman [Sun, 7 Jul 2024 17:46:25 +0000 (12:46 -0500)]
osd/scrub: introducing the concept of a SchedEntry

SchedEntry holds the scheduling details for scrubbing a specific PG at
a specific scrub level. Namely - it identifies the [pg,level]
combination, the 'urgency' attribute of the scheduled scrub
(which determines most of its behavior and scheduling decisions)
and the actual time attributes for scheduling (target,
deadline, not_before).

Added a table detailing, for each type of scrub, what limitations apply
to it, and what restrictions are waived.

The following commits will reshape the ScrubJob objects to hold
two instances of SchedTarget-s - two wrappers around SchedEntry-s,
one for the next shallow scrub and one for the next deep scrub.

Sched-entries (wrapped in sched-targets) have a defined order:

For ready-to-scrub entries (those that have an n.b. in the past),
the order is first by urgency, then by target time (and then by
level - deep before shallow - and then by the n.b. itself).

'Future' entries are ordered by n.b., then urgency,
target time, and level.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoMerge pull request #59348 from zdover23/wip-doc-2024-08-20-rados-ops-cache-tiering
Zac Dover [Wed, 21 Aug 2024 11:26:54 +0000 (21:26 +1000)]
Merge pull request #59348 from zdover23/wip-doc-2024-08-20-rados-ops-cache-tiering

doc/rados: document unfound object cache-tiering scenario

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
10 months agomgr/dashboard: can't scroll to the end of the page 59376/head
Dnyaneshwari [Wed, 21 Aug 2024 07:31:43 +0000 (13:01 +0530)]
mgr/dashboard: can't scroll to the end of the page

Fixes: https://tracker.ceph.com/issues/67549
Signed-off-by: Dnyaneshwari Talwekar <dtalweka@redhat.com>
10 months agoMerge pull request #59362 from gbregman/main
Gil Bregman [Wed, 21 Aug 2024 05:46:29 +0000 (08:46 +0300)]
Merge pull request #59362 from gbregman/main

mgr/cephadm: change SPDK RPC fields in nvmeof configuration

10 months agoMerge pull request #59323 from yuvalif/wip-yuval-67514
Yuval Lifshitz [Wed, 21 Aug 2024 05:09:34 +0000 (08:09 +0300)]
Merge pull request #59323 from yuvalif/wip-yuval-67514

test/rgw/notifications: don't check for full queue if topics expired

Reviewed-By: Casey Bodley <cbodley@ibm.com>
10 months agoMerge pull request #54984 from NitzanMordhai/wip-nitzan-restful-un-boundary-keep...
NitzanMordhai [Tue, 20 Aug 2024 16:16:25 +0000 (19:16 +0300)]
Merge pull request #54984 from NitzanMordhai/wip-nitzan-restful-un-boundary-keep-requests

mgr/rest: Trim  requests array and limit size

10 months agodoc: add clustering related items to smb docs 59086/head
John Mulligan [Wed, 14 Aug 2024 18:19:17 +0000 (14:19 -0400)]
doc: add clustering related items to smb docs

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>
10 months agoqa/suites/orch: add a pair of teuthology tests for ctdb smb clusters
John Mulligan [Sat, 10 Aug 2024 18:42:16 +0000 (14:42 -0400)]
qa/suites/orch: add a pair of teuthology tests for ctdb smb clusters

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>
10 months agoqa/suites/orch: old smb tests need placement count 1 to avoid using clustering
John Mulligan [Sat, 10 Aug 2024 16:49:24 +0000 (12:49 -0400)]
qa/suites/orch: old smb tests need placement count 1 to avoid using clustering

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>
10 months agomgr/cephadm: enable the smb service to prevent stray ctdb services
John Mulligan [Mon, 12 Aug 2024 14:56:51 +0000 (10:56 -0400)]
mgr/cephadm: enable the smb service to prevent stray ctdb services

Tell cephadm that any `ctdb` services are "owned" by the smb service
and should be ignored as not a stray.
Ideally, we do this on a per service basis but the info that the ctdb
lock helper provides to its registration function is pretty generic.
Future versions of samba may improve upon this.

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>
10 months agomgr/cephadm: extend stray service detection with a general ignore hook
John Mulligan [Mon, 12 Aug 2024 14:56:36 +0000 (10:56 -0400)]
mgr/cephadm: extend stray service detection with a general ignore hook

Extend the system's current stray service detection with a new method on
the service classes so that new classes can hook into the stray services
in the case that ceph services and cephadm services have differing names
or use subsystems that call into ceph with different names (my use
case).

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>