]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
10 months agoqa/tasks/nvmeof.py: add nvmeof gw-group to deployment 59434/head
Vallari Agrawal [Mon, 26 Aug 2024 04:23:07 +0000 (09:53 +0530)]
qa/tasks/nvmeof.py: add nvmeof gw-group to deployment

Groups was made a required parameter to be
`ceph orch apply nvmeof <pool> <group>` in
https://github.com/ceph/ceph/pull/58860.
That broke the `nvmeof` suite so this PR fixes that.

Right now, all gateway are deployed in a single group.
Later, this would be changed to have multi groups for a better test.

Signed-off-by: Vallari Agrawal <val.agl002@gmail.com>
10 months agoMerge pull request #59053 from baum/wip-baum-20240806-00
baum [Sun, 25 Aug 2024 18:10:46 +0000 (21:10 +0300)]
Merge pull request #59053 from baum/wip-baum-20240806-00

nvmeof/NVMeofGwMonitorClient: use a separate mutex for beacons

10 months agoMerge pull request #58858 from ronen-fr/wip-rf-entry
Ronen Friedman [Sun, 25 Aug 2024 16:44:03 +0000 (19:44 +0300)]
Merge pull request #58858 from ronen-fr/wip-rf-entry

osd/scrub: a scrub queue of level-specific entries

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Nitzan Mordechai <nmordech@redhat.com>
10 months agotest/osd/scrub: fix searched-for log string 58858/head
Ronen Friedman [Sun, 25 Aug 2024 08:57:42 +0000 (03:57 -0500)]
test/osd/scrub: fix searched-for log string

To match the modified log message in
OsdScrub::restrictions_on_scrubbing().

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix missing 'const' on some formatters
Ronen Friedman [Sat, 24 Aug 2024 11:41:44 +0000 (06:41 -0500)]
osd/scrub: fix missing 'const' on some formatters

required to pass CI checks.

co-author: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agotest/osd/scrub: disable tests for deleted scrub functionality
Ronen Friedman [Sat, 24 Aug 2024 05:36:44 +0000 (00:36 -0500)]
test/osd/scrub: disable tests for deleted scrub functionality

The scrub scheduler no longer "upgrades" shallow scrubs into
deep ones on error, so the tests that check this functionality
are no longer valid.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agotest/osd: test new functionality added to the not-before queue
Ronen Friedman [Sun, 18 Aug 2024 17:33:38 +0000 (12:33 -0500)]
test/osd: test new functionality added to the not-before queue

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: delay both targets on some failures
Ronen Friedman [Sat, 17 Aug 2024 16:08:19 +0000 (11:08 -0500)]
osd/scrub: delay both targets on some failures

If the failure of a scrub-job is due to a condition that affects
both targets, both should be delayed. Otherwise, we may end up
with the following bogus scenario:

A high priority deep target is scheduled, but scrub session initiation
fails due to, for example, a concurrent snap trim. The deep target
will be delayed. A second initiation attempt may happen after the
snap trimming is done, but before the updated deep target not-before.
As a result - the lower priority target will be scheduled before the
higher priority one - which is a bug.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: reverse OSDRestrictions flags polarity
Ronen Friedman [Thu, 15 Aug 2024 13:17:48 +0000 (08:17 -0500)]
osd/scrub: reverse OSDRestrictions flags polarity

As most of the flags in OSDRestrictions are of 'true is bad' polarity,
reverse the two non-conforming flags - cpu load and time-of-day
restrictions - to match.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix the conditions for auto-repair scrubs
Ronen Friedman [Thu, 15 Aug 2024 12:51:15 +0000 (07:51 -0500)]
osd/scrub: fix the conditions for auto-repair scrubs

The conditions for auto-repair scrubs should have been changed
when need_auto lost some of its setters.

Also fix the rescheduling of repair scrubs
when the last scrub ended with errors.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: remove requested_scrub_t::deep_scrub_on_error
Ronen Friedman [Thu, 8 Aug 2024 13:49:57 +0000 (08:49 -0500)]
osd/scrub: remove requested_scrub_t::deep_scrub_on_error

This flag was used to indicate that a deep scrub should
be performed if a shallow scrub finds an error. It was
always set true for shallow, regular, scrubs - if
can_autorepair flag was set. Thus, the ephemeral flag in
the requested_scrub_t object is not really needed.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoqa/standalone/scrub: disable scrub_extended_sleep test
Ronen Friedman [Tue, 6 Aug 2024 13:07:17 +0000 (08:07 -0500)]
qa/standalone/scrub: disable scrub_extended_sleep test

Disabling osd-scrub-test.sh::TEST_scrub_extended_sleep,
as the test is no longer valid (updated code no longer
produces the same logs or the same behavior).

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: remove non-display usage of target's is_high_priority()
Ronen Friedman [Tue, 30 Jul 2024 12:12:54 +0000 (07:12 -0500)]
osd/scrub: remove non-display usage of target's is_high_priority()

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: remove 'calculated_to_deep' flag
Ronen Friedman [Mon, 29 Jul 2024 04:34:32 +0000 (23:34 -0500)]
osd/scrub: remove 'calculated_to_deep' flag

as once a sched-target was selected, we know the level of the scrub.
Also removed: the ephemeral 'time_for_deep' flag.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: modify after-repair-scrub triggering
Ronen Friedman [Sun, 28 Jul 2024 12:37:07 +0000 (07:37 -0500)]
osd/scrub: modify after-repair-scrub triggering

... to manipulate the relevant scrub target directly, instead
of using the 'planned scrub' flags.

The relevant condition flag was moved from the PG and into the scrubber.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix ReplicaReservations ctor to use correct query
Ronen Friedman [Sun, 28 Jul 2024 10:52:38 +0000 (05:52 -0500)]
osd/scrub: fix ReplicaReservations ctor to use correct query

when determining whether replica reservations are required.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix parameters validation on scrub start
Ronen Friedman [Sun, 28 Jul 2024 06:09:25 +0000 (01:09 -0500)]
osd/scrub: fix parameters validation on scrub start

... as the selected target already determines the
scrub level & type.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix reserve_local()
Ronen Friedman [Sun, 28 Jul 2024 10:20:38 +0000 (05:20 -0500)]
osd/scrub: fix reserve_local()

to use the correct method when determining whether we should
perform the reservation.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix initiation path of operator-commanded scrubs
Ronen Friedman [Sat, 27 Jul 2024 17:59:46 +0000 (12:59 -0500)]
osd/scrub: fix initiation path of operator-commanded scrubs

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agocommon/not_before_queue: extending the container's API
Ronen Friedman [Tue, 30 Jul 2024 10:59:00 +0000 (05:59 -0500)]
common/not_before_queue: extending the container's API

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: OSD's scrub queue now holds SchedEntry-s
Ronen Friedman [Wed, 24 Jul 2024 07:02:46 +0000 (02:02 -0500)]
osd/scrub: OSD's scrub queue now holds SchedEntry-s

The OSD's scrub queue now holds SchedEntry-s, instead of ScrubJob-s.
The queue itself is implemented using the 'not_before_queue_t' class.

Note: this is not a stable state of the scrubber code. In the next
commits:
- modifying the way sched targets are modified and updated, to match the
  new queue implementation.
- removing the 'planned scrub' flags.

Important note: the interaction of initiate_scrub() and pop_ready_pg()
is not changed by this commit. Namely:

Currently - pop..() loops over all eligible jobs, until it finds one
that matches the environment restrictions (which most of the time, as the
concurrency limit is usually reached, would be 'high-priority-only').

The other option is to maintain Sam's 'not_before_q' clean interface: we
always pop the top, and if that top fails the preconds tests - we delay and
re-push. This has the following troubling implications:

- it would take a long time to find a viable scrub job, if the problem
  is related to, for example, 'no scrub'.
- local resources failure (inc_scrubs() failure) must be handles
  separately, as we do not want to reshuffle the queue for this
  very very common case.
- but the real problem: unneeded shuffling of the queue, even as the
  problem is not with the scrub job itself, but with the environment
  (esp. no-scrub etc.).
  This is a common case, and it would be wrong to reshuffle the queue
  for that.
- and - remember that any change to a sched-entry must be done under PG
  lock.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agocommon/not_before_queue: move status_t out of container_t
Ronen Friedman [Tue, 30 Jul 2024 10:54:59 +0000 (05:54 -0500)]
common/not_before_queue: move status_t out of container_t

for readability

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agocommon/not_before_queue: some spelling fixes
Ronen Friedman [Mon, 29 Jul 2024 03:58:22 +0000 (22:58 -0500)]
common/not_before_queue: some spelling fixes

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agocommon: add not_before_queue_t
Samuel Just [Fri, 16 Dec 2022 18:30:18 +0000 (18:30 +0000)]
common: add not_before_queue_t

Signed-off-by: Samuel Just <sjust@redhat.com>
10 months agoosd/scrub: modify ScrubJob to hold two SchedTarget-s
Ronen Friedman [Fri, 12 Jul 2024 13:18:30 +0000 (08:18 -0500)]
osd/scrub: modify ScrubJob to hold two SchedTarget-s

ScrubJob will now hold two SchedTarget-s - two sets of scheduling
information (times, levels, etc.) for the next shallow and deep scrubs.

This is in preparation for the upcoming changes to the scheduling queue.
The change cannot stand on its own, as the partial implementation
creates some inconsistencies in the scheduling logic.

Specifically, here is what changes here, and how it differs from the
desired implementation:
- The OSD still maintains a queue of scrub jobs - one object only per
  PG.
  But now - each queue element holds two SchedTarget-s.
- When a scrub is initiated, the Scrubber is handed a ScrubJob object.
  Only in the next commit will it also receive the ID of the selected
  level. That causes some issues when re-determining the level of the
  initiated scrub. A failure to match the queue "intent" results in
  failures.
- the 'planned scrub' flags are still here, instead of directly
  encoding the characteristics of the next scrub in the relevant
  sched-entry.
- the 'urgency' levels do not cover the full required range of
  behaviors and priorities.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agonvmeof/NVMeofGwMonitorClient: use a separate mutex for beacons 59053/head
Alexander Indenbaum [Mon, 5 Aug 2024 09:50:27 +0000 (09:50 +0000)]
nvmeof/NVMeofGwMonitorClient: use a separate mutex for beacons

Add beacon_lock to mitigate potential beacon delays caused by slow message
handling, particularly in handle_nvmeof_gw_map.

Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
10 months agoMerge PR #58487 into main
Venky Shankar [Fri, 23 Aug 2024 16:32:34 +0000 (22:02 +0530)]
Merge PR #58487 into main

* refs/pull/58487/head:
qa/suites/fs/workload: drop mgrmodules stanza
qa/tasks/ceph: fix "ceph mgr module enable" command

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Milind Changire <mchangir@redhat.com>
10 months agoMerge pull request #58336 from Svelar/uadk
Casey Bodley [Fri, 23 Aug 2024 14:32:47 +0000 (10:32 -0400)]
Merge pull request #58336 from Svelar/uadk

Compressor: add UADK support

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
10 months agoMerge pull request #59418 from zdover23/wip-doc-2024-08-23-glossary-object-storage
Anthony D'Atri [Fri, 23 Aug 2024 14:11:16 +0000 (10:11 -0400)]
Merge pull request #59418 from zdover23/wip-doc-2024-08-23-glossary-object-storage

doc/glossary: add "object storage"

10 months agoMerge pull request #59086 from phlogistonjohn/jjm-smb-ctdb-clustering
Adam King [Fri, 23 Aug 2024 13:06:33 +0000 (09:06 -0400)]
Merge pull request #59086 from phlogistonjohn/jjm-smb-ctdb-clustering

smb: ctdb clustering

Reviewed-by: Adam King <adking@redhat.com>
10 months agoMerge pull request #59175 from Yonatan-Zaken/fix_boolean_flags_handling_for_ceph_orch...
Adam King [Fri, 23 Aug 2024 12:54:21 +0000 (08:54 -0400)]
Merge pull request #59175 from Yonatan-Zaken/fix_boolean_flags_handling_for_ceph_orch_daemon_add_osd

mgr/orchestrator: fix encrypted flag handling in orch daemon add osd

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Guillaume Abrioux <gabrioux@ibm.com>
10 months agodoc/glossary: add "object storage" 59418/head
Zac Dover [Fri, 23 Aug 2024 12:36:16 +0000 (22:36 +1000)]
doc/glossary: add "object storage"

Add a (very basic) definition of object storage.

Signed-off-by: Zac Dover <zac.dover@proton.me>
10 months agoMerge PR #56602 into main
Venky Shankar [Fri, 23 Aug 2024 09:21:41 +0000 (14:51 +0530)]
Merge PR #56602 into main

* refs/pull/56602/head:
mds: always make getattr wait for xlock to be released by the previous client

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 months agoMerge pull request #44470 from orozery/rbd-external-migrate
Ilya Dryomov [Fri, 23 Aug 2024 08:20:47 +0000 (10:20 +0200)]
Merge pull request #44470 from orozery/rbd-external-migrate

librbd/migration: add external clusters support

Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
10 months agoMerge pull request #59393 from anthonyeleven/caps-man-caps
Zac Dover [Thu, 22 Aug 2024 22:08:06 +0000 (08:08 +1000)]
Merge pull request #59393 from anthonyeleven/caps-man-caps

doc/releases: Correct mimic.rst

Reviewed-by: Zac Dover <zac.dover@proton.me>
10 months agoMerge pull request #59253 from clwluvw/copy-source-attrs
Casey Bodley [Thu, 22 Aug 2024 18:25:11 +0000 (14:25 -0400)]
Merge pull request #59253 from clwluvw/copy-source-attrs

rgw: load copy source bucket attrs in putobj

Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agoMerge pull request #59318 from adk3798/cephadm-osd-original-weight-param
Adam King [Thu, 22 Aug 2024 17:38:23 +0000 (13:38 -0400)]
Merge pull request #59318 from adk3798/cephadm-osd-original-weight-param

mgr/cephadm: add "original_weight" parameter to OSD class

Reviewed-by: John Mulligan <jmulligan@redhat.com>
10 months agoMerge pull request #59204 from tchaikov/wip-ceph-volume-deps
Guillaume Abrioux [Thu, 22 Aug 2024 13:54:53 +0000 (15:54 +0200)]
Merge pull request #59204 from tchaikov/wip-ceph-volume-deps

ceph-volume: add "packaging" to install_requires

10 months agoMerge pull request #58990 from Matan-B/wip-matanb-fmt-draft
Matan Breizman [Thu, 22 Aug 2024 11:53:31 +0000 (14:53 +0300)]
Merge pull request #58990 from Matan-B/wip-matanb-fmt-draft

fmt: bump up version + related changes

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Radosław Zarzyński <rzarzyns@redhat.com>
10 months agoqa/workunits/rbd: exercise snap_{name,id} parsing in test_import_native_format() 44470/head
Ilya Dryomov [Wed, 21 Aug 2024 19:16:30 +0000 (21:16 +0200)]
qa/workunits/rbd: exercise snap_{name,id} parsing in test_import_native_format()

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
10 months agodoc/rbd: clarify when image_id is expected for import-only migration
Ilya Dryomov [Sat, 17 Aug 2024 08:28:50 +0000 (10:28 +0200)]
doc/rbd: clarify when image_id is expected for import-only migration

"optional if image in trash" can be easily interpreted as "required if
image not in trash".

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
10 months agolibrbd/migration: add external clusters support
Ilya Dryomov [Fri, 16 Aug 2024 17:09:39 +0000 (19:09 +0200)]
librbd/migration: add external clusters support

This commit extends NativeFormat (aka migration where the migration
source is an RBD image) to support external Ceph clusters, limited to
import-only mode.

Co-authored-by: Or Ozeri <oro@il.ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
10 months agoMerge PR #55144 into main
Venky Shankar [Thu, 22 Aug 2024 09:24:13 +0000 (14:54 +0530)]
Merge PR #55144 into main

* refs/pull/55144/head:
client: fix file cache cap leak which can stall async read call
test/client: test contiguous read for a non-contiguous write

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
10 months agoMerge PR #56816 into main
Venky Shankar [Thu, 22 Aug 2024 09:22:33 +0000 (14:52 +0530)]
Merge PR #56816 into main

* refs/pull/56816/head:
doc: mention the peer status failed when snapshot created on the remote filesystem.
qa: add test_cephfs_mirror_remote_snap_corrupt_fails_synced_snapshot
cephfs_mirror: update peer status for invalid metadata in remote snapshot

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>
10 months agoMerge PR #59166 into main
Venky Shankar [Thu, 22 Aug 2024 09:20:51 +0000 (14:50 +0530)]
Merge PR #59166 into main

* refs/pull/59166/head:
mon/thrasher: set stopping

Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
10 months agodoc/releases: Correct mimic.rst 59393/head
Anthony D'Atri [Thu, 22 Aug 2024 03:55:34 +0000 (23:55 -0400)]
doc/releases: Correct mimic.rst

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
10 months agoosd/scrub: introducing the concept of a SchedEntry
Ronen Friedman [Sun, 7 Jul 2024 17:46:25 +0000 (12:46 -0500)]
osd/scrub: introducing the concept of a SchedEntry

SchedEntry holds the scheduling details for scrubbing a specific PG at
a specific scrub level. Namely - it identifies the [pg,level]
combination, the 'urgency' attribute of the scheduled scrub
(which determines most of its behavior and scheduling decisions)
and the actual time attributes for scheduling (target,
deadline, not_before).

Added a table detailing, for each type of scrub, what limitations apply
to it, and what restrictions are waived.

The following commits will reshape the ScrubJob objects to hold
two instances of SchedTarget-s - two wrappers around SchedEntry-s,
one for the next shallow scrub and one for the next deep scrub.

Sched-entries (wrapped in sched-targets) have a defined order:

For ready-to-scrub entries (those that have an n.b. in the past),
the order is first by urgency, then by target time (and then by
level - deep before shallow - and then by the n.b. itself).

'Future' entries are ordered by n.b., then urgency,
target time, and level.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoMerge pull request #59348 from zdover23/wip-doc-2024-08-20-rados-ops-cache-tiering
Zac Dover [Wed, 21 Aug 2024 11:26:54 +0000 (21:26 +1000)]
Merge pull request #59348 from zdover23/wip-doc-2024-08-20-rados-ops-cache-tiering

doc/rados: document unfound object cache-tiering scenario

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
10 months agoMerge pull request #59362 from gbregman/main
Gil Bregman [Wed, 21 Aug 2024 05:46:29 +0000 (08:46 +0300)]
Merge pull request #59362 from gbregman/main

mgr/cephadm: change SPDK RPC fields in nvmeof configuration

10 months agoMerge pull request #59323 from yuvalif/wip-yuval-67514
Yuval Lifshitz [Wed, 21 Aug 2024 05:09:34 +0000 (08:09 +0300)]
Merge pull request #59323 from yuvalif/wip-yuval-67514

test/rgw/notifications: don't check for full queue if topics expired

Reviewed-By: Casey Bodley <cbodley@ibm.com>
10 months agoMerge pull request #54984 from NitzanMordhai/wip-nitzan-restful-un-boundary-keep...
NitzanMordhai [Tue, 20 Aug 2024 16:16:25 +0000 (19:16 +0300)]
Merge pull request #54984 from NitzanMordhai/wip-nitzan-restful-un-boundary-keep-requests

mgr/rest: Trim  requests array and limit size

10 months agodoc: add clustering related items to smb docs 59086/head
John Mulligan [Wed, 14 Aug 2024 18:19:17 +0000 (14:19 -0400)]
doc: add clustering related items to smb docs

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>
10 months agoqa/suites/orch: add a pair of teuthology tests for ctdb smb clusters
John Mulligan [Sat, 10 Aug 2024 18:42:16 +0000 (14:42 -0400)]
qa/suites/orch: add a pair of teuthology tests for ctdb smb clusters

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>
10 months agoqa/suites/orch: old smb tests need placement count 1 to avoid using clustering
John Mulligan [Sat, 10 Aug 2024 16:49:24 +0000 (12:49 -0400)]
qa/suites/orch: old smb tests need placement count 1 to avoid using clustering

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>
10 months agomgr/cephadm: enable the smb service to prevent stray ctdb services
John Mulligan [Mon, 12 Aug 2024 14:56:51 +0000 (10:56 -0400)]
mgr/cephadm: enable the smb service to prevent stray ctdb services

Tell cephadm that any `ctdb` services are "owned" by the smb service
and should be ignored as not a stray.
Ideally, we do this on a per service basis but the info that the ctdb
lock helper provides to its registration function is pretty generic.
Future versions of samba may improve upon this.

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>
10 months agomgr/cephadm: extend stray service detection with a general ignore hook
John Mulligan [Mon, 12 Aug 2024 14:56:36 +0000 (10:56 -0400)]
mgr/cephadm: extend stray service detection with a general ignore hook

Extend the system's current stray service detection with a new method on
the service classes so that new classes can hook into the stray services
in the case that ceph services and cephadm services have differing names
or use subsystems that call into ceph with different names (my use
case).

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>
10 months agomgr/cephadm: move logic determining name in stray func
John Mulligan [Mon, 12 Aug 2024 13:52:20 +0000 (09:52 -0400)]
mgr/cephadm: move logic determining name in stray func

Encapsulate the logic determining the name of a stray service into a
method reducing the length and levels of indent in the stray checker
function.

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>
10 months agomgr/smb: enable clustering when setting up a cluster
John Mulligan [Mon, 15 Jul 2024 19:41:56 +0000 (15:41 -0400)]
mgr/smb: enable clustering when setting up a cluster

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/smb: add a cluster resource field to manage clustering
John Mulligan [Mon, 15 Jul 2024 19:41:43 +0000 (15:41 -0400)]
mgr/smb: add a cluster resource field to manage clustering

Add a new `clustering` field to the smb cluster resource. This field can
be used to select either automatic clustering with ctdb, or disable it,
or require it. The default is automatic and is based on the count value
in the placement spec. A count of 1 disables clustering and any other
value it is enabled.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/cephadm: configure ctdb cluster metadata from cephadm smb service
John Mulligan [Thu, 15 Aug 2024 20:40:47 +0000 (16:40 -0400)]
mgr/cephadm: configure ctdb cluster metadata from cephadm smb service

Add support to the smb service module so that cephadm will provide
information about the layout of the smb daemons to the clustermeta
module that, in turn, will provide the information sambacc needs to
configure ctdb.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/smb: add a python module to help manage the ctdb cluster
John Mulligan [Mon, 15 Jul 2024 19:39:19 +0000 (15:39 -0400)]
mgr/smb: add a python module to help manage the ctdb cluster

Add a new module clustermeta that implements a JSON based interface
compatible with sambacc. This module will be called directly by cephadm
as it places the daemons on the cluster nodes.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/smb: add function to parse rados psuedo-uri values
John Mulligan [Mon, 15 Jul 2024 19:22:43 +0000 (15:22 -0400)]
mgr/smb: add function to parse rados psuedo-uri values

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/smb: add support for rados locks to rados store
John Mulligan [Mon, 15 Jul 2024 19:22:22 +0000 (15:22 -0400)]
mgr/smb: add support for rados locks to rados store

Add support for using rados object locks to the rados store classes.
Callers directly using the rados store outside the store interface will
be able to make use of locking.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/cephadm: improve key management of smb service
John Mulligan [Mon, 15 Jul 2024 19:38:12 +0000 (15:38 -0400)]
mgr/cephadm: improve key management of smb service

The clustered mode of a logical smb cluster needs certain additional
capabilities in the rados pool. Improve, reorganize the key
configuration functions, and add the new caps.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agopython-common: add clustering related params to SMBSpec
John Mulligan [Mon, 15 Jul 2024 19:16:56 +0000 (15:16 -0400)]
python-common: add clustering related params to SMBSpec

Add parameters related to ctdb clustering to the smb service
deployment spec.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agocephadm: add ctdb support to smb daemon type
John Mulligan [Mon, 15 Jul 2024 19:16:04 +0000 (15:16 -0400)]
cephadm: add ctdb support to smb daemon type

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agocephadm: allow longer subcomponent names
John Mulligan [Mon, 15 Jul 2024 19:14:37 +0000 (15:14 -0400)]
cephadm: allow longer subcomponent names

Allow subcomponent names up to 32 chars long.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agocephadm: add a new context getter for rank
John Mulligan [Mon, 15 Jul 2024 19:14:13 +0000 (15:14 -0400)]
cephadm: add a new context getter for rank

Add a new context getter function to fetch a daemon's rank and rank
generation value.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/cephadm: change SPDK RPC fields in nvmeof configuration 59362/head
Gil Bregman [Tue, 20 Aug 2024 13:29:57 +0000 (16:29 +0300)]
mgr/cephadm: change SPDK RPC fields in nvmeof configuration
Fixes https://tracker.ceph.com/issues/67629

Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
10 months agopython-common/ceph/deployment: change SPDK RPC fields in nvmeof configuration
Gil Bregman [Tue, 20 Aug 2024 13:28:12 +0000 (16:28 +0300)]
python-common/ceph/deployment: change SPDK RPC fields in nvmeof configuration
Fixes https://tracker.ceph.com/issues/67629

Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
10 months agodoc/rados: document unfound object cache-tiering scenario 59348/head
Zac Dover [Tue, 20 Aug 2024 12:45:29 +0000 (22:45 +1000)]
doc/rados: document unfound object cache-tiering scenario

Explain how to deal with "unfound objects" when restarting OSDs in a
cache-tiered environment.

Fixes: https://tracker.ceph.com/issues/44286
Signed-off-by: Zac Dover <zac.dover@proton.me>
10 months agoMerge pull request #58460 from rkachach/fix_issue_oauth2_support
Adam King [Tue, 20 Aug 2024 12:35:44 +0000 (08:35 -0400)]
Merge pull request #58460 from rkachach/fix_issue_oauth2_support

adding support for SSO based on auth2-proxy

Reviewed-by: Adam King <adking@redhat.com>
10 months agoMerge pull request #58860 from adk3798/cephadm-nvmeof-require-group
Adam King [Tue, 20 Aug 2024 12:20:02 +0000 (08:20 -0400)]
Merge pull request #58860 from adk3798/cephadm-nvmeof-require-group

mgr/cephadm: require "group" parameter in nvmeof specs

Reviewed-by: Redouane Kachach <rkachach@ibm.com>
10 months agoMerge pull request #59165 from NitzanMordhai/wip-nitzan-test-rados-tools-newline...
NitzanMordhai [Tue, 20 Aug 2024 12:07:21 +0000 (15:07 +0300)]
Merge pull request #59165 from NitzanMordhai/wip-nitzan-test-rados-tools-newline-trim

test: test_rados_tools compare output without trimming newline

10 months agodoc/mgr/restful: update max_request config 54984/head
nmordech@redhat.com [Wed, 21 Feb 2024 10:01:25 +0000 (10:01 +0000)]
doc/mgr/restful: update max_request config

Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
10 months agoPendingReleaseNotes: Adding note about rest module change and adding max_request...
nmordech@redhat.com [Wed, 21 Feb 2024 09:21:25 +0000 (09:21 +0000)]
PendingReleaseNotes: Adding note about rest module change and adding max_request option

Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
10 months agomgr/rest: Trim request array and limit size
NitzanMordhai [Tue, 28 Nov 2023 09:52:05 +0000 (09:52 +0000)]
mgr/rest: Trim request array and limit size

Presently, the requests array in the REST module has the potential to grow
indefinitely, leading to excessive memory consumption, particularly when
dealing with lengthy and intricate request results.

To address this issue, a limit will be imposed on the requests array within
the REST module.
This limitation will be governed by the `mgr/restful/x/max_requests` configuration
parameter specific to the REST module.
when submit_request called we will check request array if exceed max_request option
if it does we will check if the future trimmed request finished and log error
message in case we are trimming un-finished requests.

Fixes: https://tracker.ceph.com/issues/59580
Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
10 months agoMerge pull request #59153 from ajarr/wip-67436
Ilya Dryomov [Tue, 20 Aug 2024 10:19:23 +0000 (12:19 +0200)]
Merge pull request #59153 from ajarr/wip-67436

rbd: fix CLI output of `rbd group snap info` command when a group snapshot with no member images

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sunil Angadi <Sunil.Angadi@ibm.com>
10 months agoMerge pull request #59292 from cyx1231st/wip-seastore-revert-decouple-ool-writes
Yingxin [Tue, 20 Aug 2024 08:30:57 +0000 (16:30 +0800)]
Merge pull request #59292 from cyx1231st/wip-seastore-revert-decouple-ool-writes

Revert "crimson/os/seastore: wait ool writes in DeviceSubmission phase"

Reviewed-by: Xuehan Xu <xuxuehan@qianxin.com>
Reviewed-by: Myoungwon Oh <myoungwon.oh@samsung.com>
10 months agoMerge pull request #59241 from tobias-urdin/openstack-upperconstraints
Casey Bodley [Mon, 19 Aug 2024 17:10:57 +0000 (13:10 -0400)]
Merge pull request #59241 from tobias-urdin/openstack-upperconstraints

qa: barbican: restrict python packages with upper-constraints

Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agotest/rgw/notifications: don't check for full queue if topics expired 59323/head
Yuval Lifshitz [Mon, 19 Aug 2024 16:48:29 +0000 (16:48 +0000)]
test/rgw/notifications: don't check for full queue if topics expired

there are other tests for queue length, so we can skip this check
if test takes too long.
also remove unnecessary delays from the test.

Fixes: https://tracker.ceph.com/issues/67514?tab=history
Signed-off-by: Yuval Lifshitz <ylifshit@ibm.com>
10 months agomgr/cephadm: add "original_weight" parameter to OSD class 59318/head
Adam King [Mon, 19 Aug 2024 16:30:24 +0000 (12:30 -0400)]
mgr/cephadm: add "original_weight" parameter to OSD class

Fixes: https://tracker.ceph.com/issues/67329
Signed-off-by: Adam King <adking@redhat.com>
10 months agoMerge pull request #58961 from NitzanMordhai/wip-nitzan-dencoder-test-forward-incompa...
Yuri Weinstein [Mon, 19 Aug 2024 14:25:47 +0000 (07:25 -0700)]
Merge pull request #58961 from NitzanMordhai/wip-nitzan-dencoder-test-forward-incompat-fix

workunit/dencoder: dencoder test forward incompat fix

Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agoMerge pull request #58594 from jamiepryde/isa-xor-raid
Yuri Weinstein [Mon, 19 Aug 2024 14:24:56 +0000 (07:24 -0700)]
Merge pull request #58594 from jamiepryde/isa-xor-raid

erasure-code/isa: Use isa/raid's xor_gen() instead of the region_xor(…

Reviewed-by: Mark Nelson <mnelson@redhat.com>
10 months agoqa: barbican: restrict python packages with upper-constraints 59241/head
Tobias Urdin [Thu, 15 Aug 2024 15:17:14 +0000 (17:17 +0200)]
qa: barbican: restrict python packages with upper-constraints

We install barbican by doing a pip install directly on the
cloned git repository but we don't honor the upper-constraints
from the OpenStack Requirements project that handles what
versions is supported.

This changes the pip install command that we issue when
installing barbican to honor the requirements for the
version (derived from the branch) that we use, in
this case it's the 2023.1 release upper-constraints [1].

This prevents us from pulling in untested Python packages.

This only updates Barbican because for the Keystone job
we dont directly issue pip but install using tox using the
`venv` environment which already by default sets the
constraints as you can see in [2].

[1] https://releases.openstack.org/constraints/upper/2023.1
[2] https://github.com/openstack/keystone/blob/stable/2023.1/tox.ini#L12

Fixes: https://tracker.ceph.com/issues/67444
Signed-off-by: Tobias Urdin <tobias.urdin@binero.com>
10 months agoMerge pull request #59239 from yuvalif/wip-yuval-67513
Yuval Lifshitz [Mon, 19 Aug 2024 10:37:07 +0000 (13:37 +0300)]
Merge pull request #59239 from yuvalif/wip-yuval-67513

Reviewed-By: Casey Bodley <cbodley@ibm.com>
test/rgw/notification: use real ip address instead of localhost

based on that comment:
https://tracker.ceph.com/issues/67206#note-6
the address used by the endpoint is taken as the real IP address of the
host where the test script is running and not localhost.

we also changed the rabbitmq-server conf to allow "guest"
user to connect over non localhost address

Fixes: https://tracker.ceph.com/issues/67206
Signed-off-by: Yuval Lifshitz <ylifshit@ibm.com>
10 months agoMerge pull request #59200 from ifed01/wip-ifed-fix-store-test-col-ref
Igor Fedotov [Mon, 19 Aug 2024 09:47:40 +0000 (12:47 +0300)]
Merge pull request #59200 from ifed01/wip-ifed-fix-store-test-col-ref

test/store_test: fix assertions due to unclosed collection refs.

Reviewd-by: Pere Diaz Bou <pere-altea@hotmail.com>
11 months agoMerge pull request #59256 from zdover23/wip-doc-2024-08-17-cephfs-ceph-dokan-mount...
Zac Dover [Mon, 19 Aug 2024 07:21:51 +0000 (17:21 +1000)]
Merge pull request #59256 from zdover23/wip-doc-2024-08-17-cephfs-ceph-dokan-mount-point

doc/cephfs: s/mountpoint/mount point/

Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
11 months agoMerge pull request #58995 from rhcs-dashboard/fix-66844-main
Nizamudeen A [Mon, 19 Aug 2024 05:49:52 +0000 (11:19 +0530)]
Merge pull request #58995 from rhcs-dashboard/fix-66844-main

qa/mgr/dashboard: fix test race condition

Reviewed-by: Nizamudeen A <nia@redhat.com>
11 months agoMerge pull request #59212 from cyx1231st/wip-seastore-more-reports
Yingxin [Mon, 19 Aug 2024 02:18:32 +0000 (10:18 +0800)]
Merge pull request #59212 from cyx1231st/wip-seastore-more-reports

crimson/os/seastore/cache: report lru usage/in/out with trans and extent type

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Xuehan Xu <xuxuehan@qianxin.com>
11 months agoRevert "crimson/os/seastore: wait ool writes in DeviceSubmission phase" 59292/head
Yingxin Cheng [Mon, 19 Aug 2024 01:48:28 +0000 (09:48 +0800)]
Revert "crimson/os/seastore: wait ool writes in DeviceSubmission phase"

This reverts commit c9e423facea79d42f0496264f267adee5d911b87.

The commit starts to submit OOL writes before submitting the journal
write, true, but it cannot guarantee that OOL writes finish before the
journal write.

Thus it is possible that during SeaStore restart, a journal record
appears valid but its dependent OOL records are partial written, which
leads to corruption.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
11 months agolibrbd/migration: move away from util::create_ioctx() in NativeFormat
Ilya Dryomov [Mon, 5 Aug 2024 15:52:10 +0000 (17:52 +0200)]
librbd/migration: move away from util::create_ioctx() in NativeFormat

This is another step towards supporting migration from external
clusters, where creating an IoCtx from a Rados instance that has
nothing to do with dst_io_ctx would be needed.  It also allows to
get rid of a pool lookup in the middle of parsing code.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
11 months agocommon/config: export CEPH_CONF_FILE_DEFAULT
Ilya Dryomov [Fri, 16 Aug 2024 12:12:38 +0000 (14:12 +0200)]
common/config: export CEPH_CONF_FILE_DEFAULT

It used to be exported until commit 318c62f8ae16 ("common/config:
cleanup remove some unused macros").  Having CEPH_CONF_FILE_DEFAULT
avaialable is handy to prevent parse_config_files() from picking up
CEPH_CONF environment variable.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
11 months agolibrbd: RefreshParentRequest::m_parent_snap_id is unused
Ilya Dryomov [Wed, 14 Aug 2024 16:36:57 +0000 (18:36 +0200)]
librbd: RefreshParentRequest::m_parent_snap_id is unused

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
11 months agolibrbd: assert on parent in ImageCtx destructor
Ilya Dryomov [Wed, 14 Aug 2024 17:42:09 +0000 (19:42 +0200)]
librbd: assert on parent in ImageCtx destructor

... and switch to in-class initializers while at it.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
11 months agoqa/tasks/qemu: remove hard-coding of cluster name
Or Ozeri [Mon, 6 Nov 2023 11:56:27 +0000 (13:56 +0200)]
qa/tasks/qemu: remove hard-coding of cluster name

This commit allows running the qemu task on an arbitrary cluster name.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
11 months agoqa/tasks/rbd: support non-default ceph clusters
Or Ozeri [Wed, 15 Nov 2023 09:47:54 +0000 (11:47 +0200)]
qa/tasks/rbd: support non-default ceph clusters

This commit allows running the rbd task on an arbitrary cluster name.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
11 months agolibrbd/migration: don't clone when flattening
Or Ozeri [Tue, 31 Jan 2023 11:08:22 +0000 (13:08 +0200)]
librbd/migration: don't clone when flattening

When the flatten flag is set, instead of creating the
destination image by cloning, create it independently,
as the parent relation is unnecessary in this case.
This will be particularly useful when the migration source
is located in an external Ceph cluster, which will soon be
supported.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
11 months agoMerge pull request #59290 from anthonyeleven/mountpoint
Anthony D'Atri [Sun, 18 Aug 2024 15:43:00 +0000 (08:43 -0700)]
Merge pull request #59290 from anthonyeleven/mountpoint

doc: Harmonize 'mountpoint'

11 months agodoc: Harmonize 'mountpoint' 59290/head
Anthony D'Atri [Sun, 18 Aug 2024 15:23:39 +0000 (11:23 -0400)]
doc: Harmonize 'mountpoint'

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>