]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
10 months agoqa/suites/orch: add test for smb with ctdb and cluster public ips 59419/head
John Mulligan [Fri, 23 Aug 2024 14:16:27 +0000 (10:16 -0400)]
qa/suites/orch: add test for smb with ctdb and cluster public ips

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agodoc: add documentation for (cluster_)public_addrs options
John Mulligan [Fri, 23 Aug 2024 14:01:08 +0000 (10:01 -0400)]
doc: add documentation for (cluster_)public_addrs options

Document the spec and resource options (they're basically the same) for
specifying public addresses that will be managed automatically
by CTDB.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/smb: add cluster public ip information to service spec
John Mulligan [Thu, 22 Aug 2024 18:08:16 +0000 (14:08 -0400)]
mgr/smb: add cluster public ip information to service spec

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/smb: extend cluster resource type to define public ip addrs
John Mulligan [Thu, 22 Aug 2024 18:08:06 +0000 (14:08 -0400)]
mgr/smb: extend cluster resource type to define public ip addrs

When a cluster defines public IPs it will pass this information along to
the smb service spec.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/cephadm: pass public addresses for a cluster to cephadm binary
John Mulligan [Wed, 21 Aug 2024 21:02:57 +0000 (17:02 -0400)]
mgr/cephadm: pass public addresses for a cluster to cephadm binary

Add the strictly-formed public addresses list as one of the config blobs
we pass to the binary for smb container deployment.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agopython-common/deployment: add a cluster public ip spec for smb
John Mulligan [Wed, 21 Aug 2024 15:31:52 +0000 (11:31 -0400)]
python-common/deployment: add a cluster public ip spec for smb

This spec can be used to define one or more public addresses that will
be automatically assigned to hosts by CTDB. The address can be specified
in the "interface" form - an address plus prefix length.  Optionally,
networks to bind to can be specified. The network value will be
converted to a network device name later by cephadm.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agocephadm: add support for cluster public ip addresses to smb daemon
John Mulligan [Wed, 21 Aug 2024 21:03:40 +0000 (17:03 -0400)]
cephadm: add support for cluster public ip addresses to smb daemon

When a list of public addresses (and optional network destination(s))
are supplied at deploy time, convert the networks to device names
and pass that result to the sambcc ctdb configuration.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/smb: simplify orch backend enablement
John Mulligan [Wed, 21 Aug 2024 21:03:19 +0000 (17:03 -0400)]
mgr/smb: simplify orch backend enablement

We have a developer/debug module option that allows one to disable
triggering orchestration. When I tried to use it I thought it was
buggy and I had trouble diagnosing it. The mistake was on my side,
but the code change makes it much clearer what is being enabled
so I want to keep it.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agoMerge pull request #57956 from tobias-urdin/remove-keystone-v2
Casey Bodley [Mon, 26 Aug 2024 15:03:42 +0000 (11:03 -0400)]
Merge pull request #57956 from tobias-urdin/remove-keystone-v2

rgw/auth: Remove Keystone v2.0 API support

Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agoMerge pull request #59301 from xxhdx1985126/wip-67604
Matan Breizman [Mon, 26 Aug 2024 10:55:47 +0000 (13:55 +0300)]
Merge pull request #59301 from xxhdx1985126/wip-67604

crimson/common/tri_mutex: also wake up waiters when demoting

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
10 months agoMerge pull request #58136 from xxhdx1985126/wip-66372
Matan Breizman [Mon, 26 Aug 2024 10:50:10 +0000 (13:50 +0300)]
Merge pull request #58136 from xxhdx1985126/wip-66372

crimson/osd/osd: mark down connections to downed osds

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
10 months agoMerge pull request #54620 from rishabh-d-dave/mgr-vol-clone-stats
Venky Shankar [Mon, 26 Aug 2024 10:14:53 +0000 (15:44 +0530)]
Merge pull request #54620 from rishabh-d-dave/mgr-vol-clone-stats

mgr/vol: show progress and stats for the subvolume snapshot clones

Reviewed-by: Venky Shankar <vshankar@redhat.com>
10 months agoMerge pull request #59428 from zdover23/wip-doc-2024-08-26-cephadm-services-osd
Zac Dover [Mon, 26 Aug 2024 08:09:16 +0000 (18:09 +1000)]
Merge pull request #59428 from zdover23/wip-doc-2024-08-26-cephadm-services-osd

doc/cephadm: how to get exact size_spec from device

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
10 months agoMerge pull request #59392 from cyx1231st/wip-inplace-rewrite-comments
Yingxin [Mon, 26 Aug 2024 03:28:03 +0000 (11:28 +0800)]
Merge pull request #59392 from cyx1231st/wip-inplace-rewrite-comments

crimson/os/seastore: refine documents related to inplace rewrite

Reviewed-by: Myoungwon Oh <myoungwon.oh@samsung.com>
10 months agodoc/cephadm: how to get exact size_spec from device 59428/head
Zac Dover [Sun, 25 Aug 2024 20:03:34 +0000 (06:03 +1000)]
doc/cephadm: how to get exact size_spec from device

Add instructions for retrieving the exact size of block devices.

Fixes: https://tracker.ceph.com/issues/66754
Signed-off-by: Zac Dover <zac.dover@proton.me>
10 months agoMerge pull request #59053 from baum/wip-baum-20240806-00
baum [Sun, 25 Aug 2024 18:10:46 +0000 (21:10 +0300)]
Merge pull request #59053 from baum/wip-baum-20240806-00

nvmeof/NVMeofGwMonitorClient: use a separate mutex for beacons

10 months agoMerge pull request #58858 from ronen-fr/wip-rf-entry
Ronen Friedman [Sun, 25 Aug 2024 16:44:03 +0000 (19:44 +0300)]
Merge pull request #58858 from ronen-fr/wip-rf-entry

osd/scrub: a scrub queue of level-specific entries

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Nitzan Mordechai <nmordech@redhat.com>
10 months agotest/osd/scrub: fix searched-for log string 58858/head
Ronen Friedman [Sun, 25 Aug 2024 08:57:42 +0000 (03:57 -0500)]
test/osd/scrub: fix searched-for log string

To match the modified log message in
OsdScrub::restrictions_on_scrubbing().

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix missing 'const' on some formatters
Ronen Friedman [Sat, 24 Aug 2024 11:41:44 +0000 (06:41 -0500)]
osd/scrub: fix missing 'const' on some formatters

required to pass CI checks.

co-author: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agotest/osd/scrub: disable tests for deleted scrub functionality
Ronen Friedman [Sat, 24 Aug 2024 05:36:44 +0000 (00:36 -0500)]
test/osd/scrub: disable tests for deleted scrub functionality

The scrub scheduler no longer "upgrades" shallow scrubs into
deep ones on error, so the tests that check this functionality
are no longer valid.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agotest/osd: test new functionality added to the not-before queue
Ronen Friedman [Sun, 18 Aug 2024 17:33:38 +0000 (12:33 -0500)]
test/osd: test new functionality added to the not-before queue

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: delay both targets on some failures
Ronen Friedman [Sat, 17 Aug 2024 16:08:19 +0000 (11:08 -0500)]
osd/scrub: delay both targets on some failures

If the failure of a scrub-job is due to a condition that affects
both targets, both should be delayed. Otherwise, we may end up
with the following bogus scenario:

A high priority deep target is scheduled, but scrub session initiation
fails due to, for example, a concurrent snap trim. The deep target
will be delayed. A second initiation attempt may happen after the
snap trimming is done, but before the updated deep target not-before.
As a result - the lower priority target will be scheduled before the
higher priority one - which is a bug.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: reverse OSDRestrictions flags polarity
Ronen Friedman [Thu, 15 Aug 2024 13:17:48 +0000 (08:17 -0500)]
osd/scrub: reverse OSDRestrictions flags polarity

As most of the flags in OSDRestrictions are of 'true is bad' polarity,
reverse the two non-conforming flags - cpu load and time-of-day
restrictions - to match.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix the conditions for auto-repair scrubs
Ronen Friedman [Thu, 15 Aug 2024 12:51:15 +0000 (07:51 -0500)]
osd/scrub: fix the conditions for auto-repair scrubs

The conditions for auto-repair scrubs should have been changed
when need_auto lost some of its setters.

Also fix the rescheduling of repair scrubs
when the last scrub ended with errors.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: remove requested_scrub_t::deep_scrub_on_error
Ronen Friedman [Thu, 8 Aug 2024 13:49:57 +0000 (08:49 -0500)]
osd/scrub: remove requested_scrub_t::deep_scrub_on_error

This flag was used to indicate that a deep scrub should
be performed if a shallow scrub finds an error. It was
always set true for shallow, regular, scrubs - if
can_autorepair flag was set. Thus, the ephemeral flag in
the requested_scrub_t object is not really needed.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoqa/standalone/scrub: disable scrub_extended_sleep test
Ronen Friedman [Tue, 6 Aug 2024 13:07:17 +0000 (08:07 -0500)]
qa/standalone/scrub: disable scrub_extended_sleep test

Disabling osd-scrub-test.sh::TEST_scrub_extended_sleep,
as the test is no longer valid (updated code no longer
produces the same logs or the same behavior).

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: remove non-display usage of target's is_high_priority()
Ronen Friedman [Tue, 30 Jul 2024 12:12:54 +0000 (07:12 -0500)]
osd/scrub: remove non-display usage of target's is_high_priority()

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: remove 'calculated_to_deep' flag
Ronen Friedman [Mon, 29 Jul 2024 04:34:32 +0000 (23:34 -0500)]
osd/scrub: remove 'calculated_to_deep' flag

as once a sched-target was selected, we know the level of the scrub.
Also removed: the ephemeral 'time_for_deep' flag.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: modify after-repair-scrub triggering
Ronen Friedman [Sun, 28 Jul 2024 12:37:07 +0000 (07:37 -0500)]
osd/scrub: modify after-repair-scrub triggering

... to manipulate the relevant scrub target directly, instead
of using the 'planned scrub' flags.

The relevant condition flag was moved from the PG and into the scrubber.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix ReplicaReservations ctor to use correct query
Ronen Friedman [Sun, 28 Jul 2024 10:52:38 +0000 (05:52 -0500)]
osd/scrub: fix ReplicaReservations ctor to use correct query

when determining whether replica reservations are required.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix parameters validation on scrub start
Ronen Friedman [Sun, 28 Jul 2024 06:09:25 +0000 (01:09 -0500)]
osd/scrub: fix parameters validation on scrub start

... as the selected target already determines the
scrub level & type.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix reserve_local()
Ronen Friedman [Sun, 28 Jul 2024 10:20:38 +0000 (05:20 -0500)]
osd/scrub: fix reserve_local()

to use the correct method when determining whether we should
perform the reservation.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix initiation path of operator-commanded scrubs
Ronen Friedman [Sat, 27 Jul 2024 17:59:46 +0000 (12:59 -0500)]
osd/scrub: fix initiation path of operator-commanded scrubs

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agocommon/not_before_queue: extending the container's API
Ronen Friedman [Tue, 30 Jul 2024 10:59:00 +0000 (05:59 -0500)]
common/not_before_queue: extending the container's API

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: OSD's scrub queue now holds SchedEntry-s
Ronen Friedman [Wed, 24 Jul 2024 07:02:46 +0000 (02:02 -0500)]
osd/scrub: OSD's scrub queue now holds SchedEntry-s

The OSD's scrub queue now holds SchedEntry-s, instead of ScrubJob-s.
The queue itself is implemented using the 'not_before_queue_t' class.

Note: this is not a stable state of the scrubber code. In the next
commits:
- modifying the way sched targets are modified and updated, to match the
  new queue implementation.
- removing the 'planned scrub' flags.

Important note: the interaction of initiate_scrub() and pop_ready_pg()
is not changed by this commit. Namely:

Currently - pop..() loops over all eligible jobs, until it finds one
that matches the environment restrictions (which most of the time, as the
concurrency limit is usually reached, would be 'high-priority-only').

The other option is to maintain Sam's 'not_before_q' clean interface: we
always pop the top, and if that top fails the preconds tests - we delay and
re-push. This has the following troubling implications:

- it would take a long time to find a viable scrub job, if the problem
  is related to, for example, 'no scrub'.
- local resources failure (inc_scrubs() failure) must be handles
  separately, as we do not want to reshuffle the queue for this
  very very common case.
- but the real problem: unneeded shuffling of the queue, even as the
  problem is not with the scrub job itself, but with the environment
  (esp. no-scrub etc.).
  This is a common case, and it would be wrong to reshuffle the queue
  for that.
- and - remember that any change to a sched-entry must be done under PG
  lock.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agocommon/not_before_queue: move status_t out of container_t
Ronen Friedman [Tue, 30 Jul 2024 10:54:59 +0000 (05:54 -0500)]
common/not_before_queue: move status_t out of container_t

for readability

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agocommon/not_before_queue: some spelling fixes
Ronen Friedman [Mon, 29 Jul 2024 03:58:22 +0000 (22:58 -0500)]
common/not_before_queue: some spelling fixes

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agocommon: add not_before_queue_t
Samuel Just [Fri, 16 Dec 2022 18:30:18 +0000 (18:30 +0000)]
common: add not_before_queue_t

Signed-off-by: Samuel Just <sjust@redhat.com>
10 months agoosd/scrub: modify ScrubJob to hold two SchedTarget-s
Ronen Friedman [Fri, 12 Jul 2024 13:18:30 +0000 (08:18 -0500)]
osd/scrub: modify ScrubJob to hold two SchedTarget-s

ScrubJob will now hold two SchedTarget-s - two sets of scheduling
information (times, levels, etc.) for the next shallow and deep scrubs.

This is in preparation for the upcoming changes to the scheduling queue.
The change cannot stand on its own, as the partial implementation
creates some inconsistencies in the scheduling logic.

Specifically, here is what changes here, and how it differs from the
desired implementation:
- The OSD still maintains a queue of scrub jobs - one object only per
  PG.
  But now - each queue element holds two SchedTarget-s.
- When a scrub is initiated, the Scrubber is handed a ScrubJob object.
  Only in the next commit will it also receive the ID of the selected
  level. That causes some issues when re-determining the level of the
  initiated scrub. A failure to match the queue "intent" results in
  failures.
- the 'planned scrub' flags are still here, instead of directly
  encoding the characteristics of the next scrub in the relevant
  sched-entry.
- the 'urgency' levels do not cover the full required range of
  behaviors and priorities.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agonvmeof/NVMeofGwMonitorClient: use a separate mutex for beacons 59053/head
Alexander Indenbaum [Mon, 5 Aug 2024 09:50:27 +0000 (09:50 +0000)]
nvmeof/NVMeofGwMonitorClient: use a separate mutex for beacons

Add beacon_lock to mitigate potential beacon delays caused by slow message
handling, particularly in handle_nvmeof_gw_map.

Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
10 months agoMerge PR #58487 into main
Venky Shankar [Fri, 23 Aug 2024 16:32:34 +0000 (22:02 +0530)]
Merge PR #58487 into main

* refs/pull/58487/head:
qa/suites/fs/workload: drop mgrmodules stanza
qa/tasks/ceph: fix "ceph mgr module enable" command

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Milind Changire <mchangir@redhat.com>
10 months agoMerge pull request #58336 from Svelar/uadk
Casey Bodley [Fri, 23 Aug 2024 14:32:47 +0000 (10:32 -0400)]
Merge pull request #58336 from Svelar/uadk

Compressor: add UADK support

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
10 months agoMerge pull request #59418 from zdover23/wip-doc-2024-08-23-glossary-object-storage
Anthony D'Atri [Fri, 23 Aug 2024 14:11:16 +0000 (10:11 -0400)]
Merge pull request #59418 from zdover23/wip-doc-2024-08-23-glossary-object-storage

doc/glossary: add "object storage"

10 months agoMerge pull request #59086 from phlogistonjohn/jjm-smb-ctdb-clustering
Adam King [Fri, 23 Aug 2024 13:06:33 +0000 (09:06 -0400)]
Merge pull request #59086 from phlogistonjohn/jjm-smb-ctdb-clustering

smb: ctdb clustering

Reviewed-by: Adam King <adking@redhat.com>
10 months agoMerge pull request #59175 from Yonatan-Zaken/fix_boolean_flags_handling_for_ceph_orch...
Adam King [Fri, 23 Aug 2024 12:54:21 +0000 (08:54 -0400)]
Merge pull request #59175 from Yonatan-Zaken/fix_boolean_flags_handling_for_ceph_orch_daemon_add_osd

mgr/orchestrator: fix encrypted flag handling in orch daemon add osd

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Guillaume Abrioux <gabrioux@ibm.com>
10 months agodoc/glossary: add "object storage" 59418/head
Zac Dover [Fri, 23 Aug 2024 12:36:16 +0000 (22:36 +1000)]
doc/glossary: add "object storage"

Add a (very basic) definition of object storage.

Signed-off-by: Zac Dover <zac.dover@proton.me>
10 months agoMerge PR #56602 into main
Venky Shankar [Fri, 23 Aug 2024 09:21:41 +0000 (14:51 +0530)]
Merge PR #56602 into main

* refs/pull/56602/head:
mds: always make getattr wait for xlock to be released by the previous client

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Leonid Usov <leonid.usov@ibm.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 months agocrimson/os/seastore: refine documents related to inplace rewrite 59392/head
Yingxin Cheng [Thu, 22 Aug 2024 02:34:47 +0000 (10:34 +0800)]
crimson/os/seastore: refine documents related to inplace rewrite

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
10 months agoMerge pull request #44470 from orozery/rbd-external-migrate
Ilya Dryomov [Fri, 23 Aug 2024 08:20:47 +0000 (10:20 +0200)]
Merge pull request #44470 from orozery/rbd-external-migrate

librbd/migration: add external clusters support

Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
10 months agoMerge pull request #59393 from anthonyeleven/caps-man-caps
Zac Dover [Thu, 22 Aug 2024 22:08:06 +0000 (08:08 +1000)]
Merge pull request #59393 from anthonyeleven/caps-man-caps

doc/releases: Correct mimic.rst

Reviewed-by: Zac Dover <zac.dover@proton.me>
10 months agoMerge pull request #59253 from clwluvw/copy-source-attrs
Casey Bodley [Thu, 22 Aug 2024 18:25:11 +0000 (14:25 -0400)]
Merge pull request #59253 from clwluvw/copy-source-attrs

rgw: load copy source bucket attrs in putobj

Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agoMerge pull request #59318 from adk3798/cephadm-osd-original-weight-param
Adam King [Thu, 22 Aug 2024 17:38:23 +0000 (13:38 -0400)]
Merge pull request #59318 from adk3798/cephadm-osd-original-weight-param

mgr/cephadm: add "original_weight" parameter to OSD class

Reviewed-by: John Mulligan <jmulligan@redhat.com>
10 months agoMerge pull request #59204 from tchaikov/wip-ceph-volume-deps
Guillaume Abrioux [Thu, 22 Aug 2024 13:54:53 +0000 (15:54 +0200)]
Merge pull request #59204 from tchaikov/wip-ceph-volume-deps

ceph-volume: add "packaging" to install_requires

10 months agoMerge pull request #58990 from Matan-B/wip-matanb-fmt-draft
Matan Breizman [Thu, 22 Aug 2024 11:53:31 +0000 (14:53 +0300)]
Merge pull request #58990 from Matan-B/wip-matanb-fmt-draft

fmt: bump up version + related changes

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Radosław Zarzyński <rzarzyns@redhat.com>
10 months agoqa/workunits/rbd: exercise snap_{name,id} parsing in test_import_native_format() 44470/head
Ilya Dryomov [Wed, 21 Aug 2024 19:16:30 +0000 (21:16 +0200)]
qa/workunits/rbd: exercise snap_{name,id} parsing in test_import_native_format()

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
10 months agodoc/rbd: clarify when image_id is expected for import-only migration
Ilya Dryomov [Sat, 17 Aug 2024 08:28:50 +0000 (10:28 +0200)]
doc/rbd: clarify when image_id is expected for import-only migration

"optional if image in trash" can be easily interpreted as "required if
image not in trash".

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
10 months agolibrbd/migration: add external clusters support
Ilya Dryomov [Fri, 16 Aug 2024 17:09:39 +0000 (19:09 +0200)]
librbd/migration: add external clusters support

This commit extends NativeFormat (aka migration where the migration
source is an RBD image) to support external Ceph clusters, limited to
import-only mode.

Co-authored-by: Or Ozeri <oro@il.ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
10 months agoMerge PR #55144 into main
Venky Shankar [Thu, 22 Aug 2024 09:24:13 +0000 (14:54 +0530)]
Merge PR #55144 into main

* refs/pull/55144/head:
client: fix file cache cap leak which can stall async read call
test/client: test contiguous read for a non-contiguous write

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
10 months agoMerge PR #56816 into main
Venky Shankar [Thu, 22 Aug 2024 09:22:33 +0000 (14:52 +0530)]
Merge PR #56816 into main

* refs/pull/56816/head:
doc: mention the peer status failed when snapshot created on the remote filesystem.
qa: add test_cephfs_mirror_remote_snap_corrupt_fails_synced_snapshot
cephfs_mirror: update peer status for invalid metadata in remote snapshot

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>
10 months agoMerge PR #59166 into main
Venky Shankar [Thu, 22 Aug 2024 09:20:51 +0000 (14:50 +0530)]
Merge PR #59166 into main

* refs/pull/59166/head:
mon/thrasher: set stopping

Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
10 months agodoc/releases: Correct mimic.rst 59393/head
Anthony D'Atri [Thu, 22 Aug 2024 03:55:34 +0000 (23:55 -0400)]
doc/releases: Correct mimic.rst

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
10 months agoosd/scrub: introducing the concept of a SchedEntry
Ronen Friedman [Sun, 7 Jul 2024 17:46:25 +0000 (12:46 -0500)]
osd/scrub: introducing the concept of a SchedEntry

SchedEntry holds the scheduling details for scrubbing a specific PG at
a specific scrub level. Namely - it identifies the [pg,level]
combination, the 'urgency' attribute of the scheduled scrub
(which determines most of its behavior and scheduling decisions)
and the actual time attributes for scheduling (target,
deadline, not_before).

Added a table detailing, for each type of scrub, what limitations apply
to it, and what restrictions are waived.

The following commits will reshape the ScrubJob objects to hold
two instances of SchedTarget-s - two wrappers around SchedEntry-s,
one for the next shallow scrub and one for the next deep scrub.

Sched-entries (wrapped in sched-targets) have a defined order:

For ready-to-scrub entries (those that have an n.b. in the past),
the order is first by urgency, then by target time (and then by
level - deep before shallow - and then by the n.b. itself).

'Future' entries are ordered by n.b., then urgency,
target time, and level.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoMerge pull request #59348 from zdover23/wip-doc-2024-08-20-rados-ops-cache-tiering
Zac Dover [Wed, 21 Aug 2024 11:26:54 +0000 (21:26 +1000)]
Merge pull request #59348 from zdover23/wip-doc-2024-08-20-rados-ops-cache-tiering

doc/rados: document unfound object cache-tiering scenario

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
10 months agoMerge pull request #59362 from gbregman/main
Gil Bregman [Wed, 21 Aug 2024 05:46:29 +0000 (08:46 +0300)]
Merge pull request #59362 from gbregman/main

mgr/cephadm: change SPDK RPC fields in nvmeof configuration

10 months agoMerge pull request #59323 from yuvalif/wip-yuval-67514
Yuval Lifshitz [Wed, 21 Aug 2024 05:09:34 +0000 (08:09 +0300)]
Merge pull request #59323 from yuvalif/wip-yuval-67514

test/rgw/notifications: don't check for full queue if topics expired

Reviewed-By: Casey Bodley <cbodley@ibm.com>
10 months agoMerge pull request #54984 from NitzanMordhai/wip-nitzan-restful-un-boundary-keep...
NitzanMordhai [Tue, 20 Aug 2024 16:16:25 +0000 (19:16 +0300)]
Merge pull request #54984 from NitzanMordhai/wip-nitzan-restful-un-boundary-keep-requests

mgr/rest: Trim  requests array and limit size

10 months agodoc: add clustering related items to smb docs 59086/head
John Mulligan [Wed, 14 Aug 2024 18:19:17 +0000 (14:19 -0400)]
doc: add clustering related items to smb docs

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>
10 months agoqa/suites/orch: add a pair of teuthology tests for ctdb smb clusters
John Mulligan [Sat, 10 Aug 2024 18:42:16 +0000 (14:42 -0400)]
qa/suites/orch: add a pair of teuthology tests for ctdb smb clusters

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>
10 months agoqa/suites/orch: old smb tests need placement count 1 to avoid using clustering
John Mulligan [Sat, 10 Aug 2024 16:49:24 +0000 (12:49 -0400)]
qa/suites/orch: old smb tests need placement count 1 to avoid using clustering

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>
10 months agomgr/cephadm: enable the smb service to prevent stray ctdb services
John Mulligan [Mon, 12 Aug 2024 14:56:51 +0000 (10:56 -0400)]
mgr/cephadm: enable the smb service to prevent stray ctdb services

Tell cephadm that any `ctdb` services are "owned" by the smb service
and should be ignored as not a stray.
Ideally, we do this on a per service basis but the info that the ctdb
lock helper provides to its registration function is pretty generic.
Future versions of samba may improve upon this.

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>
10 months agomgr/cephadm: extend stray service detection with a general ignore hook
John Mulligan [Mon, 12 Aug 2024 14:56:36 +0000 (10:56 -0400)]
mgr/cephadm: extend stray service detection with a general ignore hook

Extend the system's current stray service detection with a new method on
the service classes so that new classes can hook into the stray services
in the case that ceph services and cephadm services have differing names
or use subsystems that call into ceph with different names (my use
case).

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>
10 months agomgr/cephadm: move logic determining name in stray func
John Mulligan [Mon, 12 Aug 2024 13:52:20 +0000 (09:52 -0400)]
mgr/cephadm: move logic determining name in stray func

Encapsulate the logic determining the name of a stray service into a
method reducing the length and levels of indent in the stray checker
function.

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>
10 months agomgr/smb: enable clustering when setting up a cluster
John Mulligan [Mon, 15 Jul 2024 19:41:56 +0000 (15:41 -0400)]
mgr/smb: enable clustering when setting up a cluster

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/smb: add a cluster resource field to manage clustering
John Mulligan [Mon, 15 Jul 2024 19:41:43 +0000 (15:41 -0400)]
mgr/smb: add a cluster resource field to manage clustering

Add a new `clustering` field to the smb cluster resource. This field can
be used to select either automatic clustering with ctdb, or disable it,
or require it. The default is automatic and is based on the count value
in the placement spec. A count of 1 disables clustering and any other
value it is enabled.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/cephadm: configure ctdb cluster metadata from cephadm smb service
John Mulligan [Thu, 15 Aug 2024 20:40:47 +0000 (16:40 -0400)]
mgr/cephadm: configure ctdb cluster metadata from cephadm smb service

Add support to the smb service module so that cephadm will provide
information about the layout of the smb daemons to the clustermeta
module that, in turn, will provide the information sambacc needs to
configure ctdb.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/smb: add a python module to help manage the ctdb cluster
John Mulligan [Mon, 15 Jul 2024 19:39:19 +0000 (15:39 -0400)]
mgr/smb: add a python module to help manage the ctdb cluster

Add a new module clustermeta that implements a JSON based interface
compatible with sambacc. This module will be called directly by cephadm
as it places the daemons on the cluster nodes.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/smb: add function to parse rados psuedo-uri values
John Mulligan [Mon, 15 Jul 2024 19:22:43 +0000 (15:22 -0400)]
mgr/smb: add function to parse rados psuedo-uri values

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/smb: add support for rados locks to rados store
John Mulligan [Mon, 15 Jul 2024 19:22:22 +0000 (15:22 -0400)]
mgr/smb: add support for rados locks to rados store

Add support for using rados object locks to the rados store classes.
Callers directly using the rados store outside the store interface will
be able to make use of locking.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/cephadm: improve key management of smb service
John Mulligan [Mon, 15 Jul 2024 19:38:12 +0000 (15:38 -0400)]
mgr/cephadm: improve key management of smb service

The clustered mode of a logical smb cluster needs certain additional
capabilities in the rados pool. Improve, reorganize the key
configuration functions, and add the new caps.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agopython-common: add clustering related params to SMBSpec
John Mulligan [Mon, 15 Jul 2024 19:16:56 +0000 (15:16 -0400)]
python-common: add clustering related params to SMBSpec

Add parameters related to ctdb clustering to the smb service
deployment spec.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agocephadm: add ctdb support to smb daemon type
John Mulligan [Mon, 15 Jul 2024 19:16:04 +0000 (15:16 -0400)]
cephadm: add ctdb support to smb daemon type

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agocephadm: allow longer subcomponent names
John Mulligan [Mon, 15 Jul 2024 19:14:37 +0000 (15:14 -0400)]
cephadm: allow longer subcomponent names

Allow subcomponent names up to 32 chars long.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agocephadm: add a new context getter for rank
John Mulligan [Mon, 15 Jul 2024 19:14:13 +0000 (15:14 -0400)]
cephadm: add a new context getter for rank

Add a new context getter function to fetch a daemon's rank and rank
generation value.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/cephadm: change SPDK RPC fields in nvmeof configuration 59362/head
Gil Bregman [Tue, 20 Aug 2024 13:29:57 +0000 (16:29 +0300)]
mgr/cephadm: change SPDK RPC fields in nvmeof configuration
Fixes https://tracker.ceph.com/issues/67629

Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
10 months agopython-common/ceph/deployment: change SPDK RPC fields in nvmeof configuration
Gil Bregman [Tue, 20 Aug 2024 13:28:12 +0000 (16:28 +0300)]
python-common/ceph/deployment: change SPDK RPC fields in nvmeof configuration
Fixes https://tracker.ceph.com/issues/67629

Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
10 months agodoc/rados: document unfound object cache-tiering scenario 59348/head
Zac Dover [Tue, 20 Aug 2024 12:45:29 +0000 (22:45 +1000)]
doc/rados: document unfound object cache-tiering scenario

Explain how to deal with "unfound objects" when restarting OSDs in a
cache-tiered environment.

Fixes: https://tracker.ceph.com/issues/44286
Signed-off-by: Zac Dover <zac.dover@proton.me>
10 months agoMerge pull request #58460 from rkachach/fix_issue_oauth2_support
Adam King [Tue, 20 Aug 2024 12:35:44 +0000 (08:35 -0400)]
Merge pull request #58460 from rkachach/fix_issue_oauth2_support

adding support for SSO based on auth2-proxy

Reviewed-by: Adam King <adking@redhat.com>
10 months agoMerge pull request #58860 from adk3798/cephadm-nvmeof-require-group
Adam King [Tue, 20 Aug 2024 12:20:02 +0000 (08:20 -0400)]
Merge pull request #58860 from adk3798/cephadm-nvmeof-require-group

mgr/cephadm: require "group" parameter in nvmeof specs

Reviewed-by: Redouane Kachach <rkachach@ibm.com>
10 months agoMerge pull request #59165 from NitzanMordhai/wip-nitzan-test-rados-tools-newline...
NitzanMordhai [Tue, 20 Aug 2024 12:07:21 +0000 (15:07 +0300)]
Merge pull request #59165 from NitzanMordhai/wip-nitzan-test-rados-tools-newline-trim

test: test_rados_tools compare output without trimming newline

10 months agodoc/mgr/restful: update max_request config 54984/head
nmordech@redhat.com [Wed, 21 Feb 2024 10:01:25 +0000 (10:01 +0000)]
doc/mgr/restful: update max_request config

Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
10 months agoPendingReleaseNotes: Adding note about rest module change and adding max_request...
nmordech@redhat.com [Wed, 21 Feb 2024 09:21:25 +0000 (09:21 +0000)]
PendingReleaseNotes: Adding note about rest module change and adding max_request option

Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
10 months agomgr/rest: Trim request array and limit size
NitzanMordhai [Tue, 28 Nov 2023 09:52:05 +0000 (09:52 +0000)]
mgr/rest: Trim request array and limit size

Presently, the requests array in the REST module has the potential to grow
indefinitely, leading to excessive memory consumption, particularly when
dealing with lengthy and intricate request results.

To address this issue, a limit will be imposed on the requests array within
the REST module.
This limitation will be governed by the `mgr/restful/x/max_requests` configuration
parameter specific to the REST module.
when submit_request called we will check request array if exceed max_request option
if it does we will check if the future trimmed request finished and log error
message in case we are trimming un-finished requests.

Fixes: https://tracker.ceph.com/issues/59580
Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
10 months agoMerge pull request #59153 from ajarr/wip-67436
Ilya Dryomov [Tue, 20 Aug 2024 10:19:23 +0000 (12:19 +0200)]
Merge pull request #59153 from ajarr/wip-67436

rbd: fix CLI output of `rbd group snap info` command when a group snapshot with no member images

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sunil Angadi <Sunil.Angadi@ibm.com>
10 months agoMerge pull request #59292 from cyx1231st/wip-seastore-revert-decouple-ool-writes
Yingxin [Tue, 20 Aug 2024 08:30:57 +0000 (16:30 +0800)]
Merge pull request #59292 from cyx1231st/wip-seastore-revert-decouple-ool-writes

Revert "crimson/os/seastore: wait ool writes in DeviceSubmission phase"

Reviewed-by: Xuehan Xu <xuxuehan@qianxin.com>
Reviewed-by: Myoungwon Oh <myoungwon.oh@samsung.com>
10 months agocrimson/common/tri_mutex: also wake up waiters when demoting 59301/head
Xuehan Xu [Mon, 19 Aug 2024 10:22:01 +0000 (18:22 +0800)]
crimson/common/tri_mutex: also wake up waiters when demoting

Fixes: https://tracker.ceph.com/issues/67604
Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
10 months agoMerge pull request #59241 from tobias-urdin/openstack-upperconstraints
Casey Bodley [Mon, 19 Aug 2024 17:10:57 +0000 (13:10 -0400)]
Merge pull request #59241 from tobias-urdin/openstack-upperconstraints

qa: barbican: restrict python packages with upper-constraints

Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agotest/rgw/notifications: don't check for full queue if topics expired 59323/head
Yuval Lifshitz [Mon, 19 Aug 2024 16:48:29 +0000 (16:48 +0000)]
test/rgw/notifications: don't check for full queue if topics expired

there are other tests for queue length, so we can skip this check
if test takes too long.
also remove unnecessary delays from the test.

Fixes: https://tracker.ceph.com/issues/67514?tab=history
Signed-off-by: Yuval Lifshitz <ylifshit@ibm.com>
10 months agomgr/cephadm: add "original_weight" parameter to OSD class 59318/head
Adam King [Mon, 19 Aug 2024 16:30:24 +0000 (12:30 -0400)]
mgr/cephadm: add "original_weight" parameter to OSD class

Fixes: https://tracker.ceph.com/issues/67329
Signed-off-by: Adam King <adking@redhat.com>
10 months agoMerge pull request #58961 from NitzanMordhai/wip-nitzan-dencoder-test-forward-incompa...
Yuri Weinstein [Mon, 19 Aug 2024 14:25:47 +0000 (07:25 -0700)]
Merge pull request #58961 from NitzanMordhai/wip-nitzan-dencoder-test-forward-incompat-fix

workunit/dencoder: dencoder test forward incompat fix

Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agoMerge pull request #58594 from jamiepryde/isa-xor-raid
Yuri Weinstein [Mon, 19 Aug 2024 14:24:56 +0000 (07:24 -0700)]
Merge pull request #58594 from jamiepryde/isa-xor-raid

erasure-code/isa: Use isa/raid's xor_gen() instead of the region_xor(…

Reviewed-by: Mark Nelson <mnelson@redhat.com>