]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
10 months agoMerge branch 'main' into add-email-contact 59484/head
Shraddha Agrawal [Thu, 29 Aug 2024 05:13:14 +0000 (10:43 +0530)]
Merge branch 'main' into add-email-contact

10 months agoMerge pull request #59390 from ceph/wip-oozmen-67656
Casey Bodley [Wed, 28 Aug 2024 19:29:02 +0000 (15:29 -0400)]
Merge pull request #59390 from ceph/wip-oozmen-67656

doc/rgw/account: Handling notification topics when migrating an existing user into an account

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Yuval Lifshitz <ylifshit@ibm.com>
10 months agoMerge pull request #53915 from pritha-srivastava/wip-rgw-sts-update-oidc-provider
Casey Bodley [Wed, 28 Aug 2024 19:07:21 +0000 (15:07 -0400)]
Merge pull request #53915 from pritha-srivastava/wip-rgw-sts-update-oidc-provider

rgw/iam: add AddClientIdToOIDCProvider/UpdateOidcProviderThumbprint

Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agoMerge pull request #59249 from pritha-srivastava/wip-rgw-sts-err-message
Casey Bodley [Wed, 28 Aug 2024 15:27:55 +0000 (11:27 -0400)]
Merge pull request #59249 from pritha-srivastava/wip-rgw-sts-err-message

rgw/sts: correcting the error message returned for an sts key

Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agomailmap, githubmap, organisationmap: Add Shraddha Agrawal
Shraddha Agrawal [Wed, 28 Aug 2024 13:27:53 +0000 (18:57 +0530)]
mailmap, githubmap, organisationmap: Add Shraddha Agrawal

Signed-off-by: Shraddha Agrawal <shraddhaag@ibm.com>
10 months agoMerge PR #59300 into main
Venky Shankar [Wed, 28 Aug 2024 13:53:02 +0000 (19:23 +0530)]
Merge PR #59300 into main

* refs/pull/59300/head:
client: calls to _ll_fh_exists() should hold client_lock

Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
10 months agoMerge pull request #59419 from phlogistonjohn/jjm-smb-ctdb-vips
Adam King [Wed, 28 Aug 2024 12:45:43 +0000 (08:45 -0400)]
Merge pull request #59419 from phlogistonjohn/jjm-smb-ctdb-vips

smb: cluster public ip addresses support

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Anoop C S <anoopcs@cryptolab.net>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Michael Adam <obnox@samba.org>
10 months agoMerge pull request #59434 from VallariAg/fix-nvmeof-apply-teuthology
Vallari Agrawal [Wed, 28 Aug 2024 12:37:35 +0000 (18:07 +0530)]
Merge pull request #59434 from VallariAg/fix-nvmeof-apply-teuthology

qa/tasks/nvmeof.py: add nvmeof gw-group to deployment

10 months agoMerge pull request #59385 from leonidc/wip-leonidc-20242108-fixing-gw-bugs
Aviv Caro [Wed, 28 Aug 2024 10:35:01 +0000 (13:35 +0300)]
Merge pull request #59385 from leonidc/wip-leonidc-20242108-fixing-gw-bugs

mon: handle gw fast-reboot, proper handle of gw delete scenarios

10 months agoMerge pull request #59240 from leonidc/wip-leonidc-20241508-upgrade-rules-centos9...
Aviv Caro [Wed, 28 Aug 2024 10:34:37 +0000 (13:34 +0300)]
Merge pull request #59240 from leonidc/wip-leonidc-20241508-upgrade-rules-centos9-only

upgrade rules for NVMeofGW monitors and gateways

10 months agoMerge pull request #59401 from nbalacha/wip-nbalacha-check-mirror-ns
Ilya Dryomov [Wed, 28 Aug 2024 06:56:57 +0000 (08:56 +0200)]
Merge pull request #59401 from nbalacha/wip-nbalacha-check-mirror-ns

rbd-mirror: use correct ioctx for namespace

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
10 months agoMerge pull request #57876 from Suyashd999/rgw-false-positives
Yuval Lifshitz [Wed, 28 Aug 2024 05:18:27 +0000 (08:18 +0300)]
Merge pull request #57876 from Suyashd999/rgw-false-positives

rgw: rgw_auth.cc: disable false use-after-move clang-tidy warning

Reviewed-By: Ronen Friedman <rfriedma@ibm.com>, Yuval Lifshitz <ylifshit@ibm.com>
10 months agoMerge pull request #58959 from rhcs-dashboard/fix-67192-main
Aashish Sharma [Wed, 28 Aug 2024 05:09:51 +0000 (10:39 +0530)]
Merge pull request #58959 from rhcs-dashboard/fix-67192-main

mgr/dashboard: Add Performance Details grafana charts for individual clusters in Manage-clusters page

Reviewed-by: Nizamudeen A <nia@redhat.com>
10 months agoqa/suites/orch: add test for smb with ctdb and cluster public ips 59419/head
John Mulligan [Fri, 23 Aug 2024 14:16:27 +0000 (10:16 -0400)]
qa/suites/orch: add test for smb with ctdb and cluster public ips

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agodoc: add documentation for (cluster_)public_addrs options
John Mulligan [Fri, 23 Aug 2024 14:01:08 +0000 (10:01 -0400)]
doc: add documentation for (cluster_)public_addrs options

Document the spec and resource options (they're basically the same) for
specifying public addresses that will be managed automatically
by CTDB.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/smb: add cluster public ip information to service spec
John Mulligan [Thu, 22 Aug 2024 18:08:16 +0000 (14:08 -0400)]
mgr/smb: add cluster public ip information to service spec

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/smb: extend cluster resource type to define public ip addrs
John Mulligan [Thu, 22 Aug 2024 18:08:06 +0000 (14:08 -0400)]
mgr/smb: extend cluster resource type to define public ip addrs

When a cluster defines public IPs it will pass this information along to
the smb service spec.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/cephadm: pass public addresses for a cluster to cephadm binary
John Mulligan [Wed, 21 Aug 2024 21:02:57 +0000 (17:02 -0400)]
mgr/cephadm: pass public addresses for a cluster to cephadm binary

Add the strictly-formed public addresses list as one of the config blobs
we pass to the binary for smb container deployment.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agopython-common/deployment: add a cluster public ip spec for smb
John Mulligan [Wed, 21 Aug 2024 15:31:52 +0000 (11:31 -0400)]
python-common/deployment: add a cluster public ip spec for smb

This spec can be used to define one or more public addresses that will
be automatically assigned to hosts by CTDB. The address can be specified
in the "interface" form - an address plus prefix length.  Optionally,
networks to bind to can be specified. The network value will be
converted to a network device name later by cephadm.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agoMerge pull request #58380 from adk3798/squid-base-mds-upgrade-sequence-cephadm
Adam King [Tue, 27 Aug 2024 17:33:15 +0000 (13:33 -0400)]
Merge pull request #58380 from adk3798/squid-base-mds-upgrade-sequence-cephadm

qa/suites/fs: pull compiled cephadm for squid branch in mds_upgrade_sequence

Reviewed-by: John Mulligan <jmulligan@redhat.com>
10 months agoMerge pull request #59421 from phlogistonjohn/jjm-teuth-cephadm-from-ctr
Adam King [Tue, 27 Aug 2024 17:32:43 +0000 (13:32 -0400)]
Merge pull request #59421 from phlogistonjohn/jjm-teuth-cephadm-from-ctr

qa/tasks: add a new cephadm_from_container feature to cephadm task

Reviewed-by: Adam King <adking@redhat.com>
10 months agoMerge PR #59171 into main
Patrick Donnelly [Tue, 27 Aug 2024 17:20:54 +0000 (13:20 -0400)]
Merge PR #59171 into main

* refs/pull/59171/head:
client: use vectors for context lists

Reviewed-by: Venky Shankar <vshankar@redhat.com>
10 months agoMerge PR #59176 into main
Patrick Donnelly [Tue, 27 Aug 2024 17:12:11 +0000 (13:12 -0400)]
Merge PR #59176 into main

* refs/pull/59176/head:
mds: encode quiesce payload on demand
mds: print quiesce message name in debug log

Reviewed-by: Venky Shankar <vshankar@redhat.com>
10 months agoMerge PR #58419 into main
Patrick Donnelly [Tue, 27 Aug 2024 17:10:54 +0000 (13:10 -0400)]
Merge PR #58419 into main

* refs/pull/58419/head:
mds: generate correct path for unlinked snapped files
qa: add test for cephx path check on unlinked snapped dir tree
mds: add debugging for stray_prior_path

Reviewed-by: Milind Changire <mchangir@redhat.com>
10 months agoMerge PR #58987 into main
Patrick Donnelly [Tue, 27 Aug 2024 17:10:10 +0000 (13:10 -0400)]
Merge PR #58987 into main

* refs/pull/58987/head:
qa/cephfs: update ignorelist

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
10 months agoMerge PR #59088 into main
Patrick Donnelly [Tue, 27 Aug 2024 17:09:38 +0000 (13:09 -0400)]
Merge PR #59088 into main

* refs/pull/59088/head:
mds: add compile time checks for sortedness
mds: sort conf keys

Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
10 months agoMerge PR #59095 into main
Patrick Donnelly [Tue, 27 Aug 2024 17:09:11 +0000 (13:09 -0400)]
Merge PR #59095 into main

* refs/pull/59095/head:
qa: wait for file creation before changing mode

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
10 months agoMerge PR #59162 into main
Patrick Donnelly [Tue, 27 Aug 2024 17:08:33 +0000 (13:08 -0400)]
Merge PR #59162 into main

* refs/pull/59162/head:
client: Prevent race condition when printing Inode in ll_sync_inode

Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
10 months agoMerge PR #59173 into main
Patrick Donnelly [Tue, 27 Aug 2024 17:07:37 +0000 (13:07 -0400)]
Merge PR #59173 into main

* refs/pull/59173/head:
mds: fix spelling typo

Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Kotresh Hiremath Ravishankar <khiremat@redhat.com>
Reviewed-by: Neeraj Pratap Singh <neesingh@redhat.com>
10 months agoMerge pull request #59423 from idryomov/wip-67698
Ilya Dryomov [Tue, 27 Aug 2024 15:06:54 +0000 (17:06 +0200)]
Merge pull request #59423 from idryomov/wip-67698

rbd: "rbd bench" always writes the same byte

Reviewed-by: Mykola Golub <mgolub@suse.com>
10 months agoMerge pull request #59409 from adk3798/teuth-reinstall-nvme-cli
Adam King [Tue, 27 Aug 2024 12:48:26 +0000 (08:48 -0400)]
Merge pull request #59409 from adk3798/teuth-reinstall-nvme-cli

qa/distros: reinstall nvme-cli on centos 9 nodes

Reviewed-by: Guillaume Abrioux <gabrioux@ibm.com>
10 months agoMerge pull request #57952 from NitzanMordhai/wip-nitzan-bench-osd-admin-command
Matan Breizman [Tue, 27 Aug 2024 10:03:02 +0000 (13:03 +0300)]
Merge pull request #57952 from NitzanMordhai/wip-nitzan-bench-osd-admin-command

crimson: Add support for bench osd command

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
10 months agoMerge pull request #59189 from xxhdx1985126/wip-67508
Matan Breizman [Tue, 27 Aug 2024 08:25:03 +0000 (11:25 +0300)]
Merge pull request #59189 from xxhdx1985126/wip-67508

crimson/osd/recovery_backend: restart object pulls that are blocked by down osds

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
10 months agoMerge pull request #59085 from VallariAg/update-default-nvmeof-img
Aviv Caro [Tue, 27 Aug 2024 08:20:17 +0000 (11:20 +0300)]
Merge pull request #59085 from VallariAg/update-default-nvmeof-img

mgr/cephadm: bump DEFAULT_NVMEOF_IMAGE to 1.2.17

10 months agoMerge pull request #59433 from idryomov/wip-drop-xmlstarlet-variable
Ilya Dryomov [Tue, 27 Aug 2024 06:53:38 +0000 (08:53 +0200)]
Merge pull request #59433 from idryomov/wip-drop-xmlstarlet-variable

qa: drop XMLSTARLET variable, use xmlstarlet directly

Reviewed-by: Ramana Raja <rraja@redhat.com>
10 months agorbd-mirror: use correct ioctx for namespace 59401/head
N Balachandran [Thu, 22 Aug 2024 08:15:36 +0000 (13:45 +0530)]
rbd-mirror: use correct ioctx for namespace

The PoolReplayer uses the ioctx for the default namespace
to check if other namespaces are enabled for mirroring, causing
it to incorrectly conclude that they are all enabled.

Fixes: https://tracker.ceph.com/issues/67676
Signed-off-by: N Balachandran <nibalach@redhat.com>
10 months agocrimson/osd/pg: add logs for repeating pulls 59189/head
Xuehan Xu [Thu, 22 Aug 2024 09:54:02 +0000 (17:54 +0800)]
crimson/osd/pg: add logs for repeating pulls

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
10 months agoMerge pull request #58870 from rhcs-dashboard/fix-67194-main
afreen23 [Tue, 27 Aug 2024 02:31:38 +0000 (08:01 +0530)]
Merge pull request #58870 from rhcs-dashboard/fix-67194-main

mgr/dashboard: fix typo in Multi-Cluster > Manager Cluster to Manage Clusters

Reviewed-by: Afreen Misbah <afreen23.git@gmail.com>
10 months agoMerge pull request #59376 from rhcs-dashboard/Upgrade-page-scroll-issue
afreen23 [Tue, 27 Aug 2024 02:20:30 +0000 (07:50 +0530)]
Merge pull request #59376 from rhcs-dashboard/Upgrade-page-scroll-issue

mgr/dashboard: can't scroll to the end of the page

Reviewed-by: Afreen Misbah <afreen23.git@gmail.com>
10 months agocrimson/osd/recovery_backend: restart object pulling for recoveries that
Xuehan Xu [Tue, 13 Aug 2024 07:32:02 +0000 (15:32 +0800)]
crimson/osd/recovery_backend: restart object pulling for recoveries that
are blocked pulling from down osds

Fixes: https://tracker.ceph.com/issues/67508
Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
10 months agocrimson/common/interruptible_future: new interruptor function `repeat_eagain`
Xuehan Xu [Tue, 13 Aug 2024 06:59:23 +0000 (14:59 +0800)]
crimson/common/interruptible_future: new interruptor function `repeat_eagain`

Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
10 months agoMerge pull request #59332 from afreen23/nvmeof-group-mtls
afreen23 [Tue, 27 Aug 2024 01:27:02 +0000 (06:57 +0530)]
Merge pull request #59332 from afreen23/nvmeof-group-mtls

mgr/dashboard: Add group field in nvmeof service form

Reviewed-by: Afreen Misbah <afreen23.git@gmail.com>
10 months agoMerge pull request #59422 from cbodley/wip-67697
Casey Bodley [Mon, 26 Aug 2024 21:52:22 +0000 (17:52 -0400)]
Merge pull request #59422 from cbodley/wip-67697

rgw: ignore zoneless default realm when not configured

Reviewed-by: Shilpa Jagannath <smanjara@redhat.com>
10 months agocephadm: add support for cluster public ip addresses to smb daemon
John Mulligan [Wed, 21 Aug 2024 21:03:40 +0000 (17:03 -0400)]
cephadm: add support for cluster public ip addresses to smb daemon

When a list of public addresses (and optional network destination(s))
are supplied at deploy time, convert the networks to device names
and pass that result to the sambcc ctdb configuration.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agomgr/smb: simplify orch backend enablement
John Mulligan [Wed, 21 Aug 2024 21:03:19 +0000 (17:03 -0400)]
mgr/smb: simplify orch backend enablement

We have a developer/debug module option that allows one to disable
triggering orchestration. When I tried to use it I thought it was
buggy and I had trouble diagnosing it. The mistake was on my side,
but the code change makes it much clearer what is being enabled
so I want to keep it.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
10 months agodoc/rgw/account: Handling notification topics when migrating an existing user into... 59390/head
Oguzhan Ozmen [Thu, 22 Aug 2024 02:44:01 +0000 (22:44 -0400)]
doc/rgw/account: Handling notification topics when migrating an existing user into an account

Add a subsection under "Migrate an existing User into an Account" to
describe how a client can seamlessly migrate the notification topics
after account migration.

Fixes https://tracker.ceph.com/issues/67656

Signed-off-by: Oguzhan Ozmen <oozmen@bloomberg.net>
10 months agoMerge pull request #59227 from xxhdx1985126/wip-67564
Matan Breizman [Mon, 26 Aug 2024 15:19:58 +0000 (18:19 +0300)]
Merge pull request #59227 from xxhdx1985126/wip-67564

crimson/osd/pg: implement PG::PGLogEntryHandler::remove()

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
10 months agoMerge pull request #59117 from cbodley/wip-67468
Casey Bodley [Mon, 26 Aug 2024 15:12:27 +0000 (11:12 -0400)]
Merge pull request #59117 from cbodley/wip-67468

rgw/rados: zero-init shard_count in RGWBucket::check_index_unlinked()

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
10 months agoMerge pull request #59172 from clwluvw/enoent-loglevel
Casey Bodley [Mon, 26 Aug 2024 15:12:03 +0000 (11:12 -0400)]
Merge pull request #59172 from clwluvw/enoent-loglevel

rgw: increase log level for enoent caused by clients

Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agoMerge pull request #59243 from cbodley/wip-67522
Casey Bodley [Mon, 26 Aug 2024 15:04:38 +0000 (11:04 -0400)]
Merge pull request #59243 from cbodley/wip-67522

rgw/http: finish_request() after logging errors

Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agoMerge pull request #59329 from smanjara/wip-data-sync-full-initialize
Casey Bodley [Mon, 26 Aug 2024 15:04:24 +0000 (11:04 -0400)]
Merge pull request #59329 from smanjara/wip-data-sync-full-initialize

rgw/multisite: initialize sync_status in RGWDataFullSyncSingleEntryCR ctor

Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agoMerge pull request #57956 from tobias-urdin/remove-keystone-v2
Casey Bodley [Mon, 26 Aug 2024 15:03:42 +0000 (11:03 -0400)]
Merge pull request #57956 from tobias-urdin/remove-keystone-v2

rgw/auth: Remove Keystone v2.0 API support

Reviewed-by: Casey Bodley <cbodley@redhat.com>
10 months agotest/rgw: include --rgw-realm/zonegroup/zone args for 'account create' 59422/head
Casey Bodley [Fri, 23 Aug 2024 19:55:44 +0000 (15:55 -0400)]
test/rgw: include --rgw-realm/zonegroup/zone args for 'account create'

Signed-off-by: Casey Bodley <cbodley@redhat.com>
10 months agotest/rgw: test_multi.py creates realm with --default
Casey Bodley [Fri, 23 Aug 2024 19:54:18 +0000 (15:54 -0400)]
test/rgw: test_multi.py creates realm with --default

mstart.sh relies on default realm/zonegroup/zone configuration, because
it doesn't supply them to radosgw as config options

Signed-off-by: Casey Bodley <cbodley@redhat.com>
10 months agorgw: ignore zoneless default realm when not configured
Casey Bodley [Fri, 23 Aug 2024 19:03:31 +0000 (15:03 -0400)]
rgw: ignore zoneless default realm when not configured

"default" zone/zonegroup deployments without a realm can be broken by
the creation of an unrelated realm, because that realm is (was)
automatically set as the default

when startup detects an incomplete default realm (one that doesn't have
a default zone), fall back to the realmless "default" zone/zonegroup
instead

Fixes: https://tracker.ceph.com/issues/67697
Signed-off-by: Casey Bodley <cbodley@redhat.com>
10 months agoradosgw-admin: add 'realm default rm' command
Casey Bodley [Fri, 23 Aug 2024 18:53:46 +0000 (14:53 -0400)]
radosgw-admin: add 'realm default rm' command

the 'realm default' command could only set a different realm as the
default, and provided no way to clear the default

Signed-off-by: Casey Bodley <cbodley@redhat.com>
10 months agoMerge pull request #59301 from xxhdx1985126/wip-67604
Matan Breizman [Mon, 26 Aug 2024 10:55:47 +0000 (13:55 +0300)]
Merge pull request #59301 from xxhdx1985126/wip-67604

crimson/common/tri_mutex: also wake up waiters when demoting

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
10 months agoMerge pull request #58136 from xxhdx1985126/wip-66372
Matan Breizman [Mon, 26 Aug 2024 10:50:10 +0000 (13:50 +0300)]
Merge pull request #58136 from xxhdx1985126/wip-66372

crimson/osd/osd: mark down connections to downed osds

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
10 months agoMerge pull request #54620 from rishabh-d-dave/mgr-vol-clone-stats
Venky Shankar [Mon, 26 Aug 2024 10:14:53 +0000 (15:44 +0530)]
Merge pull request #54620 from rishabh-d-dave/mgr-vol-clone-stats

mgr/vol: show progress and stats for the subvolume snapshot clones

Reviewed-by: Venky Shankar <vshankar@redhat.com>
10 months agoqa/tasks/nvmeof.py: add nvmeof gw-group to deployment 59434/head
Vallari Agrawal [Mon, 26 Aug 2024 04:23:07 +0000 (09:53 +0530)]
qa/tasks/nvmeof.py: add nvmeof gw-group to deployment

Groups was made a required parameter to be
`ceph orch apply nvmeof <pool> <group>` in
https://github.com/ceph/ceph/pull/58860.
That broke the `nvmeof` suite so this PR fixes that.

Right now, all gateway are deployed in a single group.
Later, this would be changed to have multi groups for a better test.

Signed-off-by: Vallari Agrawal <val.agl002@gmail.com>
10 months ago mon/NVMeofGw*: fixing bugs - handle gw fast-reboot, proper handle of gw delete scenarios 59385/head
Leonid Chernin [Wed, 21 Aug 2024 16:30:14 +0000 (16:30 +0000)]
 mon/NVMeofGw*: fixing bugs - handle gw fast-reboot, proper handle of gw delete scenarios

Signed-off-by: Leonid Chernin <leonidc@il.ibm.com>
10 months agoMerge pull request #59428 from zdover23/wip-doc-2024-08-26-cephadm-services-osd
Zac Dover [Mon, 26 Aug 2024 08:09:16 +0000 (18:09 +1000)]
Merge pull request #59428 from zdover23/wip-doc-2024-08-26-cephadm-services-osd

doc/cephadm: how to get exact size_spec from device

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
10 months agomon/NVMeofGw*: support upgrades from prior out-of-tree nvmeofha implementation (nvmeo... 59240/head
Leonid Chernin [Sun, 18 Aug 2024 05:16:14 +0000 (05:16 +0000)]
mon/NVMeofGw*: support upgrades from prior out-of-tree nvmeofha implementation (nvmeof-reef)

This commit adds upgrade support for users running an experimental
nvmeofha implementation which can be found in the nvmeof-reef branch in
ceph.git.

Signed-off-by: Leonid Chernin <leonidc@il.ibm.com>
10 months agoinclude/ceph_features: add NVMEOFHA feature bit
Samuel Just [Wed, 14 Aug 2024 19:40:50 +0000 (12:40 -0700)]
include/ceph_features: add NVMEOFHA feature bit

Normally, we'd just use the SERVER_SQUID or SERVER_T flags instead of
using an extra feature bit.  However, the nvmeof ha monitor paxos
service has had a more complex development journey.  There are users
interested in using the nvmeof ha feature in squid, but it didn't make
the cutoff for backporting it.  There's an upstream nvmeof-squid branch
in the ceph.git repository with the patches backported for anyone
interested in building it.

However, that means that users of our normal stable releases will see
the feature added to the monitor one release after anyone who chooses to
use the nvmeof-squid branch.  We could disallow upgrades from
nvmeof-squid to T, but by adding a feature bit here we make such a
restriction unnecessary.

Signed-off-by: Samuel Just <sjust@redhat.com>
10 months agoinclude/ceph_features: remove stray available marker
Samuel Just [Wed, 14 Aug 2024 19:22:23 +0000 (12:22 -0700)]
include/ceph_features: remove stray available marker

Should have been removed in caa9e7a45e.

Signed-off-by: Samuel Just <sjust@redhat.com>
10 months agocrimson: Add support for bench osd command 57952/head
Nitzan Mordechai [Mon, 10 Jun 2024 10:51:03 +0000 (10:51 +0000)]
crimson: Add support for bench osd command

this commit adds support for the 'bench' admin command in the OSD,
allowing administrators to perform benchmark tests on the OSD. The
'bench' command accepts 4 optional parameters with the following
default values:

1. count - Total number of bytes to write (default: 1GB).
2. size - Block size for each write operation (default: 4MB).
3. object_size - Size of each object to write (default: 0).
4. object_num - Number of objects to write (default: 0).

The results of the benchmark are returned in a JSON formatted output,
which includes the following fields:

1. bytes_written - Total number of bytes written during the benchmark.
2. blocksize - Block size used for each write operation.
3. elapsed_sec - Total time taken to complete the benchmark in seconds.
4. bytes_per_sec - Write throughput in bytes per second.
5. iops - Number of input/output operations per second.

Example JSON output:

```json
{
  "osd_bench_results": {
    "bytes_written": 1073741824,
    "blocksize": 4194304,
    "elapsed_sec": 0.5,
    "bytes_per_sec": 2147483648,
    "iops": 512
  }
}

Fixes: https://tracker.ceph.com/issues/66380
Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
10 months agoMerge pull request #59392 from cyx1231st/wip-inplace-rewrite-comments
Yingxin [Mon, 26 Aug 2024 03:28:03 +0000 (11:28 +0800)]
Merge pull request #59392 from cyx1231st/wip-inplace-rewrite-comments

crimson/os/seastore: refine documents related to inplace rewrite

Reviewed-by: Myoungwon Oh <myoungwon.oh@samsung.com>
10 months agodoc/cephadm: how to get exact size_spec from device 59428/head
Zac Dover [Sun, 25 Aug 2024 20:03:34 +0000 (06:03 +1000)]
doc/cephadm: how to get exact size_spec from device

Add instructions for retrieving the exact size of block devices.

Fixes: https://tracker.ceph.com/issues/66754
Signed-off-by: Zac Dover <zac.dover@proton.me>
10 months agoMerge pull request #59053 from baum/wip-baum-20240806-00
baum [Sun, 25 Aug 2024 18:10:46 +0000 (21:10 +0300)]
Merge pull request #59053 from baum/wip-baum-20240806-00

nvmeof/NVMeofGwMonitorClient: use a separate mutex for beacons

10 months agoMerge pull request #58858 from ronen-fr/wip-rf-entry
Ronen Friedman [Sun, 25 Aug 2024 16:44:03 +0000 (19:44 +0300)]
Merge pull request #58858 from ronen-fr/wip-rf-entry

osd/scrub: a scrub queue of level-specific entries

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Nitzan Mordechai <nmordech@redhat.com>
10 months agotest/osd/scrub: fix searched-for log string 58858/head
Ronen Friedman [Sun, 25 Aug 2024 08:57:42 +0000 (03:57 -0500)]
test/osd/scrub: fix searched-for log string

To match the modified log message in
OsdScrub::restrictions_on_scrubbing().

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix missing 'const' on some formatters
Ronen Friedman [Sat, 24 Aug 2024 11:41:44 +0000 (06:41 -0500)]
osd/scrub: fix missing 'const' on some formatters

required to pass CI checks.

co-author: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agotest/osd/scrub: disable tests for deleted scrub functionality
Ronen Friedman [Sat, 24 Aug 2024 05:36:44 +0000 (00:36 -0500)]
test/osd/scrub: disable tests for deleted scrub functionality

The scrub scheduler no longer "upgrades" shallow scrubs into
deep ones on error, so the tests that check this functionality
are no longer valid.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agotest/osd: test new functionality added to the not-before queue
Ronen Friedman [Sun, 18 Aug 2024 17:33:38 +0000 (12:33 -0500)]
test/osd: test new functionality added to the not-before queue

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: delay both targets on some failures
Ronen Friedman [Sat, 17 Aug 2024 16:08:19 +0000 (11:08 -0500)]
osd/scrub: delay both targets on some failures

If the failure of a scrub-job is due to a condition that affects
both targets, both should be delayed. Otherwise, we may end up
with the following bogus scenario:

A high priority deep target is scheduled, but scrub session initiation
fails due to, for example, a concurrent snap trim. The deep target
will be delayed. A second initiation attempt may happen after the
snap trimming is done, but before the updated deep target not-before.
As a result - the lower priority target will be scheduled before the
higher priority one - which is a bug.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: reverse OSDRestrictions flags polarity
Ronen Friedman [Thu, 15 Aug 2024 13:17:48 +0000 (08:17 -0500)]
osd/scrub: reverse OSDRestrictions flags polarity

As most of the flags in OSDRestrictions are of 'true is bad' polarity,
reverse the two non-conforming flags - cpu load and time-of-day
restrictions - to match.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix the conditions for auto-repair scrubs
Ronen Friedman [Thu, 15 Aug 2024 12:51:15 +0000 (07:51 -0500)]
osd/scrub: fix the conditions for auto-repair scrubs

The conditions for auto-repair scrubs should have been changed
when need_auto lost some of its setters.

Also fix the rescheduling of repair scrubs
when the last scrub ended with errors.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: remove requested_scrub_t::deep_scrub_on_error
Ronen Friedman [Thu, 8 Aug 2024 13:49:57 +0000 (08:49 -0500)]
osd/scrub: remove requested_scrub_t::deep_scrub_on_error

This flag was used to indicate that a deep scrub should
be performed if a shallow scrub finds an error. It was
always set true for shallow, regular, scrubs - if
can_autorepair flag was set. Thus, the ephemeral flag in
the requested_scrub_t object is not really needed.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoqa/standalone/scrub: disable scrub_extended_sleep test
Ronen Friedman [Tue, 6 Aug 2024 13:07:17 +0000 (08:07 -0500)]
qa/standalone/scrub: disable scrub_extended_sleep test

Disabling osd-scrub-test.sh::TEST_scrub_extended_sleep,
as the test is no longer valid (updated code no longer
produces the same logs or the same behavior).

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: remove non-display usage of target's is_high_priority()
Ronen Friedman [Tue, 30 Jul 2024 12:12:54 +0000 (07:12 -0500)]
osd/scrub: remove non-display usage of target's is_high_priority()

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: remove 'calculated_to_deep' flag
Ronen Friedman [Mon, 29 Jul 2024 04:34:32 +0000 (23:34 -0500)]
osd/scrub: remove 'calculated_to_deep' flag

as once a sched-target was selected, we know the level of the scrub.
Also removed: the ephemeral 'time_for_deep' flag.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: modify after-repair-scrub triggering
Ronen Friedman [Sun, 28 Jul 2024 12:37:07 +0000 (07:37 -0500)]
osd/scrub: modify after-repair-scrub triggering

... to manipulate the relevant scrub target directly, instead
of using the 'planned scrub' flags.

The relevant condition flag was moved from the PG and into the scrubber.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix ReplicaReservations ctor to use correct query
Ronen Friedman [Sun, 28 Jul 2024 10:52:38 +0000 (05:52 -0500)]
osd/scrub: fix ReplicaReservations ctor to use correct query

when determining whether replica reservations are required.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix parameters validation on scrub start
Ronen Friedman [Sun, 28 Jul 2024 06:09:25 +0000 (01:09 -0500)]
osd/scrub: fix parameters validation on scrub start

... as the selected target already determines the
scrub level & type.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix reserve_local()
Ronen Friedman [Sun, 28 Jul 2024 10:20:38 +0000 (05:20 -0500)]
osd/scrub: fix reserve_local()

to use the correct method when determining whether we should
perform the reservation.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: fix initiation path of operator-commanded scrubs
Ronen Friedman [Sat, 27 Jul 2024 17:59:46 +0000 (12:59 -0500)]
osd/scrub: fix initiation path of operator-commanded scrubs

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agocommon/not_before_queue: extending the container's API
Ronen Friedman [Tue, 30 Jul 2024 10:59:00 +0000 (05:59 -0500)]
common/not_before_queue: extending the container's API

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agoosd/scrub: OSD's scrub queue now holds SchedEntry-s
Ronen Friedman [Wed, 24 Jul 2024 07:02:46 +0000 (02:02 -0500)]
osd/scrub: OSD's scrub queue now holds SchedEntry-s

The OSD's scrub queue now holds SchedEntry-s, instead of ScrubJob-s.
The queue itself is implemented using the 'not_before_queue_t' class.

Note: this is not a stable state of the scrubber code. In the next
commits:
- modifying the way sched targets are modified and updated, to match the
  new queue implementation.
- removing the 'planned scrub' flags.

Important note: the interaction of initiate_scrub() and pop_ready_pg()
is not changed by this commit. Namely:

Currently - pop..() loops over all eligible jobs, until it finds one
that matches the environment restrictions (which most of the time, as the
concurrency limit is usually reached, would be 'high-priority-only').

The other option is to maintain Sam's 'not_before_q' clean interface: we
always pop the top, and if that top fails the preconds tests - we delay and
re-push. This has the following troubling implications:

- it would take a long time to find a viable scrub job, if the problem
  is related to, for example, 'no scrub'.
- local resources failure (inc_scrubs() failure) must be handles
  separately, as we do not want to reshuffle the queue for this
  very very common case.
- but the real problem: unneeded shuffling of the queue, even as the
  problem is not with the scrub job itself, but with the environment
  (esp. no-scrub etc.).
  This is a common case, and it would be wrong to reshuffle the queue
  for that.
- and - remember that any change to a sched-entry must be done under PG
  lock.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agocommon/not_before_queue: move status_t out of container_t
Ronen Friedman [Tue, 30 Jul 2024 10:54:59 +0000 (05:54 -0500)]
common/not_before_queue: move status_t out of container_t

for readability

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agocommon/not_before_queue: some spelling fixes
Ronen Friedman [Mon, 29 Jul 2024 03:58:22 +0000 (22:58 -0500)]
common/not_before_queue: some spelling fixes

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agocommon: add not_before_queue_t
Samuel Just [Fri, 16 Dec 2022 18:30:18 +0000 (18:30 +0000)]
common: add not_before_queue_t

Signed-off-by: Samuel Just <sjust@redhat.com>
10 months agoosd/scrub: modify ScrubJob to hold two SchedTarget-s
Ronen Friedman [Fri, 12 Jul 2024 13:18:30 +0000 (08:18 -0500)]
osd/scrub: modify ScrubJob to hold two SchedTarget-s

ScrubJob will now hold two SchedTarget-s - two sets of scheduling
information (times, levels, etc.) for the next shallow and deep scrubs.

This is in preparation for the upcoming changes to the scheduling queue.
The change cannot stand on its own, as the partial implementation
creates some inconsistencies in the scheduling logic.

Specifically, here is what changes here, and how it differs from the
desired implementation:
- The OSD still maintains a queue of scrub jobs - one object only per
  PG.
  But now - each queue element holds two SchedTarget-s.
- When a scrub is initiated, the Scrubber is handed a ScrubJob object.
  Only in the next commit will it also receive the ID of the selected
  level. That causes some issues when re-determining the level of the
  initiated scrub. A failure to match the queue "intent" results in
  failures.
- the 'planned scrub' flags are still here, instead of directly
  encoding the characteristics of the next scrub in the relevant
  sched-entry.
- the 'urgency' levels do not cover the full required range of
  behaviors and priorities.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
10 months agonvmeof/NVMeofGwMonitorClient: use a separate mutex for beacons 59053/head
Alexander Indenbaum [Mon, 5 Aug 2024 09:50:27 +0000 (09:50 +0000)]
nvmeof/NVMeofGwMonitorClient: use a separate mutex for beacons

Add beacon_lock to mitigate potential beacon delays caused by slow message
handling, particularly in handle_nvmeof_gw_map.

Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
10 months agoqa: drop XMLSTARLET variable, use xmlstarlet directly 59433/head
Ilya Dryomov [Sun, 25 Aug 2024 11:22:08 +0000 (13:22 +0200)]
qa: drop XMLSTARLET variable, use xmlstarlet directly

The variable was added in commit 9b6b7c35d03f ("Handle
differently-named xmlstarlet binary for *suse") but this
compatibility business is long outdated:

  Mon Oct 13 08:52:37 UTC 2014 - toms@opensuse.org

  - SPEC file changes
    - Added link from /usr/bin/xml to /usr/bin/xmlstarlet as other
      distributions do the same
    - Did the same for the manpage

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
10 months agorbd: "rbd bench" always writes the same byte 59423/head
Ilya Dryomov [Fri, 23 Aug 2024 21:00:24 +0000 (23:00 +0200)]
rbd: "rbd bench" always writes the same byte

It's expected that the buffer is filled with the same byte, but the
byte should differ from run to run:

    memset(bp.c_str(), rand() & 0xff, io_size);

This was broken in commit c7f71d14a5d3 ("rbd: migrated existing command
logic to new namespaces") which inadvertently moved the call to srand(),
leaving rand() unseeded for the above memset().

Fixes: https://tracker.ceph.com/issues/67698
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
10 months agorgw: realm create only sets default realm on --default
Casey Bodley [Fri, 23 Aug 2024 18:49:32 +0000 (14:49 -0400)]
rgw: realm create only sets default realm on --default

Signed-off-by: Casey Bodley <cbodley@redhat.com>
10 months agoqa/tasks: add a new cephadm_from_container feature to cephadm task 59421/head
John Mulligan [Fri, 9 Aug 2024 18:37:43 +0000 (14:37 -0400)]
qa/tasks: add a new cephadm_from_container feature to cephadm task

The cephadm_from_container allows one to do a single container build
and then point teuthology at that image as the "single source of truth".
I find this extremely convenient when running teuthology locally and
I keep carrying this patch around - I figure having it upstream will
simplify my workflow. Maybe someday it'll benefit others too.

To use it I set up a yaml overrides file with the following content:
```yaml
overrides:
  cephadm:
    image: "quay.io/phlogistonjohn/ceph:dev"
    cephadm_from_container: true
  verify_ceph_hash: false
verify_ceph_hash: false
```

This let's me test my custom builds fairly easily!

Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>
10 months agoMerge PR #58487 into main
Venky Shankar [Fri, 23 Aug 2024 16:32:34 +0000 (22:02 +0530)]
Merge PR #58487 into main

* refs/pull/58487/head:
qa/suites/fs/workload: drop mgrmodules stanza
qa/tasks/ceph: fix "ceph mgr module enable" command

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Milind Changire <mchangir@redhat.com>
10 months agoMerge pull request #58336 from Svelar/uadk
Casey Bodley [Fri, 23 Aug 2024 14:32:47 +0000 (10:32 -0400)]
Merge pull request #58336 from Svelar/uadk

Compressor: add UADK support

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
10 months agoMerge pull request #59418 from zdover23/wip-doc-2024-08-23-glossary-object-storage
Anthony D'Atri [Fri, 23 Aug 2024 14:11:16 +0000 (10:11 -0400)]
Merge pull request #59418 from zdover23/wip-doc-2024-08-23-glossary-object-storage

doc/glossary: add "object storage"