]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
5 months agomgr/cephadm: continue in nfs service purge if grace file is already deleted 61594/head
Adam King [Wed, 29 Jan 2025 20:48:53 +0000 (15:48 -0500)]
mgr/cephadm: continue in nfs service purge if grace file is already deleted

The test_nfs task we run in teuthology creates and removes a number of
nfs clusters during the task. I think it's possible based on timing for
it to end up in a situation where it tries to remove an nfs service before
the grace file has been created. In that case, cephadm doesn't know it
hasn't created the grace file and just repeatedly fails forever attempting
to remove the nonexistent file. This patch adds handling for the error
case where we get a nonzero rc but the error message implies the command
failed because the file already does not exist.

Fixes: https://tracker.ceph.com/issues/69736
Signed-off-by: Adam King <adking@redhat.com>
5 months agoMerge pull request #61589 from piyushagarwal1411/fix-69727-main
afreen23 [Tue, 4 Feb 2025 20:02:18 +0000 (01:32 +0530)]
Merge pull request #61589 from piyushagarwal1411/fix-69727-main

mgr/dashboard: Add 'Browse Dashboards' button in Grafana dashboards

Reviewed-by: Afreen Misbah <afreen@ibm.com>
5 months agoMerge pull request #61634 from VallariAg/wip-vallari-nvme-maxgroup-alert
Vallari Agrawal [Tue, 4 Feb 2025 14:56:49 +0000 (20:26 +0530)]
Merge pull request #61634 from VallariAg/wip-vallari-nvme-maxgroup-alert

monitoring: add NVMeoFMaxGatewayGroups alert

5 months agoMerge pull request #61357 from VallariAg/wip-nvmeof-teuthology-test-fix-ha
Vallari Agrawal [Tue, 4 Feb 2025 14:55:32 +0000 (20:25 +0530)]
Merge pull request #61357 from VallariAg/wip-nvmeof-teuthology-test-fix-ha

qa: fix nvmeof teuthology thrasher fix

5 months agoMerge pull request #61620 from anthonyeleven/remove-obsolete-sample-conf
Zac Dover [Tue, 4 Feb 2025 11:51:39 +0000 (21:51 +1000)]
Merge pull request #61620 from anthonyeleven/remove-obsolete-sample-conf

src: modernize sample.ceph.conf

Reviewed-by: Zac Dover <zac.dover@proton.me>
5 months agoqa/suites/nvmeof: use SCALING_DELAYS: '120' 61357/head
Vallari Agrawal [Tue, 4 Feb 2025 07:50:18 +0000 (13:20 +0530)]
qa/suites/nvmeof: use SCALING_DELAYS: '120'

Increase delays for qa/workunits/nvmeof/scalability_test.sh
as namespace rebalancing takes more time. After upscaling,
gateway initially could be 'CREATED', it is a valid state during
gateway initialization, but then the state should progress
to 'AVAILABLE' within couple of seconds.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoMerge pull request #61567 from idryomov/wip-58185
Ilya Dryomov [Tue, 4 Feb 2025 09:58:24 +0000 (10:58 +0100)]
Merge pull request #61567 from idryomov/wip-58185

librbd: stop filtering async request error codes

Reviewed-by: Ramana Raja <rraja@redhat.com>
5 months agomgr/dashboard: Add 'Browse Dashboards' button in Grafana dashboards 61589/head
Piyush Agarwal [Thu, 30 Jan 2025 09:12:37 +0000 (14:42 +0530)]
mgr/dashboard: Add 'Browse Dashboards' button in Grafana dashboards

Fixes: https://tracker.ceph.com/issues/69727
Signed-off-by: Piyush Agarwal <piyushagarwal14.pa@gmail.com>
5 months agoMerge pull request #60686 from zhsgao/mds_bal_overload_epochs
Venky Shankar [Tue, 4 Feb 2025 07:51:37 +0000 (13:21 +0530)]
Merge pull request #60686 from zhsgao/mds_bal_overload_epochs

mds: fix option mds_bal_overload_epochs

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 months agoMerge pull request #61632 from gbregman/main
Gil Bregman [Mon, 3 Feb 2025 23:52:08 +0000 (01:52 +0200)]
Merge pull request #61632 from gbregman/main

mgr/cephadm/nvmeof: Add max_hosts field to NVMeOF configuration

5 months agomgr/cephadm/nvmeof: Add max_hosts field to NVMeOF configuration and update default... 61632/head
Gil Bregman [Mon, 3 Feb 2025 21:13:49 +0000 (23:13 +0200)]
mgr/cephadm/nvmeof: Add max_hosts field to NVMeOF configuration and update default values
Fixes https://tracker.ceph.com/issues/69759

Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
5 months agoMerge pull request #61627 from petrutlucian94/zlib-fix
Ilya Dryomov [Mon, 3 Feb 2025 19:47:04 +0000 (20:47 +0100)]
Merge pull request #61627 from petrutlucian94/zlib-fix

win32_deps_build.sh: pin zlib tag

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
5 months agoMerge pull request #61374 from myoungwon/fix-68518
Radoslaw Zarzynski [Mon, 3 Feb 2025 19:07:49 +0000 (20:07 +0100)]
Merge pull request #61374 from myoungwon/fix-68518

src/test: allow ENOENT if target object of tier_flush has snapshots

Reviewed-by: Laura Flores <lflores@redhat.com>
5 months agomonitoring: add tests for NVMeoFMaxGatewayGroups 61634/head
Vallari Agrawal [Mon, 3 Feb 2025 18:27:30 +0000 (23:57 +0530)]
monitoring: add tests for NVMeoFMaxGatewayGroups

Add unit tests for alert NVMeoFMaxGatewayGroups
in monitoring/ceph-mixin/tests_alerts/test_alerts.yml

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agomonitoring: add alert NVMeoFMaxGatewayGroups
Vallari Agrawal [Mon, 3 Feb 2025 18:24:50 +0000 (23:54 +0530)]
monitoring: add alert NVMeoFMaxGatewayGroups

Add alert NVMeoFMaxGatewayGroups to prometheus_alerts.yml
and prometheus_alerts.libsonnet.

This alerts is to indicate if max number of NVMeoF gateway
groups have been reached in a cluster.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agomonitoring: add NVMeoFMaxGatewayGroups
Vallari Agrawal [Mon, 3 Feb 2025 18:22:47 +0000 (23:52 +0530)]
monitoring: add NVMeoFMaxGatewayGroups

Add config NVMeoFMaxGatewayGroups to config.libsonnet
and set it to 4 (groups).

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoMerge pull request #61628 from kamoltat/wip-ksirivad-fix-stretch-mode-doc
Zac Dover [Mon, 3 Feb 2025 17:41:14 +0000 (03:41 +1000)]
Merge pull request #61628 from kamoltat/wip-ksirivad-fix-stretch-mode-doc

doc/rados/operations/stretch-mode: fix mistake in stretch mode

Reviewed-by: Zac Dover <zac.dover@proton.me>
5 months agodoc/rados/operations/stretch-mode: fix mistake in stretch mode 61628/head
Kamoltat Sirivadhna [Mon, 3 Feb 2025 17:18:44 +0000 (17:18 +0000)]
doc/rados/operations/stretch-mode: fix mistake in stretch mode

Degraded stretch mode should only half the "min_size" not
"size".

Fixes: No tracker (doc changes)
Signed-off-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
5 months agoMerge pull request #61232 from xxhdx1985126/wip-67888-followup
Yuri Weinstein [Mon, 3 Feb 2025 15:59:32 +0000 (07:59 -0800)]
Merge pull request #61232 from xxhdx1985126/wip-67888-followup

osd/PeeringState: rename "cancel_backfill" to "suspend_backfill"

Reviewed-by: Samuel Just <sjust@redhat.com>
5 months agoMerge pull request #61397 from amathuria/wip-amat-test-osdmap-pruning
SrinivasaBharathKanta [Mon, 3 Feb 2025 15:43:28 +0000 (21:13 +0530)]
Merge pull request #61397 from amathuria/wip-amat-test-osdmap-pruning

mon/test_mon_osdmap_prune: Use first_pinned instead of first_committed

5 months agoMerge pull request #61365 from Matan-B/wip-matanb-snapmapper-logs
SrinivasaBharathKanta [Mon, 3 Feb 2025 15:43:09 +0000 (21:13 +0530)]
Merge pull request #61365 from Matan-B/wip-matanb-snapmapper-logs

osd/SnapMapper: Improve logging

5 months agoMerge pull request #61328 from adamemerson/wip-64191
SrinivasaBharathKanta [Mon, 3 Feb 2025 15:42:43 +0000 (21:12 +0530)]
Merge pull request #61328 from adamemerson/wip-64191

test/neorados: Silence mismatched new/delete warning

5 months agoMerge pull request #60945 from NitzanMordhai/wip-nitzan-crushwrapper-corpus-squid
SrinivasaBharathKanta [Mon, 3 Feb 2025 15:42:19 +0000 (21:12 +0530)]
Merge pull request #60945 from NitzanMordhai/wip-nitzan-crushwrapper-corpus-squid

dencoder tests fix type backwards incompatible checks

5 months agowin32_deps_build.sh: pin zlib tag 61627/head
Lucian Petrut [Mon, 3 Feb 2025 14:53:05 +0000 (14:53 +0000)]
win32_deps_build.sh: pin zlib tag

The zlib Windows build started to fail, probably because of this:
https://github.com/madler/zlib/issues/1038

  Cloning into 'zlib'...
  make: *** No rule to make target 'zconf.h', needed by 'adler32.o'.

We'll pin the zlib version for now to unblock the Windows build.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
5 months agoqa/suites/nvmeof: Remove watchdog from thrasher
Vallari Agrawal [Thu, 30 Jan 2025 12:13:48 +0000 (17:43 +0530)]
qa/suites/nvmeof: Remove watchdog from thrasher

This commit does the following:
1. remove watchdog from thrasher
1. remove wait from fio_test
3. change thrasher switcher wait-time to 10 mins

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agosrc: modernize sample.ceph.conf 61620/head
Anthony D'Atri [Sun, 2 Feb 2025 21:38:14 +0000 (16:38 -0500)]
src: modernize sample.ceph.conf

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
5 months agoMerge pull request #61577 from ronen-fr/wip-rf-just-me
Ronen Friedman [Sun, 2 Feb 2025 14:22:07 +0000 (16:22 +0200)]
Merge pull request #61577 from ronen-fr/wip-rf-just-me

osd/scrub: remove unnecessary loop

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
5 months agoMerge pull request #61538 from leonidc/fix-duplicated-optimized
leonidc [Sun, 2 Feb 2025 14:05:02 +0000 (16:05 +0200)]
Merge pull request #61538 from leonidc/fix-duplicated-optimized

nvmeofgw* : fix duplicated optimized host's pathes

5 months agoMerge pull request #61590 from ronen-fr/wip-rf-noinfo-repair
Ronen Friedman [Sun, 2 Feb 2025 14:02:21 +0000 (16:02 +0200)]
Merge pull request #61590 from ronen-fr/wip-rf-noinfo-repair

osd/scrub: discard repair_oinfo_oid()

Reviewed-by: Samuel Just <sjust@redhat.com>
5 months agoMerge pull request #61394 from ronen-fr/wip-rf-cacher-v2
Ronen Friedman [Sun, 2 Feb 2025 13:55:09 +0000 (15:55 +0200)]
Merge pull request #61394 from ronen-fr/wip-rf-cacher-v2

common: modify md_config_obs_impl API

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
5 months agoMerge pull request #60426 from ronen-fr/wip-rf-svwperf
Ronen Friedman [Sun, 2 Feb 2025 13:49:49 +0000 (15:49 +0200)]
Merge pull request #60426 from ronen-fr/wip-rf-svwperf

common/perf_counters: enabling 'find()' by logger name

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
5 months agoMerge pull request #61613 from zdover23/wip-doc-2025-02-02-architecture 61617/head
Zac Dover [Sat, 1 Feb 2025 21:38:32 +0000 (07:38 +1000)]
Merge pull request #61613 from zdover23/wip-doc-2025-02-02-architecture

doc/architecture: remove sentence

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
5 months agodoc/architecture: remove sentence 61613/head
Zac Dover [Sat, 1 Feb 2025 21:15:32 +0000 (07:15 +1000)]
doc/architecture: remove sentence

Remove a sentence that is more marketing than reference.

Signed-off-by: Zac Dover <zac.dover@proton.me>
5 months agoMerge pull request #61561 from athanatos/sjust/wip-crimson-recovery-69412
Samuel Just [Fri, 31 Jan 2025 18:44:49 +0000 (10:44 -0800)]
Merge pull request #61561 from athanatos/sjust/wip-crimson-recovery-69412

crimson: take obc lock during push commit on primary

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
5 months agoMerge pull request #61001 from MaxKellermann/common_includes
Ilya Dryomov [Fri, 31 Jan 2025 10:50:57 +0000 (11:50 +0100)]
Merge pull request #61001 from MaxKellermann/common_includes

common: add missing includes

Reviewed-by: Adam Emerson <aemerson@redhat.com>
5 months agoMerge pull request #61598 from idryomov/wip-rbd-migration-https-doc
Ilya Dryomov [Thu, 30 Jan 2025 23:01:10 +0000 (00:01 +0100)]
Merge pull request #61598 from idryomov/wip-rbd-migration-https-doc

doc/rbd: use https links in live import examples

Reviewed-by: Zac Dover <zac.dover@proton.me>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
5 months agocrimson/.../replicated_recovery_backend: take excl lock while pushes commit 61561/head
Samuel Just [Wed, 22 Jan 2025 02:41:48 +0000 (18:41 -0800)]
crimson/.../replicated_recovery_backend: take excl lock while pushes commit

Fixes: https://tracker.ceph.com/issues/69412
Signed-off-by: Samuel Just <sjust@redhat.com>
5 months agocrimson/.../replicated_recovery_backend: route pushes earlier
Samuel Just [Wed, 22 Jan 2025 02:47:09 +0000 (18:47 -0800)]
crimson/.../replicated_recovery_backend: route pushes earlier

Let ReplicatedRecoveryBackend::handle_recovery_op route pushes
between handle_push and handle_pull_response instead of
ReplicatedRecoveryBackend::handle_push.

Signed-off-by: Samuel Just <sjust@redhat.com>
5 months agodoc/rbd: use https links in live import examples 61598/head
Ilya Dryomov [Thu, 30 Jan 2025 19:30:18 +0000 (20:30 +0100)]
doc/rbd: use https links in live import examples

Even though it's explicitly said that "http" stream can be used to
import via both HTTP and HTTPS, it can still be confusing that "type":
"http" is expected to go with "url": "https://...".  Switch example
URLs from HTTP to HTTPS to make it more obvious.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
5 months agoMerge pull request #57551 from linuxbox2/wip-rgwlc-66111
Matt Benjamin [Thu, 30 Jan 2025 17:15:36 +0000 (12:15 -0500)]
Merge pull request #57551 from linuxbox2/wip-rgwlc-66111

rgwlc: send pool transition notifications too

5 months agoMerge pull request #60250 from aainscow/interval_set_enhancements
Alex Ainscow [Thu, 30 Jan 2025 17:06:00 +0000 (17:06 +0000)]
Merge pull request #60250 from aainscow/interval_set_enhancements

include: interval_set: Relax requirements and enhance performance of interval sets

5 months agoMerge pull request #61135 from rkachach/fix_issue_cephadm_services_registry
Adam King [Thu, 30 Jan 2025 16:43:58 +0000 (11:43 -0500)]
Merge pull request #61135 from rkachach/fix_issue_cephadm_services_registry

mgr/cephadm: using service registry pattern for cephadm services

Reviewed-by: Adam King <adking@redhat.com>
5 months agoMerge pull request #59480 from bill-scales/ec_partial_read
Bill Scales [Thu, 30 Jan 2025 16:17:25 +0000 (16:17 +0000)]
Merge pull request #59480 from bill-scales/ec_partial_read

Further EC partial stripe read fixes

5 months agoMerge pull request #61591 from gbregman/main
Gil Bregman [Thu, 30 Jan 2025 15:59:11 +0000 (17:59 +0200)]
Merge pull request #61591 from gbregman/main

mgr/cephadm/nvmeof: Add verify_listener_ip field to NVMeOF configuration

5 months agoMerge PR #61537 into main
Venky Shankar [Thu, 30 Jan 2025 12:44:05 +0000 (18:14 +0530)]
Merge PR #61537 into main

* refs/pull/61537/head:
libcephfs_proxy: implement ceph_readdir_r()

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
Reviewed-by: Anoop C S <anoopcs@cryptolab.net>
5 months agoqa/workunits/rbd: add test_import_nbd_stream_disconnected() 61567/head
Ilya Dryomov [Tue, 28 Jan 2025 08:33:37 +0000 (09:33 +0100)]
qa/workunits/rbd: add test_import_nbd_stream_disconnected()

When the NBD server is killed, nbd_pread() can set errno to at least
ENOTCONN, EINVAL and 0 which is supposed to stand for "no additional
errno information is available for this error".  Add a test to ensure
that "rbd migration execute" command always fails and that the image
isn't transitioned to MIGRATION_STATE_EXECUTED in this scenario.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
5 months agolibrbd: stop filtering async request error codes
Ilya Dryomov [Wed, 29 Jan 2025 11:56:34 +0000 (12:56 +0100)]
librbd: stop filtering async request error codes

The roots of this go back to 2015 when snap create was changed to
filter EEXIST in commit 63f6c9bac9a4 ("librbd: fixed snap create race
conditions") and flatten respectively EINVAL in commit ef7e210c3f74
("librbd: better handling for duplicate flatten requests").  From there
this pattern made it to most other operations that can be proxied
including "rbd migration execute".

The motivation was to suppress generation of an "expected" error in
response to a duplicate async request notification for the operation.
However, doing this at the top of the handler (right before returning
to the caller) and for an error as generic as EINVAL is super fragile.
It's trivial for an error that is being filtered to sneak in with
a lower level change completely unnoticed.  For example, live migration
recently added NBD stream which is implemented on top of libnbd and it
turns out that some libnbd APIs return EINVAL on various occasions when
the NBD endpoint disappears and an error like ENOTCONN would make more
sense.  If this occurs during "rbd migration execute" operation, the
rest of librbd never learns that migration was disrupted and the image
is transitioned to MIGRATION_STATE_EXECUTED, thus handing a partially
imported (read: corrupted) image to the user.

Luckily, with commits 07fbc4b71df4 ("librbd: track complete async
operation requests") and 96bc20445afb ("librbd: track complete async
operation return code"), the scenario which originally prompted error
code filtering isn't an issue anymore.  Despite a few shortcomings
(e.g. when an async request notification is acked with result 0, it's
impossible to tell whether a) a new operation was kicked off, b) there
is an operation that is still in progress or c) it's for an operation
that completed earlier but hasn't "expired" yet), even just commit
07fbc4b71df4 by itself prevents a duplicate notification from kicking
off a second operation that could generate an error for something that
actually succeeded.  With that in mind, eradicate error code filtering
from Operations class.

Fixes: https://tracker.ceph.com/issues/58185
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
5 months agoqa/tasks/nvmeof.py: Add teardown() method
Vallari Agrawal [Wed, 29 Jan 2025 15:34:04 +0000 (21:04 +0530)]
qa/tasks/nvmeof.py: Add teardown() method

Add teardown method to remove nvmeof service
before rest of the cluster tearsdown.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoqa/tasks/nvmeof.py: Ignore systemctl_stop thrashing method
Vallari Agrawal [Tue, 28 Jan 2025 12:43:17 +0000 (18:13 +0530)]
qa/tasks/nvmeof.py: Ignore systemctl_stop thrashing method

Do not use systemctl_stop method to thrash daemons,
just use 'ceph orch daemon stop' and 'ceph orch daemon rm'
methods to thrash nvmeof gateways.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoqa/tasks/nvmeof.py: Fix do_checks() method
Vallari Agrawal [Tue, 28 Jan 2025 09:18:15 +0000 (14:48 +0530)]
qa/tasks/nvmeof.py: Fix do_checks() method

All checks currently run on initator node, now
run all "ceph" commands on one of gateway hosts
instead of initator nodes. And run "nvme list"
and "nvme list-subsys" checks on initator node.

Add retry (5 times) to do_checks if any command fails.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoqa/tasks/nvmeof.py: make seperate calls in do_checks()
Vallari Agrawal [Mon, 20 Jan 2025 11:43:44 +0000 (17:13 +0530)]
qa/tasks/nvmeof.py: make seperate calls in do_checks()

When running 'nvme list-subsys <device>' command
in do_checks(), instead of combining command for
all devices with '&&', make seperate calls.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoqa/tasks/nvmeof.py: Do not use 'systemctl start' in thrasher
Vallari Agrawal [Tue, 14 Jan 2025 03:52:31 +0000 (09:22 +0530)]
qa/tasks/nvmeof.py: Do not use 'systemctl start' in thrasher

Instead use 'daemon start' in revive_daemon() to bring
up gateways thrashed with 'systemctl stop'.
This is because 'systemctl start' method seems to temporary
issues.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoqa/workunits/nvmeof/fio_test.sh: fix fio filenames
Vallari Agrawal [Tue, 14 Jan 2025 03:49:03 +0000 (09:19 +0530)]
qa/workunits/nvmeof/fio_test.sh: fix fio filenames

Filenames were provided to fio as nvme1n1:nvme1n2,
it should be pull path (/dev/nvme1n1:/dev/nvme1n2).

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoqa/tasks/nvmeof.py: Add stop_and_join method to thrasher
Vallari Agrawal [Mon, 13 Jan 2025 19:10:35 +0000 (00:40 +0530)]
qa/tasks/nvmeof.py: Add stop_and_join method to thrasher

Also add nvme-gw show command output in do_checks()
and revive daemons with 'ceph orch daemon start' in
revive_daemon() method.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoqa: Add qa/suites/nvmeof/thrash/gateway-initiator-setup/2-subsys-8-namespace.yaml
Vallari Agrawal [Mon, 13 Jan 2025 09:39:39 +0000 (15:09 +0530)]
qa: Add qa/suites/nvmeof/thrash/gateway-initiator-setup/2-subsys-8-namespace.yaml

This allows to run nvmeof thrasher test on smaller
confgurations which finshes faster than 120subsys-8ns
config.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoqa/tasks/nvmeof: Add --refresh flag in do_checks() cmds
Vallari Agrawal [Mon, 13 Jan 2025 09:33:27 +0000 (15:03 +0530)]
qa/tasks/nvmeof: Add --refresh flag in do_checks() cmds

This is to ensure latest state of the services are displayed.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agomgr/cephadm/nvmeof: Add verify_listener_ip field to NVMeOF configuration and remove... 61591/head
Gil Bregman [Thu, 30 Jan 2025 11:33:51 +0000 (13:33 +0200)]
mgr/cephadm/nvmeof: Add verify_listener_ip field to NVMeOF configuration and remove obsolete enable_key_encryption
Fixes https://tracker.ceph.com/issues/69731

Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
5 months agoMerge pull request #61540 from idryomov/wip-69679
Ilya Dryomov [Thu, 30 Jan 2025 10:23:16 +0000 (11:23 +0100)]
Merge pull request #61540 from idryomov/wip-69679

mon/OSDMonitor: relax cap enforcement for unmanaged snapshots

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
5 months agoosd/scrub: discard repair_oinfo_oid() 61590/head
Ronen Friedman [Thu, 30 Jan 2025 09:27:58 +0000 (03:27 -0600)]
osd/scrub: discard repair_oinfo_oid()

repair_oinfo_oid(), called every scrub, has a very specific
functionality: fix the object ID specified in the Object Info
attribute, if different from the ID of the owning object.

This fix was added in 2017, as a response to a unique failure
scenario that was observed in Sepia - probably following a
filesystem bug. See https://tracker.ceph.com/issues/18409 &
https://tracker.ceph.com/issues/20471.

The limited functionality of repair_oinfo_oid() -
only repairing this one specific issue, and only if the OI_ATTR
exists and is decodable - does not justify the overhead of
running it every scrub.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
5 months agoosd/scrub: remove unnecessary loop 61577/head
Ronen Friedman [Wed, 29 Jan 2025 19:09:36 +0000 (13:09 -0600)]
osd/scrub: remove unnecessary loop

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
5 months agoMerge pull request #61471 from idryomov/wip-65889-more
Ilya Dryomov [Thu, 30 Jan 2025 08:38:29 +0000 (09:38 +0100)]
Merge pull request #61471 from idryomov/wip-65889-more

cls/rbd: don't use read API for write-like methods

Reviewed-by: Ramana Raja <rraja@redhat.com>
5 months agocommon: ceph_context: make use of get_tracked_keys() 61394/head
Ronen Friedman [Wed, 15 Jan 2025 07:49:46 +0000 (01:49 -0600)]
common: ceph_context: make use of get_tracked_keys()

modify some configuration object registrations
in common/ceph_context to use the updated
md_config_obs_t::get_tracked_keys() API

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
5 months agoMerge pull request #61545 from Hezko/nvmeof-cli-add-cmnds 60583/head
Hezko [Thu, 30 Jan 2025 07:07:27 +0000 (09:07 +0200)]
Merge pull request #61545 from Hezko/nvmeof-cli-add-cmnds

mgr/dashboard: Add additional API and CLI endpoints

5 months agoMerge pull request #61465 from ArbitCode/wip-raja-fix-multipart-upload-cant-get-obj-tag
Raja [Thu, 30 Jan 2025 06:29:44 +0000 (11:59 +0530)]
Merge pull request #61465 from ArbitCode/wip-raja-fix-multipart-upload-cant-get-obj-tag

RGW:fix obj by multipart upload cant get tag

5 months agomds: fix option mds_bal_overload_epochs 60686/head
Zhansong Gao [Mon, 11 Nov 2024 05:26:03 +0000 (13:26 +0800)]
mds: fix option mds_bal_overload_epochs

When option mds_bal_overload_epochs was added, two positions
should have been modified, but one of them was overlooked.

Fixes: https://tracker.ceph.com/issues/68953
Signed-off-by: Zhansong Gao <zhsgao@hotmail.com>
5 months agoMerge pull request #61510 from dmick/container-no-repo-creds
Dan Mick [Wed, 29 Jan 2025 19:18:24 +0000 (11:18 -0800)]
Merge pull request #61510 from dmick/container-no-repo-creds

container/build.sh: don't require repo creds on NO_PUSH

5 months agoMerge pull request #61566 from zdover23/wip-doc-2025-01-30-cephadm-services-osd
Zac Dover [Wed, 29 Jan 2025 17:29:34 +0000 (03:29 +1000)]
Merge pull request #61566 from zdover23/wip-doc-2025-01-30-cephadm-services-osd

doc/cephadm: simplify confusing math proposition

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
5 months agoMerge pull request #61530 from aza547/ssl_cert
Adam King [Wed, 29 Jan 2025 16:28:18 +0000 (11:28 -0500)]
Merge pull request #61530 from aza547/ssl_cert

cephadm: rgw: allow specifying the ssl_certificate by filepath

Reviewed-by: Adam King <adking@redhat.com>
5 months agoMerge pull request #61511 from phlogistonjohn/jjm-ctr-label-ceph
Adam King [Wed, 29 Jan 2025 16:25:09 +0000 (11:25 -0500)]
Merge pull request #61511 from phlogistonjohn/jjm-ctr-label-ceph

container: add label ceph=True back

Reviewed-by: Adam King <adking@redhat.com>
5 months agoMerge pull request #61565 from AliMasarweh/wip-alimasa-bn-policy-with-tenant
Ali Masarwa [Wed, 29 Jan 2025 15:48:12 +0000 (17:48 +0200)]
Merge pull request #61565 from AliMasarweh/wip-alimasa-bn-policy-with-tenant

RGW | bucket notifications: support cross tenant operations

Reviewed-by: yuvalif<ylifshit@redhat.com>
5 months agoMerge pull request #61559 from linuxbox2/wip-new-zppbits
Matt Benjamin [Wed, 29 Jan 2025 14:50:35 +0000 (09:50 -0500)]
Merge pull request #61559 from linuxbox2/wip-new-zppbits

rgw: update to latest zpp_bits.h to compile w/gcc-14 & clang 19

5 months agoMerge pull request #60246 from jamiepryde/SIMD-align-64
SrinivasaBharathKanta [Wed, 29 Jan 2025 14:16:23 +0000 (19:46 +0530)]
Merge pull request #60246 from jamiepryde/SIMD-align-64

erasure-code: Increase SIMD_ALIGN from 32 to 64

5 months agoMerge pull request #59679 from ceph/add-new-ec-plugins-for-qa
SrinivasaBharathKanta [Wed, 29 Jan 2025 14:15:32 +0000 (19:45 +0530)]
Merge pull request #59679 from ceph/add-new-ec-plugins-for-qa

qa/erasure-code: add new teuthology isa configs

5 months agodoc/cephadm: simplify confusing math proposition 61566/head
Zac Dover [Wed, 29 Jan 2025 14:05:59 +0000 (00:05 +1000)]
doc/cephadm: simplify confusing math proposition

s/This means that the exact device size is 3.64 * 1000, or 3640GB"/This
means that the exact device size is 3.64TB, or 3640 GB"/

In the original text, the number "3.64" appears to refer to a quantity
(and indeed, it is a quantity of Terabytes), but it is unlabeled. Also,
on repeated recent readings of this sentence I found it more puzzling
than enlightening. So I made this commit.

Signed-off-by: Zac Dover <zac.dover@proton.me>
5 months agoRGW | bucket notifications: support cross tenant operations 61565/head
Ali Masarwa [Wed, 29 Jan 2025 12:09:22 +0000 (14:09 +0200)]
RGW | bucket notifications: support cross tenant operations

Signed-off-by: Ali Masarwa <amasarwa@redhat.com>
5 months agoMerge pull request #61123 from rhcs-dashboard/raise-smb-msg-exception
Pedro Gonzalez Gomez [Wed, 29 Jan 2025 12:20:51 +0000 (13:20 +0100)]
Merge pull request #61123 from rhcs-dashboard/raise-smb-msg-exception

mgr/dashboard: smb raise exception for unsucessful resource update

Reviewed-by: Afreen Misbah <afreen@ibm.com>
5 months agolibcephfs_proxy: implement ceph_readdir_r() 61537/head
Xavi Hernandez [Mon, 27 Jan 2025 11:07:58 +0000 (12:07 +0100)]
libcephfs_proxy: implement ceph_readdir_r()

Signed-off-by: Xavi Hernandez <xhernandez@gmail.com>
5 months agoMerge pull request #61317 from rhcs-dashboard/add-smb-service-msg
afreen23 [Wed, 29 Jan 2025 09:28:37 +0000 (14:58 +0530)]
Merge pull request #61317 from rhcs-dashboard/add-smb-service-msg

mgr/dashboard: add warning message on smb service management

Reviewed-by: Afreen Misbah <afreen@ibm.com>
5 months agoMerge pull request #61033 from rhcs-dashboard/rgw-user-accounts-ui
Nizamudeen A [Wed, 29 Jan 2025 05:14:47 +0000 (10:44 +0530)]
Merge pull request #61033 from rhcs-dashboard/rgw-user-accounts-ui

mgr/dashboard: RGW user accounts UI

Reviewed-by: Ankush Behl <cloudbehl@gmail.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
5 months agoMerge pull request #60819 from rhcs-dashboard/smb-cluster-form
afreen23 [Wed, 29 Jan 2025 05:09:22 +0000 (10:39 +0530)]
Merge pull request #60819 from rhcs-dashboard/smb-cluster-form

mgr/dashboard: create smb cluster

Reviewed-by: Afreen Misbah <afreen@ibm.com>
5 months agocrimson/.../replicate_recovery_backend: remove unnecessary check
Samuel Just [Wed, 22 Jan 2025 02:46:23 +0000 (18:46 -0800)]
crimson/.../replicate_recovery_backend: remove unnecessary check

Already checked in handle_recovery_op.

Signed-off-by: Samuel Just <sjust@redhat.com>
5 months agocrimson/.../object_context_loader: add lock_excl_sync method
Samuel Just [Wed, 22 Jan 2025 03:26:35 +0000 (19:26 -0800)]
crimson/.../object_context_loader: add lock_excl_sync method

Signed-off-by: Samuel Just <sjust@redhat.com>
5 months agocrimson/.../object_context_loader: add obc get_obc_manager variant
Samuel Just [Wed, 22 Jan 2025 02:41:17 +0000 (18:41 -0800)]
crimson/.../object_context_loader: add obc get_obc_manager variant

Avoids extra lookup.

Signed-off-by: Samuel Just <sjust@redhat.com>
5 months agocrimson/.../replicated_recovery_backend: move do_transaction to _handle_pull_response
Samuel Just [Wed, 22 Jan 2025 02:13:42 +0000 (02:13 +0000)]
crimson/.../replicated_recovery_backend: move do_transaction to _handle_pull_response

Signed-off-by: Samuel Just <sjust@redhat.com>
5 months agocrimson/.../replicated_recovery_backend: only call _committed_pushed_object if complete
Samuel Just [Wed, 15 Jan 2025 22:34:31 +0000 (22:34 +0000)]
crimson/.../replicated_recovery_backend: only call _committed_pushed_object if complete

Signed-off-by: Samuel Just <sjust@redhat.com>
5 months agocrimson/.../replicated_recovery_backend: convert handle_pull_response to coroutine
Samuel Just [Wed, 15 Jan 2025 22:34:09 +0000 (22:34 +0000)]
crimson/.../replicated_recovery_backend: convert handle_pull_response to coroutine

Signed-off-by: Samuel Just <sjust@redhat.com>
5 months agocrimson/.../replicated_recovery_backend: convert _handle_pull_response to coroutine
Samuel Just [Wed, 15 Jan 2025 22:15:36 +0000 (22:15 +0000)]
crimson/.../replicated_recovery_backend: convert _handle_pull_response to coroutine

Signed-off-by: Samuel Just <sjust@redhat.com>
5 months agocrimson/.../replicated_recovery_backend: convert submit_push_data to coroutine
Samuel Just [Fri, 10 Jan 2025 22:50:14 +0000 (22:50 +0000)]
crimson/.../replicated_recovery_backend: convert submit_push_data to coroutine

Signed-off-by: Samuel Just <sjust@redhat.com>
5 months agocrimson/.../replicated_recovery_backend: convert prep_push_target to coroutine
Samuel Just [Fri, 10 Jan 2025 22:46:43 +0000 (22:46 +0000)]
crimson/.../replicated_recovery_backend: convert prep_push_target to coroutine

Signed-off-by: Samuel Just <sjust@redhat.com>
5 months agocrimson/.../recovery_backend: convert to logging macros, some formatting changes
Samuel Just [Wed, 8 Jan 2025 00:51:18 +0000 (00:51 +0000)]
crimson/.../recovery_backend: convert to logging macros, some formatting changes

Signed-off-by: Samuel Just <sjust@redhat.com>
5 months agocrimson/.../replicated_recovery_backend: convert to logging macros, some formatting...
Samuel Just [Tue, 7 Jan 2025 20:55:48 +0000 (12:55 -0800)]
crimson/.../replicated_recovery_backend: convert to logging macros, some formatting changes

Signed-off-by: Samuel Just <sjust@redhat.com>
5 months agorgw: update to latest zpp_bits.h to compile w/gcc-14 & clang 19 61559/head
Matt Benjamin [Tue, 28 Jan 2025 20:07:06 +0000 (15:07 -0500)]
rgw: update to latest zpp_bits.h to compile w/gcc-14 & clang 19

Fixes: https://tracker.ceph.com/issues/69696
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
5 months agonvmeofgw*: 2 fixes - for duplicated optimized pathes and fix for GW startup 61538/head
Leonid Chernin [Tue, 21 Jan 2025 13:05:09 +0000 (13:05 +0000)]
nvmeofgw*: 2 fixes - for duplicated optimized  pathes and fix for GW startup
 1. fix duplicated optimized host's pathes - trigger process_gw_down upon
   fast-gw reboot, removed old fast-reboot handlers
 2. fix GW startup - trigger process_gw_down when expired WAIT_BLOCKLIST timer

Signed-off-by: Leonid Chernin <leonidc@il.ibm.com>
5 months agoMerge pull request #61549 from zdover23/2025-01-28-radosgw-multisite
Zac Dover [Tue, 28 Jan 2025 18:10:00 +0000 (04:10 +1000)]
Merge pull request #61549 from zdover23/2025-01-28-radosgw-multisite

doc/radosgw: s/zonegroup/pools/

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
5 months agoMerge pull request #60735 from adk3798/cephadm-remove-daemon-service-name
Adam King [Tue, 28 Jan 2025 17:03:03 +0000 (12:03 -0500)]
Merge pull request #60735 from adk3798/cephadm-remove-daemon-service-name

mgr/cephadm: set service name for DaemonDescription object used during daemon removal

Reviewed-by: John Mulligan <jmulligan@redhat.com>
5 months agoMerge pull request #61215 from ArbitCode/wip-rgw-raja-feature-64526
Raja [Tue, 28 Jan 2025 16:30:26 +0000 (22:00 +0530)]
Merge pull request #61215 from ArbitCode/wip-rgw-raja-feature-64526

RGW:support x-amz-expected-bucket-owner to verify bucket ownership wi…

5 months agomgr/dashboard: smb raise exception for unsucessful resource update 61123/head
Pedro Gonzalez Gomez [Tue, 17 Dec 2024 20:08:55 +0000 (21:08 +0100)]
mgr/dashboard: smb raise exception for unsucessful resource update

Adds a decorator to raise a DashboardException with the msg error of an unsucessful smb resource update

Fixes: https://tracker.ceph.com/issues/69286
Signed-off-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>
5 months agodoc/radosgw: s/zonegroup/pools/ 61549/head
Zac Dover [Tue, 28 Jan 2025 06:30:24 +0000 (16:30 +1000)]
doc/radosgw: s/zonegroup/pools/

s/zonegroup/pools/, where this change makes the text clearer.

This change was made in reponse to an upstream comment on
https://pad.ceph.com/p/Report_Documentation_Bugs.

Fixes: https://tracker.ceph.com/issues/69689
Signed-off-by: Zac Dover <zac.dover@proton.me>
5 months agomgr/dashboard: Add additional cli endpoints to align with existing nvmeof cli 61545/head
Tomer Haskalovitch [Wed, 15 Jan 2025 09:49:18 +0000 (11:49 +0200)]
mgr/dashboard: Add additional cli endpoints to align with existing nvmeof cli

Added new endpoints to ceph cli and dashboard API to align with cli commands that already exist in existing nvmeof cli.

fixes: https://tracker.ceph.com/issues/62705

Signed-off-by: Tomer Haskalovitch <il033030@Tomers-MBP.lan>
5 months agoMerge pull request #61551 from leonidc/level_of_critical_mon_ev
leonidc [Tue, 28 Jan 2025 14:15:51 +0000 (16:15 +0200)]
Merge pull request #61551 from leonidc/level_of_critical_mon_ev

nvmeofgw*: change log level of critical nvmeof monitor events to 1