]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
5 months agosrc/script: add a script to help build ceph using containers 59841/head
John Mulligan [Tue, 20 Aug 2024 19:01:05 +0000 (15:01 -0400)]
src/script: add a script to help build ceph using containers

The build-with-container script tries to encapsulate nearly all major
build tasks using docker/podman containers. If there's no build image
locally it will create one for your. It provides targets for building
(make), testing (make check), building rpm packages or deb packages and
is designed to be fairly easily extended.

View the comment at the top of the source file for usage details.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
5 months agobuild: add files needed to create a build container
John Mulligan [Tue, 20 Aug 2024 19:00:57 +0000 (15:00 -0400)]
build: add files needed to create a build container

A build container contains all the tools and dependencies needed to
build ceph. It provides a Container file and small script that
helps bootstrap the container setup. This script installs a few extra
things we need before farming most of the work out to install-deps.sh.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
5 months agobuild: small script tweak to allow different build dirs
John Mulligan [Sat, 14 Sep 2024 10:31:23 +0000 (06:31 -0400)]
build: small script tweak to allow different build dirs

Move the mkdir line to allow for other builds dir naming schemes outside
of what appears in the .gitignore file. A tiny bit of added flexibility
at little cost.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
5 months agosrc/script: add helper function has_build_dir
John Mulligan [Mon, 14 Nov 2022 15:57:25 +0000 (10:57 -0500)]
src/script: add helper function has_build_dir

This function returns successfully if $BUILD_DIR exists and is valid.
This is a useful building block for automation around the build and
can be used to avoid re-running commands that fail is the build dir
exists already.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
5 months agoMerge pull request #61470 from aclamk/wip-aclamk-bluefs-bdev-expand-addendum
Adam Kupczyk [Wed, 5 Feb 2025 19:42:41 +0000 (20:42 +0100)]
Merge pull request #61470 from aclamk/wip-aclamk-bluefs-bdev-expand-addendum

os/bluestore: CBT bluefs-bdev-expand addendum

5 months agoMerge pull request #61019 from MaxKellermann/test_objectstore__WITH_BLUESTORE-1
Adam Kupczyk [Wed, 5 Feb 2025 19:37:13 +0000 (20:37 +0100)]
Merge pull request #61019 from MaxKellermann/test_objectstore__WITH_BLUESTORE-1

test/objectstore: extend `#ifdef WITH_BLUESTORE`

5 months agoMerge pull request #60547 from MaxKellermann/without_bluestore
Adam Kupczyk [Wed, 5 Feb 2025 19:36:50 +0000 (20:36 +0100)]
Merge pull request #60547 from MaxKellermann/without_bluestore

Fix two build failures with `WITH_BLUESTORE=no`

5 months agoMerge pull request #59633 from YiteGu/optimize-offline-trim-report-info
Adam Kupczyk [Wed, 5 Feb 2025 19:36:27 +0000 (20:36 +0100)]
Merge pull request #59633 from YiteGu/optimize-offline-trim-report-info

tools/ceph-bluestore-tool: optimize offline trim report info

5 months agoMerge pull request #61633 from chardan/wip-jfw-rgw-fix-editor-mode
Jesse Williamson [Wed, 5 Feb 2025 18:07:22 +0000 (10:07 -0800)]
Merge pull request #61633 from chardan/wip-jfw-rgw-fix-editor-mode

rgw: fixup for emacs/vim modes, moved to top of file.

5 months agoMerge pull request #61594 from adk3798/test-nfs-task-cluster-purge-fixup
Adam King [Wed, 5 Feb 2025 15:23:58 +0000 (10:23 -0500)]
Merge pull request #61594 from adk3798/test-nfs-task-cluster-purge-fixup

mgr/cephadm: continue in nfs service purge if grace file is already deleted

Reviewed-by: John Mulligan <jmulligan@redhat.com>
5 months agoMerge pull request #61593 from adk3798/cephadm-osd-extra-args-initial-deploy
Adam King [Wed, 5 Feb 2025 15:22:35 +0000 (10:22 -0500)]
Merge pull request #61593 from adk3798/cephadm-osd-extra-args-initial-deploy

mgr/cephadm: create OSD daemon deploy specs through make_daemon_spec

Reviewed-by: John Mulligan <jmulligan@redhat.com>
5 months agoMerge pull request #61579 from phlogistonjohn/jjm-cephadm-small-moves
Adam King [Wed, 5 Feb 2025 15:21:43 +0000 (10:21 -0500)]
Merge pull request #61579 from phlogistonjohn/jjm-cephadm-small-moves

cephadm: move three functions out of cephadm.py

Reviewed-by: Adam King <adking@redhat.com>
5 months agoMerge pull request #61578 from omidyoosefi/monitor-port-nfs
Adam King [Wed, 5 Feb 2025 15:20:06 +0000 (10:20 -0500)]
Merge pull request #61578 from omidyoosefi/monitor-port-nfs

pybind/mgr/cephadm: allow setting custom monitoring_port for nfs

Reviewed-by: Adam King <adking@redhat.com>
5 months agoMerge pull request #61571 from adk3798/cephadm-ganesha-server-scope
Adam King [Wed, 5 Feb 2025 15:17:56 +0000 (10:17 -0500)]
Merge pull request #61571 from adk3798/cephadm-ganesha-server-scope

mgr/cephadm: add Server_Scope = <fsid> to NFSv4 section of ganesha conf

Reviewed-by: John Mulligan <jmulligan@redhat.com>
5 months agoMerge pull request #61389 from Kushal-deb/fix-issue-69435_NVMe-of_service
Adam King [Wed, 5 Feb 2025 15:15:57 +0000 (10:15 -0500)]
Merge pull request #61389 from Kushal-deb/fix-issue-69435_NVMe-of_service

mgr/cephadm: Abort nvme deployment with pool that doesn't exist

Reviewed-by: Adam King <adking@redhat.com>
5 months agoMerge pull request #60991 from ShwetaBhosale1/fix_issue_69153_nfs_to_show_ingress_mode
Adam King [Wed, 5 Feb 2025 15:14:47 +0000 (10:14 -0500)]
Merge pull request #60991 from ShwetaBhosale1/fix_issue_69153_nfs_to_show_ingress_mode

mgr/nfs: Show ingress mode in output of 'ceph nfs cluster info' command

Reviewed-by: Adam King <adking@redhat.com>
5 months agoMerge pull request #60915 from Kushal-deb/fix-issue-2313279
Adam King [Wed, 5 Feb 2025 15:13:40 +0000 (10:13 -0500)]
Merge pull request #60915 from Kushal-deb/fix-issue-2313279

cephadm: Add pre_remove and ensure deployment values are reset and API settings are updated when  removing Prometheus or Alertmanager daemons

Reviewed-by: Adam King <adking@redhat.com>
5 months agoMerge pull request #61368 from cbodley/wip-69527
J. Eric Ivancich [Wed, 5 Feb 2025 14:04:06 +0000 (09:04 -0500)]
Merge pull request #61368 from cbodley/wip-69527

rgw/s3: remove local variable 'uri' that shadows member variable

Reviewed-by: Yixin Jin yjin77@yahoo.ca
5 months agoMerge pull request #61037 from thotz/make-restore-attrs-humanreadable
J. Eric Ivancich [Wed, 5 Feb 2025 14:00:59 +0000 (09:00 -0500)]
Merge pull request #61037 from thotz/make-restore-attrs-humanreadable

rgw/rgw_admin.cc : Make restore attrs readable in admin cli

Reviewed-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-by: Adam Emerson <aemerson@redhat.com>
Shreyansh Sancheti <ssanchet@redhat.com>

5 months agoMerge pull request #59937 from BBoozmen/oozmen-enhancing-fetch-remote-obj-logs
J. Eric Ivancich [Wed, 5 Feb 2025 13:59:31 +0000 (08:59 -0500)]
Merge pull request #59937 from BBoozmen/oozmen-enhancing-fetch-remote-obj-logs

RGW: add src/dest object info to fetch_remote_obj()'s debug log events

Reviewed-by: Adam Emerson <aemerson@redhat.com>
5 months agoMerge pull request #61505 from cbodley/wip-69582
J. Eric Ivancich [Wed, 5 Feb 2025 13:58:51 +0000 (08:58 -0500)]
Merge pull request #61505 from cbodley/wip-69582

examples/rgw: add type to HeadBucketOutput for old boto

Reviewed-by: Yuval Lifshitz <ylifshit@ibm.com>
5 months agoMerge pull request #59913 from clwluvw/bucketreplication-uid
J. Eric Ivancich [Wed, 5 Feb 2025 13:58:10 +0000 (08:58 -0500)]
Merge pull request #59913 from clwluvw/bucketreplication-uid

rgw: use effective owner in PutBucketReplication

Reviewed-by: Casey Bodley <cbodley@redhat.com>
5 months agoMerge pull request #61366 from aclamk/wip-aclamk-bluefs-unittest-string-fill-fix
Adam Kupczyk [Wed, 5 Feb 2025 10:55:14 +0000 (11:55 +0100)]
Merge pull request #61366 from aclamk/wip-aclamk-bluefs-unittest-string-fill-fix

os/bluestore: Fix unittest_bluefs

5 months agoMerge pull request #61466 from rhcs-dashboard/storage-class-management
afreen23 [Wed, 5 Feb 2025 09:52:02 +0000 (15:22 +0530)]
Merge pull request #61466 from rhcs-dashboard/storage-class-management

mgr/dashboard: Storage Class Management

Reviewed-by: Afreen Misbah <afreen@ibm.com>
5 months agomgr/dashboard: Storage Class Management 61466/head
Dnyaneshwari [Fri, 17 Jan 2025 10:06:50 +0000 (15:36 +0530)]
mgr/dashboard: Storage Class Management

Fixes: https://tracker.ceph.com/issues/69606
Signed-off-by: Dnyaneshwari Talwekar <dtalweka@redhat.com>
5 months agoMerge pull request #61067 from MaxKellermann/librados_static
SrinivasaBharathKanta [Wed, 5 Feb 2025 03:59:22 +0000 (09:29 +0530)]
Merge pull request #61067 from MaxKellermann/librados_static

librados: disable symbol versions when building statically

5 months agoMerge pull request #61025 from MaxKellermann/config_legacy_values__static
SrinivasaBharathKanta [Wed, 5 Feb 2025 03:59:08 +0000 (09:29 +0530)]
Merge pull request #61025 from MaxKellermann/config_legacy_values__static

common/config: make `legacy_values` static

5 months agoMerge pull request #58706 from xxhdx1985126/wip-67065
Radoslaw Zarzynski [Wed, 5 Feb 2025 00:22:49 +0000 (01:22 +0100)]
Merge pull request #58706 from xxhdx1985126/wip-67065

test: fix ld link errors

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
5 months agomgr/cephadm: continue in nfs service purge if grace file is already deleted 61594/head
Adam King [Wed, 29 Jan 2025 20:48:53 +0000 (15:48 -0500)]
mgr/cephadm: continue in nfs service purge if grace file is already deleted

The test_nfs task we run in teuthology creates and removes a number of
nfs clusters during the task. I think it's possible based on timing for
it to end up in a situation where it tries to remove an nfs service before
the grace file has been created. In that case, cephadm doesn't know it
hasn't created the grace file and just repeatedly fails forever attempting
to remove the nonexistent file. This patch adds handling for the error
case where we get a nonzero rc but the error message implies the command
failed because the file already does not exist.

Fixes: https://tracker.ceph.com/issues/69736
Signed-off-by: Adam King <adking@redhat.com>
5 months agoMerge pull request #61589 from piyushagarwal1411/fix-69727-main
afreen23 [Tue, 4 Feb 2025 20:02:18 +0000 (01:32 +0530)]
Merge pull request #61589 from piyushagarwal1411/fix-69727-main

mgr/dashboard: Add 'Browse Dashboards' button in Grafana dashboards

Reviewed-by: Afreen Misbah <afreen@ibm.com>
5 months agoMerge pull request #61634 from VallariAg/wip-vallari-nvme-maxgroup-alert
Vallari Agrawal [Tue, 4 Feb 2025 14:56:49 +0000 (20:26 +0530)]
Merge pull request #61634 from VallariAg/wip-vallari-nvme-maxgroup-alert

monitoring: add NVMeoFMaxGatewayGroups alert

5 months agoMerge pull request #61357 from VallariAg/wip-nvmeof-teuthology-test-fix-ha
Vallari Agrawal [Tue, 4 Feb 2025 14:55:32 +0000 (20:25 +0530)]
Merge pull request #61357 from VallariAg/wip-nvmeof-teuthology-test-fix-ha

qa: fix nvmeof teuthology thrasher fix

5 months agoMerge pull request #61620 from anthonyeleven/remove-obsolete-sample-conf
Zac Dover [Tue, 4 Feb 2025 11:51:39 +0000 (21:51 +1000)]
Merge pull request #61620 from anthonyeleven/remove-obsolete-sample-conf

src: modernize sample.ceph.conf

Reviewed-by: Zac Dover <zac.dover@proton.me>
5 months agoqa/suites/nvmeof: use SCALING_DELAYS: '120' 61357/head
Vallari Agrawal [Tue, 4 Feb 2025 07:50:18 +0000 (13:20 +0530)]
qa/suites/nvmeof: use SCALING_DELAYS: '120'

Increase delays for qa/workunits/nvmeof/scalability_test.sh
as namespace rebalancing takes more time. After upscaling,
gateway initially could be 'CREATED', it is a valid state during
gateway initialization, but then the state should progress
to 'AVAILABLE' within couple of seconds.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoMerge pull request #61567 from idryomov/wip-58185
Ilya Dryomov [Tue, 4 Feb 2025 09:58:24 +0000 (10:58 +0100)]
Merge pull request #61567 from idryomov/wip-58185

librbd: stop filtering async request error codes

Reviewed-by: Ramana Raja <rraja@redhat.com>
5 months agomgr/dashboard: Add 'Browse Dashboards' button in Grafana dashboards 61589/head
Piyush Agarwal [Thu, 30 Jan 2025 09:12:37 +0000 (14:42 +0530)]
mgr/dashboard: Add 'Browse Dashboards' button in Grafana dashboards

Fixes: https://tracker.ceph.com/issues/69727
Signed-off-by: Piyush Agarwal <piyushagarwal14.pa@gmail.com>
5 months agoMerge pull request #60686 from zhsgao/mds_bal_overload_epochs
Venky Shankar [Tue, 4 Feb 2025 07:51:37 +0000 (13:21 +0530)]
Merge pull request #60686 from zhsgao/mds_bal_overload_epochs

mds: fix option mds_bal_overload_epochs

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 months agocephadm: Add pre_remove and ensure deployment values are reset and API settings are... 60915/head
Kushal Deb [Fri, 29 Nov 2024 08:38:51 +0000 (14:08 +0530)]
cephadm: Add pre_remove and ensure deployment values are reset and API settings are updated when removing Prometheus or Alertmanager daemons

This fixes an issue where the dashboard API settings are not updated
properly when the active Prometheus or Alertmanager daemon is removed.
If the active daemon is removed, the settings are reconfigured to point
to a remaining daemon or reset if no daemons are available.

This avoids dashboard errors like "404 Not Found" caused by stale API
host settings.

Signed-off-by: Kushal Deb <Kushal.Deb@ibm.com>
5 months agoFixup for emacs/vim modes, moved to top of file. 61633/head
Jesse F. Williamson [Mon, 3 Feb 2025 22:36:14 +0000 (14:36 -0800)]
Fixup for emacs/vim modes, moved to top of file.

Signed-off-by: Jesse F. Williamson <jfw@ibm.com>
5 months agoMerge pull request #61632 from gbregman/main
Gil Bregman [Mon, 3 Feb 2025 23:52:08 +0000 (01:52 +0200)]
Merge pull request #61632 from gbregman/main

mgr/cephadm/nvmeof: Add max_hosts field to NVMeOF configuration

5 months agomgr/cephadm/nvmeof: Add max_hosts field to NVMeOF configuration and update default... 61632/head
Gil Bregman [Mon, 3 Feb 2025 21:13:49 +0000 (23:13 +0200)]
mgr/cephadm/nvmeof: Add max_hosts field to NVMeOF configuration and update default values
Fixes https://tracker.ceph.com/issues/69759

Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
5 months agoMerge pull request #61627 from petrutlucian94/zlib-fix
Ilya Dryomov [Mon, 3 Feb 2025 19:47:04 +0000 (20:47 +0100)]
Merge pull request #61627 from petrutlucian94/zlib-fix

win32_deps_build.sh: pin zlib tag

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
5 months agoMerge pull request #61374 from myoungwon/fix-68518
Radoslaw Zarzynski [Mon, 3 Feb 2025 19:07:49 +0000 (20:07 +0100)]
Merge pull request #61374 from myoungwon/fix-68518

src/test: allow ENOENT if target object of tier_flush has snapshots

Reviewed-by: Laura Flores <lflores@redhat.com>
5 months agomonitoring: add tests for NVMeoFMaxGatewayGroups 61634/head
Vallari Agrawal [Mon, 3 Feb 2025 18:27:30 +0000 (23:57 +0530)]
monitoring: add tests for NVMeoFMaxGatewayGroups

Add unit tests for alert NVMeoFMaxGatewayGroups
in monitoring/ceph-mixin/tests_alerts/test_alerts.yml

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agomonitoring: add alert NVMeoFMaxGatewayGroups
Vallari Agrawal [Mon, 3 Feb 2025 18:24:50 +0000 (23:54 +0530)]
monitoring: add alert NVMeoFMaxGatewayGroups

Add alert NVMeoFMaxGatewayGroups to prometheus_alerts.yml
and prometheus_alerts.libsonnet.

This alerts is to indicate if max number of NVMeoF gateway
groups have been reached in a cluster.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agomonitoring: add NVMeoFMaxGatewayGroups
Vallari Agrawal [Mon, 3 Feb 2025 18:22:47 +0000 (23:52 +0530)]
monitoring: add NVMeoFMaxGatewayGroups

Add config NVMeoFMaxGatewayGroups to config.libsonnet
and set it to 4 (groups).

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoMerge pull request #61628 from kamoltat/wip-ksirivad-fix-stretch-mode-doc
Zac Dover [Mon, 3 Feb 2025 17:41:14 +0000 (03:41 +1000)]
Merge pull request #61628 from kamoltat/wip-ksirivad-fix-stretch-mode-doc

doc/rados/operations/stretch-mode: fix mistake in stretch mode

Reviewed-by: Zac Dover <zac.dover@proton.me>
5 months agodoc/rados/operations/stretch-mode: fix mistake in stretch mode 61628/head
Kamoltat Sirivadhna [Mon, 3 Feb 2025 17:18:44 +0000 (17:18 +0000)]
doc/rados/operations/stretch-mode: fix mistake in stretch mode

Degraded stretch mode should only half the "min_size" not
"size".

Fixes: No tracker (doc changes)
Signed-off-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
5 months agoMerge pull request #61232 from xxhdx1985126/wip-67888-followup
Yuri Weinstein [Mon, 3 Feb 2025 15:59:32 +0000 (07:59 -0800)]
Merge pull request #61232 from xxhdx1985126/wip-67888-followup

osd/PeeringState: rename "cancel_backfill" to "suspend_backfill"

Reviewed-by: Samuel Just <sjust@redhat.com>
5 months agoMerge pull request #61397 from amathuria/wip-amat-test-osdmap-pruning
SrinivasaBharathKanta [Mon, 3 Feb 2025 15:43:28 +0000 (21:13 +0530)]
Merge pull request #61397 from amathuria/wip-amat-test-osdmap-pruning

mon/test_mon_osdmap_prune: Use first_pinned instead of first_committed

5 months agoMerge pull request #61365 from Matan-B/wip-matanb-snapmapper-logs
SrinivasaBharathKanta [Mon, 3 Feb 2025 15:43:09 +0000 (21:13 +0530)]
Merge pull request #61365 from Matan-B/wip-matanb-snapmapper-logs

osd/SnapMapper: Improve logging

5 months agoMerge pull request #61328 from adamemerson/wip-64191
SrinivasaBharathKanta [Mon, 3 Feb 2025 15:42:43 +0000 (21:12 +0530)]
Merge pull request #61328 from adamemerson/wip-64191

test/neorados: Silence mismatched new/delete warning

5 months agoMerge pull request #60945 from NitzanMordhai/wip-nitzan-crushwrapper-corpus-squid
SrinivasaBharathKanta [Mon, 3 Feb 2025 15:42:19 +0000 (21:12 +0530)]
Merge pull request #60945 from NitzanMordhai/wip-nitzan-crushwrapper-corpus-squid

dencoder tests fix type backwards incompatible checks

5 months agowin32_deps_build.sh: pin zlib tag 61627/head
Lucian Petrut [Mon, 3 Feb 2025 14:53:05 +0000 (14:53 +0000)]
win32_deps_build.sh: pin zlib tag

The zlib Windows build started to fail, probably because of this:
https://github.com/madler/zlib/issues/1038

  Cloning into 'zlib'...
  make: *** No rule to make target 'zconf.h', needed by 'adler32.o'.

We'll pin the zlib version for now to unblock the Windows build.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
5 months agoqa/suites/nvmeof: Remove watchdog from thrasher
Vallari Agrawal [Thu, 30 Jan 2025 12:13:48 +0000 (17:43 +0530)]
qa/suites/nvmeof: Remove watchdog from thrasher

This commit does the following:
1. remove watchdog from thrasher
1. remove wait from fio_test
3. change thrasher switcher wait-time to 10 mins

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agosrc: modernize sample.ceph.conf 61620/head
Anthony D'Atri [Sun, 2 Feb 2025 21:38:14 +0000 (16:38 -0500)]
src: modernize sample.ceph.conf

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
5 months agoMerge pull request #61577 from ronen-fr/wip-rf-just-me
Ronen Friedman [Sun, 2 Feb 2025 14:22:07 +0000 (16:22 +0200)]
Merge pull request #61577 from ronen-fr/wip-rf-just-me

osd/scrub: remove unnecessary loop

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
5 months agoMerge pull request #61538 from leonidc/fix-duplicated-optimized
leonidc [Sun, 2 Feb 2025 14:05:02 +0000 (16:05 +0200)]
Merge pull request #61538 from leonidc/fix-duplicated-optimized

nvmeofgw* : fix duplicated optimized host's pathes

5 months agoMerge pull request #61590 from ronen-fr/wip-rf-noinfo-repair
Ronen Friedman [Sun, 2 Feb 2025 14:02:21 +0000 (16:02 +0200)]
Merge pull request #61590 from ronen-fr/wip-rf-noinfo-repair

osd/scrub: discard repair_oinfo_oid()

Reviewed-by: Samuel Just <sjust@redhat.com>
5 months agoMerge pull request #61394 from ronen-fr/wip-rf-cacher-v2
Ronen Friedman [Sun, 2 Feb 2025 13:55:09 +0000 (15:55 +0200)]
Merge pull request #61394 from ronen-fr/wip-rf-cacher-v2

common: modify md_config_obs_impl API

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
5 months agoMerge pull request #60426 from ronen-fr/wip-rf-svwperf
Ronen Friedman [Sun, 2 Feb 2025 13:49:49 +0000 (15:49 +0200)]
Merge pull request #60426 from ronen-fr/wip-rf-svwperf

common/perf_counters: enabling 'find()' by logger name

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
5 months agoMerge pull request #61613 from zdover23/wip-doc-2025-02-02-architecture 61617/head
Zac Dover [Sat, 1 Feb 2025 21:38:32 +0000 (07:38 +1000)]
Merge pull request #61613 from zdover23/wip-doc-2025-02-02-architecture

doc/architecture: remove sentence

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
5 months agodoc/architecture: remove sentence 61613/head
Zac Dover [Sat, 1 Feb 2025 21:15:32 +0000 (07:15 +1000)]
doc/architecture: remove sentence

Remove a sentence that is more marketing than reference.

Signed-off-by: Zac Dover <zac.dover@proton.me>
5 months agoMerge pull request #61561 from athanatos/sjust/wip-crimson-recovery-69412
Samuel Just [Fri, 31 Jan 2025 18:44:49 +0000 (10:44 -0800)]
Merge pull request #61561 from athanatos/sjust/wip-crimson-recovery-69412

crimson: take obc lock during push commit on primary

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
5 months agoMerge pull request #61001 from MaxKellermann/common_includes
Ilya Dryomov [Fri, 31 Jan 2025 10:50:57 +0000 (11:50 +0100)]
Merge pull request #61001 from MaxKellermann/common_includes

common: add missing includes

Reviewed-by: Adam Emerson <aemerson@redhat.com>
5 months agoMerge pull request #61598 from idryomov/wip-rbd-migration-https-doc
Ilya Dryomov [Thu, 30 Jan 2025 23:01:10 +0000 (00:01 +0100)]
Merge pull request #61598 from idryomov/wip-rbd-migration-https-doc

doc/rbd: use https links in live import examples

Reviewed-by: Zac Dover <zac.dover@proton.me>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
5 months agocrimson/.../replicated_recovery_backend: take excl lock while pushes commit 61561/head
Samuel Just [Wed, 22 Jan 2025 02:41:48 +0000 (18:41 -0800)]
crimson/.../replicated_recovery_backend: take excl lock while pushes commit

Fixes: https://tracker.ceph.com/issues/69412
Signed-off-by: Samuel Just <sjust@redhat.com>
5 months agocrimson/.../replicated_recovery_backend: route pushes earlier
Samuel Just [Wed, 22 Jan 2025 02:47:09 +0000 (18:47 -0800)]
crimson/.../replicated_recovery_backend: route pushes earlier

Let ReplicatedRecoveryBackend::handle_recovery_op route pushes
between handle_push and handle_pull_response instead of
ReplicatedRecoveryBackend::handle_push.

Signed-off-by: Samuel Just <sjust@redhat.com>
5 months agopybind/mgr/cephadm: allow setting custom monitoring_port for nfs 61578/head
Omid Yoosefi [Wed, 29 Jan 2025 20:48:52 +0000 (15:48 -0500)]
pybind/mgr/cephadm: allow setting custom monitoring_port for nfs

ganesha config allows this, so allow users to set their own custom
ports in case they wish to do so.

Signed-off-by: Omid Yoosefi <omidyoosefi@ibm.com>
5 months agomgr/cephadm: add Server_Scope = <fsid> to NFSv4 section of ganesha conf 61571/head
Adam King [Wed, 29 Jan 2025 17:02:50 +0000 (12:02 -0500)]
mgr/cephadm: add Server_Scope = <fsid> to NFSv4 section of ganesha conf

From the ganesha team

"""
In the NFSv4 param block, we need a parameter Server_Scope set to some value common among all servers in a cluster.

The default with it blank is to use the hostname which may be different for each server in the cluster.
"""

This is related to ongoing work on high availability nfs. From the cephadm side
we just need to make sure all nfs daemons in the cluster end up with
the same value for the Server_Scope field. This patch uses the cluster
id (which we already brought into the template as the "namespace" attribute)

Signed-off-by: Adam King <adking@redhat.com>
5 months agodoc/rbd: use https links in live import examples 61598/head
Ilya Dryomov [Thu, 30 Jan 2025 19:30:18 +0000 (20:30 +0100)]
doc/rbd: use https links in live import examples

Even though it's explicitly said that "http" stream can be used to
import via both HTTP and HTTPS, it can still be confusing that "type":
"http" is expected to go with "url": "https://...".  Switch example
URLs from HTTP to HTTPS to make it more obvious.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
5 months agoMerge pull request #57551 from linuxbox2/wip-rgwlc-66111
Matt Benjamin [Thu, 30 Jan 2025 17:15:36 +0000 (12:15 -0500)]
Merge pull request #57551 from linuxbox2/wip-rgwlc-66111

rgwlc: send pool transition notifications too

5 months agoMerge pull request #60250 from aainscow/interval_set_enhancements
Alex Ainscow [Thu, 30 Jan 2025 17:06:00 +0000 (17:06 +0000)]
Merge pull request #60250 from aainscow/interval_set_enhancements

include: interval_set: Relax requirements and enhance performance of interval sets

5 months agoMerge pull request #61135 from rkachach/fix_issue_cephadm_services_registry
Adam King [Thu, 30 Jan 2025 16:43:58 +0000 (11:43 -0500)]
Merge pull request #61135 from rkachach/fix_issue_cephadm_services_registry

mgr/cephadm: using service registry pattern for cephadm services

Reviewed-by: Adam King <adking@redhat.com>
5 months agoMerge pull request #59480 from bill-scales/ec_partial_read
Bill Scales [Thu, 30 Jan 2025 16:17:25 +0000 (16:17 +0000)]
Merge pull request #59480 from bill-scales/ec_partial_read

Further EC partial stripe read fixes

5 months agoMerge pull request #61591 from gbregman/main
Gil Bregman [Thu, 30 Jan 2025 15:59:11 +0000 (17:59 +0200)]
Merge pull request #61591 from gbregman/main

mgr/cephadm/nvmeof: Add verify_listener_ip field to NVMeOF configuration

5 months agomgr/cephadm: create OSD daemon deploy specs through make_daemon_spec 61593/head
Adam King [Thu, 30 Jan 2025 14:15:37 +0000 (09:15 -0500)]
mgr/cephadm: create OSD daemon deploy specs through make_daemon_spec

That function handles setting up the extra container/entrypoint
args for the daemon during initial deployment. Having the
CephadmDaemonDeploySpec made directly in the OSD deployment
workflow means initial deployments of OSDs won't have the
extra container/entrypoint args from the spec

Fixes: https://tracker.ceph.com/issues/69734
Signed-off-by: Adam King <adking@redhat.com>
5 months agoMerge PR #61537 into main
Venky Shankar [Thu, 30 Jan 2025 12:44:05 +0000 (18:14 +0530)]
Merge PR #61537 into main

* refs/pull/61537/head:
libcephfs_proxy: implement ceph_readdir_r()

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
Reviewed-by: Anoop C S <anoopcs@cryptolab.net>
5 months agoqa/workunits/rbd: add test_import_nbd_stream_disconnected() 61567/head
Ilya Dryomov [Tue, 28 Jan 2025 08:33:37 +0000 (09:33 +0100)]
qa/workunits/rbd: add test_import_nbd_stream_disconnected()

When the NBD server is killed, nbd_pread() can set errno to at least
ENOTCONN, EINVAL and 0 which is supposed to stand for "no additional
errno information is available for this error".  Add a test to ensure
that "rbd migration execute" command always fails and that the image
isn't transitioned to MIGRATION_STATE_EXECUTED in this scenario.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
5 months agolibrbd: stop filtering async request error codes
Ilya Dryomov [Wed, 29 Jan 2025 11:56:34 +0000 (12:56 +0100)]
librbd: stop filtering async request error codes

The roots of this go back to 2015 when snap create was changed to
filter EEXIST in commit 63f6c9bac9a4 ("librbd: fixed snap create race
conditions") and flatten respectively EINVAL in commit ef7e210c3f74
("librbd: better handling for duplicate flatten requests").  From there
this pattern made it to most other operations that can be proxied
including "rbd migration execute".

The motivation was to suppress generation of an "expected" error in
response to a duplicate async request notification for the operation.
However, doing this at the top of the handler (right before returning
to the caller) and for an error as generic as EINVAL is super fragile.
It's trivial for an error that is being filtered to sneak in with
a lower level change completely unnoticed.  For example, live migration
recently added NBD stream which is implemented on top of libnbd and it
turns out that some libnbd APIs return EINVAL on various occasions when
the NBD endpoint disappears and an error like ENOTCONN would make more
sense.  If this occurs during "rbd migration execute" operation, the
rest of librbd never learns that migration was disrupted and the image
is transitioned to MIGRATION_STATE_EXECUTED, thus handing a partially
imported (read: corrupted) image to the user.

Luckily, with commits 07fbc4b71df4 ("librbd: track complete async
operation requests") and 96bc20445afb ("librbd: track complete async
operation return code"), the scenario which originally prompted error
code filtering isn't an issue anymore.  Despite a few shortcomings
(e.g. when an async request notification is acked with result 0, it's
impossible to tell whether a) a new operation was kicked off, b) there
is an operation that is still in progress or c) it's for an operation
that completed earlier but hasn't "expired" yet), even just commit
07fbc4b71df4 by itself prevents a duplicate notification from kicking
off a second operation that could generate an error for something that
actually succeeded.  With that in mind, eradicate error code filtering
from Operations class.

Fixes: https://tracker.ceph.com/issues/58185
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
5 months agoqa/tasks/nvmeof.py: Add teardown() method
Vallari Agrawal [Wed, 29 Jan 2025 15:34:04 +0000 (21:04 +0530)]
qa/tasks/nvmeof.py: Add teardown() method

Add teardown method to remove nvmeof service
before rest of the cluster tearsdown.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoqa/tasks/nvmeof.py: Ignore systemctl_stop thrashing method
Vallari Agrawal [Tue, 28 Jan 2025 12:43:17 +0000 (18:13 +0530)]
qa/tasks/nvmeof.py: Ignore systemctl_stop thrashing method

Do not use systemctl_stop method to thrash daemons,
just use 'ceph orch daemon stop' and 'ceph orch daemon rm'
methods to thrash nvmeof gateways.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoqa/tasks/nvmeof.py: Fix do_checks() method
Vallari Agrawal [Tue, 28 Jan 2025 09:18:15 +0000 (14:48 +0530)]
qa/tasks/nvmeof.py: Fix do_checks() method

All checks currently run on initator node, now
run all "ceph" commands on one of gateway hosts
instead of initator nodes. And run "nvme list"
and "nvme list-subsys" checks on initator node.

Add retry (5 times) to do_checks if any command fails.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoqa/tasks/nvmeof.py: make seperate calls in do_checks()
Vallari Agrawal [Mon, 20 Jan 2025 11:43:44 +0000 (17:13 +0530)]
qa/tasks/nvmeof.py: make seperate calls in do_checks()

When running 'nvme list-subsys <device>' command
in do_checks(), instead of combining command for
all devices with '&&', make seperate calls.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoqa/tasks/nvmeof.py: Do not use 'systemctl start' in thrasher
Vallari Agrawal [Tue, 14 Jan 2025 03:52:31 +0000 (09:22 +0530)]
qa/tasks/nvmeof.py: Do not use 'systemctl start' in thrasher

Instead use 'daemon start' in revive_daemon() to bring
up gateways thrashed with 'systemctl stop'.
This is because 'systemctl start' method seems to temporary
issues.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoqa/workunits/nvmeof/fio_test.sh: fix fio filenames
Vallari Agrawal [Tue, 14 Jan 2025 03:49:03 +0000 (09:19 +0530)]
qa/workunits/nvmeof/fio_test.sh: fix fio filenames

Filenames were provided to fio as nvme1n1:nvme1n2,
it should be pull path (/dev/nvme1n1:/dev/nvme1n2).

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoqa/tasks/nvmeof.py: Add stop_and_join method to thrasher
Vallari Agrawal [Mon, 13 Jan 2025 19:10:35 +0000 (00:40 +0530)]
qa/tasks/nvmeof.py: Add stop_and_join method to thrasher

Also add nvme-gw show command output in do_checks()
and revive daemons with 'ceph orch daemon start' in
revive_daemon() method.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoqa: Add qa/suites/nvmeof/thrash/gateway-initiator-setup/2-subsys-8-namespace.yaml
Vallari Agrawal [Mon, 13 Jan 2025 09:39:39 +0000 (15:09 +0530)]
qa: Add qa/suites/nvmeof/thrash/gateway-initiator-setup/2-subsys-8-namespace.yaml

This allows to run nvmeof thrasher test on smaller
confgurations which finshes faster than 120subsys-8ns
config.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agoqa/tasks/nvmeof: Add --refresh flag in do_checks() cmds
Vallari Agrawal [Mon, 13 Jan 2025 09:33:27 +0000 (15:03 +0530)]
qa/tasks/nvmeof: Add --refresh flag in do_checks() cmds

This is to ensure latest state of the services are displayed.

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
5 months agomgr/cephadm/nvmeof: Add verify_listener_ip field to NVMeOF configuration and remove... 61591/head
Gil Bregman [Thu, 30 Jan 2025 11:33:51 +0000 (13:33 +0200)]
mgr/cephadm/nvmeof: Add verify_listener_ip field to NVMeOF configuration and remove obsolete enable_key_encryption
Fixes https://tracker.ceph.com/issues/69731

Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
5 months agoMerge pull request #61540 from idryomov/wip-69679
Ilya Dryomov [Thu, 30 Jan 2025 10:23:16 +0000 (11:23 +0100)]
Merge pull request #61540 from idryomov/wip-69679

mon/OSDMonitor: relax cap enforcement for unmanaged snapshots

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
5 months agoosd/scrub: discard repair_oinfo_oid() 61590/head
Ronen Friedman [Thu, 30 Jan 2025 09:27:58 +0000 (03:27 -0600)]
osd/scrub: discard repair_oinfo_oid()

repair_oinfo_oid(), called every scrub, has a very specific
functionality: fix the object ID specified in the Object Info
attribute, if different from the ID of the owning object.

This fix was added in 2017, as a response to a unique failure
scenario that was observed in Sepia - probably following a
filesystem bug. See https://tracker.ceph.com/issues/18409 &
https://tracker.ceph.com/issues/20471.

The limited functionality of repair_oinfo_oid() -
only repairing this one specific issue, and only if the OI_ATTR
exists and is decodable - does not justify the overhead of
running it every scrub.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
5 months agoosd/scrub: remove unnecessary loop 61577/head
Ronen Friedman [Wed, 29 Jan 2025 19:09:36 +0000 (13:09 -0600)]
osd/scrub: remove unnecessary loop

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
5 months agoMerge pull request #61471 from idryomov/wip-65889-more
Ilya Dryomov [Thu, 30 Jan 2025 08:38:29 +0000 (09:38 +0100)]
Merge pull request #61471 from idryomov/wip-65889-more

cls/rbd: don't use read API for write-like methods

Reviewed-by: Ramana Raja <rraja@redhat.com>
5 months agocommon: ceph_context: make use of get_tracked_keys() 61394/head
Ronen Friedman [Wed, 15 Jan 2025 07:49:46 +0000 (01:49 -0600)]
common: ceph_context: make use of get_tracked_keys()

modify some configuration object registrations
in common/ceph_context to use the updated
md_config_obs_t::get_tracked_keys() API

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
5 months agoMerge pull request #61545 from Hezko/nvmeof-cli-add-cmnds 60583/head
Hezko [Thu, 30 Jan 2025 07:07:27 +0000 (09:07 +0200)]
Merge pull request #61545 from Hezko/nvmeof-cli-add-cmnds

mgr/dashboard: Add additional API and CLI endpoints

5 months agoMerge pull request #61465 from ArbitCode/wip-raja-fix-multipart-upload-cant-get-obj-tag
Raja [Thu, 30 Jan 2025 06:29:44 +0000 (11:59 +0530)]
Merge pull request #61465 from ArbitCode/wip-raja-fix-multipart-upload-cant-get-obj-tag

RGW:fix obj by multipart upload cant get tag

5 months agomds: fix option mds_bal_overload_epochs 60686/head
Zhansong Gao [Mon, 11 Nov 2024 05:26:03 +0000 (13:26 +0800)]
mds: fix option mds_bal_overload_epochs

When option mds_bal_overload_epochs was added, two positions
should have been modified, but one of them was overlooked.

Fixes: https://tracker.ceph.com/issues/68953
Signed-off-by: Zhansong Gao <zhsgao@hotmail.com>
5 months agocephadm: remove some lines that are now redundant 61579/head
John Mulligan [Thu, 30 Jan 2025 00:27:06 +0000 (19:27 -0500)]
cephadm: remove some lines that are now redundant

The previous set of patches replaced some function calls and now there
are unnecessary lines present. Remove them.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
5 months agocephadm: use parsed_container_cpu_perc in cephadm.py
John Mulligan [Thu, 30 Jan 2025 00:25:52 +0000 (19:25 -0500)]
cephadm: use parsed_container_cpu_perc in cephadm.py

Replace the use of _parse_cpu_perc and related command calls with
parsed_container_mem_usage.  This needs no additional test updates
because the test updates in the previous patch that added
parsed_container_mem_usage covered all of that already.

Signed-off-by: John Mulligan <jmulligan@redhat.com>